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Preface 


This preface introduces the ARM Architecture Reference Manual, ARMv8, for ARMv8-A architecture profile. It 
contains the following sections: 


° About this manual on page xvi. 
° Using this manual on page xviii. 
. Conventions on page xxiii. 
. Additional reading on page xxv. 
° Feedback on page xxvi. 

Note 





This document describes only the ARMv8-A architecture profile. For the behaviors required by the ARMv7-A and 
ARMv/7-R architecture profiles, see the ARM® Architecture Reference Manual, ARMv7-A and ARMv7-R edition. 
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About this manual 


This manual describes the ARM® architecture v8, ARMv8. The architecture describes the operation of an 
ARMV8-A Processing element (PE), and this manual includes descriptions of: 


The two Execution states, AArch64 and AArch32. 


The instruction sets: 


—_ In AArch32 state, the A32 and T32 instruction sets, that are compatible with earlier versions of the 
ARM architecture. 


— In AArch64 state, the A64 instruction set. 


The states that determine how a PE operates, including the current Exception level and Security state, and in 
AArch32 state the PE mode. 


The Exception model. 
The interprocessing model, that supports transitioning between AArch64 state and AArch32 state. 


The memory model, that defines memory ordering and memory management. This manual covers a single 
architecture profile, ARMv8-A, that defines a Virtual Memory System Architecture (VMSA). 


The programmers’ model, and its interfaces to System registers that control most PE and memory system 
features, and provide status information. 


The Advanced SIMD and floating-point instructions, that provide high-performance: 
— _ Single-precision and double-precision floating-point operations. 
— Conversions between double-precision, single-precision, and half-precision floating-point values. 


— Integer, single-precision floating-point, and in A64, double-precision vector operations in all 
instruction sets. 


— _ Double-precision floating-point vector operations in the A64 instruction set. 
The security model, that provides two security states to support secure applications. 
The virtualization model, that support the virtualization of Non-secure operation. 


The Debug architecture, that provides software access to debug features. 


This manual gives the assembler syntax for the instructions it describes, meaning that it describes instructions in 
textual form. However, this manual is not a tutorial for ARM assembler language, nor does it describe ARM 
assembler language, except at a very basic level. To make effective use of ARM assembler language, read the 
documentation supplied with the assembler being used. 


This manual is organized into parts: 


Part A Provides an introduction to the ARMv8-A architecture, and an overview of the AArch64 and 


AArch32 Execution states. 


Part B Describes the application level view of the AArch64 Execution state, meaning the view from ELO. 


It describes the application level view of the programmers’ model and the memory model. 


Part C Describes the A64 instruction set, that is available in the AArch64 Execution state. The descriptions 


for each instruction also include the precise effects of each instruction when executed at ELO, 
described as unprivileged execution, including any restrictions on its use, and how the effects of the 
instruction differ at higher Exception levels. This information is of primary importance to authors 
and users of compilers, assemblers, and other programs that generate ARM machine code. 


Part D Describes the system level view of the AArch64 Execution state. It includes details of the System 


registers, most of which are not accessible from ELO, and the system level view of the programmers’ 
model and the memory model. This part includes the description of self-hosted debug. 
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Part E 


Part F 


Part G 


Part H 


Part I 


Part J 


Preface 
About this manual 


Describes the application level view of the AArch32 Execution state, meaning the view from the 
ELO. It describes the application level view of the programmers’ model and the memory model. 


— Note 


In AArch32 state, execution at ELO is execution in User mode. 





Describes the T32 and A32 instruction sets, that are available in the AArch32 Execution state. These 
instruction sets are backwards-compatible with earlier versions of the ARM architecture. This part 
describes the precise effects of each instruction when executed in User mode, described as 
unprivileged execution or execution at ELO, including any restrictions on its use, and how the effects 
of the instruction differ at higher Exception levels. This information is of primary importance to 
authors and users of compilers, assemblers, and other programs that generate ARM machine code. 


— Note 


User mode is the only mode where software execution is unprivileged. 





Describes the system level view of the AArch32 Execution state, that is generally compatible with 
earlier versions of the ARM architecture. This part includes details of the System registers, most of 
which are not accessible from ELO, and the instruction interface to those registers. It also describes 
the system level view of the programmers’ model and the memory model. 


Describes the Debug architecture for external debug. This provides configuration, breakpoint and 
watchpoint support, and a Debug Communications Channel (DCC) to a debug host. 


Describes additional features of the architecture that are not closely coupled to a processing element 
(PE), and therefore are accessed through memory-mapped interfaces. Some of these features are 
OPTIONAL. 


Provides pseudocode that describes various features of the ARMv8 architecture. 


Part K, Appendixes 


Provide additional information. Some appendixes give information that is not part of the ARMv8 
architectural requirements. The cover page of each appendix indicates its status. 
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Using this manual 


The information in this manual is organized into parts, as described in this section. 


Part A, Introduction and Architecture Overview 


Part A gives an overview of the ARMv8-A architecture profile, including its relationship to the other ARM PE 
architectures. It introduces the terminology used to describe the architecture, and gives an overview of the 
Executions states, AArch64 and AArch32. It contains the following chapter: 


Chapter A1 Introduction to the ARMv8 Architecture 


Read this for an introduction to the ARMvé8 architecture. 


Part B, The AArch64 Application Level Architecture 


Part B describes the AArch64 state application level view of the architecture. It contains the following chapters: 


Chapter B1 The AArch64 Application Level Programmers’ Model 


Read this for an application level description of the programmers’ model for software executing in 
AArch64 state. It describes execution at ELO when ELO is using AArch64 state. 


Chapter B2 The AArch64 Application Level Memory Model 


Read this for an application level description of the memory model for software executing in 
AArch64 state. It describes the memory model for execution in ELO when ELO is using AArch64 
state. It includes information about ARM memory types, attributes, and memory access controls. 


Part C, The A64 Instruction Set 


Part C describes the A64 instruction set, that is used in AArch64 state. It contains the following chapters: 


Chapter C1 The A64 Instruction Set 


Read this for a description of the A64 instruction set and common instruction operation details. 


Chapter C2 About the A64 Instruction Descriptions 


Read this to understand the format of the A64 instruction descriptions. 


Chapter C3 A64 Instruction Set Overview 
Read this for an overview of the individual A64 instructions, that are divided into five functional 
groups. 

Chapter C4 A64 Instruction Set Encoding 


Read this for a description of the A64 instruction set encoding. 


Chapter C5 The A64 System Instruction Class 
Read this for a description of the AArch64 system instructions and register descriptions, and the 
system instruction class encoding space. 

Chapter C6 A64 Base Instruction Descriptions 
Read this for information on key aspects of the A64 base instructions and for descriptions of the 
individual instructions, which are listed in alphabetical order. 

Chapter C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 


Read this for information on key aspects of the A64 Advanced SIMD and floating-point instructions 
and for descriptions of the individual instructions, which are listed in alphabetical order. 
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Part D, The AArch64 System Level Architecture 


Part D describes the AArch64 state system level view of the architecture. It contains the following chapters: 


Chapter D1 The AArch64 System Level Programmers’ Model 


Read this for a description of the AArch64 state system level view of the programmers’ model. 


Chapter D2 AArch64 Self-hosted Debug 


Read this for an introduction to, and a description of, self-hosted debug in AArch64 state. 


Chapter D3 The AArch64 System Level Memory Model 


Read this for a description of the AArch64 state system level view of the general features of the 
memory system. 


Chapter D4 The AArch64 Virtual Memory System Architecture 


Read this for a system level view of the AArch64 Virtual Memory System Architecture (VMSA), 
the memory system architecture of an ARMv8 implementation that is executing in AArch64 state. 


Chapter D5 The Performance Monitors Extension 


Read this for a description of an implementation of the ARM Performance Monitors, that are an 
optional non-invasive debug component. 


Chapter D6 The Generic Timer in AArch64 state 


Read this for a description of the AArch64 view of an implementation of the ARM Generic Timer. 


Chapter D7 AArch64 System Register Descriptions 


Read this for an introduction to, and description of, each of the AArch64 System registers. 


Part E, The AArch32 Application Level Architecture 


Part E describes the AArch32 state application level view of the architecture. It contains the following chapters: 


Chapter E1 The AArch32 Application Level Programmers’ Model 


Read this for an application level description of the programmers’ model for software executing in 
AArch32 state. It describes execution at ELO when ELO is using AArch32 state. 


Chapter E2 The AArch32 Application Level Memory Model 


Read this for an application level description of the memory model for software executing in 
AArch32 state. It describes the memory model for execution in ELO when ELO is using AArch32 
state. It includes information about ARM memory types, attributes, and memory access controls. 


Part F, The AArch32 Instruction Sets 


Part F describes the T32 and A32 instruction sets, that are used in AArch32 state. It contains the following chapters: 


Chapter F1 The AArch32 Instruction Sets Overview 


Read this for an overview of the T32 and A32 instruction sets. 


Chapter F2 About the T32 and A32 Instruction Descriptions 
Read this to understand the format of the T32 and A32 instruction descriptions. 


Chapter F3 The T32 Instruction Set Encoding 


Read this for a description of the T32 instruction set encoding. This includes the T32 encoding of 
the Advanced SIMD and floating-point instructions. 


Chapter F4 The A32 Instruction Set Encoding 


Read this for a description of the A32 instruction set encoding. This includes the A32 encoding of 
the Advanced SIMD and floating-point instructions. 
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Chapter F5 732 and A32 Base Instruction Set Instruction Descriptions 


Read this for a description of each of the T32 and A32 base instructions. 


Chapter F6 732 and A32 Advanced SIMD and floating-point Instruction Descriptions 


Read this for a description of each of the T32 and A32 Advanced SIMD and floating-point 
instructions. 


Part G, The AArch32 System Level Architecture 


Part G describes the AArch32 state system level view of the architecture. It contains the following chapters: 


Chapter G1 The AArch32 System Level Programmers’ Model 


Read this for a description of the AArch32 state system level view of the programmers’ model for 
execution in an Exception level that is using AArch32. 


Chapter G2 AArch32 Self-hosted Debug 


Read this for an introduction to, and a description of, self-hosted debug in AArch64 state. 


Chapter G3 The AArch32 System Level Memory Model 


Read this for a system level view of the general features of the memory system. 


Chapter G4 The AArch32 Virtual Memory System Architecture 
Read this for a description of the AArch32 Virtual Memory System Architecture (VMSA). 


Chapter G5 The Generic Timer in AArch32 state 


Read this for a description of the AArch32 view of an implementation of the ARM Generic Timer. 


Chapter G6 AArch32 System Register Descriptions 
Read this for a description of each of the AArch32 System registers. 


Part H, External Debug 
Part H describes the architecture for external debug. It contains the following chapters: 


Chapter H1 About External Debug 


Read this for an introduction to external debug, and a definition of the scope of this part of the 
manual. 


Chapter H2 Debug State 


Read this for a description of debug state, which the PE might enter as the result of a Halting debug 
event. 


Chapter H3 Halting Debug Events 


Read this for a description of the external debug events referred to as Halting debug events. 


Chapter H4 The Debug Communication Channel and Instruction Transfer Register 


Read this for a description of the communication between a debugger and the PE debug logic using 
the Debug Communications Channel and the Instruction Transfer register. 


Chapter H5 The Embedded Cross-Trigger Interface 


Read this for a description of the embedded cross-trigger interface. 


Chapter H6 Debug Reset and Powerdown Support 


Read this for a description of reset and powerdown support in the Debug architecture. 
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Chapter H7 The PC Sample-based Profiling Extension 


Read this for a description of the PC Sample-based Profiling Extension that is an OPTIONAL 
extension to an ARMv8 implementation. 


Chapter H8 About the External Debug Registers 
Read this for some additional information about the external debug registers. 


Chapter H9 External Debug Register Descriptions 


Read this for a description of each external debug register. 


Part |, Memory-mapped Components of the ARMv8 Architecture 
Part I describes the memory-mapped components in the architecture. It contains the following chapters: 


Chapter I1 System Level Implementation of the Generic Timer 


Read this for a definition of a system level implementation of the Generic Timer. 


Chapter I2 Recommended External Interface to the Performance Monitors 


Read this for a description of the recommended memory-mapped and external debug interfaces to 
the Performance Monitors. 


Chapter I3 External System Control Register Descriptions 


Read this for a description of each memory-mapped system control register. 


Part J, Architectural Pseudocode 


Part J contains pseudocode that describes various features of the ARM architecture. It contains the following 
chapter: 


Chapter J1 ARMv8 Pseudocode 


Read this for the pseudocode definitions that describe various features of the ARMv8 architecture, 
for operation in AArch64 state and in AArch32 state. 


Part K, Appendixes 
This manual contains the following appendixes: 


Appendix K1 Architectural Constraints on UNPREDICTABLE behaviors 


Read this for a description of the architecturally-required constraints on UNPREDICTABLE behaviors 
in the ARMV8 architecture, including AArch32 behaviors that were UNPREDICTABLE in previous 
versions of the architecture. 


Appendix K2 Recommended External Debug Interface 


Read this for a description of the recommended external debug interface. 


— Note 


This description is not part of the ARM architecture specification. It is included here as 
supplementary information, for the convenience of developers and users who might require this 
information. 





Appendix K3 Recommendations for Performance Monitors Event Numbers for IMPLEMENTATION 
DEFINED Events 


Read this for a description of ARM recommendations for the use of the IMPLEMENTATION DEFINED 
event numbers. 
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— Note 


This description is not part of the ARM architecture specification. It is included here as 
supplementary information, for the convenience of developers and users who might require this 
information. 





Appendix K4 Recommendations for reporting memory attributes on an interconnect 


Read this for the ARM recommendations about how the architectural memory attributes are 
reported on an interconnect. 


Appendix K5 ARMv8 Changes to the T32 and A32 Instruction Sets 
Read this for a summary of the changes that are introduced to the T32 and A32 instruction sets in 
ARMv8. 

Appendix K6 Legacy Instruction Syntax for AArch32 Instruction Sets 
Read this for information about the pre-UAL syntax of the AArch32 instruction sets, which can still 
be valid for the A32 instruction set. 

Appendix K7 Address Translation Examples 


Read this for examples of translation table lookups using the translation regimes described in 
Chapter D4 The AArch64 Virtual Memory System Architecture and Chapter G4 The AArch32 Virtual 
Memory System Architecture. 


Appendix K8 Example OS Save and Restore Sequences 


Read this for software examples that perform the OS Save and Restore sequences for an ARMv8 
debug implementation. 


—— Note 
Chapter H6 Debug Reset and Powerdown Support describes the OS Save and Restore mechanism. 





Appendix K9 Recommended Upload and Download Processes for External Debug 


Read this for information about implementing and using the ARM architecture. 


— Note 


This description is not part of the ARM architecture specification. It is included here as 
supplementary information, for the convenience of developers and users who might require this 
information. 





Appendix K10 Barrier Litmus Tests 
Read this for examples of the use of barrier instructions provided by the ARMv8 architecture. 


— Note 


This description is not part of the ARM architecture specification. It is included here as 
supplementary information, for the convenience of developers and users who might require this 
information. 





Appendix K11 ARM Pseudocode Definition 
Read this for definitions of the AArch32 pseudocode. 


Appendix K12 Registers Index 


Read this for an alphabetic and functional index of AArch32 and AArch64 registers, and 
memory-mapped registers. 





xxii 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


Preface 
Conventions 


Conventions 
The following sections describe conventions that this book can use: 
° Typographic conventions. 
° Signals. 
° Numbers. 
° Pseudocode descriptions. 
. Assembler syntax descriptions on page xxiv. 


Typographic conventions 


The typographical conventions are: 


italic Introduces special terminology, and denotes citations. 
bold Denotes signal names, and is used for terms in descriptive lists, where appropriate. 
monospace Used for assembler syntax descriptions, pseudocode, and source code examples. 


Also used in the main text for instruction mnemonics and for references to other items appearing in 
assembler syntax descriptions, pseudocode, and source code examples. 

SMALL CAPITALS 
Used in body text for a few terms that have specific technical meanings, and are defined in the 
Glossary. 

Colored text Indicates a link. This can be: 
° A URL, for example http://infocenter.arm.com. 


° A cross-reference, that includes the page number of the referenced information if it is not on 
the current page, for example, Assembler syntax descriptions on page xxiv. 


° A link, to a chapter or appendix, or to a glossary entry, or to the section of the document that 
defines the colored term, for example Simple sequential execution or SCTLR. 


Signals 
In general this specification does not define hardware signals, but it does include some signal examples and 
recommendations. The signal conventions are: 
Signal level The level of an asserted signal depends on whether the signal is active-HIGH or 
active-LOW. Asserted means: 
. HIGH for active-HIGH signals. 
° LOW for active-LOW signals. 
Lower-case n At the start or end of a signal name denotes an active-LOW signal. 
Numbers 


Numbers are normally written in decimal. Binary numbers are preceded by 0b, and hexadecimal numbers by 0x. In 
both cases, the prefix and the associated value are written in a monospace font, for example 0xFFFFQ000. To improve 
readability, long numbers can be written with an underscore separator between every four characters, for example 
OxFFFF_@000_0000_0000. Ignore any underscores when interpreting the value of a number. 


Pseudocode descriptions 


This manual uses a form of pseudocode to provide precise descriptions of the specified functionality. This 
pseudocode is written in monospace font, and is described in Appendix K11 ARM Pseudocode Definition. 
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Assembler syntax descriptions 


This manual contains numerous syntax descriptions for assembler instructions and for components of assembler 
instructions. These are shown in a monospace font, and use the conventions described in Structure of the A64 
assembler language on page C1-123, Appendix K11 ARM Pseudocode Definition, and Pseudocode operators and 
keywords on page K12-5648. 
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Additional reading 


This section lists relevant publications from ARM and third parties. 


See the Infocenter, http://infocenter.arm.com, for access to ARM documentation. 


ARM publications 


Other publications 


ARM® AMBA?® 4 ATB Protocol Specification, ATBv1.0 and ATBv1.1, (ARM THI 0032B). 
ARM® Architecture Reference Manual, ARMv7-A and ARMv7-R edition (ARM DDI 0406). 
ARM® Debug Interface Architecture Specification, ADIV5.0 to ADIV5.2 (ARM THI 0031). 
ARM® Embedded Trace Macrocell Architecture Specification, ETMv4 (ARM THI 0064). 


ARM® Generic Interrupt Controller Architecture Specification, GIC architecture version 3.0 and version 4.0 
(ARM IHI 0069). 


ARM® CoreSight™ SoC Technical Reference Manual (ARM DDI 0480). 
ARM® CoreSight™ v2.0 Architecture Specification (ARM THI 0029). 
ARM® Procedure Call Standard for the ARM 64-bit Architecture (ARM THI 0055). 


The following publications are referred to in this manual, or provide more information: 


Announcing the Advanced Encryption Standard (AES), Federal Information Processing Standards 
Publication 197, November 2001. 


IEEE Std 754-2008, [EEE Standard for Floating-point Arithmetic, August 2008. 
IEEE Std 754-1985, IEEE Standard for Floating-point Arithmetic, March 1985. 
Secure Hash Standard (SHA), Federal Information Processing Standards Publication 180-2, August 2002. 


The Galois/Counter Mode of Operation, McGraw, D. and Viega, J., Submission to NIST Modes of Operation 
Process, January 2004. 


Memory Consistency Models for Shared Memory-Multiprocessors, Gharachorloo, Kourosh, 1995, Stanford 
University Technical Report CSL-TR-95-685. 


JEDEC Solid State Technology Association, Standard Manufacturer’s Identification Code, JEP 106. 
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Feedback 


ARM welcomes feedback on its documentation. 


Feedback on this manual 


If you have comments on the content of this manual, send an e-mail to errata@arm.com. Give: 


° The title. 

° The number, ARM DDI 0487A.k. 

° The page numbers to which your comments apply. 
. A concise explanation of your comments. 


ARM also welcomes general suggestions for additions and improvements. 
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Chapter A1 


Introduction to the ARMv8 Architecture 


This chapter introduces the ARM architecture. It contains the following sections: 


About the ARM architecture on page A1-30. 

Architecture profiles on page A1-32. 

ARMV8 architectural concepts on page A1-33. 

Supported data types on page A1-36. 

Floating-point and Advanced SIMD support on page A1-46. 
Cryptographic Extension on page A1-52. 

The ARM memory model on page A1-53. 
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Al Introduction to the ARMv8 Architecture 
A1.1 About the ARM architecture 


A1.1 


About the ARM architecture 


The ARM architecture described in this Architecture Reference Manual defines the behavior of an abstract machine, 
referred to as a processing element, often abbreviated to PE. Implementations compliant with the ARM architecture 
must conform to the described behavior of the processing element. It is not intended to describe how to build an 
implementation of the PE, nor to limit the scope of such implementations beyond the defined behaviors. 


Except where the architecture specifies differently, the programmer-visible behavior of an implementation that is 
compliant with the ARM architecture must be the same as a simple sequential execution of the program on the 
processing element. This programmer-visible behavior does not include the execution time of the program. 


The ARM Architecture Reference Manual also describes rules for software to use the processing element. 
The ARM architecture includes definitions of: 


° An associated debug architecture, see: 
— Chapter D2 AArch64 Self-hosted Debug. 
— Chapter G2 AArch32 Self-hosted Debug. 
— Part H of this manual, External Debug on page 4837. 


. Associated trace architectures, that define trace macrocells that implementers can implement with the 
associated processor hardware. For more information see the Embedded Trace Macrocell Architecture 
Specification. 


The ARM architecture is a Reduced Instruction Set Computer (RISC) architecture with the following RISC 
architecture features: 


. A large uniform register file. 


° A load/store architecture, where data-processing operations only operate on register contents, not directly on 
memory contents. 


° Simple addressing modes, with all load/store addresses determined from register contents and instruction 
fields only. 


The architecture defines the interaction of the PE with memory, including caches, and includes a memory translation 
system. It also describes how multiple PEs interact with each other and with other observers in a system. 


This document defines the ARMv8-A architecture profile. See Architecture profiles on page A1-32 for more 
information. 


The ARM architecture supports implementations across a wide range of performance points. Implementation size, 
performance, and very low power consumption are key attributes of the ARM architecture. 


An important feature of the ARMV8 architecture is backwards compatibility, combined with the freedom for optimal 
implementation in a wide range of standard and more specialized use cases. The ARMV8 architecture supports: 


° A 64-bit Execution state, AArch64. 
° A 32-bit Execution state, AArch32, that is compatible with previous versions of the ARM architecture. 





Note 


° The AArch32 Execution state is compatible with the ARMv7-A architecture profile, and enhances that 
profile to support some features included in the AArch64 Execution state. 


° This document describes only the ARMv8-A architecture profile. For the behaviors required by the 
ARMv7-A and ARMv7-R architecture profiles, see the ARM® Architecture Reference Manual, ARMv7-A and 
ARMV7-R edition. 





Features that are optional are explicitly defined as such in this Manual. 


Note 


The presence of an ID register field for a feature does not imply that the feature is optional. 
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Both Execution states support SIMD and floating-point instructions: 


° AArch32 state provides: 
— _ SIMD instructions in the base instruction sets, that operate on the 32-bit general-purpose registers. 


— Advanced SIMD instructions that operate on registers in the SIMD and floating-point register 
(SIMD&FP register) file. 


— Floating-point instructions that operate on registers in the SIMD&FP register file. 


° AArch64 state provides: 
— Advanced SIMD instructions that operate on registers in the SIMD&FP register file. 
— Floating-point instructions that operate on registers in the SIMD&FP register file. 


Note 


See Conventions on page xxiii for information about conventions used in this manual, including the use of SMALL 
CAPITALS for particular terms that have ARM-specific meanings that are defined in the Glossary. 
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A1.2 Architecture profiles 


The ARM architecture has evolved significantly since its introduction, and ARM continues to develop it. Eight 
major versions of the architecture have been defined to date, denoted by the version numbers | to 8. Of these, the 
first three versions are now obsolete. 


The generic names AArch64 and AArch32 describe the 64-bit and 32-bit Execution states: 


AArch64 Is the 64-bit Execution state, meaning addresses are held in 64-bit registers, and instructions in the 
base instruction set can use 64-bit registers for their processing. AArch64 state supports the A64 
instruction set. 


AArch32 Is the 32-bit Execution state, meaning addresses are held in 32-bit registers, and instructions in the 
base instruction sets use 32-bit registers for their processing. AArch32 state supports the T32 and 
A32 instruction sets. 





Note 


The Base instruction set comprises the supported instructions other than the Advanced SIMD and floating-point 
instructions. 





See sections Execution state on page A1-33 and The ARM instruction sets on page A1-34 for more information. 
ARM defines three architecture profiles: 


A Application profile, described in this manual: 


° Supports a Virtual Memory System Architecture (VMSA) based on a Memory Management 
Unit (MMU). 


—— Note 
An ARMv8-A implementation can be called an AArchv8-A implementation. 





° Supports the A64, A32, and T32 instruction sets. 


R Real-time profile: 

° Supports a Protected Memory System Architecture (PMSA) based on a Memory Protection 
Unit (MPU). 

° Supports the A32 and T32 instruction sets. 

M Microcontroller profile: 

° Implements a programmers' model designed for low-latency interrupt processing, with 
hardware stacking of registers and support for writing interrupt handlers in high-level 
languages. 

° Implements a variant of the R-profile PMSA. 

° Supports a variant of the T32 instruction set. 

Note 





This Architecture Reference Manual describes only the ARMv8-A profile. 





For information about the R and M architecture profiles, and earlier ARM architecture versions see: 
° The ARM® Architecture Reference Manual, ARMv7-A and ARMv7-R edition. 

° The ARM®v7-M Architecture Reference Manual. 

° The ARM®v6-M Architecture Reference Manual. 


A1.2.1 Debug architecture version 


The ARM Debug architecture is fully integrated with the architecture, and does not have a separate version number. 
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A1.3 ARMvV8 architectural concepts 


ARMvV8 introduces major changes to the ARM architecture, while maintaining a high level of consistency with 
previous versions of the architecture. The ARMv8 Architecture Reference Manual includes significant changes in 
the terminology used to describe the architecture, and this section introduces both the ARMv8 architectural concepts 
and the associated terminology. 


The following subsections describe key ARMvV8 architectural concepts. Each section introduces the corresponding 
terms that are used to describe the architecture: 


° Execution state. 

. The ARM instruction sets on page A1-34. 
° System registers on page A1-34. 

. ARMvVv8 Debug on page A1-35. 


A1.3.1 Execution state 


The Execution state defines the PE execution environment, including: 


. The supported register widths. 
. The supported instruction sets. 
° Significant aspects of: 


— The exception model. 
— _ The Virtual Memory System Architecture (VMSA). 


— The programmers’ model. 


The Execution states are: 


AArch64 The 64-bit Execution state. This Execution state: 
° Provides 31 64-bit general-purpose registers, of which X30 is used as the procedure link 
register. 
. Provides a 64-bit program counter (PC), stack pointers (SPs), and exception link registers 
(ELRs). 
. Provides 32 128-bit registers for SIMD vector and scalar floating-point support. 
° Provides a single instruction set, A64. For more information, see The ARM instruction sets 


on page A1-34. 
° Defines the ARMv8 Exception model, with up to four Exception levels, ELO - EL3, that 


provide an execution privilege hierarchy, see Exception levels on page D1-1498. 


° Provides support for 64-bit virtual addressing. For more information, including the limits on 
address ranges, see Chapter D4 The AArch64 Virtual Memory System Architecture. 


° Defines a number of Process state (PSTATE) elements that hold PE state. The A64 
instruction set includes instructions that operate directly on various PSTATE elements. 


° Names each System register using a suffix that indicates the lowest Exception level at which 
the register can be accessed. 
AArch32 The 32-bit Execution state. This Execution state: 


. Provides 13 32-bit general-purpose registers, and a 32-bit PC, SP, and link register (LR). The 
LR is used as both an ELR and a procedure link register. 


Some of these registers have multiple banked instances for use in different PE modes. 


. Provides a single ELR, for exception returns from Hyp mode. 
° Provides 32 64-bit registers for Advanced SIMD vector and scalar floating-point support. 
° Provides two instruction sets, A32 and T32. For more information, see The ARM instruction 


sets on page A1-34. 





. Supports the ARMv7-A exception model, based on PE modes, and maps this onto the 
ARMvVv8 Exception model, that is based on the Exception levels. 
° Provides support for 32-bit virtual addressing. 
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° Defines a number of Process state (PSTATE) elements that hold PE state. The A32 and T32 
instruction sets include instructions that operate directly on various PSTATE elements, and 
instructions that access PSTATE by using the Application Program Status Register (APSR) 
or the Current Program Status Register (CPSR). 


Later subsections give more information about the different properties of the Execution states. 


Transitioning between the AArch64 and AArch32 Execution states is known as interprocessing. The PE can move 
between Execution states only on a change of Exception level, and subject to the rules given in /nterprocessing on 
page D1-1607. This means different software layers, such as an application, an operating system kernel, and a 
hypervisor, executing at different Exception levels, can execute in different Execution states. 





A1.3.2 The ARM instruction sets 
In ARMvV8 the possible instruction sets depend on the Execution state: 
AArch64 AArch64 state supports only a single instruction set, called A64. This is a fixed-length instruction 
set that uses 32-bit instruction encodings. 
For information on the A64 instruction set, see Chapter C3 A64 Instruction Set Overview. 
AArch32 AArch32 state supports the following instruction sets: 
A32 This is a fixed-length instruction set that uses 32-bit instruction encodings. 
T32 This is a variable-length instruction set that uses both 16-bit and 32-bit instruction 
encodings. 
In previous documentation, these instruction sets were called the ARM and Thumb instruction sets. 
ARMvV8 extends each of these instruction sets. In AArch32 state, the Instruction set state determines 
the instruction set that the PE executes. 
For information on the A32 and T32 instruction sets, see Chapter Fl The AArch32 Instruction Sets 
Overview. 
The ARMvV8 instruction sets support SIMD and scalar floating-point instructions. See Floating-point and Advanced 
SIMD support on page A1-46. 
A1.3.3 System registers 
System registers provide control and status information of architected features. 
The System registers use a standard naming format: <register_name>.<bit_field_name> to identify specific 
registers as well as control and status bits within a register. 
Bits can also be described by their numerical position in the form <register_name>[x:y] or the generic form 
bits[x:y]. 
In addition, in AArch64 state, most register names include the lowest Exception level that can access the register as 
a suffix to the register name: 
° <register_name>_ELx, where x is 0, 1, 2, or 3. 
For information about Exception levels, see Exception levels on page D1-1498. 
The System registers comprise: 
. The following registers that are described in this manual: 
— General system control registers. 
— Debug registers. 
— Generic Timer registers. 
— Optionally, Performance Monitor registers. 
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° Optionally, one or more of the following groups of registers that are defined in other ARM architecture 
specifications: 


— __ Trace System registers, as defined in the Embedded Trace Macrocell Architecture Specification, 
ETMv4. 


— Generic Interrupt Controller (GIC) System registers, see The ARM Generic Interrupt Controller 
System registers. 


For information about the AArch64 System registers, see Chapter D7 AArch64 System Register Descriptions. 


For information about the AArch32 System registers, see Chapter G6 AArch32 System Register Descriptions. 


The ARM Generic Interrupt Controller System registers 


From version 3 of the ARM Generic Interrupt Controller architecture, GICv3, the GIC architecture specification 
defines a System register interface to some of its functionality. The System register summaries in this manual 
include these registers, see: 


° About the GIC System registers on page C5-290, for more information about the AArch64 GIC System 
registers. 


° Generic Interrupt Controller System registers, functional groups on page G4-4207, for more information 
about the AArch32 GIC System registers. 


These sections give only short overviews of the GIC System registers. For more information, including descriptions 
of the registers, see the ARM® Generic Interrupt Controller Architecture Specification, GIC architecture version 3.0 
and version 4.0 (ARM THI 0069). 


Note 


The programmers’ model for earlier versions of the GIC architecture is wholly memory-mapped. 








A1.3.4 ARMv8 Debug 


ARMvV8 supports the following: 


Self-hosted debug 


In this model, the PE generates debug exceptions. Debug exceptions are part of the ARMv8 
Exception model. 


External debug 


In this model, debug events cause the PE to enter Debug state. In Debug state the PE is controlled 
by an external debugger. 


All ARMV8 implementations support both models. The model chosen by a particular user depends on the debug 
requirements during different stages of the design and development life cycle of the product. For example, external 
debug might be used during debugging of the hardware implementation and OS bring-up, and self-hosted debug 
might be used during application development. 


For more information about self-hosted debug: 
° In AArch64 state, see Chapter D2 AArch64 Self-hosted Debug. 
° In AArch32 state, see Chapter G2 AArch32 Self-hosted Debug. 


For more information about external debug, see Part H External Debug. 
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A1.4 Supported data types 
The ARMV8 architecture supports the following integer data types: 
Byte 8 bits. 
Halfword 16 bits. 
Word 32 bits. 
Doubleword 64 bits. 
Quadword 128 bits. 
The architecture also supports the following floating-point data types: 
° Half-precision, see Half-precision floating-point formats on page A1-40 for details. 
° Single-precision, see Single-precision floating-point format on page A1-42 for details. 
. Double-precision, see Double-precision floating-point format on page A1-43 for details. 
It also supports: 
° Fixed-point interpretation of words and doublewords. See Fixed-point format on page A1-44. 
° Vectors, where a register holds multiple elements, each of the same data type. See Vector formats on 
page A1-37 for details. 
The ARMV8 architecture provides two register files: 
. A general-purpose register file. 
° A SIMD&FP register file. 
In each of these, the possible register widths depend on the Execution state. 
In AArch64 state: 
° A general-purpose register file contains 64-bit registers: 

— Many instructions can access these registers as 64-bit registers or as 32-bit registers, using only the 
bottom 32 bits. 

° A SIMD&FP register file contains 128-bit registers: 

— The quadword integer data types only apply to the SIMD&FP register file. 

— The floating-point data types only apply to the SIMD&FP register file. 

— While the AArch64 vector registers support 128-bit vectors, the effective vector length can be 64-bits 
or 128-bits depending on the A64 instruction encoding used, see Jnstruction Mnemonics on 
page C1-123. 

For more information on the register files in AArch64 state, see Registers in AArch64 Execution state on 
page B1-59. 
In AArch32 state: 
° A general-purpose register file contains 32-bit registers: 
— Two 32-bit registers can support a doubleword. 
— _ Vector formatting is supported, see Figure A1l-4 on page A1-40. 
° A SIMD&FP register file contains 64-bit registers: 
—  AArch32 state does not support quadword integer or floating-point data types. 
Note 
Two consecutive 64-bit registers can be used as a 128-bit register. 
For more information on the register files in AArch32 state, see The general-purpose registers, and the PC, in 
AArch32 state on page E1-2291. 
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A1.4.1 Vector formats 
In an implementation that includes the SIMD instructions that operate on the SIMD&FP register file, a register can 
hold one or more packed elements, all of the same size and type. The combination of a register and a data type 
describes a vector of elements. The vector is considered to be an array of elements of the data type specified in the 
instruction. The number of elements in the vector is implied by the size of the data elements and the size of the 
register. 
Vector indices are in the range 0 to (number of elements — 1). An index of 0 refers to the least significant end of the 
vector. 
Vector formats in AArch64 state 
In AArch6é4 state, the SIMD&FP registers can be referred to as Vn, where n is a value from 0 to 31. 
The SIMD&FP registers support three data formats for loads, stores and data-processing operations: 
. A single, scalar, element in the least significant bits of the register. 
° A 64-bit vector of byte, halfword, or word elements. 
° A 128-bit vector of byte, halfword, word or doubleword elements. 
The element sizes are defined in Table Al-1 with the vector format described as: 
° For a 128-bit vector: Vn{.2D, .4S, .8H, .16B}. 
° For a 64-bit vector: Vn{.1D, .2S, .4H, .8B}. 
Table A1-1 SIMD elements in AArch64 state 
Mnemonic _ Size 
B 8 bits 
H 16 bits 
s 32 bits 
D 64 bits 
Figure Al-1 on page Al-38 shows the SIMD vectors in AArch64 state. 
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Figure A1-1 SIMD vectors in AArch64 state 


Vector formats in AArch32 state 


Table Al-2 shows the available formats. Each instruction description specifies the data types that the instruction 
supports. 


Table A1-2 Advanced SIMD data types in AArch32 state 





Data type specifier | Meaning 




















.<Size> Any element of <size> bits 

.F<size> Floating-point number of <size> bits 

.I<size> Signed or unsigned integer of <size> bits 
.P<size> Polynomial over {0, 1} of degree less than <size> 
.S<size> Signed integer of <size> bits 

.U<size> Unsigned integer of <size> bits 





Polynomial arithmetic over {0, 1} on page A1-45 describes the polynomial data type. 
The .F16 data type is the half-precision data type selected by the FPSCR.AHP bit. 


The .F32 data type is the ARM standard single-precision floating-point data type, see Single-precision 
floating-point format on page A1-42. 


The instruction definitions use a data type specifier to define the data types appropriate to the operation. Figure A1-2 
on page A1-39 shows the hierarchy of the Advanced SIMD data types. 
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+ Output format only. See VMULL instruction description. 


}# Available only if the Cyptographic Extension is implemented. 
See VMULL instruction description. 


Figure A1-2 Advanced SIMD data type hierarchy in AArch32 state 
For example, a multiply instruction must distinguish between integer and floating-point data types. 


An integer multiply instruction that generates a double-width (long) result must specify the input data types as 
signed or unsigned. However, some integer multiply instructions use modulo arithmetic, and therefore do not have 
to distinguish between signed and unsigned inputs. 


Figure Al-3 on page Al-40 shows the Advanced SIMD vectors in AArch32 state. 


Note 


In AArch32 state, a pair of even and following odd numbered doubleword registers can be concatenated and treated 
as a single quadword register. 
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Figure A1-3 Advanced SIMD vectors in AArch32 state 


The AArch32 general-purpose registers support vectors formats for use by the SIMD instructions in the Base 
instruction set. Figure A1-4 shows these formats, that means that a general-purpose register can be treated as either 
two halfwords or four bytes. 
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Figure A1-4 Vector formatting in AArch32 state 


A1.4.2 Half-precision floating-point formats 


ARMvV8 supports two half-precision floating-point formats: 
° IEEE half-precision, as described in the IEEE 754-2008 standard. 


. Alternative half-precision. 


Note 


Half-precision floating-point formats can only be converted to and from other floating-point formats. They cannot 
be used in any other data-processing operations. This applies to both AArch32 state and AArch64 state. 
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The description of IEEE half-precision includes ARM-specific details that are left open by the standard, and is only 
an introduction to the formats and to the values they can contain. For more information, especially on the handling 
of infinities, NaNs and signed zeros, see the IEEE 754 standard. 


For both half-precision floating-point formats, the layout of the 16-bit format is the same. The format is: 


15 14 10 9 0 


The interpretation of the format depends on the value of the exponent field, bits[14:10] and on which half-precision 
format is being used. 
0 < exponent < 0x1F 
The value is a normalized number and is equal to: 
(-1)8 x 2(exponent-15) x (1 fraction) 
The minimum positive normalized number is 2-!4, or approximately 6.104 x 10-. 
The maximum positive normalized number is (2 — 2-!°) x 215, or 65504. 
Larger normalized numbers can be expressed using the alternative format when the 
exponent == 0x1F. 
exponent == 


The value is either a zero or a denormalized number, depending on the fraction bits: 


fraction == 
The value is a zero. There are two distinct zeros: 
+0 when S== 
-0 when S==1. 

fraction != 0 


The value is a denormalized number and is equal to: 
(-1)§ x 2-14 x (0.fraction) 


The minimum positive denormalized number is 2-24, or approximately 5.960 x 10-8. 


exponent == 0x1F 
The value depends on which half-precision format is being used: 


IEEE half-precision 
The value is either an infinity or a Not a Number (NaN), depending on the fraction bits: 





fraction == 
The value is an infinity. There are two distinct infinities: 
tinfinity When S==0. This represents all positive numbers that are too 
big to be represented accurately as a normalized number. 
-infinity © When S==1. This represents all negative numbers with an 
absolute value that is too big to be represented accurately as a 
normalized number. 
fraction != 0 
The value is a NaN, and is either a quiet NaN or a signaling NaN. 
The two types of NaN are distinguished by their most significant fraction 
bit, bit[9]: 
bit[9] == 0 The NaN is a signaling NaN. The sign bit can take any value, 
and the remaining fraction bits can take any value except all 
zeros. 
bit[9] == 1 The NaN is a quiet NaN. The sign bit and remaining fraction 
bits can take any value. 
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A1.4.3 


Alternative half-precision 
The value is a normalized number and is equal to: 
-1S x 216 x (1.fraction) 


The maximum positive normalized number is (2-2-!9) x 2!6 or 131008. 


Single-precision floating-point format 


The single-precision floating-point format is as defined by the IEEE 754 standard. 


This description includes ARM-specific details that are left open by the standard. It is only intended as an 
introduction to the formats and to the values they can contain. For full details, especially of the handling of infinities, 
NaNs and signed zeros, see the IEEE 754 standard. 


A single-precision value is a 32-bit word with the format: 


31 30 23 22 0 


The interpretation of the format depends on the value of the exponent field, bits[30:23]: 


0 < exponent < QxFF 


The value is a normalized number and is equal to: 
(-1)§ x 2(exponent — 127) x (1 fraction) 
The minimum positive normalized number is 2-!26, or approximately 1.175 x 10-38, 


The maximum positive normalized number is (2 — 2-23) x 2127, or approximately 3.403 x 1038. 


exponent == 


The value is either a zero or a denormalized number, depending on the fraction bits: 


fraction == 
The value is a zero. There are two distinct zeros: 
+0 When S==0. 
-0 When S==1. 
These usually behave identically. In particular, the result is equal if +0 and —0 are 
compared as floating-point numbers. However, they yield different results in some 
circumstances. For example, the sign of the infinity produced as the result of dividing 
by zero depends on the sign of the zero. The two zeros can be distinguished from each 
other by performing an integer comparison of the two words. 


fraction != 0 
The value is a denormalized number and is equal to: 
(-1)§ x 2-126 x (0.fraction) 
The minimum positive denormalized number is 2-!49, or approximately 1.401 x 105. 


Denormalized numbers are always flushed to zero in Advanced SIMD processing in AArch32 state. 
They are optionally flushed to zero in floating-point processing and in Advanced SIMD processing 
in AArch6é4 state. For details see Flush-to-zero on page A1-49. 


exponent == QxFF 
The value is either an infinity or a Not a Number (NaN), depending on the fraction bits: 
fraction == 
The value is an infinity. There are two distinct infinities: 


+infinity When S==0. This represents all positive numbers that are too big to be 
represented accurately as a normalized number. 


-infinity © When S==1. This represents all negative numbers with an absolute value 
that is too big to be represented accurately as a normalized number. 
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fraction != 0 
The value is a NaN, and is either a quiet NaN or a signaling NaN. 
The two types of NaN are distinguished by their most significant fraction bit, bit[22]: 
bit[22] == 
The NaN is a signaling NaN. The sign bit can take any value, and the 
remaining fraction bits can take any value except all zeros. 
bit[22] == 


The NaN is a quiet NaN. The sign bit and remaining fraction bits can take 
any value. 


For details of the default NaN see NaN handling and the Default NaN on page A1-50. 


Note 


NaNs with different sign or fraction bits are distinct NaNs, but this does not mean software can use floating-point 
comparison instructions to distinguish them. This is because the IEEE 754 standard specifies that a NaN compares 
as unordered with everything, including itself. 








A1.4.4 Double-precision floating-point format 


The double-precision floating-point format is as defined by the IEEE 754 standard. Double-precision floating-point 
is supported by both floating-point and SIMD instructions in AArch64 state, and only by floating-point instructions 
in AArch32 state. 


This description includes implementation-specific details that are left open by the standard. It is only intended as an 
introduction to the formats and to the values they can contain. For full details, especially of the handling of infinities, 
NaNs and signed zeros, see the IEEE 754 standard. 


A double-precision value is a 64-bit doubleword, with the format: 


63 62 52.51 32 31 0 


me fm 
exponent fraction 
_————— 


Double-precision values represent numbers, infinities and NaNs in a similar way to single-precision values, with 
the interpretation of the format depending on the value of the exponent: 


0 < exponent < 0x7FF 
The value is a normalized number and is equal to: 
(—1)§ x 2(exponent-1023) x (1 fraction) 
The minimum positive normalized number is 2-!922, or approximately 2.225 x 10-30, 


The maximum positive normalized number is (2 — 2-52) x 21023, or approximately 1.798 x 10308. 





exponent == 

The value is either a zero or a denormalized number, depending on the fraction bits: 

fraction == 
The value is a zero. There are two distinct zeros that behave in the same way as the two 
single-precision zeros: 
+0 when S== 
0 when S==1. 

fraction != 0 
The value is a denormalized number and is equal to: 
(-1)§ x 2-1022 x (0.fraction) 

The minimum positive denormalized number is 2-!974, or approximately 4.941 x 10-324, 
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Optionally, denormalized numbers are flushed to zero in floating-point calculations. For details see 
Flush-to-zero on page A1-49. 


exponent == 0x7FF 


The value is either an infinity or a NaN, depending on the fraction bits: 


fraction == 
The value is an infinity. As for single-precision, there are two infinities: 
+infinity When S==0. 
-infinity ©When S==1. 

fraction != 0 


The value is a NaN, and is either a quiet NaN or a signaling NaN. 
The two types of NaN are distinguished by their most significant fraction bit, bit[51] of 
the doubleword: 
bit[51] == 0 
The NaN is a signaling NaN. The sign bit can take any value, and the 
remaining fraction bits can take any value except all zeros. 
bit[51] == 
The NaN is a quiet NaN. The sign bit and the remaining fraction bits can 
take any value. 


For details of the default NaN see NaN handling and the Default NaN on page A1-50. 


Note 


NaNs with different sign or fraction bits are distinct NaNs, but this does not mean software can use floating-point 
comparison instructions to distinguish them. This is because the IEEE 754 standard specifies that a NaN compares 
as unordered with everything, including itself. 








A1.4.5 Fixed-point format 


Fixed-point formats are used only for conversions between floating-point and fixed-point values. They apply to 
general-purpose registers. 


Fixed-point values can be signed or unsigned, and can be 16-bit or 32-bit. Conversion instructions take an argument 
that specifies the number of fraction bits in the fixed-point number. That is, it specifies the position of the binary 
point. 


A1.4.6 Conversion between floating-point and fixed-point values 


ARMvV8 supports the conversion of a scalar floating-point to or from a signed or unsigned fixed-point value in a 
general-purpose register. 


The instruction argument #fbits indicates that the general-purpose register holds a fixed-point number with fbits bits 
after the binary point, where fbits is in the range | to 64 for a 64-bit general-purpose register, or 1 to 32 for a 32-bit 
general-purpose register. 


More specifically: 
° For a 64-bit register Xq: 
— The integer part is X4[63:#fbits]. 
— The fractional part is Xq[(#fbits-1):0]. 
° For a 32-bit register Wa or Ra: 
— The integer part is Wq[31:#fbits] or Rg[31:#fbits]. 
— The fractional part is Wa[(#fbits-1):0] or Ra[(#fbits-1):0]. 
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These instructions might generate the following exceptions: 


Invalid Operation = When the floating-point input is NaN or Infinity or when a numerical value cannot be 
represented within the destination register. 


Inexact When the numeric result differs from the input. 


Input Denormal When flush-to-zero mode is enabled and the denormal input is replaced by a zero. 


Note 


An out of range fixed-point result is saturated to the destination size. 








Polynomial arithmetic over {0, 1} 


Some SIMD instructions that operate on SIMD&FP registers can operate on polynomials over {0, 1}, see Supported 
data types on page A1-36. The polynomial data type represents a polynomial in x of the form bp_;x""! + ... + b)x 
+ bo where by is bit[k] of the value. 


The coefficients 0 and | are manipulated using the rules of Boolean arithmetic: 
° 0+0=1+1=0 

° 0+1=1+0=1 

° 0x0=0x1l=1x0=0 

° ix1l=1, 


That is: 
. Adding two polynomials over {0, 1} is the same as a bitwise exclusive OR. 


° Multiplying two polynomials over {0, 1} is the same as integer multiplication except that partial products are 
exclusive-ORed instead of being added. 


A64, A32 and T32 provide instructions for performing polynomial multiplication of 8-bit values. 


° For AArch32, see VMUL (integer and polynomial) on page F6-3533 and VMULL (integer and polynomial) 
on page F6-3537. 


° For AArch64, see PMUL on page C7-1137 and PMULL, PMULL2 on page C7-1139. 

The Cryptographic Extension adds the ability to perform long polynomial multiplies of 64-bit values. See PMULL, 
PMULL2 on page C7-1139. 

Pseudocode description of polynomial multiplication 

In pseudocode, polynomial addition is described by the EOR operation on bitstrings. 


Polynomial multiplication is described by the PolynomialMu1t() function defined in Chapter J1 ARMv8 Pseudocode. 
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A1.5 Floating-point and Advanced SIMD support 
Note 
In AArch32 state, the SIMD instructions that operate on SIMD&FP registers are always described as the Advanced 
SIMD instructions, to distinguish them from the SIMD instructions in the base instruction sets, that operate on the 
32-bit general-purpose registers. The A64 instruction set does not provide any SIMD instructions that operate on 
the general-purpose registers, and therefore some AArch64 state descriptions use SIMD as a synonym for Advanced 
SIMD. Unless the context clearly indicates otherwise, this section describes the support for SIMD instructions that 
operate on SIMD&FP registers. 
ARMvV8 can support the following levels of support for floating-point and Advanced SIMD instructions: 
° Full floating-point and SIMD support without exception trapping. 
° Full floating-point and SIMD support with exception trapping. 
° No floating-point or SIMD support. This option is licensed only for implementations targeting specialised 
markets. 
Note 
All systems that support standard operating systems with rich application environments provide hardware 
support for floating-point and Advanced SIMD. It is a requirement of the ARM Procedure Call Standard for 
AArch64, see Procedure Call Standard for the ARM 64-bit Architecture. 
ARMvVv8 supports single-precision (32-bit) and double-precision (64-bit) floating-point data types and arithmetic as 
defined by the IEEE 754 floating-point standard. It also supports the half-precision (16-bit) floating-point data type 
for data storage only, by supporting conversions between single-precision and half-precision data types and 
double-precision and half-precision data types. 
The SIMD instructions provide packed Single Instruction Multiple Data (SIMD) and single-element scalar 
operations, and support: 
° Single-precision and double-precision arithmetic in AArch64 state. 
° Single-precision arithmetic only in AArch32 state. 
Floating-point support in AArch64 state SIMD is IEEE 754-2008 compliant with: 
° Configurable rounding modes. 
° Configurable Default NaN behavior. 
° Configurable Flush-to-zero behavior. 
Floating-point computation using AArch32 Advanced SIMD instructions remains unchanged from ARMv7. A32 
and T32 Advanced SIMD floating-point always uses ARM standard floating-point arithmetic and performs 
IEEE 754 floating-point arithmetic with the following restrictions: 
° Denormalized numbers are flushed to zero, see Flush-to-zero on page A1-49. 
° Only default NaNs are supported, see NaN handling and the Default NaN on page A1-50. 
° The Round to Nearest rounding mode is used. 
° Untrapped floating-point exception handling is used for all floating-point exceptions. 
ARMvV8 introduces new instructions for AArch32 state: 
° Floating-point selection, see VSELEQ, VSELGE, VSELGT, VSELVS on page F6-3690. 
° Floating-point maximum and minimum numbers, see VMAXNM on page F6-3471 and VMINNM on 
page F6-3478. 
. Floating-point integer conversions with directed rounding modes, see: 
—  VCVTA (Advanced SIMD) on page F6-3367 and VCVTA (floating-point) on page F6-3369. 
—_— VCVTM (Advanced SIMD) on page F6-3374 and VCVTM (floating-point) on page F6-3376. 
—  VCVTN (Advanced SIMD) on page F6-3378 and VCVTN (floating-point) on page F6-3380. 
— VCVTP (Advanced SIMD) on page F6-3382 and VCVTP (floating-point) on page F6-3384. 
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. Floating-point round to integral floating-point, see: 

—  VRINTA (Advanced SIMD) on page F6-3646 and VRINTA (floating-point) on page F6-3648. 
—  VRINTM (Advanced SIMD) on page F6-3650 and VRINTM (floating-point) on page F6-3652. 
—  VRINTN (Advanced SIMD) on page F6-3654 and VRINTN (floating-point) on page F6-3656. 
—  VRINTP (Advanced SIMD) on page F6-3658 and VRINTP (floating-point) on page F6-3660. 
—_— VRINTR on page F6-3662. 

—  VRINTX (Advanced SIMD) on page F6-3664 and VRINTX (floating-point) on page F6-3666. 
—  VRINTZ (Advanced SIMD) on page F6-3668 and VRINTZ (floating-point) on page F6-3670. 


. Floating-point conversions between half-precision and double-precision, see VCV7TB on page F6-3371 and 
VCVIT on page F6-3389. 


If floating-point exception trapping is supported, floating-point exceptions, such as overflow or division by zero, 
can be handled without trapping. This applies to both floating-point and SIMD operations. When handled in this 
way, a floating-point exception causes a cumulative status register bit to be set to 1 and a default result to be 
produced by the operation. For more information about floating-point exceptions, see Floating-point exception 
traps on page D1-1552. 


In AArch64 state, the following registers control floating-point operation and return floating-point status 
information: 
° The Floating-Point Control Register, FPCR, controls: 

— The half-precision format where applicable, FPCR.AHP bit. 

—_— Default NaN behavior, FPCR.DN bit. 

— Flush to zero behavior, FPCR.FZ bit. 

— Rounding mode support, FPCR.Rmode field. 


— Len and Stride fields associated with execution in AArch32 state, and only supported for a context 
save and restore from AArch64 state. These fields are obsolete in ARMv8 and can be implemented as 
RAZ/WI. If they are implemented as RW and are programmed to a nonzero value, they make some 
AArch32 floating-point instructions UNDEFINED. 


— Floating-point exception trap controls, the FPCR.{IDE, IXE, UFE, OFE, DZE, IOE} bits, see 
Floating-point exception traps on page D1-1552. In an implementation that does not support trapping 
of floating-point exceptions these bits are RESO. 


° The Floating-Point Status Register, FPSR, provides: 


— Cumulative floating-point exceptions flags, FPSR.{IDC, IXC, UFC, OFC, DZC, IOC and QC}. 
— The AArch32 floating-point comparison flags {N,Z,C,V}. These bits are RESO if AArch32 
floating-point is not implemented. 
Note 


In AArch64 state, the process state flags, PSTATE.{N,Z,C,V} are used for all data-processing 
compares and any associated conditional execution. 








AArch32 state provides a single Floating-Point Status and Control Register, FPSCR, combining the FPCR and 
FPSR fields. 


For system level information about the SIMD and floating-point support, see Advanced SIMD and floating-point 
support on page G1-3880. 
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A1.5.1 Instruction support 
The floating-point and SIMD support includes the following types of instructions: 
° Load and store for single elements and vectors of multiple elements. 
Note 
Single elements are also referred to as scalar elements. 
. Data processing on single and multiple elements for both integer and floating-point data types. 
° Floating-point conversion: 
—  Half-precision, single-precision, and double-precision conversions. 
— _ Single-precision, double-precision, and fixed point integer conversions. 
—  Single-precision, double-precision, and integer conversions. 
° Floating-point rounding. 
For more information on the floating-point and SIMD instructions in AArch64 state, see Chapter C3 A64 
Instruction Set Overview. 
For more information on the floating-point and Advanced SIMD instructions in AArch32 state, see Chapter Fl The 
AArch32 Instruction Sets Overview. 
A1.5.2 Floating-point standards, and terminology 
The ARM includes support for all the required features of ANSI/IEEE Std 754-2008, IEEE Standard for Binary 
Floating-Point Arithmetic, referred to as IEEE 754-2008. However, some terms in this manual are based on the 
1985 version of this standard, referred to as IEEE 754-1985: 
° ARM floating-point terminology generally uses the IEEE 754-1985 terms. This section summarizes how 
IEEE 754-2008 changes these terms. 
° References to IEEE 754 that do not include the issue year apply to either issue of the standard. 
Table Al-3 shows how the terminology in this manual differs from that used in IEEE 754-2008. 
Table A1-3 Floating-point terminology 
This manual IEEE 754-2008 
Normalized 2 Normal 
Denormal, or denormalized Subnormal 
Round towards Minus Infinity (RM) roundTowardsNegative 
Round towards Plus Infinity (RP) roundTowardsPositive 
Round towards Zero (RZ) roundTowardZero 
Round to Nearest (RN) roundTiesToEven 
Round to Nearest with Ties to Away roundTiesToAway 
Rounding mode Rounding-direction attribute 
a. Normalized number is used in preference to normal number, because of the other 
specific uses of normal in this manual. 
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A1.5.3 ARM standard floating-point input and output values 

ARMvV8 provides full IEEE 754 floating-point arithmetic support. In AArch32 state, floating-point operations 

performed using Advanced SIMD instructions are limited to ARM standard floating-point operation, regardless of 

the selected rounding mode in the FPSCR. Unlike AArch32, AArch64 SIMD floating point arithmetic is performed 
using the rounding mode selected by the FPCR. 

ARM standard floating-point arithmetic supports the following input formats defined by the IEEE 754 

floating-point standard: 

° Zeros. 

. Normalized numbers. 

° Denormalized numbers are flushed to 0 before floating-point operations, see Flush-to-zero. 

° NaNs. 

. Infinities. 

ARM standard floating-point arithmetic supports the Round to Nearest (roundTiesToEven) rounding mode defined 

by the IEEE 754 standard. 

ARM standard floating-point arithmetic supports the following output result formats defined by the IEEE 754 

standard: 

° Zeros. 

. Normalized numbers. 

° Results that are less than the minimum normalized number are flushed to zero, see Flush-to-zero. 

. NaNs produced in floating-point operations are always the default NaN, see NaN handling and the Default 
NaN on page A1-50. 

° Infinities. 

A1.5.4 Flush-to-zero 

The performance of floating-point processing can be reduced when doing calculations involving denormalized 

numbers and Underflow exceptions. In many algorithms, this performance can be recovered, without significantly 

affecting the accuracy of the final result, by replacing the denormalized operands and intermediate results with 
zeros. To permit this optimization, ARM floating-point implementations have a processing mode called 

Flush-to-zero mode. AArch32 Advanced SIMD floating-point instructions always use Flush-to-zero mode. 

Behavior in Flush-to-zero mode differs from standard IEEE 754 arithmetic in the following ways: 

° All inputs to floating-point operations that are double-precision denormalized numbers or single-precision 
denormalized numbers are treated as though they were zero. This causes an Input Denormal exception, but 
does not cause an Inexact exception. The Input Denormal exception occurs only in Flush-to-zero mode. 

In AArch32 state the FPSCR contains a cumulative exception bit FPSCR.IDC and optional trap enable bit 
FPSCR.IDE corresponding to Input Denormal exception. 

In AArch64 state the FPSR contains a cumulative exception bit FPSR.IDC and optional trap enable bit 
FPCR.IDE corresponding to the Input Denormal exception. 

The occurrence of all exceptions except Input Denormal is determined using the input values after 
flush-to-zero processing has occurred. 

° The result of a floating-point operation is flushed to zero if the result of the operation before rounding 
satisfies the condition: 

0 <Abs(result) < MinNorm, where: 
—  _MinNorm is 2-!26 for single-precision 
—  _MinNorm is 2-!922 for double-precision. 
This causes the FPSR.UFC bit to be set to 1, and prevents any Inexact exception from occurring for the 
operation. 
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Underflow exceptions occur only when a result is flushed to zero. 


In all implementations Underflow exceptions that occur in Flush-to-zero mode are always treated as 
untrapped, even when the Underflow trap enable bit, FPCR.UFE, is set to 1. 


° An Inexact exception does not occur if the result is flushed to zero, even though the final result of zero is not 
equivalent to the value that would be produced if the operation were performed with unbounded precision 
and exponent range. 


When an input or a result is flushed to zero the value of the sign bit of the zero is preserved. That is, the sign bit of 
the zero matches the sign bit of the input or result that is being flushed to zero. 


Flush-to-zero mode has no effect on half-precision numbers that are inputs to floating-point operations, or results 
from floating-point operations. 


Note 
Flush-to-zero mode is incompatible with the IEEE 754 standard, and must not be used when IEEE 754 compatibility 
is a requirement. Flush-to-zero mode must be used with care. Although it can improve performance on some 
algorithms, there are significant limitations on its use. These are application dependent: 





° On many algorithms, it has no noticeable effect, because the algorithm does not normally use denormalized 
numbers. 

° On other algorithms, it can cause exceptions to occur or seriously reduce the accuracy of the results of the 
algorithm. 














A1.5.5 NaN handling and the Default NaN 

The IEEE 754 standard specifies that: 

° An operation that produces an Invalid Operation floating-point exception generates a quiet NaN as its result 
if that exception is untrapped. 

. An operation involving a quiet NaN operand, but not a signaling NaN operand, returns an input NaN as its 
result. 

The floating-point processing behavior when Default NaN mode is disabled adheres to this, with the following 

additions: 

° If an untrapped Invalid Operation floating-point exception is produced, the quiet NaN result is derived from: 
— The first signaling NaN operand, if the exception was produced because at least one of the operands 

is a signaling NaN. 

_ Otherwise, the default NaN. 

° If an untrapped Invalid Operation floating-point exception is not produced, but at least one of the operands 
is a quiet NaN, the result is derived from the first quiet NaN operand. 

Depending on the operation, the exact value of a derived quiet NaN result may differ in both sign and number of 

fraction bits from its source. For a quiet NaN result derived from signaling NaN operand, the most-significant 

fraction bit is set to 1. 

Note 

. In these descriptions, first operand relates to the left-to-right ordering of the arguments to the pseudocode 
function that describes the operation. 

° The IEEE 754 standard specifies that the sign bit of a NaN has no significance. 
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The floating-point and SIMD processing behavior when Default NaN mode is enabled is that the Default NaN is 


the result of all floating-point operations that either: 
° Generate untrapped Invalid Operation floating-point exceptions. 


. Have one or more quiet NaN inputs, but no signaling NaN inputs. 


Table Al-4 shows the format of the default NaN for ARM floating-point operations. 


Default NaN mode is selected for the floating-point processing by setting the FPCR.DN bit to 1. 


Other aspects of the functionality of the Invalid Operation exception are not affected by Default NaN mode. These 














are that: 

° If untrapped, it causes the FPSR.IOC bit be set to 1. 

° If trapped, it causes a user trap handler to be invoked. 

Table A1-4 Default NaN encoding 
Half-precision, IEEE Format Single-precision Double-precision 

Sign bit 0 0 0 
Exponent 0x1F OxFF Ox7FF 
Fraction Bit[9] == 1, bits[8:0] == bit[22] == 1, bits[21:0] == bit[51] == 1, bits[50:0] == 
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A1.6 Cryptographic Extension 


The presence of this Extension in an implementation is subject to export license controls. The Cryptographic 
Extension is an extension of the SIMD support and operates on the vector register file. It provides instructions for 
the acceleration of encryption and decryption to support the following: 


° AES 
° SHA1 
° SHA2-256 


The Cryptographic Extension also provides multiply instructions that operate on long polynomials, see PMULL, 
PMULL2 on page C7-1139. 
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A1.7 The ARM memory model 

The ARM memory model supports: 

° Generating an exception on an unaligned memory access. 

° Restricting access by applications to specified areas of memory. 

. Translating virtual addresses provided by executing instructions into physical addresses. 

. Altering the interpretation of multi-byte data between big-endian and little-endian. 

° Controlling the order of accesses to memory. 

° Controlling caches and address translation structures. 

° Synchronizing access to shared memory by multiple PEs. 

Virtual address (VA) support depends on the Execution state, as follows: 

AArch6é4 state 
Supports 64-bit virtual addressing, with the Translation Control Register determining the supported 
VA range. Execution at EL1 and ELO supports two independent VA ranges, each with its own 
translation controls. 

AArch32 state 
Supports 32-bit virtual addressing, with the Translation Control Register determining the supported 
VA range. For execution at EL1 and ELO, system software can split the VA range into two 
subranges, each with its own translation controls. 

The supported physical address space is IMPLEMENTATION DEFINED, and can be discovered by system software. 

Regardless of the Execution state, the Virtual Memory System Architecture (VMSA) can translate VAs to blocks or 

pages of memory anywhere within the supported physical address space. 

For more information, see: 

For execution in AArch64 state 
. Chapter B2 The AArch64 Application Level Memory Model. 
. Chapter D3 The AArch64 System Level Memory Model. 
° Chapter D4 The AArch64 Virtual Memory System Architecture. 

For execution in AArch32 state 
° Chapter E2 The AArch32 Application Level Memory Model. 
° Chapter G3 The AArch32 System Level Memory Model. 
° Chapter G4 The AArch32 Virtual Memory System Architecture. 
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Chapter B1 
The AArch64 Application Level Programmers’ Model 


This chapter gives an application level view of the ARM programmers’ model. It contains the following sections: 
° About the Application level programmers’ model on page B1-58. 

° Registers in AArch64 Execution state on page B1-59. 

° Software control features and ELO on page B1-64. 
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B1.1 About the Application level programmers’ model 


This chapter contains the programmers’ model information required for application development. 


The information in this chapter is distinct from the system information required to service and support application 
execution under an operating system, or higher level of system software. However, some knowledge of the system 
information is needed to put the Application level programmers' model into context. 


Depending on the implementation choices, the architecture supports multiple levels of execution privilege, 
indicated by different Exception levels that number upwards from ELO to EL3. ELO corresponds to the lowest 
privilege level and is often described as unprivileged. The Application level programmers’ model is the 
programmers’ model for software executing at ELO. For more information see Exception levels on page D1-1498. 


System software determines the Exception level, and therefore the level of privilege, at which software runs. When 
an operating system supports execution at both EL1 and ELO, an application usually runs unprivileged at ELO. This: 


. Permits the operating system to allocate system resources to an application in a unique or shared manner. 


. Provides a degree of protection from other processes, and so helps protect the operating system from 
malfunctioning software. 


This chapter indicates where some system level understanding is necessary, and where relevant it gives a reference 
to the system level description. 


Execution at any Exception level above ELO is often referred to as privileged execution. 


For more information on the system level view of the architecture refer to Chapter D1 The AArch64 System Level 
Programmers’ Model. 
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B1.2 Registers in AArch64 Execution state 


This section describes the registers and process state visible at ELO when executing in the AArch64 state. It includes 
the following: 


° Registers in AArch64 state 
° Process state, PSTATE on page B1-61 


° System registers on page B1-62 


B1.2.1 Registers in AArch64 state 
In the AArch64 application level view, an ARM processing element has: 


RO-R30 31 general-purpose registers, RO to R30. Each register can be accessed as: 
° A 64-bit general-purpose register named XO to X30. 
° A 32-bit general-purpose register named WO to W30. 


See the register name mapping in Figure B1-1. 


63 32 31 0 
< Wn > 





tt INR 


Figure B1-1 General-purpose register naming 
The X30 general-purpose register is used as the procedure call link register. 


— Note 


In instruction encodings, the value @b11111 (31) is used to indicate the ZR (zero register). This 
indicates that the argument takes the value zero, but does not indicate that the ZR is implemented 
as a physical register. 





SP A 64-bit dedicated Stack Pointer register. The least significant 32 bits of the stack-pointer can be 
accessed via the register name WSP. 


The use of SP as an operand in an instruction, indicates the use of the current stack pointer. 


— Note 


Stack pointer alignment to a 16-byte boundary is configurable at EL1. For more information see the 
Procedure Call Standard for the ARM 64-bit Architecture. 





PC A 64-bit Program Counter holding the address of the current instruction. 
Software cannot write directly to the PC. It can only be updated on a branch, exception entry or 
exception return. 
——— Note 


Attempting to execute an A64 instruction that is not word-aligned generates an Alignment fault, see 
PC alignment checking on page D1-1515. 





V0-V31 32 SIMD&FP registers, VO to V31. Each register can be accessed as: 
° A 128-bit register named QO to Q31. 
. A 64-bit register named DO to D31. 
° A 32-bit register named SO to $31. 
° A 16-bit register named HO to H31. 
. An 8-bit register named BO to B31. 
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° A 128-bit vector of elements. 
° A 64-bit vector of elements. 


Where the number of bits described by a register name does not occupy an entire SIMD&FP 
register, it refers to the least significant bits. See Figure B1-2. 








127 64 63 32 31 1615 8 7 0O 
+Bn> 
+ Hn——_> 
a Sn > 
« Dn > 
4 Qn > 





Figure B1-2 SIMD and floating-point register naming 


For more information about data types and vector formats, see Supported data types on page A1-36. 
FPCR, FPSR Two SIMD and floating-point control and status registers, FPCR and FPSR. 


See Registers for instruction processing and exception handling on page D1-1507 for more information on the 
registers. 


Pseudocode description of registers in AArch64 state 


In the pseudocode functions that access registers: 
° The assignment form is used for register writes. 


. The non-assignment for register reads. 


The uses of the X[] function are: 
° Reading or writing X0-X30, using n to index the required register. 


° Reading the zero register ZR, accessed as X[31]. 


Note 


The pseudocode use of X[31] to represent the zero register does not indicate that hardware must implement this 
register. 








The AArch64 SP[] function is used to read or write the current SP. 
The AArch64 PC[] function is used to read the PC. 


The AArch64 V[] function is used to read or write the Advanced SIMD and floating-point registers VO-V31, using 
a parameter n to index the required register. 


The AArch64 Vpart[] function is used to read or write a part of one of VO-V31, using a parameter n to index the 
required register, and a parameter part to indicate the required part of the register, see the function description for 
more information. 


The SP[], PC[], V[], and Vpart[] functions are defined in Chapter J1 ARMv8 Pseudocode. 
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B1.2.2 Process state, PSTATE 


Process state or PSTATE is an abstraction of process state information. All of the instruction sets provide 
instructions that operate on elements of PSTATE. 


The following PSTATE information is accessible at ELO: 


The condition flags 

Flag-setting instructions set these. They are: 

N Negative condition flag. If the result of the instruction is regarded as a two's 
complement signed integer, the PE sets this to: 
° 1 if the result is negative. 
° 0 if the result is positive or zero. 

Z Zero condition flag. Set to: 
. 1 if the result of the instruction is zero. 
° 0 otherwise. 


A result of zero often indicates an equal result from a comparison. 
Cc Carry condition flag. Set to: 


° 1 if the instruction results in a carry condition, for example an unsigned overflow 
that is the result of an addition. 


° 0 otherwise. 
Vv Overflow condition flag. Set to: 
. 1 if the instruction results in an overflow condition, for example a signed 


overflow that is the result of an addition. 
° 0 otherwise. 


Conditional instructions test the N, Z, C and V condition flags, combining them with the condition 
code for the instruction to determine whether the instruction must be executed. In this way, 
execution of the instruction is conditional on the result of a previous operation. For more 
information about conditional execution, see Condition flags and related instructions on 

page C6-433. 


The exception masking bits 


D Debug exception mask bit. When ELO is enabled to modify the mask bits, this bit is 
visible and can be modified. However, this bit is architecturally ignored at ELO. 

A SError interrupt mask bit. 

I IRQ interrupt mask bit. 

F FIQ interrupt mask bit. 

For each bit, the values are: 

0 Exception not masked. 

1 Exception masked. 


Access at ELO using AArch64 state depends on SCTLR_EL1.UMA. See Traps to EL1 of ELO 
accesses to the PSTATE.{D, A, I, F} interrupt masks on page D1-1566. 


See Process state, PSTATE on page D1-1513 for the system level view of PSTATE. 
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Accessing PSTATE fields at ELO 


At ELO using AArch64 state, PSTATE fields can be accessed using Special-purpose registers that can be directly 
read using the MRS instruction and directly written using the MSR (register) instructions. Table B1-1 shows the 
Special-purpose registers that access the PSTATE fields that hold AArch64 state when the PE is at ELO using 
AArché4. All other PSTATE fields do not have direct read and write access at ELO. 


Table B1-1 Accessing PSTATE fields at ELO using MRS and MSR (register) 











Special-purpose register PSTATE fields 
NZCV N, Z, C, V 
DAIF D,A,1,F 





Software can also use the MSR (immediate) instruction to directly write to PSTATE.{D, A, I, F}. Table B1-2 shows 
the MSR (immediate) operands that can directly write to PSTATE. {D, A, I, F} when the PE is at ELO using AArch64 
state. 


Table B1-2 Accessing PSTATE.{D, A, I, F} at ELO using MSR (immediate) 











Operand PSTATE fields Notes 
DAIFSet D, A, I, F Directly sets any of the PSTATE.{D,A, I, F} bits to 1 
DAIFClr D, A, I, F Directly clears any of the PSTATE.{D, A, I, F} bits to 0 





However, access to the PSTATE.{D, A, I, F} fields at ELO using AArch64 state depends on SCTLR_EL1.UMA. 
Traps to ELI of ELO accesses to the PSTATE.{D, A, [, F} interrupt masks on page D1-1566. 


Writes to the PSTATE fields have side-effects on various aspects of the PE operation. All of these side-effects, are 
guaranteed: 


° Not to be visible to earlier instructions in the execution stream. 


° To be visible to later instructions in the execution stream. 





B1.2.3 System registers 
System registers provide support for execution control, status and general system configuration. The majority of the 
System registers are not accessible at ELO. 
However, some System registers can be configured to allow access from software executing at ELO. Any access 
from ELO to a System register with the access right disabled causes the instruction to behave as UNDEFINED. The 
registers that can be accessed from ELO are: 
Cache ID registers § The CTR_ELO and DCZID_ELDO registers provide implementation parameters for ELO 
cache management support. 
Debug registers A debug communications channel is supported by the MDCCSR_ELO, DBGDTR_ELO, 
DBGDTRRX_ELO and DBGDTRTX_ELO registers. 
Performance Monitors registers 
See Performance Monitors support on page B1-63. 
Thread ID registers The TPIDR_ELO and TPIDRRO_ELO registers are two thread ID registers with different 
access rights. 
Timer registers In ARMV8 the following operations are performed: 
° Read access to the system counter clock frequency using CNTFRQ_ELO. 
° Physical and virtual timer count registers, CNTPCT_ELO and CNTVCT_ELO. 
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. Physical up-count comparison, down-count value and timer control registers, 
CNTP_CVAL_ELO, CNTP_TVAL_ELO, and CNTP_CTL_ELO. 


. Virtual up-count comparison, down-count value and timer control registers, 
CNTV_CVAL_ELO, CNTV_TVAL_ELO, and CNTV_CTL_ELO. 


Performance Monitors support 

The ARMV8 architecture defines optional Performance Monitors. 
The basic form of the Performance Monitors is: 

° A 64-bit cycle counter. 


° Up to a maximum of 32 IMPLEMENTATION DEFINED event counters, where the number is identified by the 
PMCR_ELO.N field. 


° System register access to the cycle counter and event registers, and related controls for: 
— Enabling and resetting counters. 
— Flagging overflows. 


— Generating interrupts on overflow. 


Software can enable the cycle counter independently of the event counters. 


Software executing at EL1 or a higher Exception level, for example an operating system, can enable access to the 
counters from ELO. This allows an application to monitor its own performance with fine grain control without 
requiring operating system support. For example, an application might implement per-function performance 
monitoring. 


For details on the features, configuration and control of the Performance Monitors, see Chapter DS The 
Performance Monitors Extension. 


ELO access to Performance Monitors 


To allow application code to make use of the Performance Monitors, software executing at a higher Exception level 
must set the following bits in the PMUSERENR_ELO System register: 


EN When set to 1, access to all Performance Monitors registers is allowed at ELO, except for writes to 
PMUSERENR_ELO, and reads/writes of PMINTENSET_EL1 and PMINTENCLR_EL1. 


ER When set to 1, read access to event counters is allowed at ELO. This includes read/write access to 
PMSELR_ELO, so that the event counter to read through PMXEVCNTR_ELO can be set. 


CR When set to 1, read access to PMCCNTR_ELO is allowed at ELO. 


SW When set to 1, write access to PMSWINC_ELO is allowed at ELO. 


Note 
Register PMUSERENR_ELO is always read-only at ELO. 
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B1.3 


Software control features and ELO 


The following sections describe the ELO view of the ARMv8 software control features: 











° Exception handling 
° Wait for Interrupt and Wait for Event 
° The YIELD instruction 
. Application level cache management on page B1-65 
. Instructions relating to Debug on page B1-65 

B1.3.1 Exception handling 
In the ARM architecture, an exception causes a change of program flow. Execution of an exception handler starts, 
at an Exception level higher than ELO, from a defined vector that relates to the exception taken. 
Exceptions include: 
° Interrupts. 
° Memory system aborts. 
° Exceptions generated by attempting to execute an instruction that is UNDEFINED. 
° System calls. 
° Secure monitor or Hypervisor traps. 
. Debug exceptions. 
Most details of exception handling are not visible to application level software, and are described in Chapter D1 The 
AArch64 System Level Programmers’ Model. 
The SVC instruction causes a Supervisor Call exception. This provides a mechanism for unprivileged software to 
make a system call to an operating system. 
The BRK instruction generates a Breakpoint Instruction exception. This provides a mechanism for debugging 
software using debugger executing on the same PE, see Breakpoint Instruction exceptions on page D2-1639. 

Note 

The BRK instruction is supported only in the A64 instruction set. The equivalent instruction in the T32 and A32 
instruction sets is BKPT. 

B1.3.2 Wait for Interrupt and Wait for Event 
Issuing a WFI instruction indicates that no further execution is required until a WFI wake-up event occurs, see Wait 
For Interrupt on page D1-1602. This permits entry to a low-power state. 
Issuing a WFE instruction indicates that no further execution is required until a WFE wake-up event occurs, see Wait 
for Event mechanism and Send event on page D1-1599. This permits entry to a low-power state. 

B1.3.3 The YIELD instruction 
The YIELD instruction provides a hint that the task performed by a thread is of low importance so that it could yield, 
see YIELD on page C6-765. This mechanism can be used to improve overall performance in a Symmetric 
Multithreading (SMT) or Symmetric Multiprocessing (SMP) system. 
Examples of when the YIELD instruction might be used include a thread that is sitting in a spin-lock, or where the 
arbitration priority of the snoop bit in an SMP system is modified. The YIELD instruction permits binary 
compatibility between SMT and SMP systems. 
The YIELD instruction is a NOP (No Operation) hint instruction. 
The YIELD instruction has no effect in a single-threaded system, but developers of such systems can use the 
instruction to flag its intended use for future migration to a multiprocessor or multithreading system. Operating 
systems can use YIELD in places where a yield hint is wanted, knowing that it will be treated as a NOP if there is no 
implementation benefit. 
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Application level cache management 


A small number of cache management instructions can be enabled at ELO from higher levels of privilege using the 
SCTLR_EL1 System register. Any access from ELO to an operation with the access right disabled causes the 
instruction to behave as UNDEFINED. 


About the available operations, see Application level access to functionality related to caches on page B2-72. 


Instructions relating to Debug 


Exception handling on page B 1-64 refers to the BRK instruction, which generates a Breakpoint Instruction exception. 
In addition, in both AArch64 state and AArch32 state, the HLT instruction causes the PE to halt execution and enter 
Debug state. This provides a mechanism for debugging software using a debugger that is external to the PE, see 
Chapter H1 About External Debug. 


Note 


In AArch32 state, previous versions of the architecture defined the DBG instruction, that could provide a hint to the 
debug system. In ARMv8, this instruction executes as a NOP. ARM deprecates the use of the DBG instruction. 
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This chapter gives an application level view of the memory model. It contains the following sections: 


Address space on page B2-68. 

Memory type overview on page B2-69. 

Caches and memory hierarchy on page B2-70. 
Alignment support on page B2-76. 

Endian support on page B2-78. 

Atomicity in the ARM architecture on page B2-81. 
Memory ordering on page B2-84. 

Memory types and attributes on page B2-94. 
Mismatched memory attributes on page B2-105. 


Synchronization and semaphores on page B2-108. 





Note 


In this chapter, System register names usually link to the description of the register in Chapter D7 AArch64 System 
Register Descriptions, for example SCTLR_EL1. 
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B2.1 Address space 


Address calculations are performed using 64-bit registers. However, supervisory software can configure the top 
eight address bits for use as a tag, as described in Address tagging in AArch64 state on page D4-1724. If this is done, 
address bits[63:56]: 


° Are not considered when determining whether the address is valid. 


° Are never propagated to the program counter. 


Supervisory software determines the valid address range. Attempting to access an address that is not valid generates 
an MMU fault. 


Simple sequential execution of instructions might overflow the valid address range. For more information see 
Instruction address space overflow on page D3-1691. 


Memory accesses use the Mem[] function. This function makes an access of the required type. If supervisory software 
configures the top eight address bits for use as a tag, the top eight address bits are ignored. 


The AccType{} enumeration defines the different access types. 





Note 


° Chapter D3 The AArch64 System Level Memory Model and Chapter D4 The AArch64 Virtual Memory System 
Architecture include descriptions of memory system features that are transparent to the application, including 
memory access, address translation, memory maintenance instructions, and alignment checking and the 
associated fault handling. These chapters also include pseudocode descriptions of these operations. 


° For information on the pseudocode that relates to memory accesses, see Basic memory access on 
page D3-1717, Unaligned memory access on page D3-1718, and Aligned memory access on page D3-1717. 
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B2.2 Memory type overview 
ARMvV8 provides the following mutually-exclusive memory types: 
Normal This is generally used for bulk memory operations, both read/write and read-only operations. 
Device The ARM architecture forbids speculative reads of any type of Device memory. This means Device 


memory types are suitable attributes for read-sensitive locations. 


Locations of the memory map that are assigned to peripherals are usually assigned the Device 
memory attribute. 


Device memory has additional attributes that have the following effects: 


° They prevent aggregation of reads and writes, maintaining the number and size of the 
specified memory accesses. See Gathering on page B2-101. 


° They preserve the access order and synchronization requirements, both for accesses to a 
single peripheral and where there is a synchronization requirement on the observability of 
one or more memory write and read accesses. See Reordering on page B2-102 


° They indicate whether a write can be acknowledged other than at the end point. See Early 
Write Acknowledgement on page B2-103. 


For more information on Normal memory and Device memory, see Memory types and attributes on page B2-94. 


Note 


Earlier versions of the ARM architecture defined a single Device memory type and a Strongly-ordered memory 
type. A Note in Device memory on page B2-98 describes how these memory types map onto the ARMv8 memory 
types. 
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B2.3 Caches and memory hierarchy 


The implementation of a memory system depends heavily on the microarchitecture and therefore many details of 
the memory system are IMPLEMENTATION DEFINED. ARMv8 defines the application level interface to the memory 
system, including a hierarchical memory system with multiple levels of cache. This section describes an application 
level view of this system. It contains the subsections: 


° Introduction to caches. 

° Memory hierarchy on page B2-71. 

° Application level access to functionality related to caches on page B2-72 
° Implication of caches for the application programmer on page B2-73. 


° Preloading caches on page B2-74. 


B2.3.1 Introduction to caches 


A cache is a block of high-speed memory that contains a number of entries, each consisting of: 
° Main memory address information, commonly known as a tag. 


. The associated data. 
Caches increase the average speed of a memory access. Caching takes account of two principles of locality: 


Spatial locality 


An access to one location is likely to be followed by accesses to adjacent locations. Examples of this 
principle are: 


. Sequential instruction execution. 
° Accessing a data structure. 
Temporal locality 


An access to an area of memory is likely to be repeated in a short time period. An example of this 
principle is the execution of a software loop. 


To minimize the quantity of control information stored, the spatial locality property groups several locations 
together under the same tag. This logical block is commonly known as a cache line. When data is loaded into a 
cache, access times for subsequent loads and stores are reduced, resulting in overall performance benefits. An access 
to information already in a cache is known as a cache hit, and other accesses are called cache misses. 


Normally, caches are self-managing, with the updates occurring automatically. Whenever the PE accesses a 
cacheable memory location, the cache is checked. If the access is a cache hit, the access occurs in the cache. 
Otherwise, the access is made to memory. Typically, when making this access, a cache location is allocated and the 
cache line loaded from memory. ARMv8 permits different cache topologies and access policies, provided they 
comply with the memory coherency model described in this manual. 


Caches introduce a number of potential problems, mainly because: 





° Memory accesses can occur at times other than when the programmer would expect them. 
° A data item can be held in multiple physical locations. 
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B2.3.2 Memory hierarchy 


Typically memory close to a PE has very low latency, but is limited in size and expensive to implement. Further 
from the PE it is common to implement larger blocks of memory but these have increased latency. To optimize 
overall performance, an ARMv8 memory system can include multiple levels of cache in a hierarchical memory 
system that exploits this trade-off between size and latency. Figure B2-1 shows an example of such a system in an 
ARMv8-A system that supports virtual addressing. 
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Figure B2-1 Multiple levels of cache in a memory hierarchy 





Note 
In this manual, in a hierarchical memory system, Level 1 refers to the level closest to the processing element, as 
shown in Figure B2-1. 





Instructions and data can be held in separate caches or in a unified cache. A cache hierarchy can have one or more 
levels of separate instruction and data caches, with one or more unified caches that are located at the levels closest 
to the main memory. Memory coherency for cache topologies can be defined by two conceptual points: 


Point of Unification (PoU) 


The point at which the instruction cache, data cache, and translation table walks of a particular PE 
are guaranteed to see the same copy of a memory location. In many cases, the point of unification 
is the point in a uniprocessor memory system by which the instruction and data caches and the 
translation table walks have merged. The point of unification might coincide with the point of 


coherency. 


Point of Coherency (PoC) 


The point at which all agents that can access memory are guaranteed to see the same copy of a 
memory location. In many cases this is effectively the main system memory, although the 
architecture does not prohibit the implementation of caches beyond the PoC that have no effect on 
the coherency between memory system agents. 


—— Note 
The presence of system caches can affect the definition of point of coherency as described in System 
level caches on page D3-1713. 





See also About cache maintenance in ARMV8 on page D3-1699. 
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The cacheability and shareability memory attributes 
Cacheability and shareability are two attributes that describe the memory hierarchy in a multiprocessing system: 


Cacheability This attribute defines whether memory locations are allowed to be allocated into a cache or not. 
Cacheability is defined independently for Inner and Outer cacheability locations. 


Shareability This attribute defines whether memory locations are shareable between different agents in a system. 
Marking a memory location as shareable for a particular domain requires hardware to ensure that 
the location is coherent for all agents in that domain. Shareability is defined independently for Inner 
and Outer shareability domains. 


For more information about cacheability and shareability see Memory types and attributes on page B2-94. 


B2.3.3 Application level access to functionality related to caches 


As indicated in About the Application level programmers’ model on page B1-58, the application level corresponds 
to execution at ELO. The architecture defines a set of cache maintenance instructions that software can use to 
manage cache coherency. Software executing at a higher Exception level can enable use of some of this 
functionality from ELO, as follows: 

When the value of SCTLR_EL1.UCTis 1 


Software executing at ELO can access: 


° The data cache maintenance instructions, DC CVAU, DC CVAC, and DC CIVAC. See Data cache 
maintenance instructions (DC*) on page D3-1704. 


° The instruction cache maintenance instruction IC IVAU. See Jnstruction cache maintenance 
instructions (IC*) on page D3-1704. 


When the value of SCTLR_EL1.UCT is 1 


Software executing at ELO can access the cache type register. See CTR_ELO. 


When the value of SCTLR_EL1.DZE is 1 


Software executing at ELO can access the data cache zero instruction DC ZVA. See Data cache zero 
instruction on page D3-1711. 


The SCTLR_EL1.{UCI, UCT, DZE} control fields are only accessible by software executing at EL1 or higher. 


This functionality is UNDEFINED at ELO when the value of the corresponding SCTLR_EL1 control field is 0, see: 
° Traps to ELI of ELO execution of cache maintenance instructions on page D1-1564. 

° Traps to ELI of ELO accesses to the CTR_ELO on page D1-1565. 

° Traps to EL1 of ELO execution of DC ZVA instructions on page D1-1566. 


When the value of SCTLR_EL1.UCI is 1: 


° If a DC CVAU, DC CVAC, or DC CIVAC cache maintenance instruction is executed at ELO, and the target address 
does not have read access permission at ELO, a Permission fault is generated. 


° If the IC IVAU cache maintenance instruction, and the target address does not have read access permission at 
ELO, it is IMPLEMENTATION DEFINED whether a Permission fault is generated. 
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B2.3.4 Implication of caches for the application programmer 

In normal operation, the caches are largely invisible to the application programmer. However they can become 

visible when there is a breakdown in the coherency of the caches. Such a breakdown can occur: 

° When memory locations are updated by other agents in the system that do not use hardware management of 
coherency. 

° When memory updates made from the application software must be made visible to other agents in the 
system, without the use of hardware management of coherency. 

For example: 

° In the absence of hardware management of coherency of DMA accesses, in a system with a DMA controller 
that reads memory locations that are held in the data cache of a PE, a breakdown of coherency occurs when 
the PE has written new data in the data cache, but the DMA controller reads the old data held in memory. 

. In a Harvard cache implementation, where there are separate instruction and data caches, a breakdown of 
coherency occurs when new instruction data has been written into the data cache, but the instruction cache 
still contains the old instruction data. 

Data coherency issues 

Software can ensure the data coherency of caches in the following ways: 

. By not using the caches in situations where coherency issues can arise. This can be achieved by: 

— Using Non-cacheable or, in some cases, Write-Through Cacheable memory. 
— Not enabling caches in the system. 

. By using cache maintenance instructions to manage the coherency issues in software. See Application level 
access to functionality related to caches on page B2-72. 

° By using hardware coherency mechanisms to ensure the coherency of data accesses to memory for cacheable 
locations by observers within the different shareability domains, see Non-shareable Normal memory on 
page B2-96 and Shareable, Inner Shareable, and Outer Shareable Normal memory on page B2-95. 

Note 
The performance of these hardware coherency mechanisms is highly implementation-specific. In some 
implementations the mechanism suppresses the ability to cache shareable locations. In other 
implementations, cache coherency hardware can hold data in caches while managing coherency between 
observers within the shareability domains. 
Note 

Not all these mechanisms are directly available to software operating at ELO and might involve interaction with 

software operating at a higher Exception level. 

Synchronization and coherency issues between data and instruction accesses 

How far ahead of the current point of execution instructions are fetched from is IMPLEMENTATION DEFINED. Such 

prefetching can be either a fixed or a dynamically varying number of instructions, and can follow any or all possible 

future execution paths. For all types of memory: 

. The PE might have fetched the instructions from memory at any time since the last Context synchronization 
event on that PE. 

° Any instructions fetched in this way might be executed multiple times, if this is required by the execution of 
the program, without being refetched from memory. In the absence of a Context synchronization event, there 
is no limit on the number of times such an instruction might be executed without being refetched from 
memory. 
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B2.3.5 


The ARM architecture does not require the hardware to ensure coherency between instruction caches and memory, 
even for locations of shared memory. 


If software requires coherency between instruction execution and memory, it must manage this coherency using 
Context synchronization events, DSB memory barriers, and cache maintenance instructions. See Context 
synchronization event. The following code sequence can be used to allow a PE to execute code that the same PE has 
written. 


; Coherency example for data and instruction accesses within the same Inner Shareable domain. 
; Enter this code with <Wt> containing a new 32-bit instruction, 
; to be held in Cacheable space at a location pointed to by Xn. 





STR Wt, [Xn] 
DC CVAU, Xn ; Clean data cache by VA to point of unification (PoU) 
DSB ISH ; Ensure visibility of the data cleaned from cache 
IC IVAU, Xn ; Invalidate instruction cache by VA to PoU 
DSB ISH ; Ensure completion of the invalidations 
ISB ; Synchronize the fetched instruction stream 
Note 
° For Non-cacheable or Write-Through accesses, the clean data cache by VA instruction is not required. 


However, the invalidate instruction cache instruction is required because the ARMv8-A AArch64 
architecture allows Non-cacheable accesses to be held in an instruction cache. See Non-cacheable accesses 
and instruction caches on page D3-1698. 


. This code can be used when the thread of execution modifying the code is the same thread of execution that 
is executing the code. The ARMv8 architecture limits the set of instructions that can be executed by one 
thread of execution as they are being modified by another thread of execution without requiring explicit 
synchronization. See Concurrent modification and execution of instructions on page B2-83. 


° The system software controls whether these cache maintenance instructions are available to the application 
level by setting SCTLR_EL1.UCI. 








Note 


If this sequence is not executed between writing data to a location and executing the instruction at that location, the 
lack of coherency between instruction caches and memory means that the instructions that are executed might be 
the old instruction or the updated instruction, and which is used can arbitrarily vary during execution. It must not 
be assumed by software, before the synchronization sequence is executed, that once the updated instruction has been 
seen, the old instruction will not be seen again. 





Preloading caches 


The ARM architecture provides memory system hints PRFM, LDNP, and STNP that software can use to communicate 
the expected use of memory locations to the hardware. The memory system can respond by taking actions that are 
expected to speed up the memory accesses if they occur. The effect of these memory system hints is 
IMPLEMENTATION DEFINED. Typically, implementations use this information to bring the data or instruction 
locations into caches. 


The Preload instructions are hints, and so implementations can treat them as NOPs without affecting the functional 
behavior of the device. The instructions cannot generate synchronous Data Abort exceptions, but the resulting 
memory system operations might, under exceptional circumstances, generate an asynchronous external abort, which 
is taken using an SError interrupt exception. For more information, see Exception from a Data abort on 

page D1-1533. 


PrefetchHint{} defines the prefetch hint types. 


The Hint_Prefetch() function signals to the memory system that memory accesses of the type hint to or from the 
specified address are likely to occur in the near future. The memory system might take some action to speed up the 
memory accesses when they do occur, such as preloading the specified address into one or more caches as indicated 
by the innermost cache level target and non-temporal hint stream. 
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For more information on PRFM and Load/Store instructions that provide hints to the memory system, see Prefetch 
memory on page C3-156 and Load/Store SIMD and Floating-point Non-temporal pair on page C3-154. 
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B2.4 Alignment support 


This section describes alignment support. It contains the following subsections: 


° Instruction alignment. 

. Alignment of data accesses. 

° Unaligned data access restrictions on page B2-77. 
B2.4.1 Instruction alignment 


A64 instructions must be word-aligned. 


Attempting to fetch an instruction from a misaligned location results in a PC alignment fault. See PC alignment 
checking on page D1-1515. 


B2.4.2 Alignment of data accesses 
An unaligned access to any type of Device memory causes an Alignment fault. 
The alignment requirements for accesses to Normal memory are as follows: 


. For all instructions that load or store a single or multiple registers, other than 
Load-Exclusive/Store-Exclusive and Load-Acquire/Store-Release, if the address that is accessed is not 
aligned to the size of the data element being accessed, then one of the following occurs: 


—  AnAlignment fault is generated. 
—  Anunaligned access is performed. 


When the value of SCTLR_ELx.A at the current Exception level is 1, alignment checking is enabled, and 
unaligned accesses generate Alignment faults. 


Note 


— The SCTLR_EL1.A bit applies to software running at ELO and at EL1, although it can only be 
accessed from EL] and higher. 





— Alignment checks are based on the size of the accessed elements, not the overall access size. This 
affects SIMD element and structure loads and stores, and also Load/Store pair instructions. 


— These alignment checking rules mean the ARMV8 architecture introduces requirements for 64-bit and 
128-bit alignment checking. 





° For all Load-Exclusive/Store-Exclusive and Load-Acquire/Store-Release memory accesses that access a 
single element or a pair of elements, an Alignment fault is generated if the address being accessed is not 
aligned to the size of the data structure being accessed. 


A failed alignment check results in an Alignment fault, which is taken as a Data Abort exception, that is taken as 
follows: 


° For an access from Non-secure ELO or EL1, if the Alignment fault is generated only because the translation 
tables identify the address being accessed as Device memory then: 


— If the first stage of address translation marks the address as Device memory then the exception is taken 
to EL1. 


—  Ifonly the second stage of address translation marks the address as Device memory then the exception 
is taken to EL2. 


° Otherwise, the exception is taken to the lowest Exception level that can handle the exception, consistent with 
the basic requirement that the Exception level never decreases on taking an exception. Therefore: 


— Alignment faults taken from ELO or EL1 are taken to EL1 unless redirected by HCR_EL2.TGE 
— Alignment faults taken from EL2 are taken to EL2. 
— Alignment faults taken from EL3 are taken to EL3. 
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B2.4 Alignment support 
B2.4.3 Unaligned data access restrictions 

The following points apply to unaligned data accesses in ARMv8: 

. Accesses are not guaranteed to be single-copy atomic except at the byte access level, see Atomicity in the 
ARM architecture on page B2-81. 

° Unaligned accesses typically take a number of additional cycles to complete compared to a naturally-aligned 
access. 

° An operation that performs an unaligned access can abort on any memory access that it makes, and can abort 


on more than one access. This means that an unaligned access that occurs across a page boundary can 
generate an abort on either side of the boundary. 
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B2.5 Endian support 


General description of endianness in the ARM architecture describes the relationship between endianness and 
memory addressing in the ARM architecture. 


The following subsections then describe the endianness schemes supported by the architecture: 


° Instruction endianness on page B2-79. 
. Data endianness on page B2-79. 
B2.5.1 General description of endianness in the ARM architecture 


This section only describes memory addressing and the effects of endianness for data elements up to quadwords of 
128 bits. However, this description can be extended to apply to larger data elements. 


For an address A, Figure B2-2 shows, for big-endian and little-endian memory systems, the relationship between: 
° The quadword at address A. 

° The doubleword at address A and A+8. 

e The words at addresses A, A+4, A+8, and A+12. 

° The halfwords at addresses A, A+2, A+4, A+6, A+8, A+10, A+12, and A+14. 


° The bytes at addresses A, A+1, A+2, A+3, A+4, A+5, A+6, A+7, A+8, A+9, A+10, A+11, A+12, A+13, 
A+14, and A+15. 


The terms in Figure B2-2 have the following definitions: 


B_A Byte at address A. 
HW_A Halfword at address A. 
MSByte Most-significant byte. 
LSByte Least-significant byte. 


Big-endian memory system 
MSByte Incrementing byte address > LSByte 








Quadword at address A 


Doubleword at address A Doubleword at address A+8 


Word at address A Word at address A+4 Word at address A+8 Word at address A+12 


HW_A HW_A+2 HW_A+4 HW_A+6 HW_A+8 HW_A+10 HW_A+12 HW_A+14 


Little-endian memory system 
MSByte« Incrementing byte address LSByte 











Quadword at address A 
Doubleword at address A+8 Doubleword at address A 


Word at address A+12 Word at address A+8 Word at address A+4 Word at address A 


HWwA+14 | HW.AH2 | HW A+0 HW_A+8 HW_A+6 HW_A+4 HW_A+2 





Figure B2-2 Endianness relationships 
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The big-endian and little-endian mapping schemes determine the order in which the bytes of a quadword, 
doubleword, word or halfword are interpreted. For example, a load of a word from address 0x1000 always results in 
an access to the bytes at memory locations 0x1000, 0x1001, 0x1002, and 0x1003. The endianness mapping scheme 
determines the significance of these four bytes. 


B2.5.2 Instruction endianness 


In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always little-endian. 


B2.5.3 Data endianness 
SCTLR_EL1.E0E, configurable at EL1 or higher, determines the data endianness for execution at ELO. 


The data size used for endianness conversions: 


° Is the size of the data value that is loaded or stored for SIMD and floating-point register and general-purpose 
register loads and stores. 


° Is the size of the data element that is loaded or stored for SIMD element and data structure loads and stores. 
For more information see Endianness in SIMD operations on page B2-80. 





Note 


This means the ARMV8 architecture introduces a requirement for 128-bit endian conversions. 





Instructions to reverse bytes in a general-purpose register or a SIMD and floating-point 
register 


An application or device driver might have to interface to memory-mapped peripheral registers or shared memory 
structures that are not the same endianness as the internal data structures. Similarly, the endianness of the operating 
system might not match that of the peripheral registers or shared memory. In these cases, the PE requires an efficient 
method to transform explicitly the endianness of the data. 


Table B2-1 shows the instructions that provide this functionality: 


Table B2-1 Byte reversal instructions 























Function Instructions Notes 

Reverse bytes in 32-bit word or words? REV32 For use with general-purpose registers 

Reverse bytes in whole register REV For use with general-purpose registers 

Reverse bytes in 16-bit halfwords REV16 For use with general-purpose registers 

Reverse elements in doublewords, vector REV64 For use with SIMD and floating-point registers 
Reverse elements in words, vector REV32 For use with SIMD and floating-point registers 
Reverse elements in halfwords, vector REV16 For use with SIMD and floating-point registers 





a. Can operate on multiple words. 
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Endianness in SIMD operations 


SIMD element Load/Store instructions transfer vectors of elements between memory and the SIMD and 
floating-point register file. An instruction specifies both the length of the transfer and the size of the data elements 
being transferred. This information is used to load and store data correctly in both big-endian and little-endian 
systems. 


For example: 
LD1 {V0.4H}, [X1] 


This loads a 64-bit register with four 16-bit values. The four elements appear in the register in array order, with the 
lowest indexed element fetched from the lowest address. The order of bytes in the elements depends on the 
endianness configuration, as shown in Figure B2-3. Therefore, the order of the elements in the registers is the same 
regardless of the endianness configuration. 


64-bit register containing four 16-bit elements 





A[15:8] 
AI7:0] 
B[15:8] 
B[7:0] 
C[15:8] 
C[7:0] 
D[15:8] 
7 |D[7:0] 

















LD1 {VO.4H}, [X1] LD1 {V0.4H}, [X1] 








oak wWNrnN = O 




















0 
1 
2 
3 
4 
5 
6 
7 


Memory system with Memory system with 
little-endian addressing (LE) big-endian addressing (BE) 





Figure B2-3 SIMD byte order example 
The BigEndian() pseudocode function determines the current endianness of the data. 
The BigEndianReverse() pseudocode function reverses the endianness of a bitstring. 


The BigEndian() and BigEndianReverse() functions are defined in Chapter J1 ARMv8 Pseudocode. 
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Atomicity in the ARM architecture 


Atomicity is a feature of memory accesses, described as atomic accesses. The ARM architecture description refers 
to two types of atomicity, single-copy atomicity and multicopy atomicity. In the ARMV8 architecture, the atomicity 
requirements for memory accesses depend on the memory type, and whether the access is explicit or implicit. For 
more information see: 


Requirements for single-copy atomicity. 

Properties of single-copy atomic accesses on page B2-82. 
Multi-copy atomicity on page B2-82. 

Requirements for multi-copy atomicity on page B2-82. 


Concurrent modification and execution of instructions on page B2-83. 


For more information about the memory types, see Memory type overview on page B2-69. 


Requirements for single-copy atomicity 


For explicit memory accesses generated from an Exception level the following rules apply: 


A read that is generated by a load instruction that loads a single general-purpose register and is aligned to the 
size of the read in the instruction is single-copy atomic. 


A write that is generated by a store instruction that stores a single general-purpose register and is aligned to 
the size of the write in the instruction is single-copy atomic. 


Reads that are generated by a Load Pair instruction that loads two general-purpose registers and are aligned 
to the size of the load to each register are treated as two single-copy atomic reads, one for each register being 
loaded. 


Writes that are generated by a Store pair instruction that stores two general-purpose registers and are aligned 
to the size of the store of each register are treated as two single-copy atomic writes, one for each register being 
stored. 


Load-Exclusive Pair instructions of two 32-bit quantities and Store-Exclusive Pair instructions of 32-bit 
quantities are single-copy atomic. 


When the Store-Exclusive of a Load-Exclusive/Store-Exclusive pair instruction using two 64-bit quantities 
succeeds, it causes a single-copy atomic update of the entire memory location being updated. 


Note 


To atomically load two 64-bit quantities, perform a Load-Exclusive pair/Store-Exclusive pair sequence of 
reading and writing the same value for which the Store-Exclusive pair succeeds, and use the read values from 
the Load-Exclusive pair. 








Where translation table walks generate a read of a translation table entry, this read is single-copy atomic. 


For the atomicity of instruction fetches, see Concurrent modification and execution of instructions on 
page B2-83. 


Reads to floating-point and SIMD registers of a single 64-bit or smaller quantity that is aligned to the size of 
the quantity being loaded are treated as single-copy atomic reads. 


Writes from floating-point and SIMD registers of a single 64-bit or smaller quantity that is aligned to the size 
of the quantity being stored are treated as single-copy atomic writes. 


Element or Structure Reads to floating-point and SIMD registers of 64-bit or smaller elements, where each 
element is aligned to the size of the element being loaded, have each element treated as a single-copy atomic 
read. 


Element or Structure Writes from floating-point and SIMD registers of 64-bit or smaller elements, where 
each element is aligned to the size of the element being stored, have each element treated as a single-copy 
atomic store. 
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B2.6.2 


B2.6.3 


B2.6.4 


° Reads to floating-point and SIMD registers of a 128-bit value that is 64-bit aligned in memory are treated as 
a pair of single-copy atomic 64-bit reads. 


° Writes from floating-point and SIMD registers of a 128-bit value that is 64-bit aligned in memory are treated 
as a pair of single-copy atomic 64-bit writes. 


All other memory accesses are regarded as streams of accesses to bytes, and no atomicity between accesses to 
different bytes is ensured by the architecture. 


All accesses to any byte are single-copy atomic. 





Note 


In AArch64 state, no memory accesses from a DC ZVA have single-copy atomicity of any quantity greater than 
individual bytes. 





If, according to these rules, an instruction is executed as a sequence of accesses, exceptions, including interrupts, 
can be taken during that sequence, regardless of the memory type being accessed. If any of these exceptions are 
returned from using their preferred return address, the instruction that generated the sequence of accesses is 
re-executed, and so any access performed before the exception was taken is repeated. See also Taking an interrupt 
or other exception during a multiple-register load or store on page D1-1560. 





Note 


The exception behavior for these multiple access instructions means they are not suitable for use for writes to 
memory for the purpose of software synchronization. 





Properties of single-copy atomic accesses 


A read or write operation that is single-copy atomic has the following properties: 


1. For a single-copy atomic store, if the store overlaps another single-copy atomic store, then all of the writes 
from one of the stores are inserted into the Coherence order of each overlapping byte before any of the writes 
of the other store are inserted into the Coherence orders of the overlapping bytes. 


2. If a single-copy atomic load overlaps a single-copy atomic store and for any of the overlapping bytes the load 
returns the data written by the write inserted into the Coherence order of that byte by the single-copy atomic 
store then the load must return data from a point in the Coherence order no earlier than the writes inserted 
into the Coherence order by the single-copy atomic store of all of the overlapping bytes. 


Multi-copy atomicity 


In a multiprocessing system, writes to a memory location are multi-copy atomic if the following conditions are both 
true: 


. All writes to the same location are serialized, meaning they are observed in the same order by all observers, 
although some observers might not observe all of the writes. 


° A read of a location does not return the value of a write until all observers observe that write. 


Note 


Writes that are not coherent are not multi-copy atomic. 








Requirements for multi-copy atomicity 


In a multiprocessing system, coherent writes to a memory location are multi-copy atomic if the read of a location 
returns the value of a write only when all observers have observed that write. 


For Normal memory, writes are not required to be multi-copy atomic. 


For Device memory with the non-Gathering attribute, writes that are single-copy atomic are also multi-copy atomic. 
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For Device memory with the Gathering attribute, writes are not required to be multi-copy atomic. 


B2.6.5 Concurrent modification and execution of instructions 


The ARMV8 architecture limits the set of instructions that can be executed by one thread of execution as they are 
being modified by another thread of execution without requiring explicit synchronization. 


Concurrent modification and execution of instructions can lead to the resulting instruction performing any behavior 
that can be achieved by executing any sequence of instructions that can be executed from the same Exception level, 
except where the instruction before modification and the instruction after modification is a B, BL, NOP, BRK, SVC, HVC, 
or SMC instruction. 


For the B, BL, NOP, BRK, SVC, HVC, and SMC instructions the architecture guarantees that, after modification of the 
instruction, behavior is consistent with execution of either: 


° The instruction originally fetched. 


° A fetch of the modified instruction. 


If one thread of execution changes a conditional branch instruction, such as B or BL, to another conditional instruction 
and the change affects both the condition field and the branch target, execution of the changed instruction by another 
thread of execution before the change is synchronized can lead to either: 


. The old condition being associated with the new target address. 


. The new condition being associated with the old target address. 


These possibilities apply regardless of whether the condition, either before or after the change to the branch 
instruction, is the always condition. 


For all other instructions, to avoid UNPREDICTABLE or CONSTRAINED UNPREDICTABLE behavior, instruction 
modifications must be explicitly synchronized before they are executed. The required synchronization is as follows: 


1. No PE must be executing an instruction when another PE is modifying that instruction. 
2. To ensure that the modified instructions are observable, the PE that modified the instructions must issue the 


following sequence of instructions and operations: 


; Coherency example for data and instruction accesses within the same Inner Shareable domain. 
; Enter this code with <Wt> containing a new 32-bit instruction, 
; to be held in Cacheable space at a location pointed to by Xn. 





STR Wt, [Xn] 

DC CVAU, Xn ; Clean data cache by VA to point of unification (PoU) 

DSB ISH ; Ensure visibility of the data cleaned from cache 

IC IVAU, Xn ; Invalidate instruction cache by VA to PoU 

DSB ISH ; Ensure completion of the invalidations 

Note 
The DC CVAU operation is not required if the area of memory is either Non-cacheable or Write-through 
Cacheable. 





3. In a multiprocessor system, the IC IVAU is broadcast to all PEs within the Inner Shareable domain of the PE 
running this sequence. However, when the modified instructions are observable, each PE that is executing 
the modified instructions must issue the following instruction to ensure execution of the modified 
instructions: 


ISB ; Synchronize fetched instruction stream 


For more information about the required synchronization operation, see Synchronization and coherency issues 
between data and instruction accesses on page B2-73. 





Note 


For information about memory accesses caused by instruction fetches, see Ordering requirements on page B2-85. 
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B2.7 Memory ordering 


This section describes observation ordering. It contains the following subsections: 
° Observability and completion. 

° Ordering requirements on page B2-85. 

° Memory barriers on page B2-87. 


° Summary of the memory ordering rules on page B2-91. 
For information on endpoint ordering of memory accesses, see Reordering on page B2-102. 


In the ARMv8 memory model, the shareability memory attribute indicates whether hardware must ensure memory 
coherency. 


The ARMv8 memory system architecture defines additional attributes and associated behaviors, defined in the 
system level section of this manual. See: 


° Chapter D3 The AArch64 System Level Memory Model. 
° Chapter D4 The AArch64 Virtual Memory System Architecture. 


See also Mismatched memory attributes on page B2-105. 


B2.7.1 Observability and completion 


An observer is a master in the system that is capable of observing memory accesses. For a PE, the following 
mechanisms must be treated as independent observers: 


. The mechanism that performs reads or writes to memory. 


. A mechanism that causes an instruction cache to be filled from memory or that fetches instructions to be 
executed directly from memory. These are treated as reads. 


° A mechanism that performs translation table walks. These are treated as reads. 
The set of observers that can observe a memory access is defined by the system. 


In the definitions in this subsection, subsequent means whichever of the following is appropriate to the context: 
° After the point in time where the location is observed by that observer. 


° After the point in time where the location is globally observed. 


For all memory: 


° A write to a location in memory is said to be observed by an observer when: 


—  Asubsequent read of the location by the same observer returns the value written by the observed write, 
or written by a write to that location by any observer that is sequenced in the Coherence order of the 
location after the observed write. 


— _ A subsequent write of the location by the same observer is sequenced in the Coherence order of the 
location after the observed write. 


° A write to a location in memory is said to be globally observed for a shareability domain or set of observers 
when: 


— A subsequent read of the location by any observer in that shareability domain returns the value written 
by the globally observed write, or written by a write to that location by any observer that is sequenced 
in the Coherence order of the location after the globally observed write. 


— A subsequent write of the location by any observer in that shareability domain is sequenced in the 
Coherence order of the location after the globally observed write. 


° A read of a location in memory is said to be observed by an observer when a subsequent write to the location 
by the same observer has no effect on the value returned by the read. 


. A read of a location in memory is said to be globally observed for a shareability domain when a subsequent 
write to the location by any observer in that shareability domain has no effect on the value returned by the 
read. 
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Additionally, for Device-nGnRnE memory: 

° A read or write of a memory-mapped location in a peripheral that exhibits side-effects is said to be observed, 
and globally observed, only when the read or write: 
— Meets the general conditions listed. 
— Can begin to affect the state of the memory-mapped peripheral. 


— Can trigger all associated side-effects, whether they affect other peripheral devices, PEs, or memory. 


Note 


This definition is consistent with the memory access having reached the peripheral. 








For all memory, the completion rules are defined as: 


. A read or write is complete for a shareability domain when all of the following are true: 
— The read or write is globally observed for that shareability domain. 
— Any translation table walks associated with the read or write are complete for that shareability domain. 


. A translation table walk is complete for a shareability domain when the memory accesses associated with the 
translation table walk are globally observed for that shareability domain, and the TLB is updated. 


° A cache or TLB maintenance instruction is complete for a shareability domain when the effects of the 
instruction are globally observed for that shareability domain, and any translation table walks that arise from 
the instruction are complete for that shareability domain. 


The completion of any cache or TLB maintenance instruction includes its completion on all PEs that are 
affected by both the instruction and the DSB operation that is required to guarantee visibility of the 
maintenance instruction. 


Completion of side-effects of accesses to Device memory 


The completion of a memory access to Device memory other than Device-nGnRnE is not guaranteed to be sufficient 
to determine that the side-effects of the memory access are visible to all observers. The mechanism that ensures the 
visibility of side-effects of a memory access is IMPLEMENTATION DEFINED. 











B2.7.2 Ordering requirements 
ARMvV8 defines restrictions for the permitted ordering of memory accesses. These restrictions depend on the 
memory type of the addresses that are accessed, see Memory types and attributes on page B2-94. 
Note 
See Summary of the memory ordering rules on page B2-91 for the definition of address dependency. 
The only stores by an observer that can be observed by another observer are those stores that have been 
Architecturally executed. Speculative writes by an observer cannot be observed by another observer. For the 
purposes of this requirement, speculative writes are all of: 
. Writes generated by store instructions that appear in the Execution stream after a branch that is not 
architecturally resolved. 
° Writes generated by store instructions that appear in the Execution stream after an instruction where a 
synchronous exception condition has not been architecturally resolved. 
. Writes generated by conditional store instructions for which the conditions for the instruction have not been 
architecturally resolved. 
° Writes generated by store instructions for which the data being written comes from a register that has not been 
architecturally committed. 
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The following additional restrictions apply to the order in which accesses to memory are observed: 


° Reads and writes to different addresses can be observed in any order provided the following constraints are 
met: 


—  Ifan address dependency exists between two reads or between a read and a write, then those memory 
accesses are observed in program order by all observers within the common shareability domain of the 
memory addresses being accessed. 


The ARMvVv8 architecture relaxes this rule for execution where the second read is generated by a Load 
Non-Temporal Pair instruction. See Load/Store Non-temporal Pair on page C3-149 and Load/Store 
SIMD and Floating-point Non-temporal pair on page C3-154. 


— Ordering can be achieved by using a DMB or DSB barrier. For more information on DMB and DSB 
instructions, see Memory barriers on page B2-87. 


. Reads and writes to the same address are coherent within the shareability domain of the memory address 
being accessed. 


° Two reads to the same address by the same observer are observed in program order by all observers within 
the shareability domain of the memory address being accessed. 


. Writes are not required to be multi-copy atomic. This means that in the absence of barriers, the observation 
of a store by one observer does not imply the observation of the store by another observer. 


° Instructions that access multiple elements have no defined ordering requirements for the memory accesses 
relative to each other. 


For Device memory with the non-Reordering attribute, the order of memory accesses arriving at a single peripheral 
is the same as occurs in a Simple sequential execution on page Glossary-5728 of the program. This means the 
accesses arrive in program order. This ordering applies for all accesses using any of the memory types with the 
non-Reordering attribute, which means Device-nGnRE accesses are ordered with respect to Device-nGnRnE 
accesses to the same peripheral. If the memory accesses are not to a peripheral then there are no ordering restrictions 
from the non-Reordering attribute. For the purposes of this definition, a single peripheral is a region of memory of 
an IMPLEMENTATION DEFINED size that is defined by the peripheral. 


Memory accesses caused by instruction fetches are not required to be observed in program order, unless they are 
separated by an ISB or other Context synchronization event. 


Address dependencies and order 


In the ARMV8 architecture, a register data dependency between the value returned by a load instruction and the 
address used by a subsequent memory transaction creates order between that load instruction and the subsequent 
memory transaction. 


A register data dependency exists between a first data value and a second data value when either: 


° The register, excluding the zero register (XZR or WZR), used to hold the first data value is used in the 
calculation of the second data value, and the calculation between the first data value and the second data value 
does not consist of either: 


—  Aconditional branch whose condition is determined by the first data value. 


— A conditional selection, move, or computation whose condition is determined by the first data value, 
where the input data values for the selection, move, or computation do not have a data dependency on 
the first data value. 


° There is a register data dependency between the first data value and a third data value, and between the third 
data value and the second data value. 
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Note 


A register data dependency can exist even if the value of the first data value is discarded as part of the calculation, 
as might be the case if it is ANDed with @x@ or if arithmetic using the first data value cancels out its contribution. 


For example, each of the following code sequences creates order between the memory transactions: 


Sequence 1 LDR X1, [X2] 
AND X1, X1, XZR 
LDR X4, [X3, X1] 


Sequence 2 LDR X1, [X2] 
ADD X3, X3, X1 
SUB X3, X3, X1 
STR X4, [X3] 








Address dependencies of Load Non-temporal Pair instructions 


Where an address dependency exists between two reads, and the second read was generated by a Load 
Non-temporal Pair instruction, then in the absence of any other barrier mechanism to achieve order, those memory 
accesses can be observed in any order by other observers within the shareability domain of the memory addresses 
being accessed. 


This affects the following instruction: 
° LDNP on page C6-542. 


B2.7.3 Memory barriers 


The ARM architecture is a weakly ordered memory architecture that supports out of order completion. Memory 
barrier is the general term applied to an instruction, or sequence of instructions, that forces synchronization events 
by a PE with respect to retiring Load/Store instructions. The memory barriers defined by the ARMV8 architecture 
provide a range of functionality, including: 


° Ordering of Load/Store instructions. 
° Completion of Load/Store instructions. 
° Context synchronization. 


The following subsections describe the ARMv8 memory barrier instructions: 

° Instruction Synchronization Barrier (ISB) on page B2-88 

° Data Memory Barrier (DMB) on page B2-88. 

° Data Synchronization Barrier (DSB) on page B2-89. 

° Shareability and access limitations on the data barrier operations on page B2-90. 


° Load-Acquire, Store-Release on page B2-90. 


Note 
Depending on the required synchronization, a program might use memory barriers on their own, or it might use them 
in conjunction with cache maintenance and memory management instructions that in general are only available 
when software execution is at EL1 or higher. 








The DMB and DSB memory barriers affect reads and writes to the memory system generated by Load/Store instructions 
and data or unified cache maintenance instructions being executed by the PE. Instruction fetches or accesses caused 
by a hardware translation table access are not explicit accesses. 
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Instruction Synchronization Barrier (ISB) 


An ISB instruction flushes the pipeline in the PE, so that all instructions that come after the ISB instruction in 
program order are fetched from the cache or memory after the ISB instruction has completed. Using an ISB ensures 
that the effects of context-changing operations executed before the ISB are visible to the instructions fetched after 
the ISB instruction. Examples of context-changing operations that require the insertion of an ISB instruction to ensure 
the effects of the operation are visible to instructions fetched after the ISB instruction are: 


° Completed cache and TLB maintenance instructions. 


° Changes to System registers. 


Any context-changing operations appearing in program order after the ISB instruction only take effect after the ISB 
has been executed. 


The pseudocode function for the operation of an ISB is InstructionSynchronizationBarrier(). 


See also Memory barriers on page D3-1719. 


Data Memory Barrier (DMB) 


The DMB instruction is a data memory barrier. The PE that executes the DMB instruction is referred to as the executing 
PE, PEe. The DMB instruction takes an <option> argument that specifies the shareability domains and access types to 
which the instruction applies, see Shareability and access limitations on the data barrier operations on page B2-90. 


If the required shareability is Full system then the operation applies to all observers within the system. 
A DMB creates two groups of memory accesses, Group A and Group B: 


Group A Contains: 


° All explicit memory accesses of the required access types from observers in the same 
required shareability domain as PEe that are observed by PEe before the DMB instruction. 
These accesses include any accesses of the required access types performed by PEe. 


° All loads of required access types from an observer PEx in the same required shareability 
domain as PEe that have been observed by any given different observer, PEy, in the same 
required shareability domain as PEe before PEy has performed a memory access that is a 
member of Group A. 


Group B Contains: 


° All explicit memory accesses of the required access types by PEe that occur in program order 
after the DMB instruction. 


° All explicit memory accesses of the required access types by any given observer PEx in the 
same required shareability domain as PEe that can only occur after a load by PEx has returned 
the result of a store that is a member of Group B. 


Any observer with the same required shareability domain as PEe observes all members of Group A before it 
observes any member of Group B to the extent that those group members are required to be observed, as determined 
by the shareability and cacheability of the memory addresses accessed by the group members. 


If members of Group A and members of Group B access the same memory-mapped peripheral of arbitrary 
system-defined size, then members of Group A that are accessing Device or Normal Non-cacheable memory arrive 
at that peripheral before members of Group B that are accessing Device or Normal Non-cacheable memory. Where 
the members of Group A and Group B that must be ordered are from the same PE, a DMB NSH is sufficient for this 
guarantee. 
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Note 


° A memory access might be in neither Group A nor Group B. The DMB does not affect the order of observation 
of such a memory access. 


° The second part of the definition of Group A is recursive. Ultimately, membership of Group A derives from 
the observation by PEy of a load before PEy performs an access that is a member of Group A as a result of 
the first part of the definition of Group A. 


° The second part of the definition of Group B is recursive. Ultimately, membership of Group B derives from 
the observation by any observer of an access by PEe that is a member of Group B as a result of the first part 
of the definition of Group B. 





DMB only affects memory accesses and the operation of data cache and unified cache maintenance instructions, see 
Cache maintenance instructions on page D3-1703. It has no effect on the ordering of any other instructions 
executing on the PE. A DMB instruction intended to ensure the completion of cache maintenance instructions must 
have an access type of both loads and stores. 


The pseudocode function for the operation of a DMB is DataMemoryBarrier(). 


See also Memory barriers on page D3-1719. 


Data Synchronization Barrier (DSB) 
The DSB instruction is a memory barrier, that synchronizes the execution stream with memory accesses. 


The DSB instruction takes an <option> argument that specifies the shareability domains and access types to which the 
instruction applies, see Shareability and access limitations on the data barrier operations on page B2-90. 


If the required shareability is Full system then the operation applies to all observers within the system. 


A DSB behaves as a DMB with the same arguments, and also has the additional properties defined in this section. The 
PE that executes the DSB instruction is referred to as the executing PE, PEe 


Execution of a DSB: 


° At EL2 ensures that any memory accesses caused by speculative translation table walks from the Non-secure 
EL1&0 translation regime have been observed. 


° At EL3 ensures that any memory accesses caused by speculative translation table walks from any of the 
following translation regimes have been observed: 


— The EL2 translation regime. 
— The Secure EL1&0 translation regime. 


— The Non-secure EL1&0 translation regime. 
For more information, see Use of out-of-context translation regimes on page D4-1735. 
A DSB completes when all of the following apply: 


° All explicit memory accesses that are observed by PEe before the DSB is executed and are of the required 
access types, and are from observers in the same required shareability domain as PEe, are complete for the 
set of observers in the required shareability domain. 


° If the required access types of the DSB is reads and writes, then all cache maintenance instructions and all TLB 
maintenance instructions issued by PEe before the DSB are complete for the required shareability domain. 


In addition, no instruction that appears in program order after the DSB instruction can execute until the DSB completes. 
The pseudocode function for the operation of a DSB is DataSynchronizationBarrier(). 


See also Memory barriers on page D3-1719. 
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Shareability and access limitations on the data barrier operations 


The DMB and DSB instructions take an argument that specifies: 
° The shareability domain over which the instruction must operate. This is one of: 
— Full system. 
— Outer Shareable. 
— Inner Shareable. 
—  Non-shareable. 
. The accesses for which the instruction operates. This is one of: 
— __ Read and write accesses in Group A and Group B. 
—  _ Write accesses only in Group A and Group B. 


— __ Read access only in Group A and read and write accesses in Group B. 


Note 


This form of a DMB or DSB instruction can be described as a Load-Load/Store barrier. 








Table B2-2 shows how these options are encoded in the <option> field of the instruction: 


Table B2-2 Encoding of the DMB and DSB <option> parameter 














Accesses Shareability domain 

Group A Group B Fullsystem OuterShareable Inner Shareable Non-shareable 
Reads and writes Reads and writes SY OSH ISH NSH 

Writes Writes ST OSHST ISHST NSHST 

Reads Reads and writes LD OSHLD ISHLD NSHLD 





See the instruction descriptions for more information: 
° DMB on page C6-515. 
° DSB on page C6-518. 


Note 


ISB also supports an optional limitation argument that can only contain one value that corresponds to full system 
operation, see /SB on page C6-532. 








Load-Acquire, Store-Release 


ARMvV8 provides a set of instructions with Acquire semantics for loads, and Release semantics for stores. See 
Load-Acquire/Store-Release on page C3-151. 


For all memory types, these instructions have the following ordering requirements: 


. A Store-Release followed by a Load-Acquire is observed in program order by any observers that are in both: 
— The shareability domain of the address accessed by the Store-Release. 


— The shareability domain of the address accessed by the Load-Acquire. 

° For a Load-Acquire, observers in the shareability domain of the address accessed by the Load-Acquire 
observe accesses in the following order: 
1. The read caused by the Load-Acquire. 


2. Reads and writes caused by loads and stores that appear in program order after the Load-Acquire for 
which the shareability of the address accessed by the load or store requires that the observer observes 
the access. 


There are no additional ordering requirements on loads or stores that appear before the Load-Acquire. 
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° For a Store-Release, observers in the shareability domain of the address accessed by the Store-Release 
observe accesses in the following order: 


1. All of the following for which the shareability of the address accessed requires that the observer 
observes the access: 


. Reads and writes caused by loads and stores that appear in program order before the 
Store-Release. 


° Writes that were observed by the PE executing the Store-Release before it executed the 
Store-Release. 


2. The write caused by the Store-Release. 


There are no other ordering requirements on loads or stores that appear in program order after the 
Store-Release. 


° A Store-Release instruction is multi-copy atomic when observed with a Load-Acquire instruction. 


In addition, for accesses to a memory-mapped peripheral of an arbitrary system-defined size that are defined as any 
type of Device memory accesses, these instructions have the following requirements: 


. A Load-Acquire to an address in the memory-mapped peripheral ensures that all memory accesses using 
Device memory types to the same memory-mapped peripheral that are architecturally required to be observed 
after the Load-Acquire will arrive at the memory-mapped peripheral after the memory access of the 
Load-Acquire. 


° A Store-Release to an address in the memory-mapped peripheral ensures that all memory accesses using 
Device memory types to the same memory-mapped peripheral that are architecturally required to be observed 
before the Store-Release will arrive at the memory-mapped peripheral before the memory access of the 
Store-Release. 


. If a Load-Acquire to a memory address in the memory-mapped peripheral has observed the value stored to 
that address by a Store-Release, then any memory access to the memory-mapped peripheral that is 
architecturally required to be ordered before the memory access of the Store-Release will arrive at the 
memory-mapped peripheral before any memory access to the same peripheral that is architecturally required 
to be ordered after the memory access of the Load-Acquire. 


Load-Acquire and Store-Release, other than Load-Acquire Exclusive Pair and Store-Release-Exclusive Pair, access 
only a single data element. This access is single-copy atomic. The address of the data object must be aligned to the 
size of the data element being accessed, otherwise the access generates an Alignment fault. 


Load-Acquire Exclusive Pair and Store-Release Exclusive Pair access two data elements. The address supplied to 
the instructions must be aligned to twice the size of the element being loaded, otherwise the access generates an 
Alignment fault. 


A Store-Release Exclusive instruction only has the release semantics if the store is successful. 


Note 


° Each Load-Acquire Exclusive and Store-Release Exclusive instruction is essentially a variant of the 
equivalent Load-Exclusive or Store-Exclusive instruction. All usage restrictions and single-copy atomicity 
properties: 





— That apply to the Load-Exclusive instructions also apply to the Load-Acquire Exclusive instructions. 


— That apply to the Store-Exclusive instructions also apply to the Store-Release Exclusive instructions. 


° The Load-Acquire/Store-Release instructions can remove the requirement to use the explicit DMB memory 
barrier instruction. 





B2.7.4 Summary of the memory ordering rules 


The following is a concise list of the situations that are required, by the ARM architecture specification, to cause 
externally-visible order of memory. This ordering means that if memory transaction A has externally visible order 
ahead of memory transaction B, then all observers within the shareability domains of A and B will observe A 
before B. See Terms used in the summary of the memory ordering rules for definitions of the terms used. 
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Note 
This list applies to both AArch32 state and AArch64 state, and is consistent with the requirements of ARMv/7. 








1. DMB and DSB barrier instructions, and load acquire/store release instructions, create externally-visible order as 
defined by those instructions. 


2. A True or False Address dependency from a Load to a Load or from a Load to a Store creates 
externally-visible order. 


3. A True Control dependency from a Load to an ISB instruction creates externally-visible order between the 
load and any memory accesses after the ISB instruction. 


4. A True Register data dependency from a Load to a Store creates externally-visible order. 
3: A True Control dependency from a Load to a Store creates externally-visible order. 
6. Memory is coherent within the shareability domain of a memory address, which means there is a total order 


of all writes to that address that all observers within that shareability domain will agree on. 


Note 


A consequence of this is that reads to the same address by the same PE are observed in order. 








7. A Dependency from a Store to a Load through memory between different PEs creates externally-visible order 
but stores are not multi-copy atomic except where explicitly defined to be by the definition of the store. 


Note 


A consequence of the lack of multi-copy atomicity is that a Store to Load dependency through memory on 
the same PE does not create externally-visible order. 








No other effects are required to create externally visible order in the ARM architecture. 


Terms used in the summary of the memory ordering rules 
The summary uses the following terms: 


Register data dependency 
This is defined in Address dependencies and order on page B2-86. 


False Register data dependency 
A False Register data dependency is a Register data dependency where no register in the system 
holds a variable for which a change of the first data value causes a change of the second data value. 
True Register data dependency 
A True Register data dependency is a Register data dependency that is not a false Register data 
dependency. 
True Address dependency 


A True Address dependency between a load and a subsequent memory transaction exists where 
there is a True Register data dependency between the data value returned from the load and the 
address used by the subsequent memory transaction. 


False Address dependency 


A False Address dependency between a load and a subsequent memory transaction exists where 
there is a False Register data dependency between the data value returned from the load and the 
address used by the subsequent memory transaction. 
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True Control dependency 
A True Control dependency between a load and a subsequent instruction exists: 


° Where there is a True Register data dependency between the data value returned from the 
load and data value used in the evaluation of a conditional branch and the subsequent 
instruction is only executed as a result of one of the possible outcomes of that conditional 
branch. 


° Where there is a True Register data dependency between the data value returned from the 
load and the data value used in the evaluation of a subsequent instruction that is a conditional 
selection, move, or computation for which both: 


— The condition is determined by the returned data value. 
— No input data value for the selection, move, or computation has a register data 
dependency on the returned data value. 
Dependency from a Store to a Load through memory 


A Dependency from a Store to a Load through memory exists where the Store and Load are to the 
same physical address, and value returned by the Load is the value that was written by the Store, 
and could not be the value that was previously held at that memory address. 
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B2.8 


Memory types and attributes 


In ARMV8 the ordering of accesses for addresses in memory, referred to as the memory order model, is defined by 
the memory attributes. The following sections describe this model: 


° Normal memory. 
° Device memory on page B2-98. 
° Memory access restrictions on page B2-104. 











B2.8.1 Normal memory 

The Normal memory type attribute applies to most memory in a system. It indicates that the hardware is permitted 

by the architecture to perform speculative data read accesses to these locations, regardless of the access permissions 

for these locations. 

The Normal memory type has the following properties: 

° A write to amemory location with the Normal attribute completes in finite time. This means that it is globally 
observed for the shareability domain of the memory location in finite time. For a Non-cacheable location, the 
location is observed by all observers in finite time. 

. A completed write to a memory location with the Normal attribute is globally observed for the shareability 
domain of the memory location in finite time without the need for explicit cache maintenance instructions or 
barriers. For a Non-cacheable location, the completed write is globally observed for all observers in finite 
time without the need for explicit cache maintenance instructions or barriers. 

° Writes to a memory location with the Normal memory attribute that are Non-cacheable must reach the 
endpoint for that location in the memory system in finite time. 

° Unaligned memory accesses can access Normal memory if the system is configured to generate such 
accesses. 

. There is no requirement for the memory system beyond the PE to be able to identify the elements accessed 
by multi-register Load/Store instructions. See Multi-register loads and stores that access Normal memory on 
page B2-98. 

Note 

. The Normal memory attribute is appropriate for locations of memory that are idempotent, meaning that they 

exhibit all of the following properties: 

— __ Read accesses can be repeated with no side-effects. 

— _ _ Repeated read accesses return the last value written to the resource being read. 

— Read accesses can fetch additional memory locations with no side-effects. 

—  _ Write accesses can be repeated with no side-effects if the contents of the location accessed are 
unchanged between the repeated writes or as the result of an exception, as described in this section. 

— _Unaligned accesses can be supported. 

— Accesses can be merged before accessing the target memory system. 

. An instruction that generates a sequence of accesses as described in Atomicity in the ARM architecture on 
page B2-81 might be abandoned as a result of an exception being taken during the sequence of accesses. On 
return from the exception the instruction is restarted, and therefore one or more of the memory locations 
might be accessed multiple times. This can result in repeated write accesses to a location that has been 
changed between the write accesses. 

The following sections describe the other attributes for Normal memory: 

° Shareable Normal memory on page B2-95. 

° Non-shareable Normal memory on page B2-96. 

° Cacheability attributes for Normal memory on page B2-96. 
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See also: 
° Multi-register loads and stores that access Normal memory on page B2-98. 
° Atomicity in the ARM architecture on page B2-81. 


° Memory barriers on page B2-87. For accesses to Normal memory, a DMB instruction is required to ensure the 
required ordering. 


° Concurrent modification and execution of instructions on page B2-83. 


Shareable Normal memory 


A Normal memory location has a Shareability attribute that is one of: 


° Inner Shareable, meaning it applies across the Inner Shareable shareability domain. 

° Outer Shareable, meaning it applies across both the Inner Shareable and the Outer Shareable shareability 
domains. 

° Non-shareable. 


The shareability attributes define the data coherency requirements of the location, that hardware must enforce. They 
do not affect the coherency requirements of instruction fetches, see Synchronization and coherency issues between 
data and instruction accesses on page B2-73. 





Note 


° System designers can use the shareability attribute to specify the locations in Normal memory for which 
coherency must be maintained. However, software developers must not assume that specifying a memory 
location as Non-shareable permits software to make assumptions about the incoherency of the location 
between different PEs in a shared memory system. Such assumptions are not portable between different 
multiprocessing implementations that might use the shareability attribute. Any multiprocessing 
implementation might implement caches that are shared, inherently, between different processing elements. 


° This architecture assumes that all PEs that use the same operating system or hypervisor are in the same Inner 
Shareable shareability domain. 





Shareable, Inner Shareable, and Outer Shareable Normal memory 
The ARM architecture abstracts the system as a series of Inner and Outer Shareability domains. 


Each Inner Shareability domain contains a set of observers that are data coherent for each member of that set for 
data accesses with the Inner Shareable attribute made by any member of that set. 


Each Outer Shareability domain contains a set of observers that are data coherent for each member of that set for 
data accesses with the Outer Shareable attribute made by any member of that set. 


The following properties also hold: 


° Each observer is only a member of a single Inner Shareability domain. 
° Each observer is only a member of a single Outer Shareability domain. 
° All observers in an Inner Shareability domain are always members of the same Outer Shareability domain. 


This means that an Inner Shareability domain is a subset of an Outer Shareability domain, although it is not 
required to be a proper subset. 
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Note 
e Because all data accesses to Non-cacheable locations are data coherent to all observers, Non-cacheable 


locations are always treated as Outer Shareable. 


° The Inner Shareable domain is expected to be the set of PEs controlled by a single hypervisor or operating 
system. 





The details of the use of the shareability attributes are system-specific. Example B2-1 shows how they might be 
used. 


Example B2-1 Use of shareability attributes 


In an implementation, a particular subsystem with two clusters of PEs has the requirement that: 


° In each cluster, the data caches or unified caches of the PEs in the cluster are transparent for all data accesses 
to memory locations with the Inner Shareable attribute. 


e However, between the two clusters, the caches: 
— _ Are not required to be coherent for data accesses that have only the Inner Shareable attribute. 


— Are coherent for data accesses that have the Outer Shareable attribute. 


In this system, each cluster is in a different shareability domain for the Inner Shareable attribute, but all components 
of the subsystem are in the same shareability domain for the Outer Shareable attribute. 


A system might implement two such subsystems. If the data caches or unified caches of one subsystem are not 
transparent to the accesses from the other subsystem, this system has two Outer Shareable shareability domains. 


Having two levels of shareability means system designers can reduce the performance and power overhead for 
shared memory locations that do not need to be part of the Outer Shareable shareability domain. 


For shareable Normal memory, the Load-Exclusive and Store-Exclusive synchronization primitives take account of 
the possibility of accesses by more than one observer in the same Shareability domain. 


Non-shareable Normal memory 


For Normal memory locations, the Non-shareable attribute identifies Normal memory that is likely to be accessed 
only by a single PE. 


A location in Normal memory with the Non-shareable attribute does not require the hardware to make data accesses 
by different observers coherent, unless the memory is Non-cacheable. For a Non-shareable location, if other 
observers share the memory system, software must use cache maintenance instructions, if the presence of caches 
might lead to coherency issues when communicating between the observers. This cache maintenance requirement 
is in addition to the barrier operations that are required to ensure memory ordering. 


For Non-shareable Normal memory, it is IMPLEMENTATION DEFINED whether the Load-Exclusive and 
Store-Exclusive synchronization primitives take account of the possibility of accesses by more than one observer. 


Cacheability attributes for Normal memory 


In addition to being Outer Shareable, Inner Shareable or Non-shareable, each region of Normal memory is assigned 
a Cacheability attribute that is one of: 


° Write-Through Cacheable. 
° Write-Back Cacheable. 
° Non-cacheable. 
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Also, for Write-Through Cacheable and Write-Back Cacheable Normal memory regions: 
° A region might be assigned cache allocation hints for read and write accesses. 


° It is IMPLEMENTATION DEFINED whether the cache allocation hints can have an additional attribute of 
Transient or Non-transient. 


For more information see Cacheability, cache allocation hints, and cache transient hints on page D3-1695. 


A memory location can be marked as having different cacheability attributes, for example when using aliases in a 
virtual to physical address mapping: 


. If the attributes differ only in the cache allocation hint this does not affect the behavior of accesses to that 
location. 
° For other cases see Mismatched memory attributes on page B2-105. 


The cacheability attributes provide a mechanism of coherency control with observers that lie outside the shareability 
domain of a region of memory. In some cases, the use of Write-Through Cacheable or Non-cacheable regions of 
memory might provide a better mechanism for controlling coherency than the use of hardware coherency 
mechanisms or the use of cache maintenance routines. To this end, the architecture requires the following properties 
for Non-cacheable or Write-Through Cacheable memory: 


° A completed write to a memory location that is Non-cacheable or Write-Through Cacheable for a level of 
cache made by an observer accessing the memory system inside the level of cache is visible to all observers 
accessing the memory system outside the level of cache without the need of explicit cache maintenance. 


. A completed write to a memory location that is Non-cacheable for a level of cache made by an observer 
accessing the memory system outside the level of cache is visible to all observers accessing the memory 
system inside the level of cache without the need of explicit cache maintenance. 


° For accesses to Normal memory that is Non-cacheable, a DMB instruction ensures that all members of Group 
A reach a single peripheral or block of memory, of IMPLEMENTATION DEFINED size, before any member of 
Group B, where: 


— The definition of the operation of a DMB instruction defines Group A and Group B, see Data Memory 
Barrier (DMB) on page B2-88. 


— The IMPLEMENTATION DEFINED size of the single peripheral or block of memory is defined by the 
peripheral or block of memory. 


This applies for all types of DMB instruction. 





Note 


Implementations can use the cache allocation hints to indicate a probable performance benefit of caching. For 
example, a programmer might know that a piece of memory is not going to be accessed again and would be better 
treated as Non-cacheable. The distinction between memory regions with attributes that differ only in the cache 
allocation hints exists only as a hint for performance. 





For Normal memory, the ARM architecture provides cacheability attributes that are defined independently for each 

of two conceptual levels of cache, the inner and the outer cache. The relationship between these conceptual levels 

of cache and the implemented physical levels of cache is IMPLEMENTATION DEFINED, and can differ from the 

boundaries between the Inner and Outer Shareability domains. However: 

. Inner refers to the innermost caches, meaning the caches that are closest to the PE, and always includes the 
lowest level of cache. 

. No cache that is controlled by the Inner cacheability attributes can lie outside a cache that is controlled by the 
Outer cacheability attributes. 


. An implementation might not have any outer cache. 


Example B2-2 on page B2-98, Example B2-3 on page B2-98, and Example B2-4 on page B2-98 describe the 
possible ways of implementing a system with three levels of cache, /evel J (L1) to level 3 (L3). 
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Note 
° LI cache is the level closest to the PE, see Memory hierarchy on page B2-71. 
. When managing coherency, system designs must consider both the inner and outer cacheability attributes, as 


well as the shareability attributes. This is because hardware might have to manage the coherency of caches 
at one conceptual level, even when another conceptual level has the Non-cacheable attribute. 





Example B2-2 Implementation with two inner and one outer cache levels 


Implement the three levels of cache in the system, L1 to L3, with: 
. The Inner cacheability attribute applied to L1 and L2 cache. 
° The Outer cacheability attribute applied to L3 cache. 


Example B2-3 Implementation with three inner and no outer cache levels 


Implement the three levels of cache in the system, L1 to L3, with the Inner cacheability attribute applied to L1, L2, 
and L3 cache. Do not use the Outer cacheability attribute. 


Example B2-4 Implementation with one inner and two outer cache levels 


Implement the three levels of cache in the system, L1 to L3, with: 
. The Inner cacheability attribute applied to L1 cache. 
° The Outer cacheability attribute applied to L2 and L3 cache. 


Multi-register loads and stores that access Normal memory 


For all instructions that load or store more than one general-purpose register from an Exception level there is no 
requirement for the memory system beyond the PE to be able to identify the size of the elements accessed by these 
load or store instructions. 


For all instructions that load or store more than one general-purpose register from an Exception level the order in 
which the registers are accessed is not defined by the architecture. 


For all instructions that load or store one or more SIMD and floating-point register from an Exception level there is 
no requirement for the memory system beyond the PE to be able to identify the size of the element accessed by these 
load or store instructions. 


B2.8.2 Device memory 


The Device memory type attributes define memory locations where an access to the location can cause side-effects, 
or where the value returned for a load can vary depending on the number of loads performed. Typically, the Device 
memory attributes are used for memory-mapped peripherals and similar locations. 


The attributes for ARMv8 Device memory are: 
Gathering Identified as G or nG, see Gathering on page B2-101. 
Reordering Identified as R or nR, see Reordering on page B2-102. 


Early Write Acknowledgement hint 
Identified as E or nE, see Early Write Acknowledgement on page B2-103. 
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The ARMVv8 Device memory types are: 


Device-nGnRnE Device non-Gathering, non-Reordering, No Early write acknowledgement. 


Equivalent to the Strongly-ordered memory type in earlier versions of the architecture. 


Device-nGnRE Device non-Gathering, non-Reordering, Early Write Acknowledgement. 


Equivalent to the Device memory type in earlier versions of the architecture. 


Device-nGRE Device non-Gathering, Reordering, Early Write Acknowledgement. 


Device-GRE 


ARMv8 adds this memory type to the translation table formats found in earlier versions of 
the architecture. The use of barriers is required to order accesses to Device-nGRE memory. 


Device Gathering, Reordering, Early Write Acknowledgement. 


ARMv8 adds this memory type to the translation table formats found in earlier versions of 
the architecture. Device-GRE memory has the fewest constraints. It behaves similar to 
Normal memory, with the restriction that speculative accesses to Device-GRE memory is 
forbidden. 


Collectively these are referred to as any Device memory type. Going down the list, the memory types are described 
as getting weaker; conversely the going up the list the memory types are described as getting stronger. 


Note 





As the list of types shows, these additional attributes are hierarchical. For example, a memory location that 
permits Gathering must also permit Reordering and Early Write Acknowledgement. 


The architecture does not require an implementation to distinguish between each of these memory types and 
ARM recognizes that not all implementations will do so. The subsection that describes each of the attributes, 
describes the implementation rules for the attribute. 


Earlier versions of the ARM architecture defined the following memory types: 


Strongly-ordered memory. This is the equivalent of the Device-nGnRnE memory type. 


Device memory. This is the equivalent of the Device-nGnRE memory type. 





All of these memory types have the following properties: 


Speculative data accesses are not permitted to any memory location with any Device memory attribute. This 
means that each memory access to any Device memory type must be one that would be generated by a simple 
sequential execution of the program. 


The following exceptions to this apply: 


Reads generated by the SIMD and floating-point instructions can access bytes that are not explicitly 
accessed by the instruction if the bytes accessed are in a 16-byte window, aligned to 16-bytes, that 
contains at least one byte that is explicitly accessed by the instruction. 


For Device memory with the Gathering attribute, reads generated by the LDNP instructions are 
permitted to access bytes that are not explicitly accessed by the instruction, provided that the bytes 
accessed are in a 128-byte window, aligned to 128-bytes, that contains at least one byte that is 
explicitly accessed by the instruction. 


Where a load or store instruction performs a sequence of memory accesses, as opposed to one 
single-copy atomic access as defined in the rules for single-copy atomicity, these accesses might occur 
multiple times as a result of executing the load or store instruction. See Properties of single-copy 
atomic accesses on page B2-82. 


Note 





An instruction that generates a sequence of accesses as described in Atomicity in the ARM architecture 
on page B2-81 might be abandoned as a result of an exception being taken during the sequence of 
accesses. On return from the exception the instruction is restarted, and therefore one or more of the 
memory locations might be accessed multiple times. This can result in repeated accesses to a location 
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where the program only defines a single access. For this reason, ARM strongly recommends that no 
accesses to Device memory are performed from a single instruction that spans the boundary of a 
translation granule or which in some other way could lead to some of the accesses being aborted. 


— Write speculation that is visible to other observers is prohibited for all memory types. 





° A write to a memory location with any Device memory attribute completes in finite time. This means that it 
is globally observed for all observers in the system in finite time. 


. If a location with any Device memory attribute changes without an explicit write by an observer, this change 
must also be globally observed for all observers in the system in finite time. Such a change might occur in a 
peripheral location that holds status information. 


° A completed write to a memory location with any Device memory attribute is globally observed for all 
observers in finite time without the need for explicit maintenance. 


. Data accesses to memory locations are coherent for all observers in the system, and correspondingly are 
treated as being Outer Shareable. 


° A memory location with any Device memory attribute cannot be allocated into a cache. 


. Writes to a memory location with any Device memory attribute must reach the endpoint for that address in 
the memory system in finite time. Typically, the endpoint is a peripheral or some physical memory. 


° For accesses to any Device memory type, a DMB instruction ensures that all members of Group A reach a single 
peripheral or block of memory, of IMPLEMENTATION DEFINED size, before any member of Group B, where: 


— The definition of the operation of a DMB instruction defines Group A and Group B, see Data Memory 
Barrier (DMB) on page B2-88. 


— The IMPLEMENTATION DEFINED size of the single peripheral or block of memory is defined by the 
peripheral or block of memory. 


This applies for all types of DMB instruction. 


° All accesses to memory with any Device memory attribute must be aligned. Any unaligned access generates 
an Alignment fault at the first stage of translation that defined the location as being Device. 


Note 


In the Non-secure EL1&0 translation regime in systems where HCR_EL2.TGE==1 and HCR_EL2.DC==0, 
any Alignment fault that results from the fact that all locations are treated as Device is a fault at the first stage 
of translation. This causes ESR_EL2.ISS[24] to be 0. 








° Hardware does not prevent speculative instruction fetches from a memory location with any of the Device 
memory attributes unless the memory location is also marked as Execute-never for all Exception levels. 


Note 


This means that to prevent speculative instruction fetches from memory locations with Device memory 
attributes, any location that is assigned any Device memory type must also be marked as Execute-never for 
all Exception levels. Failure to mark a memory location with any Device memory attribute as Execute-never 
for all Exception levels is a programming error. 








See also Memory access restrictions on page B2-104. 


The memory types for Translation table walks cannot be defined as any Device memory type within the TCR_ELx. 
For the Non-secure EL1&0 translation regime, the memory accesses made during a stage | translation table walk 
are subject to a stage 2 translation, and as a result of this second stage of translation, the accesses from the first stage 
translation table walk might be made to memory locations with any Device memory type. These accesses might be 
made speculatively. When the value of the HCR_EL2.PTW bit is 1, a stage 2 permission fault is generated if a first 
stage translation table walk is made to any Device memory type. 
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Note 


In general, making a translation table walk to any Device memory type is the result of a programming error. 





For instruction fetches, if branches cause the program counter to point to an area of memory with the Device 
attribute which is not marked as Execute-never for the current Exception level, an implementation can either: 


. Treat the instruction fetch as if it were to a memory location with the Normal Non-cacheable attribute. 
. Take a Permission fault. 

Gathering 

In the Device memory attribute: 

G Indicates that the location has the Gathering attribute. 

nG Indicates that the location does not have the Gathering attribute, meaning it is non-Gathering. 


The Gathering attribute determines whether it is permissible for either: 


. Multiple memory accesses of the same type, read or write, to the same memory location to be merged into a 
single transaction. 


° Multiple memory accesses of the same type, read or write, to different memory locations to be merged into 
a single memory transaction on an interconnect. 





Note 


This also applies to writebacks from the cache, whether caused by a Natural eviction or as a result of a cache 
maintenance instruction. 





For memory types with the Gathering attribute, either of these behaviors is permitted, provided that the ordering and 
coherency rules of the memory location are followed. 


For memory types with the non-Gathering attribute, neither of these behaviors is permitted. As a result: 


. The number of memory accesses that are made corresponds to the number that would be generated by a 
simple sequential execution of the program. 


° All accesses occur at their programmed size, except that there is no requirement for the memory system 
beyond the PE to be able to identify the elements accessed by multi-register Load/Store instructions. See 
Multi-register loads and stores that access Device memory on page B2-103. 


Gathering between memory accesses separated by a memory barrier that affects those memory accesses is not 
permitted. This applies if one memory access is in Group A and one memory access is in Group B. That is, gathering 
is not permitted between a memory access in Group A and a memory access in Group B if the two accesses are 
separated by a barrier that affects at least one of the accesses. 


Gathering between two memory accesses generated by a Load-Acquire/Store-Release is not permitted. 


A read from a memory location with the non-Gathering attribute cannot come from a cache or a buffer, but must 
come from the endpoint for that address in the memory system. Typically this is a peripheral or physical memory. 





Note 


° A read from a memory location with the Gathering attribute can come from intermediate buffering of a 
previous write, provided that: 


— The accesses are not separated by a DMB or DSB barrier that affects both of the accesses, for example if 
one access is in Group A and the other is in Group B. 


— The accesses are not separated by other ordering constructions that require that the accesses are in 
order. Such a construction might be a combination of Load-Acquire and Store-Release. 


— The accesses are not generated by a Store-Release instruction. 
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. The ARM architecture only defines programmer visible behavior. Therefore, gathering can be performed if 
a programmer cannot tell whether gathering has occurred. 





An implementation is permitted to perform an access with the Gathering attribute in a manner consistent with the 
requirements specified by the Non-gathering attribute. 


An implementation is not permitted to perform an access with the Non-gathering attribute in a manner consistent 
with the relaxations allowed by the Gathering attribute. 


Reordering 
In the Device memory attribute: 


R Indicates that the location has the Reordering attribute. Accesses to the location can be reordered 
within the same rules that apply to accesses to Normal Non-cacheable memory. All memory types 
with the Reordering attribute have the same ordering rules as accesses to Normal Non-cacheable 
memory, see Memory ordering on page B2-84. 


nR Indicates that the location does not have the Reordering attribute, meaning it is non-Reordering. 


— Note 


Some interconnect fabrics, such as PCIe, perform very limited re-ordering, which is not important 
for the software usage. It is outside the scope of the ARM architecture to prohibit the use of a 
Non-reordering memory type with these interconnects. 





For all memory types with the non-Reordering attribute, the order of memory accesses arriving at a single peripheral 
of IMPLEMENTATION DEFINED size, as defined by the peripheral, must be the same order that occurs in a simple 
sequential execution of the program. That is, the accesses appear in program order. This ordering applies to all 
accesses using any of the memory types with the non-Reordering attribute. As a result, if there is a mixture of 
Device-nGnRE and Device-nGnRnE accesses to the same peripheral, these occur in program order. If the memory 
accesses are not to a peripheral, then this attribute imposes no restrictions. 


Note 


. The IMPLEMENTATION DEFINED size of the single peripheral is the same as applies for the ordering guarantee 
provided by the DMB instruction. 





° The ARM architecture only defines programmer visible behavior. Therefore, reordering can be performed if 
a programmer cannot tell whether reordering has occurred. 





An implementation: 


° Is permitted to perform an access with the Reordering attribute in a manner consistent with the requirements 
specified by the non-Reordering attribute. 


. Is not permitted to perform an access with the non-Reordering attribute in a manner consistent with the 
relaxations allowed by the Reordering attribute. 


The non-Reordering attribute does not require any additional ordering, other than that which applies to Normal 
memory, between: 


. Accesses with the non-Reordering attribute and accesses with the Reordering attribute. 
° Accesses with the non-Reordering attribute and accesses to Normal memory. 
° Accesses with the non-Reordering attribute and accesses to different peripherals of IMPLEMENTATION 


DEFINED size. 


The non-Reordering attribute has no effect on the ordering of cache maintenance instructions, even if the memory 
location specified in the instruction has the non-Reordering attribute. 
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Early Write Acknowledgement 


In the Device memory attribute: 
iE Indicates that the location has the Early Write Acknowledgement attribute. 
nE Indicates that the location has the No Early Write Acknowledgement attribute. 


Early Write Acknowledgement is a hint to the platform memory system. Assigning the No Early Write 
Acknowledgement attribute to a Device memory location recommends that only the endpoint of the write access 
returns a write acknowledgement of the access, and that no earlier point in the memory system returns a write 
acknowledge. This means that a DSB barrier, executed by the PE that performed the write to the No Early Write 
Acknowledgement location, completes only after the write has reached its endpoint in the memory system. 
Typically, this endpoint is a peripheral or physical memory. 


When the Early Write Acknowledgement attribute is assigned to a Device memory location, there is no such 
recommendation for the handling of accesses to that location. 





Note 


. The Early Write Acknowledgement hint has no effect on the ordering rules. The purpose of signaling no Early 
Write Acknowledgement is to signal to the interconnect that the peripheral requires the ability to signal the 
acknowledgement. The No Write Acknowledgement signal also provides an additional semantic that can be 
interpreted by the driver that is accessing the peripheral. 


° This attribute is treated as a hint, as the exact nature of the interconnects accessed by a PE is outside the scope 
of the ARM architecture definition, and not all interconnects provide a mechanism to ensure that a write has 
reached the physical endpoint of the memory system. 


° ARM recommends that writes with the No Early Write Acknowledgement hint are used for PCIe 
configuration writes. However, the mechanisms by which PCle configuration writes are identified are 
IMPLEMENTATION DEFINED. 


° ARM strongly recommends that the Early Write Acknowledgement hint is not ignored by a PE, but is made 
available for use by the system. 





Because the No Early Write Acknowledgement attribute is a hint: 


° An implementation is permitted to perform an access with the Early Write Acknowledgement attribute in a 
manner consistent with the requirements specified by the No Early Write Acknowledgement attribute. 


. An implementation is permitted to perform an access with the No Early Write Acknowledgement attribute in 
a manner consistent with the relaxations allowed by the Early Write Acknowledgement attribute. 


Multi-register loads and stores that access Device memory 


For all instructions that load or store more than one general-purpose register from an Exception level there is no 
requirement for the memory system beyond the PE to be able to identify the size of the elements accessed by these 
load or store instructions. 


For all instructions that load or store more than one general-purpose register from an Exception level the order in 
which the registers are accessed is not defined by the architecture. This applies even to accesses to any type of 
Device memory. 


For all instructions that load or store one or more floating-point and SIMD register from an Exception level there is 
no requirement for the memory system beyond the PE to be able to identify the size of the element accessed by these 
load or store instructions, even for access to any type of Device memory. 
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B2.8.3 


Memory access restrictions 


The following restrictions apply to memory accesses: 


For accesses to any two bytes, p and gq, that are generated by the same instruction: 


— The bytes p and qg must have the same memory type and shareability attributes, otherwise the results 
are CONSTRAINED UNPREDICTABLE. For example, an LD1, ST1, or an unaligned load or store that spans 
the boundary between Normal memory and Device memory is CONSTRAINED UNPREDICTABLE. 


— Except for possible differences in the cache allocation hints, ARM deprecates having different 
cacheability attributes for bytes p and q. 


For the permitted CONSTRAINED UNPREDICTABLE behavior, see Crossing a page boundary with different 
memory types or Shareability attributes on page K1-5482. 


If the accesses of an instruction that causes multiple accesses to any type of Device memory cross an address 
boundary that corresponds to the smallest implemented translation granule then behavior is CONSTRAINED 
UNPREDICTABLE, and Crossing a peripheral boundary with a Device access on page K1-5483 describes the 
permitted behaviors. For this reason, it is important that an access to a volatile memory device is not made 
using a single instruction that crosses an address boundary of the size of the smallest implemented translation 
granule. 


Note 


— The boundary referred to is between two Device memory regions that are both of the size of the 
smallest implemented translation granule and aligned to the size of the smallest implemented 
translation granule. 





— This restriction means it is important that an access to a volatile memory device is not made using a 
single instruction that crosses an address boundary of the size of the smallest implemented translation 
granule. 


— ARM expects this restriction to constrain the placing of volatile memory devices in the system 


memory map, rather than expecting a compiler to be aware of the alignment of memory accesses. 
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Mismatched memory attributes 


Memory attributes are controlled by privileged software. For more information, see Chapter D4 The AArch64 
Virtual Memory System Architecture. 


Physical memory locations are accessed with mismatched attributes if all accesses to the location do not use a 
common definition of all of the following attributes of that location: 


. Memory type, Device or Normal. 
° Shareability. 


° Cacheability, for the same level of the inner or outer cache, but excluding any cache allocation hints. 


Collectively these are referred to as memory attributes. 


Note 


The terms location and memory location refer to any byte within the current coherency granule and are used 
interchangeably. 








When a memory location is accessed with mismatched attributes the only software visible effects are one or more 
of the following: 
° Uniprocessor semantics for reads and writes to that memory location might be lost. This means: 


— A read of the memory location by one agent might not return the value most recently written to that 
memory location by the same agent. 


— Multiple writes to the memory location by one agent with different memory attributes might not be 
ordered in program order. 


° There might be a loss of coherency when multiple agents attempt to access a memory location. 
. There might be a loss of properties derived from the memory type, as described in later bullets in this section. 
° If all Load-Exclusive/Store-Exclusive instructions executed across all threads to access a given memory 


location do not use consistent memory attributes, the exclusive monitor state becomes UNKNOWN. 


. Bytes written without the Write-Back cacheable attribute within the same Write-Back granule as bytes 
written with the Write-Back cacheable attribute might have their values reverted to the old values as a result 
of cache Write-Back. 


The loss of properties associated with mismatched memory type attributes refers only to the following properties of 
Device memory that are additional to the properties of Normal memory: 


° Prohibition of speculative read accesses. 
° Prohibition on Gathering. 
. Prohibition on Re-ordering. 


For the following situations, when a physical memory location is accessed with mismatched attributes, a more 
restrictive set of behaviors applies. The description of each situation also describes the behaviors that apply: 


1. If the only memory type mismatch associated with a memory location across all users of the memory location 
is between different types of Device memory, then all accesses might take the properties of the weakest 
Device memory type. 


2. Any agent that reads that memory location using the same common definition of the shareability and 
cacheability attributes is guaranteed to access it coherently, to the extent required by that common definition 
of the memory attributes, only if all of the following conditions are met: 


° All aliases to the memory location with write permission both use a common definition of the 
shareability and cacheability attributes for the memory location, and either: 


— Have the inner cacheability attribute the same as the outer cacheability attribute. 
— In the Non-secure EL1&0 translation regime, have HCR_EL2.MIOCNCE set to 0. 


. All aliases to a memory location use a definition of the shareability attributes that encompasses all the 
agents with permission to access the location. 
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3: The possible software-visible effects caused by mismatched attributes for a memory location are defined 
more precisely if all of the mismatched attributes define the memory location as one of: 


° Any Device memory type. 


° Normal Inner Non-cacheable, Outer Non-cacheable memory. 


In these cases, the only permitted software-visible effects of the mismatched attributes are one or more of the 
following: 


° Possible loss of properties derived from the memory type when multiple agents attempt to access the 
memory location. 


° Possible reordering of memory transactions to the same memory location with different memory 
attributes, potentially leading to a loss of coherency or uniprocessor semantics. Any possible loss of 
coherency or uniprocessor semantics can be avoided by inserting DMB barrier instructions between 
accesses to the same memory location that might use different attributes. 


Where there is a loss of the uniprocessor semantics, ordering, or coherency, the following approaches can be used: 


1. If the mismatched attributes for a memory location all assign the same shareability attribute to the location, 
any loss of uniprocessor semantics, ordering, or coherency within a shareability domain can be avoided by 
use of software cache management. To do so, software must use the techniques that are required for the 
software management of the ordering or coherency of cacheable locations between agents in different 
shareability domains. This means: 


° Before writing to a location not using the Write-Back attribute, software must invalidate, or clean, a 
location from the caches if any agent might have written to the location with the Write-Back attribute. 
This avoids the possibility of overwriting the location with stale data. 


° After writing to a location with the Write-Back attribute, software must clean the location from the 
caches, to make the write visible to external memory. 


° Before reading the location with a cacheable attribute, software must invalidate the location from the 
caches, to ensure that any value held in the caches reflects the last value made visible in external 
memory. 

° Executing a DMB barrier instruction, with scope that applies to the common shareability of the accesses, 


between any accesses to the same memory location that use different attributes. 


In all cases: 


. Location refers to any byte within the current coherency granule. 

° A clean and invalidate instruction can be used instead of a clean instruction, or instead of an invalidate 
instruction. 

° In the sequences outlined in this section, all cache maintenance instructions and memory transactions 


must be completed, or ordered by the use of barrier operations, if they are not naturally ordered by the 
use of a common address, see Ordering and completion of data and instruction cache instructions on 
page D3-1709. 


Note 


With software management of coherency, race conditions can cause loss of data. A race condition occurs 
when different agents write simultaneously to bytes that are in the same location, and the invalidate, write, 





clean sequence of one agent overlaps with the equivalent sequence of another agent. A race condition also 
occurs if the first operation of either sequence is a clean, rather than an invalidate. 





2: If the mismatched attributes for a location mean that multiple cacheable accesses to the location might be 
made with different shareability attributes, then uniprocessor semantics, ordering, and coherency are 
guaranteed only if: 


° Each PE that accesses the location with a cacheable attribute performs a clean and invalidate of the 
location before and after accessing that location. 


° A DMB barrier with scope that covers the full shareability of the accesses is placed between any accesses 
to the same memory location that use different attributes. 
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Note 


The Note in rule 1 of this list, about possible race conditions, also applies to this rule. 





In addition, if multiple agents attempt to use Load-Exclusive or Store-Exclusive instructions to access a location, 
and the accesses from the different agents have different memory attributes associated with the location, the 
exclusive monitor state becomes UNKNOWN. 


ARM strongly recommends that software does not use mismatched attributes for aliases of the same location. An 
implementation might not optimize the performance of a system that uses mismatched aliases. 
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B2.10 


Synchronization and semaphores 


ARMvV8 provides non-blocking synchronization of shared memory, using synchronization primitives. The 
information in this section about memory accesses by synchronization primitives applies to accesses to both Normal 
memory and to any type of Device memory. 


Note 


Use of the ARMV8 synchronization primitives scales for multiprocessing system designs. 








Table B2-3 shows the synchronization primitives and the associated CLREX instruction. 


Table B2-3 Synchronization primitives and associated instruction 





Function A64 instruction 





Load-Exclusive 














Byte LDXRB, LDAXRB 
Halfword LDXRH, LDAXRH 
Register? LDXR, LDAXR 
Pair@ LDXP, LDAXP 





Store-Exclusive 

















Byte STXRB, STLXRB 
Halfword STXRH, STLXRH 
Register* STXR, STLXR 
Pair@ STXP, STLXP 
Clear-Exclusive CLREX 





a. The instruction operates on a doubleword if accessing an 
X register, or on a word if accessing a W register. 


The model for the use of a Load-Exclusive/Store-Exclusive instruction pair accessing a non-aborting memory 
address x is: 


. The Load-Exclusive instruction reads a value from memory address x. 


° The corresponding Store-Exclusive instruction succeeds in writing back to memory address x only if no other 
observer, process, or thread has performed a more recent store to address x. The Store-Exclusive instruction 
returns a status bit that indicates whether the memory write succeeded. 


A Load-Exclusive instruction marks a small block of memory for exclusive access. The size of the marked block is 
IMPLEMENTATION DEFINED, see Marking and the size of the marked memory block on page B2-115. A 
Store-Exclusive instruction to any address in the marked block clears the marking. 


Note 


In this section, the term PE includes any observer that can generate a Load-Exclusive or a Store-Exclusive 
instruction. 








The following sections give more information: 

° Exclusive access instructions and Non-shareable memory locations on page B2-109. 
° Exclusive access instructions and Shareable memory locations on page B2-111. 

° Marking and the size of the marked memory block on page B2-115. 
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° Context switch support on page B2-115. 
. Load-Exclusive and Store-Exclusive instruction usage restrictions on page B2-115. 
° Use of WFE and SEV instructions by spin-locks on page B2-118. 


Exclusive access instructions and Non-shareable memory locations 


For memory locations for which the shareability attribute is Non-shareable, the exclusive access instructions rely 
on a local monitor that marks any address from which the PE executes a Load-Exclusive instruction. Any 
non-aborted attempt by the same PE to use a Store-Exclusive instruction to modify any address is guaranteed to 
clear the marking. 


A Load-Exclusive instruction performs a load from memory, and: 


. The executing PE marks the physical memory address for exclusive access. 


° The local monitor of the executing PE transitions to the Exclusive Access state. 
A Store-Exclusive instruction performs a conditional store to memory that depends on the state of the local monitor: 


If the local monitor is in the Exclusive Access state 


° If the address of the Store-Exclusive instruction is the same as the address that has been 
marked in the monitor by an earlier Load-Exclusive instruction, then the store occurs. 
Otherwise, it is IMPLEMENTATION DEFINED whether the store occurs. 

° A status value is returned to a register: 

— If the store took place the status value is 0. 
— Otherwise, the status value is 1. 


° The local monitor of the executing PE transitions to the Open Access state. 
If the local monitor is in the Open Access state 
° No store takes place. 


. A status value of 1 is returned to a register. 


° The local monitor remains in the Open Access state. 
The Store-Exclusive instruction defines the register to which the status value is returned. 
When a PE writes using any instruction other than a Store-Exclusive instruction: 


° If the write is to a physical address that is not marked as Exclusive Access by its local monitor and that local 
monitor is in the Exclusive Access state it is IMPLEMENTATION DEFINED whether the write affects the state of 
the local monitor. 


. If the write is to a physical address that is marked as Exclusive Access by its local monitor it is 
IMPLEMENTATION DEFINED whether the write affects the state of the local monitor. 


Tt is IMPLEMENTATION DEFINED whether a store to a marked physical address causes a mark in the local monitor to 
be cleared if that store is by an observer other than the one that caused the physical address to be marked. 


Figure B2-4 on page B2-110 shows the state machine for the local monitor and the effect of each of the operations 
shown in the figure. 
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LoadExc]l (x) LoadExc] (x) 
v 
Open Exclusive }¢«— 
ie Access Access 
x x 
StoreExcl (x) Store(Marked_address) * Store(Marked_address) * 
Store(x) Store(!Marked_address) * Store(!Marked_address) * 

CLREX StoreExcl (Marked_address) 

StoreExcl (!Marked_address) 

CLREX 


Operations marked * are possible alternative IMPLEMENTATION DEFINED options. 

In the diagram: LoadExcl represents any Load-Exclusive instruction 
StoreExcl represents any Store-Exclusive instruction 
Store represents any other store instruction. 


Any LoadExcl operation updates the marked address to the most significant bits of the address x used for the operation. 


Figure B2-4 Local monitor state machine diagram 


For more information about marking see Marking and the size of the marked memory block on page B2-115. 


Note 


For the local monitor state machine, as shown in Figure B2-4: 





° The IMPLEMENTATION DEFINED options for the local monitor are consistent with the local monitor being 
constructed so that it does not hold any physical address, but instead treats any access as matching the address 
of the previous Load-Exclusive instruction. 


° A local monitor implementation can be unaware of Load-Exclusive and Store-Exclusive instructions from 
other PEs. 
. The architecture does not require a load instruction by another PE, that is not a Load-Exclusive instruction, 


to have any effect on the local monitor. 


° It is IMPLEMENTATION DEFINED whether the transition from Exclusive Access to Open Access state occurs 
when the Store or StoreExcl is from another observer. 





Changes to the local monitor state resulting from speculative execution 


The architecture permits a local monitor to transition to the Open Access state as a result of speculation, or from 
some other cause. This is in addition to the transitions to Open Access state caused by the architectural execution 
of an operation shown in Figure B2-4. 


An implementation must ensure that: 


° The local monitor cannot be seen to transition to the Exclusive Access state except as a result of the 
architectural execution of one of the operations shown in Figure B2-4. 


° Any transition of the local monitor to the Open Access state not caused by the architectural execution of an 
operation shown in Figure B2-4 must not indefinitely delay forward progress of execution. 





B2-110 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


B2 The AArch64 Application Level Memory Model 
B2.10 Synchronization and semaphores 


B2.10.2 Exclusive access instructions and Shareable memory locations 


In the context of this section, a shareable memory location is a memory location that has, or is treated as if it has, a 
Shareability attribute of Inner Shareable or Outer Shareable. 


For shareable memory locations, exclusive access instructions rely on: 


° A local monitor for each PE in the system, that marks any address from which the PE executes a 
Load-Exclusive. The local monitor operates as described in Exclusive access instructions and Non-shareable 
memory locations on page B2-109, except that for shareable memory any Store-Exclusive is then subject to 
checking by the global monitor if it is described in that section as doing at least one of the following: 

— Updating memory. 
— Returning a status value of 0. 


The local monitor can ignore accesses from other PEs in the system. 


. A global monitor that marks a physical address as exclusive access for a particular PE. This marking is used 
later to determine whether a Store-Exclusive to that address that has not been failed by the local monitor can 
occur. Any successful write to the marked block by any other observer in the shareability domain of the 
memory location is guaranteed to clear the marking. For each PE in the system, the global monitor: 


— Can hold at least one marked block. 


— Maintains a state machine for each marked block it can hold. 


Note 


For each PE, the architecture only requires global monitor support for a single marked address. Any situation 
that might benefit from the use of multiple marked addresses on a single PE is UNPREDICTABLE or 
CONSTRAINED UNPREDICTABLE, see Load-Exclusive and Store-Exclusive instruction usage restrictions on 
page B2-115. 








Note 


The global monitor can either reside within the PE, or exist as a secondary monitor at the memory interfaces. The 
IMPLEMENTATION DEFINED aspects of the monitors mean that the global monitor and local monitor can be combined 
into a single unit, provided that the unit performs the global monitor and local monitor functions defined in this 
manual. 








For shareable memory locations, in some implementations and for some memory types, the properties of the global 
monitor require functionality outside the PE. Some system implementations might not implement this functionality 
for all locations of memory. In particular, this can apply to: 


. Any type of memory in the system implementation that does not support hardware cache coherency. 


. Non-cacheable memory, or memory treated as Non-cacheable, in an implementation that does support 
hardware cache coherency. 


In such a system, it is defined by the system: 


° Whether the global monitor is implemented. 
° If the global monitor is implemented, which address ranges or memory types it monitors. 
Note 





To support the use of the Load-Exclusive/Store-Exclusive mechanism when address translation is disabled, a system 
might define at least one location of memory, of at least the size of the translation granule, in the system memory 
map to support the global monitor for all ARM PEs within a common Inner Shareable domain. However, this is not 
an architectural requirement. Therefore, architecturally-compliant software that requires mutual exclusion must not 
rely on using the Load-Exclusive/Store-Exclusive mechanism, and must instead use a software algorithm such as 
Lamport’s Bakery algorithm to achieve mutual exclusion. 
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Because implementations can choose which memory types are treated as Non-cacheable, the only memory types for 
which it is architecturally guaranteed that a global exclusive monitor is implemented are: 


° Inner Shareable, Inner Write-Back, Outer Write-Back Normal memory with Read allocation hints and Write 
allocation hints and not transient. 


° Outer Shareable, Inner Write-Back, Outer Write-Back Normal memory with Read allocation hints and Write 
allocation hints and not transient. 


The set of memory types that support atomic instructions must include all of the memory types for which a global 
monitor is implemented. 


If the global monitor is not implemented for an address range or memory type, then performing a Load-Exclusive 
or a Store-Exclusive instruction to such a location has one or more of the following effects: 


. The instruction generates an external abort. 


° The instruction generates an IMPLEMENTATION DEFINED MMU fault. This is reported using the Fault Status 
code of ESR_ELx.DFSC = 110101. 
Tf the IMPLEMENTATION DEFINED MMU fault is generated for the Non-secure EL1&0 translation regime then: 


— If the fault is generated because of the memory type defined in the first stage of translation, or if the 
second stage of translation is disabled, then this is a first stage fault and the exception is taken to EL1. 


— Otherwise, the fault is a second stage fault and the exception is taken to EL2. 


The priority of this fault is IMPLEMENTATION DEFINED. 
° The instruction is treated as a NOP. 


° The Load-Exclusive instruction is treated as if it were accessing a Non-shareable location, but the state of the 
local monitor becomes UNKNOWN. 


° The Store-Exclusive instruction is treated as if it were accessing a Non-shareable location, but the state of the 
local monitor becomes UNKNOWN. In this case, if the store exclusive instruction is a store exclusive pair of 
64-bit quantities, then the two quantities being stored might not be stored atomically. 


° The value held in the result register of the Store-Exclusive instruction becomes UNKNOWN. 


In addition, for write transactions generated by non-PE observers that do not implement exclusive accesses or other 
atomic access mechanisms, the effect that writes have on the global and local monitors used by ARM PEs is 
IMPLEMENTATION DEFINED. The writes might not clear the global monitors of other PEs for: 





° Some address ranges. 
° Some memory types. 
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Operation of the global monitor 


A Load-Exclusive instruction from shareable memory performs a load from memory, and causes the physical 
address of the access to be marked as exclusive access for the requesting PE. This access can also cause the 
exclusive access mark to be removed from any other physical address that has been marked by the requesting PE. 


Note 


The global monitor only supports a single outstanding exclusive access to shareable memory per PE. 








A Load-Exclusive instruction by one PE has no effect on the global monitor state for any other PE. 
A Store-Exclusive instruction performs a conditional store to memory: 


. The store is guaranteed to succeed only if the physical address accessed is marked as exclusive access for the 
requesting PE and both the local monitor and the global monitor state machines for the requesting PE are in 
the Exclusive Access state. In this case: 


—  Asstatus value of 0 is returned to a register to acknowledge the successful store. 
— The final state of the global monitor state machine for the requesting PE is IMPLEMENTATION DEFINED. 


— Ifthe address accessed is marked for exclusive access in the global monitor state machine for any other 
PE then that state machine transitions to Open Access state. 


. If no address is marked as exclusive access for the requesting PE, the store does not succeed: 
—  Asstatus value of 1 is returned to a register to indicate that the store failed. 


— The global monitor is not affected and remains in Open Access state for the requesting PE. 


. If a different physical address is marked as exclusive access for the requesting PE, it is IMPLEMENTATION 
DEFINED whether the store succeeds or not: 


— _ If the store succeeds a status value of 0 is returned to a register, otherwise a value of | is returned. 


— _ If the global monitor state machine for the PE was in the Exclusive Access state before the 
Store-Exclusive instruction it is IMPLEMENTATION DEFINED whether that state machine transitions to 
the Open Access state. 


The Store-Exclusive instruction defines the register to which the status value is returned. 


In a shared memory system, the global monitor implements a separate state machine for each PE in the system. The 

state machine for accesses to shareable memory by PE(n) can respond to all the shareable memory accesses visible 

to it. This means it responds to: 

. Accesses generated by PE(n). 

. Accesses generated by the other observers in the shareability domain of the memory location. These accesses 
are identified as (!n). 


In a shared memory system, the global monitor implements a separate state machine for each observer that can 
generate a Load-Exclusive or a Store-Exclusive instruction in the system. 
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Clear global monitor event 


Whenever the global monitor state for a PE changes from Exclusive access to Open access, an event is generated 
and held in the Event register for that PE. This register is used by the Wait for Event mechanism, see Mechanisms 
for entering a low-power state on page D1-1599. 


Figure B2-5 shows the state machine for PE(n) in a global monitor. 















































LoadExcl (x,n) LoadExcl (x,n) 
v 
Open Exclusive }¢— 
Access Access 
x x 
CLREX(n) StoreExcl (Marked_address, !n)+ ‘StoreExcl (Marked_address, !n)+ 
CLREX(C!n) Store(Marked_address, !n) Store(!Marked_address,n) 
LoadExcl (x, !n) StoreExcl (Marked_address,n)* StoreExcl (Marked_address,n) * 
StoreExcl (x,n) StoreExcl (!Marked_address,n)* StoreExcl (!Marked_address,n)* 
StoreExcl (x, !n) Store(Marked_address,n)* Store(Marked_address,n)* 
Store(x,n) CLREX(n) * CLREX(n) * 
Store(x, !n) StoreExcl(!Marked_address, !n) 


Store(!Marked_address, !n) 
CLREX(C!n) 


+StoreExcl(Marked_address,!n) clears the monitor only if the StoreExcl updates memory 


Operations marked * are possible alternative IMPLEMENTATION DEFINED options. 
In the diagram: LoadExcl represents any Load-Exclusive instruction 


StoreExcl represents any Store-Exclusive instruction 
Store represents any other store instruction. 


Any LoadExcl operation updates the marked address to the most significant bits of the address x used for the operation. 


Figure B2-5 Global monitor state machine diagram for PE(n) in a multiprocessor system 


For more information about marking see Marking and the size of the marked memory block on page B2-115. 


Note 





For the global monitor state machine, as shown in Figure B2-5: 


The architecture does not require a load instruction by another PE, that is not a Load-Exclusive instruction, 
to have any effect on the global monitor. 


Whether a Store-Exclusive instruction successfully updates memory or not depends on whether the address 
accessed matches the marked shareable memory address for the PE issuing the Store-Exclusive instruction, 
and whether the local and global monitors are in the exclusive state. For this reason, Figure B2-5 only shows 
how the operations by (!n) cause state transitions of the state machine for PE(n). 


A Load-Exclusive instruction can only update the marked shareable memory address for the PE issuing the 
Load-Exclusive instruction. 


When the global monitor is in the Exclusive Access state, it is IMPLEMENTATION DEFINED whether a CLREX 
instruction causes the global monitor to transition from Exclusive Access to Open Access state. 
It is IMPLEMENTATION DEFINED: 


— Whether a modification to a Non-shareable memory location can cause a global monitor to transition 
from Exclusive Access to Open Access state. 


— Whether a Load-Exclusive instruction to a Non-shareable memory location can cause a global monitor 
to transition from Open Access to Exclusive Access state. 
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Marking and the size of the marked memory block 


When a Load-Exclusive instruction is executed, the resulting marked block ignores the least significant bits of the 
64-bit memory address. 


When a Load-Exclusive instruction is executed, a marked block of size 2¢ bytes is created by ignoring the least 
significant bits of the memory address. A marked address is any address within this marked block. The size of the 
marked memory block is called the Exclusives reservation granule. The Exclusives reservation granule is 
IMPLEMENTATION DEFINED in the range 4 - 512 words. 





Note 
This definition means that the Exclusives reservation granule is: 
. 4 words in an implementation where a is 4. 


° 512 words in an implementation where a is 11. 


For example, in an implementation where a is 4, a successful LDXRB of address 0x341B4 defines a marked block using 
bits[47:4] of the address. This means that the four words of memory from 0x341B0 to 0x341BF are marked for 
exclusive access. 





In some implementations the CTR identifies the Exclusives reservation granule, see CTR_ELO. Otherwise, software 
must assume that the maximum Exclusives reservation granule, 512 words, is implemented. 


Context switch support 


An exception return clears the local monitor. As a result, performing a CLREX instruction as part of a context switch 
is not required in most situations. 


Note 


Context switching is not an application level operation. However, this information is included here to complete the 
description of the exclusive operations. 








Load-Exclusive and Store-Exclusive instruction usage restrictions 


The Load-Exclusive and Store-Exclusive instructions are intended to work together as a pair, for example a 
LDXP/STXP pair or a LDXR/STXR pair. To support different implementations of these functions, software must follow the 
notes and restrictions given here. 


The following notes describe the use of a LoadExcl/StoreExcl pair, to indicate the use of any of the 
Load-Exclusive/Store-Exclusive pairs shown in Table B2-3 on page B2-108. In this context, a LoadExcl/StoreExcl 
pair comprises two instructions in the same thread of execution: 


° The exclusives support a single outstanding exclusive access for each PE thread that is executed. The 
architecture makes use of this by not requiring an address or size check as part of the IsExclusiveLocal() 
function. If the target virtual address of a StoreExc] is different from the virtual address of the preceding 
LoadExcl instruction in the same thread of execution, behavior can be CONSTRAINED UNPREDICTABLE with 
the following behavior: 


— The StoreExcl either passes or fails, and the status value returned by the StoreExcl is UNKNOWN. 


Note 


This means the StoreExcl might pass for some instances of a LoadExc1/StoreExcl pair with mismatched 
addresses, and fail for other instances of a LoadExcl/StoreExcl pair with mismatched addresses. 








— The data at the address accessed by the LoadExc], and at the address accessed by the StoreExc], is 
UNKNOWN. 


This means software can rely on a LoadExcl/StoreExcl pair to eventually succeed only if the LoadExcl and the 
StoreExcl are executed with the same virtual address. 
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° If two StoreExcl instructions are executed without an intervening LoadExc] instruction the second StoreExcl 
instruction returns a status value of 1. This means that: 
—  ARMrecommends that, in a given thread of execution, every StoreExcl instruction has a preceding 
LoadExcl instruction associated with it. 


It is not necessary for every LoadExcl instruction to have a subsequent StoreExcl instruction. 


° An implementation of the Load-Exclusive and Store-Exclusive instructions can require that, in any thread of 
execution, the transaction size of a StoreExcl instruction is the same as the transaction size of the preceding 
LoadExcl instruction executed in that thread. If the transaction size of a StoreExc] instruction is different from 
the preceding LoadExcl instruction in the same thread of execution, behavior can be CONSTRAINED 
UNPREDICTABLE with the following behavior: 


— The StoreExcl either passes or fails, and the status value returned by the StoreExcl is UNKNOWN. 


Note 


This means the StoreExcl might pass for some instances of a LoadExc1/StoreExcl pair with mismatched 
transaction sizes, and fail for other instances of a LoadExc1/StoreExcl pair with mismatched transaction 
sizes. 








— The block of data of the size of the larger of the transaction sizes used by the LoadExcl/StoreExcl pair 
at the address accessed by the LoadExcl/StoreExcl pair, is UNKNOWN. 


This means software can rely on a LoadExcl/StoreExcl pair to eventually succeed only if the LoadExcl and the 
StoreExcl have the same transaction size. 


° An implementation of the LoadExcl and StoreExcl instructions can require that, in any thread of execution, 
the StoreExcl instruction accesses the same number of registers as the preceding LoadExc] instruction 
executed in that thread. If the StoreExcl instruction accesses a different number of registers than the preceding 
LoadExcl instruction in the same thread of execution, behavior is CONSTRAINED UNPREDICTABLE. As a result, 
software can rely on an LoadExcl/StoreExcl pair to eventually succeed only if they access the same number 
of registers. For more information see CONSTRAINED UNPREDICTABLE behavior when 
Load-Exclusive/Store-Exclusive access a different number of registers on page B2-118. 


° LoadExcl/StoreExcl loops are guaranteed to make forward progress only if, for any LoadExcl/StoreExcl loop 
within a single thread of execution, the software meets all of the following conditions: 


1 Between the Load-Exclusive and the Store-Exclusive, there are no explicit memory accesses, 
preloads, direct or indirect System register writes, address translation instructions, cache or TLB 
maintenance instructions, exception generating instructions, exception returns, or indirect 
branches. 


2 Between the Store-Exclusive returning a failing result and the retry of the corresponding 
Load-Exclusive: 


. There are no stores or PLDW instructions to any address within the Exclusives reservation 
granule accessed by the Store-Exclusive. 


° There are no loads or preloads to any address within the Exclusives reservation granule 
accessed by the Store-Exclusive that use a different VA alias to that address. 


. There are no direct or indirect System register writes, address translation instructions, 
cache or TLB maintenance instructions, exception generating instructions, exception 
returns, or indirect branches. 


° All loads and stores are to a block of contiguous virtual memory of not more than 512 
bytes in size. 


The exclusive monitor can be cleared at any time without an application-related cause, provided that such 
clearing is not systematically repeated so as to prevent the forward progress in finite time of at least one of 
the threads that is accessing the exclusive monitor. 


° Implementations can benefit from keeping the LoadExc] and StoreExcl operations close together in a single 
thread of execution. This minimizes the likelihood of the exclusive monitor state being cleared between the 
LoadExcl instruction and the StoreExc] instruction. Therefore, for best performance, ARM strongly 
recommends a limit of 128 bytes between LoadExcl and StoreExc] instructions in a single thread of execution. 
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° The architecture sets an upper limit of 2048 bytes on the Exclusives reservation granule that can be marked 
as exclusive. For performance reasons, ARM recommends that objects that are accessed by exclusive 
accesses are separated by the size of the exclusive reservations granule. This is a performance guideline 
rather than a functional requirement. 


. After taking a Data Abort exception, the state of the exclusive monitors is UNKNOWN. 


° For the memory location accessed by a LoadExc1/StoreExcl pair, if the memory attributes for a StoreExcl 
instruction are different from the memory attributes for the preceding LoadExcl instruction in the same thread 
of execution, behavior is CONSTRAINED UNPREDICTABLE. Where this occurs because the translation of the 
accessed address changes between the LoadExc]l instruction and the StoreExcl instruction, the CONSTRAINED 
UNPREDICTABLE behavior is as follows: 


— The StoreExcl either passes or fails, and the status value returned by the StoreExcl is UNKNOWN. 


Note 


This means the StoreExcl might pass for some instances of a LoadExcl/StoreExcl pair with changed 
memory attributes, and fail for other instances of a LoadExc1/StoreExcl pair with changed memory 
attributes. 








— The data at the address accessed by the StoreExcl is UNKNOWN. 


Note 


Another bullet point in this list covers the case where the memory attributes of a LoadExcl/StoreExcl pair 
differ as a result of using different virtual addresses with different attributes that point to the same physical 
address. 








° The effect of a data or unified cache invalidate, clean, or clean and invalidate instruction on a local or global 
exclusive monitor that is in the Exclusive Access state is CONSTRAINED UNPREDICTABLE, and the instruction 
might clear the monitor, or it might leave it in the Exclusive Access state. For address-based maintenance 
instructions, this also applies to the monitors of other PEs in the same shareability domain as the PE executing 
the cache maintenance instruction, as determined by the shareability domain of the address being maintained. 


Note 


ARM strongly recommends that implementations ensure that the use of such maintenance instructions by a 
PE in the Non-secure state cannot cause a denial of service on a PE in the Secure state. 








° If the mapping of the virtual to physical address is changed between the LoadExc] instruction and the STREX 
instruction, and the change is performed using a break-before-make sequence as described in Using 
break-before-make when updating translation table entries on page D4-1816, if the StoreExcl is performed 
after another write to the same physical address as the StoreExc], and that other write was performed after the 
old translation was properly invalidated and that invalidation was properly synchronized, then the StoreExc] 
will not pass its monitor check. 





Note 
ARM expects that, in many implementations, either: 
— The TLB invalidation will clear either the local or global monitor. 


— __ The physical address will be checked between the LoadExcl] and StoreExcl. 








Note 


In the event of repeatedly-contending LoadExcl/StoreExcl instruction sequences from multiple PEs, an 
implementation must ensure that forward progress is made by at least one PE. 
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CONSTRAINED UNPREDICTABLE behavior when Load-Exclusive/Store-Exclusive 
access a different number of registers 


As stated in this section, an implementation can require that the instructions of a Load-Exclusive/Store-Exclusive 
pair access the same number of registers. In such an implementation, this means behavior is CONSTRAINED 
UNPREDICTABLE if, in a single thread of execution, either: 


° An LDXP instruction of two 32-bit quantities is followed by an STXR instruction of one 64-bit quantity at the 
same address. 


° An LDXR instruction of one 64-bit quantity is followed by an STXP instruction of two 32-bit quantities at the 
same address. 


In these cases, the CONSTRAINED UNPREDICTABLE behavior must be one of: 
° The STXP or STXR instruction generates an external Data Abort. 


° The STXP or STXR instruction generates an IMPLEMENTATION DEFINED MMU fault reported using the Fault 
Status code of ESR_ELx.DFSC = 0b110101. 








° The STXP or STXR instruction always fails, returning a status of 1. 
° The STXP or STXR instruction always passes, returning a status of 0. 
° This STXP or STXR instruction has the same pass or fail behavior that it would have had if the instruction had 


used the same size and number of registers as the preceding LDXR or LDXP instruction. 


B2.10.6 Use of WFE and SEV instructions by spin-locks 


ARMvV8 provides Wait For Event, Send Event, and Send Event Local instructions, WFE, SEV, and SEVL, that can assist 
with reducing power consumption and bus contention caused by PEs repeatedly attempting to obtain a spin-lock. 
These instructions can be used at the application level, but a complete understanding of what they do depends on a 
system level understanding of exceptions. They are described in Wait for Event mechanism and Send event on 
page D1-1599. However, in ARMv8, when the global monitor for a PE changes from Exclusive Access state to 
Open Access state, an event is generated. 


Note 


This is equivalent to issuing an SEVL instruction on the PE for which the monitor state has changed. It removes the 
need for spinlock code to include an SEV instruction after clearing a spinlock. 
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Chapter C1 


The A64 Instruction Set 


This chapter describes the A64 instruction set. It contains the following sections: 


About the A64 instruction set on page C1-122. 
Structure of the A64 assembler language on page C1-123. 
Address generation on page C1-128. 


Instruction aliases on page C1-131. 
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C1 The A64 Instruction Set 


C1.1 About the A64 instruction set 


C1.1 About the A64 instruction set 


The A64 instruction set is the instruction set supported in the AArch64 Execution state. 


All A64 instructions have a width of 32 bits. The A64 encoding structure breaks down into the following functional 
groups: 


A miscellaneous group of branch instructions, exception generating instructions, and system instructions. 
Data-processing instructions associated with general-purpose registers. These instructions are supported by 
two functional groups, depending on whether the operands: 

— Are all held in registers. 

— Include an operand with a constant immediate value. 


Load and store instructions associated with the general-purpose register file and the SIMD and floating-point 
register file. 


SIMD and scalar floating-point data-processing instructions that operate on the SIMD and floating-point 
registers. 


The encoding hierarchy within a functional group breaks down as follows: 


A functional group consists of a set of related instruction classes. A64 instruction index by encoding on 
page C4-192 provides an overview of the instruction encodings in the form of a list of instruction classes 
within their functional groups. 


An instruction class consists of a set of related instruction forms. Instruction forms are documented in one of 
two alphabetic lists: 


— The load, store, and data-processing instructions associated with the general-purpose registers, 
together with those in the other instruction classes. See Chapter C6 A64 Base Instruction Descriptions. 


— The load, store, and data-processing instructions associated with the SIMD and floating-point support. 
See Chapter C7 A64 Advanced SIMD and Floating-point Instruction Descriptions. 

An instruction form might support a single instruction syntax. Where an instruction supports more than one 

syntax, each syntax is an instruction variant. Instruction variants can occur because of differences in: 

— The size or format of the operands. 

— The register file used for the operands. 

— The addressing mode used for load/load/store memory operands. 

Instruction variants might also arise as the result of other factors. 


Instruction variants are described in the instruction description for the individual instructions. 


A64 instructions have a regular bit encoding structure: 


5-bit register operand fields at fixed positions within the instruction. For general-purpose register operands, 
the values 0-30 select one of 31 registers. The value 31 is used as a special case that can: 


— Indicate use of the current stack pointer, when identifying a load/store base register or in a limited set 
of data-processing instructions. See The stack pointer registers on page D1-1507. 


— Indicate the value zero when used as a source register operand. 

— Indicate discarding the result when used as a destination register operand. 

For SIMD and floating-point register access, the value used selects one of 32 registers. 

Immediate bits that provide constant data-processing values or address offsets are placed in contiguous 


bitfields. Some computed values in instruction variants use one or more immediate bitfields together with the 
secondary encoding bitfields. 


All encodings that are not fully defined are described as unallocated. An attempt to execute an unallocated 
instruction is UNDEFINED, unless the behavior is otherwise defined in this Manual. 
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C1.2 Structure of the A64 assembler language 


The letter W denotes a general-purpose register holding a 32-bit word, and X denotes a general-purpose register 
holding a 64-bit doubleword. 


An A64 assembler recognizes both upper-case and lower-case variants of the instruction mnemonics and register 
names, but not mixed case variants. An A64 disassembler can output either upper-case or lower-case mnemonics 
and register names. Program and data labels are case-sensitive. 


The A64 assembly language does not require the # character to introduce constant immediate operands, but an 
assembler must allow immediate values introduced with or without the # character. ARM recommends that an A64 
disassembler outputs a # before an immediate operand. 


In Example C1-1 on page C1-124 the sequence //is used as a comment leader and A64 assemblers are encouraged 
to accept this syntax. 


C1.2.1 Common syntax terms 
The following syntax terms are used frequently throughout the A64 instruction set description. 


UPPER Text in upper-case letters is fixed. Text in lower-case letters is variable. This means that register 
name Xn indicates that the X is required, followed by a variable register number, for example X29. 


<> Any text enclosed by angle braces, < >, is a value that the user supplies. Subsequent text might 
supply additional information. 


{ } Any item enclosed by curly brackets, { }, is optional. A description of the item and how its presence 
or absence affects the instruction is normally supplied by subsequent text. In some cases curly 
braces are actual symbols in the syntax, for example when they surround a register list. These cases 
are called out in the surrounding text. 


[ ] Any items enclosed by square brackets, [ ], constitute a list of alternative characters. A single one 
of the characters can be used in that position and the subsequent text describes the meaning of the 
alternatives. In some case the square brackets are part of the syntax itself, such as addressing modes 
or vector elements. These cases are called out in the surrounding text. 


alb Alternative words are separated by a vertical bar, |, and can be surrounded by parentheses to delimit 
them. For example, U(ADD| SUB)W represents UADDW or USUBW. 


+ 


This indicates an optional + or - sign. If neither is used then + is assumed. 


uimmn An n-bit unsigned, positive, immediate value. 

simmn An n-bit two’s complement, signed immediate value, where n includes the sign bit. 
SP See Register names on page C1-124. 

Wn See Register names on page C1-124. 

WSP See Register names on page C1-124. 

WZR See Register names on page C1-124. 

Xn See Register names on page C1-124. 

XZR See Register names on page C1-124 

C1.2.2 Instruction Mnemonics 


The A64 assembly language overloads instruction mnemonics and distinguishes between the different forms of an 
instruction based on the operand types. For example, the following ADD instructions all have different opcodes. 
However, the programmer must only remember one mnemonic, as the assembler automatically chooses the correct 
opcode based on the operands. The disassembler follows the same procedure in reverse. 
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C1 The A64 Instruction Set 
C1.2 Structure of the A64 assembler language 


Example C1-1 ADD instructions with different opcodes 


ADD WO, W1, W2 
ADD X@, X1, X2 
ADD X@, X1, W2, SXTW 
ADD X@, X1, #42 


// add 32-bit register 
// add 64-bit register 
// add 64-bit extended register 
// add 64-bit immediate 

































































C1.2.3 Condition Code 
The A64 ISA has some instructions that set condition flags or test condition codes or both. For information about 
instructions that set the condition flags or use the condition mnemonics, see Condition flags and related instructions 
on page C6-433. 
Table C1-1 shows the available condition codes. 

Table C1-1 Condition codes 
cond Mnemonic Meaning (integer) Meaning (floating-point) Condition flags 
0000 EQ Equal Equal Z== 
0001 NE Not equal Not equal or unordered Z== 
0010 CS or HS Carry set Greater than, equal, or unordered C==1 
0011 CC or LO Carry clear Less than C== 
0100 MI Minus, negative Less than N==1 
0101 PL Plus, positive or zero Greater than, equal, or unordered N== 
0110 vs Overflow Unordered V== 
Q111 VC No overflow Ordered V== 
1000 HI Unsigned higher Greater than, or unordered C==1 && Z== 
1001 LS Unsigned lower or same Less than or equal ’(C ==1 && Z ==0) 
1010 GE Signed greater than or equal Greater than or equal N== 
1011 LT Signed less than Less than, or unordered N!=V 
1100 GT Signed greater than Greater than Z==0&& N==V 
1101 LE Signed less than or equal Less than, equal, or unordered "(Z==-0&& N==V) 
1110 AL Always Always Any 
1111 Nvb Always Always Any 

a. Unordered means at least one NaN operand. 
b. The condition code NV exists only to provide a valid disassembly of the @b1111 encoding, otherwise its behavior is identical to AL. 
C1.2.4 Register names 
This section describes the AArch64 registers. It contains the following subsections: 
° General-purpose register file and the stack pointer on page C1-125. 
° SIMD and floating-point register file on page C1-125. 
° SIMD and floating-point scalar register names on page C1-126. 
° SIMD vector register names on page C1-126. 
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° SIMD vector element names on page C1-126. 


General-purpose register file and the stack pointer 


The 31 general-purpose registers in the general-purpose register file are named RO-R30 and encoded in the 
instruction register fields with values 0-30. A general-purpose register field that encodes the value 31 represents 
either the current stack pointer or the zero register, depending on the instruction and the operand position. 


When the registers are used in a specific instruction variant, they must be qualified to indicate the operand data size, 
32 bits or 64 bits, and the data size of the instruction. 


When the data size is 32 bits, the lower 32 bits of the register are used and the upper 32 bits are ignored on a read 
and cleared to zero on a write. 


Table C1-2 shows the qualified names for registers, where n is a register number 0-30. 


Table C1-2 General-purpose register names 























Name Size Encoding Description 
Wn 32 bits 0-30 General-purpose register 0-30 
Xn 64 bits 0-30 General-purpose register 0-30 
WZR 32 bits 31 Zero register 
XZR 64 bits 31 Zero register 
WSP 32 bits 31 Current stack pointer 

SP 64 bits 31 Current stack pointer 





The following list provides further details relating to Table C1-2. 


. The names Xn and Wn both refer to the same general-purpose register, Rn. 
° There is no register named W31 or X31. 
° The name SP represents the stack pointer for 64-bit operands where an encoding of the value 31 in the 


corresponding register field is interpreted as a read or write of the current stack pointer. When instructions 
do not interpret this operand encoding as the stack pointer, use of the name SP is an error. 


° The name WSFP represents the current stack pointer in a 32-bit context. 


. The name XZR represents the zero register for 64-bit operands where an encoding of the value 31 in the 
corresponding register field is interpreted as returning zero when read or discarding the result when written. 
When instructions do not interpret this operand encoding as the zero register, use of the name XZR is an error. 


° The name WZR represents the zero register in a 32-bit context. 


. The architecture does not define a specific name for general-purpose register R30 to reflect its role as the link 
register on procedure calls. However, an A64 assembler must always use W30 and X30 for this purpose, and 
additional software names might be defined as part of the Procedure Call Standard, see Procedure Call 
Standard for the ARM 64-bit Architecture. 


SIMD and floating-point register file 


The 32 registers in the SIMD and floating-point register file, VO-V31, hold floating-point operands for the scalar 
floating-point instructions, and both scalar and vector operands for the SIMD instructions. When they are used in a 
specific instruction form, the names must be further qualified to indicate the data shape, that is the data element size 
and the number of elements or lanes within the register. A similar requirement is placed on the general-purpose 
registers. See General-purpose register file and the stack pointer. 
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Note 


The data type is described by the instruction mnemonics that operate on the data. The data type is not described by 
the register name. The data type is the interpretation of bits within each register or vector element, whether these 
are integers, floating-point values, polynomials or cryptographic hashes. 








SIMD and floating-point scalar register names 


SIMD and floating-point instructions that operate on scalar data only access the lower bits of a SIMD and 
floating-point register. The unused high bits are ignored on a read and cleared to 0 on a write. 


Table C1-3 shows the qualified names for accessing scalar SIMD and floating-point registers. The letter n denotes 
a register number between 0 and 31. 


Table C1-3 SIMD and floating-point scalar register names 




















Size Name 
8 bits Bn 
16 bits Hn 
32 bits Sn 
64 bits Dn 
128 bits Qn 





SIMD vector register names 


If a register holds multiple data elements on which arithmetic is performed in a parallel, SIMD, manner, then a 
qualifier describes the vector shape. The vector shape is the element size and the number of elements or lanes. If the 
element size in bits multiplied by the number of lanes does not equal 128, then the upper 64 bits of the register are 
ignored on a read and cleared to zero on a write. 


Table C1-4 shows the SIMD vector register names. The letter n denotes a register number between 0 and 31. 


Table C1-4 SIMD vector register names 





























Shape Name 
8 bits x 8 lanes Vn. 8B 
8 bits x 16 lanes Vn.16B 
16 bits x 4 lanes Vn.4H 
16 bits x 8 lanes Vn. 8H 
32 bits x 2 lanes Vn.2S 
32 bits x 4 lanes Vn.4S 
64 bits x 1 lane Vn.1D 
64 bits x 2 lanes Vn.2D 





SIMD vector element names 


Appending a constant, zero-based element index to the register name inside square brackets indicates that a single 
element from a SIMD and floating-point register is used as a scalar operand. The number of lanes is not represented, 
as it is not encoded in the instruction and can only be inferred from the index value. 
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Table C1-5 shows the vector register names and the element index. The letter i denotes the element index. 


Table C1-5 Vector register names with element index 

















Size Name 
8 bits Vn.B[i] 
16 bits Vn.H[i] 
32 bits Vn. S[i] 
64 bits Vn.D[i] 





An assembler must accept a fully qualified SIMD register name, if the number of lanes is greater than the index 
value. See SIMD vector register names on page C1-126. For example, an assembler must accept all of the following 
forms as the name for the 32-bit element in bits [63:32] of the SIMD and floating-point register V9: 


V9.S[1] //standard disassembly 
v9.2S[1] //optional number of lanes 
V9.4S[1] //optional number of lanes 


Note 


The SIMD and floating-point register element name Vn.S[@] is not equivalent to the scalar SIMD and floating-point 
register name Sn. Although they represent the same bits in the register, they select different instruction encoding 
forms, either the vector element or the scalar form. 








SIMD vector register list 


Where an instruction operates on multiple SIMD and floating-point registers, for example vector Load/Store 
structure and table lookup operations, the registers are specified as a list enclosed by curly braces. This list consists 
of either a sequence of registers separated by commas, or a register range separated by a hyphen. The registers must 
be numbered in increasing order, modulo 32, in increments of one. The hyphenated form is preferred for 
disassembly if there are more than two registers in the list and the register number are increasing. The following 
examples are equivalent representations of a set of four registers V4 to V7, each holding four lanes of 32-bit elements: 


{ V4.4S - V7.4S } //standard disassembly 
{ V4.4S, V5.4S, V6.4S, V7.4S } //alternative representation 


SIMD vector element list 


Registers in a list can also have a vector element form. For example, the LD4 instruction can load one element into 
each of four registers, and in this case the index is appended to the list as follows: 


{ V4.8 - V7.S }[3] //standard disassembly 
{ V4.4S, V5.4S, V6.4S, V7.4S }[3] //alternative with optional number of lanes 
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C1.3 


Address generation 


The A64 instruction set supports 64-bit addresses. The valid address range is determined by the following factors: 
° The size of the implemented virtual address space. 

° Memory Management Unit (MMU) configuration settings. 

The top 8 bits of the 64-bit address can be used as a tag, see Address tagging in AArch64 state on page D4-1724. 


For more information on memory management and address translation, see Chapter D4 The AArch64 Virtual 
Memory System Architecture. 


























C1.3.1 Register indexed addressing 
The A64 instruction set allows a 64-bit index register to be added to the 64-bit base register, with optional scaling 
of the index by the access size. Additionally it allows for sign-extension or zero-extension of a 32-bit value within 
an index register, followed by optional scaling. 
C1.3.2 PC-relative addressing 
The A64 instruction set has support for position-independent code and data addressing: 
° PC-relative literal loads have an offset range of + 1MB. 
. Process state flag and compare based conditional branches have a range of + 1MB. Test bit conditional 
branches have a restricted range of + 32KB. 
° Unconditional branches, including branch and link, have a range of + 128MB. 
PC-relative Load/Store operations, and address generation with a range of + 4GB can be performed using two 
instructions. 
C1.3.3 Load/Store addressing modes 
Load/Store addressing modes in the A64 instruction set require a 64-bit base address from a general-purpose register 
X0-X30 or the current stack pointer, SP, with an optional immediate or register offset. Table C1-6 shows the 
assembler syntax for the complete set of Load/Store addressing modes. 
Table C1-6 A64 Load/Store addressing modes 
Offset 
Addressing Mode 
Immediate Register Extended Register 
Base register only (no offset) [base{, #0}] - - 
Base plus offset [base{, #imm}] [base, Xm{, LSL #imm}] [base, Wm, (S|U)XTW {#imm}] 
Pre-indexed [base, #imm]! - - 
Post-indexed [base], #imm [base], Xm@ = 
Literal (PC-relative) label - - 
. The post-indexed by register offset mode can be used with the SIMD Load/Store structure instructions described in Load/Store 
Vector on page C3-154. Otherwise the post-indexed by register offset mode is not available. 
Some types of Load/Store instruction support only a subset of the Load/Store addressing modes listed in Table C1-6. 
Details of the supported modes are as follows: 
. Base plus offset addressing means that the address is the value in the 64-bit base register plus an offset. 
° Pre-indexed addressing means that the address is the sum of the value in the 64-bit base register and an offset, 
and the address is then written back to the base register. 
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° Post-indexed addressing means that the address is the value in the 64-bit base register, and the sum of the 
address and the offset is then written back to the base register. 


° Literal addressing means that the address is the value of the 64-bit program counter for this instruction plus 
a 19-bit signed word offset. This means that it is a 4 byte aligned address within +1MB of the address of this 
instruction with no offset. Literal addressing can only be used for loads of at least 32 bits and for prefetch 
instructions. The PC cannot be referenced using any other addressing modes. The syntax for labels is specific 
to individual toolchains. 


° An immediate offset can be unsigned or signed, and scaled or unscaled, depending on the type of Load/Store 
instruction. When the immediate offset is scaled it is encoded as a multiple of the transfer size, although the 
assembly language always uses a byte offset, and the assembler or disassembler performs the necessary 
conversion. The usable byte offsets therefore depend on the type of Load/Store instruction and the transfer 
size. 


Table C1-7 shows the offset and the type of Load/Store instruction. 


Table C1-7 Immediate offsets and the type of Load/Store instruction 




















Offset bits Sign Scaling Write-Back Load/Store type 
0 - - - Exclusive/acquire/release 
7 Signed Scaled Optional Register pair 
9 Signed Unscaled Optional Single register 
12 Unsigned Scaled No Single register 
° A register offset means that the offset is the 64 bits from a general-purpose register, Xm, optionally scaled 


by the transfer size, in bytes, if LSL #imm is present and where imm must be equal to log2(transfer_size). 


° An extended register offset means that offset is the bottom 32 bits from a general-purpose register Wm, 
sign-extended or zero-extended to 64 bits, and then scaled by the transfer size if so indicated by #imm, where 
imm must be equal to log»(transfer_size). An assembler must accept Wm or Xm as an extended register 
offset, but Wm is preferred for disassembly. 


° Generating an address lower than the value in the base register requires a negative signed immediate offset 
or a register offset holding a negative value. 


° When stack alignment checking is enabled by system software and the base register is the SP, the current 
stack pointer must be initially quadword aligned, that is aligned to 16 bytes. Misalignment generates a Stack 
Alignment fault. The offset does not have to be a multiple of 16 bytes unless the specific Load/Store 
instruction requires this. SP cannot be used as a register offset. 


Address calculation 


General-purpose arithmetic instructions can calculate the result of most addressing modes and write the address to 
a general-purpose register or, in most cases, to the current stack pointer. 
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Table C1-8 shows the arithmetic instructions that can compute addressing modes. 


Table C1-8 Arithmetic instructions to compute addressing modes 





Addressing 
Form 


Offset 


Immediate Register 


Extended Register 





Base register (no 
offset) 


MOV Xd|SP, base - 





Base plus offset 


ADD Xd|SP, base, #imm 
or 
SUB Xd|SP, base, #imm 


ADD <Xd|SP>, base, Xm{,LSL#imm} 


ADD <Xd|SP>, base, Wm,(S|U)XT(W|H|B|) {#imm} 





Pre-indexed 





Post-indexed 





Literal 
(PC-relative) 


ADR Xd, label - 





Note 





. To calculate a base plus immediate offset the ADD instructions defined in Arithmetic (immediate) on 
page C3-158 accept an unsigned 12-bit immediate offset, with an optional left shift by 12. This means that a 
single ADD instruction cannot support the full range of byte offsets available to a single register Load/Store 
with a scaled 12-bit immediate offset. For example, a quadword LDR effectively has a 16-bit byte offset. To 
calculate an address with a byte offset that requires more than 12 bits it is necessary to use two ADD 


instructions. The following example shows this: 


ADD Xd, base, #(imm & OxFFF) 
ADD Xd, Xd, #(imm>>12), LSL #12 


. To calculate a base plus extended register offset, the ADD instructions defined in Arithmetic (extended register) 
on page C3-164 provide a superset of the addressing mode that also supports sign-extension or 
zero-extension of a byte or halfword value with any shift amount between 0 and 4, for example: 


ADD Xd, base, Wm, SXTW #3 
ADD Xd, base, Wm, UXTH #4 


// Xd = base + (SignExtend(Wwm) LSL 3) 
// Xd = base + (ZeroExtend(Wm<15:0>) LSL 4) 


° If the same extended register offset is used by more than one Load/Store instruction, then, depending on the 
implementation, it might be more efficient to calculate the extended and scaled intermediate result just once, 
and then re-use it as a simple register offset. The extend and scale calculation can be performed using the 
SBFIZ and UBFIZ bitfield instructions defined in Bitfield move on page C3-161, for example: 


SBFIZ Xd, Xm, #3, #32 
UBFIZ Xd, Xm, #4, #16 


//Xd = “Wm, SXTW #3” 
//Xd = “Wm, UXTH #4” 
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C1.4 Instruction aliases 


Some instructions have an associated architecture alias that is used for disassembly of the encoding when the 
associated conditions are met. Architecture alias instructions are included in the alphabetic lists of instruction types 
and clearly presented as an alias form in descriptions for the individual instructions. 
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Chapter C2 
About the A64 Instruction Descriptions 


This chapter describes the instruction descriptions contained in Chapter C6 A64 Base Instruction Descriptions and 
Chapter C7 A64 Advanced SIMD and Floating-point Instruction Descriptions. 


It contains the following sections: 





° Understanding the A64 instruction descriptions on page C2-134. 
. General information about the A64 instruction descriptions on page C2-137. 
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C2.1 


Understanding the A64 instruction descriptions 


Each instruction description in Chapter C6 and Chapter C7 has the following content: 
A title. 

An introduction to the instruction. 

The instruction encoding or encodings. 

Any alias conditions. 

A list of the assembler symbols for the instruction. 


Pseudocode describing how the instruction operates. 


NDARWN = 


Notes, if applicable. 


The following sections describe each of these. 





C2.1.1 The title 
The title of an instruction description includes the base mnemonic for the instruction. 
If different forms of an instruction use the same base mnemonic, each form has its own description. In this case, the 
title is the mnemonic followed by a short description of the instruction form in parentheses. This is most often used 
when an operand is an immediate value in one instruction form, but is a register in another form. 
For example, in Chapter C6 there are the following titles for different forms of the ADD instruction: 
° ADD (extended register) on page C6-437. 
° ADD (immediate) on page C6-439. 
° ADD (shifted register) on page C6-441. 

C2.1.2 An introduction to the instruction 
This briefly describes the function of the instruction. The introduction is not a complete description of the 
instruction, and it is not definitive. If there is any conflict between it and the more detailed information that follows 
it, the more detailed information takes priority. 

C2.1.3 The instruction encoding or encodings 
This shows the instruction encoding diagram, or if the instruction has more than one encoding, shows all of the 
encoding diagrams. Each diagram has a subheading. 
For example, for load and store instructions, the subheadings might be: 
° Post-index. 
° Pre-index. 
° Unsigned offset. 
Each diagram numbers the bits from 31 to 0. The diagram for an instruction at address A shows, from left to right, 
the bytes at addresses A+3, A+2, A+1, and A. 
There might be variants of an encoding, if the assembler syntax prototype differs depending on the value in one or 
more of the encoding fields. In this case, each variant has a subheading that describes the variant and shows the 
distinguishing field value or values in parentheses. For example, in Chapter C6 there are the following subheadings 
for variants of the ADC instruction encoding: 
° 32-bit variant (sf = 0). 
° 64-bit variant (sf = 1). 
The assembler syntax prototype for an encoding or variant of an encoding shows how to form a complete assembler 
source code instruction that assembles to the encoding. Unless otherwise stated, the prototype is also the preferred 
syntax for a disassembler to disassemble the encoding to. Disassemblers are permitted to omit optional symbols that 
represent the default value of a field or set of fields, to produce more readable disassembled code, provided that the 
output re-assembles to the same encoding. 
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Each encoding diagram, and its associated assembler syntax prototypes, is followed by encoding-specific 
pseudocode that translates the fields of that encoding into inputs for the encoding-independent pseudocode that 
describes the operation of the instruction. See Pseudocode describing how the instruction operates on page C2-136. 


C2.1.4 Any alias conditions, if applicable 


This is an optional part of an instruction description. If included, it describes the set of conditions for which an 
alternative mnemonic and its associated assembler syntax prototypes are preferred for disassembly by a 
disassembler. It includes a link to the alias instruction description that defines the alternative syntax. The alias 
syntax and the original syntax can be used interchangeably in the assembler source code. 


ARM recommends that if a disassembler outputs the alias syntax, it consistently outputs the alias syntax. 


C2.1.5 A list of the assembler symbols for the instruction 


The Assembler symbols subsection of the instruction description contains a list of the symbols that the assembler 
syntax prototype or prototypes use, if any. 


In assembler syntax prototypes, the following conventions are used: 


<> Angle brackets. Any symbol enclosed by these is a name or a value that the user supplies. For each 
symbol, there is a description of what the symbol represents. The description usually also specifies 
which encoding field or fields encodes the symbol. 


{ } Brace brackets. Any symbols enclosed by these are optional. For each optional symbol, there is a 
description of what the symbol represents and how its presence or absence is encoded. 


In some assembler syntax prototypes, some brace brackets are mandatory, for example if they 
surround a register list. When the use of brace brackets is mandatory, they are separated from other 
syntax items by one or more spaces. 


# This usually precedes a numeric constant. All uses of # are optional in A64 assembler source code. 
ARM recommends that disassemblers output the # where the assembler syntax prototype includes it. 


+/- This indicates an optional + or - sign. If neither is coded, + is assumed. 


Single spaces are used for clarity, to separate syntax items. Where a space is mandatory, the assembler syntax 
prototype shows two or more consecutive spaces. 


Any characters not shown in this conventions list must be coded exactly as shown in the assembler syntax prototype. 
Apart from brace brackets, the characters shown are used as part of a meta-language to define the architectural 
assembler syntax for an instruction encoding or alias, but have no architecturally defined significance in the input 
to an assembler or in the output from a disassembler. 


The following symbol conventions are used: 


<Xn> The 64-bit name of a general-purpose register (XO-X30) or the zero register (XZR). 

<Wn> The 32-bit name of a general-purpose register (WO-W30) or the zero register (WZR). 
<Xn|SP> The 64-bit name of a general-purpose register (XO-X30) or the current stack pointer (SP). 
<Wn | WSP> The 32-bit name of a general-purpose register (WO-W30) or the current stack pointer (WSP). 


<Bn>, <Hn>, <Sn>, <Dn>, <Qn> 
The 8, 16, 32, 64 or 128-bit name of a SIMD and floating-point register in a scalar context as 


described in section Register names on page C1-124. 


<Vn> The name of a SIMD and floating-point register name in a vector context as described in Register 
names on page C1-124. 


If the description of a symbol specifies that the symbol is a register, the description might also specify that the range 
of permitted registers is extended or restricted. It also specifies any differences from the default rules for such fields. 
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Note 


Register names on page C1-124 provides the A64 register names. 








C2.1.6 Pseudocode describing how the instruction operates 
The Operation subsection of the instruction description contains this pseudocode. 


It is encoding-independent pseudocode that provides a precise description of what the instruction does. 





Note 


For a description of ARM pseudocode, see Appendix K11 ARM Pseudocode Definition. This appendix also 
describes the execution model for an instruction. 





C2.1.7 Notes, if applicable 


If applicable, other notes about the instruction appear under additional subheadings. 
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C2.2 General information about the A64 instruction descriptions 


This section provides general information about the A64 instruction descriptions. Some of this information also 
applies to System register descriptions, for example the terms defined in Fixed values in AArch64 instruction and 
System register descriptions apply to the AArch64 descriptions throughout this manual. The following subsections 
provide this information: 


° Fixed values in AArch64 instruction and System register descriptions. 
° Modified immediate constants in A64 instructions on page C2-138. 
C2.2.1 Fixed values in AArch64 instruction and System register descriptions 


This section summarizes the terms used to describe fixed values in AArch64 register and instruction descriptions. 
The Glossary gives full descriptions of these terms, and each entry in this section includes a link to the 
corresponding Glossary entry. 


Note 


In register descriptions, the meaning of some bits depends on the PE state. This affects the definitions of RESO and 
RES1, as shown in the Glossary. 








The following terms are used to describe bits or fields with fixed values: 
RAZ Read-As-Zero. See Read-As-Zero (RAZ). 

In diagrams, a RAZ bit can be shown as 0. 
(0), RESO Reserved, Should-Be-Zero (SBZ) or RESO. 


In instruction encoding diagrams, and sometimes in other descriptions, (@) indicates an SBZ bit. If 
the bit is set to 1, behavior is CONSTRAINED UNPREDICTABLE, and must be one of the following: 


° The instruction is UNDEFINED. 

° The instruction is treated as a NOP. 

° The instruction executes as if the value of the bit was 0. 

° Any destination registers of the instruction become UNKNOWN. 


This notation can be expanded for fields, so a three-bit field can be shown as either (@)(@)(@) or as 
(000). 

In register diagrams, but not in the A64 encoding and instruction descriptions, bits or fields can be 
shown as RESO. See the Glossary definition of RESO for more information. 


— Note 


Some of the System instruction descriptions in this chapter are based on the field description of the 
input value for the instruction. These are register descriptions and therefore can include RESO fields, 





The (0) and RESO descriptions can be applied to bits or bitfields that are read-only, or are write-only. 
The Glossary definitions cover these cases. 


RAO Read-As-One. See Read-As-One (RAO). 
In diagrams, a RAO bit can be shown as 1. 
(1), RES1 Reserved, Should-Be-One (SBO) or RES1. 


In instruction encoding diagrams, and sometimes in other descriptions, (1) indicates a SBO bit. If 
the bit is set to 0, behavior is CONSTRAINED UNPREDICTABLE, and must be one of the following: 


° The instruction is UNDEFINED. 

° The instruction is treated as a NOP. 

° The instruction executes as if the value of the bit was 1. 

. Any destination registers of the instruction become UNKNOWN. 


This notation can be expanded for fields, so a three-bit field can be shown as either (1)(1)(1) or as 
(111). 

In register diagrams, but not in the A64 encoding and instruction descriptions, bits or fields can be 
shown as RES1. See the Glossary definition of RES1 for more information. 
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— Note 


Some of the System instruction descriptions in this chapter are based on the field description of the 
input value for the instruction. These are register descriptions and therefore can include RES] fields, 





The (1) and RES1 descriptions can be applied to bits or bitfields that are read-only, or are write-only. 
The Glossary definitions cover these cases. 





















































C2.2.2 Modified immediate constants in A64 instructions 
It contains the following subsections: 
° Modified immediate constants in A64 floating-point instructions. 
Modified immediate constants in A64 floating-point instructions 
Table C2-1 shows the immediate constants available in FMOV (scalar, immediate) and FMOV (vector, immediate) 
floating-point instructions. 
Table C2-1 A64 Floating-point modified immediate constants 
Datatype immediate Constant 4 
F32 abcdefgh aBbbbbbc defgh000 00000000 00000000 
F64 abcdefgh aBbbbbbb bbcdefgh 00000000 20000000 00000000 02000000 20000000 a0000000 
a. In this column, B = NOT(b). The bit pattern represents the floating-point number (—1)S x 2¢xP x mantissa, where 
S = UInt(a), exp = UInt(NOT(b) :c:d)-3 and mantissa = (16+UInt(e:f:g:h))/16. 
The immediate value shown in the table is either: 
° The value of the imm8 field for an FMOV (scalar, immediate) instruction, see FMOV (scalar, immediate) on 
page C7-978. 
. The value obtained by concatenating the a:b:c:d:e:f:g:h fields field for an FMOV (vector, immediate) 
instruction, see FMOV (vector, immediate) on page C7-973. 
Table C2-2 shows the floating-point constant values encoded in the b:c:d:e:f:g:h fields of the FMOV (vector, 
immediate) instruction. 
Table C2-2 Floating-point constant values 
bed 
efgh 
000 01 010 011 100 101 110 111 
0000 «2.0 394.0 8.0 16.0 0.125 0.25 0.5 1.0 
0001 2.125 4.25 8.5 17.0 0.1328125 0.265625 0.53125 1.0625 
0010 2.25 4.5 9.0 18.0 0.140625 0.28125 0.5625 1.125 
0011 2.375 4.759.5 19.0 0.1484375 0.296875 0.59375 1.1875 
0100 «2.5 5.0 10.0 20.0 0.15625 0.3125 0.625 = 1.25 
Q101 2.625 5.25 10.5 21.0 0.1640625 0.328125 0.65625 1.3125 
0110 2.75 5.5 11.0 22.0 0.171875 0.34375 0.6875 1.375 
@111 2.875 5.75 11.5 23.0 0.1796875 0.359375 0.71875 1.4375 
1000 «3.0 6.0 12.0 24.0 0.1875 0.375 0.75 1.5 
1001 3.125 6.25 12.5 25.0 0.1953125 0.390625 0.78125 1.5625 
1010 3.25 6.5 13.0 26.0 0.203125 0.40625 0.8125 1.625 
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Table C2-2 Floating-point constant values (continued) 





efgh 


bcd 
000 001 010 011 100 101 110 111 





1011 


3.375 6.75 13.5 27.0 0.2109375 0.421875 0.84375 1.6875 





1100 


3.5 7.0 14.0 28.0 0.21875 0.4375 0.875 = 1.75 





1101 


3.625 7.25 14.5 29.0 0.2265625 0.453125 0.90625 1.8125 





1110 


3.75 7.5 15.0 30.0 0.234375 0.46875 0.9375 1.875 





1111 


3.875 7.75 15.5 31.0 0.2421875 0.484375 0.96875 1.9375 





Operation of modified immediate constants, floating-point instructions 


For an A64 floating-point instruction that uses a modified immediate constant, the operation described by the 
VFPExpandImm() pseudocode function returns the value of the immediate constant. 
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Chapter C3 


A64 Instruction Set Overview 


This chapter provides an overview of the A64 instruction set. It contains the following sections: 


Branches, Exception generating, and System instructions on page C3-142. 
Loads and stores on page C3-146. 

Data processing - immediate on page C3-158. 

Data processing - register on page C3-163. 

Data processing - SIMD and floating-point on page C3-171. 


For a structured breakdown of instruction groups by encoding, see Chapter C4 A64 Instruction Set Encoding. 





ARM DDI 0487A.k_iss10775 


1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


C3-141 


C3 A64 Instruction Set Overview 
C3.1 Branches, Exception generating, and System instructions 


C3.1 Branches, Exception generating, and System instructions 


This section describes the branch, exception generating, and system instructions. It contains the following 
subsections: 


Conditional branch. 

Unconditional branch (immediate) on page C3-143. 
Unconditional branch (register) on page C3-143. 
Exception generation and return on page C3-143. 
System register instructions on page C3-144. 
System instructions on page C3-144. 

Hint instructions on page C3-145. 

Barriers and CLREX instructions on page C3-145. 


For information about the encoding structure of the instructions in this instruction group, see Branches, exception 
generating and system instructions on page C4-197. 





Note 


Software must: 


Use only BLR or BL to perform a nested subroutine call when that subroutine is expected to return to the 
immediately following instruction, that is, the instruction with the address of the BLR or BL instruction 
incremented by four. 


Use only RET to perform a subroutine return, when that subroutine is expected to have been entered by a BL 
or BLR instruction. 


Use only B, BR, or the instructions listed in Table C3-1 to perform a control transfer that is not a subroutine 
call or subroutine return described in this Note. 





C3.1.1 Conditional branch 


Conditional branches change the flow of execution depending on the current state of the condition flags or the value 
in a general-purpose register. See Table C1-1 on page C1-124 for a list of the condition codes that can be used for 


cond. 


Table C3-1 shows the Conditional branch instructions. 


Table C3-1 Conditional branch instructions 


























Mnemonic Instruction Branch offset range from the PC See 
B.cond Branch conditionally +1MB B.cond on page C6-462 
CBNZ Compare and branch if nonzero +1MB CBNZ on page C6-476 
CBZ Compare and branch if zero +1MB CBZ on page C6-477 
TBNZ Test bit and branch if nonzero +32KB TBNZ on page C6-744 
TBZ Test bit and branch if zero +32KB TBZ on page C6-745 
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C3.1.2 Unconditional branch (immediate) 


Unconditional branch (immediate) instructions change the flow of execution unconditionally by adding an 
immediate offset with a range of +128MB to the value of the program counter that fetched the instruction. The BL 
instruction also writes the address of the sequentially following instruction to general-purpose register, X30. 


Table C3-2 shows the Unconditional branch instructions with an immediate branch offset. 


Table C3-2 Unconditional branch instructions (immediate) 





Immediate branch offset range 








Mnemonic _ Instruction (ronrtie PC See 
B Branch unconditionally +128MB B on page C6-463 
BL Branch with link +128MB BL on page C6-472 





C3.1.3 Unconditional branch (register) 


Unconditional branch (register) instructions change the flow of execution unconditionally by setting the program 
counter to the value in a general-purpose register. The BLR instruction also writes the address of the sequentially 
following instruction to general-purpose register X30. The RET instruction behaves identically to BR, but provides an 
additional hint to the PE that this is a return from a subroutine. Table C3-3 shows Unconditional branch instructions 
that jump directly to an address held in a general-purpose register. 


Table C3-3 Unconditional branch instructions (register) 














Mnemonic _ Instruction See 

BLR Branch with link to register BLR on page C6-473 
BR Branch to register BR on page C6-474 
RET Return from subroutine RET on page C6-653 





C3.1.4 Exception generation and return 


This section describes the following exceptions: 
° Exception generating. 

° Exception return on page C3-144. 

° Debug state on page C3-144. 


Exception generating 


Table C3-4 shows the Exception generating instructions. 


Table C3-4 Exception generating instructions 


























Mnemonic Instruction See 

BRK Breakpoint Instruction BRK on page C6-475 

HLT Halt Instruction HLT on page C6-529 

HVC Generate exception targeting Exception level 2 HVC on page C6-530 

SMC Generate exception targeting Exception level 3 SMC on page C6-675 

SVC Generate exception targeting Exception level 1 SVC on page C6-738 
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Exception return 


Table C3-5 shows the Exception return instructions. 


Table C3-5 Exception return instructions 





Mnemonic _ Instruction See 





ERET Exception return using current ELR and SPSR ERET on page C6-525 





Debug state 


Table C3-6 shows the Debug state instructions. 


Table C3-6 Debug state instructions 




















Mnemonic Instruction See 

DCPS1 Debug switch to Exception level 1 DCPS1 on page C6-512 

DCPS2 Debug switch to Exception level 2 DCPS2 on page C6-513 

DCPS3 Debug switch to Exception level 3 DCPS3 on page C6-514 

DRPS Debug restore PE state DRPS on page C6-517 
C3.1.5 System register instructions 


For detailed information about the System register instructions, see Chapter C5 The A64 System Instruction Class. 
Table C3-7 shows the System register instructions. 


Table C3-7 System register instructions 

















Mnemonic Instruction See 
MRS Move System register to general-purpose register MRS on page C6-622 
MSR Move general-purpose register to System register MSR (register) on page C6-625 
Move immediate to PE state field MSR (immediate) on page C6-623 
C3.1.6 System instructions 


For detailed information about the System instructions, see Chapter CS The A64 System Instruction Class. 


Table C3-8 shows the System instructions. 


Table C3-8 System instructions 

















Mnemonic Instruction See 
sys System instruction SYS on page C6-742 
SYSL System instruction with result SYSL on page C6-743 
Ic Instruction cache maintenance IC on page C6-531 and Table C5-2 on page C5-276 
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Table C3-8 System instructions (continued) 





Mnemonic 


Instruction 


See 





DC 


Data cache maintenance 


DC on page C6-511 and Table C5-2 on page C5-276 





AT 


Address translation 


AT on page C6-461 and Table C5-3 on page C5-277 





TLBI 


TLB Invalidate 


TLBI on page C6-746 and Table C5-4 on page C5-278 





C3.1.7 Hint instructions 


Table C3-9 shows the Hint instructions. 


C3.1.8 


Barriers and CLREX instructions 


Table C3-9 Hint instructions 


























Mnemonic _ Instruction See 

NOP No operation NOP on page C6-637 
YIELD Yield hint YIELD on page C6-765 
WFE Wait for event WFE on page C6-763 
WFI Wait for interrupt WFI on page C6-764 
SEV Send event SEV on page C6-672 
SEVL Send event local SEVL on page C6-673 
HINT Unallocated hint HINT on page C6-528 





Table C3-10 shows the barrier and CLREX instructions. 


Table C3-10 Barriers and CLREX instructions 

















Mnemonic Instruction See 

CLREX Clear exclusive monitor CLREX on page C6-484 
DSB Data synchronization barrier DSB on page C6-518 
DMB Data memory barrier DMB on page C6-515 
ISB Instruction synchronization barrier ISB on page C6-532 





For more information about the barriers, see Memory barriers on page B2-87. 


For information about the allocated values for the data barriers, see: 
. DMB on page C6-515. 
° DSB on page C6-518. 
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Overview 


C3.2 Loads and stores 


This section describes the Load/Store instructions. It contains the following subsections: 
° Load/Store register. 

. Load/Store register (unscaled offset) on page C3-147. 

. Load/Store Pair on page C3-148. 

° Load/Store Non-temporal Pair on page C3-149. 

° Load/Store unprivileged on page C3-150. 

° Load-Exclusive/Store-Exclusive on page C3-150. 

° Load-Acquire/Store-Release on page C3-151. 

° Load/Store scalar SIMD and floating-point on page C3-152. 
° Load/Store Vector on page C3-154. 

° Prefetch memory on page C3-156. 


Apart from Load-Exclusive, Store-Exclusive, Load-Acquire, and Store-Release, addresses can have any alignment 
unless strict alignment checking is enabled, that is if SCTLR_ELx.A == 1. 


The additional control bits SCTLR_ELx.SA and SCTLR_EL1.SAO0 control whether the stack pointer must be 
quadword aligned when used as a base register. See SP alignment checking on page D1-1515. Using a misaligned 
stack pointer generates an SP alignment fault exception. 


For information about the encoding structure of the instructions in this instruction group, see Loads and stores on 
page C4-202. 


Note 


In some cases, Load/Store instructions can lead to CONSTRAINED UNPREDICTABLE behavior. See AArch64 
CONSTRAINED UNPREDICTABLE behaviors on page K1-5479. 








C3.2.1 Load/Store register 


The Load/Store register instructions support the following addressing modes: 

° Base plus a scaled 12-bit unsigned immediate offset or base plus an unscaled 9-bit signed immediate offset. 
° Base plus a 64-bit register offset, optionally scaled. 

° Base plus a 32-bit extended register offset, optionally scaled. 

° Pre-indexed by an unscaled 9-bit signed immediate offset. 

° Post-indexed by an unscaled 9-bit signed immediate offset. 

° PC-relative literal for loads of 32 bits or more. 


See also Load/Store addressing modes on page C1-128. 


If a Load instruction specifies writeback and the register being loaded is also the base register, then behavior is 
CONSTRAINED UNPREDICTABLE and one of the following behaviors must occur: 


° The instruction is treated as UNDEFINED. 
. The instruction is treated as a NOP. 
° The instruction performs the load using the specified addressing mode and the base register becomes 


UNKNOWN. In addition, if an exception occurs during the execution of such an instruction, the base address 
might be corrupted so that the instruction cannot be repeated. 
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If a Store instruction performs a writeback and the register that is stored is also the base register, then behavior is 
CONSTRAINED UNPREDICTABLE and one of the following behaviors must occur: 


° The instruction is treated as UNDEFINED. 
. The instruction is treated as a NOP. 
° The instruction performs the store to the designated register using the specified addressing mode, but the 


value stored is UNKNOWN. 


Table C3-11 shows the Load/Store Register instructions. 


Table C3-11 Load/Store register instructions 





Mnemonic 


Instruction 


See 





LDR 


Load register (register offset) 


LDR (register) on page C6-555 





Load register (immediate offset) 


LDR (immediate) on page C6-550 





Load register (PC-relative literal) 


LDR (literal) on page C6-553 





LDRB 


Load byte (register offset) 


LDRB (register) on page C6-559 





Load byte (immediate offset) 


LDRB (immediate) on page C6-557 





LDRSB 


Load signed byte (register offset) 


LDRSB (register) on page C6-568 





Load signed byte (immediate offset) 


LDRSB (immediate) on page C6-565 





LDRH 


Load halfword (register offset) 


LDRH (register) on page C6-563 





Load halfword (immediate offset) 


LDRH (immediate) on page C6-561 





LDRSH 


Load signed halfword (register offset) 


LDRSH (register) on page C6-573 





Load signed halfword (immediate offset) 


LDRSH (immediate) on page C6-570 





LDRSW 


Load signed word (register offset) 


LDRSW (register) on page C6-578 





Load signed word (immediate offset) 


LDRSW (immediate) on page C6-575 





Load signed word (PC-relative literal) 


LDRSW (literal) on page C6-577 





STR 


Store register (register offset) 


STR (register) on page C6-700 





Store register (immediate offset) 


STR (immediate) on page C6-697 





STRB 


Store byte (register offset) 


STRB (register) on page C6-704 





Store byte (immediate offset) 


STRB (immediate) on page C6-702 





STRH 


Store halfword (register offset) 


STRH (register) on page C6-708 





Store halfword (immediate offset) 


STRH (immediate) on page C6-706 





C3.2.2 


Load/Store register (unscaled offset) 


The Load/Store register instructions with an unscaled offset support only one addressing mode: 


° Base plus an unscaled 9-bit signed immediate offset. 


See Load/Store addressing modes on page C1-128. 
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The Load/Store register (unscaled offset) instructions are required to disambiguate this instruction class from the 
Load/Store register instruction forms that support an addressing mode of base plus a scaled, unsigned 12-bit 


immediate offset, because that can represent some offset values in the same range. 


The ambiguous immediate offsets are byte offsets that are both: 


° In the range 0-255, inclusive. 


° Naturally aligned to the access size. 


Other byte offsets in the range -256 to 255 inclusive are unambiguous. An assembler program translating a 
Load/Store instruction, for example LDR, is required to encode an unambiguous offset using the unscaled 9-bit offset 
form, and to encode an ambiguous offset using the scaled 12-bit offset form. A programmer might force the 
generation of the unscaled 9-bit form by using one of the mnemonics in Table C3-12. ARM recommends that a 
disassembler outputs all unscaled 9-bit offset forms using one of these mnemonics, but unambiguous offsets can be 
output using a Load/Store single register mnemonic, for example, LDR. 


Table C3-12 shows the Load/Store register instructions with an unscaled offset. 


Table C3-12 Load/Store register (unscaled offset) instructions 



































Mnemonic Instruction See 

LDUR Load register (unscaled offset) LDUR on page C6-589 
LDURB Load byte (unscaled offset) LDURB on page C6-591 
LDURSB Load signed byte (unscaled offset) LDURSB on page C6-593 
LDURH Load halfword (unscaled offset) LDURH on page C6-592 
LDURSH Load signed halfword (unscaled offset) LDURSH on page C6-595 
LDURSW Load signed word (unscaled offset) LDURSW on page C6-597 
STUR Store register (unscaled offset) STUR on page C6-714 
STURB Store byte (unscaled offset) STURB on page C6-715 
STURH Store halfword (unscaled offset) STURH on page C6-716 














C3.2.3 Load/Store Pair 
The Load/Store Pair instructions support the following addressing modes: 
° Base plus a scaled 7-bit signed immediate offset. 
° Pre-indexed by a scaled 7-bit signed immediate offset. 
° Post-indexed by a scaled 7-bit signed immediate offset. 
See also Load/Store addressing modes on page C1-128. 
If a Load Pair instruction specifies the same register for the two registers that are being loaded, then behavior is 
CONSTRAINED UNPREDICTABLE and one of the following behaviors must occur: 
. The instruction is treated as UNDEFINED. 
. The instruction is treated as a NOP. 
° The instruction performs all the loads using the specified addressing mode and the register that is loaded takes 
an UNKNOWN value. 
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If a Load Pair instruction specifies writeback and one of the registers being loaded is also the base register, then 
behavior is CONSTRAINED UNPREDICTABLE and one of the following behaviors must occur: 


° The instruction is treated as UNDEFINED. 
. The instruction is treated as a NOP. 
° The instruction performs all of the loads using the specified addressing mode, and the base register becomes 


UNKNOWN. In addition, if an exception occurs during the instruction, the base address might be corrupted so 
that the instruction cannot be repeated. 


If a Store Pair instruction performs a writeback and one of the registers being stored is also the base register, then 
behavior is CONSTRAINED UNPREDICTABLE and one of the following behaviors must occur: 


° The instruction is treated as UNDEFINED. 
. The instruction is treated as a NOP. 
° The instruction performs all the stores of the registers indicated by the specified addressing mode, but the 


value stored for the base register is UNKNOWN. 


Table C3-13 shows the Load/Store Pair instructions. 


Table C3-13 Load/Store Pair instructions 




















Mnemonic Instruction See 
LDP Load Pair LDP on page C6-544 
LDPSW Load Pair signed words LDPSW on page C6-547 
STP Store Pair STP on page C6-694 
C3.2.4 Load/Store Non-temporal Pair 
The Load/Store Non-temporal Pair instructions support only one addressing mode: 
. Base plus a scaled 7-bit signed immediate offset. 
See Load/Store addressing modes on page C1-128. 
The Load/Store Non-temporal Pair instructions provide a hint to the memory system that an access is non-temporal 
or streaming, and unlikely to be repeated in the near future. This means that data caching is not required. However, 
depending on the memory type, the instructions might permit memory reads to be preloaded and memory writes to 
be gathered to accelerate bulk memory transfers. 
In addition there is an exception to the usual memory ordering rules. If an address dependency exists between two 
memory reads, and a Load Non-temporal Pair instruction generated the second read, then in the absence of any other 
barrier mechanism to achieve order, the memory accesses can be observed in any order by the other observers within 
the shareability domain of the memory addresses being accessed. 
If a Load Non-Temporal Pair instruction specifies the same register for the two registers that are being loaded, then 
behavior is CONSTRAINED UNPREDICTABLE and one of the following must occur: 
° The instruction is treated as UNDEFINED. 
. The instruction is treated as a NOP. 
° The instruction performs all the loads using the specified addressing mode and the register that is loaded takes 
an UNKNOWN value. 
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Table C3-14 shows the Load/Store Non-temporal Pair instructions. 


Table C3-14 Load/Store Non-temporal Pair instructions 





Mnemonic Instruction 


See 





LDNP Load Non-temporal Pair 


LDNP on page C6-542 





STNP Store Non-temporal Pair 


STNP on page C6-692 


















































C3.2.5 Load/Store unprivileged 
The Load/Store unprivileged instructions support only one addressing mode: 
° Base plus an unscaled 9-bit signed immediate offset. 
See Load/Store addressing modes on page C1-128. 
The Load/Store unprivileged instructions can be used when the PE is at EL1 to perform unprivileged memory 
accesses. If the PE is executing in any other Exception level, then the access permissions for that level apply. 
Table C3-15 shows the Load/Store unprivileged instructions. 
Table C3-15 Load-Store unprivileged instructions 
Mnemonic _ Instruction See 
LDTR Load unprivileged register LDTR on page C6-580 
LDTRB Load unprivileged byte LDTRB on page C6-582 
LDTRSB Load unprivileged signed byte LDTRSB on page C6-584 
LDTRH Load unprivileged halfword LDTRH on page C6-583 
LDTRSH Load unprivileged signed halfword LDTRSH on page C6-586 
LDTRSW Load unprivileged signed word LDTRSW on page C6-588 
STTR Store unprivileged register STTR on page C6-710 
STTRB Store unprivileged byte STTRB on page C6-712 
STTRH Store unprivileged halfword STTRH on page C6-713 
C3.2.6 Load-Exclusive/Store-Exclusive 
The Load-Exclusive/Store-Exclusive instructions support only one addressing mode: 
. Base register with no offset. 
See Load/Store addressing modes on page C1-128. 
The Load-Exclusive instructions mark the physical address being accessed as an exclusive access. This exclusive 
access mark is checked by the Store-Exclusive instruction, permitting the construction of atomic read-modify-write 
operations on shared memory variables, semaphores, mutexes, and spinlocks. See Synchronization and semaphores 
on page B2-108. 
The Load-Exclusive/Store-Exclusive instructions other than Load-Exclusive pair and Store-Exclusive pair require 
natural alignment, and an unaligned address generates an Alignment fault. Memory accesses generated by 
Load-Exclusive pair or Store-Exclusive pair instructions must be aligned to the size of the pair, otherwise the access 
generates an Alignment fault. When a Store-Exclusive pair succeeds, it causes a single-copy atomic update of the 
entire memory location. 
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C3.2 Loads and stores 


Table C3-16 Load-Exclusive/Store-Exclusive instructions 





Mnemonic 


Instruction 


See 





LDXR 


Load Exclusive register 


LDXR on page C6-600 





LDXRB 


Load Exclusive byte 


LDXRB on page C6-601 





LDXRH 


Load Exclusive halfword 


LDXRH on page C6-602 





LDXP 


Load Exclusive pair 


LDXP on page C6-598 





STXR 


Store Exclusive register 


STXR on page C6-720 





STXRB 


Store Exclusive byte 


STXRB on page C6-722 





STXRH 


Store Exclusive halfword 


STXRH on page C6-724 





STXP 


Store Exclusive pair 


STXP on page C6-717 





C3.2.7 Load-Acquire/Store-Release 


The Load-Acquire/Store-Release instructions support only one addressing mode: 


. Base register with no offset. 


See Load/Store addressing modes on page C1-128. 


The Load-Acquire/Store-Release instructions can remove the requirement to use the explicit DMB memory barrier 
instruction. For more information about the ordering of Load-Acquire/Store-Release, see Load-Acquire, 


Store-Release on page B2-90. 


The Load-Acquire/Store-Release instructions other than Load-Acquire pair and Store-Release pair require natural 
alignment, and an unaligned address generates an Alignment fault. Memory accesses generated by Load-Acquire 
pair or Store-Release pair instructions must be aligned to the size of the pair, otherwise the access generates an 


Alignment fault. 


A Store-Release Exclusive instruction only has the Release semantics if the store is successful. 


Table C3-17 shows the Non-exclusive Load-Acquire/Store-Release instructions. 


Table C3-17 Non-exclusive Load-Acquire and Store-Release instructions 





Mnemonic 


Instruction 


See 





LDA 


Load-Acquire register 


LDAR on page C6-533 





LDA\ 


RB 


Load-Acquire byte 


LDARB on page C6-534 





LDA\ 


RH 


Load-Acquire halfword 


LDARH on page C6-535 





STL 


Store-Release register 


STLR on page C6-680 





STL 


RB 


Store-Release byte 


STLRB on page C6-681 





STL 





RH 


Store-Release halfword 


STLRH on page C6-682 
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Table C3-18 shows the Exclusive Load-Acquire/Store-Release instructions. 


Table C3-18 Exclusive Load-Acquire and Store-Release instructions 





























Mnemonic _ Instruction See 

LDAXR Load-Acquire Exclusive register LDAXR on page C6-538 
LDAXRB Load-Acquire Exclusive byte LDAXRB on page C6-540 
LDAXRH Load-Acquire Exclusive halfword LDAXRH on page C6-541 
LDAXP Load-Acquire Exclusive pair LDAXP on page C6-536 
STLXR Store-Release Exclusive register STLXR on page C6-686 
STLXRB Store-Release Exclusive byte STLXRB on page C6-688 
STLXRH Store-Release Exclusive halfword STLXRH on page C6-690 
STLXP Store-Release Exclusive pair STLXP on page C6-683 





C3.2.8 Load/Store scalar SIMD and floating-point 


The Load/Store scalar SIMD and floating-point instructions operate on scalar values in the SIMD and floating-point 
register file as described in SIMD and floating-point scalar register names on page C1-126. The memory addressing 
modes available, described in Load/Store addressing modes on page C1-128, are identical to the general-purpose 
register Load/Store instructions, and like those instructions permit arbitrary address alignment unless strict 
alignment checking is enabled. However, unlike the Load/Store instructions that transfer general-purpose registers, 
Load/Store scalar SIMD and floating-point instructions make no guarantee of atomicity, even when the address is 
naturally aligned to the size of the data. 


Load/Store scalar SIMD and floating-point register 


The Load/Store scalar SIMD and floating-point register instructions support the following addressing modes: 


. Base plus a scaled 12-bit unsigned immediate offset or base plus unscaled 9-bit signed immediate offset. 
° Base plus 64-bit register offset, optionally scaled. 

° Base plus 32-bit extended register offset, optionally scaled. 

° Pre-indexed by an unscaled 9-bit signed immediate offset. 

° Post-indexed by an unscaled 9-bit signed immediate offset. 

° PC-relative literal for loads of 32 bits or more. 


For more information on the addressing modes, see Load/Store addressing modes on page C1-128. 





Note 


The unscaled 9-bit signed immediate offset address mode requires its own instruction form, see Load/Store scalar 
SIMD and floating-point register (unscaled offset) on page C3-153. 
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Table C3-19 shows the Load/Store instructions for a single SIMD and floating-point register. 


Table C3-19 Load/Store single SIMD and floating-point register instructions 





Mnemonic Instruction See 





LDR Load scalar SIMD&FP register (register offset) LDR (register, SIMD&FP) on page C7-1099 





Load scalar SIMD&FP register (immediate offset) LDR (immediate, SIMD&FP) on page C7-1093 





Load scalar SIMD &FP register (PC-relative literal) LDR (literal, SIMD&FP) on page C7-1097 





STR Store scalar SIMD &FP register (register offset) STR (register, SIMD&FP) on page C7-1359 





Store scalar SIMD &FP register (immediate offset) STR (immediate, SIMD&FP) on page C7-1355 





Load/Store scalar SIMD and floating-point register (unscaled offset) 

The Load /Store scalar SIMD and floating-point register instructions support only one addressing mode: 
° Base plus an unscaled 9-bit signed immediate offset. 

See also Load/Store addressing modes on page C1-128. 


The Load/Store scalar SIMD and floating-point register (unscaled offset) instructions are required to disambiguate 
this instruction class from the Load/Store single SIMD and floating-point instruction forms that support an 
addressing mode of base plus a scaled, unsigned 12-bit immediate offset. This is similar to the Load/Store register 
(unscaled offset) instructions, that disambiguate this instruction class from the Load/Store register instruction, see 
Load/Store register (unscaled offset) on page C3-147. 


Table C3-20 shows the Load/Store SIMD and floating-point register instructions with an unscaled offset. 


Table C3-20 Load/Store SIMD and floating-point register instructions 











Mnemonic Instruction See 
LDUR Load scalar SIMD&FP register (unscaled offset) LDUR (SIMD&FP) on page C7-1102 
STUR Store scalar SIMD&FP register (unscaled offset) STUR (SIMD&FP) on page C7-1362 





Load/Store SIMD and Floating-point register pair 


The Load/Store SIMD and floating-point register pair instructions support the following addressing modes: 


° Base plus a scaled 7-bit signed immediate offset. 
° Pre-indexed by a scaled 7-bit signed immediate offset. 
° Post-indexed by a scaled 7-bit signed immediate offset. 


See also Load/Store addressing modes on page C1-128. 


If a Load pair instruction specifies the same register for the two registers that are being loaded, then behavior is 
CONSTRAINED UNPREDICTABLE and one of the following behaviors must occur: 


. The instruction is treated as UNDEFINED. 
° The instruction is treated as a NOP. 
. The instruction performs all of the loads using the specified addressing mode and the register being loaded 


takes an UNKNOWN value. 
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C3.2.9 


Table C3-21 shows the Load/Store SIMD and floating-point register pair instructions. 


Table C3-21 Load/Store SIMD and floating-point register pair instructions 











Mnemonic _ Instruction See 
LDP Load pair of scalar SIMD&FP registers LDP (SIMD&FP) on page C7-1090 
STP Store pair of scalar SIMD&FP registers STP (SIMD&FP) on page C7-1352 





Load/Store SIMD and Floating-point Non-temporal pair 

The Load/Store SIMD and Floating-point Non-temporal pair instructions support only one addressing mode: 
° Base plus a scaled 7-bit signed immediate offset. 

See also Load/Store addressing modes on page C1-128. 


The Load/Store Non-temporal pair instructions provide a hint to the memory system that an access is non-temporal 
or streaming, and unlikely to be repeated in the near future. This means that data caching is not required. However, 
depending on the memory type, the instructions might permit memory reads to be preloaded and memory writes to 
be gathered to accelerate bulk memory transfers. 


In addition there is an exception to the usual memory ordering rules. If an address dependency exists between two 
memory reads, and a Load non-temporal pair instruction generated the second read, then in the absence of any other 
barrier mechanism to achieve order, those memory accesses can be observed in any order by the other observers 
within the shareability domain of the memory addresses being accessed. 


If a Load Non-temporal pair instruction specifies the same register for the two registers that are being loaded, then 
behavior is CONSTRAINED UNPREDICTABLE and one of the following behaviors must occur: 


° The instruction is treated as UNDEFINED. 
. The instruction is treated as a NOP. 
° The instruction performs all the loads using the specified addressing mode and the register that is loaded takes 


an UNKNOWN value. 


Table C3-22 shows the Load/Store SIMD and floating-point Non-temporal pair instructions. 


Table C3-22 Load/Store SIMD and floating-point Non-temporal pair instructions 





Mnemonic Instruction See 





LDNP Load pair of scalar SIMD&FP registers LDNP (SIMD&FP) on page C7-1088 





STNP Store pair of scalar SIMD&FP registers STNP (SIMD&FP) on page C7-1350 





Load/Store Vector 


The Vector Load/Store structure instructions support the following addressing modes: 
° Base register only. 
° Post-indexed by a 64-bit register. 


. Post-indexed by an immediate, equal to the number of bytes transferred. 


Load/Store vector instructions, like other Load/Store instructions, allow any address alignment, unless strict 
alignment checking is enabled. If strict alignment checking is enabled, then alignment checking to the size of the 
element is performed. However, unlike the Load/Store instructions that transfer general-purpose registers, the 
Load/Store vector instructions do not guarantee atomicity, even when the address is naturally aligned to the size of 
the element. 
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Table C3-23 shows the Load/Store structure instructions. A post-increment immediate offset, if present, must be 8, 
16, 24, 32, 48, or 64, depending on the number of elements transferred. 


Table C3-23 Load/Store multiple structures instructions 





















































Mnemonic _ Instruction See 
LD1 Load single 1-element structure to one lane of one register LD1 (single structure) on page C7-1051 
Load multiple 1-element structures to one register or to two, three or LD1 (multiple structures) on page C7-1047 
four consecutive registers 
LD2 Load single 2-element structure to one lane of two consecutive LD2 (single structure) on page C7-1061 
registers 
Load multiple 2-element structures to two consecutive registers LD2 (multiple structures) on page C7-1058 
LD3 Load single 3-element structure to one lane of three consecutive LD3 (single structure) on page C7-1071 
registers 
Load multiple 3-element structures to three consecutive registers LD3 (multiple structures) on page C7-1068 
LD4 Load single 4-element structure to one lane of four consecutive LD4 (single structure) on page C7-1081 
registers 
Load multiple 4-element structures to four consecutive registers LD4 (multiple structures) on page C7-1078 
ST1 Store single 1-element structure from one lane of one register STI (single structure) on page C7-1325 
Store multiple 1-element structures from one register, or from two, STI (multiple structures) on page C7-1321 
three or four consecutive registers 
ST2 Store single 2-element structure from one lane of two consecutive ST2 (single structure) on page C7-1332 
registers 
Store multiple 2-element structures from two consecutive registers ST2 (multiple structures) on page C7-1329 
ST3 Store single 3-element structure from one lane of three consecutive ST3 (single structure) on page C7-1339 
registers 
Store multiple 3-element structures from three consecutive registers ST3 (multiple structures) on page C7-1336 
ST4 Store single 4-element structure from one lane of four consecutive ST4 (single structure) on page C7-1346 
registers 
ST4 Store multiple 4-element structures from four consecutive registers ST4 (multiple structures) on page C7-1343 
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Load single structure and replicate 


Table C3-24 shows the Load single structure and replicate instructions. A post-increment immediate offset, if 
present, must be 1, 2, 3, 4, 6, 8, 12, 16, 24, or 32, depending on the number of elements transferred. 


Table C3-24 Load single structure and replicate instructions 

















Mnemonic _ Instruction See 

LD1R Load single 1-element structure and replicate to all lanes of one register LDIR on page C7-1055 
LD2R Load single 2-element structure and replicate to all lanes of two registers LD2R on page C7-1065 
LD3R Load single 3-element structure and replicate to all lanes of three registers LD3R on page C7-1075 
LD4R Load single 4-element structure and replicate to all lanes of four registers LD4R on page C7-1085 





C3.2.10 Prefetch memory 


The Prefetch memory instructions support the following addressing modes: 

° Base plus a scaled 12-bit unsigned immediate offset or base plus an unscaled 9-bit signed immediate offset. 
° Base plus a 64-bit register offset. This can be optionally scaled by 8-bits, for example LSL#3. 

° Base plus a 32-bit extended register offset. This can be optionally scaled by 8-bits. 


° PC-relative literal. 


The prefetch memory instructions signal to the memory system that memory accesses from a specified address are 
likely to occur in the near future. The memory system can respond by taking actions that are expected to speed up 
the memory access when they do occur, such as preloading the specified address into one or more caches. Because 
these signals are only hints, it is valid for the PE to treat any or all prefetch instructions as a NOP. 


Because they are hints to the memory system, the operation of a PRFM instruction cannot cause a synchronous 
exception. However, a memory operation performed as a result of one of these memory system hints might in 
exceptional cases trigger an asynchronous event, and thereby influence the execution of the PE. An example of an 
asynchronous event that might be triggered is an SError interrupt. 


A PRFM instruction can only have an effect on software visible structures, such as caches and translation lookaside 
buffers associated with memory locations that can be accessed by reads, writes, or execution as defined in the 
translation regime of the current Exception level. 


A PRFM instruction is guaranteed not to access Device memory. 


A PRFM instruction using a PLI hint must not result in any access that could not be performed by the PE speculatively 
fetching an instruction. Therefore, if all associated MMUs are disabled, a PLI hint cannot access any memory 
location that cannot be accessed by instruction fetches. 


The PRFM instructions require an additional <prfop> operand to be specified, which must be one of the following: 
PLDLIKEEP, PLDLISTRM, PLDL2KEEP, PLDL2STRM, PLDL3KEEP, PLDL3STRM 
PSTLIKEEP, PSTLISTRM, PSTL2KEEP, PSTL2STRM, PSTL3KEEP, PSTL3STRM 
PLILIKEEP, PLILISTRM, PLIL2KEEP, PLIL2STRM, PLIL3KEEP, PLIL3STRM 


<prfop> is defined as <type><target><policy>. 





Here: 
<type> Is one of: 
PLD Prefetch for load. 
PST Prefetch for store. 
PLI Preload instructions. 
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<target> Is one of: 

L1 Level 1 cache. 

L2 Level 2 cache. 

L3 Level 3 cache. 
<policy> Is one of: 

KEEP Retained or temporal prefetch, allocated in the cache normally. 

STRM Streaming or non-temporal prefetch, for data that is used only once. 


PRFUM explicitly uses the unscaled 9-bit signed immediate offset addressing mode, as described in Load/Store 


register (unscaled offset) on page C3-147. 


Table C3-25 shows the Prefetch memory instructions. 


Table C3-25 Prefetch memory instructions 





Mnemonic _ Instruction 


See 





PRFM Prefetch memory (register offset) 


PRFM (register) on page C6-648 





Prefetch memory (immediate offset) 


PRFM (immediate) on page C6-644 





Prefetch memory (PC-relative offset) 


PRFM (literal) on page C6-646 





PRFUM Prefetch memory (unscaled offset) 


PRFM (unscaled offset) on page C6-650 
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C3.3 Data processing - immediate 
This section describes the instruction groups for data processing with immediate operands. It contains the following 
subsections: 
° Arithmetic (immediate). 
° Logical (immediate) on page C3-159. 
. Move (wide immediate) on page C3-159. 
. Move (immediate) on page C3-160. 
. PC-relative address calculation on page C3-160. 
° Bitfield move on page C3-161. 
. Bitfield insert and extract on page C3-161 
° Extract register on page C3-161. 
° Shift (immediate) on page C3-162. 
° Sign-extend and Zero-extend on page C3-162. 
For information about the encoding structure of the instructions in this instruction group, see Data processing - 
immediate on page C4-193. 
C3.3.1 Arithmetic (immediate) 
The Arithmetic (immediate) instructions accept a 12-bit unsigned immediate value, optionally shifted left by 12 bits. 
The Arithmetic (immediate) instructions that do not set condition flags can read from and write to the current stack 
pointer. The flag setting instructions can read from the stack pointer, but they cannot write to it. 
Table C3-26 shows the Arithmetic instructions with an immediate offset. 
Table C3-26 Arithmetic instructions with an immediate 
Mnemonic _ Instruction See 
ADD Add ADD (immediate) on page C6-439 
ADDS Add and set flags ADDS (immediate) on page C6-445 
SUB Subtract SUB (immediate) on page C6-728 
SUBS Subtract and set flags SUBS (immediate) on page C6-734 
CMP Compare CMP (immediate) on page C6-494 
CMN Compare negative CMN (immediate) on page C6-489 
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Logical (immediate) 


The Logical (immediate) instructions accept a bitmask immediate value that is a 32-bit pattern or a 64-bit pattern 
viewed as a vector of identical elements of size e = 2, 4, 8, 16, 32 or, 64 bits. Each element contains the same 
sub-pattern, that is a single run of | to (e - 1) nonzero bits from bit 0 followed by zero bits, then rotated by 0 to (e - 
1) bits. This mechanism can generate 5334 unique 64-bit patterns as 2667 pairs of pattern and their bitwise inverse. 


Note 


Values that consist of only zeros or only ones cannot be described in this way. 








The Logical (immediate) instructions that do not set the condition flags can write to the current stack pointer, for 
example to align the stack pointer in a function prologue. 


Note 


Apart from ANDS, and its TST alias, Logical (immediate) instructions do not set the condition flags. However, the final 
results of a bitwise operation can be tested by a CBZ, CBNZ, TBZ, or TBNZ conditional branch. 








Table C3-27 shows the Logical immediate instructions. 


Table C3-27 Logical immediate instructions 





Mnemonic Instruction See 





AND Bitwise AND AND (immediate) on page C6-451 





ANDS Bitwise AND and set flags ANDS (immediate) on page C6-454 





EOR Bitwise exclusive OR EOR (immediate) on page C6-522 





ORR Bitwise inclusive OR ORR (immediate) on page C6-640 





TST Test bits TST (immediate) on page C6-748 





Move (wide immediate) 


The Move (wide immediate) instructions insert a 16-bit immediate, or inverted immediate, into a 16-bit aligned 
position in the destination register. The value of the other bits in the destination register depends on the variant used. 
The optional shift amount can be any multiple of 16 that is smaller than the register size. 


Table C3-28 shows the Move (wide immediate) instructions. 


Table C3-28 Move (wide immediate) instructions 





Mnemonic Instruction See 





MOVZ Move wide with zero MOVZ on page C6-620 





MOVN Move wide with NOT MOVWN on page C6-618 





MOVK Move wide with keep MOVK on page C6-617 
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C3.3.4 


C3.3.5 


Move (immediate) 


The Move (immediate) instructions are aliases for a single MOVZ, MOVN, or ORR (immediate with zero register), 
instruction to load an immediate value into the destination register. An assembler must permit a signed or unsigned 
immediate, as long as its binary representation can be generated using one of these instructions, and an assembler 
error results if the immediate cannot be generated in this way. On disassembly it is unspecified whether the 
immediate is output as a signed or an unsigned value. 


If there is a choice between the MOVZ, MOVN, and ORR instruction to encode the immediate, then an assembler must 
prefer MOVZ to MOVN, and MOVZ or MOVN to ORR, to ensure reversability. A disassembler must output ORR (immediate with 
zero register) MOVZ, and MOVN, as a MOV mnemonic except that the underlying instruction must be used when: 


° ORR has an immediate that can be generated by a MOVZ or MOVN instruction. 
° A MOVN instruction has an immediate that can be encoded by MOVZ. 
° MOVZ #0 or MOVN #0 have a shift amount other than LSL #0. 


Table C3-29 shows the Move (immediate) instructions. 


Table C3-29 Move (immediate) instructions 





Mnemonic _ Instruction See 





MOV Move (inverted wide immediate) MOV (inverted wide immediate) on page C6-613 





Move (wide immediate) MOV (wide immediate) on page C6-614 





Move (bitmask immediate) MOV (bitmask immediate) on page C6-615 





PC-relative address calculation 


The ADR instruction adds a signed, 21-bit immediate to the value of the program counter that fetched this instruction, 
and then writes the result to a general-purpose register. This permits the calculation of any byte address within 
+1MB of the current PC. 


The ADRP instruction shifts a signed, 21-bit immediate left by 12 bits, adds it to the value of the program counter with 
the bottom 12 bits cleared to zero, and then writes the result to a general-purpose register. This permits the 
calculation of the address at a 4KB aligned memory region. In conjunction with an ADD (immediate) instruction, or 
a Load/Store instruction with a 12-bit immediate offset, this allows for the calculation of, or access to, any address 
within +4GB of the current PC. 


Note 


The term page used in the ADRP description is short-hand for the 4KB memory region, and is not related to the virtual 
memory translation granule size. 








Table C3-30 shows the instructions used for PC-relative address calculations are as follows: 


Table C3-30 PC-relative address calculation instructions 





Mnemonic Instruction See 





ADRP Compute address of 4KB page at a PC-relative offset | ADRP on page C6-450 





ADR Compute address of label at a PC-relative offset. ADR on page C6-449 
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C3.3.6 Bitfield move 
The Bitfield move instructions copy a field of constant width from bit 0 in the source register to a constant bit 
position in the destination register, or from a constant bit position in the source register to bit 0 in the destination 
register. The remaining bits in the destination register are set as follows: 
. For BFM the remaining bits are unchanged. 
° For UBFM the lower bits, if any, and upper bits, if any, are set to zero. 
° For SBFM the lower bits, if any, are set to zero, and the upper bits, if any, are set to a copy of the 
most-significant bit in the copied field. 
Table C3-31 shows the Bitfield move instructions. 
Table C3-31 Bitfield move instructions 
Mnemonic Instruction See 
BFM Bitfield move BFM on page C6-465 
SBFM Signed bitfield move SBFM on page C6-668 
UBFM Unsigned bitfield move (32-bit) UBFM on page C6-752 
C3.3.7 Bitfield insert and extract 
The Bitfield insert and extract instructions are implemented as aliases of the Bitfield move instructions. Table C3-32 
shows the Bitfield insert and extract aliases. 
Table C3-32 Bitfield insert and extract instructions 
Mnemonic _ Instruction See 
BFI Bitfield insert BFI on page C6-464 
BFXIL Bitfield extract and insert low BFXIL on page C6-467 
SBFIZ Signed bitfield insert in zero SBFIZ on page C6-667 
SBFX Signed bitfield extract SBFX on page C6-670 
UBFIZ Unsigned bitfield insert in zero UBFIZ on page C6-751 
UBFX Unsigned bitfield extract UBFX on page C6-754 
C3.3.8 Extract register 
Depending on the register width of the operands, the Extract register instruction copies a 32-bit or 64-bit field from 
a constant bit position within a double-width value formed by the concatenation of a pair of source registers to a 
destination register. 
Table C3-33 shows the Extract (immediate) instructions. 
Table C3-33 Extract register instructions 
Mnemonic _ Instruction See 
EXTR Extract register from pair EXTR on page C6-526 
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C3.3.9 Shift (immediate) 


Shifts and rotates by a constant amount are implemented as aliases of the Bitfield move or Extract register 
instructions. The shift or rotate amount must be in the range 0 to one less than the register width of the instruction, 


inclusive. 


Table C3-34 shows the aliases that can be used as immediate shift and rotate instructions. 


Table C3-34 Aliases for immediate shift and rotate instructions 

















Mnemonic Instruction See 

ASR Arithmetic shift right ASR (immediate) on page C6-459 
LSL Logical shift left LSL (immediate) on page C6-604 
LSR Logical shift right LSR (immediate) on page C6-607 
ROR Rotate right ROR (immediate) on page C6-660 





C3.3.10 Sign-extend and Zero-extend 


The Sign-extend and Zero-extend instructions are implemented as aliases of the Bitfield move instructions. 


Table C3-35 shows the aliases that can be used as zero-extend and sign-extend instructions. 


Table C3-35 Zero-extend and sign-extend instructions 





























Mnemonic Instruction See 
SXTB Sign-extend byte SXTB on page C6-739 
SXTH Sign-extend halfword SXTH on page C6-740 
SXTW Sign-extend word SXTW on page C6-741 
UXTB Unsigned extend byte UXTB on page C6-761 
UXTH Unsigned extend halfword UXTH on page C6-762 
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C3.4 Data processing - register 


This section describes the instruction groups for data processing with all register operands. It contains the following 
subsections: 


° Arithmetic (shifted register). 

° Arithmetic (extended register) on page C3-164. 
° Arithmetic with carry on page C3-165. 

° Logical (shifted register) on page C3-165. 

° Move (register) on page C3-166. 

. Shift (register) on page C3-166. 

. Multiply and divide on page C3-167. 

° CRC32 on page C3-168. 

. Bit operation on page C3-169. 

° Conditional select on page C3-169. 


° Conditional comparison on page C3-170. 


For information about the encoding structure of the instructions in this instruction group, see Data processing - 
register on page C4-224. 


C3.4.1 Arithmetic (shifted register) 


The Arithmetic (shifted register) instructions apply an optional shift operator to the second source register value 
before performing the arithmetic operation. The register width of the instruction controls whether the new bits are 
fed into the intermediate result on a right shift or rotate at bit[63] or bit[31]. 


The shift operators LSL, ASR and LSR accept an immediate shift amount in the range 0 to one less than the register 
width of the instruction, inclusive. 


Omitting the shift operator implies LSL #0, which means that there is no shift. A disassembler must not output LSL 
#0. However, a disassembler must output all other shifts by zero. 


The current stack pointer, SP or WSP, cannot be used with this class of instructions. See Arithmetic (extended 
register) on page C3-164 for arithmetic instructions that can operate on the current stack pointer. 


Table C3-36 shows the Arithmetic (shifted register) instructions. 


Table C3-36 Arithmetic (shifted register) instructions 



































Mnemonic Instruction See 
ADD Add ADD (shifted register) on page C6-441 
ADDS Add and set flags ADDS (shifted register) on page C6-447 
SUB Subtract SUB (shifted register) on page C6-730 
SUBS Subtract and set flags SUBS (shifted register) on page C6-736 
CMN Compare negative CMN (shifted register) on page C6-490 
CMP Compare CMP (shifted register) on page C6-495 
NEG Negate NEG (shifted register) on page C6-631 
NEGS Negate and set flags NEGS on page C6-633 
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C3.4.2 Arithmetic (extended register) 


The extended register instructions provide an optional sign-extension or zero-extension of a portion of the second 
source register value, followed by an optional left shift by a constant amount of 1-4, inclusive. 


The extended shift is described by the mandatory extend operator SXTB, SXTH, SXTW, UXTB, UXTH, or UXTW. This is 
followed by an optional left shift amount. If the shift amount is not specified, the default shift amount is zero. A 
disassembler must not output a shift amount of zero. 


For 64-bit instruction forms the additional operators UXTX and SXTX use all 64 bits of the second source register with 
an optional shift. In that case ARM recommends UXTX as the operator. If and only if at least one register is SP, ARM 
recommends use of the LSL operator name, rather than UXTX, and when the shift amount is also zero then both the 
operator and the shift amount can be omitted. 


For 32-bit instruction forms the operators UXTW and SXTW both use all 32 bits of the second source register with an 
optional shift. In that case ARM recommends UXTW as the operator. If and only if at least one register is WSP, ARM 
recommends use of the LSL operator name, rather than UXTW, and when the shift amount is also zero then both the 
operator and the shift amount can be omitted. 


The non-flag setting variants of the extended register instruction permit the use of the current stack pointer as either 
the destination register and the first source register. The flag setting variants only permit the stack pointer to be used 
as the first source register. 


In the 64-bit form of these instructions the final register operand is written as Wm for all except the UXTX/LSL and SXTX 
extend operators. For example: 


CMP X4, W5, SXTW 
ADD X1, X2, W3, UXTB #2 
SUB SP, SP, X1 // SUB SP, SP, X1, UXTX #0 


Table C3-37 shows the Arithmetic (extended register) instructions. 


Table C3-37 Arithmetic (extended register) instructions 





























Mnemonic __ Instruction See 

ADD Add ADD (extended register) on page C6-437 

ADDS Add and set flags ADDS (extended register) on page C6-443 
SUB Subtract SUB (extended register) on page C6-726 

SUBS Subtract and set flags SUBS (extended register) on page C6-732 
CMN Compare negative CMN (extended register) on page C6-487 

CMP Compare CMP (extended register) on page C6-492 
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C3.4.3 Arithmetic with carry 


The Arithmetic with carry instructions accept two source registers, with the carry flag as an additional input to the 
calculation. They do not support shifting of the second source register. 


Table C3-38 shows the Arithmetic with carry instructions 


Table C3-38 Arithmetic with carry instructions 























Mnemonic _ Instruction See 

ADC Add with carry ADC on page C6-435 
ADCS Add with carry and set flags ADCS on page C6-436 
SBC Subtract with carry SBC on page C6-663 
SBCS Subtract with carry and set flags SBCS on page C6-665 
NGC Negate with carry NGC on page C6-635 
NGCS Negate with carry and set flags NGCS on page C6-636 





C3.4.4 Logical (shifted register) 


The Logical (shifted register) instructions apply an optional shift operator to the second source register value before 
performing the main operation. The register width of the instruction controls whether the new bits are fed into the 
intermediate result on a right shift or rotate at bit[63] or bit[31]. 


The shift operators LSL, ASR, LSR and ROR accept a constant immediate shift amount in the range 0 to one less than the 
register width of the instruction, inclusive. 


Omitting the shift operator and amount implies LSL #0, which means that there is no shift. A disassembler must not 
output LSL #0. However, a disassembler must output all other shifts by zero. 





Note 


Apart from ANDS, TST and BICS the logical instructions do not set the condition flags, but the final result of a bit 
operation can usually directly control a CBZ, CBNZ, TBZ, or TBNZ conditional branch. 





Table C3-39 shows the Logical (shifted register) instructions. 


Table C3-39 Logical (shifted register) instructions 





























Mnemonic _ Instruction See 

AND Bitwise AND AND (shifted register) on page C6-452 

ANDS Bitwise AND and set flags ANDS (shifted register) on page C6-456 

BIC Bitwise bit clear BIC (shifted register) on page C6-468 

BICS Bitwise bit clear and set flags BICS (shifted register) on page C6-470 

EON Bitwise exclusive OR NOT EON (shifted register) on page C6-520 

EOR Bitwise exclusive OR EOR (shifted register) on page C6-523 

ORR Bitwise inclusive OR ORR (shifted register) on page C6-642 
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Table C3-39 Logical (shifted register) instructions (continued) 














Mnemonic _ Instruction See 

MVN Bitwise NOT MVN on page C6-629 

ORN Bitwise inclusive OR NOT ORN (shifted register) on page C6-638 
TST Test bits TST (shifted register) on page C6-749 





C3.4.5 Move (register) 


The Move (register) instructions are aliases for other data processing instructions. They copy a value from a 
general-purpose register to another general-purpose register or the current stack pointer, or from the current stack 
pointer to a general-purpose register. 


Table C3-40 MOV register instructions 





Mnemonic Instruction See 





MOV Move register MOV (register) on page C6-616 





Move register to SP or move SP to register MOV (to/from SP) on page C6-612 





C3.4.6 Shift (register) 


In the Shift (register) instructions, the shift amount is the positive value in the second source register modulo the 
register size. The register width of the instruction controls whether the new bits are fed into the result on a right shift 
or rotate at bit[63] or bit[31]. 


Table C3-41 shows the Shift (register) instructions. 


Table C3-41 Shift (register) instructions 

















Mnemonic _ Instruction See 

ASRV Arithmetic shift right variable ASRV on page C6-460 
LSLV Logical shift left variable LSLV on page C6-605 
LSRV Logical shift right variable LSRV on page C6-608 
RORV Rotate right variable RORV on page C6-662 





However, the Shift (register) instructions have a preferred set of aliases that match the shift immediate aliases 
described in Shift (immediate) on page C3-162. 


Table C3-42 shows the aliases for Shift (register) instructions. 


Table C3-42 Aliases for Variable shift instructions 























Mnemonic Instruction See 
ASR Arithmetic shift right ASR (register) on page C6-458 
LSL Logical shift left LSL (register) on page C6-603 
LSR Logical shift right LSR (register) on page C6-606 
ROR Rotate right ROR (register) on page C6-661 
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This section describes the instructions used for integer multiplication and division. It contains the following 


subsections: 
° Multiply. 
. Divide on page C3-168. 


Multiply 


The Multiply instructions write to a single 32-bit or 64-bit destination register, and are built around the fundamental 
four operand multiply-add and multiply-subtract operation, together with 32-bit to 64-bit widening variants. A 

64-bit to 128-bit widening multiple can be constructed with two instructions, using SMULH or UMULH to generate the 
upper 64 bits. Table C3-43 shows the Multiply instructions. 


Table C3-43 Multiply integer instructions 


















































Mnemonic Instruction See 

MADD Multiply-add MADD on page C6-609 
MSUB Multiply-subtract MSUB on page C6-626 
MNEG Multiply-negate MNEG on page C6-611 
MUL Multiply MUL on page C6-628 
SMADDL Signed multiply-add long SMADDL on page C6-674 
SMSUBL Signed multiply-subtract long SMSUBL on page C6-677 
SMNEGL Signed multiply-negate long SMNEGL on page C6-676 
SMULL Signed multiply long SMULL on page C6-679 
SMULH Signed multiply high SMULH on page C6-678 
UMADDL Unsigned multiply-add long UMADDL on page C6-756 
UMSUBL Unsigned multiply-subtract long UMSUBL on page C6-758 
UMNEGL Unsigned multiply-negate long UMNEGL on page C6-757 
UMULL Unsigned multiply long UMULL on page C6-760 
UMULH Unsigned multiply high UMULH on page C6-759 
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Divide 


The Divide instructions compute the quotient of a division, rounded towards zero. The remainder can then be 
computed as (numerator - (quotient x denominator)), using the MSUB instruction. 


If a signed integer division (INT_MIN / -1) is performed where INT_MIN is the most negative integer value 
representable in the selected register size, then the result overflows the signed integer range. No indication of this 
overflow is produced and the result that is written to the destination register is INT_MIN. 


A division by zero results in a zero being written to the destination register, without any indication that the division 
by zero occurred. 


Table C3-44 shows the Divide instructions. 


C3.4.8 CRC32 


Table C3-44 Divide instructions 











Mnemonic __ Instruction See 
SDIV Signed divide SDIV on page C6-671 
UDIV Unsigned divide UDIV on page C6-755 





The optional CRC32 instructions operate on the general-purpose register file to update a 32-bit CRC value from an 
input value comprising 1, 2, 4, or 8 bytes. There are two different classes of CRC instructions, CRC32 and CRC32C, that 
support two commonly used 32-bit polynomials, known as CRC-32 and CRC-32C. 


To fit with common usage, the bit order of the values is reversed as part of the operation. 


When bits[19:16] of ID_AA64ISARO_EL] are set to 0b0001 the CRC instructions are implemented. 


Table C3-45 shows the CRC instructions. 


Table C3-45 CRC32 instructions 





























Mnemonic Instruction See 

CRC32B CRC-32 sum from byte CRC32B, CRC32H, CRC32W, CRC32X on page C6-498 
CRC32H CRC-32 sum from halfword CRC32B, CRC32H, CRC32W, CRC32X on page C6-498 
CRC32W CRC-32 sum from word CRC32B, CRC32H, CRC32W, CRC32X on page C6-498 
CRC32X CRC-32 sum from doubleword CRC32B, CRC32H, CRC32W, CRC32X on page C6-498 
CRC32CB CRC-32C sum from byte CRC32CB, CRC32CH, CRC32CW, CRC32CX on page C6-500 
CRC32CH CRC-32C sum from halfword CRC32CB, CRC32CH, CRC32CW, CRC32CX on page C6-500 
CRC32CW CRC-32C sum from word CRC32CB, CRC32CH, CRC32CW, CRC32CX on page C6-500 
CRC32CX CRC-32C sum from doubleword CRC32CB, CRC32CH, CRC32CW, CRC32CX on page C6-500 
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Table C3-46 shows the Bit operation instructions. 


C3 A64 Instruction Set Overview 


C3.4 Data processing - register 


Table C3-46 Bit operation instructions 





Mnemonic 


Instruction 


See 





CLS 


Count leading sign bits 


CLS on page C6-485 





CLZ 


Count leading zero bits 


CLZ on page C6-486 





RBIT 


Reverse bit order 


RBIT on page C6-652 





REV 


Reverse bytes in register 


REV on page C6-654 





REV16 


Reverse bytes in halfwords 


REVI16 on page C6-656 





REV32 


Reverses bytes in words 


REV32 on page C6-658 





C3.4.10 Conditional select 


The Conditional select instructions select between the first or second source register, depending on the current state 
of the condition flags. When the named condition is true, the first source register is selected and its value is copied 
without modification to the destination register. When the condition is false the second source register is selected 

and its value might be optionally inverted, negated, or incremented by one, before writing to the destination register. 


Other useful conditional set and conditional unary operations are implemented as aliases of the four Conditional 


select instructions. 


Table C3-47 shows the Conditional select instructions. 


Table C3-47 Conditional select instructions 





Mnemonic 


Instruction 


See 





CSEL 


Conditional select 


CSEL on page C6-502 





CSINC 


Conditional select increment 


CSINC on page C6-505 





CSINV 


Conditional select inversion 


CSINV on page C6-507 





CSNEG 


Conditional select negation 


CSNEG on page C6-509 





CSET 


Conditional set 


CSET on page C6-503 





CSETM 


Conditional set mask 


CSETM on page C6-504 





CINC 


Conditional increment 


CINC on page C6-482 





CINV 


Conditional invert 


CINV on page C6-483 





CNEG 


Conditional negate 


CNEG on page C6-497 
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C3.4.11 Conditional comparison 


The Conditional comparison instructions provide a conditional select for the NZCV condition flags, setting the flags 
to the result of an arithmetic comparison of its two source register values if the named input condition is true, or to 
an immediate value if the input condition is false. There are register and immediate forms. The immediate form 
compares the source register to a small 5-bit unsigned value. 


Table C3-48 shows the Conditional comparison instructions. 


Table C3-48 Conditional comparison instructions 























Mnemonic Instruction See 
CCMN Conditional compare negative (register) CCMN (register) on page C6-479 
CCMN Conditional compare negative (immediate) CCMN (immediate) on page C6-478 
CCMP Conditional compare (register) CCMP (register) on page C6-481 
CCMP Conditional compare (immediate) CCMP (immediate) on page C6-480 
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Data processing - SIMD and floating-point 


This section describes the instruction groups for data processing with SIMD and floating-point register operands. 


It contains the following subsections that describe the scalar floating-point data processing instructions: 


Floating-point move (register) on page C3-172. 
Floating-point move (immediate) on page C3-172. 
Floating-point conversion on page C3-172. 
Floating-point round to integral on page C3-174. 
Floating-point multiply-add on page C3-175. 
Floating-point arithmetic (one source) on page C3-175. 
Floating-point arithmetic (two sources) on page C3-175. 
Floating-point minimum and maximum on page C3-176. 
Floating-point comparison on page C3-176. 


Floating-point conditional select on page C3-177. 


It also contains the following subsections that describe the SIMD data processing instructions: 


e 


SIMD move on page C3-177 

SIMD arithmetic on page C3-177. 

SIMD compare on page C3-180. 

SIMD widening and narrowing arithmetic on page C3-181. 
SIMD unary arithmetic on page C3-182. 

SIMD by element arithmetic on page C3-184. 

SIMD permute on page C3-185. 

SIMD immediate on page C3-185. 

SIMD shift (immediate) on page C3-185. 

SIMD floating-point and integer conversion on page C3-187. 
SIMD reduce (across vector lanes) on page C3-188. 

SIMD pairwise arithmetic on page C3-188. 

SIMD table lookup on page C3-189. 

The Cryptographic Extension on page C3-189. 


For information about the encoding structure of the instructions in this instruction group, see Data processing - 
SIMD and floating point on page C4-233. 


For information about the floating-point exceptions, see Floating-point exception traps on page D1-1552. 


Common features of SIMD instructions 


A number of SIMD instructions come in three forms: 


Wide Indicated by the suffix W. The element width of the destination register and the first source operand 
is double that of the second source operand. 

Long Indicated by the suffix L. The element width of the destination register is double that of both source 
operands. 

Narrow Indicated by the suffix N. The element width of the destination register is half that of both source 
operands. 


In addition, each vector form of the instruction is part of a pair, with a second and upper half suffix of 2, to identify 
the variant of the instruction: 


Where a SIMD operation widens or lengthens a 64-bit vector to a 128-bit vector, the instruction provides a 
second part operation that can extract the source from the upper 64 bits of the source registers. 


Where a SIMD operation narrows a 128-bit vector to a 64-bit vector, the instruction provides a second-part 
operation that can pack the result of a second operation into the upper part of the same destination register. 
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Note 


This is referred to as a lane set specifier. 








C3.5.2 Floating-point move (register) 


The Floating-point move (register) instructions copy a scalar floating-point value from one register to another 
register without performing any conversion. 


Some of the Floating-point move (register) instructions overlap with the functionality provided by the Advanced 
SIMD instructions DUP, INS, and UMOV. However, ARM recommends using the FMOV instructions when operating on 
scalar floating-point data to avoid the creation of scalar floating-point code that depends on the availability of the 
Advanced SIMD instruction set. 


Table C3-49 shows the Floating-point move (register) instructions. 


Table C3-49 Floating-point move (register) instructions 





Mnemonic 


Instruction See 





FMOV 


Floating-point move register without conversion FMOV (register) on page C7-974 





Floating-point move to or from general-purpose register without conversion FMOV (general) on page C7-975 





C3.5.3 Floating-point move (immediate) 


The Floating-point move (immediate) instructions convert a small constant immediate floating-point value into a 
single-precision or double-precision scalar floating-point value in a SIMD and floating-point register. 


The floating-point constant can be specified either in decimal notation, such as 12.0 or -1.2e1, or as a string 
beginning with 0x followed by a hexadecimal representation of the IEEE 754 single-precision or double-precision 
encoding. ARM recommends that a disassembler uses the decimal notation, provided that this displays the value 
precisely. 


The floating-point value must be expressible as (+ n/16 x 2"), where n is an integer in the range 16 <n <31 and ris 
an integer in the range of -3 <r < 4, that is a normalized binary floating-point encoding with one sign bit, four bits 
of fraction, and a 3-bit exponent. 


Table C3-50 shows the Floating-point move (immediate) instruction: 


Table C3-50 Floating-point move (immediate) instruction 





Mnemonic Instruction See 





FMOV Floating-point move immediate = FMOV (scalar, immediate) on page C7-978 





C3.5.4 Floating-point conversion 


The following subsections describe the conversion of floating-point values: 
° Convert floating-point precision. 


° Convert between floating-point and integer or fixed-point on page C3-173. 


Convert floating-point precision 


These instructions convert a floating-point scalar with one precision to a floating-point scalar with a different 
precision, using the current rounding mode as specified by FPCR.RMode. 
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Table C3-51 shows the Floating-point precision conversion instruction. 


Table C3-51 Floating-point precision conversion instruction 





Mnemonic Instruction See 





FCVT Floating-point convert precision (scalar) FCVT on page C7-869 





Convert between floating-point and integer or fixed-point 


These instructions convert a floating-point scalar in a SIMD and floating-point register to or from a signed or 
unsigned integer or fixed-point in a general-purpose register. For a fixed-point value, a final immediate operand 
indicates that the general-purpose register holds a fixed-point number and fbits indicates the number of bits after 
the binary point. fbits is in the range 1- 32 inclusive for a 32-bit general-purpose register name, and 1-64 inclusive 
for a 64-bit general-purpose register name. 


These instructions generate the Invalid Operation exception, in response to a floating-point input of NaN, infinity, 
or a numerical value that cannot be represented within the destination register. An out-of-range integer or 
fixed-point result is saturated to the size of the destination register. A numeric result that differs from the input 
generates an Inexact exception. When flush-to-zero mode is enabled, zero replaces a denormal input and generates 
an Input Denormal exception. 


Table C3-52 shows the Floating-point and fixed-point conversion instructions. 


Table C3-52 Floating-point and integer or fixed-point conversion instructions 









































Mnemonic __ Instruction See 

FCVTAS Floating-point scalar convert to signed integer, rounding to nearest = FCVTAS (scalar) on page C7-873 
with ties to away (scalar form) 

FCVTAU Floating-point scalar convert to unsigned integer, rounding to FCVTAU (scalar) on page C7-877 
nearest with ties to away (scalar form) 

FCVTMS Floating-point scalar convert to signed integer, rounding toward FCVTMS (scalar) on page C7-883 
minus infinity (scalar form) 

FCVTMU Floating-point scalar convert to unsigned integer, rounding toward =FCVTMU (scalar) on page C7-887 
minus infinity (scalar form) 

FCVTNS Floating-point scalar convert to signed integer, rounding to nearest =FCVTNS (scalar) on page C7-893 
with ties to even (scalar form) 

FCVTNU Floating-point scalar convert to unsigned integer, rounding to FCVTWNU (scalar) on page C7-897 
nearest with ties to even (scalar form) 

FCVTPS Floating-point scalar convert to signed integer, rounding toward FCVTPS (scalar) on page C7-901 
positive infinity (scalar form) 

FCVTPU Floating-point scalar convert to unsigned integer, rounding toward = =FCVTPU (scalar) on page C7-905 
positive infinity (scalar form) 

FCVTZS Floating-point scalar convert to signed integer, rounding toward FCVTZS (scalar, integer) on page C7-916 
zero (scalar form) 
Floating-point convert to signed fixed-point, rounding toward zero =F CVTZS (scalar, fixed-point) on page C7-914 
(scalar form) 
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Table C3-52 Floating-point and integer or fixed-point conversion instructions (continued) 




















Mnemonic _ Instruction See 

FCVTZU Floating-point scalar convert to unsigned integer, rounding toward = FCVTZU (scalar, integer) on page C7-925 
zero (scalar form) 
Floating-point scalar convert to unsigned fixed-point, rounding FCVTZU (scalar, fixed-point) on page C7-923 
toward zero (scalar form) 

SCVTF Signed integer scalar convert to floating-point, using the current SCVTF (vector, integer) on page C7-1175 
rounding mode (scalar form) 
Signed fixed-point convert to floating-point, using the current SCVTF (scalar, fixed-point) on page C7-1177 
rounding mode (scalar form) 

UCVTF Unsigned integer scalar convert to floating-point, using the current UCVTF (vector, integer) on page C7-1401 


rounding mode (scalar form) 





Unsigned fixed-point convert to floating-point, using the current UCVTF (scalar, fixed-point) on page C7-1403 


rounding mode (scalar form) 





C3.5.5 Floating-point round to integral 


The Floating-point round to integral instructions round a floating-point value to an integral floating-point value of 


the same size. 


These instructions generate the Invalid Operation exception in response to a signaling NaN input, or the Input 
Denormal exception in response to a denormal input when flush-to-zero mode is enabled. The FRINTX instruction 
can also generate the Inexact exception if the result is numeric and does not have the same numerical value as the 
input. A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, 


and a NaN is propagated as in normal floating-point arithmetic. 


Table C3-53 shows the Floating-point round to integral instructions. 


Table C3-53 Floating-point round to integral instructions 





Mnemonic Instruction 


See 





FRINTA 


Floating-point round to integral, to nearest with ties to away 


FRINTA (scalar) on page C7-1007 





FRINTI 


Floating-point round to integral, using current rounding mode 


FRINTI (scalar) on page C7-1011 





FRINTM 


Floating-point round to integral, toward minus infinity 


FRINTM (scalar) on page C7-1015 





FRINTN 


Floating-point round to integral, to nearest with ties to even 


FRINTN (scalar) on page C7-1019 





FRINTP 


Floating-point round to integral, toward positive infinity 


FRINTP (scalar) on page C7-1023 





FRINTX 


Floating-point round to integral exact, using current rounding mode 


FRINTX (scalar) on page C7-1027 








FRINTZ 


Floating-point round to integral, toward zero 


FRINTZ (scalar) on page C7-1031 
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Table C3-54 shows the Floating-point multiply-add instructions that require three source register operands. 


Table C3-54 Floating-point multiply-add instructions 

















Mnemonic _ Instruction See 

FMADD Floating-point scalar fused multiply-add FMADD on page C7-931 
FMSUB Floating-point scalar fused multiply-subtract FMSUB on page C7-979 
FNMADD Floating-point scalar negated fused multiply-add FNMADD on page C7-994 
FNMSUB Floating-point scalar negated fused multiply-subtract | /NMSUB on page C7-996 





Floating-point arithmetic (one source) 


Table C3-55 shows the Floating-point arithmetic instructions that require a single source register operand. 


Table C3-55 Floating-point arithmetic instructions with one source register 





Mnemonic 


Instructions 


See 





FABS 


Floating-point scalar absolute value 


FABS (scalar) on page C7-831 





FNEG 


Floating-point scalar negate 


FNEG (scalar) on page C7-993 





FSQRT 


Floating-point scalar square root 


FSQRT (scalar) on page C7-1038 





Floating-point arithmetic (two sources) 


Table C3-56 shows the Floating-point arithmetic instructions that require two source register operands. 


Table C3-56 Floating-point arithmetic instructions with two source registers 





Mnemonic 


Instruction 


See 





FADD 


Floating-point scalar add 


FADD (scalar) on page C7-838 





FDIV 


Floating-point scalar divide 


FDIV (scalar) on page C7-929 





FMUL 


Floating-point scalar multiply 


FMUL (scalar) on page C7-985 





FNMUL 


Floating-point scalar multiply-negate 


FNMUL (scalar) on 
page C7-998 





FSUB 


Floating-point scalar subtract 


F SUB (scalar) on page C7-1041 
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C3.5.9 Floating-point minimum and maximum 


The min(x,y) and max(x,y) operations return a quiet NaN when either x or y is NaN. In flush-to-zero mode denormal 
operands are flushed to zero before comparison, and if the result of the comparison is the flushed value, then a zero 
value is returned. Where both x and y are zero, or denormal values flushed to zero, with different signs, then +0.0 
is returned by max() and -0.0 by min(). 


The minNum(x,y) and maxNum(x,y) operations follow the IEEE 754-2008 standard and return the numerical operand 
when one operand is numerical and the other a quiet NaN. Apart from this additional handling of a single quiet NaN 
the result is then identical to min(x,y) and max(x,y). 


Table C3-57 shows the Floating-point instructions that can perform floating-point minimum and maximum 
operations. 


Table C3-57 Floating-point minimum and maximum instructions 

















Mnemonic _ Instruction See 

FMAX Floating-point scalar maximum FMAX (scalar) on page C7-935 
FMAXNM Floating-point scalar maximum number FMAXNM (scalar) on page C7-939 
FMIN Floating-point scalar minimum FMIN (scalar) on page C7-951 
FMINNM Floating-point scalar minimum number FMINNM (scalar) on page C7-955 





C3.5.10 Floating-point comparison 


These instructions set the NZCV condition flags in PSTATE, based on the result of a comparison of two operands. 
If the floating-point comparisons are unordered, where one or both operands are a form of NaN, the C and V bits 
are set to 1 and the N and Z bits are cleared to 0. 


Note 


The NZCV flags in the FPSR are associated with AArch32 state. The A64 floating-point comparison instructions 
do not change the condition flags in the FPSR. 








For the conditional Floating-point comparison instructions, if the condition is TRUE, the flags are updated to the 
result of the comparison, otherwise the flags are updated to the immediate value that is defined in the instruction 
encoding. 


The quiet compare instructions generate an Invalid Operation exception if either of the source operands is a 
signaling NaN. The signaling compare instructions generate an Invalid Operation exception if either of the source 
operands is any type of NaN. 


Table C3-58 shows the Floating-point comparison instructions. 


Table C3-58 Floating-point comparison instructions 























Mnemonic _ Instruction See 
FCMP Floating-point quiet compare FCMP on page C7-863 
FCMPE Floating-point signaling compare FCMPE on page C7-865 
FCCMP Floating-point conditional quiet compare FCCMP on page C7-843 
FCCMPE Floating-point conditional signaling compare © FCCMPE on page C7-845 
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C3.5.11 Floating-point conditional select 
Table C3-59 shows the Floating-point conditional select instructions. 
Table C3-59 Floating-point conditional select instructions 
Mnemonic Instruction See 
FCSEL Floating-point scalar conditional select FCSEL on page C7-867 
C3.5.12 SIMD move 
The functionality of some data movement instructions overlaps with that provided by the scalar floating-point FMOV 
instructions described in Floating-point move (register) on page C3-172. 
Table C3-60 shows the SIMD move instructions. 
Table C3-60 SIMD move instructions 
Mnemonic _ Instruction See 
DUP Duplicate vector element to vector or scalar DUP (element) on page C7-820 
DUP Duplicate general-purpose register to vector DUP (general) on page C7-823 
INS@ Insert vector element from another vector element INS (element) on page C7-1043 
Insert vector element from general-purpose register INS (general) on page C7-1045 
MOV Move vector element to vector element MOV (element) on page C7-1114 
Move general-purpose register to vector element MOV (from general) on page C7-1116 
Move vector element to scalar MOV (scalar) on page C7-1112 
Move vector element to general-purpose register MOV (to general) on page C7-1119 
UMOV Unsigned move vector element to general-purpose register UMOV on page C7-1433 
SMOV Signed move vector element to general-purpose register SMOV on page C7-1226 
a. Disassembles as MOV. 
C3.5.13 SIMD arithmetic 
Table C3-61 shows the SIMD arithmetic instructions. 
Table C3-61 SIMD arithmetic instructions 
Mnemonic _ Instruction See 
ADD Add (vector and scalar form) ADD (vector) on page C7-773 
AND Bitwise AND (vector form) AND (vector) on page C7-786 
BIC Bitwise bit clear (register) (vector form) BIC (vector, register) on page C7-789 
BIF Bitwise insert if false (vector form) BIF on page C7-790 
BIT Bitwise insert if true (vector form) BIT on page C7-791 
BSL Bitwise select (vector form) BSL on page C7-792 
EOR Bitwise exclusive OR (vector form) EOR (vector) on page C7-825 
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Table C3-61 SIMD arithmetic instructions (continued) 





































































































Mnemonic _ Instruction See 

FABD Floating-point absolute difference (vector and scalar form) FABD on page C7-828 

FADD Floating-point add (vector form) FADD (scalar) on page C7-838 

FDIV Floating-point divide (vector form) FDIV (vector) on page C7-927 

FMAX Floating-point maximum (vector form) FMAXP (vector) on page C7-946 

FMAXNM Floating-point maximum number (vector form) FMAXNM (vector) on page C7-937 

FMIN Floating-point minimum (vector form) FMIN (vector) on page C7-949 

FMINNM Floating-point minimum number (vector form) FMINNM (vector) on page C7-953 

FMLA Floating-point fused multiply-add (vector form) FMLA (vector) on page C7-967 

FMLS Floating-point fused multiply-subtract (vector form) FMLS (vector) on page C7-971 

FMUL Floating-point multiply (vector form) FMUL (vector) on page C7-983 

FMULX Floating-point multiply extended (vector and scalar form) FMULX on page C7-990 

FRECPS Floating-point reciprocal step (vector and scalar form) FRECPS on page C7-1002 

FRSQRTS Floating-point reciprocal square root step (vector and scalar form) FRSQRTS on page C7-1035 

FSUB Floating-point subtract (vector form) F SUB (vector) on page C7-1039 

MLA Multiply-add (vector form) MLA (vector) on page C7-1106 

MLS Multiply-subtract (vector form) MLS (vector) on page C7-1110 

MUL Multiply (vector form) MUL (vector) on page C7-1125 

MOV Move vector register (vector form) MOV (vector) on page C7-1118 

ORN Bitwise inclusive OR NOT (vector form) ORN (vector) on page C7-1133 

ORR Bitwise inclusive OR (register) (vector form) ORR (vector, register) on page C7-1136 

PMUL Polynomial multiply (vector form) PMUL on page C7-1137 

SABA Signed absolute difference and accumulate (vector form) SABA on page C7-1154 

SABD Signed absolute difference (vector form) SABD on page C7-1158 

SHADD Signed halving add (vector form) SHADD on page C7-1191 

SHSUB Signed halving subtract (vector form) SHSUB on page C7-1199 

SMAX Signed maximum (vector form) SMAX on page C7-1204 

SMIN Signed minimum (vector form) SMIN on page C7-1210 

SQADD Signed saturating add (vector and scalar form) SQADD on page C7-1234 

SQDMULH Signed saturating doubling multiply returning high half (vector and scalar SQDMULH (vector) on page C7-1253 
form) 

SQRSHL Signed saturating rounding shift left (register) (vector and scalar form) SQRSHL on page C7-1268 

SQRDMULH Signed saturating rounding doubling multiply returning high half (vector SQRDMULH (vector) on page C7-1266 
and scalar form) 
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Table C3-61 SIMD arithmetic instructions (continued) 






























































Mnemonic _ Instruction See 

SQSHL Signed saturating shift left (register) (vector and scalar form) SQSHL (register) on page C7-1279 
SQSUB Signed saturating subtract (vector and scalar form) SQSUB on page C7-1290 

SRHADD Signed rounding halving add (vector form) SRHADD on page C7-1298 

SRSHL Signed rounding shift left (register) (vector and scalar form) SRSHL on page C7-1303 

SSHL Signed shift left (register) (vector and scalar form) SSHL on page C7-1309 

SUB Subtract (vector and scalar form) SUB (vector) on page C7-1364 
UABA Unsigned absolute difference and accumulate (vector form) UABA on page C7-1380 

UABD Unsigned absolute difference (vector form) UABD on page C7-1384 

UHADD Unsigned halving add (vector form) UHADD on page C7-1407 

UHSUB Unsigned halving subtract (vector form) UHSUB on page C7-1409 

UMAX Unsigned maximum (vector form) UMAX on page C7-1411 

UMIN Unsigned minimum (vector form) UMIN on page C7-1417 

UQADD Unsigned saturating add (vector and scalar form) UQADD on page C7-1439 

UQRSHL Unsigned saturating rounding shift left (register) (vector and scalar form) | UQRSHL on page C7-1441 

UQSHL Unsigned saturating shift left (register) (vector and scalar form) UQSHL (register) on page C7-1449 
UQSUB Unsigned saturating subtract (vector and scalar form) UQSUB on page C7-1454 

URHADD Unsigned rounding halving add (vector form) URHADD on page C7-1460 

URSHL Unsigned rounding shift left (register) (vector and scalar form) URSHL on page C7-1462 

USHL Unsigned shift left (register) (vector and scalar form) USHL on page C7-1469 
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C3.5.14 SIMD compare 
The SIMD compare instructions compare vector or scalar elements according to the specified condition and set the 
destination vector element to all ones if the condition holds, or to zero if the condition does not hold. 
Note 
Some of the comparisons, such as LS, LE, LO, and LT, can be made by reversing the operands and using the 
opposite comparison, HS, GE, HI, or GT. 
Table C3-62 shows that SIMD compare instructions. 
Table C3-62 SIMD compare instructions 
Mnemonic Instruction See 
CMEQ Compare bitwise equal (vector and scalar form) CMEQ (register) on page C7-797 
Compare bitwise equal to zero (vector and scalar form) CMEQ (zero) on page C7-799 
CMHS Compare unsigned higher or same (vector and scalar form) CMHS (register) on page C7-811 
CMGE Compare signed greater than or equal (vector and scalar form) CMGE (register) on page C7-801 
Compare signed greater than or equal to zero (vector and scalar form) CMGE (zero) on page C7-803 
CMHI Compare unsigned higher (vector and scalar form) CMHI (register) on page C7-809 
CMGT Compare signed greater than (vector and scalar form) CMGT (register) on page C7-805 
Compare signed greater than zero (vector and scalar form) CMGT (zero) on page C7-807 
CMLE Compare signed less than or equal to zero (vector and scalar form) CMLE (zero) on page C7-813 
CMLT Compare signed less than zero (vector and scalar form) CMLT (zero) on page C7-815 
CMTST Compare bitwise test bits nonzero (vector and scalar form) CMTST on page C7-817 
FCMEQ Floating-point compare equal (vector and scalar form) FCMEQ (register) on page C7-847 
Floating-point compare equal to zero (vector and scalar form) FCMEQ (zero) on page C7-849 
FCMGE Floating-point compare greater than or equal (vector and scalar form) FCMGE (register) on page C7-851 
Floating-point compare greater than or equal to zero (vector and scalar form) FCMGE (zero) on page C7-853 
FCMGT Floating-point compare greater than (vector and scalar form) FCMGT (register) on page C7-855 
Floating-point compare greater than zero (vector and scalar form) FCMGT (zero) on page C7-857 
FCMLE Floating-point compare less than or equal to zero (vector and scalar form) FCMLE (zero) on page C7-859 
FCMLT Floating-point compare less than zero (vector and scalar form) FCMLT (zero) on page C7-861 
FACGE Floating-point absolute compare greater than or equal (vector and scalar form) | FACGE on page C7-832 
FACGT Floating-point absolute compare greater than (vector and scalar form) FACGT on page C7-834 
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C3.5.15 SIMD widening and narrowing arithmetic 
For information about the variants of these instructions, see Common features of SIMD instructions on page C3-171. 
Table C3-63 shows the SIMD widening and narrowing arithmetic instructions. 
Table C3-63 SIMD widening and narrowing arithmetic instructions 
Mnemonic Instruction See 





ADDHN, ADDHN2 


Add returning high, narrow (vector form) 


ADDHN, ADDHN2 on page C7-775 





PMULL, PMULL2 


Polynomial multiply long (vector form) 


PMULL, PMULL2 on page C7-1139 


See also The Cryptographic Extension on 
page C3-189 





RADDHN, RADDHN2 


Rounding add returning high, narrow (vector form) 


RADDHN, RADDHN72 on page C7-1141 





RSUBHN, RSUBHN2 


Rounding subtract returning high, narrow (vector form) 


RSUBHN, RSUBHN2 on page C7-1152 





SABAL, SABAL2 


Signed absolute difference and accumulate long (vector form) 


SABAL, SABAL2 on page C7-1156 





SABDL, SABDL2 


Signed absolute difference long (vector form) 


SABDL, SABDL2 on page C7-1160 





SADDL, SADDL2 


Signed add long (vector form) 


SADDL, SADDL2 on page C7-1164 





SADDW, SADDW2 


Signed add wide (vector form) 


SADDW, SADDW2 on page C7-1170 





SMLAL, SMLAL2 


Signed multiply-add long (vector form) 


SMLAL, SMLAL2 (vector) on page C7-1219 





SMLSL, SMLSL2 


Signed multiply-subtract long (vector form) 


SMLSL, SMLSL2 (vector) on page C7-1224 





SMULL, SMULL2 


Signed multiply long (vector form) 


SMULL, SMULL2 (vector) on page C7-1230 





SQDMLAL, SQDMLAL2 


Signed saturating doubling multiply-add long (vector and 
scalar form) 


SQDMLAL, SQDMLAL2 (vector) on 
page C7-1240 





SQDMLSL, SQDMLSL2 


Signed saturating doubling multiply-subtract long (vector and 
scalar form) 


SQDMLSL, SQDMLSL2 (vector) on 
page C7-1247 





SQDMULL, SQDMULL2 


Signed saturating doubling multiply long (vector and scalar 
form) 


SQDMULL, SQDMULL2 (vector) on 
page C7-1258 











SSUBL, SSUBL2 Signed subtract long (vector form) SSUBL, SSUBL2 on page C7-1317 
SSUBW, SSUBW2 Signed subtract wide (vector form) SSUBW, SSUBW2 on page C7-1319 
SUBHN, SUBHN2 Subtract returning high, narrow (vector form) SUBHN, SUBHN2 on page C7-1366 





UABAL, UABAL2 


Unsigned absolute difference and accumulate long (vector 
form) 


UABAL, UABAL2 on page C7-1382 





UABDL, UABDL2 


Unsigned absolute difference long (vector form) 


UABDL, UABDL2 on page C7-1386 





UADDL, UADDL2 


Unsigned add long (vector form) 


UADDL, UADDL2 on page C7-1390 





UADDW, UADDW2 


Unsigned add wide (vector form) 


UADDW, UADDW2 on page C7-1396 





UMLAL, UMLAL2 


Unsigned multiply-add long (vector form) 


UMLAL, UMLAL2 (vector) on page C7-1426 











UMLSL, UMLSL2 


Unsigned multiply-subtract long (vector form) 


UMLSL, UMLSL2 (vector) on page C7-1431 
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Mnemonic Instruction See 





UMULL, UMULL2 Unsigned multiply long (vector form) UMULL, UMULL2 (vector) on page C7-1437 





USUBL, USUBL2 Unsigned subtract long (vector form) USUBL, USUBL2 on page C7-1479 





USUBW, USUBW2 Unsigned subtract wide (vector form) USUBW, USUBW2 on page C7-1481 























C3.5.16 SIMD unary arithmetic 
For information about the variants of these instructions, see Common features of SIMD instructions on page C3-171. 
Table C3-64 shows the SIMD unary arithmetic instructions. 
Table C3-64 SIMD unary arithmetic instructions 
Mnemonic Instruction See 
ABS Absolute value (vector and scalar form) ABS on page C7-771 
CLS Count leading sign bits (vector form) CLS (vector) on page C7-793 
CLZ Count leading zero bits (vector form) CLZ (vector) on page C7-795 
CNT Population count per byte (vector form) CNT on page C7-819 
FABS Floating-point absolute (vector form) FABS (vector) on page C7-830 





FCVTL, FCVTL2 


Floating-point convert to higher precision long (vector form) 


FCVTL, FCVTL2 on page C7-879 





FCVTN, FCVTN2 


Floating-point convert to lower precision narrow (vector form) 


FCVIN, FCVTN2 on page C7-889 





FCVTXN, FCVTXN2 


Floating-point convert to lower precision narrow, rounding to odd 
(vector and scalar form) 


FCVTXN, FCVTXN2 on 
page C7-907 















































FNEG Floating-point negate (vector form) F'NEG (vector) on page C7-992 

FRECPE Floating-point reciprocal estimate (vector and scalar form) FRECPE on page C7-1000 

FRECPX Floating-point reciprocal square root (scalar form) FRECPX on page C7-1004 

FRINTA Floating-point round to integral, to nearest with ties to away (vector FRINTA (scalar) on page C7-1007 
form) 

FRINTI Floating-point round to integral, using current rounding mode (vector FRINTI (vector) on page C7-1009 
form) 

FRINTM Floating-point round to integral, toward minus infinity (vector form) FRINTM (vector) on page C7-1013 

FRINTN Floating-point round to integral, to nearest with ties to even (vector FRINTN (vector) on page C7-1017 
form) 

FRINTP Floating-point round to integral, toward positive infinity (vector form) | FRINTP (vector) on page C7-1021 

FRINTX Floating-point round to integral exact, using current rounding mode FRINTX (vector) on page C7-1025 
(vector form) 

FRINTZ Floating-point round to integral, toward zero (vector form) FRINTZ (vector) on page C7-1029 

FRSQRTE Floating-point reciprocal square root estimate (vector and scalar form) | /RSQRTE on page C7-1033 

FSQRT Floating-point square root (vector form) FSQRT (vector) on page C7-1037 
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Table C3-64 SIMD unary arithmetic instructions (continued) 






































Mnemonic Instruction See 

MVN Bitwise NOT (vector form) MVN on page C7-1127 

NEG Negate (vector and scalar form) NEG (vector) on page C7-1130 
NOT Bitwise NOT (vector form) NOT on page C7-1132 

RBIT Bitwise reverse (vector form) RBIT (vector) on page C7-1143 
REV16 Reverse elements in 16-bit halfwords (vector form) REV16 (vector) on page C7-1144 
REV32 Reverse elements in 32-bit words (vector form) REV32 (vector) on page C7-1146 
REV64 Reverse elements in 64-bit doublewords (vector form) REV64 on page C7-1148 

SADALP Signed add and accumulate long pairwise (vector form) SADALP on page C7-1162 
SADDLP Signed add long pairwise (vector form) SADDLP on page C7-1166 

SQABS Signed saturating absolute value (vector and scalar form) SQABS on page C7-1232 

SQNEG Signed saturating negate (vector and scalar form) SQNEG on page C7-1261 





SQXTN, SQXTN2 


Signed saturating extract narrow (vector form) 


SQXTN, SQXTN2 on 
page C7-1292 





SQXTUN, SQXTUN2 


Signed saturating extract unsigned narrow (vector and scalar form) 


SQXTUN, SQXTUN2 on 
page C7-1295 














SUQADD Signed saturating accumulate of unsigned value (vector and scalar SUQADD on page C7-1368 
form) 

SXTL, SXTL2 Signed extend long SXTL, SXTL2 on page C7-1370 

UADALP Unsigned add and accumulate long pairwise (vector form) UADALP on page C7-1388 

UADDLP Unsigned add long pairwise (vector form) UADDLP on page C7-1392 





UQXTN, UQXTN2 


Unsigned saturating extract narrow (vector form) 


UQXTN, UQXTN2 on 
page C7-1456 




















URECPE Unsigned reciprocal estimate (vector form) URECPE on page C7-1459 

URSQRTE Unsigned reciprocal square root estimate (vector form) URSQRTE on page C7-1466 

USQADD Unsigned saturating accumulate of signed value (vector and scalar USQADD on page C7-1475 
form) 

UXTL, UXTL2 Unsigned extend long UXTL, UXTL2 on page C7-1483 

XTN, XTN2 Extract narrow (vector form) XTN, XTN2 on page C7-1489 
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C3.5.17 SIMD by element arithmetic 
For information about the variants of these instructions, see Common features of SIMD instructions on page C3-171. 
Table C3-65 shows the SIMD by element arithmetic instructions. 
Table C3-65 SIMD by element arithmetic instructions 
Mnemonic Instruction See 
FMLA Floating-point fused multiply-add (vector and scalar form) FMLA (by element) on page C7-965 
FMLS Floating-point fused multiply-subtract (vector and scalar form) FMLS (by element) on page C7-969. 
FMUL Floating-point multiply (vector and scalar form) FMUL (by element) on page C7-981 
FMULX Floating-point multiply extended (vector and scalar form) FMULX (by element) on page C7-987 
MLA Multiply-add (vector form) MLA (by element) on page C7-1104 
MLS Multiply-subtract (vector form) MLS (by element) on page C7-1108 
MUL Multiply (vector form) MUL (by element) on page C7-1123 





SMLAL, SMLAL2 


SMLAL, SMLAL2 (by element) on 
page C7-1216 


Signed multiply-add long (vector form) 





SMLSL, SMLSL2 


Signed multiply-subtract long (vector form) SMLSL, SMLSL2 (by element) on 


page C7-1221 








SMULL, SMULL2 


Signed multiply long (vector form) SMULL, SMULL2 (by element) on 


page C7-1228 





SQDMLAL, SQDMLAL2 


Signed saturating doubling multiply-add long (vector and scalar SQDMLAL, SQDMLAL2 (by element) on 
form) page C7-1236 





SQDMLSL, SQDMLSL2 


SQDMLSL, SQDMLSL2 (by element) on 
page C7-1243 


Signed saturating doubling multiply-subtract long (vector form) 





SQDMULH 


Signed saturating doubling multiply returning high half (vector 
and scalar form) 


SQDMULH (by element) on page C7-1250 








SQDMULL, SQDMULL2 


Signed saturating doubling multiply long (vector and scalar form) SQDMULL, SQDMULL2 (by element) on 
page C7-1255 








SQRDMULH 


Signed saturating rounding doubling multiply returning high half 
(vector and scalar form) 


SQRDMULH (by element) on page C7-1263 





UMLAL, UMLAL2 


UMLAL, UMLAL2 (by element) on 
page C7-1423 


Unsigned multiply-add long (vector form) 





UMLSL, UMLSL2 


UMLSL, UMLSL2 (by element) on 
page C7-1428 


Unsigned multiply-subtract long (vector form) 





UMULL, UMULL2 


UMULL, UMULL2 (by element) on 
page C7-1435 


Unsigned multiply long (vector form) 
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Table C3-66 shows the SIMD permute instructions. 


C3.5.19 SIMD immediate 


Table C3-66 SIMD permute instructions 


























Mnemonic _ Instruction See 

EXT Extract vector from a pair of vectors EXT on page C7-826 
TRN1 Transpose vectors (primary) TRN1 on page C7-1376 
TRN2 Transpose vectors (secondary) TRN2 on page C7-1378 
UZP1 Unzip vectors (primary) UZP1 on page C7-1485 
UZP2 Unzip vectors (secondary) UZP2 on page C7-1487 
ZIP1 Zip vectors (primary) ZIP1 on page C7-1491 
ZIP2 Zip vectors (secondary) ZIP2 on page C7-1493 





Table C3-67 shows the SIMD immediate instructions. 


Table C3-67 SIMD immediate instructions 




















Mnemonic _ Instruction See 

BIC Bitwise bit clear immediate BIC (vector, immediate) on page C7-787 
FMOV Floating-point move immediate FMOV (vector, immediate) on page C7-973 
MOVI Move immediate MOVI on page C7-1120 

MVNI Move inverted immediate MVNI on page C7-1128 

ORR Bitwise inclusive OR immediate ORR (vector, immediate) on page C7-1134 





C3.5.20 SIMD shift (immediate) 


For information about the variants of these instructions, see Common features of SIMD instructions on page C3-171. 


Table C3-68 shows the SIMD shift immediate instructions. 


Table C3-68 SIMD shift (immediate) instructions 




















Mnemonic Instruction See 

RSHRN, RSHRN2 Rounding shift right narrow immediate (vector form) RSHRN, RSHRN2 on page C7-1150 
SHL Shift left immediate (vector and scalar form) SHL on page C7-1193 

SHLL, SHLL2 Shift left long (by element size) (vector form) SHLL, SHLL2 on page C7-1195 
SHRN, SHRN2 Shift right narrow immediate (vector form) SHRN, SHRN2 on page C7-1197 
SLI Shift left and insert immediate (vector and scalar form) SLI on page C7-1201 





SQRSHRN, SQRSHRN2 Signed saturating rounded shift right narrow immediate (vector SQRSHRN, SQRSHRN2 on 
and scalar form) 


page C7-1270 
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Table C3-68 SIMD shift (immediate) instructions (continued) 





Mnemonic 


Instruction 


See 





SQRSHRUN, SQRSHRUN2 


Signed saturating shift right unsigned narrow immediate (vector 
and scalar form) 


SQRSHRUN, SQRSHRUN2 on 
page C7-1273 





SQSHL 


Signed saturating shift left immediate (vector and scalar form) 


SQSHL (immediate) on page C7-1276 





SQSHLU 


Signed saturating shift left unsigned immediate (vector and scalar 
form) 


SQSHLU on page C7-1281 





SQSHRN, SQSHRN2 


Signed saturating shift right narrow immediate (vector and scalar 
form) 


SQSHRN, SQSHRN2 on page C7-1284 





SQSHRUN, SQSHRUN2 


Signed saturating shift right unsigned narrow immediate (vector 
and scalar form) 


SQSHRUN, SQSHRUN2 on 
page C7-1287 











SRI Shift right and insert immediate (vector and scalar form) SRI on page C7-1300 
SRSHR Signed rounding shift right immediate (vector and scalar form) SRSAR on page C7-1305 
SRSRA Signed rounding shift right and accumulate immediate (vector and SRSRA on page C7-1307. 


scalar form) 





SSHLL, SSHLL2 


Signed shift left long immediate (vector form) 


SSHLL, SSHLL2 on page C7-1311 











SSHR Signed shift right immediate (vector and scalar form) SSHR on page C7-1313 
SSRA Signed integer shift right and accumulate immediate (vector and SSRA on page C7-1315 
scalar form) 
SXTL, SXTL2 Signed integer extend (vector only) SXTL, SXTL2 on page C7-1370 





UQRSHRN, UQRSHRN2 


Unsigned saturating rounded shift right narrow immediate (vector 
and scalar form) 


UQRSHRN, UQRSHRN2 on 
page C7-1443 





UQSHL 


nsigned saturating shift left immediate (vector and scalar form) 


UQSHL (immediate) on page C7-1446 





UQSHRN, UQSHRN2 


U 
Unsigned saturating shift right narrow immediate (vector and 
scalar form) 


UQSHRN, UQSHRN2 on page C7-1451 





URSHR 


nsigned rounding shift right immediate (vector and scalar form) 


URSHR on page C7-1464 





URSRA 


U 
Unsigned integer rounding shift right and accumulate immediate 
(vector and scalar form) 


URSRA on page C7-1467 





USHLL, USHLL2 


nsigned shift left long immediate (vector form) 


USHLL, USHLL2 on page C7-1471 








U 
Unsigned shift right immediate (vector and scalar form) 
U 
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USHR USHR on page C7-1473 
USRA nsigned shift right and accumulate immediate (vector and scalar USRA on page C7-1477 
form) 
UXTL, UXTL2 Unsigned integer extend (vector only) UXTL, UXTL2 on page C7-1483 
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C3.5.21 SIMD floating-point and integer conversion 
The SIMD floating-point and integer conversion instructions generate the Invalid Operation exception in response 
to a floating-point input of NaN, infinity, or a numerical value that cannot be represented within the destination 
register. An out-of-range integer or a fixed-point result is saturated to the size of the destination register. A numeric 
result that differs from the input raises the Inexact exception. 
Table C3-69 shows the SIMD floating-point and integer conversion instructions. 
Table C3-69 SIMD floating-point and integer conversion instructions 
Mnemonic Instruction See 
FCVTAS Floating-point convert to signed integer, rounding to nearest with ties FCVTAS (vector) on page C7-871 
to away (vector and scalar form) 
FCVTAU Floating-point convert to unsigned integer, rounding to nearest with ties © FCVTAU (vector) on page C7-875 
to away (vector and scalar form) 
FCVTMS Floating-point convert to signed integer, rounding toward minus FCVTMS (vector) on page C7-881 
infinity (vector and scalar form) 
FCVTMU Floating-point convert to unsigned integer, rounding toward minus FCVTMU (vector) on page C7-885 
infinity (vector and scalar form) 
FCVTNS Floating-point convert to signed integer, rounding to nearest with ties FCVTNS (vector) on page C7-891 
to even (vector and scalar form) 
FCVTNU Floating-point convert to unsigned integer, rounding to nearest with ties © FCVTNU (vector) on page C7-895 
to even (vector and scalar form) 
FCVTPS Floating-point convert to signed integer, rounding toward positive FCVTPS (vector) on page C7-899 
infinity (vector and scalar form) 
FCVTPU Floating-point convert to unsigned integer, rounding toward positive FCVTPU (vector) on page C7-903 
infinity (vector and scalar form) 
FCVTZS Floating-point convert to signed integer, rounding toward zero (vector © FCVTZS (vector, integer) on page C7-912 
and scalar form) 
Floating-point convert to signed fixed-point, rounding toward zero FCVITZS (vector, fixed-point) on 
(vector and scalar form) page C7-909 
FCVTZU Floating-point convert to unsigned integer, rounding toward zero FCVTZU (vector, integer) on page C7-921 
(vector and scalar form) 
Floating-point convert to unsigned fixed-point, rounding toward zero, § FCVTZU (vector, fixed-point) on 
(vector and scalar form) page C7-918 
SCVTF Signed integer convert to floating-point (vector and scalar form) SCVTF (vector, integer) on page C7-1175 
Signed fixed-point convert to floating-point (vector and scalar form) SCVIF (vector, fixed-point) on 
page C7-1172 
UCVTF Unsigned integer convert to floating-point (vector and scalar form) UCVIF (vector, integer) on page C7-1401 





Unsigned fixed-point convert to floating-point (vector and scalar form) 


UCVTF (vector, fixed-point) on 
page C7-1398 
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C3.5.22 SIMD reduce (across vector lanes) 
The SIMD reduce (across vector lanes) instructions perform arithmetic operations horizontally, that is across all 
lanes of the input vector. They deliver a single scalar result. 
Table C3-70 shows the SIMD reduce (across vector lanes) instructions. 
Table C3-70 SIMD reduce (across vector lanes) instructions 
Mnemonic _ Instruction See 
ADDV Add (across vector) ADDV on page C7-780 
FMAXNMV Floating-point maximum number (across vector) FMAXNMV on page C7-944 
FMAXV Floating-point maximum (across vector) FMAXV on page C7-948 
FMINNMV Floating-point minimum number (across vector) FMINNMV on page C7-960 
FMINV Floating-point minimum (across vector) FMINV on page C7-964 
SADDLV Signed add long (across vector) SADDLV on page C7-1168 
SMAXV Signed maximum (across vector) SMAXV on page C7-1208 
SMINV Signed minimum (across vector) SMINV on page C7-1214 
UADDLV Unsigned add long (across vector) UADDLYV on page C7-1394 
UMAXV Unsigned maximum (across vector) UMAXV on page C7-1415 
UMINV Unsigned minimum (across vector) UMINV on page C7-1421 
C3.5.23 SIMD pairwise arithmetic 
The SIMD pairwise arithmetic instructions perform operations on pairs of adjacent elements and deliver a vector 
result. 
Table C3-71 shows the SIMD pairwise arithmetic instructions. 
Table C3-71 SIMD pairwise arithmetic instructions 
Mnemonic _ Instruction See 
ADDP Add pairwise (vector and scalar form) ADDP (vector) on page C7-778 
ADDP (scalar) on page C7-777 
FADDP Floating-point add pairwise (vector and scalar form) FADDP (vector) on page C7-841 
FADDP (scalar) on page C7-840 
FMAXNMP Floating-point maximum number pairwise (vector and scalar form) | FMAXNMP (vector) on page C7-942 
FMAXNMP (scalar) on page C7-941 
FMAXP Floating-point maximum pairwise (vector and scalar form) FMAXP (vector) on page C7-946 
FMAXP (scalar) on page C7-945 
FMINNMP Floating-point minimum number pairwise (vector and scalar form) FMINNMP (vector) on page C7-958 
FMINNMP (scalar) on page C7-957 
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Table C3-71 SIMD pairwise arithmetic instructions (continued) 
































Mnemonic __ Instruction See 

FMINP Floating-point minimum pairwise (vector and scalar form) FMINP (vector) on page C7-962 
FMINP (scalar) on page C7-961 

SMAXP Signed maximum pairwise SMAXP on page C7-1206 

SMINP Signed minimum pairwise SMINP on page C7-1212 

UMAXP Unsigned maximum pairwise UMAXP on page C7-1413 

UMINP Unsigned minimum pairwise UMINP on page C7-1419 

C3.5.24 SIMD table lookup 


C3.5.25 


Table C3-72 shows the SIMD table lookup instructions. 


The Cryptographic Extension 


Table C3-72 SIMD table lookup instructions 











Mnemonic _ Instruction See 
TBL Table vector lookup TBL on page C7-1372 
TBX Table vector lookup extension TBX on page C7-1374 





The instructions provided by the optional Cryptographic Extension share the SIMD and floating-point register file. 


For more information see: 


° Announcing the Advanced Encryption Standard. 
° The Galois/Counter Mode of Operation. 


° Announcing the Secure Hash Standard. 


Table C3-73 shows the Cryptographic Extension instructions. 


Table C3-73 Cryptographic Extension instructions 





Mnemonic 


Instruction 


See 





AESD 


AES single round decryption 


AESD on page C7-782 





AESE 


AES single round encryption 


AESE on page C7-783 





AESIMC 


AES inverse mix columns 


AESIMC on page C7-784 





AESMC 


AES mix columns 


AESMC on page C7-785 





PMULL 


Polynomial multiply long 


PMULL, PMULL2 on page C7-1139 





SHA1C 


SHA1 hash update (choose) 


SHAIC on page C7-1181 





SHA1H 


SHAI fixed rotate 


SHAIH on page C7-1182 





SHA1M 


SHA1 hash update (majority) 


SHAIM on page C7-1183 





SHA1P 


SHAT hash update (parity) 


SHAIP on page C7-1184 





SHA1SUQ 


SHA schedule update 0 


SHAISUO on page C7-1185 





SHA1SU1 


SHA1 schedule update 1 


SHAISU] on page C7-1186 
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Table C3-73 Cryptographic Extension instructions (continued) 























Mnemonic _ Instruction See 
SHA256H SHA256 hash update (part 1) SHA256H on page C7-1188 
SHA256H2 SHA256 hash update (part 2) SHA256H2 on page C7-1187 
SHA256SU0 SHA256 schedule update 0 SHA256SU0 on page C7-1189 
SHA256SU1 SHA256 schedule update 1 SHA256SU1 on page C7-1190 
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Chapter C4 


A64 Instruction Set Encoding 


This chapter describes the encoding of the A64 instruction set. It contains the following sections: 


A64 instruction index by encoding on page C4-192. 

Data processing - immediate on page C4-193. 

Branches, exception generating and system instructions on page C4-197. 
Loads and stores on page C4-202. 

Data processing - register on page C4-224. 

Data processing - SIMD and floating point on page C4-233. 


In this chapter: 


In the decode tables, an entry of - for a field value means the value of the field does not affect the decoding. 


In the decode diagrams, a shaded field indicates that the bits in that field are not used in that level of decode. 
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C4.1 A64 instruction index by encoding 


The encoding of an A64 instruction is: 


131 29 28 25 24| | | | | | 0| 


Table C4-1 Main encoding table for the A64 instruction set 





Decode fields 
Decode group or instruction page 





























op0 
QOxx Unallocated. 
100x Data processing - immediate on page C4-193 
101x Branches, exception generating and system instructions on page C4-197 
x1x0 Loads and stores on page C4-202 
x101 Data processing - register on page C4-224 
0111 Data processing - SIMD and floating point on page C4-233 
1111 Data processing - SIMD and floating point on page C4-233 
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C4.2 Data processing - immediate 


This section describes the encoding of the Data processing (immediate) instruction group. This section is decoded 
from A64 instruction index by encoding on page C4-192. For additional information on this functional group of 
instructions, see Data processing - immediate on page C3-158. 


131 29 28| 25 |23 22 | | | | | 0| 


po] 00 | op 


Table C4-2 Encoding table for the Data Processing -- Immediate group 





Decode fields 
Decode group or instruction page 




















op0 

00x PC-rel. addressing on page C4-196 

Q1x Add/subtract (immediate) 

100 Logical (immediate) on page C4-195 
101 Move wide (immediate) on page C4-195 
110 Bitfield on page C4-194 

111 Extract on page C4-194 





C4.2.1 Add/subtract (immediate) 


This section describes the encoding of the Add/subtract (immediate) instruction class. This section is decoded from 
Data processing - immediate. 


|31 30 29 28|27 26 25 24/23 2221 | | | 109 | 5 4| 0 | 


fsffon[S]1 000 a[shiR] _mmi2_——S—S—~ir~SSn Sd] SRC 





Decode fields 
Instruction Page 
sf op S_ shift 



































- - - ik Unallocated. 

Q Y) eo - ADD (immediate) - 32-bit variant on page C6-439 

0 M) 1 - ADDS (immediate) - 32-bit variant on page C6-445 
0 1 e - SUB (immediate) - 32-bit variant on page C6-728 

0 1 1 - SUBS (immediate) - 32-bit variant on page C6-734 
1 Y) eo - ADD (immediate) - 64-bit variant on page C6-439 

al Y) 1 - ADDS (immediate) - 64-bit variant on page C6-445 
1 1 eo - SUB (immediate) - 64-bit variant on page C6-728 

1 a 1 - SUBS (immediate) - 64-bit variant on page C6-734 
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C4.2.2 Bitfield 


This section describes the encoding of the Bitfield instruction class. This section is decoded from Data processing 
- immediate on page C4-193. 


|31 30 29 28|27 26 25 24/23 2221 | 16/15 | 109 5 4| 0 | 


ec ae i ee ee 





Decode fields 
Instruction Page 
sf ope N 





- 11 - Unallocated. 





) - 1 Unallocated. 





0 00 Y) SBPM - 32-bit variant on page C6-668 





) 01 Y) BFM - 32-bit variant on page C6-465 





Q 10 Y) UBFM - 32-bit variant on page C6-752 





1 - 0 Unallocated. 





1 00 1 SBFM - 64-bit variant on page C6-668 





1 01 1 BFM - 64-bit variant on page C6-465 





1 10 1 UBFM - 64-bit variant on page C6-752 





C4.2.3 Extract 


This section describes the encoding of the Extract instruction class. This section is decoded from Data processing 
- immediate on page C4-193. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 | 109 5 4| 0 | 


FN RS 





Decode fields 
Instruction Page 
sf op21 N 00 imms 



































- x1 - - - Unallocated. 
- 00 - 1 - Unallocated. 
- 1x - - - Unallocated. 
) - - - 1xxxxx Unallocated. 
Q - 1 - - Unallocated. 
0 00 0 0 Oxxxxx EXTR - 32-bit variant on page C6-526 
1 - 0 - - Unallocated. 
1 00 1 0 - EXTR - 64-bit variant on page C6-526 
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C4 A64 Instruction Set Encoding 
C4.2 Data processing - immediate 


C4.2.4 Logical (immediate) 
This section describes the encoding of the Logical (immediate) instruction class. This section is decoded from Data 


processing - immediate on page C4-193. 


|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0 | 


eo ae i ee ee 





Decode fields 
Instruction Page 
sf ope N 





() - 1 Unallocated. 





i) 00 0 AND (immediate) - 32-bit variant on page C6-451 





Q 01 0 ORR (immediate) - 32-bit variant on page C6-640 





0 10 0 EOR (immediate) - 32-bit variant on page C6-522 





0 11 1) ANDS (immediate) - 32-bit variant on page C6-454 














1 00 - AND (immediate) - 64-bit variant on page C6-451 
1 Q1 - ORR (immediate) - 64-bit variant on page C6-640 
1 10 - EOR (immediate) - 64-bit variant on page C6-522 
1 11 - ANDS (immediate) - 64-bit variant on page C6-454 





C4.2.5 Move wide (immediate) 


This section describes the encoding of the Move wide (immediate) instruction class. This section is decoded from 
Data processing - immediate on page C4-193. 


|31 30 29 28|27 26 25 24|23 22 21 20| | | | 5 4| 0 | 
Isffopc]1 0 0 10 1]hw] imme CT RT 





Decode fields 
Instruction Page 
sf opc hw 
































- 01 - Unallocated. 

Y) - 1x Unallocated. 

Y) 00 - MOVN - 32-bit variant on page C6-618 
Y) 10 - MOVZ - 32-bit variant on page C6-620 
Y) 11 - MOVK - 32-bit variant on page C6-617 
1 00 - MOVN - 64-bit variant on page C6-618 
1 10 - MOVZ - 64-bit variant on page C6-620 
1 11 - MOVK - 64-bit variant on page C6-617 
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C4 A64 Instruction Set Encoding 
C4.2 Data processing - immediate 


C4.2.6 PC-rel. addressing 


This section describes the encoding of the PC-rel. addressing instruction class. This section is decoded from Data 
processing - immediate on page C4-193. 


|31 30 29 28|27 26 25 24|23 | | | | 5 4| 0 | 
opfmmi[ 0.0.00] Sm SSC~=~“—~—S*~S*S*S—C—C—s—éSC‘iRCd 





Decode fields 
Instruction Page 














op 
") ADR 
1 ADRP 
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C4 A64 Instruction Set Encoding 
C4.3 Branches, exception generating and system instructions 


C4.3 Branches, exception generating and system instructions 


This section describes the encoding of the Branch, exception generation and system instruction group. This section 
is decoded from A64 instruction index by encoding on page C4-192. For additional information on this functional 
group of instructions, see Branches, Exception generating, and System instructions on page C3-142. 


|31 29 28 25 | 2221 | | | | | 0| 


| opd | 101 | opt 


Table C4-3 Encoding table for the Branches, Exception Generating and System instructions 









































group 
Decode fields 
Decode group or instruction page 
op0 op1 
010 Qxxx Conditional branch (immediate) on page C4-198 
010 1xxx Unallocated. 
110 QOxx Exception generation on page C4-198 
110 0100 System on page C4-199 
110 0101 Unallocated. 
110 Q11x Unallocated. 
110 1xxx Unconditional branch (register) on page C4-201 
x00 - Unconditional branch (immediate) on page C4-200 
x01 Oxxx Compare & branch (immediate) 
x01 1xxx Test & branch (immediate) on page C4-200 
x11 - Unallocated. 





C4.3.1 Compare & branch (immediate) 


This section describes the encoding of the Compare & branch (immediate) instruction class. This section is decoded 
from Branches, exception generating and system instructions. 


[31 30 29 28|27 26 25 24|23 | | | | 5 4| 0| 


sflo.1 4.04 Ofopf TT immig R 





Decode fields 
Instruction Page 




















sf op 

i) Y) CBZ - 32-bit variant on page C6-477 

0 1 CBNZ - 32-bit variant on page C6-476 

1 Y) CBZ - 64-bit variant on page C6-477 

1 1 CBNZ - 64-bit variant on page C6-476 
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C4 A64 Instruction Set Encoding 
C4.3 Branches, exception generating and system instructions 


C4.3.2 Conditional branch (immediate) 


This section describes the encoding of the Conditional branch (immediate) instruction class. This section is decoded 
from Branches, exception generating and system instructions on page C4-197. 


|31 30 29 28|27 26 25 24|23 | | | | 5 4|3 0| 


fot o1o7 Oy immisn—SS~S~S~SC«wO] con 





Decode fields 
Instruction Page 











o1 00 

v) ) B.cond 

0 1 Unallocated. 
1 - Unallocated. 





C4.3.3 Exception generation 


This section describes the encoding of the Exception generation instruction class. This section is decoded from 
Branches, exception generating and system instructions on page C4-197. 


|31 30 29 28|27 26 25 24/23 +21 20! | | | 5 4] 21 0 
1107070 0f oc [imme CSL 





Decode fields 
Instruction Page 
ope op2 LL 











- xxl - Unallocated. 
- x1x - Unallocated. 
- 1xx - Unallocated. 





000 000 00 Unallocated. 





000 000 @1 SVC 





000 000 10 HVC 





000 000 11 SMC 





001 000 x1 Unallocated. 





001 000 0@ = BRK 





001 000 1x Unallocated. 





010 000 x1 Unallocated. 





010 000 e@)=)S HLT 





010 000 1x Unallocated. 








Q11 000 - Unallocated. 
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C4.3.4 System 


C4 A64 Instruction Set Encoding 


C4.3 Branches, exception generating and system instructions 





Decode fields 
Instruction Page 
ope op2 LL 





100 000 - Unallocated. 





101 000 00 Unallocated. 





101 000 01 DCPS1 





101 000 10 DCPS2 





101 000 Ti... DCPS3 





11x 000 - Unallocated. 





This section describes the encoding of the System instruction class. This section is decoded from Branches, 


exception generating and system instructions on page C4-197. 


131 30 29 28|27 26 25 24|23 22 21 20/19 18 


16/15 


(110707070 oftfopo{ opt | cRn | cRm | op2 | Rt | 


12\11 8/7 5 4| 0 | 





Decode fields 


Instruction Page 






























































L op0 opt CRn CRm op2 Rt 

0 00 - 000x - - - Unallocated. 

0 00 - 0100 - - != 11111 Unallocated. 

®@ 00 - 0100 - - 11111 MSR (immediate) 

0 00 - 0101 - - - Unallocated. 

0 00 - Q11x - - - Unallocated. 

0 00 - 1xxx - - - Unallocated. 

0 00 xx Q01x - - - Unallocated. 

0 00 xOx 001x - - - Unallocated. 

0 00 011 001x - - != 11111 Unallocated. 

0 00 011 0010 '= 0000 - 11111 HINT - Hints 8 to 127 variant on page C6-528 

0 00 011 0010 §=— 0000 000 11111 NOP 

0 00 011 0010 §«=— 0200 001 11111 YIELD 

0 00 011 0010 «©0000 010 11111 WFE 

0 00 011 0010 8 §=— 0000 011 11111 WFI 

0 00 011 0010 ©0200 100 11111 SEV 

0 00 011 0010 §«§©— 0000 101 11111 SEVL 

0 00 011 0010 0000 11x 11111 HINT - Hints 6 and 7 variant on page C6-528 
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C4 A64 Instruction Set Encoding 


C4.3 Branches, exception generating and system instructions 





Decode fields 


Instruction Page 





















































L op0 opt CRn CRm op2 Rt 

0 00 011 0011 - 000 - Unallocated. 
0 00 011 0011 - 001 - Unallocated. 
0 00 011 0011 - 010 11111 CLREX 

0 00 011 0011 - 011 - Unallocated. 
0 00 011 0011 - 100 11111 DSB 

) 00 011 0011 - 101 11111 DMB 

0 00 011 0011 - 110 11111 ISB 

0 00 011 0011 - 111 - Unallocated. 
0 00 1xx 001x - - - Unallocated. 
®@ 1 - - - - - SYS 

@ ix - - - - - MSR (register) 
1 00 - - - - - Unallocated. 
1 01 - - - - - SYSL 

1 ix - - - - - MRS 





C4.3.5 Test & branch (immediate) 


This section describes the encoding of the Test & branch (immediate) instruction class. This section is decoded from 
Branches, exception generating and system instructions on page C4-197. 


|31 30 29 28|27 26 25 24|23 |19 18 | | | 5 4| 0 | 
pelo 7107 tf] oa [| immia—=—“‘*CS™*™C*dNSC*SéRY:CSCi‘<zs 


C4.3.6 Unconditional branch (immediate) 





Decode fields 
Instruction Page 








op 
0 TBZ 
1 TBNZ 





This section describes the encoding of the Unconditional branch (immediate) instruction class. This section is 
decoded from Branches, exception generating and system instructions on page C4-197. 
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C4 A64 Instruction Set Encoding 
C4.3 Branches, exception generating and system instructions 


131 30 29 28|27 26 25 | | | | | | 0| 


po 0107] SSCS SC—SCS 





Decode fields 
Instruction Page 








op 
0 B 
1 BL 





C4.3.7 Unconditional branch (register) 
This section describes the encoding of the Unconditional branch (register) instruction class. This section is decoded 


from Branches, exception generating and system instructions on page C4-197. 


|31 30 29 28|27 26 25 24| 21 20| 16/15 | 109 5 4| 0 | 


1101701 a a a a eV 





Decode fields 
Instruction Page 
ope op2 op3 Rn op4 





- - - - '= Q0000 Unallocated. 









































7 7 '= 000000 = - - Unallocated. 

- != 11111 - - - Unallocated. 

0000 = 11111 000000 - 00000 BR 

0001 811111 000000 - 00000 BLR 

0010 = 8=11111 000000 - 00000 RET 

0011 - - - - Unallocated. 

Q10x- - != 11111 - Unallocated. 

0100 = 11111 000000 11111 00000 ERET 

0101 = 11111 000000 11111 00000 DRPS 

Q@llx - - - - Unallocated. 

1xxx - - - Unallocated. 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 


C4.4 


Loads and stores 


This section describes the encoding of the Loads and stores instruction group. This section is decoded from A64 
instruction index by encoding on page C4-192. For additional information on this functional group of instructions, 
see Loads and stores on page C3-146. 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 12/1110 9 | | 0 | 


| | fort] 1] fofops} [ops Pop PO 


| 
op0 
op2 





Table C4-4 Encoding table for the Loads and Stores group 





Decode fields 


Decode group or instruction page 






















































































op0 opi op2 op3 op4 op5 
) 00 1 00 000000 = - Advanced SIMD load/store multiple structures on page C4-203 
) 00 1 01 OXXXXX = Advanced SIMD load/store multiple structures (post-indexed) on page C4-204 
0 00 1 Ox AXXXxXx = Unallocated. 
0 00 ab 10 x00000 = - Advanced SIMD load/store single structure on page C4-205 
0 00 1 11 - - Advanced SIMD load/store single structure (post-indexed) on page C4-208 
) 00 1 x X1Xxxx - Unallocated. 
) 00 1 x0 XX1Xxx - Unallocated. 
0 00 al x XXX1XxX - Unallocated. 
) 00 1 x XXXX1X - Unallocated. 
) 00 1 x XXXXX1_- Unallocated. 
1 00 1 - - - Unallocated. 
- 00 0 Ox - - Load/store exclusive on page C4-212 
= 00 ) 1x - - Unallocated. 
- 01 - Ox - - Load register (literal) on page C4-212 
- 01 - 1x - - Unallocated. 
- 10 - 00 - - Load/store no-allocate pair (offset) on page C4-213 
- 10 - 01 - - Load/store register pair (post-indexed) on page C4-221 
- 10 - 10 - - Load/store register pair (offset) on page C4-221 
- 10 - 11 - - Load/store register pair (pre-indexed) on page C4-222 
- 11 - Ox Oxxxxx 00 Load/store register (unscaled immediate) on page C4-219 
- 11 - Ox Oxxxxx 1 Load/store register (immediate post-indexed) on page C4-214 
- 11 - Ox Oxxxxx 10 Load/store register (unprivileged) on page C4-218 
- 11 - Ox Oxxxxx 11 Load/store register (immediate pre-indexed) on page C4-215 
- 11 - Ox 1xxxxx 00 Unallocated. 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 


Table C4-4 Encoding table for the Loads and Stores group (continued) 





Decode fields 


opO0 opt op2 op3 op4 


Decode group or instruction page 
op5 

















- 11 - Ox 1xxxxx @1 Unallocated. 

- 11 - Ox 1xxxxx 10 Load/store register (register offset) on page C4-216 

- 11 - Ox 1xxxxx 11 Unallocated. 

- 11 - 1x - - Load/store register (unsigned immediate) on page C4-220 
C4.4.1 Advanced SIMD load/store multiple structures 


This section describes the encoding of the Advanced SIMD load/store multiple structures instruction class. This 
section is decoded from Loads and stores on page C4-202. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 | 5 4| 0 | 


fofajoo 14100 oftfo0000 0] opcode |size] Rn | Rt | 





Decode fields 
Instruction Page 







































































L opcode 
0 0000 ST4 (multiple structures) 
) 0001 Unallocated. 
) 0010 ST1 (multiple structures) - Four registers variant on page C7-1321 
0 0011 Unallocated. 
) 0100 ST3 (multiple structures) 
) 0101 Unallocated. 
0 0110 ST1 (multiple structures) - Three registers variant on page C7-1321 
0 0111 ST1 (multiple structures) - One register variant on page C7-1321 
0 1000 ST2 (multiple structures) 
0 1001 Unallocated. 
0 1010 ST1 (multiple structures) - Two registers variant on page C7-1321 
0 1011 Unallocated. 
0 11xx Unallocated. 
1 0000 LD4 (multiple structures) 
1 0001 Unallocated. 
1 0010 LD1 (multiple structures) - Four registers variant on page C7-1047 
1 0011 Unallocated. 
1 0100 LD3 (multiple structures) 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 
Instruction Page 


























L opcode 

1 0101 Unallocated. 

1 0110 LD1 (multiple structures) - Three registers variant on page C7-1047 
1 0111 LD1 (multiple structures) - One register variant on page C7-1047 

1 1000 LD2 (multiple structures) 

1 1001 Unallocated. 

1 1010 LD1 (multiple structures) - Two registers variant on page C7-1047 
1 1011 Unallocated. 

1 11xx Unallocated. 











C4.4.2 Advanced SIMD load/store multiple structures (post-indexed) 


This section describes the encoding of the Advanced SIMD load/store multiple structures (post-indexed) instruction 
class. This section is decoded from Loads and stores on page C4-202. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 12/1110 9 | 5 4| 0 | 


fofafo.o 4700 1[t]o] Rm | opcode [swe] Rn | Rt 





Decode fields 
Instruction Page 
























































L Rm opcode 

Q - 0001 Unallocated. 

a - 0011 Unallocated. 

a - 0101 Unallocated. 

e - 1001 Unallocated. 

Qe - 1011 Unallocated. 

eo - 11xx Unallocated. 

0 '= 11111 0000 ST4 (multiple structures) - Register offset variant on page C7-1343 

Y) != 11111 0010 ST1 (multiple structures) - Four registers, register offset variant on page C7-1322 

®@ = 11111 0100 ST3 (multiple structures) - Register offset variant on page C7-1336 

Y) '= 11111 0110 ST1 (multiple structures) - Three registers, register offset variant on page C7-1322 

Y) '= 11111 6111 ST1 (multiple structures) - One register, register offset variant on page C7-1321 

Y) '= 11111 1000 ST2 (multiple structures) - Register offset variant on page C7-1329 

Y) != 11111 1010 ST1 (multiple structures) - Two registers, register offset variant on page C7-1322 

@ 11111 0000 ST4 (multiple structures) - Immediate offset variant on page C7-1343 

® 11111 0010 ST1 (multiple structures) - Four registers, immediate offset variant on page C7-1322 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 


Instruction Page 






















































































L Rm opcode 

@ 11111 0100 ST3 (multiple structures) - Immediate offset variant on page C7-1336 

@ 11111 0110 ST1 (multiple structures) - Three registers, immediate offset variant on page C7-1322 
@ 11111 0111 ST1 (multiple structures) - One register, immediate offset variant on page C7-1321 

@ 11111 1000 ST2 (multiple structures) - Immediate offset variant on page C7-1329 

@ 11111 1010 ST1 (multiple structures) - Two registers, immediate offset variant on page C7-1322 
1 - 0001 Unallocated. 

1 - 0011 Unallocated. 

1 - 0101 Unallocated. 

1 - 1001 Unallocated. 

1 - 1011 Unallocated. 

1 - 11xx Unallocated. 

1 ‘= 11111 0000 LD4 (multiple structures) - Register offset variant on page C7-1078 

1 ‘= 11111 0010 LD1 (multiple structures) - Four registers, register offset variant on page C7-1048 

1 ‘= 11111 0100 LD3 (multiple structures) - Register offset variant on page C7-1068 

1 != 11111 6110 LD1 (multiple structures) - Three registers, register offset variant on page C7-1048 

1 !=11111 6111 LD1 (multiple structures) - One register, register offset variant on page C7-1047 

1 ‘= 11111 1000 LD2 (multiple structures) - Register offset variant on page C7-1058 

1 ‘!= 11111 1010 LD1 (multiple structures) - Two registers, register offset variant on page C7-1048 

1 = 11111 0000 LD4 (multiple structures) - Immediate offset variant on page C7-1078 

1 = 11111 0010 LD1 (multiple structures) - Four registers, immediate offset variant on page C7-1048 
1 = 11111 0100 LD3 (multiple structures) - Immediate offset variant on page C7-1068 

1 = 11111 0110 LD1 (multiple structures) - Three registers, immediate offset variant on page C7-1048 
1 = 11111 0111 LD1 (multiple structures) - One register, immediate offset variant on page C7-1047 

1 = 11111 1000 LD2 (multiple structures) - Immediate offset variant on page C7-1058 

1 11111 1010 LD1 (multiple structures) - Two registers, immediate offset variant on page C7-1048 











C4.4.3 Advanced SIMD load/store single structure 


This section describes the encoding of the Advanced SIMD load/store single structure instruction class. This section 
is decoded from Loads and stores on page C4-202. 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 


|31 30 29 28|27 26 25 24|23 22 21 20/1918 1716/15 1312/11109 | 5 4| 0 | 


fofafo 0 4707 O[t[R[O 00 0 O]opcde[S[sze] Rn | Rt 





Decode fields 
Instruction Page 
L R_ opcode S_ size 





































































































Q - 11x - = Unallocated. 

0 08 000 - = ST1 (single structure) - 8-bit variant on page C7-1325 
® 8 001 - = ST3 (single structure) - 8-bit variant on page C7-1339 
@ 0 610 - x0 ST1 (single structure) - 16-bit variant on page C7-1325 
®@ 0 610 - xl Unallocated. 

®@ @ 11 - x ST3 (single structure) - 16-bit variant on page C7-1339 
® 0 011 - xl Unallocated. 

®@ 0 100 - 00 ST1 (single structure) - 32-bit variant on page C7-1325 
0 0 100 - ix Unallocated. 

®@ 0 100 ® 01 ST1 (single structure) - 64-bit variant on page C7-1325 
®@ 0 100 1 01 Unallocated. 

®@ 0 101 - 00 ST3 (single structure) - 32-bit variant on page C7-1339 
®@ @ 101 - 10 Unallocated. 

®@ 0 101 ®@ 01 ST3 (single structure) - 64-bit variant on page C7-1339 
® @ 101 ®@ i121 Unallocated. 

®@ @ 101 1 xl Unallocated. 

® 1 000 - ST2 (single structure) - 8-bit variant on page C7-1332 
® 1 = 001 - ST4 (single structure) - 8-bit variant on page C7-1346 
®@ 1 610 - x0 ST2 (single structure) - 16-bit variant on page C7-1332 
® 1 = 010 - xl Unallocated. 

® 1 @11 - x@ ST4 (single structure) - 16-bit variant on page C7-1346 
® 1 = 11 - xl Unallocated. 

® 1 = 100 - 00 ST2 (single structure) - 32-bit variant on page C7-1332 
®@ 1 100 - 10 Unallocated. 

® 1 = 100 ® 1 ST2 (single structure) - 64-bit variant on page C7-1332 
®@ 1 = 100 ®@ i121 Unallocated. 

® 1 = 100 1 xl Unallocated. 

®@ 1 = 101 - 00 ST4 (single structure) - 32-bit variant on page C7-1346 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 
Instruction Page 
L R_ opcode S_ size 

















































































































® 1 = 101 - 10 Unallocated. 

® 1 = 101 ® 1 ST4 (single structure) - 64-bit variant on page C7-1346 

® 1 = 101 ®@ i121 Unallocated. 

®@ 1 = 101 1 xl Unallocated. 

1 0 000 - = LD1 (single structure) - 8-bit variant on page C7-1051 

1 0 001 - + LD3 (single structure) - 8-bit variant on page C7-1071 

1 0 010 - x@ LD1 (single structure) - 16-bit variant on page C7-1051 

1 0 010 - xl Unallocated. 

1 @ 11 - x LD3 (single structure) - 16-bit variant on page C7-1071 

1 @® 11 - xl Unallocated. 

1 0 100 - 00 LD1 (single structure) - 32-bit variant on page C7-1051 

1 0 100 - 1x Unallocated. 

1 0 100 ®@ 1 LD1 (single structure) - 64-bit variant on page C7-1051 

1 0 100 1 01 Unallocated. 

1 @ 101 - 00 LD3 (single structure) - 32-bit variant on page C7-1071 

1 @ 101 - 10 Unallocated. 

1 @ 101 ®@ 1 LD3 (single structure) - 64-bit variant on page C7-1071 

1 @ 101 @ 11 Unallocated. 

1 @ 101 1 xl Unallocated. 

1 @ = 110 eo - LDIR 

1 0 110 1 - Unallocated. 

1 0 111 0 - LD3R 

1 @ 111 1 - Unallocated. 

1 1 000 - LD2 (single structure) - 8-bit variant on page C7-1061 

1 1 001 - = LD4 (single structure) - 8-bit variant on page C7-1081 

1 1 = 010 - x LD2 (single structure) - 16-bit variant on page C7-1061 

1 1 = 010 - xl Unallocated. 

1 1 @11 - xd LD4 (single structure) - 16-bit variant on page C7-1081 

1 1 11 - xi Unallocated. 

1 1 = 100 - 00 LD2 (single structure) - 32-bit variant on page C7-1061 

1 1 = 100 - 10 Unallocated. 

1 1 100 ®@ 01 LD2 (single structure) - 64-bit variant on page C7-1061 
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C4 A64 Instruction Set 
C4.4 Loads and stores 


Encoding 





Decode fields 


Instruction Page 






































L opcode S_ size 

1 100 @ i121 Unallocated. 

1 100 1 xi Unallocated. 

1 101 - 00 LD4 (single structure) - 32-bit variant on page C7-1081 
1 101 - 10 Unallocated. 

1 101 @ 1 LD4 (single structure) - 64-bit variant on page C7-1081 
1 101 @ i121 Unallocated. 

i 101 1 xi Unallocated. 

1 110 Q - LD2R 

1 110 1 - Unallocated. 

1 111 0 - LD4R 

1 111 1 - Unallocated. 








C4.4.4 Advanced SIMD load/store single structure (post-indexed) 


This section describes the encoding of the Advanced SIMD load/store single structure (post-indexed) instruction 
class. This section is decoded from Loads and stores on page C4-202. 















































|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 1312/11109 | 5 4| 0| 
fofajo ot 1o14tftiR] Rm fopcode|s{size] Rn | Rt 
Decode fields 
Instruction Page 
L R Rm opcode size 
@ - - 11x - Unallocated. 
0 Oo - 010 x1 Unallocated. 
0 Oo - Q11 x1 Unallocated. 
0 Oo - 100 1x Unallocated. 
0 eo - 100 01 Unallocated. 
0 oOo - 101 10 Unallocated. 
@ Oo - 101 11 Unallocated. 
®@ Oo - 101 x1 Unallocated. 
0 0 '= 11111 000 - STI (single structure) - 8-bit, register offset variant on page C7-1325 
0 0 '= 11111 001 - ST3 (single structure) - 8-bit, register offset variant on page C7-1339 
0 0 != 11111 010 x ST1 (single structure) - 16-bit, register offset variant on page C7-1326 
0 20 {= 11111 011 x0 ST3 (single structure) - 16-bit, register offset variant on page C7-1340 
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C4.4 Loads and stores 





Decode fields 
Instruction Page 




















































































































L R Rm opcode S_ size 

0 20 {= 11111 100 - 00 ST1 (single structure) - 32-bit, register offset variant on page C7-1326 

0 20 {= 11111 100 ®@ 1 ST1 (single structure) - 64-bit, register offset variant on page C7-1326 

0 0 '= 11111 101 - 00 ST3 (single structure) - 32-bit, register offset variant on page C7-1340 

@ 20 {= 11111 101 Q@ 1 ST3 (single structure) - 64-bit, register offset variant on page C7-1340 

®@ @ 11111 000 - = ST1 (single structure) - 8-bit, immediate offset variant on page C7-1325 

®@ @ 11111 001 - = ST3 (single structure) - 8-bit, immediate offset variant on page C7-1339 

0 0 11111 010 - xO ST1 (single structure) - 16-bit, immediate offset variant on page C7-1326 

0 @ 11111 Q11 - xO ST3 (single structure) - 16-bit, immediate offset variant on page C7-1340 

®@ @ 11111 100 - 00 ST1 (single structure) - 32-bit, immediate offset variant on page C7-1326 

®@ @ 11111 100 ®@ 1 ST1 (single structure) - 64-bit, immediate offset variant on page C7-1326 

0 0 11111 101 - 0 ST3 (single structure) - 32-bit, immediate offset variant on page C7-1340 

®@ @ 11111 101 ®@ 1 ST3 (single structure) - 64-bit, immediate offset variant on page C7-1340 

®@ 1 - 010 - xi Unallocated. 

®@ 1 - Q11 - xl Unallocated. 

®@ 1 - 100 - 10 Unallocated. 

®@ 1 - 100 @ 11 Unallocated. 

®@ 1 - 100 1 xi Unallocated. 

®@ 1 - 101 - 10 Unallocated. 

®@ 1 - 101 @ 11 Unallocated. 

®@ 1 - 101 1 xi Unallocated. 

@ 1 = 11111 000 - ST2 (single structure) - 8-bit, register offset variant on page C7-1332 

@ 1 '= 11111 001 - ST4 (single structure) - 8-bit, register offset variant on page C7-1346 

@ 1 = 11111 010 - xd ST2 (single structure) - 16-bit, register offset variant on page C7-1333 

@ 1 = 11111 011 - xO ST4 (single structure) - 16-bit, register offset variant on page C7-1347 

@ 1 {= 11111 100 - 00 ST2 (single structure) - 32-bit, register offset variant on page C7-1333 

@ 1 = 11111 100 @ 1 ST2 (single structure) - 64-bit, register offset variant on page C7-1333 

@ 1 '= 11111 101 - 00 ST4 (single structure) - 32-bit, register offset variant on page C7-1347 

@ 1 = 11111 101 @ @1 ST4 (single structure) - 64-bit, register offset variant on page C7-1347 

@ 1 11111 000 - ST2 (single structure) - 8-bit, immediate offset variant on page C7-1332 

®@ 1 11111 001 - = ST4 (single structure) - 8-bit, immediate offset variant on page C7-1346 

@ 1 11111 010 - xO ST2 (single structure) - 16-bit, immediate offset variant on page C7-1333 

@ 1 11111 Q11 - xO ST4 (single structure) - 16-bit, immediate offset variant on page C7-1347 
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Decode fields 
Instruction Page 























































































































L R Rm opcode S_ size 
@ 1 11111 100 - 00 ST2 (single structure) - 32-bit, immediate offset variant on page C7-1333 
®@ 1 11111 100 ®@ 1 ST2 (single structure) - 64-bit, immediate offset variant on page C7-1333 
®@ 1 11111 101 - 00 ST4 (single structure) - 32-bit, immediate offset variant on page C7-1347 
®@ 1 11111 101 Q@ 1 ST4 (single structure) - 64-bit, immediate offset variant on page C7-1347 
1 Q@ - 010 - xl Unallocated. 
1 Q - Q11 - xl Unallocated. 
1 OQ - 100 - 1x Unallocated. 
1 Q - 100 1 01 Unallocated. 
1 Q@ - 101 - 10 Unallocated. 
1 Q@ - 101 @ 11 Unallocated. 
1 Q@ - 101 1 xl Unallocated. 
1 Q@ - 110 1 - Unallocated. 
1 OQ - 111 1 - Unallocated. 
1 @ '= 11111 000 - LD1 (single structure) - 8-bit, register offset variant on page C7-1051 
1 @ '= 11111 001 - LD3 (single structure) - 8-bit, register offset variant on page C7-1071 
1 0 {= 11111 010 - xd LD1 (single structure) - 16-bit, register offset variant on page C7-1052 
1 0 '= 11111 11 - xO LD3 (single structure) - 16-bit, register offset variant on page C7-1072 
1 @ '= 11111 100 - 00 LD1 (single structure) - 32-bit, register offset variant on page C7-1052 
1 @ '= 11111 100 @ 1 LD1 (single structure) - 64-bit, register offset variant on page C7-1052 
1 @ '= 11111 101 - 0 LD3 (single structure) - 32-bit, register offset variant on page C7-1072 
1 @ '= 11111 101 @ 1 LD3 (single structure) - 64-bit, register offset variant on page C7-1072 
1 @ != 11111 110 Qa - LDIR - Register offset variant on page C7-1055 
1 @ '= 11111 111 Qo - LD3R - Register offset variant on page C7-1075 
1 @ = 11111 000 - = LD1 (single structure) - 8-bit, immediate offset variant on page C7-1051 
1 @ 11111 001 - = LD3 (single structure) - 8-bit, immediate offset variant on page C7-1071 
1 @ 11111 010 - xO LD1 (single structure) - 16-bit, immediate offset variant on page C7-1052 
1 0 11111 011 - xO LD3 (single structure) - 16-bit, immediate offset variant on page C7-1072 
1 @ 11111 100 - 00 LD1 (single structure) - 32-bit, immediate offset variant on page C7-1052 
1 @ 11111 100 @ 1 LD1 (single structure) - 64-bit, immediate offset variant on page C7-1052 
1 @ 11111 101 - 0 LD3 (single structure) - 32-bit, immediate offset variant on page C7-1072 
1 @ 11111 101 Q@ 1 LD3 (single structure) - 64-bit, immediate offset variant on page C7-1072 
1 @ 11111 110 Qo - LDIR - Immediate offset variant on page C7-1055 
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Instruction Page 























































































































L R Rm opcode S_ size 
1 @ 11111 111 Q - LD3R - Immediate offset variant on page C7-1075 
1 1 - 010 - xl Unallocated. 
1 1 - Q11 - xl Unallocated. 
1 #1 - 100 - 10 Unallocated. 
1 #1 - 100 @ 11 Unallocated. 
1 1 - 100 1 xl Unallocated. 
1 1 - 101 - 10 Unallocated. 
1 1 - 101 ®@ 11 Unallocated. 
1 1 - 101 1 xl Unallocated. 
1 1 - 110 1 - Unallocated. 
1 1 - 111 1 - Unallocated. 
1 1 '= 11111 000 - LD2 (single structure) - 8-bit, register offset variant on page C7-1061 
cs | '= 11111 001 - LD4 (single structure) - 8-bit, register offset variant on page C7-1081 
1 1 {= 11111 010 - xd LD2 (single structure) - 16-bit, register offset variant on page C7-1062 
1 1 '= 11111 11 - xO LD4 (single structure) - 16-bit, register offset variant on page C7-1082 
1 1 {= 11111 100 - 00 LD? (single structure) - 32-bit, register offset variant on page C7-1062 
1 1 {= 11111 100 ®@ 1 LD2 (single structure) - 64-bit, register offset variant on page C7-1062 
1 1 '= 11111 101 - 00 LD4 (single structure) - 32-bit, register offset variant on page C7-1082 
1 1 {= 11111 101 @ 1 LD4 (single structure) - 64-bit, register offset variant on page C7-1082 
1 1 != 11111 110 Q - LD2R - Register offset variant on page C7-1065 
1 1 != 11111 111 Q - LDAR - Register offset variant on page C7-1085 
1 1 11111 000 - = LD2 (single structure) - 8-bit, immediate offset variant on page C7-1061 
1 1 11111 001 - = LD4 (single structure) - 8-bit, immediate offset variant on page C7-1081 
1 1 11111 010 - xd LD2 (single structure) - 16-bit, immediate offset variant on page C7-1062 
1 1 11111 011 - xO LD4 (single structure) - 16-bit, immediate offset variant on page C7-1082 
1 1 11111 100 - 0 LD2 (single structure) - 32-bit, immediate offset variant on page C7-1062 
1 1. 11111 100 ®@ 1 LD2 (single structure) - 64-bit, immediate offset variant on page C7-1062 
1 1 11111 101 - 00 LD4 (single structure) - 32-bit, immediate offset variant on page C7-1082 
1 1 11111 101 @ 1 LD4 (single structure) - 64-bit, immediate offset variant on page C7-1082 
1 1 11111 110 Q - LD2R - Immediate offset variant on page C7-1065 
1 1 11111 111 Qo - LDAR - Immediate offset variant on page C7-1085 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 


C4.4.5 Load register (literal) 


This section describes the encoding of the Load register (literal) instruction class. This section is decoded from 
Loads and stores on page C4-202. 


|31 30 29 28|27 26 25 24|23 | | | | 5 4| 0 | 
foc[o 1 i[v[o 0] StS SCCSC~“—*S*~srSC<‘CSTCC*;? 





Decode fields 
Instruction Page 


























opc Vv 

00 ) LDR (literal) - 32-bit variant on page C6-553 

00 1 LDR (literal, SIMD&FP) - 32-bit variant on page C7-1097 
01 ) LDR (literal) - 64-bit variant on page C6-553 

01 1 LDR (literal, SIMD&FP) - 64-bit variant on page C7-1097 
10 t) LDRSW (literal) 

10 1 LDR (literal, SIMD&FP) - 128-bit variant on page C7-1097 
11 t) PRFM (literal) 

11 1 Unallocated. 








C4.4.6 Load/store exclusive 
This section describes the encoding of the Load/store exclusive instruction class. This section is decoded from 


Loads and stores on page C4-202. 


st 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 5 4| 0 | 


ECC Ne 





Decode fields 
Instruction Page 
size o2 L o1 00 











- af - 0 0 Unallocated. 
- 1 - 1 - Unallocated. 
Ox - - 1 - Unallocated. 





00 Y) @ 0 i) STXRB 





00 vy) @ 0 1 STLXRB 





00 Q 1 @ i) LDXRB 





00 Q 1 @ 1 LDAXRB 





00 1 @ 0 1 STLRB 





00 i 1 @ 1 LDARB 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 
Instruction Page 
size o2 L o1 00 





01 Q ® 0 i) STXRH 





01 i) 0 @ 1 STLXRH 





01 Q 1 @ i) LDXRH 





01 Y) 1 @ 1 LDAXRH 





01 1 0 0 1 STLRH 





01 1 1 0 1 LDARH 





10 Y) 0 0 i) STXR - 32-bit variant on page C6-720 





10 Y) 0 0 1 STLXR - 32-bit variant on page C6-686 





10 Y) ® 1 i) STXP - 32-bit variant on page C6-717 





10 Y) ® 1 1 STLXP - 32-bit variant on page C6-683 





10 Y) 1 0 i) LDXR - 32-bit variant on page C6-600 





10 Y) 1 0 1 LDAXR - 32-bit variant on page C6-538 





10 Y) 1 1 i) LDXP - 32-bit variant on page C6-598 





10 Y) 1 1 1 LDAXP - 32-bit variant on page C6-536 





10 1 0 0 1 STLR - 32-bit variant on page C6-680 





10 1. 1 0 1 LDAR - 32-bit variant on page C6-533 





11 Y) 0 20 i) STXR - 64-bit variant on page C6-720 





11 Y) 0 20 1 STLXR - 64-bit variant on page C6-686 





11 Y) ® 1 i) STXP - 64-bit variant on page C6-717 





11 Y) ®@ 1 1 STLXP - 64-bit variant on page C6-683 





11 Y) 1 0 i) LDXR - 64-bit variant on page C6-600 





11 Y) 1 0 1 LDAXR - 64-bit variant on page C6-538 





11 Y) 1 1 i) LDXP - 64-bit variant on page C6-598 





11 Q 1 1 1 LDAXP - 64-bit variant on page C6-536 





11 J. 0 20 1 STLR - 64-bit variant on page C6-680 





11 1 1 0 1 LDAR - 64-bit variant on page C6-533 





C4.4.7 Load/store no-allocate pair (offset) 


This section describes the encoding of the Load/store no-allocate pair (offset) instruction class. This section is 
decoded from Loads and stores on page C4-202. 
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C4.4 Loads and stores 


|31 30 29 28|27 26 25 24/23 2221 | |15 14 | 109 | 5 4| 0 | 


Popo [7 0 i[v[o 0 oft] mm? | Ro | Rn | Rt 





Decode fields 
Instruction Page 
ope V L 





00 i) ) STNP - 32-bit variant on page C6-692 





00 i) 1 LDNP - 32-bit variant on page C6-542 











00 1 0 STNP (SIMD&FP) - 32-bit variant on page C7-1350 
00 1 1 LDNP (SIMD&FP) - 32-bit variant on page C7-1088 
01 0 - Unallocated. 





01 1 7) STNP (SIMD&FP) - 64-bit variant on page C7-1350 





01 1 1 LDNP (SIMD&FP) - 64-bit variant on page C7-1088 





10 i) 0 STNP - 64-bit variant on page C6-692 





10 i) 1 LDNP - 64-bit variant on page C6-542 


























10 1 ) STNP (SIMD&FP) - 128-bit variant on page C7-1350 
10 1 1 LDNP (SIMD&FP) - 128-bit variant on page C7-1088 
11 - - Unallocated. 

C4.4.8 Load/store register (immediate post-indexed) 


This section describes the encoding of the Load/store register (immediate post-indexed) instruction class. This 
section is decoded from Loads and stores on page C4-202. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0 | 


[sze[7 1 1[v[o Ofopcfo. mma —'fo a] en | Rt 





Decode fields 
Instruction Page 
size V_ ope 





x1 1 1x Unallocated. 





00 0 00 STRB (immediate) 





00 ® 1 LDRB (immediate) 





00 ® 10 LDRSB (immediate) - 64-bit variant on page C6-565 





00 ® 11 LDRSB (immediate) - 32-bit variant on page C6-565 





00 1 00 STR (immediate, SIMD&FP) - 8-bit variant on page C7-1355 





00 1 1 LDR (immediate, SIMD&FP) - 8-bit variant on page C7-1093 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 


Instruction Page 




































































size opc 

00 10 STR (immediate, SIMD&FP) - 128-bit variant on page C7-1355 

00 11 LDR (immediate, SIMD&FP) - 128-bit variant on page C7-1093 

01 00 STRH (immediate) 

01 01 LDRH (immediate) 

01 10 LDRSH (immediate) - 64-bit variant on page C6-570 

01 11 LDRSH (immediate) - 32-bit variant on page C6-570 

01 00 STR (immediate, SIMD&FP) - 16-bit variant on page C7-1355 

01 01 LDR (immediate, SIMD&FP) - 16-bit variant on page C7-1093 

1x 11 Unallocated. 

1x 1x Unallocated. 

10 00 STR (immediate) - 32-bit variant on page C6-697 

10 01 LDR (immediate) - 32-bit variant on page C6-550 

10 10 LDRSW (immediate) 

10 00 STR (immediate, SIMD&FP) - 32-bit variant on page C7-1355 

10 01 LDR (immediate, SIMD&FP) - 32-bit variant on page C7-1093 

11 00 STR (immediate) - 64-bit variant on page C6-697 

11 01 LDR (immediate) - 64-bit variant on page C6-550 

11 10 Unallocated. 

11 00 STR (immediate, SIMD&FP) - 64-bit variant on page C7-1355 

11 01 LDR (immediate, SIMD&FP) - 64-bit variant on page C7-1093 
C4.4.9 Load/store register (immediate pre-indexed) 


This section describes the encoding of the Load/store register (immediate pre-indexed) instruction class. This 
section is decoded from Loads and stores on page C4-202. 


|31 30 29 28|27 26 25 24/23 22 21 20| 


[sze[1 1 i[v[o Ofopcfo. mma —*'(t a] Rn | Rt 


| 12/1110 9 | 5 4| 0 | 





Decode fields 


Instruction Page 














size opc 
x1 1x Unallocated. 
00 00 STRB (immediate) 
00 Q1 LDRB (immediate) 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 
Instruction Page 
size V_ ope 





00 ® 10 LDRSB (immediate) - 64-bit variant on page C6-565 





00 ®@ 11 LDRSB (immediate) - 32-bit variant on page C6-565 





00 1 00 STR (immediate, SIMD&FP) - 8-bit variant on page C7-1355 





00 1 @1 LDR (immediate, SIMD&FP) - 8-bit variant on page C7-1093 





00 1 10 STR (immediate, SIMD&FP) - 128-bit variant on page C7-1356 





00 1 11 LDR (immediate, SIMD&FP) - 128-bit variant on page C7-1094 





01 0 00 STRH (immediate) 





01 ® 1 LDRH (immediate) 





01 ® 10 LDRSH (immediate) - 64-bit variant on page C6-570 





01 ®@ 11 LDRSH (immediate) - 32-bit variant on page C6-570 





01 1 00 STR (immediate, SIMD&FP) - 16-bit variant on page C7-1356 





01 1 @1 LDR (immediate, SIMD&FP) - 16-bit variant on page C7-1094 





1x 0 11 Unallocated. 





1x 1 1x Unallocated. 





10 0 00 STR (immediate) - 32-bit variant on page C6-697 





10 ®@ 1 LDR (immediate) - 32-bit variant on page C6-550 





10 ) 10 LDRSW (immediate) 





10 1 00 STR (immediate, SIMD&FP) - 32-bit variant on page C7-1356 





10 1 1 LDR (immediate, SIMD&FP) - 32-bit variant on page C7-1094 





11 0 00 STR (immediate) - 64-bit variant on page C6-697 





11 ® 1 LDR (immediate) - 64-bit variant on page C6-550 





11 1) 10 Unallocated. 





11 1 00 STR (immediate, SIMD&FP) - 64-bit variant on page C7-1356 








11 1 1 LDR (immediate, SIMD&FP) - 64-bit variant on page C7-1094 





C4.4.10 Load/store register (register offset) 


This section describes the encoding of the Load/store register (register offset) instruction class. This section is 
decoded from Loads and stores on page C4-202. 
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|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1312/11109 | 5 4| 0 | 


[size[7 1 1[v[0 Of ope]i] Rm | option [S[7 0] Rn | Rt 





Decode fields 


Instruction Page 











































































































size V_ opc_ option 
- - xOx Unallocated. 
x1 1x - Unallocated. 
00 00 != @11 STRB (register) - Extended register variant on page C6-704 
00 00 011 STRB (register) - Shifted register variant on page C6-704 
00 Q1 {= @11 LDRB (register) - Extended register variant on page C6-559 
00 01 011 LDRB (register) - Shifted register variant on page C6-559 
00 10 {= @11 LDRSB (register) - 64-bit with extended register offset variant on page C6-568 
00 10 011 LDRSB (register) - 64-bit with shifted register offset variant on page C6-568 
00 11 {= @11 LDRSB (register) - 32-bit with extended register offset variant on page C6-568 
00 11 011 LDRSB (register) - 32-bit with shifted register offset variant on page C6-568 
00 00 \= Q11 STR (register, SIMD&FP) 
00 00 Q11 STR (register, SIMD&FP) 
00 Q1 != 011 LDR (register, SIMD&FP) 
00 Q1 @11 LDR (register, SIMD&FP) 
00 10 - STR (register, SIMD&FP) 
00 11 - LDR (register, SIMD&FP) 
Q1 00 - STRH (register) 
Q1 01 - LDRH (register) 
01 10 - LDRSH (register) - 64-bit variant on page C6-573 
Q1 11 - LDRSH (register) - 32-bit variant on page C6-573 
01 00 - STR (register, SIMD&FP) 
Q1 Q1 - LDR (register, SIMD&FP) 
1x 11 - Unallocated. 
1x 1x - Unallocated. 
10 00 - STR (register) - 32-bit variant on page C6-700 
10 Q1 - LDR (register) - 32-bit variant on page C6-555 
10 10 - LDRSW (register) 
10 00 - STR (register, SIMD&FP) 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 


size V_ opc 


option 


Instruction Page 





LDR (register, SIMD&FP) 





STR (register) - 64-bit variant on page C6-700 





LDR (register) - 64-bit variant on page C6-555 





PRFM (register) 





STR (register, SIMD&FP) 





10 1 1 
11 0 00 
11 @ 1 
11 0 10 
11 1 00 
11 1 1 





LDR (register, SIMD&FP) 





C4.4.11 Load/store register (unprivileged) 


This section describes the encoding of the Load/store register (unprivileged) instruction class. This section is 
decoded from Loads and stores on page C4-202. 


|31 30 29 28|27 26 25 24/23 22 21 20| | 12/1110 9 | 5 4| 0 | 
fsize]1 1 1]V]O Ofopcjo} immo sf Of Rn Ss {| RT 





Decode fields 
Instruction Page 
size V_ opc 





- 1 - Unallocated. 





00 0 00 STTRB 





00 @ 1 LDTRB 





00 ® 10 LDTRSB - 64-bit variant on page C6-584 





00 @ 11 LDTRSB - 32-bit variant on page C6-584 





01 0 00 STTRH 





01 @ 01 LDTRH 





01 @ 10 LDTRSH - 64-bit variant on page C6-586 





01 ®@ 11 LDTRSH - 32-bit variant on page C6-586 





1x 1) 11 Unallocated. 





10 0 00 STTR - 32-bit variant on page C6-710 





10 ®@ 1 LDTR - 32-bit variant on page C6-580 





10 0 10 LDTRSW 





11 0 00 STTR - 64-bit variant on page C6-710 





11 @ 1 LDTR - 64-bit variant on page C6-580 





11 1) 10 Unallocated. 
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C4.4.12 Load/store register (unscaled immediate) 


C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 


This section describes the encoding of the Load/store register (unscaled immediate) instruction class. This section 
is decoded from Loads and stores on page C4-202. 


|31 30 29 28|27 26 25 24/23 22 21 20 


12/1110 9 | 5 4| 0 | 


[sze[7 1 i[v[0 Ofopco. mma joo] rn | Rt 





Decode fields 


Instruction Page 
















































































size V_ ope 
x1 1 Ix Unallocated. 
00 0 00 STURB 
00 ®@ 1 LDURB 
00 0 10 LDURSB - 64-bit variant on page C6-593 
00 ®@ 11 LDURSB - 32-bit variant on page C6-593 
00 1 00 STUR (SIMD&FP) - 8-bit variant on page C7-1362 
00 1 @1 LDUR (SIMD&FP) - 8-bit variant on page C7-1102 
00 1 10 STUR (SIMD&FP) - 128-bit variant on page C7-1362 
00 % .2f LDUR (SIMD&FP) - 128-bit variant on page C7-1102 
Q1 0 00 STURH 
Q1 ®@ 1 LDURH 
01 0 10 LDURSH - 64-bit variant on page C6-595 
01 ® 11 LDURSH - 32-bit variant on page C6-595 
01 1 00 STUR (SIMD&FP) - 16-bit variant on page C7-1362 
01 1 1 LDUR (SIMD&FP) - 16-bit variant on page C7-1102 
1x ® 11 Unallocated. 
1x 1 ix Unallocated. 
10 0 00 STUR - 32-bit variant on page C6-714 
10 ®@ 1 LDUR - 32-bit variant on page C6-589 
10 ® 10 LDURSW 
10 1 00 STUR (SIMD&FP) - 32-bit variant on page C7-1362 
10 1 01 LDUR (SIMD&FP) - 32-bit variant on page C7-1102 
11 0 00 STUR - 64-bit variant on page C6-714 
11 ®@ 1 LDUR - 64-bit variant on page C6-589 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 
Instruction Page 
size V_ opc 





11 Qo 10 PRFM (unscaled offset) 





11 1 00 STUR (SIMD&FP) - 64-bit variant on page C7-1362 





11 1 1 LDUR (SIMD&FP) - 64-bit variant on page C7-1102 





C4.4.13 Load/store register (unsigned immediate) 


This section describes the encoding of the Load/store register (unsigned immediate) instruction class. This section 
is decoded from Loads and stores on page C4-202. 


|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 





Decode fields 
Instruction Page 
size V_ ope 





x1 1 1x Unallocated. 





00 ® 00 STRB (immediate) 





00 ® 1 LDRB (immediate) 





00 ® 10 LDRSB (immediate) - 64-bit variant on page C6-566 





00 ® 11 LDRSB (immediate) - 32-bit variant on page C6-566 





00 1 00 STR (immediate, SIMD&FP) - 8-bit variant on page C7-1356 





00 1 1 LDR (immediate, SIMD&FP) - 8-bit variant on page C7-1094 





00 1 10 STR (immediate, SIMD&FP) - 128-bit variant on page C7-1357 





00 Te Et LDR (immediate, SIMD&FP) - 128-bit variant on page C7-1095 





Q1 ® 00 STRH (immediate) 





01 ® 1 LDRH (immediate) 





01 ® 10 LDRSH (immediate) - 64-bit variant on page C6-571 





01 ®@ 11 LDRSH (immediate) - 32-bit variant on page C6-571 





01 1 00 STR (immediate, SIMD&FP) - 16-bit variant on page C7-1356 





01 1 @1 LDR (immediate, SIMD&FP) - 16-bit variant on page C7-1094 





1x 0 11 Unallocated. 





1x 1 1x Unallocated. 





10 0 00 STR (immediate) - 32-bit variant on page C6-698 





10 ®@ 1 LDR (immediate) - 32-bit variant on page C6-551 








10 Qo 10 LDRSW (immediate) 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 


Instruction Page 


size V_ opc 





10 1 00 STR (immediate, SIMD&FP) - 32-bit variant on page C7-1356 





10 1 @1 LDR (immediate, SIMD&FP) - 32-bit variant on page C7-1094 





11 0 00 STR (immediate) - 64-bit variant on page C6-698 





11 ®@ 1 LDR (immediate) - 64-bit variant on page C6-551 





11 ) 10 PREM (immediate) 





11 1 00 STR (immediate, SIMD&FP) - 64-bit variant on page C7-1356 








11 1 1 LDR (immediate, SIMD&FP) - 64-bit variant on page C7-1094 





C4.4.14 Load/store register pair (offset) 


This section describes the encoding of the Load/store register pair (offset) instruction class. This section is decoded 


from Loads and stores on page C4-202. 


|31 30 29 28|27 26 25 24/23 2221 | |15 14 


| 109 | 5 4| 0 | 





Decode fields 


Instruction Page 









































ope V L 

00 i) 0 STP - 32-bit variant on page C6-695 

00 i) 1 LDP - 32-bit variant on page C6-545 

00 1 ) STP (SIMD&FP) - 32-bit variant on page C7-1353 
00 1 1 LDP (SIMD&EFP) - 32-bit variant on page C7-1091 
01 i) Q Unallocated. 

01 i) 1 LDPSW 

01 1 ) STP (SIMD&FP) - 64-bit variant on page C7-1353 
Q1 1 1 LDP (SIMD&FP) - 64-bit variant on page C7-1091 
10 i) ) STP - 64-bit variant on page C6-695 

10 i) 1 LDP - 64-bit variant on page C6-545 

10 1 ) STP (SIMD&FP) - 128-bit variant on page C7-1353 
10 1 1 LDP (SIMD&FP) - /28-bit variant on page C7-1091 
11 - - Unallocated. 








C4.4.15 Load/store register pair (post-indexed) 


This section describes the encoding of the Load/store register pair (post-indexed) instruction class. This section is 


decoded from Loads and stores on page C4-202. 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 


|31 30 29 28|27 26 25 24/23 2221 | 


|15 14 


| 109 | 5 4| 0 | 





Decode fields 


Instruction Page 









































opc L 

00 0 STP - 32-bit variant on page C6-694 

00 1 LDP - 32-bit variant on page C6-544 

00 ) STP (SIMD&FP) - 32-bit variant on page C7-1352 
00 1 LDP (SIMD&FP) - 32-bit variant on page C7-1090 
01 Q Unallocated. 

01 1  LDPSW 

01 ) STP (SIMD&FP) - 64-bit variant on page C7-1352 
01 1 LDP (SIMD&EFP) - 64-bit variant on page C7-1090 
10 ) STP - 64-bit variant on page C6-694 

10 1 LDP - 64-bit variant on page C6-544 

10 ) STP (SIMD&FP) - 128-bit variant on page C7-1352 
10 1 LDP (SIMD&FP) - /28-bit variant on page C7-1090 
11 - Unallocated. 
































C4.4.16 Load/store register pair (pre-indexed) 
This section describes the encoding of the Load/store register pair (pre-indexed) instruction class. This section is 
decoded from Loads and stores on page C4-202. 
|31 30 29 28|27 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 
Decode fields 
Instruction Page 
opc L 
00 ) STP - 32-bit variant on page C6-694 
00 1 LDP - 32-bit variant on page C6-544 
00 0 STP (SIMD&FP) - 32-bit variant on page C7-1352 
00 1 LDP (SIMD&FP) - 32-bit variant on page C7-1090 
01 0 Unallocated. 
01 1 LDPSW 
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C4 A64 Instruction Set Encoding 
C4.4 Loads and stores 





Decode fields 
Instruction Page 
ope V L 





01 1 0 STP (SIMD&FP) - 64-bit variant on page C7-1352 





01 1 1 LDP (SIMD&FP) - 64-bit variant on page C7-1090 





10 i) Q STP - 64-bit variant on page C6-694 





10 i) 1 LDP - 64-bit variant on page C6-544 




















10 1 0 STP (SIMD&FP) - 128-bit variant on page C7-1352 
10 1 1 LDP (SIMD&FP) - /28-bit variant on page C7-1090 
11 - - Unallocated. 
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C4 A64 Instruction Set Encoding 
C4.5 Data processing - register 


C4.5 Data processing - register 


This section describes the encoding of the Data processing (register) instruction group. This section is decoded from 
A64 instruction index by encoding on page C4-192. For additional information on this functional group of 
instructions, see Data processing - register on page C3-163. 


|31 30 29 28|27 24 21 20! | 12|11 10 | | 0 | 


Ef ap 
opO i Po op3 









































op1 
Table C4-5 Encoding table for the Data Processing -- Register group 
Decode fields 
Decode group or instruction page 
op0 opt op2= op3 
i) dl 0110 - Data-processing (2 source) on page C4-229 
1 al 0110—- Data-processing (1 source) on page C4-228 
- Y) Oxxx  - Logical (shifted register) on page C4-231 
- Y) Ixx® - Add/subtract (shifted register) on page C4-225 
- Y) Ixxl - Add/subtract (extended register) 
- 1 0000 - Add/subtract (with carry) on page C4-225 
- 1 0010 «(0 Conditional compare (register) on page C4-227 
- dt 0010 «1 Conditional compare (immediate) on page C4-226 
- 1 0100 - Conditional select on page C4-227 
- 1. Oxxl - Unallocated. 
- 1 1xxx - Data-processing (3 source) on page C4-230 
C4.5.1 Add/subtract (extended register) 


This section describes the encoding of the Add/subtract (extended register) instruction class. This section is decoded 
from Data processing - register. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312| 109 | 5 4| 0 | 


fsffon[S[0 10 1 T[opt]i] Rm | option [imma] Rn] Rd 





Decode fields 
Instruction Page 
sf op S_ opt imm3 

















- = = - 1x1 Unallocated. 
- = - - 11x Unallocated. 
7 2 2 x1 - Unallocated. 
= & “ 1x 7 Unallocated. 
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C4 A64 Instruction Set Encoding 
C4.5 Data processing - register 





Decode fields 
Instruction Page 
sf op S_ opt imm3 
































i) ) 0 00 - ADD (extended register) - 32-bit variant on page C6-437 
i) ) 1 00 - ADDS (extended register) - 32-bit variant on page C6-443 
i) 1 0 00 - SUB (extended register) - 32-bit variant on page C6-726 
i) 1 1 00 - SUBS (extended register) - 32-bit variant on page C6-732 
1 0 0 00 - ADD (extended register) - 64-bit variant on page C6-437 
1 ) 1 00 - ADDS (extended register) - 64-bit variant on page C6-443 
1 1 0 00 - SUB (extended register) - 64-bit variant on page C6-726 
1 1 1 00 - SUBS (extended register) - 64-bit variant on page C6-732 





C4.5.2 Add/subtract (shifted register) 


This section describes the encoding of the Add/subtract (shifted register) instruction class. This section is decoded 
from Data processing - register on page C4-224. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 | 109 5 4| 0 | 


ES eT 





Decode fields 
Instruction Page 
sf op S_ shift imm6 
































- - - i - Unallocated. 

) - - = 1xxxxx Unallocated. 

) i) eo - - ADD (shifted register) - 32-bit variant on page C6-441 
) i) 1 - - ADDS (shifted register) - 32-bit variant on page C6-447 
) 1 eo - - SUB (shifted register) - 32-bit variant on page C6-730 
) 1 1 - - SUBS (shifted register) - 32-bit variant on page C6-736 
1 i) eo - - ADD (shifted register) - 64-bit variant on page C6-441 
1 i) 1 - - ADDS (shifted register) - 64-bit variant on page C6-447 
1 1 e - - SUB (shifted register) - 64-bit variant on page C6-730 

1 1 1 - - SUBS (shifted register) - 64-bit variant on page C6-736 





C4.5.3 Add/subtract (with carry) 


This section describes the encoding of the Add/subtract (with carry) instruction class. This section is decoded from 
Data processing - register on page C4-224. 
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C4 A64 Instruction Set Encoding 
C4.5 Data processing - register 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 | 109 | 5 4| 0 | 


sffo[S[1 1070000] Rm | opcode? | Rn | Rd 





Decode fields 
Instruction Page 
sf op S_ opcode2 












































- - = XXXXx1 Unallocated. 

- - = XXXx1x Unallocated. 

- - = XXX1Xxx Unallocated. 

- - = XX1XXxx Unallocated. 

- - - — X1Xxxx Unallocated. 

- - - — 1XXxxxx Unallocated. 

0 Y) @ 00000 ADC - 32-bit variant on page C6-435 
Q Y) 1 000000 ADCS - 32-bit variant on page C6-436 
0 1 @ 00000 SBC - 32-bit variant on page C6-663 

0 1 1 000000 SBCS - 32-bit variant on page C6-665 
1 a ®@ 000000 ADC - 64-bit variant on page C6-435 
1 Y) 1 000000 ADCS - 64-bit variant on page C6-436 
1 1 @ 00000 SBC - 64-bit variant on page C6-663 

1 1 1 000000 SBCS - 64-bit variant on page C6-665 








C4.5.4 Conditional compare (immediate) 
This section describes the encoding of the Conditional compare (immediate) instruction class. This section is 


decoded from Data processing - register on page C4-224. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 12\11109 | 5 4/3 0 | 


sflopfs]1_ 1.0 10014 0] imm5 | cond |ifo2] Rn fo3]nzov_ 





Decode fields 
Instruction Page 
sf op S 02 03 





- a - - 1 Unallocated. 





- 2 a 1 - Unallocated. 





- - Q - - Unallocated. 





) Y) 1 0 Y) CCMN (immediate) - 32-bit variant on page C6-478 
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C4 A64 Instruction Set Encoding 
C4.5 Data processing - register 





Decode fields 
Instruction Page 
sf op S 02 03 





) 1 1 0 Y) CCMP (immediate) - 32-bit variant on page C6-480 





1 Y) 1 0 Y) CCMN (immediate) - 64-bit variant on page C6-478 





1 1 1 0 Y) CCMP (immediate) - 64-bit variant on page C6-480 





C4.5.5 Conditional compare (register) 


This section describes the encoding of the Conditional compare (register) instruction class. This section is decoded 
from Data processing - register on page C4-224. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 12/1110 9 | 5 4/3 0 | 


ffo[S[1 1070070] Rm | cond [ole] Rn oo] raw 





Decode fields 
Instruction Page 
sf op S o2 03 





z = = - ae Unallocated. 





= 4 > 1 - Unallocated. 





- - 0 - - Unallocated. 





0 Y) 1 0 Y) CCMN (register) - 32-bit variant on page C6-479 





0 1 1 0 Y) CCMP (register) - 32-bit variant on page C6-481 





1 Y) 1 0 ") CCMN (register) - 64-bit variant on page C6-479 





1 1 1 0 Y) CCMP (register) - 64-bit variant on page C6-481 





C4.5.6 Conditional select 


This section describes the encoding of the Conditional select instruction class. This section is decoded from Data 
processing - register on page C4-224. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 12/1110 9 | 5 4| 0 | 


sflopfs]1_ 1.01010 0] Rm | cond Jop2] Rn | Rd 





Decode fields 
Instruction Page 
sf op S_ op2 





2 = - 1x Unallocated. 





- - 1 - Unallocated. 





Y) 0 0 00 CSEL - 32-bit variant on page C6-502 





Y) 7) @ 1 CSINC - 32-bit variant on page C6-505 
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C4 A64 Instruction Set Encoding 
C4.5 Data processing - register 





Decode fields 
Instruction Page 
sf op S_ op2 





Y) 1 0 00 CSINV - 32-bit variant on page C6-507 





Y) 1 ®@ 1 CSNEG - 32-bit variant on page C6-509 





1 ) 0 00 CSEL - 64-bit variant on page C6-502 





1 ) @ 1 CSINC - 64-bit variant on page C6-505 





1 He 0 00 CSINV - 64-bit variant on page C6-507 








1. 1 ®@ 1 CSNEG - 64-bit variant on page C6-509 





C4.5.7 Data-processing (1 source) 


This section describes the encoding of the Data-processing (1 source) instruction class. This section is decoded from 
Data processing - register on page C4-224. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 | 109 | 5 4| 0 | 


sf[i{s]1 10101 1 0] opcode2 | opcode | Rn | RG 





Decode fields 
Instruction Page 
sf S opcode2 opcode 



























































- - = XX1xxx Unallocated. 

- - = X1xxxx Unallocated. 

- - = 1xxXXxxx Unallocated. 

- = Xxxxl - Unallocated. 

- - — Xxxlx - Unallocated. 

- -  Xx1xx - Unallocated. 

- - — X1xxx - Unallocated. 

- - —-1xxxx - Unallocated. 

- ®@ 00000 00011x Unallocated. 

- 1 - - Unallocated. 

0 ® 00000 000000 RBIT - 32-bit variant on page C6-652 

0 ®@ 00000 000001 REV 16 - 32-bit variant on page C6-656 

) ®@ 00000 000010 REV - 32-bit variant on page C6-654 

) ®@ 00000 000011 Unallocated. 

) ® 00000 000100 CLZ - 32-bit variant on page C6-486 

0 ®@ 00000 000101 CLS - 32-bit variant on page C6-485 

a ®@ 00000 000000 RBIT - 64-bit variant on page C6-652 
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C4 A64 Instruction Set Encoding 
C4.5 Data processing - register 





Decode fields 


Instruction Page 

















sf opcode2 opcode 

1 00000 000001 REV 16 - 64-bit variant on page C6-656 
1 00000 000010 REV32 

i 00000 000011 REV - 64-bit variant on page C6-654 

1 00000 000100 CLZ - 64-bit variant on page C6-486 

1 00000 000101 CLS - 64-bit variant on page C6-485 





C4.5.8 Data-processing (2 source) 


This section describes the encoding of the Data-processing (2 source) instruction class. This section is decoded from 


Data processing - register on page C4-224. 


|31 30 29 28|27 26 25 24|23 22 21 20| 


16/15 


10 9 5 4 0 | 


Cae a i 





Decode fields 


sf S opcode 


Instruction Page 




















= - — 00000x Unallocated. 
- - — 011xxx Unallocated. 
- = 1XXxxx Unallocated. 
- @ 00Q1xx Unallocated. 
- @ 011xx Unallocated. 
- 1 - Unallocated. 








0 @ 00010 


UDIV - 32-bit variant on page C6-755 





0 @ 000011 


SDIV - 32-bit variant on page C6-671 





0 0 001000 


LSLV - 32-bit variant on page C6-605 





i) ®@ 001001 


LSRV - 32-bit variant on page C6-608 





Q @ 001010 


ASRV - 32-bit variant on page C6-460 





i) @ 001011 


RORV - 32-bit variant on page C6-662 





Q @ 10x11 


Unallocated. 





0 @ 010000 


CRC32B, CRC32H, CRC32W, CRC32X - CRC32B variant on page C6-498 





i) Q@ 010001 


CRC32B, CRC32H, CRC32W, CRC32X - CRC32H variant on page C6-498 





Q @ 010010 


CRC32B, CRC32H, CRC32W, CRC32X - CRC32W variant on page C6-498 





i) ®@ 010100 


CRC32CB, CRC32CH, CRC32CW, CRC32CX - CRC32CB variant on page C6-500 





i) Q@ 010101 


CRC32CB, CRC32CH, CRC32CW, CRC32CX - CRC32CH variant on page C6-500 
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C4 A64 Instruction Set Encoding 
C4.5 Data processing - register 





Decode fields 


sf S opcode 


Instruction Page 





Q @ 010110 


CRC32CB, CRC32CH, CRC32CW, CRC32CX - CRC32CW variant on page C6-500 





1 @ 000010 


UDIV - 64-bit variant on page C6-755 





1 @ 000011 


SDIV - 64-bit variant on page C6-671 





1 @ 001000 


LSLV - 64-bit variant on page C6-605 





1 ®@ 001001 


LSRV - 64-bit variant on page C6-608 





1 0 001010 


ASRV - 64-bit variant on page C6-460 





1 ®@ 001011 


RORV - 64-bit variant on page C6-662 





1 1) 010xxd 


Unallocated. 





1 1) 010x0x 


Unallocated. 





1 @ 010011 


CRC32B, CRC32H, CRC32W, CRC32X - CRC32X variant on page C6-498 





Hl @ 010111 


CRC32CB, CRC32CH, CRC32CW, CRC32CX - CRC32CX variant on page C6-500 















































C4.5.9 Data-processing (3 source) 
This section describes the encoding of the Data-processing (3 source) instruction class. This section is decoded from 
Data processing - register on page C4-224. 
2 30 29 sal 26 25 ae 21 20| 16|15 14 | 109 5 4| 0| 
Decode fields 
Instruction Page 
sf op54 op31 00 
- 00 010 1 Unallocated. 
- 00 Q11 - Unallocated. 
- 00 100 - Unallocated. 
- 00 110 1 Unallocated. 
- 00 111 - Unallocated. 
- 01 - - Unallocated. 
- 1x - - Unallocated. 
Y) 00 000 ) MADD - 32-bit variant on page C6-609 
Y) 00 000 1 MSUB - 32-bit variant on page C6-626 
Y) 00 001 ) Unallocated. 
Y) 00 001 1 Unallocated. 
Q 00 010 ) Unallocated. 
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C4.5.10 Logical (shifted register) 


C4 A64 Instruction Set Encoding 
C4.5 Data processing - register 





Decode fields 


Instruction Page 



































sf op54 op31 00 

Y) 00 101 ) Unallocated. 

Y) 00 101 1 Unallocated. 

Y) 00 110 ) Unallocated. 

1 00 000 ) MADD - 64-bit variant on page C6-609 
1 00 000 1 MSUB - 64-bit variant on page C6-626 
1 00 001 () SMADDL 

1 00 001 1 SMSUBL 

1 00 010 Q SMULH 

1 00 101 () UMADDL 

1 00 101 He UMSUBL 

1 00 110 t) UMULH 





This section describes the encoding of the Logical (shifted register) instruction class. This section is decoded from 
Data processing - register on page C4-224. 


|31 30 29 28|27 26 25 24/23 22 21 20| 


16|15 


5 4| 0| 


FS CT 





Decode fields 


Instruction Page 









































sf opc N_ imm6 

Y) - - 1xxxxx Unallocated. 

Y) 00 Qo - AND (shifted register) - 32-bit variant on page C6-452 

Y) 00 1 - BIC (shifted register) - 32-bit variant on page C6-468 

Y) 01 Qo - ORR (shifted register) - 32-bit variant on page C6-642 

Y) 01 1 - ORN (shifted register) - 32-bit variant on page C6-638 

Y) 10 Qo - EOR (shifted register) - 32-bit variant on page C6-523 

Y) 10 1 - EON (shifted register) - 32-bit variant on page C6-520 

Y) 11 Qo - ANDS (shifted register) - 32-bit variant on page C6-456 

Y) 11 1 - BICS (shifted register) - 32-bit variant on page C6-470 

1 00 Qo - AND (shifted register) - 64-bit variant on page C6-452 

1 00 1 - BIC (shifted register) - 64-bit variant on page C6-468 

1 01 Qe - ORR (shifted register) - 64-bit variant on page C6-642 
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C4 A64 Instruction Set Encoding 
C4.5 Data processing - register 





Decode fields 
Instruction Page 
sf opc N_ imm6 























1 01 1 - ORN (shifted register) - 64-bit variant on page C6-638 
1 10 Qo - EOR (shifted register) - 64-bit variant on page C6-523 
1 10 1 - EON (shifted register) - 64-bit variant on page C6-520 
1 11 Qo - ANDS (shifted register) - 64-bit variant on page C6-456 
1 11 1 - BICS (shifted register) - 64-bit variant on page C6-470 
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C4 A64 Instruction Set Encoding 
C4.6 Data processing - SIMD and floating point 


C4.6 Data processing - SIMD and floating point 


This section describes the encoding of the Data processing (SIMD and floating-point) instruction group. This 
section is decoded from A64 instruction index by encoding on page C4-192. For additional information on this 
functional group of instructions, see Data processing - SIMD and floating-point on page C3-171. 


|31 


28|27 


24|23 22 


[19 18 17 16/15 | 109 | | 0 | 


| op ttt opt | op2 ops] opt 


Table C4-6 Encoding table for the Data Processing -- Scalar Floating-Point and Advanced SIMD 


group 





Decode fields 


Decode group or instruction page 




























































































opO0 opi op2 op3 op4 
0000 = Ox x101 00 xxxx1@ Unallocated. 
0010 ex x101 00 xxxx1@ Unallocated. 
0100 = 0x x101 00 Xxxxl1@ Cryptographic AES on page C4-257 
0101 0x XQXxx - @xxx0@ Cryptographic three-register SHA on page C4-258 
0101 Ox XOXx - @xxx1@ Unallocated. 
0101 0x x101 00 xxxxl@ Cryptographic two-register SHA on page C4-259 
0110 ex x101 00 xxxx1@ Unallocated. 
Q111 =x XOXx - Oxxxx@ Unallocated. 
Q111 x x101 00 xxxx1@ Unallocated. 
Qlx1 = 00 QOxx - @xxxxl | Advanced SIMD scalar copy on page C4-238 
Q1x1l 01 XOXx - @xxxxl Unallocated. 
Qlx1 = Ox 0111 00 xxxx1@ Unallocated. 
Qlx1 = Ox x100 00 xxxxl1@ | Advanced SIMD scalar two-register miscellaneous on page C4-244 
Qlx1 = Ox x110 00 xxxxl1@ Advanced SIMD scalar pairwise on page C4-238 
Q1x1l = Ox x1xx 1x xxxx1@ Unallocated. 
Q1x1l = Ox x1xx x1 xxxx1@ Unallocated. 
Qlx1 = Ox x1xx - xxxx0@ Advanced SIMD scalar three different on page C4-241 
Qlx1 = Ox x1xx - xxxxxl Advanced SIMD scalar three same on page C4-241 
Qlx1 10 - - xxxxxl Advanced SIMD scalar shift by immediate on page C4-239 
Q@lx1 «11 - - xxxxxl Unallocated. 
Qlx1 1x - - xxxxx@ Advanced SIMD scalar x indexed element on page C4-246 
Qx0@ = Ox XOXxx - @xxx@@ Advanced SIMD table lookup on page C4-248 
Qx0@ = Ox XOxx - @xxx1@ Advanced SIMD permute on page C4-237 
Qx10 = Ox XOxx - @xxxx@ Advanced SIMD extract on page C4-236 
Oxx® 00 QOxx - @xxxxl | Advanced SIMD copy on page C4-235 
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C4 A64 Instruction Set Encoding 


C4.6 Data processing - SIMD and floating point 


Table C4-6 Encoding table for the Data Processing -- Scalar Floating-Point and Advanced SIMD 


group (continued) 





Decode fields 


Decode group or instruction page 
















































































op0 opi op2 op3 op4 

Oxxd 01 XOXxx - Oxxxxl Unallocated. 

Oxx® x 0111 00 xxxx1@ Unallocated. 

Oxxd 0x 10xx - @1xxxl Unallocated. 

Oxxd 0x XOXx - 1xxxxx Unallocated. 

Oxx® x x100 00 xxxxl1@ Advanced SIMD two-register miscellaneous on page C4-253 
Oxx® 0x x110 00 xxxxl1@ Advanced SIMD across lanes 

Oxx® x x1xx 1x xxxx1@ Unallocated. 

Oxxd x X1xx x1 xxxx1@ Unallocated. 

Oxx® x x1xx - xxxx0@ Advanced SIMD three different on page C4-249 

Oxxd 0x x1xx - Xxxxxl Advanced SIMD three same on page C4-250 

Oxx® 10 0000 - xxxxxl Advanced SIMD modified immediate on page C4-237 

Oxx® 10 '= Q000- Xxxxxl Advanced SIMD shift by immediate on page C4-247 

Oxxd 11 - - xxxxxl Unallocated. 

Oxx® 1x - - xxxxx@ Advanced SIMD vector x indexed element on page C4-256 
lixl - - - - Unallocated. 

1xx® - - - - Unallocated. 

xX0x1 Ox XOxx - - Conversion between floating-point and fixed-point on page C4-265 
x@x1 Ox x1xx - 000000 = Conversion between floating-point and integer on page C4-266 
x@x1 Ox x1xx - 100000 Unallocated. 

x@x1l = Ox x1xx - x10000 = Floating-point data-processing (I source) on page C4-261 
x@x1l Ox x1xx - xx1000 Floating-point compare on page C4-259 

x@x1 Ox x1xx - xxxl0@ Floating-point immediate on page C4-264 

x@xl = Ox x1xx - xxxx@1 Floating-point conditional compare on page C4-260 

x@xl = Ox x1xx - Xxxxl@ Floating-point data-processing (2 source) on page C4-262 
x@xl = Ox x1xx - xxxxll Floating-point conditional select on page C4-260 

x@xl = 1x - - - Floating-point data-processing (3 source) on page C4-263 








C4.6.1 Advanced SIMD across lanes 


This section describes the encoding of the Advanced SIMD across lanes instruction class. This section is decoded 
from Data processing - SIMD and floating point on page C4-233. 
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| 5 4 


0| 


fo[afulo +771 O[sze]i 1000] opade [io] Rn [| Rd | 





Decode fields 


Instruction Page 










































































U_ size opcode 

- = 0000x Unallocated. 
= = 00010 Unallocated. 
- - Q01xx Unallocated. 
- = 0100x Unallocated. 
7 - 01011 Unallocated. 
- - 01101 Unallocated. 
- - 01110 Unallocated. 
- - 10xxx Unallocated. 
= - 1100x Unallocated. 
- - 111xx Unallocated. 
Qe - 00011 SADDLV 

Qo - 01010 SMAXV 

Qo - 11010 SMINV 

Qo - 11011 ADDV 

1 - 00011 UADDLV 

1 - 01010 UMAXV 

1 - 11010 UMINV 

1 - 11011 Unallocated. 
1 0x 01100 FMAXNMV 
1 Ox 01111 FMAXV 

1 ix 01100 FMINNMV 
1 1x 01111 FMINV 








C4.6.2 Advanced SIMD copy 


This section describes the encoding of the Advanced SIMD copy instruction class. This section is decoded from 


Data processing - SIMD and floating point on page C4-233. 
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|31 30 29 28|27 26 25 24/23 22 21 20 16|15 14 111109 | 


5 4| 


0| 


fo[afonfo 7770000] immS [0] mm [i] Rn | Rd 





Decode fields 


Instruction Page 















































Q op imm5 imm4 

= = x0000 - Unallocated. 

- 0 - 0000 DUP (element) 
. V) = 0001 DUP (general) 
- 0 - 0010 Unallocated. 

- ) - 0100 Unallocated. 

= ) - 0110 Unallocated. 

- 0 - 1xxx Unallocated. 

v) 0 - 0011 Unallocated. 

eo 0 - 0101 SMOV 

0 0 - 0111 UMOV 

) 1 - - Unallocated. 

1 0 - 0011 INS (general) 
1 0 - 0101 SMOV 

1 0 x1000 0111 UMOV 





INS (element) 





C4.6.3 Advanced SIMD extract 


This section describes the encoding of the Advanced SIMD extract instruction class. This section is decoded from 


Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 14 111109 | 


5 4| 


0| 


fofaji o 111 ofop2fo] Rm fol imma jo] Rn | Rd 





Decode fields 


Instruction Page 

















op2 
x1 Unallocated. 
00 EXT 
1x Unallocated. 
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C4.6.4 Advanced SIMD modified immediate 


This section describes the encoding of the Advanced SIMD modified immediate instruction class. This section is 
decoded from Data processing - SIMD and floating point on page C4-233. 


[31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 1615 12\11109 8|7 6 5 4| 0 | 


fo[afonfo 77770000 0falblc] emode olifale[ [alm] Ra | 





Decode fields 


Q op-' cmode 


Instruction Page 





Unallocated. 





MOVI - 32-bit shifted immediate variant on page C7-1120 





ORR (vector, immediate) - 32-bit variant on page C7-1134 





MOVI - 16-bit shifted immediate variant on page C7-1120 





ORR (vector, immediate) - 16-bit variant on page C7-1134 





MOVI - 32-bit shifting ones variant on page C7-1120 





MOVI - 8-bit variant on page C7-1120 





FMOV (vector, immediate) - Single-precision variant on page C7-973 





MVNI - 32-bit shifted immediate variant on page C7-1128 





BIC (vector, immediate) - 32-bit variant on page C7-787 





MVNI - 16-bit shifted immediate variant on page C7-1128 








BIC (vector, immediate) - 16-bit variant on page C7-787 





MVNI - 32-bit shifting ones variant on page C7-1128 





MOVI - 64-bit scalar variant on page C7-1120 





Unallocated. 





- i) Oxxd 
= 0 Oxx1 
- 0 10x0 
5 i) 10x1 
= i) 110x 
- 0 1110 
= ) 1111 
- a Oxxd 
- 1 Oxx1 
- 1 10x0 
- 1 10x1 
- 1 110x 
Y) al 1110 
0 il 1111 
1 1 1110 





MOVI - 64-bit vector variant on page C7-1120 





1 1 1111 


FMOV (vector, immediate) - Double-precision variant on page C7-973 





C4.6.5 Advanced SIMD permute 


This section describes the encoding of the Advanced SIMD permute instruction class. This section is decoded from 
Data processing - SIMD and floating point on page C4-233. 
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|31 30 29 28|27 26 25 24/23 22 21 20| 16/1514 12/11109 | 


fo[afo 0 4771 O[sze]o] Rm [Olopcode[7 of Rn [| Rd 


C4.6.6 Advanced SIMD scalar copy 


5 4| 


0| 





Decode fields 


Instruction Page 


























opcode 

000 Unallocated. 
001 UZP1 

010 TRN1 

011 ZIP 1 

100 Unallocated. 
101 UZP2 

110 TRN2 

111 ZIP2 





This section describes the encoding of the Advanced SIMD scalar copy instruction class. This section is decoded 


from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 111109 | 


ape oso] So) oe er 


5 4| 





Decode fields 


Instruction Page 


























op imm5— imm4 

0 - Xxx1 Unallocated. 

0 - Xx1x Unallocated. 

0 - X1xx Unallocated. 

0 - 0000 DUP (element) 
0 - 1xxx Unallocated. 

0 x0000 0000 Unallocated. 

1 - - Unallocated. 





C4.6.7 Advanced SIMD scalar pairwise 


This section describes the encoding of the Advanced SIMD scalar pairwise instruction class. This section is decoded 


from Data processing - SIMD and floating point on page C4-233. 
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|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16| 12/1110 9 | 5 4| 0 | 


fo [ult 171 O[sze]i 1000] opade [io] Rn [| Rd 





Decode fields 
Instruction Page 
U_ size opcode 
























































= - QOxxx Unallocated. 

- - 010xx Unallocated. 

- - 01110 Unallocated. 

- - 10xxx Unallocated. 

- - 1100x Unallocated. 

- - 11010 Unallocated. 

- - 111xx Unallocated. 

- 1x 01101 Unallocated. 

0 - 11011 ADDP (scalar) 

1 - 11011 Unallocated. 

1 Ox 01100 FMAXNMP (scalar) 
1 Qx 01101 FADDP (scalar) 

1 Qx @1111 FMAXP (scalar) 

1 1x 01100 FMINNMP (scalar) 
1 1x Q1111 FMINP (scalar) 





C4.6.8 Advanced SIMD scalar shift by immediate 


This section describes the encoding of the Advanced SIMD scalar shift by immediate instruction class. This section 
is decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24|23 22 1918 16/15 111109 | 5 4| 0 | 


fo tfuli 141:4 0} immh | immb | opcode Jt] Rn | RG 





Decode fields 
Instruction Page 
U  immh opcode 

















- '= 0000 00001 Unallocated. 
- '= 0000 00011 Unallocated. 
- '= 0000 00101 Unallocated. 
- '= 0000 00111 Unallocated. 
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C4.6 Data processing - SIMD and floating point 





Decode fields 


Instruction Page 








































































































U- immh opcode 

- '= 0000 01001 Unallocated. 

- '= 0000 01011 Unallocated. 

- '= 0000 1101 Unallocated. 

- '= 0000 1111 Unallocated. 

- '= 0000 101xx Unallocated. 

- '= 0000 11001 Unallocated. 

- '= 0000 11010 Unallocated. 

- '= 0000 11101 Unallocated. 

- '= 0000 11110 Unallocated. 

7 0000 - Unallocated. 

Y) '= 0000 00000 SSHR 

Y) != Q000 00010 SSRA 

Y) '= Q000 00100 SRSHR 

Y) '= Q000 00110 SRSRA 

0 '= 0000 01000 Unallocated. 

Y) '= Q000 01010 SHL 

0 '= 0000 01100 Unallocated. 

0 '= 0000 1110 SQSHL (immediate) 
0 '= Q000 10000 Unallocated. 

0 '= 0000 10001 Unallocated. 

Y) '= Q000 10010 SQSHRN, SQSHRN2 
Y) != Q000 10011 SQRSHRN, SQRSHRN2 
() '= 0000 11100 SCVTF (vector, fixed-point) 
() '= 0000 11111 FCVTZS (vector, fixed-point) 
1 '= Q000 00000 USHR 

1 != Q000 00010 USRA 

a != Q000 00100 URSHR 

1 '= Q000 00110 URSRA 

1 != Q000 01000 SRI 

1 != Q000 01010 SLI 

a != Q000 01100 SQSHLU 

1 '= 0000 1110 UQSHL (immediate) 
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Decode fields 
Instruction Page 
U- immh opcode 




















1 != 0000 = 10000 SQSHRUN, SQSHRUN2 

1 = != 0000 = 10001 SQRSHRUN, SQRSHRUN2 
1 =!= 0000 = 10010 UQSHRN, UQSHRN2 

1 = != 0000 = 10011 UQRSHRN, UQRSHRN2 

1 '= 0000 11100 UCVTFE (vector, fixed-point) 
1 '= Q@@@ 11111 FCVTZU (vector, fixed-point) 





C4.6.9 Advanced SIMD scalar three different 


This section describes the encoding of the Advanced SIMD scalar three different instruction class. This section is 
decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 12/1110 9 | 5 4| 0 | 


fo tfuji 111 ofsize{i] Rm | opcode [o of Rn | Rd 





Decode fields 
Instruction Page 









































U —— opcode 

- QOxx Unallocated. 

- Q1xx Unallocated. 

- 1000 Unallocated. 

- 1010 Unallocated. 

- 1100 Unallocated. 

- 111x Unallocated. 

() 1001 SQDMLAL, SQDMLAL2 (vector) 
) 1011 SQDMLSL, SQDMLSL2 (vector) 
() 1101 SQDMULL, SQDMULL2 (vector) 
1 1001 Unallocated. 

1 1011 Unallocated. 

1 1101 Unallocated. 








C4.6.10 Advanced SIMD scalar three same 


This section describes the encoding of the Advanced SIMD scalar three same instruction class. This section is 
decoded from Data processing - SIMD and floating point on page C4-233. 
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|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 111109 | 5 4| 0 | 


fo iJu[t 777 Ofswe[i] Rm | opcode [i] Rn | Ra 





Decode fields 
Instruction Page 
U_ size opcode 


































































































= = 00000 Unallocated. 
- - 0001x Unallocated. 
= = 00100 Unallocated. 
- - Q11xx Unallocated. 
- - 1001x Unallocated. 
- ix 11011 Unallocated. 
eo - 00001 SQADD 
eo - 00101 SQSUB 
0 - 00110 CMGT (register) 
Q - 00111 CMGE (register) 
eo - 01000 SSHL 
0 - 01001 SQSHL (register) 
eo - 01010 SRSHL 
eo - 01011 SQRSHL 
) - 10000 ADD (vector) 
eo - 10001 CMTST 
eo - 10100 Unallocated. 
eo - 10101 Unallocated. 
0 - 10110 SQDMULH (vector) 
Q - 10111 Unallocated. 
®@ ex 11000 Unallocated. 
@ ex 11001 Unallocated. 
@ ex 11010 Unallocated. 
@ ex 11011 FMULX 
@ = ex 11100 FCMEQ (register) 
@ ex 11101 Unallocated. 
@ ex 11110 Unallocated. 
) Ox 11111 FRECPS 
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Decode fields 


Instruction Page 














































































































U_ size opcode 

@ ix 11000 Unallocated. 

@ ix 11001 Unallocated. 

®@ ix 11010 Unallocated. 

@ ix 11100 Unallocated. 

@ ix 11101 Unallocated. 

@ ix 11110 Unallocated. 

®@ 1x 11111 FRSQRTS 

1 - 00001 UQADD 

1 - 00101 UQSUB 

1 - 00110 CMHI (register) 
1 - 00111 CMHS (register) 
1 - 01000 USHL 

1 - 01001 UQSHL (register) 
1 - 01010 URSHL 

1 - 01011 UQRSHL 

1 - 10000 SUB (vector) 

1 - 10001 CMEQ (register) 
1 - 10100 Unallocated. 

1 - 10101 Unallocated. 

1 - 10110 SQRDMULH (vector) 
1 - 10111 Unallocated. 

1 0x 11000 Unallocated. 

1 0x 11001 Unallocated. 

1 0x 11010 Unallocated. 

1 0x 11011 Unallocated. 

1 Ox 11100 FCMGE (register) 
1 Qx 11101 FACGE 

1 0x 11110 Unallocated. 

1 0x 11111 Unallocated. 

1 ix 11000 Unallocated. 

1 ix 11001 Unallocated. 

1 1x 11010 FABD 








ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


C4-243 
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Decode fields 


Instruction Page 














U_ size opcode 

1 1x 11100 FCMGT (register) 
1 1x 11101 FACGT 

1 ix 11110 Unallocated. 

1 ix 11111 Unallocated. 





C4.6.11 Advanced SIMD scalar two-register miscellaneous 


This section describes the encoding of the Advanced SIMD scalar two-register miscellaneous instruction class. This 
section is decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16| 12|11 10 9 


| 5 


4 | 0 | 


fo tut 1771 O[sze]1 0000] opade [io] Rn | Rd 





Decode fields 


Instruction Page 




































































U_ size opcode 
- - 0000x Unallocated. 
- = 00010 Unallocated. 
= = 0010x Unallocated. 
7 = 00110 Unallocated. 
- - 01111 Unallocated. 
- - 1000x Unallocated. 
- - 10011 Unallocated. 
- - 10101 Unallocated. 
- - 10111 Unallocated. 
- - 1100x Unallocated. 
- - 11110 Unallocated. 
- Ox Q11xx Unallocated. 
- Ox 11111 Unallocated. 
- 1x 10110 Unallocated. 
- 1x 11100 Unallocated. 
Qo - 00011 SUQADD 
Qo - 00111 SQABS 
v) - 01000 CMGT (zero) 
v) - 01001 CMEQ (zero) 
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Decode fields 


Instruction Page 





































































































U_ size opcode 

0 - 01010 CMLT (zero) 

eo - 01011 ABS 

e - 10010 Unallocated. 

eo - 10100 SQXTN, SQXTN2 

@ ex 10110 Unallocated. 

) Ox 11010 FCVTNS (vector) 

0 Ox 11011 FCVTMS (vector) 

0 Ox 11100 FCVTAS (vector) 

0 Ox 11101 SCVTF (vector, integer) 
0 1x 01100 FCMGT (zero) 

v) 1x 01101 FCMEQ (zero) 

0 1x 01110 FCMLT (zero) 

) 1x 11010 FCVTPS (vector) 

0 1x 11011 FCVTZS (vector, integer) 
0 1x 11101 FRECPE 

v) 1x 11111 FRECPX 

1 - 00011 USQADD 

1 - 00111 SQNEG 

ul - 01000 CMGE (zero) 

i - 01001 CMLE (zero) 

1 - 01010 Unallocated. 

a - 01011 NEG (vector) 

1 - 10010 SQXTUN, SQXTUN2 
1 - 10100 UQXTN, UQXTN2 

1 0x 10110 FCVTXN, FCVTXN2 
1 Ox 11010 FCVTNU (vector) 

1 Ox 11011 FCVTMU (vector) 

1 Ox 11100 FCVTAU (vector) 

i Ox 11101 UCVTF (vector, integer) 
1 1x 01100 FCMGE (zero) 

1 1x 01101 FCMLE (zero) 

1 1x 01110 Unallocated. 








ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


C4-245 


C4 A64 Instruction Set Encoding 
C4.6 Data processing - SIMD and floating point 





Decode fields 


Instruction Page 














U_ size opcode 

cl 1x 11010 FCVTPU (vector) 

al 1x 11011 FCVTZU (vector, integer) 
1 ix 11101 FRSQRTE 

1 1x 11111 Unallocated. 










































































C4.6.12 Advanced SIMD scalar x indexed element 
This section describes the encoding of the Advanced SIMD scalar x indexed element instruction class. This section 
is decoded from Data processing - SIMD and floating point on page C4-233. 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12/1110 9 | 5 4| 0 | 
Decode fields 
Instruction Page 
U_ size opcode 
- - 0000 Unallocated. 
- - 0010 Unallocated. 
- - 0100 Unallocated. 
- - 0110 Unallocated. 
- - 1000 Unallocated. 
- - 1010 Unallocated. 
- - 111x Unallocated. 
eo - 0011 SQDMLAL, SQDMLAL2 (by element) 
eo - 111 SQDMLSL, SQDMLSL2 (by element) 
eo - 1011 SQDMULL, SQDMULL2 (by element) 
0 - 1100 SQDMULH (by element) 
0 - 1101 SQRDMULH (by element) 
Y) 1x 0001 FMLA (by element) 
0 1x 0101 FMLS (by element) 
v) 1x 1001 FMUL (by element) 
1 - 0011 Unallocated. 
1 - 0111 Unallocated. 
1 - 1011 Unallocated. 
1 - 110x Unallocated. 
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Decode fields 
Instruction Page 
U_ size opcode 











1 1x 0001 Unallocated. 
1 1x 0101 Unallocated. 
1 1x 1001 FMULX (by element) 





C4.6.13 Advanced SIMD shift by immediate 


This section describes the encoding of the Advanced SIMD shift by immediate instruction class. This section is 
decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24/23 22 1918  16|15 1109 | 5 4| 0 | 


fo[afu[o +717 0] 0000 | immb [ opcode [i] Rn [| Rd 


immh 





Decode fields 
Instruction Page 

































































U opcode 

- 00001 Unallocated. 

- 00011 Unallocated. 

- 00101 Unallocated. 

- 00111 Unallocated. 

- 01001 Unallocated. 

- 01011 Unallocated. 

- 01101 Unallocated. 

- 01111 Unallocated. 

- 10101 Unallocated. 

- 1011x Unallocated. 

- 11101 Unallocated. 

- 11110 Unallocated. 

i) 00000 SSHR 

i) 00010 SSRA 

i) 00100 SRSHR 

i) 00110 SRSRA 

i) 01000 Unallocated. 

i) 01010 SHL 

i) 01100 Unallocated. 

i) 01110 SQSHL (immediate) 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C4-247 


ID092916 Non-Confidential 
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C4.6 Data processing - SIMD and floating point 


C4.6.14 Advanced SIMD table lookup 





Decode fields 


Instruction Page 







































































U —— opcode 

i) 10000 SHRN, SHRN2 

Q 10001 RSHRN, RSHRN2 

Q 10010 SQSHRN, SQSHRN2 

Q 10011 SQRSHRN, SQRSHRN2 

Q 10100 SSHLL, SSHLL2 

() 11100 SCVTF (vector, fixed-point) 
0 11111 FCVTZS (vector, fixed-point) 
1 00000 USHR 

1 00010 USRA 

1 00100 URSHR 

1 00110 URSRA 

1 01000 SRI 

1 01010 SLI 

1 01100 SQSHLU 

al 01110 UQSHL (immediate) 

1 10000 SQSHRUN, SQSHRUN2 

1 10001 SQRSHRUN, SQRSHRUN2 
1 10010 UQSHRN, UQSHRN2 

1 10011 UQRSHRN, UQRSHRN2 

1 10100 USHLL, USHLL2 

1. 11100 UCVTF (vector, fixed-point) 
al 11111 FCVTZU (vector, fixed-point) 








This section describes the encoding of the Advanced SIMD table lookup instruction class. This section is decoded 
from Data processing - SIMD and floating point on page C4-233. 
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fofajoo 141 ofop2fo] Rm [often fopfo of Rn | Rd 





Decode fields 


Instruction Page 
































op2_ len 

x1 - Unallocated. 

00 00 TBL - Single register table variant on page C7-1372 
00 00 TBX - Single register table variant on page C7-1374 
00 01 TBL - Two register table variant on page C7-1372 
00 01 TBX - Two register table variant on page C7-1374 
00 10 TBL - Three register table variant on page C7-1372 
00 10 TBX - Three register table variant on page C7-1374 
00 11 TBL - Four register table variant on page C7-1372 
00 11 TBX - Four register table variant on page C7-1374 
1x - Unallocated. 





C4.6.15 Advanced SIMD three different 


This section describes the encoding of the Advanced SIMD three different instruction class. This section is decoded 
from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24/23 22 21 20| 


12/1110 9 | 5 4| 0 | 


fo[auyo 1171 O[sze]i] Rm | opcode [oo] Rn] Rd 





Decode fields 
Instruction Page 
































U opcode 

- 1111 Unallocated. 

) 0000 SADDL, SADDL2 

) 0001 SADDW, SADDW2 

() 0010 SSUBL, SSUBL2 

) 0011 SSUBW, SSUBW2 

) 0100 ADDHN, ADDHN2 

() 0101 SABAL, SABAL2 

) 0110 SUBHN, SUBHN2 

() 0111 SABDL, SABDL2 
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Decode fields 


Instruction Page 










































































U —— opcode 

i) 1000 SMLAL, SMLAL2 (vector) 

) 1001 SQDMLAL, SQDMLAL2 (vector) 
t) 1010 SMLSL, SMLSL2 (vector) 

() 1011 SQDMLSL, SQDMLSL2 (vector) 
) 1100 SMULL, SMULL2 (vector) 

) 1101 SQDMULL, SQDMULL2 (vector) 
() 1110 PMULL, PMULL2 

1 0000 UADDL, UADDL2 

1 0001 UADDW, UADDW2 

1 0010 USUBL, USUBL2 

1 0011 USUBW, USUBW2 

1 0100 RADDHN, RADDHN2 

1 0101 UABAL, UABAL2 

1 0110 RSUBHN, RSUBHN2 

1 0111 UABDL, UABDL2 

1 1000 UMLAL, UMLAL2 (vector) 

1 1001 Unallocated. 

fi 1010 UMLSL, UMLSL2 (vector) 

1 1011 Unallocated. 

1 1100 UMULL, UMULL2 (vector) 

1 1101 Unallocated. 

1 1110 Unallocated. 








This section describes the encoding of the Advanced SIMD three same instruction class. This section is decoded 
from Data processing - SIMD and floating point on page C4-233. 
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| 5 4 0| 


fofajufo 717 ofswe[i] Rm | opcode [i] Rn | Ra | 





Decode fields 


Instruction Page 






















































































U_ size opcode 

Qo - 00000 SHADD 

Qo - 00001 SQADD 

Qo - 00010 SRHADD 

eo - 00100 SHSUB 

eo - 00101 SQSUB 

Q - 00110 CMGT (register) 
0 - 00111 CMGE (register) 
eo - 01000 SSHL 

0 - 01001 SQSHL (register) 
@ - 01010 SRSHL 

eo - 01011 SQRSHL 

eo - 01100 SMAX 

eo - 01101 SMIN 

eo - 01110 SABD 

eo - 01111 SABA 

0 - 10000 ADD (vector) 

eo - 10001 CMTST 

) - 10010 MLA (vector) 

0 - 10011 MUL (vector) 

eo - 10100 SMAXP 

eo - 10101 SMINP 

) - 10110 SQDMULH (vector) 
) - 10111 ADDP (vector) 

0 Qx 11000 FMAXNM (vector) 
) Qx 11001 FMLA (vector) 

0 Qx 11010 FADD (vector) 

®@ ex 11011 FMULX 

) Ox 11100 FCMEQ (register) 
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Decode fields 


Instruction Page 








































































































U_ size opcode 

@ ex 11101 Unallocated. 

1) Qx 11110 FMAX (vector) 

@ ex 11111 FRECPS 

0 00 00011 AND (vector) 

0 01 00011 BIC (vector, register) 
) 1x 11000 FMINNM (vector) 
0 1x 11001 FMLS (vector) 

0 1x 11010 FSUB (vector) 

@ ix 11011 Unallocated. 

@ ix 11100 Unallocated. 

@ ix 11101 Unallocated. 

0 1x 11110 FMIN (vector) 

@ 1x 11111 FRSQRTS 

) 10 00011 ORR (vector, register) 
0 11 00011 ORN (vector) 

1 - 00000 UHADD 

1 - 00001 UQADD 

1 - 00010 URHADD 

1 - 00100 UHSUB 

1 - 00101 UQSUB 

1 - 00110 CMHI (register) 

1 - 00111 CMHS (register) 
1 - 01000 USHL 

1 - 01001 UQSHL (register) 
1 - 01010 URSHL 

1 - 01011 UQRSHL 

1 - 01100 UMAX 

1 - 01101 UMIN 

1 - 01110 UABD 

1 = 01111 UABA 

1 - 10000 SUB (vector) 

1 - 10001 CMEQ (register) 
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Decode fields 


Instruction Page 






















































































U_ size opcode 

1 - 10010 MLS (vector) 

1 - 10011 PMUL 

1 - 10100 UMAXP 

1 - 10101 UMINP 

1 - 10110 SQRDMULH (vector) 
1 - 10111 Unallocated. 

1 Qx 11000 FMAXNMP (vector) 
1 0x 11001 Unallocated. 

1 Qx 11010 FADDP (vector) 

1 Qx 11011 FMUL (vector) 

1 Ox 11100 FCMGE (register) 

1 Qx 11101 FACGE 

1 Qx 11110 FMAXP (vector) 

1 Qx 11111 FDIV (vector) 

1 00 00011 EOR (vector) 

1 @1 00011 BSL 

1 1x 11000 FMINNMP (vector) 
1 ix 11001 Unallocated. 

1 1x 11010 FABD 

1 ix 11011 Unallocated. 

1 1x 11100 FCMGT (register) 

1 1x 11101 FACGT 

1 1x 11110 FMINP (vector) 

1 ix 11111 Unallocated. 

1 10 00011 BIT 

1 11 00011 BIF 








C4.6.17 Advanced SIMD two-register miscellaneous 


This section describes the encoding of the Advanced SIMD two-register miscellaneous instruction class. This 
section is decoded from Data processing - SIMD and floating point on page C4-233. 
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|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16| 12/1110 9 | 5 4| 0 | 


o[afuo 471 O[sze]i 0000] opade [io] Rn [| Rd | 





Decode fields 
Instruction Page 
U_ size opcode 





- - 1000x nallocated. 





- - 10101 nallocated. 





- - 11110 nallocated. 

















U 
U 
U 

- Ox Q11xx Unallocated. 
U 
U 
U 




































































- Ox 11111 nallocated. 

- 1x 10110 nallocated. 

- 1x 10111 nallocated. 

Qo - 00000 REV64 

) - 00001 REV 16 (vector) 

e - 00010 SADDLP 

eo - 00011 SUQADD 

) - 00100 CLS (vector) 

e - 00101 CNT 

eo - 00110 SADALP 

eo - 00111 SQABS 

0 - 01000 CMGT (zero) 

0 - 01001 CMEQ (zero) 

) - 01010 CMLT (zero) 

e - 01011 ABS 

eo - 10010 XTN, XTN2 

0 - 10011 Unallocated. 

eo - 10100 SQXTN, SQXTN2 

® ex 10110 FCVTN, FCVTN2 

® ex 10111 FCVTL, FCVTL2 

) Ox 11000 FRINTN (vector) 

) Ox 11001 FRINTM (vector) 

) Ox 11010 FCVTNS (vector) 

) Ox 11011 FCVTMS (vector) 
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Instruction Page 








































































































U_ size opcode 

) Ox 11100 FCVTAS (vector) 

0 Ox 11101 SCVTF (vector, integer) 
) 1x 01100 FCMGT (zero) 

v) 1x 01101 FCMEQ (zero) 

0 1x 01110 FCMLT (zero) 

) 1x Q1111 FABS (vector) 

() 1x 11000 FRINTP (vector) 

) 1x 11001 FRINTZ (vector) 

) 1x 11010 FCVTPS (vector) 

) 1x 11011 FCVTZS (vector, integer) 
@ ix 11100 URECPE 

) 1x 11101 FRECPE 

i) 1x 11111 Unallocated. 

1 - 00000 REV32 (vector) 

1 - 00001 Unallocated. 

1 - 00010 UADDLP 

1 - 00011 USQADD 

1 - 00100 CLZ (vector) 

1 - 00110 UADALP 

1 - 00111 SQNEG 

al - 01000 CMGE (zero) 

a - 01001 CMLE (zero) 

1 - 01010 Unallocated. 

1 - 01011 NEG (vector) 

1 - 10010 SQXTUN, SQXTUN2 
1 - 10011 SHLL, SHLL2 

1 - 10100 UQXTN, UQXTN2 

1 0x 10110 FCVTXN, FCVTXN2 
1 0x 10111 Unallocated. 

1 Ox 11000 FRINTA (vector) 

1 Ox 11001 FRINTX (vector) 

1 Ox 11010 FCVTNU (vector) 
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Decode fields 


Instruction Page 
























































U_ size opcode 

il Ox 11011 FCVTMU (vector) 

a Ox 11100 FCVTAU (vector) 

a Ox 11101 UCVTF (vector, integer) 
1 0 00101 NOT 

1 01 00101 RBIT (vector) 

1 ix 00101 Unallocated. 

1 1x 01100 FCMGE (zero) 

il 1x 01101 FCMLE (zero) 

1 ix 01110 Unallocated. 

al 1x Q1111 FNEG (vector) 

1 ix 11000 Unallocated. 

al 1x 11001 FRINTI (vector) 

1 1x 11010 FCVTPU (vector) 

1 1x 11011 FCVTZU (vector, integer) 
1 1x 11100 URSQRTE 

1 ix 11101 FRSQRTE 

sl 1x 11111 FSQRT (vector) 





C4.6.18 Advanced SIMD vector x indexed element 


This section describes the encoding of the Advanced SIMD vector x indexed element instruction class. This section 
is decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12/1110 9 | 5 4| 0 | 


fofajujo 141 afsize[L|m] Rm | opcode |H[o] Rn | Rd 





Decode fields 


U_ size opcode 


Instruction Page 




















- = 111x Unallocated. 

eo - 0000 Unallocated. 

eo - 0010 SMLAL, SMLAL2 (by element) 

e@ - 0011 SQDMLAL, SQDMLAL2 (by element) 
Qo - 0100 Unallocated. 

0 - 0110 SMLSL, SMLSL2 (by element) 
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Decode fields 


U_ size opcode 


Instruction Page 























e - 0111 SQDMLSL, SQDMLSL2 (by element) 
0 - 1000 MUL (by element) 

a - 1010 SMULL, SMULL2 (by element) 

a - 1011 SQDMULL, SQDMULL2 (by element) 
0 - 1100 SQDMULH (by element) 

0 - 1101 SQRDMULH (by element) 

0 1x 0001 FMLA (by element) 





0 1x 0101 


FMLS (by element) 








0 1x 1001 


FMUL (by element) 












































1 - 0000 MLA (by element) 

1 - 0010 UMLAL, UMLAL2 (by element) 
1 - 0011 Unallocated. 

1 - 0100 MLS (by element) 

1 - 0110 UMLSL, UMLSL2 (by element) 
1 - 0111 Unallocated. 

1 - 1000 Unallocated. 

a8 - 1010 UMULL, UMULL2 (by element) 
1 - 1011 Unallocated. 

1 - 110x Unallocated. 

1 1x 0001 Unallocated. 

1 1x 0101 Unallocated. 

1 1x 1001 FMULX (by element) 








C4.6.19 Cryptographic AES 


This section describes the encoding of the Cryptographic AES instruction class. This section is decoded from Data 


processing - SIMD and floating point on page C4-233. 
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131 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16| 12|11 10 9 


| 5 4 


0| 


01001771 0[sze]i 0100] opcode [io] Rn | Rd 





Decode fields 


Instruction Page 





























size opcode 

- X1Xxx Unallocated. 
= Q00xx Unallocated. 
- 1xxxx Unallocated. 
xl - Unallocated. 
00 00100 AESE 

00 00101 AESD 

00 00110 AESMC 

00 00111 AESIMC 

1x - Unallocated. 





C4.6.20 Cryptographic three-register SHA 


This section describes the encoding of the Cryptographic three-register SHA instruction class. This section is 


decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/1514 12/1110 9 


| 5 4| 


0} 


jo 1017111 Ofsizefo] Rm [ofopcode{o of Rn | Rd 





Decode fields 


Instruction Page 






































size opcode 

- 111 Unallocated. 

x1 - Unallocated. 

00 000 SHAIC 

00 001 SHAIP 

00 010 SHAIM 

00 @11 SHA1SU0 

00 100 SHA256H 

00 101 SHA256H2 

00 110 SHA256SU1 

1x - Unallocated. 
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C4.6.21 Cryptographic two-register SHA 


This section describes the encoding of the Cryptographic two-register SHA instruction class. This section is 
decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 at 26 25 24|23 22 21 20/19 18 17 16| 12|11 10 9 be 5 4| 0 | 


O1011771 O[sze]1 0100] opcode [io] Rn | Rd 





Decode fields 
Instruction Page 
size opcode 





























- XX1xx Unallocated. 
- X1xxx Unallocated. 
- 1xxxx Unallocated. 
x1 - Unallocated. 
00 00000 SHA1H 

00 00001 SHAISU1 

00 00010 SHA256SU0 
00 00011 Unallocated. 
1x - Unallocated. 





C4.6.22 Floating-point compare 


This section describes the encoding of the Floating-point compare instruction class. This section is decoded from 
Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


[Miojs|]1 11 1 oftype]1] Rm [op {1 0 0 of Rn | opcode? _| 





Decode fields 
Instruction Page 
M S_ type op_ opcode2 
































. Gt - XXXx1 Unallocated. 
- = * 2 - xxx1x Unallocated. 
- So - XX1xx Unallocated. 
2 a « x1 - Unallocated. 
= a 1x - Unallocated. 
- - 10 - - Unallocated. 
: i .- - - Unallocated. 
0 0 0 00 «© g0000 FCMP 
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Decode fields 


Instruction Page 





























M S_ type op_ opcode2 

) 0 00 00 01000 FCMP 

) 0 00 00 10000 FCMPE 

) 0 00 00 11000 FCMPE 

0 @ 1 00 00000 FCMP 

) @ 1 00 01000 FCMP 

) @ 1 00 10000 FCMPE 

) 0 1 00 11000 FCMPE 

1 - - - Unallocated. 





This section describes the encoding of the Floating-point conditional compare instruction class. This section is 
decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24/23 22 21 20| 


16/15 


12/1110 9 | 5 4|3 0 | 


M[O[S[1 117 O[wpe]i[ Rm | cond [0 1] Rn on] raw 





Decode fields 


Instruction Page 























M S_ type op 

- - 10 - Unallocated. 

- 1 - - Unallocated. 

Q 0 00 0 FCCMP - Single-precision variant on page C7-843 

Q 0 00 1 FCCMPE - Single-precision variant on page C7-845 
Y) @ 1 0 FCCMP - Double-precision variant on page C7-843 
Y) @ 1 1 FCCMPE - Double-precision variant on page C7-845 
1 - - Unallocated. 





C4.6.24 Floating-point conditional select 


This section describes the encoding of the Floating-point conditional select instruction class. This section is decoded 
from Data processing - SIMD and floating point on page C4-233. 
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|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 12/1110 9 | 5 4| 0 | 


M[O[S[1 177 O[wpe]i[ Rm | cond [1a] Rn] Rd 





Decode fields 
Instruction Page 

















M S_ type 

- - 10 Unallocated. 

- 1 - Unallocated. 

Y) 0 0 FCSEL - Single-precision variant on page C7-867 
Y) ® 1 FCSEL - Double-precision variant on page C7-867 
1 - = Unallocated. 





C4.6.25 Floating-point data-processing (1 source) 


This section describes the encoding of the Floating-point data-processing (1 source) instruction class. This section 
is decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24/23 22 21 20| 15 141312/11109 | 5 4| 0 | 


[M[O[S[1 111 O[wpe]i] opcode [70000] Rn | Rd 





Decode fields 


Instruction Page 





















































M S_ type opcode 
- -  - X1XXxx Unallocated. 
- -  - 1XXXxxx Unallocated. 
- 1 - - Unallocated. 
Y) 0 0 000000 FMOV (register) - Single-precision variant on page C7-974 
Q 0 00 000001 FABS (scalar) - Single-precision variant on page C7-831 
Y) 0 00 000010 FNEG (scalar) - Single-precision variant on page C7-993 
Y) 0 00 000011 FSQRT (scalar) - Single-precision variant on page C7-1038 
Q 0 00 000100 Unallocated. 
Y) 0 2 000101 FCVT - Single-precision to double-precision variant on page C7-869 
Q 0 00 000110 Unallocated. 
Q 0 00 000111 FCVT - Single-precision to half-precision variant on page C7-869 
Y) 0 00 001000 FRINTN (scalar) - Single-precision variant on page C7-1019 
Y) 0 00 001001 FRINTP (scalar) - Single-precision variant on page C7-1023 
Y) 0 00 001010 FRINTM (scalar) - Single-precision variant on page C7-1015 
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Decode fields 


Instruction Page 






















































































M S_ type opcode 

Y) 0 0 001011 FRINTZ (scalar) - Single-precision variant on page C7-1031 

Y) 0 00 001100 FRINTA (scalar) - Single-precision variant on page C7-1007 

Q 0 00 001101 Unallocated. 

Y) 0 00 001110 FRINTX (scalar) - Single-precision variant on page C7-1027 

Y) 0 00 001111 FRINTI (scalar) - Single-precision variant on page C7-1011 

Y) @ 1 000000 FMOV (register) - Double-precision variant on page C7-974 

Y) ® 1 000001 FABS (scalar) - Double-precision variant on page C7-831 

Q @ 1 000010 FNEG (scalar) - Double-precision variant on page C7-993 

Q @ 1 000011 FSQRT (scalar) - Double-precision variant on page C7-1038 

Y) ®@ 1 000100 FCVT - Double-precision to single-precision variant on page C7-869 
Q ® 21 000101 Unallocated. 

Y) ® 21 000110 Unallocated. 

Y) ®@ 1 000111 FCVT - Double-precision to half-precision variant on page C7-869 
Y) @ 1 001000 FRINTN (scalar) - Double-precision variant on page C7-1019 

Y) @ 1 001001 FRINTP (scalar) - Double-precision variant on page C7-1023 

Y) ® 1 001010 FRINTM (scalar) - Double-precision variant on page C7-1015 

Y) @ 1 001011 FRINTZ (scalar) - Double-precision variant on page C7-1031 

Y) @ 1 001100 FRINTA (scalar) - Double-precision variant on page C7-1007 

Y) ®@ @1 001101 Unallocated. 

Y) @ 1 001110 FRINTX (scalar) - Double-precision variant on page C7-1027 

Y) @ 1 001111 FRINTI (scalar) - Double-precision variant on page C7-1011 

Y) 0 10 QOxxxx Unallocated. 

Q @ 11 000100 FCVT - Half-precision to single-precision variant on page C7-869 
Y) @ 11 000101 FCVT - Half-precision to double-precision variant on page C7-869 
Y) @ 11 00011x Unallocated. 

Y) @ 11 001101 Unallocated. 

1 -  - - Unallocated. 





C4.6.26 Floating-point data-processing (2 source) 


This section describes the encoding of the Floating-point data-processing (2 source) instruction class. This section 
is decoded from Data processing - SIMD and floating point on page C4-233. 
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16|15 12/1110 9 | 5 4| 0 | 





Decode fields 


Instruction Page 



















































































M S_ type opcode 

- - = Ixx1 Unallocated. 

- - = 1x1x Unallocated. 

- - = 11xx Unallocated. 

~ - 10 - Unallocated. 

- 1 - - Unallocated. 

) 0 00 0000 FMUL (scalar) - Single-precision variant on page C7-985 

) 0 00 0001 FDIV (scalar) - Single-precision variant on page C7-929 

7) 0 00 0010 FADD (scalar) - Single-precision variant on page C7-838 

) 0 00 0011 FSUB (scalar) - Single-precision variant on page C7-1041 

) 0 00 0100 FMAX (scalar) - Single-precision variant on page C7-935 

) 0 00 0101 FMIN (scalar) - Single-precision variant on page C7-951 

) 0 00 0110 FMAXNM (scalar) - Single-precision variant on page C7-939 
) 0 00 0111 FMINNM (scalar) - Single-precision variant on page C7-955 
7) 0 00 1000 FNMUL (scalar) - Single-precision variant on page C7-998 

7) @ 1 0000 FMUL (scalar) - Double-precision variant on page C7-985 

) @ 1 0001 FDIV (scalar) - Double-precision variant on page C7-929 

7) @ 1 0010 FADD (scalar) - Double-precision variant on page C7-838 

0 ®@ 1 0011 FSUB (scalar) - Double-precision variant on page C7-1041 

) ® 1 0100 FMAX (scalar) - Double-precision variant on page C7-935 

) ® 1 0101 FMIN (scalar) - Double-precision variant on page C7-951 

) @ 1 0110 FMAXNM (scalar) - Double-precision variant on page C7-939 
7) @ 1 0111 FMINNM (scalar) - Double-precision variant on page C7-955 
0 @ 1 1000 FNMUL (scalar) - Double-precision variant on page C7-998 
1 - = - Unallocated. 





C4.6.27 Floating-point data-processing (3 source) 


This section describes the encoding of the Floating-point data-processing (3 source) instruction class. This section 
is decoded from Data processing - SIMD and floating point on page C4-233. 
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|31 30 29 28|27 26 25 24/23 22 21 20| 


16|15 14 | 109 5 4| 0 | 


OC Ceeet cS ae ee en eR AT 





Decode fields 


Instruction Page 






































M S_ type o1 00 

- - 10 - - Unallocated. 

- 1 - - - Unallocated. 

0 0 00 i) Q FMADD - Single-precision variant on page C7-931 

0 0 00 Q a FMSUB - Single-precision variant on page C7-979 

0 0 00 1 Y) FNMADD - Single-precision variant on page C7-994 
0 0 00 1 1 FNMSUB - Single-precision variant on page C7-996 
i) ®@ 1 i) Y) FMADD - Double-precision variant on page C7-931 
i) 0 O01 i) 1 FMSUB - Double-precision variant on page C7-979 

0 ® O01 1 Y) FNMADD - Double-precision variant on page C7-994 
) 0 O01 1 1 FNMSUB - Double-precision variant on page C7-996 
1 - = - - Unallocated. 



































C4.6.28 Floating-point immediate 
This section describes the encoding of the Floating-point immediate instruction class. This section is decoded from 
Data processing - SIMD and floating point on page C4-233. 
|31 30 29 28|27 26 25 24|23 22 21 20| | 13 12|11 10 9 | 5 4| 0 | 
M[o[s[i 711 Olype[t] imme [10 0] mms | Ra 
Decode fields 
Instruction Page 
M S_ type’ imm5d 
- - = XXXxL Unallocated. 
- - = XXX1x Unallocated. 
- - = XX1xx Unallocated. 
- - = X1xxx Unallocated. 
- - 1xxxx Unallocated. 
- - 10 - Unallocated. 
- 1 - - Unallocated. 
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Decode fields 


Instruction Page 











M S_ type immd 

0 0 00 00000 FMOV (scalar, immediate) - Single-precision variant on page C7-978 
) ®@ 1 00000 FMOV (scalar, immediate) - Double-precision variant on page C7-978 
1 - = Unallocated. 





Conversion between floating-point and fixed-point 


This section describes the encoding of the Conversion between floating-point and fixed-point instruction class. This 
section is decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|15 | 109 5 4| 0 | 


HEN RTS 





Decode fields 


Instruction Page 


















































sf S type rmode opcode _ scale 

- - = - 1xx - Unallocated. 

- - + x0 00x - Unallocated. 

- - = x1 Q1x - Unallocated. 

- - = Ox 00x - Unallocated. 

- - + 1x Q1x - Unallocated. 

- - 10 - - - Unallocated. 

- 1 - - - - Unallocated. 

i) - + - - Oxxxxx Unallocated. 

) 0 00 00 010 - SCVTF (scalar, fixed-point) - 32-bit to single-precision variant on 
page C7-1177 

i) 0 00 00 Q11 - UCVTF (scalar, fixed-point) - 32-bit to single-precision variant on 
page C7-1403 

i) 0 00 11 000 - FCVTZS (scalar, fixed-point) - Single-precision to 32-bit variant on 
page C7-914 

i) 0 00 11 001 - FCVTZU (scalar, fixed-point) - Single-precision to 32-bit variant on 
page C7-923 

i) ®@ 1 00 010 - SCVTF (scalar, fixed-point) - 32-bit to double-precision variant on 
page C7-1177 

i) ®@ 1 00 Q11 - UCVTF (scalar, fixed-point) - 32-bit to double-precision variant on 
page C7-1403 

i) ®@ 1 11 000 - FCVTZS (scalar, fixed-point) - Double-precision to 32-bit variant on 
page C7-914 
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Decode fields 


Instruction Page 
































sf S type rmode_ opcode scale 

i) ®@ 1 11 001 - FCVTZU (scalar, fixed-point) - Double-precision to 32-bit variant on 
page C7-923 

1 0 00 00 010 - SCVTF (scalar, fixed-point) - 64-bit to single-precision variant on 
page C7-1177 

1 0 00 00 Q11 - UCVTFE (scalar, fixed-point) - 64-bit to single-precision variant on 
page C7-1403 

1 0 00 11 000 - FCVTZS (scalar, fixed-point) - Single-precision to 64-bit variant on 
page C7-914 

1 0 00 11 001 - FCVTZU (scalar, fixed-point) - Single-precision to 64-bit variant on 
page C7-923 

1 Q@ 1 00 010 - SCVTF (scalar, fixed-point) - 64-bit to double-precision variant on 
page C7-1177 

1 ®@ 1 00 Q11 - UCVTF (scalar, fixed-point) - 64-bit to double-precision variant on 
page C7-1403 

1 ® 1 11 000 - FCVTZS (scalar, fixed-point) - Double-precision to 64-bit variant on 
page C7-914 

1 ®@ 1 11 001 - FCVTZU (scalar, fixed-point) - Double-precision to 64-bit variant on 
page C7-923 

C4.6.30 Conversion between floating-point and integer 


This section describes the encoding of the Conversion between floating-point and integer instruction class. This 
section is decoded from Data processing - SIMD and floating point on page C4-233. 


|31 30 29 28|27 26 25 24|23 22 21 20/1918  16/15141312/11109 | 5 4| 0 | 


sflo]s]1_1 1 1 oltype[1 fmodd opcode]o 0 0 0 0 of Rn | Rd 





Decode fields 


Instruction Page 


























sf S type rmode _ opcode 

- - = x1 Q1x Unallocated. 
- - = x1 10x Unallocated. 
- - = 1x Q1x Unallocated. 
- - = 1x 10x Unallocated. 
- @ 10 - Oxx Unallocated. 
- @ 10 - 10x Unallocated. 
- 1 - - - Unallocated. 
0 0 00 x1 11x Unallocated. 
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Decode fields 


Instruction Page 





































































































sf S type rmode_ opcode 
Q 0 0 00 000 FCVTNS (scalar) - Single-precision to 32-bit variant on page C7-893 
7) 0 00 00 001 FCVTNU (scalar) - Single-precision to 32-bit variant on page C7-897 
) 0 00 00 010 SCVTF (scalar, integer) - 32-bit to single-precision variant on page C7-1179 
7) 0 00 00 011 UCVTFE (scalar, integer) - 32-bit to single-precision variant on page C7-1405 
) 0 00 00 100 FCVTAS (scalar) - Single-precision to 32-bit variant on page C7-873 
Q 0 0 00 101 FCVTAU (scalar) - Single-precision to 32-bit variant on page C7-877 
) 0 00 00 110 FMOV (general) - Single-precision to 32-bit variant on page C7-975 
) 0 00 00 111 FMOV (general) - 32-bit to single-precision variant on page C7-975 
) 0 00 01 000 FCVTPS (scalar) - Single-precision to 32-bit variant on page C7-901 
) 0 00 01 001 FCVTPU (scalar) - Single-precision to 32-bit variant on page C7-905 
i) 0 0 1x 11x Unallocated. 
) 0 00 10 000 FCVTMS (scalar) - Single-precision to 32-bit variant on page C7-883 
7) 0 00 10 001 FCVTMU (scalar) - Single-precision to 32-bit variant on page C7-887 
0 0 0 11 000 FCVTZS (scalar, integer) - Single-precision to 32-bit variant on page C7-916 
7) 0 00 11 001 FCVTZU (scalar, integer) - Single-precision to 32-bit variant on page C7-925 
) 0 1 - 11x Unallocated. 
0 ® 1 00 000 FCVTNS (scalar) - Double-precision to 32-bit variant on page C7-893 
) ® 1 00 001 FCVTNU (scalar) - Double-precision to 32-bit variant on page C7-897 
) ® 1 00 010 SCVTF (scalar, integer) - 32-bit to double-precision variant on page C7-1179 
0 ®@ 1 00 011 UCVTEF (scalar, integer) - 32-bit to double-precision variant on page C7-1405 
) ® 1 00 100 FCVTAS (scalar) - Double-precision to 32-bit variant on page C7-873 
0 ® 1 00 101 FCVTAU (scalar) - Double-precision to 32-bit variant on page C7-877 
i) @ 1 01 000 FCVTPS (scalar) - Double-precision to 32-bit variant on page C7-901 
) ® 1 01 001 FCVTPU (scalar) - Double-precision to 32-bit variant on page C7-905 
) ®@ 1 10 000 FCVTMS (scalar) - Double-precision to 32-bit variant on page C7-883 
Q ® 1 10 001 FCVTMU (scalar) - Double-precision to 32-bit variant on page C7-887 
) @ 1 11 000 FCVTZS (scalar, integer) - Double-precision to 32-bit variant on page C7-916 
0 ® 1 11 001 FCVTZU (scalar, integer) - Double-precision to 32-bit variant on page C7-925 
) ® 10 - 11x Unallocated. 
1 0 0 - 11x Unallocated. 
1 0 00 00 000 FCVTNS (scalar) - Single-precision to 64-bit variant on page C7-893 
1 ® 0 00 001 FCVTNU (scalar) - Single-precision to 64-bit variant on page C7-897 
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Decode fields 


Instruction Page 


































































































sf S type rmode_ opcode 
1 0 0 00 010 SCVTF (scalar, integer) - 64-bit to single-precision variant on page C7-1179 
1 0 0 00 011 UCVTF (scalar, integer) - 64-bit to single-precision variant on page C7-1405 
1 0 0 00 100 FCVTAS (scalar) - Single-precision to 64-bit variant on page C7-873 
1 0 0 00 101 FCVTAU (scalar) - Single-precision to 64-bit variant on page C7-877 
1 0 0 01 000 FCVTPS (scalar) - Single-precision to 64-bit variant on page C7-901 
1 0 0 01 001 FCVTPU (scalar) - Single-precision to 64-bit variant on page C7-905 
1 0 00 10 000 FCVTMS (scalar) - Single-precision to 64-bit variant on page C7-883 
1 0 00 10 001 FCVTMU (scalar) - Single-precision to 64-bit variant on page C7-887 
1 0 0 11 000 FCVTZS (scalar, integer) - Single-precision to 64-bit variant on page C7-916 
1 0 0 11 001 FCVTZU (scalar, integer) - Single-precision to 64-bit variant on page C7-925 
1 ® 1 x1 11x Unallocated. 
1 ® 1 00 000 FCVTNS (scalar) - Double-precision to 64-bit variant on page C7-893 
1 ® 1 00 001 FCVTNU (scalar) - Double-precision to 64-bit variant on page C7-897 
1 ® 1 00 010 SCVTF (scalar, integer) - 64-bit to double-precision variant on page C7-1179 
1 ® 1 00 011 UCVTEF (scalar, integer) - 64-bit to double-precision variant on page C7-1405 
1 ® 1 00 100 FCVTAS (scalar) - Double-precision to 64-bit variant on page C7-873 
a ® 1 00 101 FCVTAU (scalar) - Double-precision to 64-bit variant on page C7-877 
1 ® 1 00 110 FMOV (general) - Double-precision to 64-bit variant on page C7-975 
au ®@ 1 00 111 FMOV (general) - 64-bit to double-precision variant on page C7-975 
1 ® 1 01 000 FCVTPS (scalar) - Double-precision to 64-bit variant on page C7-901 
1 ® 1 01 001 FCVTPU (scalar) - Double-precision to 64-bit variant on page C7-905 
1 ®@ 1 1x 11x Unallocated. 
1 0 1 10 000 FCVTMS (scalar) - Double-precision to 64-bit variant on page C7-883 
1 ® 1 10 001 FCVTMU (scalar) - Double-precision to 64-bit variant on page C7-887 
1 ® 1 11 000 FCVTZS (scalar, integer) - Double-precision to 64-bit variant on page C7-916 
1 0 1 11 001 FCVTZU (scalar, integer) - Double-precision to 64-bit variant on page C7-925 
a ® 10 x 11x Unallocated. 
1 ® 10 01 110 FMOV (general) - Top half of 128-bit to 64-bit variant on page C7-975 
1 ® 10 01 111 FMOV (general) - 64-bit to top half of 128-bit variant on page C7-975 
ve ® 10 1x 11x Unallocated. 
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Chapter C5 
The A64 System Instruction Class 


This chapter describes the A64 System instruction class, and the System instruction class encoding space, that is a 
subset of the System registers encoding space. It contains the following sections: 


° The System instruction class encoding space on page C5-270. 


° Special-purpose registers on page C5-293. 


° A64 system instructions for cache maintenance on page C5-347. 
. A64 system instructions for address translation on page C5-365. 
° A64 system instructions for TLB maintenance on page C5-378. 


See General information about the A64 instruction descriptions on page C2-137 for information about entries used 
in the instruction encoding descriptions. 
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C5.1 


The System instruction class encoding space 


Part of the A64 instruction encoding space is assigned to instructions that access the System register encoding space. 
These instructions provide: 


° Access to System registers, including the debug registers, that provide system control, and system status 
information. 


° Access to Special-purpose registers such as SPSR_ELx, ELR_ELx, and the equivalent fields of the Process 


State. 
° The cache and TLB maintenance instructions and address translation instructions. 
° Barriers and the CLREX instruction. 
° Architectural hint instructions. 


This section describes the general model for accessing this functionality. 


Note 


° See Fixed values in AArch64 instruction and System register descriptions on page C2-137 for information 
about abbreviations used in the System instruction descriptions. 





° In AArch32 state much of this functionality is provided through the System register interface described in 
The AArch32 System register interface on page G1-3877. In AArch64 state, the parameters used to 
characterize the System register encoding space are {opQ, op1, CRn, CRm, op2}. These are based on the 
parameters that characterize the AArch32 System register encoding space, which reflect the original 
implementation of these registers, as described in Background to the System register interface on 
page G1-3879. In ARMv8, there is no particular significance to the naming of these parameters, and no 
functional distinction between the opn parameters and the CRx parameters. 





Principles of the System instruction class encoding describes some general properties of these encodings. System 
instruction class encoding overview on page C5-271 then describes the top-level encoding of these instructions, and 
the following sections then describe the next level of the encoding hierarchy: 


° op0==0b00, architectural hints, barriers and CLREX, and PSTATE access on page C5-272. 

° op0==0b01, cache maintenance, TLB maintenance, and address translation instructions on page C5-275. 
° op0==0b10, Moves to and from debug and trace System registers on page C5-279. 

° op0==0b11, Moves to and from non-debug System registers and Special-purpose registers on page C5-281. 
° Reserved encodings for IMPLEMENTATION DEFINED registers on page C5-291. 





C5.1.1 Principles of the System instruction class encoding 
In ARMV8, an encoding in the System instruction space is identified by a set of arguments, { 0p, op1, CRn, CRm, op2}. 
These form an encoding hierarchy, where: 
opd Defines the top-level division of the encoding space, see System instruction class encoding overview 
on page C5-271. 
op1 Identifies the lowest Exception level at which the encoding is accessible, as follows: 
Accessible at ELO op1 has the value 3. 
Accessible at EL1 op1 has the value 0, 1, or 2. The value is the same as the op1 value used to 
access the equivalent AArch32 register. 
Accessible at EL2 op1 has the value 4. 
Accessible at EL3 op1 has the value 6. 
ARM strongly recommends that implementers adopt this use of op1 when using the IMPLEMENTATION DEFINED 
regions of the encoding space described in Reserved encodings for IMPLEMENTATION DEFINED registers on 
page C5-291. 
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System register width 


In AArch6é4 state, each encoding in the System instruction space can provide access to a 64-bit register. An AArch64 
System register is described as either a 32-bit register or a 64-bit register. For a 32-bit registers, the upper bits, 
bits[63:32], are RESO. 


C5.1.2 System instruction class encoding overview 


The encoding of the System instruction class describes each instruction as being either: 
° A transfer to a System register. This is a System instruction with the semantics of a write. 


° A transfer from a System register. This is a System instruction with the semantics of a read. 
A System instruction that initiates an operation operates as if it was making a transfer to a register. 


In the AArch64 instruction set, the decode structure for the System instruction class is: 


31 30 29 28 27 26 25 24 23 22 21201918 1615 12 11 8 7 5 4 


Tor 0101-0 O[t [on] opt | Gk | —GRm | Om | a 


The value of L indicates the transfer direction: 
0 Transfer to System register. 


1 Transfer from System register. 


The op0 field is the top level encoding of the System instruction type. Its possible values are: 


0bee These encodings provide: 
° Instructions with an immediate field for accessing PSTATE, the current PE state. 
° The architectural hint instructions. 
° Barriers and the CLREX instruction. 


For more information about these encodings, see op0==0b00, architectural hints, barriers and 
CLREX, and PSTATE access on page C5-272. 


Qb01 These encodings provide the cache maintenance, TLB maintenance, and address translation 
instructions. 


—— Note 
These are equivalent to operations in the AArch32 (coproc==0b1111) encoding space. 





For more information, see op0==0b01, cache maintenance, TLB maintenance, and address 
translation instructions on page C5-275. 
0b10 These encodings provide moves to and from: 


° Legacy AArch32 System registers for execution environments, to provide access to these 
registers from higher exception levels that are using AArch64. 


° Debug and trace registers. 


—— Note 
These are equivalent to the registers in the AArch32 (coproc==0b1110) encoding space. 





For more information, see op0==0b10, Moves to and from debug and trace System registers on 
page C5-279. 


@b11 These encodings provide: 


° Moves to and from Non-debug System registers. The accessed registers provide system 
control, and system status information. 
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C5.1.3 


— Note 


The accessed registers are equivalent to the registers in the AArch32 (coproc==0b1111) 
encoding space. 





° Access to Special-purpose registers. 


For more information, see /nstructions for accessing Special-purpose registers on page C5-290 and 
Instructions for accessing non-debug System registers on page C5-281. 


UNDEFINED behaviors 


In the System register instruction encoding space, the following principles apply: 


° All unallocated encodings are treated as UNDEFINED. 

° All encodings with L==1 and op0==0b0x are UNDEFINED, except for encodings in the area reserved for 
IMPLEMENTATION DEFINED use, see Reserved encodings for IMPLEMENTATION DEFINED registers on 
page C5-291. 


For registers and operations that are accessible from a particular Exception level, any attempt to access those 
registers from a lower Exception level is UNDEFINED. 


If a particular Exception level: 


° Defines a register to be RO, then any attempt to write to that register, at that Exception level, is UNDEFINED. 
This means that any access to that register with L==0 is UNDEFINED. 


° Defines a register to be WO, then any attempt to read from that register, at that Exception level, is UNDEFINED. 
This means that any access to that register with L==1 is UNDEFINED. 


For IMPLEMENTATION DEFINED encoding spaces, the treatment of the encodings is IMPLEMENTATION DEFINED, but 
see the recommendation in Principles of the System instruction class encoding on page C5-270. 


op0==0b00, architectural hints, barriers and CLREX, and PSTATE access 


The different groups of System register instructions with op0==0b00: 
° Are identified by the value of CRn. 
° Are always encoded with a value of @b11111 in the Rt field. 


The encoding of these instructions is: 


31 30 29 28 27 26 25 24 23 22 21201918 1615 12 11 8 7 


Pere ror gto of et | om | ome Toe TTT 


op0 


The encoding of the CRn field is as follows: 

0b0010 See Architectural hint instructions. 

0b0011 See Barriers and CLREX on page C5-273. 

0b0100 See Instructions for accessing the PSTATE fields on page C5-274. 


Architectural hint instructions 


Within the op0==0b00 encodings, the architectural hint instructions are identified by CRn having the value 0b0010. The 
encoding of these instructions is: 


31 30 29 28 27 26 25 24 23 22 21201918 1615 12 11 5 4 0 


770707070 0fofoojo7 10010] Opec  [1i7i1 


OpO Opt CRn CRm Op2 Rt 


The value of op<6:0>, formed by concatenating the CRm and op2 fields, determines the hint instruction as follows: 


0b0000000 NOP instruction. This has no effect on architectural state other than to advance the PC. 
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0b0000001 YIELD instruction. 
0b0000010 WFE instruction. 
0b0000011 WFI instruction. 
0b0000100 SEV instruction. 
0b0000101 SEVL instruction. 
0b0000110-0b1111111 
Unallocated values. These encodings behave as NOPs. 





Note 
° Instruction encodings with bits[4:0] not set to @b11111 are UNDEFINED. 
. The operation of the A64 instructions for architectural hints are identical to the corresponding A32 and T32 
instructions. 





For more information about: 
° The WFE, WFI, SEV, and SEVL instructions, see Mechanisms for entering a low-power state on page D1-1599. 
° The YIELD instruction, see Software control features and ELO on page B1-64. 


Barriers and CLREX 


Within the op0==0b00 encodings, the barriers and CLREX instructions are identified by CRn having the value 0b0011. 
The encoding of these instructions is: 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 eas au 7 
I 
op0 op1 CRn Rt 


The value of op2 determines the instruction, as follows. For the DSB and DMB instructions, CRm controls the instruction 


options. 
0b010 CLREX instruction. The value of CRm is ignored. 
0b100 DSB instruction. The value of CRm sets the option type, see Table C5-1. 
0b101 DMB instruction. The value of CRm sets the option type, see Table C5-1. 
0b110 ISB instruction. The value of CRm is ignored. 
0b000, 0b001, 0b011, 0b111 
UNDEFINED. 
Note 





Instruction encodings with bits[4:0] not set to @b11111 are UNDEFINED. 





Table C5-1 shows the CRm encodings for the data barrier option types. 


Table C5-1 CRm encoding for DMB and DSB instructions 
































CRm value Option, for DMB and DSB Meaning 
0001 OSHLD Outer Shareable, load 
0010 OSHST Outer Shareable, store 
0011 OSH Outer Shareable, all 
0101 NSHLD Non-shareable, load 
0110 NSHST Non-shareable, store 
0111 NSH Non-shareable, all 
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Table C5-1 CRm encoding for DMB and DSB instructions (continued) 


























CRm value Option, for DMB and DSB Meaning 

1001 ISHLD Inner Shareable, load 
1010 ISHST Inner Shareable, store 
1011 ISH Inner Shareable, all 
1101 LD Full system, load 
1110 ST Full system, store 
0000, 0100, 1000,1100 #<imm>a Full system, all 

1111 SY Full system, all 





a. #<imm> is a 4-bit unsigned immediate in the range 0-15, encoded in the CRm field. 


Note 


The operation of the A64 instructions for barriers and CLREX are identical to the corresponding A32 and T32 
instructions. 








For more information about: 
° The barrier instructions, see Memory barriers on page B2-87. 


° The CLREX instruction, see Synchronization and semaphores on page B2-108. 


Instructions for accessing the PSTATE fields 


Within the op0==0b00 encodings, the instructions that can be used to modify PSTATE fields directly are identified 
by CRn having the value 0b0100. The encoding of these instructions is: 


31 30 29 28 27 26 25 24 23 2221201918 1615 12 11 7 
Pore re re ofofo of op fo tee) wm [ae [eer 
op0 CRn CRm 


These instructions are: 


MSR DAIFSet, #Imm4 ; Used to set any or all of DAIF to 1 
MSR DAIFCIr, #Imm4 ; Used to clear any or all of DAIF to 0 
MSR SPSel, #Imm1 ; Used to select the Stack Pointer, between SP_EL@ and SP_ELx 


The value of op2 selects the instruction form, which defines the constraints on the values of the op1 and Imm4 
arguments, as follows: 
op2==0b101 Selects the MSR SPSel instruction. 

op1 must be 0b000. 

This instruction is accessible at EL1 or higher. 


Imm4<@> selects the accessed stack pointer, as follows: 
0 Selects SP_ELO. 


1 Selects SP_ELx on page K12-5664, where x is the number of the current Exception 
level, 1, 2, or 3. 


Imm4<3:1> are SBZ. 


op2==0b110 Selects the MSR DAIFSet instruction, that sets the specified PSTATE.{D, A, I, F} bits to 1. 
op1 must be @b011. 


This instruction is accessible at EL1 or higher, and when the value of the SCTLR_EL1.UMA bit is 
1 it is also accessible at ELO. 
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Imm4 determines which of the PSTATE.{D, A, I, F} bits are set to 1, as follows: 

Imm4<3> If this bit is set to 1 then the D bit is set to 1, otherwise the D bit is not changed. 
Imm4<2> _ If this bit is set to 1 then the A bit is set to 1, otherwise the A bit is not changed. 
Imm4<1> _If this bit is set to 1 then the I bit is set to 1, otherwise the I bit is not changed. 


Imm4<@> If this bit is set to 1 then the F bit is set to 1, otherwise the F bit is not changed. 


op2==0b111 Selects the MSR DAIFCir instruction, that clears the specified PSTATE.{D, A, I, F} bits to 0. 
op1 must be @b011. 


This instruction is accessible at EL1 or higher, and when the value of the SCTLR_EL1.UMA bit is 
1 it is also accessible at ELO. 


Imm4 determines which of the PSTATE.{D, A, I, F} bits is cleared to 0, as follows: 

Imm4<3> If this bit is set to 1 then the D bit is cleared to 0, otherwise the D bit is not changed. 
Imm4<2> If this bit is set to 1 then the A bit is cleared to 0, otherwise the A bit is not changed. 
Imm4<1> _ If this bit is set to 1 then the I bit is cleared to 0, otherwise the I bit is not changed. 
Imm4<@> If this bit is set to 1 then the F bit is cleared to 0, otherwise the F bit is not changed. 


All other combinations of op1 and op2 are reserved, and the corresponding instructions are UNDEFINED. 


Note 
For PSTATE updates, instruction encodings with bits[4:0] not set to 0b11111 are UNDEFINED. 








Writes to PSTATE.{D, A, I, F} occur in program order without the need for additional synchronization. Changing 
PSTATE.SPSel to use ELO synchronizes any updates to SP_ELO that have been written by an MSR to SP_ELO, 
without the need for additional synchronization. 


op0==0b01, cache maintenance, TLB maintenance, and address translation instructions 


The System instructions are encoded with op0==0b01. The different groups of System instructions are identified by 
the values of CRn and CRm, except that some of this encoding space is reserved for IMPLEMENTATION DEFINED 
functionality. The encoding of these instructions is: 


31 30 29 28 27 26 25 24 23 22 21201918 1615 12 11 7 


UC TR 


op0 


The grouping of these instructions depending on the CRn and CRm fields is as follows: 
CRn==7 The instruction group is determined by the value of CRm, as follows: 
CRm=={1, 5} Instruction cache maintenance instructions. 
CRm== Data cache zero operation. 
CRm=={6, 10, 11,14} Data cache maintenance instructions. 
See Cache maintenance instructions, and data cache zero on page C5-276. 
CRm== See Address translation instructions on page C5-277. 
CRn== See TLB maintenance instructions on page C5-278. 
CRn=={11, 15} See Reserved encoding space for IMPLEMENTATION DEFINED instructions on page C5-279. 
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Cache maintenance instructions, and data cache zero 


Table C5-2 lists the Cache maintenance instructions and their encodings. Instructions that take an argument include 
Xt in the instruction syntax. For instructions that do not take an argument, the Xt field is encoded as 0b11111. 


Table C5-2 Cache maintenance instructions 





Access instruction encoding 
































Instruction Notes 
op0 op1 CRn CRm_= op2 

Instruction cache maintenance instructions 

IC IALLUIS 1 0 7 1 0 Accessible from EL1 or higher. 

IC IALLU 5 0 

IC IVAU, Xt i] ‘i 5 1 When SCTLR_EL1.UCI == 1, accessible from ELO or higher. 

Otherwise, accessible from EL1 or higher. 

Data cache maintenance instructions 

DCIVAC, Xt 1 0 7 6 1 Accessible from EL1 or higher. 

DCISW, Xt 2 

DC CSW, Xt 10 2 

DC CISW, Xt 14 2 

DC CVAC, Xt 3 7 10 1 When SCTLR_EL1.UCI == 1, accessible from ELO or higher. 
~ ei Otherwise, accessible from EL1 or higher. 

DC CVAU, Xt 11 1 

DC CIVAC, Xt 14 1 

Data cache zero operation 

DC ZVA, Xt 1 3 fi 4 1 When SCTLR_EL1.DZE == 1, accessible from ELO or higher. 


Otherwise, accessible from EL1 or higher. 





For more information about these instructions, see About cache maintenance in ARMV8 on page D3-1699 and 
Cache maintenance instructions on page D3-1703. 
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C5.1 The System instruction class encoding space 


Table C5-3 lists the Address translation instructions and their encodings. The syntax of the instructions includes Xt, 
that provides the address to be translated. 


Table C5-3 Address translation instructions 





Access instruction encoding 












































Instruction Notes 
op0d op1 CRn CRm_= op2 
AT SIEIR, Xt 1 0 7 8 0 Accessible from EL1 or higher. 
AT SIEIW, Xt 1 
AT S1EOR, Xt 2 
AT S1EOW, Xt 3 
AT S1E2R, Xt 4 7 8 0 Accessible from EL2 or higher. 
AT SIE2W, Xt 1 
AT S12EIR, Xt 4 
AT S12E1W, Xt 5 
AT S12EOR, Xt 6 
AT S12E0W, Xt 7 
AT S1E3R, Xt 6 7 8 0 Accessible only from EL3. 
AT SIE3W, Xt 1 





For more information about these instructions, see Address translation instructions on page D4-1771. 
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TLB maintenance instructions 


Table C5-4 lists the TLB maintenance instructions and their encodings. Instructions that take an argument include 
Xt in the instruction syntax. For instructions that do not take an argument, the Xt field is encoded as 0b11111. 


Table C5-4 TLB maintenance instructions 





Access instruction encoding 

























































































Instruction Notes 
op0d op1 CRn CRm_= op2 

TLBI VMALLEIIS 0 8 0 Accessible from EL1 or higher. 
TLBI VAEIIS, Xt 1 

TLBI ASIDELS, Xt 2 

TLBI VAAELIS, Xt 3 

TLBI VALELIS, Xt 5 

TLBI VAALELS, Xt 7 

TLBI VMALLE1 0 Accessible from EL1 or higher. 
TLBI VAE1, Xt 1 

TLBI ASIDE1, Xt 2 

TLBI VAAE1, Xt 3 

TLBI VALE1, Xt 5 

TLBI VAALEI, Xt 7 

TLBIIPAS2E1S, Xt 4 8 1 Accessible from EL2 or higher. 
TLBI IPAS2LEUS, Xt 5 

TLBI ALLE2IS 0 

TLBI VAEZIS, Xt 1 

TLBI ALLE1IS 4 

TLBI VALEZIS, Xt 5 

TLBI VMALLS12E1IS 6 

TLBI IPAS2E1, Xt 1 

TLBI IPAS2LE1, Xt 5 

TLBI ALLE2 0 

TLBI VAE2, Xt 1 

TLBI ALLE1 4 

TLBI VALE2, Xt P) 

TLBI VMALLS12E1 6 

C5-278 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential 


1ID092916 


C5.1.5 


C5 The A64 System Instruction Class 
C5.1 The System instruction class encoding space 


Table C5-4 TLB maintenance instructions (continued) 





Access instruction encoding 
Instruction Notes 
op0d op1 CRn CRm_=— op2 




















TLBI ALLE3IS 1 6 8 3 0 Accessible only from EL3. 
TLBI VAE3IS, Xt 1 
TLBI VALE3IS, Xt 3 
TLBI ALLE3 7 0 
TLBI VAE3, Xt 1 
TLBI VALE3, Xt 2) 





For more information about these instructions, see TLB maintenance instructions on page D4-1817. 


Reserved encoding space for IMPLEMENTATION DEFINED instructions 


The A64 instruction set reserves the following encoding space for IMPLEMENTATION DEFINED instructions: 


31 30 29 28 27 26 25 24 23 22 21201918 1615 12 11 5 4 0 


TroTototTo oon iixti] | | 
| 
IMPLEMENTATION DEFINED IMPLEMENTATION DEFINED 


The value of L defines the use of Rt as follows: 
0 Rt is an argument supplied to the instruction. 


1 Rt is a result returned by the instruction. 


IMPLEMENTATION DEFINED instructions in this encoding space are accessed using the SYS and SYSL instructions, see 
SYS on page C6-742 and SYSL on page C6-743. 


See also Reserved encodings for IMPLEMENTATION DEFINED registers on page C5-291. 


op0==0b10, Moves to and from debug and trace System registers 


The instructions that move data to and from the debug, Execution environment, and trace System registers are 
encoded with op0==0b10. This means the encoding of these instructions is: 


31 30 29 28 27 26 25 24 23 22 21201918 1615 12 11 8 7 


Tororo te otro] of | om | om | oe | 


op0 


Note 
These encodings access the registers that are equivalent to the AArch32 System registers in the (coproc==0b1110) 
encoding space. 








The value of op1 provides the next level of decode of these instructions, as follows: 


op1 == {0, 3, 4} 
Debug. See Instructions for accessing debug System registers on page C5-280 
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— Note 


The standard encoding of debug registers is op0@==0b10, op1=={0, 3, 4}. The following sections 
describe registers in the op@==0b11 encoding space that are classified as debug registers, see 
Instructions for accessing non-debug System registers on page C5-281: 


° DLR_ELO, Debug Link Register on page D7-2177. 

° DSPSR_ELO, Debug Saved Program Status Register on page D7-2178. 

° MDCR_EL2, Monitor Debug Configuration Register (EL2) on page D7-2187. 
° MDCR_EL3, Monitor Debug Configuration Register (EL3) on page D7-2191. 
° SDER32_EL3, AArch32 Secure Debug Enable Register on page D7-2213. 





op1 == Trace. See the appropriate trace architecture specification. 


Instructions for accessing debug System registers 
The instructions for accessing debug System registers are: 


MSR <System register>, Xt ; Write to System register 
MRS Xt, <System register> ; Read from System register 


Where <System_register> is the register name, for example MDCCSR_ELO. 


This section includes only the System register access encodings for which both: 
° op@ is Qb10. 
° The value of op1 is one of {0, 3, 4}. 





Note 


These encodings access the registers that are equivalent to the AArch32 System registers in the (coproc==0b1110) 
encoding space. 





Table C5-5 shows the mapping of the System register encodings for debug System register access. 


Table C5-5 System instruction encodings for debug System register access 





Access instruction encoding 
























































Register — 
op0 op1 CRn CRm_— op2 
OSDTRRX_EL1 2 0 0 0 2 RW 
MDCCINT_EL1 2 0 RW 
MDSCR_EL1 2 RW 
OSDTRTX_EL1 3 2 RW 
OSECCR_EL1 6 2 RW 
DBGBVR<n>_EL1 0-152 384 RW 
DBGBCR<n>_EL1 0-154 5 RW 
DBGWVR<n>_ELI 0-154 6 RW 
DBGWCR<n>_EL1 0-152 «7 RW 
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Table C5-5 System instruction encodings for debug System register access (continued) 





Access instruction encoding 










































































roger = 
op0 op1 CRn CRm_— op2 

MDRAR_ELI1 2 0 1 0 0 RO 
OSLAR_EL1 4+ WO 
OSLSR_EL1 1 4 RO 
OSDLR_EL1 3 4 RW 
DBGPRCR_EL1 4 4 RW 
DBGCLAIMSET_EL1 7 8 6 RW 
DBGCLAIMCLR_EL1 9 6 RW 
DBGAUTHSTATUS_EL1 14 6 RO 
MDCCSR_ELO 3 0 1 0 RO 
DBGDTR_ELO 4 0 RW 
DBGDTRRX_ELO 5 0 RO 
DBGDTRTX_ELO WO 
DBGVCR32_EL2 4 0 7 0 RW 





a. Unimplemented breakpoint and watchpoint register access instructions are unallocated. 
CRm encodes <n>, the breakpoint or watchpoint number. 


For more information see Mapping of the System registers between the Execution states on page D1-1610. 





C5.1.6 op0==0b11, Moves to and from non-debug System registers and Special-purpose registers 
The instructions that move data to and from non-debug System registers are encoded with op0==0b11, except that 
some of this encoding space is reserved for IMPLEMENTATION DEFINED functionality. The encoding of these 
instructions is: 

31 30 29 28 27 26 25 24 23 22 21 = = 18 HS 12 + 7 
A 
op0 
The value of CRn provides the next level of decode of these instructions, as follows: 
CRn=={0, 1, 2, 3, 5, 6, 7, 9, 10, 12, 13, 14} 
See Instructions for accessing non-debug System registers. 
CRn==4 See Instructions for accessing Special-purpose registers on page C5-290. 
CRn=={11, 15} See Reserved encodings for IMPLEMENTATION DEFINED registers on page C5-291. 
Instructions for accessing non-debug System registers 
The A64 instructions for accessing System registers are: 
MSR <System register>, Xt ; Write to System register 
MRS Xt, <System register> ; Read from System register 
Where <System_register> is the register name, for example MIDR_EL1. 
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This section includes only the System register access encodings for which both: 
° 0p@ is Qb11. 
° The value of CRn is one of {0, 1, 2, 3, 5, 6, 7, 9, 10, 12, 13, 14}. 





Note 
. These encodings access the registers that are equivalent to the AArch32 System registers in the 
(coproc==0b1111) encoding space. 


° While this group is described as accessing the non-debug System registers, its correct characterization is by 
the {op@, CRn} values given in this subsection, and the group includes the debug registers described in the 
following sections: 

—  DLR_ELO, Debug Link Register on page D7-2177. 

—  DSPSR_ELO, Debug Saved Program Status Register on page D7-2178. 

—  MDCR_EL2, Monitor Debug Configuration Register (EL2) on page D7-2187. 

—  MDCR_EL3, Monitor Debug Configuration Register (EL3) on page D7-2191. 

—  SDER32_EL3, AArch32 Secure Debug Enable Register on page D7-2213. 

These registers are exceptions to the standard encoding of debug registers, that has op@==0b10, see 
Instructions for accessing debug System registers on page C5-280. 





The instruction encoding for these accesses is: 


31 30 29 28 27 26 25 24 23 2221201918 1615 12 11 7 
Tororo? oot i] of | om | om | | a 
op0 


See text for permitted values of CRn 

Table C5-6 shows the encodings of the register access instructions. In the Notes column of the table: 
Config-RO Means it is configurable whether read accesses are permitted. Write accesses are UNDEFINED. 
Config-WO Means it is configurable whether write accesses are permitted. Read accesses are UNDEFINED. 


Config-RW Means it is configurable whether accesses are permitted. Either read and write accesses are 
permitted, or read and write accesses are UNDEFINED. 


See the register descriptions for information about the control that determines whether these accesses are permitted. 


Table C5-6 System instruction encodings for non-Debug System register accesses 





Access instruction encoding 




















Register accessed a Notes 
op0 op1 CRn CRm op2 
MIDR_EL1 32 3 0 0 0 0 RO. 
MPIDR_EL1 64 3 RO. 
REVIDR_EL1 32 6 RO. 
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Table C5-6 System instruction encodings for non-Debug System register accesses (continued) 





Access instruction encoding 















































































































































Register accessed BE) Notes 
op0 op1 CRn CRm op2 
ID_PFRO_EL1 32 3 0 0 1 0 RO, but UNKNOWN if AArch32 is not 
io. Bee BA 5a implemented. 
ID_DFRO_EL1 32 2 
ID_AFRO_EL1 32 3 
ID_MMFRO_EL1 32 + 
ID_MMFRI1_EL1 32 5 
ID_MMFR2_EL1 32, 6 
ID_MMFR3_EL1 32 7 
ID_ISARO_EL1 32 2 0 RO, but UNKNOWN if AArch32 is not 
ID_ISAR1_EL1 32 sa ae 
ID_ISAR2_EL1 32 2 
ID_ISAR3_EL1 32. 3 
ID_ISAR4_EL1 32 4 
ID_ISARS_EL1 32 5 
ID_MMFR4_EL1 32 6 
Reserved, RAZ - 7 RO. 
MVFRO_EL1 32 3 0 RO, but UNKNOWN if AArch32 is not 
————_ implemented. 
MVFRI_ELI1 32 1 
MVFR2_EL1 32 2 
Reserved, RAZ - n RO, for n=3-7. 
ID_AA64PFRO_EL1 64 + 0 RO. 
ID_AA64PFR1_EL1 64 1 RO. 
Reserved, RAZ - n RO, for n=2-7. 
ID_AA64DFRO_EL1 64 5 0 RO. 
ID_AA64DFR1_EL1 64 1 RO. 
ID_AA64AFRO_EL1 64 4 RO. 
ID_AA64AFR1_EL1 64 =) RO. 
Reserved, RAZ - n RO, for n={2, 3, 6, 7}. 
ID_AA64ISARO_EL1 64 6 0 RO. 
ID_AA64ISAR1_EL1 64 1 RO. 
Reserved, RAZ - n RO, for n=2-7. 
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Table C5-6 System instruction encodings for non-Debug System register accesses (continued) 





Access instruction encoding 









































































































































Register accessed ‘eie) Notes 
op0 op1 CRn CRm op2 

ID_AA64MMFRO_EL1 = 64 3 0 0 7 0 RO. 

ID_AA64MMFRI1_EL1 64 1 RO. 

Reserved, RAZ - n RO, for n=2-7. 

SCTLR_EL1 32 1 0 0 RW. 

ACTLR_EL1 64 1 RW, contents IMPLEMENTATION DEFINED. 

CPACR_EL1 32 2 RW. 

TTBRO_EL1 64 2 0 0 RW. 

TTBR1_EL1 64 1 RW. 

TeR_ELI 64 2 RW. 

ICC_PMR_EL1 32 + 6 0 RW.? 

ICV_PMR_EL1 

AFSRO_EL1 32 5 1 0 RW, contents IMPLEMENTATION DEFINED. 

AFSR1_EL1 32 1 

ESR_EL1 32 2 0 RW. 

FAR_EL1 64 6 0 0 RW. 

PAR_EL1 64 | 4 0 RW. 

PMINTENSET_EL1 32 9 14 1 RW.> 

PMINTENCLR_EL1 32 2 RW.» 

MAIR_EL1 64 10 2 0 RW. 

AMAIR_EL1 64 3 0 RW, contents IMPLEMENTATION DEFINED. 

VBAR_EL1 64 12 0 0 RW. 

RVBAR_EL1 64 1 RO. Implemented only if EL2 and EL3 are 
not implemented. 

RMR_EL1 64 2 RW. Implemented only when both EL1 is the 
highest implemented Exception level and 
EL1 can use AArch32 and AArch64. 
Otherwise, it is IMPLEMENTATION DEFINED 
whether the register is implemented. 

ISR_EL1 32. 1 0 RO. 
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Table C5-6 System instruction encodings for non-Debug System register accesses (continued) 





Access instruction encoding 




















































































































Register accessed hea Notes 
op0 op1 CRn CRm op2 

ICC_IARO_EL1 32 3 0 12 8 0 RO.@ 

ICV_IARO_EL1 

ICC_EOIRO_ELI 32 1 WO.a 

ICV_EOIRO_EL1 

ICC_HPPIRO_EL1 32 2 RO.@ 

ICV_HPPIRO_EL1 

ICC_BPRO_EL1 32 3 RW.@ 

ICV_BPRO_EL1 

ICC_APOR<n>_EL1 32 4-7 RW, <n> = 0p2-4.4 

ICV_APOR<n>_EL1 

ICC_AP1IR<n>_EL1 32 9 0-3 RW, <n> = o0p2.2 

ICV_APIR<n>_EL1 

ICC_DIR_EL1 32 11 1 WO.a 

ICV_DIR_EL1 

ICC_RPR_EL1 32 3 RO.@ 

ICV_RPR_EL1 

ICC_SGIIR_EL1 64 5 WO.@ 

ICC_ASGIIR_ELI 64 6 WO.@ 

ICC_SGIOR_EL1 64 7 WO.a 

ICC_IAR1_EL1 32 12 0 RO.@ 

ICV_IAR1_EL1 

ICC_EOIR1_ELI1 32 1 WO.@ 

ICV_EOIR1_EL1 

ICC_HPPIR1_EL1 32 2 RO.@ 

ICV_HPPIR1_EL1 

ICC_BPR1_EL1 32 3 RW.@ 

ICV_BPR1_EL1 

ICC_CTLR_EL1 32 4 RW.@ 

ICV_CTLR_EL1 

ICC_SRE_EL1 32 5 RW.@ 

ICC_IGRPENO_EL1 32 6 RW.@ 

ICC_IGRPEN1_EL1 32 7 RW.@ 
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Table C5-6 System instruction encodings for non-Debug System register accesses (continued) 





Access instruction encoding 















































































































































Register accessed pie) Notes 
op0 op1 CRn CRm op2 
ICC_IGRPENO_EL1 32 3 0 12 12 6 RW.2 
ICV_IGRPENO_EL1 
ICC_IGRPEN1_EL1 32 7 RW. 
ICV_IGRPEN1_EL1 
CONTEXTIDR_EL1 32 13 0 1 RW. 
TPIDR_EL1 64 4 RW. 
CNTKCTL_EL1 32 14 1 0 RW.¢ 
CCSIDR_EL1 32 1 0 0 0 RO. 
CLIDR_EL1 64 1 RO. 
AIDR_EL1 32 7 RO. 
CSSELR_EL1 32 2 0 0 0 RW. 
CTR_ELO 32 3 0 0 1 Config-RO at ELO, otherwise RO. 
DCZID_ELO 32 7 RO. 
PMCR_ELO 32 9 12 0 Config-RW at ELO, otherwise RW.® 
PMCNTENSET_ELO 32, 1 
PMCNTENCLR_ELO 32 2 
PMOVSCLR_ELO 32 3 
PMSWINC_ELO 32 4 Config-WO at ELO, otherwise WO. 
PMSELR_ELO 32 5 Config-RW at ELO, otherwise RW.® 
PMCEIDO_ELO 32 6 Config-RO at ELO, otherwise RO.> 
PMCEID1_ELO 32 7 
PMCCNTR_ELO 64 13 0 Config-RW at ELO, otherwise RW. 
PMXEVTYPER_ELO 32, 1 
PMXEVCNTR_ELO 32 2 
PMUSERENR_ELO 32 14 0 RO at ELO, RW at other Exception levels.° 
PMOVSSET_ELO 32 3 Config-RW at ELO, otherwise RW.» 
TPIDR_ELO 64 13 0 2 RW. 
TPIDRRO_ELO 64 3 RW. 
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Table C5-6 System instruction encodings for non-Debug System register accesses (continued) 





Access instruction encoding 















































































































































Register accessed pie) Notes 
op0 op1 CRn CRm op2 
CNTFRQ_ELO 32 3 3 14 0 0 RO at EL1, RW at the highest Exception 
level implemented. Config-RO at ELO.¢ 
CNTPCT_ELO 64 1 Config-RO at ELO, otherwise RO.°¢ 
CNTVCT_ELO 64 2 
CNTP_TVAL_ELO 32 2 0 Config-RW at ELO and Non-secure EL1, 
~~ otherwise RW.° 
CNTP_CTL._ELO 32 1 
CNTP_CVAL_ELO 64 2 
CNTV_TVAL_ELO 32 14 3 0 Config-RW at ELO, otherwise RW.¢ 
CNTV_CTL_ELO 32 1 
CNTV_CVAL_ELO 64 2 
PMEVCNTR<n>_ELO 32 14 {8-10} {0-7} | RW. CRm and op2 encode <n>, the counter 
ji i number?: 
106} ° For CRm=={8, 12}, <n>=op2. 
PMEVTYPER<n>_ELO 32 {12-14} {0-7} ° For CRm=={9, 13}, <n>=0p2+8. 
° For CRm=={10, 14}, <n>=0p2+16. 
15 {0-6} or CRm== { }, <n>=op 
° For CRm=={11, 15}, <n>=0p2+24. 
Config-RW at ELO, otherwise RW. 
PMCCFILTR_ELO 32 7 Config-RW at ELO, otherwise RW. 
VPIDR_EL2 32 4 0 0 0 RW. 
VMPIDR_EL2 64 5 RW. 
SCTLR_EL2 32 1 0 0 RW. 
ACTLR_EL2 64 1 RW, contents IMPLEMENTATION DEFINED. 
HCR_EL2 64 1 0 RW. 
MDCR_EL2 32 1 RW.4 
CPTR.EL2 32 2 RW. 
HSTR_EL2 32 3 RW. 
HACR_EL2 32 7 RW, contents IMPLEMENTATION DEFINED. 
TTBRO_EL2 64 2 0 0 RW. 
TCR_EL2 32 2 RW. 
VTTBR_EL2 64 1 0 RW. 
VTCR_EL2 32 2 RW. 
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Table C5-6 System instruction encodings for non-Debug System register accesses (continued) 





Access instruction encoding 









































































































































Register accessed hea Notes 
op0 op1 CRn CRm op2 

DACR32_EL2 32 3 4 3 0 0 RW. If EL1 cannot use AArch32, the register 
is UNDEFINED. ° 

IFSR32_EL2 32 5 0 1 RW. If EL1 cannot use AArch32, the register 
is UNDEFINED. ° 

AFSRO_EL2 32 1 0 RW, contents IMPLEMENTATION DEFINED. 

AFSR1_EL2 32 1 RW, contents IMPLEMENTATION DEFINED. 

ESR_EL2 32 2 0 RW. 

FPEXC32_EL2 32 3 0 RW. If EL1 cannot use AArch32, the register 
is UNDEFINED. ® 

FAR_EL2 64 6 0 0 RW. 

HPFAR_EL2 64 4 RW. 

MAIR_EL2 64 10 2 0 RW. 

AMAIR_EL2 64 3 0 RW, contents IMPLEMENTATION DEFINED. 

VBAR_EL2 64 12 0 0 RW. 

RVBAR_EL2 64 1 RO. Implemented only if EL3 is not 
implemented. 

RMR_EL2 64 2 RW. Implemented only when both EL2 is the 
highest implemented Exception level and 
EL2 can use AArch32 and AArché4. 
Otherwise, it is IMPLEMENTATION DEFINED 
whether the register is implemented. 

ICH_APOR<n>_EL2 32 8 0-3 RW, <n>=o0p2.4 

ICH_AP1IR<n>_EL2 32 9 0-3 RW, <n>=o0p2.4 

ICC_SRE_EL2 32 5 RW.@ 

ICH_HCR_EL2 32 11 0 RW.@ 

ICH_VTR_EL2 32 1 RO.@ 

ICH_MISR_EL2 32 2 RO.@ 

ICH_EISR_EL2 32 3 RO.@ 

ICH_ELRSR_EL2 32 5 RO.@ 

ICH_VMCR_EL2 32 7 RW.@ 

ICH_LR<n>_EL2 64 12, 13 0-7 RW:?2: 


° For CRm==12, <n>=op2. 
° For CRm==13, <n>=o0p2+8. 
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Table C5-6 System instruction encodings for non-Debug System register accesses (continued) 





Access instruction encoding 







































































































































































Register accessed ne Notes 
op0 op1 CRn CRm op2 

TPIDR_EL2 64 3 4 13 0 2 RW. 

CNTVOFF_EL2 64 14 0 3 RW.¢ 

CNTHCTL_EL2 32 1 0 RW.¢ 

CNTHP_TVAL_EL2 32 2 0 RW.¢ 

CNTHP_CTL_EL2 32 1 RW.¢ 

CNTHP_CVAL_EL2 64 2 RW.¢ 

SCTLR_EL3 32 6 1 0 0 RW. 

ACTLR_EL3[63:0] 64 1 RW, contents IMPLEMENTATION DEFINED. 

SCR_EL} 32 1 0 RW. 

SDER32_EL3 32 1 1 RW. If EL1 cannot use AArch32, the register 
is UNDEFINED. 4: © 

CPTR_EL3 32 1 2 RW. 

MDCR_EL3 32 3 1 RW.4 

TTBRO_EL3 64 2 0 0 RW. 

TCR_EL3 32 2 RW. 

AFSRO_EL3 32 5 1 0 RW, contents IMPLEMENTATION DEFINED. 

AFSR1_EL3 32 1 RW, contents IMPLEMENTATION DEFINED. 

ESR_EL3 32 2 0 RW. 

FAR_EL3 64 6 0 0 RW. 

MAIR_EL3 64 10 2 0 RW. 

AMAIR_EL3 64 3 0 RW, contents IMPLEMENTATION DEFINED. 

VBAR_EL3 64 12 0 0 RW. 

RVBAR_EL3 64 1 RO. 

RMR_EL3 64 2 RW. Implemented only if EL3 can use both 
AArch32 and AArch64. Otherwise, it is 
IMPLEMENTATION DEFINED whether the 
register is implemented. 

ICC_CTLR_EL3 32 12 4 RW.@ 

ICC_SRE_EL3 32 5 RW.@ 

ICC_IGRPEN1_EL3 32 fi RW.? 

TPIDR_EL3 64 13 0 2 RW. 
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Table C5-6 System instruction encodings for non-Debug System register accesses (continued) 





Access instruction encoding 











Register accessed hea Notes 
op0 op1 CRn CRm op2 
CNTPS_TVAL_EL1 32 3 7 14 2 0 RW at EL3, Config-RW at Secure EL1.¢ 
CNTPS_CTL_EL1 32 1 
CNTPS_CVAL_EL1 64 2 





a. GIC System register, see About the GIC System registers As that subsection describes, each ICV_* register uses the same encoding as the 
corresponding ICC_* register. 

b. Performance Monitors Extension System register, see Performance Monitors registers on page D7-2215. 

c. Generic Timer System register, see Generic Timer registers on page D7-2255. 

d. Debug register in the op8==3 encoding space, see Debug registers on page D7-2147. 


e. Defined to allow access from AArch6é4 state to registers that are only used in AArch32 state. 


About the GIC System registers 


From version 3.0 of the GIC architecture specification, the specification defines three groups of System registers, 
identified by the prefix of the register name: 


ICC_ GIC physical CPU interface System registers. 

ICH_ GIC virtual interface control System registers. 

ICV_ GIC Virtual CPU interface System registers. 
Note 








These registers are in addition to the GIC memory-mapped register groups GICC_, GICD_, GICH_, GICR_, 
GICV_, and GITS_. 





When implemented, the GIC System registers form part of an ARM processor implementation, and therefore these 
registers are included in the register summaries. However, the registers are defined only in the GIC Architecture 
Specification. 


As Table C5-6 on page C5-282 shows, the ICV_* registers have the same {op@, op1, CRn, CRm, op2} encodings as the 
corresponding ICC_* registers. For these encodings, GIC register configuration fields determine which register is 
accessed. 


For more information see the ARM® Generic Interrupt Controller Architecture Specification, GIC architecture 
version 3.0 and version 4.0 (ARM IHI 0069). 


Instructions for accessing Special-purpose registers 
The A64 instructions for accessing Special-purpose registers are: 


MSR <Special-purpose register>, Xt ; Write to Special-purpose register 
MRS Xt, <Special-purpose register> ; Read from Special-purpose register 


For these accesses, CRn has the value 4. The encoding for Special-purpose register accesses is: 





31 30 29 28 27 26 25 24 23 22 21 20 19 18 1615 12 11 8 7 
rToTeroro Of i] of [To o) om [oe] a 
OpO CRn 
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Table C5-7 lists the encodings for op1, CRm, and op2 fields for accesses to the Special-purpose registers in AArch64. 


Table C5-7 Special-purpose register accesses 





Access instruction encoding 






































Register Notes 
op0 op1 CRn CRm~= op2 
SPSR_EL1 3 0 4 0 0 Accessible from EL1 or higher. 
ELR_EL1 1 
SP_ELO 1 0 Accessible from EL1 or higher. If SP_ELO is the current stack pointer 
then the access is UNDEFINED. 
SPSel 2 0 Accessible from EL1 or higher. 
CurrentEL 2 RO. Accessible from EL1 or higher. 
DAIF 3 4 2 1 Configurable whether accesses at ELO are permitted. 
NZCV 0 Accessible from ELO or higher. 
FPCR 4 0 Accessible from ELO or higher. 
FPSR 1 
DSPSR_ELO 3 0 Accessible only in Debug state, from ELO or higher. 
DLR_ELO 1 
SPSR_EL2 4 4 0 0 Accessible from EL2 or higher. 
ELR_EL2 1 
SP_EL1 1 0 
SPSR_irq 3 0 
SPSR_abt 1 
SPSR_und 2 
SPSR_fiq 3 
SPSR_EL3 6 4 0 0 Accessible from EL3 or higher. 
ELR_EL3 1 
SP_EL2 1 0 
All direct and indirect reads and writes to Special-purpose registers appear to occur in program order relative to 
other instructions. 
Reserved encodings for IMPLEMENTATION DEFINED registers 
The System register encoding space with op0==0b11 reserves the following encodings for IMPLEMENTATION 
DEFINED registers: 
31 30 29 28 27 26 25 24 23 22 21201918 1615 12.11 5 4 0 
idence DEFINED NEAT DEFINED 
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The value of L defines the access type and the use of Rt as follows: 
0 Write the value in Rt to the IMPLEMENTATION DEFINED register. 
1 Read the value of the IMPLEMENTATION DEFINED register to Rt. 


For more information about these encodings see S3_<op]>_<Cn>_<Cm>_<op2>, IMPLEMENTATION 
DEFINED registers on page D7-2089. As that section describes, any IMPLEMENTATION DEFINED registers are 
accessed in a similar way to architecturally-defined System registers, using MRS and MSR instructions, see: 


° MRS on page C6-622. 
. MSR (immediate) on page C6-623. 
° MSR (register) on page C6-625. 





See also Reserved encoding space for IMPLEMENTATION DEFINED instructions on page C5-279. 
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C5.2 Special-purpose registers 


The Special-purpose registers are: 


CurrentEL, DAIF, and NZCV, that return PSTATE information. 


The ELRs, DLR_ELO and ELR_ELx, that hold the return address for the return from Debug state, or for the 
exception return. 


FPCR and FPSR, that provide floating-point status and control. 
The stack pointers, SP_ELx, and stack pointer selector, SPSel. 


The SPSRs, DSPSR_ELO and SPSR_ELx, that hold the PE state from immediately before entering Debug 
state or taking an exception. This means they hold the state required for the return from Debug state, or for 
the exception return. 


SPSR_abt, SPSR_fig, SPSR_irq, and SPSR_und, that map to the corresponding AArch32 registers. 





Note 


The AArch32 SPSRs SPSR_hyp, SPSR_mon, and SPSR_sve are mapped to the AArch64 SPSR_ELx 
registers. 





The characteristic of a Special-purpose register is that all direct and indirect reads and writes to the register appear 
to occur in program order relative to other instructions, without the need for any explicit synchronization. 


This section describes the following registers: 


CurrentEL, that software can read to determine the current Exception level. 

DAIF, that specifies the current interrupt mask bits. 

DLR_ELO, that holds the address to return to for a return from Debug state. 
DSPSR_ELO, that holds process state on entry to Debug state. 

ELR_EL1, that holds the address to return to for an exception return from EL1. 
ELR_EL2, that holds the address to return to for an exception return from EL2. 
ELR_EL3, that holds the address to return to for an exception return from EL3. 
FPCR, that provides control of floating-point operation. 

FPSR, that provides floating-point status information. 

NZCV, that holds the condition flags. 

SP_ELO, that holds the stack pointer for ELO. 

SP_EL1, that holds the stack pointer for EL1. 

SP_EL2, that holds the stack pointer for EL2. 

SP_EL3, that holds the stack pointer for EL3. 

SPSel, that at EL1 or higher selects between the SP for the current Exception level and SP_ELO. 
SPSR_abt, that holds process state on taking an exception to AArch32 Abort mode. 
SPSR_EL1, that holds process state on taking an exception to AArch64 EL1. 
SPSR_EL2, that holds process state on taking an exception to AArch64 EL2. 
SPSR_EL3, that holds process state on taking an exception to AArch64 EL3. 
SPSR_fiq, that holds process state on taking an exception to AArch32 FIQ mode. 
SPSR_irq, that holds process state on taking an exception to AArch32 IRQ mode. 
SPSR_und, that holds process state on taking an exception to AArch32 Undefined mode. 
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C5.2.1 CurrentEL, Current Exception Level 
The CurrentEL characteristics are: 
Purpose 
Holds the current Exception level. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RO RO RO RO RO 
A write to the CurrentEL register is UNDEFINED. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
There are no configuration notes. 
Attributes 
CurrentEL is a 32-bit register. 
Field descriptions 
The CurrentEL bit assignments are: 
31 43210 
reso 
Bits [31:4] 
Reserved, RESO. 
EL, bits [3:2] 
Current Exception level. Possible values of this field are: 
00 ELO 
01 EL1 
10 EL2 
11 EL3 
This field resets to a value that is architecturally UNKNOWN. 
Bits [1:0] 
Reserved, RESO. 
Accessing the CurrentEL: 
To access the CurrentEL: 
MRS <Xt>, CurrentEL ; Read CurrentEL into Xt 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0100 0010 010 
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C5.2.2 DAIF, Interrupt Mask Bits 
The DAIF characteristics are: 
Purpose 
Allows access to the interrupt mask bits. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RW RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If SCTLR_EL1.UMA==0, accesses to this register from ELO are trapped to EL1. 
Configurations 
Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into an Exception level that is using AArch64. Otherwise, RW fields in this register reset to 
architecturally UNKNOWN values. 
Attributes 
DAIF is a 32-bit register. 
Field descriptions 
The DAIF bit assignments are: 
31 1098765 0 
RESO >] fF RESO 
Bits [31:10] 
Reserved, RESO. 
D, bit [9] 
Process state D mask. The possible values of this bit are: 
0 Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 
level are not masked. 
1 Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 
level are masked. 
When the target Exception level of the debug exception is higher than the current Exception level, 
the exception is not masked by this bit. 
When this register has an architecturally-defined reset value, this field resets to 1. 
A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
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I, bit [7] 


F, bit [6] 


Bits [5:0] 


C5 The A64 System Instruction Class 
C5.2 Special-purpose registers 


When this register has an architecturally-defined reset value, this field resets to 1. 


IRQ mask bit. The possible values of this bit are: 
Q Exception not masked. 
1 Exception masked. 


When this register has an architecturally-defined reset value, this field resets to 1. 


FIQ mask bit. The possible values of this bit are: 
7) Exception not masked. 
1 Exception masked. 


When this register has an architecturally-defined reset value, this field resets to 1. 


Reserved, RESO. 


Accessing the DAIF: 


To access the DAIF: 


MRS <Xt>, DAIF ; Read DAIF into Xt 
MSR DAIF, <Xt> ; Write Xt to DAIF 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 011 0100 0010 001 
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C5.2.3 DLR_ELO, Debug Link Register 
The DLR_ELO characteristics are: 


Purpose 


In Debug state, holds the address to restart from. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW RW RW RW RW RW 





Access to this register is from Debug state only. During normal execution this register is 
unallocated. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
AArch64 System register DLR_EL0[31:0] is architecturally mapped to AArch32 System register 
DLR. 

Attributes 


DLR_ELO is a 64-bit register. 


Field descriptions 


DLR_ELO is a member of multiple register groups and is defined elsewhere. For the full definition, see DLR_ELO. 


Accessing the DLR_ELO: 
To access the DLR_ELO: 


MRS <Xt>, DLR_EL@ ; Read DLR_ELO into Xt 
MSR DLR_EL@, <Xt> ; Write Xt to DLR_ELO 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 0100 0101 001 
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C5.2.4 DSPSR_ELO, Debug Saved Program Status Register 
The DSPSR_ELO characteristics are: 


Purpose 


Holds the saved process state on entry to Debug state. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW RW RW RW RW RW 





Access to this register is from Debug state only. During normal execution this register is 
unallocated. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register DSPSR_ELO is architecturally mapped to AArch32 System register 
DSPSR. 


Attributes 
DSPSR_ELO is a 32-bit register. 


Field descriptions 


DSPSR_ELO is a member of multiple register groups and is defined elsewhere. For the full definition, see 
DSPSR_ELO. 


Accessing the DSPSR_ELO: 
To access the DSPSR_ELO: 


MRS <Xt>, DSPSR_EL@ ; Read DSPSR_EL@ into Xt 
MSR DSPSR_EL@, <Xt> ; Write Xt to DSPSR_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 0100 0101 000 
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C5.2.5 


ELR_EL1, Exception Link Register (EL1) 
The ELR_EL1 characteristics are: 


Purpose 
When taking an exception to EL1, holds the address to return to. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) 


EL3 (SCR.NS=0) 





- RW RW RW RW 


RW 





An exception return from EL1 using AArch64 makes ELR_EL1 become UNKNOWN. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


There are no configuration notes. 


Attributes 
ELR_EL1 is a 64-bit register. 


Field descriptions 


The ELR_EL1 bit assignments are: 


63 


Return address 


Bits [63:0] 


Return address. 


An exception return from EL1 using AArch64 makes ELR_EL1 become UNKNOWN. 


Accessing the ELR_EL1: 
To access the ELR_EL1: 


MRS <Xt>, ELR_EL1 ; Read ELR_EL1 into Xt 
MSR ELR_EL1, <Xt> ; Write Xt to ELR_EL1 


Register access is encoded as follows: 





op0 op1 


CRn CRm_= op2 





11 000 


0100 0000 001 
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C5.2.6 ELR_EL2, Exception Link Register (EL2) 
The ELR_EL2 characteristics are: 


Purpose 


When taking an exception to EL2, holds the address to return to. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





An exception return from EL2 using AArch64 makes ELR_EL2 become UNKNOWN. 


When EL? is in AArch32 Execution state and an exception is taken from ELO, EL1, or EL2 to EL3 
and AArch64 execution, the upper 32-bits of ELR_EL2 are either set to 0 or hold the same value 
that they did before AArch32 execution. Which option is adopted is determined by an 
implementation, and might vary dynamically within an implementation. Correspondingly software 
must regard the value as being an UNKNOWN choice between the two values. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register ELR_EL2 is architecturally mapped to AArch32 System register 
ELR_hyp. 


Attributes 
ELR_EL2 is a 64-bit register. 


Field descriptions 


The ELR_EL2 bit assignments are: 


63 0 


Return address 


Bits [63:0] 
Return address. 
An exception return from EL2 using AArch64 makes ELR_EL2 become UNKNOWN. 


When EL? is in AArch32 Execution state and an exception is taken from ELO, EL1, or EL2 to EL3 
and AArch64 execution, the upper 32-bits of ELR_EL2 are either set to 0 or hold the same value 
that they did before AArch32 execution. Which option is adopted is determined by an 
implementation, and might vary dynamically within an implementation. Correspondingly software 
must regard the value as being an UNKNOWN choice between the two values. 


Accessing the ELR_EL2: 
To access the ELR_EL2: 


MRS <Xt>, ELR_EL2 ; Read ELR_EL2 into Xt 
MSR ELR_EL2, <Xt> ; Write Xt to ELR_EL2 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 0100 0000 001 
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C5.2.7 ELR_EL3, Exception Link Register (EL3) 
The ELR_EL3 characteristics are: 


Purpose 
When taking an exception to EL3, holds the address to return to. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





An exception return from EL3 using AArch64 makes ELR_EL3 become UNKNOWN. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


There are no configuration notes. 


Attributes 
ELR_EL3 is a 64-bit register. 


Field descriptions 


The ELR_EL3 bit assignments are: 


63 0 


Return address 


Bits [63:0] 
Return address. 


An exception return from EL3 using AArch64 makes ELR_EL3 become UNKNOWN. 


Accessing the ELR_EL3: 
To access the ELR_EL3: 


MRS <Xt>, ELR_EL3 ; Read ELR_EL3 into Xt 
MSR ELR_EL3, <Xt> ; Write Xt to ELR_EL3 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 0100 0000 001 
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C5.2.8 FPCR, Floating-point Control Register 


The FPCR characteristics are: 


Purpose 


Controls floating-point behavior. 


Usage constraints 


This register is accessible as follows: 


Traps and Enables 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW RW RW RW RW RW 





For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


Configurations 


If CPACR_EL1.FPEN==00, accesses to this register from ELO and EL] are trapped to EL1. 
If CPACR_EL1.FPEN==01, accesses to this register from ELO are trapped to EL1. 
If CPACR_EL1.FPEN==10, accesses to this register from ELO and EL] are trapped to EL1. 


If CPTR_EL2.TFP==1, Non-secure accesses to this register from ELO, EL1, and EL2 are 
trapped to EL2. 


If CPTR_EL3.TFP==1, accesses to this register from ELO, EL1, EL2, and EL3 are trapped 
to EL3. 


The named fields in this register map to the equivalent fields in the AArch32 FPSCR. 


It is IMPLEMENTATION DEFINED whether the Len and Stride fields can be programmed to non-zero 
values, which will cause some AArch32 floating-point instruction encodings to be UNDEFINED, or 
whether these fields are RAZ. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 


FPCR is a 32-bit register. 


Field descriptions 


The FPCR bit assignments are: 


31 27 26 25 24 23 22 21 20 i 16151413121110 9 8 7 0 








ais 
RESO 
reso 
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Bits [31:27] 


Reserved, RESO. 


AHP, bit [26] 
Alternative half-precision control bit: 
0 IEEE half-precision format selected. 
1 Alternative half-precision format selected. 
DN, bit [25] 
Default NaN mode control bit: 
Q NaN operands propagate through to the output of a floating-point operation. 
1 Any operation involving one or more NaNs returns the Default NaN. 
The value of this bit controls both scalar and Advanced SIMD floating-point arithmetic. 
FZ, bit [24] 


Flush-to-zero mode control bit: 


) Flush-to-zero mode disabled. Behavior of the floating-point system is fully compliant 
with the IEEE 754 standard. 


1 Flush-to-zero mode enabled. 

The value of this bit controls both scalar and Advanced SIMD floating-point arithmetic. 
RMode, bits [23:22] 

Rounding Mode control field. The encoding of this field is: 


00 Round to Nearest (RN) mode 

01 Round towards Plus Infinity (RP) mode 

10 Round towards Minus Infinity (RM) mode 
11 Round towards Zero (RZ) mode. 


The specified rounding mode is used by both scalar and Advanced SIMD floating-point 
instructions. 


Stride, bits [21:20] 


This field has no function in AArch64, and non-zero values are ignored during AArch64 execution. 
It is included only for context saving and restoration of AArch32 FPSCR.Stride. 


Bit [19] 
Reserved, RESO. 
Len, bits [18:16] 


This field has no function in AArch64, and non-zero values are ignored during AArch64 execution. 
It is included only for context saving and restoration of AArch32 FPSCR.Len. 





IDE, bit [15] 

Input Denormal exception trap enable. Possible values are: 

0 Untrapped exception handling selected. If the floating-point exception occurs then the 
FPSR.IDC bit is set to 1. 

i Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the FPSR.IDC bit. The trap handling software can decide whether to set the 
FPSR.IDC bit to 1. 

The value of this bit controls both scalar and Advanced SIMD floating-point arithmetic. 

If the implementation does not support this exception, this bit is RESO. 
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Bits [14:13] 


Reserved, RESO. 


IXE, bit [12] 
Inexact exception trap enable. Possible values are: 


0 Untrapped exception handling selected. If the floating-point exception occurs then the 
FPSR.IXC bit is set to 1. 


1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the FPSR.IXC bit. The trap handling software can decide whether to set the 
FPSR.IXC bit to 1. 


The value of this bit controls both scalar and Advanced SIMD floating-point arithmetic. 


If the implementation does not support this exception, this bit is RESO. 


UFE, bit [11] 
Underflow exception trap enable. Possible values are: 


0 Untrapped exception handling selected. If the floating-point exception occurs then the 
FPSR.UFC bit is set to 1. 


1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the FPSR.UFC bit. The trap handling software can decide whether to set the 
FPSR.UFC bit to 1. 


The value of this bit controls both scalar and Advanced SIMD floating-point arithmetic. 


If the implementation does not support this exception, this bit is RESO. 


OFE, bit [10] 
Overflow exception trap enable. Possible values are: 


) Untrapped exception handling selected. If the floating-point exception occurs then the 
FPSR.OFC bit is set to 1. 


1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the FPSR.OFC bit. The trap handling software can decide whether to set the 
FPSR.OFC bit to 1. 


The value of this bit controls both scalar and Advanced SIMD floating-point arithmetic. 


If the implementation does not support this exception, this bit is RESO. 





DZE, bit [9] 
Division by Zero exception trap enable. Possible values are: 
) Untrapped exception handling selected. If the floating-point exception occurs then the 
FPSR.DZC bit is set to 1. 
1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the FPSR.DZC bit. The trap handling software can decide whether to set the 
FPSR.DZC bit to 1. 
The value of this bit controls both scalar and Advanced SIMD floating-point arithmetic. 
If the implementation does not support this exception, this bit is RESO. 
IOE, bit [8] 
Invalid Operation exception trap enable. Possible values are: 
0 Untrapped exception handling selected. If the floating-point exception occurs then the 
FPSR.IOC bit is set to 1. 
1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the FPSR.IOC bit. The trap handling software can decide whether to set the 
FPSR.IOC bit to 1. 
The value of this bit controls both scalar and Advanced SIMD floating-point arithmetic. 
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If the implementation does not support this exception, this bit is RESO. 


Bits [7:0] 


Reserved, RESO. 


Accessing the FPCR: 
To access the FPCR: 


MRS <Xt>, FPCR ; Read FPCR into Xt 
MSR FPCR, <Xt> ; Write Xt to FPCR 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 0100 0100 000 
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C5.2.9 FPSR, Floating-point Status Register 
The FPSR characteristics are: 


Purpose 


Provides floating-point system status information. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If CPACR_EL1.FPEN==00, accesses to this register from ELO and EL] are trapped to EL1. 
. If CPACR_EL1.FPEN==01, accesses to this register from ELO are trapped to EL1. 
° If CPACR_EL1.FPEN==10, accesses to this register from ELO and EL] are trapped to EL1. 


° If CPTR_EL2.TFP==1, Non-secure accesses to this register from ELO, EL1, and EL2 are 
trapped to EL2. 


° If CPTR_EL3.TFP==1, accesses to this register from ELO, EL1, EL2, and EL3 are trapped 
to EL3. 


Configurations 
The named fields in this register map to the equivalent fields in the AArch32 FPSCR. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
FPSR is a 32-bit register. 


Field descriptions 


The FPSR bit assignments are: 


31 30 29 28 27 26 876543210 





Qc oe | 10C 


DZC 
OFC 
UFC 
IXC 
RESO 
IDC 


N, bit [31] 


Negative condition flag for AArch32 floating-point comparison operations. AArch64 floating-point 
comparisons set the PSTATE.N flag instead. 
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Z, bit [30] 
Zero condition flag for AArch32 floating-point comparison operations. AArch64 floating-point 
comparisons set the PSTATE.Z flag instead. 

C, bit [29] 
Carry condition flag for AArch32 floating-point comparison operations. AArch64 floating-point 
comparisons set the PSTATE.C flag instead. 

V, bit [28] 
Overflow condition flag for AArch32 floating-point comparison operations. AArch64 
floating-point comparisons set the PSTATE.V flag instead. 

QC, bit [27] 
Cumulative saturation bit, Advanced SIMD only. This bit is set to 1 to indicate that an Advanced 
SIMD integer operation has saturated since 0 was last written to this bit. 

Bits [26:8] 


Reserved, RESO. 


IDC, bit [7] 


Input Denormal cumulative exception bit. This bit is set to 1 to indicate that the Input Denormal 
exception has occurred since 0 was last written to this bit. 


How scalar and Advanced SIMD floating-point instructions update this bit depends on the value of 
the FPCR.IDE bit. This bit is only set to 1 to indicate an exception if FPCR.IDE is 0, or if trapping 
software sets it. 

Bits [6:5] 


Reserved, RESO. 


IXC, bit [4] 
Inexact cumulative exception bit. This bit is set to 1 to indicate that the Inexact exception has 


occurred since 0 was last written to this bit. 


How scalar and Advanced SIMD floating-point instructions update this bit depends on the value of 
the FPCR.IXE bit. This bit is only set to 1 to indicate an exception if FPCR.IXE is 0, or if trapping 
software sets it. 


UFC, bit [3] 


Underflow cumulative exception bit. This bit is set to 1 to indicate that the Underflow exception has 
occurred since 0 was last written to this bit. 


How scalar and Advanced SIMD floating-point instructions update this bit depends on the value of 
the FPCR.UFE bit. This bit is only set to 1 to indicate an exception if FPCR.UFE is 0, or if trapping 
software sets it. 


OFC, bit [2] 


Overflow cumulative exception bit. This bit is set to 1 to indicate that the Overflow exception has 
occurred since 0 was last written to this bit. 


How scalar and Advanced SIMD floating-point instructions update this bit depends on the value of 
the FPCR.OFE bit. This bit is only set to 1 to indicate an exception if FPCR.OFE is 0, or if trapping 
software sets it. 


DZC, bit [1] 


Division by Zero cumulative exception bit. This bit is set to 1 to indicate that the Division by Zero 
exception has occurred since 0 was last written to this bit. 


How scalar and Advanced SIMD floating-point instructions update this bit depends on the value of 
the FPCR.DZE bit. This bit is only set to 1 to indicate an exception if FPCR.DZE is 0, or if trapping 
software sets it. 
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IOC, bit [0] 


Invalid Operation cumulative exception bit. This bit is set to 1 to indicate that the Invalid Operation 
exception has occurred since 0 was last written to this bit. 


How scalar and Advanced SIMD floating-point instructions update this bit depends on the value of 
the FPCR.IOE bit. This bit is only set to 1 to indicate an exception if FPCR.IOE is 0, or if trapping 
software sets it. 

Accessing the FPSR: 

To access the FPSR: 


MRS <Xt>, FPSR ; Read FPSR into Xt 
MSR FPSR, <Xt> ; Write Xt to FPSR 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 0100 0100 001 
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NZCV, Condition Flags 


The NZCV characteristics are: 


Purpose 


Allows access to the condition flags. 


Usage constraints 


This register is accessible as follows: 


C5 The A64 System Instruction Class 
C5.2 Special-purpose registers 





ELO EL1(NS) EL1(S)_ EL2 (NS) 


EL3(SCR.NS=1) EL3 (SCR.NS=0) 








Traps and Enables 


Configurations 


Attributes 


Field descriptions 


The NZCV bit assignments are: 


RW RW RW RW RW RW 
There are no traps or enables affecting this register. 
There are no configuration notes. 
NZCYV is a 32-bit register. 
0 


31 30 29 28 27 


N, bit [31] 
Negative condition flag. Set to 1 if the result of the last flag-setting instruction was negative. 

Z, bit [30] 
Zero condition flag. Set to 1 if the result of the last flag-setting instruction was zero, and to 0 
otherwise. A result of zero often indicates an equal result from a comparison. 

C, bit [29] 
Carry condition flag. Set to 1 if the last flag-setting instruction resulted in a carry condition, for 
example an unsigned overflow on an addition. 

V, bit [28] 
Overflow condition flag. Set to 1 if the last flag-setting instruction resulted in an overflow condition, 
for example a signed overflow on an addition. 

Bits [27:0] 


Reserved, RESO. 


Accessing the NZCV: 
To access the NZCV: 


MRS <Xt>, NZCV ; Read NZCV into Xt 
MSR NZCV, <Xt> ; Write Xt to NZCV 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 0100 0010 000 
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C5.2.11 SP_ELO, Stack Pointer (ELO) 
The SP_ELO characteristics are: 


Purpose 


Holds the stack pointer associated with ELO. At higher Exception levels, this is used as the current 


stack pointer when the value of SPSel.SP is 0. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) 


EL3 (SCR.NS=0) 





- RW RW RW RW 


RW 





This accessibility information only applies when the value of SPSel.SP is 1, and only for accesses 
using the MRS or MSR instructions. In addition, this register is accessible at ELO as the current 


stack pointer. 


When the value of SPSel.SP is 0: 


. Any access to SP_ELO using the MRS or MSR instructions is UNDEFINED. 


. This register is accessible at all Exception levels as the current stack pointer. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


There are no configuration notes. 


Attributes 
SP_ELO is a 64-bit register. 


Field descriptions 


The SP_ELO bit assignments are: 


63 


Stack pointer 


Bits [63:0] 
Stack pointer. 


Accessing the SP_ELO: 
To access the SP_ELO: 


MRS <Xt>, SP_EL@ ; Read SP_EL@ into Xt 
MSR SP_ELQ, <Xt> ; Write Xt to SP_ELQ 


Register access is encoded as follows: 





op0 op1 


CRn CRm_= op2 





11 000 


0100 = 001 000 
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C5.2.12 SP_EL1, Stack Pointer (EL1) 
The SP_EL1 characteristics are: 


Purpose 


Holds the stack pointer associated with EL1. When executing at EL1, the value of SPSel.SP 
determines the current stack pointer: 





SPSel.SP current stack pointer 





0 SP_ELO 





1 SP_EL1 





Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





This accessibility information only applies to accesses using the MRS or MSR instructions. 
When the value of SPSel.SP is 1, this register is also accessible at EL1 as the current stack pointer. 


—— Note 
When the value of SPSel.SP is 0, SP_ELO is used as the current stack pointer at all Exception levels. 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


There are no configuration notes. 


Attributes 
SP_EL]1 is a 64-bit register. 


Field descriptions 


The SP_EL1 bit assignments are: 


63 0 


Stack pointer 


Bits [63:0] 


Stack pointer. 


Accessing the SP_EL1: 
To access the SP_EL1: 


MRS <Xt>, SP_EL1 ; Read SP_EL1 into Xt 
MSR SP_EL1, <Xt> ; Write Xt to SP_EL1 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 0100 0001 000 
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C5.2.13 SP_EL2, Stack Pointer (EL2) 
The SP_EL2 characteristics are: 


Purpose 


Holds the stack pointer associated with EL2. When executing at EL2, the value of SPSel.SP 
determines the current stack pointer: 





SPSel.SP current stack pointer 





0 SP_ELO 





1 SP_EL2 





Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





This accessibility information only applies to accesses using the MRS or MSR instructions. 
When the value of SPSel.SP is 1, this register is also accessible at EL2 as the current stack pointer. 


—— Note 
When the value of SPSel.SP is 0, SP_ELO is used as the current stack pointer at all Exception levels. 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


There are no configuration notes. 


Attributes 
SP_EL2 is a 64-bit register. 


Field descriptions 


The SP_EL2 bit assignments are: 


63 0 


Stack pointer 


Bits [63:0] 


Stack pointer. 


Accessing the SP_EL2: 
To access the SP_EL2: 


MRS <Xt>, SP_EL2 ; Read SP_EL2 into Xt 
MSR SP_EL2, <Xt> ; Write Xt to SP_EL2 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 0100 0001 000 
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C5.2.14 SP_EL3, Stack Pointer (EL3) 
The SP_EL3 characteristics are: 


Purpose 


Holds the stack pointer associated with EL3. When executing at EL3, the value of SPSel.SP 
determines the current stack pointer: 





SPSel.SP current stack pointer 





0 SP_ELO 





1 SP_ELS 





Usage constraints 
This register is not accessible using MRS and MSR instructions. 
When the value of SPSel.SP is 1, this register is accessible at EL3 as the current stack pointer. 


—— Note 
When the value of SPSel.SP is 0, SP_ELO is used as the current stack pointer at all Exception levels. 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


There are no configuration notes. 


Attributes 
SP_EL3 is a 64-bit register. 


Field descriptions 


The SP_EL3 bit assignments are: 


63 0 


Stack pointer 


Bits [63:0] 
Stack pointer. 
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C5.2.15 SPSel, Stack Pointer Select 
The SPSel characteristics are: 
Purpose 
Allows the Stack Pointer to be selected between SP_ELO and SP_ELx. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RW RW RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into an Exception level that is using AArch64. Otherwise, RW fields in this register reset to 
architecturally UNKNOWN values. 
Attributes 
SPSel is a 32-bit register. 
Field descriptions 
The SPSel bit assignments are: 
31 10 
RESO 
Bits [31:1] 
Reserved, RESO. 
SP, bit [0] 
Stack pointer to use. Possible values of this bit are: 
0 Use SP_ELO at all Exception levels. 
1 Use SP_ELx for Exception level ELx. 
When this register has an architecturally-defined reset value, this field resets to 1. 
Accessing the SPSel: 
To access the SPSel: 
MRS <Xt>, SPSel ; Read SPSel into Xt 
MSR SPSel, <Xt> ; Write Xt to SPSel 
Register access is encoded as follows: 
op0 opi CRn CRm_= op2 
11 000 0100 = 0010 000 
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C5.2.16 SPSR_abt, Saved Program Status Register (Abort mode) 
The SPSR_abt characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to Abort mode. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





“ 2 - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register SPSR_abt is architecturally mapped to AArch32 System register 
SPSR_abt. 


If EL1 does not support execution in AArch32, this register is RESO. 


Attributes 
SPSR_abt is a 32-bit register. 


Field descriptions 


The SPSR_abt bit assignments are: 


31 30 29 28 27 26 25 24 23.21 2019 1615 1098765 4 3 





IT[1:0] Se a MI4] 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to Abort mode, and copied to CPSR.N on 
executing an exception return operation in Abort mode. 

Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Abort mode, and copied to CPSR.Z on 
executing an exception return operation in Abort mode. 

C, bit [29] 
Set to the value of CPSR.C on taking an exception to Abort mode, and copied to CPSR.C on 
executing an exception return operation in Abort mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to Abort mode, and copied to CPSR.V on 
executing an exception return operation in Abort mode. 

Q, bit [27] 
Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 
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IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 
In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvV8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 


E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
Q Little-endian operation 
1 Big-endian operation. 
Instruction fetches ignore this bit. 


When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 


If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 


If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 


Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 


A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
Q Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 


) Exception not masked. 


a Exception masked. 
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F, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
Q Taken from A32 state. 
1 Taken from T32 state. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
a Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





Qb0000 User 





0b0001 FIQ 





obe010 ~=—s IRQ 





Qb0011 Supervisor 





Qb0110 Monitor 





@b0111 Abort 





0b1010 Hyp 





Qb1011 Undefined 








Qb1111 System 





Other values are reserved. 


Accessing the SPSR_abt: 
To access the SPSR_abt: 


MRS <Xt>, SPSR_abt ; Read SPSR_abt into Xt 
MSR SPSR_abt, <Xt> ; Write Xt to SPSR_abt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 100 0100 0011 001 
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C5.2.17 SPSR_EL1, Saved Program Status Register (EL1) 
The SPSR_EL1 characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to EL1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register SPSR_EL1 is architecturally mapped to AArch32 System register 
SPSR_sve. 


Attributes 
SPSR_EL1 is a 32-bit register. 


Field descriptions 


The SPSR_EL1 bit assignments are: 


When exception taken from AArch32: 


31 30 29 28 27 26 25 24 23 22 21 20 19 1615 1098765 4 3 





IT[1:0] ee 5 —, MI4] 
RESO 


An exception return from EL1 using AArch64 makes SPSR_EL1 become UNKNOWN. 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to Supervisor mode, and copied to CPSR.N on 
executing an exception return operation in Supervisor mode. 

Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Supervisor mode, and copied to CPSR.Z on 
executing an exception return operation in Supervisor mode. 

C, bit [29] 
Set to the value of CPSR.C on taking an exception to Supervisor mode, and copied to CPSR.C on 
executing an exception return operation in Supervisor mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to Supervisor mode, and copied to CPSR.V on 
executing an exception return operation in Supervisor mode. 

ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C5-323 


ID092916 Non-Confidential 


C5 The A64 System Instruction Class 
C5.2 Special-purpose registers 


Q, bit [27] 


Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 


IT[1:0], bits [26:25] 

IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 
J, bit [24] 

RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvV8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 


Bits [23:22] 
Reserved, RESO. 


SS, bit [21] 


Software step. Shows the value of PSTATE.SS immediately before the exception was taken. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
Q Little-endian operation 
1 Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 

A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
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I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
F, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
0 Taken from A32 state. 
1 Taken from T32 state. 
M4], bit [4] 


Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





Qb0000 User 





0b0001 + =FIQ 





obe010 ~=—s IRQ 





@b0011 Supervisor 





@b0111 Abort 





@b1011 Undefined 








Qb1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


When exception taken from AArch64: 


31 30 29 28 27 22 21 20 19 1098765 4 3 





— Mi] 
RESO 
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An exception return from EL1 using AArch64 makes SPSR_EL1 become UNKNOWN. 


N, bit [31] 
Set to the value of the N condition flag on taking an exception to EL1, and copied to the N condition 
flag on executing an exception return operation in EL1. 

Z, bit [30] 
Set to the value of the Z condition flag on taking an exception to EL1, and copied to the Z condition 
flag on executing an exception return operation in EL1. 

C, bit [29] 
Set to the value of the C condition flag on taking an exception to EL1, and copied to the C condition 
flag on executing an exception return operation in EL1. 

V, bit [28] 


Set to the value of the V condition flag on taking an exception to EL1, and copied to the V condition 
flag on executing an exception return operation in EL1. 


Bits [27:22] 

Reserved, RESO. 
SS, bit [21] 

Software step. Shows the value of PSTATE.SS immediately before the exception was taken. 
IL, bit [20] 


Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 


Bits [19:10] 
Reserved, RESO. 





D, bit [9] 
Process state D mask. The possible values of this bit are: 
Q Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 

level are not masked. 
1 Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 
level are masked. 

When the target Exception level of the debug exception is higher than the current Exception level, 
the exception is not masked by this bit. 

A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 

I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 

¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
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Bit [5] 
Reserved, RESO. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 


Q Exception taken from AArch64. 


M{[3:0], bits [3:0] 


AArch64 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





0be000 ELOt 





0b0100 ELIt 





0b0101 ELth 





Other values are reserved, and returning to an Exception level that is using AArch64 with a reserved 
value in this field is treated as an illegal exception return. 


The bits in this field are interpreted as follows: 
° M[3:2] holds the Exception Level. 
° M[1] is unused and is RESO for all non-reserved values. 
° M[0] is used to select the SP: 
— 0 means the SP is always SPO. 


— 1 means the exception SP is determined by the EL. 


Accessing the SPSR_EL1: 
To access the SPSR_ELI: 


MRS <Xt>, SPSR_EL1 ; Read SPSR_EL1 into Xt 
MSR SPSR_EL1, <Xt> ; Write Xt to SPSR_EL1 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0100 0000 000 
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C5.2.18 SPSR_EL2, Saved Program Status Register (EL2) 
The SPSR_EL2 characteristics are: 
Purpose 
Holds the saved process state when an exception is taken to EL2. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register SPSR_EL2 is architecturally mapped to AArch32 System register 
SPSR_hyp. 
Attributes 
SPSR_EL2 is a 32-bit register. 
Field descriptions 
The SPSR_EL2 bit assignments are: 
When exception taken from AArch32: 
31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 109 8765 4 3 
IT[1:0] oe 5 a M[4] 
RESO 
An exception return from EL2 using AArch64 makes SPSR_EL2 become UNKNOWN. 
N, bit [31] 
Set to the value of CPSR.N on taking an exception to Hyp mode, and copied to CPSR.N on 
executing an exception return operation in Hyp mode. 
Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Hyp mode, and copied to CPSR.Z on executing 
an exception return operation in Hyp mode. 
C, bit [29] 
Set to the value of CPSR.C on taking an exception to Hyp mode, and copied to CPSR.C on 
executing an exception return operation in Hyp mode. 
V, bit [28] 
Set to the value of CPSR.V on taking an exception to Hyp mode, and copied to CPSR.V on 
executing an exception return operation in Hyp mode. 
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Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 


IT[1:0], bits [26:25] 


J, bit [24] 


Bits [23:22] 


SS, bit [21] 


IL, bit [20] 


IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvV8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 


Reserved, RESO. 


Software step. Shows the value of PSTATE.SS immediately before the exception was taken. 


Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 


GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 


IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
Q Little-endian operation 
1 Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 

A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
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I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
F, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
0 Taken from A32 state. 
1 Taken from T32 state. 
M4], bit [4] 


Execution state that the exception was taken from. Possible values of this bit are: 


1 Exception taken from AArch32. 


M{[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





Qb0000 User 





0b0001 + =6FIQ 





obe010 ~=—s IRQ 





Qb0011 Supervisor 





@b0111 Abort 





0b1010 Hyp 





@b1011 Undefined 








Qb1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


When exception taken from AArch64: 


31 30 29 28 27 22 21 20 19 1098765 4 3 





Pia 
RESO 
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An exception return from EL2 using AArch64 makes SPSR_EL2 become UNKNOWN. 


N, bit [31] 
Set to the value of the N condition flag on taking an exception to EL2, and copied to the N condition 
flag on executing an exception return operation in EL2. 

Z, bit [30] 
Set to the value of the Z condition flag on taking an exception to EL2, and copied to the Z condition 
flag on executing an exception return operation in EL2. 

C, bit [29] 
Set to the value of the C condition flag on taking an exception to EL2, and copied to the C condition 
flag on executing an exception return operation in EL2. 

V, bit [28] 


Set to the value of the V condition flag on taking an exception to EL2, and copied to the V condition 
flag on executing an exception return operation in EL2. 


Bits [27:22] 

Reserved, RESO. 
SS, bit [21] 

Software step. Shows the value of PSTATE.SS immediately before the exception was taken. 
IL, bit [20] 


Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 


Bits [19:10] 
Reserved, RESO. 





D, bit [9] 
Process state D mask. The possible values of this bit are: 
Q Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 

level are not masked. 
1 Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 
level are masked. 

When the target Exception level of the debug exception is higher than the current Exception level, 
the exception is not masked by this bit. 

A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 

I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 

¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
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Bit [5] 
Reserved, RESO. 

M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
0 Exception taken from AArch64. 

M[3:0], bits [3:0] 


AArch64 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





0be000 ELOt 





0b0100 ELIt 





0b0101 ELih 





0b1000 EL2t 





0b1001 EL2h 





Other values are reserved, and returning to an Exception level that is using AArch64 with a reserved 
value in this field is treated as an illegal exception return. 


The bits in this field are interpreted as follows: 
° M[3:2] holds the Exception Level. 
° M[1] is unused and is RESO for all non-reserved values. 
° M[0] is used to select the SP: 
— 0 means the SP is always SPO. 


— 1 means the exception SP is determined by the EL. 


Accessing the SPSR_EL2: 
To access the SPSR_EL2: 


MRS <Xt>, SPSR_EL2 ; Read SPSR_EL2 into Xt 
MSR SPSR_EL2, <Xt> ; Write Xt to SPSR_EL2 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 100 0100 0000 000 
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C5.2.19 SPSR_EL3, Saved Program Status Register (EL3) 
The SPSR_EL3 characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to EL3. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





e s - - RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register SPSR_EL3 can be mapped to AArch32 System register SPSR_mon, but 
this is not architecturally mandated. 


Attributes 
SPSR_EL3 is a 32-bit register. 


Field descriptions 


The SPSR_EL3 bit assignments are: 


When exception taken from AArch32: 


31 30 29 28 27 26 25 24 23 22 21 20 19 1615 1098765 4 3 





IT[1:0] ee 5 — MI4] 
RESO 


An exception return from EL3 using AArch64 makes SPSR_EL3 become UNKNOWN. 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to Monitor mode, and copied to CPSR.N on 
executing an exception return operation in Monitor mode. 

Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Monitor mode, and copied to CPSR.Z on 
executing an exception return operation in Monitor mode. 

C, bit [29] 
Set to the value of CPSR.C on taking an exception to Monitor mode, and copied to CPSR.C on 
executing an exception return operation in Monitor mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to Monitor mode, and copied to CPSR.V on 
executing an exception return operation in Monitor mode. 
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Q, bit [27] 


Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 


IT[1:0], bits [26:25] 

IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 
J, bit [24] 

RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvV8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 


Bits [23:22] 
Reserved, RESO. 


SS, bit [21] 


Software step. Shows the value of PSTATE.SS immediately before the exception was taken. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
Q Little-endian operation 
1 Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 

A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
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I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
F, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
0 Taken from A32 state. 
1 Taken from T32 state. 
M4], bit [4] 


Execution state that the exception was taken from. Possible values of this bit are: 


1 Exception taken from AArch32. 


M{[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





Qb0000 User 





0b0001 + =FIQ 





obe010 ~=—s IRQ 





@b0011 Supervisor 





0b0110 Monitor 








@b0111 Abort 





0b1010 #=Hyp 





Qb1011 Undefined 








Qb1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


When exception taken from AArch64: 


31 30 29 28 27 22 21 20 19 1098765 4 3 





ae M4] 
RESO 
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An exception return from EL3 using AArch64 makes SPSR_EL3 become UNKNOWN. 


N, bit [31] 
Set to the value of the N condition flag on taking an exception to EL3, and copied to the N condition 
flag on executing an exception return operation in EL3. 

Z, bit [30] 
Set to the value of the Z condition flag on taking an exception to EL3, and copied to the Z condition 
flag on executing an exception return operation in EL3. 

C, bit [29] 
Set to the value of the C condition flag on taking an exception to EL3, and copied to the C condition 
flag on executing an exception return operation in EL3. 

V, bit [28] 


Set to the value of the V condition flag on taking an exception to EL3, and copied to the V condition 
flag on executing an exception return operation in EL3. 


Bits [27:22] 

Reserved, RESO. 
SS, bit [21] 

Software step. Shows the value of PSTATE.SS immediately before the exception was taken. 
IL, bit [20] 


Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 


Bits [19:10] 
Reserved, RESO. 





D, bit [9] 
Process state D mask. The possible values of this bit are: 
Q Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 

level are not masked. 
1 Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 
level are masked. 

When the target Exception level of the debug exception is higher than the current Exception level, 
the exception is not masked by this bit. 

A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 

I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 

¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
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Bit [5] 
Reserved, RESO. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 


Q Exception taken from AArch64. 
M[3:0], bits [3:0] 


AArch64 mode that an exception was taken from. The possible values are: 


























M[3:0] Mode 
0b0000 ELOt 
0b0100 ELIt 
0b0101 ELIh 
0b1000 EL2t 
0b1001 EL2h 
0b1100 EL3t 
0b1101 EL3h 





Other values are reserved, and returning to an Exception level that is using AArch64 with a reserved 


value in this field is treated as an illegal exception return. 
The bits in this field are interpreted as follows: 
° M[3:2] holds the Exception Level. 
° M[1] is unused and is RESO for all non-reserved values. 
° M[O] is used to select the SP: 

—  Omeans the SP is always SPO. 


— 1 means the exception SP is determined by the EL. 


Accessing the SPSR_EL3: 
To access the SPSR_EL3: 


MRS <Xt>, SPSR_EL3 ; Read SPSR_EL3 into Xt 
MSR SPSR_EL3, <Xt> ; Write Xt to SPSR_EL3 


Register access is encoded as follows: 














op0 opi CRn CRm_= op2 
11 110 0100 = 0000 000 
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C5.2 Special-purpose registers 


SPSR_fiq, Saved Program Status Register (FIQ mode) 


The SPSR_fiq characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to FIQ mode. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





e z - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register SPSR_fiq is architecturally mapped to AArch32 System register 
SPSR_fiq. 


If EL1 does not support execution in AArch32, this register is RESO. 


Attributes 
SPSR_fiq is a 32-bit register. 


Field descriptions 


The SPSR_fiq bit assignments are: 


31 30 29 28 27 26 25 24 23.21 2019 1615 1098765 4 3 





IT[1:0] Se an MI4] 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to FIQ mode, and copied to CPSR.N on 
executing an exception return operation in FIQ mode. 

Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to FIQ mode, and copied to CPSR.Z on executing 
an exception return operation in FIQ mode. 

C, bit [29] 
Set to the value of CPSR.C on taking an exception to FIQ mode, and copied to CPSR.C on executing 
an exception return operation in FIQ mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to FIQ mode, and copied to CPSR.V on 
executing an exception return operation in FIQ mode. 

Q, bit [27] 
Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 
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C5 The A64 System Instruction Class 
C5.2 Special-purpose registers 


IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 
In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvV8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 


E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
Q Little-endian operation 
1 Big-endian operation. 
Instruction fetches ignore this bit. 


When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 


If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 


If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 


Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 


A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
Q Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 


) Exception not masked. 


a Exception masked. 
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C5.2 Special-purpose registers 


F, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
0 Taken from A32 state. 
1 Taken from T32 state. 
M{[4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M{[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





Qb0000 User 





0b0001 FIQ 





obe010 ~=—s IRQ 





Qb0011 Supervisor 





Qb0110 Monitor 





@b0111 Abort 





0b1010 Hyp 





Qb1011 Undefined 








Qb1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


Accessing the SPSR_fiq: 
To access the SPSR_fiq: 


MRS <Xt>, SPSR_fiq ; Read SPSR_fiq into Xt 
MSR SPSR_fig, <Xt> ; Write Xt to SPSR_fiq 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 100 0100 0011 011 
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C5.2 Special-purpose registers 


C5.2.21 SPSR_irq, Saved Program Status Register (IRQ mode) 
The SPSR_irq characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to IRQ mode. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





e z - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register SPSR_irq is architecturally mapped to AArch32 System register 
SPSR_irq. 


If EL1 does not support execution in AArch32, this register is RESO. 


Attributes 
SPSR_irq is a 32-bit register. 


Field descriptions 


The SPSR_irq bit assignments are: 


31 30 29 28 27 26 25 24 23.21 2019 1615 1098765 4 3 





IT[1:0] Se —_— MI4] 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to IRQ mode, and copied to CPSR.N on 
executing an exception return operation in IRQ mode. 

Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to IRQ mode, and copied to CPSR.Z on executing 
an exception return operation in IRQ mode. 

C, bit [29] 
Set to the value of CPSR.C on taking an exception to IRQ mode, and copied to CPSR.C on 
executing an exception return operation in IRQ mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to IRQ mode, and copied to CPSR.V on 
executing an exception return operation in IRQ mode. 

Q, bit [27] 
Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 
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C5 The A64 System Instruction Class 


C5.2 Special-purpose registers 


IT[1:0], bits [26:25] 


J, bit [24] 


Bits [23:21] 


IL, bit [20] 


IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvV8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 


Reserved, RESO. 


Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 


GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 


E, bit [9] 


A, bit [8] 


I, bit [7] 


IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 


Endianness state bit. Controls the load and store endianness for data accesses: 
Q Little-endian operation 

1 Big-endian operation. 

Instruction fetches ignore this bit. 


When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 


If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 


If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 


Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 


SError interrupt mask bit. The possible values of this bit are: 
Q Exception not masked. 


1 Exception masked. 


IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 


a Exception masked. 
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C5.2 Special-purpose registers 


F, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
0 Taken from A32 state. 
1 Taken from T32 state. 
M{[4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M{[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





Qb0000 User 





0b0001 FIQ 





obe010 ~=—s IRQ 





Qb0011 Supervisor 





Qb0110 Monitor 





@b0111 Abort 





0b1010 Hyp 





Qb1011 Undefined 








Qb1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


Accessing the SPSR_irq: 
To access the SPSR_irq: 


MRS <Xt>, SPSR_irq ; Read SPSR_irq into Xt 
MSR SPSR_irg, <Xt> ; Write Xt to SPSR_irq 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 0100 0011 000 
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C5.2 Special-purpose registers 


C5.2.22 SPSR_und, Saved Program Status Register (Undefined mode) 
The SPSR_und characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to Undefined mode. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





e z - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register SPSR_und is architecturally mapped to AArch32 System register 
SPSR_und. 


If EL1 does not support execution in AArch32, this register is RESO. 


Attributes 
SPSR_und is a 32-bit register. 


Field descriptions 


The SPSR_und bit assignments are: 


31 30 29 28 27 26 25 24 23.21 2019 1615 1098765 4 3 





IT[1:0] Se a MI4] 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to Undefined mode, and copied to CPSR.N on 
executing an exception return operation in Undefined mode. 

Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Undefined mode, and copied to CPSR.Z on 
executing an exception return operation in Undefined mode. 

C, bit [29] 
Set to the value of CPSR.C on taking an exception to Undefined mode, and copied to CPSR.C on 
executing an exception return operation in Undefined mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to Undefined mode, and copied to CPSR.V on 
executing an exception return operation in Undefined mode. 

Q, bit [27] 
Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 
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C5.2 Special-purpose registers 


IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 
In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvV8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 


E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
Q Little-endian operation 
1 Big-endian operation. 
Instruction fetches ignore this bit. 


When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 


If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 


If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 


Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 


A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
Q Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 


) Exception not masked. 


a Exception masked. 
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C5.2 Special-purpose registers 


F, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
0 Taken from A32 state. 
1 Taken from T32 state. 
M{[4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M{[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





Qb0000 User 





0b0001 FIQ 





obe010 ~=—s IRQ 





Qb0011 Supervisor 





Qb0110 Monitor 





@b0111 Abort 





0b1010 Hyp 





Qb1011 Undefined 








Qb1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


Accessing the SPSR_und: 
To access the SPSR_und: 


MRS <Xt>, SPSR_und ; Read SPSR_und into Xt 
MSR SPSR_und, <Xt> ; Write Xt to SPSR_und 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 0100 0011 010 
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C5 The A64 System Instruction Class 
C5.3 A64 system instructions for cache maintenance 


A64 system instructions for cache maintenance 


The following sections define the cache maintenance system instructions in A64: 


DC CISW, Data or unified Cache line Clean and Invalidate by Set/Way on page C5-348. 
DC CIVAC, Data or unified Cache line Clean and Invalidate by VA to PoC on page C5-350. 
DC CSW, Data or unified Cache line Clean by Set/Way on page C5-351. 

DC CVAC, Data or unified Cache line Clean by VA to PoC on page C5-353. 

DC CVAU, Data or unified Cache line Clean by VA to PoU on page C5-354. 

DC ISW, Data or unified Cache line Invalidate by Set/Way on page C5-355. 

DC IVAC, Data or unified Cache line Invalidate by VA to PoC on page C5-357. 

DC ZVA, Data Cache Zero by VA on page C5-359. 

IC IALLU, Instruction Cache Invalidate All to PoU on page C5-361. 

IC IALLUIS, Instruction Cache Invalidate All to PoU, Inner Shareable on page C5-362. 
IC IVAU, Instruction Cache line Invalidate by VA to PoU on page C5-363. 
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C5 The A64 System Instruction Class 
C5.3 A64 system instructions for cache maintenance 


C5.3.1 DC CISW, Data or unified Cache line Clean and Invalidate by Set/Way 
The DC CISW characteristics are: 


Purpose 


Clean and Invalidate data cache by set/way. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





If this instruction is executed with a set, way or level argument that is larger than the value supported 
by the implementation then the behavior is CONSTRAINED UNPREDICTABLE and one of the following 
occurs: 


. The instruction is UNDEFINED 

° The instruction performs cache maintenance on one of: 
—  Nocache lines. 
—  Assingle arbitrary cache line. 


— Multiple arbitrary cache lines. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TSW==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


AArch64 System instruction DC CISW performs the same function as AArch32 System instruction 
DCCISW. 


Attributes 
DC CISW is a 64-bit System instruction. 


Field descriptions 


The DC CISW input value bit assignments are: 


63 32 31 4 3 10 


RESO SetWay tot | | 


| RESO 


Bits [63:32] 
Reserved, RESO. 
SetWay, bits [31:4] 
Contains two fields: 
° Way, bits[31:32-A], the number of the way to operate on. 


° Set, bits[B-1:L], the number of the set to operate on. 
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C5.3 A64 system instructions for cache maintenance 


Bits[L-1:4] are RESO. 
A = Log2(ASSOCIATIVITY), L = Log2(LINELEN), B = (L + S), S = Logo(NSETS). 


ASSOCIATIVITY, LINELEN (line length, in bytes), and NSETS (number of sets) have their usual 
meanings and are the values for the cache level being operated on. The values of A and S are 
rounded up to the next integer. 


Level, bits [3:1] 


Cache level to operate on, minus 1. For example, this field is 0 for operations on L1 cache, or 1 for 
operations on L2 cache. 


Bit [0] 


Reserved, RESO. 


Executing the DC CISW instruction: 
The DC CISW instruction is executed as: 
DC CISW, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 000 0111 1110 010 
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C5.3 A64 system instructions for cache maintenance 


C5.3.2 DC CIVAC, Data or unified Cache line Clean and Invalidate by VA to PoC 
The DC CIVAC characteristics are: 


Purpose 


Clean and Invalidate data cache by address to Point of Coherency. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-WO WO WO WO WO WO 





If ELO access is enabled, this instruction is available at ELO when the VA has read access 
permission, otherwise it causes a Permission Fault. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


* If HCR_EL2.TPC==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


° If HCR_EL2.TPC==1, and SCTLR_EL1.UCI==1, Non-secure execution of this instruction 
at ELO is trapped to EL2. 


° If SCTLR_EL1.UCI==0, execution of this instruction at ELO is trapped to EL1. 


Configurations 


AArch64 System instruction DC CIVAC performs the same function as AArch32 System 
instruction DCCIMVAC. 


Attributes 
DC CIVAC is a 64-bit System instruction. 


Field descriptions 


The DC CIVAC input value bit assignments are: 


63 0 


Virtual address to use 


Bits [63:0] 


Virtual address to use. 


Executing the DC CIVAC instruction: 
The DC CIVAC instruction is executed as: 
DC CIVAC, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 011 0111 1110 001 
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C5.3 A64 system instructions for cache maintenance 


C5.3.3 DC CSW, Data or unified Cache line Clean by Set/Way 
The DC CSW characteristics are: 


Purpose 


Clean data cache by set/way. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





If this instruction is executed with a set, way or level argument that is larger than the value supported 
by the implementation then the behavior is CONSTRAINED UNPREDICTABLE and one of the following 
occurs: 


. The instruction is UNDEFINED 

° The instruction performs cache maintenance on one of: 
—  Nocache lines. 
—  Assingle arbitrary cache line. 


— Multiple arbitrary cache lines. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TSW==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


AArch64 System instruction DC CSW performs the same function as AArch32 System instruction 
DCCSW. 


Attributes 
DC CSW is a 64-bit System instruction. 


Field descriptions 


The DC CSW input value bit assignments are: 


63 32 31 4 3 10 


RESO SetWay tot | | 


| RESO 


Bits [63:32] 
Reserved, RESO. 
SetWay, bits [31:4] 
Contains two fields: 
° Way, bits[31:32-A], the number of the way to operate on. 


° Set, bits[B-1:L], the number of the set to operate on. 
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C5.3 A64 system instructions for cache maintenance 


Bits[L-1:4] are RESO. 
A = Log2(ASSOCIATIVITY), L = Log2(LINELEN), B = (L + S), S = Logo(NSETS). 


ASSOCIATIVITY, LINELEN (line length, in bytes), and NSETS (number of sets) have their usual 
meanings and are the values for the cache level being operated on. The values of A and S are 
rounded up to the next integer. 


Level, bits [3:1] 


Cache level to operate on, minus 1. For example, this field is 0 for operations on L1 cache, or 1 for 
operations on L2 cache. 


Bit [0] 


Reserved, RESO. 


Executing the DC CSW instruction: 
The DC CSW instruction is executed as: 
DC CSW, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 000 0111 1010 010 
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C5 The A64 System Instruction Class 
C5.3 A64 system instructions for cache maintenance 


C5.3.4 DC CVAC, Data or unified Cache line Clean by VA to PoC 
The DC CVAC characteristics are: 


Purpose 


Clean data cache by address to Point of Coherency. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-WO WO WO WO WO WO 





If ELO access is enabled, this instruction is available at ELO when the VA has read access 
permission, otherwise it causes a Permission Fault. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


* If HCR_EL2.TPC==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


° If HCR_EL2.TPC==1, and SCTLR_EL1.UCI==1, Non-secure execution of this instruction 
at ELO is trapped to EL2. 


° If SCTLR_EL1.UCI==0, execution of this instruction at ELO is trapped to EL1. 


Configurations 


AArch64 System instruction DC CVAC performs the same function as AArch32 System instruction 
DCCMVAC. 


Attributes 
DC CVAC is a 64-bit System instruction. 


Field descriptions 


The DC CVAC input value bit assignments are: 


63 0 


Virtual address to use 


Bits [63:0] 


Virtual address to use. 


Executing the DC CVAC instruction: 
The DC CVAC instruction is executed as: 
DC CVAC, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 011 0111 1010 001 
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C5 The A64 System Instruction Class 
C5.3 A64 system instructions for cache maintenance 


C5.3.5 DC CVAU, Data or unified Cache line Clean by VA to PoU 
The DC CVAU characteristics are: 


Purpose 


Clean data cache by address to Point of Unification. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-WO WO WO WO WO WO 





If ELO access is enabled, this instruction is available at ELO when the VA has read access 
permission, otherwise it causes a Permission Fault. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TPU==1, Non-secure execution of this instruction at ELO and EL1 is trapped 
to EL2. 


* If SCTLR_EL1.UCI==0, execution of this instruction at ELO is trapped to EL1. 


Configurations 


AArch64 System instruction DC CVAU performs the same function as AArch32 System instruction 
DCCMVAU. 


Attributes 
DC CVAU is a 64-bit System instruction. 


Field descriptions 


The DC CVAU input value bit assignments are: 


63 0 


Virtual address to use 


Bits [63:0] 


Virtual address to use. 


Executing the DC CVAU instruction: 
The DC CVAU instruction is executed as: 
DC CVAU, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 011 0111 1011 001 
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C5.3 A64 system instructions for cache maintenance 


C5.3.6 DC ISW, Data or unified Cache line Invalidate by Set/Way 
The DC ISW characteristics are: 


Purpose 


Invalidate data cache by set/way. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





At EL1, this operation must operate as DC CISW if all of the following apply: 
° EL2 is implemented. 

° The value of HCR_EL2.SWIO is 1| or the value of HCR_EL2.VM is 1. 
° The value of SCR_EL3.NS is 1 or EL3 is not implemented. 


If this instruction is executed with a set, way or level argument that is larger than the value supported 
by the implementation then the behavior is CONSTRAINED UNPREDICTABLE and one of the following 
occurs: 


° The instruction is UNDEFINED 

° The instruction performs cache maintenance on one of: 
—  Nocache lines. 
— A single arbitrary cache line. 


— Multiple arbitrary cache lines. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TSW==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


AArch64 System instruction DC ISW performs the same function as AArch32 System instruction 
DCISW. 


Attributes 
DC ISW is a 64-bit System instruction. 


Field descriptions 


The DC ISW input value bit assignments are: 


63 32 31 4 3 10 


RESO SetWay tot | | 


— RESO 


Bits [63:32] 


Reserved, RESO. 
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SetWay, bits [31:4] 
Contains two fields: 
° Way, bits[31:32-A], the number of the way to operate on. 
° Set, bits[B-1:L], the number of the set to operate on. 
Bits[L-1:4] are RESO. 
A = Log2(ASSOCIATIVITY), L = Logo(LINELEN), B = (L + S), S = Logo(NSETS). 


ASSOCIATIVITY, LINELEN (line length, in bytes), and NSETS (number of sets) have their usual 
meanings and are the values for the cache level being operated on. The values of A and S are 
rounded up to the next integer. 


Level, bits [3:1] 


Cache level to operate on, minus 1. For example, this field is 0 for operations on L1 cache, or 1 for 
operations on L2 cache. 


Bit [0] 
Reserved, RESO. 


Executing the DC ISW instruction: 
The DC ISW instruction is executed as: 
DC ISW, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





opO0 opt CRn CRm_= op2 





01 000 0111 0110 010 
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C5 The A64 System Instruction Class 
C5.3 A64 system instructions for cache maintenance 


C5.3.7 DC IVAC, Data or unified Cache line Invalidate by VA to PoC 
The DC IVAC characteristics are: 


Purpose 


Invalidate data cache by address to Point of Coherency. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





This instruction requires write access permission to the VA, otherwise it causes a Permission Fault. 


When the instruction is executed, it can generate a watchpoint, which is prioritized in the same way 
as other watchpoints. If a watchpoint is generated, the CM bit in the ESR_ELx.ISS field is set to 1. 


At EL], this instruction must be performed as DC CIVAC if all of the following apply: 
. EL2 is implemented. 

° HCR_EL2.VM is set to 1. 

° SCR_EL3.NS is set to 1 or EL3 is not implemented. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TPC==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


AArch64 System instruction DC IVAC performs the same function as AArch32 System instruction 
DCIMVAC. 


Attributes 
DC IVAC is a 64-bit System instruction. 


Field descriptions 


The DC IVAC input value bit assignments are: 


63 0 


Virtual address to use 


Bits [63:0] 


Virtual address to use. 


Executing the DC IVAC instruction: 
The DC IVAC instruction is executed as: 


DC IVAC, <Xt> 
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C5.3 A64 system instructions for cache maintenance 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 000 0111 0110 001 





When the instruction is executed, it can generate a watchpoint, which is prioritized in the same way as other 
watchpoints. If a watchpoint is generated, the CM bit in the ESR_ELx.ISS field is set to 1. 
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C5.3 A64 system instructions for cache maintenance 


C5.3.8 DC ZVA, Data Cache Zero by VA 
The DC ZVA characteristics are: 


Purpose 


Zero data cache by address. Zeroes a naturally aligned block of N bytes, where the size of N is 
identified in DCZID_ELO. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-WO WO WO WO WO WO 





When the value of SCTLR_EL1.DZE is 0 this instruction is UNDEFINED at ELO. 


When this instruction is executed, it can generate memory faults or watchpoints which are 
prioritized in the same way as other memory-related faults or watchpoints. If a synchronous data 
abort fault or a watchpoint is generated, the CM bit in the ESR_ELx.ISS field is not set. 


If the memory region being zeroed is any type of Device memory, this instruction can give an 
alignment fault which is prioritized in the same way as other alignment faults that are determined 
by the memory type. 


This instruction applies to Normal memory regardless of cacheability attributes. 
This instruction behaves as a set of Stores to each byte within the block being accessed, and so it: 
° Will cause a Permission Fault if the translation system does not permit writes to the locations. 


° Requires the same considerations for ordering and the management of coherency as any other 
store instructions. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TDZ==1, Non-secure execution of this instruction at ELO and EL1 is trapped 
to EL2. 


° If SCTLR_EL1.DZE==0, execution of this instruction at ELO is trapped to EL1. 


Configurations 


There are no configuration notes. 


Attributes 
DC ZVA is a 64-bit System instruction. 


Field descriptions 


The DC ZVA input value bit assignments are: 


63 0 


Virtual address to use 





Bits [63:0] 
Virtual address to use. There is no alignment restriction on the address within the block of N bytes 
that is used. 
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C5.3 A64 system instructions for cache maintenance 


Executing the DC ZVA instruction: 
The DC ZVA instruction is executed as: 
DC ZVA, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





Q1 011 0111 0100 001 





When this instruction is executed, it can generate memory faults or watchpoints which are prioritized in the same 
way as other memory-related faults or watchpoints. If a synchronous data abort fault or a watchpoint is generated, 
the CM bit in the ESR_ELx.ISS field is not set. 


If the memory region being zeroed is any type of Device memory, this instruction can give an alignment fault which 
is prioritized in the same way as other alignment faults that are determined by the memory type. 


This instruction applies to Normal memory regardless of cacheability attributes. 


This instruction behaves as a set of Stores to each byte within the block being accessed, and so it: 





° Generates a Permission Fault if the translation system does not permit writes to the locations. 
° Requires the same considerations for ordering and the management of coherency as any other store 
instructions. 
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C5.3.9 IC IALLU, Instruction Cache Invalidate All to PoU 
The IC IALLU characteristics are: 
Purpose 
Invalidate all instruction caches to Point of Unification. 
Usage constraints 
This instruction can be executed at the following exception levels: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- WO WO WO WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If HCR_EL2.TPU==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
AArch64 System instruction IC IALLU performs the same function as AArch32 System instruction 
ICIALLU. 
Attributes 
IC IALLU is a 64-bit System instruction. 
Field descriptions 
IC IALLU ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 
Executing the IC IALLU instruction: 
The IC IALLU instruction is executed as: 
IC IALLU 
The instruction is encoded in the System instruction encoding space as follows: 
op0 opt CRn CRm_= op2 
01 000 Q111 = @101 000 
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C5.3.10 IC IALLUIS, Instruction Cache Invalidate All to PoU, Inner Shareable 
The IC IALLUIS characteristics are: 


Purpose 


Invalidate all instruction caches in Inner Shareable domain to Point of Unification. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TPU==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


AArch64 System instruction IC IALLUIS performs the same function as AArch32 System 
instruction ICIALLUIS. 


Attributes 
IC IALLUIS is a 64-bit System instruction. 


Field descriptions 


IC IALLUIS ignores the value in the register specified by the instruction. Software does not have to write a value 
to the register before issuing this instruction. 


Executing the IC IALLUIS instruction: 
The IC IALLUIS instruction is executed as: 


IC IALLUIS 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 000 0111 = =0001 000 
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IC IVAU, Instruction Cache line Invalidate by VA to PoU 


The IC IVAU characteristics are: 


Purpose 
Invalidate instruction cache by address to Point of Unification. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-WO WO WO WO WO WO 





If ELO access is enabled, this instruction is available at ELO when the VA has read access 
permission. It is IMPLEMENTATION DEFINED whether executing this instruction at ELO causes a 
Permission Fault if the VA does not have read access permission at ELO. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TPU==1, Non-secure execution of this instruction at ELO and EL] is trapped 
to EL2. 


° If SCTLR_EL1.UCI==0, execution of this instruction at ELO is trapped to EL1. 


Configurations 


AArch64 System instruction IC IVAU performs the same function as AArch32 System instruction 
ICIMVAU. 


Attributes 
IC IVAU is a 64-bit System instruction. 


Field descriptions 


The IC IVAU input value bit assignments are: 


Virtual address to use 


Bits [63:0] 


Virtual address to use. 


Executing the IC IVAU instruction: 
The IC IVAU instruction is executed as: 
IC IVAU, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 011 0111 0101 001 
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If ELO access is enabled, this instruction is available at ELO when the VA has read access permission. It is 
IMPLEMENTATION DEFINED whether executing this instruction at ELO causes a Permission Fault if the VA does not 
have read access permission at ELO. 
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C5 The A64 System Instruction Class 
C5.4 A64 system instructions for address translation 


A64 system instructions for address translation 


The following sections define the address translation system instructions in A64: 


AT S12E0R, Address Translate Stages 1 and 2 ELO Read on page C5-366. 
AT SI2EOW, Address Translate Stages I and 2 ELO Write on page C5-367. 
AT S12EIR, Address Translate Stages 1 and 2 EL1 Read on page C5-368. 
AT S12E1W, Address Translate Stages 1 and 2 ELI Write on page C5-369. 
AT S1EOR, Address Translate Stage 1 ELO Read on page C5-370. 
AT S1EOW, Address Translate Stage 1 ELO Write on page C5-371. 
AT S1EIR, Address Translate Stage 1 EL] Read on page C5-372. 
AT SIE1W, Address Translate Stage 1 ELI Write on page C5-373. 
AT S1E2R, Address Translate Stage 1 EL2 Read on page C5-374. 
AT S1E2W, Address Translate Stage 1 EL2 Write on page C5-375. 
AT S1E3R, Address Translate Stage 1 EL3 Read on page C5-376. 
AT S1E3W, Address Translate Stage 1 EL3 Write on page C5-377. 
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C5.4 A64 system instructions for address translation 


C5.4.1 AT S12E0R, Address Translate Stages 1 and 2 ELO Read 
The AT S12EOR characteristics are: 


Purpose 


Performs stage | and 2 address translations as defined for ELO, with permissions as if reading from 
the given virtual address. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If EL2 does not exist, or stage 2 translation is disabled, or the instruction is executed at EL3 when 
the value of SCR_EL3.NS is 0, this operation executes as AT S1EOR. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
AT S12EOR is a 64-bit System instruction. 


Field descriptions 


The AT S12E0R input value bit assignments are: 


63 0 


Input address for translation 





Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 


If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 

Executing the AT S12E0R instruction: 

The AT S12EOR instruction is executed as: 


AT S12E@R, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 100 0111 1000 110 
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C5.4 A64 system instructions for address translation 


C5.4.2 AT S12E0W, Address Translate Stages 1 and 2 ELO Write 
The AT S12E0OW characteristics are: 


Purpose 


Performs stage | and 2 address translations as defined for ELO, with permissions as if writing to the 
given virtual address. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If EL2 does not exist, or stage 2 translation is disabled, or the instruction is executed at EL3 when 
the value of SCR_EL3.NS is 0, this operation executes as AT SIEOW. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
AT S12E0W is a 64-bit System instruction. 


Field descriptions 


The AT S12E0W input value bit assignments are: 


63 0 


Input address for translation 





Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 


If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 

Executing the AT S12E0W instruction: 

The AT S12E0W instruction is executed as: 


AT S12EQW, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





opO0 opt CRn CRm_= op2 





01 100 0111 1000 111 
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C5.4 A64 system instructions for address translation 


C5.4.3 AT S12E1R, Address Translate Stages 1 and 2 EL1 Read 
The AT S12E1R characteristics are: 


Purpose 


Performs stage 1 and 2 address translations as defined for EL1, with permissions as if reading from 
the given virtual address. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If EL2 does not exist, or stage 2 translation is disabled, or the instruction is executed at EL3 when 
the value of SCR_EL3.NS is 0, this operation executes as AT SIEIR. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
AT S12EIR is a 64-bit System instruction. 


Field descriptions 


The AT S12E1R input value bit assignments are: 


63 0 


Input address for translation 





Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 


If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 

Executing the AT S12E1R instruction: 

The AT S12E1R instruction is executed as: 


AT S12E1R, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 100 0111 1000 100 
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C5 The A64 System Instruction Class 
C5.4 A64 system instructions for address translation 


C5.4.4 AT S12E1W, Address Translate Stages 1 and 2 EL1 Write 
The AT S12E1W characteristics are: 


Purpose 


Performs stage 1 and 2 address translations as defined for EL1, with permissions as if writing to the 
given virtual address. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If EL2 does not exist, or stage 2 translation is disabled, or the instruction is executed at EL3 when 
the value of SCR_EL3.NS is 0, this operation executes as AT SIEIW. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
AT S12E1W is a 64-bit System instruction. 


Field descriptions 


The AT $12E1W input value bit assignments are: 


63 0 


Input address for translation 





Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 


If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 

Executing the AT S$12E1W instruction: 

The AT S12E1W instruction is executed as: 


AT S12E1W, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





opO0 opt CRn CRm_= op2 





01 100 0111 1000 101 
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C5.4 A64 system instructions for address translation 




























C5.4.5 AT S1E0R, Address Translate Stage 1 ELO Read 
The AT S1EOR characteristics are: 
Purpose 
Performs stage | address translation as defined for ELO, with permissions as if reading from the 
given virtual address. 
Usage constraints 
This instruction can be executed at the following exception levels: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- WO WO WO WO WO 
Traps and Enables 
There are no traps or enables affecting this instruction. 
Configurations 
There are no configuration notes. 
Attributes 
AT S1EOR is a 64-bit System instruction. 
Field descriptions 
The AT S1EOR input value bit assignments are: 
0 
Input address for translation 
Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 
If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 
Executing the AT S1E0R instruction: 
The AT S1EOR instruction is executed as: 
AT S1E@R, <Xt> 
The instruction is encoded in the System instruction encoding space as follows: 
op0 opt CRn CRm_= op2 
Q1 000 0111 1000 010 
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C5.4.6 AT S1E0W, Address Translate Stage 1 ELO Write 
The AT S1EOW characteristics are: 


Purpose 
Performs stage 1 address translation as defined for ELO, with permissions as if writing to the given 
virtual address. 

Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
AT S1EOW is a 64-bit System instruction. 


Field descriptions 


The AT S1EOW input value bit assignments are: 


63 0 







Input address for translation 


Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 


If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 

Executing the AT S1E0W instruction: 

The AT S1EOW instruction is executed as: 


AT S1EQW, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 000 0111 1000 011 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C5-371 
1ID092916 Non-Confidential 


C5 The A64 System Instruction Class 
C5.4 A64 system instructions for address translation 




























C5.4.7 AT S1E1R, Address Translate Stage 1 EL1 Read 
The AT S1E1R characteristics are: 
Purpose 
Performs stage 1 address translation as defined for EL1, with permissions as if reading from the 
given virtual address. 
Usage constraints 
This instruction can be executed at the following exception levels: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- WO WO WO WO WO 
Traps and Enables 
There are no traps or enables affecting this instruction. 
Configurations 
There are no configuration notes. 
Attributes 
AT S1EIR is a 64-bit System instruction. 
Field descriptions 
The AT S1EI1R input value bit assignments are: 
0 
Input address for translation 
Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 
If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 
Executing the AT S1E1R instruction: 
The AT S1EIR instruction is executed as: 
AT S1E1R, <Xt> 
The instruction is encoded in the System instruction encoding space as follows: 
op0 opt CRn CRm_= op2 
Q1 000 0111 1000 000 
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C5.4.8 AT S1E1W, Address Translate Stage 1 EL1 Write 
The AT S1EIW characteristics are: 


Purpose 
Performs stage 1 address translation as defined for EL1, with permissions as if writing to the given 
virtual address. 

Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
AT S1E1W is a 64-bit System instruction. 


Field descriptions 


The AT S1E1W input value bit assignments are: 


63 0 







Input address for translation 


Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 


If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 

Executing the AT S1E1W instruction: 

The AT S1E1W instruction is executed as: 


AT S1E1W, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 000 0111 1000 001 
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C5.4.9 AT S1E2R, Address Translate Stage 1 EL2 Read 
The AT S1E2R characteristics are: 


Purpose 


Performs stage 1 address translation as defined for EL2, with permissions as if reading from the 
given virtual address. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: z wo wo ; 





Performing this operation from EL3 is UNDEFINED if EL2 does not exist. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
AT S1E2R is a 64-bit System instruction. 


Field descriptions 


The AT S1E2R input value bit assignments are: 


63 0 


Input address for translation 





Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 


If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 

Executing the AT S1E2R instruction: 

The AT S1E2R instruction is executed as: 

AT S1E2R, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 100 0111 1000 000 
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C5.4.10 AT S1E2W, Address Translate Stage 1 EL2 Write 
The AT S1E2W characteristics are: 


Purpose 


Performs stage 1 address translation as defined for EL2, with permissions as if writing to the given 
virtual address. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: z z wo wo ; 





Performing this operation from EL3 is UNDEFINED if EL2 does not exist. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
AT S1E2W is a 64-bit System instruction. 


Field descriptions 


The AT S1E2W input value bit assignments are: 


63 0 


Input address for translation 





Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 


If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 

Executing the AT S1E2W instruction: 

The AT S1E2W instruction is executed as: 

AT S1E2W, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 100 0111 1000 001 
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C5.4.11 AT S1E3R, Address Translate Stage 1 EL3 Read 
The AT S1E3R characteristics are: 
Purpose 
Performs stage | address translation as defined for EL3, with permissions as if reading from the 
given virtual address. 
Usage constraints 
This instruction can be executed at the following exception levels: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - - WO WO 
Traps and Enables 
There are no traps or enables affecting this instruction. 
Configurations 
There are no configuration notes. 
Attributes 
AT S1E3R is a 64-bit System instruction. 
Field descriptions 
The AT S1E3R input value bit assignments are: 
0 
Input address for translation 
Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 
If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 
Executing the AT S1E3R instruction: 
The AT S1E3R instruction is executed as: 
AT S1E3R, <Xt> 
The instruction is encoded in the System instruction encoding space as follows: 
op0 opt CRn CRm_= op2 
01 110 0111 1000 000 
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C5.4.12 AT S1E3W, Address Translate Stage 1 EL3 Write 
The AT S1E3W characteristics are: 


Purpose 
Performs stage 1 address translation as defined for EL3, with permissions as if writing to the given 
virtual address. 

Usage constraints 


This instruction can be executed at the following exception levels: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: g 2 2 wo wo 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
AT S1E3W is a 64-bit System instruction. 


Field descriptions 


The AT S1E3W input value bit assignments are: 


63 0 







Input address for translation 


Bits [63:0] 
Input address for translation. The resulting address can be read from the PAR_EL1. 


If the address translation instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then VA[63:32] is RESO. 

Executing the AT S1E3W instruction: 

The AT S1E3W instruction is executed as: 


AT S1E3W, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 110 0111 1000 001 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C5-377 
1ID092916 Non-Confidential 


C5 The A64 System Instruction Class 
C5.5 A64 system instructions for TLB maintenance 


C5.5 A64 system instructions for TLB maintenance 


The following sections define the TLB maintenance system instructions in A64: 


TLBI ALLE1, TLB Invalidate All, EL1 on page C5-379. 

TLBI ALLEIIS, TLB Invalidate All, EL1, Inner Shareable on page C5-380. 

TLBI ALLE2, TLB Invalidate All, EL2 on page C5-381. 

TLBI ALLE2IS, TLB Invalidate All, EL2, Inner Shareable on page C5-382. 

TLBI ALLE3, TLB Invalidate All, EL3 on page C5-383. 

TLBI ALLE3IS, TLB Invalidate All, EL3, Inner Shareable on page C5-384. 

TLBI ASIDE1, TLB Invalidate by ASID, EL1 on page C5-385. 

TLBI ASIDELIS, TLB Invalidate by ASID, ELI, Inner Shareable on page C5-387. 

TLBI IPAS2E1, TLB Invalidate by Intermediate Physical Address, Stage 2, ELI on page C5-389. 
TLBI IPAS2E1IS, TLB Invalidate by Intermediate Physical Address, Stage 2, ELI, Inner Shareable on 
page C5-390. 

TLBI IPAS2LE1, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, ELI on page C5-392. 


TLBI IPAS2LE1IS, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, ELI, Inner 
Shareable on page C5-393. 


TLBI VAAE1, TLB Invalidate by VA, All ASID, ELI on page C5-395. 

TLBI VAAELIS, TLB Invalidate by VA, All ASID, EL1, Inner Shareable on page C5-397. 

TLBI VAALE1, TLB Invalidate by VA, All ASID, Last level, EL1 on page C5-399. 

TLBI VAALELIS, TLB Invalidate by VA, All ASID, EL1, Inner Shareable on page C5-401. 

TLBI VAE1, TLB Invalidate by VA, ELI on page C5-403. 

TLBI VAEIIS, TLB Invalidate by VA, EL1, Inner Shareable on page C5-405. 

TLBI VAE2, TLB Invalidate by VA, EL2 on page C5-407. 

TLBI VAE2IS, TLB Invalidate by VA, EL2, Inner Shareable on page C5-409. 

TLBI VAE3, TLB Invalidate by VA, EL3 on page C5-411. 

TLBI VAE3IS, TLB Invalidate by VA, EL3, Inner Shareable on page C5-413. 

TLBI VALE1, TLB Invalidate by VA, Last level, EL] on page C5-415. 

TLBI VALEIIS, TLB Invalidate by VA, Last level, ELI, Inner Shareable on page C5-417. 

TLBI VALE2, TLB Invalidate by VA, Last level, EL2 on page C5-419. 

TLBI VALE2IS, TLB Invalidate by VA, Last level, EL2, Inner Shareable on page C5-421. 

TLBI VALE3, TLB Invalidate by VA, Last level, EL3 on page C5-423. 

TLBI VALE3IS, TLB Invalidate by VA, Last level, EL3, Inner Shareable on page C5-425. 

TLBI VMALLE], TLB Invalidate by VMID, All at stage 1, ELI on page C5-427. 

TLBI VMALLEIIS, TLB Invalidate by VMID, All at stage 1, ELI, Inner Shareable on page C5-428. 
TLBI VMALLS12E1, TLB Invalidate by VMID, All at Stage 1 and 2, EL1 on page C5-429. 

TLBI VMALLS12E1IS, TLB Invalidate by VMID, All at Stage 1 and 2, EL1, Inner Shareable on page C5-430. 


For more information about these instructions see TLB maintenance instructions on page D4-1817. In particular, for 
the full description of the scope of each instruction see Scope of the A64 TLB maintenance instructions on 
page D4-1820. 
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C5.5.1 TLBI ALLE1, TLB Invalidate All, EL1 
The TLBI ALLE1 characteristics are: 
Purpose 
Invalidate all EL1&0 regime stage 1 and 2 TLB entries. 
If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 
For details of the scope of this instruction see ALL. 
Usage constraints 
This instruction can be executed at the following exception levels: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - WO WO WO 
Traps and Enables 
There are no traps or enables affecting this instruction. 
Configurations 
There are no configuration notes. 
Attributes 
TLBI ALLE] is a 64-bit System instruction. 
Field descriptions 
TLBI ALLE! ignores the value in the register specified by the instruction. Software does not have to write a value 
to the register before issuing this instruction. 
Executing the TLBI ALLE1 instruction: 
The TLBI ALLE1 instruction is executed as: 
TLBI ALLE1 
The instruction is encoded in the System instruction encoding space as follows: 
op0 opi CRn CRm_= op2 
01 100 1000 0111 100 
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C5.5.2 TLBI ALLE1IS, TLB Invalidate All, EL1, Inner Shareable 
The TLBI ALLELIS characteristics are: 


Purpose 


Invalidate all EL1&0 regime stage 1 and 2 TLB entries on all PEs in the same Inner Shareable 
domain. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see ALL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI ALLEIIS is a 64-bit System instruction. 


Field descriptions 

TLBI ALLELIS ignores the value in the register specified by the instruction. Software does not have to write a value 
to the register before issuing this instruction. 

Executing the TLBI ALLE1IS instruction: 

The TLBI ALLELIS instruction is executed as: 

TLBI ALLE1IS 


The instruction is encoded in the System instruction encoding space as follows: 





opO0 opt CRn CRm_= op2 





Q1 100 1000 0011 100 
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C5.5.3 TLBI ALLE2, TLB Invalidate All, EL2 
The TLBI ALLE2 characteristics are: 


Purpose 
Invalidate all EL2 regime stage 1 TLB entries. 


For details of the scope of this instruction see ALL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) 


EL3 (SCR.NS=0) 





- - - WO WO 





Performing this operation from EL3 is UNDEFINED if EL2 does not exist. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI ALLE2 is a 64-bit System instruction. 


Field descriptions 


TLBI ALLE2 ignores the value in the register specified by the instruction. Software does not have to write a value 


to the register before issuing this instruction. 


Executing the TLBI ALLE2 instruction: 
The TLBI ALLE2 instruction is executed as: 
TLBI ALLE2 


The instruction is encoded in the System instruction encoding space as follows: 














op0 opt CRn CRm_= op2 
01 100 1000 = @111 000 
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C5.5.4 TLBI ALLEZ2IS, TLB Invalidate All, EL2, Inner Shareable 
The TLBI ALLEZIS characteristics are: 


Purpose 
Invalidate all EL2 regime stage 1 TLB entries on all PEs in the same Inner Shareable domain. 


For details of the scope of this instruction see ALL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO - 





Performing this operation from EL3 is UNDEFINED if EL2 does not exist. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI ALLEZIS is a 64-bit System instruction. 


Field descriptions 

TLBI ALLEZIS ignores the value in the register specified by the instruction. Software does not have to write a value 
to the register before issuing this instruction. 

Executing the TLBI ALLEZIS instruction: 

The TLBI ALLEZ2IS instruction is executed as: 

TLBI ALLE2IS 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 100 1000 0011 000 
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TLBI ALLE3, TLB Invalidate All, EL3 
The TLBI ALLE3 characteristics are: 


Purpose 
Invalidate all EL3 regime stage 1 TLB entries. 


For details of the scope of this instruction see ALL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - WO WO 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI ALLES3 is a 64-bit System instruction. 


Field descriptions 


TLBI ALLE3 ignores the value in the register specified by the instruction. Software does not have to write a value 
to the register before issuing this instruction. 


Executing the TLBI ALLE3 instruction: 
The TLBI ALLE3 instruction is executed as: 
TLBI ALLE3 


The instruction is encoded in the System instruction encoding space as follows: 





opO0 opt CRn CRm_= op2 





Q1 110 1000 0111 000 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C5-383 


1ID092916 


Non-Confidential 


C5 The A64 System Instruction Class 
C5.5 A64 system instructions for TLB maintenance 


C5.5.6 TLBI ALLE3IS, TLB Invalidate All, EL3, Inner Shareable 
The TLBI ALLE3IS characteristics are: 


Purpose 
Invalidate all EL3 regime stage 1 TLB entries on all PEs in the same Inner Shareable domain. 


For details of the scope of this instruction see ALL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - WO WO 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI ALLE3IS is a 64-bit System instruction. 


Field descriptions 


TLBI ALLE3IS ignores the value in the register specified by the instruction. Software does not have to write a value 
to the register before issuing this instruction. 


Executing the TLBI ALLE3IS instruction: 
The TLBI ALLE3IS instruction is executed as: 
TLBI ALLE3IS 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 110 1000 0011 000 
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TLBI ASIDE1, TLB Invalidate by ASID, EL1 


The TLBI ASIDE! characteristics are: 


Purpose 
Invalidate EL1&0 regime stage 1 TLB entries for the given ASID and the current VMID. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see ASID. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI ASIDE] is a 64-bit System instruction. 


Field descriptions 


The TLBI ASIDE] input value bit assignments are: 


48 47 0 


ASID RESO 


ASID, bits [63:48] 


ASID value to match. Any appropriate TLB entries that match the ASID values will be affected by 
this operation. 


If the implementation supports 16 bits of ASID, but only 8 bits are being used in the context being 
invalidated, the upper bits are considered RESO and must be written to 0 by software performing the 
TLB maintenance. 


Bits [47:0] 


Reserved, RESO. 


Executing the TLBI ASIDE1 instruction: 
The TLBI ASIDE! instruction is executed as: 


TLBI ASIDE1, <Xt> 
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The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 000 1000 0111 010 
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TLBI ASIDE1IS, TLB Invalidate by ASID, EL1, Inner Shareable 


The TLBI ASIDEIIS characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 1 TLB entries for the given ASID and the current VMID on all PEs 
in the same Inner Shareable domain. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see ASID. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI ASIDEIIS is a 64-bit System instruction. 


Field descriptions 


The TLBI ASIDELIS input value bit assignments are: 


48 47 0 


ASID RESO 


ASID, bits [63:48] 


ASID value to match. Any appropriate TLB entries that match the ASID values will be affected by 
this operation. 


If the implementation supports 16 bits of ASID, but only 8 bits are being used in the context being 
invalidated, the upper bits are considered RESO and must be written to 0 by software performing the 
TLB maintenance. 


Bits [47:0] 


Reserved, RESO. 


Executing the TLBI ASIDE1IS instruction: 
The TLBI ASIDELIIS instruction is executed as: 


TLBI ASIDE1IS, <Xt> 
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The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 000 1000 0011 010 
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C5.5.9 TLBI IPAS2E1, TLB Invalidate by Intermediate Physical Address, Stage 2, EL1 
The TLBI IPAS2E1 characteristics are: 


Purpose 
Invalidate EL1&0 regime stage 2 TLB entries for the given IPA and the current VMID. 
For details of the scope of this instruction see IPAS2. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If SCR_EL3.NS==0, or EL2 is not implemented, this instruction is a NOP. 


This instruction must apply to structures that contain only stage 2 translation information, but does 
not need to apply to structures that contain combined stage 1 and stage 2 translation information. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI IPAS2E1 is a 64-bit System instruction. 


Field descriptions 


The TLBI IPAS2E1 input value bit assignments are: 


63 36 35 0 


RESO IPA[47:12] 


Bits [63:36] 


Reserved, RESO. 


IPA[47:12], bits [35:0] 
Bits[47:12] of the intermediate physical address to match. 


Executing the TLBI IPAS2E1 instruction: 
The TLBI IPAS2E1 instruction is executed as: 
TLBI IPAS2E1, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 100 1000 0100 001 
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C5.5.10 TLBI IPAS2E1IS, TLB Invalidate by Intermediate Physical Address, Stage 2, EL1, Inner 
Shareable 


The TLBI IPAS2E1IS characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 2 TLB entries for the given IPA and the current VMID on all PEs 
in the same Inner Shareable domain. 


For details of the scope of this instruction see IPAS2. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If SCR_EL3.NS==0, or EL2 is not implemented, this instruction is a NOP. 


This instruction must apply to structures that contain only stage 2 translation information, but does 
not need to apply to structures that contain combined stage 1 and stage 2 translation information. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI IPAS2E1IS is a 64-bit System instruction. 


Field descriptions 


The TLBI IPAS2E1IS input value bit assignments are: 


63 36 35 0 


RESO IPA[47:12] 


Bits [63:36] 


Reserved, RESO. 


IPA[47:12], bits [35:0] 
Bits[47:12] of the intermediate physical address to match. 


Executing the TLBI IPAS2E1IS instruction: 
The TLBI IPAS2EIIJS instruction is executed as: 


TLBI IPAS2E1IS, <Xt> 
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The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 100 1000 0000 001 
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C5.5.11 TLBI IPAS2LE1, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, EL1 
The TLBI IPAS2LE1 characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 2 TLB entries for the last level of translation, the given IPA, and the 
current VMID. 


For details of the scope of this instruction see IPAS2L. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If SCR_EL3.NS==0, or EL2 is not implemented, this instruction is a NOP. 


This instruction must apply to structures that contain only stage 2 translation information, but does 
not need to apply to structures that contain combined stage 1 and stage 2 translation information. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI IPAS2LE1 is a 64-bit System instruction. 


Field descriptions 


The TLBI IPAS2LE1 input value bit assignments are: 


63 36 35 0 


RESO IPA[47:12] 


Bits [63:36] 


Reserved, RESO. 


IPA[47:12], bits [35:0] 
Bits[47:12] of the intermediate physical address to match. 


Executing the TLBI IPAS2LE1 instruction: 
The TLBI IPAS2LE1 instruction is executed as: 
TLBI IPAS2LE1, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 100 1000 0100 101 
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C5.5.12 TLBI IPAS2LE1IS, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, EL1, 
Inner Shareable 


The TLBI IPAS2LE1IS characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 2 TLB entries for the last level of translation, the given IPA, and the 
current VMID, on all PEs in the same Inner Shareable domain. 


For details of the scope of this instruction see IPAS2L. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If SCR_EL3.NS==0, or EL2 is not implemented, this instruction is a NOP. 


This instruction must apply to structures that contain only stage 2 translation information, but does 
not need to apply to structures that contain combined stage 1 and stage 2 translation information. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI IPAS2LELIS is a 64-bit System instruction. 


Field descriptions 


The TLBI IPAS2LELIS input value bit assignments are: 


63 36 35 0 


RESO IPA[47:12] 


Bits [63:36] 


Reserved, RESO. 


IPA[47:12], bits [35:0] 
Bits[47:12] of the intermediate physical address to match. 


Executing the TLBI IPAS2LE1IS instruction: 
The TLBI IPAS2LEIIS instruction is executed as: 


TLBI IPAS2LE1IS, <Xt> 
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The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 100 1000 0000 101 
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C5.5.13 TLBI VAAE1, TLB Invalidate by VA, All ASID, EL1 
The TLBI VAAEI characteristics are: 


Purpose 
Invalidate EL1&0 regime stage 1 TLB entries for the given VA and the current VMID. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VAA. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 


There are no configuration notes. 


Attributes 
TLBI VAAE}1 is a 64-bit System instruction. 


Field descriptions 


The TLBI VAAE1 input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the VA will be 
affected by this operation, regardless of the ASID. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 





° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 
° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 
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. Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VAAE1 instruction: 


The TLBI VAAE1 instruction is executed as: 


TLBI VAAE1, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 0111 011 
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C5.5.14 TLBI VAAE1IS, TLB Invalidate by VA, All ASID, EL1, Inner Shareable 
The TLBI VAAEIIS characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 1 TLB entries for the given VA and the current VMID on all PEs in 
the same Inner Shareable domain. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VAA. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 


There are no configuration notes. 


Attributes 
TLBI VAAEIIS is a 64-bit System instruction. 


Field descriptions 


The TLBI VAAEILIS input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the VA will be 
affected by this operation, regardless of the ASID. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 





° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 
° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 
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. Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VAAE1IS instruction: 


The TLBI VAAEIIS instruction is executed as: 


TLBI VAAE1IS, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 0011 011 
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C5.5.15 TLBI VAALE1, TLB Invalidate by VA, All ASID, Last level, EL1 
The TLBI VAALE1 characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 1 TLB entries for the last level of translation table walk, the given 
VA, and the current VMID. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VAAL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 


There are no configuration notes. 


Attributes 
TLBI VAALE1 is a 64-bit System instruction. 


Field descriptions 


The TLBI VAALE1 input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the VA will be 
affected by this operation, regardless of the ASID. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 





° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 
° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 
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. Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VAALE1 instruction: 


The TLBI VAALE] instruction is executed as: 


TLBI VAALE1, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 0111 111 
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C5.5.16 TLBI VAALE1IS, TLB Invalidate by VA, All ASID, EL1, Inner Shareable 
The TLBI VAALEIIS characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 1 TLB entries for the last level of translation table walk, the given 
VA, and the current VMID, on all PEs in the same Inner Shareable domain. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VAAL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 


There are no configuration notes. 


Attributes 
TLBI VAALEIIS is a 64-bit System instruction. 


Field descriptions 


The TLBI VAALEIIS input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the VA will be 
affected by this operation, regardless of the ASID. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 





° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 
° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 
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. Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VAALE1IS instruction: 


The TLBI VAALEIIS instruction is executed as: 


TLBI VAALE1IS, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 0011 111 
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C5.5.17 TLBI VAE1, TLB Invalidate by VA, EL1 
The TLBI VAE1 characteristics are: 


Purpose 
Invalidate EL1&0 regime stage 1 TLB entries for the given VA and ASID and the current VMID. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VA. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 


There are no configuration notes. 


Attributes 
TLBI VAE1 is a 64-bit System instruction. 


Field descriptions 


The TLBI VAE1 input value bit assignments are: 


63 48 47 44 43 0 


ASID RESO VA[55:12] 


ASID, bits [63:48] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 


If the implementation supports 16 bits of ASID, but only 8 bits are being used in the context being 
invalidated, the upper bits are considered RESO and must be written to 0 by software performing the 
TLB maintenance. 


Bits [47:44] 
Reserved, RESO. 
VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 
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If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 
° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 


° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 


° Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VAE1 instruction: 
The TLBI VAE1 instruction is executed as: 
TLBI VAE1, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 0111 001 
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C5.5.18 TLBI VAE1IS, TLB Invalidate by VA, EL1, Inner Shareable 
The TLBI VAELIS characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 1 TLB entries for the given VA and ASID, and the current VMID, 
on all PEs in the same Inner Shareable domain. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VA. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VAEIIS is a 64-bit System instruction. 


Field descriptions 


The TLBI VAELIS input value bit assignments are: 


63 48 47 44 43 0 


ASID RESO VA[55:12] 


ASID, bits [63:48] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 


If the implementation supports 16 bits of ASID, but only 8 bits are being used in the context being 
invalidated, the upper bits are considered RESO and must be written to 0 by software performing the 
TLB maintenance. 


Bits [47:44] 
Reserved, RESO. 
VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 
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If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 
° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 


° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 


° Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VAE1IS instruction: 
The TLBI VAELIS instruction is executed as: 
TLBI VAE1IS, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 = @011 001 
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C5.5.19 TLBI VAE2, TLB Invalidate by VA, EL2 
The TLBI VAE2 characteristics are: 


Purpose 
Invalidate EL2 regime stage 1 TLB entries for the given VA. 


For details of the scope of this instruction see VA. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO - 





Performing this operation from EL3 is UNDEFINED if EL2 does not exist. 


Traps and Enables 


There are no traps or enables affecting this instruction. 
Configurations 


There are no configuration notes. 


Attributes 
TLBI VAE2 is a 64-bit System instruction. 


Field descriptions 


The TLBI VAE2 input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 





° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 

° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 

° Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 
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Executing the TLBI VAE2 instruction: 
The TLBI VAE2 instruction is executed as: 


TLBI VAE2, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 op 


CRn 


CRm_= op2 





01 100 


1000 


0111 001 
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C5.5.20 TLBI VAEZ2IS, TLB Invalidate by VA, EL2, Inner Shareable 
The TLBI VAEZ2IS characteristics are: 


Purpose 


Invalidate EL2 regime stage 1 TLB entries for the given VA on all PEs in the same Inner Shareable 
domain. 


For details of the scope of this instruction see VA. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO - 





Performing this operation from EL3 is UNDEFINED if EL2 does not exist. 
Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VAEZIS is a 64-bit System instruction. 


Field descriptions 


The TLBI VAEZIS input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 





° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 

° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 

° Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 
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Executing the TLBI VAEZ2IS instruction: 
The TLBI VAEZIS instruction is executed as: 


TLBI VAE2IS, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn 


CRm = op2 





Q1 100 1000 


0011 001 
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C5.5 A64 system instructions for TLB maintenance 


C5.5.21 TLBI VAE3, TLB Invalidate by VA, EL3 
The TLBI VAE3 characteristics are: 


Purpose 
Invalidate EL3 regime stage 1 TLB entries for the given VA. 


For details of the scope of this instruction see VA. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - WO WO 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VAE3 is a 64-bit System instruction. 


Field descriptions 


The TLBI VAE3 input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 
. Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 


. Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 


. Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VAE3 instruction: 


The TLBI VAE3 instruction is executed as: 
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TLBI VAE3, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 110 1000 0111 001 
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C5.5.22 TLBI VAE3IS, TLB Invalidate by VA, EL3, Inner Shareable 
The TLBI VAE3IS characteristics are: 


Purpose 


Invalidate EL3 regime stage 1 TLB entries for the given VA on all PEs in the same Inner Shareable 
domain. 


For details of the scope of this instruction see VA. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - WO WO 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VAE3IS is a 64-bit System instruction. 


Field descriptions 


The TLBI VAE3IS input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 
° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 


. Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 


. Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VAE3IS instruction: 


The TLBI VAE3IS instruction is executed as: 
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TLBI VAE3IS, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 110 1000 0011 001 
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C5.5.23 TLBI VALE1, TLB Invalidate by VA, Last level, EL1 
The TLBI VALE] characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 1 TLB entries for the last level of translation table walk, the given 
VA and ASID, and the current VMID. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VAL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VALE1 is a 64-bit System instruction. 


Field descriptions 


The TLBI VALE] input value bit assignments are: 


63 48 47 44 43 0 


ASID RESO VA[55:12] 


ASID, bits [63:48] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 


If the implementation supports 16 bits of ASID, but only 8 bits are being used in the context being 
invalidated, the upper bits are considered RESO and must be written to 0 by software performing the 
TLB maintenance. 


Bits [47:44] 
Reserved, RESO. 
VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 
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If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 
° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 


° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 


° Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VALE1 instruction: 
The TLBI VALE] instruction is executed as: 
TLBI VALE1, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 0111 101 
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C5.5.24 TLBI VALE1IS, TLB Invalidate by VA, Last level, EL1, Inner Shareable 
The TLBI VALEIS characteristics are: 


Purpose 


Invalidate EL1&0 regime stage 1 TLB entries for the last level of translation table walk, the given 
VA and ASID, and the current VMID, on all PEs in the same Inner Shareable domain. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VAL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VALELIS is a 64-bit System instruction. 


Field descriptions 


The TLBI VALELIS input value bit assignments are: 


63 48 47 44 43 0 


ASID RESO VA[55:12] 


ASID, bits [63:48] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 


If the implementation supports 16 bits of ASID, but only 8 bits are being used in the context being 
invalidated, the upper bits are considered RESO and must be written to 0 by software performing the 
TLB maintenance. 


Bits [47:44] 
Reserved, RESO. 
VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 
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If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 
° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 


° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 


° Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VALE1IS instruction: 
The TLBI VALELIS instruction is executed as: 
TLBI VALE1IS, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 0011 101 
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C5.5.25 TLBI VALE2, TLB Invalidate by VA, Last level, EL2 
The TLBI VALE2 characteristics are: 


Purpose 


Invalidate EL2 regime stage 1 TLB entries for the last level of translation table walk and the given 
VA. 


For details of the scope of this instruction see VAL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO - 





Performing this operation from EL3 is UNDEFINED if EL2 does not exist. 
Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VALE2 is a 64-bit System instruction. 


Field descriptions 


The TLBI VALE2 input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 





° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 

° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 

° Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 
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Executing the TLBI VALE2 instruction: 
The TLBI VALE2 instruction is executed as: 


TLBI VALE2, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn 


CRm = op2 





Q1 100 1000 


0111 101 
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C5.5.26 TLBI VALEZ2IS, TLB Invalidate by VA, Last level, EL2, Inner Shareable 
The TLBI VALEZIS characteristics are: 


Purpose 


Invalidate EL2 regime stage 1 TLB entries for the last level of translation table walk and the given 
VA on all PEs in the same Inner Shareable domain. 


For details of the scope of this instruction see VAL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO - 





Performing this operation from EL3 is UNDEFINED if EL2 does not exist. 


Traps and Enables 


There are no traps or enables affecting this instruction. 
Configurations 


There are no configuration notes. 


Attributes 
TLBI VALEZ2IS is a 64-bit System instruction. 


Field descriptions 


The TLBI VALEZIS input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 





° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 

° Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 

° Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 
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Executing the TLBI VALEZIS instruction: 
The TLBI VALEZIS instruction is executed as: 


TLBI VALE2IS, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn 


CRm = op2 





Q1 100 1000 


0011 101 
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C5.5 A64 system instructions for TLB maintenance 


C5.5.27 TLBI VALE3, TLB Invalidate by VA, Last level, EL3 
The TLBI VALE3 characteristics are: 


Purpose 


Invalidate EL3 regime stage 1 TLB entries for the last level of translation table walk and the given 
VA. 


For details of the scope of this instruction see VAL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - WO WO 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VALE3 is a 64-bit System instruction. 


Field descriptions 


The TLBI VALE3 input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 
° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 


. Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 


. Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VALE3 instruction: 


The TLBI VALE3 instruction is executed as: 
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TLBI VALE3, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 110 1000 0111 101 
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C5.5.28 TLBI VALE3IS, TLB Invalidate by VA, Last level, EL3, Inner Shareable 
The TLBI VALES3IS characteristics are: 


Purpose 


Invalidate EL3 regime stage 1 TLB entries for the last level of translation table walk and the given 
VA on all PEs in the same Inner Shareable domain. 


For details of the scope of this instruction see VAL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - WO WO 





Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VALE3IS is a 64-bit System instruction. 


Field descriptions 


The TLBI VALE3IS input value bit assignments are: 


63 44 43 0 


RESO VA[55:12] 


Bits [63:44] 


Reserved, RESO. 


VA[55:12], bits [43:0] 


Bits[55:12] of the virtual address to match. Any appropriate TLB entries that match the ASID value 
(if appropriate) and VA will be affected by this operation. 


If the TLB maintenance instructions are targeting a translation regime that is using AArch32, and 
so has a VA of only 32 bits, then the software must treat bits[55:32] as RESO. 


The treatment of the low-order bits of this field depends on the translation granule size, as follows: 
° Where a 4KB translation granule is being used, all bits are valid and used for the invalidation. 


. Where a 16KB translation granule is being used, bits [1:0] of this field are RESO and ignored 
when the instruction is executed, because VA[13:12] have no effect on the operation of the 
instruction. 


. Where a 64KB translation granule is being used, bits [3:0] of this field are RESO and ignored 
when the instruction is executed, because VA[15:12] have no effect on the operation of the 
instruction. 


Executing the TLBI VALE3IS instruction: 


The TLBI VALE3IS instruction is executed as: 
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TLBI VALE3IS, <Xt> 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 110 1000 0011 101 
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C5.5.29 TLBI VMALLE1, TLB Invalidate by VMID, All at stage 1, EL1 
The TLBI VMALLE 1 characteristics are: 


Purpose 
Invalidate all EL1&0 regime stage | TLB entries for the current VMID. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VMALL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VMALLE1 is a 64-bit System instruction. 


Field descriptions 

TLBI VMALLE1 ignores the value in the register specified by the instruction. Software does not have to write a 
value to the register before issuing this instruction. 

Executing the TLBI VMALLE‘1 instruction: 

The TLBI VMALLE1 instruction is executed as: 

TLBI VMALLE1 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 = @111 000 
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C5.5.30 TLBI VMALLE1IS, TLB Invalidate by VMID, All at stage 1, EL1, Inner Shareable 
The TLBI VMALLELIS characteristics are: 


Purpose 


Invalidate all EL1&0 regime stage 1 TLB entries for the current VMID on all PEs in the same Inner 
Shareable domain. 


If EL3 is implemented, the value of SCR_EL3.NS determines whether the instruction invalidates 
the translations that are associated with Secure address space, or invalidates the translations 
associated with the Non-secure address space. 


For details of the scope of this instruction see VMALL. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VMALLEILIS is a 64-bit System instruction. 


Field descriptions 


TLBI VMALLEILIS ignores the value in the register specified by the instruction. Software does not have to write a 
value to the register before issuing this instruction. 


Executing the TLBI VMALLE1IS instruction: 
The TLBI VMALLEIIS instruction is executed as: 


TLBI VMALLE1IS 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opi CRn CRm_= op2 





01 000 1000 0011 000 
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C5 The A64 System Instruction Class 
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C5.5.31 TLBI VMALLS12E1, TLB Invalidate by VMID, All at Stage 1 and 2, EL1 
The TLBI VMALLS12E1 characteristics are: 


Purpose 
Invalidate all EL1&0 regime stage 1 and 2 TLB entries for the current VMID. 
For details of the scope of this instruction see VMALLS 12. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If this instruction is executed at EL3 when the value of SCR_EL3.NS is 0, this instruction executes 
as a TLBI VMALLEI. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VMALLS12E1 is a 64-bit System instruction. 


Field descriptions 


TLBI VMALLS12E1 ignores the value in the register specified by the instruction. Software does not have to write 
a value to the register before issuing this instruction. 


Executing the TLBI VMALLS12E1 instruction: 
The TLBI VMALLS12E1 instruction is executed as: 
TLBI VMALLS12E1 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





01 100 1000 0111 110 
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C5.5.32 TLBI VMALLS12E1IS, TLB Invalidate by VMID, All at Stage 1 and 2, EL1, Inner Shareable 
The TLBI VMALLS12E1IS characteristics are: 


Purpose 


Invalidate all EL1&0 regime stage 1 and 2 TLB entries for the current VMID on all PEs in the same 
Inner Shareable domain. 


For details of the scope of this instruction see VMALLS 12. 


Usage constraints 


This instruction can be executed at the following exception levels: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If this instruction is executed at EL3 when the value of SCR_EL3.NS is 0, this instruction executes 
as a TLBI VMALLELIS. 


Traps and Enables 


There are no traps or enables affecting this instruction. 


Configurations 


There are no configuration notes. 


Attributes 
TLBI VMALLS12EIIS is a 64-bit System instruction. 


Field descriptions 


TLBI VMALLS12E1IS ignores the value in the register specified by the instruction. Software does not have to write 
a value to the register before issuing this instruction. 


Executing the TLBI VMALLS12E1IS instruction: 
The TLBI VMALLS12E1IS instruction is executed as: 


TLBI VMALLS12E11S 


The instruction is encoded in the System instruction encoding space as follows: 





op0 opt CRn CRm_= op2 





Q1 100 1000 0011 110 
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Chapter C6 
A64 Base Instruction Descriptions 


This chapter describes the A64 base instructions. 


It contains the following sections: 
. About the A64 base instructions on page C6-432. 
° Alphabetical list of A64 base instructions on page C6-434. 
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C6 A64 Base Instruction Descriptions 
C6.1 About the A64 base instructions 


C6.1 


About the A64 base instructions 


Alphabetical list of A64 base instructions on page C6-434 gives full descriptions of the A64 instructions that are in 
the following instruction groups: 


° Branch, Exception generation, and system instructions. 

° Loads and stores associated with the general-purpose registers. 
. Data processing (immediate). 

. Data processing (register). 


A64 instruction index by encoding on page C4-192 provides an overview of the instruction encodings as well as of 
the instruction classes within their functional groups. 


The rest of this section is general description of the base instructions. It contains the following subsections: 
° Register size. 

° Use of the PC. 

° Use of the stack pointer on page C6-433. 





. Condition flags and related instructions on page C6-433. 
C6.1.1 Register size 
Most data processing, comparison, and conversion instructions that use the general-purpose registers as the source 
or destination operand have two instruction variants that operate on either a 32-bit or a 64-bit value. 
Where a 32-bit instruction form is selected, the following holds: 
° The upper 32 bits of the source registers are ignored. 
° The upper 32 bits of the destination register are set to zero. 
° Right shifts and right rotates inject at bit[31], not at bit[63]. 
° The condition flags, where set by the instruction, are computed from the lower 32 bits. 
This distinction applies even when the results of a 32-bit instruction form are indistinguishable from the lower 32 
bits computed by the equivalent 64-bit instruction form. For example, a 32-bit bitwise ORR could be performed using 
a 64-bit ORR and simply ignoring the top 32 bits of the result. However, the A64 instruction set includes separate 
32-bit and 64-bit forms of the ORR instruction. 
As well as distinct sign-extend or zero-extend instructions, the A64 instruction set also provides the ability to extend 
and shift the final source register of an ADD, SUB, ADDS, or SUBS instruction and the index register of a Load/Store 
instruction. This enables array index calculations involving a 64-bit array pointer and a 32-bit array index to be 
implemented efficiently. 
The assembly language notation enables the distinct identification of registers holding 32-bit values and registers 
holding 64-bit values. See Register names on page C1-124 and Register indexed addressing on page C1-128. 
C6.1.2 Use of the PC 
A64 instructions have limited access to the PC. The only instructions that can read the PC are those that generate a 
PC relative address: 
. ADR and ADRP. 
° The Load register (literal) instruction class. 
. Direct branches that use an immediate offset. 
° The unconditional branch with link instructions, BL and BLR, that use the PC to create the return link 
address. 
Only explicit control flow instructions can modify the PC: 
° Conditional and unconditional branch and return instructions. 
. Exception generation and exception return instructions. 
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C6 A64 Base Instruction Descriptions 
C6.1 About the A64 base instructions 


For more details of instructions that can modify the PC, see Branches, Exception generating, and System 
instructions on page C3-142. 


C6.1.3 Use of the stack pointer 


A64 instructions can use the stack pointer only in a limited number of cases: 


Load/Store instructions use the current stack pointer as the base address: 


— When stack alignment checking is enabled by system software and the base register is SP, the current 
stack pointer must be initially quadword aligned, That is, it must be aligned to 16 bytes. Misalignment 
generates an SP alignment fault. See SP alignment checking on page D1-1515 for more information. 


Add and subtract data processing instructions in their immediate and extended register forms, use the current 
stack pointer as a source register or the destination register or both. 


Logical data processing instructions in their immediate form use the current stack pointer as the destination 
register. 


C6.1.4 Condition flags and related instructions 


The A64 base instructions that use the condition flags as an input are: 


Conditional branch. The conditional branch instruction is B. cond. 


Add or subtract with carry. These instruction types include instructions to perform multi-precision arithmetic 
and calculate checksums. The add or subtract with carry instructions are ADC, ADCS, SBC, and SBCS, or an 
architectural alias for these instructions. 


Conditional select with increment, negate, or invert.This instruction type conditionally selects between one 
source register and a second, incremented, negated, inverted, or unmodified source register. The conditional 
select with increment, negate, or invert instructions are CSINC, CSINV, and CSNEG. 


These instructions also implement: 


— Conditional select or move. The condition flags select one of two source registers as the destination 
register. Short conditional sequences can be replaced by unconditional instructions followed by a 
conditional select, CSEL. 


— Conditional set. Conditionally selects between 0 and 1, or 0 and -1. This can be used to convert the 
condition flags to a Boolean value or mask in a general-purpose register, for example. These 
instructions include CSET and CSETM. 


Conditional compare. This instruction type sets the condition flags to the result of a comparison if the original 
condition is true, otherwise it sets the condition flags to an immediate value. It permits the flattening of nested 
conditional expressions without using conditional branches or performing Boolean arithmetic within the 
general-purpose registers. The conditional compare instructions are CCMP and CCMN. 


The A64 base instructions that update the condition flags as an output are: 


Flag-setting data processing instructions, such as ADCS, ADDS, ANDS, BICS, SBCS, and SUBS, and the aliases CMN, 
CMP, and TST. 


Conditional compare instructions such as CCMN, CCMP. 


The flags can be directly accessed for a read/write using the NZCV, Condition Flags on page CS5-311. 


The A64 base instructions also include conditional branch instructions that do not use the condition flags as an input: 


Compare and branch if a register is zero or nonzero, CBZ and CBNZ. 


Test a single bit in a register and branch if the bit is zero or nonzero, TBZ and TBNZ. 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2 Alphabetical list of A64 base instructions 


This section lists every instruction in the base category of the A64 instruction set. For details of the format used, see 
Understanding the A64 instruction descriptions on page C2-134. 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.1 ADC 


Add with Carry adds two register values and the Carry flag value, and writes the result to the destination register. 


|31 30 29 28|27 26 25 24/23 22 21 20] 16|15141312/1110 9 | 5 4| 0 | 
sffopo]7 tO 70000] Am jooo0000] Rn | Ra 
op S 


32-bit variant 
Applies when sf == 0. 
ADC <Wd>, <Wn>, <Wm> 
64-bit variant 
Applies when sf == 1. 


ADC <Xd>, <Xn>, <Xm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
Operation 


bits(datasize) result; 

bits(datasize) operandl = X[n]; 

bits(datasize) operand2 = X[m]; 

(result, -) = AddWithCarry(operand1, operand2, PSTATE.C); 


X[d] = result; 
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C6.2 Alphabetical list of A64 base instructions 





C6.2.2 ADCS 
Add with Carry, setting flags, adds two register values and the Carry flag value, and writes the result to the 
destination register. It updates the condition flags based on the result. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14.13 12/1110 9 E 5 4| 0 | 
sflo[1]1 1070000] Rm _ [oo0000] Rn [| Rd | 
op S 
32-bit variant 
Applies when sf == 
ADCS <Wd>, <Wn>, <Wm> 
64-bit variant 
Applies when sf == 
ADCS <Xd>, <Xn>, <Xm> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
Operation 
bits(datasize) result; 
bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = X[m]; 
bits(4) nzcv; 
(result, nzcv) = AddWithCarry(operand1, operand2, PSTATE.C); 
PSTATE.<N,Z,C,V> = nzcv; 
X[d] = result; 
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ADD (extended register) 


Add (extended register) adds a register value and a sign or zero-extended register value, followed by an optional left 
shift amount, and writes the result to the destination register. The argument that is extended from the <Rm> register 


can be a byte, halfword, word, or doubleword. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312| 109 | 5 4| 


aoa 61a oa Ra een ee oe 


op S 


32-bit variant 
Applies when sf == 


ADD <Wd|WSP>, <Wn|WSP>, <Wm>{, <extend> {#<amount>}} 


64-bit variant 
Applies when sf == 


ADD <Xd|SP>, <Xn|SP>, <R><m>{, <extend> {#<amount>}} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize = if sf == '1' then 64 else 32; 
ExtendType extend_type = DecodeRegExtend(option) ; 


integer shift = 


UInt(imm3) ; 


if shift > 4 then ReservedValue(); 


Assembler symbols 


<Wd|WSP> Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Wn | WSP> Is the 32-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 

<Xd| SP> Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Xn|SP> Is the 64-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 

<R> Is a width specifier, encoded in the "option" field. It can have the following values: 
W when option = 00x 
W when option = 010 
xX when option = x11 
W when option = 10x 
W when option = 110 

<m> Is the number [0-30] of the second general-purpose source register or the name ZR (31), encoded in 


the "Rm" field. 
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<extend> For the 32-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB when option 
UXTH when option 
LSL|UXTW when option 


UXTX when option 
SXTB when option 
SXTH when option 
SXTW when option 
SXTX when option 


000 
001 
010 
011 
100 
101 
110 
111 


If "Rd" or "Rn" is '11111' (WSP) and "option" is '010' then LSL is preferred, but may be omitted 
when "imm3" is '000'. In all other cases <extend> is required and must be UXTW when "option" is 


'010'. 


For the 64-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB when option 
UXTH when option 
UXTW when option 


LSL|UXTX when option 


SXTB when option 
SXTH when option 
SXTW when option 
SXTX when option 


000 
001 
010 
011 
100 
101 
110 
111 


If "Rd" or "Rn" is '11111' (SP) and "option" is '011' then LSL is preferred, but may be omitted when 
"imm3" is '000'. In all other cases <extend> is required and must be UXTX when "option" is '011'. 


<amount> Is the left shift amount to be applied after extension in the range 0 to 4, defaulting to 0, encoded in 
the "imm3" field. It must be absent when <extend> is absent, is required when <extend> is LSL, 
and is optional when <extend> is present but not LSL. 


Operation 


bits(datasize) result; 
bits(datasize) operand1 


if n == 31 then SP[] else X[n]; 


bits(datasize) operand2 = ExtendReg(m, extend_type, shift); 


(result, -) = AddWithCarry(operand1, operand2, '@'); 


if d == 31 then 
SP[] = result; 
else 
X[d] = result; 
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C6.2.4 ADD (immediate) 


Add (immediate) adds a register value and an optionally-shifted immediate value, and writes the result to the 
destination register. 


This instruction is used by the alias MOV (to/from SP). See Alias conditions for details of when each alias is 


preferred. 
|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 
sflofo]1 oo o ashi] | immi2 Rn STR 
op S 


32-bit variant 
Applies when sf == 0. 


ADD <Wd|WSP>, <Wn|WSP>, #<imm>{, <shift>} 


64-bit variant 
Applies when sf == 1. 


ADD <Xd|SP>, <Xn|SP>, #<imm>{, <shift>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer datasize = if sf == '1' then 64 else 32; 
bits(datasize) imm; 


case shift of 
when '00' imm = ZeroExtend(imm12, datasize); 
when 'Q1' imm = ZeroExtend(imm12:Zeros(12), datasize); 
when '1x' ReservedValue(); 


Alias conditions 





Alias is preferred when 





MOV (to/from SP) shift == 'Q0' && imm12 == '000000000000' && (Rd == '11111' || Rn == '11111') 





Assembler symbols 








<Wd|WSP> Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 
<Wn|WSP> Is the 32-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<Xd| SP> Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 
<Xn|SP> Is the 64-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<imm> Is an unsigned immediate, in the range 0 to 4095, encoded in the "imm12" field. 
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<shift> Is the optional left shift to apply to the immediate, defaulting to LSL #0 and encoded in the "shift" 
field. It can have the following values: 


LSL #0 when shift = 00 
LSL #12 when shift = 01 


The encoding shift = 1x is reserved. 


Operation 


bits(datasize) result; 
bits(datasize) operandl = if n == 31 then SP[] else X[n]; 


(result, -) = AddWithCarry(operand1, imm, 'Q'); 


if d == 31 then 
SP[] = result; 
else 
X[d] = result; 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.5 ADD (shifted register) 


Add (shifted register) adds a register value and an optionally-shifted register value, and writes the result to the 
destination register. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 | 109 5 4| 0 | 


FC 


op S 


32-bit variant 
Applies when sf == 


ADD <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 


64-bit variant 
Applies when sf == 


ADD <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


if shift == '11' then ReservedValue(); 
if sf == 'O' && imm6<5> == '1' then ReservedValue(); 


ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL and encoded 

in the "shift" field. It can have the following values: 

LSL when shift = 00 

LSR when shift = 01 

ASR when shift = 10 


The encoding shift = 111s reserved. 
<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-441 
1ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


Operation 

bits(datasize) result; 

bits(datasize) operandl = X[n]; 

bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 
(result, -) = AddWithCarry(operand1, operand2, '@'); 


X[d] = result; 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.6 ADDS (extended register) 


Add (extended register), setting flags, adds a register value and a sign or zero-extended register value, followed by 
an optional left shift amount, and writes the result to the destination register. The argument that is extended from 
the <Rm> register can be a byte, halfword, word, or doubleword. It updates the condition flags based on the result. 


This instruction is used by the alias CMN (extended register). See for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312; 109 | 5 4| 0 | 


sflojijo 10 1 1fo oft] Rm | option | imma | Rn | Rd 


op S 


32-bit variant 
Applies when sf == 


ADDS <Wd>, <Wn|WSP>, <Wm>{, <extend> {#<amount>}} 


64-bit variant 
Applies when sf == 


ADDS <Xd>, <Xn|SP>, <R><m>{, <extend> {#<amount>}} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = UInt(imm3); 

if shift > 4 then ReservedValue(); 


Alias conditions 





Alias is preferred when 





CMN (extended register) Rd == '11111' 





Assembler symbols 





<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn | WSP> Is the 32-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn|SP> Is the 64-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 

<R> Is a width specifier, encoded in the "option" field. It can have the following values: 
W when option = 00x 
W when option = 010 
xX when option = x11 
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W when option = 10x 
when option = 110 
<m> Is the number [0-30] of the second general-purpose source register or the name ZR (31), encoded in 


the "Rm" field. 


<extend> For the 32-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 
UXTB when option = 000 
UXTH when option = 001 
LSL|UXTW when option = 010 
UXTX when option = 011 
SXTB when option = 100 
SXTH when option = 101 
SXTW when option = 110 
SXTX when option = 111 


If "Rn" is '11111' (WSP) and "option" is '010' then LSL is preferred, but may be omitted when 
"imm3" is '000'. In all other cases <extend> is required and must be UXTW when "option" is '010'. 


For the 64-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB when option = 000 
UXTH when option = 001 
UXTW when option = 010 
LSL|UXTX when option = 011 
SXTB when option = 100 
SXTH when option = 101 
SXTW when option = 110 
SXTX when option = 111 


If "Rn" is'11111' (SP) and "option" is '011' then LSL is preferred, but may be omitted when "imm3" 
is '000'. In all other cases <extend> is required and must be UXTX when "option" is '011'. 


<amount> Is the left shift amount to be applied after extension in the range 0 to 4, defaulting to 0, encoded in 
the "imm3" field. It must be absent when <extend> is absent, is required when <extend> is LSL, 
and is optional when <extend> is present but not LSL. 


Operation 


bits(datasize) result; 

bits(datasize) operandl = if n == 31 then SP[] else X[n]; 
bits(datasize) operand2 = ExtendReg(m, extend_type, shift); 
bits(4) nzcv; 


(result, nzcv) = AddWithCarry(operand1, operand2, '@'); 
PSTATE.<N,Z,C,V> = nzcv; 


X[d] = result; 
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C6.2.7 ADDS (immediate) 


Add (immediate), setting flags, adds a register value and an optionally-shifted immediate value, and writes the result 
to the destination register. It updates the condition flags based on the result. 


This instruction is used by the alias CMN (immediate). See Alias conditions for details of when each alias is 


preferred. 
|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 
sflo]i]1 oo 0 ashi] | immi2 Rn TRG 
op S 


32-bit variant 
Applies when sf == 0. 


ADDS <Wd>, <Wn|WSP>, #<imm>{, <shift>} 


64-bit variant 
Applies when sf == 1. 


ADDS <Xd>, <Xn|SP>, #<imm>{, <shift>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer datasize = if sf == '1' then 64 else 32; 
bits(datasize) imm; 


case shift of 
when 'Q00' imm = ZeroExtend(imm12, datasize); 
when 'Q1' imm = ZeroExtend(imm12:Zeros(12), datasize); 
when '1x' ReservedValue(); 


Alias conditions 





Alias is preferred when 





CMN (immediate) Rd == '11111' 





Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn | WSP> Is the 32-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn|SP> Is the 64-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<imm> Is an unsigned immediate, in the range 0 to 4095, encoded in the "imm12" field. 

<shift> Is the optional left shift to apply to the immediate, defaulting to LSL #0 and encoded in the "shift" 


field. It can have the following values: 
LSL #0 when shift = 00 
LSL #12 when shift = 01 
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The encoding shift = 1x is reserved. 


Operation 

bits(datasize) result; 

bits(datasize) operandl = if n == 31 then SP[] else X[n]; 
bits(4) nzcv; 

(result, nzcv) = AddWithCarry(operand1, imm, 'Q'); 


PSTATE.<N,Z,C,V> = nzcv; 


X[d] = result; 





C6-446 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.8 ADDS (shifted register) 


Add (shifted register), setting flags, adds a register value and an optionally-shifted register value, and writes the 
result to the destination register. It updates the condition flags based on the result. 


This instruction is used by the alias CMN (shifted register). See Alias conditions for details of when each alias is 
preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 | 109 5 4| 0 | 


GIGNAC CC 


op S 


32-bit variant 
Applies when sf == 


ADDS <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 


64-bit variant 
Applies when sf == 


ADDS <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


if shift == '11' then ReservedValue(); 
if sf == 'O' && imm6<5> == '1' then ReservedValue(); 


ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 


Alias conditions 





Alias is preferred when 





CMN (shifted register) Rd == '11111' 





Assembler symbols 





<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
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<shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL and encoded 
in the "shift" field. It can have the following values: 


LSL when shift = 00 
LSR when shift = 01 


ASR when shift = 10 


The encoding shift = 111s reserved. 


<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field. 


Operation 


bits(datasize) result; 

bits(datasize) operandl = X[n]; 

bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 
bits(4) nzcv; 


(result, nzcv) = AddWithCarry(operand1, operand2, '@'); 
PSTATE.<N,Z,C,V> = nzcv; 


X[d] = result; 
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C6.2.9 ADR 


Form PC-relative address adds an immediate value to the PC value to form a PC-relative address, and writes the 
result to the destination register. 


|31 30 29 28|27 26 25 24|23 | | | | 5 4| 0| 


[Ojimmiof1 0 0 0 Of imme 
op 

Literal variant 

ADR <Xd>, <label> 


Decode for this encoding 


integer d = UInt(Rd); 
bits(64) imm; 


imm = SignExtend(immhi:immlo, 64); 


Assembler symbols 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<label> Is the program label whose address is to be calculated. Its offset from the address of this instruction, 
in the range +/-1MB, is encoded in "immhi:immlo". 
Operation 
bits(64) base = PC[]; 


X[d] = base + imm; 
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C6.2.10 ADRP 


Form PC-relative address to 4KB page adds an immediate value that is shifted left by 12 bits, to the PC value to 
form a PC-relative address, with the bottom 12 bits masked out, and writes the result to the destination register. 


|31 30 29 28|27 26 25 24|23 | | | | 5 4| 0| 


[tjimmiof1 0 0 0 Of imme 
op 

Literal variant 

ADRP <Xd>, <label> 

Decode for this encoding 


integer d = UInt(Rd); 
bits(64) imm; 


imm = SignExtend(immhi:immlo:Zeros(12), 64); 


Assembler symbols 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<label> Is the program label whose 4KB page address is to be calculated. Its offset from the page address of 
this instruction, in the range +/-4GB, is encoded as "immhi:immlo" times 4096. 
Operation 
bits(64) base = PC[]; 
base<11:0> = Zeros(12); 


X[d] = base + imm; 





C6-450 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.11 AND (immediate) 


Bitwise AND (immediate) performs a bitwise AND of a register value and an immediate value, and writes the result 
to the destination register. 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 | 109 5 4| 0 | 


oko | 


opc 


32-bit variant 
Applies when sf == @ && N == 


AND <Wd|WSP>, <Wn>, #<imm> 


64-bit variant 
Applies when sf == 


AND <Xd|SP>, <Xn>, #<imm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer datasize = if sf == '1' then 64 else 32; 
bits(datasize) imm; 

if sf == '0' &&N != '@' then ReservedValue(); 
(imm, -) = DecodeBitMasks(N, imms, immr, TRUE); 


Assembler symbols 


<Wd|WSP> Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd| SP> Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 

<imm> For the 32-bit variant: is the bitmask immediate, encoded in "imms:immr". 


For the 64-bit variant: is the bitmask immediate, encoded in "N:imms:immr". 


Operation 


bits(datasize) result; 
bits(datasize) operandl = X[n]; 


result = operand1 AND imm; 
if d == 31 then 

SP[] = result; 
else 

X[d] = result; 
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C6.2.12 AND (shifted register) 


Bitwise AND (shifted register) performs a bitwise AND of a register value and an optionally-shifted register value, 
and writes the result to the destination register. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 | 109 5 4| 0 | 


slo 0010-1 Oso] — Ra | as — | mr 


opc 


32-bit variant 
Applies when sf == 


AND <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 


64-bit variant 
Applies when sf == 


AND <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 

if sf == 'O' && imm6<5> == '1' then ReservedValue(); 


ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 
field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 
ROR when shift = 11 
<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
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Operation 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 


result = operandl AND operand2; 
X[d] = result; 
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C6.2.13 ANDS (immediate) 
Bitwise AND (immediate), setting flags, performs a bitwise AND of a register value and an immediate value, and 
writes the result to the destination register. It updates the condition flags based on the result. 
This instruction is used by the alias TST (immediate). See Alias conditions for details of when each alias is 
preferred. 
|31 30 29 28|27 26 25 24|23 2221 | 16/15 | 109 5 4| 0 | 
sf[1 1/1 0 0 1 0 i ee 
ope 
32-bit variant 
Applies when sf == @ && N == 
ANDS <Wd>, <Wn>, #<imm> 
64-bit variant 
Applies when sf == 
ANDS <Xd>, <Xn>, #<imm> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer datasize = if sf == '1' then 64 else 32; 
bits(datasize) imm; 
if sf == '@' && N != '@' then ReservedValue(); 
(imm, -) = DecodeBitMasks(N, imms, immr, TRUE); 
Alias conditions 
Alias is preferred when 
TST (immediate) Rd == '11111' 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<imm> For the 32-bit variant: is the bitmask immediate, encoded in "imms:immr". 
For the 64-bit variant: is the bitmask immediate, encoded in "N:imms:immr". 
Operation 
bits(datasize) result; 
bits(datasize) operandl = X[n]; 
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result = operand1l AND imm; 
PSTATE.<N,Z,C,V> = result<datasize-1>:IsZeroBit(result):'QQ'; 


X[d] = result; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-455 
1ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 














C6.2.14 ANDS (shifted register) 

Bitwise AND (shifted register), setting flags, performs a bitwise AND of a register value and an optionally-shifted 
register value, and writes the result to the destination register. It updates the condition flags based on the result. 
This instruction is used by the alias TST (shifted register). See Alias conditions for details of when each alias is 
preferred. 

|31 30 29 28|27 26 25 24(23 22 21 20) 16/15 10 9 5 4| 0 | 

sf[1 1]o 1 0 10 so) Ra | ns — [I 

opc 

32-bit variant 
Applies when sf == 
ANDS <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 
64-bit variant 
Applies when sf == 
ANDS <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 
Decode for all variants of this encoding 

integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 

if sf == 'O' && imm6<5> == '1' then ReservedValue(); 

ShiftType shift_type = DecodeShift(shift) ; 

integer shift_amount = UInt(imm6); 
Alias conditions 

Alias is preferred when 
TST (shifted register) Rd == '11111' 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 
field. It can have the following values: 
LSL when shift = 00 
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LSR when shift = 01 
ASR when shift = 10 
ROR when shift = 11 
<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
Operation 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 


result = operandl AND operand2; 
PSTATE.<N,Z,C,V> = result<datasize-1>:IsZeroBit(result):'Q0'; 


X[d] = result; 
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C6.2.15 ASR (register) 


Arithmetic Shift Right (register) shifts a register value right by a variable number of bits, shifting in copies of its 
sign bit, and writes the result to the destination register. The remainder obtained by dividing the second source 
register by the data size defines the number of bits by which the first source register is right-shifted. 


This instruction is an alias of the ASRV instruction. This means that: 
° The encodings in this description are named to match the encodings of ASRV. 


° The description of ASRV gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 1413 12/1110 9 | 5 4| 0 | 


sofo]1 7070170] Rm [oo710]10] Rn | Rd 


op2 


32-bit variant 
Applies when sf == 
ASR <Wd>, <Wn>, <Wm> 
is equivalent to 

ASRV <Wd>, <Wn>, <Wm> 


and is always the preferred disassembly. 


64-bit variant 
Applies when sf == 
ASR <Xd>, <Xn>, <Xm> 
is equivalent to 

ASRV <Xd>, <Xn>, <Xm> 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register holding a shift amount from 0 to 


31 in its bottom 5 bits, encoded in the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register holding a shift amount from 0 to 


63 in its bottom 6 bits, encoded in the "Rm" field. 


Operation 


The description of ASRV gives the operational pseudocode for this instruction. 
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C6.2.16 ASR (immediate) 


Arithmetic Shift Right (immediate) shifts a register value right by an immediate number of bits, shifting in copies 
of the sign bit in the upper bits and zeros in the lower bits, and writes the result to the destination register. 


This instruction is an alias of the SBFM instruction. This means that: 


° The encodings in this description are named to match the encodings of SBFM. 
° The description of SBFM gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0| 
aoof OT ON mT 
opc imms 


32-bit variant 

Applies when sf == @ && N == 0 && imms == Q11111. 
ASR <Wd>, <Wn>, #<shift> 

is equivalent to 

SBFM <Wd>, <Wn>, #<shift>, #31 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1 && N == 1 && imms == 111111. 
ASR <Xd>, <Xn>, #<shift> 

is equivalent to 

SBFM <Xd>, <Xn>, #<shift>, #63 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<shift> For the 32-bit variant: is the shift amount, in the range 0 to 31, encoded in the "immr" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, encoded in the "immr" field. 


Operation 


The description of SBFM gives the operational pseudocode for this instruction. 
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C6.2.17 


ASRV 


Arithmetic Shift Right Variable shifts a register value right by a variable number of bits, shifting in copies of its sign 
bit, and writes the result to the destination register. The remainder obtained by dividing the second source register 
by the data size defines the number of bits by which the first source register is right-shifted. 


This instruction is used by the alias ASR (register). The alias is always the preferred disassembly. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


sflofo]t 1070770] Rm jootojr of Rn _ {| Rd | 


op2 


32-bit variant 
Applies when sf == 


ASRV <Wd>, <Wn>, <Wm> 


64-bit variant 
Applies when sf == 


ASRV <Xd>, <Xn>, <Xm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 
ShiftType shift_type = DecodeShift(op2); 


Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding a shift amount from 0 to 
31 in its bottom 5 bits, encoded in the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register holding a shift amount from 0 to 
63 in its bottom 6 bits, encoded in the "Rm" field. 

Operation 


bits(datasize) result; 
bits(datasize) operand2 = X[m]; 


result = ShiftReg(n, shift_type, UInt(operand2) MOD datasize); 
X[d] = result; 
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C6.2.18 AT 


Address Translate. For more information, see A64 system instructions for address translation. 


This instruction is an alias of the SYS instruction. This means that: 


. The encodings in this description are named to match the encodings of SYS. 
° The description of SYS gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|15 42\11 8|7 5 4| 0 | 
Ti OTO1 070 Ofofo 1] opt [ori iioox) wm] oR | 
L CRn CRm 


System variant 

AT <at_op>, <Xt> 

is equivalent to 

SYS #<opl>, C7, <Cm>, #<op2>, <Xt> 


and is the preferred disassembly when SysOp(op1, '0111',CRm,op2) == Sys_AT. 


Assembler symbols 








<at_op> Is an AT instruction name, as listed for the AT system instruction group, encoded in the 
"opl:CRm<0>:op2" field. It can have the following values: 
S1E1R when op1 = 000, CRm<@> = 0, op2 = 000 
S1E1W when op1 = 000, CRm<@> = 0, op2 = 001 
S1EQR when op1 = @00, CRm<@> = 0, op2 = 010 
S1EQW when op1 = Q00, CRm<@> = 0, op2 = 011 
S1E2R when op1 = 100, CRm<@> = 0, op2 = 000 
S1E2W when op1 = 100, CRm<@> = Q, op2 = 001 
S12E1R when op1 = 100, CRm<@> = 0, op2 = 100 
S12E1W when op1 = 100, CRm<@> = 0, op2 = 101 
S12EQR when op1 = 100, CRm<@> = 0, op2 = 110 
S12EQW when op1 = 100, CRm<@> = 0, op2 = 111 
S1E3R when op1 = 110, CRm<@> = 0, op2 = 000 
S1E3W when op1 = 110, CRm<@> = 0, op2 = 001 
<op1> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op1" field. 
<Cm> Is aname ‘Cm’, with 'm' in the range 0 to 15, encoded in the "CRm" field. 
<op2> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op2" field. 
<Xt> Is the 64-bit name of the general-purpose source register, encoded in the "Rt" field. 


Operation 


The description of SYS gives the operational pseudocode for this instruction. 
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C6.2.19 B.cond 


Branch conditionally to a label at a PC-relative offset, with a hint that this is not a subroutine call or return. 


|31 30 29 28|27 26 25 24|23 | 5 4|3 0 | 


fototo7 oo] immis——SSS~S~SCSCSO con 


19-bit signed PC-relative branch offset variant 


B.<cond> <label> 


Decode for this encoding 
bits(64) offset = SignExtend(imm19:'@0', 64); 
bits(4) condition = cond; 
Assembler symbols 
<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 


<label> Is the program label to be conditionally branched to. Its offset from the address of this instruction, 
in the range +/-1MB, is encoded as "imm19" times 4. 


Operation 


if ConditionHolds(condition) then 
BranchTo(PC[] + offset, BranchType_JMP) ; 
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C6.2.20 B 


Branch causes an unconditional branch to a label at a PC-relative offset, with a hint that this is not a subroutine call 
or return. 


131 30 29 28|27 26 25 | | | | | 0| 


ojo ot Of im 


op 


26-bit signed PC-relative branch offset variant 


B <label> 


Decode for this encoding 


bits(64) offset = SignExtend(imm26:'00', 64); 


Assembler symbols 


<label> Is the program label to be unconditionally branched to. Its offset from the address of this instruction, 
in the range +/-128MB, is encoded as "imm26" times 4. 


Operation 


BranchTo(PC[] + offset, BranchType_JMP) ; 
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C6.2.21 BFI 
Bitfield Insert copies any number of low-order bits from a source register into the same number of adjacent bits at 
any position in the destination register, leaving other bits unchanged. 
This instruction is an alias of the BFM instruction. This means that: 
° The encodings in this description are named to match the encodings of BFM. 
. The description of BFM gives the operational pseudocode for this instruction. 
e 30 29 28|27 26 25 24|23 22 21 | 16|15 | 109 5 4| 0 | 
opc 
32-bit variant 
Applies when sf == @ && N == 
BFI <Wd>, <Wn>, #<Isb>, #<width> 
is equivalent to 
BFM <Wd>, <Wn>, #(-<Isb> MOD 32), #(<width>-1) 
and is the preferred disassembly when UInt(imms) < UInt(immr). 
64-bit variant 
Applies when sf == 1 && N == 
BFI <Xd>, <Xn>, #<Isb>, #<width> 
is equivalent to 
BFM <Xd>, <Xn>, #(-<Isb> MOD 64), #(<width>-1) 
and is the preferred disassembly when UInt(imms) < UInt(immr). 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Isb> For the 32-bit variant: is the bit number of the Isb of the destination bitfield, in the range 0 to 31. 
For the 64-bit variant: is the bit number of the Isb of the destination bitfield, in the range 0 to 63. 
<width> For the 32-bit variant: is the width of the bitfield, in the range 1 to 32-<Isb>. 
For the 64-bit variant: is the width of the bitfield, in the range 1 to 64-<Isb>. 
Operation 
The description of BFM gives the operational pseudocode for this instruction. 
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C6.2.22 BFM 


Bitfield Move copies any number of low-order bits from a source register into the same number of adjacent bits at 
any position in the destination register, leaving other bits unchanged. 


This instruction is used by the aliases BFI and BFXIL. See Alias conditions for details of when each alias is 
preferred. 


|31 30 29 2827 26 25 24|23 2221 | 16/15 | 109 5 4| 0 | 


omar eee eer ee 


opc 


32-bit variant 
Applies when sf == @ && N == 


BFM <Wd>, <Wn>, #<immr>, #<imms> 


64-bit variant 
Applies when sf == 1 && N == 


BFM <Xd>, <Xn>, #<immr>, #<imms> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer datasize = if sf == '1' then 64 else 32; 


integer R; 
bits(datasize) wmask; 
bits(datasize) tmask; 


if sf == '1' && N != '1' then ReservedValue(); 
if sf == '@' && (N != 'O' || immr<5> != 'O' || imms<5> != '@') then ReservedValue(); 


R = UInt(immr); 
(wmask, tmask) = DecodeBitMasks(N, imms, immr, FALSE); 


Alias conditions 





Alias is preferred when 





BFI Rn != '11111' && UInt(imms) < UInt(immr) 





BFXIL ~ UInt(imms) >= UInt(immr) 





Assembler symbols 





<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 

<immr> For the 32-bit variant: is the right rotate amount, in the range 0 to 31, encoded in the "immr" field. 
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For the 64-bit variant: is the right rotate amount, in the range 0 to 63, encoded in the "immr" field. 


<imms> For the 32-bit variant: is the leftmost bit number to be moved from the source, in the range 0 to 31, 
encoded in the "imms" field. 


For the 64-bit variant: is the leftmost bit number to be moved from the source, in the range 0 to 63, 
encoded in the "imms" field. 


Operation 


bits(datasize) dst = X[d]; 
bits(datasize) src = X[n]; 


// perform bitfield move on low bits 
bits(datasize) bot = (dst AND NOT(wmask)) OR (ROR(src, R) AND wmask); 


// combine extension bits and result bits 
X[d] = (dst AND NOT(tmask)) OR (bot AND tmask); 





C6-466 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.23 BFXIL 


Bitfield extract and insert at low end copies any number of low-order bits from a source register into the same 
number of adjacent bits at the low end in the destination register, leaving other bits unchanged. 


This instruction is an alias of the BFM instruction. This means that: 
° The encodings in this description are named to match the encodings of BFM. 


. The description of BFM gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0 | 


FC CT 


opc 


32-bit variant 

Applies when sf == @ && N == 

BFXIL <Wd>, <Wn>, #<Isb>, #<width> 

is equivalent to 

BFM <Wd>, <Wn>, #<Isb>, #(<]sb>+<width>-1) 


and is the preferred disassembly when UInt(imms) >= UInt(immr). 


64-bit variant 

Applies when sf == 1 && N == 

BFXIL <Xd>, <Xn>, #<Isb>, #<width> 

is equivalent to 

BFM <Xd>, <Xn>, #<Isb>, #(<]sb>+<width>-1) 


and is the preferred disassembly when UInt(imms) >= UInt(immr). 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Isb> For the 32-bit variant: is the bit number of the Isb of the source bitfield, in the range 0 to 31. 


For the 64-bit variant: is the bit number of the Isb of the source bitfield, in the range 0 to 63. 


<width> For the 32-bit variant: is the width of the bitfield, in the range 1 to 32-<Isb>. 
For the 64-bit variant: is the width of the bitfield, in the range 1 to 64-<Isb>. 


Operation 


The description of BFM gives the operational pseudocode for this instruction. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-467 
1ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.24 BIC (shifted register) 


Bitwise Bit Clear (shifted register) performs a bitwise AND of a register value and the complement of an 
optionally-shifted register value, and writes the result to the destination register. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 | 109 5 4| 0 | 


[0-00-1010] — Ra | as — | Tr 


opc 


32-bit variant 
Applies when sf == 


BIC <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 


64-bit variant 
Applies when sf == 


BIC <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 

if sf == 'O' && imm6<5> == '1' then ReservedValue(); 


ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 
field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 
ROR when shift = 11 
<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
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Operation 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 


operand2 = NOT(operand2); 


result = operandl AND operand2; 
X[d] = result; 
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C6.2.25 BICS (shifted register) 


Bitwise Bit Clear (shifted register), setting flags, performs a bitwise AND of a register value and the complement 
of an optionally-shifted register value, and writes the result to the destination register. It updates the condition flags 
based on the result. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 | 109 5 4| 0| 


alt] 10-1 Ofer —Ra | ans [I 


opc 


32-bit variant 
Applies when sf == 


BICS <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 


64-bit variant 
Applies when sf == 


BICS <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


if sf == 'O' && imm6<5> == '1' then ReservedValue(); 


ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 
field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 
ROR when shift = 11 
<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
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Operation 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 


operand2 = NOT(operand2); 


result = operandl AND operand2; 
PSTATE.<N,Z,C,V> = result<datasize-1>:IsZeroBit(result):'QQ'; 


X[d] = result; 
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C6.2.26 BL 


Branch with Link branches to a PC-relative offset, setting the register X30 to PC+4. It provides a hint that this is a 
subroutine call. 


|31 30 29 28|27 2625 | | | | | 0| 
f[ootoi] Sim SOC—~—SOSCSCSY 
op 


26-bit signed PC-relative branch offset variant 


BL <label> 


Decode for this encoding 


bits(64) offset = SignExtend(imm26:'00', 64); 


Assembler symbols 
<label> Is the program label to be unconditionally branched to. Its offset from the address of this instruction, 
in the range +/-128MB, is encoded as "imm26" times 4. 
Operation 
X[30] = PC[] + 4; 


BranchTo(PC[] + offset, BranchType_CALL); 
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C6.2.27 BLR 


Branch with Link to Register calls a subroutine at an address in a register, setting register X30 to PC+4. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4/3 21 0| 


110101 1/0 0fo 1{11111/000000f{ Rn [00000 
op 


Integer variant 
BLR <Xn> 


Decode for this encoding 


integer n = UInt(Rn); 


Assembler symbols 
<Xn> Is the 64-bit name of the general-purpose register holding the address to be branched to, encoded in 
the "Rn" field. 
Operation 
bits(64) target = X[n]; 


X[30] = PC[] + 4; 
BranchTo(target, BranchType_CALL) ; 
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C6.2.28 BR 
Branch to Register branches unconditionally to an address in a register, with a hint that this is not a subroutine return. 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 5 4/3 2 1 0| 
110101 1/0 ofo 0/11111/000000f{f Rn [00000 
op 
Integer variant 
BR <Xn> 
Decode for this encoding 
integer n = UInt(Rn); 
Assembler symbols 
<Xn> Is the 64-bit name of the general-purpose register holding the address to be branched to, encoded in 
the "Rn" field. 
Operation 
bits(64) target = X[n]; 
BranchTo(target, BranchType_JMP) ; 
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C6.2.29 BRK 


Breakpoint instruction generates a Breakpoint Instruction exception. The PE records the exception in ESR_ELx, 
using the EC value 0x3c, and captures the value of the immediate argument in ESR_ELx.ISS. 


|31 30 29 28|27 26 25 24|23 22 21 20| | | | 5 4/3 21 0| 
tTiortotoojootf, imme —C*dL.' OJ100 J 


System variant 


BRK #<imm> 
Decode for this encoding 


bits(16) comment = imm16; 


Assembler symbols 


<imm> Is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the "imm16" field. 


Operation 


AArch64.SoftwareBreakpoint (comment) ; 
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C6.2.30 CBNZ 


Compare and Branch on Nonzero compares the value in a register with zero, and conditionally branches to a label 
at a PC-relative offset if the comparison is not equal. It provides a hint that this is not a subroutine call or return. 
This instruction does not affect the condition flags. 


|31 30 29 28|27 26 25 24|23 | | | | 5 4| 0 


sflo.1 tot ofsf TT immig R 
op 


32-bit variant 
Applies when sf == 0. 


CBNZ <Wt>, <label> 


64-bit variant 
Applies when sf == 1. 


CBNZ <Xt>, <label> 


Decode for all variants of this encoding 
integer t = UInt(Rt); 


integer datasize = if sf == '1' then 64 else 32; 
bits(64) offset = SignExtend(imm19:'Q0', 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be tested, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be tested, encoded in the "Rt" field. 
<label> Is the program label to be conditionally branched to. Its offset from the address of this instruction, 


in the range +/-1MB, is encoded as "imm19" times 4. 


Operation 
bits(datasize) operandl = X[t]; 


if IsZero(operand1) == FALSE then 
BranchTo(PC[] + offset, BranchType_JMP) ; 
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C6.2.31 CBZ 


Compare and Branch on Zero compares the value in a register with zero, and conditionally branches to a label at a 
PC-relative offset if the comparison is equal. It provides a hint that this is not a subroutine call or return. This 
instruction does not affect condition flags. 


|31 30 29 28|27 26 25 24|23 | | | | 5 4| 0 


isffo.1 4107 0fo) TT immi9 RT 
op 


32-bit variant 
Applies when sf == 0. 


CBZ <Wt>, <label> 


64-bit variant 
Applies when sf == 1. 


CBZ <Xt>, <label> 


Decode for all variants of this encoding 
integer t = UInt(Rt); 


integer datasize = if sf == '1' then 64 else 32; 
bits(64) offset = SignExtend(imm19:'Q0', 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be tested, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be tested, encoded in the "Rt" field. 
<label> Is the program label to be conditionally branched to. Its offset from the address of this instruction, 


in the range +/-1MB, is encoded as "imm19" times 4. 


Operation 
bits(datasize) operandl = X[t]; 


if IsZero(operand1) == TRUE then 
BranchTo(PC[] + offset, BranchType_JMP) ; 
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C6.2.32 CCMN (immediate) 
Conditional Compare Negative (immediate) sets the value of the condition flags to the result of the comparison of 
a register value and a negated immediate value if the condition is TRUE, and an immediate value otherwise. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 0 | 
sflo[t]1 1047001 0] imm5  [ cond |t}o] Rn [0] naw | 
op 
32-bit variant 
Applies when sf == 0. 
CCMN <Wn>, #<imm>, #<nzcv>, <cond> 
64-bit variant 
Applies when sf == 1. 
CCMN <Xn>, #<imm>, #<nzcv>, <cond> 
Decode for all variants of this encoding 
integer n = UInt(Rn); 
integer datasize = if sf == '1' then 64 else 32; 
bits(4) flags = nzcv; 
bits(datasize) imm = ZeroExtend(imm5, datasize); 
Assembler symbols 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<imm> Is a five bit unsigned (positive) immediate encoded in the "imm5" field. 
<nzcv> Is the flag bit specifier, an immediate in the range 0 to 15, giving the alternative state for the 4-bit 
NZCV condition flags, encoded in the "nzcv" field. 
<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
Operation 
bits(datasize) operandl = X[n]; 
if ConditionHolds(cond) then 
(-, flags) = AddWithCarry(operand1, imm, '0'); 
PSTATE.<N,Z,C,V> = flags; 
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C6.2.33 CCMN (register) 


Conditional Compare Negative (register) sets the value of the condition flags to the result of the comparison of a 
register value and the inverse of another register value if the condition is TRUE, and an immediate value otherwise. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 12/1110 9 | 5 4|3 0 | 


Isffojt]71 10470070) Rm] cond Jojo} Rn Jol nav _ | 
op 


32-bit variant 
Applies when sf == 0. 


CCMN <Wn>, <Wm>, #<nzcv>, <cond> 


64-bit variant 
Applies when sf == 1. 


CCMN <Xn>, <Xm>, #<nzcv>, <cond> 


Decode for all variants of this encoding 


integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 
bits(4) flags = nzcv; 


Assembler symbols 


<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<nzcv> Is the flag bit specifier, an immediate in the range 0 to 15, giving the alternative state for the 4-bit 


NZCV condition flags, encoded in the "nzcv" field. 


<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
Operation 

bits(datasize) operandl = X[n]; 

bits(datasize) operand2 = X[m]; 


if ConditionHolds(cond) then 
(-, flags) = AddWithCarry(operand1, operand2, 'Q'); 
PSTATE.<N,Z,C,V> = flags; 
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C6.2.34 CCMP (immediate) 


Conditional Compare (immediate) sets the value of the condition flags to the result of the comparison of a register 
value and an immediate value if the condition is TRUE, and an immediate value otherwise. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 12/1110 9 | 5 4|3 0 | 


Isf}ift]1 104007 0] imms | cond [tfo} Rn Jot nzv_ | 
op 


32-bit variant 


Applies when sf == 0. 


CCMP <Wn>, #<imm>, #<nzcv>, <cond> 


64-bit variant 


Applies when sf == 1. 


CCMP <Xn>, #<imm>, #<nzcv>, <cond> 


Decode for all variants of this encoding 


integer n 


= UInt(Rn); 


integer datasize = if sf == '1' then 64 else 32; 
bits(4) flags = nzcv; 
bits(datasize) imm = ZeroExtend(imm5, datasize); 


Assembler symbols 


<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<imm> Is a five bit unsigned (positive) immediate encoded in the "imm5" field. 

<nzcv> Is the flag bit specifier, an immediate in the range 0 to 15, giving the alternative state for the 4-bit 
NZCV condition flags, encoded in the "nzcv" field. 

<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 

Operation 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2; 


if ConditionHolds(cond) then 

operand2 = NOT(imm); 

(-, flags) = AddWithCarry(operand1, operand2, '1'); 
PSTATE.<N,Z,C,V> = flags; 
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C6.2.35 CCMP (register) 


Conditional Compare (register) sets the value of the condition flags to the result of the comparison of two registers 
if the condition is TRUE, and an immediate value otherwise. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 12/1110 9 | 5 4|3 0 | 


sfltfi]7 totoo tof Rm | cond fojof Rn fol nv | 
op 


32-bit variant 
Applies when sf == 0. 


CCMP <Wn>, <Wm>, #<nzcv>, <cond> 


64-bit variant 
Applies when sf == 1. 


CCMP <Xn>, <Xm>, #<nzcv>, <cond> 


Decode for all variants of this encoding 


integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 
bits(4) flags = nzcv; 


Assembler symbols 


<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<nzcv> Is the flag bit specifier, an immediate in the range 0 to 15, giving the alternative state for the 4-bit 


NZCV condition flags, encoded in the "nzcv" field. 


<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
Operation 

bits(datasize) operandl = X[n]; 

bits(datasize) operand2 = X[m]; 


if ConditionHolds(cond) then 

operand2 = NOT(operand2) ; 

(-, flags) = AddWithCarry(operandl, operand2, '1'); 
PSTATE.<N,Z,C,V> = flags; 
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C6.2.36 CINC 


Conditional Increment returns, in the destination register, the value of the source register incremented by 1 if the 
condition is TRUE, and otherwise returns the value of the source register. 


This instruction is an alias of the CSINC instruction. This means that: 


° The encodings in this description are named to match the encodings of CSINC. 
° The description of CSINC gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 12\/1110 9 | 5 4| 0| 
sofoy7 TO Toto Oo] mim | ei loli] em [ Ra | 
op Rm cond o2 Rn 


32-bit variant 

Applies when sf == 0. 

CINC <Wd>, <Wn>, <cond> 

is equivalent to 

CSINC <Wd>, <Wn>, <Wn>, invert(<cond>) 


and is the preferred disassembly when Rn == Rm. 


64-bit variant 

Applies when sf == 1. 

CINC <Xd>, <Xn>, <cond> 

is equivalent to 

CSINC <Xd>, <Xn>, <Xn>, invert(<cond>) 


and is the preferred disassembly when Rn == Rm. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" and "Rm" fields. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" and "Rm" fields. 
<cond> Is one of the standard conditions, excluding AL and NV, encoded in the "cond" field with its least 


significant bit inverted. 


Operation 


The description of CSINC gives the operational pseudocode for this instruction. 
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C6.2.37 CINV 


Conditional Invert returns, in the destination register, the bitwise inversion of the value of the source register if the 
condition is TRUE, and otherwise returns the value of the source register. 


This instruction is an alias of the CSINV instruction. This means that: 


° The encodings in this description are named to match the encodings of CSINV. 
° The description of CSINV gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 12/1110 9 | 5 4| 0 | 
Spot to +0100] am | ei [ofo] eam |. Ra 
op Rm cond 02 Rn 


32-bit variant 

Applies when sf == 0. 

CINV <Wd>, <Wn>, <cond> 

is equivalent to 

CSINV <Wd>, <Wn>, <Wn>, invert(<cond>) 


and is the preferred disassembly when Rn == Rm. 


64-bit variant 

Applies when sf == 1. 

CINV <Xd>, <Xn>, <cond> 

is equivalent to 

CSINV <Xd>, <Xn>, <Xn>, invert(<cond>) 


and is the preferred disassembly when Rn == Rm. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" and "Rm" fields. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" and "Rm" fields. 
<cond> Is one of the standard conditions, excluding AL and NV, encoded in the "cond" field with its least 


significant bit inverted. 


Operation 


The description of CSINV gives the operational pseudocode for this instruction. 
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C6.2.38 CLREX 


Clear Exclusive clears the local monitor of the executing PE. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 14 13 12/11 8|7 6 5 4|3 21 0| 


770107070 0)0)0 0017/0017] crm Jo10]i 1411 


System variant 


CLREX {#<imm>} 


Decode for this encoding 


// CRm field is ignored 


Assembler symbols 
<imm> Is an optional 4-bit unsigned immediate, in the range 0 to 15, defaulting to 15 and encoded in the 


"CRm" field. 


Operation 


ClearExclusiveLocal(ProcessorID()); 
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C6.2.39 CLS 


Count leading sign bits : Rd = CLS(Rn) 


|31 30 29 28|27 26 25 24/23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fot toto 11 opooooopoo1o mn | Rd 
op 


32-bit variant 
Applies when sf == 0. 


CLS <Wd>, <Wn> 


64-bit variant 
Applies when sf == 1. 


CLS <Xd>, <Xn> 


Decode for all variants of this encoding 
integer d = UInt(Rd); 


integer n = UInt(Rn); 
integer datasize = if sf == '1' then 64 else 32; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
Operation 


integer result; 
bits(datasize) operandl = X[n]; 


result = CountLeadingSignBits(operand1) ; 


X[d] = result<datasize-1:0>; 
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C6.2.40 CLZ 


Count leading zero bits : Rd = CLZ(Rn) 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fot 707011 ofooooo[0o7 0] Rn | Rd 
op 


32-bit variant 
Applies when sf == 0. 


CLZ <Wd>, <Wn> 


64-bit variant 
Applies when sf == 1. 


CLZ <Xd>, <Xn> 


Decode for all variants of this encoding 
integer d = UInt(Rd); 


integer n = UInt(Rn); 
integer datasize = if sf == '1' then 64 else 32; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
Operation 


integer result; 
bits(datasize) operandl = X[n]; 


result = CountLeadingZeroBits(operand1) ; 
X[d] = result<datasize-1:0>; 
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C6.2.41 CMN (extended register) 


Compare Negative (extended register) adds a register value and a sign or zero-extended register value, followed by 
an optional left shift amount. The argument that is extended from the <Rm> register can be a byte, halfword, word, 
or doubleword. It updates the condition flags based on the result, and discards the result. 


This instruction is an alias of the ADDS (extended register) instruction. This means that: 
° The encodings in this description are named to match the encodings of ADDS (extended register). 


° The description of ADDS (extended register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1312| 109 | 5 4| 


Sloe 0-1 v0 ot] — Reon [ns | TT 


op S 


32-bit variant 

Applies when sf == 

CMN <Wn|WSP>, <Wm>{, <extend> {#<amount>}} 

is equivalent to 

ADDS WZR, <Wn|WSP>, <Wm>{, <extend> {#<amount>}} 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 

CMN <Xn|SP>, <R><m>{, <extend> {#<amount>}} 

is equivalent to 

ADDS XZR, <Xn|SP>, <R><m>{, <extend> {#<amount>}} 


and is always the preferred disassembly. 


Assembler symbols 


<Wn | WSP> Is the 32-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xn|SP> Is the 64-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 
<R> Is a width specifier, encoded in the "option" field. It can have the following values: 
W when option = 00x 
W when option = 010 
X when option = x11 
W when option = 10x 
W when option = 110 


<m> Is the number [0-30] of the second general-purpose source register or the name ZR (31), encoded in 
the "Rm" field. 
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<extend> For the 32-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB 
UXTH 
LSL|UXTW 
UXTX 
SXTB 
SXTH 
SXTW 
SXTX 


when option 
when option 
when option 
when option 
when option 
when option 
when option 


when option 


000 
001 
010 
011 
100 
101 
110 
111 


If "Rn" is '11111' (WSP) and "option" is '010' then LSL is preferred, but may be omitted when 
"imm3" is '000'. In all other cases <extend> is required and must be UXTW when "option" is '010'. 


For the 64-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB 
UXTH 
UXTW 
LSL|UXTX 
SXTB 
SXTH 
SXTW 
SXTX 


when option 
when option 
when option 
when option 
when option 
when option 
when option 


when option 


000 
001 
010 
011 
100 
101 
110 
111 


If"Rn" is'11111' (SP) and "option" is '011' then LSL is preferred, but may be omitted when "imm3" 
is '000'. In all other cases <extend> is required and must be UXTX when "option" is '011'. 


<amount> Is the left shift amount to be applied after extension in the range 0 to 4, defaulting to 0, encoded in 
the "imm3" field. It must be absent when <extend> is absent, is required when <extend> is LSL, 
and is optional when <extend> is present but not LSL. 


Operation 


The description of ADDS (extended register) gives the operational pseudocode for this instruction. 
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C6.2.42 CMN (immediate) 


Compare Negative (immediate) adds a register value and an optionally-shifted immediate value. It updates the 
condition flags based on the result, and discards the result. 


This instruction is an alias of the ADDS (immediate) instruction. This means that: 
° The encodings in this description are named to match the encodings of ADDS (immediate). 


° The description of ADDS (immediate) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 2221 | | | 109 | 5 4| 0 | 


sfo]i[1 00 0 a[enR] —immi2—S—S—~—~sSCSCiR Sid 


op S Rd 


32-bit variant 

Applies when sf == 0. 

CMN <Wn|WSP>, #<imm>{, <shift>} 

is equivalent to 

ADDS WZR, <Wn|WSP>, #<imm> {, <shift>} 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1. 

CMN <Xn|SP>, #<imm>{, <shift>} 

is equivalent to 

ADDS XZR, <Xn|SP>, #<imm> {, <shift>} 


and is always the preferred disassembly. 


Assembler symbols 


<Wn | WSP> Is the 32-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<Xn|SP> Is the 64-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<imm> Is an unsigned immediate, in the range 0 to 4095, encoded in the "imm12" field. 

<shift> Is the optional left shift to apply to the immediate, defaulting to LSL #0 and encoded in the "shift" 


field. It can have the following values: 
LSL #0 when shift = 00 
LSL #12 when shift = 01 


The encoding shift = 1x is reserved. 


Operation 


The description of ADDS (immediate) gives the operational pseudocode for this instruction. 
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C6.2.43 CMN (shifted register) 


Compare Negative (shifted register) adds a register value and an optionally-shifted register value. It updates the 
condition flags based on the result, and discards the result. 


This instruction is an alias of the ADDS (shifted register) instruction. This means that: 
° The encodings in this description are named to match the encodings of ADDS (shifted register). 


° The description of ADDS (shifted register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 | 109 | 5 4| 0 | 


fsf[o[7o 10 7 a[snR]o. Rm | mmo] Rn [i 7777 


111 
op S Rd 


32-bit variant 

Applies when sf == 0. 

CMN <Wn>, <Wm>{, <shift> #<amount>} 

is equivalent to 

ADDS WZR, <Wn>, <Wm> {, <shift> #<amount>} 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1. 

CMN <Xn>, <Xm>{, <shift> #<amount>} 

is equivalent to 

ADDS XZR, <Xn>, <Xm> {, <shift> #<amount>} 


and is always the preferred disassembly. 


Assembler symbols 


<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL and encoded 
in the "shift" field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 


The encoding shift = 111s reserved. 


<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field. 
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Operation 


The description of ADDS (shifted register) gives the operational pseudocode for this instruction. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-491 
1ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.44 CMP (extended register) 


Compare (extended register) subtracts a sign or zero-extended register value, followed by an optional left shift 
amount, from a register value. The argument that is extended from the <Rm> register can be a byte, halfword, word, 
or doubleword. It updates the condition flags based on the result, and discards the result. 


This instruction is an alias of the SUBS (extended register) instruction. This means that: 
. The encodings in this description are named to match the encodings of SUBS (extended register). 


° The description of SUBS (extended register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1312| 109 | 5 4| 


Sie 0-1 v0 ot] Reon [ns | TT 


op S 


32-bit variant 

Applies when sf == 

CMP <Wn|WSP>, <Wm>{, <extend> {#<amount>}} 

is equivalent to 

SUBS WZR, <Wn|WSP>, <Wm>{, <extend> {#<amount>}} 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 

CMP <Xn|SP>, <R><m>{, <extend> {#<amount>}} 

is equivalent to 

SUBS XZR, <Xn|SP>, <R><m>{, <extend> {#<amount>}} 


and is always the preferred disassembly. 


Assembler symbols 


<Wn | WSP> Is the 32-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xn|SP> Is the 64-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 
<R> Is a width specifier, encoded in the "option" field. It can have the following values: 
W when option = 00x 
W when option = 010 
X when option = x11 
W when option = 10x 
W when option = 110 


<m> Is the number [0-30] of the second general-purpose source register or the name ZR (31), encoded in 
the "Rm" field. 
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<extend> 


<amount> 


Operation 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


For the 32-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB when option = 000 
UXTH when option = 001 
LSL|UXTW when option = 010 
UXTX when option = 011 
SXTB when option = 100 
SXTH when option = 101 
SXTW when option = 110 
SXTX when option = 111 


If "Rn" is '11111' (WSP) and "option" is '010' then LSL is preferred, but may be omitted when 
"imm3" is '000'. In all other cases <extend> is required and must be UXTW when "option" is '010'. 


For the 64-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB when option = 000 
UXTH when option = 001 
UXTW when option = 010 
LSL|UXTX when option = Q11 
SXTB when option = 100 
SXTH when option = 101 
SXTW when option = 110 
SXTX when option = 111 


If"Rn" is'11111' (SP) and "option" is '011' then LSL is preferred, but may be omitted when "imm3" 
is '000'. In all other cases <extend> is required and must be UXTX when "option" is '011'. 


Is the left shift amount to be applied after extension in the range 0 to 4, defaulting to 0, encoded in 
the "imm3" field. It must be absent when <extend> is absent, is required when <extend> is LSL, 
and is optional when <extend> is present but not LSL. 


The description of SUBS (extended register) gives the operational pseudocode for this instruction. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-493 


1ID092916 


Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 





C6.2.45 CMP (immediate) 
Compare (immediate) subtracts an optionally-shifted immediate value from a register value. It updates the condition 
flags based on the result, and discards the result. 
This instruction is an alias of the SUBS (immediate) instruction. This means that: 
° The encodings in this description are named to match the encodings of SUBS (immediate). 
° The description of SUBS (immediate) gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24/23 2221 | | | 109 | 5 4| 0 | 
Tita 
op S Rd 
32-bit variant 
Applies when sf == 0. 
CMP <Wn|WSP>, #<imm>{, <shift>} 
is equivalent to 
SUBS WZR, <Wn|WSP>, #<imm> {, <shift>} 
and is always the preferred disassembly. 
64-bit variant 
Applies when sf == 1. 
CMP <Xn|SP>, #<imm>{, <shift>} 
is equivalent to 
SUBS XZR, <Xn|SP>, #<imm> {, <shift>} 
and is always the preferred disassembly. 
Assembler symbols 
<Wn | WSP> Is the 32-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<Xn|SP> Is the 64-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<imm> Is an unsigned immediate, in the range 0 to 4095, encoded in the "imm12" field. 
<shift> Is the optional left shift to apply to the immediate, defaulting to LSL #0 and encoded in the "shift" 
field. It can have the following values: 
LSL #0 when shift = 00 
LSL #12 when shift = 01 
The encoding shift = 1x is reserved. 
Operation 
The description of SUBS (immediate) gives the operational pseudocode for this instruction. 
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C6.2.46 CMP (shifted register) 


Compare (shifted register) subtracts an optionally-shifted register value from a register value. It updates the 
condition flags based on the result, and discards the result. 


This instruction is an alias of the SUBS (shifted register) instruction. This means that: 
° The encodings in this description are named to match the encodings of SUBS (shifted register). 


° The description of SUBS (shifted register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20 16|15 | 109 | 5 4| 0 | 


sit[i[o 107 a[snRfo. Rm | mmo] Rn [77777 


111 
op S Rd 


32-bit variant 

Applies when sf == 0. 

CMP <Wn>, <Wm>{, <shift> #<amount>} 

is equivalent to 

SUBS WZR, <Wn>, <Wm> {, <shift> #<amount>} 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1. 

CMP <Xn>, <Xm>{, <shift> #<amount>} 

is equivalent to 

SUBS XZR, <Xn>, <Xm> {, <shift> #<amount>} 


and is always the preferred disassembly. 


Assembler symbols 


<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL and encoded 
in the "shift" field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 


The encoding shift = 111s reserved. 


<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-495 
1ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


Operation 


The description of SUBS (shifted register) gives the operational pseudocode for this instruction. 





C6-496 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.47 CNEG 


Conditional Negate returns, in the destination register, the negated value of the source register if the condition is 
TRUE, and otherwise returns the value of the source register. 


This instruction is an alias of the CSNEG instruction. This means that: 


° The encodings in this description are named to match the encodings of CSNEG. 
° The description of CSNEG gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 12\1110 9 | 5 4| 0| 
fol? to 70700] Rm | Fix loli] mn [| Ra | 
op cond o2 


32-bit variant 

Applies when sf == 0. 

CNEG <Wd>, <Wn>, <cond> 

is equivalent to 

CSNEG <Wd>, <Wn>, <Wn>, invert(<cond>) 


and is the preferred disassembly when Rn == Rm. 


64-bit variant 

Applies when sf == 1. 

CNEG <Xd>, <Xn>, <cond> 

is equivalent to 

CSNEG <Xd>, <Xn>, <Xn>, invert(<cond>) 


and is the preferred disassembly when Rn == Rm. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" and "Rm" fields. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" and "Rm" fields. 
<cond> Is one of the standard conditions, excluding AL and NV, encoded in the "cond" field with its least 


significant bit inverted. 


Operation 


The description of CSNEG gives the operational pseudocode for this instruction. 
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C6.2.48 CRC32B, CRC32H, CRC32W, CRC32X 
CRC32 checksum performs a cyclic redundancy check (CRC) calculation on a value held in a general-purpose 
register. It takes an input CRC value in the first source operand, performs a CRC on the input value in the second 
source operand, and returns the output CRC value. The second source operand can be 8, 16, 32, or 64 bits. To align 
with common usage, the bit order of the values is reversed as part of the operation, and the polynomial 0x04C11DB7 
is used for the CRC calculation. 
In ARMv8-A, this is an OPTIONAL instruction. 
Note 
ID_AA64ISARO_EL1.CRC32 indicates whether this instruction is supported. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 
sofol1 7070170] am o7ojolx] Rn | Rd 
Cc 
CRC32B variant 
Applies when sf == @ && sz == 00. 
CRC32B <Wd>, <Wn>, <Wm> 
CRC32H variant 
Applies when sf == @ && sz == 01. 
CRC32H <Wd>, <Wn>, <Wm> 
CRC32W variant 
Applies when sf == @ && sz == 10. 
CRC32W <Wd>, <Wn>, <Wm> 
CRC32xX variant 
Applies when sf == 1 && sz == 11. 
CRC32X <Wd>, <Wn>, <Xm> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if sf == '1' && sz != '11' then UnallocatedEncoding(); 
if sf == '@' && sz == '11' then UnallocatedEncoding(); 
integer size = 8 << UInt(sz); 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose accumulator output register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose accumulator input register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose data source register, encoded in the "Rm" field. 
<Wm> Is the 32-bit name of the general-purpose data source register, encoded in the "Rm" field. 
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Operation 


if !HaveCRCExt() then 
UnallocatedEncoding(); 


bits(32) acc = X[n]; // accumulator 
bits(size) val = X[m]; // input value 
bits(32) poly = @x@4C11DB7<31:0>; 


bits(32+size) tempacc = BitReverse(acc) :Zeros(size); 
bits(size+32) tempval = BitReverse(val):Zeros(32); 


// Poly32Mod2 on a bitstring does a polynomial Modulus over {@,1} operation 
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly)); 
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C6.2.49 CRC32CB, CRC32CH, CRC32CW, CRC32CX 
CRC32 checksum performs a cyclic redundancy check (CRC) calculation on a value held in a general-purpose 
register. It takes an input CRC value in the first source operand, performs a CRC on the input value in the second 
source operand, and returns the output CRC value. The second source operand can be 8, 16, 32, or 64 bits. To align 
with common usage, the bit order of the values is reversed as part of the operation, and the polynomial 0x1EDC6F41 
is used for the CRC calculation. 
In ARMv8-A, this is an OPTIONAL instruction. 
Note 
ID_AA64ISARO_EL1.CRC32 indicates whether this instruction is supported. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312|1110 9 | 5 4| 0 | 
sojol1 7070170] am o7oji[x] kn | Ra 
Cc 
CRC32CB variant 
Applies when sf == @ && sz == 00. 
CRC32CB <Wd>, <Wn>, <Wm> 
CRC32CH variant 
Applies when sf == @ && sz == 01. 
CRC32CH <Wd>, <Wn>, <Wm> 
CRC32CW variant 
Applies when sf == @ && sz == 10. 
CRC32CW <Wd>, <Wn>, <Wm> 
CRC32CX variant 
Applies when sf == 1 && sz == 11. 
CRC32CX <Wd>, <Wn>, <Xm> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if sf == '1' && sz != '11' then UnallocatedEncoding(); 
if sf == '@' && sz == '11' then UnallocatedEncoding(); 
integer size = 8 << UInt(sz); 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose accumulator output register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose accumulator input register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose data source register, encoded in the "Rm" field. 
<Wm> Is the 32-bit name of the general-purpose data source register, encoded in the "Rm" field. 
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Operation 


if !HaveCRCExt() then 
UnallocatedEncoding(); 


bits(32) acc = X[n]; // accumulator 
bits(size) val = X[m]; // input value 
bits(32) poly = @x1EDC6F41<31:0>; 


bits(32+size) tempacc = BitReverse(acc) :Zeros(size); 
bits(size+32) tempval = BitReverse(val):Zeros(32); 


// Poly32Mod2 on a bitstring does a polynomial Modulus over {@,1} operation 
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly)); 
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C6.2.50 CSEL 


Conditional Select returns, in the destination register, the value of the first source register if the condition is TRUE, 
and otherwise returns the value of the second source register. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 12/1110 9 | 5 4| 0 | 


sflofo}; Fo Fo 70 of Rm | cond Jojo] Re | Rd 
op 


32-bit variant 
Applies when sf == 


CSEL <Wd>, <Wn>, <Wm>, <cond> 


64-bit variant 
Applies when sf == 


CSEL <Xd>, <Xn>, <Xm>, <cond> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
Operation 


bits(datasize) result; 
bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = X[m]; 


if ConditionHolds(cond) then 
result = operand1; 

else 
result = operand2; 


X[d] = result; 





C6-502 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.51 CSET 


Conditional Set sets the destination register to 1 if the condition is TRUE, and otherwise sets it to 0. 


This instruction is an alias of the CSINC instruction. This means that: 


° The encodings in this description are named to match the encodings of CSINC. 
° The description of CSINC gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 12/1110 9 | 5 4| 0 | 
sofol1 +O 710100111714) Him [olii 777i) Ra 
op Rm cond 02 Rn 


32-bit variant 

Applies when sf == 0. 

CSET <Wd>, <cond> 

is equivalent to 

CSINC <Wd>, WZR, WZR, invert(<cond>) 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1. 

CSET <Xd>, <cond> 

is equivalent to 

CSINC <Xd>, XZR, XZR, invert(<cond>) 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<cond> Is one of the standard conditions, excluding AL and NV, encoded in the "cond" field with its least 


significant bit inverted. 


Operation 


The description of CSINC gives the operational pseudocode for this instruction. 
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C6.2.52 CSETM 


Conditional Set Mask sets all bits of the destination register to 1 if the condition is TRUE, and otherwise sets all bits 
to 0. 


This instruction is an alias of the CSINV instruction. This means that: 


° The encodings in this description are named to match the encodings of CSINV. 
° The description of CSINV gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 12/1110 9 | 5 4| 0 | 
spor +07 0100/1117 %) Him [ool 777i] Ra 
op Rm cond 02 Rn 


32-bit variant 

Applies when sf == 0. 

CSETM <Wd>, <cond> 

is equivalent to 

CSINV <Wd>, WZR, WZR, invert(<cond>) 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1. 

CSETM <Xd>, <cond> 

is equivalent to 

CSINV <Xd>, XZR, XZR, invert(<cond>) 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<cond> Is one of the standard conditions, excluding AL and NV, encoded in the "cond" field with its least 


significant bit inverted. 


Operation 


The description of CSINV gives the operational pseudocode for this instruction. 
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C6.2.53 CSINC 


Conditional Select Increment returns, in the destination register, the value of the first source register if the condition 
is TRUE, and otherwise returns the value of the second source register incremented by 1. 


This instruction is used by the aliases CINC and CSET. See Alias conditions for details of when each alias is 
preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 12/1110 9 | 5 4| 0 | 


stlofofy Fo Fo FO of Rm | cond Joli] Rn | Re 
op 


32-bit variant 
Applies when sf == 


CSINC <Wd>, <Wn>, <Wm>, <cond> 


64-bit variant 
Applies when sf == 


CSINC <Xd>, <Xn>, <Xm>, <cond> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


Alias conditions 





Alias _ is preferred when 





CINC © Rm != '11111' && cond != '111x' && Rn != '11111' && Rn == Rm 





CSET Rm == '11111' && cond != '111x' && Rn == '11111' 





Assembler symbols 





<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
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Operation 


bits(datasize) result; 
bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = X[m] 


if ConditionHolds(cond) then 
result = operand; 
else 
result = operand2 + 1; 


X[d] = result; 
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C6.2.54 CSINV 


Conditional Select Invert returns, in the destination register, the value of the first source register if the condition is 
TRUE, and otherwise returns the bitwise inversion value of the second source register. 


This instruction is used by the aliases CINV and CSETM. See Alias conditions for details of when each alias is 
preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 12\1110 9 | 5 4| 0 | 


stlifofs Fo Fo FO of Rm | cond Jojo| Rn | Re 
op 


32-bit variant 
Applies when sf == 


CSINV <Wd>, <Wn>, <Wm>, <cond> 


64-bit variant 
Applies when sf == 


CSINV <Xd>, <Xn>, <Xm>, <cond> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


Alias conditions 





Alias is preferred when 





CINV Rm != '11111' && cond != '111x' && Rn != '11111' && Rn == Rm 





CSETM © Rm == '11111' && cond != '111x' && Rn == '11111' 





Assembler symbols 





<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
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Operation 


bits(datasize) result; 
bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = X[m]; 
if ConditionHolds(cond) then 
result = operand1; 
else 
result = NOT(operand2); 


X[d] = result; 
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C6.2.55 CSNEG 


Conditional Select Negation returns, in the destination register, the value of the first source register if the condition 
is TRUE, and otherwise returns the negated value of the second source register. 


This instruction is used by the alias CNEG. See Alias conditions for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 12/1110 9 | 5 4| 0 | 


slifoli Fo Fo Fo of Rm | cond Joli] Re | Rd 
op 


32-bit variant 
Applies when sf == 


CSNEG <Wd>, <Wn>, <Wm>, <cond> 


64-bit variant 
Applies when sf == 


CSNEG <Xd>, <Xn>, <Xm>, <cond> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


Alias conditions 





Alias is preferred when 





CNEG cond != '111x' && Rn == Rm 





Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
Operation 


bits(datasize) result; 
bits(datasize) operand1 
bits(datasize) operand2 


X[n]; 
X[m]; 


if ConditionHolds(cond) then 
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result = operand1; 

else 
result = NOT(operand2); 
result = result + 1; 


X[d] = result; 
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DC 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


Data Cache operation. For more information, see A64 system instructions for cache maintenance. 
This instruction is an alias of the SYS instruction. This means that: 
. The encodings in this description are named to match the encodings of SYS. 


° The description of SYS gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|15 12\11 8|7 5 4| 0 | 


770107070 0)0]0 1] om [oi1i 
L 


1 
CRn 
System variant 
DC <dc_op>, <Xt> 
is equivalent to 


SYS #<opl>, C7, <Cm>, #<op2>, <Xt> 


and is the preferred disassembly when SysOp(op1, '0111',CRm,op2) == Sys_DC. 


Assembler symbols 

<dc_op> Is a DC instruction name, as listed for the DC system instruction group, encoded in the 
"op1:CRm:op2" field. It can have the following values: 
IVAC when op1 = 000, CRm = 0110, op2 = 001 
ISwW when op1 = 000, CRm = 0110, op2 = 010 
CSW when op1 = 000, CRm = 1010, op2 = 010 
CISW when op1 = @00, CRm = 1110, op2 = 010 
ZVA when op1 = 011, CRm = 100, op2 = 001 
CVAC when op1 = 11, CRm = 1010, op2 = 001 
CVAU when op1 = @11, CRm = 1011, op2 = 001 
CIVAC when op1 = @11, CRm = 1110, op2 = 001 


<op1> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op1" field. 
<Cm> Is aname ‘Cm’, with 'm' in the range 0 to 15, encoded in the "CRm" field. 
<op2> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op2" field. 


<Xt> Is the 64-bit name of the general-purpose source register, encoded in the "Rt" field. 


Operation 


The description of SYS gives the operational pseudocode for this instruction. 
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C6.2.57 


DCPS1 


Debug Change PE State to EL1, when executed in Debug state: 

° If executed at ELO changes the current Exception level and SP to EL1 using SP_EL1. 
° Otherwise, if executed at ELx, selects SP_ELx. 

The target exception level of a DCPS1 instruction is: 

° EL1 if the instruction is executed at ELO. 

° Otherwise, the Exception level at which the instruction is executed. 

When the target Exception level of a DCPS1 instruction is ELx, on executing this instruction: 
° ELR_ELx becomes UNKNOWN. 

° SPSR_ELx becomes UNKNOWN. 

° ESR_ELx becomes UNKNOWN. 

° DLR_ELO and DSPSR_ELO become UNKNOWN. 

° The endianness is set according to SCTLR_ELx.EE. 

This instruction is UNDEFINED at ELO in Non-secure state if EL2 is implemented and HCR_EL2.TGE == 1. 
This instruction is always UNDEFINED in Non-debug state. 


For more information on the operation of the DCPSn instructions, see DCPS. 


|31 30 29 28|27 26 25 24|23 22 21 20) | | | 5 4/3 21 0| 


11ototoojs,otf imme sf ft 1 | 


LL 


System variant 
DCPS1 {#<imm>} 


Decode for this encoding 


if !Halted() then AArch64.UndefinedFault(); 


Assembler symbols 

<imm> Is an optional 16-bit unsigned immediate, in the range 0 to 65535, defaulting to 0 and encoded in 
the "imm16" field. 

Operation 


DCPSInstruction(LL); 
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C6.2.58 DCPS2 
Debug Change PE State to EL2, when executed in Debug state: 
° If executed at ELO or EL1 changes the current Exception level and SP to EL2 using SP_EL2. 
° Otherwise, if executed at ELx, selects SP_ELx. 
The target exception level of a DCPS2 instruction is: 
° EL2 if the instruction is executed at an exception level that is not EL3. 
° EL3 if the instruction is executed at EL3. 


When the target Exception level of a DCPS2 instruction is ELx, on executing this instruction: 


° ELR_ELx becomes UNKNOWN. 
° SPSR_ELx becomes UNKNOWN. 
° ESR_ELx becomes UNKNOWN. 


° DLR_ELO and DSPSR_ELO become UNKNOWN. 

° The endianness is set according to SCTLR_ELx.EE. 
This instruction is UNDEFINED at the following exception levels: 
° All exception levels if EL2 is not implemented. 

° At ELO and EL1 in Secure state if EL2 is implemented. 
This instruction is always UNDEFINED in Non-debug state. 


For more information on the operation of the DCPSn instructions, see DCPS. 


|31 30 29 28|27 26 25 24|23 22 21 20| | | | 5 4/3 21 0| 
Tiototooiot] _mmie_——S—~S~—S TTT 
LL 


System variant 


DCPS2 {#<imm>} 


Decode for this encoding 


if !Halted() then AArch64.UndefinedFault(); 


Assembler symbols 
<imm> Is an optional 16-bit unsigned immediate, in the range 0 to 65535, defaulting to 0 and encoded in 


the "imm16" field. 


Operation 


DCPSInstruction(LL); 
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C6.2.59 DCPS3 
Debug Change PE State to EL3, when executed in Debug state: 
e If executed at EL3 selects SP_EL3. 
° Otherwise, changes the current Exception level and SP to EL3 using SP_EL3. 
The target exception level of a DCPS3 instruction is EL3. 
On executing a DCPS3 instruction: 
° ELR_EL3 becomes UNKNOWN. 
° SPSR_EL3 becomes UNKNOWN. 
° ESR_EL3 becomes UNKNOWN. 
° DLR_ELO and DSPSR_ELO become UNKNOWN. 
° The endianness is set according to SCTLR_EL3.EE. 
This instruction is UNDEFINED at all exception levels if either: 
° EDSCR == 1. 
° EL3 is not implemented. 
This instruction is always UNDEFINED in Non-debug state. 
For more information on the operation of the DCPSn instructions, see DCPS. 
|31 30 29 28|27 26 25 24/23 22 21 20] | | | 5 4/3 21 0| 
11o10700/7 047 imme st ft 1 | 
LL 
System variant 
DCPS3 {#<imm>} 
Decode for this encoding 
if !Halted() then AArch64.UndefinedFault(); 
Assembler symbols 
<imm> Is an optional 16-bit unsigned immediate, in the range 0 to 65535, defaulting to 0 and encoded in 
the "imm16" field. 
Operation 
DCPSInstruction(LL); 
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C6.2.60 DMB 


Data Memory Barrier is a memory barrier that ensures the ordering of observations of memory accesses, see Data 
Memory Barrier (DMB) on page B2-88. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 14 13 12/11 8|7 6 5 4|3 21 0| 
110707010 ofofo ofo11fo00%71] CRm {1/0 1]1111 1 
opc 


System variant 


DMB <option>|#<imm> 


Decode for this encoding 


MBReqDomain domain; 
MBReqTypes types; 


case CRm<3:2> of 
when 'Q@Q0' domain = MBReqDomain_OuterShareable; 
when '@1' domain = MBReqDomain_Nonshareable; 
when '10' domain = MBReqDomain_InnerShareable; 
when '11' domain = MBReqDomain_Ful1System; 


case CRm<1:0> of 
when 'Q1' types 
when '10' types 
when '11' types 
otherwise 
types = MBReqTypes_Al1; 
domain = MBReqDomain_Ful1System; 


MBReqTypes_Reads; 
MBReqTypes_Writes; 
MBReqTypes_Al1; 


Assembler symbols 


<option> Specifies the limitation on the barrier operation. Values are: 


SY Full system is the required shareability domain, reads and writes are the required access 
types in both Group A and Group B. This option is referred to as the full system DMB. 
Encoded as CRm = 0b1111. 


ST Full system is the required shareability domain, writes are the required access type in 
both Group A and Group B. Encoded as CRm = 0b1110. 


LD Full system is the required shareability domain, reads are the required access type in 
Group A, and reads and writes are the required access types in Group B. Encoded as CRm 
= 0b1101. 

ISH Inner Shareable is the required shareability domain, reads and writes are the required 


access types in both Group A and Group B. Encoded as CRm = 0b1011. 


ISHST Inner Shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as CRm = 0b1010. 

ISHLD Inner Shareable is the required shareability domain, reads are the required access type 
in Group A, and reads and writes are the required access types in Group B. Encoded as 
CRm = 0b1001. 

NSH Non-shareable is the required shareability domain, reads and writes are the required 


access types in both Group A and Group B. Encoded as CRm = @b0111. 


NSHST Non-shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as CRm = 0b0110. 
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NSHLD Non-shareable is the required shareability domain, reads are the required access type in 
Group A, and reads and writes are the required access types in Group B. Encoded as CRm 
= 0b0101. 

OSH Outer Shareable is the required shareability domain, reads and writes are the required 


access types in both Group A and Group B. Encoded as CRm = @b0011. 


OSHST Outer Shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as CRm = 0b0010. 


OSHLD Outer Shareable is the required shareability domain, reads are the required access type 
in Group A, and reads and writes are the required access types in Group B. Encoded as 
CRm = 0b0001. 


All other encodings of CRm that are not listed above are reserved, and can be encoded using the 
#<imm> syntax. It is IMPLEMENTATION DEFINED whether options other than SY are implemented. All 
unsupported and reserved options must execute as a full system barrier operation, but software must 
not rely on this behavior. 


<imm> Is a 4-bit unsigned immediate, in the range 0 to 15, encoded in the "CRm" field. 


Operation 


DataMemoryBarrier(domain, types); 
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C6.2.61 DRPS 


Debug restore process state 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 13 12/1110 9 8/7 6 5 4/3 21 0| 


110101 31/0 101/111 11/0 0000 0/1 1111/0 0000 


System variant 


DRPS 


Decode for this encoding 


if !Halted() || PSTATE.EL == EL@ then UnallocatedEncoding(); 


Operation 


DRPSInstruction(); 
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C6.2.62 DSB 


Data Synchronization Barrier is a memory barrier that ensures the completion of memory accesses, see Data 
Synchronization Barrier (DSB) on page B2-89. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 14 13 12/11 8|7 6 5 4|3 21 0| 


[110710170170 ofojo ofotifoot al crm |ifo oft 11114) 


System variant 


DSB <option> |#<imm> 


opc 


Decode for this encoding 


MBReqDomain domain; 
MBReqTypes types; 


case CRm<3:2> of 
when '@Q' domain 
when '@1' domain 
when '10' domain 
when '11' domain 


case CRm<1:0> of 
when 'Q1' types 
when '10' types 
when '11' types 
otherwise 


MBReqDomain_OuterShareable; 
MBReqDomain_Nonshareable; 
MBReqDomain_InnerShareable; 
MBReqDomain_Ful1System; 


MBReqTypes_Reads; 
MBReqTypes_Writes; 
MBReqTypes_Al1; 


types = MBReqTypes_Al1; 
domain = MBReqDomain_Ful1System; 


Assembler symbols 


<option> Specifies the limitation on the barrier operation. Values are: 


SY 


Full system is the required shareability domain, reads and writes are the required access 
types in both Group A and Group B. This option is referred to as the full system DMB. 
Encoded as CRm = 0b1111. 





ST Full system is the required shareability domain, writes are the required access type in 
both Group A and Group B. Encoded as CRm = 0b1110. 

LD Full system is the required shareability domain, reads are the required access type in 
Group A, and reads and writes are the required access types in Group B. Encoded as CRm 
= 0b1101. 

ISH Inner Shareable is the required shareability domain, reads and writes are the required 
access types in both Group A and Group B. Encoded as CRm = 0b1011. 

ISHST Inner Shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as CRm = 0b1010. 

ISHLD Inner Shareable is the required shareability domain, reads are the required access type 
in Group A, and reads and writes are the required access types in Group B. Encoded as 
CRm = 0b1001. 

NSH Non-shareable is the required shareability domain, reads and writes are the required 
access types in both Group A and Group B. Encoded as CRm = @b0111. 

NSHST Non-shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as CRm = 0b0110. 
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NSHLD Non-shareable is the required shareability domain, reads are the required access type in 
Group A, and reads and writes are the required access types in Group B. Encoded as CRm 
= 0b0101. 

OSH Outer Shareable is the required shareability domain, reads and writes are the required 


access types in both Group A and Group B. Encoded as CRm = @b0011. 


OSHST Outer Shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as CRm = 0b0010. 


OSHLD Outer Shareable is the required shareability domain, reads are the required access type 
in Group A, and reads and writes are the required access types in Group B. Encoded as 
CRm = 0b0001. 


All other encodings of CRm that are not listed above are reserved, and can be encoded using the 
#<imm> syntax. It is IMPLEMENTATION DEFINED whether options other than SY are implemented. All 
unsupported and reserved options must execute as a full system barrier operation, but software must 
not rely on this behavior. 


<imm> Is a 4-bit unsigned immediate, in the range 0 to 15, encoded in the "CRm" field. 


Operation 


DataSynchronizationBarrier(domain, types); 
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C6.2.63 EON (shifted register) 


Bitwise Exclusive OR NOT (shifted register) performs a bitwise Exclusive OR NOT of a register value and an 
optionally-shifted register value, and writes the result to the destination register. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 | 109 5 4| 0 | 


soe 10-1 Ole Ra | as — | Tr 


opc 


32-bit variant 
Applies when sf == 


EON <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 


64-bit variant 
Applies when sf == 


EON <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 

if sf == 'O' && imm6<5> == '1' then ReservedValue(); 


ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 
field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 
ROR when shift = 11 
<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
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Operation 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 


operand2 = NOT(operand2); 
result = operand1 EOR operand2; 


X[d] = result; 
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C6.2.64 EOR (immediate) 


Bitwise Exclusive OR (immediate) performs a bitwise Exclusive OR of a register value and an immediate value, 
and writes the result to the destination register. 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 | 109 5 4| 0 | 


oho | 


opc 


32-bit variant 
Applies when sf == @ && N == 


EOR <Wd|WSP>, <Wn>, #<imm> 


64-bit variant 
Applies when sf == 


EOR <Xd|SP>, <Xn>, #<imm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer datasize = if sf == '1' then 64 else 32; 
bits(datasize) imm; 

if sf == '0' &&N != '@' then ReservedValue(); 
(imm, -) = DecodeBitMasks(N, imms, immr, TRUE); 


Assembler symbols 


<Wd|WSP> Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd| SP> Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 

<imm> For the 32-bit variant: is the bitmask immediate, encoded in "imms:immr". 


For the 64-bit variant: is the bitmask immediate, encoded in "N:imms:immr". 


Operation 


bits(datasize) result; 
bits(datasize) operandl = X[n]; 


result = operandl EOR imm; 


if d == 31 then 
SP[] = result; 
else 
X[d] = result; 
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C6.2.65 EOR (shifted register) 


Bitwise Exclusive OR (shifted register) performs a bitwise Exclusive OR of a register value and an 
optionally-shifted register value, and writes the result to the destination register. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 | 109 5 4| 0 | 


sole 10-1 Oso] Ra | as — ae Tr 


opc 


32-bit variant 
Applies when sf == 


EOR <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 


64-bit variant 
Applies when sf == 


EOR <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 

if sf == 'O' && imm6<5> == '1' then ReservedValue(); 


ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 
field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 
ROR when shift = 11 
<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
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Operation 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 


result = operand1 EOR operand2; 


X[d] = result; 
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C6.2.66 ERET 


Exception Return using the ELR and SPSR for the current Exception level. When executed, the PE restores PSTATE 
from the SPSR, and branches to the address held in the ELR. 


The PE checks the SPSR for the current Exception level for an illegal return event. See ///egal return events from 
AArch64 state on page D1-1537. 


ERET is UNDEFINED at ELO. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 13 12/1110 9 8/7 6 5 4/3 2 1 O| 


110101 31/0 10 0/1 1111/0 0000 0/1 111110 0000 


System variant 


ERET 
Decode for this encoding 


if PSTATE.EL == EL@ then UnallocatedEncoding(); 


Operation 


AArch64.ExceptionReturn(ELR[], SPSR[]); 
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C6.2.67 EXTR 
Extract register extracts a register from a pair of registers. 


This instruction is used by the alias ROR (immediate). See Alias conditions for details of when each alias is 
preferred. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 | 109 5 4| 0 | 


FGA CAC 


32-bit variant 
Applies when sf == @ && N == 0 && imms == Qxxxxx. 


EXTR <Wd>, <Wn>, <Wm>, #<Isb> 


64-bit variant 
Applies when sf == 1 && N == 


EXTR <Xd>, <Xn>, <Xm>, #<Isb> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 
integer 1sb; 


if N != sf then UnallocatedEncoding(); 


if sf == 'O' && imms<5> == '1' then ReservedValue(); 
Isb = UInt(imms); 


Alias conditions 





Alias is preferred when 





ROR (immediate) Rn == Rm 





Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 

<Isb> For the 32-bit variant: is the least significant bit position from which to extract, in the range 0 to 31, 


encoded in the "imms" field. 


For the 64-bit variant: is the least significant bit position from which to extract, in the range 0 to 63, 
encoded in the "imms" field. 
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Operation 


bits(datasize) result; 

bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = X[m]; 
bits(2sdatasize) concat = operand1:operand2; 


result = concat<Isb+datasize-1:1sb>; 


X[d] = result; 
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C6.2.68 HINT 


Hint instruction is for the instruction set space that is reserved for architectural hint instructions. 


The encoding variants described here are unallocated in this revision of the architecture, and behave as a NOP. These 
encodings might be allocated to other hint functionality in future revisions of the architecture. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 14 13 12/11 81/7 5 4|3 21 0| 


[11017070170 ofojo ofotifoot of cRm_ | op2 [1 11114) 


Hints 6 and 7 variant 
Applies when CRm == 0000 && op2 == 11x. 


HINT #<imm> 


Hints 8 to 127 variant 
Applies when CRm != 0000. 


HINT #<imm> 


Assembler symbols 
<imm> Is a 7-bit unsigned immediate, in the range 0 to 127, excluding the allocated encodings described 
below, encoded in "CRm:op2". The following encodings of "CRm:op2" are allocated: 
0000000 §=©NOP 
0000001 YIELD 
0000010 WFE 
0000011 WFI 
0000100 8§=SEV 
0000101 SEVL 


— Note 

For allocated encodings of "CRm:op2": 

° A disassembler will disassemble the allocated instruction, rather than the HINT instruction. 
° An assembler may support assembly of allocated encodings using HINT with the 


corresponding <imm> value, but it is not required to do so. 
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C6.2.69 HLT 


Halt instruction generates a Halt Instruction debug event. 


|31 30 29 28|27 26 25 24|23 22 21 20| | | | 5 4/3 21 0| 
Tio7to01000710| mms ——S—S—S~s OJ 


System variant 


HLT #<imm> 


Decode for this encoding 


if EDSCR.HDE == '@' || !HaltingAllowed() then UndefinedFault(); 


Assembler symbols 


<imm> Is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the "imm16" field. 


Operation 


Halt(DebugHalt_HaltInstruction) ; 
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C6.2.70 HVC 


Hypervisor Call causes an exception to EL2. Non-secure software executing at EL1 can use this instruction to call 
the hypervisor to request a service. 


The HVC instruction is UNDEFINED: 
° At ELO, and Secure EL1. 
° When SCR_EL3.HCE is set to 0. 


On executing an HVC instruction, the PE records the exception as a Hypervisor Call exception in ESR_ELx, using 
the EC value 0x16, and the value of the immediate argument. 


|31 30 29 28|27 26 25 24/23 22 21 20| | | | 5 4/3 21 0| 


Tioto1o0000] imme —~——s«zgo oof 


System variant 


HVC #<imm> 


Decode for this encoding 


bits(16) imm = imm16; 


Assembler symbols 


<imm> Is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the "imm16" field. 


Operation 


if !HaveEL(EL2) || PSTATE.EL == EL@ || (PSTATE.EL == EL1 && IsSecure()) then 
UnallocatedEncoding(); 


hvc_enable = if HaveEL(EL3) then SCR_EL3.HCE else NOT(HCR_EL2.HCD); 
if hvc_enable == 'Q' then 

AArch64.UndefinedFault(); 
else 

AArch64.CallHypervisor(imm) ; 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


Instruction Cache operation. For more information, see A64 system instructions for cache maintenance. 
This instruction is an alias of the SYS instruction. This means that: 
. The encodings in this description are named to match the encodings of SYS. 


° The description of SYS gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|15 12\11 8|7 5 4| 0 | 


L CRn 
System variant 
IC <ic_op>{, <Xt>} 
is equivalent to 
SYS #<opl>, C7, <Cm>, #<op2>{, <Xt>} 


and is the preferred disassembly when SysOp(op1, '0111',CRm,op2) == Sys_IC. 


Assembler symbols 


<ic_op> Is an IC instruction name, as listed for the IC system instruction pages, encoded in the 
"opl:CRm:op2" field. It can have the following values: 


TALLUIS when op1 = 000, CRm = 001, op2 = 000 
TALLU when op1 = 000, CRm = 0101, op2 = 000 
IVAU when op1 = @11, CRm = 0101, op2 = 001 


<op1> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op1" field. 

<Cm> Is a name 'Cm’, with 'm' in the range 0 to 15, encoded in the "CRm'" field. 

<op2> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op2" field. 

<Xt> Is the 64-bit name of the optional general-purpose source register, defaulting to '11111', encoded in 
the "Rt" field. 

Operation 


The description of SYS gives the operational pseudocode for this instruction. 
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C6.2.72 ISB 


Instruction Synchronization Barrier flushes the pipeline in the PE, so that all instructions following the ISB are 
fetched from cache or memory, after the instruction has been completed. It ensures that the effects of context 
changing operations executed before the ISB instruction are visible to the instructions fetched after the ISB. Context 
changing operations include changing the ASID, TLB maintenance instructions, and all changes to the System 
registers. In addition, any branches that appear in program order after the ISB instruction are written into the branch 
prediction logic with the context that is visible after the ISB instruction. This is needed to ensure correct execution 
of the instruction stream. For more information, see Memory barriers on page B2-87. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 14 13 12/11 8|7 6 5 4/3 2 1 0| 
T1O1070700)0f0 [0717/0077] cRm [t]1o]11171 
opc 


System variant 
ISB {<option>|#<imm>} 
Decode for this encoding 


// Empty. 


Assembler symbols 


<option> Specifies an optional limitation on the barrier operation. Values are: 
SY Full system barrier operation, encoded as CRm = @b1111. Can be omitted. 


All other encodings of CRm are reserved. The corresponding instructions execute as full system 
barrier operations, but must not be relied upon by software. 


<imm> Is an optional 4-bit unsigned immediate, in the range 0 to 15, defaulting to 15 and encoded in the 
"CRm" field. 
Operation 


InstructionSynchronizationBarrier(); 
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C6.2.73 LDAR 


Load-Acquire Register derives an address from a base register value, loads a 32-bit word or 64-bit doubleword from 
memory, and writes it to a register. The instruction also has memory ordering semantics as described in 
Load-Acquire, Store-Release on page B2-90. For information about memory accesses, see Load/Store addressing 
modes on page C1-128. 


Ei 30 29 28|27 26 25 ae 22 21 20| ene | 109 5 4| 0 | 


1 xfo 0 40 0 of foe TOTOIE DAMA i 


size Rs Rt2 


32-bit variant 
Applies when size == 10. 


LDAR <Wt>, [<Xn|SP>{,#0}] 
64-bit variant 

Applies when size == 11. 
LDAR <Xt>, [<Xn|SP>{,#0}] 


Decode for all variants of this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


integer elsize = 8 << UInt(size); 
integer regsize = if elsize == 64 then 64 else 32; 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 


bits(64) address; 
bits(elsize) data; 
constant integer dbytes = elsize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 





data = Mem[address, dbytes, AccType_ORDERED] ; 
X[t] = ZeroExtend(data, regsize); 
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C6.2.74 LDARB 
Load-Acquire Register Byte derives an address from a base register value, loads a byte from memory, zero-extends 
it and writes it to a register. The instruction also has memory ordering semantics as described in Load-Acquire, 
Store-Release on page B2-90. For information about memory accesses, see Load/Store addressing modes on 
page C1-128. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 5 4| 0| 
[fo ofJo 07000 EE OMMMM mG) OMA a ed 
size Rs Rt2 
No offset variant 
LDARB <Wt>, [<Xn|SP>{,#0}] 
Decode for this encoding 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 
bits(64) address; 
bits(8) data; 
if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 
data = Mem[address, 1, AccType_ORDERED]; 
X[t] = ZeroExtend(data, 32); 
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C6 A64 Base Instruction Descriptions 
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LDARH 


Load-Acquire Register Halfword derives an address from a base register value, loads a halfword from memory, 
zero-extends it, and writes it to a register. The instruction also has memory ordering semantics as described in 
Load-Acquire, Store-Release on page B2-90. For information about memory accesses, see Load/Store addressing 
modes on page C1-128. 


|31 30 29 28|27 26 25 ae 22 21 20| ene | 109 5 4| 0 | 


io ifo 0 400 of if Hae TOTO ER 1)(1) (1) (1) (4 = 


size Rs Rt2 


No offset variant 


LDARH <Wt>, [<Xn|SP>{,#0}] 


Decode for this encoding 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


Operation 


bits(64) address; 
bits(16) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


data 
X[t] 


Mem[address, 2, AccType_ORDERED]; 
ZeroExtend(data, 32); 
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C6.2.76 LDAXP 

Load-Acquire Exclusive Pair of Registers derives an address from a base register value, loads two 32-bit words or 
two 64-bit doublewords from memory, and writes them to two registers. A 32-bit pair requires the address to be 
doubleword aligned and is single-copy atomic at doubleword granularity. A 64-bit pair requires the address to be 
quadword aligned and is single-copy atomic for each doubleword at doubleword granularity. The PE marks the 
physical address being accessed as an exclusive access. This exclusive access mark is checked by Store Exclusive 
instructions. See Synchronization and semaphores on page B2-108. The instruction also has memory ordering 
semantics as described in Load-Acquire, Store-Release on page B2-90. For information about memory accesses see 
Load/Store addressing modes on page C1-128. 

|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 14 | 109 5 4| 0 | 

[i{szio_0 1 0 0 0 Lili MOMMA) a 

Rs 

32-bit variant 
Applies when sz == 
LDAXP <Wtl>, <Wt2>, [<Xn|SP>{,#0}] 
64-bit variant 
Applies when sz == 
LDAXP <Xtl>, <Xt2>, [<Xn|SP>{,#0}] 
Decode for all variants of this encoding 

integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); 

integer elsize = 32 << UInt(sz); 

integer datasize = elsize « 2; 
Notes for all encodings 
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDAXP on page K1-5484. 
Assembler symbols 
<Wtl> Is the 32-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Wt2> Is the 32-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 

field. 
<Xtl> Is the 64-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt2> Is the 64-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
C6-536 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k 


Non-Confidential ID092916 


_iss10775 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


Operation 


bits(64) address; 

bits(datasize) data; 

constant integer dbytes = datasize DIV 8; 
boolean rt_unknown = FALSE; 


if t == t2 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // result is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


// Tell the Exclusive Monitors to record a sequence of one or more atomic 
// memory reads from virtual address range [address, address+dbytes-1]. 
// The Exclusive Monitor will only be set if all the reads are from the 
// same dbytes-aligned physical address, to allow for the possibility of 
// an atomicity break if the translation is changed between reads. 
AArch64.SetExclusiveMonitors(address, dbytes); 


if rt_unknown then 
// ConstrainedUNPREDICTABLE case 
X[t] = bits(datasize) UNKNOWN; 
elsif elsize == 32 then 
// 32-bit load exclusive pair (atomic) 
data = Mem[address, dbytes, AccType_ORDERED] ; 
if BigEndian() then 
X[t] = data<datasize-1l:elsize>; 
X[t2] = data<elsize-1:0>; 
else 
X[t] = data<elsize-1:0>; 
X[t2] = data<datasize-1l:elsize>; 
else // elsize == 64 
// 64-bit load exclusive pair (not atomic), 
// but must be 128-bit aligned 
if address != Align(address, dbytes) then 
AArch64.Abort(address, AArch64.AlignmentFault(AccType_ORDERED, FALSE, FALSE)); 
X[t] = Mem[address, 8, AccType_ORDERED]; 
X[t2] = Mem[address+8, 8, AccType_ORDERED]; 
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C6.2.77 LDAXR 
Load-Acquire Exclusive Register derives an address from a base register value, loads a 32-bit word or 64-bit 
doubleword from memory, and writes it to a register. The memory access is atomic. The PE marks the physical 
address being accessed as an exclusive access. This exclusive access mark is checked by Store Exclusive 
instructions. See Synchronization and semaphores on page B2-108. The instruction also has memory ordering 
semantics as described in Load-Acquire, Store-Release on page B2-90. For information about memory accesses see 
Load/Store addressing modes on page C1-128. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 | 5 4| 0| 
[7 xJoo07000 elnlok OOOO ae MMA, Rn TRE 
size Rt2 
32-bit variant 
Applies when size == 10. 
LDAXR <Wt>, [<Xn|SP>{,#0}] 
64-bit variant 
Applies when size == 11. 
LDAXR <Xt>, [<Xn|SP>{,#0}] 
Decode for all variants of this encoding 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer elsize = 8 << UInt(size); 
integer regsize = if elsize == 64 then 64 else 32; 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 
bits(64) address; 
bits(elsize) data; 
constant integer dbytes = elsize DIV 8; 
if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 
// Tell the Exclusive Monitors to record a sequence of one or more atomic 
// memory reads from virtual address range [address, address+dbytes-1]. 
// The Exclusive Monitor will only be set if all the reads are from the 
// same dbytes-aligned physical address, to allow for the possibility of 
// an atomicity break if the translation is changed between reads. 
AArch64.SetExclusiveMonitors(address, dbytes); 
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data 
X[t] 


Mem[address, dbytes, AccType_ORDERED]; 
ZeroExtend(data, regsize); 
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C6.2.78 LDAXRB 
Load-Acquire Exclusive Register Byte derives an address from a base register value, loads a byte from memory, 
zero-extends it and writes it to a register. The memory access is atomic. The PE marks the physical address being 
accessed as an exclusive access. This exclusive access mark is checked by Store Exclusive instructions. See 
Synchronization and semaphores on page B2-108. The instruction also has memory ordering semantics as described 
in Load-Acquire, Store-Release on page B2-90. For information about memory accesses see Load/Store addressing 
modes on page C1-128. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 | 5 4| 0| 
fo oJoo 71000 elnlok OOOO ae MMA, Rn TRE 
size Rt2 
No offset variant 
LDAXRB <Wt>, [<Xn|SP>{,#0}] 
Decode for this encoding 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 
bits(64) address; 
bits(8) data; 
if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 
// Tell the Exclusive Monitors to record a sequence of one or more atomic 
// memory reads from virtual address range [address, address+dbytes-1]. 
// The Exclusive Monitor will only be set if all the reads are from the 
// same dbytes-aligned physical address, to allow for the possibility of 
// an atomicity break if the translation is changed between reads. 
AArch64.SetExclusiveMonitors(address, 1); 
data = Mem[address, 1, AccType_ORDERED]; 
X[t] = ZeroExtend(data, 32); 
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C6.2.79 LDAXRH 


Load-Acquire Exclusive Register Halfword derives an address from a base register value, loads a halfword from 

memory, zero-extends it and writes it to a register. The memory access is atomic. The PE marks the physical address 
being accessed as an exclusive access. This exclusive access mark is checked by Store Exclusive instructions. See 
Synchronization and semaphores on page B2-108. The instruction also has memory ordering semantics as described 
in Load-Acquire, Store-Release on page B2-90. For information about memory accesses see Load/Store addressing 


modes on page C1-128. 


Ea eect ere | 109 | 5 4| 


fo ifo 0 1 0 0 oof i Porn OTD a DAMA) a 


size Rt2 


No offset variant 


LDAXRH <Wt>, [<Xn|SP>{,#0}] 


Decode for this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 


bits(64) address; 
bits(16) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


// Tell the Exclusive Monitors to record a sequence of one or more atomic 
// memory reads from virtual address range [address, address+dbytes-1]. 
// The Exclusive Monitor will only be set if all the reads are from the 
// same dbytes-aligned physical address, to allow for the possibility of 
// an atomicity break if the translation is changed between reads. 
AArch64.SetExclusiveMonitors(address, 2); 





data = Mem[address, 2, AccType_ORDERED]; 
X[t] = ZeroExtend(data, 32); 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-541 


ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 





C6.2.80 LDNP 
Load Pair of Registers, with non-temporal hint, calculates an address from a base register value and an immediate 
offset, loads two 32-bit words or two 64-bit doublewords from memory, and writes them to two registers. 
For information about memory accesses, see Load/Store addressing modes on page C1-128. For information about 
Non-temporal pair instructions, see Load/Store Non-temporal Pair on page C3-149. 
|31 30 29 28|27 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 
[x o]1 o tfofo o oft] immz OT RT Rn STR 
opc LL 
32-bit variant 
Applies when opc == 00. 
LDNP <Wtl>, <Wt2>, [<Xn|SP>{, #<imm>}] 
64-bit variant 
Applies when opc == 10. 
LDNP <Xtl>, <Xt2>, [<Xn|SP>{, #<imm>}] 
Decode for all variants of this encoding 
// Empty. 
Notes for all encodings 
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDNP on page K1-5484. 
Assembler symbols 
<Wt1> Is the 32-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Wt2> Is the 32-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 
<Xtl> Is the 64-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt2> Is the 64-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 
<Xn| SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> For the 32-bit variant: is the optional signed immediate byte offset, a multiple of 4 in the range -256 
to 252, defaulting to 0 and encoded in the "imm7" field as <imm>/4. 
For the 64-bit variant: is the optional signed immediate byte offset, a multiple of 8 in the range -512 
to 504, defaulting to 0 and encoded in the "imm7" field as <imm>/8. 
Shared decode for all encodings 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer t2 = UInt(Rt2); 
if opc<@> == '1' then UnallocatedEncoding(); 
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integer scale = 2 + UInt(opc<1>); 
integer datasize = 8 << scale; 
bits(64) offset = LSL(SignExtend(imm7, 64), scale); 


Operation 


bits(64) address; 

bits(datasize) datal; 

bits(datasize) data2; 

constant integer dbytes = datasize DIV 8; 
boolean rt_unknown = FALSE; 


if t == t2 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // result is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


datal = Mem[address, dbytes, AccType_STREAM] ; 
data2 = Mem[address+dbytes, dbytes, AccType_STREAM] ; 
if rt_unknown then 
datal = bits(datasize) UNKNOWN; 
data2 = bits(datasize) UNKNOWN; 
X[t] = datal; 
X[t2] = data2; 
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C6.2.81 


LDP 


Load Pair of Registers calculates an address from a base register value and an immediate offset, loads two 32-bit 
words or two 64-bit doublewords from memory, and writes them to two registers. For information about memory 


accesses, see Load/Store addressing modes on page C1-128. 


Post-index 


|31 30 29 2827 26 25 24|23 2221 | 


15 14 


109 | 


5 4| 


0 | 


[x o]1 o tfofo o tfif immz | RT Rn S| 


1 
opc L 


32-bit variant 

Applies when opc == 00. 

LDP <Wtl>, <Wt2>, [<Xn|SP>], #<imm> 

64-bit variant 

Applies when opc == 10. 

LDP <Xtl>, <Xt2>, [<Xn|SP>], #<imm> 
Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = TRUE; 


Pre-index 


|31 30 29 28|27 26 25 24/23 2221 | 


|15 14 


109 | 


5 4| 


0| 


fc o[7 0 ijofo 7 1[i] mm? | Ro | mn | Rt 


1 
opc L 


32-bit variant 

Applies when opc == 00. 

LDP <Wtl>, <Wt2>, [<Xn|SP>, #<imm>]! 

64-bit variant 

Applies when opc == 10. 

LDP <Xtl>, <Xt2>, [<Xn|SP>, #<imm>]! 
Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = FALSE; 


Signed offset 


|31 30 29 28|27 26 25 24/23 2221 | 


|15 14 


109 | 


5 4| 


0| 


[x o]1 o tfofo 4 ofaf immz | RT Rn S| 


1 
opc L 
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32-bit variant 
Applies when opc == 00. 
LDP <Wtl>, <Wt2>, [<Xn|SP>{, #<imm>}] 
64-bit variant 
Applies when opc == 10. 
LDP <Xtl>, <Xt2>, [<Xn|SP>{, #<imm>}] 
Decode for all variants of this encoding 
boolean whack = FALSE; 
boolean postindex = FALSE; 
Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDP on page K1-5484. 


Assembler symbols 


<Wtl> Is the 32-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 

<Wt2> Is the 32-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 

<Xtb Is the 64-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 

<Xt2> Is the 64-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 

<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 

<imm> For the 32-bit post-index and 32-bit pre-index variant: is the signed immediate byte offset, a 


multiple of 4 in the range -256 to 252, encoded in the "imm7" field as <imm>/4. 


For the 32-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 4 in 
the range -256 to 252, defaulting to 0 and encoded in the "imm7" field as <imm>/4. 


For the 64-bit post-index and 64-bit pre-index variant: is the signed immediate byte offset, a 
multiple of 8 in the range -512 to 504, encoded in the "imm7" field as <imm>/8. 


For the 64-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 8 in 
the range -512 to 504, defaulting to 0 and encoded in the "imm7" field as <imm>/8. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); 

if L:opc<@> == '@1' || opc == '11' then UnallocatedEncoding(); 
boolean signed = (opc<@> != 'Q'); 

integer scale = 2 + UInt(opc<1>); 

integer datasize = 8 << scale; 

bits(64) offset = LSL(SignExtend(imm7, 64), scale); 


Operation for all encodings 


bits(64) address; 

bits(datasize) datal; 

bits(datasize) data2; 

constant integer dbytes = datasize DIV 8; 
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boolean rt_unknown 
boolean wb_unknown 


FALSE; 
FALSE; 


if whack && (t == n || t2 == n) && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_WBSUPPRESS, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_WBSUPPRESS wback = FALSE; // writeback is suppressed 
when Constraint_UNKNOWN wb_unknown = TRUE; // writeback is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if t == t2 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // result is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


datal = Mem[address, dbytes, AccType_NORMAL]; 
data2 = Mem[address+dbytes, dbytes, AccType_NORMAL]; 
if rt_unknown then 

datal = bits(datasize) UNKNOWN; 

data2 = bits(datasize) UNKNOWN; 
if signed then 

X[t] = SignExtend(datal, 64); 

X[t2] = SignExtend(data2, 64); 
else 

X[t] = datal; 

X[t2] = data2; 


if whack then 
if wb_unknown then 
address = bits(64) UNKNOWN; 
elsif postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C6.2.82 LDPSW 


Load Pair of Registers Signed Word calculates an address from a base register value and an immediate offset, loads 
two 32-bit words from memory, sign-extends them, and writes them to two registers. For information about memory 
accesses, see Load/Store addressing modes on page C1-128. 


Post-index 
|31 30 29 28|27 26 25 24|23 2221 ‘| \15 14 | 109 | 5 4| 0 | 
fo if1 o tfofo o 1ft{ immz | eR 
opc L 


Post-index variant 


LDPSW <Xt1>, <Xt2>, [<Xn|SP>], #<imm> 


Decode for this encoding 


boolean whack = TRUE; 
boolean postindex = TRUE; 


Pre-index 
|31 30 29 28|27 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 
fo [ro tfopo + 1]t] mm? —«Y~—C#R 
opc L 


Pre-index variant 


LDPSW <Xt1l>, <Xt2>, [<Xn|SP>, #<imm>]! 


Decode for this encoding 
boolean whack = TRUE; 
boolean postindex = FALSE; 


Signed offset 


|31 30 29 28|27 26 25 24|23 2221 | |15 14 | 109 | 5 4| 0 | 


foa[7 0 ijofo 7 fi] mm? | Re. Rn | Rt 


1 
opc LL 


Signed offset variant 


LDPSW <Xt1>, <Xt2>, [<Xn|SP>{, #<imm>}] 


Decode for this encoding 
boolean whack = FALSE; 
boolean postindex = FALSE; 

Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDPSW on page K1-5485. 
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Assembler symbols 


<Xtb Is the 64-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 

<Xt2> Is the 64-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 

<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 

<imm> For the post-index and pre-index variant: is the signed immediate byte offset, a multiple of 4 in the 


range -256 to 252, encoded in the "imm7" field as <imm>/4. 


For the signed offset variant: is the optional signed immediate byte offset, a multiple of 4 in the 
range -256 to 252, defaulting to 0 and encoded in the "imm7" field as <imm>/4. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); 

bits(64) offset = LSL(SignExtend(imm7, 64), 2); 


Operation for all encodings 


bits(64) address; 

bits(32) datal; 

bits(32) data2; 

boolean rt_unknown = FALSE; 
boolean wb_unknown = FALSE; 


if whack && (t == n || t2 == n) && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_WBSUPPRESS, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_WBSUPPRESS wback = FALSE; // writeback is suppressed 
when Constraint_UNKNOWN wb_unknown = TRUE; // writeback is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if t == t2 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // result is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


datal = Mem[address, 4, AccType_NORMAL]; 
data2 = Mem[address+4, 4, AccType_NORMAL]; 
if rt_unknown then 

datal = bits(32) UNKNOWN; 

data2 = bits(32) UNKNOWN; 
X[t] = SignExtend(datal, 64); 
X[t2] = SignExtend(data2, 64); 
if whack then 

if wb_unknown then 

address = bits(64) UNKNOWN; 
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elsif postindex then 

address = address + offset; 
if n == 31 then 

SP[] = address; 
else 

X[n] = address; 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 
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C6.2.83 LDR (immediate) 


Load Register (immediate) loads a word or doubleword from memory and writes it to a register. The address that is 
used for the load is calculated from a base register and an immediate offset. For information about memory accesses, 
see Load/Store addressing modes on page C1-128. The Unsigned offset variant scales the immediate offset value 
by the size of the value accessed before adding it to the base register value. 


Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0| 
fi x[t 4 tfofo ofo to] immo ———s«ifo 7] =n ~«dT~SCOR_Csd 
size opc 


32-bit variant 
Applies when size == 10. 


LDR <Wt>, [<Xn|SP>], #<simm> 


64-bit variant 
Applies when size == 11. 


LDR <Xt>, [<Xn|SP>], #<simm> 


Decode for all variants of this encoding 


boolean whack = TRUE; 

boolean postindex = TRUE; 

integer scale = UInt(size); 

bits(64) offset = SignExtend(imm9, 64); 


Pre-index 
|31 30 29 28|27 26 25 24/23 22 21 20] | 12\11109 | 5 4| 0 | 
fi x[t 1 foo ofo to. immo ———s<si1 7] An] ORCS 
size opc 


32-bit variant 
Applies when size == 10. 


LDR <wt>, [<Xn|SP>, #<simm>] ! 


64-bit variant 
Applies when size == 11. 


LDR <Xt>, [<Xn|SP>, #<simm>] ! 


Decode for all variants of this encoding 


boolean whack = TRUE; 

boolean postindex = FALSE; 

integer scale = UInt(size); 

bits(64) offset = SignExtend(imm9, 64); 
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Unsigned offset 


|31 30 29 2827 26 25 24|23 2221 | | | 109 | 5 4| 0 | 


xf 1 ifofo ifo i] mmi2—S—SsS—~—~ss~SCSn Sd] SOR 


size opc 


32-bit variant 
Applies when size == 10. 


LDR <Wt>, [<Xn|SP>{, #<pimm>}] 


64-bit variant 
Applies when size == 11. 


LDR <Xt>, [<Xn|SP>{, #<pimm>}] 


Decode for all variants of this encoding 


boolean whack = FALSE; 

boolean postindex = FALSE; 

integer scale = UInt(size); 

bits(64) offset = LSL(ZeroExtend(imm12, 64), scale); 


Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDR (immediate) on page K1-5485. 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> For the 32-bit variant: is the optional positive immediate byte offset, a multiple of 4 in the range 0 


to 16380, defaulting to 0 and encoded in the "imm12" field as <pimm>/4. 


For the 64-bit variant: is the optional positive immediate byte offset, a multiple of 8 in the range 0 
to 32760, defaulting to 0 and encoded in the "imm12" field as <pimm>/8. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer regsize; 


regsize = if size == '11' then 64 else 32; 
integer datasize = 8 << scale; 
Operation for all encodings 

bits(64) address; 

bits(datasize) data; 


boolean wb_unknown = FALSE; 


if whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 
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assert c IN {Constraint_WBSUPPRESS, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 

when Constraint_WBSUPPRESS wback = FALSE; // writeback is suppressed 

when Constraint_UNKNOWN wb_unknown = TRUE; // writeback is UNKNOWN 

when Constraint_UNDEF UnallocatedEncoding(); 

when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


data 
X[t] 


Mem[address, datasize DIV 8, AccType_NORMAL]; 
ZeroExtend(data, regsize); 


if whack then 
if wb_unknown then 
address = bits(64) UNKNOWN; 
elsif postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C6.2.84 LDR (literal) 


Load Register (literal) calculates an address from the PC value and an immediate offset, loads a word from memory, 
and writes it to a register. For information about memory accesses, see Load/Store addressing modes on 
page C1-128. 


|31 30 29 28|27 26 25 24|23 | | | | 5 4| 0 


fo xjo 1 tfofo of immig R 


opc 


32-bit variant 
Applies when opc == 00. 


LDR <wt>, <label> 


64-bit variant 
Applies when opc == 01. 


LDR <Xt>, <label> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
MemOp memop = MemOp_LOAD; 
boolean signed = FALSE; 
integer size; 

bits(64) offset; 


case opc of 
when 'QQ' 
size = 4; 
when 'Q1' 
size = 8; 
when '10' 
size = 4; 
signed = TRUE; 
when '11' 
memop = MemOp_PREFETCH; 


offset = SignExtend(imm19:'Q0', 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be loaded, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be loaded, encoded in the "Rt" field. 
<label> Is the program label from which the data is to be loaded. Its offset from the address of this 


instruction, in the range +/-1 MB, is encoded as "imm19" times 4. 


Operation 


bits(64) address = PC[] + offset; 
bits(sizex8) data; 


case memop of 
when MemOp_LOAD 
data = Mem[address, size, AccType_NORMAL]; 
if signed then 
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~x< 
oo 
ct 
ws 
iT} 


SignExtend(data, 64); 
else 
X[t] = data; 


when MemOp_PREFETCH 
Prefetch(address, t<4:0>); 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


Load Register (register) calculates an address from a base register value and an offset register value, loads a word 
from memory, and writes it to a register. The offset register value can optionally be shifted and extended. For 
information about memory accesses, see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312\11109 | 5 4| 0| 


[1 x]1 1 tJofo ofo aft] Rm option {s]1 of Rn | Rt 


size 


ope 


32-bit variant 


Applies when size == 10. 


LDR <wt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


64-bit variant 


Applies when size == 11. 


LDR <Xt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


Decode for all variants of this encoding 


integer scale 
if option<1> 


= UInt(size); 


== '@' then UnallocatedEncoding(); // sub-word index 


ExtendType extend_type = DecodeRegExtend(option) ; 


integer shift = 


if S == '1' then scale else Q; 


Assembler symbols 


<Wt> 
<Xt> 
<Xn| SP> 


<Wm> 


<Xm> 


<extend> 


<amount> 


Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 


When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 


Is the index extend/shift specifier, defaulting to LSL, and which must be omitted for the LSL option 
when <amount> is omitted. encoded in the "option" field. It can have the following values: 


UXTW when option = 010 


LSL when option = 011 
SXTW when option = 110 
SXTX when option = 111 


For the 32-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 





#0 when S = 0 
#2 when S = 1 
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For the 64-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 


values: 
#0 when S = 0 
#3 when S = 1 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 
integer regsize; 


regsize = if size == '11' then 64 else 32; 
integer datasize = 8 << scale; 


Operation 


bits(64) offset = ExtendReg(m, extend_type, shift); 
bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


data 
X[t] 


Mem[address, datasize DIV 8, AccType_NORMAL]; 
ZeroExtend(data, regsize); 
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C6.2.86 LDRB (immediate) 


Load Register Byte (immediate) loads a byte from memory, zero-extends it, and writes the result to a register. The 
address that is used for the load is calculated from a base register and an immediate offset. For information about 
memory accesses, see Load/Store addressing modes on page C1-128. 


Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0| 
fo of1 1 tfofo ofo tfo] imma ft Rn | Rt 
size opc 


Post-index variant 


LDRB <Wt>, [<Xn|SP>], #<simm> 


Decode for this encoding 


boolean whack = TRUE; 
boolean postindex = TRUE; 
bits(64) offset = SignExtend(imm9, 64); 


Pre-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 A| 0| 
fo of1 1 1fojo ofo tfof imma tt tf Rn TR 
size opc 


Pre-index variant 


LDRB <Wt>, [<Xn|SP>, #<simm>]! 


Decode for this encoding 
boolean whack = TRUE; 


boolean postindex = FALSE; 
bits(64) offset = SignExtend(imm9, 64); 


Unsigned offset 


|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 
fo of1 1 tfofo 1fo tf immt2—— RT 
size opc 


Unsigned offset variant 


LDRB <Wt>, [<Xn|SP>{, #<pimm>}] 


Decode for this encoding 


boolean whack = FALSE; 
boolean postindex = FALSE; 
bits(64) offset = LSL(ZeroExtend(imm12, 64), 0); 
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Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDRB (immediate) on page K1-5486. 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> Is the optional positive immediate byte offset, in the range 0 to 4095, defaulting to 0 and encoded 


in the "imm12" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation for all encodings 


bits(64) address; 
bits(8) data; 
boolean wb_unknown = FALSE; 


if whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 


assert c IN {Constraint_WBSUPPRESS, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_WBSUPPRESS wback = FALSE; 
when Constraint_UNKNOWN wb_unknown = TRUE; 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


// writeback is suppressed 
// writeback is UNKNOWN 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


data 
X[t] 


Mem[address, 1, AccType_NORMAL]; 
ZeroExtend(data, 32); 


if whack then 
if wb_unknown then 
address = bits(64) UNKNOWN; 
elsif postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.87 LDRB (register) 


Load Register Byte (register) calculates an address from a base register value and an offset register value, loads a 
byte from memory, zero-extends it, and writes it to a register. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312\11109 | 5 4| 0| 


fo of1 1 1Jofo ofo aft] Rm__ | option {s]1 of Rn [| Rt 


size opc 


Extended register variant 

Applies when option != 011. 

LDRB <Wt>, [<Xn|SP>, (<Wm>|<Xm>), <extend> {<amount>}] 
Shifted register variant 

Applies when option == 011. 

LDRB <Wt>, [<Xn|SP>, <Xm>{, LSL <amount>}] 

Decode for all variants of this encoding 


if option<1> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Wm> When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<Xm> When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<extend> Is the index extend specifier, encoded in the "option" field. It can have the following values: 
UXTW when option = 010 
SXTW when option = 110 
SXTX when option = 111 
<amount> Is the index shift amount, it must be #0, encoded in "S" as @ if omitted, or as 1 if present. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 


Operation 


bits(64) offset = ExtendReg(m, extend_type, 0); 
bits(64) address; 
bits(8) data; 
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if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 





data = Mem[address, 1, AccType_NORMAL]; 
X[t] = ZeroExtend(data, 32); 
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C6.2.88 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


LDRH (immediate) 


Load Register Halfword (immediate) loads a halfword from memory, zero-extends it, and writes the result to a 
register. The address that is used for the load is calculated from a base register and an immediate offset. For 
information about memory accesses, see Load/Store addressing modes on page C1-128. 


Post-index 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


fo 1]1 1 1Jofo ofo sfo] imma fo tT Rn STR 


size opc 


Post-index variant 


LDRH <Wt>, [<Xn|SP>], #<simm> 


Decode for this encoding 
boolean whack = TRUE; 


boolean postindex = TRUE; 
bits(64) offset = SignExtend(imm9, 64); 


Pre-index 


|31 30 29 28|27 26 25 24/23 22 21 20| | 12/1110 9 | 5 4| 0 | 


jo t]1 1 1Jofo ofo sfo} imma tt tf Rn S| RT 


size opc 


Pre-index variant 


LDRH <Wt>, [<Xn|SP>, #<simm>] ! 


Decode for this encoding 
boolean whack = TRUE; 


boolean postindex = FALSE; 
bits(64) offset = SignExtend(imm9, 64); 


Unsigned offset 


|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 


fo t]1 1 tfofo sfo af immi2 TR TR 


size opc 


Unsigned offset variant 


LDRH <Wt>, [<Xn|SP>{, #<pimm>}] 


Decode for this encoding 


boolean whack = FALSE; 
boolean postindex = FALSE; 
bits(64) offset = LSL(ZeroExtend(imm12, 64), 1); 
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Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDRH (immediate) on page K1-5486. 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> Is the optional positive immediate byte offset, a multiple of 2 in the range 0 to 8190, defaulting to 0 


and encoded in the "imm12" field as <pimm>/2. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation for all encodings 


bits(64) address; 
bits(16) data; 
boolean wb_unknown = FALSE; 


if whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 


assert c IN {Constraint_WBSUPPRESS, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_WBSUPPRESS wback = FALSE; 
when Constraint_UNKNOWN wb_unknown = TRUE; 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


// writeback is suppressed 
// writeback is UNKNOWN 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


data 
X[t] 


Mem[address, 2, AccType_NORMAL]; 
ZeroExtend(data, 32); 


if whack then 
if wb_unknown then 
address = bits(64) UNKNOWN; 
elsif postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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LDRH (register) 


Load Register Halfword (register) calculates an address from a base register value and an offset register value, loads 
a halfword from memory, zero-extends it, and writes it to a register. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312\11109 | 5 4| 0| 


fo t]1 1 tJofo ofo aft] Rm option {s]1 of Rn [| Rt 


size opc 


32-bit variant 


LDRH <Wt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


Decode for this encoding 


if option<l> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = if S == '1' then 1 else 0; 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Wm> When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<Xm> When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<extend> Is the index extend/shift specifier, defaulting to LSL, and which must be omitted for the LSL option 
when <amount> is omitted. encoded in the "option" field. It can have the following values: 
UXTW when option = 010 
LSL when option = 011 
SXTW when option = 110 
SXTX when option = 111 
<amount> Is the index shift amount, optional only when <extend> is not LSL. Where it is permitted to be 
optional, it defaults to #0. It is encoded in the "S" field. It can have the following values: 
#0 when S = @ 
#1 when S = 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 


Operation 


bits(64) offset = ExtendReg(m, extend_type, shift); 
bits(64) address; 
bits(16) data; 
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if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 





data = Mem[address, 2, AccType_NORMAL]; 
X[t] = ZeroExtend(data, 32); 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.90 LDRSB (immediate) 


Load Register Signed Byte (immediate) loads a byte from memory, sign-extends it to either 32 bits or 64 bits, and 
writes the result to a register. The address that is used for the load is calculated from a base register and an immediate 
offset. For information about memory accesses, see Load/Store addressing modes on page C1-128. 


Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0| 
fo of1 1 tfofo of1 xfo] imma ft en S| Rt 
size opc 


32-bit variant 
Applies when opc == 11. 


LDRSB <Wt>, [<Xn|SP>], #<simm> 


64-bit variant 
Applies when opc == 10. 


LDRSB <Xt>, [<Xn|SP>], #<simm> 


Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = TRUE; 
bits(64) offset = SignExtend(imm9, 64); 


Pre-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0| 
fo of1 1 tfofo of1 xfo] immo tt tf Rn ST RT 
size opc 


32-bit variant 
Applies when opc == 11. 


LDRSB <Wt>, [<Xn|SP>, #<simm>]! 


64-bit variant 
Applies when opc == 10. 


LDRSB <Xt>, [<Xn|SP>, #<simm>]! 


Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = FALSE; 
bits(64) offset = SignExtend(imm9, 64); 


Unsigned offset 





|31 30 29 28|27 26 25 24/23 22 21 | | | 109 | 5 4| 0 | 
foo]t + t]o]o 11x] mmia2_—s—s—s=SC~dr~Srn dC 
size opc 
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32-bit variant 
Applies when opc == 11. 
LDRSB <Wt>, [<Xn|SP>{, #<pimm>}] 
64-bit variant 
Applies when opc == 10. 
LDRSB <Xt>, [<Xn|SP>{, #<pimm>}] 
Decode for all variants of this encoding 
boolean whack = FALSE; 
boolean postindex = FALSE; 
bits(64) offset = LSL(ZeroExtend(imm12, 64), 0); 
Notes for all encodings 
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDRSB (immediate) on page K1-5486. 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> Is the optional positive immediate byte offset, in the range 0 to 4095, defaulting to 0 and encoded 


in the "imm12" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
MemOp memop; 

boolean signed; 
integer regsize; 


if opc<l> == '@' then 
// store or zero-extending load 
memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
regsize = 32; 
signed = FALSE; 
else 
// sign-extending load 
memop = MemOp_LOAD; 
regsize = if opc<@> == '1' then 32 else 64; 
signed = TRUE; 


Operation for all encodings 


bits(64) address; 

bits(8) data; 

boolean wb_unknown = FALSE; 
boolean rt_unknown = FALSE; 


if memop == MemOp_LOAD && whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_WBSUPPRESS, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
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case c of 
when Constraint_WBSUPPRESS wback = FALSE; // writeback is suppressed 
when Constraint_UNKNOWN wb_unknown = TRUE; // writeback is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if memop == MemOp_STORE && whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_NONE, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_NONE rt_unknown = FALSE; // value stored is original value 
when Constraint_UNKNOWN rt_unknown = TRUE; // value stored is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
if memop != MemOp_PREFETCH then CheckSPAlignment(); 
address = SP[]; 

else 
address = X[n]; 


if !postindex then 
address = address + offset; 


case memop of 
when MemOp_STORE 
if rt_unknown then 
data = bits(8) UNKNOWN; 
else 
data = X[t]; 
Mem[address, 1, AccType_NORMAL] = data; 


when MemOp_LOAD 
data = Mem[address, 1, AccType_NORMAL]; 
if signed then 
X[t] = SignExtend(data, regsize); 
else 
X[t] = ZeroExtend(data, regsize); 


when MemOp_PREFETCH 
Prefetch(address, t<4:0>); 


if whack then 
if wb_unknown then 
address = bits(64) UNKNOWN; 
elsif postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 


X[n] = address; 
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C6.2.91 


LDRSB (register) 


Load Register Signed Byte (register) calculates an address from a base register value and an offset register value, 
loads a byte from memory, sign-extends it, and writes it to a register. For information about memory accesses, see 


Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15 1312/1110 9 


5 4| 


FY AC 


size opc 


32-bit with extended register offset variant 

Applies when opc == 11 && option != @11. 

LDRSB <Wt>, [<Xn|SP>, (<Wm>|<Xm>), <extend> {<amount>}] 
32-bit with shifted register offset variant 

Applies when opc == 11 && option == @11. 

LDRSB <Wt>, [<Xn|SP>, <Xm>{, LSL <amount>}] 

64-bit with extended register offset variant 

Applies when opc == 10 && option != @11. 

LDRSB <Xt>, [<Xn|SP>, (<Wm>|<Xm>), <extend> {<amount>}] 
64-bit with shifted register offset variant 

Applies when opc == 10 && option == @11. 

LDRSB <Xt>, [<Xn|SP>, <Xm>{, LSL <amount>}] 


Decode for all variants of this encoding 


if option<1> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 


<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


<Wm> When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 


"Rm" field. 


<Xm> When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 


"Rm" field. 


<extend> Is the index extend specifier, encoded in the "option" field. It can have the following values: 


UXTW when option = 010 
SXTW when option = 110 
SXTX when option = 111 


<amount> Is the index shift amount, it must be #0, encoded in "S" as @ if omitted, or as 1 if present. 
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Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 
MemOp memop; 

boolean signed; 
integer regsize; 


if opc<l> == '@' then 
// store or zero-extending load 
memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
regsize = 32; 
signed = FALSE; 
else 
// sign-extending load 
memop = MemOp_LOAD; 
regsize = if opc<@> == '1' then 32 else 64; 
signed = TRUE; 


Operation 


bits(64) offset = ExtendReg(m, extend_type, Q); 
bits(64) address; 
bits(8) data; 


if n == 31 then 
if memop != MemOp_PREFETCH then CheckSPAlignment(); 
address = SP[]; 

else 
address = X[n]; 


address = address + offset; 


case memop of 
when MemOp_STORE 
data = X[t]; 
Mem[address, 1, AccType_NORMAL] = data; 


when MemOp_LOAD 
data = Mem[address, 1, AccType_NORMAL]; 
if signed then 
X[t] = SignExtend(data, regsize); 
else 
X[t] = ZeroExtend(data, regsize); 


when MemOp_PREFETCH 
Prefetch(address, t<4:0>); 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 
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C6.2.92 LDRSH (immediate) 


Load Register Signed Halfword (immediate) loads a halfword from memory, sign-extends it to 32 bits or 64 bits, 
and writes the result to a register. The address that is used for the load is calculated from a base register and an 
immediate offset. For information about memory accesses, see Load/Store addressing modes on page C1-128. 


Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0| 
fo 1j1 1 tJfofo of1 xfo] imma ft Rn S| Rt 
size opc 


32-bit variant 
Applies when opc == 11. 


LDRSH <Wt>, [<Xn|SP>], #<simm> 


64-bit variant 
Applies when opc == 10. 


LDRSH <Xt>, [<Xn|SP>], #<simm> 


Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = TRUE; 
bits(64) offset = SignExtend(imm9, 64); 


Pre-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0| 
fo. 1f1 1 tfofo of1 xfo] immo tt tf Rn ST RT 
size opc 


32-bit variant 
Applies when opc == 11. 


LDRSH <Wt>, [<Xn|SP>, #<simm>]! 


64-bit variant 
Applies when opc == 10. 


LDRSH <Xt>, [<Xn|SP>, #<simm>]! 


Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = FALSE; 
bits(64) offset = SignExtend(imm9, 64); 


Unsigned offset 





131 30 29 28|27 26 25 24/23 2221 | | | 109 | 5 4| 0 | 
oat 1 jojo [tx] mma —=~d sekn i] SOR i 
size opc 
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32-bit variant 
Applies when opc == 11. 


LDRSH <Wt>, [<Xn|SP>{, #<pimm>}] 


64-bit variant 
Applies when opc == 10. 


LDRSH <Xt>, [<Xn|SP>{, #<pimm>}] 


Decode for all variants of this encoding 
boolean whack = FALSE; 

boolean postindex = FALSE; 

bits(64) offset = LSL(ZeroExtend(imm12, 64), 1); 
Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDRSH (immediate) on 
page K1-5487. 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 

<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> Is the optional positive immediate byte offset, a multiple of 2 in the range 0 to 8190, defaulting to 0 


and encoded in the "imm12" field as <pimm>/2. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
MemOp memop; 

boolean signed; 
integer regsize; 


if opc<l> == '@' then 
// store or zero-extending load 
memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
regsize = 32; 
signed = FALSE; 
else 
// sign-extending load 
memop = MemOp_LOAD; 
regsize = if opc<@> == '1' then 32 else 64; 
signed = TRUE; 


Operation for all encodings 


bits(64) address; 

bits(16) data; 

boolean wb_unknown = FALSE; 
boolean rt_unknown = FALSE; 


if memop == MemOp_LOAD && whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 
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assert c IN {Constraint_WBSUPPRESS, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 

when Constraint_WBSUPPRESS wback = FALSE; // writeback is suppressed 

when Constraint_UNKNOWN wb_unknown = TRUE; // writeback is UNKNOWN 

when Constraint_UNDEF UnallocatedEncoding(); 

when Constraint_NOP EndOfInstruction(); 


if memop == MemOp_STORE && whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_NONE, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_NONE rt_unknown = FALSE; // value stored is original value 
when Constraint_UNKNOWN rt_unknown = TRUE; // value stored is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
if memop != MemOp_PREFETCH then CheckSPAlignment(); 
address = SP[]; 

else 
address = X[n]; 


if !postindex then 
address = address + offset; 


case memop of 
when MemOp_STORE 
if rt_unknown then 
data = bits(16) UNKNOWN; 
else 
data = X[t]; 
Mem[address, 2, AccType_NORMAL] = data; 


when MemOp_LOAD 
data = Mem[address, 2, AccType_NORMAL]; 
if signed then 
X[t] = SignExtend(data, regsize); 
else 
X[t] = ZeroExtend(data, regsize); 


when MemOp_PREFETCH 
Prefetch(address, t<4:0>); 


if whack then 
if wb_unknown then 
address = bits(64) UNKNOWN; 
elsif postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C6.2.93 LDRSH (register) 


Load Register Signed Halfword (register) calculates an address from a base register value and an offset register 
value, loads a halfword from memory, sign-extends it, and writes it to a register. For information about memory 
accesses see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312\11109 | 5 4| 0| 


fo t]1 1 tJofo oft x{1] Rm __ | option {s]1 of Rn | Rt 


size opc 


32-bit variant 

Applies when opc == 11. 

LDRSH <Wt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 
64-bit variant 

Applies when opc == 10. 


LDRSH <Xt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


Decode for all variants of this encoding 


if option<1> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = if S == '1' then 1 else 0; 


Assembler symbols 





<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Wm> When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<Xm> When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<extend> Is the index extend/shift specifier, defaulting to LSL, and which must be omitted for the LSL option 
when <amount> is omitted. encoded in the "option" field. It can have the following values: 
UXTW when option = 010 
LSL when option = 011 
SXTW when option = 110 
SXTX when option = 111 
<amount> Is the index shift amount, optional only when <extend> is not LSL. Where it is permitted to be 
optional, it defaults to #0. It is encoded in the "S" field. It can have the following values: 
#0 when S = @ 
#1 when S = 
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Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 
MemOp memop; 

boolean signed; 
integer regsize; 


if opc<l> == '@' then 
// store or zero-extending load 
memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
regsize = 32; 
signed = FALSE; 
else 
// sign-extending load 
memop = MemOp_LOAD; 
regsize = if opc<@> == '1' then 32 else 64; 
signed = TRUE; 


Operation 


bits(64) offset = ExtendReg(m, extend_type, shift); 
bits(64) address; 
bits(16) data; 


if n == 31 then 
if memop != MemOp_PREFETCH then CheckSPAlignment(); 
address = SP[]; 

else 
address = X[n]; 


address = address + offset; 


case memop of 
when MemOp_STORE 
data = X[t]; 
Mem[address, 2, AccType_NORMAL] = data; 


when MemOp_LOAD 
data = Mem[address, 2, AccType_NORMAL]; 
if signed then 
X[t] = SignExtend(data, regsize); 
else 
X[t] = ZeroExtend(data, regsize); 


when MemOp_PREFETCH 
Prefetch(address, t<4:0>); 
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LDRSW (immediate) 


Load Register Signed Word (immediate) loads a word from memory, sign-extends it to 64 bits, and writes the result 
to a register. The address that is used for the load is calculated from a base register and an immediate offset. For 
information about memory accesses, see Load/Store addressing modes on page C1-128. 


Post-index 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


[1 of1 1 tJofo oft ofo] imma fo tT Rn S| 


size opc 


Post-index variant 


LDRSW <Xt>, [<Xn|SP>], #<simm> 


Decode for this encoding 
boolean whack = TRUE; 


boolean postindex = TRUE; 
bits(64) offset = SignExtend(imm9, 64); 


Pre-index 


|31 30 29 28|27 26 25 24/23 22 21 20| | 12/1110 9 | 5 4| 0 | 


[1 of1 1 tJfofo oft ofo} imma tt tf Rn S| Rt 


size opc 


Pre-index variant 


LDRSW <Xt>, [<Xn|SP>, #<simm>]! 


Decode for this encoding 
boolean whack = TRUE; 


boolean postindex = FALSE; 
bits(64) offset = SignExtend(imm9, 64); 


Unsigned offset 


|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 


[1 of1 4 tfofo aft of immiz2 TR TR 


size opc 


Unsigned offset variant 


LDRSW <Xt>, [<Xn|SP>{, #<pimm>}] 


Decode for this encoding 


boolean whack = FALSE; 
boolean postindex = FALSE; 
bits(64) offset = LSL(ZeroExtend(imm12, 64), 2); 
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Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDRSW (immediate) on 


page K1-5487. 


Assembler symbols 


<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> Is the optional positive immediate byte offset, a multiple of 4 in the range 0 to 16380, defaulting to 


0 and encoded in the "imm12" field as <pimm>/4. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation for all encodings 


bits(64) address; 
bits(32) data; 
boolean wb_unknown = FALSE; 


if whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 


assert c IN {Constraint_WBSUPPRESS, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_WBSUPPRESS wback = FALSE; // writeback is suppressed 
when Constraint_UNKNOWN wb_unknown = TRUE; // writeback is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


data = Mem[address, 4, AccType_NORMAL]; 
X[t] = SignExtend(data, 64); 
if whack then 
if wb_unknown then 
address = bits(64) UNKNOWN; 
elsif postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C6.2.95 LDRSW (literal) 
Load Register Signed Word (literal) calculates an address from the PC value and an immediate offset, loads a word 
from memory, and writes it to a register. For information about memory accesses, see Load/Store addressing modes 
on page C1-128. 

|31 30 29 28|27 26 25 24|23 5 4| 0 | 
[i ojo 1 tfofo of immig§ OR 
opc 

Literal variant 
LDRSW <Xt>, <label> 
Decode for this encoding 

integer t = UInt(Rt); 

bits(64) offset; 

offset = SignExtend(imm19:'Q0', 64); 
Assembler symbols 
<Xt> Is the 64-bit name of the general-purpose register to be loaded, encoded in the "Rt" field. 
<label> Is the program label from which the data is to be loaded. Its offset from the address of this 

instruction, in the range +/-1 MB, is encoded as "imm19" times 4. 

Operation 

bits(64) address = PC[] + offset; 

bits(32) data; 

data = Mem[address, 4, AccType_NORMAL]; 

X[t] = SignExtend(data, 64); 
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C6.2.96 LDRSW (register) 


Load Register Signed Word (register) calculates an address from a base register value and an offset register value, 
loads a word from memory, sign-extends it to form a 64-bit value, and writes it to a register. The offset register value 
can be shifted left by 0 or 2 bits. For information about memory accesses, see Load/Store addressing modes on 
page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312\11109 | 5 4| 0 | 


oft 1 ifofo oft of7] Rm [option [S[7 0] Rn | Rt 


size opc 


64-bit variant 


LDRSW <Xt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


Decode for this encoding 


if option<l> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = if S == '1' then 2 else 0; 


Assembler symbols 


<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Wm> When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<Xm> When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<extend> Is the index extend/shift specifier, defaulting to LSL, and which must be omitted for the LSL option 
when <amount> is omitted. encoded in the "option" field. It can have the following values: 
UXTW when option = 010 
LSL when option = 011 
SXTW when option = 110 
SXTX when option = 111 
<amount> Is the index shift amount, optional only when <extend> is not LSL. Where it is permitted to be 
optional, it defaults to #0. It is encoded in the "S" field. It can have the following values: 
#0 when S = @ 
#2 when S = 1 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 


Operation 


bits(64) offset = ExtendReg(m, extend_type, shift); 
bits(64) address; 
bits(32) data; 
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if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 





data = Mem[address, 4, AccType_NORMAL]; 
X[t] = SignExtend(data, 64); 
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C6.2.97 LDTR 


Load Register (unprivileged) loads a word or doubleword from memory, and writes it to a register. The address that 
is used for the load is calculated from a base register and an immediate offset. 


When executed at EL1, the memory access is restricted as if execution was at ELO. Otherwise, the access permission 
is for the Exception level at which the instruction is executed. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24/23 22 21 20] | 12\11109 | 5 4| 0 | 
fi x[t 4 ‘Jojo ofo to] immo——S<«d( OY] ~=SrRn SCd|~SO Cd 
size opc 


32-bit variant 

Applies when size == 10. 

LDTR <Wt>, [<Xn|SP>{, #<simm>}] 

64-bit variant 

Applies when size == 11. 

LDTR <Xt>, [<Xn|SP>{, #<simm>}] 

Decode for all variants of this encoding 
integer scale = UInt(size); 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer regsize; 


regsize = if size == '11' then 64 else 32; 
integer datasize = 8 << scale; 


Operation 


bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 
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address = address + offset; 





data = Mem[address, datasize DIV 8, AccType_UNPRIV]; 
X[t] = ZeroExtend(data, regsize); 
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C6.2.98 LDTRB 


Load Register Byte (unprivileged) loads a byte from memory, zero-extends it, and writes the result to a register. The 
address that is used for the load is calculated from a base register and an immediate offset. 


When executed at EL1, the memory access is restricted as if execution was at ELO. Otherwise, the access permission 
is for the Exception level at which the instruction is executed. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24/23 22 21 20] | 12\11109 | 5 4| 0 | 
fo oft 4 tfojo ojo to] mmo —~*i(1oy Rn | Rt 
size opc 


Unscaled offset variant 


LDTRB <Wt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(8) data; 


if n == 31 then 
CheckSPAlignment() ; 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 





data = Mem[address, 1, AccType_UNPRIV]; 
X[t] = ZeroExtend(data, 32); 
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C6.2.99 LDTRH 


Load Register Halfword (unprivileged) loads a halfword from memory, zero-extends it, and writes the result to a 
register. The address that is used for the load is calculated from a base register and an immediate offset. 


When executed at EL1, the memory access is restricted as if execution was at ELO. Otherwise, the access permission 
is for the Exception level at which the instruction is executed. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24/23 22 21 20] | 12\11109 | 5 4| 0 | 
fo [1 4 tfofo ofo to] mmo ——~—~«i(t oy] =n ~SdT~SO Cd 
size opc 


Unscaled offset variant 


LDTRH <Wt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(16) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 





data = Mem[address, 2, AccType_UNPRIV]; 
X[t] = ZeroExtend(data, 32); 
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C6.2.100 LDTRSB 


Load Register Signed Byte (unprivileged) loads a byte from memory, sign-extends it to 32 bits or 64 bits, and writes 
the result to a register. The address that is used for the load is calculated from a base register and an immediate offset. 


When executed at EL1, the memory access is restricted as if execution was at ELO. Otherwise, the access permission 
is for the Exception level at which the instruction is executed. For information about memory accesses, see 


Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 


0| 


fo oft 1 afofo o[1 xJo] mma —~<(t oy mn | Rt 


size opc 


32-bit variant 

Applies when opc == 11. 

LDTRSB <Wt>, [<Xn|SP>{, #<simm>}] 

64-bit variant 

Applies when opc == 10. 

LDTRSB <Xt>, [<Xn|SP>{, #<simm>}] 

Decode for all variants of this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
MemOp memop; 

boolean signed; 
integer regsize; 


if opc<1l> == '@' then 
// store or zero-extending load 


memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 


regsize = 32; 
signed = FALSE; 
else 
// sign-extending load 
memop = MemOp_LOAD; 
regsize = if opc<@> == '1' then 32 else 64; 
signed = TRUE; 
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Operation 


bits(64) address; 
bits(8) data; 


if n == 31 then 
if memop != MemOp_PREFETCH then CheckSPAlignment(); 
address = SP[]; 

else 
address = X[n]; 


address = address + offset; 


case memop of 
when MemOp_STORE 
data = X[t]; 
Mem[address, 1, AccType_UNPRIV] = data; 


when MemOp_LOAD 
data = Mem[address, 1, AccType_UNPRIV]; 
if signed then 
X[t] = SignExtend(data, regsize); 
else 
X[t] = ZeroExtend(data, regsize); 


when MemOp_PREFETCH 
Prefetch(address, t<4:0>); 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 
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C6.2.101 LDTRSH 


Load Register Signed Halfword (unprivileged) loads a halfword from memory, sign-extends it to 32 bits or 64 bits, 
and writes the result to a register. The address that is used for the load is calculated from a base register and an 


immediate offset. 


When executed at EL1, the memory access is restricted as if execution was at ELO. Otherwise, the access permission 
is for the Exception level at which the instruction is executed. For information about memory accesses, see 


Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 


0| 


fo a[7 1 ifofo oft xfo] mma—s«tt oy] mn | Rt 


size opc 


32-bit variant 

Applies when opc == 11. 

LDTRSH <Wt>, [<Xn|SP>{, #<simm>}] 

64-bit variant 

Applies when opc == 10. 

LDTRSH <Xt>, [<Xn|SP>{, #<simm>}] 

Decode for all variants of this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
MemOp memop; 

boolean signed; 
integer regsize; 


if opc<1l> == '@' then 
// store or zero-extending load 


memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 


regsize = 32; 
signed = FALSE; 
else 
// sign-extending load 
memop = MemOp_LOAD; 
regsize = if opc<@> == '1' then 32 else 64; 
signed = TRUE; 
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Operation 


bits(64) address; 
bits(16) data; 


if n == 31 then 
if memop != MemOp_PREFETCH then CheckSPAlignment(); 
address = SP[]; 

else 
address = X[n]; 


address = address + offset; 


case memop of 
when MemOp_STORE 
data = X[t]; 
Mem[address, 2, AccType_UNPRIV] = data; 


when MemOp_LOAD 
data = Mem[address, 2, AccType_UNPRIV]; 
if signed then 
X[t] = SignExtend(data, regsize); 
else 
X[t] = ZeroExtend(data, regsize); 


when MemOp_PREFETCH 
Prefetch(address, t<4:0>); 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 
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C6.2.102 LDTRSW 


Load Register Signed Word (unprivileged) loads a word from memory, sign-extends it to 64 bits, and writes the 
result to a register. The address that is used for the load is calculated from a base register and an immediate offset. 


When executed at EL1, the memory access is restricted as if execution was at ELO. Otherwise, the access permission 
is for the Exception level at which the instruction is executed. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24/23 22 21 20] | 12\11109 | 5 4| 0 | 
oft 7 tfofo oft ojo] immo —~—~*i(toy Rn | Rt | 
size opc 


Unscaled offset variant 


LDTRSW <Xt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(32) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 





data = Mem[address, 4, AccType_UNPRIV]; 
X[t] = SignExtend(data, 64); 
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C6.2.103 LDUR 


Load Register (unscaled) calculates an address from a base register and an immediate offset, loads a 32-bit word or 
64-bit doubleword from memory, zero-extends it, and writes it to a register. For information about memory accesses, 
see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


[1 x]1 1 tJofo ofo sfo] imma fo of Rn S| 


size opc 


32-bit variant 

Applies when size == 10. 

LDUR <Wt>, [<Xn|SP>{, #<simm>}] 

64-bit variant 

Applies when size == 11. 

LDUR <Xt>, [<Xn|SP>{, #<simm>}] 

Decode for all variants of this encoding 
integer scale = UInt(size); 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer regsize; 


regsize = if size == '11' then 64 else 32; 
integer datasize = 8 << scale; 


Operation 


bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 
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data 
X[t] 


Mem[address, datasize DIV 8, AccType_NORMAL]; 
ZeroExtend(data, regsize); 
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C6.2.104 LDURB 


Load Register Byte (unscaled) calculates an address from a base register and an immediate offset, loads a byte from 
memory, zero-extends it, and writes it to a register. For information about memory accesses, see Load/Store 
addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


fo of1 1 tJofo ofo sfo] imma fo of Rn S| 


size opc 


Unscaled offset variant 


LDURB <Wt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(8) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 





data = Mem[address, 1, AccType_NORMAL]; 
X[t] = ZeroExtend(data, 32); 
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C6.2.105 LDURH 


Load Register Halfword (unscaled) calculates an address from a base register and an immediate offset, loads a 
halfword from memory, zero-extends it, and writes it to a register. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


jo 1]1 1 1Jofo ofo sfo] imma fo of Rn S| 


size opc 


Unscaled offset variant 


LDURH <Wt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(16) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 





data = Mem[address, 2, AccType_NORMAL]; 
X[t] = ZeroExtend(data, 32); 
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C6.2.106 LDURSB 


Load Register Signed Byte (unscaled) calculates an address from a base register and an immediate offset, loads a 
signed byte from memory, sign-extends it, and writes it to a register. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


fo of1 1 tJofo of1 xfo] imma fo of Rn S| 


size opc 


32-bit variant 

Applies when opc == 11. 

LDURSB <Wt>, [<Xn|SP>{, #<simm>}] 

64-bit variant 

Applies when opc == 10. 

LDURSB <Xt>, [<Xn|SP>{, #<simm>}] 

Decode for all variants of this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
MemOp memop; 

boolean signed; 
integer regsize; 


if opc<l> == '@' then 
// store or zero-extending load 
memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
regsize = 32; 
signed = FALSE; 
else 
// sign-extending load 
memop = MemOp_LOAD; 
regsize = if opc<@> == '1' then 32 else 64; 
signed = TRUE; 
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Operation 


bits(64) address; 
bits(8) data; 


if n == 31 then 
if memop != MemOp_PREFETCH then CheckSPAlignment(); 
address = SP[]; 

else 
address = X[n]; 


address = address + offset; 


case memop of 
when MemOp_STORE 
data = X[t]; 
Mem[address, 1, AccType_NORMAL] = data; 


when MemOp_LOAD 
data = Mem[address, 1, AccType_NORMAL]; 
if signed then 
X[t] = SignExtend(data, regsize); 
else 
X[t] = ZeroExtend(data, regsize); 


when MemOp_PREFETCH 
Prefetch(address, t<4:0>); 
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C6.2.107 LDURSH 


Load Register Signed Halfword (unscaled) calculates an address from a base register and an immediate offset, loads 
a signed halfword from memory, sign-extends it, and writes it to a register. For information about memory accesses, 
see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


size opc 
32-bit variant 
Applies when opc == 11. 
LDURSH <Wt>, [<Xn|SP>{, #<simm>}] 
64-bit variant 
Applies when opc == 10. 
LDURSH <Xt>, [<Xn|SP>{, #<simm>}] 
Decode for all variants of this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
MemOp memop; 

boolean signed; 
integer regsize; 


if opc<l> == '@' then 
// store or zero-extending load 
memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
regsize = 32; 
signed = FALSE; 
else 
// sign-extending load 
memop = MemOp_LOAD; 
regsize = if opc<@> == '1' then 32 else 64; 
signed = TRUE; 
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Operation 


bits(64) address; 
bits(16) data; 


if n == 31 then 
if memop != MemOp_PREFETCH then CheckSPAlignment(); 
address = SP[]; 

else 
address = X[n]; 


address = address + offset; 


case memop of 
when MemOp_STORE 
data = X[t]; 
Mem[address, 2, AccType_NORMAL] = data; 


when MemOp_LOAD 
data = Mem[address, 2, AccType_NORMAL]; 
if signed then 
X[t] = SignExtend(data, regsize); 
else 
X[t] = ZeroExtend(data, regsize); 


when MemOp_PREFETCH 
Prefetch(address, t<4:0>); 
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C6.2.108 LDURSW 


Load Register Signed Word (unscaled) calculates an address from a base register and an immediate offset, loads a 
signed word from memory, sign-extends it, and writes it to a register. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


[1 of1 1 tJofo oft ofo] imma fo of Rn S| 


size opc 


Unscaled offset variant 


LDURSW <Xt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(32) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 





data = Mem[address, 4, AccType_NORMAL]; 
X[t] = SignExtend(data, 64); 
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C6.2.109 LDXP 

Load Exclusive Pair of Registers derives an address from a base register value, loads two 32-bit words or two 64-bit 
doublewords from memory, and writes them to two registers. A 32-bit pair requires the address to be doubleword 
aligned and is single-copy atomic at doubleword granularity. A 64-bit pair requires the address to be quadword 
aligned and is single-copy atomic for each doubleword at doubleword granularity. The PE marks the physical 
address being accessed as an exclusive access. This exclusive access mark is checked by Store Exclusive 
instructions. See Synchronization and semaphores on page B2-108. For information about memory accesses see 
Load/Store addressing modes on page C1-128. 

|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 14 | 109 5 4| 0 | 

|ifsz]o 0 1 0 0 0 Oil ii OIOTOTOTG) a a 
32-bit variant 
Applies when sz == 
LDXP <Wtl>, <Wt2>, [<Xn|SP>{,#0}] 
64-bit variant 
Applies when sz == 
LDXP <Xtl>, <Xt2>, [<Xn|SP>{,#0}] 
Decode for all variants of this encoding 

integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); 

integer elsize = 32 << UInt(sz); 

integer datasize = elsize « 2; 
Notes for all encodings 
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDXP on page K1-5487. 
Assembler symbols 
<Wtl> Is the 32-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Wt2> Is the 32-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 

field. 
<Xtb> Is the 64-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt2> Is the 64-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 

<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 

bits(64) address; 

bits(datasize) data; 

constant integer dbytes = datasize DIV 8; 
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boolean rt_unknown = FALSE; 


if t == t2 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // result is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


// Tell the Exclusive Monitors to record a sequence of one or more atomic 
// memory reads from virtual address range [address, address+dbytes-1]. 
// The Exclusive Monitor will only be set if all the reads are from the 
// same dbytes-aligned physical address, to allow for the possibility of 
// an atomicity break if the translation is changed between reads. 
AArch64.SetExclusiveMonitors(address, dbytes); 


if rt_unknown then 
// ConstrainedUNPREDICTABLE case 
X[t] = bits(datasize) UNKNOWN; 
elsif elsize == 32 then 
// 32-bit load exclusive pair (atomic) 
data = Mem[address, dbytes, AccType_ATOMIC]; 
if BigEndian() then 
X[t] = data<datasize-1l:elsize>; 
X[t2] = data<elsize-1:0>; 
else 
X[t] = data<elsize-1:0>; 
X[t2] = data<datasize-1l:elsize>; 
else // elsize == 64 
// 64-bit load exclusive pair (not atomic), 
// but must be 128-bit aligned 
if address != Align(address, dbytes) then 
AArch64.Abort(address, AArch64.AlignmentFault(AccType_ATOMIC, FALSE, FALSE)); 
X[t] = Mem[address, 8, AccType_ATOMIC]; 
X[t2] = Mem[address+8, 8, AccType_ATOMIC]; 
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C6.2.110 LDXR 
Load Exclusive Register derives an address from a base register value, loads a 32-bit word or a 64-bit doubleword 
from memory, and writes it to a register. The memory access is atomic. The PE marks the physical address being 
accessed as an exclusive access. This exclusive access mark is checked by Store Exclusive instructions. See 
Synchronization and semaphores on page B2-108. For information about memory accesses see Load/Store 
addressing modes on page C1-128. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 5 4| 0| 
oo1000 EY OOOO Nol ONO. a 
size Rt2 
32-bit variant 
Applies when size == 10. 
LDXR <Wt>, [<Xn|SP>{,#0}] 
64-bit variant 
Applies when size == 11. 
LDXR <Xt>, [<Xn|SP>{,#0}] 
Decode for all variants of this encoding 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer elsize = 8 << UInt(size); 
integer regsize = if elsize == 64 then 64 else 32; 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 
bits(64) address; 
bits(elsize) data; 
constant integer dbytes = elsize DIV 8; 
if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 
// Tell the Exclusive Monitors to record a sequence of one or more atomic 
// memory reads from virtual address range [address, address+dbytes-1]. 
// The Exclusive Monitor will only be set if all the reads are from the 
// same dbytes-aligned physical address, to allow for the possibility of 
// an atomicity break if the translation is changed between reads. 
AArch64.SetExclusiveMonitors(address, dbytes); 
data = Mem[address, dbytes, AccType_ATOMIC]; 
X[t] = ZeroExtend(data, regsize); 
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LDXRB 


Load Exclusive Register Byte derives an address from a base register value, loads a byte from memory, zero-extends 
it and writes it to a register. The memory access is atomic. The PE marks the physical address being accessed as an 
exclusive access. This exclusive access mark is checked by Store Exclusive instructions. See Synchronization and 
semaphores on page B2-108. For information about memory accesses see Load/Store addressing modes on 

page C1-128. 


|31 30 29 28|27 26 25 sa 22 21 20| ete | 109 5 4| 0 | 


9 ofo 0 10 0 ofof ipo Hac HOTO Ea DAMA a 


size Rt2 


No offset variant 


LDXRB <Wt>, [<Xn|SP>{,#0}] 


Decode for this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 


bits(64) address; 
bits(8) data; 


if n == 31 then 
CheckSPAlignment() ; 
address = SP[]; 
else 
address = X[n]; 


// Tell the Exclusive Monitors to record a sequence of one or more atomic 
// memory reads from virtual address range [address, address+dbytes-1]. 
// The Exclusive Monitor will only be set if all the reads are from the 
// same dbytes-aligned physical address, to allow for the possibility of 
// an atomicity break if the translation is changed between reads. 
AArch64.SetExclusiveMonitors(address, 1); 





data = Mem[address, 1, AccType_ATOMIC]; 
X[t] = ZeroExtend(data, 32); 
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C6.2.112 LDXRH 
Load Exclusive Register Halfword derives an address from a base register value, loads a halfword from memory, 
zero-extends it and writes it to a register. The memory access is atomic. The PE marks the physical address being 
accessed as an exclusive access. This exclusive access mark is checked by Store Exclusive instructions. See 
Synchronization and semaphores on page B2-108. For information about memory accesses see Load/Store 
addressing modes on page C1-128. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 5 4| 0| 
oo1000 EY OE DOM) Nol (1) (1) (1) (1 a 
size Rt2 
No offset variant 
LDXRH <Wt>, [<Xn|SP>{,#0}] 
Decode for this encoding 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 
bits(64) address; 
bits(16) data; 
if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 
// Tell the Exclusive Monitors to record a sequence of one or more atomic 
// memory reads from virtual address range [address, address+dbytes-1]. 
// The Exclusive Monitor will only be set if all the reads are from the 
// same dbytes-aligned physical address, to allow for the possibility of 
// an atomicity break if the translation is changed between reads. 
AArch64.SetExclusiveMonitors(address, 2); 
data = Mem[address, 2, AccType_ATOMIC]; 
X[t] = ZeroExtend(data, 32); 
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LSL (register) 


Logical Shift Left (register) shifts a register value left by a variable number of bits, shifting in zeros, and writes the 
result to the destination register. The remainder obtained by dividing the second source register by the data size 
defines the number of bits by which the first source register is left-shifted. 


This instruction is an alias of the LSLV instruction. This means that: 
° The encodings in this description are named to match the encodings of LSLV. 


° The description of LSLV gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1413 12/1110 9 | 5 4| 0 | 


sffofojt 10710170) Rm foot ojoo}] Rn {| Ra | 


op2 


32-bit variant 
Applies when sf == 
LSL <Wd>, <Wn>, <Wm> 
is equivalent to 

LSLV <Wd>, <Wn>, <Wm> 


and is always the preferred disassembly. 


64-bit variant 
Applies when sf == 
LSL <Xd>, <Xn>, <Xm> 
is equivalent to 

LSLV <Xd>, <Xn>, <Xm> 


and is always the preferred disassembly. 


Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding a shift amount from 0 to 
31 in its bottom 5 bits, encoded in the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register holding a shift amount from 0 to 
63 in its bottom 6 bits, encoded in the "Rm" field. 

Operation 


The description of LSLV gives the operational pseudocode for this instruction. 
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C6.2.114 


LSL (immediate) 


Logical Shift Left (immediate) shifts a register value left by an immediate number of bits, shifting in zeros, and 
writes the result to the destination register. 


This instruction is an alias of the UBFM instruction. This means that: 
° The encodings in this description are named to match the encodings of UBFM. 


° The description of UBFM gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0 | 


ee 


opc imms 


32-bit variant 

Applies when sf == @ && N == @ && imms != Q11111. 
LSL <Wd>, <Wn>, #<shift> 

is equivalent to 

UBFM <Wd>, <Wn>, #(-<shift> MOD 32), #(31-<shift>) 
and is the preferred disassembly when imms + 1 == immr. 
64-bit variant 

Applies when sf == 1 && N == 1 && imms != 111111. 
LSL <Xd>, <Xn>, #<shift> 

is equivalent to 

UBFM <Xd>, <Xn>, #(-<shift> MOD 64), #(63-<shift>) 


and is the preferred disassembly when imms + 1 == immr. 


Assembler symbols 

<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 


<shift> For the 32-bit variant: is the shift amount, in the range 0 to 31. 


For the 64-bit variant: is the shift amount, in the range 0 to 63. 


Operation 


The description of UBFM gives the operational pseudocode for this instruction. 
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C6.2.115 LSLV 


Logical Shift Left Variable shifts a register value left by a variable number of bits, shifting in zeros, and writes the 
result to the destination register. The remainder obtained by dividing the second source register by the data size 
defines the number of bits by which the first source register is left-shifted. 


This instruction is used by the alias LSL (register). The alias is always the preferred disassembly. 


|31 30 29 28|27 26 25 24/23 22 21 20] 16|15 14 13 12/11 10 9 — 5 4| 0 | 
sfofjol1 to +0770) Rm footojoo] Rn | Ra | 
op2 


32-bit variant 
Applies when sf == 


LSLV <Wd>, <Wn>, <Wm> 


64-bit variant 
Applies when sf == 


LSLV <Xd>, <Xn>, <Xm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 
ShiftType shift_type = DecodeShift(op2); 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register holding a shift amount from 0 to 


31 in its bottom 5 bits, encoded in the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register holding a shift amount from 0 to 


63 in its bottom 6 bits, encoded in the "Rm" field. 


Operation 


bits(datasize) result; 
bits(datasize) operand2 = X[m]; 


result = ShiftReg(n, shift_type, UInt(operand2) MOD datasize); 
X[d] = result; 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.116 


LSR (register) 


Logical Shift Right (register) shifts a register value right by a variable number of bits, shifting in zeros, and writes 
the result to the destination register. The remainder obtained by dividing the second source register by the data size 
defines the number of bits by which the first source register is right-shifted. 


This instruction is an alias of the LSRV instruction. This means that: 
° The encodings in this description are named to match the encodings of LSRV. 


° The description of LSRV gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1413 12/1110 9 | 5 4| 0 | 


sffofojt 1070170) Rm footojot} Rn [| Ra | 


op2 


32-bit variant 
Applies when sf == 
LSR <Wd>, <Wn>, <Wm> 
is equivalent to 

LSRV <Wd>, <Wn>, <Wm> 


and is always the preferred disassembly. 


64-bit variant 
Applies when sf == 
LSR <Xd>, <Xn>, <Xm> 
is equivalent to 

LSRV <Xd>, <Xn>, <Xm> 


and is always the preferred disassembly. 


Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding a shift amount from 0 to 
31 in its bottom 5 bits, encoded in the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register holding a shift amount from 0 to 
63 in its bottom 6 bits, encoded in the "Rm" field. 

Operation 


The description of LSRV gives the operational pseudocode for this instruction. 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.117 _ LSR (immediate) 


Logical Shift Right (immediate) shifts a register value right by an immediate number of bits, shifting in zeros, and 
writes the result to the destination register. 


This instruction is an alias of the UBFM instruction. This means that: 
° The encodings in this description are named to match the encodings of UBFM. 


° The description of UBFM gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0 | 


HR SB 


opc imms 


32-bit variant 

Applies when sf == @ && N == @ && imms == Q11111. 
LSR <Wd>, <Wn>, #<shift> 

is equivalent to 

UBFM <Wd>, <Wn>, #<shift>, #31 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1 && N == 1 && imms == 111111. 
LSR <Xd>, <Xn>, #<shift> 

is equivalent to 

UBFM <Xd>, <Xn>, #<shift>, #63 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<shift> For the 32-bit variant: is the shift amount, in the range 0 to 31, encoded in the "immr" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, encoded in the "immr" field. 


Operation 


The description of UBFM gives the operational pseudocode for this instruction. 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.118 


LSRV 


Logical Shift Right Variable shifts a register value right by a variable number of bits, shifting in zeros, and writes 
the result to the destination register. The remainder obtained by dividing the second source register by the data size 
defines the number of bits by which the first source register is right-shifted. 


This instruction is used by the alias LSR (register). The alias is always the preferred disassembly. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


sflofo]t 1070770] Rm footojot}] Rn [| Rd | 


op2 


32-bit variant 
Applies when sf == 


LSRV <Wd>, <Wn>, <Wm> 


64-bit variant 
Applies when sf == 


LSRV <Xd>, <Xn>, <Xm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 
ShiftType shift_type = DecodeShift(op2); 


Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding a shift amount from 0 to 
31 in its bottom 5 bits, encoded in the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register holding a shift amount from 0 to 
63 in its bottom 6 bits, encoded in the "Rm" field. 

Operation 


bits(datasize) result; 
bits(datasize) operand2 = X[m]; 


result = ShiftReg(n, shift_type, UInt(operand2) MOD datasize); 
X[d] = result; 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.119 MADD 


Multiply-Add multiplies two register values, adds a third register value, and writes the result to the destination 
register. 


This instruction is used by the alias MUL. See Alias conditions for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 5 4| 0 | 


seo oreo] eae] me | | 


32-bit variant 
Applies when sf == 


MADD <Wd>, <Wn>, <Wm>, <Wa> 


64-bit variant 
Applies when sf == 


MADD <Xd>, <Xn>, <Xm>, <Xa> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer a = UInt(Ra); 
integer destsize = if sf == '1' then 64 else 32; 


Alias conditions 





Alias _ is preferred when 





MUL _ Ra == '11111' 





Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 





<Wa> Is the 32-bit name of the third general-purpose source register holding the addend, encoded in the 
"Ra" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register holding the multiplicand, encoded in 


the "Rn" field. 


<Xm> Is the 64-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 
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C6.2 Alphabetical list of A64 base instructions 


<Xa> Is the 64-bit name of the third general-purpose source register holding the addend, encoded in the 
"Ra" field. 
Operation 
bits(destsize) operandl = X[n]; 
bits(destsize) operand2 = X[m]; 
bits(destsize) operand3 = X[a]; 
integer result; 


result = UInt(operand3) + (UInt(operand1) » UInt(operand2)); 


X[d] = result<destsize-1:0>; 
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MNEG 


Multiply-Negate multiplies two register values, negates the product, and writes the result to the destination register. 
This instruction is an alias of the MSUB instruction. This means that: 
. The encodings in this description are named to match the encodings of MSUB. 


° The description of MSUB gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 14 | 109 5 4| 0 | 


soo e100) Lilet Ta] ee 


Ra 


32-bit variant 

Applies when sf == 

MNEG <Wd>, <Wn>, <Wm> 

is equivalent to 

MSUB <Wd>, <Wn>, <Wm>, WZR 
and is always the preferred disassembly. 
64-bit variant 

Applies when sf == 

MNEG <Xd>, <Xn>, <Xm> 

is equivalent to 

MSUB <Xd>, <Xn>, <Xm>, XZR 


and is always the preferred disassembly. 


Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Xn> Is the 64-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Xm> Is the 64-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 


Operation 


The description of MSUB gives the operational pseudocode for this instruction. 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.121 MOV (to/from SP) 


Move between register and stack pointer : Rd = Rn 


This instruction is an alias of the ADD (immediate) instruction. This means that: 


. The encodings in this description are named to match the encodings of ADD (immediate). 
. The description of ADD (immediate) gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24/23 2221 | | | 109 | 5 4| 0 | 
fsofo[1 00 0 fo ofoo DOO DODO OOO mn | Rd 
op S shift imm12 


32-bit variant 

Applies when sf == 0. 

MOV <Wd|WSP>, <Wn|WSP> 

is equivalent to 

ADD <Wd|WSP>, <Wn|WSP>, #0 


and is the preferred disassembly when (Rd == '11111' || Rn == '11111'). 


64-bit variant 

Applies when sf == 1. 
MOV <Xd|SP>, <Xn|SP> 

is equivalent to 

ADD <Xd|SP>, <Xn|SP>, #0 


and is the preferred disassembly when (Rd == '11111' || Rn == '11111'). 


Assembler symbols 





<Wd|WSP> Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Wn | WSP> Is the 32-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 

<Xd|SP> Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Xn|SP> Is the 64-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 

Operation 


The description of ADD (immediate) gives the operational pseudocode for this instruction. 
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C6.2.122 MOV (inverted wide immediate) 
Move (inverted wide immediate) moves an inverted 16-bit immediate value to a register. 


This instruction is an alias of the MOVN instruction. This means that: 


. The encodings in this description are named to match the encodings of MOVN. 
° The description of MOVN gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20) | | | 5 4| 0 | 
so o[1 00710 1[w] mms ——OSCS—~—~sSCS—sSCi RCS 
opc 


32-bit variant 

Applies when sf == 0. 

MOV <Wd>, #<imm> 

is equivalent to 

MOVN <Wd>, #<imm16>, LSL #<shift> 


and is the preferred disassembly when ! (IsZero(imm16) && hw != 'Q0') && ! IsOnes(imm16). 


64-bit variant 

Applies when sf == 1. 

MOV <Xd>, #<imm> 

is equivalent to 

MOVN <Xd>, #<imm16>, LSL #<shift> 


and is the preferred disassembly when ! (IsZero(imm16) && hw != '00'). 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<imm> For the 32-bit variant: is a 32-bit immediate, the bitwise inverse of which can be encoded in 


"imm16:hw", but excluding Oxffff0000 and OxOO00Offff 
For the 64-bit variant: is a 64-bit immediate, the bitwise inverse of which can be encoded in 


"imm16:hw". 


<shift> For the 32-bit variant: is the amount by which to shift the immediate left, either 0 (the default) or 
16, encoded in the "hw" field as <shift>/16. 


For the 64-bit variant: is the amount by which to shift the immediate left, either 0 (the default), 16, 
32 or 48, encoded in the "hw" field as <shift>/16. 


Operation 


The description of MOVN gives the operational pseudocode for this instruction. 
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C6.2.123 MOV (wide immediate) 
Move (wide immediate) moves a 16-bit immediate value to a register. 
This instruction is an alias of the MOVZ instruction. This means that: 
. The encodings in this description are named to match the encodings of MOVZ. 


° The description of MOVZ gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20) | | | 5 4| 0 | 


st of1 00701] immis~——SO=—“*~srSCS~éRC*«ds 


opc 


32-bit variant 

Applies when sf == 0. 

MOV <Wd>, #<imm> 

is equivalent to 

MOVZ <Wd>, #<imm16>, LSL #<shift> 


and is the preferred disassembly when ! (IsZero(imm16) && hw != '00'). 


64-bit variant 

Applies when sf == 1. 

MOV <Xd>, #<imm> 

is equivalent to 

MOVZ <Xd>, #<imm16>, LSL #<shift> 


and is the preferred disassembly when ! (IsZero(imm16) && hw != '00'). 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<imm> For the 32-bit variant: is a 32-bit immediate which can be encoded in "imm16:hw". 


For the 64-bit variant: is a 64-bit immediate which can be encoded in "imm16:hw". 


<shift> For the 32-bit variant: is the amount by which to shift the immediate left, either 0 (the default) or 
16, encoded in the "hw" field as <shift>/16. 


For the 64-bit variant: is the amount by which to shift the immediate left, either 0 (the default), 16, 
32 or 48, encoded in the "hw" field as <shift>/16. 


Operation 


The description of MOVZ gives the operational pseudocode for this instruction. 
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C6.2.124 MOV (bitmask immediate) 
Move (bitmask immediate) writes a bitmask immediate value to a register. 
This instruction is an alias of the ORR (immediate) instruction. This means that: 
. The encodings in this description are named to match the encodings of ORR (immediate). 


° The description of ORR (immediate) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 | 109 | 5 4| 0 | 


fo 1]1 00740 O(N] mm [| mms (ia7711] Rd | 
Rn 


opc 


32-bit variant 

Applies when sf == @ && N == 0. 
MOV <Wd|WSP>, #<imm> 

is equivalent to 

ORR <Wd|WSP>, WZR, #<imm> 


and is the preferred disassembly when ! MoveWidePreferred(sf, N, imms, immr). 


64-bit variant 

Applies when sf == 1. 
MOV <Xd|SP>, #<imm> 

is equivalent to 

ORR <Xd|SP>, XZR, #<imm> 


and is the preferred disassembly when ! MoveWidePreferred(sf, N, imms, immr). 


Assembler symbols 


<Wd|WSP> Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Xd|SP> Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<imm> For the 32-bit variant: is the bitmask immediate, encoded in "imms:immr", but excluding values 


which could be encoded by MOVZ or MOVN. 


For the 64-bit variant: is the bitmask immediate, encoded in "N:imms:immr", but excluding values 
which could be encoded by MOVZ or MOVN. 


Operation 


The description of ORR (immediate) gives the operational pseudocode for this instruction. 
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C6.2.125 MOV (register) 
Move (register) copies the value in a source register to the destination register. 
This instruction is an alias of the ORR (shifted register) instruction. This means that: 
. The encodings in this description are named to match the encodings of ORR (shifted register). 


° The description of ORR (shifted register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 | 109 | 5 4| 0 | 


fsf[o 1]o 7074 of0 ofo] Am  joooo00077111] RG | 
Rn 


opc shift N imm6 


32-bit variant 
Applies when sf == 0. 
MOV <Wd>, <Wm> 

is equivalent to 

ORR <Wd>, WZR, <Wm> 


and is always the preferred disassembly. 


64-bit variant 
Applies when sf == 1. 
MOV <Xd>, <Xm> 

is equivalent to 

ORR <Xd>, XZR, <Xm> 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wm> Is the 32-bit name of the general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xm> Is the 64-bit name of the general-purpose source register, encoded in the "Rm" field. 
Operation 


The description of ORR (shifted register) gives the operational pseudocode for this instruction. 
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MOVK 


Move wide with keep moves an optionally-shifted 16-bit immediate value into a register, keeping other bits 
unchanged. 


|31 30 29 28|27 26 25 24/23 22 21 20| | | | 5 4| 0 | 


sfl1 {7 0 0 1 otf hw] immig RR 


opc 


32-bit variant 
Applies when sf == 0. 


MOVK <Wd>, #<imm>{, LSL #<shift>} 


64-bit variant 
Applies when sf == 1. 


MOVK <Xd>, #<imm>{, LSL #<shift>} 


Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer datasize = if sf == '1' then 64 else 32; 
integer pos; 
if sf == '@' && hw<l> == '1' then UnallocatedEncoding(); 
pos = UInt(hw: '0000'); 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<imm> Is the 16-bit unsigned immediate, in the range 0 to 65535, encoded in the "imm16" field. 


<shift> For the 32-bit variant: is the amount by which to shift the immediate left, either 0 (the default) or 
16, encoded in the "hw" field as <shift>/16. 


For the 64-bit variant: is the amount by which to shift the immediate left, either 0 (the default), 16, 
32 or 48, encoded in the "hw" field as <shift>/16. 


Operation 
bits(datasize) result; 
result = X[d]; 


result<pos+15:pos> = imm16; 
X[d] = result; 
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C6.2.127 MOVN 


Move wide with NOT moves the inverse of an optionally-shifted 16-bit immediate value to a register. 


This instruction is used by the alias MOV (inverted wide immediate). See Alias conditions for details of when each 
alias is preferred. 


|31 30 29 28|27 26 25 24/23 22 21 20| | | | 5 4| 0 | 
Isffo o]1 0070 1;hw] imme CE RT 
opc 


32-bit variant 
Applies when sf == Q. 


MOVN <Wd>, #<imm>{, LSL #<shift>} 


64-bit variant 
Applies when sf == 1. 


MOVN <Xd>, #<imm>{, LSL #<shift>} 


Decode for all variants of this encoding 

integer d = UInt(Rd); 

integer datasize = if sf == '1' then 64 else 32; 
integer pos; 

if sf == '@' && hw<l> == '1' then UnallocatedEncoding(); 
pos = UInt(hw: '0000'); 


Alias conditions 











Alias of variant is preferred when 
MOV (inverted wide immediate) 64-bit ! (IsZero(imml6) && hw != '@0') 
MOV (inverted wide immediate) 32-bit ! (IsZero(imm16) && hw != '@0') && ! IsOnes(imm16) 





Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<imm> Is the 16-bit unsigned immediate, in the range 0 to 65535, encoded in the "imm16" field. 
<shift> For the 32-bit variant: is the amount by which to shift the immediate left, either 0 (the default) or 


16, encoded in the "hw" field as <shift>/16. 


For the 64-bit variant: is the amount by which to shift the immediate left, either 0 (the default), 16, 
32 or 48, encoded in the "hw" field as <shift>/16. 
Operation 
bits(datasize) result; 


result = Zeros(); 
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result<pos+15:pos> = imm16; 
result = NOT(result); 
X[d] = result; 
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C6.2.128 MOVZ 


Move wide with zero moves an optionally-shifted 16-bit immediate value to a register. 


This instruction is used by the alias MOV (wide immediate). See Alias conditions for details of when each alias is 


preferred. 
|31 30 29 28|27 26 25 24/23 22 21 20] | | | 5 4| 0 | 
toototfhw] immi6 CTR 
ope 


32-bit variant 
Applies when sf == Q. 


MOVZ <Wd>, #<imm>{, LSL #<shift>} 


64-bit variant 
Applies when sf == 1. 


MOVZ <Xd>, #<imm>{, LSL #<shift>} 


Decode for all variants of this encoding 

integer d = UInt(Rd); 

integer datasize = if sf == '1' then 64 else 32; 
integer pos; 

if sf == '@' && hw<l> == '1' then UnallocatedEncoding(); 
pos = UInt(hw: '0000'); 


Alias conditions 





Alias is preferred when 





MOV (wide immediate) ! (IsZero(imml6) && hw != '00') 





Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<imm> Is the 16-bit unsigned immediate, in the range 0 to 65535, encoded in the "imm16" field. 
<shift> For the 32-bit variant: is the amount by which to shift the immediate left, either 0 (the default) or 


16, encoded in the "hw" field as <shift>/16. 
For the 64-bit variant: is the amount by which to shift the immediate left, either 0 (the default), 16, 


32 or 48, encoded in the "hw" field as <shift>/16. 
Operation 
bits(datasize) result; 


result = Zeros(); 





C6-620 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


result<pos+15:pos> = imm16; 
X[d] = result; 
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C6.2.129 MRS 


Move System Register allows the PE to read an AArch64 System register into a general-purpose register. 


|31 30 29 28|27 26 25 24/23 22 2120/1918  16|15 12\11 8/7 5 4| 0 | 


770107070 0)1|1 fo] op | oRn | cRm [om] Rt 
L 


System variant 

MRS <Xt>, (<systemreg>|S<op0>_<op1>_<Cn>_<Cm>_<op2>) 

Decode for this encoding 

AArch64.CheckSystemAccess('1':00, op1, CRn, CRm, op2, Rt, L); 


integer t = UInt(Rt); 


integer sys_opQ = 2 + UInt(oQ); 


integer sys_op1 = UInt(op1); 
integer sys_op2 = UInt(op2); 
integer sys_crn = UInt(CRn); 
integer sys_crm = UInt(CRm); 


Assembler symbols 
<Xt> Is the 64-bit name of the general-purpose destination register, encoded in the "Rt" field. 


<systemreg> Is a System register name, encoded in the "o0:op1:CRn:CRm:op2". The System register names are 
defined in Chapter D7 AArch64 System Register Descriptions. 


<op0> Is an unsigned immediate, encoded in the "00" field. It can have the following values: 
2 when 00 = 0 
3 when 0@ = 1 

<op1> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op1" field. 

<Cn> Is aname 'Cn’, with 'n' in the range 0 to 15, encoded in the "CRn" field. 

<Cm> Is aname ‘Cm’, with 'm' in the range 0 to 15, encoded in the "CRm" field. 

<op2> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op2" field. 

Operation 


X[t] = AArch64.SysRegRead(sys_op0, sys_opl, sys_crn, sys_crm, sys_op2); 
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MSR (immediate) 


Move immediate value to Special Register moves an immediate value to selected bits of the PSTATE. For more 
information, see PSTATE. 


|31 30 29 28|27 26 25 24/23 22 2120/1918 16/15 1413 12/11 8/7 5 4|3 21 0| 


[1107017010 ofofo of opt foro of cRm | op2 [1 111 14) 


System variant 


MSR <pstatefield>, #<imm> 


Decode for this encoding 
AArch64.CheckSystemAccess('00', op1, 'Q100', CRm, op2, '11111', '0'); 


bits(4) operand = CRm; 

PSTATEField field; 

case opl:op2 of 
when 'Q0@ 101' field = PSTATEField_SP; 
when 'Q@11 110' field = PSTATEField_DAIFSet; 
when 'Q@11 111' field = PSTATEField_DAIFCIr; 
otherwise UnallocatedEncoding(); 


// Check that an AArch64 MSR/MRS access to the DAIF flags is permitted 
if opl == 'Q11' && PSTATE.EL == EL@ && SCTLR_EL1.UMA == 'Q' then 
AArch64.SystemRegisterTrap(EL1, '@@', op2, op1, '@100', '11111', CRm, '0'); 


Assembler symbols 


<pstatefield> Is a PSTATE field name, encoded in the "op1:op2" field. It can have the following values: 
SPSel when op1 = 00, op2 = 101 
DAIFSet when op1 = Q11, op2 = 110 
DAIFCIr when op1 = @11, op2 = 111 
The following encodings are reserved: 
. opl = 000, op2 = Oxx. 
° opl = 000, op2 = 100. 
° opl = 000, op2 = 11x. 
. opl = 001, op2 = xxx. 
° opl = 010, op2 = xxx. 
° opl = 011, op2 = Qxx. 
° opl = 011, op2 = 10x. 
. op1 


1xx, 0p2 = Xxx. 


<imm> Is a 4-bit unsigned immediate, in the range 0 to 15, encoded in the "CRm" field. 


Operation 


case field of 
when PSTATEField_SP 
PSTATE.SP = operand<Q>; 
when PSTATEField_DAIFSet 
PSTATE.D = PSTATE.D OR operand<3>; 
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PSTATE.A = PSTATE.A OR operand<2>; 
PSTATE.I = PSTATE.I OR operand<1>; 
PSTATE.F = PSTATE.F OR operand<Q>; 
when PSTATEField_DAIFCIr 
PSTATE.D = PSTATE.D AND NOT(operand<3>) ; 
PSTATE.A = PSTATE.A AND NOT(operand<2>) ; 
PSTATE.I = PSTATE.I AND NOT(operand<1>) ; 
PSTATE.F = PSTATE.F AND NOT(operand<Q>) ; 
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C6.2.131 MSR (register) 


Move general-purpose register to System Register allows the PE to write an AArch64 System register from a 
general-purpose register. 


|31 30 29 28|27 26 25 24/23 22 2120/1918  16|15 12\11 8/7 5 4| 0 | 


iodo do Fo offi fool opt | cRn | Rm | ope | Rt | 


System variant 


MSR (<systemreg>|S<op0>_<op1>_<Cn>_<Cm>_<op2>), <Xt> 


Decode for this encoding 
AArch64.CheckSystemAccess('1':00, op1, CRn, CRm, op2, Rt, L); 
integer t = UInt(Rt); 


integer sys_opQ = 2 + UInt(oQ); 
integer sys_op1 = UInt(op1); 
integer sys_op2 = UInt(op2); 
integer sys_crn = UInt(CRn); 
integer sys_crm = UInt(CRm); 


Assembler symbols 


<systemreg> Isa System register name, encoded in the "o0:op1:CRn:CRm:op2". The System register names are 
defined in Chapter D7 AArch64 System Register Descriptions. 


<op0> Is an unsigned immediate, encoded in the "00" field. It can have the following values: 
2 when 0@ = 0 
3 when 0@ = 1 

<op1> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op1" field. 

<Cn> Is aname 'Cn’, with 'n' in the range 0 to 15, encoded in the "CRn" field. 

<Cm> Is aname 'Cm’, with 'm' in the range 0 to 15, encoded in the "CRm'" field. 

<op2> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op2" field. 

<Xt> Is the 64-bit name of the general-purpose source register, encoded in the "Rt" field. 

Operation 


AArch64.SysRegWrite(sys_op®, sys_opl, sys_crn, sys_crm, sys_op2, X[t]); 
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C6.2.132 MSUB 


Multiply-Subtract multiplies two register values, subtracts the product from a third register value, and writes the 
result to the destination register. 


This instruction is used by the alias MNEG. See Alias conditions for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 5 4| 0 | 


seo oreo] me] me | | 


32-bit variant 


Applies when sf == 


MSUB <Wd>, <Wn>, <Wm>, <Wa> 


64-bit variant 


Applies when sf == 


MSUB <Xd>, <Xn>, <Xm>, <Xa> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer a = UInt(Ra); 
integer destsize = if sf == '1' then 64 else 32; 


Alias conditions 





Alias is preferred when 





MNEG | Ra == '11111' 





Assembler symbols 


<Wd> 


<wn> 





<Xd> 


<Xn> 


<Xm> 


Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 


Is the 32-bit name of the third general-purpose source register holding the minuend, encoded in the 
"Ra" field. 


Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


Is the 64-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


Is the 64-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 
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<Xa> Is the 64-bit name of the third general-purpose source register holding the minuend, encoded in the 
"Ra" field. 


Operation 

bits(destsize) operandl = X[n]; 
bits(destsize) operand2 = X[m]; 
bits(destsize) operand3 = X[a]; 


integer result; 


result = UInt(operand3) - (UInt(operand1) » UInt(operand2)); 
X[d] = result<destsize-1:0>; 
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C6.2.133 MUL 
Multiply : Rd = Rn « Rm 


This instruction is an alias of the MADD instruction. This means that: 


. The encodings in this description are named to match the encodings of MADD. 
. The description of MADD gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 14 | 109 5 4| 0 | 
seo e100) ae ole TTT] ee 
Ra 


32-bit variant 

Applies when sf == 

MUL <Wd>, <Wn>, <Wm> 

is equivalent to 

MADD <Wd>, <Wn>, <Wm>, WZR 
and is always the preferred disassembly. 
64-bit variant 

Applies when sf == 

MUL <Xd>, <Xn>, <Xm> 

is equivalent to 

MADD <Xd>, <Xn>, <Xm>, XZR 


and is always the preferred disassembly. 


Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Xn> Is the 64-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Xm> Is the 64-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 


Operation 


The description of MADD gives the operational pseudocode for this instruction. 
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C6.2.134 MVN 
Bitwise NOT writes the bitwise inverse of a register value to the destination register. 
This instruction is an alias of the ORN (shifted register) instruction. This means that: 
. The encodings in this description are named to match the encodings of ORN (shifted register). 


° The description of ORN (shifted register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 | 109 | 5 4| 0 | 


fo t]o 704 ols] Rm [| mmo [i777] Rd | 
N Rn 


opc 


32-bit variant 

Applies when sf == 0. 

MVN <Wd>, <Wm>{, <shift> #<amount>} 

is equivalent to 

ORN <Wd>, WZR, <Wm>{, <shift> #<amount>} 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1. 

MVN <Xd>, <Xm>{, <shift> #<amount>} 

is equivalent to 

ORN <Xd>, XZR, <Xm>{, <shift> #<amount>} 


and is always the preferred disassembly. 


Assembler symbols 





<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wm> Is the 32-bit name of the general-purpose source register, encoded in the "Rm" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xm> Is the 64-bit name of the general-purpose source register, encoded in the "Rm" field. 

<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 
field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 
ROR when shift = 11 

<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 
For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
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Operation 


The description of ORN (shifted register) gives the operational pseudocode for this instruction. 
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C6.2.135 NEG (shifted register) 
Negate (shifted register) negates an optionally-shifted register value, and writes the result to the destination register. 
This instruction is an alias of the SUB (shifted register) instruction. This means that: 
. The encodings in this description are named to match the encodings of SUB (shifted register). 


° The description of SUB (shifted register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 | 109 | 5 4| 0 | 


fs[t[o[o 704 t[snmfo. Rm [| mmo [i777] Ra | 
Rn 


op S 


32-bit variant 

Applies when sf == 0. 

NEG <Wd>, <Wm>{, <shift> #<amount>} 

is equivalent to 

SUB <Wd>, WZR, <Wm> {, <shift> #<amount>} 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1. 

NEG <Xd>, <Xm>{, <shift> #<amount>} 

is equivalent to 

SUB <Xd>, XZR, <Xm> {, <shift> #<amount>} 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wm> Is the 32-bit name of the general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xm> Is the 64-bit name of the general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL and encoded 
in the "shift" field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 


The encoding shift = 111s reserved. 


<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-631 
1ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


Operation 


The description of SUB (shifted register) gives the operational pseudocode for this instruction. 
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C6.2.136 NEGS 


Negate, setting flags, negates an optionally-shifted register value, and writes the result to the destination register. It 
updates the condition flags based on the result. 


This instruction is an alias of the SUBS (shifted register) instruction. This means that: 
° The encodings in this description are named to match the encodings of SUBS (shifted register). 


° The description of SUBS (shifted register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20 16|15 | 109 | 5 4| 0 | 


sto +o 4 t[sntfo. Rm [| mmo (i771 1] Ro | 
Rn 


op S 


32-bit variant 

Applies when sf == 0. 

NEGS <Wd>, <Wm>{, <shift> #<amount>} 

is equivalent to 

SUBS <Wd>, WZR, <Wm> {, <shift> #<amount>} 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1. 

NEGS <Xd>, <Xm>{, <shift> #<amount>} 

is equivalent to 

SUBS <Xd>, XZR, <Xm> {, <shift> #<amount>} 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wm> Is the 32-bit name of the general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xm> Is the 64-bit name of the general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL and encoded 
in the "shift" field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 


The encoding shift = 111s reserved. 


<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field. 
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Operation 


The description of SUBS (shifted register) gives the operational pseudocode for this instruction. 
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C6.2.137 NGC 


Negate with Carry negates the sum of a register value and the value of NOT (Carry flag), and writes the result to 
the destination register. 


This instruction is an alias of the SBC instruction. This means that: 


° The encodings in this description are named to match the encodings of SBC. 
° The description of SBC gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312|11109 | 5 4| 0| 
Spo? to7T O00 oO] Rm jooooo011777) Ra | 
op S Rn 


32-bit variant 
Applies when sf == 0. 
NGC <Wd>, <Wm> 

is equivalent to 

SBC <Wd>, WZR, <Wm> 


and is always the preferred disassembly. 


64-bit variant 
Applies when sf == 1. 
NGC <Xd>, <Xm> 

is equivalent to 

SBC <Xd>, XZR, <Xm> 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wm> Is the 32-bit name of the general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xm> Is the 64-bit name of the general-purpose source register, encoded in the "Rm" field. 
Operation 


The description of SBC gives the operational pseudocode for this instruction. 
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C6.2.138 NGCS 
Negate with Carry, setting flags, negates the sum of a register value and the value of NOT (Carry flag), and writes 
the result to the destination register. It updates the condition flags based on the result. 
This instruction is an alias of the SBCS instruction. This means that: 
° The encodings in this description are named to match the encodings of SBCS. 
° The description of SBCS gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15141312|/11109 | 5 4| 0 | 
op S Rn 
32-bit variant 
Applies when sf == 0. 
NGCS <Wd>, <Wm> 
is equivalent to 
SBCS <Wd>, WZR, <Wm> 
and is always the preferred disassembly. 
64-bit variant 
Applies when sf == 1. 
NGCS <Xd>, <Xm> 
is equivalent to 
SBCS <Xd>, XZR, <Xm> 
and is always the preferred disassembly. 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wm> Is the 32-bit name of the general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xm> Is the 64-bit name of the general-purpose source register, encoded in the "Rm" field. 
Operation 
The description of SBCS gives the operational pseudocode for this instruction. 
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C6.2.139 NOP 


No Operation does nothing, other than advance the value of the program counter by 4. This instruction can be used 
for instruction alignment purposes. 


Note 


The timing effects of including a NOP instruction in a program are not guaranteed. It can increase execution time, 
leave it unchanged, or even reduce it. Therefore, NOP instructions are not suitable for timing loops. 








|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 14 13 12/11 8/7 5 4|3 21 0| 


1101701070 ojojo ofo11jo0170j000 ojo 0 of111it| 
CRm 


op2 


System variant 
NOP 
Decode for this encoding 


// Empty. 


Operation 


// do nothing 
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C6.2.140 ORN (shifted register) 
Bitwise OR NOT (shifted register) performs a bitwise (inclusive) OR of a register value and the complement of an 
optionally-shifted register value, and writes the result to the destination register. 
This instruction is used by the alias MVN. See Alias conditions for details of when each alias is preferred. 
|31 30 29 28|27 26 25 2428 22 21 20| 16|15 | 109 5 4| 0 | 
sflo_1]0 1 0 1 0 sa | Ra a eT 
ope 
32-bit variant 
Applies when sf == 
ORN <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 
64-bit variant 
Applies when sf == 
ORN <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 
if sf == 'O' && imm6<5> == '1' then ReservedValue(); 
ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 
Alias conditions 
Alias is preferred when 
MVN — Rn == '11111' 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 
field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
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ASR when shift = 10 
ROR when shift = 11 

<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
Operation 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 


operand2 = NOT(operand2); 


result = operand1 OR operand2; 
X[d] = result; 
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C6.2.141 ORR (immediate) 


Bitwise OR (immediate) performs a bitwise (inclusive) OR of a register value and an immediate register value, and 
writes the result to the destination register. 


This instruction is used by the alias MOV (bitmask immediate). See Alias conditions for details of when each alias 
is preferred. 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 | 109 5 4| 0 | 


ical ae eer ee ae 


opc 


32-bit variant 
Applies when sf == @ && N == 


ORR <Wd|WSP>, <Wn>, #<imm> 


64-bit variant 
Applies when sf == 


ORR <Xd|SP>, <Xn>, #<imm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer datasize = if sf == '1' then 64 else 32; 
bits(datasize) imm; 

if sf == '0' &&N != '@' then ReservedValue(); 
(imm, -) = DecodeBitMasks(N, imms, immr, TRUE); 


Alias conditions 





Alias is preferred when 





MOV (bitmask immediate) Rn == '11111' && ! MoveWidePreferred(sf, N, imms, immr) 





Assembler symbols 


<Wd|WSP> Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd| SP> Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 

<imm> For the 32-bit variant: is the bitmask immediate, encoded in "imms:immr". 


For the 64-bit variant: is the bitmask immediate, encoded in "N:imms:immr". 
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Operation 


bits(datasize) result; 
bits(datasize) operandl = X[n]; 


result = operand1 OR imm; 
if d == 31 then 

SP[] = result; 
else 

X[d] = result; 
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C6.2.142 ORR (shifted register) 

Bitwise OR (shifted register) performs a bitwise (inclusive) OR of a register value and an optionally-shifted register 

value, and writes the result to the destination register. 

This instruction is used by the alias MOV (register). See Alias conditions for details of when each alias is preferred. 
|31 30 29 28|27 26 25 2428 22 21 20| 16|15 | 109 5 4| 0 | 
sflo_1]0 10 1 0 st [| Ra | a eT 

opc 

32-bit variant 

Applies when sf == 

ORR <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 

64-bit variant 

Applies when sf == 

ORR <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 

Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 
if sf == 'O' && imm6<5> == '1' then ReservedValue(); 

ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 
Alias conditions 
Alias is preferred when 
MOV (register) shift == '00' && imm6 == '000000' && Rn == '11111' 

Assembler symbols 

<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 

<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 

field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
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ASR when shift = 10 
ROR when shift = 11 

<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
Operation 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 


result = operand1 OR operand2; 
X[d] = result; 
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C6.2.143 


PRFM (immediate) 


Prefetch Memory (immediate) signals the memory system that data memory accesses from a specified address are 
likely to occur in the near future. The memory system can respond by taking actions that are expected to speed up 
the memory accesses when they do occur, such as preloading the cache line containing the specified address into 
one or more caches. 


The effect of an PRFM instruction is IMPLEMENTATION DEFINED. For more information, see Prefetch memory on 
page C3-156. 


For information about memory accesses, see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 


afi 1 ijofo aft o] mmi2—S—S—~ir~Sn Sd] SOR 


size opc 


Unsigned offset variant 


PREM (<prfop>|#<imm5>), [<Xn|SP>{, #<pimm>}] 


Decode for this encoding 


bits(64) offset = LSL(ZeroExtend(imm12, 64), 3); 


Assembler symbols 


<prfop> Is the prefetch operation, defined as <type><target><policy>. <type> is one of: 
PLD Prefetch for load, encoded in the "Rt<4:3>" field as 0b00. 
PLI Preload instructions, encoded in the "Rt<4:3>" field as @bQ1. 
PST Prefetch for store, encoded in the "Rt<4:3>" field as 0b10. 
<target> is one of: 
L1 Level | cache, encoded in the "Rt<2:1>" field as @b00. 
L2 Level 2 cache, encoded in the "Rt<2:1>" field as @b01. 
L3 Level 3 cache, encoded in the "Rt<2:1>" field as @b10. 
<policy> is one of: 


KEEP Retained or temporal prefetch, allocated in the cache normally. Encoded in the "Rt<0>" 
field as 0. 


STRM Streaming or non-temporal prefetch, for data that is used only once. Encoded in the 
"Rt<0>" field as 1. 


For more information on these prefetch operations, see Prefetch memory on page C3-156. For other 
encodings of the "Rt" field, use <imm5>. 


<imm5> Is the prefetch operation encoding as an immediate, in the range 0 to 31, encoded in the "Rt" field. 
This syntax is only for encodings that are not accessible using <prfop>. 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 

<pimm> Is the optional positive immediate byte offset, a multiple of 8 in the range 0 to 32760, defaulting to 
0 and encoded in the "imm12" field as <pimm>/8. 

Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
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Operation 
bits(64) address; 
if n == 31 then 

address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


Prefetch(address, t<4:@>); 
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C6.2.144 PRFM (literal) 
Prefetch Memory (literal) signals the memory system that data memory accesses from a specified address are likely 
to occur in the near future. The memory system can respond by taking actions that are expected to speed up the 
memory accesses when they do occur, such as preloading the cache line containing the specified address into one 
or more caches. 
The effect of an PRFM instruction is IMPLEMENTATION DEFINED. For more information, see Prefetch memory on 
page C3-156. 
For information about memory accesses, see Load/Store addressing modes on page C1-128. 
|31 30 29 28|27 26 25 24|23 | | | | 5 4| 0 
ajo 1 ijolo 0] SSCS CSC‘C~C‘C~;~;~dSSCiC' 
opc 
Literal variant 
PREM (<prfop>|#<imm5>), <label> 
Decode for this encoding 
integer t = UInt(Rt); 
bits(64) offset; 
offset = SignExtend(imm19:'Q0', 64); 
Assembler symbols 
<prfop> Is the prefetch operation, defined as <type><target><policy>. <type> is one of: 
PLD Prefetch for load, encoded in the "Rt<4:3>" field as 0b00. 
PLI Preload instructions, encoded in the "Rt<4:3>" field as @bQ1. 
PST Prefetch for store, encoded in the "Rt<4:3>" field as 0b10. 
<target> is one of: 
L1 Level | cache, encoded in the "Rt<2:1>" field as @b00. 
L2 Level 2 cache, encoded in the "Rt<2:1>" field as @b01. 
L3 Level 3 cache, encoded in the "Rt<2:1>" field as @b10. 
<policy> is one of: 
KEEP Retained or temporal prefetch, allocated in the cache normally. Encoded in the "Rt<0>" 
field as 0. 
STRM Streaming or non-temporal prefetch, for data that is used only once. Encoded in the 
"Rt<0>" field as 1. 
For more information on these prefetch operations, see Prefetch memory on page C3-156. For other 
encodings of the "Rt" field, use <imm5>. 
<imm5> Is the prefetch operation encoding as an immediate, in the range 0 to 31, encoded in the "Rt" field. 
This syntax is only for encodings that are not accessible using <prfop>. 
<label> Is the program label from which the data is to be loaded. Its offset from the address of this 
instruction, in the range +/-1 MB, is encoded as "imm19" times 4. 
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Operation 
bits(64) address = PC[] + offset; 


Prefetch(address, t<4:0>); 
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C6.2.145 


PRFM (register) 


Prefetch Memory (register) signals the memory system that data memory accesses from a specified address are 
likely to occur in the near future. The memory system can respond by taking actions that are expected to speed up 
the memory accesses when they do occur, such as preloading the cache line containing the specified address into 
one or more caches. 


The effect of an PRFM instruction is IMPLEMENTATION DEFINED. For more information, see Prefetch memory on 
page C3-156. 


For information about memory accesses, see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312/11109 | 5 4| 0 | 


a7 1 iJofo oft 07] Rm [option [S[7 0] Rn | Rt 


size opc 


Integer variant 


PREM (<prfop>|#<imm5>), [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


Decode for this encoding 


if option<l> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = if S == '1' then 3 else 0; 


Assembler symbols 


<prfop> Is the prefetch operation, defined as <type><target><policy>. <type> is one of: 
PLD Prefetch for load, encoded in the "Rt<4:3>" field as 0b00. 
PLI Preload instructions, encoded in the "Rt<4:3>" field as @bQ1. 
PST Prefetch for store, encoded in the "Rt<4:3>" field as 0b10. 
<target> is one of: 
L1 Level | cache, encoded in the "Rt<2:1>" field as @b00. 
L2 Level 2 cache, encoded in the "Rt<2:1>" field as @b01. 
L3 Level 3 cache, encoded in the "Rt<2:1>" field as @b10. 
<policy> is one of: 


KEEP Retained or temporal prefetch, allocated in the cache normally. Encoded in the "Rt<0>" 
field as 0. 


STRM Streaming or non-temporal prefetch, for data that is used only once. Encoded in the 
"Rt<0>" field as 1. 


For more information on these prefetch operations, see Prefetch memory on page C3-156. For other 
encodings of the "Rt" field, use <imm5>. 


<imm5> Is the prefetch operation encoding as an immediate, in the range 0 to 31, encoded in the "Rt" field. 
This syntax is only for encodings that are not accessible using <prfop>. 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


<Wm> When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 


<Xm> When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
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<extend> Is the index extend/shift specifier, defaulting to LSL, and which must be omitted for the LSL option 
when <amount> is omitted. encoded in the "option" field. It can have the following values: 
UXTW when option = 010 
LSL when option = 011 
SXTW when option = 110 
SXTX when option = 111 
<amount> Is the index shift amount, optional only when <extend> is not LSL. Where it is permitted to be 
optional, it defaults to #0. It is encoded in the "S" field. It can have the following values: 
#0 when S = @ 
#3 when S = 1 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 


Operation 


bits(64) offset = ExtendReg(m, extend_type, shift); 
bits(64) address; 


if n == 31 then 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


Prefetch(address, t<4:0>); 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-649 


1ID092916 


Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.146 


PRFM (unscaled offset) 


Prefetch Memory (unscaled offset) signals the memory system that data memory accesses from a specified address 
are likely to occur in the near future. The memory system can respond by taking actions that are expected to speed 
up the memory accesses when they do occur, such as preloading the cache line containing the specified address into 
one or more caches. 


The effect of an PRFUM instruction is IMPLEMENTATION DEFINED. For more information, see Prefetch memory on 
page C3-156. 


For information about memory accesses, see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0 | 


a7 1 ifofo oft ojo] mma —ijo oy rn | Rt 


size opc 


Unscaled offset variant 


PRFUM (<prfop>|#<imm5>), [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<prfop> Is the prefetch operation, defined as <type><target><policy>. <type> is one of: 
PLD Prefetch for load, encoded in the "Rt<4:3>" field as 0b00. 
PLI Preload instructions, encoded in the "Rt<4:3>" field as @bQ1. 
PST Prefetch for store, encoded in the "Rt<4:3>" field as 0b10. 
<target> is one of: 
L1 Level | cache, encoded in the "Rt<2:1>" field as @b00. 
L2 Level 2 cache, encoded in the "Rt<2:1>" field as @b01. 
L3 Level 3 cache, encoded in the "Rt<2:1>" field as @b10. 
<policy> is one of: 


KEEP Retained or temporal prefetch, allocated in the cache normally. Encoded in the "Rt<0>" 
field as 0. 


STRM Streaming or non-temporal prefetch, for data that is used only once. Encoded in the 
"Rt<0>" field as 1. 


For more information on these prefetch operations, see Prefetch memory on page C3-156. For other 
encodings of the "Rt" field, use <imm5>. 


<imm5> Is the prefetch operation encoding as an immediate, in the range 0 to 31, encoded in the "Rt" field. 
This syntax is only for encodings that are not accessible using <prfop>. 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 

<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 
in the "imm9" field. 

Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
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Operation 
bits(64) address; 
if n == 31 then 

address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


Prefetch(address, t<4:@>); 
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C6.2.147  RBIT 


Reverse Bits reverses the bit order in a register. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


stfo1 707 O11 opooo oop ooo] mn | Ri 


32-bit variant 
Applies when sf == 0. 
RBIT <Wd>, <Wn> 
64-bit variant 
Applies when sf == 1. 


RBIT <Xd>, <Xn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize = if sf == '1' then 64 else 32; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
Operation 


bits(datasize) operand = X[n]; 
bits(datasize) result; 


for i = @ to datasize-1 
result<datasize-1-i> = operand<i>; 


X[d] = result; 
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C6.2.148 RET 


Return from subroutine branches unconditionally to an address in a register, with a hint that this is a subroutine 
return. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4/3 21 0| 


110101 1/0 o[1 0/1 1111/000000] #+Rn [00000 
op 


Integer variant 
RET {<Xn>} 


Decode for this encoding 


integer n = UInt(Rn); 


Assembler symbols 
<Xn> Is the 64-bit name of the general-purpose register holding the address to be branched to, encoded in 
the "Rn" field. Defaults to X30 if absent. 
Operation 
bits(64) target = X[n]; 


BranchTo(target, BranchType_RET) ; 
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C6.2.149 REV 


Reverse Bytes reverses the byte order in a register. 


This instruction is used by the pseudo-instruction REV64. The pseudo-instruction is never the preferred 


disassembly. 


[31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


sflifo]1 to 10177 0/0000 0/000 ofr xf Rn | Rd | 


opc 


32-bit variant 
Applies when sf == @ && opc == 10. 


REV <Wd>, <Wn> 


64-bit variant 
Applies when sf == 1 && opc == 11. 


REV <Xd>, <Xn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize = if sf == '1' then 64 else 32; 


integer container_size; 
case opc of 
when 'Q@Q' 
Unreachable(); 
when 'Q1' 
container_size = 16; 
when '10' 
container_size = 32; 
when '11' 
if sf == 'Q@' then UnallocatedEncoding(); 
container_size = 64; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 


Operation 


bits(datasize) operand = X[n]; 
bits(datasize) result; 


integer containers = datasize DIV container_size; 
integer elements_per_container = container_size DIV 8; 
integer index = Q; 

integer rev_index; 
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for c = Q to containers-1 
rev_index = index + ((elements_per_container - 1) « 8); 
for e = Q to elements_per_container-1 
result<rev_index+7:rev_index> = operand<index+7:index>; 
index = index + 8; 
rev_index = rev_index - 8; 


X[d] = result; 
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C6.2.150 REV16 


Reverse bytes in 16-bit halfwords reverses the byte order in each 16-bit halfword of a register. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0| 
stfoy7 ToT O11 opOooO OOOO OO] em | Ra | 
ope 


32-bit variant 
Applies when sf == 0. 


REV16 <Wd>, <Wn> 


64-bit variant 
Applies when sf == 1. 


REV16 <Xd>, <Xn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize = if sf == '1' then 64 else 32; 


integer container_size; 
case opc of 
when 'Q@Q' 
Unreachable(); 
when '@1' 
container_size = 16; 
when '10' 
container_size 
when '11' 
if sf == '@' then UnallocatedEncoding(); 
container_size = 64; 


32; 


Assembler symbols 

<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
Operation 


bits(datasize) operand = X[n]; 
bits(datasize) result; 


integer containers = datasize DIV container_size; 
integer elements_per_container = container_size DIV 8; 
integer index = Q; 
integer rev_index; 
for c = @ to containers-1 
rev_index = index + ((elements_per_container - 1) « 8); 
for e = Q to elements_per_container-1 
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result<rev_index+7:rev_index> = operand<index+7:index>; 
index = index + 8; 
rev_index = rev_index - 8; 


X[d] = result; 
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C6.2.151 REV32 


Reverse bytes in 32-bit words reverses the byte order in each 32-bit word of a register. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


Afot tot oti opooooofoo0io] mn | Ri 


sf opc 


64-bit variant 


REV32 <Xd>, <Xn> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize = if sf == '1' then 64 else 32; 


integer container_size; 
case opc of 
when 'QQ' 
Unreachable(); 
when 'Q1' 
container_size = 16; 
when '10' 
container_size = 32; 
when '11' 
if sf == 'Q@' then UnallocatedEncoding(); 
container_size = 64; 


Assembler symbols 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
Operation 


bits(datasize) operand = X[n]; 
bits(datasize) result; 


integer containers = datasize DIV container_size; 
integer elements_per_container = container_size DIV 8; 
integer index = Q; 
integer rev_index; 
for c = Q to containers-1 
rev_index = index + ((elements_per_container - 1) « 8); 
for e = Q to elements_per_container-1 
result<rev_index+7:rev_index> = operand<index+7:index>; 
index = index + 8; 
rev_index = rev_index - 8; 


X[d] = result; 
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REV64 


Reverse Bytes reverses the byte order in a 64-bit general-purpose register. 

It is OPTIONAL whether an assembler supports this pseudo-instruction. 

This instruction is a pseudo-instruction of the REV instruction. This means that: 

. The encodings in this description are named to match the encodings of REV. 

° The assembler syntax is used only for assembly, and is not used on disassembly. 


. The description of REV gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fpyot tot otiopooooofoo0ii mn | Rd 


sf opc 


64-bit variant 
REV64 <Xd>, <Xn> 
is equivalent to 
REV <Xd>, <Xn> 


and is never the preferred disassembly. 


Assembler symbols 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 


Operation 


The description of REV gives the operational pseudocode for this instruction. 
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C6.2.153 


ROR (immediate) 


Rotate right (immediate) provides the value of the contents of a register rotated by a variable number of bits. The 
bits that are rotated off the right end are inserted into the vacated bit positions on the left. 


This instruction is an alias of the EXTR instruction. This means that: 
° The encodings in this description are named to match the encodings of EXTR. 


. The description of EXTR gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20 16|15 | 109 5 4| 0 | 


oe ee ae VV 


32-bit variant 

Applies when sf == @ && N == 0 && imms == QXxxxx. 
ROR <Wd>, <Ws>, #<shift> 

is equivalent to 

EXTR <Wd>, <Ws>, <Ws>, #<shift> 


and is the preferred disassembly when Rn == Rm. 


64-bit variant 

Applies when sf == 1 && N == 
ROR <Xd>, <Xs>, #<shift> 

is equivalent to 

EXTR <Xd>, <Xs>, <Xs>, #<shift> 


and is the preferred disassembly when Rn == Rm. 


Assembler symbols 

<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Ws> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" and "Rm" fields. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xs> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" and "Rm" fields. 


<shift> For the 32-bit variant: is the amount by which to rotate, in the range 0 to 31, encoded in the "imms" 
field. 


For the 64-bit variant: is the amount by which to rotate, in the range 0 to 63, encoded in the "imms" 
field. 


Operation 


The description of EXTR gives the operational pseudocode for this instruction. 
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C6.2.154 ROR (register) 


Rotate Right (register) provides the value of the contents of a register rotated by a variable number of bits. The bits 
that are rotated off the right end are inserted into the vacated bit positions on the left. The remainder obtained by 
dividing the second source register by the data size defines the number of bits by which the first source register is 
right-shifted. 


This instruction is an alias of the RORV instruction. This means that: 


° The encodings in this description are named to match the encodings of RORV. 
° The description of RORV gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312|11109 | 5 4| 0| 
sflofo]t4 1070770] Rm [oo7roj1 4] Rn [Rd | 
op2 


32-bit variant 
Applies when sf == 0. 
ROR <Wd>, <Wn>, <Wm> 
is equivalent to 

RORV <Wd>, <Wn>, <Wm> 


and is always the preferred disassembly. 


64-bit variant 
Applies when sf == 1. 
ROR <Xd>, <Xn>, <Xm> 
is equivalent to 

RORV <Xd>, <Xn>, <Xm> 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register holding a shift amount from 0 to 


31 in its bottom 5 bits, encoded in the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register holding a shift amount from 0 to 


63 in its bottom 6 bits, encoded in the "Rm" field. 


Operation 


The description of RORV gives the operational pseudocode for this instruction. 
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C6.2.155 


RORV 


Rotate Right Variable provides the value of the contents of a register rotated by a variable number of bits. The bits 
that are rotated off the right end are inserted into the vacated bit positions on the left. The remainder obtained by 
dividing the second source register by the data size defines the number of bits by which the first source register is 
right-shifted. 


This instruction is used by the alias ROR (register). The alias is always the preferred disassembly. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/1514 1312/1110 9 | 5 4| 0 | 


sojol1 7070170] Rm [oo710]11] Rn | Rd 


op2 


32-bit variant 
Applies when sf == 0. 
RORV <Wd>, <Wn>, <Wm> 
64-bit variant 
Applies when sf == 1. 


RORV <Xd>, <Xn>, <Xm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 

ShiftType shift_type = DecodeShift(op2); 

Assembler symbols 

<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding a shift amount from 0 to 
31 in its bottom 5 bits, encoded in the "Rm" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register holding a shift amount from 0 to 
63 in its bottom 6 bits, encoded in the "Rm" field. 

Operation 


bits(datasize) result; 
bits(datasize) operand2 = X[m]; 


result = ShiftReg(n, shift_type, UInt(operand2) MOD datasize); 
X[d] = result; 





C6-662 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.156 SBC 


Subtract with Carry subtracts a register value and the value of NOT (Carry flag) from a register value, and writes 
the result to the destination register. 


This instruction is used by the alias NGC. See Alias conditions for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24/23 22 21 20] 16|15 14 13 12/11 10 9 5 4| 0 | 
sflifo[1 1010000] Rm  joooooo0; Rn | Ra | 
op S 


32-bit variant 
Applies when sf == 


SBC <Wd>, <Wn>, <Wm> 


64-bit variant 
Applies when sf == 


SBC <Xd>, <Xn>, <Xm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


Alias conditions 





Alias _ is preferred when 





NGC Rn == '11111' 





Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
Operation 


bits(datasize) result; 
bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = X[m]; 


operand2 = NOT(operand2); 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-663 
1ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


(result, -) = AddWithCarry(operand1, operand2, PSTATE.C); 


X[d] = result; 
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C6.2.157  SBCS 


Subtract with Carry, setting flags, subtracts a register value and the value of NOT (Carry flag) from a register value, 
and writes the result to the destination register. It updates the condition flags based on the result. 


This instruction is used by the alias NGCS. See Alias conditions for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24/23 22 21 20] 16|15 14 13 12|11 10 9 5 4| 0 | 
isflifi{1 1010000] Rm  joooooo] Rn | Ra | 
op S 


32-bit variant 
Applies when sf == 


SBCS <Wd>, <Wn>, <Wm> 


64-bit variant 
Applies when sf == 


SBCS <Xd>, <Xn>, <Xm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


Alias conditions 





Alias _ is preferred when 





NGCS Rn == '11111' 





Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
Operation 


bits(datasize) result; 
bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = X[m]; 
bits(4) nzcv; 


operand2 = NOT(operand2) ; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-665 
1ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


(result, nzcv) = AddWithCarry(operand1, operand2, PSTATE.C); 
PSTATE.<N,Z,C,V> = nzcv; 


X[d] = result; 
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C6.2.158 SBFIZ 


Signed Bitfield Insert in Zero zeroes the destination register and copies any number of contiguous bits from a source 
register into any position in the destination register, sign-extending the most significant bit of the transferred value. 


This instruction is an alias of the SBFM instruction. This means that: 


° The encodings in this description are named to match the encodings of SBFM. 
° The description of SBFM gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0| 
Sooo ma Ps 
opc 


32-bit variant 

Applies when sf == @ && N == 

SBFIZ <Wd>, <Wn>, #<Isb>, #<width> 

is equivalent to 

SBFM <Wd>, <Wn>, #(-<Isb> MOD 32), #(<width>-1) 


and is the preferred disassembly when UInt(imms) < UInt(immr). 


64-bit variant 

Applies when sf == 1 && N == 

SBFIZ <Xd>, <Xn>, #<Isb>, #<width> 

is equivalent to 

SBFM <Xd>, <Xn>, #(-<Isb> MOD 64), #(<width>-1) 


and is the preferred disassembly when UInt(imms) < UInt(immr). 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Isb> For the 32-bit variant: is the bit number of the Isb of the destination bitfield, in the range 0 to 31. 


For the 64-bit variant: is the bit number of the Isb of the destination bitfield, in the range 0 to 63. 


<width> For the 32-bit variant: is the width of the bitfield, in the range 1 to 32-<Isb>. 
For the 64-bit variant: is the width of the bitfield, in the range 1 to 64-<Isb>. 


Operation 


The description of SBFM gives the operational pseudocode for this instruction. 
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C6.2.159 SBFM 
Signed Bitfield Move copies any number of low-order bits from a source register into the same number of adjacent 
bits at any position in the destination register, shifting in copies of the sign bit in the upper bits and zeros in the lower 
bits. 
This instruction is used by the aliases ASR (immediate), SBFIZ, SBFX, SXTB, SXTH, and SXTW. See Alias 
conditions on page C6-669 for details of when each alias is preferred. 
|31 30 29 28|27 26 25 24|23 2221 | 16/15 | 109 5 4| 0 | 
Sof OT ON ma Ps 
opc 
32-bit variant 
Applies when sf == @ && N == 
SBFM <Wd>, <Wn>, #<immr>, #<imms> 
64-bit variant 
Applies when sf == 1 && N == 
SBFM <Xd>, <Xn>, #<immr>, #<imms> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer datasize = if sf == '1' then 64 else 32; 
integer R; 
integer S; 
bits(datasize) wmask; 
bits(datasize) tmask; 
if sf == '1' && N != '1' then ReservedValue(); 
if sf == '@' && (N != 'O' || immr<5> != 'O' || imms<5> != '@') then ReservedValue(); 
R = UInt(immr); 
S = UInt(imms); 
(wmask, tmask) = DecodeBitMasks(N, imms, immr, FALSE); 
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Alias conditions 


























Alias of variant is preferred when 

ASR (immediate) 32-bit imms == 'Q11111' 

ASR (immediate) 64-bit imms == '111111' 

SBFIZ é UInt(imms) < UInt(immr) 

SBFX Z BFXPreferred(sf, opc<l>, imms, immr) 
SXTB - immr == 'Q00000' && imms == 'Q00111' 
SXTH : immr == 'Q00000' && imms == 'QQ1111' 
SXTW = immr == 'Q00000' && imms == 'Q11111' 





Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 

<immr> For the 32-bit variant: is the right rotate amount, in the range 0 to 31, encoded in the "immr" field. 
For the 64-bit variant: is the right rotate amount, in the range 0 to 63, encoded in the "immr" field. 

<imms> For the 32-bit variant: is the leftmost bit number to be moved from the source, in the range 0 to 31, 
encoded in the "imms" field. 
For the 64-bit variant: is the leftmost bit number to be moved from the source, in the range 0 to 63, 
encoded in the "imms" field. 

Operation 


bits(datasize) src = X[n]; 


// perform bitfield move on low bits 
bits(datasize) bot = ROR(src, R) AND wmask; 


// determine extension bits (sign, zero or dest register) 
bits(datasize) top = Replicate(src<S>); 


// combine extension bits and result bits 
X[d] = (top AND NOT(tmask)) OR (bot AND tmask); 
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C6.2.160 


SBFX 


Signed Bitfield Extract extracts any number of adjacent bits at any position from a register, sign-extends them to 
the size of the register, and writes the result to the destination register. 


This instruction is an alias of the SBFM instruction. This means that: 
° The encodings in this description are named to match the encodings of SBFM. 


° The description of SBFM gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0 | 


FC Ce 


opc 


32-bit variant 

Applies when sf == @ && N == 

SBFX <Wd>, <Wn>, #<Isb>, #<width> 

is equivalent to 

SBFM <Wd>, <Wn>, #<Isb>, #(<]sb>+<width>-1) 


and is the preferred disassembly when BFXPreferred(sf, opc<l>, imms, immr). 


64-bit variant 

Applies when sf == 1 && N == 

SBFX <Xd>, <Xn>, #<Isb>, #<width> 

is equivalent to 

SBFM <Xd>, <Xn>, #<Isb>, #(<]sb>+<width>-1) 


and is the preferred disassembly when BFXPreferred(sf, opc<l>, imms, immr). 


Assembler symbols 

<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 


<Isb> For the 32-bit variant: is the bit number of the Isb of the source bitfield, in the range 0 to 31. 
For the 64-bit variant: is the bit number of the Isb of the source bitfield, in the range 0 to 63. 


<width> For the 32-bit variant: is the width of the bitfield, in the range 1 to 32-<Isb>. 
For the 64-bit variant: is the width of the bitfield, in the range 1 to 64-<Isb>. 


Operation 


The description of SBFM gives the operational pseudocode for this instruction. 
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C6.2.161 SDIV 


Signed Divide divides a signed integer register value by another signed integer register value, and writes the result 
to the destination register. The condition flags are not affected. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


sofoji Fo vor of Rm foo oo iif Re | Rd 


32-bit variant 
Applies when sf == 


SDIV <Wd>, <Wn>, <Wm> 


64-bit variant 
Applies when sf == 


SDIV <Xd>, <Xn>, <Xm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
Operation 

bits(datasize) operandl = X[n]; 

bits(datasize) operand2 = X[m]; 


integer result; 


if IsZero(operand2) then 
result = 0; 
else 
result = RoundTowardsZero(Real(Int(operand1, FALSE)) / Real(Int(operand2, FALSE))); 


X[d] = result<datasize-1:0>; 
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C6.2.162 SEV 


Send Event is a hint instruction. It causes an event to be signaled to all PEs in the multiprocessor system. For more 
information, see Wait for Event mechanism and Send event on page D1-1599. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 14 13 12/11 8/7 5 4|3 21 0| 
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CRm 


op2 


System variant 
SEV 
Decode for this encoding 


// Empty. 


Operation 


SendEvent(); 
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C6.2.163 SEVL 


Send Event Local is a hint instruction. It causes an event to be signaled locally without the requirement to affect 
other PEs in the multiprocessor system. It can prime a wait-loop which starts with a WFE instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 14 13 12/11 8/7 5 4|3 21 0| 
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CRm 


op2 


System variant 
SEVL 
Decode for this encoding 


// Empty. 


Operation 


EventRegisterSet(); 
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C6.2.164 SMADDL 


Signed Multiply-Add Long multiplies two 32-bit register values, adds a 64-bit register value, and writes the result 
to the 64-bit destination register. 


This instruction is used by the alias SMULL. See Alias conditions for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 14 | 109 5 4| 0 | 
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64-bit variant 


SMADDL <Xd>, <Wn>, <Wm>, <Xa> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer a = UInt(Ra); 


Alias conditions 





Alias is preferred when 





SMULL Ra == '11111' 





Assembler symbols 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 
<Xa> Is the 64-bit name of the third general-purpose source register holding the addend, encoded in the 
"Ra" field. 
Operation 
bits(32) operandl = X[n]; 
bits(32) operand2 = X[m]; 


bits(64) operand3 = X[a]; 


integer result; 
result = Int(operand3, FALSE) + (Int(operand1, FALSE) » Int(operand2, FALSE)); 


X[d] = result<63:0>; 
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Secure Monitor Call causes an exception to EL3. 
SMC is available only for software executing at EL1 or higher. It is UNDEFINED in ELO. 


If the values of HCR_EL2.TSC and SCR_EL3.SMD are both 0, execution of an SMC instruction at EL1 or higher 
generates a Secure Monitor Call exception, recording it in ESR_ELx, using the EC value 0x17, that is taken to EL3. 


If the value of HCR_EL2.TSC is 1, execution of an SMC instruction in a Non-secure EL] state generates an exception 
that is taken to EL2, regardless of the value of SCR_EL3.SMD. For more information, see Traps to EL2 of 
Non-secure EL1 execution of SMC instructions on page D1-1578. 


If the value of HCR_EL2.TSC is 0 and the value of SCR_EL3.SMD is 1, the SMC instruction is UNDEFINED. 


|31 30 29 28|27 26 25 24|23 22 21 20) | | | 5 4/3 21 0| 
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System variant 


SMC #<imm> 


Decode for this encoding 


bits(16) imm = imm16; 


Assembler symbols 


<imm> Is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the "1mm16" field. 


Operation 


if !HaveEL(EL3) || PSTATE.EL == EL@ then 
UnallocatedEncoding(); 


AArch64.CheckForSMCTrap(imm) ; 


if SCR_EL3.SMD == '1' then 
// SMC disabled 
AArch64.UndefinedFault(); 

else 
AArch64.Cal1lSecureMonitor(imm) ; 
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C6.2.166 SMNEGL 

Signed Multiply-Negate Long multiplies two 32-bit register values, negates the product, and writes the result to the 

64-bit destination register. 

This instruction is an alias of the SMSUBL instruction. This means that: 

° The encodings in this description are named to match the encodings of SMSUBL. 

° The description of SMSUBL gives the operational pseudocode for this instruction. 

i 30 29 lal 26 25 al 22 21 20| eee 14 = 10 9 5 4| 0 | 
= 

64-bit variant 

SMNEGL <Xd>, <Wn>, <Wm> 

is equivalent to 

SMSUBL <Xd>, <Wn>, <Wm>, XZR 

and is always the preferred disassembly. 

Assembler symbols 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 

Operation 

The description of SMSUBL gives the operational pseudocode for this instruction. 

C6-676 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k 


Non-Confidential ID092916 


_iss10775 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.167 SMSUBL 


Signed Multiply-Subtract Long multiplies two 32-bit register values, subtracts the product from a 64-bit register 
value, and writes the result to the 64-bit destination register. 


This instruction is used by the alias SMNEGL. See Alias conditions for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 14 | 109 5 4| 0 | 
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64-bit variant 


SMSUBL <Xd>, <Wn>, <Wm>, <Xa> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer a = UInt(Ra); 


Alias conditions 





Alias is preferred when 





SMNEGL Ra == '11111' 





Assembler symbols 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 


<Xa> Is the 64-bit name of the third general-purpose source register holding the minuend, encoded in the 
"Ra" field. 


Operation 


bits(32) operandl = X[n]; 
bits(32) operand2 = X[m]; 
bits(64) operand3 = X[a]; 


integer result; 


result = Int(operand3, FALSE) - (Int(operand1, FALSE) » Int(operand2, FALSE)); 
X[d] = result<63:0>; 
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C6.2.168 SMULH 


Signed Multiply High multiplies two 64-bit register values, and writes bits[127:64] of the 128-bit result to the 64-bit 
destination register. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 5 4| 0 | 
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64-bit variant 


SMULH <Xd>, <Xn>, <Xm> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


Assembler symbols 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Xn> Is the 64-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Xm> Is the 64-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 


Operation 
bits(64) operandl = X[n]; 
bits(64) operand2 = X[m]; 


integer result; 
result = Int(operand1, FALSE) « Int(operand2, FALSE); 


X[d] = result<127:64>; 
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C6.2.169 SMULL 

Signed Multiply Long multiplies two 32-bit register values, and writes the result to the 64-bit destination register. 

This instruction is an alias of the SMADDL instruction. This means that: 

° The encodings in this description are named to match the encodings of SMADDL. 

° The description of SMADDL gives the operational pseudocode for this instruction. 

El 30 — 26 25 ae 22 21 20) SS as 10 9 5 4| 0 | 
= 

64-bit variant 

SMULL <Xd>, <Wn>, <Wm> 

is equivalent to 

SMADDL <Xd>, <Wn>, <Wm>, XZR 

and is always the preferred disassembly. 

Assembler symbols 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 

Operation 

The description of SMADDL gives the operational pseudocode for this instruction. 
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C6.2.170 STLR 


Store-Release Register stores a 32-bit word or a 64-bit doubleword to a memory location, from a register. The 
instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. For 
information about memory accesses, see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) ree | 109 | 5 4| 0| 
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size Rt2 


32-bit variant 
Applies when size == 10. 


STLR <Wt>, [<Xn|SP>{,#0}] 
64-bit variant 

Applies when size == 11. 
STLR <Xt>, [<Xn|SP>{,#0}] 


Decode for all variants of this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


integer elsize = 8 << UInt(size); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 


bits(64) address; 
bits(elsize) data; 
constant integer dbytes = elsize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


data = X[t]; 
Mem[address, dbytes, AccType_ORDERED] = data; 
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Store-Release Register Byte stores a byte from a 32-bit register to a memory location. The instruction also has 
memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. For information about 


memory accesses, see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 


5 4| 0| 
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size 


No offset variant 


STLRB <Wt>, [<Xn|SP>{,#0}] 


Decode for this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 


bits(64) address; 
bits(8) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


data = X[t]; 
Mem[address, 1, AccType_ORDERED] = data; 
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C6.2.172 STLRH 


Store-Release Register Halfword stores a halfword from a 32-bit register to a memory location. The instruction also 
has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. For information about 
memory accesses, see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) ree | 5 4| 0| 
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size Rt2 


No offset variant 


STLRH <Wt>, [<Xn|SP>{,#0}] 


Decode for this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Operation 


bits(64) address; 
bits(16) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


data = X[t]; 
Mem[address, 2, AccType_ORDERED] = data; 
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C6.2.173  STLXP 


Store-Release Exclusive Pair of registers stores two 32-bit words or two 64-bit doublewords to a memory location 
if the PE has exclusive access to the memory address, from two registers, and returns a status value of 0 if the store 
was successful, or of 1 if no store was performed. See Synchronization and semaphores on page B2-108. A 32-bit 
pair requires the address to be doubleword aligned and is single-copy atomic at doubleword granularity. A 64-bit 
pair requires the address to be quadword aligned and, if the Store-Exclusive succeeds, it causes a single-copy atomic 
update of the 128-bit memory location being updated. The instruction also has memory ordering semantics as 
described in Load-Acquire, Store-Release on page B2-90. For information about memory accesses see Load/Store 
addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 14 | 109 5 4| 0 | 
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32-bit variant 
Applies when sz == 


STLXP <Ws>, <Wtl>, <Wt2>, [<Xn|SP>{,#0}] 


64-bit variant 
Applies when sz == 


STLXP <Ws>, <Xtl>, <Xt2>, [<Xn|SP>{,#0}] 


Decode for all variants of this encoding 

integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); // ignored by load/store single register 
integer s = UInt(Rs); // ignored by all loads and store-release 
integer elsize = 32 << UInt(sz); 

integer datasize = elsize « 2; 


Notes for all encodings 

For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STLXP on page K1-5488. 
Assembler symbols 


<Ws> Is the 32-bit name of the general-purpose register into which the status result of the store exclusive 
is written, encoded in the "Rs" field. The value returned is: 





0 If the operation updates memory. 
1 If the operation fails to update memory. 
<Xtl> Is the 64-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt2> Is the 64-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 
<Wtl> Is the 32-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
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<Wt2> Is the 32-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


Aborts and alignment 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
° Memory is not updated. 

° <Ws> is not updated. 


Accessing an address that is not aligned to the size of the data being accessed causes an Alignment fault Data Abort 
exception to be generated, subject to the following rules: 


. Tf AArch64.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch64.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation 


bits(64) address; 

bits(datasize) data; 

constant integer dbytes = datasize DIV 8; 
boolean rt_unknown = FALSE; 

boolean rn_unknown = FALSE; 


if s == t || (s == t2) then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // store UNKNOWN value 
when Constraint_NONE rt_unknown = FALSE; // store original value 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if s == n && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rn_unknown = TRUE; // address is UNKNOWN 
when Constraint_NONE rn_unknown = FALSE; // address is original base 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
elsif rn_unknown then 
address = bits(64) UNKNOWN; 
else 
address = X[n]; 


if rt_unknown then 

data = bits(datasize) UNKNOWN; 
else 

bits(datasize DIV 2) ell = X[t]; 

bits(datasize DIV 2) el2 = X[t2]; 

data = if BigEndian() then ell:el2 else el2:el1; 
bit status = '1'; 
// Check whether the Exclusive Monitors are set to include the 
// physical memory locations corresponding to virtual address 
// range [address, address+dbytes-1]. 





C6-684 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


if AArch64.ExclusiveMonitorsPass(address, dbytes) then 
// This atomic write will be rejected if it does not refer 
// to the same physical locations after address translation. 
Mem[address, dbytes, AccType_ORDERED] = data; 
status = ExclusiveMonitorsStatus(); 

X[s] = ZeroExtend(status, 32); 
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C6.2.174 STLXR 
Store-Release Exclusive Register stores a 32-bit word or a 64-bit doubleword to memory if the PE has exclusive 
access to the memory address, from two registers, and returns a status value of 0 if the store was successful, or of 1 
if no store was performed. See Synchronization and semaphores on page B2-108. The memory access is atomic. 
The instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. 
For information about memory accesses see Load/Store addressing modes on page C1-128. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 5 4| 0 | 
oo1000 ofofo| Rs La) TOYOYONG ee es 
size Rt2 
32-bit variant 
Applies when size == 10. 
STLXR <Ws>, <Wt>, [<Xn|SP>{,#0}] 
64-bit variant 
Applies when size == 11. 
STLXR <Ws>, <Xt>, [<Xn|SP>{,#0}] 
Decode for all variants of this encoding 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer s = UInt(Rs); // ignored by all loads and store-release 
integer elsize = 8 << UInt(size); 
Notes for all encodings 
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STLXR on page K1-5488. 
Assembler symbols 
<Ws> Is the 32-bit name of the general-purpose register into which the status result of the store exclusive 
is written, encoded in the "Rs" field. The value returned is: 
Q If the operation updates memory. 
al If the operation fails to update memory. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Aborts and alignment 
If a synchronous Data Abort exception is generated by the execution of this instruction: 
° Memory is not updated. 
° <Ws> is not updated. 
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Accessing an address that is not aligned to the size of the data being accessed causes an Alignment fault Data Abort 
exception to be generated, subject to the following rules: 


. Tf AArch64.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch64.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation 


bits(64) address; 

bits(elsize) data; 

constant integer dbytes = elsize DIV 8; 
boolean rt_unknown = FALSE; 

boolean rn_unknown = FALSE; 


if s == t then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // store UNKNOWN value 
when Constraint_NONE rt_unknown = FALSE; // store original value 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if s == n && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rn_unknown = TRUE; // address is UNKNOWN 
when Constraint_NONE rn_unknown = FALSE; // address is original base 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
elsif rn_unknown then 
address = bits(64) UNKNOWN; 
else 
address = X[n]; 


if rt_unknown then 

data = bits(elsize) UNKNOWN; 
else 

data = X[t]; 


bit status = '1'; 
// Check whether the Exclusive Monitors are set to include the 
// physical memory locations corresponding to virtual address 
// range [address, address+dbytes-1]. 
if AArch64.ExclusiveMonitorsPass(address, dbytes) then 
// This atomic write will be rejected if it does not refer 
// to the same physical locations after address translation. 
Mem[address, dbytes, AccType_ORDERED] = data; 
status = ExclusiveMonitorsStatus(); 
X[s] = ZeroExtend(status, 32); 
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C6.2.175 STLXRB 
Store-Release Exclusive Register Byte stores a byte from a 32-bit register to memory if the PE has exclusive access 
to the memory address, and returns a status value of 0 if the store was successful, or of 1 if no store was performed. 
See Synchronization and semaphores on page B2-108. The memory access is atomic. The instruction also has 
memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. For information about 
memory accesses see Load/Store addressing modes on page C1-128. 
|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 14 | 109 5 4| 0 | 
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size Rt2 
No offset variant 
STLXRB <Ws>, <Wt>, [<Xn|SP>{,#0}] 
Decode for this encoding 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer s = UInt(Rs); // ignored by all loads and store-release 
Notes for all encodings 
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STLXRB on page K1-5489. 
Assembler symbols 
<Ws> Is the 32-bit name of the general-purpose register into which the status result of the store exclusive 
is written, encoded in the "Rs" field. The value returned is: 
0 If the operation updates memory. 
1 If the operation fails to update memory. 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Aborts 
If a synchronous Data Abort exception is generated by the execution of this instruction: 
. Memory is not updated. 
. <Ws> is not updated. 
If AArch64.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 
Operation 
bits(64) address; 
bits(8) data; 
boolean rt_unknown = FALSE; 
boolean rn_unknown = FALSE; 
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if s == t then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // store UNKNOWN value 
when Constraint_NONE rt_unknown = FALSE; // store original value 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if s == n && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rn_unknown = TRUE; // address is UNKNOWN 
when Constraint_NONE rn_unknown = FALSE; // address is original base 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
elsif rn_unknown then 
address = bits(64) UNKNOWN; 
else 
address = X[n]; 


if rt_unknown then 

data = bits(8) UNKNOWN; 
else 

data = X[t]; 


bit status = '1'; 
// Check whether the Exclusive Monitors are set to include the 
// physical memory locations corresponding to virtual address 
// range [address, address+dbytes-1]. 
if AArch64.ExclusiveMonitorsPass(address, 1) then 
// This atomic write will be rejected if it does not refer 
// to the same physical locations after address translation. 
Mem[address, 1, AccType_ORDERED] = data; 
status = ExclusiveMonitorsStatus(); 
X[s] = ZeroExtend(status, 32); 
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C6.2.176 STLXRH 
Store-Release Exclusive Register Halfword stores a halfword from a 32-bit register to memory if the PE has 
exclusive access to the memory address, and returns a status value of 0 if the store was successful, or of | if no store 
was performed. See Synchronization and semaphores on page B2-108. The memory access is atomic. The 
instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. For 
information about memory accesses see Load/Store addressing modes on page C1-128. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 | 109 5 4| 0 | 
oo1000 ofofo| Rs La) TOYOYONG ee es 
size Rt2 
No offset variant 
STLXRH <Ws>, <Wt>, [<Xn|SP>{,#0}] 
Decode for this encoding 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer s = UInt(Rs); // ignored by all loads and store-release 
Notes for all encodings 
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STLXRH on page K1-5489. 
Assembler symbols 
<Ws> Is the 32-bit name of the general-purpose register into which the status result of the store exclusive 
is written, encoded in the "Rs" field. The value returned is: 
0 If the operation updates memory. 
1 If the operation fails to update memory. 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
Aborts and alignment 
If a synchronous Data Abort exception is generated by the execution of this instruction: 
. Memory is not updated. 
. <Ws> is not updated. 
A non halfword-aligned memory address causes an Alignment fault Data Abort exception to be generated, subject 
to the following rules: 
e If AArch64.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 
Tf AArch64.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 
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Operation 


bits(64) address; 

bits(16) data; 

boolean rt_unknown = FALSE; 
boolean rn_unknown = FALSE; 


if s == t then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // store UNKNOWN value 
when Constraint_NONE rt_unknown = FALSE; // store original value 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if s == n && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rn_unknown = TRUE; // address is UNKNOWN 
when Constraint_NONE rn_unknown = FALSE; // address is original base 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
elsif rn_unknown then 
address = bits(64) UNKNOWN; 
else 
address = X[n]; 


if rt_unknown then 

data = bits(16) UNKNOWN; 
else 

data = X[t]; 


bit status = '1'; 
// Check whether the Exclusive Monitors are set to include the 
// physical memory locations corresponding to virtual address 
// range [address, address+dbytes-1]. 
if AArch64.ExclusiveMonitorsPass(address, 2) then 
// This atomic write will be rejected if it does not refer 
// to the same physical locations after address translation. 
Mem[address, 2, AccType_ORDERED] = data; 
status = ExclusiveMonitorsStatus(); 
X[s] = ZeroExtend(status, 32); 
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C6.2.177. STNP 
Store Pair of Registers, with non-temporal hint, calculates an address from a base register value and an immediate 
offset, and stores two 32-bit words or two 64-bit doublewords to the calculated address, from two registers. For 
information about memory accesses, see Load/Store addressing modes on page C1-128. For information about 
Non-temporal pair instructions, see Load/Store Non-temporal Pair on page C3-149. 
|31 30 29 28|27 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 
fot 0 ifofo 0 fo] mm? | Re | &n | Rt 
opc L 
32-bit variant 
Applies when opc == 00. 
STNP <Wt1>, <Wt2>, [<Xn|SP>{, #<imm>}] 
64-bit variant 
Applies when opc == 10. 
STNP <Xtl>, <Xt2>, [<Xn|SP>{, #<imm>}] 
Decode for all variants of this encoding 
// Empty. 
Assembler symbols 
<Wtl> Is the 32-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Wt2> Is the 32-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 
<Xtb> Is the 64-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt2> Is the 64-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> For the 32-bit variant: is the optional signed immediate byte offset, a multiple of 4 in the range -256 
to 252, defaulting to 0 and encoded in the "imm7" field as <imm>/4. 
For the 64-bit variant: is the optional signed immediate byte offset, a multiple of 8 in the range -512 
to 504, defaulting to 0 and encoded in the "imm7" field as <imm>/8. 
Shared decode for all encodings 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer t2 = UInt(Rt2); 
if opc<@> == '1' then UnallocatedEncoding(); 
integer scale = 2 + UInt(opc<1>); 
integer datasize = 8 << scale; 
bits(64) offset = LSL(SignExtend(imm7, 64), scale); 
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Operation 


bits(64) address; 

bits(datasize) datal; 

bits(datasize) data2; 

constant integer dbytes = datasize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


datal = X[t]; 

data2 = X[t2]; 

Mem[address, dbytes, AccType_STREAM] = datal; 
Mem[address+dbytes, dbytes, AccType_STREAM] = data2; 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 
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C6.2.178 


STP 


Store Pair of Registers calculates an address from a base register value and an immediate offset, and stores two 
32-bit words or two 64-bit doublewords to the calculated address, from two registers. For information about 
memory accesses, see Load/Store addressing modes on page C1-128. 


Post-index 


|31 30 29 28|27 26 25 24|23 2221 | 


15 14 


109 | 


5 4| 


0| 


[x o]1 o tfofo o tfof immz | RT Rn S| Rt 


opc L 


32-bit variant 

Applies when opc == 00. 

STP <Wtl>, <Wt2>, [<Xn|SP>], #<imm> 

64-bit variant 

Applies when opc == 10. 

STP <Xtl>, <Xt2>, [<Xn|SP>], #<imm> 
Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = TRUE; 


Pre-index 


|31 30 29 28|27 26 25 24/23 2221 | 


|15 14 


109 | 


5 4| 


0| 


fot 0 ifofo 7 1[o] mm? | Ro | Rn | Rt 


opc L 


32-bit variant 

Applies when opc == 00. 

STP <Wtl>, <Wt2>, [<Xn|SP>, #<imm>]! 

64-bit variant 

Applies when opc == 10. 

STP <Xtl>, <Xt2>, [<Xn|SP>, #<imm>]! 
Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = FALSE; 


Signed offset 


|31 30 29 28|27 26 25 24/23 2221 | 


|15 14 


109 | 


5 4| 


0| 


[x o]1 o tfofo 4 ofof immz | RT Rn S| Rt 


opc L 
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32-bit variant 
Applies when opc == 00. 
STP <Wtl>, <Wt2>, [<Xn|SP>{, #<imm>}] 
64-bit variant 
Applies when opc == 10. 
STP <Xtl>, <Xt2>, [<Xn|SP>{, #<imm>}] 
Decode for all variants of this encoding 
boolean whack = FALSE; 
boolean postindex = FALSE; 
Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STP on page K1-5488. 


Assembler symbols 


<Wtl> Is the 32-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 

<Wt2> Is the 32-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 

<Xtb Is the 64-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 

<Xt2> Is the 64-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 

<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 

<imm> For the 32-bit post-index and 32-bit pre-index variant: is the signed immediate byte offset, a 


multiple of 4 in the range -256 to 252, encoded in the "imm7" field as <imm>/4. 


For the 32-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 4 in 
the range -256 to 252, defaulting to 0 and encoded in the "imm7" field as <imm>/4. 


For the 64-bit post-index and 64-bit pre-index variant: is the signed immediate byte offset, a 
multiple of 8 in the range -512 to 504, encoded in the "imm7" field as <imm>/8. 


For the 64-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 8 in 
the range -512 to 504, defaulting to 0 and encoded in the "imm7" field as <imm>/8. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); 

if L:opc<@> == '@1' || opc == '11' then UnallocatedEncoding(); 
integer scale = 2 + UInt(opc<1>); 

integer datasize = 8 << scale; 

bits(64) offset = LSL(SignExtend(imm7, 64), scale); 


Operation for all encodings 


bits(64) address; 

bits(datasize) datal; 

bits(datasize) data2; 

constant integer dbytes = datasize DIV 8; 
boolean rt_unknown = FALSE; 
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if whack && (t == n || t2 == n) && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_NONE, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_NONE rt_unknown = FALSE; // value stored is pre-writeback 
when Constraint_UNKNOWN rt_unknown = TRUE; // value stored is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


if rt_unknown && t == n then 
datal = bits(datasize) UNKNOWN; 
else 
datal = X[t]; 
if rt_unknown && t2 == n then 
data2 = bits(datasize) UNKNOWN; 
else 
data2 = X[t2]; 
Mem[address, dbytes, AccType_NORMAL] = datal; 
Mem[address+dbytes, dbytes, AccType_NORMAL] = data2; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C6.2.179 STR (immediate) 


Store Register (immediate) stores a word or a doubleword from a register to memory. The address that is used for 
the store is calculated from a base register and an immediate offset. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0| 
[1_x]1 1 tfofo ofo ofo] imma fo tf Rn S| Rt 
size opc 


32-bit variant 
Applies when size == 10. 


STR <Wt>, [<Xn|SP>], #<simm> 


64-bit variant 
Applies when size == 11. 


STR <Xt>, [<Xn|SP>], #<simm> 


Decode for all variants of this encoding 


boolean whack = TRUE; 

boolean postindex = TRUE; 

integer scale = UInt(size); 

bits(64) offset = SignExtend(imm9, 64); 


Pre-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0| 
[1 x]1_1 tfofo ofo ofo] imma tt tt Rn S| RT 
size opc 


32-bit variant 
Applies when size == 10. 


STR <Wt>, [<Xn|SP>, #<simm>]! 


64-bit variant 
Applies when size == 11. 


STR <Xt>, [<Xn|SP>, #<simm>]! 


Decode for all variants of this encoding 


boolean whack = TRUE; 

boolean postindex = FALSE; 

integer scale = UInt(size); 

bits(64) offset = SignExtend(imm9, 64); 
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Unsigned offset 


|31 30 29 28/27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 


xt 1 ifofo io o] mmi2—Ss—S—~ir~Srn Sd] SOC 


size opc 


32-bit variant 
Applies when size == 10. 


STR <Wt>, [<Xn|SP>{, #<pimm>}] 


64-bit variant 
Applies when size == 11. 


STR <Xt>, [<Xn|SP>{, #<pimm>}] 


Decode for all variants of this encoding 


boolean whack = FALSE; 

boolean postindex = FALSE; 

integer scale = UInt(size); 

bits(64) offset = LSL(ZeroExtend(imm12, 64), scale); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> For the 32-bit variant: is the optional positive immediate byte offset, a multiple of 4 in the range 0 


to 16380, defaulting to 0 and encoded in the "imm12" field as <pimm>/4. 


For the 64-bit variant: is the optional positive immediate byte offset, a multiple of 8 in the range 0 
to 32760, defaulting to 0 and encoded in the "imm12" field as <pimm>/8. 
Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


integer datasize = 8 << scale; 


Operation for all encodings 


bits(64) address; 
bits(datasize) data; 
boolean rt_unknown = FALSE; 


if whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_NONE, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_NONE rt_unknown = FALSE; // value stored is original value 
when Constraint_UNKNOWN rt_unknown = TRUE; // value stored is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
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CheckSPAlignment(); 

address = SP[]; 
else 

address = X[n]; 


if !postindex then 
address = address + offset; 


if rt_unknown then 
data = bits(datasize) UNKNOWN; 
else 
data = X[t]; 
Mem[address, datasize DIV 8, AccType_NORMAL] = data; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C6.2.180 STR (register) 
Store Register (register) calculates an address from a base register value and an offset register value, and stores a 
32-bit word or a 64-bit doubleword to the calculated address, from a register. For information about memory 
accesses, see Load/Store addressing modes on page C1-128. 
The instruction uses an offset addressing mode, that calculates the address used for the memory access from a base 
register value and an offset register value. The offset can be optionally shifted and extended. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 1312/11109 | 5 4| 0 | 
fi x]t 1 ifo[o ofo oi] Rm [option [S[7 0] Rn | Rt 
size opc 
32-bit variant 
Applies when size == 10. 
STR <Wt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 
64-bit variant 
Applies when size == 11. 
STR <Xt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 
Decode for all variants of this encoding 
integer scale = UInt(size); 
if option<l> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = if S == '1' then scale else 0; 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Wm> When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<Xm> When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<extend> Is the index extend/shift specifier, defaulting to LSL, and which must be omitted for the LSL option 
when <amount> is omitted. encoded in the "option" field. It can have the following values: 
UXTW when option = 010 
LSL when option = 011 
SXTW when option = 110 
SXTX when option = 111 
<amount> For the 32-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 
#0 when S = @ 
#2 when S = 1 
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For the 64-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 


values: 
#0 when S = 0 
#3 when S = 1 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 


integer datasize = 8 << scale; 


Operation 


bits(64) offset = ExtendReg(m, extend_type, shift); 
bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


data = X[t]; 
Mem[address, datasize DIV 8, AccType_NORMAL] = data; 
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C6.2.181 


STRB (immediate) 


Store Register Byte (immediate) stores the least significant byte of a 32-bit register to memory. The address that is 
used for the store is calculated from a base register and an immediate offset. For information about memory 
accesses, see Load/Store addressing modes on page C1-128. 


Post-index 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


fo of1 1 1Jofo ofo ofo} imma fo tT Rn STR 


size opc 


Post-index variant 


STRB <Wt>, [<Xn|SP>], #<simm> 


Decode for this encoding 
boolean whack = TRUE; 


boolean postindex = TRUE; 
bits(64) offset = SignExtend(imm9, 64); 


Pre-index 


|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0 | 


jo of1 1 1Jofo ofo ofo} imma tt tf] Rn S| Rt 


size opc 


Pre-index variant 


STRB <Wt>, [<Xn|SP>, #<simm>] ! 


Decode for this encoding 
boolean whack = TRUE; 


boolean postindex = FALSE; 
bits(64) offset = SignExtend(imm9, 64); 


Unsigned offset 


|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 


fo o]1 1 tfofo sfo of TT immiz2 eR CTR 


size opc 


Unsigned offset variant 


STRB <Wt>, [<Xn|SP>{, #<pimm>}] 


Decode for this encoding 


boolean whack = FALSE; 
boolean postindex = FALSE; 
bits(64) offset = LSL(ZeroExtend(imm12, 64), 0); 
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Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STRB (immediate) on page K1-5490. 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> Is the optional positive immediate byte offset, in the range 0 to 4095, defaulting to 0 and encoded 


in the "imm12" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation for all encodings 


bits(64) address; 
bits(8) data; 
boolean rt_unknown = FALSE; 


if whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_NONE, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_NONE rt_unknown = FALSE; // value stored is original value 
when Constraint_UNKNOWN rt_unknown = TRUE; // value stored is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


if rt_unknown then 
data = bits(8) UNKNOWN; 
else 
data = X[t]; 
Mem[address, 1, AccType_NORMAL] = data; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-703 


Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 





C6.2.182 STRB (register) 
Store Register Byte (register) calculates an address from a base register value and an offset register value, and stores 
a byte from a 32-bit register to the calculated address. For information about memory accesses, see Load/Store 
addressing modes on page C1-128. 
The instruction uses an offset addressing mode, that calculates the address used for the memory access from a base 
register value and an offset register value. The offset can be optionally shifted and extended. 
|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1312/11109 | 5 4| 0 | 
foot 1 ifo]o ofo0 o]1] Rm [option [S[7 0] Rn | Rt 
size opc 
Extended register variant 
Applies when option != @11. 
STRB <Wt>, [<Xn|SP>, (<Wm>|<Xm>), <extend> {<amount>}] 
Shifted register variant 
Applies when option == 11. 
STRB <Wt>, [<Xn|SP>, <Xm>{, LSL <amount>}] 
Decode for all variants of this encoding 
if option<l> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Wm> When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<Xm> When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<extend> Is the index extend specifier, encoded in the "option" field. It can have the following values: 
UXTW when option = 010 
SXTW when option = 110 
SXTX when option = 111 
<amount> Is the index shift amount, it must be #0, encoded in "S" as @ if omitted, or as 1 if present. 
Shared decode for all encodings 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 
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Operation 


bits(64) offset = ExtendReg(m, extend_type, Q); 
bits(64) address; 
bits(8) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


data = X[t]; 
Mem[address, 1, AccType_NORMAL] = data; 
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C6.2.183 


STRH (immediate) 


Store Register Halfword (immediate) stores the least significant halfword of a 32-bit register to memory. The 
address that is used for the store is calculated from a base register and an immediate offset. For information about 
memory accesses, see Load/Store addressing modes on page C1-128. 


Post-index 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


fo 1]1 1 1Jofo ofo ofo] imma fo tT Rn STR 


size opc 


Post-index variant 


STRH <Wt>, [<Xn|SP>], #<simm> 


Decode for this encoding 
boolean whack = TRUE; 


boolean postindex = TRUE; 
bits(64) offset = SignExtend(imm9, 64); 


Pre-index 


|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0 | 


jo t]1 1 1Jofo ofo ofo} imma tt tf Rn S| Rt 


size opc 


Pre-index variant 


STRH <Wt>, [<Xn|SP>, #<simm>] ! 


Decode for this encoding 
boolean whack = TRUE; 


boolean postindex = FALSE; 
bits(64) offset = SignExtend(imm9, 64); 


Unsigned offset 


|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 


jo t]1 1 tJofo sfo of immiz2 TR TR 


size opc 


Unsigned offset variant 


STRH <Wt>, [<Xn|SP>{, #<pimm>}] 


Decode for this encoding 


boolean whack = FALSE; 
boolean postindex = FALSE; 
bits(64) offset = LSL(ZeroExtend(imm12, 64), 1); 





C6-706 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k 
Non-Confidential ID092916 


_iss10775 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STRH (immediate) on page K1-5490. 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> Is the optional positive immediate byte offset, a multiple of 2 in the range 0 to 8190, defaulting to 0 


and encoded in the "imm12" field as <pimm>/2. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation for all encodings 


bits(64) address; 
bits(16) data; 
boolean rt_unknown = FALSE; 


if whack && n == t && n != 31 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_NONE, Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_NONE rt_unknown = FALSE; // value stored is original value 
when Constraint_UNKNOWN rt_unknown = TRUE; // value stored is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


if rt_unknown then 
data = bits(16) UNKNOWN; 
else 
data = X[t]; 
Mem[address, 2, AccType_NORMAL] = data; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C6.2.184 STRH (register) 
Store Register Halfword (register) calculates an address from a base register value and an offset register value, and 
stores a halfword from a 32-bit register to the calculated address. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 
The instruction uses an offset addressing mode, that calculates the address used for the memory access from a base 
register value and an offset register value. The offset can be optionally shifted and extended. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 1312/11109 | 5 4| 0 | 
fo a]i 1 ifo[o ofo o]1] Rm | optn [S[7 of Rn | Rt 
size opc 
32-bit variant 
STRH <Wt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 
Decode for this encoding 
if option<l> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = if S == '1' then 1 else 0; 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Wm> When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<Xm> When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 
<extend> Is the index extend/shift specifier, defaulting to LSL, and which must be omitted for the LSL option 
when <amount> is omitted. encoded in the "option" field. It can have the following values: 
UXTW when option = 010 
LSL when option = 011 
SXTW when option = 110 
SXTX when option = 111 
<amount> Is the index shift amount, optional only when <extend> is not LSL. Where it is permitted to be 
optional, it defaults to #0. It is encoded in the "S" field. It can have the following values: 
#0 when S = @ 
#1 when S = 1 
Shared decode for all encodings 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer m = UInt(Rm); 
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Operation 


bits(64) offset = ExtendReg(m, extend_type, shift); 
bits(64) address; 
bits(16) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


data = X[t]; 
Mem[address, 2, AccType_NORMAL] = data; 
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C6.2.185 STTR 


Store Register (unprivileged) stores a word or doubleword from a register to memory. The address that is used for 
the store is calculated from a base register and an immediate offset. 


When executed at EL1, the memory access is restricted as if execution was at ELO. Otherwise, the access permission 
is for the Exception level at which the instruction is executed. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24/23 22 21 20] | 12\11109 | 5 4| 0 | 
fi x[t 1 ‘Jojo ofo oo] immo ——~—~<i(toy rn | Rt 
size opc 


32-bit variant 

Applies when size == 10. 

STTR <Wt>, [<Xn|SP>{, #<simm>}] 

64-bit variant 

Applies when size == 11. 

STTR <Xt>, [<Xn|SP>{, #<simm>}] 

Decode for all variants of this encoding 
integer scale = UInt(size); 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


integer datasize = 8 << scale; 


Operation 


bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 
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data = X[t]; 
Mem[address, datasize DIV 8, AccType_UNPRIV] = data; 
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C6.2.186 STTRB 


Store Register Byte (unprivileged) stores a byte from a 32-bit register to memory. The address that is used for the 
store is calculated from a base register and an immediate offset. 


When executed at EL1, the memory access is restricted as if execution was at ELO. Otherwise, the access permission 
is for the Exception level at which the instruction is executed. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24/23 22 21 20] | 12\11109 | 5 4| 0 | 
fo oft 4 tfojo ofo ojo] mmo ‘(toy Rn | Rt 
size opc 


Unscaled offset variant 


STTRB <Wt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(8) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


data = X[t]; 
Mem[address, 1, AccType_UNPRIV] = data; 





C6-712 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.187  STTRH 


Store Register Halfword (unprivileged) stores a halfword from a 32-bit register to memory. The address that is used 
for the store is calculated from a base register and an immediate offset. 


When executed at EL1, the memory access is restricted as if execution was at ELO. Otherwise, the access permission 
is for the Exception level at which the instruction is executed. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24/23 22 21 20] | 12\11109 | 5 4| 0 | 
fo [1 1 tJofo ofo ofo] immo —~*i(toy Rn | Rt 
size opc 


Unscaled offset variant 


STTRH <Wt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(16) data; 


if n == 31 then 
CheckSPAlignment() ; 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


data = X[t]; 
Mem[address, 2, AccType_UNPRIV] = data; 
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C6.2.188 STUR 
Store Register (unscaled) calculates an address from a base register value and an immediate offset, and stores a 
32-bit word or a 64-bit doubleword to the calculated address, from a register. For information about memory 
accesses, see Load/Store addressing modes on page C1-128. 
|31 30 29 28|27 26 25 24/23 22 21 20| | 12/1110 9 | 5 4| 0 | 
[i x]i 1 tfofo ofo ojo] imma fo of Rn STR 
size opc 
32-bit variant 
Applies when size == 10. 
STUR <Wt>, [<Xn|SP>{, #<simm>}] 
64-bit variant 
Applies when size == 11. 
STUR <Xt>, [<Xn|SP>{, #<simm>}] 
Decode for all variants of this encoding 
integer scale = UInt(size); 
bits(64) offset = SignExtend(imm9, 64); 
Assembler symbols 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 
in the "imm9" field. 
Shared decode for all encodings 
integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer datasize = 8 << scale; 
Operation 
bits(64) address; 
bits(datasize) data; 
if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 
address = address + offset; 
data = X[t]; 
Mem[address, datasize DIV 8, AccType_NORMAL] = data; 
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C6.2.189 STURB 


Store Register Byte (unscaled) calculates an address from a base register value and an immediate offset, and stores 
a byte to the calculated address, from a 32-bit register. For information about memory accesses, see Load/Store 
addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


fo of1 1 tJofo ofo ofo} imma fo of Rn S| Rt 


size opc 


Unscaled offset variant 


STURB <Wt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(8) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


data = X[t]; 
Mem[address, 1, AccType_NORMAL] = data; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-715 
1ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.190 STURH 


Store Register Halfword (unscaled) calculates an address from a base register value and an immediate offset, and 
stores a halfword to the calculated address, from a 32-bit register. For information about memory accesses, see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12/1110 9 | 5 4| 0| 


jo t]1 1 1Jofo ofo ofo] imma fo of Rn S| 


size opc 


Unscaled offset variant 


STURH <Wt>, [<Xn|SP>{, #<simm>}] 


Decode for this encoding 


bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 


in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 
integer t = UInt(Rt); 


Operation 


bits(64) address; 
bits(16) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


data = X[t]; 
Mem[address, 2, AccType_NORMAL] = data; 
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C6.2.191 STXP 


Store Exclusive Pair of registers stores two 32-bit words or two 64-bit doublewords from two registers to a memory 
location if the PE has exclusive access to the memory address, and returns a status value of 0 if the store was 
successful, or of | if no store was performed. See Synchronization and semaphores on page B2-108. A 32-bit pair 
requires the address to be doubleword aligned and is single-copy atomic at doubleword granularity. A 64-bit pair 
requires the address to be quadword aligned and, if the Store-Exclusive succeeds, it causes a single-copy atomic 
update of the 128-bit memory location being updated. For information about memory accesses see Load/Store 
addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 14 | 109 5 4| 0 | 


rl 010-0 Ooo] Re lo] Re] | a 


32-bit variant 

Applies when sz == 

STXP <Ws>, <Wtl>, <Wt2>, [<Xn|SP>{,#0}] 
64-bit variant 

Applies when sz == 


STXP <Ws>, <Xtl>, <Xt2>, [<Xn|SP>{,#0}] 


Decode for all variants of this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer t2 = UInt(Rt2); // ignored by load/store single register 
integer s = UInt(Rs); // ignored by all loads and store-release 


integer elsize = 32 << UInt(sz); 

integer datasize = elsize « 2; 
Notes for all encodings 
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STXP on page K1-5490. 
Assembler symbols 


<Ws> Is the 32-bit name of the general-purpose register into which the status result of the store exclusive 
is written, encoded in the "Rs" field. The value returned is: 





Q If the operation updates memory. 
1 If the operation fails to update memory. 
<Xtl> Is the 64-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Xt2> Is the 64-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 
<Wtl> Is the 32-bit name of the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Wt2> Is the 32-bit name of the second general-purpose register to be transferred, encoded in the "Rt2" 
field. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C6-717 


ID092916 Non-Confidential 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


Aborts and alignment 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
. Memory is not updated. 

. <Ws> is not updated. 


Accessing an address that is not aligned to the size of the data being accessed causes an Alignment fault Data Abort 
exception to be generated, subject to the following rules: 


e Tf AArch64.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch64.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation 


bits(64) address; 

bits(datasize) data; 

constant integer dbytes = datasize DIV 8; 
boolean rt_unknown = FALSE; 

boolean rn_unknown = FALSE; 


if s == t || (s == t2) then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // store UNKNOWN value 
when Constraint_NONE rt_unknown = FALSE; // store original value 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if s == n && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rn_unknown = TRUE; // address is UNKNOWN 
when Constraint_NONE rn_unknown = FALSE; // address is original base 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
elsif rn_unknown then 
address = bits(64) UNKNOWN; 
else 
address = X[n]; 


if rt_unknown then 
data = bits(datasize) UNKNOWN; 
else 
bits(datasize DIV 2) ell = X[t]; 
bits(datasize DIV 2) el2 = X[t2]; 
data = if BigEndian() then ell:el2 else el2:el1; 
bit status = '1'; 
// Check whether the Exclusive Monitors are set to include the 
// physical memory locations corresponding to virtual address 
// range [address, address+dbytes-1]. 
if AArch64.ExclusiveMonitorsPass(address, dbytes) then 
// This atomic write will be rejected if it does not refer 
// to the same physical locations after address translation. 
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Mem[address, dbytes, AccType_ATOMIC] = data; 
status = ExclusiveMonitorsStatus(); 
X[s] = ZeroExtend(status, 32); 
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C6.2.192  STXR 


Store Exclusive Register stores a 32-bit word or a 64-bit doubleword from a register to memory if the PE has 
exclusive access to the memory address, and returns a status value of 0 if the store was successful, or of 1 if no store 
was performed. See Synchronization and semaphores on page B2-108. For information about memory accesses see 
Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15 14 | 109 | 5 4| 0 | 
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00 


size L Rt2 


32-bit variant 

Applies when size == 10. 

STXR <Ws>, <Wt>, [<Xn|SP>{,#0}] 
64-bit variant 

Applies when size == 11. 


STXR <Ws>, <Xt>, [<Xn|SP>{,#0}] 


Decode for all variants of this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer s = UInt(Rs); // ignored by all loads and store-release 


integer elsize = 8 << UInt(size); 


Notes for all encodings 

For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STXR on page K1-5491. 
Assembler symbols 


<Ws> Is the 32-bit name of the general-purpose register into which the status result of the store exclusive 
is written, encoded in the "Rs" field. The value returned is: 


Q If the operation updates memory. 

1 If the operation fails to update memory. 
<Xt> Is the 64-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


Aborts and alignment 


If a synchronous Data Abort exception is generated by the execution of this instruction: 





° Memory is not updated. 
° <Ws> is not updated. 
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Accessing an address that is not aligned to the size of the data being accessed causes an Alignment fault Data Abort 
exception to be generated, subject to the following rules: 


. Tf AArch64.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch64.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation 


bits(64) address; 

bits(elsize) data; 

constant integer dbytes = elsize DIV 8; 
boolean rt_unknown = FALSE; 

boolean rn_unknown = FALSE; 


if s == t then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // store UNKNOWN value 
when Constraint_NONE rt_unknown = FALSE; // store original value 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if s == n && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rn_unknown = TRUE; // address is UNKNOWN 
when Constraint_NONE rn_unknown = FALSE; // address is original base 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
elsif rn_unknown then 
address = bits(64) UNKNOWN; 
else 
address = X[n]; 


if rt_unknown then 

data = bits(elsize) UNKNOWN; 
else 

data = X[t]; 


bit status = '1'; 
// Check whether the Exclusive Monitors are set to include the 
// physical memory locations corresponding to virtual address 
// range [address, address+dbytes-1]. 
if AArch64.ExclusiveMonitorsPass(address, dbytes) then 
// This atomic write will be rejected if it does not refer 
// to the same physical locations after address translation. 
Mem[address, dbytes, AccType_ATOMIC] = data; 
status = ExclusiveMonitorsStatus(); 
X[s] = ZeroExtend(status, 32); 
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C6.2.193 


STXRB 


Store Exclusive Register Byte stores a byte from a register to memory if the PE has exclusive access to the memory 
address, and returns a status value of 0 if the store was successful, or of 1 if no store was performed. See 
Synchronization and semaphores on page B2-108. The memory access is atomic. 


For information about memory accesses see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 14 | 109 | 5 4| 0 | 
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size Rt2 


No offset variant 


STXRB <Ws>, <Wt>, [<Xn|SP>{,#0}] 


Decode for this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer s = UInt(Rs); // ignored by all loads and store-release 


Notes for all encodings 

For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly STXRB on page K1-5491. 
Assembler symbols 


<Ws> Is the 32-bit name of the general-purpose register into which the status result of the store exclusive 
is written, encoded in the "Rs" field. The value returned is: 


Q If the operation updates memory. 


1 If the operation fails to update memory. 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


Aborts 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
. Memory is not updated. 

° <Ws> is not updated. 


Tf AArch64.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation 


bits(64) address; 

bits(8) data; 

boolean rt_unknown = FALSE; 
boolean rn_unknown = FALSE; 
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if s == t then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // store UNKNOWN value 
when Constraint_NONE rt_unknown = FALSE; // store original value 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if s == n && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rn_unknown = TRUE; // address is UNKNOWN 
when Constraint_NONE rn_unknown = FALSE; // address is original base 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
elsif rn_unknown then 
address = bits(64) UNKNOWN; 
else 
address = X[n]; 


if rt_unknown then 

data = bits(8) UNKNOWN; 
else 

data = X[t]; 


bit status = '1'; 
// Check whether the Exclusive Monitors are set to include the 
// physical memory locations corresponding to virtual address 
// range [address, address+dbytes-1]. 
if AArch64.ExclusiveMonitorsPass(address, 1) then 
// This atomic write will be rejected if it does not refer 
// to the same physical locations after address translation. 
Mem[address, 1, AccType_ATOMIC] = data; 
status = ExclusiveMonitorsStatus(); 
X[s] = ZeroExtend(status, 32); 
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C6.2.194 


STXRH 


Store Exclusive Register Halfword stores a halfword from a register to memory if the PE has exclusive access to 
the memory address, and returns a status value of 0 if the store was successful, or of 1 if no store was performed. 
See Synchronization and semaphores on page B2-108. The memory access is atomic. 


For information about memory accesses see Load/Store addressing modes on page C1-128. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 14 | 5 4| 0 | 
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size Rt2 


No offset variant 


STXRH <Ws>, <Wt>, [<Xn|SP>{,#0}] 


Decode for this encoding 


integer n = UInt(Rn); 
integer t = UInt(Rt); 
integer s = UInt(Rs); // ignored by all loads and store-release 


Assembler symbols 


<Ws> Is the 32-bit name of the general-purpose register into which the status result of the store exclusive 
is written, encoded in the "Rs" field. The value returned is: 


Q If the operation updates memory. 


1 If the operation fails to update memory. 


<Wt> Is the 32-bit name of the general-purpose register to be transferred, encoded in the "Rt" field. 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


Aborts and alignment 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
. Memory is not updated. 

. <Ws> is not updated. 


A non halfword-aligned memory address causes an Alignment fault Data Abort exception to be generated, subject 
to the following rules: 


. Tf AArch64.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch64.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation 


bits(64) address; 

bits(16) data; 

boolean rt_unknown = FALSE; 
boolean rn_unknown = FALSE; 
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if s == t then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // store UNKNOWN value 
when Constraint_NONE rt_unknown = FALSE; // store original value 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if s == n && n != 31 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_NONE, Constraint_UNDEF, Constraint_NOP}; 


case c of 
when Constraint_UNKNOWN rn_unknown = TRUE; // address is UNKNOWN 
when Constraint_NONE rn_unknown = FALSE; // address is original base 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
elsif rn_unknown then 
address = bits(64) UNKNOWN; 
else 
address = X[n]; 


if rt_unknown then 

data = bits(16) UNKNOWN; 
else 

data = X[t]; 


bit status = '1'; 
// Check whether the Exclusive Monitors are set to include the 
// physical memory locations corresponding to virtual address 
// range [address, address+dbytes-1]. 
if AArch64.ExclusiveMonitorsPass(address, 2) then 
// This atomic write will be rejected if it does not refer 
// to the same physical locations after address translation. 
Mem[address, 2, AccType_ATOMIC] = data; 
status = ExclusiveMonitorsStatus(); 
X[s] = ZeroExtend(status, 32); 
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C6.2.195 SUB (extended register) 


Subtract (extended register) subtracts a sign or zero-extended register value, followed by an optional left shift 
amount, from a register value, and writes the result to the destination register. The argument that is extended from 
the <Rm> register can be a byte, halfword, word, or doubleword. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312| 109 | 5 4| 0| 


sflifojo 101 4fo oft] Rm | option | imms | Rn | RA 


op S 


32-bit variant 
Applies when sf == 


SUB <Wd|WSP>, <Wn|WSP>, <Wm>{, <extend> {#<amount>}} 


64-bit variant 
Applies when sf == 


SUB <Xd|SP>, <Xn|SP>, <R><m>{, <extend> {#<amount>}} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer datasize = if sf == '1' then 64 else 32; 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = UInt(imm3); 

if shift > 4 then ReservedValue(); 


Assembler symbols 


<Wd|WSP> Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Wn | WSP> Is the 32-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 

<Xd| SP> Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 
field. 

<Xn|SP> Is the 64-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 

<R> Is a width specifier, encoded in the "option" field. It can have the following values: 
W when option = 00x 
W when option = 010 
xX when option = x11 
W when option = 10x 
W when option = 110 


<m> Is the number [0-30] of the second general-purpose source register or the name ZR (31), encoded in 
the "Rm" field. 
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For the 32-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB when option = 000 
UXTH when option = 001 
LSL|UXTW when option = 010 
UXTX when option = 011 
SXTB when option = 100 
SXTH when option = 101 
SXTW when option = 110 
SXTX when option = 111 


If "Rd" or "Rn" is '11111' (WSP) and "option" is '010' then LSL is preferred, but may be omitted 
when "imm3" is '000'. In all other cases <extend> is required and must be UXTW when "option" is 
'010'. 

For the 64-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB when option = 000 
UXTH when option = 001 
UXTW when option = 010 
LSL|UXTX when option = 011 
SXTB when option = 100 
SXTH when option = 101 
SXTW when option = 110 
SXTX when option = 111 


If "Rd" or "Rn" is '11111' (SP) and "option" is '011' then LSL is preferred, but may be omitted when 
"imm3" is '000'. In all other cases <extend> is required and must be UXTX when "option" is '011'. 


Is the left shift amount to be applied after extension in the range 0 to 4, defaulting to 0, encoded in 
the "imm3" field. It must be absent when <extend> is absent, is required when <extend> is LSL, 
and is optional when <extend> is present but not LSL. 


bits(datasize) result; 


bits(datasize) operand1 


if n == 31 then SP[] else X[n]; 


bits(datasize) operand2 = ExtendReg(m, extend_type, shift); 


operand2 = NOT(operand2); 
(result, -) = AddWithCarry(operand1, operand2, '1'); 


if d == 31 then 


SP[] = 
else 
X[d] = 


result; 


result; 
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C6.2.196 


SUB (immediate) 


Subtract (immediate) subtracts an optionally-shifted immediate value from a register value, and writes the result to 


the destination register. 


|31 30 29 28|27 26 25 24|23 2221 | | 


10 9 


5 4| 


0) 


sf[tfo]1 oo 0 ashi} | immi2 Rn TRG 


op S 


32-bit variant 
Applies when sf == 0. 


SUB <Wd|WSP>, <Wn|WSP>, #<imm>{, <shift>} 


64-bit variant 
Applies when sf == 1. 


SUB <Xd|SP>, <Xn|SP>, #<imm>{, <shift>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer datasize = if sf == '1' then 64 else 32; 
bits(datasize) imm; 


case shift of 
when 'QQ' imm = ZeroExtend(imm12, datasize); 


when 'Q1' imm = ZeroExtend(imm12:Zeros(12), datasize); 


when '1x' ReservedValue(); 


Assembler symbols 


<Wd|WSP> Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 


field. 


<Wn | WSP> Is the 32-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 


<Xd|SP> Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" 


field. 





<Xn|SP> Is the 64-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 


<imm> Is an unsigned immediate, in the range 0 to 4095, encoded in the "imm12" field. 


<shift> Is the optional left shift to apply to the immediate, defaulting to LSL #0 and encoded in the "shift" 


field. It can have the following values: 


LSL #0 when shift = 00 
LSL #12 when shift = 01 


The encoding shift = 1x is reserved. 


Operation 


bits(datasize) result; 


bits(datasize) operandl = if n == 31 then SP[] else X[n]; 


bits(datasize) operand2; 





C6-728 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


operand2 = NOT(imm); 
(result, -) = AddWithCarry(operand1, operand2, '1'); 


if d == 31 then 
SP[] = result; 
else 
X[d] = result; 
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C6.2.197 SUB (shifted register) 


Subtract (shifted register) subtracts an optionally-shifted register value from a register value, and writes the result 
to the destination register. 


This instruction is used by the alias NEG (shifted register). See Alias conditions for details of when each alias is 
preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 | 109 5 4| 0 | 


CEIGIGRACne C a 


op S 


32-bit variant 
Applies when sf == 


SUB <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 


64-bit variant 
Applies when sf == 


SUB <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


if shift == '11' then ReservedValue(); 
if sf == 'O' && imm6<5> == '1' then ReservedValue(); 


ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 


Alias conditions 





Alias is preferred when 





NEG (shifted register) Rn == '11111' 





Assembler symbols 





<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
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<shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL and encoded 
in the "shift" field. It can have the following values: 


LSL when shift = 00 
LSR when shift = Q1 
ASR when shift = 10 


The encoding shift = 111s reserved. 
<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field. 


Operation 
bits(datasize) result; 


bits(datasize) operandl = X[n]; 
bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 


operand2 = NOT(operand2) ; 
(result, -) = AddWithCarry(operand1, operand2, '1'); 


X[d] = result; 
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C6.2.198 SUBS (extended register) 
Subtract (extended register), setting flags, subtracts a sign or zero-extended register value, followed by an optional 
left shift amount, from a register value, and writes the result to the destination register. The argument that is extended 
from the <Rm> register can be a byte, halfword, word, or doubleword. It updates the condition flags based on the 
result. 
This instruction is used by the alias CMP (extended register). See for details of when each alias is preferred. 
|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1312; 109 | 5 4| 0 | 
0101 4]0 0]i] Rm [option | imma [Rn [Rd 
op S 
32-bit variant 
Applies when sf == 0. 
SUBS <Wd>, <Wn|WSP>, <Wm>{, <extend> {#<amount>}} 
64-bit variant 
Applies when sf == 1. 
SUBS <Xd>, <Xn|SP>, <R><m>{, <extend> {#<amount>}} 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = UInt(imm3); 
if shift > 4 then ReservedValue(); 
Alias conditions 
Alias is preferred when 
CMP (extended register) Rd == '11111' 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn | WSP> Is the 32-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn|SP> Is the 64-bit name of the first source general-purpose register or stack pointer, encoded in the "Rn" 
field. 
<R> Is a width specifier, encoded in the "option" field. It can have the following values: 
W when option = 00x 
W when option = 010 
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x when option = x11 
W when option = 10x 
W when option = 110 
<m> Is the number [0-30] of the second general-purpose source register or the name ZR (31), encoded in 
the "Rm" field. 
<extend> For the 32-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 
UXTB when option = 000 
UXTH when option = 001 
LSL|UXTW when option = 010 
UXTX when option = 011 
SXTB when option = 100 
SXTH when option = 101 
SXTW when option = 110 
SXTX when option = 111 


If "Rn" is '11111' (WSP) and "option" is '010' then LSL is preferred, but may be omitted when 
"imm3" is '000'. In all other cases <extend> is required and must be UXTW when "option" is '010'. 


For the 64-bit variant: is the extension to be applied to the second source operand, encoded in the 
"option" field. It can have the following values: 


UXTB when option = 000 
UXTH when option = 001 
UXTW when option = 010 
LSL|UXTX when option = Q@11 
SXTB when option = 100 
SXTH when option = 101 
SXTW when option = 110 
SXTX when option = 111 


If "Rn" is '11111' (SP) and "option" is '011' then LSL is preferred, but may be omitted when "imm3" 
is '000'. In all other cases <extend> is required and must be UXTX when "option" is '011'. 


<amount> Is the left shift amount to be applied after extension in the range 0 to 4, defaulting to 0, encoded in 
the "imm3" field. It must be absent when <extend> is absent, is required when <extend> is LSL, 
and is optional when <extend> is present but not LSL. 


Operation 

bits(datasize) result; 

bits(datasize) operandl = if n == 31 then SP[] else X[n]; 
bits(datasize) operand2 = ExtendReg(m, extend_type, shift); 


bits(4) nzcv; 


operand2 = NOT(operand2); 
(result, nzcv) = AddWithCarry(operand1, operand2, '1'); 


PSTATE.<N,Z,C,V> = nzcv; 


X[d] = result; 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 














C6.2.199 SUBS (immediate) 
Subtract (immediate), setting flags, subtracts an optionally-shifted immediate value from a register value, and writes 
the result to the destination register. It updates the condition flags based on the result. 
This instruction is used by the alias CMP (immediate). See Alias conditions for details of when each alias is 
preferred. 
|31 30 29 28|27 26 25 24|23 2221 | | | 109 | 5 4| 0 | 
sf[t]if1 oo 0 ashi] | immi2 Rn TRG 
op S 
32-bit variant 
Applies when sf == 0. 
SUBS <Wd>, <Wn|WSP>, #<imm>{, <shift>} 
64-bit variant 
Applies when sf == 1. 
SUBS <Xd>, <Xn|SP>, #<imm>{, <shift>} 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer datasize = if sf == '1' then 64 else 32; 
bits(datasize) imm; 
case shift of 
when 'Q00' imm = ZeroExtend(imm12, datasize); 
when 'Q1' imm = ZeroExtend(imm12:Zeros(12), datasize); 
when '1x' ReservedValue(); 
Alias conditions 
Alias is preferred when 
CMP (immediate) Rd == '11111' 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn | WSP> Is the 32-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn|SP> Is the 64-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field. 
<imm> Is an unsigned immediate, in the range 0 to 4095, encoded in the "imm12" field. 
<shift> Is the optional left shift to apply to the immediate, defaulting to LSL #0 and encoded in the "shift" 
field. It can have the following values: 
LSL #0 when shift = 00 
LSL #12 when shift = 01 
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The encoding shift = 1x is reserved. 


Operation 

bits(datasize) result; 

bits(datasize) operandl = if n == 31 then SP[] else X[n]; 
bits(datasize) operand2; 

bits(4) nzcv; 


operand2 = NOT(imm); 
(result, nzcv) = AddWithCarry(operand1, operand2, '1'); 


PSTATE.<N,Z,C,V> = nzcv; 


X[d] = result; 


C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.200 SUBS (shifted register) 


Subtract (shifted register), setting flags, subtracts an optionally-shifted register value from a register value, and 
writes the result to the destination register. It updates the condition flags based on the result. 


This instruction is used by the aliases CMP (shifted register) and NEGS. See Alias conditions for details of when 
each alias is preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 | 109 5 4| 0 | 


CERI Ce 


op S 


32-bit variant 
Applies when sf == 


SUBS <Wd>, <Wn>, <Wm>{, <shift> #<amount>} 


64-bit variant 
Applies when sf == 


SUBS <Xd>, <Xn>, <Xm>{, <shift> #<amount>} 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


if shift == '11' then ReservedValue(); 
if sf == 'O' && imm6<5> == '1' then ReservedValue(); 


ShiftType shift_type = DecodeShift(shift) ; 
integer shift_amount = UInt(imm6); 


Alias conditions 





Alias is preferred when 





CMP (shifted register) Rd == '11111' 











NEGS Rn == '11111' 

Assembler symbols 

<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 

<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


<shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL and encoded 
in the "shift" field. It can have the following values: 


LSL when shift = 00 
LSR when shift = Q1 


ASR when shift = 10 


The encoding shift = 111s reserved. 


<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field. 


Operation 


bits(datasize) result; 

bits(datasize) operandl = X[n]; 

bits(datasize) operand2 = ShiftReg(m, shift_type, shift_amount) ; 
bits(4) nzcv; 


operand2 = NOT(operand2) ; 
(result, nzcv) = AddWithCarry(operand1, operand2, '1'); 


PSTATE.<N,Z,C,V> = nzcv; 


X[d] = result; 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.201 SVC 
Supervisor Call causes an exception to be taken to EL1. 


On executing an SVC instruction, the PE records the exception as a Supervisor Call exception in ESR_ELx, using the 
EC value x15, and the value of the immediate argument. 


|31 30 29 28|27 26 25 24|23 22 21 20| | | | 5 4/3 21 0| 
Ti orto0700j000; imme ——C‘i‘L‘.'C OTJF}00 14 | 


System variant 


SVC #<imm> 
Decode for this encoding 


bits(16) imm = imm16; 


Assembler symbols 


<imm> Is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the "imm16" field. 


Operation 


AArch64.Cal]1Supervisor(imm) ; 
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C6.2.202 SXTB 


Signed Extend Byte extracts an 8-bit value from a register, sign-extends it to the size of the register, and writes the 
result to the destination register. 


This instruction is an alias of the SBFM instruction. This means that: 


° The encodings in this description are named to match the encodings of SBFM. 
° The description of SBFM gives the operational pseudocode for this instruction. 
as 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0| 
ao of 00 TT UNo es OT OC OT Tm | 
opc immr imms 


32-bit variant 

Applies when sf == @ && N == 
SXTB <Wd>, <Wn> 

is equivalent to 

SBFM <Wd>, <Wn>, #0, #7 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1 && N == 
SXTB <Xd>, <Wn> 

is equivalent to 

SBFM <Xd>, <Xn>, #0, #7 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
Operation 


The description of SBFM gives the operational pseudocode for this instruction. 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.203 


SXTH 


Sign Extend Halfword extracts a 16-bit value, sign-extends it to the size of the register, and writes the result to the 
destination register. 


This instruction is an alias of the SBFM instruction. This means that: 
° The encodings in this description are named to match the encodings of SBFM. 


° The description of SBFM gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0 | 


aiono oe oe o oea — 


opc immr imms 


32-bit variant 

Applies when sf == @ && N == 
SXTH <Wd>, <Wn> 

is equivalent to 

SBFM <Wd>, <Wn>, #0, #15 
and is always the preferred disassembly. 
64-bit variant 

Applies when sf == 1 && N == 
SXTH <Xd>, <Wn> 

is equivalent to 

SBFM <Xd>, <Xn>, #0, #15 


and is always the preferred disassembly. 


Assembler symbols 

<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 


<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 


Operation 


The description of SBFM gives the operational pseudocode for this instruction. 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


SXTW 


Sign Extend Word sign-extends a word to the size of the register, and writes the result to the destination register. 
This instruction is an alias of the SBFM instruction. This means that: 
° The encodings in this description are named to match the encodings of SBFM. 


° The description of SBFM gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 | 109 5 4| 0 | 


Pio aan ne en ae 


sf opc N immr imms 


64-bit variant 
SXTW <Xd>, <Wn> 
is equivalent to 
SBFM <Xd>, <Xn>, #0, #31 


and is always the preferred disassembly. 


Assembler symbols 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 


<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 


Operation 


The description of SBFM gives the operational pseudocode for this instruction. 
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C6.2.205 SYS 


System instruction. For more information, see op0==0b01, cache maintenance, TLB maintenance, and address 


translation instructions on page C5-275 for the encodings of System instructions. 


This instruction is used by the aliases AT, DC, IC, and TLBI. See Alias conditions for details of when each alias is 


preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|15 12\11 


Faorere reo wi | om | om [ae] a 


System 


variant 


SYS #<opl>, <Cn>, <Cm>, #<op2>{, <Xt>} 


Decode for this encoding 

AArch64.CheckSystemAccess('@1', op1, CRn, CRm, op2, Rt, L); 
integer t = UInt(Rt); 

integer sys_op1 = UInt(op1); 

integer sys_op2 = UInt(op2); 

integer sys_crn = UInt(CRn); 

integer sys_crm = UInt(CRm); 


Alias conditions 


8|7 


5 4| 





Alias _ is preferred when 














AT CRn == 'Q111' && CRm == '100x' && SysOp(op1,'@111',CRm,op2) == Sys_AT 
DC CRn == 'Q111' && SysOp(op1, '@111',CRm,op2) == Sys_DC 

IC CRn == 'Q111' && SysOp(op1, '@111',CRm,op2) == Sys_IC 

TLBI CRn == '1000' && SysOp(op1,'1000',CRm,op2) == Sys_TLBI 





Assembler symbols 


<op1> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op1" field. 
<Cn> Is aname 'Cn', with 'n' in the range 0 to 15, encoded in the "CRn" field. 
<Cm> Is aname 'Cm’, with 'm' in the range 0 to 15, encoded in the "CRm'" field. 
<op2> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op2" field. 
<Xt> Is the 64-bit name of the optional general-purpose source register, defaulting to '11111', encoded in 
the "Rt" field. 
Operation 
AArch64.SysInstr(1, sys_op1, sys_crn, sys_crm, sys_op2, X[t]); 
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C6.2.206 SYSL 


System instruction with result. For more information, see op0==0b01, cache maintenance, TLB maintenance, and 
address translation instructions on page C5-275 for the encodings of System instructions. 


|31 30 29 28|27 26 25 24/23 22 2120/1918  16|15 12\11 8/7 5 4| 0 | 


iodo dodo ofifo i] opt | cRn | Rm | ope | Rt | 


System variant 

SYSL <Xt>, #<opl>, <Cn>, <Cm>, #<op2> 

Decode for this encoding 

AArch64.CheckSystemAccess('@1', op1, CRn, CRm, op2, Rt, L); 


integer t = UInt(Rt); 


integer sys_op1 = UInt(op1); 
integer sys_op2 = UInt(op2); 
integer sys_crn = UInt(CRn); 


integer sys_crm = UInt(CRm); 


Assembler symbols 


<Xt> Is the 64-bit name of the general-purpose destination register, encoded in the "Rt" field. 
<op1> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op1" field. 

<Cn> Is aname ‘Cn’, with 'n' in the range 0 to 15, encoded in the "CRn" field. 

<Cm> Is a name 'Cm’, with 'm' in the range 0 to 15, encoded in the "CRm'" field. 

<op2> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op2" field. 
Operation 


X[t] = AArch64.SysInstrwWithResult(1, sys_opl, sys_crn, sys_crm, sys_op2); 
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C6.2.207  TBNZ 


Test bit and Branch if Nonzero compares the value of a bit in a general-purpose register with zero, and conditionally 
branches to a label at a PC-relative offset if the comparison is not equal. It provides a hint that this is not a subroutine 
call or return. This instruction does not affect condition flags. 


|31 30 29 28|27 26 25 24|23 |19 18 | | | 5 4| 0| 


bsfo.1 toa tfat bso Pimms 
op 


14-bit signed PC-relative branch offset variant 


TBNZ <R><t>, #<imm>, <label> 


Decode for this encoding 
integer t = UInt(Rt); 
integer datasize = if b5 == '1' then 64 else 32; 


integer bit_pos = UInt(b5:b4Q); 
bits(64) offset = SignExtend(imm14:'@0', 64); 


Assembler symbols 


<R> Is a width specifier, encoded in the "b5" field. It can have the following values: 
W when b5 = 0 
X when b5 = 1 


In assembler source code an 'X' specifier is always permitted, but a 'W' specifier is only permitted 
when the bit number is less than 32. 


<t> Is the number [0-30] of the general-purpose register to be tested or the name ZR (31), encoded in 
the "Rt" field. 


<imm> Is the bit number to be tested, in the range 0 to 63, encoded in "b5:b40". 
<label> Is the program label to be conditionally branched to. Its offset from the address of this instruction, 
in the range +/-32KB, is encoded as "imm14" times 4. 
Operation 
bits(datasize) operand = X[t]; 


if operand<bit_pos> == op then 
BranchTo(PC[] + offset, BranchType_JMP) ; 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.208 TBZ 


Test bit and Branch if Zero compares the value of a test bit with zero, and conditionally branches to a label at a 
PC-relative offset if the comparison is equal. It provides a hint that this is not a subroutine call or return. This 
instruction does not affect condition flags. 


|31 30 29 28|27 26 25 24|23 |19 18 | | | 5 4| 0| 


bsfo.1 tot tfof bao Pimms 
op 


14-bit signed PC-relative branch offset variant 


TBZ <R><t>, #<imm>, <label> 


Decode for this encoding 
integer t = UInt(Rt); 
integer datasize = if b5 == '1' then 64 else 32; 


integer bit_pos = UInt(b5:b4Q); 
bits(64) offset = SignExtend(imm14:'@0', 64); 


Assembler symbols 


<R> Is a width specifier, encoded in the "b5" field. It can have the following values: 
W when b5 = 0 
X when b5 = 1 


In assembler source code an 'X' specifier is always permitted, but a 'W' specifier is only permitted 
when the bit number is less than 32. 


<t> Is the number [0-30] of the general-purpose register to be tested or the name ZR (31), encoded in 
the "Rt" field. 


<imm> Is the bit number to be tested, in the range 0 to 63, encoded in "b5:b40". 
<label> Is the program label to be conditionally branched to. Its offset from the address of this instruction, 
in the range +/-32KB, is encoded as "imm14" times 4. 
Operation 
bits(datasize) operand = X[t]; 


if operand<bit_pos> == op then 
BranchTo(PC[] + offset, BranchType_JMP) ; 
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C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


C6.2.209 TLBI 
TLB Invalidate operation. For more information, see A64 system instructions for TLB maintenance. 


This instruction is an alias of the SYS instruction. This means that: 


. The encodings in this description are named to match the encodings of SYS. 
° The description of SYS gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 2120/1918  16|15 12\11 8|7 5 4| 0 | 
Ti OT O1 070 Ooo 1] opt [i000] cm [om] Rk 
L CRn 


System variant 

TLBI <tlbi_op>{, <Xt>} 

is equivalent to 

SYS #<opl>, C8, <Cm>, #<op2>{, <Xt>} 


and is the preferred disassembly when SysOp(op1, '1000',CRm,op2) == Sys_TLBI. 


Assembler symbols 


<op1> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op1" field. 

<Cm> Is aname 'Cm’, with 'm' in the range 0 to 15, encoded in the "CRm'" field. 

<op2> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "op2" field. 

<t]bi_op> Is a TLBI instruction name, as listed for the TLBI system instruction group, encoded in the 


"opl:CRm:op2" field. It can have the following values: 
VMALLE1IS when op1 = 000, CRm = 0011, op2 = 000 

VAE1IS when op1 = 000, CRm = 0011, op2 = 001 
ASIDE1IS when op1 = 000, CRm = 0011, op2 = 010 
VAAE1IS when op1 = 000, CRm = 0011, op2 = 011 
VALE1IS when op1 = 000, CRm = 0011, op2 = 101 
VAALE1IS when opl = 000, CRm = 0011, op2 = 111 


VMALLE1 when op1 = 000, CRm = Q111, op2 = 000 
VAE1 when op1 = 000, CRm = Q111, op2 = 001 
ASIDE1 when op1 = 000, CRm = Q111, op2 = 010 
VAAE1 when op1 = 000, CRm = Q111, op2 = 011 
VALE1 when op1 = 000, CRm = Q111, op2 = 101 
VAALE1 when op1 = 000, CRm = Q111, op2 = 111 





IPAS2E1IS when op1 = 100, CRm = 0000, op2 = 001 
IPAS2LE1IS when op1 = 100, CRm = 0000, op2 = 101 
ALLE2IS when op1 = 100, CRm = 0011, op2 = 000 
VAE2IS when op1 = 100, CRm = 0011, op2 = 001 
ALLE1IS when op1 = 100, CRm = 0011, op2 = 100 
VALE2IS when op1 = 100, CRm = 0011, op2 = 101 
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VMALLS12E11ISwhen op1 = 100, CRm = Q011, op2 = 110 
IPAS2E1 when op1 = 100, CRm = 0100, op2 = 001 


IPAS2LE1 when op1 


ALLE2 when op1 
VAE2 when op1 
ALLE1 when op1 
VALE2 when op1 


VMALLS12E1 when op1 
ALLE3IS when op1 
VAE3IS when op1 
VALE3IS when op1 


100, CRm = 0100, op2 
100, CRm = Q111, op2 
100, CRm = Q111, op2 
100, CRm = Q111, op2 
100, CRm = Q111, op2 
100, CRm = Q111, op2 
110, CRm = 011, op2 
110, CRm = 011, op2 
110, CRm = 011, op2 
110, CRm = Q111, op2 
110, CRm = Q111, op2 
110, CRm = Q111, op2 





101 
000 
001 
100 
101 
110 
000 
001 
101 
000 
001 
101 





C6 A64 Base Instruction Descriptions 
C6.2 Alphabetical list of A64 base instructions 


Is the 64-bit name of the optional general-purpose source register, defaulting to '11111', encoded in 


ALLE3 when op1 

VAE3 when op1 

VALE3 when op1 
<Xt> 

the "Rt" field. 
Operation 


The description of SYS gives the operational pseudocode for this instruction. 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.210 TST (immediate) 


Test bits (immediate), setting the condition flags and discarding the result : Rn AND imm 


This instruction is an alias of the ANDS (immediate) instruction. This means that: 


° The encodings in this description are named to match the encodings of ANDS (immediate). 
° The description of ANDS (immediate) gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 | 5 4| 0| 
[1 i]t 00710 ON] mmr [mms 4] Rn «dd? 777 1 
opc Rd 


32-bit variant 

Applies when sf == @ && N == 0. 
TST <Wn>, #<imm> 

is equivalent to 

ANDS WZR, <Wn>, #<imm> 


and is always the preferred disassembly. 


64-bit variant 
Applies when sf == 1. 
TST <Xn>, #<imm> 

is equivalent to 

ANDS XZR, <Xn>, #<imm> 


and is always the preferred disassembly. 


Assembler symbols 


<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<imm> For the 32-bit variant: is the bitmask immediate, encoded in "imms:immr". 


For the 64-bit variant: is the bitmask immediate, encoded in "N:imms:immr". 


Operation 


The description of ANDS (immediate) gives the operational pseudocode for this instruction. 
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C6.2.211 TST (shifted register) 


Test (shifted register) performs a bitwise AND operation on a register value and an optionally-shifted register value. 
It updates the condition flags based on the result, and discards the result. 


This instruction is an alias of the ANDS (shifted register) instruction. This means that: 
° The encodings in this description are named to match the encodings of ANDS (shifted register). 


° The description of ANDS (shifted register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 | 109 | 5 4| 0 | 
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Rd 


opc N 


32-bit variant 

Applies when sf == 0. 

TST <Wn>, <Wm>{, <shift> #<amount>} 

is equivalent to 

ANDS WZR, <Wn>, <Wm>{, <shift> #<amount>} 


and is always the preferred disassembly. 


64-bit variant 

Applies when sf == 1. 

TST <Xn>, <Xm>{, <shift> #<amount>} 

is equivalent to 

ANDS XZR, <Xn>, <Xm>{, <shift> #<amount>} 


and is always the preferred disassembly. 


Assembler symbols 


<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the optional shift to be applied to the final source, defaulting to LSL and encoded in the "shift" 
field. It can have the following values: 
LSL when shift = 00 
LSR when shift = 01 
ASR when shift = 10 
ROR when shift = 11 
<amount> For the 32-bit variant: is the shift amount, in the range 0 to 31, defaulting to 0 and encoded in the 
"imm6" field. 


For the 64-bit variant: is the shift amount, in the range 0 to 63, defaulting to 0 and encoded in the 
"imm6" field, 
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Operation 


The description of ANDS (shifted register) gives the operational pseudocode for this instruction. 
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C6.2.212 UBFIZ 


Unsigned Bitfield Insert in Zero zeroes the destination register and copies any number of contiguous bits from a 
source register into any position in the destination register. 


This instruction is an alias of the UBFM instruction. This means that: 
° The encodings in this description are named to match the encodings of UBFM. 


° The description of UBFM gives the operational pseudocode for this instruction. 
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opc 


32-bit variant 

Applies when sf == @ && N == 

UBFIZ <Wd>, <Wn>, #<Isb>, #<width> 

is equivalent to 

UBFM <Wd>, <Wn>, #(-<Isb> MOD 32), #(<width>-1) 


and is the preferred disassembly when UInt(imms) < UInt(immr). 


64-bit variant 

Applies when sf == 1 && N == 

UBFIZ <Xd>, <Xn>, #<Isb>, #<width> 

is equivalent to 

UBFM <Xd>, <Xn>, #(-<Isb> MOD 64), #(<width>-1) 


and is the preferred disassembly when UInt(imms) < UInt(immr). 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 

<Isb> For the 32-bit variant: is the bit number of the Isb of the destination bitfield, in the range 0 to 31. 


For the 64-bit variant: is the bit number of the Isb of the destination bitfield, in the range 0 to 63. 


<width> For the 32-bit variant: is the width of the bitfield, in the range 1 to 32-<Isb>. 
For the 64-bit variant: is the width of the bitfield, in the range 1 to 64-<Isb>. 


Operation 


The description of UBFM gives the operational pseudocode for this instruction. 
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C6.2.213 


UBFM 


Unsigned Bitfield Move copies any number of low-order bits from a source register into the same number of 


adjacent bits at any position in the destination register, with zeros in the upper and lower bits. 


This instruction is used by the aliases LSL (immediate), LSR (immediate), UBFIZ, UBFX, UXTB, and UXTH. See 


Alias conditions on page C6-753 for details of when each alias is preferred. 
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32-bit variant 
Applies when sf == @ && N == 


UBFM <Wd>, <Wn>, #<immr>, #<imms> 


64-bit variant 
Applies when sf == 1 && N == 


UBFM <Xd>, <Xn>, #<immr>, #<imms> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer datasize = if sf == '1' then 64 else 32; 


integer R; 
bits(datasize) wmask; 
bits(datasize) tmask; 


if sf == '1' && N != '1' then ReservedValue(); 
if sf == '@' && (N != 'O' || immr<5> != 'O' || imms<5> != '@') then ReservedValue(); 


R = UInt(immr); 
(wmask, tmask) = DecodeBitMasks(N, imms, immr, FALSE); 
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Alias conditions 
































Alias of variant is preferred when 
LSL (immediate) 32-bit imms != 'Q11111' && imms + 1 == immr 
LSL (immediate) 64-bit imms != '111111' && imms + 1 == immr 
LSR (immediate) 32-bit imms == 'Q11111' 
LSR (immediate) 64-bit imms == '111111' 
UBFIZ = UInt(imms) < UInt(immr) 
UBFX - BFXPreferred(sf, opc<l>, imms, immr) 
UXTB s immr == '000000' && imms == 'Q00111' 
UXTH - immr == '000000' && imms == 'Q01111' 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<immr> For the 32-bit variant: is the right rotate amount, in the range 0 to 31, encoded in the "immr" field. 


For the 64-bit variant: is the right rotate amount, in the range 0 to 63, encoded in the "immr" field. 


<imms> For the 32-bit variant: is the leftmost bit number to be moved from the source, in the range 0 to 31, 
encoded in the "imms" field. 


For the 64-bit variant: is the leftmost bit number to be moved from the source, in the range 0 to 63, 
encoded in the "imms" field. 
Operation 
bits(datasize) src = X[n]; 


// perform bitfield move on low bits 
bits(datasize) bot = ROR(src, R) AND wmask; 


// combine extension bits and result bits 
X[d] = bot AND tmask; 
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C6.2.214 


UBFX 


Unsigned Bitfield Extract extracts any number of adjacent bits at any position from a register, zero-extends them to 
the size of the register, and writes the result to the destination register. 


This instruction is an alias of the UBFM instruction. This means that: 
° The encodings in this description are named to match the encodings of UBFM. 


° The description of UBFM gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24/23 2221 | 16|15 | 109 5 4| 0 | 


ee a 


opc 


32-bit variant 

Applies when sf == @ && N == 

UBFX <Wd>, <Wn>, #<Isb>, #<width> 

is equivalent to 

UBFM <Wd>, <Wn>, #<Isb>, #(<Isb>+<width>-1) 


and is the preferred disassembly when BFXPreferred(sf, opc<l>, imms, immr). 


64-bit variant 

Applies when sf == 1 && N == 

UBFX <Xd>, <Xn>, #<Isb>, #<width> 

is equivalent to 

UBFM <Xd>, <Xn>, #<Isb>, #(<Isb>+<width>-1) 


and is the preferred disassembly when BFXPreferred(sf, opc<l>, imms, immr). 


Assembler symbols 

<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 


<Isb> For the 32-bit variant: is the bit number of the Isb of the source bitfield, in the range 0 to 31. 
For the 64-bit variant: is the bit number of the Isb of the source bitfield, in the range 0 to 63. 


<width> For the 32-bit variant: is the width of the bitfield, in the range 1 to 32-<Isb>. 
For the 64-bit variant: is the width of the bitfield, in the range 1 to 64-<Isb>. 


Operation 


The description of UBFM gives the operational pseudocode for this instruction. 
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C6.2.215 UDIV 


Unsigned Divide divides an unsigned integer register value by another unsigned integer register value, and writes 
the result to the destination register. The condition flags are not affected. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 
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32-bit variant 
Applies when sf == 


UDIV <Wd>, <Wn>, <Wm> 


64-bit variant 
Applies when sf == 


UDIV <Xd>, <Xn>, <Xm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if sf == '1' then 64 else 32; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Wn> Is the 32-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Wm> Is the 32-bit name of the second general-purpose source register, encoded in the "Rm" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the first general-purpose source register, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the second general-purpose source register, encoded in the "Rm" field. 
Operation 

bits(datasize) operandl = X[n]; 

bits(datasize) operand2 = X[m]; 


integer result; 


if IsZero(operand2) then 
result = 0; 
else 
result = RoundTowardsZero(Real(Int(operand1, TRUE)) / Real(Int(operand2, TRUE))); 


X[d] = result<datasize-1:0>; 
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C6.2.216 UMADDL 


Unsigned Multiply-Add Long multiplies two 32-bit register values, adds a 64-bit register value, and writes the result 
to the 64-bit destination register. 


This instruction is used by the alias UMULL. See Alias conditions for details of when each alias is preferred. 
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64-bit variant 


UMADDL <Xd>, <Wn>, <Wm>, <Xa> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer a = UInt(Ra); 


Alias conditions 





Alias is preferred when 





UMULL Ra == '11111' 





Assembler symbols 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 
<Xa> Is the 64-bit name of the third general-purpose source register holding the addend, encoded in the 
"Ra" field. 
Operation 
bits(32) operandl = X[n]; 
bits(32) operand2 = X[m]; 


bits(64) operand3 = X[a]; 


integer result; 
result = Int(operand3, TRUE) + (Int(operand1, TRUE) » Int(operand2, TRUE)); 


X[d] = result<63:0>; 
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C6.2.217 UMNEGL 

Unsigned Multiply-Negate Long multiplies two 32-bit register values, negates the product, and writes the result to 

the 64-bit destination register. 

This instruction is an alias of the UMSUBL instruction. This means that: 

° The encodings in this description are named to match the encodings of UMSUBL. 

° The description of UMSUBL gives the operational pseudocode for this instruction. 

i 30 29 lal 26 25 al 22 21 20| eee 14 = 10 9 5 4| 0 | 
= 

64-bit variant 

UMNEGL <Xd>, <Wn>, <Wm> 

is equivalent to 

UMSUBL <Xd>, <Wn>, <Wm>, XZR 

and is always the preferred disassembly. 

Assembler symbols 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 

Operation 

The description of UMSUBL gives the operational pseudocode for this instruction. 
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C6.2.218 UMSUBL 


Unsigned Multiply-Subtract Long multiplies two 32-bit register values, subtracts the product from a 64-bit register 
value, and writes the result to the 64-bit destination register. 


This instruction is used by the alias UMNEGL. See Alias conditions for details of when each alias is preferred. 
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64-bit variant 


UMSUBL <Xd>, <Wn>, <Wm>, <Xa> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer a = UInt(Ra); 


Alias conditions 





Alias is preferred when 





UMNEGL - Ra == '11111' 





Assembler symbols 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 


<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 


<Xa> Is the 64-bit name of the third general-purpose source register holding the minuend, encoded in the 
"Ra" field. 


Operation 


bits(32) operandl = X[n]; 
bits(32) operand2 = X[m]; 
bits(64) operand3 = X[a]; 


integer result; 


result = Int(operand3, TRUE) - (Int(operand1, TRUE) » Int(operand2, TRUE)); 
X[d] = result<63:0>; 
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C6 A64 Base Instruction Descriptions 
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Unsigned Multiply High multiplies two 64-bit register values, and writes bits[127:64] of the 128-bit result to the 


64-bit destination register. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 14 


| 5 4) 0 | 


fo of Fos aftth of Rm foi TOTON 7 a 


64-bit variant 


UMULH <Xd>, <Xn>, <Xm> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


Assembler symbols 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Xn> Is the 64-bit name of the first general-purpose source register holding the multiplicand, encoded in 


the "Rn" field. 


<Xm> Is the 64-bit name of the second general-purpose source register holding the multiplier, encoded in 


the "Rm" field. 


Operation 
bits(64) operandl = X[n]; 
bits(64) operand2 = X[m]; 


integer result; 
result = Int(operandl, TRUE) « Int(operand2, TRUE); 


X[d] = result<127:64>; 
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C6.2.220 UMULL 

Unsigned Multiply Long multiplies two 32-bit register values, and writes the result to the 64-bit destination register. 

This instruction is an alias of the UMADDL instruction. This means that: 

. The encodings in this description are named to match the encodings of UMADDL. 

° The description of UMADDL gives the operational pseudocode for this instruction. 

El 30 — 26 25 ae 22 21 20) SS as 10 9 5 4| 0 | 
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64-bit variant 

UMULL <Xd>, <Wn>, <Wm> 

is equivalent to 

UMADDL <Xd>, <Wn>, <Wm>, XZR 

and is always the preferred disassembly. 

Assembler symbols 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the first general-purpose source register holding the multiplicand, encoded in 
the "Rn" field. 

<Wm> Is the 32-bit name of the second general-purpose source register holding the multiplier, encoded in 
the "Rm" field. 

Operation 

The description of UMADDL gives the operational pseudocode for this instruction. 
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UXTB 


Unsigned Extend Byte extracts an 8-bit value from a register, zero-extends it to the size of the register, and writes 
the result to the destination register. 


This instruction is an alias of the UBFM instruction. This means that: 
° The encodings in this description are named to match the encodings of UBFM. 


° The description of UBFM gives the operational pseudocode for this instruction. 
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sf opc N immr imms 


32-bit variant 
UXTB <Wd>, <Wn> 
is equivalent to 
UBFM <Wd>, <Wn>, #0, #7 


and is always the preferred disassembly. 


Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 


Operation 


The description of UBFM gives the operational pseudocode for this instruction. 
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C6.2.222 


UXTH 


Unsigned Extend Halfword extracts a 16-bit value from a register, zero-extends it to the size of the register, and 
writes the result to the destination register. 


This instruction is an alias of the UBFM instruction. This means that: 
° The encodings in this description are named to match the encodings of UBFM. 


° The description of UBFM gives the operational pseudocode for this instruction. 
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CHET OS 


sf opc N immr imms 


32-bit variant 
UXTH <Wd>, <Wn> 
is equivalent to 
UBFM <Wd>, <Wn>, #0, #15 


and is always the preferred disassembly. 


Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 


Operation 


The description of UBFM gives the operational pseudocode for this instruction. 
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Wait For Event is a hint instruction that permits the PE to enter a low-power state until one of a number of events 
occurs, including events signaled by executing the SEV instruction on any PE in the multiprocessor system. For more 
information, see Wait for Event mechanism and Send event on page D1-1599. 


As described in Wait for Event mechanism and Send event on page D1-1599, the execution of a WFE instruction that 
would otherwise cause entry to a low-power state can be trapped to a higher Exception level. See: 


. Traps to ELI of ELO execution of WFE and WF instructions on page D1-1565. 
° Traps to EL2 of Non-secure ELO and EL] execution of WFE and WFI instructions on page D1-1581. 


° Traps to EL3 of EL2, EL1, and ELO execution of WFE and WFI instructions on page D1-1591. 
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System variant 


WFE 


Decode for this encoding 


// Empty. 


Operation 


if EventRegistered() then 
ClearEventRegister(); 
else 
if PSTATE.EL == ELQ then 
// Check for traps described by the OS. 
AArch64.CheckForWFxTrap(EL1, TRUE); 
if HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {EL@, EL1} then 
// Check for traps described by the Hypervisor. 
AArch64.CheckForWFxTrap(EL2, TRUE); 
if HaveEL(EL3) && PSTATE.EL != EL3 then 
// Check for traps described by the Secure Monitor. 
AArch64.CheckForWFxTrap(EL3, TRUE); 
WaitForEvent(); 
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C6.2.224 WFI 


Wait For Interrupt is a hint instruction that permits the PE to enter a low-power state until one of a number of 
asynchronous event occurs. For more information, see Wait For Interrupt on page D1-1602. 


As described in Wait For Interrupt on page D1-1602, the execution of a WFI instruction that would otherwise cause 
entry to a low-power state can be trapped to a higher Exception level. See: 


° Traps to ELI of ELO execution of WFE and WF instructions on page D1-1565. 
° Traps to EL2 of Non-secure ELO and EL] execution of WFE and WFI instructions on page D1-1581. 


° Traps to EL3 of EL2, EL1, and ELO execution of WFE and WFI instructions on page D1-1591. 
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System variant 


WFI 


Decode for this encoding 


// Empty. 


Operation 


if !InterruptPending() then 

if PSTATE.EL == ELQ then 
// Check for traps described by the OS. 
AArch64.CheckForWFxTrap(EL1, FALSE); 

if HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {EL@, EL1} then 
// Check for traps described by the Hypervisor. 
AArch64.CheckForWFxTrap(EL2, FALSE); 

if HaveEL(EL3) && PSTATE.EL != EL3 then 
// Check for traps described by the Secure Monitor. 
AArch64.CheckForWFxTrap(EL3, FALSE); 

WaitForInterrupt(); 
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C6.2 Alphabetical list of A64 base instructions 


C6.2.225 YIELD 


YIELD is a hint instruction. Software with a multithreading capability can use a YIELD instruction to indicate to the 
PE that it is performing a task, for example a spin-lock, that could be swapped out to improve overall system 
performance. The PE can use this hint to suspend and resume multiple software threads if it supports the capability. 


For more information about the recommended use of this instruction, see The YIELD instruction on page B1-64. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 13 12/11 8|7 5 4/3 21 0| 


[1 104707070 ofojo ojo 1 1foo7 ofooo ojo) 
CRm 2 


op 
System variant 
YIELD 
Decode for this encoding 


// Empty. 


Operation 


Hint_Yield(); 
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Chapter C7 
A64 Advanced SIMD and Floating-point Instruction 
Descriptions 


This chapter describes the A64 SIMD and floating-point instructions. 


It contains the following sections: 
° About the A64 SIMD and floating-point instructions on page C7-768. 
° Alphabetical list of A64 floating-point and Advanced SIMD instructions on page C7-770. 
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C7.1 About the A64 SIMD and floating-point instructions 


C7.1 


About the A64 SIMD and floating-point instructions 


Alphabetical list of A64 floating-point and Advanced SIMD instructions on page C7-770 gives full descriptions of 
the A64 instructions that are in the following instruction groups: 


° Loads and store instructions associated with the SIMD and floating-point registers. 


° Data processing instructions with SIMD and floating-point registers. 


A64 instruction index by encoding on page C4-192 in the A64 Instruction Encodings chapter provides an overview 
of the instruction encodings as part of an instruction class within a functional group. 


The rest of this section is a general description of the SIMD and floating-point instructions. It contains the following 
subsections: 


° Register size. 

° Data types. 

. Condition flags and related instructions on page C7-769. 
. General capabilities on page C7-769. 











C7.1.1 Register size 

A64 provides a comprehensive set of packed Single Instruction Multiple Data (SIMD) and scalar operations using 

data held in the 32 entry 128-bit wide SIMD and floating-point register file. 

Each SIMD and floating-point register can be used to hold: 

. A single scalar value of the floating-point or integer type. 

° A 64-bit wide vector containing one or more elements. 

° A 128-bit wide vector containing two or more elements. 

Where the entire 128-bit wide register is not fully utilized, the vector or scalar quantity is held in the least significant 

bits of the register, with the most significant bits being cleared to zero on a write, see Vector formats on page A1-37. 

The following instructions can insert data into individual elements within a SIMD and floating-pointer register 

without clearing the remaining bits to zero: 

° Insert vector element from another vector element or general-purpose register, INS. 

° Load structure into a single lane, for example LD3. 

° All second-part narrowing operations, for example SHRN2. 

C7.1.2 Data types 

The A64 instruction set provides support for arithmetic, conversion, and bitwise operations on: 

° Half-precision, single-precision, and double-precision floating points. 

° Signed and unsigned integers. 

° Polynomials over {0, 1}. 

For all AArch64 floating-point operations, including SIMD operations, the rounding mode and exception trap 

handling are controlled by the FPCR. 

Note 

AArch32 Advanced SIMD operations always use ARM standard floating-point arithmetic independent of the 

AArch64 FPCR or AArch32 FPSCR rounding mode. In AArch64 state floating-point multiply-addition operations 

are always performed as fused operations, but AArch32 state provides both fused and chained variants. 

In addition to operations that consume and produce values of the same width and type, the A64 instruction set 

supports SIMD and scalar operations that produce a wider or narrower vector result: 

° Where a SIMD operation narrows a 128-bit vector to a 64-bit vector, the A64 instruction set provides a 
second-part operation, for example SHRN2, that can pack the result of a second operation into the upper part 
of the same destination register. 
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C7.1.3 


C7.1.4 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.1 About the A64 SIMD and floating-point instructions 


Where a SIMD operation widens a 64-bit vector to a 128-bit vector, the A64 instruction set provides a 
second-part operation, for example SMLAL2, that can extract the source from the upper 64 bits of the source 
registers. 


All SIMD operations that could produce side-effects that are not limited to the destination SIMD and floating-point 
register, for example a potential update of FPSR.Q or FPSR.IDC, have a dedicated scalar variant to support the use 
of SIMD with loops requiring specialised head or tail handling, or both. 


Condition flags and related instructions 


The A64 instruction set provides support for flag setting and conditional operations on the SIMD and floating-point 
register file: 


Floating-point FCSEL and FCCMP instructions are equivalent to the integer CSEL and CCMP instructions. 
Floating-point FCMP. FCMPE, FCCMP, and FCCMP set the PSTATE.{N, Z, C, V} flags based on the result of the 
floating-point comparison. 

Floating-point and integer instructions provide a means of producing either a scalar or a vector mask based 
on a comparison in a SIMD and floating-point register, for example FCMEQ. 





Note 


FCMP and FCMPE differ from the A32/T32 VCMP and VCMPE instructions, which use the dedicated FPSCR.NZCV field 
for the result. A64 instructions store the result of an FCMP or FCMPE operation in the PSTATE.{N, Z, C, V} field. 





General capabilities 


A64 SIMD and floating-point instructions provide the following capabilities: 


General arithmetic on vector and scalar floating-point and integer values. 

Dedicated polynomial multiply over {0, 1}. 

Vector and scalar fused multiply-addition of single-precision and double-precision floating-points. 

Load and store of single and pairs of SIMD and floating-point registers. 

Load and store of structures and individual lanes of between one and four SIMD and floating-point registers. 
Direct conversion between 64-bit integers and floating-point values, with explicit rounding. 
Double-rounding free conversion between double-precision and half-precision floating-point values. 
Comprehensive SIMD with widening and narrowing support. 

Vector to scalar reduction returning the minimum or maximum value, or the sum. 


Floating-point to nearest integer in floating-point format. 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


This section lists every section in the floating-point and Advanced SIMD categories of the A64 instruction set. For 
details of the format used, see Structure of the A64 assembler language on page C1-123. 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.1 ABS 


Absolute value (vector). This instruction calculates the absolute value of each vector element in the source 
SIMD&FP register, puts the result into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 2827 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


oO ifolr it 4 ofsze[s ooo ojos or ajiof Ro | Ré _ 


Scalar variant 


ABS <V><d>, <V><n> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 


integer elements = 1; 
boolean neg = (U == '1'); 


Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 5 A| 0| 
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Vector variant 


ABS <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean neg = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 


The following encodings are reserved: 





° size = Ox. 
° size = 10. 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4s 
2D 


The encoding size = 11,Q = 0 is reserved. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 


bits(datasize) result; 


integer element; 


for e = 0 to elements-1 
element = SInt(Elem[operand, e, esize]); 


V[d] 


if neg then 
element = -element; 


else 


Elem[result, e, esize] = element<esize-1:0>; 


element = Abs(element); 


result; 


00, Q = 
00, Q = 
01,Q = 
01,Q = 
10, Q = 
10, Q = 
11, Q = 


i) 


PP © FP & 
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C7.2.2 ADD (vector) 


Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results 
into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 
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Scalar variant 


ADD <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size != '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

boolean sub_op = (U == '1'); 


Vector 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312/1110 9 | 5 A| 0| 


fo[afofo +771 Ofsze]i] Rm [to000]1] Rn | Rd 
U 


Vector variant 


ADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean sub_op = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 


The following encodings are reserved: 





° size = Ox. 
° size = 10. 
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<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vim> 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4S 
2D 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
= @1,Q=1 
= 10,Q=0 
=10,Q=1 
=11,Q=1 


The encoding size = 11, Q = Qis reserved. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 


for e = 0 to elements-1 


V[d] 


elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 


if sub_op then 


Elem[result, e, esize] = elementl - element2; 


else 


Elem[result, e, esize] = elementl + element2; 


= result; 
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C7.2.3 ADDHN, ADDHN2 


Add returning High Narrow. This instruction adds each vector element in the first source SIMD&FP register to the 
corresponding vector element in the second source SIMD&FP register, places the most significant half of the result 
into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. 


The results are truncated. For rounded results, see RADDHN, RADDHN2. 


The ADDHN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the ADDHN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


ofafojo 7 4 4 ofsze|i] Rm Jo tfofojo of Rn | Rd __| 


Three registers, not all the same type variant 


ADDHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

== '11' then ReservedValue(); 
esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 


elements = datasize DIV esize; 


sub_op = (ol == '1'); 


round = 


(U == '1'); 


Assembler symbols 


2 


<Vd> 


<Tb> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 2 


16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 
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<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 


<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(2sdatasize) operandl = V[n]; 

bits(2sdatasize) operand2 = V[m]; 

bits(datasize) result; 

integer round_const = if round then 1 << (esize - 1) else Q; 
bits(2xesize) element1; 

bits(2xesize) element2; 

bits(2sesize) sum; 


for e = 0 to elements-1 
element1 = Elem[operand1, e, 2xesize]; 
element2 = Elem[operand2, e, 2xesize]; 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
sum = sum + round_const; 
Elem[result, e, esize] = sum<2«esize-l:esize>; 


Vpart[d, part] = result; 
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C7.2.4 ADDP (scalar) 


Add Pair of elements (scalar). This instruction adds two vector elements in the source SIMD&FP register and writes 
the scalar result into the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo tfoj1 141 ofsie{[1 1 000]1 1071/1 of Rn [| Rd | 


Advanced SIMD variant 


ADDP <V><d>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize « 2; 


integer elements = 2; 


ReduceOp op = ReduceOp_ADD; 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 


The following encodings are reserved: 


° size = Ox. 
° size = 10. 

<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<I> Is the source arrangement specifier, encoded in the "size" field. It can have the following values: 
2D when size = 11 


The following encodings are reserved: 
. size = Ox. 


° size = 10. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(op, operand, esize); 
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C7.2.5 


ADDP (vector) 


Add Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the first source 
SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent 
vector elements from the concatenated vector, adds each pair of values together, places the result into a vector, and 
writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312|1110 9 | 5 4| 0 | 
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Three registers of the same type variant 


ADDP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if size 
integer 
integer 
integer 


UInt(Rd) ; 
UInt(Rn); 
UInt(Rm) ; 


d 
n 
m 


:Q == '110' then ReservedValue(); 


esize = 8 << UInt(size); 
datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 


Assembler symbols 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 

16B when size = 00, Q = 
4H when size = 01,Q = 
8H when size = 01,Q = 
2s when size = 10, Q = 
4S when size = 10,Q = 
2D when size = 11,Q = 


PP © FP & 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
bits(esize) element1; 

bits(esize) element2; 





C7-778 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


for e = 0 to elements-1 
element1 = Elem[concat, 2se, esize]; 
element2 = Elem[concat, (2se)+1, esize]; 
Elem[result, e, esize] = elementl + element2; 


V[d] = result; 
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C7.2.6 ADDV 


Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes 
the scalar result to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofajojo 141 ofsie{[1 100 ols 10471J1 of Rn [| Rd | 


Advanced SIMD variant 


ADDV <V><d>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if size:Q == '100' then ReservedValue(); 
if size == y! then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 


integer elements = datasize DIV esize; 


ReduceOp op = ReduceOp_ADD; 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 
H when size = 01 
Ss when size = 10 


The encoding size = 11 is reserved. 


<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
4S when size = 10,Q=1 


The following encodings are reserved: 
° size = 10,Q = @. 


° size = 11,Q =x. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(op, operand, esize); 
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C7.2.7 AESD 


AES single round decryption 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fot oot 71 OO of10700j0070/i1110] Rn | Rd 
D 


Advanced SIMD variant 


AESD <Vd>.16B, <Vn>.16B 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP source and destination register, encoded in the "Rd" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckCryptoEnabled64(); 


bits(128) operandl = V[d]; 

bits(128) operand2 = V[n]; 

bits(128) result; 

result = operand1 EOR operand2; 

result = AESInvSubBytes(AESInvShiftRows(result)); 
V[d] = result; 
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C7.2.8 AESE 


AES single round encryption 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fot oo01771 0/0 0)10700)007 0/0110] Rn | Rd 
D 


Advanced SIMD variant 


AESE <Vd>.16B, <Vn>.16B 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP source and destination register, encoded in the "Rd" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckCryptoEnabled64(); 


bits(128) operandl = V[d]; 

bits(128) operand2 = V[n]; 

bits(128) result; 

result = operandi EOR operand2; 

result = AESSubBytes(AESShiftRows(result)); 


V[d] = result; 
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C7.2.9 AESIMC 


AES inverse mix columns 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fot oo111 Oo of10700)0071/1110] Rn | Rd 
D 


Advanced SIMD variant 


AESIMC <Vd>.16B, <Vn>.16B 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckCryptoEnabled64() ; 


bits(128) operand = V[n]; 

bits(128) result; 

result = AESInvMixColumns (operand) ; 
V[d] = result; 
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C7.2.10 AESMC 


AES mix columns 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fot oo017710[00]10700)007 1/010] Rn | Rd 
D 


Advanced SIMD variant 


AESMC <Vd>.16B, <Vn>.16B 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckCryptoEnabled64() ; 


bits(128) operand = V[n]; 
bits(128) result; 

result = AESMixColumns(operand) ; 
V[d] = result; 
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C7.2.11 AND (vector) 
Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and 
writes the result to the destination SIMD&FP register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 ee 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 
fofajojo 111 ofo of1] Rm fooorrit Rn [Rd | 
size 
Three registers of the same type variant 
AND <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if Q == '1' then 128 else 64; 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
result = operandl AND operand2; 
V[d] = result; 
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C7.2.12 BIC (vector, immediate) 


Bitwise bit Clear (vector, immediate). This instruction reads each vector element from the destination SIMD&FP 
register, performs a bitwise AND between each result and the complement of an immediate constant, places the 
result into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4| 0 | 


ojafijo 117110000 ofafb{c|x x x tfolt{dfelf{g{h] Ra | 
op 


cmode 


16-bit variant 
Applies when cmode == 10x1. 


BIC <Vd>.<T>, #<imm8>{, LSL #<amount>} 


32-bit variant 
Applies when cmode == Qxx1. 


BIC <Vd>.<T>, #<imm8>{, LSL #<amount>} 


Decode for all variants of this encoding 
integer rd = UInt(Rd); 


integer datasize = if Q == '1' then 128 else 64; 
bits(datasize) imm; 
bits(64) imm64; 


ImmediateOp operation; 
case cmode:op of 
when '@xx@1' operation = ImmediateOp_MVNI; 
when 'Qxx11' operation = ImmediateOp_BIC; 
when '10x0@1' operation = ImmediateOp_MVNT; 
when '10x11' operation = ImmediateOp_BIC; 
when '110x1' operation = ImmediateOp_MVNT; 
when '1110x' operation = ImmediateOp_MOVI; 
when '11111' 
// FMOV Dn,#imm is in main FP instruction set 
if Q == '@' then UnallocatedEncoding(); 
operation = ImmediateOp_MOVI; 


imm64 = AdvSIMDExpandImm(op, cmode, a:b:c:d:e:f:g:h); 
imm = Replicate(imm64, datasize DIV 64); 


Assembler symbols 





<Vd> Is the name of the SIMD&FP register, encoded in the "Rd" field. 
<I> For the 16-bit variant: is an arrangement specifier, encoded in the "Q" field. It can have the 
following values: 
4H when Q = 0 
8H whenQ = 1 
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For the 32-bit variant: is an arrangement specifier, encoded in the "Q" field. It can have the 
following values: 


2S when Q = @ 
4S when Q = 1 
<imm8> Is an 8-bit immediate encoded in "a:b:c:d:e:f:g:h". 
<amount> For the 16-bit variant: is the shift amount encoded in the "cmode<1>" field. It can have the 


following values: 


v) 
1 


0 when cmode<1> 


8 when cmode<1> 
defaulting to 0 if LSL is omitted. 


For the 32-bit variant: is the shift amount encoded in the "cmode<2:1>" field. It can have the 
following values: 


0 when cmode<2:1> = 00 
8 when cmode<2:1> = 01 
16 when cmode<2:1> = 10 
24 when cmode<2:1> = 11 


defaulting to 0 if LSL is omitted. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand; 
bits(datasize) result; 


case operation of 

when ImmediateOp_MOVI 
result = imm; 

when ImmediateOp_MVNI 
result = NOT(imm); 

when ImmediateOp_ORR 
operand = V[rd]; 
result = operand OR imm; 

when ImmediateOp_BIC 
operand = V[rd]; 
result = operand AND NOT(imm) ; 





V[rd] = result; 
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C7.2.13 BIC (vector, register) 


Bitwise bit Clear (vector, register). This instruction performs a bitwise AND between the first source SIMD&FP 
register and the complement of the second source SIMD&FP register, and writes the result to the destination 
SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofafoyo +77 0f0 afi] Rm [ooo77[1] Rn | Rd 


size 


Three registers of the same type variant 


BIC <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if Q == '1' then 128 else 64; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


operand2 = NOT(operand2) ; 


result = operandl AND operand2; 
V[d] = result; 
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C7.2.14 BIF 


Bitwise Insert if False. This instruction inserts each bit from the first source SIMD&FP register into the destination 
SIMD&FP register if the corresponding bit of the second source SIMD&FP register is 0, otherwise leaves the bit in 
the destination register unchanged. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofafijo +7701 afi] am [ooo7 7/1] Rn | Rd 


opc2 


Three registers of the same type variant 


BIF <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if Q == '1' then 128 else 64; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand1; 
bits(datasize) operand3; 
bits(datasize) operand4 = V[n]; 


Vid]; 
NOT(V[m]); 


operand1 
operand3 


V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3); 
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C7.2.15 BIT 
Bitwise Insert if True. This instruction inserts each bit from the first source SIMD&FP register into the SIMD&FP 
destination register if the corresponding bit of the second source SIMD&FP register is 1, otherwise leaves the bit in 
the destination register unchanged. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312|1110 9 | 5 4| 0 | 
olafijo 117 oft opi] am jooo7i] mm [| Ra | 
opc2 
Three registers of the same type variant 
BIT <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if Q == '1' then 128 else 64; 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand1; 
bits(datasize) operand3; 
bits(datasize) operand4 = V[n]; 
operand1 = V[d]; 
operand3 = V[m]; 
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3); 
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C7.2.16 BSL 


Bitwise Select. This instruction sets each bit in the destination SIMD&FP register to the corresponding bit from the 
first source SIMD&FP register when the original destination bit was 1, otherwise from the second source SIMD&FP 
register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofafijo +77 0f0 a7] Rm [oo077[1] Rn | Rd 


opc2 


Three registers of the same type variant 


BSL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if Q == '1' then 128 else 64; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand1; 
bits(datasize) operand3; 
bits(datasize) operand4 = V[n]; 


operand1 = V[m]; 
operand3 = V[d]; 
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3); 
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C7.2.17 CLS (vector) 


Count Leading Sign bits (vector). This instruction counts the number of consecutive bits following the most 
significant bit that are the same as the most significant bit in each vector element in the source SIMD&FP register, 
places the result into a vector, and writes the vector to the destination SIMD&FP register. The count does not include 
the most significant bit itself. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofafofo i771 O[sze]io 00 ojo0 700/10] Rn | Rd 
U 


Vector variant 


CLS <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CountOp countop = if U == '1' then CountOp_CLZ else CountOp_CLS; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 2 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 


integer count; 
for e = 0 to elements-1 
if countop == CountOp_CLS then 
count = CountLeadingSignBits(Elem[operand, e, esize]); 
else 
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count = CountLeadingZeroBits(Elem[operand, e, esize]); 
Elem[result, e, esize] = count<esize-1:0>; 
V[d] = result; 
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C7.2.18 CLZ (vector) 


Count Leading Zero bits (vector). This instruction counts the number of consecutive zeros, starting from the most 
significant bit, in each vector element in the source SIMD&FP register, places the result into a vector, and writes 
the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo[a[ifo +771 O[sze]i 000 ofoo 700/10] Rn | Rd 
U 


Vector variant 


CLZ <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CountOp countop = if U == '1' then CountOp_CLZ else CountOp_CLS; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 


integer count; 
for e = 0 to elements-1 
if countop == CountOp_CLS then 
count = CountLeadingSignBits(Elem[operand, e, esize]); 
else 
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count = CountLeadingZeroBits(Elem[operand, e, esize]); 
Elem[result, e, esize] = count<esize-1:0>; 
V[d] = result; 
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CMEQ (register) 


Compare bitwise Equal (vector). This instruction compares each vector element from the first source SIMD&FP 
register with the corresponding vector element from the second source SIMD&FP register, and if the comparison is 
equal sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets 
every bit of the corresponding vector element in the destination SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo a[i[1 1717 O[see]i] Rm [1 0007[1] Rn | Rd 
U 


Scalar variant 


CMEQ <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 

integer elements = 1; 


boolean and_test = (U == 'Q'); 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fofafifo 1771 O[sze]i] Rm [10007]] Rn | Rd 
U 


Vector variant 


CMEQ <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean and_test = (U == 'Q'); 


Assembler symbols 


<\V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 
The following encodings are reserved: 


° size = Ox. 
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. size = 10. 
<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<m> Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 

boolean test_passed; 


for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
if and_test then 
test_passed = !IsZero(element1 AND element2); 
else 
test_passed = (elementl == element2); 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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CMEQ (zero) 


Compare bitwise Equal to zero (vector). This instruction reads each vector element in the source SIMD&FP register 
and if the value is equal to zero sets every bit of the corresponding vector element in the destination SIMD&FP 
register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register 
to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo to[t +77 O[sze]i 000 0j0700/iJ10] Rn | Rd 
U op 


Scalar variant 


CMEQ <V><d>, <V><n>, #0 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 

integer elements = 1; 


CompareOp comparison; 

case op:U of 
when 'QQ' comparison = CompareOp_GT; 
when '@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Vector 


[31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fofafofo +171 Ofsze[7 000 0j010 0/110] Rn | Ra 
U op 


Vector variant 


CMEQ <Vd>.<T>, <Vn>.<T>, #0 


Decode for this encoding 


UInt(Rd) ; 
UInt (Rn); 


integer d 
integer n 


if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CompareOp comparison; 
case op:U of 
when 'QQ' comparison = CompareOp_GT; 
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when 'Q@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 


The following encodings are reserved: 


° size = Ox. 

° size = 10. 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

2D when size = 11,Q=1 


The encoding size = 11,Q = 0 is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element; 

boolean test_passed; 


for e = Q to elements-1 

element = SInt(Elem[operand, e, esize]); 

case comparison of 
when CompareOp_GT test_passed = element > Q; 
when CompareOp_GE test_passed = element >= Q; 
when CompareOp_EQ test_passed = element == Q; 
when CompareOp_LE test_passed = element <= Q; 
when CompareOp_LT test_passed = element < Q; 

Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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C7.2.21 CMGE (register) 


Compare signed Greater than or Equal (vector). This instruction compares each vector element in the first source 
SIMD&FP register with the corresponding vector element in the second source SIMD&FP register and if the first 
signed integer value is greater than or equal to the second signed integer value sets every bit of the corresponding 
vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector 
element in the destination SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


JA tite Bee de 


Scalar variant 


CMGE <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size != '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

boolean unsigned = (U == '1'); 
boolean cmp_eq = (eq == '1'); 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


a ee ee 


Vector variant 


CMGE <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 

boolean cmp_eq = (eq == '1'); 


Assembler symbols 





<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 
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<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


The following encodings are reserved: 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


size = Ox. 


size = 10. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2S 
4S 
2D 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operand1 
bits(datasize) operand2 
bits(datasize) result; 
integer element1; 
integer element2; 
boolean test_passed; 


for e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

test_passed = if cmp_eq then elementl >= element2 else element1 > element2; 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] 


= result; 


= V[n]; 
VEm] ; 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
=@01,Q=1 
= 10,Q=0 
= 10,Q=1 
=11,Q=1 
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CMGE (zero) 


Compare signed Greater than or Equal to zero (vector). This instruction reads each vector element in the source 
SIMD&FP register and if the signed integer value is greater than or equal to zero sets every bit of the corresponding 
vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector 
element in the destination SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo a[i[1 177 [sze]i 000 0f0700/o[10] Rn | Rd 
U op 


Scalar variant 


CMGE <V><d>, <V><n>, #0 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 

integer elements = 1; 


CompareOp comparison; 

case op:U of 
when 'QQ' comparison = CompareOp_GT; 
when '@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fofaf*fo +171 ofsze[7 000 0J0100/0[10] Rn | Ra 
U op 


Vector variant 


CMGE <Vd>.<T>, <Vn>.<T>, #0 


Decode for this encoding 


UInt(Rd) ; 
UInt (Rn); 


integer d 
integer n 


if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CompareOp comparison; 
case op:U of 
when 'QQ' comparison = CompareOp_GT; 
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when 'Q@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 


The following encodings are reserved: 


° size = Ox. 

° size = 10. 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

2D when size = 11,Q=1 


The encoding size = 11,Q = 0 is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element; 

boolean test_passed; 


for e = Q to elements-1 

element = SInt(Elem[operand, e, esize]); 

case comparison of 
when CompareOp_GT test_passed = element > Q; 
when CompareOp_GE test_passed = element >= Q; 
when CompareOp_EQ test_passed = element == Q; 
when CompareOp_LE test_passed = element <= Q; 
when CompareOp_LT test_passed = element < Q; 

Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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C7.2.23 CMGT (register) 


Compare signed Greater than (vector). This instruction compares each vector element in the first source SIMD&FP 
register with the corresponding vector element in the second source SIMD&FP register and if the first signed integer 
value is greater than the second signed integer value sets every bit of the corresponding vector element in the 
destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the 
destination SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1413 12/1110 9 | 5 4| 0 | 


JA tise Lee de 


Scalar variant 


CMGT <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size != '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

boolean unsigned = (U == '1'); 
boolean cmp_eq = (eq == '1'); 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


a Ee ee 


Vector variant 


CMGT <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 

boolean cmp_eq = (eq == '1'); 


Assembler symbols 





<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 
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<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


The following encodings are reserved: 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


size = Ox. 


size = 10. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2S 
4S 
2D 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operand1 
bits(datasize) operand2 
bits(datasize) result; 
integer element1; 
integer element2; 
boolean test_passed; 


for e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

test_passed = if cmp_eq then elementl >= element2 else element1 > element2; 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] 


= result; 


= V[n]; 
VEm] ; 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
=@01,Q=1 
= 10,Q=0 
= 10,Q=1 
=11,Q=1 





C7-806 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


C7.2.24 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


CMGT (zero) 


Compare signed Greater than zero (vector). This instruction reads each vector element in the source SIMD&FP 
register and if the signed integer value is greater than zero sets every bit of the corresponding vector element in the 
destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the 
destination SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo tJo[t 177 O[sze]i 000 0j0700/o[10] Rn | Rd 
U op 


Scalar variant 


CMGT <V><d>, <V><n>, #0 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 

integer elements = 1; 


CompareOp comparison; 

case op:U of 
when 'QQ' comparison = CompareOp_GT; 
when '@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Vector 


[31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fofafofo +171 Ofsze[7 000 0J010 0/0/10] Rn | Ra 
U op 


Vector variant 


CMGT <Vd>.<T>, <Vn>.<T>, #0 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt (Rn); 


if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CompareOp comparison; 
case op:U of 
when 'QQ' comparison = CompareOp_GT; 
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when 'Q@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 


The following encodings are reserved: 


° size = Ox. 

° size = 10. 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

2D when size = 11,Q=1 


The encoding size = 11,Q = 0 is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element; 

boolean test_passed; 


for e = Q to elements-1 

element = SInt(Elem[operand, e, esize]); 

case comparison of 
when CompareOp_GT test_passed = element > Q; 
when CompareOp_GE test_passed = element >= Q; 
when CompareOp_EQ test_passed = element == Q; 
when CompareOp_LE test_passed = element <= Q; 
when CompareOp_LT test_passed = element < Q; 

Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 





C7-808 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.25 CMHI (register) 


Compare unsigned Higher (vector). This instruction compares each vector element in the first source SIMD&FP 
register with the corresponding vector element in the second source SIMD&FP register and if the first unsigned 
integer value is greater than the second unsigned integer value sets every bit of the corresponding vector element in 
the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the 
destination SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1413 12/1110 9 | 5 4| 0 | 


J Abb tie Lee dd 


Scalar variant 


CMHI <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size != '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

boolean unsigned = (U == '1'); 
boolean cmp_eq = (eq == '1'); 


Vector 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 141312/1110 9 | 5 A| 0| 


a ae 


Vector variant 


CMHI <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 

boolean cmp_eq = (eq == '1'); 


Assembler symbols 





<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 
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<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


The following encodings are reserved: 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


size = Ox. 


size = 10. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2S 
4S 
2D 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operand1 
bits(datasize) operand2 
bits(datasize) result; 
integer element1; 
integer element2; 
boolean test_passed; 


for e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

test_passed = if cmp_eq then elementl >= element2 else element1 > element2; 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] 


= result; 


= V[n]; 
VEm] ; 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
=@01,Q=1 
= 10,Q=0 
= 10,Q=1 
=11,Q=1 
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C7.2.26 CMHS (register) 


Compare unsigned Higher or Same (vector). This instruction compares each vector element in the first source 
SIMD&FP register with the corresponding vector element in the second source SIMD&FP register and if the first 
unsigned integer value is greater than or equal to the second unsigned integer value sets every bit of the 
corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the 
corresponding vector element in the destination SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


a 


Scalar variant 


CMHS <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size != '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

boolean unsigned = (U == '1'); 
boolean cmp_eq = (eq == '1'); 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


a ae 


Vector variant 


CMHS <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 

boolean cmp_eq = (eq == '1'); 


Assembler symbols 





<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 
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<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


The following encodings are reserved: 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


size = Ox. 


size = 10. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2S 
4S 
2D 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operand1 
bits(datasize) operand2 
bits(datasize) result; 
integer element1; 
integer element2; 
boolean test_passed; 


for e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

test_passed = if cmp_eq then elementl >= element2 else element1 > element2; 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] 


= result; 


= V[n]; 
VEm] ; 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
=@01,Q=1 
= 10,Q=0 
= 10,Q=1 
=11,Q=1 
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CMLE (zero) 


Compare signed Less than or Equal to zero (vector). This instruction reads each vector element in the source 
SIMD&FP register and if the signed integer value is less than or equal to zero sets every bit of the corresponding 
vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector 
element in the destination SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo t[t]1 117 Ofsze[7 O00 ojOT0 O10] Rn | Ra | 
U op 


Scalar variant 


CMLE <V><d>, <V><n>, #0 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 

integer elements = 1; 


CompareOp comparison; 

case op:U of 
when 'QQ' comparison = CompareOp_GT; 
when '@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fofafifo +171 Ofsze[7 000 0j010 0/110] Rn | Ra | 
U op 


Vector variant 


CMLE <Vd>.<T>, <Vn>.<T>, #0 


Decode for this encoding 


UInt(Rd) ; 
UInt (Rn); 


integer d 
integer n 


if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CompareOp comparison; 
case op:U of 
when 'QQ' comparison = CompareOp_GT; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-813 


1ID092916 


Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


when 'Q@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 


The following encodings are reserved: 


° size = Ox. 

° size = 10. 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

2D when size = 11,Q=1 


The encoding size = 11,Q = 0 is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element; 

boolean test_passed; 


for e = Q to elements-1 

element = SInt(Elem[operand, e, esize]); 

case comparison of 
when CompareOp_GT test_passed = element > Q; 
when CompareOp_GE test_passed = element >= Q; 
when CompareOp_EQ test_passed = element == Q; 
when CompareOp_LE test_passed = element <= Q; 
when CompareOp_LT test_passed = element < Q; 

Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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CMLT (zero) 


Compare signed Less than zero (vector). This instruction reads each vector element in the source SIMD&FP register 
and if the signed integer value is less than zero sets every bit of the corresponding vector element in the destination 
SIMD&FFP register to one, otherwise sets every bit of the corresponding vector element in the destination 
SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo tJo[t i717 O[sze]io000j070710/10] Rn | Rd 


Scalar variant 


CMLT <V><d>, <V><n>, #0 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 

integer elements = 1; 


CompareOp comparison = CompareOp_LT; 
Vector 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 141312/11109 | 5 4| 0 | 


fofaofo i771 O[sze]i 000 0jo710710/10] Rn | Rd 


Vector variant 


CMLT <Vd>.<T>, <Vn>.<T>, #0 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CompareOp comparison = CompareOp_LT; 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 


D when size = 11 
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The following encodings are reserved: 


. size = Ox. 

. size = 10. 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element; 

boolean test_passed; 


for e = Q to elements-1 

element = SInt(Elem[operand, e, esize]); 

case comparison of 
when CompareOp_GT test_passed = element > Q; 
when CompareOp_GE test_passed = element >= Q; 
when CompareOp_EQ test_passed = element == Q; 
when CompareOp_LE test_passed = element <= Q; 
when CompareOp_LT test_passed = element < Q; 

Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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CMTST 


Compare bitwise Test bits nonzero (vector). This instruction reads each vector element in the first source SIMD&FP 
register, performs an AND with the corresponding vector element in the second source SIMD&FP register, and if 
the result is not zero, sets every bit of the corresponding vector element in the destination SIMD&FP register to one, 
otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo tJo[t +77 0[sze]i] Rm [t10007[1] Rn | Rd 
U 


Scalar variant 


CMTST <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 

integer elements = 1; 


boolean and_test = (U == 'Q'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fofafofo +771 o[sze]i] Rm [10007]1] Rn | Rd 
U 


Vector variant 


CMTST <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean and_test = (U == 'Q'); 


Assembler symbols 


<\V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 
The following encodings are reserved: 


° size = Ox. 
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. size = 10. 
<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<m> Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 

boolean test_passed; 


for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
if and_test then 
test_passed = !IsZero(element1 AND element2); 
else 
test_passed = (elementl == element2); 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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C7.2.30 CNT 


Population Count per byte. This instruction counts the number of bits that have a value of one in each vector element 
in the source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP 
register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo[afofo +771 O[sze]i ooo ojoo70 110] Rn | Rd 


Vector variant 


CNT <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if size !'= 'Q0' then ReservedValue(); 

integer esize = 8; 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV 8; 


Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
The following encodings are reserved: 
° size = 01,Q =x. 


° size = 1x,Q =x. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 


integer count; 
for e = 0 to elements-1 
count = BitCount(Elem[operand, e, esize]); 
Elem[result, e, esize] = count<esize-1:0>; 
V[d] = result; 
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C7.2.31 


DUP (element) 


Duplicate vector element to vector or scalar. This instruction duplicates the vector element at the specified element 
index in the source SIMD&FP register into a scalar or each element in a vector, and writes the result to the 
destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is used by the alias MOV (scalar). The alias is always the preferred disassembly. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 141312\1110 9 | 5 4| 0 | 


fo afoot + 770000] mms [ojoooofi] en | Rd 


Scalar variant 


DUP <V><d>, <Vn>.<T>[<index>] 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer size = LowestSetBit(immS) ; 
if size > 3 then UnallocatedEncoding(); 


integer index = UInt(immS<4:size+1>) ; 
integer idxdsize = if imm5<4> == '1' then 128 else 64; 


integer esize = 8 << size; 


integer datasize = esize; 
integer elements = 1; 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fofafofo 1770000] mms [ojoooo]i] rn | Rd 


Vector variant 


DUP <Vd>.<T>, <Vn>.<Ts>[<index>] 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer size = LowestSetBit(immS); 
if size > 3 then UnallocatedEncoding(); 


integer index = UInt(imm5<4:size+1>) ; 
integer idxdsize = if imm5<4> == '1' then 128 else 64; 


if size == 3 && Q == '@' then ReservedValue(); 
integer esize = 8 << size; 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
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Assembler symbols 


<I> For the scalar variant: is the element width specifier, encoded in the "imm5" field. It can have the 
following values: 


B when imm5 = xxxx1 
H when imm5 = xxx10 
s when imm5 = xx100 
D when imm5 = x1000 


The encoding imm5 = xQ00Q0Q is reserved. 


For the vector variant: is an arrangement specifier, encoded in the "imm5:Q" field. It can have the 
following values: 


8B when imm5 = xxxx1,Q = @ 
16B when immS = xxxx1,Q = 1 
4H when immS = xxx10,Q = @ 
8H when immS = xxx10,Q = 1 
2S when immS = xx100,Q = @ 
4S when immS = xx100,Q = 1 
2D when immS = x1000,Q = 1 





The following encodings are reserved: 
° immS = xQQQQ,Q = x. 
° immS = x1000,Q = 0. 


<Ts> Is an element size specifier, encoded in the "imm5" field. It can have the following values: 
B when immS = xxxx1 
H when imm5 = xxx10 
Ss when imm5 = xx100 
D when imm5 = x1000 


The encoding imm5 = xQ00Q0Q is reserved. 


<V> Is the destination width specifier, encoded in the "imm5" field. It can have the following values: 
B when immS = xxxx1 
H when imm5 = xxx10 
Ss when imm5 = xx100 
D when imm5 = x1000 


The encoding imm5 = xQ00Q0 is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


<index> Is the element index encoded in the "imm5" field. It can have the following values: 
imm5<4:1> when imm5 = xxxx1 
imm5<4:2> when imm5 = xxx10 
imm5<4:3> when imm5 = xx100 
imm5<4> when imm5 = x1000 


The encoding imm5 = x000Q0Q is reserved. 





<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
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Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(idxdsize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


element = Elem[operand, index, esize]; 
for e = 0 to elements-1 

Elem[result, e, esize] = element; 
V[d] = result; 
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DUP (general) 


Duplicate general-purpose register to vector. This instruction duplicates the contents of the source general-purpose 
register into a scalar or each element in a vector, and writes the result to the SIMD&FP destination register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fojajojo 14110000] imms  fofooo aii] Rn [| Rd | 


Advanced SIMD variant 


DUP <Vd>.<T>, <R><n> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer size = LowestSetBit(immS) ; 
if size > 3 then UnallocatedEncoding(); 


// imm5<4:size+1> is IGNORED 


if size == 3 && Q == '@' then ReservedValue(); 
integer esize = 8 << size; 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> Is an arrangement specifier, encoded in the "imm5:Q" field. It can have the following values: 
8B when immS = xxxx1,Q = @ 
16B when immS = xxxx1,Q = 1 
4H when immS = xxx10,Q = 0 
8H when immS = xxx10,Q = 1 
2S when immS = xx100,Q = @ 
4S when immS = xx100,Q = 1 
2D when imm5 = x1000,Q = 1 





The following encodings are reserved: 

° immS = xQQQQ,Q = x. 

° immS = x1000,Q = @. 

<R> Is the width specifier for the general-purpose source register, encoded in the "imm5" field. It can 

have the following values: 

W when immS = xxxx1 

W when imm5 = xxx10 

W when imm5 = xx100 

x when imm5 = x1000 
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The encoding imm5 = xQ00Q0Q is reserved. 


Unspecified bits in "imm5" are ignored but should be set to zero by an assembler. 


<n> Is the number [0-30] of the general-purpose source register or ZR (31), encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(esize) element = X[n]; 
bits(datasize) result; 


for e = 0 to elements-1 
Elem[result, e, esize] = element; 
V[d] = result; 
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C7.2.33 EOR (vector) 


Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source 
SIMD&FP registers, and places the result in the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 





|31 30 aoe 25 24|23 22 21 20| 16|15 14 13 12/11 10 9 5 4| 
fofafijo 117 ofo oft] Rm [ooo 7 a a 
opc2 
Three registers of the same type variant 
EOR <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if Q == '1' then 128 else 64; 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand1; 
bits(datasize) operand2; 
bits(datasize) operand3; 
bits(datasize) operand4 = V[n]; 
operand1 = V[m]; 
operand2 = Zeros(); 
operand3 = Ones(); 
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3); 
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C7.2.34 


EXT 


Extract vector from pair of vectors. This instruction extracts the lowest vector elements from the second source 
SIMD&FP register and the highest vector elements from the first source SIMD&FP register, concatenates the 
results into a vector, and writes the vector to the destination SIMD&FP register vector. The index value specifies 
the lowest vector element to extract from the first source register, and consecutive elements are extracted from the 
first, then second, source registers until the destination vector is filled. 


The following figure shows an example of the operation of EXT doubleword operation for Q = 0 and imm4<2:0> 


=3. 


76543210 7654321 0 


Vm 


Vn 


Vd 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 14 41109 | 5 


FG aC eB 


Advanced SIMD variant 


EXT <Vd>.<T>, <Vn>.<T>, <Vm>.<T>, #<index> 


Decode for this encoding 


integer 
integer 
integer 


if Q == 


integer 
integer 


d 
n 
m= 


UInt(Rd) ; 
UInt(Rn) ; 
UInt (Rm) ; 


"Q' && imm4<3> == '1' then UnallocatedEncoding(); 


datasize = if Q == '1' then 128 else 64; 
position = UInt(imm4) << 3; 


Assembler symbols 


<Vd> 


<I> 


<Vn> 


<Vm> 


<index> 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the lowest numbered byte element to be extracted, encoded in the "Q:imm4" field. It can have the 
following values: 


imm4<2:@> whenQ = 0, imm4<3> = Q 
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imm4 when Q = 1, imm4<3> = x 


The encoding Q = 0, imm4<3> = 1is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) hi = V[m]; 
bits(datasize) lo = V[n]; 
bits(datasizex2) concat = hi:lo; 


V[d] = concat<position+datasize-1:position>; 
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C7.2.35 FABD 


Floating-point Absolute Difference (vector). This instruction subtracts the floating-point values in the elements of 
the second source SIMD&FP register, from the corresponding floating-point values in the elements of the first 
source SIMD&FP register, places the absolute value of each result in a vector, and writes the vector to the 
destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312|1110 9 | 5 A| 0| 


fo afi 177 [ti] am [17070] Rn | Rd 


Scalar variant 


FABD <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 

boolean abs = TRUE; 


Vector 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0| 


ofatifo +77 [tei] am [1707 0/7] Rn | Rd 
U 


Vector variant 


FABD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean abs = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 


S when sz = 0 
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<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vim> 
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D when sz = 1 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 


2S when sz = 0,Q = 0 
4s when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 
bits(esize) diff; 


for 


V[d] 


e = 0 to elements-1 

elementl = Elem[operand1, e, esize]; 

element2 = Elem[operand2, e, esize]; 

diff = FPSub(element1, element2, FPCR); 

Elem[result, e, esize] = if abs then FPAbs(diff) else diff; 


= result; 
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C7.2.36 FABS (vector) 


Floating-point Absolute value (vector). This instruction calculates the absolute value of each vector element in the 
source SIMD&FP register, writes the result to a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


olafojo 11 4 ofifse{t ooo ofo sr at ajiof Rn | Re 


Vector single-precision and double-precision variant 


FABS <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean neg = (U == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
if neg then 
element = FPNeg(element) ; 
else 
element = FPAbs(element); 
Elem[result, e, esize] = element; 


V[d] = result; 
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C7.2.37 FABS (scalar) 


Floating-point Absolute value (scalar). This instruction calculates the absolute value in the SIMD&FP source 
register and writes the result to the SIMD&FP destination register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofojoj1 141 ofo x{tJoooojot{1 ooo] Rn {| Rd | 


type ope 


Single-precision variant 
Applies when type == 00. 


FABS <Sd>, <Sn> 


Double-precision variant 
Applies when type == Q1. 


FABS <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q@1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) result; 
bits(datasize) operand = V[n]; 


result = FPAbs(operand) ; 
V[d] = result; 
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C7.2.38 FACGE 


Floating-point Absolute Compare Greater than or Equal (vector). This instruction compares the absolute value of 
each floating-point value in the first source SIMD&FP register with the absolute value of the corresponding 
floating-point value in the second source SIMD&FP register and if the first value is greater than or equal to the 
second value sets every bit of the corresponding vector element in the destination SIMD&FP register to one, 
otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


fo 1f4]4 117 ofofszit] Rm [1471 0;4f1{ Rn |  Ra_ i 
U E ac 


Scalar variant 


FACGE <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 

CompareOp cmp; 

boolean abs; 


case E:U:ac of 
when 'Q@00' cmp 
when 'Q@10' cmp 


CompareOp_EQ; abs = FALSE; 
CompareOp_GE; abs = FALSE; 
when '@11' cmp = CompareOp_GE; abs = TRUE; 
when '110' cmp = CompareOp_GT; abs = FALSE; 
when '111' cmp = CompareOp_GT; abs = TRUE; 
otherwise UnallocatedEncoding(); 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fojatifo 14 4 ofojsz{1] Rm [117 o]t4ft{ Rn | Rd 
U E ac 


Vector variant 


FACGE <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
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if Q == '1' then 128 else 64; 


integer elements = datasize DIV esize; 
CompareOp cmp; 


boolean abs; 


case E:U:ac of 


when 
when 
when 
when 
when 


"Q00' cmp 
"Q10' cmp 
"Q11' cmp 
'110' cmp 


= CompareOp_EQ; abs = FALSE; 
= CompareOp_GE; abs = FALSE; 
= CompareOp_GE; abs = TRUE; 
= CompareOp_GT; abs = FALSE; 
'111' cmp = CompareOp_GT; abs = TRUE; 


otherwise UnallocatedEncoding(); 


Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 


D when sz = 1 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2s when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 

boolean test_passed; 


for e = 0 to elements-1 

elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
if abs then 


V[d] 


element1 = FPAbs(element1); 
element2 = FPAbs(element2); 


case cmp of 


when CompareOp_EQ test_passed 
when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR); 
when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR); 


FPCompareEQ(element1, element2, FPCR); 


Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


= result; 
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C7.2.39 FACGT 


Floating-point Absolute Compare Greater than (vector). This instruction compares the absolute value of each vector 
element in the first source SIMD&FP register with the absolute value of the corresponding vector element in the 
second source SIMD&FP register and if the first value is greater than the second value sets every bit of the 
corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the 
corresponding vector element in the destination SIMD&FP register to zero. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


fo 1f4f4 1174 ofifszit] Rm [111 0]4f1{ Rn | Ra_i 
U E ac 


Scalar variant 


FACGT <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 

CompareOp cmp; 

boolean abs; 


case E:U:ac of 
when 'Q@00' cmp 
when 'Q@10' cmp 


CompareOp_EQ; abs = FALSE; 
CompareOp_GE; abs = FALSE; 
when '@11' cmp = CompareOp_GE; abs = TRUE; 
when '110' cmp = CompareOp_GT; abs = FALSE; 
when '111' cmp = CompareOp_GT; abs = TRUE; 
otherwise UnallocatedEncoding(); 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fojatifo 144 oftfsz{t] Rm {117 o]tft] Rn | Rd 
U E ac 


Vector variant 


FACGT <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
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if Q == '1' then 128 else 64; 


integer elements = datasize DIV esize; 
CompareOp cmp; 


boolean abs; 


case E:U:ac of 


when 
when 
when 
when 
when 


"Q00' cmp 
"Q10' cmp 
"Q11' cmp 
'110' cmp 


= CompareOp_EQ; abs = FALSE; 
= CompareOp_GE; abs = FALSE; 
= CompareOp_GE; abs = TRUE; 
= CompareOp_GT; abs = FALSE; 
'111' cmp = CompareOp_GT; abs = TRUE; 


otherwise UnallocatedEncoding(); 


Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 


D when sz = 1 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2s when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 

boolean test_passed; 


for e = 0 to elements-1 

elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
if abs then 


V[d] 


element1 = FPAbs(element1); 
element2 = FPAbs(element2); 


case cmp of 


when CompareOp_EQ test_passed 
when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR); 
when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR); 


FPCompareEQ(element1, element2, FPCR); 


Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


= result; 
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C7.2.40 FADD (vector) 
Floating-point Add (vector). This instruction adds corresponding vector elements in the two source SIMD&FP 
registers, writes the result into a vector, and writes the vector to the destination SIMD&FP register. All the values 
in this instruction are floating-point values. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
eee eee 16|15141312|11109 | 5 4| 
lato lo 111 ofofsz{1] Rm [1101 a ee 
Vector single-precision and double-precision variant 
FADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean pair = (U == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 
The encoding sz = 1, Q = Q is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(2sdatasize) concat = operand2:operand1; 
bits(esize) element1; 
bits(esize) element2; 
for e = Q to elements-1 
if pair then 
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element1 = Elem[concat, 2se, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 
else 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
Elem[result, e, esize] = FPAdd(element1, element2, FPCR); 


V[d] = result; 
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C7.2.41. _ FADD (scalar) 
Floating-point Add (scalar). This instruction adds the floating-point values of the two source SIMD&FP registers, 
and writes the result to the destination SIMD&FP register. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 
fofofoy1 +17 0f0 x]i] am [oo 1]o[1 0] Rn [| Ra 
type op 
Single-precision variant 
Applies when type == 00. 
FADD <Sd>, <Sn>, <Sm> 
Double-precision variant 
Applies when type == @1. 
FADD <Dd>, <Dn>, <Dm> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 
Assembler symbols 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
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result = FPAdd(operand1, operand2, FPCR); 


V[d] = result; 
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C7.2.42 FADDP (scalar) 


Floating-point Add Pair of elements (scalar). This instruction adds two floating-point vector elements in the source 
SIMD&FP register and writes the scalar result into the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo a[i[1 177 Ofofa]i too ojo7 70710] Rn | Rd 


Single-precision and double-precision variant 


FADDP <V><d>, <Vn>.<T> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


integer esize = 32 << UInt(sz); 
integer datasize = esize « 2; 
integer elements = 2; 


ReduceOp op = ReduceOp_FADD; 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is the source arrangement specifier, encoded in the "sz" field. It can have the following values: 
2S when sz = 0 
2D when sz = 1 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(op, operand, esize); 
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C7.2.43.  FADDP (vector) 


Floating-point Add Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the 
first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair 
of adjacent vector elements from the concatenated vector, adds each pair of values together, places the result into a 
vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are 
floating-point values. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|1110 9 | 5 4| 0 | 


pofajijo 111 ofojszit] Rm ft 107 off Rn | Rd 
U 


Vector single-precision and double-precision variant 


FADDP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean pair = (U == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
bits(esize) element1; 

bits(esize) element2; 
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for e = 0 to elements-1 

if pair then 
elementl = Elem[concat, 2se, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 

else 
elementl = Elem[operandl, e, esize]; 
element2 = Elem[operand2, e, esize]; 

Elem[result, e, esize] = FPAdd(element1, element2, FPCR); 


V[d] = result; 
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C7.2.44 FCCMP 


Floating-point Conditional quiet Compare (scalar). This instruction compares the two SIMD&FP source register 
values and writes the result to the PSTATE.{N, Z, C, V} flags. If the condition does not pass then the PSTATE. {N, 
Z, C, V} flags are set to the flag bit specifier. 


It raises an Invalid Operation exception only if either operand is a signaling NaN. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 12/1110 9 | 5 4|3 0 | 


fofofoy1 +77 0f0 x]i] Rm | cond [oa] Rn [0] raw | 


type op 


Single-precision variant 
Applies when type == 00. 


FCCMP <Sn>, <Sm>, #<nzcv>, <cond> 
Double-precision variant 
Applies when type == Q1. 

FCCMP <Dn>, <Dm>, #<nzcv>, <cond> 


Decode for all variants of this encoding 


integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q@1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


bits(4) flags = nzcv; 


Assembler symbols 


<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 

<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 

<nzcv> Is the flag bit specifier, an immediate in the range 0 to 15, giving the alternative state for the 4-bit 


NZCV condition flags, encoded in the "nzcv" field. 





<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
NaNs 
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The IEEE 754 standard specifies that the result of a comparison is precisely one of <, ==, > or unordered. If either 
or both of the operands are NaNs, they are unordered, and all three of (Operand1 < Operand2), (Operand1 == 
Operand2) and (Operand1 > Operand2) are false. This case results in the FPSCR flags being set to N=0, Z=0, C=1, 


and V=1. 


Operation 
CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operandl = V[n]; 
bits(datasize) operand2; 


operand2 = V[m]; 
if ConditionHolds(cond) then 


flags = FPCompare(operandl, operand2, FALSE, FPCR); 
PSTATE.<N,Z,C,V> = flags; 
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C7.2.45 FCCMPE 


Floating-point Conditional signaling Compare (scalar). This instruction compares the two SIMD&FP source 
register values and writes the result to the PSTATE. {N, Z, C, V} flags. If the condition does not pass then the 
PSTATE.{N, Z, C, V} flags are set to the flag bit specifier. 


If either operand is any type of NaN, or if either operand is a signaling NaN, the instruction raises an Invalid 
Operation exception. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 12/1110 9 | 5 4|3 0 | 


fofofoy1 +77 0f0 x1] Rm | cond [oa] Rn [1] new | 
op 


type 
Single-precision variant 
Applies when type == 00. 
FCCMPE <Sn>, <Sm>, #<nzcv>, <cond> 
Double-precision variant 


Applies when type == @1. 


FCCMPE <Dn>, <Dm>, #<nzcv>, <cond> 


Decode for all variants of this encoding 


integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


bits(4) flags = nzcv; 


Assembler symbols 


<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 

<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 

<nzcv> Is the flag bit specifier, an immediate in the range 0 to 15, giving the alternative state for the 4-bit 


NZCV condition flags, encoded in the "nzcv" field. 





<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
NaNs 
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The IEEE 754 standard specifies that the result of a comparison is precisely one of <, ==, > or unordered. If either 
or both of the operands are NaNs, they are unordered, and all three of (Operand1 < Operand2), (Operand1 == 
Operand2) and (Operand1 > Operand2) are false. This case results in the FPSCR flags being set to N=0, Z=0, C=1, 


and V=1. 


FCCMPE raises an Invalid Operation exception if either operand is any type of NaN, and is suitable for testing for <, 


<=, >, >=, and other predicates that raise an exception when the operands are unordered. 


Operation 
CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operandl = V[n]; 
bits(datasize) operand2; 


operand2 = V[m]; 
if ConditionHolds(cond) then 


flags = FPCompare(operand1, operand2, TRUE, FPCR); 
PSTATE.<N,Z,C,V> = flags; 
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C7.2.46 FCMEQ (register) 
Floating-point Compare Equal (vector). This instruction compares each floating-point value from the first source 
SIMD&FP register, with the corresponding floating-point value from the second source SIMD&FP register, and if 
the comparison is equal sets every bit of the corresponding vector element in the destination SIMD&FP register to 
one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312|11109 | 5 4| 0| 
fo tfoy 117 Ofojst] Rm [117 ool] Rn | Ra | 
U E ac 
Scalar variant 
FCMEQ <V><d>, <V><n>, <V><m> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 
CompareOp cmp; 
boolean abs; 
case E:U:ac of 
when 'Q00' cmp = CompareOp_EQ; abs = FALSE; 
when '@10' cmp = CompareOp_GE; abs = FALSE; 
when '@11' cmp = CompareOp_GE; abs = TRUE; 
when '110' cmp = CompareOp_GT; abs = FALSE; 
when '111' cmp = CompareOp_GT; abs = TRUE; 
otherwise UnallocatedEncoding(); 
Vector 
|31 30 sala 25 24|23 22 21 20] 16|15141312/1110 9 | 5 4| 0 | 
O10 | 01110 Olse|t] Rm Jit i ojofi| Ro | Rs | 
Vector variant 
FCMEQ <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 
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integer elements = datasize DIV esize; 
CompareOp cmp; 
boolean abs; 


case E:U:ac of 

when 'Q00' cmp = CompareOp_EQ; abs = FALSE; 
when 'Q10' cmp = CompareOp_GE; abs = FALSE; 
when 'Q11' cmp = CompareOp_GE; abs = TRUE; 
when '110' cmp = CompareOp_GT; abs = FALSE; 
when '111' cmp = CompareOp_GT; abs = TRUE; 
otherwise UnallocatedEncoding(); 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<m> Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1,Q = Q is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 

boolean test_passed; 


for e = 0 to elements-1 

elementl = Elem[operand1, e, esize]; 

element2 = Elem[operand2, e, esize]; 

if abs then 
element1 = FPAbs(element1); 
element2 = FPAbs(element2); 

case cmp of 
when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR); 
when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR); 
when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR); 

Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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FCMEQ (zero) 


Floating-point Compare Equal to zero (vector). This instruction reads each floating-point value in the source 
SIMD&FP register and if the value is equal to zero sets every bit of the corresponding vector element in the 
destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the 
destination SIMD&FP register to zero. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


[31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fo tot +77 O[tfji 000 0jo717 0/1110] Rn | Rd 
U op 


Scalar variant 


FCMEQ <V><d>, <V><n>, #0.0 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


CompareOp comparison; 

case op:U of 
when 'QQ' comparison = CompareOp_CT; 
when '@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofajojo 111 oftjszi1 oo o ofo +1 ofaf1 of Rn | Rd 
U op 


Vector variant 


FCMEQ <Vd>.<T>, <Vn>.<T>, #0.0 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CompareOp comparison; 
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case op:U of 


when 
when 
when 
when 


"QQ' comparison = CompareOp_GT; 
'Q1' comparison = CompareOp_GE; 
'10' comparison = CompareOp_EQ; 
'11' comparison = CompareOp_LE; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 


D when sz = 1 
Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 


2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) zero = FPZero('Q'); 
bits(esize) element; 

boolean test_passed; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


V[d] 


case 


comparison of 


when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR); 


when CompareOp_GE test_passed 
when CompareOp_EQ test_passed 


FPCompareGE(element, zero, FPCR); 
FPCompareEQ(element, zero, FPCR); 


when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR); 
when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR); 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


= result; 
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C7.2.48 FCMGE (register) 
Floating-point Compare Greater than or Equal (vector). This instruction reads each floating-point value in the first 
source SIMD&FP register and if the value is greater than or equal to the corresponding floating-point value in the 
second source SIMD&FP register sets every bit of the corresponding vector element in the destination SIMD&FP 
register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register 
to zero. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312|11109 | 5 4| 0| 
fo 1{1}1 11 1 ofofsz{t] Rm [1417 ofojsi] Rn {| Rd 
U E ac 
Scalar variant 
FCMGE <V><d>, <V><n>, <V><m> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 
CompareOp cmp; 
boolean abs; 
case E:U:ac of 
when 'Q00' cmp = CompareOp_EQ; abs = FALSE; 
when '@10' cmp = CompareOp_GE; abs = FALSE; 
when '@11' cmp = CompareOp_GE; abs = TRUE; 
when '110' cmp = CompareOp_GT; abs = FALSE; 
when '111' cmp = CompareOp_GT; abs = TRUE; 
otherwise UnallocatedEncoding(); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312|/1110 9 | 5 4| 0 | 
fofatijo 1.11 ofojszis] Rm [117 ofojt] Rn RGA 
U E ac 
Vector variant 
FCMGE <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
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integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
CompareOp cmp; 

boolean abs; 


case E:U:ac of 
when 'Q0Q' cmp 
when 'Q10' cmp 


CompareOp_EQ; abs = FALSE; 
CompareOp_GE; abs = FALSE; 
when '@11' cmp = CompareOp_GE; abs = TRUE; 
when '110' cmp = CompareOp_GT; abs = FALSE; 
when '111' cmp = CompareOp_GT; abs = TRUE; 
otherwise UnallocatedEncoding(); 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<m> Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2s when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 

boolean test_passed; 


for e = 0 to elements-1 

elementl = Elem[operand1, e, esize]; 

element2 = Elem[operand2, e, esize]; 

if abs then 
element1 = FPAbs(element1); 
element2 = FPAbs(element2); 

case cmp of 
when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR); 
when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR); 
when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR); 

Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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FCMGE (zero) 


Floating-point Compare Greater than or Equal to zero (vector). This instruction reads each floating-point value in 
the source SIMD&FP register and if the value is greater than or equal to zero sets every bit of the corresponding 
vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector 
element in the destination SIMD&FP register to zero. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fo tfa[1 177 O[tfi 000 0jo770fol10] rn | Rd 
U op 


Scalar variant 


FCMGE <V><d>, <V><n>, #0.0 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


CompareOp comparison; 

case op:U of 
when 'QQ' comparison = CompareOp_CT; 
when '@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofajijo 111 oft|szii1 oo o ofo 11 ofof1 of Rn | Rd 
U op 


Vector variant 


FCMGE <Vd>.<T>, <Vn>.<T>, #0.0 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CompareOp comparison; 
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case op:U of 


when 
when 
when 
when 


"QQ' comparison = CompareOp_GT; 
'Q1' comparison = CompareOp_GE; 
'10' comparison = CompareOp_EQ; 
'11' comparison = CompareOp_LE; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 


D when sz = 1 
Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 


2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) zero = FPZero('Q'); 
bits(esize) element; 

boolean test_passed; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


V[d] 


case 


comparison of 


when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR); 


when CompareOp_GE test_passed 
when CompareOp_EQ test_passed 


FPCompareGE(element, zero, FPCR); 
FPCompareEQ(element, zero, FPCR); 


when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR); 
when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR); 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


= result; 
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C7.2.50 FCMGT (register) 
Floating-point Compare Greater than (vector). This instruction reads each floating-point value in the first source 
SIMD&FP register and if the value is greater than the corresponding floating-point value in the second source 
SIMD&FP register sets every bit of the corresponding vector element in the destination SIMD&FP register to one, 
otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312|11109 | 5 4| 0| 
fo tpi] 117 Ofifet] em [117 0foji] mn | Ra 
U E ac 
Scalar variant 
FCMGT <V><d>, <V><n>, <V><m> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 
CompareOp cmp; 
boolean abs; 
case E:U:ac of 
when 'Q00' cmp = CompareOp_EQ; abs = FALSE; 
when '@10' cmp = CompareOp_GE; abs = FALSE; 
when '@11' cmp = CompareOp_GE; abs = TRUE; 
when '110' cmp = CompareOp_GT; abs = FALSE; 
when '111' cmp = CompareOp_GT; abs = TRUE; 
otherwise UnallocatedEncoding(); 
Vector 
|31 30 sala 25 24|23 22 21 20] 16|15141312/1110 9 | 5 4| 0 | 
olor 01110 lst] Rm fi tf ofofi| Rn | Rs 
Vector variant 
FCMGT <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-855 


1ID092916 


Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


integer elements = datasize DIV esize; 
CompareOp cmp; 
boolean abs; 


case E:U:ac of 

when 'Q00' cmp = CompareOp_EQ; abs = FALSE; 
when 'Q10' cmp = CompareOp_GE; abs = FALSE; 
when 'Q11' cmp = CompareOp_GE; abs = TRUE; 
when '110' cmp = CompareOp_GT; abs = FALSE; 
when '111' cmp = CompareOp_GT; abs = TRUE; 
otherwise UnallocatedEncoding(); 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<m> Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1,Q = Q is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 

boolean test_passed; 


for e = 0 to elements-1 

elementl = Elem[operandl, e, esize]; 

element2 = Elem[operand2, e, esize]; 

if abs then 
element1 = FPAbs(element1); 
element2 = FPAbs(element2); 

case cmp of 
when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR); 
when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR); 
when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR); 

Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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C7.2.51 FCMGT (zero) 
Floating-point Compare Greater than zero (vector). This instruction reads each floating-point value in the source 
SIMD&FP register and if the value is greater than zero sets every bit of the corresponding vector element in the 
destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the 
destination SIMD&FP register to zero. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 5 4| 0| 
fo tfoyt 1171 Ofijs1 000 0fo1 7 ojo 0] mn | Ra | 
U op 
Scalar variant 
FCMGT <V><d>, <V><n>, #0.0 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 
CompareOp comparison; 
case op:U of 
when 'QQ' comparison = CompareOp_GT; 
when 'Q@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312|/11109 | 5 4| 0| 
fofajojo 111 oft|szi1 oo o ofo +1 ofof1 of Rn | Rd 
U op 
Vector variant 
FCMGT <Vd>.<T>, <Vn>.<T>, #0.0 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
CompareOp comparison; 
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case op:U of 


when 
when 
when 
when 


"QQ' comparison = CompareOp_GT; 
'Q1' comparison = CompareOp_GE; 
'10' comparison = CompareOp_EQ; 
'11' comparison = CompareOp_LE; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 


D when sz = 1 
Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 


2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) zero = FPZero('Q'); 
bits(esize) element; 

boolean test_passed; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


V[d] 


case 


comparison of 


when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR); 


when CompareOp_GE test_passed 
when CompareOp_EQ test_passed 


FPCompareGE(element, zero, FPCR); 
FPCompareEQ(element, zero, FPCR); 


when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR); 
when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR); 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


= result; 
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C7.2.52 FCMLE (zero) 
Floating-point Compare Less than or Equal to zero (vector). This instruction reads each floating-point value in the 
source SIMD&FP register and if the value is less than or equal to zero sets every bit of the corresponding vector 
element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element 
in the destination SIMD&FP register to zero. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 29 28|27 26 25 2423 22 21 20/19 1817 16/15 141312|/11109 | 5 4| 0 | 
fo tfi]t 1171 Olio oo oot totic] mn | Ra | 
U op 
Scalar variant 
FCMLE <V><d>, <V><n>, #0.0 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 
CompareOp comparison; 
case op:U of 
when 'QQ' comparison = CompareOp_GT; 
when 'Q@1' comparison = CompareOp_GE; 
when '10' comparison = CompareOp_EQ; 
when '11' comparison = CompareOp_LE; 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 
fofajijo 111 oftjszi1 ooo ofo +1 ofif1 of Rn | Rd 
U op 
Vector variant 
FCMLE <Vd>.<T>, <Vn>.<T>, #0.0 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
CompareOp comparison; 
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case op:U of 


when 
when 
when 
when 


"QQ' comparison = CompareOp_GT; 
'Q1' comparison = CompareOp_GE; 
'10' comparison = CompareOp_EQ; 
'11' comparison = CompareOp_LE; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 


D when sz = 1 
Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 


2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) zero = FPZero('Q'); 
bits(esize) element; 

boolean test_passed; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


V[d] 


case 


comparison of 


when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR); 


when CompareOp_GE test_passed 
when CompareOp_EQ test_passed 


FPCompareGE(element, zero, FPCR); 
FPCompareEQ(element, zero, FPCR); 


when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR); 
when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR); 
Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


= result; 
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FCMLT (zero) 


Floating-point Compare Less than zero (vector). This instruction reads each floating-point value in the source 
SIMD&FP register and if the value is less than zero sets every bit of the corresponding vector element in the 
destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the 
destination SIMD&FP register to zero. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fo tfolt 177 O[tfji 000 0jo1710110] Rn | Rd 


Scalar variant 


FCMLT <V><d>, <V><n>, #0.0 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


CompareOp comparison = CompareOp_LT; 
Vector 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofajojo 141 oftfsz{1 ooo ofot1t ofr of Rn [| Rd | 


Vector variant 


FCMLT <Vd>.<T>, <Vn>.<T>, #0.0 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


CompareOp comparison = CompareOp_LT; 
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Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) zero = FPZero('Q'); 
bits(esize) element; 

boolean test_passed; 


for e = Q to elements-1 

element = Elem[operand, e, esize]; 

case comparison of 
when CompareOp_GT test_passed = FPCompareGI(element, zero, FPCR); 
when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR); 
when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR); 
when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR); 
when CompareOp_LT test_passed = FPCompareGI(zero, element, FPCR); 

Elem[result, e, esize] = if test_passed then Ones() else Zeros(); 


V[d] = result; 
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FCMP 


Floating-point quiet Compare (scalar). This instruction compares the two SIMD&FP source register values, or the 
first SIMD&FP source register value and zero. It writes the result to the PSTATE.{N, Z, C, V} flags. 


It raises an Invalid Operation exception only if either operand is a signaling NaN. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15 14 1312/1110 9 | 5 4/3 21 0| 


fofofoyt +77 0f0 x7] Rm [0 o]ro000] Rn oxfo00 


type opc 


Single-precision variant 
Applies when type == 00 && opc == 00. 


FCMP <Sn>, <Sm> 


Single-precision, zero variant 
Applies when type == 00 && Rm == (00000) && opc == 01. 


FCMP <Sn>, #0.0 


Double-precision variant 
Applies when type == 1 && opc == 00. 


FCMP <Dn>, <Dm> 


Double-precision, zero variant 
Applies when type == 01 && Rm == (00000) && opc == 01. 


FCMP <Dn>, #0.0 


Decode for all variants of this encoding 


integer n = UInt(Rn); 
integer m = UInt(Rm); // ignored when opc<@> == '1' 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q@1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


boolean signal_all_nans = (opc<l> == '1'); 
boolean cmp_with_zero = (opc<@> == '1'); 


Assembler symbols 


<Dn> For the double-precision variant: is the 64-bit name of the first SIMD&FP source register, encoded 
in the "Rn" field. 
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For the double-precision, zero variant: is the 64-bit name of the SIMD&FP source register, encoded 
in the "Rn" field. 


<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 


<Sn> For the single-precision variant: is the 32-bit name of the first SIMD&FP source register, encoded 
in the "Rn" field. 


For the single-precision, zero variant: is the 32-bit name of the SIMD&FP source register, encoded 
in the "Rn" field. 


<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
NaNs 


The IEEE 754 standard specifies that the result of a comparison is precisely one of <, ==, > or unordered. If either 
or both of the operands are NaNs, they are unordered, and all three of (Operand1 < Operand2), (Operand1 == 
Operand2) and (Operand1 > Operand2) are false. This case results in the FPSCR flags being set to N=0, Z=0, C=1, 
and V=1. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operandl = V[n]; 
bits(datasize) operand2; 


operand2 = if cmp_with_zero then FPZero('@') else V[m]; 


PSTATE.<N,Z,C,V> = FPCompare(operand1, operand2, signal_all_nans, FPCR); 
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C7.2.55 FCMPE 
Floating-point signaling Compare (scalar). This instruction compares the two SIMD&FP source register values, or 
the first SIMD&FP source register value and zero. It writes the result to the PSTATE.{N, Z, C, V} flags. 
If either operand is any type of NaN, or if either operand is a signaling NaN, the instruction raises an Invalid 
Operation exception. 
A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312/1110 9 | 5 4/3 2 1 0| 
fofofoy? 117 0fo x[i] amo oji 000] mm [1x00] 
type opc 
Single-precision variant 
Applies when type == 00 && opc == 10. 
FCMPE <Sn>, <Sm> 
Single-precision, zero variant 
Applies when type == 00 && Rm == (00000) && opc == 11. 
FCMPE <Sn>, #0.0 
Double-precision variant 
Applies when type == 01 && opc == 10. 
FCMPE <Dn>, <Dm> 
Double-precision, zero variant 
Applies when type == 01 && Rm == (00000) && opc == 11. 
FCMPE <Dn>, #0.0 
Decode for all variants of this encoding 
integer n = UInt(Rn); 
integer m = UInt(Rm); // ignored when opc<@> == '1' 
integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 
boolean signal_all_nans = (opc<l> == '1'); 
boolean cmp_with_zero = (opc<@> == '1'); 
Assembler symbols 
<Dn> For the double-precision variant: is the 64-bit name of the first SIMD&FP source register, encoded 
in the "Rn" field. 
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For the double-precision, zero variant: is the 64-bit name of the SIMD&FP source register, encoded 
in the "Rn" field. 


<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 


<Sn> For the single-precision variant: is the 32-bit name of the first SIMD&FP source register, encoded 
in the "Rn" field. 


For the single-precision, zero variant: is the 32-bit name of the SIMD&FP source register, encoded 
in the "Rn" field. 


<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
NaNs 


The IEEE 754 standard specifies that the result of a comparison is precisely one of <, ==, > or unordered. If either 
or both of the operands are NaNs, they are unordered, and all three of (Operand1 < Operand2), (Operand1 == 
Operand2) and (Operand1 > Operand2) are false. This case results in the FPSCR flags being set to N=0, Z=0, C=1, 
and V=1. 


FCMPE raises an Invalid Operation exception if either operand is any type of NaN, and is suitable for testing for <, 
<=, >, >=, and other predicates that raise an exception when the operands are unordered. 
Operation 

CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operandl = V[n]; 
bits(datasize) operand2; 


operand2 = if cmp_with_zero then FPZero('@') else V[m]; 


PSTATE.<N,Z,C,V> = FPCompare(operand1, operand2, signal_all_nans, FPCR); 
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C7.2.56 FCSEL 


Floating-point Conditional Select (scalar). This instruction allows the SIMD&FP destination register to take the 
value from either one or the other of two SIMD&FP source registers. If the condition passes, the first SIMD&FP 
source register value is taken, otherwise the second SIMD&FP source register value is taken. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 12/1110 9 | 5 4| 0 | 


fofofoyt +77 0f0 x]i] Rm | cond [1a] Rn | Rd 


type 


Single-precision variant 
Applies when type == 00. 

FCSEL <Sd>, <Sn>, <Sm>, <cond> 
Double-precision variant 
Applies when type == Q1. 


FCSEL <Dd>, <Dn>, <Dm>, <cond> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ0' datasize = 32; 
when '@1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


bits(4) condition = cond; 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<cond> Is one of the standard conditions, encoded in the "cond" field in the standard way. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
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result = if ConditionHolds(condition) then V[n] else V[m]; 


V[d] = result; 
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C7.2.57 FCVT 


Floating-point Convert precision (scalar). This instruction converts the floating-point value in the SIMD&FP source 
register to the precision for the destination register data type using the rounding mode that is determined by the 
FPCR and writes the result to the SIMD&FP destination register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo[ofol1 +71 O[wpe]i[o 00 iJopc]i 0000] Rn | Rd 


Half-precision to single-precision variant 
Applies when type == 11 && opc == 00. 


FCVT <Sd>, <Hn> 


Half-precision to double-precision variant 
Applies when type == 11 && opc == 01. 


FCVT <Dd>, <Hn> 


Single-precision to half-precision variant 
Applies when type == 00 && opc == 11. 


FCVT <Hd>, <Sn> 


Single-precision to double-precision variant 
Applies when type == 00 && opc == 01. 


FCVT <Dd>, <Sn> 


Double-precision to half-precision variant 
Applies when type == 01 && opc == 11. 

FCVT <Hd>, <Dn> 

Double-precision to single-precision variant 
Applies when type == 01 && opc == 00. 

FCVT <Sd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if type == opc then UnallocatedEncoding(); 


integer srcsize; 
case type of 
when 'QQ' srcsize = 32; 
when 'Q1' srcsize = 64; 
when '10' UnallocatedEncoding(); 
when '11' srcsize = 16; 
integer dstsize; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-869 
1ID092916 Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


case opc of 
when 'QQ0' dstsize = 32; 
when '@1' dstsize = 64; 
when '10' UnallocatedEncoding(); 
when '11' dstsize = 16; 


Assembler symbols 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Hd> Is the 16-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Hn> Is the 16-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(dstsize) result; 
bits(srcsize) operand = V[n]; 


result = FPConvert(operand, FPCR); 
V[d] = result; 
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C7.2.58 FCVTAS (vector) 


Floating-point Convert to Signed integer, rounding to nearest with ties to Away (vector). This instruction converts 
each element in a vector from a floating-point value to a signed integer value using the Round to Nearest with Ties 
to Away rounding mode and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312|/11109 | 5 4| 0| 


fo io[r Tit offer ooo oe tT 0 oft of Rn TR 


Scalar variant 


FCVTAS <V><d>, <V><n> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn) ; 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPRounding_TIEAWAY; 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


foJafofo Tit foe ooo oi tT 0 ofr of Rn TR 


Vector variant 


FCVTAS <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPRounding_TIEAWAY; 
boolean unsigned = (U == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.59 FCVTAS (scalar) 


Floating-point Convert to Signed integer, rounding to nearest with ties to Away (scalar). This instruction converts 
the floating-point value in the SIMD&FP source register to a 32-bit or 64-bit signed integer using the Round to 
Nearest with Ties to Away rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


fsofo]1 177 0[0 xJi]o [ro ofooo000] Rn | Rd 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTAS <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTAS <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTAS <Wd>, <Dn> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTAS <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 


case type of 

when 'QQ' 
fltsize 

when '@1' 
fltsize = 64; 

when '10' 
UnallocatedEncoding() 

when '11' 
UnallocatedEncoding() 


32; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
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<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 


intval = FPToFixed(fltval, @, FALSE, FPCR, FPRounding_TIEAWAY) ; 
X[d] = intval; 
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C7.2.60 FCVTAU (vector) 


Floating-point Convert to Unsigned integer, rounding to nearest with ties to Away (vector). This instruction converts 
each element in a vector from a floating-point value to an unsigned integer value using the Round to Nearest with 
Ties to Away rounding mode and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


foi Ti offer ooo oe tT 0 of of Rn TR 


Scalar variant 


FCVTAU <V><d>, <V><n> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPRounding_TIEAWAY; 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


foJaife Tit foe ooo oe FT 0 of of an TR 


Vector variant 


FCVTAU <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPRounding_TIEAWAY; 
boolean unsigned = (U == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.61 FCVTAU (scalar) 


Floating-point Convert to Unsigned integer, rounding to nearest with ties to Away (scalar). This instruction converts 
the floating-point value in the SIMD&FP source register to a 32-bit or 64-bit unsigned integer using the Round to 
Nearest with Ties to Away rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


fsf[ofo]1 177 0[0 xJiJo [to 7fooo000] en | Rd 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTAU <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTAU <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTAU <Wd>, <Dn> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTAU <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 


case type of 

when 'QQ' 
fltsize 

when '@1' 
fltsize = 64; 

when '10' 
UnallocatedEncoding() 

when '11' 
UnallocatedEncoding() 


32; 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
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<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 


intval = FPToFixed(fltval, @, TRUE, FPCR, FPRounding_TIEAWAY) ; 
X[d] = intval; 
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C7.2.62 FCVTL, FCVTL2 


Floating-point Convert to higher precision Long (vector). This instruction reads each element in a vector in the 
SIMD&FP source register, converts each value to double the precision of the source element using the rounding 
mode that is determined by the FPCR, and writes each result to the equivalent element of the vector in the 
SIMD&FP destination register. 


Where the operation lengthens a 64-bit vector to a 128-bit vector, the FCVTL2 variant operates on the elements in the 
top 64 bits of the source register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fofajojo 111 ofojszi1 ooo ofsx ort afi of Rn | Rd | 


Vector single-precision and double-precision variant 


FCVTL{2} <Vd>.<Ta>, <Vn>.<Tb> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer esize = 16 << UInt(sz); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 





<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "sz" field. It can have the following values: 
4s when sz = 0 
2D when sz = 1 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Tb> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
4H when sz = 0,Q = 
8H when sz = 0,Q=1 
2S when sz = 1,Q = 0 
4S when sz = 1,Q=1 
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Operation 

CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = Vpart[n, part]; 
bits(2sdatasize) result; 


for e = 0 to elements-1 
Elem[result, e, 2xesize] = FPConvert(Elem[operand, e, esize], FPCR); 


V[d] = result; 
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C7.2.63 FCVTMS (vector) 


Floating-point Convert to Signed integer, rounding toward Minus infinity (vector). This instruction converts a scalar 
or each element in a vector from a floating-point value to a signed integer value using the Round towards Minus 
Infinity rounding mode, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312|/11109 | 5 4| 0| 


fo sof Tit offer ooo oft Fe fips of an TR 


Scalar variant 


FCVIMS <V><d>, <V><n> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPDecodeRounding(01:02); 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


foJafofo Tit foe ooo oi Fe pif of an TR 


Vector variant 


FCVTMS <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPDecodeRounding(01:02); 
boolean unsigned = (U == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.64 FCVTMS (scalar) 


Floating-point Convert to Signed integer, rounding toward Minus infinity (scalar). This instruction converts the 
floating-point value in the SIMD&FP source register to a 32-bit or 64-bit signed integer using the Round towards 
Minus Infinity rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24/23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


fsfofo]1 177 0f0 x]i]1 ofooofoo 0000] Rn | Rd 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTMS <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTMS <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTMS <Wd>, <Dn> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTMS <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding(); 


32; 


64; 


rounding = FPDecodeRounding(rmode) ; 
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Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, @, FALSE, FPCR, rounding); 
X[d] = intval; 
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C7.2.65 FCVTMU (vector) 


Floating-point Convert to Unsigned integer, rounding toward Minus infinity (vector). This instruction converts a 
scalar or each element in a vector from a floating-point value to an unsigned integer value using the Round towards 
Minus Infinity rounding mode, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14131211109 | 5 4| 0| 


fos Tit offer ooo oft Fo aif of kn TR 


Scalar variant 


FCVTMU <V><d>, <V><n> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPDecodeRounding(01:02); 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


foJaifo 77 foe oo ot Fe aif of an TR 


Vector variant 


FCVTMU <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPDecodeRounding(01:02); 
boolean unsigned = (U == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.66 FCVTMU (scalar) 


Floating-point Convert to Unsigned integer, rounding toward Minus infinity (scalar). This instruction converts the 
floating-point value in the SIMD&FP source register to a 32-bit or 64-bit unsigned integer using the Round towards 
Minus Infinity rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24/23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


sfofo]1 177 0f0 x]i]1 ofoo foo 0000] Rn | Rd 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTMU <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTMU <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTMU <Wd>, <Dn> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTMU <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding(); 


32; 


64; 


rounding = FPDecodeRounding(rmode) ; 
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Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, @, TRUE, FPCR, rounding); 
X[d] = intval; 





C7-888 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.67 FCVTN, FCVTN2 


Floating-point Convert to lower precision Narrow (vector). This instruction reads each vector element in the 
SIMD&FP source register, converts each result to half the precision of the source element, writes the final result to 
a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The destination 
vector elements are half as long as the source vector elements. The rounding mode is determined by the FPCR. 


The FCVTN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the FCVTN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fofajojo 141 ofofsz{1 ooo ols O14 ofr of Rn [| Rd 


Vector single-precision and double-precision variant 


FCVIN{2} <Vd>.<Tb>, <Vn>.<Ta> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


integer esize = 16 << UInt(sz); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 0 


[present] whenQ = 1 





<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Tb> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
4H when sz = 0,Q = 
8H when sz = 0,Q=1 
2S when sz = 1,Q = 0 
4S when sz = 1,Q=1 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Ta> Is an arrangement specifier, encoded in the "sz" field. It can have the following values: 
4s when sz = 0 
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2D when sz = 1 


Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operand = V[n]; 
bits(datasize) result; 


for e = 0 to elements-1 
Elem[result, e, esize] = FPConvert(Elem[operand, e, 2xesize], FPCR); 


Vpart[d, part] = result; 
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C7.2.68 FCVTNS (vector) 


Floating-point Convert to Signed integer, rounding to nearest with ties to even (vector). This instruction converts a 
scalar or each element in a vector from a floating-point value to a signed integer value using the Round to Nearest 
rounding mode, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 


[31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312/11109 | 5 4| 0 | 


fo s]o[ Fat ofofet 0 oo oft Fo toft of kn TR 


Scalar variant 


FCVINS <V><d>, <V><n> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPDecodeRounding(01:02); 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


foJafofo 77 foe 0 oo oft Fo soft of an TR 


Vector variant 


FCVINS <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPDecodeRounding(01:02); 
boolean unsigned = (U == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.69 FCVTNS (scalar) 


Floating-point Convert to Signed integer, rounding to nearest with ties to even (scalar). This instruction converts the 
floating-point value in the SIMD&FP source register to a 32-bit or 64-bit signed integer using the Round to Nearest 
rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


sflojo]1 141 ofo x{1Jo ojo o ofoooooo] Rn {| Ra | 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVINS <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVINS <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 
FCVINS <Wd>, <Dn> 
Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 
FCVINS <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding(); 


32; 


64; 


rounding = FPDecodeRounding(rmode) ; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, @, FALSE, FPCR, rounding); 
X[d] = intval; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.70 FCVTNU (vector) 


Floating-point Convert to Unsigned integer, rounding to nearest with ties to even (vector). This instruction converts 
a scalar or each element in a vector from a floating-point value to an unsigned integer value using the Round to 
Nearest rounding mode, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312|/11109 | 5 4| 0| 


fo af Trt ofofet ooo oft Fo soft of an TR 


Scalar variant 


FCVTNU <V><d>, <V><n> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPDecodeRounding(01:02); 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


folalifo it foe 0 oo oh Fo Toft of an TR 


Vector variant 


FCVTINU <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPDecodeRounding(01:02); 
boolean unsigned = (U == '1'); 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.71 FCVTNU (scalar) 


Floating-point Convert to Unsigned integer, rounding to nearest with ties to even (scalar). This instruction converts 
the floating-point value in the SIMD&FP source register to a 32-bit or 64-bit unsigned integer using the Round to 
Nearest rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


fsofo]1 177 0f0 xJi]o ofoo7fooo000] Rn | Rd 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTNU <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTNU <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTNU <Wd>, <Dn> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTNU <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding(); 


32; 


64; 


rounding = FPDecodeRounding(rmode) ; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, @, TRUE, FPCR, rounding); 
X[d] = intval; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.72 FCVTPS (vector) 


Floating-point Convert to Signed integer, rounding toward Plus infinity (vector). This instruction converts a scalar 
or each element in a vector from a floating-point value to a signed integer value using the Round towards Plus 
Infinity rounding mode, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 5 4| 0| 


fo io[r ray offer ooo oft Fo soft of en TR 


Scalar variant 


FCVTPS <V><d>, <V><n> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPDecodeRounding(01:02); 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


foJafofo Tay oie ooo oft Fo Toft of an TR 


Vector variant 


FCVTPS <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPDecodeRounding(01:02); 
boolean unsigned = (U == '1'); 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.73 FCVTPS (scalar) 


Floating-point Convert to Signed integer, rounding toward Plus infinity (scalar). This instruction converts the 
floating-point value in the SIMD&FP source register to a 32-bit or 64-bit signed integer using the Round towards 
Plus Infinity rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


sflofo]1 141 ofo x{1Jo t]oo oJfoooooo} Rn {| Ra | 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTPS <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTPS <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTPS <Wd>, <Dn> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTPS <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding(); 


32; 


64; 


rounding = FPDecodeRounding(rmode) ; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, @, FALSE, FPCR, rounding); 
X[d] = intval; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.74 FCVTPU (vector) 


Floating-point Convert to Unsigned integer, rounding toward Plus infinity (vector). This instruction converts a 
scalar or each element in a vector from a floating-point value to an unsigned integer value using the Round towards 
Plus Infinity rounding mode, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312|/11109 | 5 4| 0| 


fo aTT Tit offer 0 oo oft te tofr of Rn TR 


Scalar variant 


FCVTPU <V><d>, <V><n> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPDecodeRounding(01:02); 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


fojaife Tay oie 0 oo oft Fo tfofr of an TR 


Vector variant 


FCVTPU <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPDecodeRounding(01:02); 
boolean unsigned = (U == '1'); 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.75 FCVTPU (scalar) 


Floating-point Convert to Unsigned integer, rounding toward Plus infinity (scalar). This instruction converts the 
floating-point value in the SIMD&FP source register to a 32-bit or 64-bit unsigned integer using the Round towards 
Plus Infinity rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


fsfofo[1 177 0[0 xJi]o Joo 7fooo000] en | Rd 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTPU <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTPU <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTPU <Wd>, <Dn> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTPU <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding(); 


32; 


64; 


rounding = FPDecodeRounding(rmode) ; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, @, TRUE, FPCR, rounding); 
X[d] = intval; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 











C7.2.76 FCVTXN, FCVTXN2 
Floating-point Convert to lower precision Narrow, rounding to odd (vector). This instruction reads each vector 
element in the source SIMD&FP register, narrows each value to half the precision of the source element using the 
Round to Odd rounding mode, writes the result to a vector, and writes the vector to the destination SIMD&FP 
register. 
Note 
This instruction uses the Round to Odd rounding mode which is not defined by the IEEE 754-2008 standard. This 
rounding mode ensures that if the result of the conversion is inexact the least significant bit of the mantissa is forced 
to 1. This rounding mode enables a floating-point value to be converted to a lower precision format via an 
intermediate precision format while avoiding double rounding errors. For example, a 64-bit floating-point value can 
be converted to a correctly rounded 16-bit floating-point value by first using this instruction to produce a 32-bit 
value and then using another instruction with the wanted rounding mode to convert the 32-bit value to the final 
16-bit floating-point value. 
The FCVTXN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the FCVTXN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 29 28|27 26 25 24/23 22 21 20/19 1817 16/15 141312|/11109 | 5 4| 0 | 
fo 1fi]1 111 ofojszi1 ooo of1 017 of1 of Rn | Rd | 
Scalar variant 
FCVTXN <Vb><d>, <Va><n> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if sz == 'Q' then ReservedValue(); 
integer esize = 32; 
integer datasize = esize; 
integer elements = 1; 
integer part = Q; 
Vector 
|31 30 as 25 24/23 22 21 20|19 18 17 16/15 14 13 12/11 10 9 ES 5 4| 0 | 
o[ayifo +77 [ols] 000 0]1 0770/10] kn | Ra 
Vector variant 
FCVTXN{2} <Vd>.<Tb>, <Vn>.<Ta> 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz == 'Q' then ReservedValue(); 
integer esize = 32; 

integer datasize = 64; 

integer elements = 2; 

integer part = UInt(Q); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 0 


[present] whenQ =1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Tb> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 1,Q = 0 
4s when sz = 1,Q=1 


The encoding sz = 0, Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Ta> Is an arrangement specifier, encoded in the "sz" field. It can have the following values: 
2D when sz = 1 


The encoding sz = @ is reserved. 


<Vb> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 1 


The encoding sz = @ is reserved. 


<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Va> Is the source width specifier, encoded in the "sz" field. It can have the following values: 
D when sz = 1 


The encoding sz = @ is reserved. 


<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operand = V[n]; 
bits(datasize) result; 


for e = 0 to elements-1 
Elem[result, e, esize] = FPConvert(Elem[operand, e, 2*esize], FPCR, FPRounding_ODD); 


Vpart[d, part] = result; 
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FCVTZS (vector, fixed-point) 


Floating-point Convert to Signed fixed-point, rounding toward Zero (vector). This instruction converts a scalar or 
each element in a vector from floating-point to fixed-point signed integer using the Round towards Zero rounding 


mode, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2 


, and CPTR_EL3 registers, and the Security state and 


Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 11918 16/1514 1312/1110 9 


aol 1+ 1 8 Rom [me [V1 aT J 


Scalar variant 


immh 


FCVTZS <V><d>, <V><n>, #<fbits> 


Decode for this encoding 


integer 
integer 


if immh = 
integer 
integer 
integer 


integer 
boolean 


FPRounding rounding = 


Vector 


|31 30 29 28|27 26 25 24|23 22 1918 16/1514 1312/1110 9 


foJapofo +44 + of 0000 [mm [raat ia] ee Re 


d = UInt(Rd); 

n = UInt(Rn); 

== 'QOxx' then ReservedValue(); 

esize = 32 << UInt(immh<3>); 

datasize = esize; 

elements = 1; 

fracbits = (esize = 2) - UInt(immh:immb); 
unsigned = (U == '1'); 


Vector variant 


FPRounding_ZERO; 


immh 


FCVTZS <Vd>.<T>, <Vn>.<T>, #<fbits> 


Decode for this encoding 


integer 
integer 
if immh 
if immh 


if immh<3> 


integer 
integer 
integer 


d 


UInt(Rd) ; 


n = UInt(Rn); 


= 'Q000' then SEE "Advanced SIMD modified immediate"; 
"QOxx' then ReservedValue(); 
:Q == '10' then ReservedValue(); 


esize = 32 
datasize = 
elements = 


<< UInt(immh<3>) ; 
if Q == '1' then 128 else 64; 
datasize DIV esize; 
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integer fracbits = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 
FPRounding rounding = FPRounding_ZERO; 


Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
Ss when immh = 01xx 
D when immh = 1xxx 


The encoding immh = QQxx is reserved. 


<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 

<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 
2D when immh = 1xxx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The following encodings are reserved: 

° immh = 0001, Q = x. 

° immh = QQ1x, Q = x. 


° jimmh = 1xxx,Q = Q. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<fbits> For the scalar variant: is the number of fractional bits, in the range 1 to the operand width, encoded 


in the "immh:immb" field. It can have the following values: 


(64-UInt(immh:immb)) when immh = Q1xx 


(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = QQxx is reserved. 


For the vector variant: is the number of fractional bits, in the range 1 to the element width, encoded 
in the "immh:immb" field. It can have the following values: 


(64-UInt(immh:immb)) when immh = Q1xx 

(128-UInt(immh:immb)) when immh = 1xxx 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 
The following encodings are reserved: 

° immh = 0001. 

° immh = QQ1x. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
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Elem[result, e, esize] = FPToFixed(element, fracbits, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.78 FCVTZS (vector, integer) 


Floating-point Convert to Signed integer, rounding toward Zero (vector). This instruction converts a scalar or each 
element in a vector from a floating-point value to a signed integer value using the Round towards Zero rounding 


mode, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 


Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2 


, and CPTR_EL3 registers, and the Security state and 


Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16/15 14 13 12|11 10 9 


eile + toh ooo eT Tea a] maT J 


Scalar variant 


FCVTZS <V><d>, <V><n> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPDecodeRounding(01:02); 
boolean unsigned = (U == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16/15 14 13 12|11 10 9 


orale +1 + oho ooo! Vea a] a J 


Vector variant 


FCVTZS <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPDecodeRounding(01:02); 
boolean unsigned = (U == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.79 FCVTZS (scalar, fixed-point) 
Floating-point Convert to Signed fixed-point, rounding toward Zero (scalar). This instruction converts the 
floating-point value in the SIMD&FP source register to a 32-bit or 64-bit fixed-point signed integer using the Round 
towards Zero rounding mode, and writes the result to the general-purpose destination register. 
A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20/1918  16|15 0 | 
FN a 
type rmode opcode 
Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 
FCVTZS <Wd>, <Sn>, #<fbits> 
Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 
FCVTZS <Xd>, <Sn>, #<fbits> 
Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 
FCVTZS <Wd>, <Dn>, #<fbits> 
Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 
FCVTZS <Xd>, <Dn>, #<fbits> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
case type of 
when 'QQ' fltsize = 32; 
when 'Q1' fltsize = 64; 
when '1x' UnallocatedEncoding(); 
if sf == '@' && scale<5> == 'Q' then UnallocatedEncoding(); 
integer fracbits = 64 - UInt(scale); 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
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<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<fbits> For the double-precision to 32-bit and single-precision to 32-bit variant: is the number of bits after 


the binary point in the fixed-point destination, in the range 1 to 32, encoded as 64 minus "scale". 


For the double-precision to 64-bit and single-precision to 64-bit variant: is the number of bits after 
the binary point in the fixed-point destination, in the range 1 to 64, encoded as 64 minus "scale". 


Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, fracbits, FALSE, FPCR, FPRounding_ZERO) ; 
X[d] = intval; 
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C7.2.80 FCVTZS (scalar, integer) 


Floating-point Convert to Signed integer, rounding toward Zero (scalar). This instruction converts the floating-point 
value in the SIMD&FP source register to a 32-bit or 64-bit signed integer using the Round towards Zero rounding 


mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 


Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 


FT RC OT 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTZS <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTZS <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTZS <Wd>, <Dn> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTZS <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding(); 


32; 


64; 


rounding = FPDecodeRounding(rmode) ; 
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Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, @, FALSE, FPCR, rounding); 
X[d] = intval; 
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C7.2.81 FCVTZU (vector, fixed-point) 


Floating-point Convert to Unsigned fixed-point, rounding toward Zero (vector). This instruction converts a scalar 
or each element in a vector from floating-point to fixed-point unsigned integer using the Round towards Zero 


rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 


Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 1918 16|15141312/1110 9 | 5 4| 


eat + + +1 8 Rom [ome [V1] maT J 


immh 


Scalar variant 


FCVTZU <V><d>, <V><n>, #<fbits> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if immh == 'Q0@xx' then ReservedValue(); 


integer esize = 32 << UInt(immh<3>); 
integer datasize = esize; 
integer elements = 1; 


integer fracbits = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 
FPRounding rounding = FPRounding_ZERO; 


Vector 


|31 30 29 28|27 26 25 24/23 22 1918 16|15141312/1110 9 | 5 4| 


0| 


foJalifo +444 of 0000 [immb [r oat a] ee Re 


immh 


Vector variant 


FCVTZU <Vd>.<T>, <Vn>.<T>, #<fbits> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh == 'Q0@xx' then ReservedValue(); 

if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(immh<3>); 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 
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integer fracbits = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 
FPRounding rounding = FPRounding_ZERO; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


<fbits> 


Is a width specifier, encoded in the "immh" field. It can have the following values: 
Ss when immh = 01xx 
D when immh = 1xxx 


The encoding immh = QQxx is reserved. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 


2D when immh = 1xxx,Q = 1 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The following encodings are reserved: 

° immh = 0001, Q = x. 

° immh = QQ1x, Q = x. 


° jimmh = 1xxx,Q = Q. 
Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


For the scalar variant: is the number of fractional bits, in the range 1 to the operand width, encoded 
in the "immh:immb" field. It can have the following values: 


(64-UInt(immh:immb)) when immh = Q1xx 


(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = QQxx is reserved. 


For the vector variant: is the number of fractional bits, in the range 1 to the element width, encoded 
in the "immh:immb" field. It can have the following values: 


(64-UInt(immh:immb)) when immh = Q1xx 

(128-UInt(immh:immb)) when immh = 1xxx 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 
The following encodings are reserved: 

° immh = 0001. 

° immh = QQ1x. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
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Elem[result, e, esize] = FPToFixed(element, fracbits, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.82 FCVTZU (vector, integer) 


Floating-point Convert to Unsigned integer, rounding toward Zero (vector). This instruction converts a scalar or 
each element in a vector from a floating-point value to an unsigned integer value using the Round towards Zero 
rounding mode, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312|/11109 | 5 4| 0| 


foi Tit oie ooo oft to aif of kn TR 


Scalar variant 


FCVTZU <V><d>, <V><n> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


FPRounding rounding = FPDecodeRounding(01:02); 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


foJaifo ray oie ooo oft Fo apis of an TR 


Vector variant 


FCVTZU <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


FPRounding rounding = FPDecodeRounding(01:02); 
boolean unsigned = (U == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPToFixed(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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Floating-point Convert to Unsigned fixed-point, rounding toward Zero (scalar). This instruction converts the 
floating-point value in the SIMD&FP source register to a 32-bit or 64-bit fixed-point unsigned integer using the 
Round towards Zero rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


131 30 29 28|27 26 25 24|23 22 21 20|19 18 


16|15 | 109 5 4| 0 | 


aslo li einer alone eas — tk ee 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTZU <Wd>, <Sn>, #<fbits> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTZU <Xd>, <Sn>, #<fbits> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTZU <Wd>, <Dn>, #<fbits> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTZU <Xd>, <Dn>, #<fbits> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 


case type of 
when 'QQ0' fltsize = 32; 
when 'Q1' fltsize = 64; 
when '1x' UnallocatedEncoding(); 


if sf == '@' && scale<5> == 'Q' then UnallocatedEncoding(); 
integer fracbits = 64 - UInt(scale); 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
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<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<fbits> For the double-precision to 32-bit and single-precision to 32-bit variant: is the number of bits after 


the binary point in the fixed-point destination, in the range 1 to 32, encoded as 64 minus "scale". 


For the double-precision to 64-bit and single-precision to 64-bit variant: is the number of bits after 
the binary point in the fixed-point destination, in the range 1 to 64, encoded as 64 minus "scale". 


Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, fracbits, TRUE, FPCR, FPRounding_ZERO) ; 
X[d] = intval; 
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C7.2.84 FCVTZU (scalar, integer) 


Floating-point Convert to Unsigned integer, rounding toward Zero (scalar). This instruction converts the 
floating-point value in the SIMD&FP source register to a 32-bit or 64-bit unsigned integer using the Round towards 
Zero rounding mode, and writes the result to the general-purpose destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


fsofo]1 177 0[0 xJi]1 tJoo7fooo000] rn | Rd 


type rmode opcode 


Single-precision to 32-bit variant 
Applies when sf == @ && type == 00. 


FCVTZU <Wd>, <Sn> 


Single-precision to 64-bit variant 
Applies when sf == 1 && type == 00. 


FCVTZU <Xd>, <Sn> 


Double-precision to 32-bit variant 
Applies when sf == @ && type == 01. 


FCVTZU <Wd>, <Dn> 


Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01. 


FCVTZU <Xd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding(); 


32; 


64; 


rounding = FPDecodeRounding(rmode) ; 
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Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


fltval = V[n]; 
intval = FPToFixed(fltval, @, TRUE, FPCR, rounding); 
X[d] = intval; 
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C7.2.85  FDIV (vector) 


Floating-point Divide (vector). This instruction divides the floating-point values in the elements in the first source 
SIMD&FP register, by the floating-point values in the corresponding elements in the second source SIMD&FP 
register, places the results in a vector, and writes the vector to the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


ofafijo 141 ofofse{t} Rm [trata aft] Rn [Rd | 


Vector single-precision and double-precision variant 


FDIV <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if sz:Q 
integer 
integer 
integer 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

== '10' then ReservedValue(); 
esize = 32 << UInt(sz); 


datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 
The encoding sz = 1, Q = Q is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 


for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
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Elem[result, e, esize] = FPDiv(element1, element2, FPCR); 


V[d] = result; 
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C7.2.86 FDIV (scalar) 


Floating-point Divide (scalar). This instruction divides the floating-point value of the first source SIMD&FP 
register by the floating-point value of the second source SIMD&FP register, and writes the result to the destination 
SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 1413 12/1110 9 | 5 4| 0 | 


fofofoj1 141 ofo xft} Rm fooorjs of] Rn {| Ra | 


type 


Single-precision variant 
Applies when type == 00. 


FDIV <Sd>, <Sn>, <Sm> 


Double-precision variant 
Applies when type == 01. 


FDIV <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
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result = FPDiv(operand1, operand2, FPCR); 


V[d] = result; 
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C7.2.87 FMADD 
Floating-point fused Multiply-Add (scalar). This instruction multiplies the values of the first two SIMD&FP source 
registers, adds the product to the value of the third SIMD&FP source register, and writes the result to the SIMD&FP 
destination register. 
A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 14 | 109 5 4| 0 | 
ole 1+ v0 x[0] ea lo] re | |r 
type o1 
Single-precision variant 
Applies when type == 00. 
FMADD <Sd>, <Sn>, <Sm>, <Sa> 
Double-precision variant 
Applies when type == 01. 
FMADD <Dd>, <Dn>, <Dm>, <Da> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer a = UInt(Ra); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 
Assembler symbols 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register holding the multiplicand, encoded in the 
"Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register holding the multiplier, encoded in the 
"Rm" field. 
<Da> Is the 64-bit name of the third SIMD&FP source register holding the addend, encoded in the "Ra" 
field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register holding the multiplicand, encoded in the 
"Rn" field. 
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<Sm> Is the 32-bit name of the second SIMD&FP source register holding the multiplier, encoded in the 
"Rm" field. 

<Sa> Is the 32-bit name of the third SIMD&FP source register holding the addend, encoded in the "Ra" 
field. 

Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operanda = V[a]; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


result = FPMulAdd(operanda, operand1, operand2, FPCR); 


V[d] = result; 
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C7.2.88 FMAX (vector) 


Floating-point Maximum (vector). This instruction compares corresponding vector elements in the two source 
SIMD&FP registers, places the larger of each of the two floating-point values into a vector, and writes the vector to 
the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1413 12/1110 9 | 5 4| 0 | 


ofafojo 7 4 4 ofojse}i] Rm fa ta vojij Ro | Rd 


Vector single-precision and double-precision variant 


FMAX <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean pair = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
bits(esize) element1; 

bits(esize) element2; 


for e = 0 to elements-1 
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if pair then 
element1 = Elem[concat, 2se, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 
else 
elementl = Elem[operandl, e, esize]; 
element2 = Elem[operand2, e, esize]; 


if minimum then 
Elem[result, e, esize] = FPMin(element1, element2, FPCR); 
else 
Elem[result, e, esize] 


FPMax(element1, element2, FPCR); 


V[d] = result; 
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C7.2.89 FMAX (scalar) 


Floating-point Maximum (scalar). This instruction compares the two source SIMD&FP registers, and writes the 
larger of the two floating-point values to the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofofoyt +77 0f0 xJi] Rm lo 7Joo[10] Rn | Rd 


type op 


Single-precision variant 
Applies when type == 00. 


FMAX <Sd>, <Sn>, <Sm> 


Double-precision variant 
Applies when type == @1. 


FMAX <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
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result = FPMax(operand1, operand2, FPCR); 
V[d] = result; 





C7-936 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.90 | FMAXNM (vector) 


Floating-point Maximum Number (vector). This instruction compares corresponding vector elements in the two 
source SIMD&FP registers, writes the larger of the two floating-point values into a vector, and writes the vector to 
the destination SIMD&FP register. 


NaNs are handled according to the IEEE 754-2008 standard. If one vector element is numeric and the other is a quiet 
NaN, the result placed in the vector is the numerical value, otherwise the result is identical to FMAX (scalar). 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fofajojo 4 4 4 ofolszit}] Rm [1100 0/1] Rn | Rd_ | 
U o1 


Vector single-precision and double-precision variant 


FMAXNM <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean pair = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
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bits(esize) element1; 
bits(esize) element2; 


for e = 0 to elements-1 
if pair then 
elementl = Elem[concat, 2se, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 
else 
element1 = Elem[operandl, e, esize]; 
element2 = Elem[operand2, e, esize]; 


if minimum then 
Elem[result, e, esize] 
else 
Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR); 


FPMinNum(element1, element2, FPCR); 


V[d] = result; 
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C7.2.91 FMAXNM (scalar) 


Floating-point Maximum Number (scalar). This instruction compares the first and second source SIMD&FP 
register values, and writes the larger of the two floating-point values to the destination SIMD&FP register. 


NaNs are handled according to the IEEE 754-2008 standard. If one vector element is numeric and the other is a quiet 
NaN, the result that is placed in the vector is the numerical value, otherwise the result is identical to FMAX (scalar). 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


S2l tte aes fe 


type 


Single-precision variant 
Applies when type == 00. 


FMAXNM <Sd>, <Sn>, <Sm> 


Double-precision variant 
Applies when type == Q1. 


FMAXNM <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ0' datasize = 32; 
when 'Q@1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 





<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


result = FPMaxNum(operand1, operand2, FPCR); 
V[d] = result; 
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C7.2.92 FMAXNMP (scalar) 


Floating-point Maximum Number of Pair of elements (scalar). This instruction compares two vector elements in the 
source SIMD&FP register and writes the largest of the floating-point values as a scalar to the destination SIMD&FP 
register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Single-precision and double-precision variant 


FMAXNMP <V><d>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize « 2; 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is the source arrangement specifier, encoded in the "sz" field. It can have the following values: 
2S when sz = 0 
2D when sz = 1 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(ReduceOp_FMAXNUM, operand, esize); 
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C7.2.93 FMAXNMP (vector) 


Floating-point Maximum Number Pairwise (vector). This instruction creates a vector by concatenating the vector 
elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, 
reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the largest of each pair of 
values into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are 
floating-point values. 


NaNs are handled according to the IEEE 754-2008 standard. If one vector element is numeric and the other is a quiet 
NaN, the result is the numerical value, otherwise the result is identical to FMAX (scalar). 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Vector single-precision and double-precision variant 


FMAXNMP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean pair = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2s when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = 0 is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
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bits(datasize) result; 
bits(2sdatasize) concat = operand2:operand1; 
bits(esize) element1; 
bits(esize) element2; 


for e = 0 to elements-1 
if pair then 
elementl = Elem[concat, 2«e, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 
else 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 


if minimum then 
Elem[result, e, esize] = FPMinNum(element1, element2, FPCR); 
else 
Elem[result, e, esize] 


FPMaxNum(element1, element2, FPCR); 


V[d] = result; 
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C7.2.94 FMAXNMV 


Floating-point Maximum Number across Vector. This instruction compares all the vector elements in the source 
SIMD&FP register, and writes the largest of the values as a scalar to the destination SIMD&FP register. All the 


values in this instruction are floating-point values. 


NaNs are handled according to the IEEE 754-2008 standard. If one vector element is numeric and the other is a quiet 
NaN, the result of the comparison is the numerical value, otherwise the result is identical to FMAX (scalar). 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 


Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 
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Single-precision and double-precision variant 


FMAXNMV <V><d>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q != '@1' then ReservedValue(); // .4S only 


integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 


S when sz = 0 


The encoding sz = 1 is reserved. 


<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<I> Is an arrangement specifier, encoded in the "Q:sz" field. It can have the following values: 
4s when Q = 1, sz = 0 


The following encodings are reserved: 
° Q=0,sz =x. 


. Q=eljsze=l. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(ReduceOp_FMAXNUM, operand, esize); 
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C7.2.95 FMAXP (scalar) 


Floating-point Maximum of Pair of elements (scalar). This instruction compares two vector elements in the source 
SIMD&FP register and writes the largest of the floating-point values as a scalar to the destination SIMD&FP 
register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Single-precision and double-precision variant 


FMAXP <V><d>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize « 2; 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is the source arrangement specifier, encoded in the "sz" field. It can have the following values: 
2S when sz = 0 
2D when sz = 1 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(ReduceOp_FMAX, operand, esize); 
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C7.2.96 | FMAXP (vector) 


Floating-point Maximum Pairwise (vector). This instruction creates a vector by concatenating the vector elements 
of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each 
pair of adjacent vector elements from the concatenated vector, writes the larger of each pair of values into a vector, 
and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point 
values. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Vector single-precision and double-precision variant 


FMAXP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean pair = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1,Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
bits(esize) element1; 
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bits(esize) element2; 


for e = 0 to elements-1 
if pair then 
elementl = Elem[concat, 2se, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 
else 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 


if minimum then 

Elem[result, e, esize] = FPMin(element1, element2, FPCR); 
else 

Elem[result, e, esize] = FPMax(element1, element2, FPCR); 


V[d] = result; 
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C7.2.97 FMAXV 
Floating-point Maximum across Vector. This instruction compares all the vector elements in the source SIMD&FP 
register, and writes the largest of the values as a scalar to the destination SIMD&FP register. All the values in this 
instruction are floating-point values. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Single-precision and double-precision variant 
FMAXV <V><d>, <Vn>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if sz:Q != 'Q1' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 
Assembler symbols 
<V> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
The encoding sz = 1is reserved. 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is an arrangement specifier, encoded in the "Q:sz" field. It can have the following values: 
4s when Q = 1, sz = 0 
The following encodings are reserved: 
° Q=0,sz =x. 
. Q=1,sz=1. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(ReduceOp_FMAX, operand, esize); 
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C7.2.98  FMIN (vector) 


Floating-point minimum (vector). This instruction compares corresponding elements in the vectors in the two 
source SIMD&FP registers, places the smaller of each of the two floating-point values into a vector, and writes the 
vector to the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Vector single-precision and double-precision variant 


FMIN <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean pair = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
bits(esize) element1; 

bits(esize) element2; 


for e = 0 to elements-1 
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if pair then 
element1 = Elem[concat, 2se, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 
else 
elementl = Elem[operandl, e, esize]; 
element2 = Elem[operand2, e, esize]; 


if minimum then 
Elem[result, e, esize] = FPMin(element1, element2, FPCR); 
else 
Elem[result, e, esize] 


FPMax(element1, element2, FPCR); 


V[d] = result; 
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C7.2.99 FMIN (scalar) 


Floating-point Minimum (scalar). This instruction compares the first and second source SIMD&FP register values, 
and writes the smaller of the two floating-point values to the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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type op 


Single-precision variant 
Applies when type == 00. 


FMIN <Sd>, <Sn>, <Sm> 


Double-precision variant 
Applies when type == @1. 


FMIN <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
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result = FPMin(operand1, operand2, FPCR); 
V[d] = result; 
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C7.2.100 | FMINNM (vector) 


Floating-point Minimum Number (vector). This instruction compares corresponding vector elements in the two 
source SIMD&FP registers, writes the smaller of the two floating-point values into a vector, and writes the vector 
to the destination SIMD&FP register. 


NaNs are handled according to the IEEE 754-2008 standard. If one vector element is numeric and the other is a quiet 
NaN, the result placed in the vector is the numerical value, otherwise the result is identical to FMIN (scalar). 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Vector single-precision and double-precision variant 


FMINNM <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean pair = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
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bits(esize) element1; 
bits(esize) element2; 


for e = 0 to elements-1 
if pair then 
elementl = Elem[concat, 2se, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 
else 
element1 = Elem[operandl, e, esize]; 
element2 = Elem[operand2, e, esize]; 


if minimum then 
Elem[result, e, esize] 
else 
Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR); 


FPMinNum(element1, element2, FPCR); 


V[d] = result; 
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C7.2.101 |FMINNM (scalar) 


Floating-point Minimum Number (scalar). This instruction compares the first and second source SIMD&FP register 
values, and writes the smaller of the two floating-point values to the destination SIMD&FP register. 


NaNs are handled according to the IEEE 754-2008 standard. If one vector element is numeric and the other is a quiet 
NaN, the result that is placed in the vector is the numerical value, otherwise the result is identical to FMIN (scalar). 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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type 


Single-precision variant 
Applies when type == 00. 


FMINNM <Sd>, <Sn>, <Sm> 


Double-precision variant 
Applies when type == Q1. 


FMINNM <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ0' datasize = 32; 
when 'Q@1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 





<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


result = FPMinNum(operand1, operand2, FPCR); 


V[d] = result; 
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C7.2.102 FMINNMP (scalar) 


Floating-point Minimum Number of Pair of elements (scalar). This instruction compares two vector elements in the 
source SIMD&FP register and writes the smallest of the floating-point values as a scalar to the destination 
SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Single-precision and double-precision variant 


FMINNMP <V><d>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize « 2; 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is the source arrangement specifier, encoded in the "sz" field. It can have the following values: 
2S when sz = 0 
2D when sz = 1 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(ReduceOp_FMINNUM, operand, esize); 
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C7.2.103 FMINNMP (vector) 


Floating-point Minimum Number Pairwise (vector). This instruction creates a vector by concatenating the vector 

elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, 
reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the smallest of each pair 
of floating-point values into a vector, and writes the vector to the destination SIMD&FP register. All the values in 
this instruction are floating-point values. 


NaNs are handled according to the IEEE 754-2008 standard. If one vector element is numeric and the other is a quiet 
NaN, the result is the numerical value, otherwise the result is identical to FMIN (scalar). 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Vector single-precision and double-precision variant 


FMINNMP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean pair = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2s when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = 0 is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
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bits(datasize) result; 
bits(2sdatasize) concat = operand2:operand1; 
bits(esize) element1; 
bits(esize) element2; 


for e = 0 to elements-1 
if pair then 
elementl = Elem[concat, 2«e, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 
else 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 


if minimum then 
Elem[result, e, esize] = FPMinNum(element1, element2, FPCR); 
else 
Elem[result, e, esize] 


FPMaxNum(element1, element2, FPCR); 


V[d] = result; 
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C7.2.104 FMINNMV 


Floating-point Minimum Number across Vector. This instruction compares all the vector elements in the source 
SIMD&FP register, and writes the smallest of the values as a scalar to the destination SIMD&FP register. All the 


values in this instruction are floating-point values. 


NaNs are handled according to the IEEE 754-2008 standard. If one vector element is numeric and the other is a quiet 
NaN, the result of the comparison is the numerical value, otherwise the result is identical to FMIN (scalar). 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 


Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 
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Single-precision and double-precision variant 


FMINNMV <V><d>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q != '@1' then ReservedValue(); // .4S only 


integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 


S when sz = 0 


The encoding sz = 1 is reserved. 


<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<I> Is an arrangement specifier, encoded in the "Q:sz" field. It can have the following values: 
4s when Q = 1, sz = 0 


The following encodings are reserved: 
° Q=0,sz =x. 


. Q=eljsze=l. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(ReduceOp_FMINNUM, operand, esize); 
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FMINP (scalar) 


Floating-point Minimum of Pair of elements (scalar). This instruction compares two vector elements in the source 
SIMD&FP register and writes the smallest of the floating-point values as a scalar to the destination SIMD&FP 
register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Single-precision and double-precision variant 


FMINP <V><d>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer esize = 32 << UInt(sz); 
integer datasize = esize « 2; 
Assembler symbols 


<V> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 


D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


<I> Is the source arrangement specifier, encoded in the "sz" field. It can have the following values: 
2S when sz = 0 


2D when sz = 1 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(ReduceOp_FMIN, operand, esize); 
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C7.2.106  FMINP (vector) 


Floating-point Minimum Pairwise (vector). This instruction creates a vector by concatenating the vector elements 
of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each 
pair of adjacent vector elements from the concatenated vector, writes the smaller of each pair of values into a vector, 
and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point 
values. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Vector single-precision and double-precision variant 


FMINP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean pair = (U == '1'); 


boolean minimum = (01 == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1,Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
bits(esize) element1; 
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bits(esize) element2; 


for e = 0 to elements-1 
if pair then 
elementl = Elem[concat, 2se, esize]; 
element2 = Elem[concat, (2#e)+1, esize]; 
else 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 


if minimum then 

Elem[result, e, esize] = FPMin(element1, element2, FPCR); 
else 

Elem[result, e, esize] = FPMax(element1, element2, FPCR); 


V[d] = result; 
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C7.2.107  FMINV 
Floating-point Minimum across Vector. This instruction compares all the vector elements in the source SIMD&FP 
register, and writes the smallest of the values as a scalar to the destination SIMD&FP register. All the values in this 
instruction are floating-point values. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 ales 26 25 24|23 22 21 ee 109 | 5 4| 0 | 
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Single-precision and double-precision variant 
FMINV <V><d>, <Vn>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if sz:Q != 'Q1' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 
Assembler symbols 
<V> Is the destination width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
The encoding sz = 1is reserved. 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is an arrangement specifier, encoded in the "Q:sz" field. It can have the following values: 
4s when Q = 1, sz = 0 
The following encodings are reserved: 
° Q=0,sz =x. 
. Q=1,sz=1. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
V[d] = Reduce(ReduceOp_FMIN, operand, esize); 
C7-964 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k 


Non-Confidential 


1ID092916 


_iss10775 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.108 FMLA (by element) 


Floating-point fused Multiply-Add to accumulator (by element). This instruction multiplies the vector elements in 
the first source SIMD&FP register by the specified value in the second source SIMD&FP register, and accumulates 
the results in the vector elements of the destination SIMD&FP register. All the values in this instruction are 
floating-point values. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 
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Scalar variant 


FMLA <V><d>, <V><n>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi = M; 
case sz:L of 
when '@x' index = UInt(H:L); 
when '10' index = UInt(H); 
when '11' UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 

boolean sub_op = (02 == '1'); 


Vector 
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Vector variant 


FMLA <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi = M; 
case sz:L of 
when '@x' index = UInt(H:L); 
when '10' index = UInt(H); 
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when '11' UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean sub_op = (02 == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q:sz" field. It can have the following values: 
2S when Q = 0, sz = 0 
4s when Q = 1, sz = 0 
2D when Q = 1,5z =1 


The encoding Q = 0, sz = 1is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "M:Rm" fields. 
<Ts> Is an element size specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<index> Is the element index, encoded in the "sz:L:H" field. It can have the following values: 
H:L when sz = 0,L = x 
H when sz = 1,L = 0 


The encoding sz = 1,L = 1is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 

bits(idxdsize) operand2 = V[m]; 

bits(datasize) operand3 = V[d]; 

bits(datasize) result; 

bits(esize) element1; 

bits(esize) element2 = Elem[operand2, index, esize]; 


for e = 0 to elements-1 

elementl = Elem[operandl, e, esize]; 

if sub_op then element1 = FPNeg(element1) ; 

Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR); 
V[d] = result; 
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C7.2.109 FMLA (vector) 


Floating-point fused Multiply-Add to accumulator (vector). This instruction multiplies corresponding 
floating-point values in the vectors in the two source SIMD&FP registers, adds the product to the corresponding 
vector element of the destination SIMD&FP register, and writes the result to the destination SIMD&FP register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Vector single-precision and double-precision variant 


FMLA <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean sub_op = (op == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) operand3 = V[d]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 


for e = 0 to elements-1 
element1 = Elem[operand1, e, esize]; 
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element2 = Elem[operand2, e, esize]; 
if sub_op then element1 = FPNeg(element1) ; 
Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR); 


V[d] = result; 
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C7.2.110  FMLS (by element) 


Floating-point fused Multiply-Subtract from accumulator (by element). This instruction multiplies the vector 
elements in the first source SIMD&FP register by the specified value in the second source SIMD&FP register, and 
subtracts the results from the vector elements of the destination SIMD&FP register. All the values in this instruction 
are floating-point values. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 141312|1110 9 | 5 4| 0 | 


[oO 1fo}4 444 14 44sz}Ljm] Rm foltjo t{Hjo] Rn | Rd 
02 


Scalar variant 


FMLS <V><d>, <V><n>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi = M; 
case sz:L of 
when '@x' index = UInt(H:L); 
when '10' index = UInt(H); 
when '11' UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 

boolean sub_op = (02 == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 141312/1110 9 | 5 4| 0 | 


Ofajojo 7 4 4 a] ifse}efw] Rm _foltfo ifrjo] Re | Rd 


Vector variant 


FMLS <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi = M; 
case sz:L of 
when '@x' index = UInt(H:L); 
when '10' index = UInt(H); 
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when '11' UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean sub_op = (02 == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q:sz" field. It can have the following values: 
2S when Q = 0, sz = 0 
4s when Q = 1, sz = 0 
2D when Q = 1,5z =1 


The encoding Q = 0, sz = 1is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "M:Rm" fields. 
<Ts> Is an element size specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<index> Is the element index, encoded in the "sz:L:H" field. It can have the following values: 
H:L when sz = 0,L = x 
H when sz = 1,L = 0 


The encoding sz = 1, L = 1is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 

bits(idxdsize) operand2 = V[m]; 

bits(datasize) operand3 = V[d]; 

bits(datasize) result; 

bits(esize) element1; 

bits(esize) element2 = Elem[operand2, index, esize]; 


for e = 0 to elements-1 

elementl = Elem[operandl, e, esize]; 

if sub_op then element1 = FPNeg(element1) ; 

Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR); 
V[d] = result; 
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C7.2.111 FMLS (vector) 


Floating-point fused Multiply-Subtract from accumulator (vector). This instruction multiplies corresponding 
floating-point values in the vectors in the two source SIMD&FP registers, negates the product, adds the result to the 
corresponding vector element of the destination SIMD&FP register, and writes the result to the destination 
SIMD&FP register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


fofafoyo +771 O[tfsi] am [17 007[] Rn | Rd 
op 


Vector single-precision and double-precision variant 


FMLS <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean sub_op = (op == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 





bits(datasize) operand3 = V[d] 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 
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for 


V[d] 


e = 0 to elements-1 

elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 

if sub_op then element1 = FPNeg(element1) ; 


Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR); 


= result; 
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C7.2.112 FMOV (vector, immediate) 


Floating-point move immediate (vector). This instruction copies an immediate floating-point constant into every 
element of the SIMD&FP destination register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 12\11109 8|7 6 5 4| 0 | 
fojajopfo 114110000 0falbfc[111 1foltfafe[ft{[g{h] Ra | 
cmode 


Single-precision variant 

Applies when op == 0. 

FMOV <Vd>.<T>, #<imm> 

Double-precision variant 

Applies when Q == 1 && op == 1. 

FMOV <Vd>.2D, #<imm> 

Decode for all variants of this encoding 

integer rd = UInt(Rd); 

integer datasize = if Q == '1' then 128 else 64; 

bits(datasize) imm; 

bits(64) imm64; 

if cmode:op == '11111' then 
// FMOV Dn,#imm is in main FP instruction set 
if Q == '@' then UnallocatedEncoding(); 

imm64 = AdvSIMDExpandImm(op, cmode, a:b:c:d:e:f:g:h); 


imm = Replicate(imm64, datasize DIV 64); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
2S when Q = 0 
4S whenQ = 1 
<imm> Is a signed floating-point constant with 3-bit exponent and normalized 4 bits of precision, encoded 


in "a:b:c:d:e:f:g:h". For details of the range of constants available and the encoding of <imm>, see 
Modified immediate constants in A64 floating-point instructions on page C2-138. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 


V[rd] = imm; 
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C7.2.113 


FMOV (register) 


Floating-point Move register without conversion. This instruction copies the floating-point value in the SIMD&FP 
source register to the SIMD&FP destination register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofojoj1 141 ofo x{tfJoooojooj1 ooo] Rn {| Rd | 


type ope 


Single-precision variant 
Applies when type == 00. 


FMOV <Sd>, <Sn> 


Double-precision variant 
Applies when type == Q1. 


FMOV <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 
Assembler symbols 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 


<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 


V[d] = operand; 
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C7.2.114 | FMOV (general) 
Floating-point Move to or from general-purpose register without conversion. This instruction transfers the contents 
of a SIMD&FP register to a general-purpose register, or the contents of a general-purpose register to a SIMD&FP 
register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|/15141312/11109 | 5 4| 0 | 
fsflo[o[t 111 Olype[tfo x[1 71 xJoo0000] kn | Ra 
rmode opcode 
32-bit to single-precision variant 
Applies when sf == 0 && type == 00 && rmode == 00 && opcode == 111 
FMOV <Sd>, <Wn> 
Single-precision to 32-bit variant 
Applies when sf == 0 && type == 00 && rmode == 00 && opcode == 110. 
FMOV <Wd>, <Sn> 
64-bit to double-precision variant 
Applies when sf == 1 && type == 01 && rmode == 00 && opcode == 111 
FMOV <Dd>, <Xn> 
64-bit to top half of 128-bit variant 
Applies when sf == 1 && type == 10 && rmode == 01 && opcode == 111 
FMOV <Vd>.D[1], <Xn> 
Double-precision to 64-bit variant 
Applies when sf == 1 && type == 01 && rmode == 00 && opcode == 110. 
FMOV <Xd>, <Dn> 
Top half of 128-bit to 64-bit variant 
Applies when sf == 1 && type == 10 && rmode == 01 && opcode == 110. 
FMOV <Xd>, <Vn>.D[1] 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPConvOp op; 
FPRounding rounding; 
boolean unsigned; 
integer part; 
case type of 
when 'QQ' 
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fltsize = 32; 
when '@1' 

fltsize = 64; 
when '10' 


if opcode<2:1>:rmode != '11 @1' then UnallocatedEncoding(); 
fltsize = 128; 

when '11' 
UnallocatedEncoding(); 


case opcode<2:1>:rmode of 


when 'QQ xx' // FCVT[NPMZ] [US] 
rounding = FPDecodeRounding(rmode) ; 
unsigned = (opcode<@> == '1'); 
op = FPConvOp_CVT_Ftol; 
when 'Q1 QQ' // [US]CVTF 
rounding = FPRoundingMode(FPCR) ; 
unsigned = (opcode<@> == '1'); 
op = FPConvOp_CVT_ItoF; 
when '10 QQ' // FCVTA[US] 
rounding = FPRounding_TIEAWAY; 
unsigned = (opcode<@> == '1'); 
op = FPConvOp_CVT_Ftol; 
when '11 QQ' // FMOV 
if fltsize != intsize then UnallocatedEncoding(); 
op = if opcode<@> == '1' then FPConvOp_MOV_ItoF else FPConvOp_MOV_FtoI; 
part = Q; 
when '11 Q1' // FMOV D[1] 
if intsize != 64 || fltsize != 128 then UnallocatedEncoding(); 
op = if opcode<@> == '1' then FPConvOp_MOV_ItoF else FPConvOp_MOV_Ftol; 
part = 1; 
otherwise 
UnallocatedEncoding(); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 

<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


case op of 


when FPConvOp_CVT_Ftol 

fltval = V[n]; 

intval = FPToFixed(fltval, 0, unsigned, FPCR, rounding); 
X[d] = intval; 
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when FPConvOp_CVT_ItoF 





= FixedToFP(intval, @, unsigned, FPCR, rounding); 
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intval = X[n]; 
fltval 
V[d] = fltval; 

when FPConvOp_MOV_FtoI 
fltval = Vpart[n, part]; 
intval = ZeroExtend(fltval, intsize); 
X[d] = intval; 

when FPConvOp_MOV_ItoF 
intval = X[n]; 
fltval = intval<fltsize-1:0>; 
Vpart[d, part] = fltval; 
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C7.2.115 | FMOV (scalar, immediate) 


Floating-point move immediate (scalar). This instruction copies a floating-point immediate constant into the 
SIMD&FP destination register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20| | 1312/1110 9 8|7 6 5 4| 0 | 
fofojoj1 111 ojo xji] imme ft o ofoo ooo] Ra | 
type 


Single-precision variant 
Applies when type == 00. 


FMOV <Sd>, #<imm> 


Double-precision variant 
Applies when type == Q1. 


FMOV <Dd>, #<imm> 


Decode for all variants of this encoding 
integer d = UInt(Rd); 


integer datasize; 
case type of 
when 'QQ0' datasize = 32; 
when 'Q@1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


bits(datasize) imm = VFPExpandImm(imm8) ; 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<imm> Is a signed floating-point constant with 3-bit exponent and normalized 4 bits of precision, encoded 


in the "imm8" field. For details of the range of constants available and the encoding of <imm>, see 
Modified immediate constants in A64 floating-point instructions on page C2-138. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 


V[d] = imm; 
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C7.2.116 FMSUB 


Floating-point Fused Multiply-Subtract (scalar). This instruction multiplies the values of the first two SIMD&FP 
source registers, negates the product, adds that to the value of the third SIMD&FP source register, and writes the 
result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20 16|15 14 | 109 5 4| 0 | 


ole 1¥ 10 s[0] ea] re] |r 


type o1 


Single-precision variant 
Applies when type == 00. 


FMSUB <Sd>, <Sn>, <Sm>, <Sa> 


Double-precision variant 
Applies when type == @1. 


FMSUB <Dd>, <Dn>, <Dm>, <Da> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer a = UInt(Ra); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 





<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register holding the multiplicand, encoded in the 
"Rn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register holding the multiplier, encoded in the 
"Rm" field. 

<Da> Is the 64-bit name of the third SIMD&FP source register holding the minuend, encoded in the "Ra" 
field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register holding the multiplicand, encoded in the 
"Rn" field. 
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<Sm> Is the 32-bit name of the second SIMD&FP source register holding the multiplier, encoded in the 
"Rm" field. 

<Sa> Is the 32-bit name of the third SIMD&FP source register holding the minuend, encoded in the "Ra" 
field. 

Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operanda = V[a]; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


operandi = FPNeg(operand1) ; 
result = FPMulAdd(operanda, operand1, operand2, FPCR); 


V[d] = result; 





C7-980 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.117 | FMUL (by element) 


Floating-point Multiply (by element). This instruction multiplies the vector elements in the first source SIMD&FP 
register by the specified value in the second source SIMD&FP register, places the results in a vector, and writes the 
vector to the destination SIMD&FP register. All the values in this instruction are floating-point values. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 141312|/1110 9 | 5 4| 0 | 


iO sfos 7 4 4 a] tfse}e fw] Rm fi o 0 i[Hjo] Rn | Rd 


Scalar variant 


FMUL <V><d>, <V><n>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi = M; 
case sz:L of 
when '@x' index = UInt(H:L); 
when '10' index = UInt(H); 
when '11' UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 


integer elements = 1; 

boolean mulx_op = (U == '1'); 
Vector 

|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15141312/1110 9 | 5 4| 0 | 


fofajojo 1 11 tftfsz{tjm] Rm [1 oo tfHfo] Rn [| Rd 
U 


Vector variant 


FMUL <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi = M; 
case sz:L of 
when '@x' index = UInt(H:L); 
when '10' index = UInt(H); 
when '11' UnallocatedEncoding(); 
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integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean mulx_op = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q:sz" field. It can have the following values: 
2s when Q = 0, sz = @ 
4s when Q = 1, sz = 0 
2D when Q = 1,5z = 1 


The encoding Q = 0, sz = 1is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "M:Rm" fields. 
<Ts> Is an element size specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<index> Is the element index, encoded in the "sz:L:H" field. It can have the following values: 
HL when sz = 0,L = x 
H when sz = 1,L = 0 


The encoding sz = 1, L = 1is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 

bits(idxdsize) operand2 = V[m]; 

bits(datasize) result; 

bits(esize) element1; 

bits(esize) element2 = Elem[operand2, index, esize]; 


for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
if mulx_op then 


Elem[result, e, esize] = FPMulX(element1, element2, FPCR); 


else 


Elem[result, e, esize] = FPMul(element1, element2, FPCR); 


V[d] = result; 
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C7.2.118 | FMUL (vector) 


Floating-point Multiply (vector). This instruction multiplies corresponding floating-point values in the vectors in 
the two source SIMD&FP registers, places the result in a vector, and writes the vector to the destination SIMD&FP 


register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


pofa}ijo 1 411 ofofse{t} Rm ft roriat] Rn [| Rd | 


Vector single-precision and double-precision variant 


FMUL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if sz:Q 
integer 
integer 
integer 


d = UInt(Rd); 
n = UInt(Rn); 
m = UInt(Rm); 
== '10' then ReservedValue(); 


esize = 32 << UInt(sz); 
datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 
The encoding sz = 1, Q = Q is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 


for e = 


®@ to elements-1 


elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
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Elem[result, e, esize] = FPMul(element1, element2, FPCR); 


V[d] = result; 
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C7.2.119 | FMUL (scalar) 


Floating-point Multiply (scalar). This instruction multiplies the floating-point values of the two source SIMD&FP 
registers, and writes the result to the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofojoj1 141 ofo xf{1} Rm fojoo oft of Rn {| Ra 


type op 


Single-precision variant 
Applies when type == 00. 


FMUL <Sd>, <Sn>, <Sm> 


Double-precision variant 
Applies when type == @1. 


FMUL <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
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result = FPMul(operand1, operand2, FPCR); 


V[d] = result; 
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C7.2.120 FMULX (by element) 


Floating-point Multiply extended (by element). This instruction multiplies the floating-point values in the vector 
elements in the first source SIMD&FP register by the specified floating-point value in the second source SIMD&FP 
register, places the results in a vector, and writes the vector to the destination SIMD&FP register. 


Before each multiplication, a check is performed for whether one value is infinite and the other is zero. In this case, 
if only one of the values is negative, the result is 2.0, otherwise the result is -2.0. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 141312/1110 9 | 5 4| 0 | 


oat 1771 i[tfei[M, em [100 a[Hpo] rn] Rd 
U 


Scalar variant 


FMULX <V><d>, <V><n>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi = M; 
case sz:L of 
when '@x' index = UInt(H:L); 
when '10' index = UInt(H); 
when '11' UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


boolean mulx_op = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 141312\1110 9 | 5 4| 0| 


lati jo 7 4 4 a] ifse}efw] Rm |i o 0 i[hjo] Ro | Rd 


Vector variant 


FMULX <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 

bit Rmhi = M; 

case sz:L of 
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when '@x' index = UInt(H:L); 
when '10' index = UInt(H); 
when '11' UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean mulx_op = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q:sz" field. It can have the following values: 
2S when Q = 0, sz = @ 
4s when Q = 1, sz = 0 
2D when Q = 1,5z = 1 
The encoding Q = 0, sz = 1is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "M:Rm" fields. 
<Ts> Is an element size specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<index> Is the element index, encoded in the "sz:L:H" field. It can have the following values: 
H:L when sz = 0,L = x 
H when sz = 1,L = 0 


The encoding sz = 1,L = 1is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 

bits(idxdsize) operand2 = V[m]; 

bits(datasize) result; 

bits(esize) element1; 

bits(esize) element2 = Elem[operand2, index, esize]; 


for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
if mulx_op then 


Elem[result, e, esize] = FPMulX(element1, element2, FPCR); 


else 
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Elem[result, e, esize] = FPMul(element1, element2, FPCR); 


V[d] = result; 
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C7.2.121 FMULX 


Floating-point Multiply extended. This instruction multiplies corresponding floating-point values in the vectors of 
the two source SIMD&FP registers, places the resulting floating-point values in a vector, and writes the vector to 
the destination SIMD&FP register. 


If one value is zero and the other value is infinite, the result is 2.0. In this case, the result is negative if only one of 
the values is negative, otherwise the result is positive. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/11109 | 5 4| 0 | 


foto] 1771 [ofa] am [1707 7[1] Rn | Rd 


Scalar variant 


FMULX <V><d>, <V><n>, <V><m> 


Decode for this encoding 
integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo[afofo +771 Ofofsi] am [1707 7[1] Rn | Rd 


Vector variant 


FMULX <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 


S when sz = 0 
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D when sz = 1 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 


2S when sz = 0,Q = 0 
4s when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 


for 


V[d] 


e = 0 to elements-1 

elementl = Elem[operand1, e, esize]; 

element2 = Elem[operand2, e, esize]; 

Elem[result, e, esize] = FPMulX(element1, element2, FPCR); 
= result; 
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C7.2.122 FNEG (vector) 


Floating-point Negate (vector). This instruction negates the value of each vector element in the source SIMD&FP 
register, writes the result to a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


afro i 1 4 ofifse{t ooo oor at ajiof Rn | Re 


Vector single-precision and double-precision variant 


FNEG <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean neg = (U == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
if neg then 
element = FPNeg(element) ; 
else 
element = FPAbs(element); 
Elem[result, e, esize] = element; 


V[d] = result; 
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C7.2.123 FNEG (scalar) 


Floating-point Negate (scalar). This instruction negates the value in the SIMD&FP source register and writes the 
result to the SIMD&FP destination register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofojoj1 141 ofo x{tJo oo of1 ols ooo of] Rn {| Rd | 


type ope 


Single-precision variant 
Applies when type == 00. 


FNEG <Sd>, <Sn> 


Double-precision variant 
Applies when type == Q1. 


FNEG <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q@1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) result; 
bits(datasize) operand = V[n]; 


result = FPNeg(operand) ; 
V[d] = result; 
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C7.2.124 FNMADD 
Floating-point Negated fused Multiply-Add (scalar). This instruction multiplies the values of the first two 
SIMD&FP source registers, negates the product, subtracts the value of the third SIMD&FP source register, and 
writes the result to the destination SIMD&FP register. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 14 | 109 5 4| 0 | 
ole 1¥ v0 x] ea lo] Re | |r 
type o1 
Single-precision variant 
Applies when type == 00. 
FNMADD <Sd>, <Sn>, <Sm>, <Sa> 
Double-precision variant 
Applies when type == 01. 
FNMADD <Dd>, <Dn>, <Dm>, <Da> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer a = UInt(Ra); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 
Assembler symbols 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register holding the multiplicand, encoded in the 
"Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register holding the multiplier, encoded in the 
"Rm" field. 
<Da> Is the 64-bit name of the third SIMD&FP source register holding the addend, encoded in the "Ra" 
field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register holding the multiplicand, encoded in the 
"Rn" field. 
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<Sm> Is the 32-bit name of the second SIMD&FP source register holding the multiplier, encoded in the 
"Rm" field. 

<Sa> Is the 32-bit name of the third SIMD&FP source register holding the addend, encoded in the "Ra" 
field. 

Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 

bits(datasize) operanda = V[a]; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


operanda = FPNeg(operanda) ; 
operandi = FPNeg(operand1) ; 
result = FPMulAdd(operanda, operand1, operand2, FPCR); 


V[d] = result; 
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C7.2.125 FNMSUB 


Floating-point Negated fused Multiply-Subtract (scalar). This instruction multiplies the values of the first two 
SIMD&FP source registers, subtracts the value of the third SIMD&FP source register, and writes the result to the 
destination SIMD&FP register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 14 | 109 5 4| 0 | 


olen ve xs] Ra] re] | 


type o1 


Single-precision variant 
Applies when type == 00. 


FNMSUB <Sd>, <Sn>, <Sm>, <Sa> 


Double-precision variant 
Applies when type == 01. 


FNMSUB <Dd>, <Dn>, <Dm>, <Da> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer a = UInt(Ra); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 





<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register holding the multiplicand, encoded in the 
"Rn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register holding the multiplier, encoded in the 
"Rm" field. 

<Da> Is the 64-bit name of the third SIMD&FP source register holding the minuend, encoded in the "Ra" 
field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register holding the multiplicand, encoded in the 
"Rn" field. 
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<Sm> Is the 32-bit name of the second SIMD&FP source register holding the multiplier, encoded in the 
"Rm" field. 

<Sa> Is the 32-bit name of the third SIMD&FP source register holding the minuend, encoded in the "Ra" 
field. 

Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operanda = V[a]; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


operanda = FPNeg(operanda) ; 
result = FPMulAdd(operanda, operand1, operand2, FPCR); 


V[d] = result; 
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C7.2.126 FNMUL (scalar) 
Floating-point Multiply-Negate (scalar). This instruction multiplies the floating-point values of the two source 
SIMD&FP registers, and writes the negation of the result to the destination SIMD&FP register. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 
fofofoy1 +17 0[0 x]i] Rm [io 0 o[1 0] Rn [| Ra 
type op 
Single-precision variant 
Applies when type == 00. 
FNMUL <Sd>, <Sn>, <Sm> 
Double-precision variant 
Applies when type == @1. 
FNMUL <Dd>, <Dn>, <Dm> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 
Assembler symbols 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
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result = FPMul(operand1, operand2, FPCR); 
result = FPNeg(result); 


V[d] = result; 
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C7.2.127 FRECPE 
Floating-point Reciprocal Estimate. This instruction finds an approximate reciprocal estimate for each vector 
element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination 
SIMD&FP register. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
131 30 29 28|27 26 25 24|23 22 21 MAMMA 109 | 5 4| 0| 
fo [oli 147 Ofife{t ooo oi 110i o] mn [| Ra | 
Scalar variant 
FRECPE <V><d>, <V><n> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 5 4| 0| 
ofafoyo 117 Olife{t ooo oi tio io] mn [| Ra | 
Vector variant 
FRECPE <Vd>.<T>, <Vn>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
Assembler symbols 
<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
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<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FPRecipEstimate(element, FPCR); 


V[d] = result; 
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C7.2.128 FRECPS 


Floating-point Reciprocal Step. This instruction multiplies the corresponding floating-point values in the vectors of 
the two source SIMD&FP registers, subtracts each of the products from 2.0, places the resulting floating-point 
values in a vector, and writes the vector to the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


fo tfoj1 141 ofofse{t} Rm [tartar aft] Rn [Rd | 


Scalar variant 


FRECPS <V><d>, <V><n>, <V><m> 


Decode for this encoding 
integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 32 << UInt(sz); 


integer datasize = esize; 
integer elements = 1; 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fojajojo 1 41 ofofse{t} Rm [trata afi] Rn [Rd | 


Vector variant 


FRECPS <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


Assembler symbols 





<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
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Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 


2s when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = 0 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 


for 


V[d] 


e = Q to elements-1 

elementl = Elem[operandl, e, esize]; 

element2 = Elem[operand2, e, esize]; 

Elem[result, e, esize] = FPRecipStepFused(element1, element2) ; 


= result; 
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C7.2.129  FRECPX 


Floating-point Reciprocal exponent (scalar). This instruction finds an approximate reciprocal exponent for each 
vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination 
SIMD&EFP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fo Jolt 177 O[tfjio 0001171 7J10] Rn | Rd 


Scalar single-precision and double-precision variant 


FRECPX <V><d>, <V><n> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


Assembler symbols 


<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
s when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPRecpX(element, FPCR); 


V[d] = result; 





C7-1004 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.130 FRINTA (vector) 


Floating-point Round to Integral, to nearest with ties to Away (vector). This instruction rounds a vector of 
floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the 
Round to Nearest with Ties to Away rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fofaf*fo +171 oOfofsto000i110 0/0/10] Rn | Ra | 
U o2 o1 


Vector single-precision and double-precision variant 


FRINTA <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean exact = FALSE; 
FPRounding rounding; 
case U:ol:02 of 
when '@xx' rounding = FPDecodeRounding(o1:02); 
when '100' rounding = FPRounding_TIEAWAY; 
when '101' UnallocatedEncoding(); 
when '110' rounding = FPRoundingMode(FPCR); exact = TRUE; 
when '111' rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact); 


V[d] = result; 
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C7.2.131 FRINTA (scalar) 


Floating-point Round to Integral, to nearest with ties to Away (scalar). This instruction rounds a floating-point value 
in the SIMD&FP source register to an integral floating-point value of the same size using the Round to Nearest with 
Ties to Away rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21201191817 |15141312/11109 | 5 4| 0 | 


fofofo]1 477 0f0 xfifootfiooji0000] mn | Rd 


type rmode 


Single-precision variant 
Applies when type == 00. 


FRINTA <Sd>, <Sn> 


Double-precision variant 
Applies when type == @1. 


FRINTA <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) result; 
bits(datasize) operand = V[n]; 
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result = FPRoundInt(operand, FPCR, FPRounding_TIEAWAY, FALSE); 


V[d] = result; 
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C7.2.132 FRINTI (vector) 


Floating-point Round to Integral, using current rounding mode (vector). This instruction rounds a vector of 
floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the 
rounding mode that is determined by the FPCR, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


fofaf7fo +171 Oftfet ooo 01100110] Rn | Ra 
U o2 01 


Vector single-precision and double-precision variant 


FRINTI <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean exact = FALSE; 
FPRounding rounding; 
case U:ol:02 of 
when '@xx' rounding = FPDecodeRounding(o1:02); 
when '100' rounding = FPRounding_TIEAWAY; 
when '101' UnallocatedEncoding(); 
when '110' rounding = FPRoundingMode(FPCR); exact = TRUE; 
when '111' rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact); 


V[d] = result; 
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C7.2.133  FRINTI (scalar) 


Floating-point Round to Integral, using current rounding mode (scalar). This instruction rounds a floating-point 
value in the SIMD&FP source register to an integral floating-point value of the same size using the rounding mode 
that is determined by the FPCR, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21201191817 |15141312/11109 | 5 4| 0 | 


fofofoyt 177 0fo xfipootfiaifio000] mn | Rd 


type rmode 


Single-precision variant 
Applies when type == 00. 


FRINTI <Sd>, <Sn> 


Double-precision variant 
Applies when type == @1. 


FRINTI <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


FPRounding rounding; 
rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) result; 
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bits(datasize) operand = V[n]; 
result = FPRoundInt(operand, FPCR, rounding, FALSE); 


V[d] = result; 
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C7.2.134 FRINTM (vector) 


Floating-point Round to Integral, toward Minus infinity (vector). This instruction rounds a vector of floating-point 
values in the SIMD&FP source register to integral floating-point values of the same size using the Round towards 
Minus Infinity rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fofafoyo +171 Ofofsto0001100/110] Rn | Ra 
U o2 01 


Vector single-precision and double-precision variant 


FRINTM <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d 
integer n 


= UInt(Rd); 
= UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean exact = FALSE; 
FPRounding rounding; 
case U:ol:02 of 


when 
when 
when 
when 
when 


"@xx' rounding = FPDecodeRounding(o1:02); 

'100' rounding = FPRounding_TIEAWAY; 

'101' UnallocatedEncoding(); 

'110' rounding = FPRoundingMode(FPCR); exact = TRUE; 
'111' rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<Vd> 


<I> 


<Vn> 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact); 


V[d] = result; 
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C7.2.135 | FRINTM (scalar) 


Floating-point Round to Integral, toward Minus infinity (scalar). This instruction rounds a floating-point value in 
the SIMD&FP source register to an integral floating-point value of the same size using the Round towards Minus 
Infinity rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21201191817 |15141312/11109 | 5 4| 0 | 


fofofo]1 1771 0f0 xfifoo for oji0000] mn | Rd 


type rmode 


Single-precision variant 
Applies when type == 00. 


FRINTM <Sd>, <Sn> 


Double-precision variant 
Applies when type == @1. 


FRINTM <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


FPRounding rounding; 
rounding = FPDecodeRounding('10'); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) result; 
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bits(datasize) operand = V[n]; 
result = FPRoundInt(operand, FPCR, rounding, FALSE); 


V[d] = result; 
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C7.2.136 FRINTN (vector) 


Floating-point Round to Integral, to nearest with ties to even (vector). This instruction rounds a vector of 
floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the 
Round to Nearest rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fofafofo +17 ofofs1o000i110 0/010] Rn | Ra | 
U o2 o1 


Vector single-precision and double-precision variant 


FRINTN <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean exact = FALSE; 
FPRounding rounding; 
case U:ol:02 of 
when '@xx' rounding = FPDecodeRounding(o1:02); 
when '100' rounding = FPRounding_TIEAWAY; 
when '101' UnallocatedEncoding(); 
when '110' rounding = FPRoundingMode(FPCR); exact = TRUE; 
when '111' rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact); 


V[d] = result; 
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C7.2.137 FRINTN (scalar) 


Floating-point Round to Integral, to nearest with ties to even (scalar). This instruction rounds a floating-point value 
in the SIMD&FP source register to an integral floating-point value of the same size using the Round to Nearest 
rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/191817 |15141312/11109 | 5 4| 0 | 


fofofo]1 177 0f0 xfifoofoooji0000] en | Rd 


type rmode 


Single-precision variant 
Applies when type == 00. 


FRINTN <Sd>, <Sn> 


Double-precision variant 
Applies when type == @1. 


FRINTN <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


FPRounding rounding; 
rounding = FPDecodeRounding('@0'); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) result; 
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bits(datasize) operand = V[n]; 
result = FPRoundInt(operand, FPCR, rounding, FALSE); 


V[d] = result; 
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C7.2.138 | FRINTP (vector) 


Floating-point Round to Integral, toward Plus infinity (vector). This instruction rounds a vector of floating-point 
values in the SIMD&FP source register to integral floating-point values of the same size using the Round towards 
Plus Infinity rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fofafoyo +171 Oftfeto000l110 0/0/10] Rn | Ra 
U o2 o1 


Vector single-precision and double-precision variant 


FRINTP <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean exact = FALSE; 
FPRounding rounding; 
case U:ol:02 of 
when '@xx' rounding = FPDecodeRounding(o1:02); 
when '100' rounding = FPRounding_TIEAWAY; 
when '101' UnallocatedEncoding(); 
when '110' rounding = FPRoundingMode(FPCR); exact = TRUE; 
when '111' rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact); 


V[d] = result; 





C7-1022 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.139 FRINTP (scalar) 


Floating-point Round to Integral, toward Plus infinity (scalar). This instruction rounds a floating-point value in the 
SIMD&FP source register to an integral floating-point value of the same size using the Round towards Plus Infinity 
rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/191817 |15141312/11109 | 5 4| 0 | 


fofofof1 177 0f0 xfifoofooiiooo00] mn | Rd 


type rmode 


Single-precision variant 
Applies when type == 00. 


FRINTP <Sd>, <Sn> 


Double-precision variant 
Applies when type == 01. 


FRINTP <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


FPRounding rounding; 
rounding = FPDecodeRounding('Q1'); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) result; 
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bits(datasize) operand = V[n]; 
result = FPRoundInt(operand, FPCR, rounding, FALSE); 


V[d] = result; 
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C7.2.140 FRINTX (vector) 


Floating-point Round to Integral exact, using current rounding mode (vector). This instruction rounds a vector of 
floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the 
rounding mode that is determined by the FPCR, and writes the result to the SIMD&FP destination register. 


An Inexact exception is raised when the result value is not numerically equal to the input value. A zero input gives 
a zero result with the same sign, an infinite input gives an infinite result with the same sign, and a NaN is propagated 
as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fojatijo 11 1 ojofszjt oo 0 of1 10 0of1]1 of Rn | Ra_ id 
U o2 01 


Vector single-precision and double-precision variant 


FRINTX <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean exact = FALSE; 
FPRounding rounding; 
case U:ol:02 of 
when '@xx' rounding = FPDecodeRounding(01:02); 
when '100' rounding = FPRounding_TIEAWAY; 
when '101' UnallocatedEncoding(); 
when '110' rounding = FPRoundingMode(FPCR); exact = TRUE; 
when '111' rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1,Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact); 


V[d] = result; 
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C7.2.141 FRINTX (scalar) 


Floating-point Round to Integral exact, using current rounding mode (scalar). This instruction rounds a 
floating-point value in the SIMD&FP source register to an integral floating-point value of the same size using the 
rounding mode that is determined by the FPCR, and writes the result to the SIMD&FP destination register. 


An Inexact exception is raised when the result value is not numerically equal to the input value. A zero input gives 
a zero result with the same sign, an infinite input gives an infinite result with the same sign, and a NaN is propagated 
as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21201191817 |15141312/11109 | 5 4| 0 | 


fofojoj1 111 ofo xfifoo tft 1 of1 0000] Rn | Rd | 


type rmode 


Single-precision variant 
Applies when type == 00. 


FRINTX <Sd>, <Sn> 


Double-precision variant 
Applies when type == @1. 


FRINTX <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


FPRounding rounding; 
rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 





<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
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Operation 
CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) result; 
bits(datasize) operand = V[n]; 


result = FPRoundInt(operand, FPCR, rounding, TRUE); 


V[d] = result; 
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C7.2.142 FRINTZ (vector) 


Floating-point Round to Integral, toward Zero (vector). This instruction rounds a vector of floating-point values in 
the SIMD&FP source register to integral floating-point values of the same size using the Round towards Zero 
rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fofafofo +171 Oftfet O00 01100110] Rn | Ra 
U o2 01 


Vector single-precision and double-precision variant 


FRINTZ <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean exact = FALSE; 
FPRounding rounding; 
case U:ol:02 of 
when '@xx' rounding = FPDecodeRounding(o1:02); 
when '100' rounding = FPRounding_TIEAWAY; 
when '101' UnallocatedEncoding(); 
when '110' rounding = FPRoundingMode(FPCR); exact = TRUE; 
when '111' rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact); 


V[d] = result; 
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C7.2.143  FRINTZ (scalar) 


Floating-point Round to Integral, toward Zero (scalar). This instruction rounds a floating-point value in the 
SIMD&FP source register to an integral floating-point value of the same size using the Round towards Zero 
rounding mode, and writes the result to the SIMD&FP destination register. 


A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal arithmetic. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21201191817 |15141312/11109 | 5 4| 0 | 
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type rmode 


Single-precision variant 
Applies when type == 00. 


FRINTZ <Sd>, <Sn> 


Double-precision variant 
Applies when type == @1. 


FRINTZ <Dd>, <Dn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


FPRounding rounding; 
rounding = FPDecodeRounding('11'); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) result; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1031 
1ID092916 Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


bits(datasize) operand = V[n]; 
result = FPRoundInt(operand, FPCR, rounding, FALSE); 


V[d] = result; 
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C7.2.144 FRSQRTE 
Floating-point Reciprocal Square Root Estimate. This instruction calculates an approximate square root for each 
vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination 
SIMD&EFP register. 
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
131 30 29 28|27 26 25 24|23 22 21 MAMMA 109 | 5 4| 0| 
fo afi 117 olife{t ooo oi tio tio] mn [| Ra | 
Scalar variant 
FRSQRTE <V><d>, <V><n> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 5 4| 0| 
olafijo 117 olije{t ooo oi tiotio] em [| Ra | 
Vector variant 
FRSQRTE <Vd>.<T>, <Vn>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if sz:Q == '10' then ReservedValue(); 
integer esize = 32 << UInt(sz); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
Assembler symbols 
<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
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<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPRSqrtEstimate(element, FPCR); 


V[d] = result; 
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C7.2.145 FRSQRTS 


Floating-point Reciprocal Square Root Step. This instruction multiplies corresponding floating-point values in the 
vectors of the two source SIMD&FP registers, subtracts each of the products from 3.0, divides these results by 2.0, 
places the results into a vector, and writes the vector to the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


fo tfojt 1 tt oftfse{t} Rm ft ttt aft] Rn [Rd 


Scalar variant 


FRSQRTS <V><d>, <V><n>, <V><m> 


Decode for this encoding 
integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 32 << UInt(sz); 


integer datasize = esize; 
integer elements = 1; 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofajojo 1 41 oftfse{t} Rm [tr traft] Rn [Rd | 


Vector variant 


FRSQRTS <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


Assembler symbols 





<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
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<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

<m> Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2s when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = 0 is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 


for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
Elem[result, e, esize] = FPRSqrtStepFused(element1, element2) ; 


V[d] = result; 
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C7.2.146 FSQRT (vector) 


Floating-point Square Root (vector). This instruction calculates the square root for each vector element in the source 
SIMD&FP register, places the result in a vector, and writes the vector to the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


ofafifo +771 O[tfaji ooo 01771 7[10] Rn | Rd 


Vector single-precision and double-precision variant 


FSQRT <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = FPSqrt(element, FPCR); 


V[d] = result; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1037 
1ID092916 Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 





C7.2.147 FSQRT (scalar) 
Floating-point Square Root (scalar). This instruction calculates the square root of the value in the SIMD&FP source 
register and writes the result to the SIMD&FP destination register. 
A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 
fofojoy1 +17 Of0 x]i]o oo oli 110000) kn [| Ra 
type opc 
Single-precision variant 
Applies when type == 00. 
FSQRT <Sd>, <Sn> 
Double-precision variant 
Applies when type == @1. 
FSQRT <Dd>, <Dn> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 
Assembler symbols 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operand = V[n]; 
result = FPSqrt(operand, FPCR); 
V[d] = result; 
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C7.2.148 FSUB (vector) 


Floating-point Subtract (vector). This instruction subtracts the elements in the vector in the second source 
SIMD&FP register, from the corresponding elements in the vector in the first source SIMD&FP register, places each 
result into elements of a vector, and writes the vector to the destination SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 1413 12/1110 9 | 5 4| 0 | 


ofafojo 7 4 7 of ifse}i] Rm fr to vojit Ro | Rd 


Vector single-precision and double-precision variant 


FSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean abs = (U == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 
bits(esize) diff; 


for e = 0 to elements-1 
element1 = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
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diff = FPSub(element1, element2, FPCR); 
Elem[result, e, esize] = if abs then FPAbs(diff) else diff; 


V[d] = result; 
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C7.2.149 FSUB (scalar) 


Floating-point Subtract (scalar). This instruction subtracts the floating-point value of the second source SIMD&FP 
register from the floating-point value of the first source SIMD&FP register, and writes the result to the destination 
SIMD&FP register. 


This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results 
in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


a ee 


type 


Single-precision variant 
Applies when type == 00. 


FSUB <Sd>, <Sn>, <Sm> 


Double-precision variant 
Applies when type == Q1. 


FSUB <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize; 
case type of 
when 'QQ' datasize = 32; 
when 'Q1' datasize = 64; 
when '1x' UnallocatedEncoding(); 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) result; 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
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result = FPSub(operand1, operand2, FPCR); 
V[d] = result; 
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C7.2.150 INS (element) 


Insert vector element from another vector element. This instruction copies the vector element of the source 
SIMD&FP register to the specified vector element of the destination SIMD&FP register. 


This instruction can insert data into individual elements within a SIMD&FP register without clearing the remaining 
bits to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is used by the alias MOV (element). The alias is always the preferred disassembly. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 14 11109 | 5 4| 0 | 


foto + 770000] mms lo] mm [i] Rn | Rd 


Advanced SIMD variant 


INS <Vd>.<Ts>[<indexl>], <Vn>.<Ts>[<index2>] 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn); 


integer size = LowestSetBit(immS) ; 
if size > 3 then UnallocatedEncoding(); 


integer dst_index = UInt(imm5<4:size+l1>); 

integer src_index = UInt(imm4<3:size>); 

integer idxdsize = if imm4<3> == '1' then 128 else 64; 
// imm4<size-1:0> is IGNORED 


integer esize = 8 << size; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ts> Is an element size specifier, encoded in the "imm5" field. It can have the following values: 
B when immS = xxxx1 
H when imm5 = xxx10 
s when imm5 = xx100 
D when imm5 = x1000 


The encoding imm5 = x000Q0Q is reserved. 


<index1> Is the destination element index encoded in the "1mm5" field. It can have the following values: 
imm5<4:1> when imm5 = xxxx1 
imm5<4:2> when imm5 = xxx10 
imm5<4:3> when imm5 = xx100 
imm5<4> when imm5 = x1000 


The encoding imm5 = x000Q0Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
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<index2> Is the source element index encoded in the "imm5:imm4" field. It can have the following values: 
imm4<3:0> when immS = xxxx1 
imm4<3:1> when imm5 = xxx10 
imm4<3:2> when immS = xx100 
imm4<3> when imm5 = x1000 
The encoding imm5 = xQ00Q0Q is reserved. 


Unspecified bits in "imm4" are ignored but should be set to zero by an assembler. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(idxdsize) operand = V[n]; 
bits(128) result; 


result = V[d]; 
Elem[result, dst_index, esize] = Elem[operand, src_index, esize]; 
V[d] = result; 
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C7.2.151 INS (general) 


Insert vector element from general-purpose register. This instruction copies the contents of the source 
general-purpose register to the specified vector element in the destination SIMD&FP register. 


This instruction can insert data into individual elements within a SIMD&FP register without clearing the remaining 
bits to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is used by the alias MOV (from general). The alias is always the preferred disassembly. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofofo +77 0000] mms [ojoo7 [i] Rn | Rd 


Advanced SIMD variant 


INS <Vd>.<Ts>[<index>], <R><n> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


integer size = LowestSetBit(immS) ; 


if size > 3 then UnallocatedEncoding(); 
integer index = UInt(imm5<4:size+1>) ; 


integer esize = 8 << size; 
integer datasize = 128; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ts> Is an element size specifier, encoded in the "imm5" field. It can have the following values: 
B when immS = xxxx1 
H when imm5 = xxx10 
s when imm5 = xx100 
D when imm5 = x1000 


The encoding imm5 = x000Q0Q is reserved. 


<index> Is the element index encoded in the "imm5" field. It can have the following values: 
imm5<4:1> when immS = xxxx1 
imm5<4:2> when immS = xxx10 
imm5<4:3> when immS = xx100 
imm5<4> =when imm5 = x1000 
The encoding imm5 = x000Q0Q is reserved. 
<R> Is the width specifier for the general-purpose source register, encoded in the "imm5" field. It can 
have the following values: 


W when immS = xxxx1 
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W when imm5 = 
W when imm5 = 
Xx when imm5 = 


The encoding imm5 = x000Q0Q is reserved. 


Xxx10 
xx100 
x1000 


<n> Is the number [0-30] of the general-purpose source register or ZR (31), encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(esize) element = X[n]; 
bits(datasize) result; 


result = V[d]; 
Elem[result, index, esize] = element; 
V[d] = result; 
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LD1 (multiple structures) 


Load multiple single-element structures to one, two, three, or four registers. This instruction loads multiple 
single-element structures from memory and writes the result to one, two, three, or four SIMD&FP registers. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


No offset 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 | 5 4| 0 | 


ajo oi too ofifo ooo 0 ofx x 1 x]sze| Rn | Rt 


opcode 


One register variant 
Applies when opcode == 0111. 


LD1 { <Vt>.<T> }, [<Xn|SP>] 


Two registers variant 
Applies when opcode == 1010. 


LD1 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>] 


Three registers variant 
Applies when opcode == 0110. 


LD1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>] 


Four registers variant 
Applies when opcode == 0010. 


LD1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>] 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 


Post-index 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 12/1110 9 | 5 4| 0 | 


fojafoe oo iio] am Pe et xfsee | RT 


opcode 


One register, immediate offset variant 
Applies when Rm == 11111 && opcode == @111. 


LD1 { <Vt>.<T> }, [<Xn|SP>], <imm> 


One register, register offset variant 
Applies when Rm != 11111 && opcode == @111. 


LD1 { <Vt>.<T> }, [<Xn|SP>], <Xm> 
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Two registers, immediate offset variant 
Applies when Rm == 11111 && opcode == 1010. 


LD1 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <imm> 


Two registers, register offset variant 
Applies when Rm != 11111 && opcode == 1010. 


LD1 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <Xm> 


Three registers, immediate offset variant 
Applies when Rm == 11111 && opcode == 0110. 


LD1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <imm> 


Three registers, register offset variant 
Applies when Rm != 11111 && opcode == 0110. 


LD1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <Xm> 


Four registers, immediate offset variant 
Applies when Rm == 11111 && opcode == 0010. 


LD1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <imm> 


Four registers, register offset variant 
Applies when Rm != 11111 && opcode == 0010. 


LD1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 





<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

1D when size = 11,Q = 0 

2D when size = 11,Q=1 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<Vt4> Is the name of the fourth SIMD&FP register to be transferred, encoded as "Rt" plus 3 modulo 32. 
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<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> For the one register, immediate offset variant: is the post-index immediate offset, encoded in the "Q" 
field. It can have the following values: 
#8 when Q = @ 
#16 when Q = 1 


For the two registers, immediate offset variant: is the post-index immediate offset, encoded in the 
"Q" field. It can have the following values: 


#16 when Q = @ 
#32 when Q = 1 


For the three registers, immediate offset variant: is the post-index immediate offset, encoded in the 
"Q" field. It can have the following values: 


#24 when Q = 0 
#48 when Q = 1 


For the four registers, immediate offset variant: is the post-index immediate offset, encoded in the 
"Q" field. It can have the following values: 


#32 when Q = 0 
#64 when Q = 1 

<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 

integer esize = 8 << UInt(size); 

integer elements = datasize DIV esize; 


integer rpt; // number of iterations 
integer selem; // structure elements 


case opcode of 


when 'Q000' rpt = 1; selem = 4; // LD/ST4 (4 registers) 
when 'Q010' rpt = 4; selem = 1; // LD/ST1 (4 registers) 
when 'Q100' rpt = 1; selem = 3; // LD/ST3 (3 registers) 
when 'Q110' rpt = 3; selem = 1; // LD/ST1 (3 registers) 
when 'Q@111' rpt = 1; selem = 1; // LD/ST1 (1 register) 
when '1000' rpt = 1; selem = 2; // LD/ST2 (2 registers) 
when '1010' rpt = 2; selem = 1; // LD/ST1 (2 registers) 


otherwise UnallocatedEncoding(); 

// .1D format only permitted with LD1 & ST1 

if size:Q == '110' && selem != 1 then ReservedValue(); 
Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 

bits(64) address; 

bits(64) offs; 

bits(datasize) rval; 

integer e, r, s, tt; 


constant integer ebytes = esize DIV 8; 


if n == 31 then 





CheckSPAlignment(); 
address = SP[]; 
else 
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address = X[n]; 


offs = Zeros(); 
for r = Q to rpt-1 
for e = 0 to elements-1 
tt = (t + r) MOD 32; 
for s = @ to selem-1 
rval = V[tt]; 
if memop == MemO0p_LOAD then 
Elem[rval, e, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[tt] = rval; 
else // memop == MemOp_STORE 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, e, esize]; 
offs = offs + ebytes; 
tt = (tt + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.153  LD1 (single structure) 


Load one single-element structure to one lane of one register. This instruction loads a single-element structure from 
memory and writes the result to the specified lane of the SIMD&FP register without affecting the other bits of the 
register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


No offset 

|31 30 29 28|27 26 25 24/23 22 21 20/1918 1716|15 1312/11109 | 5 4| 0 | 

fofafo Oo +107 Oftfofo000 Ofxx o[S|s] Rn | Rt 
LR opcode 


8-bit variant 
Applies when opcode == 000. 


LD1 { <Vt>.B }[<index>], [<Xn|SP>] 


16-bit variant 
Applies when opcode == 010 && size == xQ. 


LD1 { <Vt>.H }[<index>], [<Xn|SP>] 


32-bit variant 
Applies when opcode == 100 && size == 00. 


LD1 { <Vt>.S }[<index>], [<Xn|SP>] 


64-bit variant 
Applies when opcode == 100 && S == @ && size == Q1. 


LD1 { <Vt>.D }[<index>], [<Xn|SP>] 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 





Post-index 
|31 30 29 28|27 26 25 24/23 22 21 20] 16115 1312/11109 | 5 4| 0 | 
fojayo.0 7707 4)1)0] Rm [xx 0|S|sze] en | Rt 
LR opcode 
8-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 000. 
LD1 { <Vt>.B }[<index>], [<Xn|SP>], #1 
8-bit, register offset variant 
Applies when Rm != 11111 && opcode == 000. 
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LD1 { <Vt>.B }[<index>], [<Xn|SP>], <Xm> 


16-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 010 && size == x0. 


LD1 { <Vt>.H }[<index>], [<Xn|SP>], #2 


16-bit, register offset variant 
Applies when Rm != 11111 && opcode == 010 && size == x0. 


LD1 { <Vt>.H }[<index>], [<Xn|SP>], <Xm> 


32-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 100 && size == 00. 


LD1 { <Vt>.S }[<index>], [<Xn|SP>], #4 


32-bit, register offset variant 
Applies when Rm != 11111 && opcode == 100 && size == 00. 


LD1 { <Vt>.S }[<index>], [<Xn|SP>], <Xm> 


64-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 100 && S == @ && size == 01. 


LD1 { <Vt>.D }[<index>], [<Xn|SP>], #8 


64-bit, register offset variant 
Applies when Rm != 11111 && opcode == 100 && S == @ && size == 01. 


LD1 { <Vt>.D }[<index>], [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 
<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 


<index> For the 8-bit variant: is the element index, encoded in "Q:S:size". 
For the 16-bit variant: is the element index, encoded in "Q:S:size<1>". 
For the 32-bit variant: is the element index, encoded in "Q:S". 


For the 64-bit variant: is the element index, encoded in "Q". 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<0>:R) + 1; 
boolean replicate = FALSE; 

integer index; 
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case scale of 


when 3 
// load and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when Q 
index = UInt(Q:S:size); // B[@-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[0-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[Q-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 


CheckSPAlignment(); 
address = SP[]; 


else 


address = X[n]; 


offs = Zeros(); 
if replicate then 


// load and replicate to all elements 

for s = 0 to selem-1 
element = Mem[addresst+offs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


else 


// load/store one element per register 
for s = @ to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 


if m != 31 then 
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offs = X[m]; 
if n == 31 then 

SP[] = address + offs; 
else 

X[n] = address + offs; 
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LD1R 


Load one single-element structure and Replicate to all lanes (of one register). This instruction loads a single-element 
structure from memory and replicates the structure to all the lanes of the SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


No offset 


|31 30 29 28|27 26 25 24|23 22 21 20/1918 1716/15 1312/11109 | 5 4| 0 | 


ajo o i to 1 ojifojo ooo oft 1 ofo}size| Rn | Rt 


No offset variant 


LDIR { <Vt>.<T> }, [<Xn|SP>] 


Decode for this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 


Post-index 


|31 30 29 28|27 26 25 24|23 22 21 20) 


opcode S 


16/15 1312/1109 | 5 4| 0 | 


olajo oF FOF aifof Rm [i 1 ofo}sze| Rn | Rt 


Immediate offset variant 
Applies when Rm == 11111. 


LDIR { <Vt>.<T> }, [<Xn|SP>], <imm> 


Register offset variant 
Applies when Rm != 11111. 


LDIR { <Vt>.<T> }, [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 


opcode S 


<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 


<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
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2s when size = 10,Q = 0 
4S when size = 10,Q=1 
1D when size = 11,Q = 0 
2D when size = 11,Q=1 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "size" field. It can have the following values: 
#1 when size = 00 
#2 when size = Q1 
#4 when size = 10 
#8 when size = 11 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<@>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
when 3 
// load and replicate 
if L == 'Q@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when Q 
index = UInt(Q:S:size); // B[O-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[@-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 





C7-1056 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = 0 to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = 0 to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.155 LD2 (multiple structures) 
Load multiple 2-element structures to two registers. This instruction loads multiple 2-element structures from 
memory and writes the result to the two SIMD&FP registers, with de-interleaving. 
For an example of de-interleaving, see LD3 (multiple structures). 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
No offset 
|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 ak 10 9 — 5 4| 0| 
jofajo o 14 co ofijo oo oo oli 00 ofsze} Rn {| Rt __| 
opcode 
No offset variant 
LD2 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>] 
Decode for this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 
Post-index 
|31 30 ail 25 24|23 22 21 20] 16|15 12M 109 | 5 4| 0| 
foJaoo F400 ififo] rm Tr 00 ofsee[ en TR 
opcode 
Immediate offset variant 
Applies when Rm == 11111. 
LD2 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <imm> 
Register offset variant 
Applies when Rm != 11111. 
LD2 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <Xm> 
Decode for all variants of this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 
Assembler symbols 
<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "Q" field. It can have the following values: 
#16 when Q = @ 
#32 whenQ = 1 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 

integer esize = 8 << UInt(size); 

integer elements = datasize DIV esize; 


integer rpt; // number of iterations 
integer selem; // structure elements 


case opcode of 


when 'Q000' rpt = 1; selem = 4; // LD/ST4 (4 registers) 
when 'Q010' rpt = 4; selem = 1; // LD/ST1 (4 registers) 
when 'Q100' rpt = 1; selem = 3; // LD/ST3 (3 registers) 
when 'Q110' rpt = 3; selem = 1; // LD/ST1 (3 registers) 
when 'Q111' rpt = 1; selem = 1; // LD/ST1 (1 register) 

when '1000' rpt = 1; selem = 2; // LD/ST2 (2 registers) 
when '1010' rpt = 2; selem = 1; // LD/ST1 (2 registers) 


otherwise UnallocatedEncoding(); 


// .1D format only permitted with LD1 & ST1 
if size:Q == '110' && selem != 1 then ReservedValue(); 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(datasize) rval; 

integer e, r, s, tt; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
for r = Q to rpt-1 
for e = 0 to elements-1 
tt = (t + r) MOD 32; 
for s = @ to selem-1 
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rval = V[tt]; 
if memop == MemOp_LOAD then 
Elem[rval, e, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[tt] = rval; 
else // memop == MemOp_STORE 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, e, esize]; 
offs = offs + ebytes; 
tt = (tt + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.156 LD2 (single structure) 


Load single 2-element structure to one lane of two registers. This instruction loads a 2-element structure from 
memory and writes the result to the corresponding elements of the two SIMD&FP registers without affecting the 
other bits of the registers. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


No offset 

|31 30 29 28|27 26 25 24/23 22 21 20/1918 1716|15 1312/11109 | 5 4| 0 | 

fofafoo +107 OfttfO 000 0]xx 0[S|sz] Rn | Rt 
LR opcode 


8-bit variant 

Applies when opcode == 000. 

LD2 { <Vt>.B, <Vt2>.B }[<index>], [<Xn|SP>] 
16-bit variant 

Applies when opcode == 010 && size == xQ. 
LD2 { <Vt>.H, <Vt2>.H }[<index>], [<Xn|SP>] 
32-bit variant 

Applies when opcode == 100 && size == 00. 
LD2 { <Vt>.S, <Vt2>.S }[<index>], [<Xn|SP>] 
64-bit variant 

Applies when opcode == 100 && S == @ && size == Q1. 


LD2 { <Vt>.D, <Vt2>.D }[<index>], [<Xn|SP>] 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 





Post-index 
|31 30 29 28|27 26 25 24/23 22 21 20] 16115 1312/11109 | 5 4| 0 | 
fojajoo 71071 
LR opcode 
8-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 000. 
LD2 { <Vt>.B, <Vt2>.B }[<index>], [<Xn|SP>], #2 
8-bit, register offset variant 
Applies when Rm != 11111 && opcode == 000. 
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LD2 { <Vt>.B, <Vt2>.B }[<index>], [<Xn|SP>], <Xm> 


16-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 010 && size == x0. 


LD2 { <Vt>.H, <Vt2>.H }[<index>], [<Xn|SP>], #4 


16-bit, register offset variant 
Applies when Rm != 11111 && opcode == 010 && size == x0. 


LD2 { <Vt>.H, <Vt2>.H }[<index>], [<Xn|SP>], <Xm> 


32-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 100 && size == 00. 


LD2 { <Vt>.S, <Vt2>.S }[<index>], [<Xn|SP>], #8 


32-bit, register offset variant 
Applies when Rm != 11111 && opcode == 100 && size == 00. 


LD2 { <Vt>.S, <Vt2>.S }[<index>], [<Xn|SP>], <Xm> 


64-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 100 && S == @ && size == 01. 


LD2 { <Vt>.D, <Vt2>.D }[<index>], [<Xn|SP>], #16 


64-bit, register offset variant 
Applies when Rm != 11111 && opcode == 100 && S == @ && size == 01. 


LD2 { <Vt>.D, <Vt2>.D }[<index>], [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 


<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<index> For the 8-bit variant: is the element index, encoded in "Q:S:size". 


For the 16-bit variant: is the element index, encoded in "Q:S:size<1>". 
For the 32-bit variant: is the element index, encoded in "Q:S". 


For the 64-bit variant: is the element index, encoded in "Q". 





<Xn| SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 
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Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<Q>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
when 3 
// \oad and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when 0 
index = UInt(Q:S:size); // B[@-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[0-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = 0 to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = 0 to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
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Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.157  LD2R 


Load single 2-element structure and Replicate to all lanes of two registers. This instruction loads a 2-element 
structure from memory and replicates the structure to all the lanes of the two SIMD&FP registers. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 





No offset 
|31 30 29 28|27 26 25 24/23 22 21 20/1918 1716|15 1312/11109 | 5 4| 0 | 
fofafoo1101 elit fo oo 0 0j1 1 ofofsze{ Rn | Rt | 
opcode S 
No offset variant 
LD2R { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>] 
Decode for this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 
Post-index 
|31 30 29 28|27 26 25 24/23 22 21 20] 1615 1312/11109 | 5 4| 0 | 
olajo oF ToT aif] Rm [i 1 ofo]sze| Rn | RE 
opcode S 
Immediate offset variant 
Applies when Rm == 11111. 
LD2R { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <imm> 
Register offset variant 
Applies when Rm != 11111. 
LD2R { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <Xm> 
Decode for all variants of this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 
Assembler symbols 
<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
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2s when size = 10,Q = 0 
4S when size = 10,Q=1 
1D when size = 11,Q = 0 
2D when size = 11,Q=1 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "size" field. It can have the following values: 
#2 when size = 00 
#4 when size = Q1 
#8 when size = 10 
#16 when size = 11 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<Q>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
when 3 
// load and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when Q 
index = UInt(Q:S:size); // BL@-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[@-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
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address = SP[]; 
else 


address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = 0 to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = 0 to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.158 LD3 (multiple structures) 
Load multiple 3-element structures to three registers. This instruction loads multiple 3-element structures from 
memory and writes the result to the three SIMD&FP registers, with de-interleaving. 
The following figure shows an example of the operation of de-interleaving of a LD3.16 (multiple 3-element 
structures) instruction: 
Memory 
A[0].x 
Al0].-y 
A[0].z -- ; 
A[1].x 
Ais a packed array of Alt]-y 
3-element structures. A[1].z 
Each element is a 16-bit | A[2].x \ 
halfword. A[2].-y 
A[2].z 
A[3].x 
A[3]-y X2|X1]Xo] DO 
A[3].z t—~ yy D1 Registers 
[Zs|Z2|Z:|Zo| Z2|Z1|Zo|D2 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
No offset 
|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 uu 10 9 _ 5 4| 0| 
oJafo oT 10 ofifo oo oo oor o ols] nT RY 
opcode 
No offset variant 
LD3 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>] 
Decode for this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 
Post-index 
|31 30 ores 25 24|23 22 21 20] 16|15 ual 109 | 5 4| 0 | 
foJafoo Ft oo ififo] an Jo 70 ofsee[ rn [RT 
opcode 
Immediate offset variant 
Applies when Rm == 11111. 
LD3 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <imm> 
Register offset variant 
Applies when Rm != 11111. 
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LD3 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 


<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

2D when size = 11,Q=1 


The encoding size = 11,Q = 0 is reserved. 


<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "Q" field. It can have the following values: 
#24 when Q = 0 
#48 when Q = 1 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 

integer esize = 8 << UInt(size); 

integer elements = datasize DIV esize; 


integer rpt; // number of iterations 
integer selem; // structure elements 


case opcode of 


when '0000' rpt = 1; selem = 4; // LD/ST4 (4 registers) 
when 'Q010' rpt = 4; selem = 1; // LD/ST1 (4 registers) 
when 'Q100' rpt = 1; selem = 3; // LD/ST3 (3 registers) 
when 'Q110' rpt = 3; selem = 1; // LD/ST1 (3 registers) 
when 'Q111' rpt = 1; selem = 1; // LD/ST1 (1 register) 
when '1000' rpt = 1; selem = 2; // LD/ST2 (2 registers) 
when '1010' rpt = 2; selem = 1; // LD/ST1 (2 registers) 


otherwise UnallocatedEncoding(); 


// .1D format only permitted with LD1 & ST1 
if size:Q == '110' && selem != 1 then ReservedValue(); 
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Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(datasize) rval; 

integer e, r, s, tt; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
for r = Q to rpt-1 
for e = 0 to elements-1 
tt = (t + r) MOD 32; 
for s = @ to selem-1 
rval = V[tt]; 
if memop == MemOp_LOAD then 
Elem[rval, e, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[tt] = rval; 
else // memop == MemOp_STORE 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, e, esize]; 
offs = offs + ebytes; 
tt = (tt + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.159 LD3 (single structure) 
Load single 3-element structure to one lane of three registers). This instruction loads a 3-element structure from 
memory and writes the result to the corresponding elements of the three SIMD&FP registers without affecting the 
other bits of the registers. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
No offset 
|31 30 29 28|27 26 25 24|23 22 21 20/1918 1716/15 1312/11109 | 5 4| 0 | 
fofafoo 1401 Ofifofo 000 0x x i|[s[se] mn | eR | 
LR opcode 
8-bit variant 
Applies when opcode == 001. 
LD3 { <Vt>.B, <Vt2>.B, <Vt3>.B }[<index>], [<Xn|SP>] 
16-bit variant 
Applies when opcode == @11 && size == xQ. 
LD3 { <Vt>.H, <Vt2>.H, <Vt3>.H }[<index>], [<Xn|SP>] 
32-bit variant 
Applies when opcode == 101 && size == 00. 
LD3 { <Vt>.S, <Vt2>.S, <Vt3>.S }[<index>], [<Xn|SP>] 
64-bit variant 
Applies when opcode == 101 && S == @ && size == Q1. 
LD3 { <Vt>.D, <Vt2>.D, <Vt3>.D }[<index>], [<Xn|SP>] 
Decode for all variants of this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 
Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| 1615 1312/11109 | 5 4| 0 | 
fojafoo 11071 it]o] Rm [xx [sls] mn | Rt | 
LR opcode 
8-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 001. 
LD3 { <Vt>.B, <Vt2>.B, <Vt3>.B }[<index>], [<Xn|SP>], #3 
8-bit, register offset variant 
Applies when Rm != 11111 && opcode == 001. 
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LD3 { <Vt>.B, <Vt2>.B, <Vt3>.B }[<index>], [<Xn|SP>], <Xm> 


16-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == @11 && size == x0. 


LD3 { <Vt>.H, <Vt2>.H, <Vt3>.H }[<index>], [<Xn|SP>], #6 


16-bit, register offset variant 
Applies when Rm != 11111 && opcode == @11 && size == x0. 


LD3 { <Vt>.H, <Vt2>.H, <Vt3>.H }[<index>], [<Xn|SP>], <Xm> 


32-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 101 && size == 00. 


LD3 { <Vt>.S, <Vt2>.S, <Vt3>.S }[<index>], [<Xn|SP>], #12 


32-bit, register offset variant 
Applies when Rm != 11111 && opcode == 101 && size == 00. 


LD3 { <Vt>.S, <Vt2>.S, <Vt3>.S }[<index>], [<Xn|SP>], <Xm> 


64-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 101 && S == @ && size == 01. 


LD3 { <Vt>.D, <Vt2>.D, <Vt3>.D }[<index>], [<Xn|SP>], #24 


64-bit, register offset variant 
Applies when Rm != 11111 && opcode == 101 && S == @ && size == 01. 


LD3 { <Vt>.D, <Vt2>.D, <Vt3>.D }[<index>], [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 


<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<index> For the 8-bit variant: is the element index, encoded in "Q:S:size". 


For the 16-bit variant: is the element index, encoded in "Q:S:size<1>". 
For the 32-bit variant: is the element index, encoded in "Q:S". 


For the 64-bit variant: is the element index, encoded in "Q". 





<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 
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Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<Q>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
when 3 
// \oad and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when 0 
index = UInt(Q:S:size); // B[@-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[0-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = 0 to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = 0 to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
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Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.160 LD3R 


Load single 3-element structure and Replicate to all lanes of three registers. This instruction loads a 3-element 
structure from memory and replicates the structure to all the lanes of the three SIMD&FP registers. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


No offset 
131 30 29 28|27 26 25 24|23 22 21 20/19181716|15 1312/11109 | 5 4| 0| 
fojajo 01101 ol ifolo 000011 1{o[size{ Rn | Rt | 


opcode S 


No offset variant 


LD3R { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>] 


Decode for this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 


Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15  1312/11109 | 5 4| 0| 
olajo oF FoF aifof Rm [i 1 t]0]sze| Rn [RL 


opcode S 


Immediate offset variant 
Applies when Rm == 11111. 


LD3R { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <imm> 


Register offset variant 
Applies when Rm != 11111. 


LD3R { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 





<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
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2s when size = 10,Q = 0 

4S when size = 10,Q=1 

1D when size = 11,Q = 0 

2D when size = 11,Q=1 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "size" field. It can have the following values: 

#3 when size = 00 

#6 when size = 01 

#12 when size = 10 

#24 when size = 11 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 

field. 


Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<0>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
when 3 
// load and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when Q 
index = UInt(Q:S:size); // B[@-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[0-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 
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if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = @ to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = @ to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 


Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 


V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 


Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 


offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.161 LD4 (multiple structures) 
Load multiple 4-element structures to four registers. This instruction loads multiple 4-element structures from 
memory and writes the result to the four SIMD&FP registers, with de-interleaving. 
For an example of de-interleaving, see LD3 (multiple structures). 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
No offset 
|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 ak 10 9 — 5 4| 0| 
jofajo o 14 co ofijo oo oo ojo 00 ofsve} Rn | Rt __| 
opcode 
No offset variant 
LD4 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>] 
Decode for this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 
Post-index 
|31 30 ail 25 24|23 22 21 20] 16|15 iM 109 | 5 4| 0| 
foJafoo F400 fifo] rm To oo ofsee[ en TR 
opcode 
Immediate offset variant 
Applies when Rm == 11111. 
LD4 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <imm> 
Register offset variant 
Applies when Rm != 11111. 
LD4 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <Xm> 
Decode for all variants of this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 
Assembler symbols 
<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<Vt4> Is the name of the fourth SIMD&FP register to be transferred, encoded as "Rt" plus 3 modulo 32. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "Q" field. It can have the following values: 

#32 when Q = 0 

#64 whenQ = 1 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 

field. 


Shared decode for all encodings 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 

integer esize = 8 << UInt(size); 

integer elements = datasize DIV esize; 


integer rpt; // number of iterations 
integer selem; // structure elements 


case opcode of 


when '0000' rpt = 1; selem = 4; // LD/ST4 (4 registers) 
when '0010' rpt = 4; selem = 1; // LD/ST1 (4 registers) 
when 'Q100' rpt = 1; selem = 3; // LD/ST3 (3 registers) 
when 'Q110' rpt = 3; selem = 1; // LD/ST1 (3 registers) 
when 'Q111' rpt = 1; selem = 1; // LD/ST1 (1 register) 
when '1000' rpt = 1; selem = 2; // LD/ST2 (2 registers) 
when '1010' rpt = 2; selem = 1; // LD/ST1 (2 registers) 


otherwise UnallocatedEncoding(); 


// .1D format only permitted with LD1 & ST1 
if size:Q == '110' && selem != 1 then ReservedValue(); 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(datasize) rval; 

integer e, r, s, tt; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
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for r = Q to rpt-1 
for e = Q to elements-1 
tt = (t + r) MOD 32; 
for s = @ to selem-1 
rval = V[tt]; 
if memop == MemOp_LOAD then 
Elem[rval, e, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[tt] = rval; 
else // memop == MemOp_STORE 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, e, esize]; 
offs = offs + ebytes; 
tt = (tt + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.162 LD4 (single structure) 
Load single 4-element structure to one lane of four registers. This instruction loads a 4-element structure from 
memory and writes the result to the corresponding elements of the four SIMD&FP registers without affecting the 
other bits of the registers. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
No offset 
|31 30 29 28|27 26 25 24|23 22 21 20/1918 1716/15 1312/11109 | 5 4| 0 | 
fofafoo 1101 Olio 000 0x x isle] mn | eR | 
LR opcode 
8-bit variant 
Applies when opcode == 001. 
LD4 { <Vt>.B, <Vt2>.B, <Vt3>.B, <Vt4>.B }[<index>], [<Xn|SP>] 
16-bit variant 
Applies when opcode == @11 && size == xQ. 
LD4 { <Vt>.H, <Vt2>.H, <Vt3>.H, <Vt4>.H }[<index>], [<Xn|SP>] 
32-bit variant 
Applies when opcode == 101 && size == 00. 
LD4 { <Vt>.S, <Vt2>.S, <Vt3>.S, <Vt4>.S }[<index>], [<Xn|SP>] 
64-bit variant 
Applies when opcode == 101 && S == @ && size == Q1. 
LD4 { <Vt>.D, <Vt2>.D, <Vt3>.D, <Vt4>.D }[<index>], [<Xn|SP>] 
Decode for all variants of this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 
Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| 1615 1312/11109 | 5 4| 0 | 
fojafoo 11014 
LR opcode 
8-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 001. 
LD4 { <Vt>.B, <Vt2>.B, <Vt3>.B, <Vt4>.B }[<index>], [<Xn|SP>], #4 
8-bit, register offset variant 
Applies when Rm != 11111 && opcode == 001. 
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LD4 { <Vt>.B, <Vt2>.B, <Vt3>.B, <Vt4>. 


16-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 


LD4 { <Vt>.H, <Vt2>.H, <Vt3>.H, <Vt4> 


16-bit, register offset variant 
Applies when Rm != 11111 && opcode == 


LD4 { <Vt>.H, <Vt2>.H, <Vt3>.H, <Vt4> 


32-bit, immediate offset variant 


Applies when Rm == 11111 && opcode == 


LD4 { <Vt>.S, <Vt2>.S, <Vt3>.S, <Vt4>. 


32-bit, register offset variant 


Applies when Rm != 11111 && opcode == 


LD4 { <Vt>.S, <Vt2>.S, <Vt3>.S, <Vt4>. 


64-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 


LD4 { <Vt>.D, <Vt2>.D, <Vt3>.D, <Vt4> 


64-bit, register offset variant 


Applies when Rm != 11111 && opcode == 


LD4 { <Vt>.D, <Vt2>.D, <Vt3>.D, <Vt4>. 


B }[<index>], [<Xn|SP>], <Xm> 


011 && size == xQ. 


-H }[<index>], [<Xn|SP>], #8 


011 && size == xQ. 


-H }[<index>], [<Xn|SP>], <Xm> 


101 && size == QQ. 


S }[<index>], [<Xn|SP>], #16 


101 && size == QQ. 


S }[<index>], [<Xn|SP>], <Xm> 


101 && S == @ && size == 01. 


.D }[<index>], [<Xn|SP>], #32 


101 && S == @ && size == @1. 


D }[<index>], [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 


<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<Vt4> Is the name of the fourth SIMD&FP register to be transferred, encoded as "Rt" plus 3 modulo 32. 


<index> For the 8-bit variant: is the element index, encoded in "Q:S:size". 


For the 16-bit variant: is the element index, encoded in "Q:S:size<1>". 


For the 32-bit variant: is the element index, encoded in "Q:S". 


For the 64-bit variant: is the element index, encoded in "Q". 





<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 
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Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<Q>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
when 3 
// \oad and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when 0 
index = UInt(Q:S:size); // B[@-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[0-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = 0 to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = 0 to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
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Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.163 LD4R 


Load single 4-element structure and Replicate to all lanes of four registers. This instruction loads a 4-element 
structure from memory and replicates the structure to all the lanes of the four SIMD&FP registers. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


No offset 
131 30 29 28|27 26 25 24|23 22 21 20/19181716/15 1312/11109 | 5 4| 0| 
fojajo 01101 elit fo 000 0|]11 1{ofsize] Rn | Rt | 


opcode S 


No offset variant 


LD4R { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>] 


Decode for this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 


Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15  1312/11109 | 5 4| 0| 
olajo oF FoF aif} Rm [i 1 t[0]sze| Rn | RE 


opcode S 


Immediate offset variant 
Applies when Rm == 11111. 


LD4R { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <imm> 


Register offset variant 
Applies when Rm != 11111. 


LD4R { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 





<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
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2s when size = 10,Q = 0 

4S when size = 10,Q=1 

1D when size = 11,Q = 0 

2D when size = 11,Q=1 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<Vt4> Is the name of the fourth SIMD&FP register to be transferred, encoded as "Rt" plus 3 modulo 32. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "size" field. It can have the following values: 

#4 when size = 00 

#8 when size = 01 

#16 when size = 10 

#32 when size = 11 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 

field. 


Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<Q>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
when 3 
// load and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when Q 
index = UInt(Q:S:size); // B[O-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[@-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 
bits(64) offs; 
bits(128) rval; 
bits(esize) element; 
integer s; 
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constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = @ to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = @ to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.164 LDNP (SIMD&FP) 


Load Pair of SIMD&FP registers, with Non-temporal hint. This instruction loads a pair of SIMD&FP registers from 
memory, issuing a hint to the memory system that the access is non-temporal. The address that is used for the load 
is calculated from a base register value and an optional immediate offset. 


For information about non-temporal pair instructions, see Load/Store SIMD and Floating-point Non-temporal pair 
on page C3-154. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 


fopc[7 0 i]7]0 0 0]i] mm? | Ro | Rn | Rt 
L 


32-bit variant 

Applies when opc == 00. 

LDNP <St1>, <St2>, [<Xn|SP>{, #<imm>}] 
64-bit variant 

Applies when opc == 01. 

LDNP <Dt1>, <Dt2>, [<Xn|SP>{, #<imm>}] 
128-bit variant 

Applies when opc == 10. 

LDNP <Qt1>, <Qt2>, [<Xn|SP>{, #<imm>}] 
Decode for all variants of this encoding 


// Empty. 


Notes for all encodings 


For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly LDNP (SIMD&FP) on page K1-5484. 


Assembler symbols 








<Dtl> Is the 64-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Dt2> Is the 64-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Qtl> Is the 128-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Qt2> Is the 128-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Stb> Is the 32-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<St2> Is the 32-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> For the 32-bit variant: is the optional signed immediate byte offset, a multiple of 4 in the range -256 


to 252, defaulting to 0 and encoded in the "imm7" field as <imm>/4. 
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For the 64-bit variant: is the optional signed immediate byte offset, a multiple of 8 in the range -512 


to 504, defaulting to 0 and encoded in the "imm7" field as <imm>/8. 


For the 128-bit variant: is the optional signed immediate byte offset, a multiple of 16 in the range 


-1024 to 1008, defaulting to 0 and encoded in the "imm7" field as <imm>/16. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); 

if opc == '11' then UnallocatedEncoding(); 

integer scale = 2 + UInt(opc); 

integer datasize = 8 << scale; 

bits(64) offset = LSL(SignExtend(imm7, 64), scale); 


Operation 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(datasize) datal; 

bits(datasize) data2; 

constant integer dbytes = datasize DIV 8; 
boolean rt_unknown = FALSE; 


if t == t2 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // result is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


datal = Mem[address, dbytes, AccType_VECSTREAM] ; 
data2 = Mem[address+dbytes, dbytes, AccType_VECSTREAM] ; 
if rt_unknown then 
datal = bits(datasize) UNKNOWN; 
data2 = bits(datasize) UNKNOWN; 
V[t] = datal; 
V[t2] = data2; 
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C7.2.165 LDP (SIMD&FP) 


Load Pair of SIMD&FP registers. This instruction loads a pair of SIMD&FP registers from memory. The address 
that is used for the load is calculated from a base register value and an optional immediate offset. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Post-index 


|31 30 29 28|27 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 


L 


32-bit variant 

Applies when opc == 00. 

LDP <Stl>, <St2>, [<Xn|SP>], #<imm> 

64-bit variant 

Applies when opc == 01. 

LDP <Dtl>, <Dt2>, [<Xn|SP>], #<imm> 
128-bit variant 

Applies when opc == 10. 

LDP <Qtl>, <Qt2>, [<Xn|SP>], #<imm> 

Decode for all variants of this encoding 
boolean whack = TRUE; 


boolean postindex = TRUE; 


Pre-index 


|31 30 29 2827 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 


L 


32-bit variant 

Applies when opc == 00. 

LDP <Stl>, <St2>, [<Xn|SP>, #<imm>]! 
64-bit variant 

Applies when opc == 01. 

LDP <Dtl>, <Dt2>, [<Xn|SP>, #<imm>]! 
128-bit variant 

Applies when opc == 10. 


LDP <Qtl>, <Qt2>, [<Xn|SP>, #<imm>]! 
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Decode for all variants of this encoding 
boolean whack = TRUE; 
boolean postindex = FALSE; 


Signed offset 


|31 30 29 2827 26 25 24|23 2221 | \15 14 | 409 | 5 4| 0 | 


L 


32-bit variant 
Applies when opc == 00. 
LDP <Stl>, <St2>, [<Xn|SP>{, #<imm>}] 
64-bit variant 
Applies when opc == 01. 
LDP <Dtl>, <Dt2>, [<Xn|SP>{, #<imm>}] 
128-bit variant 
Applies when opc == 10. 
LDP <Qtl>, <Qt2>, [<Xn|SP>{, #<imm>}] 
Decode for all variants of this encoding 
boolean whack = FALSE; 
boolean postindex = FALSE; 
Notes for all encodings 
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see LDP (SIMD&FP) on 


page K1-5485, and particularly LDNP (SIMD&FP) on page K1-5484. 


Assembler symbols 








<Dt1> Is the 64-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Dt2> Is the 64-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Qt1l> Is the 128-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Qt2> Is the 128-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Stb Is the 32-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<St2> Is the 32-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> For the 32-bit post-index and 32-bit pre-index variant: is the signed immediate byte offset, a 


multiple of 4 in the range -256 to 252, encoded in the "imm7" field as <imm>/4. 


For the 32-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 4 in 
the range -256 to 252, defaulting to 0 and encoded in the "imm7" field as <imm>/4. 


For the 64-bit post-index and 64-bit pre-index variant: is the signed immediate byte offset, a 
multiple of 8 in the range -512 to 504, encoded in the "imm7" field as <imm>/8. 
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For the 64-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 8 in 
the range -512 to 504, defaulting to 0 and encoded in the "imm7" field as <imm>/8. 


For the 128-bit post-index and 128-bit pre-index variant: is the signed immediate byte offset, a 
multiple of 16 in the range -1024 to 1008, encoded in the "imm7" field as <imm>/16. 


For the 128-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 16 
in the range -1024 to 1008, defaulting to 0 and encoded in the "imm7" field as <imm>/16. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); 

if opc == '11' then UnallocatedEncoding(); 
integer scale = 2 + UInt(opc); 

integer datasize = 8 << scale; 


bits(64) offset = LSL(SignExtend(imm7, 64), scale); 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(datasize) datal; 

bits(datasize) data2; 

constant integer dbytes = datasize DIV 8; 
boolean rt_unknown = FALSE; 


if t == t2 then 
Constraint c = ConstrainUnpredictable(); 
assert c IN {Constraint_UNKNOWN, Constraint_UNDEF, Constraint_NOP}; 
case c of 
when Constraint_UNKNOWN rt_unknown = TRUE; // result is UNKNOWN 
when Constraint_UNDEF UnallocatedEncoding(); 
when Constraint_NOP EndOfInstruction(); 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


datal = Mem[address, dbytes, AccType_VEC]; 
data2 = Mem[address+dbytes, dbytes, AccType_VEC]; 
if rt_unknown then 
datal = bits(datasize) UNKNOWN; 
data2 = bits(datasize) UNKNOWN; 
V[t] = datal; 
V[t2] = data2; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C7.2.166 LDR (immediate, SIMD&FP) 
Load SIMD&FP Register (immediate offset). This instruction loads an element from memory, and writes the result 
as a scalar to the SIMD&FP register. The address that is used for the load is calculated from a base register value, 
a signed immediate offset, and an optional offset that is a multiple of the element size. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12\1110 9 | 5 4| 0| 
fsze[1 1 t]1]o ox tJo. mmo ——<ido a] mn =] Rt 
opc 
8-bit variant 
Applies when size == 00 && opc == @1. 
LDR <Bt>, [<Xn|SP>], #<simm> 
16-bit variant 
Applies when size == 1 && opc == @1. 
LDR <Ht>, [<Xn|SP>], #<simm> 
32-bit variant 
Applies when size == 10 && opc == @1. 
LDR <St>, [<Xn|SP>], #<simm> 
64-bit variant 
Applies when size == 11 && opc == @1. 
LDR <Dt>, [<Xn|SP>], #<simm> 
128-bit variant 
Applies when size == 00 && opc == 11. 
LDR <Qt>, [<Xn|SP>], #<simm> 
Decode for all variants of this encoding 
boolean whack = TRUE; 
boolean postindex = TRUE; 
integer scale = UInt(opc<1>:size); 
if scale > 4 then UnallocatedEncoding(); 
bits(64) offset = SignExtend(imm9, 64); 
Pre-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12\1110 9 | 5 4| 0| 
[size [11 1]i]o ox tJo] mmo —sdit 7] mn dP SCOR 
opc 
8-bit variant 
Applies when size == 00 && opc == 01. 
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LDR <Bt>, [<Xn|SP>, #<simm>]! 


16-bit variant 
Applies when size == 01 && opc == 01. 


LDR <Ht>, [<Xn|SP>, #<simm>]! 


32-bit variant 
Applies when size == 10 && opc == 01. 


LDR <St>, [<Xn|SP>, #<simm>]! 


64-bit variant 
Applies when size == 11 && opc == 01. 


LDR <Dt>, [<Xn|SP>, #<simm>]! 


128-bit variant 
Applies when size == 00 && opc == 11. 


LDR <Qt>, [<Xn|SP>, #<simm>]! 


Decode for all variants of this encoding 


boolean whack = TRUE; 

boolean postindex = FALSE; 

integer scale = UInt(opc<1>:size); 

if scale > 4 then UnallocatedEncoding(); 
bits(64) offset = SignExtend(imm9, 64); 


Unsigned offset 


|31 30 29 28|27 26 25 24/23 2221 | | | 109 | 5 4| 0 | 
opc 


8-bit variant 

Applies when size == 00 && opc == 01. 
LDR <Bt>, [<Xn|SP>{, #<pimm>}] 
16-bit variant 

Applies when size == 01 && opc == 01. 
LDR <Ht>, [<Xn|SP>{, #<pimm>}] 
32-bit variant 

Applies when size == 10 && opc == @1. 
LDR <St>, [<Xn|SP>{, #<pimm>}] 
64-bit variant 

Applies when size == 11 && opc == @1. 


LDR <Dt>, [<Xn|SP>{, #<pimm>}] 
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128-bit variant 
Applies when size == 00 && opc == 11. 


LDR <Qt>, [<Xn|SP>{, #<pimm>}] 


Decode for all variants of this encoding 


boolean whack = FALSE; 

boolean postindex = FALSE; 

integer scale = UInt(opc<1>:size); 

if scale > 4 then UnallocatedEncoding(); 

bits(64) offset = LSL(ZeroExtend(imm12, 64), scale); 


Assembler symbols 


<Bt> Is the 8-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<Dt> Is the 64-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<Ht> Is the 16-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<Qt> Is the 128-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<St> Is the 32-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> For the 8-bit variant: is the optional positive immediate byte offset, in the range 0 to 4095, defaulting 


to 0 and encoded in the "imm12" field. 


For the 16-bit variant: is the optional positive immediate byte offset, a multiple of 2 in the range 0 
to 8190, defaulting to 0 and encoded in the "imm12" field as <pimm>/2. 


For the 32-bit variant: is the optional positive immediate byte offset, a multiple of 4 in the range 0 
to 16380, defaulting to 0 and encoded in the "imm12" field as <pimm>/4. 


For the 64-bit variant: is the optional positive immediate byte offset, a multiple of 8 in the range 0 
to 32760, defaulting to 0 and encoded in the "imm12" field as <pimm>/8. 


For the 128-bit variant: is the optional positive immediate byte offset, a multiple of 16 in the range 
0 to 65520, defaulting to 0 and encoded in the "imm12" field as <pimm>/16. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

AccType acctype = AccType_VEC; 

MemOp memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = 8 << scale; 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
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address = address + offset; 


case memop of 
when MemOp_STORE 
data = V[t]; 
Mem[address, datasize DIV 8, acctype] = data; 


when MemOp_LOAD 
data = Mem[address, datasize DIV 8, acctype]; 
V[t] = data; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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LDR (literal, SIMD&FP) 


Load SIMD&FP Register (PC-relative literal). This instruction loads a SIMD&FP register from memory. The 
address that is used for the load is calculated from the PC value and an immediate offset. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 | | | | 5 4| 0 | 


fopc}o 1 tfifo of immig R 


32-bit variant 
Applies when opc == 00. 


LDR <St>, <label> 


64-bit variant 
Applies when opc == 01. 


LDR <Dt>, <label> 


128-bit variant 
Applies when opc == 10. 


LDR <Qt>, <label> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer size; 
bits(64) offset; 


case opc of 

when 'Q@Q' 
size = 4; 

when '@1' 
size = 8; 

when '10' 
size = 16; 

when '11' 
UnallocatedEncoding() 


offset = SignExtend(imm19:'Q0', 64); 


Assembler symbols 

<Dt> Is the 64-bit name of the SIMD&FP register to be loaded, encoded in the "Rt" field. 
<Qt> Is the 128-bit name of the SIMD&FP register to be loaded, encoded in the "Rt" field. 
<St> Is the 32-bit name of the SIMD&FP register to be loaded, encoded in the "Rt" field. 


<label> Is the program label from which the data is to be loaded. Its offset from the address of this 
instruction, in the range +/-1 MB, is encoded as "imm19" times 4. 
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Operation 


bits(64) address = PC[] + offset; 
bits(sizex8) data; 


CheckFPAdvSIMDEnab1ed64(); 





data = Mem[address, size, AccType_VEC]; 
V[t] = data; 
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C7.2.168 LDR (register, SIMD&FP) 


Load SIMD&FP Register (register offset). This instruction loads a SIMD&FP register from memory. The address 
that is used for the load is calculated from a base register value and an offset register value. The offset can be 
optionally shifted and extended. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312/1109 | 5 4| 0 | 


[sze[7 1 1]7[0 O[x i]7] Rm [option [S[7 0] Rn | Rt 


opc 


8-bit variant 
Applies when size == 00 && opc == 01 && option != Q11. 


LDR <Bt>, [<Xn|SP>, (<Wm>|<Xm>), <extend> {<amount>}] 


8-bit variant 
Applies when size == 00 && opc == 01 && option == 11. 


LDR <Bt>, [<Xn|SP>, <Xm>{, LSL <amount>}] 


16-bit variant 
Applies when size == Q1 && opc == Q1. 


LDR <Ht>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


32-bit variant 
Applies when size == 10 && opc == @1. 


LDR <St>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


64-bit variant 
Applies when size == 11 && opc == 01. 


LDR <Dt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


128-bit variant 
Applies when size == 0@ && opc == 11. 


LDR <Qt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


Decode for all variants of this encoding 


boolean whack = FALSE; 

boolean postindex = FALSE; 

integer scale = UInt(opc<1>:size); 

if scale > 4 then UnallocatedEncoding(); 


if option<l> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = if S == '1' then scale else 0; 


Assembler symbols 


<Bt> Is the 8-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 
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<Dt> 
<Ht> 
<Qt> 
<St> 
<Xn| SP> 


<Wm> 


<Xm> 


<extend> 


<amount> 


Is the 64-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

Is the 16-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

Is the 128-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

Is the 32-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 


When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 


For the 8-bit variant: is the index extend specifier, encoded in the "option" field. It can have the 
following values: 


UXTW when option = 010 
SXTW when option = 110 
SXTX when option = 111 


For the 128-bit, 16-bit, 32-bit and 64-bit variant: is the index extend/shift specifier, defaulting to 
LSL, and which must be omitted for the LSL option when <amount> is omitted. encoded in the 
"option" field. It can have the following values: 


UXTW when option = 010 
LSL when option = 011 
SXTW when option = 110 
SXTX when option = 111 


For the 8-bit variant: is the index shift amount, it must be #0, encoded in "S" as 0 if omitted, or as 1 
if present. 


For the 16-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 


#0 when S = 0 
#1 when S = 1 


For the 32-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 


#0 when S = 0 
#2 when S = 1 


For the 64-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 


#0 when S 
#3 when S 


) 
1 


For the 128-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where 
it is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 





#0 when S = 0 
#4 when S = 1 
C7-1100 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer m = UInt(Rm); 

AccType acctype = AccType_VEC; 

MemOp memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = 8 << scale; 


Operation 


bits(64) offset = ExtendReg(m, extend_type, shift); 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


case memop of 
when MemOp_STORE 
data = V[t]; 
Mem[address, datasize DIV 8, acctype] = data; 


when MemOp_LOAD 
data = Mem[address, datasize DIV 8, acctype]; 
V[t] = data; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1101 
1ID092916 Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.169 


LDUR (SIMD&FP) 


Load SIMD&FP Register (unscaled offset). This instruction loads a SIMD&FP register from memory. The address 
that is used for the load is calculated from a base register value and an optional immediate offset. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12|1110 9 | 5 4| 0 | 


fsize]1 1 1J]1]o ofx ifo] imma fo of Rn S| Rt 


opc 


8-bit variant 

Applies when size == @0 && opc == 
LDUR <Bt>, [<Xn|SP>{, #<simm>}] 
16-bit variant 

Applies when size == 01 && opc == 
LDUR <Ht>, [<Xn|SP>{, #<simm>}] 
32-bit variant 

Applies when size == 10 && opc == 
LDUR <St>, [<Xn|SP>{, #<simm>}] 
64-bit variant 

Applies when size == 11 && opc == 
LDUR <Dt>, [<Xn|SP>{, #<simm>}] 
128-bit variant 

Applies when size == 00 && opc == 


LDUR <Qt>, [<Xn|SP>{, #<simm>}] 


01. 


01. 


01. 


01. 


11; 


Decode for all variants of this encoding 


boolean whack = FALSE; 
boolean postindex = FALSE; 


integer scale = UInt(opc<1>:size); 
if scale > 4 then UnallocatedEncoding(); 
bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Bt> Is the 8-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 


<Dt> Is the 64-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 


<Ht> Is the 16-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 


<Qt> Is the 128-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 


<St> Is the 32-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 





C7-1102 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 
in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

AccType acctype = AccType_VEC; 

MemOp memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = 8 << scale; 


Operation 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


case memop of 
when MemOp_STORE 
data = V[t]; 
Mem[address, datasize DIV 8, acctype] = data; 


when MemOp_LOAD 
data = Mem[address, datasize DIV 8, acctype]; 
V[t] = data; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C7.2.170 


MLA (by element) 


Multiply-Add to accumulator (vector, by element). This instruction multiplies the vector elements in the first source 
SIMD&FP register by the specified value in the second source SIMD&FP register, and accumulates the results with 
the vector elements of the destination SIMD&FP register. All the values in this instruction are unsigned integer 


values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15141312\1110 9 | 5 4| 0 | 


fofajijo 114 1t{size]L|m] Rm _ folojo ofHjo] Rn | Ra_ 
02 


Vector variant 


MLA <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


'Q': 


UInt(Rd) ; 
UInt(Rn); 
UInt(Rmhi:Rm) ; 


integer d 
integer n 
integer m 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean sub_op = (02 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 
The following encodings are reserved: 
° size = 00,Q =x. 


° size = 11,Q =x. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vn> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 


the following values: 


Q:Rm when size = Q1 





C7-1104 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential 


1ID092916 


<Ts> 


<index> 


Operation 
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M:Rm when size = 10 

The following encodings are reserved: 
° size = 00. 

° size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 

s when size = 10 

The following encodings are reserved: 

. size = 00. 


° size = 11. 


Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = 01 

H:L when size = 10 

The following encodings are reserved: 

. size = 00. 


° size = 11. 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(idxdsize) operand2 = V[m]; 
bits(datasize) operand3 = V[d]; 
bits(datasize) result; 

integer element1; 

integer element2; 

bits(esize) product; 


element2 = UInt(Elem[operand2, index, esize]); 
for e = 0 to elements-1 
element1 = UInt(Elem[operand1, e, esize]); 
product = (elementl«element2)<esize-1:0>; 
if sub_op then 


else 


Elem[result, e, esize] = Elem[operand3, e, esize] - product; 


Elem[result, e, esize] = Elem[operand3, e, esize] + product; 


V[d] = result; 
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C7.2.171 MLA (vector) 
Multiply-Add to accumulator (vector). This instruction multiplies corresponding elements in the vectors of the two 
source SIMD&FP registers, and accumulates the results with the vector elements of the destination SIMD&FP 
register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20] 16|15141312|/11109 | 5 4| 0 | 
fofafoyo 117 ofsze[i] Rm [1007 0]1] Rk | Ra | 
U 
Three registers of the same type variant 
MLA <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean sub_op = (U == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11,Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) operand3 = V[d]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 
bits(esize) product; 
C7-1106 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
product = (UInt(element1)«UInt(element2))<esize-1:0>; 
if sub_op then 
Elem[result, e, esize] 
else 
Elem[result, e, esize] = Elem[operand3, e, esize] + product; 


Elem[operand3, e, esize] - product; 


V[d] = result; 
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C7.2.172 


MLS (by element) 


Multiply-Subtract from accumulator (vector, by element). This instruction multiplies the vector elements in the first 
source SIMD&FP register by the specified value in the second source SIMD&FP register, and subtracts the results 
from the vector elements of the destination SIMD&FP register. All the values in this instruction are unsigned integer 


values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15141312\1110 9 | 5 4| 0 | 


fofa}ijo 114 1t}size[Ljm] Rm _ foltjo ofHjo] Rn | Rd 
02 


Vector variant 


MLS <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


'Q': 


UInt(Rd) ; 
UInt(Rn); 
UInt(Rmhi:Rm) ; 


integer d 
integer n 
integer m 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean sub_op = (02 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 
The following encodings are reserved: 
° size = 00,Q =x. 


° size = 11,Q =x. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vn> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 


the following values: 


Q:Rm when size = Q1 
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M:Rm when size = 10 

The following encodings are reserved: 
° size = 00. 

° size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 

s when size = 10 

The following encodings are reserved: 

. size = 00. 


° size = 11. 


Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = 01 

H:L when size = 10 

The following encodings are reserved: 

. size = 00. 


° size = 11. 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(idxdsize) operand2 = V[m]; 
bits(datasize) operand3 = V[d]; 
bits(datasize) result; 

integer element1; 

integer element2; 

bits(esize) product; 


element2 = UInt(Elem[operand2, index, esize]); 
for e = 0 to elements-1 
element1 = UInt(Elem[operand1, e, esize]); 
product = (elementl«element2)<esize-1:0>; 
if sub_op then 


else 


Elem[result, e, esize] = Elem[operand3, e, esize] - product; 


Elem[result, e, esize] = Elem[operand3, e, esize] + product; 


V[d] = result; 
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C7.2.173 MLS (vector) 
Multiply-Subtract from accumulator (vector). This instruction multiplies corresponding elements in the vectors of 
the two source SIMD&FP registers, and subtracts the results from the vector elements of the destination SIMD&FP 
register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20] 16|15141312|/11109 | 5 4| 0 | 
folafiyo 117 ofsze[i] am [1007 0]] Rk [| Ra | 
U 
Three registers of the same type variant 
MLS <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean sub_op = (U == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11,Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) operand3 = V[d]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 
bits(esize) product; 
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for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
product = (UInt(element1)«UInt(element2))<esize-1:0>; 
if sub_op then 
Elem[result, e, esize] 
else 
Elem[result, e, esize] = Elem[operand3, e, esize] + product; 


Elem[operand3, e, esize] - product; 


V[d] = result; 
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C7.2.174 MOV (scalar) 


Move vector element to scalar. This instruction duplicates the specified vector element in the SIMD&FP source 
register into a scalar, and writes the result to the SIMD&FP destination register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is an alias of the DUP (element) instruction. This means that: 


. The encodings in this description are named to match the encodings of DUP (element). 
° The description of DUP (element) gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16\15141312|1110 9 | 5 4| 0 | 


fo tfolt + 770000] mms [ojooooji] rn | Rd 


Scalar variant 

MOV <V><d>, <Vn>.<T>[<index>] 
is equivalent to 

DUP <V><d>, <Vn>.<T>[<index>] 


and is always the preferred disassembly. 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "imm5" field. It can have the following values: 
B when imm5 = xxxx1 
H when imm5 = xxx10 
s when imm5 = xx100 
D when imm5 = x1000 


The encoding imm5 = xQ00Q0Q is reserved. 


<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is the element width specifier, encoded in the "imm5" field. It can have the following values: 
B when immS = xxxx1 
H when imm5 = xxx10 
S when imm5 = xx100 
D when imm5 = x1000 


The encoding imm5 = xQ0Q0Q is reserved. 


<index> Is the element index encoded in the "imm5" field. It can have the following values: 
imm5<4:1> when imm5 = xxxx1 
imm5<4:2> when imm5 = xxx10 
imm5<4:3> when imm5 = xx100 
imm5<4> when imm5 = x1000 


The encoding imm5 = x000Q0Q is reserved. 
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Operation 


The description of DUP (element) gives the operational pseudocode for this instruction. 
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C7.2.175 MOV (element) 


Move vector element to another vector element. This instruction copies the vector element of the source SIMD&FP 
register to the specified vector element of the destination SIMD&FP register. 


This instruction can insert data into individual elements within a SIMD&FP register without clearing the remaining 
bits to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is an alias of the INS (element) instruction. This means that: 


° The encodings in this description are named to match the encodings of INS (element). 
° The description of INS (element) gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 11109 | 5 4| 0| 


olifijo +1100 0 0] imms fo] imma jit Rn | Rd 


Advanced SIMD variant 

MOV <Vd>.<Ts>[<indexl>], <Vn>.<Ts>[<index2>] 
is equivalent to 

INS <Vd>.<Ts>[<indexl>], <Vn>.<Ts>[<index2>] 


and is always the preferred disassembly. 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ts> Is an element size specifier, encoded in the "imm5" field. It can have the following values: 
B when immS = xxxx1 
H when imm5 = xxx10 
S when imm5 = xx100 
D when imm5 = x1000 


The encoding imm5 = xQ00Q0 is reserved. 


<index1> Is the destination element index encoded in the "imm5" field. It can have the following values: 
imm5<4:1> when imm5 = xxxx1 
imm5<4:2> when imm5 = xxx10 
imm5<4:3> when imm5 = xx100 
imm5<4> when imm5 = x1000 


The encoding imm5 = xQ00Q0Q is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


<index2> Is the source element index encoded in the "imm5:imm4" field. It can have the following values: 
imm4<3:0> when imm5 = xxxx1 
imm4<3:1> when immS = xxx10 


imm4<3:2> when immS = xx100 
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imm4<3> = when imm5 = x1000 
The encoding imm5 = xQ00Q0Q is reserved. 


Unspecified bits in "imm4" are ignored but should be set to zero by an assembler. 


Operation 


The description of INS (element) gives the operational pseudocode for this instruction. 
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C7.2.176 MOV (from general) 


Move general-purpose register to a vector element. This instruction copies the contents of the source 
general-purpose register to the specified vector element in the destination SIMD&FP register. 


This instruction can insert data into individual elements within a SIMD&FP register without clearing the remaining 
bits to zero. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is an alias of the INS (general) instruction. This means that: 


° The encodings in this description are named to match the encodings of INS (general). 
° The description of INS (general) gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312|11109 | 5 4| 0| 


folifojo +1t000 0] imms fofoorrji] Rn | Rd | 


Advanced SIMD variant 

MOV <Vd>.<Ts>[<index>], <R><n> 
is equivalent to 

INS <Vd>.<Ts>[<index>], <R><n> 


and is always the preferred disassembly. 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ts> Is an element size specifier, encoded in the "imm5" field. It can have the following values: 
B when immS = xxxx1 
H when imm5 = xxx10 
S when imm5 = xx100 
D when imm5 = x1000 


The encoding imm5 = xQ00Q0Q is reserved. 


<index> Is the element index encoded in the "imm5" field. It can have the following values: 
imm5<4:1> when immS = xxxx1 
imm5<4:2> when imm5 = xxx10 
imm5<4:3> when imm5 = xx100 
imm5<4> when imm5 = x1000 
The encoding imm5 = x0QQ0Q is reserved. 


<R> Is the width specifier for the general-purpose source register, encoded in the "imm5" field. It can 
have the following values: 


W when imm5 = xxxx1 
W when imm5 = xxx10 
W when imm5 = xx100 
xX when imm5 = x1000 
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The encoding imm5 = xQ00Q0Q is reserved. 
<n> Is the number [0-30] of the general-purpose source register or ZR (31), encoded in the "Rn" field. 


Operation 


The description of INS (general) gives the operational pseudocode for this instruction. 
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C7.2.177 


MOV (vector) 


Move vector. This instruction copies the vector in the source SIMD&FP register into the destination SIMD&FP 
register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is an alias of the ORR (vector, register) instruction. This means that: 
. The encodings in this description are named to match the encodings of ORR (vector, register). 


° The description of ORR (vector, register) gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


pofajojo 141 of1 oft] Rm fooorirft} Rn {| Ra | 


size 


Three registers of the same type variant 
MOV <Vd>.<T>, <Vn>.<T> 

is equivalent to 

ORR <Vd>.<T>, <Vn>.<T>, <Vn>.<T> 


and is the preferred disassembly when Rm == Rn. 


Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B whenQ = 1 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Operation 


The description of ORR (vector, register) gives the operational pseudocode for this instruction. 
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C7.2.178 MOV (to general) 


Move vector element to general-purpose register. This instruction reads the unsigned integer from the source 
SIMD&FP register, zero-extends it to form a 32-bit or 64-bit value, and writes the result to the destination 
general-purpose register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is an alias of the UMOV instruction. This means that: 


. The encodings in this description are named to match the encodings of UMOV. 
° The description of UMOV gives the operational pseudocode for this instruction. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312|1110 9 | 5 4| 0 | 
fofajojo 117000 o[x x xo ofojo sfijtji] Rn [| Rd 
imm5 


32-bit variant 

Applies when Q == 0 && imm5 == xx100. 
MOV <Wd>, <Vn>.S[<index>] 

is equivalent to 

UMOV <Wd>, <Vn>.S[<index>] 


and is always the preferred disassembly. 


64-bit variant 

Applies when Q == 1 && imm5 == x1000. 
MOV <Xd>, <Vn>.D[<index>] 

is equivalent to 

UMOV <Xd>, <Vn>.D[<index>] 


and is always the preferred disassembly. 


Assembler symbols 


<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<index> For the 32-bit variant: is the element index encoded in "imm5<4:3>". 


For the 64-bit variant: is the element index encoded in "imm5<4>". 


Operation 


The description of UMOV gives the operational pseudocode for this instruction. 
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C7.2.179 MOVI 


Move Immediate (vector). This instruction places an immediate constant into every vector element of the 


destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 


12/1110 9 8|7 6 5 4| 


0| 


[o]afopfo 14140000 ofalbic] cmode Jolt]dfe]fia}h] Rd | 


8-bit variant 
Applies when op == @ && cmode == 1110. 


MOVI <Vd>.<T>, #<imm8>{, LSL #0} 


16-bit shifted immediate variant 
Applies when op == @ && cmode == 10x0. 


MOVI <Vd>.<T>, #<imm8>{, LSL #<amount>} 


32-bit shifted immediate variant 
Applies when op == @ && cmode == Qxx0. 


MOVI <Vd>.<T>, #<imm8>{, LSL #<amount>} 


32-bit shifting ones variant 
Applies when op == @ && cmode == 110x. 


MOVI <Vd>.<T>, #<imm8>, MSL #<amount> 


64-bit scalar variant 
Applies when Q == 0 && op == 1 && cmode == 1110. 


MOVI <Dd>, #<imm> 


64-bit vector variant 
Applies when Q == 1 && op == 1 && cmode == 1110. 


MOVI <Vd>.2D, #<imm> 


Decode for all variants of this encoding 
integer rd = UInt(Rd); 


integer datasize = if Q == '1' then 128 else 64; 
bits(datasize) imm; 
bits(64) imm64; 


ImmediateOp operation; 

case cmode:op of 
when 'Qxx@0' operation = ImmediateOp_MOVI; 
when '@xxQ1' operation = ImmediateOp_MVNI; 
when '@xx10' operation = ImmediateOp_ORR; 
when '@xx11' operation = ImmediateOp_BIC; 
when '10x00' operation = ImmediateOp_MOVI; 





C7-1120 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


when 
when 
when 
when 
when 
when 
when 
when 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


'10x01' operation = ImmediateOp_MVNT; 
'10x10' operation = ImmediateOp_ORR; 
'10x11' operation = ImmediateOp_BIC; 
'110x0' operation = ImmediateOp_MOVT; 
'110x1' operation = ImmediateOp_MVNT; 
'1110x' operation = ImmediateOp_MOVT; 
'11110' operation = ImmediateOp_MOVI; 
"11111" 

// FMOV Dn,#imm is in main FP instruction set 

if Q == 'Q' then UnallocatedEncoding(); 

operation = ImmediateOp_MOVI; 


imm64 = AdvSIMDExpandImm(op, cmode, a:b:c:d:e:f:g:h); 
imm = Replicate(imm64, datasize DIV 64); 


Assembler symbols 


<Dd> 


<Vd> 


<imm> 


<I> 


<imm8> 


<amount> 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 

Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

Is a 64-bit immediate 'aaaaaaaabbbbbbbbccccccccddddddddeeeceeeeffffffffgggeeee¢ehhhhhhhh', 
encoded in "a:b:c:d:e:f:g:h". 

For the 8-bit variant: is an arrangement specifier, encoded in the "Q" field. It can have the following 
values: 

8B when Q = 0 

16B when Q = 1 


For the 16-bit variant: is an arrangement specifier, encoded in the "Q" field. It can have the 
following values: 


4H when Q = 0 
8H when Q = 1 


For the 32-bit variant: is an arrangement specifier, encoded in the "Q" field. It can have the 
following values: 


2S when Q = 0 
4S when Q = 1 


Is an 8-bit immediate encoded in "a:b:c:d:e:f:g:h". 


For the 16-bit shifted immediate variant: is the shift amount encoded in the "cmode<1>" field. It can 
have the following values: 
0 


1 


1) when cmode<1> 


8 when cmode<1> 


defaulting to 0 if LSL is omitted. 


For the 32-bit shifted immediate variant: is the shift amount encoded in the "cmode<2:1>" field. It 
can have the following values: 


) when cmode<2:1> = 00 
8 when cmode<2:1> = 01 
16 when cmode<2:1> = 10 
24 when cmode<2:1> = 11 


defaulting to 0 if LSL is omitted. 


For the 32-bit shifting ones variant: is the shift amount encoded in the "cmode<0>" field. It can have 
the following values: 





8 when cmode<@> = 0 
16 when cmode<@> = 1 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand; 


bits(datasize) result; 


case operation of 
when ImmediateOp_MOVI 


resu 


t = imn; 


when ImmediateOp_MVNI 


resul 


t = NOT(imm); 


when ImmediateOp_ORR 
operand = V[rd]; 


resul 


t = operand OR imm; 


when ImmediateOp_BIC 
operand = V[rd]; 








result = operand AND NOT(imm) ; 
V[rd] = result; 
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MUL (by element) 


Multiply (vector, by element). This instruction multiplies the vector elements in the first source SIMD&FP register 
by the specified value in the second source SIMD&FP register, places the results in a vector, and writes the vector 


to the destination SIMD&FP register. All the values in this instruction are unsigned integer values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15141312\1110 9 | 5 4| 0 | 


fo[afoyo +771 i[sze[t[M] Rm [100 o[H[o] Rn | Rd 


Vector variant 


MUL <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize 


integer i 

bit Rmhi; 

case size 
when 
when 


if H == '1' then 128 else 64; 


ndex; 

of 

"Q@1' index = UInt(H:L:M); Rmhi = 'Q'; 
'10' index = UInt(H:L); Rmhi = M; 


otherwise UnallocatedEncoding(); 


integer d 
integer n 
integer m 


integer e 


integer datasize 
integer elements 


Assemb 
<Vd> 


<I> 


<Vn> 


<Vim> 


UInt(Rd) ; 
UInt(Rn); 
UInt(Rmhi:Rm) ; 


8 << UInt(size); 
if Q == '1' then 128 else 64; 
datasize DIV esize; 


size 


ler symbols 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 
The following encodings are reserved: 
° size = 00,Q =x. 

° size = 11,Q =x. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 
the following values: 

O:Rm 
M:Rm 


01 
10 


when size 


when size 
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The following encodings are reserved: 
° size = 00. 
. size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


<Ts> Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = Q1 
Ss when size = 10 


The following encodings are reserved: 


° size = 00. 
° size = 11. 
<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = 01 
HiL when size = 10 


The following encodings are reserved: 
. size = 00. 


° size = 11. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(idxdsize) operand2 = V[m]; 
bits(datasize) result; 

integer element1; 

integer element2; 

bits(esize) product; 


element2 = UInt(Elem[operand2, index, esize]); 
for e = 0 to elements-1 
element1 = UInt(Elem[operand1, e, esize]); 
product = (elementl«element2)<esize-1:0>; 
Elem[result, e, esize] = product; 


V[d] = result; 
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C7.2.181 MUL (vector) 


Multiply (vector). This instruction multiplies corresponding elements in the vectors of the two source SIMD&FP 
registers, places the results in a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


lafojo i 1 4 ofszefif Rm [voor afi} Ro | Re 


Three registers of the same type variant 


MUL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if "1' && size != 'QQ' then ReservedValue(); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean poly = (U == '1'); 


Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 2 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 
bits(esize) product; 


for e = 0 to elements-1 
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elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
if poly then 

product = PolynomialMult(elementl, element2)<esize-1:0>; 
else 

product = (UInt(element1)«UInt(element2) )<esize-1:0>; 
Elem[result, e, esize] = product; 


V[d] = result; 
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Bitwise NOT (vector). This instruction reads each vector element from the source SIMD&FP register, places the 
inverse of each value into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is an alias of the NOT instruction. This means that: 
. The encodings in this description are named to match the encodings of NOT. 


° The description of NOT gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fofafijo +77 ooo] 0000j00701/10] Rn | Rd 


Vector variant 

MVN <Vd>.<T>, <Vn>.<T> 
is equivalent to 

NOT <Vd>.<T>, <Vn>.<T> 


and is always the preferred disassembly. 


Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


The description of NOT gives the operational pseudocode for this instruction. 
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C7.2.183 


MVNI 


Move inverted Immediate (vector). This instruction places the inverse of an immediate constant into every vector 


element of the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 


12/1110 9 8|7 6 5 4| 


0| 


fofalijo 14140000 ofalbjc] cmode Jolt}dfe]fia}h] Rd | 
op 


16-bit shifted immediate variant 
Applies when cmode == 10xQ. 


MVNI <Vd>.<T>, #<imm8>{, LSL #<amount>} 


32-bit shifted immediate variant 
Applies when cmode == 0xxQ. 


MVNI <Vd>.<T>, #<imm8>{, LSL #<amount>} 


32-bit shifting ones variant 
Applies when cmode == 110x. 


MVNI <Vd>.<T>, #<imm8>, MSL #<amount> 


Decode for all variants of this encoding 
integer rd = UInt(Rd); 


integer datasize = if Q == '1' then 128 else 64; 
bits(datasize) imm; 
bits(64) imm64; 


ImmediateOp operation; 
case cmode:op of 
when 'Qxx@1' operation = ImmediateOp_MVNT; 
when 'Qxx11' operation = ImmediateOp_BIC; 
when '10x0@1' operation = ImmediateOp_MVNT; 
when '10x11' operation = ImmediateOp_BIC; 
when '110x1' operation = ImmediateOp_MVNI; 
when '1110x' operation = ImmediateOp_MOVI; 
when '11111' 
// FMOV Dn,#imm is in main FP instruction set 
if Q == '@' then UnallocatedEncoding(); 
operation = ImmediateOp_MOVI; 


imm64 = AdvSIMDExpandImm(op, cmode, a:b:c:d:e:f:g:h); 
imm = Replicate(imm64, datasize DIV 64); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> For the 16-bit variant: is an arrangement specifier, encoded in the "Q" field. It can have the 


following values: 


4H when Q = 0 
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<imm8> 


<amount> 


Operation 
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8H when Q = 1 


For the 32-bit variant: is an arrangement specifier, encoded in the "Q" field. It can have the 
following values: 


2S when Q = @ 
4S whenQ = 1 
Is an 8-bit immediate encoded in "a:b:c:d:e:f:g:h". 


For the 16-bit shifted immediate variant: is the shift amount encoded in the "cmode<1>" field. It can 
have the following values: 


Q 
1 


) when cmode<1> 


8 when cmode<1> 
defaulting to 0 if LSL is omitted. 


For the 32-bit shifted immediate variant: is the shift amount encoded in the "cmode<2:1>" field. It 
can have the following values: 


1) when cmode<2:1> = 00 
8 when cmode<2:1> = Q1 
16 when cmode<2:1> = 10 


24 when cmode<2:1> = 11 


defaulting to 0 if LSL is omitted. 


For the 32-bit shifting ones variant: is the shift amount encoded in the "cmode<0>" field. It can have 
the following values: 


iT} 
S 


8 when cmode<0> 


iT} 
BR 


16 when cmode<0> 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand; 
bits(datasize) result; 


case operation of 
when ImmediateOp_MOVI 


resul 


t = imn; 


when ImmediateOp_MVNI 


resul 


t = NOT(imm); 


when ImmediateOp_ORR 
operand = V[rd]; 


resul 


t = operand OR imm; 


when ImmediateOp_BIC 
operand = V[rd]; 


resu 


V[rd] = resul 





t = operand AND NOT(imm) ; 


t; 
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C7.2.184 NEG (vector) 


Negate (vector). This instruction reads each vector element from the source SIMD&FP register, negates each value, 
puts the result into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


oO ifils it 4 ofsze[s ooo oor or ajiof Rn | Re 


Scalar variant 


NEG <V><d>, <V><n> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 


integer elements = 1; 
boolean neg = (U == '1'); 


Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 5 A| 0| 


fofa[ifo i771 O[sze]io 00 0jo7071 7/10] Rn | Rd 
U 


Vector variant 


NEG <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean neg = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 


The following encodings are reserved: 





° size = Ox. 
° size = 10. 
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<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

2D when size = 11,Q=1 


The encoding size = 11,Q = 0 is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element; 


for e = 0 to elements-1 
element = SInt(Elem[operand, e, esize]); 
if neg then 
element = -element; 
else 
element = Abs(element); 
Elem[result, e, esize] = element<esize-1:0>; 


V[d] = result; 
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C7.2.185 NOT 


Bitwise NOT (vector). This instruction reads each vector element from the source SIMD&FP register, places the 
inverse of each value into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is used by the alias MVN. The alias is always the preferred disassembly. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofajijo 1411 ofo oftoooojoororfi of Rn [| Rd | 


Vector variant 


NOT <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer esize = 8; 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV 8; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
Elem[result, e, esize] = NOT(element); 


V[d] = result; 
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C7.2.186 ORN (vector) 


Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source 
SIMD&FP registers, and writes the result to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofajojo 141 of1 aft} Rm foooriai] Rn [| Rd 


size 


Three registers of the same type variant 


ORN <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if Q == '1' then 128 else 64; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


operand2 = NOT(operand2) ; 
result = operand1 OR operand2; 


V[d] = result; 
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C7.2.187 ORR (vector, immediate) 


Bitwise inclusive OR (vector, immediate). This instruction reads each vector element from the destination 
SIMD&FP register, performs a bitwise OR between each result and an immediate constant, places the result into a 
vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 


12/1110 9 8|7 6 5 4| 


0| 


ojafojo 11110000 ofafb{c|x x x tfolt{afelf{g{h] Ra | 
op 


16-bit variant 


Applies when cmode == 10x1. 


cmode 


ORR <Vd>.<T>, #<imm8>{, LSL #<amount>} 


32-bit variant 


Applies when cmode == Qxx1. 


ORR <Vd>.<T>, #<imm8>{, LSL #<amount>} 


Decode for all variants of this encoding 


integer rd = UInt(Rd); 


integer datasize = if Q == '1' then 128 else 64; 
bits(datasize) imm; 
bits(64) imm64; 


ImmediateOp operation; 
case cmode:op of 


when 
when 
when 
when 
when 
when 
when 


imm64 = AdvSIMDExpandImm(op, 


"@xx00' operation 
"@xx10' operation 
'10x00' operation 
'10x10' operation 
'110x0' operation 
'1110x' operation 
'11110' operation 


ImmediateOp_MOVT; 
ImmediateOp_ORR; 
ImmediateOp_MOVT; 
ImmediateOp_ORR; 
ImmediateOp_MOVT; 
ImmediateOp_MOVT; 
ImmediateOp_MOVT; 
cmode, a:b:c:d:e:f:g:h); 


imm = Replicate(imm64, datasize DIV 64); 


Assembler symbols 


<Vd> 


<I> 


<imm8> 


Is the name of the SIMD&FP register, encoded in the "Rd" field. 


For the 16-bit variant: is an arrangement specifier, encoded in the "Q" field. It can have the 
following values: 


4H when Q = 0 
8H when Q = 1 


For the 32-bit variant: is an arrangement specifier, encoded in the "Q" field. It can have the 
following values: 


2S when Q = 0 
4S whenQ = 1 


Is an 8-bit immediate encoded in "a:b:c:d:e:f:g:h". 
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<amount> 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand; 
bits(datasize) result; 


case operation of 
when ImmediateOp_MOVI 

It = imm; 

when ImmediateOp_MVNI 

1t = NOTCimm); 

when ImmediateOp_ORR 
operand = V[rd]; 

t = operand OR imm; 

when ImmediateOp_BIC 
operand = V[rd]; 

1t = operand AND NOT(imm) ; 


resu 


resu 


resul 


resu 


V[rd] = resul 





C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


For the 16-bit variant: is the shift amount encoded in the "cmode<1>" field. It can have the 


following values: 


() 
8 
defaulting to 0 if LSL is omitted. 


when cmode<1> = 0 


when cmode<1> = 1 


For the 32-bit variant: is the shift amount encoded in the "cmode<2:1>" field. It can have the 


following values: 


) 
8 


16 
24 


when cmode<2:1> = 00 
when cmode<2:1> = Q1 
when cmode<2:1> = 10 


when cmode<2:1> = 11 


defaulting to 0 if LSL is omitted. 


t: 
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C7.2.188 | ORR (vector, register) 

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP 

registers, and writes the result to the destination SIMD&FP register. 

Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 

and Exception level, an attempt to execute the instruction might be trapped. 

This instruction is used by the alias MOV (vector). See Alias conditions for details of when each alias is preferred. 
|31 30 eeeere 26 25 24|23 22 21 20| 16|15141312|/1110 9 | 5 4| 0| 
fofajojo 111 of1 of] Rm jfooorrit Rn | Rd | 

size 

Three registers of the same type variant 

ORR <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 

Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if Q == '1' then 128 else 64; 

Alias conditions 

Alias is preferred when 
MOV (vector) Rm == Rn 

Assembler symbols 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 

8B when Q = 0 
16B whenQ = 1 

<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 

<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 

Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
result = operand1 OR operand2; 

V[d] = result; 
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C7.2.189 PMUL 


Polynomial Multiply. This instruction multiplies corresponding elements in the vectors of the two source SIMD&FP 
registers, places the results in a vector, and writes the vector to the destination SIMD&FP register. 


For information about multiplying polynomials see Polynomial arithmetic over {0, 1} on page A1-45. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


lati jo i 1s ofszefif Rm [voor ijt} Ro | Re 


Three registers of the same type variant 


PMUL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if U == '1' && size != 'QQ' then ReservedValue(); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean poly = (U == '1'); 


Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
The following encodings are reserved: 
° size = 01,Q =x. 


° size = 1x,Q =x. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 
bits(esize) product; 


for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
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element2 = Elem[operand2, e, esize]; 
if poly then 

product = PolynomialMult(element1, element2)<esize-1:0>; 
else 

product = (UInt(element1)«UInt(element2) )<esize-1:0>; 
Elem[result, e, esize] = product; 


V[d] = result; 
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Polynomial Multiply Long. This instruction multiplies corresponding elements in the lower or upper half of the 
vectors of the two source SIMD&FP registers, places the results in a vector, and writes the vector to the destination 
SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied. 


For information about multiplying polynomials see Polynomial arithmetic over {0, 1} on page A1-45. 


The PMULL instruction extracts each source vector from the lower half of each source register, while the PMULL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo[afofo +771 O[sze]i] Rm [1771 0[00] Rn | Rd 


Three registers, not all the same type variant 


PMULL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
if size 
integer 
integer 
integer 
integer 


d 
n 
m 


UInt(Rd) ; 
UInt(Rn); 
UInt (Rm) ; 


'Q1' || size == '10' then ReservedValue(); 
11' && !HaveCryptoExt() then UnallocatedEncoding(); 


esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 

elements = datasize DIV esize; 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
00 
11 


8H when size 


1Q when size 
The following encodings are reserved: 

° size = Q1. 

° size = 10. 

The '1Q' arrangement is only allocated in an implementation that includes the Cryptographic 


Extension, and is otherwise RESERVED. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
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<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
1D when size = 11,Q = 0 
2D when size = 11,Q=1 


The following encodings are reserved: 


° size = 01,Q =x. 
° size = 10,Q =x. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

bits(esize) element1; 

bits(esize) element2; 


for e = 0 to elements-1 
element1 = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
Elem[result, e, 2xesize] = PolynomialMult(element1, element2) ; 


V[d] = result; 
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C7.2.191 RADDHN, RADDHN2 


Rounding Add returning High Narrow. This instruction adds each vector element in the first source SIMD&FP 
register to the corresponding vector element in the second source SIMD&FP register, places the most significant 
half of the result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. 


The results are rounded. For truncated results, see ADDHN, ADDHN2. 


The RADDHN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the RADDHN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


O]atijo 1 4 4 ofsve |i] Rm Jo tfofofo of] Rn | Rd __| 


Three registers, not all the same type variant 


RADDHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

== '11' then ReservedValue(); 
esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 


elements = datasize DIV esize; 


sub_op = (ol == '1'); 


round = 


(U == '1'); 


Assembler symbols 


2 


<Vd> 


<Tb> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 2 


16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 
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<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 


<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(2sdatasize) operandl = V[n]; 

bits(2sdatasize) operand2 = V[m]; 

bits(datasize) result; 

integer round_const = if round then 1 << (esize - 1) else Q; 
bits(2xesize) element1; 

bits(2xesize) element2; 

bits(2sesize) sum; 


for e = 0 to elements-1 
element1 = Elem[operand1, e, 2xesize]; 
element2 = Elem[operand2, e, 2xesize]; 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
sum = sum + round_const; 
Elem[result, e, esize] = sum<2«esize-l:esize>; 


Vpart[d, part] = result; 
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Reverse Bit order (vector). This instruction reads each vector element from the source SIMD&FP register, reverses 
the bits of the element, places the results into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 


5 4| 0| 


fofaijo 1411 ofoi{1oooojoororfi of Rn {| Rd | 


Vector variant 


RBIT <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


d 
n 


integer 
integer 


UInt(Rd) ; 
= UInt(Rn); 
integer 
integer 
integer 


esize = 8; 
datasize 
elements 


Assembler symbols 


if Q == '1' then 128 else 64; 
datasize DIV 8; 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 
bits(esize) rev; 


for e = 0 to elements-1 
element = 
for i = @ to esize-1 


Elem[operand, e, esize]; 


rev<esize-1-i> = element<i>; 


Elem[result, e, esize] = rev; 


V[d] = result; 
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C7.2.193 REV16 (vector) 


Reverse elements in 16-bit halfwords (vector). This instruction reverses the order of 8-bit elements in each halfword 
of the vector in the source SIMD&FP register, places the results into a vector, and writes the vector to the destination 
SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofafofo 741 ofswe]7 000 0j000 0/110] An | Rd | 
U 00 


Vector variant 


REV16 <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


// size=esize: B(@), H(1), S(1), D(S) 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 


// op=REVx: 64(@), 32(1), 16(2) 
bits(2) op = 00:U; 


// => op+size: 


// 644+B = Q, 64+H = 1, 644+S = 2, 644D = X 
// 32+B = 1, 32+H = 2, 32+S = X, 324D = X 
// 16+B = 2, 16+H = X, 16+S = X, 16+D = X 
// 8+B =X, 8+H =X, 8+S =X, 84D = X 


// => 3-(optsize) (index bits in group) 








// 64/B = 3, 644+H = 2, 6448S = 1, 644D = X 
Tif 32+B = 2, 32+H = 1, 32+S = X, 324D = X 
// 16+B = 1, 16+H = X, 16+S = X, 16+D = X 
Lie 8+B = X, 8+tH =X, 84S =X, 84D =X 


// index bits within group: 1, 2, 3 
if UInt(op) + UInt(size) >= 3 then UnallocatedEncoding(); 


integer container_size; 
case op of 


when '10' container_size = 16; 
when 'Q1' container_size = 32; 
when 'QQ' container_size = 64; 


integer containers = datasize DIV container_size; 
integer elements_per_container = container_size DIV esize; 


Assembler symbols 





<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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The following encodings are reserved: 


° size = 01,Q =x. 
° size = 1x,Q =x. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element = 0; 
integer rev_element; 
for c = Q to containers-1 
rev_element = element + elements_per_container - 1; 
for e = Q to elements_per_container-1 
Elem[result, rev_element, esize] = Elem[operand, element, esize]; 
element = element + 1; 
rev_element = rev_element - 1; 


V[d] = result; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1145 
1ID092916 Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.194 REV32 (vector) 


Reverse elements in 32-bit words (vector). This instruction reverses the order of 8-bit or 16-bit elements in each 
word of the vector in the source SIMD&FP register, places the results into a vector, and writes the vector to the 
destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofaft[o 7744 o[sze[i 000 0f000 0/0/10] An | Rd | 
U 00 


Vector variant 


REV32 <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


// size=esize: B(@), H(1), S(1), D(S) 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 


// op=REVx: 64(@), 32(1), 16(2) 
bits(2) op = 00:U; 


// => op+size: 








// 644+B = Q, 64+H = 1, 644+S = 2, 644D = X 
// 32+B = 1, 32+H = 2, 32+S = X, 324D = X 
// 16+B = 2, 16+H = X, 16+S = X, 16+D = X 
// 8+B = X, 8+H =X, 8+S =X, 84D = X 
// => 3-(optsize) (index bits in group) 

// 64/B = 3, 644+H = 2, 6448S = 1, 644D = X 
Tif 32+B = 2, 32+H = 1, 32+S = X, 324D = X 
// 16+B = 1, 16+H = X, 16+S = X, 16+D = X 
Lie 8+B = X, 8+tH =X, 84S =X, 84D =X 


// index bits within group: 1, 2, 3 
if UInt(op) + UInt(size) >= 3 then UnallocatedEncoding(); 


integer container_size; 
case op of 


when '10' container_size = 16; 
when 'Q1' container_size = 32; 
when 'QQ' container_size = 64; 


integer containers = datasize DIV container_size; 
integer elements_per_container = container_size DIV esize; 


Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
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8H when size = 01,Q=1 


The encoding size = 1x, Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element = 0; 
integer rev_element; 
for c = Q to containers-1 
rev_element = element + elements_per_container - 1; 
for e = Q to elements_per_container-1 
Elem[result, rev_element, esize] = Elem[operand, element, esize]; 
element = element + 1; 
rev_element = rev_element - 1; 


V[d] = result; 
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C7.2.195 REV64 


Reverse elements in 64-bit doublewords (vector). This instruction reverses the order of 8-bit, 16-bit, or 32-bit 
elements in each doubleword of the vector in the source SIMD&FP register, places the results into a vector, and 
writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofafofo 774 1 ofsze[1 000 0J000 0/0/10] An | Rd | 
U 00 


Vector variant 


REV64 <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


// size=esize: B(@), H(1), S(1), D(S) 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 


// op=REVx: 64(@), 32(1), 16(2) 
bits(2) op = 00:U; 


// => op+size: 








// 644+B = Q, 64+H = 1, 644+S = 2, 644D = X 
// 32+B = 1, 32+H = 2, 32+S = X, 324D = X 
// 16+B = 2, 16+H = X, 16+S = X, 16+D = X 
// 8+B = X, 8+H =X, 8+S =X, 84D = X 
// => 3-(optsize) (index bits in group) 

// 64/B = 3, 644+H = 2, 6448S = 1, 644D = X 
Tif 32+B = 2, 32+H = 1, 32+S = X, 324D = X 
// 16+B = 1, 16+H = X, 16+S = X, 16+D = X 
Lie 8+B = X, 8+tH =X, 84S =X, 84D =X 


// index bits within group: 1, 2, 3 
if UInt(op) + UInt(size) >= 3 then UnallocatedEncoding(); 


integer container_size; 
case op of 


when '10' container_size = 16; 
when 'Q1' container_size = 32; 
when 'QQ' container_size = 64; 


integer containers = datasize DIV container_size; 
integer elements_per_container = container_size DIV esize; 


Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
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8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element = 0; 
integer rev_element; 
for c = Q to containers-1 
rev_element = element + elements_per_container - 1; 
for e = Q to elements_per_container-1 
Elem[result, rev_element, esize] = Elem[operand, element, esize]; 
element = element + 1; 
rev_element = rev_element - 1; 


V[d] = result; 
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C7.2.196 RSHRN, RSHRN2 


Rounding Shift Right Narrow (immediate). This instruction reads each unsigned integer value from the vector in 

the source SIMD&FP register, right shifts each result by an immediate value, writes the final result to a vector, and 
writes the vector to the lower or upper half of the destination SIMD&FP register. The destination vector elements 
are half as long as the source vector elements. The results are rounded. For truncated results, see SHRN, SHRN2. 


The RSHRN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the RSHRN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/1110 9 | 5 4| 0 | 


fo[afoyo +77 1 0] 0000 | immb [100 0]1]1] Rn | Rd 
op 


immh 


Vector variant 


RSHRN{2} <Vd>.<Tb>, <Vn>.<Ta>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


integer shift = (2 * esize) - UInt(immh:immb) ; 
boolean round = (op == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = QQ1x,Q = 0 
8H when immh = 001x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
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<Vn> 


<Ta> 


<shift> 


Operation 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


The encoding immh = 1xxx, Q = x is reserved. 
Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 

4s when immh = 001x 

2D when immh = Q1xx 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 

The encoding immh = 1xxx is reserved. 

Is the right shift amount, in the range 1| to the destination element width in bits, encoded in the 
"immh:immb" field. It can have the following values: 

(16-UInt(immh:immb)) when immh = 0001 

(32-UInt(immh:immb)) when immh = 001x 

(64-UInt(immh:immb)) when immh = Q1xx 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasizex2) operand = V[n]; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


for e = 0 to elements-1 
element = (UInt(Elem[operand, e, 2«esize]) + round_const) >> shift; 
Elem[result, e, esize] = element<esize-1:0>; 


Vpart[d, part] 


= result; 
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C7.2.197 RSUBHN, RSUBHN2 


Rounding Subtract returning High Narrow. This instruction subtracts each vector element of the second source 
SIMD&FP register from the corresponding vector element of the first source SIMD&FP register, places the most 
significant half of the result into a vector, and writes the vector to the lower or upper half of the destination 


SIMD&FP register. 


The results are rounded. For truncated results, see SUBHN, SUBHN2. 


The RSUBHN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the RSUBHN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 


of the register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20| 


fofapt[o 741 Ofswe[i] Rm [0 4]t]ojoo] An | Rd | 
U 01 


16/15 141312/1110 9 | 5 4| 0 | 


Three registers, not all the same type variant 


RSUBHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 


integer datasize = 64; 
integer part = UInt(Q); 


integer elements = datasize DIV esize; 


boolean sub_op = (01 == '1'); 
boolean round = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 





[absent] whenQ = 20 
[present] whenQ = 1 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 
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The encoding size = 11, Q = x is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(2sdatasize) operandl = V[n]; 

bits(2sdatasize) operand2 = V[m]; 

bits(datasize) result; 

integer round_const = if round then 1 << (esize - 1) else Q; 
bits(2xesize) element1; 

bits(2xesize) element2; 

bits(2sesize) sum; 


for e = 0 to elements-1 
element1 = Elem[operand1, e, 2xesize]; 
element2 = Elem[operand2, e, 2«esize]; 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
sum = sum + round_const; 
Elem[result, e, esize] = sum<2«esize-l:esize>; 


Vpart[d, part] = result; 
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C7.2.198 SABA 
Signed Absolute difference and Accumulate. This instruction subtracts the elements of the vector of the second 
source SIMD&FP register from the corresponding elements of the first source SIMD&FP register, and accumulates 
the absolute values of the results into the elements of the vector of the destination SIMD&FP register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 
fofafoyo 117 ofsze[i] Am [oa7 [ii] kn [| Ra | 
U ac 
Three registers of the same type variant 
SABA <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
boolean accumulate = (ac == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
integer element1; 
integer element2; 
bits(esize) absdiff; 
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result = if accumulate then V[d] else Zeros(); 
for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
absdiff = Abs(elementl-element2)<esize-1:0>; 
Elem[result, e, esize] = Elem[result, e, esize] + absdiff; 
V[d] = result; 
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C7.2.199 SABAL, SABAL2 
Signed Absolute difference and Accumulate Long. This instruction subtracts the vector elements in the lower or 
upper half of the second source SIMD&FP register from the corresponding vector elements of the first source 
SIMD&FP register, and accumulates the absolute values of the results into the vector elements of the destination 
SIMD&FP register. The destination vector elements are twice as long as the source vector elements. 
The SABAL instruction extracts each source vector from the lower half of each source register, while the SABAL2 
instruction extracts each source vector from the upper half of each source register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 cles 26 25 24\23 22 21 20| 16|15141312|/11109 | 5 4| 0 | 
Lao | 01110 a 
Three registers, not all the same type variant 
SABAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = 64; 
integer part = UInt(Q); 
integer elements = datasize DIV esize; 
boolean accumulate = aaa == 'Q'); 
boolean unsigned = (U == _'y! 3 
Assembler symbols 
2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 
[absent] whenQ = 20 
[present] whenQ = 1 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = Q1 
2D when size = 10 
The encoding size = 11 is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2xesize) absdiff; 


result = if accumulate then V[d] else Zeros(); 
for e = 0 to elements-1 

element1 = Int(Elem[operandi, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

absdiff = Abs(elementl-element2)<2sesize-1:0>; 

Elem[result, e, 2xesize] = Elem[result, e, 2sesize] + absdiff; 
V[d] = result; 
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C7.2.200 SABD 
Signed Absolute Difference. This instruction subtracts the elements of the vector of the second source SIMD&FP 
register from the corresponding elements of the first source SIMD&FP register, places the absolute values of the 
results into a vector, and writes the vector to the destination SIMD&FP register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20] 16|15141312/11109 | 5 4| 0 | 
fofafoyo 117 ofsze[i] am [o471o[i] kn [| Ra | 
U ac 
Three registers of the same type variant 
SABD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
boolean accumulate = (ac == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
integer element1; 
integer element2; 
bits(esize) absdiff; 
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result = if accumulate then V[d] else Zeros(); 
for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
absdiff = Abs(elementl-element2)<esize-1:0>; 
Elem[result, e, esize] = Elem[result, e, esize] + absdiff; 
V[d] = result; 
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C7.2.201 SABDL, SABDL2 


Signed Absolute Difference Long. This instruction subtracts the vector elements of the second source SIMD&FP 
register from the corresponding vector elements of the first source SIMD&FP register, places the absolute value of 
the results into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The 
destination vector elements are twice as long as the source vector elements. 


The SABDL instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the SABDL2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312|/1110 9 | 5 4| 0 | 


fofafoo +771 Ofsze]i] Rm [0 a[i[iJoo] rn | Rd 
U op 


Three registers, not all the same type variant 


SABDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 
m 


UInt(Rd) ; 
UInt(Rn) ; 
UInt (Rm) ; 


== '11' then ReservedValue(); 
esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 

elements = datasize DIV esize; 


accumulate = (op == 'Q'); 
unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Th> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 0 
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16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2xesize) absdiff; 


result = if accumulate then V[d] else Zeros(); 
for e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

absdiff = Abs(elementl-element2)<2sesize-1:0>; 

Elem[result, e, 2xesize] = Elem[result, e, 2sesize] + absdiff; 
V[d] = result; 
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C7.2.202 SADALP 


Signed Add and Accumulate Long Pairwise. This instruction adds pairs of adjacent signed integer values from the 
vector in the source SIMD&FP register and accumulates the results into the vector elements of the destination 
SIMD&FP register. The destination vector elements are twice as long as the source vector elements. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 


fo[aofo +771 O[sze]i 000 0f0 o]i]10]1 0] Rn | Rd 
U op 


Vector variant 


SADALP <Vd>.<Ta>, <Vn>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV (2 « esize); 


boolean acc = (op == '1'); 
boolean unsigned = (U == '1'); 


Assembler symbols 


0 | 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 00,Q = 0 
8H when size = 00,Q=1 
2S when size = 01,Q = 0 
4S when size = 01,Q=1 
1D when size = 10,Q = 0 
2D when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 


bits(2sesize) sum; 
integer op1; 
integer op2; 


result = if acc then V[d] else Zeros(); 
for e = 0 to elements-1 
opl = Int(Elem[operand, 2xe+0, esize], unsigned); 
op2 = Int(Elem[operand, 2xe+1, esize], unsigned); 
sum = (opl+op2)<2«esize-1:0>; 
Elem[result, e, 2xesize] = Elem[result, e, 2esize] + sum; 


V[d] = result; 
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C7.2.203 SADDL, SADDL2 


Signed Add Long (vector). This instruction adds each vector element in the lower or upper half of the first source 
SIMD&FP register to the corresponding vector element of the second source SIMD&FP register, places the results 
into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice 
as long as the source vector elements. All the values in this instruction are signed integer values. 


The SADDL instruction extracts each source vector from the lower half of each source register, while the SADDL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


Olajojo 7 4 4 ofsze |i] Rm | ofojojo of Rn | Rd _| 


Three registers, not all the same type variant 


SADDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 


UInt(Rd) ; 
UInt(Rn) ; 


m = UInt(Rm); 


== '11' then ReservedValue(); 


esize 


= 8 << UInt(size); 


datasize = 64; 


part = 


UInt(Q); 


elements = datasize DIV esize; 


sub_op = (01 == '1'); 
unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

integer sum; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
Elem[result, e, 2xesize] = sum<2xesize-1:0>; 


V[d] = result; 
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C7.2.204 SADDLP 


Signed Add Long Pairwise. This instruction adds pairs of adjacent signed integer values from the vector in the 
source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register. 
The destination vector elements are twice as long as the source vector elements. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 


fofafoyo +171 ofsze[1 000 0]0 ofoj1 oft 0] Rn | Ra 
U op 


Vector variant 


SADDLP <Vd>.<Ta>, <Vn>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV (2 « esize); 


boolean acc = (op == '1'); 
boolean unsigned = (U == '1'); 


Assembler symbols 


0| 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 00,Q = 0 
8H when size = 00,Q=1 
2S when size = 01,Q = 0 
4S when size = 01,Q=1 
1D when size = 10,Q = 0 
2D when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 


bits(2sesize) sum; 
integer op1; 
integer op2; 


result = if acc then V[d] else Zeros(); 
for e = 0 to elements-1 
opl = Int(Elem[operand, 2xe+0, esize], unsigned); 
op2 = Int(Elem[operand, 2xe+1, esize], unsigned); 
sum = (opl+op2)<2«esize-1:0>; 
Elem[result, e, 2xesize] = Elem[result, e, 2esize] + sum; 


V[d] = result; 
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C7.2.205 SADDLV 


Signed Add Long across Vector. This instruction adds every vector element in the source SIMD&FP register 
together, and writes the scalar result to the destination SIMD&FP register. The destination scalar is twice as long as 


the source vector elements. All the values in this instruction are signed integer values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312|/1110 9 | 


fo[afofo +771 O[sze]i too ojoo071 110] Rn | Rd 
U 


Advanced SIMD variant 


SADDLV <V><d>, <Vn>.<T> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


if size:Q == '10@' then ReservedValue(); 


if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 


integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 


Assembler symbols 


0| 


<V> Is the destination width specifier, encoded in the "size" field. It can have the following values: 
H when size = 00 
s when size = 01 
D when size = 10 


The encoding size = 11 is reserved. 


<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
4S when size = 10,Q=1 


The following encodings are reserved: 


° size = 10,Q = 0. 


° size = 11,Q =x. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
integer sum; 


sum = Int(Elem[operand, @, esize], unsigned); 
for e = 1 to elements-1 


sum = sum + Int(Elem[operand, e, esize], unsigned); 


V[d] = sum<2*esize-1:0>; 
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C7.2.206 SADDW, SADDW2 


Signed Add Wide. This instruction adds vector elements of the first source SIMD&FP register to the corresponding 
vector elements in the lower or upper half of the second source SIMD&FP register, places the results in a vector, 
and writes the vector to the SIMD&FP destination register. 


The SADDW instruction extracts the second source vector from the lower half of the second source register, while the 
SADDW2 instruction extracts the second source vector from the upper half of the second source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofajojo 111 ofsize]1] Rm fo ofojijo of Rn =| Ra 
U o1 


Three registers, not all the same type variant 


SADDW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


boolean sub_op = (01 == '1'); 
boolean unsigned = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = Q1 
2D when size = 10 


The encoding size = 11 is reserved. 





<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
C7-1170 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operandl = V[n]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

integer sum; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, 2*esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
Elem[result, e, 2xesize] = sum<2xesize-1:0>; 


V[d] = result; 
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C7.2.207 SCVTF (vector, fixed-point) 


Signed fixed-point Convert to Floating-point (vector). This instruction converts each element in a vector from 
fixed-point to floating-point using the rounding mode that is specified by the FPCR, and writes the result to the 


SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 


Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 ides 26 25 sales 22 ts See 109 | 5 4| 
fo tfopr 1147 6] eed rE RS 
immh 
Scalar variant 
SCVTF <V><d>, <V><n>, #<fbits> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if immh == 'Q0@xx' then ReservedValue(); 


integer esize = 32 << UInt(immh<3>); 
integer datasize = esize; 
integer elements = 1; 


integer fracbits = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 
FPRounding rounding = FPRoundingMode(FPCR) ; 


Vector 


|31 30 29 28|27 26 25 24/23 22 1918  16|15141312/1110 9 | 5 4| 


0| 


foJapofo +4 4 + of 0000 [imme [rT 0 ofa] en Te 


immh 


Vector variant 


SCVTF <Vd>.<T>, <Vn>.<T>, #<fbits> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh == 'Q0@xx' then ReservedValue(); 

if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(immh<3>); 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 
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integer fracbits = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 
FPRounding rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


<fbits> 


Is a width specifier, encoded in the "immh" field. It can have the following values: 
Ss when immh = 01xx 
D when immh = 1xxx 


The encoding immh = QQxx is reserved. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 


2D when immh = 1xxx,Q = 1 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The following encodings are reserved: 

° immh = 0001, Q = x. 

° immh = QQ1x, Q = x. 


° jimmh = 1xxx,Q = Q. 
Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


For the scalar variant: is the number of fractional bits, in the range 1 to the operand width, encoded 
in the "immh:immb" field. It can have the following values: 


(64-UInt(immh:immb)) when immh = Q1xx 


(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = QQxx is reserved. 


For the vector variant: is the number of fractional bits, in the range 1 to the element width, encoded 
in the "immh:immb" field. It can have the following values: 


(64-UInt(immh:immb)) when immh = Q1xx 

(128-UInt(immh:immb)) when immh = 1xxx 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 
The following encodings are reserved: 

° immh = 0001. 

° immh = QQ1x. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
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Elem[result, e, esize] = FixedToFP(element, fracbits, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.208 SCVTF (vector, integer) 


Signed integer Convert to Floating-point (vector). This instruction converts each element in a vector from signed 
integer to floating-point using the rounding mode that is specified by the FPCR, and writes the result to the 
SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312|/11109 | 5 4| 0| 


fo sof Trt off ooo ole to if of an TR 


Scalar variant 


SCVTF <V><d>, <V><n> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/1918 17 16/1514 1312/1110 9 | 5 4| 0 | 


foJafofo +4 F ojos ooo ofr ro ii of en Re 


Vector variant 


SCVTF <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 


Assembler symbols 





<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
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<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 

<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2s when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 
bits(datasize) result; 

FPRounding rounding = FPRoundingMode(FPCR) ; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FixedToFP(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.209 SCVTF (scalar, fixed-point) 


Signed fixed-point Convert to Floating-point (scalar). This instruction converts the signed value in the 32-bit or 
64-bit general-purpose source register to a floating-point value using the rounding mode that is specified by the 
FPCR, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24/23 22 2120/1918  16|15 | 109 5 4| 0 | 


also ln iol oa ea — 1 ee 


type rmode opcode 


32-bit to single-precision variant 
Applies when sf == @ && type == 00. 


SCVTF <Sd>, <Wn>, #<fbits> 


32-bit to double-precision variant 
Applies when sf == @ && type == 01. 


SCVTF <Dd>, <Wn>, #<fbits> 


64-bit to single-precision variant 
Applies when sf == 1 && type == 00. 


SCVTF <Sd>, <Xn>, #<fbits> 


64-bit to double-precision variant 
Applies when sf == 1 && type == 01. 


SCVTF <Dd>, <Xn>, #<fbits> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' fltsize = 32; 
when 'Q1' fltsize = 64; 
when '1x' UnallocatedEncoding(); 


if sf == '@' && scale<5> == 'Q' then UnallocatedEncoding(); 
integer fracbits = 64 - UInt(scale); 


rounding = FPRoundingMode(FPCR) ; 
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Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<fbits> For the 32-bit to double-precision and 32-bit to single-precision variant: is the number of bits after 


the binary point in the fixed-point source, in the range 1 to 32, encoded as 64 minus "scale". 


For the 64-bit to double-precision and 64-bit to single-precision variant: is the number of bits after 
the binary point in the fixed-point source, in the range 1 to 64, encoded as 64 minus "scale". 
Operation 
CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


intval = X[n]; 
fltval = FixedToFP(intval, fracbits, FALSE, FPCR, rounding); 
V[d] = fltval; 
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C7.2.210 SCVTF (scalar, integer) 


Signed integer Convert to Floating-point (scalar). This instruction converts the signed integer value in the 
general-purpose source register to a floating-point value using the rounding mode that is specified by the FPCR, and 
writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


fsfofo]1 177 0f0 x]i]o ofo7tofooo 000] Rn | Rd 


type rmode opcode 


32-bit to single-precision variant 
Applies when sf == @ && type == 00. 


SCVTF <Sd>, <Wn> 


32-bit to double-precision variant 
Applies when sf == @ && type == 01. 


SCVTF <Dd>, <Wn> 


64-bit to single-precision variant 
Applies when sf == 1 && type == 00. 


SCVTF <Sd>, <Xn> 


64-bit to double-precision variant 
Applies when sf == 1 && type == 01. 


SCVTF <Dd>, <Xn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding() 


32; 


64; 


rounding = FPRoundingMode(FPCR) ; 
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Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


intval = X[n]; 
fltval = FixedToFP(intval, @, FALSE, FPCR, rounding); 
V[d] = fltval; 
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SHA1C 


SHA1 hash update (choose) 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 1413 12/1110 9 | 5 4| 0 | 


jo tot1it1 ofo ofo} Rm _ fojoo ojo of Rn {| Ra | 


Advanced SIMD variant 


SHAIC <Qd>, <Sn>, <Vm>.4S 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 
<Qd> Is the 128-bit name of the SIMD&FP source and destination, encoded in the "Rd" field. 


<Sn> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rn" field. 


<Vn> Is the name of the third SIMD&FP source register, encoded in the "Rm" field. 


Operation 
CheckCryptoEnabled64() ; 


bits(128) X = V[d]; 

bits(32) Y = V[n]; // Note: 32 not 128 bits wide 
bits(128) W = V[m]; 

bits(32) t; 


for e = 0 to 3 
t = SHAchoose(X<63:32>, X<95:64>, X<127:96>); 
Y = Y + ROL(X<31:0>, 5) + t + Elem[W, e, 32]; 
X<63:32> = ROL(X<63:32>, 30); 
<Y, X> = ROL(Y:X, 32); 

V[d] = X; 
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C7.2.212 SHA1H 


SHAI fixed rotate 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


OTOT11 100 o1 07000000010] Rn | Rd 


Advanced SIMD variant 


SHA1H <Sd>, <Sn> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckCryptoEnabled64(); 


bits(32) operand = V[n];  // read element [@] only, [1-3] zeroed 
V[d] = ROL(operand, 30); 
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SHA1M 


SHA1 hash update (majority) 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fot o1%77 0/0 ojo] Rm [ojo7ojoo] rn | Rd 


Advanced SIMD variant 


SHAIM <Qd>, <Sn>, <Vm>.4S 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 
<Qd> Is the 128-bit name of the SIMD&FP source and destination, encoded in the "Rd" field. 


<Sn> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rn" field. 


<Vn> Is the name of the third SIMD&FP source register, encoded in the "Rm" field. 


Operation 
CheckCryptoEnabled64() ; 


bits(128) X = V[d]; 

bits(32) Y = V[n]; // Note: 32 not 128 bits wide 
bits(128) W = V[m]; 

bits(32) t; 


for e = 0 to 3 
t = SHAmajority(X<63:32>, X<95:64>, X<127:96>); 
Y = Y + ROL(X<31:0>, 5) + t + Elem[W, e, 32]; 
X<63:32> = ROL(X<63:32>, 30); 
<Y, X> = ROL(Y:X, 32); 

V[d] = X; 
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C7.2.214 SHA1P 


SHA1 hash update (parity) 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fot o1%77 0/0 fo] Rm [ojoo joo] rn | Rd 


Advanced SIMD variant 


SHAIP <Qd>, <Sn>, <Vm>.4S 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Qd> Is the 128-bit name of the SIMD&FP source and destination, encoded in the "Rd" field. 
<Sn> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the third SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckCryptoEnabled64() ; 


bits(128) X = V[d]; 

bits(32) Y = V[n]; // Note: 32 not 128 bits wide 
bits(128) W = V[m]; 

bits(32) t; 


for e = 0 to 3 
t = SHAparity(X<63:32>, X<95:64>, X<127:96>); 
Y = Y + ROL(X<31:0>, 5) + t + Elem[W, e, 32]; 
X<63:32> = ROL(X<63:32>, 30); 
<Y, X> = ROL(Y:X, 32); 

V[d] = X; 
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SHA1SU0 


SHA1 schedule update 0 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 1413 12/1110 9 | 5 4| 0 | 


fot 01477 0/0 ojo] Rm [ojo1ijoo] rn | Rd 


Advanced SIMD variant 


SHA1SU@ <Vd>.4S, <Vn>.4S, <Vm>.4S 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if !HaveCryptoExt() then UnallocatedEncoding(); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP source and destination register, encoded in the "Rd" field. 


<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rn" field. 


<Vm> Is the name of the third SIMD&FP source register, encoded in the "Rm" field. 


Operation 
CheckCryptoEnabled64() ; 


bits(128) operandl = V[d]; 
bits(128) operand2 = V[n]; 
bits(128) operand3 = V[m]; 
bits(128) result; 


result = operand2<63:0>:operand1<127:64>; 
result = result EOR operand1 EOR operand3; 
V[d] = result; 
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C7.2.216 SHA1SU1 


SHA1 schedule update 1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


OTOT11 710001 07000000710] Rn | Rd 


Advanced SIMD variant 


SHA1SU1 <Vd>.4S, <Vn>.4S 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP source and destination register, encoded in the "Rd" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rn" field. 
Operation 


CheckCryptoEnabled64(); 


bits(128) operandl = V[d]; 

bits(128) operand2 = V[n]; 

bits(128) result; 

bits(128) T = operand1 EOR LSR(operand2, 32); 
result<31:@> = ROL(T<31:@>, 1); 

result<63:32> = ROL(T<63:32>, 1); 

result<95:64> = ROL(T<95:64>, 1); 

result<127:96> = ROL(T<127:96>, 1) EOR ROL(T<31:0>, 2); 
V[d] = result; 
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C7.2.217 SHA256H2 
SHA256 hash update (part 2) 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312\1110 9 | 5 4| 0| 
o107177 0/0 0[o] Rm o]1o]tjoo] en | Ra | 
Pp 
Advanced SIMD variant 
SHA256H2 <Qd>, <Qn>, <Vm>.4S 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if !HaveCryptoExt() then UnallocatedEncoding(); 
Assembler symbols 
<Qd> Is the 128-bit name of the SIMD&FP source and destination, encoded in the "Rd" field. 
<Qn> Is the 128-bit name of the second SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the third SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckCryptoEnabled64(); 
bits(128) result; 
result = SHA256hash(V[n], V[d], V[m], FALSE); 
V[d] = result; 
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C7.2.218 SHA256H 


SHA256 hash update (part 1) 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fo tor111 0fo ofof Rm _ foflt ofofoof Rn _ [| Rd | 
P 


Advanced SIMD variant 


SHA256H <Qd>, <Qn>, <Vm>.4S 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Qd> Is the 128-bit name of the SIMD&FP source and destination, encoded in the "Rd" field. 
<Qn> Is the 128-bit name of the second SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the third SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckCryptoEnabled64() ; 


bits(128) result; 
result = SHA256hash(V[d], V[n], V[m], TRUE); 
V[d] = result; 
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C7.2.219 SHA256SU0 
SHA256 schedule update 0 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 13 12|11 10 9 


| 5 4| 0} 


oTOT11 710001 0700)000710)10] Rn | Rd 


Advanced SIMD variant 


SHA256SU@ <Vd>.4S, <Vn>.4S 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Vd> 


<Vn> 


Is the name of the SIMD&FP source and destination register, encoded in the "Rd" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckCryptoEnabled64(); 


bits(128) operandl = V[d]; 


bits(128) operand2 


Vin]; 


bits(128) result; 
bits(128) T = operand2<31:0>:operand1<127: 32>; 
bits(32) elt; 


for e = 0 to 3 


V[d] 


elt = Elem[T, e, 32]; 

elt = ROR(elt, 7) EOR ROR(elt, 18) EOR LSR(elt, 3); 
Elem[result, e, 32] = elt + Elem[operandi, e, 32]; 
= result; 
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C7.2.220 SHA256SU1 


SHA256 schedule update 1 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fot o1%77 0/0 ojo] Rm lo]? 1 ooo] rn | Rd 


Advanced SIMD variant 


SHA256SU1 <Vd>.4S, <Vn>.4S, <Vm>.4S 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if !HaveCryptoExt() then UnallocatedEncoding(); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP source and destination register, encoded in the "Rd" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rn" field. 

<Vm> Is the name of the third SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckCryptoEnabled64() ; 


bits(128) operandl = V[d]; 

bits(128) operand2 = V[n]; 

bits(128) operand3 = V[m]; 

bits(128) result; 

bits(128) T@ = operand3<31:0>:operand2<127: 32>; 
bits(64) T1; 

bits(32) elt; 


T1 = operand3<127:64>; 

fore =Q0tol 
elt = Elem[T1, e, 32]; 
elt = ROR(elt, 17) EOR ROR(elt, 19) EOR LSR(elt, 10); 
elt = elt + Elem[operand1, e, 32] + Elem[TO, e, 32]; 
Elem[result, e, 32] = elt; 


T1 = result<63:0>; 

for e = 2 to 3 
elt = Elem[T1, e-2, 32]; 
elt = ROR(elt, 17) EOR ROR(elt, 19) EOR LSR(elt, 10); 
elt = elt + Elem[operand1, e, 32] + Elem[TQ, e, 32]; 
Elem[result, e, 32] = elt; 


V[d] = result; 
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Signed Halving Add. This instruction adds corresponding signed integer values from the two source SIMD&FP 


registers, 


shifts each result right one bit, places the results into a vector, and writes the vector to the destination 


SIMD&FP register. 


The results are truncated. For rounded results, see SRHADD. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 141312\1110 9 | 5 4| 0 | 


fo[afofo +771 O[sze]i] Rm [oo 0000/1] Rn | Rd 
U 


Three registers of the same type variant 


SHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if size 
integer 
integer 
integer 
boolean 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

== '11' then ReservedValue(); 
esize = 8 << UInt(size); 


datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 
unsigned = (U == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vin> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 


bits(datasize) operand2 


Vom]; 


bits(datasize) result; 


integer 
integer 
integer 


element1; 
element2; 
sum; 
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for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
sum = elementl + element2; 
Elem[result, e, esize] = sum<esize:1>; 


V[d] = result; 
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Shift Left (¢mmediate). This instruction reads each value from a vector, right shifts each result by an immediate 
value, writes the final result to a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918  16|15141312/1110 9 | 5 4| 0 | 


fo tfoj1 1414 0} '0000 | immb fo +o 1 oft] Rn | Rd | 


immh 


Scalar variant 


SHL <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


integer shift = UInt(immh:immb) - esize; 
Vector 


|31 30 29 28|27 26 25 24|23 22 1918  16|15141312/1110 9 | 5 4| 0 | 


fo[afoyo +777 0] 0000 | immb [0707 0/1] Rn | Rd 


immh 


Vector variant 


SHL <Vd>.<T>, <Vn>.<T>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQ@@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 


integer shift = UInt(immh:immb) - esize; 


Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1193 


1ID092916 


Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = 001x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 





See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = 0 is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the left shift amount, in the range 0 to 63, encoded in the "immh:immb" 
field. It can have the following values: 
(UInt(immh:immb)-64) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the left shift amount, in the range 0 to the element width in bits minus 1, 
encoded in the "immh:immb" field. It can have the following values: 


(UInt(immh:immb)-8) when immh = 0001 
(UInt(immh:immb)-16) when immh = 001x 
(UInt(immh:immb)-32) when immh = Q1xx 
(UInt(immh:immb)-64) when immh = 1xxx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 


bits(datasize) result; 


for e = 0 to elements-1 
Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift); 


V[d] = result; 
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C7.2.223 SHLL, SHLL2 


Shift Left Long (by element size). This instruction reads each vector element in the lower or upper half of the source 
SIMD&FP register, left shifts each result by the element size, writes the final result to a vector, and writes the vector 
to the destination SIMD&FP register. The destination vector elements are twice as long as the source vector 
elements. 


The SHLL instruction extracts vector elements from the lower half of the source register, while the SHLL2 instruction 
extracts vector elements from the upper half of the source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofafifo +771 O[sze]io 00 0j100717[10] Rn | Rd 


Vector variant 


SHLL{2} <Vd>.<Ta>, <Vn>.<Tb>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


integer shift = esize; 
boolean unsigned = FALSE; // Or TRUE without change of functionality 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = Q1 
2D when size = 10 


The encoding size = 11 is reserved. 





<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<shift> Is the left shift amount, which must be equal to the source element width in bits, encoded in the 
"size" field. It can have the following values: 


8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = Vpart[n, part]; 
bits(2sdatasize) result; 

integer element; 


for e = 0 to elements-1 
element = Int(Elem[operand, e, esize], unsigned) << shift; 


Elem[result, e, 2xesize] = element<2sesize-1:0>; 


V[d] = result; 
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C7.2.224 SHRN, SHRN2 


Shift Right Narrow (immediate). This instruction reads each unsigned integer value from the source SIMD&FP 
register, right shifts each result by an immediate value, puts the final result into a vector, and writes the vector to the 
lower or upper half of the destination SIMD&FP register. The destination vector elements are half as long as the 
source vector elements. The results are truncated. For rounded results, see RSHRN, RSHRN2. 


The RSHRN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the RSHRN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/1110 9 | 5 4| 0 | 


fofafoo +71 1 0] 0000 | immb [100 ojo]i] rn | Rd 


immh op 


Vector variant 


SHRN{2} <Vd>.<Tb>, <Vn>.<Ta>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQ@@' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


integer shift = (2 * esize) - UInt(immh:immb) ; 
boolean round = (op == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = QQ1x,Q = 0 
8H when immh = 001x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
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The encoding immh = 1xxx, Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
2D when immh = Q1xx 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 
The encoding immh = 1xxx is reserved. 
<shift> Is the right shift amount, in the range 1 to the destination element width in bits, encoded in the 
"immh:immb" field. It can have the following values: 
(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasizex2) operand = V[n]; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


for e = 0 to elements-1 
element = (UInt(Elem[operand, e, 2«esize]) + round_const) >> shift; 


Elem[result, e, esize] = element<esize-1:0>; 


Vpart[d, part] = result; 
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Signed Halving Subtract. This instruction subtracts the elements in the vector in the second source SIMD&FP 
register from the corresponding elements in the vector in the first source SIMD&FP register, shifts each result right 
one bit, places each result into elements of a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo[afolo +771 Ofsze]i] Rm [oo700]7] Rn | Rd 
U 


Three registers of the same type variant 


SHSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if size 
integer 
integer 
integer 
boolean 


UInt(Rd) ; 

UInt(Rn) ; 

UInt (Rm) ; 

== '11' then ReservedValue(); 

esize = 8 << UInt(size); 

datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 

unsigned = (U == '1'); 


d 
n 
m 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

The encoding size = 11,Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


integer 
integer 
integer 


element1; 
element2; 
diff; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
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element2 = Int(Elem[operand2, e, esize], unsigned); 
diff = element1 - element2; 
Elem[result, e, esize] = diff<esize:1>; 


V[d] = result; 
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Shift Left and Insert (immediate). This instruction reads each vector element in the source SIMD&FP register, left 
shifts each vector element by an immediate value, and inserts the result into the corresponding vector element in the 
destination SIMD&FP register such that the new zero bits created by the shift are not inserted but retain their 
existing value. Bits shifted out of the left of each vector element in the source register are lost. 


The following figure shows an example of the operation of shift left by 3 for an 8-bit vector element. 


63 56 55 0 

71 [a 

63 Ze oo 56 55 0 

Va.8(7| ater operation —T [TT TTT I 
63 56 55 0 


Va.B{7] before operation[ [TT | [ [ [ [T | 4 | 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/1110 9 | 5 4| 0 | 


fo a[i[1 +777 0] 0000 [immb [0707 0/1] Rn | Rd 


immh 





Scalar variant 


SLI <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


integer shift = UInt(immh:immb) - esize; 
Vector 

|31 30 29 28|27 26 25 24|23 22 11918 16/15141312/11109 | 5 4| 0| 
fofaftfo 7114 0] 0000 [immb Jo 7070/1] kn | Ra 


immh 


Vector variant 


SLI <Vd>.<T>, <Vn>.<T>, #<shift> 
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Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ0' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 


integer esize 


= 8 << HighestSetBit(immh) ; 


integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


integer shift 


= UInt(immh:immb) - esize; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


<shift> 


Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 


8B when immh = 0001,Q = @ 


16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = Q01x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 
2D when immh = 1xxx,Q = 1 





See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 

The encoding immh = 1xxx, Q = @ is reserved. 

Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

For the scalar variant: is the left shift amount, in the range 0 to 63, encoded in the "immh:immb" 
field. It can have the following values: 

(UInt(immh:immb)-64) when immh = 1xxx 

The encoding immh = Qxxx is reserved. 


For the vector variant: is the left shift amount, in the range 0 to the element width in bits minus 1, 
encoded in the "immh:immb" field. It can have the following values: 


(UInt(immh:immb)-8) when immh = 0001 
(UInt(immh:immb)-16) when immh = 001x 
(UInt(immh:immb)-32) when immh = Q1xx 


(UInt(immh:immb)-64) when immh = 1xxx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) operand2 = V[d]; 
bits(datasize) result; 
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bits(esize) mask = LSL(Ones(esize), shift); 
bits(esize) shifted; 


for e = 0 to elements-1 

shifted = LSL(Elem[operand, e, esize], shift); 

Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted; 
V[d] = result; 
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C7.2.227 > SMAX 
Signed Maximum (vector). This instruction compares corresponding elements in the vectors in the two source 
SIMD&FP registers, places the larger of each pair of signed integer values into a vector, and writes the vector to the 
destination SIMD&FP register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20] 16|15141312|/11109 | 5 4| 0 | 
o[afoyo 117 ofsze[i] Rm lo 47 ofoli] kn [| Ra | 
U 01 
Three registers of the same type variant 
SMAX <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
boolean minimum = (01 == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
integer element1; 
integer element2; 
integer maxmin; 
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for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
maxmin = if minimum then Min(elementl, element2) else Max(elementl1, element2); 
Elem[result, e, esize] = maxmin<esize-1:0>; 


V[d] = result; 
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C7.2.228 SMAXP 


Signed Maximum Pairwise. This instruction creates a vector by concatenating the vector elements of the first source 
SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent 

vector elements in the two source SIMD&FP registers, writes the largest of each pair of signed integer values into 
a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofafofo 744 ofswe[i] Rm [107 0]0)] Rn | Rd | 
U o1 


Three registers of the same type variant 


SMAXP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
integer element1; 

integer element2; 
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integer maxmin; 


for e = 0 to elements-1 
element1 = Int(Elem[concat, 2%e, esize], unsigned); 
element2 = Int(Elem[concat, (2*e)+1, esize], unsigned); 
maxmin = if minimum then Min(elementl, element2) else Max(element1, element2); 
Elem[result, e, esize] = maxmin<esize-1:0>; 


V[d] = result; 
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C7.2.229  SMAXV 


Signed Maximum across Vector. This instruction compares all the vector elements in the source SIMD&FP register, 
and writes the largest of the values as a scalar to the destination SIMD&FP register. All the values in this instruction 
are signed integer values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofafoyo +171 Ofsze[1 400 0fo[1070]10] Rn | Ra 
U op 


Advanced SIMD variant 


SMAXV <V><d>, <Vn>.<T> 


Decode for this encoding 


integer 
integer 


if size 
if size 
integer 
integer 
integer 


boolean 
boolean 


d = UInt(Rd); 
n = UInt(Rn); 
:Q == '100' then ReservedValue(); 


== '11' then ReservedValue(); 


esize = 8 << UInt(size); 


datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 


unsigned = (U == '1'); 


min = (op == '1'); 


Assembler symbols 


<V> 


<d> 


<Vn> 


<I> 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 


B when size 
H when size 
S when size 


= 00 
01 
10 


The encoding size = 11 is reserved. 


Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size 
16B when size 
4H when size 
8H when size 
4S when size 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
= @1,Q=1 
=10,Q=1 


The following encodings are reserved: 


° size = 10,Q = @. 


° size = 11,Q =x. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
integer maxmin; 

integer element; 


maxmin = Int(Elem[operand, 0, esize], unsigned); 
for e = 1 to elements-1 
element = Int(Elem[operand, e, esize], unsigned); 
maxmin = if min then Min(maxmin, element) else Max(maxmin, element); 


V[d] = maxmin<esize-1:0>; 
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C7.2.230 SMIN 
Signed Minimum (vector). This instruction compares corresponding elements in the vectors in the two source 
SIMD&FP registers, places the smaller of each of the two signed integer values into a vector, and writes the vector 
to the destination SIMD&FP register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20] 16|15141312/11109 | 5 4| 0 | 
fofafoyo 117 ofsze[i] am [o47 o]i]i] kn [| Ra | 
U o1 
Three registers of the same type variant 
SMIN <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
boolean minimum = (01 == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
integer element1; 
integer element2; 
integer maxmin; 
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for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
maxmin = if minimum then Min(elementl, element2) else Max(elementl1, element2); 
Elem[result, e, esize] = maxmin<esize-1:0>; 


V[d] = result; 
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C7.2.231 SMINP 


Signed Minimum Pairwise. This instruction creates a vector by concatenating the vector elements of the first source 
SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent 
vector elements in the two source SIMD&FP registers, writes the smallest of each pair of signed integer values into 
a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fofafofo 747 ofswe[i] Rm [107 0]1]] Rn | Rd | 
U o1 


Three registers of the same type variant 


SMINP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

bits(2sdatasize) concat = operand2:operand1; 
integer element1; 

integer element2; 
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integer maxmin; 


for e = 0 to elements-1 
element1 = Int(Elem[concat, 2%e, esize], unsigned); 
element2 = Int(Elem[concat, (2*e)+1, esize], unsigned); 
maxmin = if minimum then Min(elementl, element2) else Max(element1, element2); 
Elem[result, e, esize] = maxmin<esize-1:0>; 


V[d] = result; 
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C7.2.232  SMINV 


Signed Minimum across Vector. This instruction compares all the vector elements in the source SIMD&FP register, 
and writes the smallest of the values as a scalar to the destination SIMD&FP register. All the values in this 
instruction are signed integer values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo[afoo +771 Ofsze]it 00 0]i/107 0/10] Rn | Rd 
U op 


Advanced SIMD variant 


SMINV <V><d>, <Vn>.<T> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


if size:Q == '10@' then ReservedValue(); 

if size == '11' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 


boolean min = (op == '1'); 


Assembler symbols 


<V> Is the destination width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 
H when size = Q1 
Ss when size = 10 


The encoding size = 11 is reserved. 


<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
4S when size = 10,Q=1 


The following encodings are reserved: 
° size = 10,Q = @. 


° size = 11,Q =x. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
integer maxmin; 

integer element; 


maxmin = Int(Elem[operand, 0, esize], unsigned); 
for e = 1 to elements-1 
element = Int(Elem[operand, e, esize], unsigned); 
maxmin = if min then Min(maxmin, element) else Max(maxmin, element); 


V[d] = maxmin<esize-1:0>; 
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C7.2.233 SMLAL, SMLAL2 (by element) 


Signed Multiply-Add Long (vector, by element). This instruction multiplies each vector element in the lower or 
upper half of the first source SIMD&FP register by the specified vector element in the second source SIMD&FP 
register, and accumulates the results with the vector elements of the destination SIMD&FP register. The destination 
vector elements are twice as long as the elements that are multiplied. All the values in this instruction are signed 
integer values. 


The SMLAL instruction extracts vector elements from the lower half of the first source register, while the SMLAL2 
instruction extracts vector elements from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 141312|/1110 9 | 5 4| 0 | 


fofajojo 1 41 4 t}size}L|m] Rm |[ojo}t ofHjo] = Rn | Rd 
U 02 


Vector variant 


SMLAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi; 


case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 
when '10' index = UInt(H:L); Rmhi = M; 


Q': 


otherwise UnallocatedEncoding(); 


integer 
integer 
integer 


integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 
m 


UInt(Rd) ; 
UInt(Rn) ; 
UInt(Rmhi :Rm) ; 


esize = 8 << UInt(size); 


datasize = 64; 

part = UInt(Q); 

elements = datasize DIV esize; 
unsigned = (U == '1'); 

sub_op = (02 == '1'); 


Assembler symbols 





2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 
[absent] whenQ = 20 
[present] whenQ = 1 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
4s when size = 01 
2D when size = 10 
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The following encodings are reserved: 


. size = 00. 
° size = 11. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 


° size = 00,Q =x. 
° size = 11,Q =x. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 
the following values: 
Q:Rm when size = Q1 
M:Rm when size = 10 


The following encodings are reserved: 
. size = 00. 
. size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


<Ts> Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = Q1 
Ss when size = 10 


The following encodings are reserved: 


° size = 00. 
. size = 11. 
<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = Q1 
HiL when size = 10 


The following encodings are reserved: 
° size = 00. 


° size = 11. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(idxdsize) operand2 = V[m]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 


element2 = Int(Elem[operand2, index, esize], unsigned); 
for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
product = (elementl«xelement2)<2«esize-1:0>; 
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if sub_op then 

Elem[result, e, 2*esize] = Elem[operand3, e, 2xesize] - product; 
else 

Elem[result, e, 2*esize] = Elem[operand3, e, 2«esize] + product; 


V[d] = result; 
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C7.2.234 SMLAL, SMLAL2 (vector) 


Signed Multiply-Add Long (vector). This instruction multiplies corresponding signed integer values in the lower or 
upper half of the vectors of the two source SIMD&FP registers, and accumulates the results with the vector elements 
of the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are 

multiplied. 


The SMLAL instruction extracts each source vector from the lower half of each source register, while the SMLAL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312|/1110 9 | 5 4| 0 | 


Ofajojo 7 4 4 ofsze |i] Rm _|# ofofojo of] Rn | Rd _ 


Three registers, not all the same type variant 


SMLAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 
boolean 
boolean 


d 
n 


UInt(Rd) ; 
UInt(Rn) ; 


m = UInt(Rm); 


== '11' then ReservedValue(); 


esize 


= 8 << UInt(size); 


datasize = 64; 


part = 


UInt(Q); 


elements = datasize DIV esize; 
sub_op = (ol == '1'); 
unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

bits(2sesize) accum; 


for e = 0 to elements-1 

element1 = Int(Elem[operandi, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
product = (elementl«element2)<2«esize-1:0>; 
if sub_op then 

accum = Elem[operand3, e, 2xesize] - product; 
else 

accum = Elem[operand3, e, 2xesize] + product; 
Elem[result, e, 2xesize] = accum; 


V[d] = result; 
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SMLSL, SMLSL2 (by element) 


Signed Multiply-Subtract Long (vector, by element). This instruction multiplies each vector element in the lower or 
upper half of the first source SIMD&FP register by the specified vector element of the second source SIMD&FP 
register and subtracts the results from the vector elements of the destination SIMD&FP register. The destination 


vector elements are twice as long as the elements that are multiplied. 


The SMLSL instruction extracts vector elements from the lower half of the first source register, while the SMLSL2 
instruction extracts vector elements from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


131 30 29 28|27 26 25 24|23 22 21 20|19 


16|15 14 13 12|11 10 9 


| 5 4) 0 | 
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Vector variant 


SMLSL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 


integer 


index; 


bit Rmhi; 
case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 'Q'; 


when 


'10' index = UInt(H:L); Rmhi = M; 


otherwise UnallocatedEncoding(); 


integer 
integer 
integer 


integer 
integer 
integer 
integer 


boolean 
boolean 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rmhi:Rm) ; 

esize = 8 << UInt(size); 


datasize = 64; 


part = UInt(Q); 

elements = datasize DIV esize; 
unsigned = (U == '1'); 

sub_op = (02 == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
01 
10 


4S when size 


2D when size 


The following encodings are reserved: 


° size = 00. 
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. size = 11. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 


° size = 00,Q =x. 
° size = 11,Q =x. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 


the following values: 

Q:Rm when size = 01 

M:Rm when size = 10 

The following encodings are reserved: 
° size = 00. 

° size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


<Ts> Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 
s when size = 10 


The following encodings are reserved: 


° size = 00. 
. size = 11. 
<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = 01 
HiL when size = 10 


The following encodings are reserved: 
. size = 00. 


° size = 11. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(idxdsize) operand2 = V[m]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 


element2 = Int(Elem[operand2, index, esize], unsigned); 
for e = 0 to elements-1 
element1 = Int(Elem[operandi, e, esize], unsigned); 
product = (elementl«element2)<2«esize-1:0>; 
if sub_op then 
Elem[result, e, 2*esize] = Elem[operand3, e, 2xesize] - product; 
else 
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Elem[result, e, 2*esize] = Elem[operand3, e, 2sesize] + product; 


V[d] = result; 
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C7.2.236 SMLSL, SMLSL2 (vector) 
Signed Multiply-Subtract Long (vector). This instruction multiplies corresponding signed integer values in the 
lower or upper half of the vectors of the two source SIMD&FP registers, and subtracts the results from the vector 
elements of the destination SIMD&FP register. The destination vector elements are twice as long as the elements 
that are multiplied. 
The SMLSL instruction extracts each source vector from the lower half of each source register, while the SMLSL2 
instruction extracts each source vector from the upper half of each source register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 cles 26 25 24\23 22 21 20| 16|15141312|/11109 | 5 4| 0 | 
Lao | 01110 sze]t] Rm |i ofyfojo of Rn _| Ra __] 
Three registers, not all the same type variant 
SMLSL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = 64; 
integer part = UInt(Q); 
integer elements = datasize DIV esize; 
boolean sub_op = (01 == '1'); 
boolean unsigned = (U == '1'); 
Assembler symbols 
2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 
[absent] whenQ = 20 
[present] whenQ=1 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = Q1 
2D when size = 10 
The encoding size = 11 is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

bits(2sesize) accum; 


for e = 0 to elements-1 

element1 = Int(Elem[operandi, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
product = (elementl«element2)<2«esize-1:0>; 
if sub_op then 

accum = Elem[operand3, e, 2xesize] - product; 
else 

accum = Elem[operand3, e, 2xesize] + product; 
Elem[result, e, 2xesize] = accum; 


V[d] = result; 
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C7.2.237 


SMOV 


Signed Move vector element to general-purpose register. This instruction reads the signed integer from the source 


SIMD&FP register, sign-extends it to form a 32-bit or 64-bit value, and writes the result to destination 
general-purpose register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 
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32-bit variant 

Applies when Q == 0. 

SMOV <Wd>, <Vn>.<Ts>[<index>] 
64-bit variant 

Applies when Q == 1. 


SMOV <Xd>, <Vn>.<Ts>[<index>] 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer size; 

case Q:imm5 of 
when 'xxxxx1' size = Q; // SMOV [WX]d, Vn.B 
when 'xxxx10' size = 1; // SMOV [WX]d, Vn.H 
when '1xx100' size = 2; // SMOV Xd, Vn.S 
otherwise UnallocatedEncoding(); 


integer idxdsize = if imm5<4> == '1' then 128 else 64; 
integer index = UInt(imm5<4:size+1>) ; 
integer esize = 8 << size; 
integer datasize = if Q == '1' then 64 else 32; 
Assembler symbols 
<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


<Ts> For the 32-bit variant: is an element size specifier, encoded in the "imm5" field. It can have the 


following values: 
B when immS = xxxx1 
H when imm5 = xxx10 


The encoding imm5 = xxx0Q0Q is reserved. 


For the 64-bit variant: is an element size specifier, encoded in the "imm5" field. It can have the 


following values: 


B when immS = xxxx1 
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H when imm5 = xxx10 

s when imm5 = xx100 

The encoding imm5 = xxQ0Q0Q is reserved. 

<index> For the 32-bit variant: is the element index encoded in the "imm5" field. It can have the following 

values: 

imm5<4:1> when imm5 = xxxx1 

imm5<4:2> when immS = xxx10 

The encoding imm5 = xxxQQ is reserved. 


For the 64-bit variant: is the element index encoded in the "imm5" field. It can have the following 
values: 


imm5<4:1> when immS = xxxx1 
immS<4:2> when imm5 = xxx10 
imm5<4:3> when immS = xx100 


The encoding imm5 = xxQQ0Q is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(idxdsize) operand = V[n]; 


X[d] = SignExtend(Elem[operand, index, esize], datasize); 
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C7.2.238 SMULL, SMULL2 (by element) 


Signed Multiply Long (vector, by element). This instruction multiplies each vector element in the lower or upper 
half of the first source SIMD&FP register by the specified vector element of the second source SIMD&FP register, 
places the result in a vector, and writes the vector to the destination SIMD&FP register. The destination vector 
elements are twice as long as the elements that are multiplied. 


The SMULL instruction extracts vector elements from the lower half of the first source register, while the SMULL2 
instruction extracts vector elements from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 141312|/1110 9 | 5 4| 0 | 
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Vector variant 


SMULL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 'Q'; 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
4s when size = 01 
2D when size = 10 


The following encodings are reserved: 





° size = 00. 
° size = 11. 
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<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 


° size = 00,Q =x. 
° size = 11,Q =x. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 
the following values: 
Q:Rm when size = 01 
M:Rm when size = 10 


The following encodings are reserved: 
° size = 00. 
° size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


<Ts> Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = Q1 
s when size = 10 


The following encodings are reserved: 


° size = 00. 
. size = 11. 
<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = Q1 
HiL when size = 10 


The following encodings are reserved: 
° size = 00. 


° size = 11. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(idxdsize) operand2 = V[m]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 


element2 = Int(Elem[operand2, index, esize], unsigned); 
for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
product = (elementl«element2)<2«esize-1:0>; 
Elem[result, e, 2*esize] = product; 


V[d] = result; 
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C7.2.239 SMULL, SMULL2 (vector) 


Signed Multiply Long (vector). This instruction multiplies corresponding signed integer values in the lower or upper 
half of the vectors of the two source SIMD&FP registers, places the results in a vector, and writes the vector to the 
destination SIMD&FP register. 


The destination vector elements are twice as long as the elements that are multiplied. 


The SMULL instruction extracts each source vector from the lower half of each source register, while the SMULL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 
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Three registers, not all the same type variant 


SMULL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 0 


[present] whenQ = 1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 





<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 


for e = 0 to elements-1 
element1 = Int(Elem[operandi, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
Elem[result, e, 2xesize] = (elementlxelement2)<2sesize-1:0>; 


V[d] = result; 
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C7.2.240 SQABS 


Signed saturating Absolute value. This instruction reads each vector element from the source SIMD&FP register, 
puts the absolute value of the result into a vector, and writes the vector to the destination SIMD&FP register. All the 
values in this instruction are signed integer values. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 
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Scalar variant 


SQABS <V><d>, <V><n> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

boolean neg = (U == '1'); 


Vector 
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Vector variant 


SQABS <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean neg = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 
H when size = Q1 


s when size = 10 
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= 11 


Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4S 
2D 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
= @1,Q=1 
= 10,Q=0 
=10,Q=1 
=11,Q=1 


The encoding size = 11, Q = Qis reserved. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 


bits(datasize) result; 


integer element; 
boolean sat; 


for e = 0 to elements-1 


V[d] 


element = SInt(Elem[operand, e, esize]); 


if neg then 


element = -element; 


else 


element = Abs(element); 
(Elem[result, e, esize], sat) = SignedSatQ(element, esize); 
if sat then FPSR.QC = '1'; 


= result; 
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C7.2.241 SQADD 


Signed saturating Add. This instruction adds the values of corresponding elements of the two source SIMD&FP 
registers, places the results into a vector, and writes the vector to the destination SIMD&FP register. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 141312\1110 9 | 5 4| 0 | 


fo tJo[t 1717 O[sze]i] Rm [oo 007|i] Rn | Rd 
U 


Scalar variant 


SQADD <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 8 << UInt(size); 
integer datasize = esize; 


integer elements = 1; 

boolean unsigned = (U == '1'); 
Vector 

|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312\1110 9 | 5 4| 0| 


fofajojo 111 ofsize]i] Rm jooooriy Rn | Rd | 
U 


Vector variant 


SQADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 


H when size = 01 


S when size = 10 
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D when size = 11 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

integer element1; 

integer element2; 

integer sum; 

boolean sat; 


for 


V[d] 


e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 

sum = elementl + element2; 

(Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned); 
if sat then FPSR.QC = '1'; 


= result; 
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C7.2.242 SQDMLAL, SQDMLAL2 (by element) 


Signed saturating Doubling Multiply-Add Long (by element). This instruction multiplies each vector element in the 
lower or upper half of the first source SIMD&FP register by the specified vector element of the second source 
SIMD&FP register, doubles the results, and accumulates the final results with the vector elements of the destination 
SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


The SQDMLAL instruction extracts vector elements from the lower half of the first source register, while the SQDMLAL2 
instruction extracts vector elements from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 141312/1110 9 | 5 4| 0 | 


fo Jol 7144 t[sze[t[M] Rm [o]o]t t[H]o] An | Rd || 
02 


Scalar variant 


SQDMLAL <Va><d>, <Vb><n>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


'Q': 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

integer part = 0; 


boolean sub_op = (02 == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15141312/1110 9 | 5 4| 0 | 


fofajojo 114 1{sie]ijm] Rm _foloj1 tj{Hjo] Rn | Ra __—id 
02 


Vector variant 


SQDMLAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Ts>[<index>] 
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Decode for this encoding 


integer 
integer 


idxdsize = if H == '1' then 128 else 64; 
index; 


bit Rmhi; 

case size of 
when '@1' index = UInt(H:L:M); Rmhi = 'Q'; 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


integer 
integer 
integer 


integer 
integer 
integer 
integer 


boolean 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rmhi:Rm); 
esize = 8 << UInt(size); 


datasize = 64; 
part = UInt(Q); 
elements = datasize DIV esize; 


sub_op = (02 == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


<Va> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 0 


[present] whenQ =1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
4s when size = Q1 

2D when size = 10 

The following encodings are reserved: 

. size = 00. 


. size = 11. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 
° size = 00,Q = x. 


° size = 11,Q =x. 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 
s when size = Q1 
D when size = 10 


The following encodings are reserved: 





° size = 00. 
° size = 11. 
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<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vb> Is the source width specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 
s when size = 10 
The following encodings are reserved: 
° size = 00. 
° size = 11. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 
the following values: 
Q:Rm when size = Q1 
M:Rm when size = 10 
The following encodings are reserved: 
° size = 00. 
. size = 11. 
Restricted to VO-V15 when element size <Ts> is H. 
<Ts> Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = Q1 
Ss when size = 10 
The following encodings are reserved: 
° size = 00. 
. size = 11. 
<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = Q1 
HiL when size = 10 


The following encodings are reserved: 
° size = 00. 


° size = 11. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = Vpart[n, part]; 
bits(idxdsize) operand2 = V[m]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

integer accum; 

boolean sat1; 

boolean sat2; 


element2 = SInt(Elem[operand2, index, esize]); 
for e = 0 to elements-1 
element1 = SInt(Elem[operand1, e, esize]); 
(product, sat1) = SignedSatQ(2 » elementl « element2, 2 « esize); 
if sub_op then 
accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product); 
else 
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accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product); 
(Elem[result, e, 2xesize], sat2) = SignedSatQ(accum, 2 *« esize); 
if satl || sat2 then FPSR.QC = '1'; 


V[d] = result; 
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C7.2.243 SQDMLAL, SQDMLAL2 (vector) 
Signed saturating Doubling Multiply-Add Long. This instruction multiplies corresponding signed integer values in 
the lower or upper half of the vectors of the two source SIMD&FP registers, doubles the results, and accumulates 
the final results with the vector elements of the destination SIMD&FP register. The destination vector elements are 
twice as long as the elements that are multiplied. 
If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 
The SQDMLAL instruction extracts each source vector from the lower half of each source register, while the SQDMLAL2 
instruction extracts each source vector from the upper half of each source register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312\1110 9 | 5 4| 0 | 
fo aJol1 +11 O[sze]i] am [1 ofo]ijo o] Rn [| Ra 
01 
Scalar variant 
SQDMLAL <Va><d>, <Vb><n>, <Vb><m> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '@Q' || size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 
integer part = Q; 
boolean sub_op = (01 == '1'); 
Vector 
|31 30 ane 25 24|23 22 21 20| 16|15141312\1110 9 | 5 4| 0 | 
fofajofo 1110 sze]t] Rm |i ofofijo of Rn | _Ra __| 
Vector variant 
SQDMLAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == 'QQ' || size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = 64; 
integer part = UInt(Q); 
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integer elements = datasize DIV esize; 


boolean sub_op = (01 == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


<Vin> 


<Va> 


<d> 


<Vb> 


<n> 


<m> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
4s when size = Q1 


2D when size = 10 


The following encodings are reserved: 
° size = 00. 


° size = 11. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 
° size = 00,Q =x. 


° size = 11,Q =x. 
Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 
s when size = 01 

D when size = 10 

The following encodings are reserved: 

° size = 00. 


° size = 11. 
Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the source width specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 

s when size = 10 

The following encodings are reserved: 

° size = 00. 


. size = 11. 
Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
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Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

integer accum; 

boolean satl; 

boolean sat2; 


for e = 0 to elements-1 

element1 = SInt(Elem[operand1, e, esize]); 
element2 = SInt(Elem[operand2, e, esize]); 
(product, sat1) = SignedSatQ(2 » elementl « element2, 2 « esize); 
if sub_op then 

accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product); 
else 

accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product); 
(Elem[result, e, 2xesize], sat2) = SignedSatQ(accum, 2 *« esize); 
if satl || sat2 then FPSR.QC = '1'; 


V[d] = result; 
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C7.2.244 SQDMLSL, SQDMLSL2 (by element) 


Signed saturating Doubling Multiply-Subtract Long (by element). This instruction multiplies each vector element 
in the lower or upper half of the first source SIMD&FP register by the specified vector element of the second source 
SIMD&FP register, doubles the results, and subtracts the final results from the vector elements of the destination 
SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied. All the 
values in this instruction are signed integer values. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


The SQDMLSL instruction extracts vector elements from the lower half of the first source register, while the SQDMLSL2 
instruction extracts vector elements from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 141312/1110 9 | 5 4| 0 | 


fo 1fo]1 117 t{size}L]M] Rm fojt{1 1]/HJo{[ Rn | Rad 
02 


Scalar variant 


SQDMLSL <Va><d>, <Vb><n>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


'O'; 


UInt(Rd) ; 
UInt(Rn); 
UInt(Rmhi:Rm); 


integer d 
integer n 
integer m 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

integer part = 0; 


boolean sub_op = (02 == '1'); 
Vector 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 141312\1110 9 | 5 4| 0 | 


olafofo 1 1 4 tfse[t fw] Rm fols]i JHjof Rn | Rd 


Vector variant 


SQDMLSL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Ts>[<index>] 
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Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi; 
case size of 
when '@1' index = UInt(H:L:M); Rmhi = 'Q'; 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


integer 
integer 
integer 


integer 
integer 
integer 
integer 


boolean 


d 
n 
m 


UInt(Rd) ; 
UInt(Rn); 
UInt(Rmhi:Rm); 


esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 

elements = datasize DIV esize; 


sub_op = (02 == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


<Va> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 0 


[present] whenQ =1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
4s when size = Q1 

2D when size = 10 

The following encodings are reserved: 

. size = 00. 


. size = 11. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 


8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 
° size = 00,Q = x. 


° size = 11,Q =x. 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 
s when size = Q1 
D when size = 10 


The following encodings are reserved: 





° size = 00. 
° size = 11. 
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<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vb> Is the source width specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 
s when size = 10 
The following encodings are reserved: 
° size = 00. 
° size = 11. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 
the following values: 
Q:Rm when size = Q1 
M:Rm when size = 10 
The following encodings are reserved: 
° size = 00. 
. size = 11. 
Restricted to VO-V15 when element size <Ts> is H. 
<Ts> Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = Q1 
Ss when size = 10 
The following encodings are reserved: 
° size = 00. 
. size = 11. 
<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = Q1 
HiL when size = 10 


The following encodings are reserved: 
° size = 00. 


° size = 11. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = Vpart[n, part]; 
bits(idxdsize) operand2 = V[m]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

integer accum; 

boolean sat1; 

boolean sat2; 


element2 = SInt(Elem[operand2, index, esize]); 


for 


e = Q to elements-1 
element1 = SInt(Elem[operand1, e, esize]); 
(product, sat1) = SignedSatQ(2 » elementl « element2, 2 « esize); 
if sub_op then 

accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product); 
else 
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accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product); 
(Elem[result, e, 2xesize], sat2) = SignedSatQ(accum, 2 « esize); 
if satl || sat2 then FPSR.QC = '1'; 


V[d] = result; 
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SQDMLSL, SQDMLSL2 (vector) 


Signed saturating Doubling Multiply-Subtract Long. This instruction multiplies corresponding signed integer 
values in the lower or upper half of the vectors of the two source SIMD&FP registers, doubles the results, and 
subtracts the final results from the vector elements of the destination SIMD&FP register. The destination vector 
elements are twice as long as the elements that are multiplied. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


The SQDMLSL instruction extracts each source vector from the lower half of each source register, while the SQDMLSL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fo Jolt 7747 Ofswe[i] Rm [1 O]t[t}oo] An | Rd | 
o1 


Scalar variant 


SQDMLSL <Va><d>, <Vb><n>, <Vb><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '@Q' || size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

integer part = Q; 


boolean sub_op = (01 == '1'); 
Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


Ofajojo 7 4 4 ofsze |i] Rm | ofi]ifo of Re | Rd 


Vector variant 


SQDMLSL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == 'QQ' || size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = 64; 
integer part = UInt(Q); 
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integer elements = datasize DIV esize; 


boolean sub_op = (01 == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


<Vin> 


<Va> 


<d> 


<Vb> 


<n> 


<m> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
4s when size = Q1 


2D when size = 10 


The following encodings are reserved: 
° size = 00. 


° size = 11. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 
° size = 00,Q =x. 


° size = 11,Q =x. 
Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 
s when size = 01 

D when size = 10 

The following encodings are reserved: 

° size = 00. 


° size = 11. 
Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the source width specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 

s when size = 10 

The following encodings are reserved: 

° size = 00. 


. size = 11. 
Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
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Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

integer accum; 

boolean satl; 

boolean sat2; 


for e = 0 to elements-1 

element1 = SInt(Elem[operand1, e, esize]); 
element2 = SInt(Elem[operand2, e, esize]); 
(product, sat1) = SignedSatQ(2 » elementl « element2, 2 « esize); 
if sub_op then 

accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product); 
else 

accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product); 
(Elem[result, e, 2xesize], sat2) = SignedSatQ(accum, 2 *« esize); 
if satl || sat2 then FPSR.QC = '1'; 


V[d] = result; 
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C7.2.246 


SQDMULH (by element) 


Signed saturating Doubling Multiply returning High half (by element). This instruction multiplies each vector 
element in the first source SIMD&FP register by the specified vector element of the second source SIMD&FP 
register, doubles the results, places the most significant half of the final results into a vector, and writes the vector 
to the destination SIMD&FP register. 


The results are truncated. For rounded results, see SQRDMULH (by element). 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15141312\1110 9 | 5 4| 0 | 


fo tfolt 1771 i[sze[t[M[ Rm [17 ofo[Hfo] Rn | Rd 
op 


Scalar variant 


SQDMULH <V><d>, <V><n>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


'O'; 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 


boolean round = (op == '1'); 
Vector 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 141312|/1110 9 | 5 4| 0 | 


otofoto ttt tisze [IM Rm Ji 7 ofofnfo| Rn __]_fe__ 


Vector variant 


SQDMULH <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 'Q'; 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 
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UInt(Rd) ; 
UInt(Rn) ; 
UInt(Rmhi:Rm) ; 


esize = 8 << UInt(size); 
datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 


round = (op == '1'); 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


<Vm> 


<Ts> 


Is a width specifier, encoded in the "size" field. It can have the following values: 
01 
10 


H when size 


S when size 


The following encodings are reserved: 
° size = 00. 


° size = 11. 

Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 

° size = 00,Q =x. 

° size = 11,Q =x. 

Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 
the following values: 

Q:Rm when size = 01 

M:Rm when size = 10 

The following encodings are reserved: 

° size = 00. 

° size = 11. 

Restricted to VO-V15 when element size <Ts> is H. 

Is an element size specifier, encoded in the "size" field. It can have the following values: 


01 
10 


H when size 


s when size 


The following encodings are reserved: 





° size = 00. 
° size = 11. 
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<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = Q1 
H:L when size = 10 


The following encodings are reserved: 
. size = 00. 


° size = 11. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 

bits(idxdsize) operand2 = V[m]; 

bits(datasize) result; 

integer round_const = if round then 1 << (esize - 1) else Q; 
integer element1; 

integer element2; 

integer product; 

boolean sat; 


element2 = SInt(Elem[operand2, index, esize]); 
for e = 0 to elements-1 
element1 = SInt(Elem[operand1, e, esize]); 
product = (2 « elementl « element2) + round_const; 
// The following only saturates if element1l and element2 equal -(2A(esize-1)) 
(Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize); 
if sat then FPSR.QC = '1'; 


V[d] = result; 
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C7.2.247 SQDMULH (vector) 
Signed saturating Doubling Multiply returning High half. This instruction multiplies the values of corresponding 
elements of the two source SIMD&FP registers, doubles the results, places the most significant half of the final 
results into a vector, and writes the vector to the destination SIMD&FP register. 
The results are truncated. For rounded results, see SQRDMULH (vector). 
If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312/1110 9 | 5 4| 0 | 
fo tfo]i 177 Ofsze[i] Am [1077 0)] Rk [| Ra | 
U 
Scalar variant 
SQDMULH <V><d>, <V><n>, <V><m> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' || size == 'Q@0' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 
boolean rounding = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312/1110 9 | 5 4| 0 | 
fofafoyo 1171 ofsze[i] Am [10770)] kn | Ra | 
U 
Vector variant 
SQDMULH <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' || size == '@0' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean rounding = (U == '1'); 
Assembler symbols 
<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
H when size = 1 
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<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


s when size = 10 
The following encodings are reserved: 
° size = 00. 


° size = 11. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 
° size = 00,Q =x. 


° size = 11,Q =x. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 


bits(datasize) operand2 


Vom]; 


bits(datasize) result; 


integer 
integer 
integer 
integer 
boolean 


for e = 


round_const = if rounding then 1 << (esize - 1) else Q; 
element1; 

element2; 

product; 

sat; 


®@ to elements-1 


element1 = SInt(Elem[operand1, e, esize]); 

element2 = SInt(Elem[operand2, e, esize]); 

product = (2 « elementl « element2) + round_const; 

(Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize); 
if sat then FPSR.QC = '1'; 


V[d] = 


result; 
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C7.2.248 SQDMULL, SQDMULL2 (by element) 


Signed saturating Doubling Multiply Long (by element). This instruction multiplies each vector element in the 
lower or upper half of the first source SIMD&FP register by the specified vector element of the second source 
SIMD&FP register, doubles the results, places the final results in a vector, and writes the vector to the destination 
SIMD&FP register. All the values in this instruction are signed integer values. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


The SQDMULL instruction extracts the first source vector from the lower half of the first source register, while the 
SQDMULL2 instruction extracts the first source vector from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 141312/1110 9 | 5 4| 0 | 


fo tfolt 177 i[sze[t[M] em [1071 [H]o] Rn | Rd 


Scalar variant 


SQDMULL <Va><d>, <Vb><n>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


'Q'; 


UInt(Rd) ; 
UInt(Rn); 
UInt(Rmhi:Rm) ; 


integer d 
integer n 
integer m 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 

integer part = 0; 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 141312/1110 9 | 5 4| 0 | 


ofajojo 1411 afsize{tjm] Rm [to tfHfo] Rn | Rd 


Vector variant 


SQDMULL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 

bit Rmhi; 

case size of 
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when '@1' index = UInt(H:L:M); Rmhi = 'Q'; 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 


the following values: 
[absent] whenQ = 20 


[present] whenQ =1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
4s when size = Q1 
2D when size = 10 


The following encodings are reserved: 


. size = 00. 
. size = 11. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 


° size = 00,Q =x. 
° size = 11,Q =x. 
<Va> Is the destination width specifier, encoded in the "size" field. It can have the following values: 
Ss when size = Q1 
D when size = 10 


The following encodings are reserved: 





° size = 00. 
° size = 11. 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vb> Is the source width specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 
s when size = 10 
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The following encodings are reserved: 


. size = 00. 
° size = 11. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 


the following values: 

Q:Rm when size = 01 

M:Rm when size = 10 

The following encodings are reserved: 
° size = 00. 

° size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


<Ts> Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = Q1 
s when size = 10 


The following encodings are reserved: 


° size = 00. 
° size = 11. 
<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = Q1 
HiL when size = 10 


The following encodings are reserved: 
° size = 00. 


° size = 11. 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operandl = Vpart[n, part]; 
bits(idxdsize) operand2 = V[m]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

boolean sat; 


element2 = SInt(Elem[operand2, index, esize]); 
for e = 0 to elements-1 
element1 = SInt(Elem[operand1, e, esize]); 
(product, sat) = SignedSatQ(2 « elementl « element2, 2 « esize); 
Elem[result, e, 2*esize] = product; 
if sat then FPSR.QC = '1'; 


V[d] = result; 
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C7.2.249 SQDMULL, SQDMULL2 (vector) 


Signed saturating Doubling Multiply Long. This instruction multiplies corresponding vector elements in the lower 
or upper half of the two source SIMD&FP registers, doubles the results, places the final results in a vector, and 
writes the vector to the destination SIMD&FP register. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


The SQDMULL instruction extracts each source vector from the lower half of each source register, while the SQDMULL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fo tfoj1 141 ofsie[t} Rm [110 1fo of Rn {| Ra | 


Scalar variant 


SQDMULL <Va><d>, <Vb><n>, <Vb><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size == 'QQ' || size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = esize; 

integer elements = 1; 

integer part = Q; 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


ofajojo 111 ofsie[1} Rm [110 1fo of Rn [| Rd | 


Vector variant 


SQDMULL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size == 'QQ' || size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 
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Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


<Vim> 


<Va> 


<d> 


<Vb> 


<n> 


<m> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


4S when size = 01 


2D when size = 10 
The following encodings are reserved: 
. size = 00. 


° size = 11. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 


8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 
° size = 00,Q =x. 


° size = 11,Q =x. 
Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 
Ss when size = 01 

D when size = 10 

The following encodings are reserved: 

° size = 00. 


° size = 11. 
Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the source width specifier, encoded in the "size" field. It can have the following values: 
H when size = Q1 

s when size = 10 

The following encodings are reserved: 

° size = 00. 


° size = 11. 
Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1259 


Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

boolean sat; 


for e = 0 to elements-1 
element1 = SInt(Elem[operand1, e, esize]); 
element2 = SInt(Elem[operand2, e, esize]); 
(product, sat) = SignedSatQ(2 « elementl « element2, 2 « esize); 
Elem[result, e, 2*esize] = product; 
if sat then FPSR.QC = '1'; 


V[d] = result; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Signed saturating Negate. This instruction reads each vector element from the source SIMD&FP register, negates 
each value, places the result into a vector, and writes the vector to the destination SIMD&FP register. All the values 
in this instruction are signed integer values. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo a[i[1 i717 O[sze]i ooo 0jo0717[10] Rn | Rd 
U 


Scalar variant 


SQNEG <V><d>, <V><n> 


Decode for this encoding 


integer 
integer 


integer 
integer 
integer 
boolean 


Vector 


d = UInt(Rd); 
n = UInt(Rn); 
esize = 8 << UInt(size); 


datasize = esize; 
elements = 1; 
neg = (U == '1'); 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


lafijo its ofsze[s ooo ofoo rs sjiof Rn | Rd 


Vector variant 


SQNEG <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer 
integer 


if size 
integer 
integer 
integer 
boolean 


d = UInt(Rd); 
n = UInt(Rn); 
:Q == '110' then ReservedValue(); 


esize = 8 << UInt(size); 

datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 

neg = (U == '1'); 


Assembler symbols 





<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 
H when size = @1 
s when size = 10 
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<d> 


<n> 


<Vd> 


<I> 


<Vn> 


D 


when size = 11 


Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4S 
2D 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


00,Q = 0 
00,Q=1 
01,Q = 0 
01,Q=1 
10,Q = 0 
10,Q=1 
11,Q=1 


The encoding size = 11, Q = Qis reserved. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 


bits(datasize) result; 


integer element; 
boolean sat; 


for e = 0 to elements-1 


V[d] 


element = SInt(Elem[operand, e, esize]); 


if neg then 


element = -element; 


else 


element = Abs(element); 
(Elem[result, e, esize], sat) = SignedSatQ(element, esize); 
if sat then FPSR.QC = '1'; 


= result; 
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C7.2.251 SQRDMULH (by element) 


Signed saturating Rounding Doubling Multiply returning High half (by element). This instruction multiplies each 


vector element in the first source SIMD&FP register by the specified vector element of the second source 


SIMD&FP register, doubles the results, places the most significant half of the final results into a vector, and writes 


the vector to the destination SIMD&FP register. 


The results are rounded. For truncated results, see SQODMULH (by element). 


If any of the results overflows, they are saturated. If saturation occurs, the cumulative saturation bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15141312\1110 9 | 


fo tot +771 i[sze[t[M] Rm [17 o[i[H[o] Rn [| Rd 
op 


Scalar variant 


SQRDMULH <V><d>, <V><n>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


'Q'; 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 


boolean round = (op == '1'); 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 141312/1110 9 | 


afelele tr eee ee oo] eT 


Vector variant 


SQRDMULH <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi; 
case size of 
when '@1' index = UInt(H:L:M); Rmhi = 'Q'; 
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when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


integer 
integer 
integer 


integer 
integer 


integer 


boolean 


d 
n 
m 


UInt(Rd) ; 
UInt(Rn); 
UInt(Rmhi:Rm) ; 


esize = 8 << UInt(size); 
datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 


round = (op == '1'); 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


<Vm> 


<Ts> 


Is a width specifier, encoded in the "size" field. It can have the following values: 


H when size = 01 


Ss when size = 10 
The following encodings are reserved: 
° size = 00. 


° size = 11. 

Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 


8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 

° size = 00,Q =x. 

° size = 11,Q =x. 

Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 
the following values: 

Q:Rm when size = Q1 

M:Rm when size = 10 

The following encodings are reserved: 

° size = 00. 

. size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 
s when size = 10 


The following encodings are reserved: 





° size = 00. 
° size = 11. 
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<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = Q1 
H:L when size = 10 


The following encodings are reserved: 
. size = 00. 


° size = 11. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = V[n]; 

bits(idxdsize) operand2 = V[m]; 

bits(datasize) result; 

integer round_const = if round then 1 << (esize - 1) else Q; 
integer element1; 

integer element2; 

integer product; 

boolean sat; 


element2 = SInt(Elem[operand2, index, esize]); 
for e = 0 to elements-1 
element1 = SInt(Elem[operand1, e, esize]); 
product = (2 « elementl « element2) + round_const; 
// The following only saturates if element1l and element2 equal -(2A(esize-1)) 
(Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize); 
if sat then FPSR.QC = '1'; 


V[d] = result; 
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C7.2.252 SQRDMULH (vector) 


Signed saturating Rounding Doubling Multiply returning High half. This instruction multiplies the values of 
corresponding elements of the two source SIMD&FP registers, doubles the results, places the most significant half 


of the final results into a vector, and writes the vector to the destination SIMD&FP register. 


The results are rounded. For truncated results, see SQDMULH (vector). 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 


bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 


0| 


fo a[i[1 177 O[sze]i] Rm [10717 0]] Rn | Rd 
U 


Scalar variant 


SQRDMULH <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size == '11' || size == 'Q@0' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = esize; 

integer elements = 1; 
boolean rounding = (U == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 


0) 


fofayio 17171 o[sze]i] Rm [10717 0]] Rn | Rd 
U 


Vector variant 


SQRDMULH <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size == '11' || size == '@0' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean rounding = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 


H when size = 01 
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<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


s when size = 10 
The following encodings are reserved: 
° size = 00. 


° size = 11. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 
° size = 00,Q =x. 


° size = 11,Q =x. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 


bits(datasize) operand2 


Vom]; 


bits(datasize) result; 


integer 
integer 
integer 
integer 
boolean 


for e = 


round_const = if rounding then 1 << (esize - 1) else Q; 
element1; 

element2; 

product; 

sat; 


®@ to elements-1 


element1 = SInt(Elem[operand1, e, esize]); 

element2 = SInt(Elem[operand2, e, esize]); 

product = (2 « elementl « element2) + round_const; 

(Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize); 
if sat then FPSR.QC = '1'; 


V[d] = 


result; 
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C7.2.253  SQRSHL 


Signed saturating Rounding Shift Left (register). This instruction takes each vector element in the first source 
SIMD&FP register, shifts it by a value from the least significant byte of the corresponding vector element of the 
second source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP 
register. 


If the shift value is positive, the operation is a left shift. Otherwise, it is a right shift. The results are rounded. For 
truncated results, see SQSHL (register). 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fo tfolt 117 o[sze]i] Rm 07 0]i[1{4] Rn | Rd 
U RS 


Scalar variant 


SQRSHL <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 


integer elements = 1; 

boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 

if S == 'Q@' && size != '11' then ReservedValue(); 
Vector 

|31 30 29 28/27 26 25 24|23 22 21 20| 16/15 141312\1110 9 | 5 4| 0| 


fofafofo +771 Ofsze]i] am 07 o]i[i]{4] Rn | Rd 
U RS 


Vector variant 


SQRSHL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 





boolean unsigned = (U == '1'); 
boolean rounding = (R == '1'); 
boolean saturating = (S == '1'); 
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Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Is a width specifier, encoded in the "size" field. It can have the following values: 


B 


H 
s 
D 


when size 
when size 
when size 


when size 


00 
= 01 
= 10 
=11 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4s 
2D 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
=@1,Q=1 
= 10,Q=0 
=10,Q=1 
=11,Q=1 


The encoding size = 11, Q = Qis reserved. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


bits(datasize) result; 


integer round_const = Q; 


integer shift; 
integer element; 
boolean sat; 


for e = 0 to elements-1 


V[d] 


shift = SInt(Elem[operand2, e, esize]<7:0>); 


if rounding then 


round_const = 1 << (-shift - 1); // ® for left shift, 2A(n-1) for right shift 
element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift; 


if saturating then 


(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 


else 


Elem[result, e, esize] =e 


= result; 


lement<esize-1:0>; 
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C7.2.254 SQRSHRN, SQRSHRN2 


Signed saturating Rounded Shift Right Narrow (immediate). This instruction reads each vector element in the 
source SIMD&FP register, right shifts each result by an immediate value, saturates each shifted result to a value that 
is half the original width, puts the final result into a vector, and writes the vector to the lower or upper half of the 
destination SIMD&FP register. All the values in this instruction are signed integer values. The destination vector 
elements are half as long as the source vector elements. The results are rounded. For truncated results, see SQSHRN, 
SQSHRN2. 


The SQRSHRN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the SQRSHRN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


If saturation occurs, the cumulative saturation bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918 16/15141312/11109 | 5 4| 0 | 


tLe tt of 0090 Lamm [roo Tit Rn __|_fe__ 


immh 


Scalar variant 


SQRSHRN <Vb><d>, <Va><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'Q000' then ReservedValue(); 
if immh<3> == '1' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = esize; 

integer elements = 1; 

integer part = 0; 


(2 » esize) - UInt(immh:immb) ; 


integer shift 


boolean round = (op == '1'); 

boolean unsigned = (U == '1'); 

Vector 

|31 30 —— sa ie 16|15141312/1110 9 | 5 4| 0 | 
Ee 


immh 


Vector variant 


SQRSHRN{2} <Vd>.<Tb>, <Vn>.<Ta>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQQ' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 
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integer 
integer 
integer 
integer 


integer 
boolean 
boolean 


esize 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


= 8 << HighestSetBit(immh) ; 


datasize = 64; 
part = UInt(Q); 
elements = datasize DIV esize; 


shift 
round 


= (2 * esize) - UInt(immh: immb) ; 
= (op == '1'); 


unsigned = (U == '1'); 


Assembler symbols 





2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 

[absent] whenQ = 20 
[present] whenQ =1 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = Q01x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = Q1xx,Q = 1 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 

The encoding immh = 1xxx, Q = x is reserved. 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
2D when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 

The encoding immh = 1xxx is reserved. 

<Vb> Is the destination width specifier, encoded in the "immh" field. It can have the following values: 
B when immh = 0001 
H when immh = 001x 
s when immh = Q1xx 
The following encodings are reserved: 
° immh = 0000. 
° immh = 1xxx. 

<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 

<Va> Is the source width specifier, encoded in the "immh" field. It can have the following values: 
H when immh = 0001 
Ss when immh = 001x 
D when immh = Q1xx 
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The following encodings are reserved: 


° immh = 0000. 


° immh = 1xxx. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1| to the destination operand width in 


bits, encoded in the "immh:immb" field. It can have the following values: 
(16-UInt(immh:immb)) when immh = 0001 

(32-UInt(immh:immb)) when immh = 001x 

(64-UInt(immh:immb)) when immh = Q1xx 

The following encodings are reserved: 

° immh = 0000. 

° immh = 1xxx. 


For the vector variant: is the right shift amount, in the range 1 to the destination element width in 
bits, encoded in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasizex2) operand = V[n]; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 

boolean sat; 


for e = 0 to elements-1 
element = (Int(Elem[operand, e, 2xesize], unsigned) + round_const) >> shift; 
(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 


Vpart[d, part] = result; 
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C7.2.255 SQRSHRUN, SQRSHRUN2 


Signed saturating Rounded Shift Right Unsigned Narrow (immediate). This instruction reads each signed integer 
value in the vector of the source SIMD&FP register, right shifts each value by an immediate value, saturates the 
result to an unsigned integer value that is half the original width, places the final result into a vector, and writes the 
vector to the destination SIMD&FP register. The results are rounded. For truncated results, see SQSHRUN, 
SQSHRUN2. 


The SQRSHRUN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the SQRSHRUN2 instruction writes the vector to the upper half of the destination register without affecting the other 
bits of the register. 


If saturation occurs, the cumulative saturation bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/1110 9 | 5 4| 0 | 


fo tfafi 4.1114 Of 10000 | immb [1 0 0 oftft] Rn | Rd 


immh op 


Scalar variant 


SQRSHRUN <Vb><d>, <Va><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'Q000' then ReservedValue(); 
if immh<3> == '1' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = esize; 

integer elements = 1; 

integer part = Q; 


integer shift = (2 * esize) - UInt(immh:immb) ; 
boolean round = (op == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 119918 16|15141312/1110 9 | 5 4| 0 | 


o[afifo +777 0] 0000 | immb [100 o[1]i] Rn | Rd 
op 


immh 


Vector variant 


SQRSHRUN{2} <Vd>.<Tb>, <Vn>.<Ta>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = 64; 
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integer part = UInt(Q); 
integer elements = datasize DIV esize; 


integer shift 
boolean round 


= (2 « esize) - UInt(immh:immb); 
= (op == '1'); 


Assembler symbols 


2 


<Vd> 


<Tb> 


<Vn> 


<Ta> 


<Vb> 


<d> 


<Va> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 


8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = Q01x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = Q1xx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 


The encoding immh = 1xxx, Q = x is reserved. 
Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 

4s when immh = 001x 

2D when immh = Q1xx 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Is the destination width specifier, encoded in the "immh" field. It can have the following values: 
B when immh = 0001 


H when immh = 001x 


Ss when immh = Q1xx 
The following encodings are reserved: 
. immh = 0000. 


° immh = 1xxx. 
Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the source width specifier, encoded in the "immh" field. It can have the following values: 


H when immh = 0001 
S when immh = QQ1x 
D when immh = Q1xx 


The following encodings are reserved: 
° immh = 0000. 


° immh = 1xxx. 
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Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

For the scalar variant: is the right shift amount, in the range | to the destination operand width in 
bits, encoded in the "immh:immb" field. It can have the following values: 

(16-UInt(immh:immb)) when immh = 0001 

(32-UInt(immh:immb)) when immh = 001x 

(64-UInt(immh:immb)) when immh = Q1xx 

The following encodings are reserved: 

° immh = 0000. 

° immh = 1xxx. 


For the vector variant: is the right shift amount, in the range 1 to the destination element width in 
bits, encoded in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasizex2) operand = V[n]; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


boolean sat; 


for e = 0 to elements-1 
element = (SInt(Elem[operand, e, 2xesize]) + round_const) >> shift; 
(Elem[result, e, esize], sat) = UnsignedSatQ(element, esize); 
if sat then FPSR.QC = '1'; 


Vpart[d, part] = result; 
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C7.2.256 SQSHL (immediate) 
Signed saturating Shift Left (immediate). This instruction reads each vector element in the source SIMD&FP 
register, shifts each result by an immediate value, places the final result in a vector, and writes the vector to the 
destination SIMD&FP register. The results are truncated. For rounded results, see UQRSHL. 
If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 saul 26 25 sali 22 Jone ee 109 | 0 | 
esol Ta tt of eee Liane [ot aso] eT 
immh 
Scalar variant 
SQSHL <V><d>, <V><n>, #<shift> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if immh == 'Q000' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = esize; 
integer elements = 1; 
integer shift = UInt(immh:immb) - esize; 
boolean src_unsigned; 
boolean dst_unsigned; 
case op:U of 
when 'QQ' UnallocatedEncoding(); 
when 'Q@1' src_unsigned = FALSE; dst_unsigned = TRUE; 
when '10' src_unsigned = FALSE; dst_unsigned = FALSE; 
when '11' src_unsigned = TRUE; dst_unsigned = TRUE; 
Vector 
|31 30 zo 7eere 26 25 2 240° 22 igs 16|15141312/11109 | 0| 
Lalo. o1 1_o[ !=0000 me fot tots] Rn _{_Re__] 
immh 
Vector variant 
SQSHL <Vd>.<T>, <Vn>.<T>, #<shift> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = if Q == '1' then 128 else 64; 
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integer elements = datasize DIV esize; 


integer shift = UInt(immh:immb) - esize; 


boolean src_unsigned; 
boolean dst_unsigned; 
case op:U of 


when 
when 
when 
when 


"QQ' 
'Q1' 
'1Q' 
"4W1' 


UnallocatedEncoding(); 

src_unsigned = FALSE; dst_unsigned = TRUE; 
src_unsigned = FALSE; dst_unsigned = FALSE; 
src_unsigned = TRUE; dst_unsigned = TRUE; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


<shift> 


Is a width specifier, encoded in the "immh" field. It can have the following values: 


B when immh = 0001 


H when immh = 001x 
S when immh = Q1xx 
D when immh = 1xxx 


The encoding immh = 0000 is reserved. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 


8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = 001x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 
2D when immh = 1xxx,Q = 1 





See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 

The encoding immh = 1xxx, Q = Q is reserved. 

Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

For the scalar variant: is the left shift amount, in the range 0 to the operand width in bits minus 1, 
encoded in the "immh:immb" field. It can have the following values: 

(UInt(immh:immb)-8) when immh = 0001 

(UInt(immh:immb)-16) when immh = 001x 

(UInt(immh:immb)-32) when immh = Q1xx 

(UInt(immh:immb)-64) when immh = 1xxx 

The encoding immh = 0000 is reserved. 


For the vector variant: is the left shift amount, in the range 0 to the element width in bits minus 1, 
encoded in the "immh:immb" field. It can have the following values: 


(UInt(immh:immb)-8) when immh = 0001 
(UInt(immh:immb)-16) when immh = 001x 
(UInt(immh:immb)-32) when immh = Q1xx 


(UInt(immh:immb)-64) when immh = 1xxx 
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See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element; 

boolean sat; 


for e = 0 to elements-1 
element = Int(Elem[operand, e, esize], src_unsigned) << shift; 
(Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned) ; 
if sat then FPSR.QC = '1'; 


V[d] = result; 
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C7.2.257 SQSHL (register) 


Signed saturating Shift Left (register). This instruction takes each element in the vector of the first source SIMD&FP 
register, shifts each element by a value from the least significant byte of the corresponding element of the second 
source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register. 


If the shift value is positive, the operation is a left shift. Otherwise, it is a right shift. The results are truncated. For 
rounded results, see SQRSHL. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


io ifoi 1 4 7 ofsve |i] Rm fo 7 ofolijij Re | Rd 


Scalar variant 


SQSHL <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 


boolean unsigned = (U == '1'); 
boolean rounding = (R == '1'); 
boolean saturating = (S == '1'); 


if S == 'Q@' && size != '11' then ReservedValue(); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312\1110 9 | 5 4| 0| 


fo[afoo +771 O[sze]i] Rm [07 ofoli]i] Rn | Rd 
U RS 


Vector variant 


SQSHL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 





boolean unsigned = (U == '1'); 
boolean rounding = (R == '1'); 
boolean saturating = (S == '1'); 
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Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is a width specifier, encoded in the "size" field. It can have the following values: 


B 


H 
s 
D 


when size 
when size 
when size 


when size 


00 
= 01 
= 10 
=11 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4s 
2D 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
=@1,Q=1 
= 10,Q=0 
=10,Q=1 
=11,Q=1 


The encoding size = 11, Q = Qis reserved. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


bits(datasize) result; 


integer round_const = Q; 


integer shift; 
integer element; 
boolean sat; 


for e = 0 to elements-1 


V[d] 


shift = SInt(Elem[operand2, e, esize]<7:0>); 


if rounding then 


round_const = 1 << (-shift - 1); // ® for left shift, 2A(n-1) for right shift 
element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift; 


if saturating then 


(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 


else 


Elem[result, e, esize] =e 


= result; 


lement<esize-1:0>; 
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Signed saturating Shift Left Unsigned (immediate). This instruction reads each signed integer value in the vector of 
the source SIMD&FP register, shifts each value by an immediate value, saturates the shifted result to an unsigned 
integer value, places the result in a vector, and writes the vector to the destination SIMD&FP register. The results 
are truncated. For rounded results, see UQRSHL. 


If saturation occurs, the cumulative saturation bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 1918  16|15141312/1110 9 | 5 4| 0 | 


en a0 ne oe 9st 


immh 


Scalar variant 


SQSHLU <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d 
integer n 


if immh = 
integer e 


UInt(Rd) ; 
UInt(Rn); 


= 'QQ000' then ReservedValue(); 


size = 8 << HighestSetBit(immh) ; 


integer datasize = esize; 
integer elements = 1; 


integer s 


boolean s 
boolean d 
case op:U 
when 
when 
when 
when 


Vector 


hift = UInt(immh:immb) - esize; 


rc_unsigned; 

st_unsigned; 

of 

*Q0' UnallocatedEncoding(); 

"@1' src_unsigned = FALSE; dst_unsigned = TRUE; 
'10' src_unsigned = FALSE; dst_unsigned = FALSE; 
'11' src_unsigned = TRUE; dst_unsigned = TRUE; 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/1110 9 | 5 4| 0 | 


ototrto 4 4 of 0090 [imme Jo 7 afototi{ Rn _|_e_ 


immh 


Vector variant 


SQSHLU <Vd>.<T>, <Vn>.<T>, #<shift> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
= UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 


integer e 


size = 8 << HighestSetBit(immh) ; 


integer datasize = if Q == '1' then 128 else 64; 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1281 
Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


integer elements = datasize DIV esize; 


integer shift = UInt(immh:immb) - esize; 


boolean src_unsigned; 
boolean dst_unsigned; 
case op:U of 


when 
when 
when 
when 


"QQ' 
'Q1' 
'1Q' 
"4W1' 


UnallocatedEncoding(); 

src_unsigned = FALSE; dst_unsigned = TRUE; 
src_unsigned = FALSE; dst_unsigned = FALSE; 
src_unsigned = TRUE; dst_unsigned = TRUE; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


<shift> 


Is a width specifier, encoded in the "immh" field. It can have the following values: 


B when immh = 0001 


H when immh = 001x 
S when immh = Q1xx 
D when immh = 1xxx 


The encoding immh = 0000 is reserved. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 


8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = 001x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 
2D when immh = 1xxx,Q = 1 





See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 

The encoding immh = 1xxx, Q = 0 is reserved. 

Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

For the scalar variant: is the left shift amount, in the range 0 to the operand width in bits minus 1, 
encoded in the "immh:immb" field. It can have the following values: 

(UInt(immh:immb)-8) when immh = 0001 

(UInt(immh:immb)-16) when immh = 001x 

(UInt(immh:immb)-32) when immh = Q1xx 

(UInt(immh:immb)-64) when immh = 1xxx 

The encoding immh = 0000 is reserved. 


For the vector variant: is the left shift amount, in the range 0 to the element width in bits minus 1, 
encoded in the "immh:immb" field. It can have the following values: 


(UInt(immh:immb)-8) when immh = 0001 
(UInt(immh:immb)-16) when immh = 001x 
(UInt(immh:immb)-32) when immh = Q1xx 


(UInt(immh:immb)-64) when immh = 1xxx 
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See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element; 

boolean sat; 


for e = 0 to elements-1 
element = Int(Elem[operand, e, esize], src_unsigned) << shift; 
(Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned) ; 
if sat then FPSR.QC = '1'; 


V[d] = result; 
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C7.2.259 SQSHRN, SQSHRN2 


Signed saturating Shift Right Narrow (immediate). This instruction reads each vector element in the source 
SIMD&FP register, right shifts and truncates each result by an immediate value, saturates each shifted result to a 
value that is half the original width, puts the final result into a vector, and writes the vector to the lower or upper 
half of the destination SIMD&FP register. All the values in this instruction are signed integer values. The destination 
vector elements are half as long as the source vector elements. For rounded results, see SQRSHRN, SQRSHRN2. 


The SQSHRN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the SQSHRN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


If saturation occurs, the cumulative saturation bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918 16/15141312/1110 9 | 5 4| 0 | 


io ifoli 1 i 1 4 of F000 | imme fs oo sfojit Re | Rd 


immh op 


Scalar variant 


SQSHRN <Vb><d>, <Va><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'Q000' then ReservedValue(); 
if immh<3> == '1' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = esize; 

integer elements = 1; 

integer part = 0; 


integer shift = (2 * esize) - UInt(immh:immb) ; 


boolean round = (op == '1'); 

boolean unsigned = (U == '1'); 

Vector 

|31 30 —— a ne 18 16|15141312/1110 9 | 5 4| 0 | 
ER a 


immh 


Vector variant 


SQSHRN{2} <Vd>.<Tb>, <Vn>.<Ta>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
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integer 
integer 
integer 


integer 
boolean 
boolean 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


datasize = 64; 
part = UInt(Q); 
elements = datasize DIV esize; 


shift = (2 « esize) - UInt(immh:immb); 
round = (op == '1'); 
unsigned = (U == '1'); 


Assembler symbols 





2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 

[absent] whenQ =0 
[present] whenQ = 1 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = Q01x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 

The encoding immh = 1xxx, Q = x is reserved. 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
2D when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 

The encoding immh = 1xxx is reserved. 

<Vb> Is the destination width specifier, encoded in the "immh" field. It can have the following values: 
B when immh = 0001 
H when immh = 0Q1x 
S when immh = Q1xx 
The following encodings are reserved: 
° immh = 0000. 
° immh = 1xxx. 

<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 

<Va> Is the source width specifier, encoded in the "immh" field. It can have the following values: 
H when immh = 0001 
s when immh = 001x 
D when immh = Q1xx 
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The following encodings are reserved: 


° immh = 0000. 


° immh = 1xxx. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1| to the destination operand width in 


bits, encoded in the "immh:immb" field. It can have the following values: 
(16-UInt(immh:immb)) when immh = 0001 

(32-UInt(immh:immb)) when immh = 001x 

(64-UInt(immh:immb)) when immh = Q1xx 

The following encodings are reserved: 

° immh = 0000. 

° immh = 1xxx. 


For the vector variant: is the right shift amount, in the range 1 to the destination element width in 
bits, encoded in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasizex2) operand = V[n]; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 

boolean sat; 


for e = 0 to elements-1 
element = (Int(Elem[operand, e, 2xesize], unsigned) + round_const) >> shift; 
(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 


Vpart[d, part] = result; 
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C7.2.260 SQSHRUN, SQSHRUN2 


Signed saturating Shift Right Unsigned Narrow (immediate). This instruction reads each signed integer value in the 
vector of the source SIMD&FP register, right shifts each value by an immediate value, saturates the result to an 
unsigned integer value that is half the original width, places the final result into a vector, and writes the vector to 
the destination SIMD&FP register. The results are truncated. For rounded results, see SQRSHRUN, SQRSHRUN2. 


The SQSHRUN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the SQSHRUN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


If saturation occurs, the cumulative saturation bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 19918 16/15141312/1110 9 | 5 4| 0 | 


fo afi] 11717 0] 0000 [immb [100 ojofi] rn | Rd 


immh op 


Scalar variant 


SQSHRUN <Vb><d>, <Va><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'Q000' then ReservedValue(); 
if immh<3> == '1' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = esize; 

integer elements = 1; 

integer part = Q; 


integer shift = (2 * esize) - UInt(immh:immb) ; 
boolean round = (op == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 1918  16|15141312/1110 9 | 5 4| 0 | 


fo[ayifo +777 0] 0000 | immb [100 ofo[i] Rn | Rd 


immh op 


Vector variant 


SQSHRUN{2} <Vd>.<Tb>, <Vn>.<Ta>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQ@@' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = 64; 

integer part = UInt(Q); 
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integer elements = datasize DIV esize; 


integer shift 
boolean round 


= (2 « esize) - UInt(immh:immb); 
= (op == '1'); 


Assembler symbols 





2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 

[absent] whenQ = 0 
[present] whenQ=1 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = Q01x,Q = 1 
2s when immh = Q1xx,Q = @ 
4S when immh = Q1xx,Q = 1 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 

The encoding immh = 1xxx, Q = x is reserved. 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
2D when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 

The encoding immh = 1xxx is reserved. 

<Vb> Is the destination width specifier, encoded in the "immh" field. It can have the following values: 
B when immh = 0001 
H when immh = QQ1x 
s when immh = @1xx 
The following encodings are reserved: 

. immh = 0000. 
° immh = 1xxx. 

<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 

<Va> Is the source width specifier, encoded in the "immh" field. It can have the following values: 
H when immh = 0001 
s when immh = QQ1x 
D when immh = Q1xx 
The following encodings are reserved: 

. immh = 0000. 
° immh = 1xxx. 
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Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

For the scalar variant: is the right shift amount, in the range | to the destination operand width in 
bits, encoded in the "immh:immb" field. It can have the following values: 

(16-UInt(immh:immb)) when immh = 0001 

(32-UInt(immh:immb)) when immh = 001x 

(64-UInt(immh:immb)) when immh = Q1xx 

The following encodings are reserved: 

° immh = 0000. 

° immh = 1xxx. 


For the vector variant: is the right shift amount, in the range 1 to the destination element width in 
bits, encoded in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasizex2) operand = V[n]; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


boolean sat; 


for e = 0 to elements-1 
element = (SInt(Elem[operand, e, 2xesize]) + round_const) >> shift; 
(Elem[result, e, esize], sat) = UnsignedSatQ(element, esize); 
if sat then FPSR.QC = '1'; 


Vpart[d, part] = result; 
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C7.2.261 SQSUB 


Signed saturating Subtract. This instruction subtracts the element values of the second source SIMD&FP register 
from the corresponding element values of the first source SIMD&FP register, places the results into a vector, and 
writes the vector to the destination SIMD&FP register. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo to[t +717 0[see]i] Rm [oo707[1] Rn | Rd 
U 


Scalar variant 


SQSUB <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312/1110 9 | 5 4| 0| 


lajojo i 1 ofszefif Rm Joo ro ijif Rn | Ré _ 


Vector variant 


SQSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 
H when size = Q1 


s when size = 10 
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when size = 11 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4s 
2D 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


00,Q = 0 
00,Q=1 
01,Q = 0 
01,Q=1 
10,Q = 0 
10,Q=1 
11,Q=1 


The encoding size = 11, Q = Qis reserved. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operand1 
bits(datasize) operand2 
bits(datasize) result; 
integer element1; 
integer element2; 


integer diff; 
boolean sat; 


for e = 0 to elements-1 


V[d] 


= V[n]; 
= V[m]; 


element1 = Int(Elem[operandi, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
diff = element1 - element2; 
(Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned); 
if sat then FPSR.QC = '1'; 


= result; 
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C7.2.262 SQXTN, SQXTN2 


Signed saturating extract Narrow. This instruction reads each vector element from the source SIMD&FP register, 
saturates the value to half the original width, places the result into a vector, and writes the vector to the lower or 
upper half of the destination SIMD&FP register. The destination vector elements are half as long as the source 
vector elements. All the values in this instruction are signed integer values. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


The SQXTN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the SQXTN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 


[o tJoj1 111 ofsize]1 00 0 of1 010 0f7 of Rn [Rd | 
U 


Scalar variant 


SQXTN <Vb><d>, <Va><n> 


Decode for this encoding 


integer 
integer 
if size 
integer 
integer 
integer 
integer 


boolean 


Vector 


d = UInt(Rd); 

n = UInt(Rn); 

== '11' then ReservedValue(); 
esize = 8 << UInt(size); 


datasize = esize; 
part = Q; 
elements 


1; 


unsigned = (U == '1'); 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 


wlalo[e +1 + ofem[r oOo os Oo ol a] a J 


Vector variant 


SQXTN{2} 


<Vd>.<Tb>, <Vn>.<Ta> 


Decode for this encoding 


integer 
integer 


if size 
integer 
integer 
integer 


d = UInt(Rd); 

n = UInt(Rn); 

== '11' then ReservedValue(); 
esize = 8 << UInt(size); 


datasize = 64; 
part = UInt(Q); 


0| 
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integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 


<Vb> Is the destination width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 
H when size = Q1 
s when size = 10 


The encoding size = 11 is reserved. 


<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 

<Va> Is the source width specifier, encoded in the "size" field. It can have the following values: 
H when size = 00 
s when size = Q1 
D when size = 10 


The encoding size = 11 is reserved. 


<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operand = V[n]; 
bits(datasize) result; 
bits(2sesize) element; 

boolean sat; 
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for e = 0 to elements-1 
element = Elem[operand, e, 2esize]; 
(Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned); 
if sat then FPSR.QC = '1'; 


Vpart[d, part] = result; 
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C7.2.263 SQXTUN, SQXTUN2 


Signed saturating extract Unsigned Narrow. This instruction reads each signed integer value in the vector of the 
source SIMD&FP register, saturates the value to an unsigned integer value that is half the original width, places the 
result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The 
destination vector elements are half as long as the source vector elements. 


If saturation occurs, the cumulative saturation bit FPSR.QC is set. 


The SQXTUN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the SQXTUN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 1817 16/15 1413 12/1110 9 | 5 4| 0 | 


fo a[i[1 1717 O[sze]i0000]100710)10] Rn | Rd 


Scalar variant 


SQXTUN <Vb><d>, <Va><n> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 

integer part = Q; 

integer elements = 1; 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofajijo 141 ofsie{1 ooo ojs 007 ofr of Rn [| Rd | 


Vector variant 


SQXTUN{2} <Vd>.<Tb>, <Vn>.<Ta> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 
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Assembler symbols 


2 


<Vd> 


<Th> 


<Vn> 


<Ta> 


<Vb> 


<d> 


<Va> 


<n> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 0 


16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 
Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 


B when size = 00 
H when size = 01 
s when size = 10 


The encoding size = 11 is reserved. 
Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the source width specifier, encoded in the "size" field. It can have the following values: 


H when size = 00 
S when size = Q1 
D when size = 10 


The encoding size = 11 is reserved. 


Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operand = V[n]; 
bits(datasize) result; 
bits(2sesize) element; 


boolean sat; 


for e = 0 to elements-1 


element = Elem[operand, e, 2esize]; 
(Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize); 
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if sat then FPSR.QC = '1'; 


Vpart[d, part] = result; 
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C7.2.264 SRHADD 
Signed Rounding Halving Add. This instruction adds corresponding signed integer values from the two source 
SIMD&FP registers, shifts each result right one bit, places the results into a vector, and writes the vector to the 
destination SIMD&FP register. 
The results are rounded. For truncated results, see SHADD. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20] 16|15 141312/1110 9 | 5 4| 0 | 
fofafoyo 117 ofsze[i] Am [oo0070)] kn [| Ra | 
U 
Three registers of the same type variant 
SRHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
integer element1; 
integer element2; 
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for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
Elem[result, e, esize] = (elementl+element2+1)<esize:1>; 


V[d] = result; 
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C7.2.265 SRI 


Shift Right and Insert (immediate). This instruction reads each vector element in the source SIMD&FP register, 
right shifts each vector element by an immediate value, and inserts the result into the corresponding vector element 
in the destination SIMD&FP register such that the new zero bits created by the shift are not inserted but retain their 
existing value. Bits shifted out of the right of each vector element of the source register are lost. 


The following figure shows an example of the operation of shift right by 3 for an 8-bit vector element. 


63 56 55 0 
Vn.BI7] 


63 56 55 0 
Vd.B[7] after operation 


63 56 55 0 


VA.B[7] before operation[ [TT | | [ [ [ | y | 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/1110 9 | 5 4| 0 | 


fo a[i[1 +7717 0] 0000 [immb [0700 0/1] Rn | Rd | 


immh 





Scalar variant 


SRI <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


integer shift = (esize « 2) - UInt(immh:immb) ; 
Vector 
|31 30 29 28|27 26 25 24|23 22 11918 16/15141312|/11109 | 5 4| 0| 


fofafio +717 0] 0000 [immb [0700 0]1] Rn | Rd 


immh 


Vector variant 


SRI <Vd>.<T>, <Vn>.<T>, #<shift> 
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Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 


integer shift = (esize » 2) - UInt(immh:immb) ; 


Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 


<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = Q01x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 





See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = @ is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1 to 64, encoded in the "Immh:immb" 
field. It can have the following values: 
(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the right shift amount, in the range 1 to the element width in bits, encoded 
in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 


(128-UInt(immh:immb)) when immh = 1xxx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) operand2 = V[d]; 
bits(datasize) result; 
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bits(esize) mask = LSR(Ones(esize), shift); 
bits(esize) shifted; 


for e = 0 to elements-1 

shifted = LSR(Elem[operand, e, esize], shift); 

Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted; 
V[d] = result; 
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C7.2.266 SRSHL 


Signed Rounding Shift Left (register). This instruction takes each signed integer value in the vector of the first 
source SIMD&FP register, shifts it by a value from the least significant byte of the corresponding element of the 
second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP 
register. 


If the shift value is positive, the operation is a left shift. If the shift value is negative, it is a rounding right shift. For 
a truncating shift, see SSHL. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|1110 9 | 5 4| 0 | 


io ifoi 1 i 7 ofsze |i] Rm Jo 7 ofifojij Re | Rd 


Scalar variant 


SRSHL <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 


boolean unsigned = (U == '1'); 
boolean rounding = (R == '1'); 
boolean saturating = (S == '1'); 
if S == 'Q' && size != '11' then ReservedValue(); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312\1110 9 | 5 4| 0| 


fo[afoo +771 O[sze]i] Rm [07 o]ijo[i] Rn | Rd 
U RS 


Vector variant 


SRSHL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 





boolean unsigned = (U == '1'); 
boolean rounding = (R == '1'); 
boolean saturating = (S == '1'); 
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Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 

The following encodings are reserved: 

° size = Ox. 


° size = 10. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 2 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


integer round_const = Q; 
integer shift; 

integer element; 
boolean sat; 


for 


V[d] 


e = 0 to elements-1 
shift = SInt(Elem[operand2, e, esize]<7:0>); 
if rounding then 
round_const = 1 << (-shift - 1); // ® for left shift, 2A(n-1) for right shift 
element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift; 
if saturating then 
(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 
else 
Elem[result, e, esize] = element<esize-1:0>; 


= result; 
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SRSHR 


Signed Rounding Shift Right (immediate). This instruction reads each vector element in the source SIMD&FP 
register, right shifts each result by an immediate value, places the final result into a vector, and writes the vector to 
the destination SIMD&FP register. All the values in this instruction are signed integer values. The results are 
rounded. For truncated results, see SSHR. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918  16|15141312/1110 9 | 5 4| 0 | 


io ifofs 4 1 4 7 of 0000 | imme Jo ofrfojoli| Rn | Ro 


immh 01 00 


Scalar variant 


SRSHR <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


integer shift = (esize » 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 19918 16|15141312/1110 9 | 5 4| 0 | 


olafofo 1 1 4 1 oj 0000 | immb Jo ofrfojoli| Rn | Ré 


immh 01 00 


Vector variant 


SRSHR <Vd>.<T>, <Vn>.<T>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 


integer shift = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 





<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = Q01x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = @ is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1 to 64, encoded in the "Immh:immb" 
field. It can have the following values: 
(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the right shift amount, in the range 1 to the element width in bits, encoded 
in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
(128-UInt(immh:immb)) when immh 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


1xxx 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 

bits(datasize) operand2; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


operand2 = if accumulate then V[d] else Zeros(); 

for e = 0 to elements-1 
element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift; 
Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>; 


V[d] = result; 
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SRSRA 


Signed Rounding Shift Right and Accumulate (immediate). This instruction reads each vector element in the source 
SIMD&FP register, right shifts each result by an immediate value, and accumulates the final results with the vector 
elements of the destination SIMD&FP register. All the values in this instruction are signed integer values. The 


results are rounded. For truncated results, see SSRA. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918  16|15141312/11109 | 5 4| 0 | 


Oo ifofs 4 1 4 7 of 0000 | imme Jo oft fijoli| Ro | Ro 


immh 01 00 


Scalar variant 


SRSRA <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


integer shift = (esize » 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 19918 16|15141312/1110 9 | 5 4| 0 | 


oafofo 1 1 4 4 oj s0000 | immb Jo ofrfijoli| Rn | Rd 


immh 01 00 


Vector variant 


SRSRA <Vd>.<T>, <Vn>.<T>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 


integer shift = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 





<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = Q01x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = @ is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1 to 64, encoded in the "Immh:immb" 
field. It can have the following values: 
(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the right shift amount, in the range 1 to the element width in bits, encoded 
in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
(128-UInt(immh:immb)) when immh 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


1xxx 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 

bits(datasize) operand2; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


operand2 = if accumulate then V[d] else Zeros(); 

for e = 0 to elements-1 
element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift; 
Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>; 


V[d] = result; 
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C7.2.269 SSHL 


Signed Shift Left (register). This instruction takes each signed integer value in the vector of the first source 
SIMD&FP register, shifts each value by a value from the least significant byte of the corresponding element of the 
second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP 
register. 


If the shift value is positive, the operation is a left shift. If the shift value is negative, it is a truncating right shift. For 
a rounding shift, see SRSHL. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|1110 9 | 5 4| 0 | 


io ifoi 1 i 7 ofsze |i] Rm Jo 7 ofojojij Re | Rd 


Scalar variant 


SSHL <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 


boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 

if S == 'Q' && size != '11' then ReservedValue(); 
Vector 

|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312\1110 9 | 5 4| 0| 


fo[afoo +771 O[sze]i] Rm [07 ofojo[i] rn | Rd 
U RS 


Vector variant 


SSHL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 





boolean unsigned = (U == '1'); 
boolean rounding = (R == '1'); 
boolean saturating = (S == '1'); 
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Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 

The following encodings are reserved: 

° size = Ox. 


° size = 10. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 2 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


integer round_const = Q; 
integer shift; 

integer element; 
boolean sat; 


for 


V[d] 


e = 0 to elements-1 
shift = SInt(Elem[operand2, e, esize]<7:0>); 
if rounding then 
round_const = 1 << (-shift - 1); // ® for left shift, 2A(n-1) for right shift 
element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift; 
if saturating then 
(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 
else 
Elem[result, e, esize] = element<esize-1:0>; 


= result; 





C7-1310 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.270 SSHLL, SSHLL2 


Signed Shift Left Long (immediate). This instruction reads each vector element from the source SIMD&FP register, 
left shifts each vector element by the specified shift amount, places the result into a vector, and writes the vector to 
the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements. 
All the values in this instruction are signed integer values. 


The SSHLL instruction extracts vector elements from the lower half of the source register, while the SSHLL2 instruction 
extracts vector elements from the upper half of the source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is used by the alias SXTL, SXTL2. See Alias conditions for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/1110 9 | 5 4| 0 | 


foJafofo Ta 1 of reooo [inne [Vo 10 oi] an Ra 


immh 


Vector variant 


SSHLL{2} <Vd>.<Ta>, <Vn>.<Tb>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQQ' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


integer shift = UInt(immh:immb) - esize; 
boolean unsigned = (U == '1'); 


Alias conditions 





Alias is preferred when 





SXTL, SXTL2 —immb == '000' && BitCount(immh) == 





Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 0 


[present] whenQ = 1 





<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
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2D when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = QQ1x,Q = @ 
8H when immh = 001x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = x is reserved. 
<shift> Is the left shift amount, in the range 0 to the source element width in bits minus 1, encoded in the 
"immh:immb" field. It can have the following values: 
(UInt(immh:immb)-8) when immh = 0001 
(UInt(immh:immb)-16) when immh = 001x 
(UInt(immh:immb)-32) when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = Vpart[n, part]; 
bits(datasizex2) result; 

integer element; 


for e = 0 to elements-1 
element = Int(Elem[operand, e, esize], unsigned) << shift; 


Elem[result, e, 2xesize] = element<2sesize-1:0>; 


V[d] = result; 
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C7.2.271  SSHR 


Signed Shift Right (immediate). This instruction reads each vector element in the source SIMD&FP register, right 
shifts each result by an immediate value, places the final result into a vector, and writes the vector to the destination 
SIMD&FP register. All the values in this instruction are signed integer values. The results are truncated. For 


rounded results, see SRSHR. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 |19 18 


16|15141312\1110 9 | 5 4| 0 | 


io ifofs 4 1 4 7 0} 0000 | imme Jo ofofojoli| Ro | Ro 


immh 


Scalar variant 


SSHR <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


shift = (esize « 
unsigned = (U == '1'); 
round = (ol == '1'); 
accumulate = (00 == '1'); 


integer 
boolean 
boolean 
boolean 


Vector 


|31 30 29 28|27 26 25 24|23 22 |19 18 


01 00 


2) - UInt(immh: immb) ; 


16|15 141312|/1110 9 | 5 4| 0 | 


olafojo 1 1 4 1 oj s0000 | immb fo ofofojoli| Rn | Re 


immh 


Vector variant 


SSHR <Vd>.<T>, <Vn>.<T>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


01 00 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 


if immh<3>:Q == 


'10' then ReservedValue(); 


integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

integer shift = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 





<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = Q01x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = @ is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1 to 64, encoded in the "Immh:immb" 
field. It can have the following values: 
(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the right shift amount, in the range 1 to the element width in bits, encoded 
in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
(128-UInt(immh:immb)) when immh 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


1xxx 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 

bits(datasize) operand2; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


operand2 = if accumulate then V[d] else Zeros(); 

for e = 0 to elements-1 
element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift; 
Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>; 


V[d] = result; 





C7-1314 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7.2.272 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


SSRA 


Signed Shift Right and Accumulate (immediate). This instruction reads each vector element in the source 
SIMD&FP register, right shifts each result by an immediate value, and accumulates the final results with the vector 
elements of the destination SIMD&FP register. All the values in this instruction are signed integer values. The 
results are truncated. For rounded results, see SRSRA. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/11109 | 5 4| 0 | 


io ifofs 4 1 4 7 0} 0000 | imme Jo ofofijoli| Rn | Ro 


immh 01 00 


Scalar variant 


SSRA <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


integer shift = (esize » 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 19918 16|15141312/1110 9 | 5 4| 0 | 


olafofo 1 1 4 1 oj s0000 | immb Jo ofofijoli| Rn | Re 


immh 01 00 


Vector variant 


SSRA <Vd>.<T>, <Vn>.<T>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 


integer shift = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 





<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = Q01x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = @ is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1 to 64, encoded in the "Immh:immb" 
field. It can have the following values: 
(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the right shift amount, in the range 1 to the element width in bits, encoded 
in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
(128-UInt(immh:immb)) when immh 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


1xxx 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 

bits(datasize) operand2; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


operand2 = if accumulate then V[d] else Zeros(); 

for e = 0 to elements-1 
element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift; 
Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>; 


V[d] = result; 
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Signed Subtract Long. This instruction subtracts each vector element in the lower or upper half of the second source 
SIMD&FP register from the corresponding vector element of the first source SIMD&FP register, places the results 
into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are signed 
integer values. The destination vector elements are twice as long as the source vector elements. 


The SSUBL instruction extracts each source vector from the lower half of each source register, while the SSUBL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 14 13 12|11 10 9 


5 4| 


alalo[s +++ ofa] ea [oo] [oo] Ra TJ 


Three registers, not all the same type variant 


SSUBL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 


UInt(Rd) ; 
UInt(Rn) ; 


m = UInt(Rm); 


== '11' then ReservedValue(); 


esize 


= 8 << UInt(size); 


datasize = 64; 


part = 


UInt(Q); 


elements = datasize DIV esize; 


sub_op = (01 == '1'); 
unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

integer sum; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
Elem[result, e, 2xesize] = sum<2xesize-1:0>; 


V[d] = result; 
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C7.2.274 SSUBW, SSUBW2 


Signed Subtract Wide. This instruction subtracts each vector element in the lower or upper half of the second source 
SIMD&FP register from the corresponding vector element in the first source SIMD&FP register, places the result 
in a vector, and writes the vector to the SIMD&FP destination register. All the values in this instruction are signed 
integer values. 


The SSUBW instruction extracts the second source vector from the lower half of the second source register, while the 
SSUBW2 instruction extracts the second source vector from the upper half of the second source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 1413 12/1110 9 | 5 4| 0 | 


Ofajojo 1 4 4 ofsze |i] Rm Jo ofi]ifo of] Rn | Rd 


Three registers, not all the same type variant 


SSUBW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 


UInt(Rd) ; 
UInt(Rn) ; 


m = UInt(Rm); 


== '11' then ReservedValue(); 


esize 


= 8 << UInt(size); 


datasize = 64; 


part = 


UInt(Q); 


elements = datasize DIV esize; 


sub_op = (01 == '1'); 
unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Vim> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
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<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operandl = V[n]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

integer sum; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, 2*esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
Elem[result, e, 2xesize] = sum<2xesize-1:0>; 


V[d] = result; 
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structures) 


Store multiple single-element structures from one, two, three, or four registers. This instruction stores elements to 
memory from one, two, three, or four SIMD&FP registers, without interleaving. Every element of each register is 


stored. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


No offset 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 | 5 4| 0 | 


fofafoo 4700 ofofoo 000 0x x1 xis] en | Rt 
L 


opcode 


One register variant 


Applies when opcode == 0111. 


ST1 { <Vt> 


.<T> }, [<Xn|SP>] 


Two registers variant 


Applies when opcode == 1010. 


ST1 { <Vt> 


.<T>, <Vt2>.<T> }, [<Xn|SP>] 


Three registers variant 


Applies when opcode == 0110. 


ST1 { <Vt> 


.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>] 


Four registers variant 


Applies when opcode == 0010. 


ST1 { <Vt> 


.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>] 


Decode for all variants of this encoding 


integer t 
integer n 
integer m 


= UInt(Rt); 
= UInt(Rn); 
= integer UNKNOWN; 


boolean whack = FALSE; 


Post-index 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 12\11109 | 5 4| 0 | 


fofafo.o 4700 100] Rm [xx1x[sze] en | Rt 
L 


opcode 


One register, immediate offset variant 


Applies when Rm == 11111 && opcode == @111. 


ST1 { <Vt> 


.<T> }, [<Xn|SP>], <imm> 


One register, register offset variant 


Applies when Rm != 11111 && opcode == Q111. 
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ST1 { <Vt>.<T> }, [<Xn|SP>], <Xm> 


Two registers, immediate offset variant 
Applies when Rm == 11111 && opcode == 1010. 


ST1 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <imm> 


Two registers, register offset variant 
Applies when Rm != 11111 && opcode == 1010. 


ST1 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <Xm> 


Three registers, immediate offset variant 
Applies when Rm == 11111 && opcode == 0110. 


ST1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <imm> 


Three registers, register offset variant 
Applies when Rm != 11111 && opcode == 0110. 


ST1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <Xm> 


Four registers, immediate offset variant 
Applies when Rm == 11111 && opcode == 0010. 


ST1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <imm> 


Four registers, register offset variant 
Applies when Rm != 11111 && opcode == 0010. 


ST1 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 





<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 2 
4S when size = 10,Q=1 
1D when size = 11,Q = 0 
2D when size = 11,Q=1 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
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Is the name of the fourth SIMD&FP register to be transferred, encoded as "Rt" plus 3 modulo 32. 
Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
For the one register, immediate offset variant: is the post-index immediate offset, encoded in the "Q" 
field. It can have the following values: 

#8 when Q = @ 

#16 when Q = 1 


For the two registers, immediate offset variant: is the post-index immediate offset, encoded in the 
"Q" field. It can have the following values: 


#16 when Q = @ 
#32 when Q = 1 


For the three registers, immediate offset variant: is the post-index immediate offset, encoded in the 
"Q" field. It can have the following values: 


#24 when Q = 0 
#48 when Q = 1 


For the four registers, immediate offset variant: is the post-index immediate offset, encoded in the 
"Q" field. It can have the following values: 


#32 when Q = @ 
#64 when Q = 1 


Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 


integer 
integer 
integer 


integer 
integer 


selem; 


datasize = if Q == '1' then 128 else 64; 
esize = 8 << UInt(size); 
elements = datasize DIV esize; 


// number of iterations 
// structure elements 


case opcode of 


when 
when 
when 
when 
when 
when 
when 


"Q000' rpt = 


1; selem = 4; // LD/ST4 (4 registers) 
"Q010' rpt = 4; selem = 1; // LD/ST1 (4 registers) 
"Q100' rpt = 1; selem = 3; // LD/ST3 (3 registers) 
"Q110' rpt = 3; selem = 1; // LD/ST1 (3 registers) 
"Q111' rpt = 1; selem = 1; // LD/ST1 (1 register) 
'1000' rpt = 1; selem = 2; // LD/ST2 (2 registers) 
'1010' rpt = 2; selem = 1; // LD/ST1 (2 registers) 


otherwise UnallocatedEncoding(); 


// .1D format only permitted with LD1 & ST1 


if size: 


== '110' && selem != 1 then ReservedValue(); 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(datasize) rval; 

integer e, r, s, tt; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1323 


Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


else 
address = X[n]; 


offs = Zeros(); 
for r = Q to rpt-1 
for e = 0 to elements-1 
tt = (t + r) MOD 32; 
for s = @ to selem-1 
rval = V[tt]; 
if memop == MemO0p_LOAD then 
Elem[rval, e, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[tt] = rval; 
else // memop == MemOp_STORE 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, e, esize]; 
offs = offs + ebytes; 
tt = (tt + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.276 ST1 (single structure) 


Store a single-element structure from one lane of one register. This instruction stores the specified element of a 


SIMD&FP register to memory. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


No offset 


|31 30 29 28|27 26 25 24|23 22 21 20/1918 1716/15 1312/11109 | 


alae 110+ ofolofe 8 00 ols x [S| sm] — mm] a] 


opcode 


8-bit variant 
Applies when opcode == 000. 


ST1 { <Vt>.B }[<index>], [<Xn|SP>] 


16-bit variant 
Applies when opcode == 010 && size == xQ. 


ST1 { <Vt>.H }[<index>], [<Xn|SP>] 


32-bit variant 
Applies when opcode == 100 && size == 00. 


ST1 { <Vt>.S }[<index>], [<Xn|SP>] 


64-bit variant 
Applies when opcode == 100 && S == @ && size == 01. 


ST1 { <Vt>.D }[<index>], [<Xn|SP>] 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 


Post-index 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1312/1109 | 


fojafo eo i ot solo] an ex ose] aT 





opcode 
8-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 000. 
ST1 { <Vt>.B }[<index>], [<Xn|SP>], #1 
8-bit, register offset variant 
Applies when Rm != 11111 && opcode == 000. 
ST1 { <Vt>.B }[<index>], [<Xn|SP>], <Xm> 
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16-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 010 && size == xQ. 


ST1 { <Vt>.H }[<index>], [<Xn|SP>], #2 


16-bit, register offset variant 
Applies when Rm != 11111 && opcode == 010 && size == xQ. 


ST1 { <Vt>.H }[<index>], [<Xn|SP>], <Xm> 


32-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 100 && size == 00. 


ST1 { <Vt>.S }[<index>], [<Xn|SP>], #4 


32-bit, register offset variant 
Applies when Rm != 11111 && opcode == 100 && size == 00. 


ST1 { <Vt>.S }[<index>], [<Xn|SP>], <Xm> 


64-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 100 && S == @ && size == 01. 


ST1 { <Vt>.D }[<index>], [<Xn|SP>], #8 


64-bit, register offset variant 
Applies when Rm != 11111 && opcode == 100 && S == @ && size == 01. 


ST1 { <Vt>.D }[<index>], [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 


<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 


<index> For the 8-bit variant: is the element index, encoded in "Q:S:size". 
For the 16-bit variant: is the element index, encoded in "Q:S:size<1>". 
For the 32-bit variant: is the element index, encoded in "Q:S". 


For the 64-bit variant: is the element index, encoded in "Q". 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<Q>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
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when 3 
// load and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 
when Q 
index = UInt(Q:S:size); // B[O-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[@-3] 
else 
if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 


integer datasize = 
integer esize = 


if Q == '1' then 128 else 64; 
8 << scale; 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 
bits(64) offs; 
bits(128) rval; 


bits(esize) element; 


integer s; 


constant integer ebytes = 


if n == 31 then 


esize DIV 8; 


CheckSPAlignment(); 


address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
if replicate then 


// load and replicate to all elements 

for s = @ to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


// load/store one element per register 
for s = Q to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 


if m != 31 then 
offs = X[m]; 
if n == 31 then 
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SP[] 
else 
X[n] = address + offs; 


address + offs; 





C7-1328 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.277  ST2 (multiple structures) 


Store multiple 2-element structures from two registers. This instruction stores multiple 2-element structures from 
two SIMD&FP registers to memory, with interleaving. Every element of each register is stored. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 





No offset 
131 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16|15 vel 10 9 — 5 4| 0 | 
ajo o i 109 jojo ooo 0 of7 00 Ofsze| Rn | Rt 
opcode 
No offset variant 
ST2 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>] 
Decode for this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 
Post-index 
|31 30 ee 26 25 24|23 22 21 20] 16|15 en 109 | 5 4| 0 | 
oJafoe Fo iofo] Rn [+ 00 ofsae[ en TR 
opcode 
Immediate offset variant 
Applies when Rm == 11111. 
ST2 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <imm> 
Register offset variant 
Applies when Rm != 11111. 
ST2 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>], <Xm> 
Decode for all variants of this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 
Assembler symbols 
<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
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2S when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "Q" field. It can have the following values: 
#16 when Q = 0 
#32 when Q = 1 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 


integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << UInt(size); 


integer elements = datasize DIV esize; 


integer rpt; 
integer selem; 


case opcode of 


// number of iterations 
// structure elements 


when 'Q000' rpt = 1; selem = 4; // LD/ST4 (4 registers) 
when 'Q010' rpt = 4; selem = 1; // LD/ST1 (4 registers) 
when 'Q100' rpt = 1; selem = 3; // LD/ST3 (3 registers) 
when 'Q110' rpt = 3; selem = 1; // LD/ST1 (3 registers) 
when 'Q@111' rpt = 1; selem = 1; // LD/ST1 (1 register) 
when '1000' rpt = 1; selem = 2; // LD/ST2 (2 registers) 
when '1010' rpt = 2; selem = 1; // LD/ST1 (2 registers) 


otherwise UnallocatedEncoding(); 


// .1D format only permitted with LD1 & ST1 
if size:Q == '110' && selem != 1 then ReservedValue(); 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(datasize) rval; 

integer e, r, s, tt; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
for r = Q to rpt-1 
for e = Q to elements-1 
tt = (t + r) MOD 32; 
for s = @ to selem-1 
rval = V[tt]; 
if memop == MemOp_LOAD then 


Elem[rval, e, esize] = Mem[addresst+offs, ebytes, AccType_VEC]; 
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V[tt] = rval; 
else // memop == MemOp_STORE 

Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, e, esize]; 
offs = offs + ebytes; 
tt = (tt + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.278 ST2 (single structure) 

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to 

memory from corresponding elements of two SIMD&FP registers. 

Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 

and Exception level, an attempt to execute the instruction might be trapped. 

No offset 
|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 eels 109 | 5 4| 0| 
fofajoo tio 14 loli fo oo 0 oj]x x o[s}sze[ Rn [| Rt | 

opcode 

8-bit variant 

Applies when opcode == 000. 

ST2 { <Vt>.B, <Vt2>.B }[<index>], [<Xn|SP>] 

16-bit variant 

Applies when opcode == 010 && size == xQ. 

ST2 { <Vt>.H, <Vt2>.H }[<index>], [<Xn|SP>] 

32-bit variant 

Applies when opcode == 100 && size == 00. 

ST2 { <Vt>.S, <Vt2>.S }[<index>], [<Xn|SP>] 

64-bit variant 

Applies when opcode == 100 && S == @ && size == 01. 

ST2 { <Vt>.D, <Vt2>.D }[<index>], [<Xn|SP>] 

Decode for all variants of this encoding 

integer t = UInt(Rt); 

integer n = UInt(Rn); 

integer m = integer UNKNOWN; 

boolean whack = FALSE; 

Post-index 

|31 30 29 28|27 26 25 24|23 22 21 20| 16|15  1312/11109 | 5 4| 0| 
foJafoo Tt ot soli] an x x ofs]sze] re TR 

opcode 

8-bit, immediate offset variant 

Applies when Rm == 11111 && opcode == 000. 

ST2 { <Vt>.B, <Vt2>.B }[<index>], [<Xn|SP>], #2 

8-bit, register offset variant 

Applies when Rm != 11111 && opcode == 000. 

ST2 { <Vt>.B, <Vt2>.B }[<index>], [<Xn|SP>], <Xm> 
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16-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 010 && size == xQ. 


ST2 { <Vt>.H, <Vt2>.H }[<index>], [<Xn|SP>], #4 


16-bit, register offset variant 
Applies when Rm != 11111 && opcode == 010 && size == xQ. 


ST2 { <Vt>.H, <Vt2>.H }[<index>], [<Xn|SP>], <Xm> 


32-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 100 && size == 00. 


ST2 { <Vt>.S, <Vt2>.S }[<index>], [<Xn|SP>], #8 


32-bit, register offset variant 
Applies when Rm != 11111 && opcode == 100 && size == 00. 


ST2 { <Vt>.S, <Vt2>.S }[<index>], [<Xn|SP>], <Xm> 


64-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 100 && S == @ && size == 01. 


ST2 { <Vt>.D, <Vt2>.D }[<index>], [<Xn|SP>], #16 


64-bit, register offset variant 
Applies when Rm != 11111 && opcode == 100 && S == @ && size == 01. 


ST2 { <Vt>.D, <Vt2>.D }[<index>], [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 


<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<index> For the 8-bit variant: is the element index, encoded in "Q:S:size". 


For the 16-bit variant: is the element index, encoded in "Q:S:size<1>". 
For the 32-bit variant: is the element index, encoded in "Q:S". 


For the 64-bit variant: is the element index, encoded in "Q". 


<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<@>:R) + 1; 
boolean replicate = FALSE; 

integer index; 
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case scale of 
when 3 
// load and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when Q 
index = UInt(Q:S:size); // B[@-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[0-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[Q-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = 0 to selem-1 
element = Mem[addresst+offs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = @ to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
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C7.2.279 ST3 (multiple structures) 
Store multiple 3-element structures from three registers. This instruction stores multiple 3-element structures to 
memory from three SIMD&FP registers, with interleaving. Every element of each register is stored. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
No offset 
|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 a 10 9 — 5 4| 0| 
ajo o i 10 9 jojo doo 0 ojo io ojsze| Rn | Rt 
opcode 
No offset variant 
ST3 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>] 
Decode for this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 
Post-index 
|31 30 ee 26 25 24|23 22 21 20| 16|15 en 109 | 5 4| 0| 
foJafo oe +40 0 ifofo] rm To To ofsee[ en TR 
opcode 
Immediate offset variant 
Applies when Rm == 11111. 
ST3 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <imm> 
Register offset variant 
Applies when Rm != 11111. 
ST3 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T> }, [<Xn|SP>], <Xm> 
Decode for all variants of this encoding 
integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 
Assembler symbols 
<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
C7-1336 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k 


Non-Confidential ID092916 


_iss10775 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


2s when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "Q" field. It can have the following values: 
#24 when Q = 0 
#48 when Q = 1 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 


Shared decode for all encodings 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 

integer esize = 8 << UInt(size); 

integer elements = datasize DIV esize; 


integer rpt; // number of iterations 
integer selem; // structure elements 


case opcode of 


when '0000' rpt = 1; selem = 4; // LD/ST4 (4 registers) 
when '0010' rpt = 4; selem = 1; // LD/ST1 (4 registers) 
when 'Q100' rpt = 1; selem = 3; // LD/ST3 (3 registers) 
when 'Q110' rpt = 3; selem = 1; // LD/ST1 (3 registers) 
when 'Q111' rpt = 1; selem = 1; // LD/ST1 (1 register) 
when '1000' rpt = 1; selem = 2; // LD/ST2 (2 registers) 
when '1010' rpt = 2; selem = 1; // LD/ST1 (2 registers) 


otherwise UnallocatedEncoding(); 


// .1D format only permitted with LD1 & ST1 
if size:Q == '110' && selem != 1 then ReservedValue(); 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(datasize) rval; 

integer e, r, s, tt; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
for r = Q to rpt-1 
for e = Q to elements-1 
tt = (t + r) MOD 32; 
for s = @ to selem-1 
rval = V[tt]; 
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if memop == MemO0p_LOAD then 
Elem[rval, e, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[tt] = rval; 
else // memop == MemOp_STORE 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, e, esize]; 
offs = offs + ebytes; 
tt = (tt + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.280 ST3 (single structure) 


Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to 


memory from corresponding elements of three SIMD&FP registers. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


No offset 


|31 30 29 28|27 26 25 24|23 22 21 20/1918 1716/15 1312/11109 | 


alae 110+ ofolole 00 ofa x ifs|sm] mm] a] 


opcode 


8-bit variant 
Applies when opcode == 001. 


ST3 { <Vt>.B, <Vt2>.B, <Vt3>.B }[<index>], [<Xn|SP>] 


16-bit variant 
Applies when opcode == @11 && size == xQ. 


ST3 { <Vt>.H, <Vt2>.H, <Vt3>.H }[<index>], [<Xn|SP>] 


32-bit variant 
Applies when opcode == 101 && size == 00. 


ST3 { <Vt>.S, <Vt2>.S, <Vt3>.S }[<index>], [<Xn|SP>] 


64-bit variant 
Applies when opcode == 101 && S == @ && size == Q1. 


ST3 { <Vt>.D, <Vt2>.D, <Vt3>.D }[<index>], [<Xn|SP>] 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 


Post-index 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 1312/11109 | 


Ofajo oF FoF topo] Rm |x x i]s|sze] Ro | Rt__| 


opcode 


8-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 001. 


ST3 { <Vt>.B, <Vt2>.B, <Vt3>.B }[<index>], [<Xn|SP>], #3 


8-bit, register offset variant 
Applies when Rm != 11111 && opcode == 001. 


ST3 { <Vt>.B, <Vt2>.B, <Vt3>.B }[<index>], [<Xn|SP>], <Xm> 
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16-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 11 && size == xQ. 


ST3 { <Vt>.H, <Vt2>.H, <Vt3>.H }[<index>], [<Xn|SP>], #6 


16-bit, register offset variant 
Applies when Rm != 11111 && opcode == 11 && size == xQ. 


ST3 { <Vt>.H, <Vt2>.H, <Vt3>.H }[<index>], [<Xn|SP>], <Xm> 


32-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 101 && size == 00. 


ST3 { <Vt>.S, <Vt2>.S, <Vt3>.S }[<index>], [<Xn|SP>], #12 


32-bit, register offset variant 
Applies when Rm != 11111 && opcode == 101 && size == 00. 


ST3 { <Vt>.S, <Vt2>.S, <Vt3>.S }[<index>], [<Xn|SP>], <Xm> 


64-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 101 && S == @ && size == 01. 


ST3 { <Vt>.D, <Vt2>.D, <Vt3>.D }[<index>], [<Xn|SP>], #24 


64-bit, register offset variant 
Applies when Rm != 11111 && opcode == 101 && S == @ && size == 01. 


ST3 { <Vt>.D, <Vt2>.D, <Vt3>.D }[<index>], [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 


<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<index> For the 8-bit variant: is the element index, encoded in "Q:S:size". 


For the 16-bit variant: is the element index, encoded in "Q:S:size<1>". 
For the 32-bit variant: is the element index, encoded in "Q:S". 


For the 64-bit variant: is the element index, encoded in "Q". 





<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 
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Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<Q>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
when 3 
// \oad and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when 0 
index = UInt(Q:S:size); // B[@-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[0-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = 0 to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = 0 to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
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Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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ST4 (multiple structures) 


Store multiple 4-element structures from four registers. This instruction stores multiple 4-element structures to 
memory from four SIMD&FP registers, with interleaving. Every element of each register is stored. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


No offset 
|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 ald 10 9 = 5 4| 0| 
ajo o i 10 0 jojo ooo 0 ojo oo Ofsze| Rn | Rt 


opcode 


No offset variant 


ST4 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>] 


Decode for this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = integer UNKNOWN; 
boolean whack = FALSE; 


Post-index 
|31 30 29 settee 26 25 24|23 22 21 20| 16|15 2M 109 | 5 4| 0| 
fofa[oo 1406 soJo] “Rm Yo 0 0 ofsee] en Te 


opcode 


Immediate offset variant 
Applies when Rm == 11111. 


ST4 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <imm> 


Register offset variant 
Applies when Rm != 11111. 


ST4 { <Vt>.<T>, <Vt2>.<T>, <Vt3>.<T>, <Vt4>.<T> }, [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 





<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
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2S when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<Vt4> Is the name of the fourth SIMD&FP register to be transferred, encoded as "Rt" plus 3 modulo 32. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> Is the post-index immediate offset, encoded in the "Q" field. It can have the following values: 

#32 when Q = @ 

#64 whenQ = 1 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 

field. 


Shared decode for all encodings 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 

integer esize = 8 << UInt(size); 

integer elements = datasize DIV esize; 


integer rpt; // number of iterations 
integer selem; // structure elements 


case opcode of 


when 'Q000' rpt = 1; selem = 4; // LD/ST4 (4 registers) 
when 'Q010' rpt = 4; selem = 1; // LD/ST1 (4 registers) 
when 'Q100' rpt = 1; selem = 3; // LD/ST3 (3 registers) 
when 'Q110' rpt = 3; selem = 1; // LD/ST1 (3 registers) 
when 'Q111' rpt = 1; selem = 1; // LD/ST1 (1 register) 
when '1000' rpt = 1; selem = 2; // LD/ST2 (2 registers) 
when '1010' rpt = 2; selem = 1; // LD/ST1 (2 registers) 


otherwise UnallocatedEncoding(); 


// .1D format only permitted with LD1 & ST1 
if size:Q == '110' && selem != 1 then ReservedValue(); 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(datasize) rval; 

integer e, r, s, tt; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
for r = Q to rpt-1 
for e = 0 to elements-1 
tt = (t + r) MOD 32; 
for s = @ to selem-1 
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rval = V[tt]; 
if memop == MemOp_LOAD then 
Elem[rval, e, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[tt] = rval; 
else // memop == MemOp_STORE 
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, e, esize]; 
offs = offs + ebytes; 
tt = (tt + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.282 ST4 (single structure) 

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to 

memory from corresponding elements of four SIMD&FP registers. 

Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 

and Exception level, an attempt to execute the instruction might be trapped. 

No offset 
|31 30 29 28|27 26 25 2423 22 21 20/1918 1716/15 1312/11109 | 5 4| 0| 
fofajoo tio 14 loli fo oo 0 of]x x i[s}sze[ Rn | Rt | 

opcode 

8-bit variant 

Applies when opcode == 001. 

ST4 { <Vt>.B, <Vt2>.B, <Vt3>.B, <Vt4>.B }[<index>], [<Xn|SP>] 

16-bit variant 

Applies when opcode == @11 && size == xQ. 

ST4 { <Vt>.H, <Vt2>.H, <Vt3>.H, <Vt4>.H }[<index>], [<Xn|SP>] 

32-bit variant 

Applies when opcode == 101 && size == 00. 

ST4 { <Vt>.S, <Vt2>.S, <Vt3>.S, <Vt4>.S }[<index>], [<Xn|SP>] 

64-bit variant 

Applies when opcode == 101 && S == @ && size == 01. 

ST4 { <Vt>.D, <Vt2>.D, <Vt3>.D, <Vt4>.D }[<index>], [<Xn|SP>] 

Decode for all variants of this encoding 

integer t = UInt(Rt); 

integer n = UInt(Rn); 

integer m = integer UNKNOWN; 

boolean whack = FALSE; 

Post-index 

|31 30 29 28|27 26 25 24|23 22 21 20| 16|15  1312/11109 | 5 4| 0| 
jofajoot10 14 ‘ofr | Rm |x x t]s|sze] Re | Rt 

opcode 

8-bit, immediate offset variant 

Applies when Rm == 11111 && opcode == 001. 

ST4 { <Vt>.B, <Vt2>.B, <Vt3>.B, <Vt4>.B }[<index>], [<Xn|SP>], #4 

8-bit, register offset variant 

Applies when Rm != 11111 && opcode == 001. 

ST4 { <Vt>.B, <Vt2>.B, <Vt3>.B, <Vt4>.B }[<index>], [<Xn|SP>], <Xm> 
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16-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == Q11 && size == xQ. 


ST4 { <Vt>.H, <Vt2>.H, <Vt3>.H, <Vt4>.H }[<index>], [<Xn|SP>], #8 


16-bit, register offset variant 
Applies when Rm != 11111 && opcode == Q11 && size == xQ. 


ST4 { <Vt>.H, <Vt2>.H, <Vt3>.H, <Vt4>.H }[<index>], [<Xn|SP>], <Xm> 


32-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 101 && size == 00. 


ST4 { <Vt>.S, <Vt2>.S, <Vt3>.S, <Vt4>.S }[<index>], [<Xn|SP>], #16 


32-bit, register offset variant 
Applies when Rm != 11111 && opcode == 101 && size == 00. 


ST4 { <Vt>.S, <Vt2>.S, <Vt3>.S, <Vt4>.S }[<index>], [<Xn|SP>], <Xm> 


64-bit, immediate offset variant 
Applies when Rm == 11111 && opcode == 101 && S == @ && size == 01. 


ST4 { <Vt>.D, <Vt2>.D, <Vt3>.D, <Vt4>.D }[<index>], [<Xn|SP>], #32 


64-bit, register offset variant 
Applies when Rm != 11111 && opcode == 101 && S == @ && size == 01. 


ST4 { <Vt>.D, <Vt2>.D, <Vt3>.D, <Vt4>.D }[<index>], [<Xn|SP>], <Xm> 


Decode for all variants of this encoding 


integer t = UInt(Rt); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
boolean whack = TRUE; 


Assembler symbols 


<Vt> Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Vt2> Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32. 
<Vt3> Is the name of the third SIMD&FP register to be transferred, encoded as "Rt" plus 2 modulo 32. 
<Vt4> Is the name of the fourth SIMD&FP register to be transferred, encoded as "Rt" plus 3 modulo 32. 
<index> For the 8-bit variant: is the element index, encoded in "Q:S:size". 


For the 16-bit variant: is the element index, encoded in "Q:S:size<1>". 
For the 32-bit variant: is the element index, encoded in "Q:S". 


For the 64-bit variant: is the element index, encoded in "Q". 





<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<Xm> Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" 
field. 
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Shared decode for all encodings 


integer scale = UInt(opcode<2:1>); 
integer selem = UInt(opcode<Q>:R) + 1; 
boolean replicate = FALSE; 

integer index; 


case scale of 
when 3 
// \oad and replicate 
if L == '@' || S == '1' then UnallocatedEncoding(); 
scale = UInt(size); 
replicate = TRUE; 


when 0 
index = UInt(Q:S:size); // B[@-15] 
when 1 
if size<@> == '1' then UnallocatedEncoding(); 
index = UInt(Q:S:size<1>); // H[0-7] 
when 2 
if size<l> == '1' then UnallocatedEncoding(); 
if size<@> == 'Q' then 
index = UInt(Q:S); // S[0-3] 
else 


if S == '1' then UnallocatedEncoding(); 
index = UInt(Q); // D[@-1] 
scale = 3; 


MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = if Q == '1' then 128 else 64; 
integer esize = 8 << scale; 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(64) offs; 

bits(128) rval; 

bits(esize) element; 

integer s; 

constant integer ebytes = esize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


offs = Zeros(); 
if replicate then 
// load and replicate to all elements 
for s = 0 to selem-1 
element = Mem[addresstoffs, ebytes, AccType_VEC]; 
// replicate to fill 128- or 64-bit register 
V[t] = Replicate(element, datasize DIV esize); 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 
else 
// load/store one element per register 
for s = 0 to selem-1 
rval = V[t]; 
if memop == MemOp_LOAD then 
// insert into one lane of 128-bit register 
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC]; 
V[t] = rval; 
else // memop == MemOp_STORE 
// extract from one lane of 128-bit register 
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Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize]; 
offs = offs + ebytes; 
t = (t + 1) MOD 32; 


if whack then 
if m != 31 then 
offs = X[m]; 
if n == 31 then 
SP[] = address + offs; 
else 
X[n] = address + offs; 
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C7.2.283 STNP (SIMD&FP) 


Store Pair of SIMD&FP registers, with Non-temporal hint. This instruction stores a pair of SIMD&FP registers to 
memory, issuing a hint to the memory system that the access is non-temporal. The address used for the store is 
calculated from an address from a base register value and an immediate offset. For information about non-temporal 
pair instructions, see Load/Store SIMD and Floating-point Non-temporal pair on page C3-154. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28/27 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 


Pope [7 0 i]i]0 0 0fo] mm? | Ra | Rn | Rt 
L 


32-bit variant 

Applies when opc == 00. 

STNP <St1>, <St2>, [<Xn|SP>{, #<imm>}] 
64-bit variant 

Applies when opc == 01. 

STNP <Dtl>, <Dt2>, [<Xn|SP>{, #<imm>}] 
128-bit variant 

Applies when opc == 10. 

STNP <Qt1>, <Qt2>, [<Xn|SP>{, #<imm>}] 
Decode for all variants of this encoding 


// Empty. 


Assembler symbols 








<Dtl> Is the 64-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Dt2> Is the 64-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Qtl> Is the 128-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Qt2> Is the 128-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Stl> Is the 32-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<St2> Is the 32-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> For the 32-bit variant: is the optional signed immediate byte offset, a multiple of 4 in the range -256 


to 252, defaulting to 0 and encoded in the "imm7" field as <imm>/4. 


For the 64-bit variant: is the optional signed immediate byte offset, a multiple of 8 in the range -512 
to 504, defaulting to 0 and encoded in the "imm7" field as <imm>/8. 


For the 128-bit variant: is the optional signed immediate byte offset, a multiple of 16 in the range 
-1024 to 1008, defaulting to 0 and encoded in the "imm7" field as <imm>/16. 
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Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); 

if opc == '11' then UnallocatedEncoding(); 

integer scale = 2 + UInt(opc); 

integer datasize = 8 << scale; 

bits(64) offset = LSL(SignExtend(imm7, 64), scale); 


Operation 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(datasize) datal; 

bits(datasize) data2; 

constant integer dbytes = datasize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


address = address + offset; 


datal = V[t]; 

data2 = V[t2]; 

Mem[address, dbytes, AccType_VECSTREAM] = datal; 
Mem[address+dbytes, dbytes, AccType_VECSTREAM] = data2; 
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C7.2.284 STP (SIMD&FP) 


Store Pair of SIMD&FP registers. This instruction stores a pair of SIMD&FP registers to memory. The address used 
for the store is calculated from a base register value and an immediate offset. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Post-index 


|31 30 29 2827 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 


fopc]1 0 1}1fo o tfof immz | RT Rn S| Rt 
L 


32-bit variant 

Applies when opc == 00. 

STP <Stl>, <St2>, [<Xn|SP>], #<imm> 
64-bit variant 

Applies when opc == 01. 

STP <Dtl>, <Dt2>, [<Xn|SP>], #<imm> 
128-bit variant 

Applies when opc == 10. 


STP <Qtl>, <Qt2>, [<Xn|SP>], #<imm> 


Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = TRUE; 


Pre-index 


|31 30 29 28|27 26 25 24|23 2221 | \15 14 | 109 | 5 4| 0 | 


Pope [7 0 i]i]0 7 1[0] mm? | R@ | Rn | Rt 
L 


32-bit variant 

Applies when opc == 00. 

STP <Stl>, <St2>, [<Xn|SP>, #<imm>]! 
64-bit variant 

Applies when opc == 01. 

STP <Dtl>, <Dt2>, [<Xn|SP>, #<imm>]! 
128-bit variant 

Applies when opc == 10. 


STP <Qtl>, <Qt2>, [<Xn|SP>, #<imm>]! 
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Decode for all variants of this encoding 


boolean whack 


= TRUE; 


boolean postindex = FALSE; 


Signed offset 


|31 30 29 28|27 26 25 24|23 2221 | \15 14 | 409 | 5 4| 0 | 


fopc]1 0 1}1fo 4 ofof immz | RT Rn S| RE 
L 


32-bit variant 


Applies when opc == 00. 


STP <Stl>, <St2>, [<Xn|SP>{, #<imm>}] 


64-bit variant 


Applies when opc == 01. 


STP <Dtl>, <Dt2>, [<Xn|SP>{, #<imm>}] 


128-bit variant 


Applies when opc == 10. 


STP <Qtl>, <Qt2>, [<Xn|SP>{, #<imm>}] 


Decode for all variants of this encoding 


boolean whack 


= FALSE; 


boolean postindex = FALSE; 


Assembler symbols 











<Dtlb> Is the 64-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Dt2> Is the 64-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Qtl> Is the 128-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<Qt2> Is the 128-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Stb> Is the 32-bit name of the first SIMD&FP register to be transferred, encoded in the "Rt" field. 
<St2> Is the 32-bit name of the second SIMD&FP register to be transferred, encoded in the "Rt2" field. 
<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<imm> For the 32-bit post-index and 32-bit pre-index variant: is the signed immediate byte offset, a 
multiple of 4 in the range -256 to 252, encoded in the "imm7" field as <imm>/4. 
For the 32-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 4 in 
the range -256 to 252, defaulting to 0 and encoded in the "imm7" field as <imm>/4. 
For the 64-bit post-index and 64-bit pre-index variant: is the signed immediate byte offset, a 
multiple of 8 in the range -512 to 504, encoded in the "imm7" field as <imm>/8. 
For the 64-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 8 in 
the range -512 to 504, defaulting to 0 and encoded in the "imm7" field as <imm>/8. 
For the 128-bit post-index and 128-bit pre-index variant: is the signed immediate byte offset, a 
multiple of 16 in the range -1024 to 1008, encoded in the "imm7" field as <imm>/16. 
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For the 128-bit signed offset variant: is the optional signed immediate byte offset, a multiple of 16 
in the range -1024 to 1008, defaulting to 0 and encoded in the "imm7" field as <imm>/16. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer t2 = UInt(Rt2); 

if opc == '11' then UnallocatedEncoding(); 

integer scale = 2 + UInt(opc); 

integer datasize = 8 << scale; 

bits(64) offset = LSL(SignExtend(imm7, 64), scale); 


Operation for all encodings 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 

bits(datasize) datal; 

bits(datasize) data2; 

constant integer dbytes = datasize DIV 8; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


datal = V[t]; 

data2 = V[t2]; 

Mem[address, dbytes, AccType_VEC] = datal; 
Mem[address+dbytes, dbytes, AccType_VEC] = data2; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C7.2.285 STR (immediate, SIMD&FP) 
Store SIMD&FP register (immediate offset). This instruction stores a single SIMD&FP register to memory. The 
address that is used for the store is calculated from a base register value and an immediate offset. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Post-index 
|31 30 29 28|27 26 25 24|23 22 21 20| | 12/1110 9 | 5 4| 0 | 
[size] 1 t[]1{o o[x ofof imma fo tf Rn STR 
opc 
8-bit variant 
Applies when size == 00 && opc == 00. 
STR <Bt>, [<Xn|SP>], #<simm> 
16-bit variant 
Applies when size == 01 && opc == 00. 
STR <Ht>, [<Xn|SP>], #<simm> 
32-bit variant 
Applies when size == 10 && opc == 00. 
STR <St>, [<Xn|SP>], #<simm> 
64-bit variant 
Applies when size == 11 && opc == 00. 
STR <Dt>, [<Xn|SP>], #<simm> 
128-bit variant 
Applies when size == 00 && opc == 10. 
STR <Qt>, [<Xn|SP>], #<simm> 
Decode for all variants of this encoding 
boolean whack = TRUE; 
boolean postindex = TRUE; 
integer scale = UInt(opc<1>:size); 
if scale > 4 then UnallocatedEncoding(); 
bits(64) offset = SignExtend(imm9, 64); 
Pre-index 
|31 30 29 28|27 26 25 24/23 22 21 20| | 12/1110 9 | 5 4| 0 | 
[size [11 1]1]o ox ofo] mmo —s<dit a] mn dP SCOR 
opc 
8-bit variant 
Applies when size == 00 && opc == 00. 
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STR <Bt>, [<Xn|SP>, #<simm>]! 


16-bit variant 

Applies when size == 01 && opc == 
STR <Ht>, [<Xn|SP>, #<simm>]! 
32-bit variant 

Applies when size == 10 && opc == 
STR <St>, [<Xn|SP>, #<simm>]! 
64-bit variant 

Applies when size == 11 && opc == 
STR <Dt>, [<Xn|SP>, #<simm>]! 
128-bit variant 

Applies when size == 00 && opc == 


STR <Qt>, [<Xn|SP>, #<simm>]! 


00. 


00. 


00. 


10. 


Decode for all variants of this encoding 


boolean whack = TRUE; 
boolean postindex = FALSE; 


integer scale = UInt(opc<1>:size); 
if scale > 4 then UnallocatedEncoding(); 
bits(64) offset = SignExtend(imm9, 64); 


Unsigned offset 


|31 30 29 28|27 26 25 24/23 2221 | | 0 | 


opc 


8-bit variant 

Applies when size == 00 && opc == 
STR <Bt>, [<Xn|SP>{, #<pimm>}] 
16-bit variant 

Applies when size == @1 && opc == 
STR <Ht>, [<Xn|SP>{, #<pimm>}] 
32-bit variant 

Applies when size == 10 && opc == 
STR <St>, [<Xn|SP>{, #<pimm>}] 
64-bit variant 

Applies when size == 11 && opc == 


STR <Dt>, [<Xn|SP>{, #<pimm>}] 


00. 


00. 


00. 


00. 
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128-bit variant 
Applies when size == 00 && opc == 10. 


STR <Qt>, [<Xn|SP>{, #<pimm>}] 


Decode for all variants of this encoding 


boolean whack = FALSE; 

boolean postindex = FALSE; 

integer scale = UInt(opc<1>:size); 

if scale > 4 then UnallocatedEncoding(); 

bits(64) offset = LSL(ZeroExtend(imm12, 64), scale); 


Assembler symbols 


<Bt> Is the 8-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<Dt> Is the 64-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<Ht> Is the 16-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<Qt> Is the 128-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<St> Is the 32-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 
<simm> Is the signed immediate byte offset, in the range -256 to 255, encoded in the "imm9" field. 

<pimm> For the 8-bit variant: is the optional positive immediate byte offset, in the range 0 to 4095, defaulting 


to 0 and encoded in the "imm12" field. 


For the 16-bit variant: is the optional positive immediate byte offset, a multiple of 2 in the range 0 
to 8190, defaulting to 0 and encoded in the "imm12" field as <pimm>/2. 


For the 32-bit variant: is the optional positive immediate byte offset, a multiple of 4 in the range 0 
to 16380, defaulting to 0 and encoded in the "imm12" field as <pimm>/4. 


For the 64-bit variant: is the optional positive immediate byte offset, a multiple of 8 in the range 0 
to 32760, defaulting to 0 and encoded in the "imm12" field as <pimm>/8. 


For the 128-bit variant: is the optional positive immediate byte offset, a multiple of 16 in the range 
0 to 65520, defaulting to 0 and encoded in the "imm12" field as <pimm>/16. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

AccType acctype = AccType_VEC; 

MemOp memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = 8 << scale; 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
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address = address + offset; 


case memop of 
when MemOp_STORE 
data = V[t]; 
Mem[address, datasize DIV 8, acctype] = data; 


when MemOp_LOAD 
data = Mem[address, datasize DIV 8, acctype]; 
V[t] = data; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C7.2.286 STR (register, SIMD&FP) 


Store SIMD&FP register (register offset). This instruction stores a single SIMD&FP register to memory. The 
address that is used for the store is calculated from a base register value and an offset register value. The offset can 
be optionally shifted and extended. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 1312/1109 | 5 4| 0 | 


[sze[7 1 1]7[0 O[x o]7] Rm | opln [s[7 0] Rn | Rt 


opc 


8-bit variant 
Applies when size == 00 && opc == 00 && option != Q11. 


STR <Bt>, [<Xn|SP>, (<Wm>|<Xm>), <extend> {<amount>}] 


8-bit variant 
Applies when size == 00 && opc == 00 && option == Q11. 


STR <Bt>, [<Xn|SP>, <Xm>{, LSL <amount>}] 


16-bit variant 
Applies when size == Q1 && opc == 00. 


STR <Ht>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


32-bit variant 
Applies when size == 10 && opc == 00. 


STR <St>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


64-bit variant 
Applies when size == 11 && opc == 00. 


STR <Dt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


128-bit variant 
Applies when size == 00 && opc == 10. 


STR <Qt>, [<Xn|SP>, (<Wm>|<Xm>){, <extend> {<amount>}}] 


Decode for all variants of this encoding 


boolean whack = FALSE; 

boolean postindex = FALSE; 

integer scale = UInt(opc<1>:size); 

if scale > 4 then UnallocatedEncoding(); 


if option<l> == '@' then UnallocatedEncoding(); // sub-word index 
ExtendType extend_type = DecodeRegExtend(option) ; 
integer shift = if S == '1' then scale else 0; 


Assembler symbols 


<Bt> Is the 8-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 
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<Dt> 
<Ht> 
<Qt> 
<St> 
<Xn| SP> 


<Wm> 


<Xm> 


<extend> 


<amount> 


Is the 64-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

Is the 16-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

Is the 128-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

Is the 32-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 

Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


When option<0> is set to Q, is the 32-bit name of the general-purpose index register, encoded in the 
"Rm" field. 


When option<0> is set to 1, is the 64-bit name of the general-purpose index register, encoded in the 
"Rm" field. 


For the 8-bit variant: is the index extend specifier, encoded in the "option" field. It can have the 
following values: 


UXTW when option = 010 
SXTW when option = 110 
SXTX when option = 111 


For the 128-bit, 16-bit, 32-bit and 64-bit variant: is the index extend/shift specifier, defaulting to 
LSL, and which must be omitted for the LSL option when <amount> is omitted. encoded in the 
"option" field. It can have the following values: 


UXTW when option = 010 
LSL when option = 011 
SXTW when option = 110 
SXTX when option = 111 


For the 8-bit variant: is the index shift amount, it must be #0, encoded in "S" as 0 if omitted, or as 1 
if present. 


For the 16-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 


#0 when S = 0 
#1 when S = 1 


For the 32-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 


#0 when S = 0 
#2 when S = 1 


For the 64-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where it 
is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 


#0 when S 
#3 when S 


) 
1 


For the 128-bit variant: is the index shift amount, optional only when <extend> is not LSL. Where 
it is permitted to be optional, it defaults to #0. It is encoded in the "S" field. It can have the following 
values: 





#0 when S = 0 
#4 when S = 1 
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Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

integer m = UInt(Rm); 

AccType acctype = AccType_VEC; 

MemOp memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = 8 << scale; 


Operation 


bits(64) offset = ExtendReg(m, extend_type, shift); 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


case memop of 
when MemOp_STORE 
data = V[t]; 
Mem[address, datasize DIV 8, acctype] = data; 


when MemOp_LOAD 
data = Mem[address, datasize DIV 8, acctype]; 
V[t] = data; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C7.2.287 


STUR (SIMD&FP) 


Store SIMD&FP register (unscaled offset). This instruction stores a single SIMD&FP register to memory. The 
address that is used for the store is calculated from a base register value and an optional immediate offset. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) | 12|17 10 9 | 5 4| 0 | 


fsize]1 1 1]1]o ofx ofo] imma fo of Rn S| Rt 


opc 


8-bit variant 

Applies when size == @0 && opc == 
STUR <Bt>, [<Xn|SP>{, #<simm>}] 
16-bit variant 

Applies when size == 01 && opc == 
STUR <Ht>, [<Xn|SP>{, #<simm>}] 
32-bit variant 

Applies when size == 10 && opc == 
STUR <St>, [<Xn|SP>{, #<simm>}] 
64-bit variant 

Applies when size == 11 && opc == 
STUR <Dt>, [<Xn|SP>{, #<simm>}] 
128-bit variant 

Applies when size == 00 && opc == 


STUR <Qt>, [<Xn|SP>{, #<simm>}] 


00. 


00. 


00. 


00. 


10. 


Decode for all variants of this encoding 


boolean whack = FALSE; 
boolean postindex = FALSE; 


integer scale = UInt(opc<1>:size); 
if scale > 4 then UnallocatedEncoding(); 
bits(64) offset = SignExtend(imm9, 64); 


Assembler symbols 


<Bt> Is the 8-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 


<Dt> Is the 64-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 


<Ht> Is the 16-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 


<Qt> Is the 128-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 


<St> Is the 32-bit name of the SIMD&FP register to be transferred, encoded in the "Rt" field. 
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<Xn|SP> Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field. 


<simm> Is the optional signed immediate byte offset, in the range -256 to 255, defaulting to 0 and encoded 
in the "imm9" field. 


Shared decode for all encodings 


integer n = UInt(Rn); 

integer t = UInt(Rt); 

AccType acctype = AccType_VEC; 

MemOp memop = if opc<@> == '1' then MemOp_LOAD else MemOp_STORE; 
integer datasize = 8 << scale; 


Operation 
CheckFPAdvSIMDEnab1ed64(); 


bits(64) address; 
bits(datasize) data; 


if n == 31 then 
CheckSPAlignment(); 
address = SP[]; 
else 
address = X[n]; 


if !postindex then 
address = address + offset; 


case memop of 
when MemOp_STORE 
data = V[t]; 
Mem[address, datasize DIV 8, acctype] = data; 


when MemOp_LOAD 
data = Mem[address, datasize DIV 8, acctype]; 
V[t] = data; 


if whack then 
if postindex then 
address = address + offset; 
if n == 31 then 
SP[] = address; 
else 
X[n] = address; 
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C7.2.288 


SUB (vector) 


Subtract (vector). This instruction subtracts each vector element in the second source SIMD&FP register from the 
corresponding vector element in the first source SIMD&FP register, places the result into a vector, and writes the 


vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo a[i[1 177 O[sze]i] Rm [to000/] Rn | Rd 
U 


Scalar variant 


SUB <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size != '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 

integer elements = 1; 

boolean sub_op = (U == '1'); 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


lati jo 7 4 4 ofsve|if Rm |r ooo ojif Ro | Rd 


Vector variant 


SUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean sub_op = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 
The following encodings are reserved: 
° size = Ox. 


° size = 10. 
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<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<m> Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(esize) element1; 
bits(esize) element2; 


for e = 0 to elements-1 
elementl = Elem[operand1, e, esize]; 
element2 = Elem[operand2, e, esize]; 
if sub_op then 
Elem[result, e, esize] = elementl - element2; 
else 
Elem[result, e, esize] = elementl + element2; 


V[d] = result; 
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C7.2.289 SUBHN, SUBHN2 


Subtract returning High Narrow. This instruction subtracts each vector element in the second source SIMD&FP 
register from the corresponding vector element in the first source SIMD&FP register, places the most significant 
half of the result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. 
All the values in this instruction are signed integer values. 


The results are truncated. For rounded results, see RSUBHN, RSUBHN2. 


The SUBHN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the SUBHN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 


of the register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 


fofajojo 1414 Ofsize}1] Rm [0 t}/1}ojo of Rn =| Rd 
U 01 


16/15 141312/1110 9 | 5 4| 0 | 


Three registers, not all the same type variant 


SUBHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 


integer datasize = 64; 
integer part = UInt(Q); 


integer elements = datasize DIV esize; 


boolean sub_op = (01 == '1'); 
boolean round = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 





[absent] whenQ = 20 
[present] whenQ = 1 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 
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The encoding size = 11, Q = x is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(2sdatasize) operandl = V[n]; 

bits(2sdatasize) operand2 = V[m]; 

bits(datasize) result; 

integer round_const = if round then 1 << (esize - 1) else Q; 
bits(2xesize) element1; 

bits(2xesize) element2; 

bits(2sesize) sum; 


for e = 0 to elements-1 
element1 = Elem[operand1, e, 2xesize]; 
element2 = Elem[operand2, e, 2«esize]; 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
sum = sum + round_const; 
Elem[result, e, esize] = sum<2«esize-l:esize>; 


Vpart[d, part] = result; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1367 


1ID092916 


Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.290 SUQADD 


Signed saturating Accumulate of Unsigned value. This instruction adds the unsigned integer values of the vector 
elements in the source SIMD&FP register to corresponding signed integer values of the vector elements in the 
destination SIMD&FP register, and writes the resulting signed integer values to the destination SIMD&FP register. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo to[t i717 0[sze]io 00 0jo00071 7/10] Rn | Rd 
U 


Scalar variant 


SUQADD <V><d>, <V><n> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn); 


integer d 
integer n 


integer esize = 8 << UInt(size); 
integer datasize = esize; 


integer elements = 1; 

boolean unsigned = (U == '1'); 
Vector 

|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


fofafoo +771 o[sze]i 000 0j000711[10] Rn | Rd 
U 


Vector variant 


SUQADD <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 


Assembler symbols 


<\> Is a width specifier, encoded in the "size" field. It can have the following values: 


B when size = 00 
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H when size = Q1 

s when size = 10 

D when size = 11 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 


bits(datasize) operand2 = V[d]; 
integer op1; 
integer op2; 
boolean sat; 


for e = 0 to elements-1 
opl = Int(Elem[operand, e, esize], !unsigned) ; 
op2 = Int(Elem[operand2, e, esize], unsigned); 
(Elem[result, e, esize], sat) = SatQ(opl + op2, esize, unsigned); 
if sat then FPSR.QC = '1'; 
V[d] = result; 
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C7.2.291  SXTL, SXTL2 


Signed extend Long. This instruction duplicates each vector element in the lower or upper half of the source 
SIMD&FP register into a vector, and writes the vector to the destination SIMD&FP register. The destination vector 
elements are twice as long as the source vector elements. All the values in this instruction are signed integer values. 


The SXTL instruction extracts the source vector from the lower half of the source register, while the SXTL2 instruction 
extracts the source vector from the upper half of the source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is an alias of the SSHLL, SSHLL2 instruction. This means that: 
. The encodings in this description are named to match the encodings of SSHLL, SSHLL2. 


° The description of SSHLL, SSHLL2 gives the operational pseudocode for this instruction. 


|31 30 29 2827 26 25 24|23 22 1918 16|15141312/11109 | 5 4| 0 | 


fofajojo 1414 0} 0000 joo oft oro oii] Rn [| Rd | 
U 


immh immb 


Vector variant 

SXTL{2} <Vd>.<Ta>, <Vn>.<Tb> 

is equivalent to 

SSHLL{2} <Vd>.<Ta>, <Vn>.<Tb>, #0 


and is the preferred disassembly when BitCount(immh) == 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
2D when immh = Q1xx 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 





<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = Q01x,Q = 1 
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2s when immh = Q1xx,Q = @ 
4S when immh = Q1xx,Q = 1 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 


The encoding immh = 1xxx, Q = x is reserved. 


Operation 
The description of SSHLL, SSHLL2 gives the operational pseudocode for this instruction. 
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C7.2.292  TBL 
Table vector Lookup. This instruction reads each value from the vector elements in the index source SIMD&FP 
register, uses each result as an index to perform a lookup in a table of bytes that is described by one to four source 
table SIMD&FP registers, places the lookup result in a vector, and writes the vector to the destination SIMD&FP 
register. If an index is out of range for the table, the result for that lookup is 0. If more than one source register is 
used to describe the table, the first source register describes the lowest bytes of the table. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 aa 25 24|23 22 21 20| 16|15141312\1110 9 | 5 4| 0 | 
jofajooti1od Aah sn eee eld 
Two register table variant 
Applies when len == 01. 
TBL <Vd>.<Ta>, { <Vn>.16B, <Vn+1>.16B }, <Vm>.<Ta> 
Three register table variant 
Applies when len == 10. 
TBL <Vd>.<Ta>, { <Vn>.16B, <Vn+1>.16B, <Vn+2>.16B }, <Vm>.<Ta> 
Four register table variant 
Applies when len == 11. 
TBL <Vd>.<Ta>, { <Vn>.16B, <Vn+1>.16B, <Vn+2>.16B, <Vn+3>.16B }, <Vm>.<Ta> 
Single register table variant 
Applies when len == 00. 
TBL <Vd>.<Ta>, { <Vn>.16B }, <Vm>.<Ta> 
Decode for all variants of this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV 8; 
integer regs = UInt(len) + 1; 
boolean is_tbl = (op == '@'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 
<Vn> For the four register table, three register table and two register table variant: is the name of the first 
SIMD&FP table register, encoded in the "Rn" field. 
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For the single register table variant: is the name of the SIMD&FP table register, encoded in the "Rn" 


field. 
<Vn+1> Is the name of the second SIMD&FP table register, encoded as "Rn" plus 1 modulo 32. 
<Vn+2> Is the name of the third SIMD&FP table register, encoded as "Rn" plus 2 modulo 32. 
<Vn+3> Is the name of the fourth SIMD&FP table register, encoded as "Rn" plus 3 modulo 32. 
<Vn> Is the name of the SIMD&FP index register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) indices = V[m]; 
bits(128«regs) table = Zeros(); 
bits(datasize) result; 

integer index; 

integer i; 


// Create table from registers 

for i = 0 to regs-1 
table<128#1+127:128«i> = V[n]; 
n = (n + 1) MOD 32; 


result = if is_tbl then Zeros() else V[d]; 
for i = 0 to elements-1 
index = UInt(Elem[indices, i, 8]); 
if index < 16 * regs then 
Elem[result, i, 8] = Elem[table, index, 8]; 


V[d] = result; 
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C7.2.293 TBX 


Table vector lookup extension. This instruction reads each value from the vector elements in the index source 
SIMD&FP register, uses each result as an index to perform a lookup in a table of bytes that is described by one to 
four source table SIMD&FP registers, places the lookup result in a vector, and writes the vector to the destination 
SIMD&FP register. If an index is out of range for the table, the existing value in the vector element of the destination 
register is left unchanged. If more than one source register is used to describe the table, the first source register 
describes the lowest bytes of the table. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


jofajoo 141 ofo ofo} Rm fofien{ijo of Rn {| Rd 
op 


Two register table variant 

Applies when len == 01. 

TBX <Vd>.<Ta>, { <Vn>.16B, <Vn+1>.16B }, <Vm>.<Ta> 

Three register table variant 

Applies when len == 10. 

TBX <Vd>.<Ta>, { <Vn>.16B, <Vn+1>.16B, <Vn+2>.16B }, <Vm>.<Ta> 
Four register table variant 

Applies when len == 11. 

TBX <Vd>.<Ta>, { <Vn>.16B, <Vn+1>.16B, <Vn+2>.16B, <Vn+3>.16B }, <Vm>.<Ta> 
Single register table variant 

Applies when len == 00. 


TBX <Vd>.<Ta>, { <Vn>.16B }, <Vm>.<Ta> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV 8; 

integer regs = UInt(len) + 1; 

boolean is_tbl = (op == 'Q@'); 


Assembler symbols 





<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "Q" field. It can have the following values: 
8B when Q = 0 
16B when Q = 1 
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<Vn> For the four register table, three register table and two register table variant: is the name of the first 
SIMD&FP table register, encoded in the "Rn" field. 
For the single register table variant: is the name of the SIMD&FP table register, encoded in the "Rn" 
field. 

<Vn+1> Is the name of the second SIMD&FP table register, encoded as "Rn" plus 1 modulo 32. 

<Vn+2> Is the name of the third SIMD&FP table register, encoded as "Rn" plus 2 modulo 32. 

<Vn+3> Is the name of the fourth SIMD&FP table register, encoded as "Rn" plus 3 modulo 32. 

<Vn> Is the name of the SIMD&FP index register, encoded in the "Rm" field. 

Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) indices = V[m]; 
bits(128«regs) table = Zeros(); 
bits(datasize) result; 

integer index; 

integer i; 


// Create table from registers 

for i = @ to regs-1 
table<128%1+127:128«i> = V[n]; 
n = (n + 1) MOD 32; 


result = if is_tbl then Zeros() else V[d]; 
for i = 0 to elements-1 

index = UInt(Elem[indices, i, 8]); 

if index < 16 * regs then 


V[d] 


Elem[result, i, 8] = Elem[table, index, 8]; 


result; 
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C7.2.294 TRN1 


Transpose vectors (primary). This instruction reads corresponding even-numbered vector elements from the two 
source SIMD&FP registers, starting at zero, places each result into consecutive elements of a vector, and writes the 
vector to the destination SIMD&FP register. Vector elements from the first source register are placed into 
even-numbered elements of the destination vector, starting at zero, while vector elements from the second source 
register are placed into odd-numbered elements of the destination vector. 


Note 


By using this instruction with TRN2, a 2 x 2 matrix can be transposed. 








The following figure shows an example of the operation of TRN1 and TRN2 halfword operations where Q = 0. 


TRN1.16 TRN2.16 
Vn 
3 0 
Vd 
Vm 





Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 
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Advanced SIMD variant 


TRN1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

integer part = UInt(op); 

integer pairs = elements DIV 2; 


Assembler symbols 





<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
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16B when size 
4H when size 
8H when size 
2s when size 
4S when size 
2D when size 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


= 00,Q=1 
= 01,Q = 0 
=@1,Q=1 
= 10,Q=0 
=10,Q=1 
=11,,Q=1 


The encoding size = 11,Q = 0 is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

integer p; 


for p = Q to pairs-1 
Elem[result, 2«p+0, esize] = —E 
Elem[result, 2*p+1, esize] = E 


V[d] = result; 


lem[operand1, 2«p+part, esize]; 
lem[operand2, 2«p+part, esize]; 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.295 TRN2 


Transpose vectors (secondary). This instruction reads corresponding odd-numbered vector elements from the two 
source SIMD&FP registers, places each result into consecutive elements of a vector, and writes the vector to the 
destination SIMD&FP register. Vector elements from the first source register are placed into even-numbered 
elements of the destination vector, starting at zero, while vector elements from the second source register are placed 
into odd-numbered elements of the destination vector. 


Note 


By using this instruction with TRN1, a 2 x 2 matrix can be transposed. 








The following figure shows an example of the operation of TRN1 and TRN2 halfword operations where Q = 0. 


TRN1.16 TRN2.16 
Vn 
3 0 
Vd 
Vm 





Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15 14 1312/1110 9 | 5 4| 0 | 
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Advanced SIMD variant 


TRN2 <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

integer part = UInt(op); 

integer pairs = elements DIV 2; 


Assembler symbols 





<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
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16B when size 
4H when size 
8H when size 
2s when size 
4S when size 
2D when size 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


= 00,Q=1 
= 01,Q = 0 
=@1,Q=1 
= 10,Q=0 
=10,Q=1 
=11,,Q=1 


The encoding size = 11,Q = 0 is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

integer p; 


for p = Q to pairs-1 
Elem[result, 2«p+0, esize] = —E 
Elem[result, 2*p+1, esize] = E 


V[d] = result; 


lem[operand1, 2«p+part, esize]; 
lem[operand2, 2«p+part, esize]; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 





C7.2.296 UABA 
Unsigned Absolute difference and Accumulate. This instruction subtracts the elements of the vector of the second 
source SIMD&FP register from the corresponding elements of the first source SIMD&FP register, and accumulates 
the absolute values of the results into the elements of the vector of the destination SIMD&FP register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312|/1110 9 | 5 4| 0| 
folafiyo 117 ofsze[i] am [oa7 fifi] em [| Ra | 
U ac 
Three registers of the same type variant 
UABA <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
boolean accumulate = (ac == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
integer element1; 
integer element2; 
bits(esize) absdiff; 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


result = if accumulate then V[d] else Zeros(); 
for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
absdiff = Abs(elementl-element2)<esize-1:0>; 
Elem[result, e, esize] = Elem[result, e, esize] + absdiff; 
V[d] = result; 
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C7.2.297_ UABAL, UABAL2 


Unsigned Absolute difference and Accumulate Long. This instruction subtracts the vector elements in the lower or 
upper half of the second source SIMD&FP register from the corresponding vector elements of the first source 
SIMD&FP register, and accumulates the absolute values of the results into the vector elements of the destination 
SIMD&FP register. The destination vector elements are twice as long as the source vector elements. All the values 
in this instruction are unsigned integer values. 


The UABAL instruction extracts each source vector from the lower half of each source register, while the UABAL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|1110 9 | 5 4| 0 | 


fofayifo 1771 o[sze]i] Rm [0 aJo[ijo oy] rn | Rd 
U op 


Three registers, not all the same type variant 


UABAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 
m 


UInt(Rd) ; 
UInt(Rn) ; 
UInt (Rm) ; 


== '11' then ReservedValue(); 
esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 

elements = datasize DIV esize; 


accumulate = (op == 'Q'); 
unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Th> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 0 
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16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2xesize) absdiff; 


result = if accumulate then V[d] else Zeros(); 
for e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

absdiff = Abs(elementl-element2)<2sesize-1:0>; 

Elem[result, e, 2xesize] = Elem[result, e, 2sesize] + absdiff; 
V[d] = result; 
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C7.2.298 UABD 
Unsigned Absolute Difference (vector). This instruction subtracts the elements of the vector of the second source 
SIMD&FP register from the corresponding elements of the first source SIMD&FP register, places the absolute 
values of the results into a vector, and writes the vector to the destination SIMD&FP register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20] 16|15141312/11109 | 5 4| 0 | 
folafiyo 117 ofsze[i] am [o%7 1o[i] kn [| Ra | 
U ac 
Three registers of the same type variant 
UABD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
boolean accumulate = (ac == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
integer element1; 
integer element2; 
bits(esize) absdiff; 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


result = if accumulate then V[d] else Zeros(); 
for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
absdiff = Abs(elementl-element2)<esize-1:0>; 
Elem[result, e, esize] = Elem[result, e, esize] + absdiff; 
V[d] = result; 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.299 UABDL, UABDL2 


Unsigned Absolute Difference Long. This instruction subtracts the vector elements in the lower or upper half of the 
second source SIMD&FP register from the corresponding vector elements of the first source SIMD&FP register, 
places the absolute value of the result into a vector, and writes the vector to the destination SIMD&FP register. The 
destination vector elements are twice as long as the source vector elements. All the values in this instruction are 
unsigned integer values. 


The UABDL instruction extracts each source vector from the lower half of each source register, while the UABDL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312|/1110 9 | 5 4| 0 | 


fofafifo 1771 o[sze]i] Rm [0 a[i[iJoo] rn | Rd 
U op 


Three registers, not all the same type variant 


UABDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 
m 


UInt(Rd) ; 
UInt(Rn) ; 
UInt (Rm) ; 


== '11' then ReservedValue(); 
esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 

elements = datasize DIV esize; 


accumulate = (op == 'Q'); 
unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Th> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 0 
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16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2xesize) absdiff; 


result = if accumulate then V[d] else Zeros(); 
for e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

absdiff = Abs(elementl-element2)<2sesize-1:0>; 

Elem[result, e, 2xesize] = Elem[result, e, 2sesize] + absdiff; 
V[d] = result; 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.300 UADALP 


Unsigned Add and Accumulate Long Pairwise. This instruction adds pairs of adjacent unsigned integer values from 
the vector in the source SIMD&FP register and accumulates the results with the vector elements of the destination 
SIMD&FP register. The destination vector elements are twice as long as the source vector elements. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 


fo[a[ifo +771 O[sze]i 000 0f0 o]i]1 [10] Rn | Rd 
U op 


Vector variant 


UADALP <Vd>.<Ta>, <Vn>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV (2 « esize); 


boolean acc = (op == '1'); 
boolean unsigned = (U == '1'); 


Assembler symbols 


0 | 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 00,Q = 0 
8H when size = 00,Q=1 
2S when size = 01,Q = 0 
4S when size = 01,Q=1 
1D when size = 10,Q = 0 
2D when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 


bits(2sesize) sum; 
integer op1; 
integer op2; 


result = if acc then V[d] else Zeros(); 
for e = 0 to elements-1 
opl = Int(Elem[operand, 2xe+0, esize], unsigned); 
op2 = Int(Elem[operand, 2xe+1, esize], unsigned); 
sum = (opl+op2)<2«esize-1:0>; 
Elem[result, e, 2xesize] = Elem[result, e, 2esize] + sum; 


V[d] = result; 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.301 UADDL, UADDL2 


Unsigned Add Long (vector). This instruction adds each vector element in the lower or upper half of the first source 
SIMD&FP register to the corresponding vector element of the second source SIMD&FP register, places the result 
into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice 
as long as the source vector elements. All the values in this instruction are unsigned integer values. 


The UADDL instruction extracts each source vector from the lower half of each source register, while the UADDL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


lofi jo 4 4 4 ofsze}i] Rm Jo ofojojo of] Rn | Rd __| 


Three registers, not all the same type variant 


UADDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 


UInt(Rd) ; 
UInt(Rn) ; 


m = UInt(Rm); 


== '11' then ReservedValue(); 


esize 


= 8 << UInt(size); 


datasize = 64; 


part = 


UInt(Q); 


elements = datasize DIV esize; 


sub_op = (01 == '1'); 
unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

integer sum; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
Elem[result, e, 2xesize] = sum<2xesize-1:0>; 


V[d] = result; 
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C7.2.302 UADDLP 


Unsigned Add Long Pairwise. This instruction adds pairs of adjacent unsigned integer values from the vector in the 
source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register. 
The destination vector elements are twice as long as the source vector elements. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 


fo[a[ifo +771 O[sze]i 000 0f0 ojo]1 oi 0] rn | Rd 
U op 


Vector variant 


UADDLP <Vd>.<Ta>, <Vn>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV (2 « esize); 


boolean acc = (op == '1'); 
boolean unsigned = (U == '1'); 


Assembler symbols 


0| 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 00,Q = 0 
8H when size = 00,Q=1 
2S when size = 01,Q = 0 
4S when size = 01,Q=1 
1D when size = 10,Q = 0 
2D when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11,Q = x is reserved. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 


bits(2sesize) sum; 
integer op1; 
integer op2; 


result = if acc then V[d] else Zeros(); 
for e = 0 to elements-1 
opl = Int(Elem[operand, 2xe+0, esize], unsigned); 
op2 = Int(Elem[operand, 2xe+1, esize], unsigned); 
sum = (opl+op2)<2«esize-1:0>; 
Elem[result, e, 2xesize] = Elem[result, e, 2esize] + sum; 


V[d] = result; 
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C7.2.303 UADDLV 


Unsigned sum Long across Vector. This instruction adds every vector element in the source SIMD&FP register 
together, and writes the scalar result to the destination SIMD&FP register. The destination scalar is twice as long as 


the source vector elements. All the values in this instruction are unsigned integer values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 
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Advanced SIMD variant 


UADDLV <V><d>, <Vn>.<T> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


if size:Q == '10@' then ReservedValue(); 


if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 


integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 


Assembler symbols 


0| 


<V> Is the destination width specifier, encoded in the "size" field. It can have the following values: 
H when size = 00 
s when size = 01 
D when size = 10 


The encoding size = 11 is reserved. 


<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
4S when size = 10,Q=1 


The following encodings are reserved: 


° size = 10,Q = 0. 


° size = 11,Q =x. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
integer sum; 


sum = Int(Elem[operand, @, esize], unsigned); 
for e = 1 to elements-1 


sum = sum + Int(Elem[operand, e, esize], unsigned); 


V[d] = sum<2*esize-1:0>; 
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C7.2.304 UADDW, UADDW2 


Unsigned Add Wide. This instruction adds the vector elements of the first source SIMD&FP register to the 
corresponding vector elements in the lower or upper half of the second source SIMD&FP register, places the result 
in a vector, and writes the vector to the SIMD&FP destination register. The vector elements of the destination 
register and the first source register are twice as long as the vector elements of the second source register. All the 
values in this instruction are unsigned integer values. 


The UADDW instruction extracts vector elements from the lower half of the second source register, while the UADDW2 
instruction extracts vector elements from the upper half of the second source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


fofaft]o 74 1 Ofswe[i] Rm [0 ofo]tfoo] An | Rd | 
U 01 


Three registers, not all the same type variant 


UADDW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 
m 


UInt(Rd) ; 
UInt(Rn) ; 
UInt (Rm) ; 


== '11' then ReservedValue(); 
esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 

elements = datasize DIV esize; 


sub_op 


(01 == '1'); 


unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Vim> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
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<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operandl = V[n]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

integer sum; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, 2*esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
Elem[result, e, 2xesize] = sum<2xesize-1:0>; 


V[d] = result; 
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C7.2.305 UCVTF (vector, fixed-point) 


Unsigned fixed-point Convert to Floating-point (vector). This instruction converts each element in a vector from 
fixed-point to floating-point using the rounding mode that is specified by the FPCR, and writes the result to the 


SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 


Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 

|31 30 ae 26 25 sales 22 ts See 109 | 5 4| 
A rE RS 

immh 

Scalar variant 

UCVTF <V><d>, <V><n>, #<fbits> 

Decode for this encoding 

integer d = UInt(Rd); 

integer n = UInt(Rn); 

if immh == 'Q0@xx' then ReservedValue(); 


integer esize = 32 << UInt(immh<3>); 
integer datasize = esize; 
integer elements = 1; 


integer fracbits = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 
FPRounding rounding = FPRoundingMode(FPCR) ; 


Vector 


|31 30 29 28|27 26 25 24/23 22 1918  16|15141312/1110 9 | 5 4| 


0| 


foJafifo +4 4+ of 0000 [immb [rT 0 ofa] en Re 


immh 


Vector variant 


UCVTF <Vd>.<T>, <Vn>.<T>, #<fbits> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh == 'Q0@xx' then ReservedValue(); 

if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(immh<3>); 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 
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integer fracbits = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 
FPRounding rounding = FPRoundingMode(FPCR) ; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


<fbits> 


Is a width specifier, encoded in the "immh" field. It can have the following values: 
Ss when immh = 01xx 
D when immh = 1xxx 


The encoding immh = QQxx is reserved. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 


2D when immh = 1xxx,Q = 1 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The following encodings are reserved: 

° immh = 0001, Q = x. 

° immh = QQ1x, Q = x. 


° jimmh = 1xxx,Q = Q. 
Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


For the scalar variant: is the number of fractional bits, in the range 1 to the operand width, encoded 
in the "immh:immb" field. It can have the following values: 


(64-UInt(immh:immb)) when immh = Q1xx 


(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = QQxx is reserved. 


For the vector variant: is the number of fractional bits, in the range 1 to the element width, encoded 
in the "immh:immb" field. It can have the following values: 


(64-UInt(immh:immb)) when immh = Q1xx 

(128-UInt(immh:immb)) when immh = 1xxx 

See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 
The following encodings are reserved: 

° immh = 0001. 

° immh = QQ1x. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 
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Elem[result, e, esize] = FixedToFP(element, fracbits, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.306 UCVTF (vector, integer) 


Unsigned integer Convert to Floating-point (vector). This instruction converts each element in a vector from an 
unsigned integer value to a floating-point value using the rounding mode that is specified by the FPCR, and writes 
the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


Scalar 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312|/11109 | 5 4| 0| 


foi Ti offer ooo oe tro if of an TR 


Scalar variant 


UCVTF <V><d>, <V><n> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


integer esize = 32 << UInt(sz); 
integer datasize = esize; 
integer elements = 1; 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/1918 17 16/1514 1312/1110 9 | 5 4| 0 | 


foJalifo +4 ofofe]t ooo ofr ro it of en Te 


Vector variant 


UCVTF <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


UInt(Rd) ; 
UInt(Rn) ; 


integer d 
integer n 


if sz:Q == '10' then ReservedValue(); 

integer esize = 32 << UInt(sz); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 


Assembler symbols 





<V> Is a width specifier, encoded in the "sz" field. It can have the following values: 
Ss when sz = 0 
D when sz = 1 
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<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 

<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2s when sz = 0,Q = 0 
4S when sz = @,Q=1 
2D when sz = 1,Q=1 


The encoding sz = 1, Q = Q is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 
bits(datasize) result; 

FPRounding rounding = FPRoundingMode(FPCR) ; 
bits(esize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, esize]; 


Elem[result, e, esize] = FixedToFP(element, @, unsigned, FPCR, rounding); 


V[d] = result; 
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C7.2.307 UCVTF (scalar, fixed-point) 


Unsigned fixed-point Convert to Floating-point (scalar). This instruction converts the unsigned value in the 32-bit 
or 64-bit general-purpose source register to a floating-point value using the rounding mode that is specified by the 
FPCR, and writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the Security state and 
Exception level in which the instruction is executed, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24/23 22 2120/1918  16|15 | 109 5 4| 0 | 


FTE CG 


type rmode opcode 


32-bit to single-precision variant 
Applies when sf == @ && type == 00. 


UCVTF <Sd>, <Wn>, #<fbits> 


32-bit to double-precision variant 
Applies when sf == @ && type == 01. 


UCVTF <Dd>, <Wn>, #<fbits> 


64-bit to single-precision variant 
Applies when sf == 1 && type == 00. 


UCVTF <Sd>, <Xn>, #<fbits> 


64-bit to double-precision variant 
Applies when sf == 1 && type == 01. 


UCVTF <Dd>, <Xn>, #<fbits> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' fltsize = 32; 
when 'Q1' fltsize = 64; 
when '1x' UnallocatedEncoding(); 


if sf == '@' && scale<5> == 'Q' then UnallocatedEncoding(); 
integer fracbits = 64 - UInt(scale); 


rounding = FPRoundingMode(FPCR) ; 
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Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
<fbits> For the 32-bit to double-precision and 32-bit to single-precision variant: is the number of bits after 


the binary point in the fixed-point source, in the range 1 to 32, encoded as 64 minus "scale". 


For the 64-bit to double-precision and 64-bit to single-precision variant: is the number of bits after 
the binary point in the fixed-point source, in the range 1 to 64, encoded as 64 minus "scale". 
Operation 
CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


intval = X[n]; 
fltval = FixedToFP(intval, fracbits, TRUE, FPCR, rounding); 
V[d] = fltval; 
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C7.2.308 UCVTF (scalar, integer) 


Unsigned integer Convert to Floating-point (scalar). This instruction converts the unsigned integer value in the 
general-purpose source register to a floating-point value using the rounding mode that is specified by the FPCR, and 
writes the result to the SIMD&FP destination register. 


A floating-point exception can be generated by this instruction. Depending on the settings in FPCR, the exception 
results in either a flag being set in FPSR, or a synchronous exception being generated. For more information, see 
Floating-point exception traps on page D1-1552. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 2120/1918  16/15141312/11109 | 5 4| 0 | 


fsofo]1 177 0f0 xJi]o ofo7ifooo000] Rn | Rd 


type rmode opcode 


32-bit to single-precision variant 
Applies when sf == @ && type == 00. 


UCVTF <Sd>, <Wn> 


32-bit to double-precision variant 
Applies when sf == @ && type == 01. 


UCVTF <Dd>, <Wn> 


64-bit to single-precision variant 
Applies when sf == 1 && type == 00. 


UCVTF <Sd>, <Xn> 


64-bit to double-precision variant 
Applies when sf == 1 && type == 01. 


UCVTF <Dd>, <Xn> 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer intsize = if sf == '1' then 64 else 32; 
integer fltsize; 
FPRounding rounding; 


case type of 
when 'QQ' 
fltsize 
when '@1' 
fltsize 
when '10' 
UnallocatedEncoding() 
when '11' 
UnallocatedEncoding() 


32; 


64; 


rounding = FPRoundingMode(FPCR) ; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1405 
1ID092916 Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Xn> Is the 64-bit name of the general-purpose source register, encoded in the "Rn" field. 
<Wn> Is the 32-bit name of the general-purpose source register, encoded in the "Rn" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 


bits(fltsize) fltval; 
bits(intsize) intval; 


intval = X[n]; 
fltval = FixedToFP(intval, @, TRUE, FPCR, rounding); 
V[d] = fltval; 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Unsigned Halving Add. This instruction adds corresponding unsigned integer values from the two source 
SIMD&FP registers, shifts each result right one bit, places the results into a vector, and writes the vector to the 
destination SIMD&FP register. 


The results are truncated. For rounded results, see URHADD. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 141312\1110 9 | 5 4| 0 | 


fofafifo +771 O[sze]i] Rm [oo o00]1] Rn | Rd 
U 


Three registers of the same type variant 


UHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if size 
integer 
integer 
integer 
boolean 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

== '11' then ReservedValue(); 
esize = 8 << UInt(size); 


datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 
unsigned = (U == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vin> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 


bits(datasize) operand2 


Vom]; 


bits(datasize) result; 


integer 
integer 
integer 


element1; 
element2; 
sum; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
sum = elementl + element2; 
Elem[result, e, esize] = sum<esize:1>; 


V[d] = result; 
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C7.2.310 UHSUB 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Unsigned Halving Subtract. This instruction subtracts the vector elements in the second source SIMD&FP register 
from the corresponding vector elements in the first source SIMD&FP register, shifts each result right one bit, places 
each result into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo[afifo +771 Ofsze]i] Rm [oo 7100/7] Rn | Rd 
U 


Three registers of the same type variant 


UHSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if size 
integer 
integer 
integer 
boolean 


d = UInt(Rd); 
n = UInt(Rn); 
m = UInt(Rm); 
== '11' then ReservedValue(); 


esize = 8 << UInt(size); 

datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 

unsigned = (U == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

The encoding size = 11,Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


integer 
integer 
integer 


element1; 
element2; 
diff; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


element2 = Int(Elem[operand2, e, esize], unsigned); 
diff = element1 - element2; 
Elem[result, e, esize] = diff<esize:1>; 


V[d] = result; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.311 UMAX 


Unsigned Maximum (vector). This instruction compares corresponding elements in the vectors in the two source 
SIMD&FP registers, places the larger of each pair of unsigned integer values into a vector, and writes the vector to 
the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/1514 1312/1110 9 | 5 4| 0 | 


fofaftfo 741 ofswe[i] Rm [077 0)0)] An | Rd | 
U o1 


Three registers of the same type variant 


UMAX <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

integer element1; 

integer element2; 

integer maxmin; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


for 


V[d] 


e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

maxmin = if minimum then Min(elementl, element2) else Max(elementl1, element2); 
Elem[result, e, esize] = maxmin<esize-1:0>; 


= result; 
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C7.2.312 UMAXP 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Unsigned Maximum Pairwise. This instruction creates a vector by concatenating the vector elements of the first 
source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of 
adjacent vector elements in the two source SIMD&FP registers, writes the largest of each pair of unsigned integer 
values into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 


fofaft[o 741 ofswe[i] Rm [107 0]0)] An | Rd | 
U o1 


16/1514 1312/1110 9 | 5 4| 0 | 


Three registers of the same type variant 


UMAXP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if size 
integer 
integer 
integer 


boolean 
boolean 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

== '11' then ReservedValue(); 
esize = 8 << UInt(size); 


datasize 
elements 


if Q == '1' then 128 else 64; 
datasize DIV esize; 


unsigned = (U == '1'); 
minimum = (o1 == '1'); 


Assembler symbols 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


when size = 00,Q = 0 


when size = 00,Q=1 


when size 
when size 
when size 


when size 


01,Q=20 
01,Q=1 
10,Q = 0 
10,Q=1 


The encoding size = 11, Q = x is reserved. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


<Vd> 

<I> 
8B 
16B 
4H 
8H 
28 
4S 

<Vn> 

<Vin> 

Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(2sdatasize) concat = operand2:operand1; 


integer 
integer 


element1; 
element2; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


integer maxmin; 


for e = 0 to elements-1 
element1 = Int(Elem[concat, 2%e, esize], unsigned); 
element2 = Int(Elem[concat, (2*e)+1, esize], unsigned); 
maxmin = if minimum then Min(elementl, element2) else Max(element1, element2); 
Elem[result, e, esize] = maxmin<esize-1:0>; 


V[d] = result; 
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C7.2.313 UMAXV 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Unsigned Maximum across Vector. This instruction compares all the vector elements in the source SIMD&FP 
register, and writes the largest of the values as a scalar to the destination SIMD&FP register. All the values in this 
instruction are unsigned integer values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofa[ifo +771 O[sze]i 100 ojo[to7ol1o] en | Rd 
U op 


Advanced SIMD variant 


UMAXV <V><d>, <Vn>.<T> 


Decode for this encoding 


integer 
integer 


if size 
if size 
integer 
integer 
integer 


boolean 
boolean 


d = UInt(Rd); 
n = UInt(Rn); 
:Q == '100' then ReservedValue(); 


== '11' then ReservedValue(); 


esize = 8 << UInt(size); 


datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 


unsigned = (U == '1'); 


min = (op == '1'); 


Assembler symbols 


<V> 


<d> 


<Vn> 


<I> 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 


B when size 
H when size 
S when size 


= 00 
01 
10 


The encoding size = 11 is reserved. 


Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size 
16B when size 
4H when size 
8H when size 
4S when size 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
= @1,Q=1 
=10,Q=1 


The following encodings are reserved: 


° size = 10,Q = @. 


° size = 11,Q =x. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
integer maxmin; 

integer element; 


maxmin = Int(Elem[operand, 0, esize], unsigned); 
for e = 1 to elements-1 
element = Int(Elem[operand, e, esize], unsigned); 
maxmin = if min then Min(maxmin, element) else Max(maxmin, element); 


V[d] = maxmin<esize-1:0>; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.314 UMIN 


Unsigned Minimum (vector). This instruction compares corresponding vector elements in the two source 
SIMD&FP registers, places the smaller of each of the two unsigned integer values into a vector, and writes the 
vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fojafrfo 744 ofswe[i] Rm [077 0/1]] An | Rd | 
U o1 


Three registers of the same type variant 


UMIN <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 
boolean minimum = (01 == '1'); 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

integer element1; 

integer element2; 

integer maxmin; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


for 


V[d] 


e = 0 to elements-1 

element1 = Int(Elem[operand1, e, esize], unsigned); 

element2 = Int(Elem[operand2, e, esize], unsigned); 

maxmin = if minimum then Min(elementl, element2) else Max(elementl1, element2); 
Elem[result, e, esize] = maxmin<esize-1:0>; 


= result; 
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C7.2.315 UMINP 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Unsigned Minimum Pairwise. This instruction creates a vector by concatenating the vector elements of the first 
source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of 
adjacent vector elements in the two source SIMD&FP registers, writes the smallest of each pair of unsigned integer 
values into a vector, and writes the vector to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20) 


fofaptfo 741 ofswe[i] Rm [107 0]1]1] Rn | Rd | 
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16|15141312\1110 9 | 5 4| 0 | 


Three registers of the same type variant 


UMINP <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if size 
integer 
integer 
integer 


boolean 
boolean 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

== '11' then ReservedValue(); 
esize = 8 << UInt(size); 


datasize 
elements 


if Q == '1' then 128 else 64; 
datasize DIV esize; 


unsigned = (U == '1'); 
minimum = (o1 == '1'); 


Assembler symbols 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


when size = 00,Q = 0 


when size = 00,Q=1 


when size 
when size 
when size 


when size 


01,Q=20 
01,Q=1 
10,Q = 0 
10,Q=1 


The encoding size = 11, Q = x is reserved. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


<Vd> 

<I> 
8B 
16B 
4H 
8H 
28 
4S 

<Vn> 

<Vin> 

Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
bits(2sdatasize) concat = operand2:operand1; 


integer 
integer 


element1; 
element2; 
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integer maxmin; 


for e = 0 to elements-1 
element1 = Int(Elem[concat, 2%e, esize], unsigned); 
element2 = Int(Elem[concat, (2*e)+1, esize], unsigned); 
maxmin = if minimum then Min(elementl, element2) else Max(element1, element2); 
Elem[result, e, esize] = maxmin<esize-1:0>; 


V[d] = result; 
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C7.2.316 UMINV 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Unsigned Minimum across Vector. This instruction compares all the vector elements in the source SIMD&FP 
register, and writes the smallest of the values as a scalar to the destination SIMD&FP register. All the values in this 
instruction are unsigned integer values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fo[a[ifo +771 O[sze]it 00 0]i[107 0/10] Rn | Rd 
U op 


Advanced SIMD variant 


UMINV <V><d>, <Vn>.<T> 


Decode for this encoding 


integer 
integer 


if size 
if size 
integer 
integer 
integer 


boolean 
boolean 


d = UInt(Rd); 
n = UInt(Rn); 
:Q == '100' then ReservedValue(); 


== '11' then ReservedValue(); 


esize = 8 << UInt(size); 


datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 


unsigned = (U == '1'); 


min = (op == '1'); 


Assembler symbols 


<V> 


<d> 


<Vn> 


<I> 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 


B when size 
H when size 
S when size 


= 00 
01 
10 


The encoding size = 11 is reserved. 


Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size 
16B when size 
4H when size 
8H when size 
4S when size 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
= @1,Q=1 
=10,Q=1 


The following encodings are reserved: 


° size = 10,Q = @. 


° size = 11,Q =x. 
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Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
integer maxmin; 

integer element; 


maxmin = Int(Elem[operand, 0, esize], unsigned); 
for e = 1 to elements-1 
element = Int(Elem[operand, e, esize], unsigned); 
maxmin = if min then Min(maxmin, element) else Max(maxmin, element); 


V[d] = maxmin<esize-1:0>; 
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C7.2.317 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 


C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


UMLAL, UMLAL2 (by element) 


Unsigned Multiply-Add Long (vector, by element). This instruction multiplies each vector element in the lower or 
upper half of the first source SIMD&FP register by the specified vector element of the second source SIMD&FP 
register and accumulates the results with the vector elements of the destination SIMD&FP register. The destination 
vector elements are twice as long as the elements that are multiplied. 


The UMLAL instruction extracts vector elements from the lower half of the first source register, while the UMLAL2 
instruction extracts vector elements from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


131 30 29 28|27 26 25 24|23 22 21 20|19 


16/15 141312|1110 9 | 5 4| 0 | 
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Vector variant 


UMLAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 


integer 


index; 


bit Rmhi; 
case size of 


when 'Q@1' index = UInt(H:L:M); Rmhi 
when 


'10' index = UInt(H:L); Rmhi = 


otherwise UnallocatedEncoding(); 


integer 
integer 
integer 


integer 
integer 
integer 
integer 


boolean 
boolean 


d 
n 
m 


UInt(Rd) ; 
UInt(Rn) ; 
UInt(Rmhi : Rm) ; 


esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 

elements = datasize DIV esize; 


(U == '1'); 


unsigned = 
= (02 == '1'); 


sub_op 


Assembler symbols 


2 


<Vd> 


<Ta> 


= 'Q'; 
M; 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 


the following values: 
when Q = 0 
when Q = 1 


[absent] 


[present] 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


4s 
2D 


when size 


when size 


01 
10 


The following encodings are reserved: 


° size = 00. 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


C7-1423 


Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


. size = 11. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 


° size = 00,Q =x. 
° size = 11,Q =x. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 


the following values: 

Q:Rm when size = 01 

M:Rm when size = 10 

The following encodings are reserved: 
° size = 00. 

° size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


<Ts> Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 
s when size = 10 


The following encodings are reserved: 


° size = 00. 
. size = 11. 
<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = 01 
HiL when size = 10 


The following encodings are reserved: 
. size = 00. 


° size = 11. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(idxdsize) operand2 = V[m]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 


element2 = Int(Elem[operand2, index, esize], unsigned); 
for e = 0 to elements-1 
element1 = Int(Elem[operandi, e, esize], unsigned); 
product = (elementl«element2)<2«esize-1:0>; 
if sub_op then 
Elem[result, e, 2*esize] = Elem[operand3, e, 2xesize] - product; 
else 
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Elem[result, e, 2*esize] = Elem[operand3, e, 2sesize] + product; 


V[d] = result; 
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C7.2.318 UMLAL, UMLAL2 (vector) 


Unsigned Multiply-Add Long (vector). This instruction multiplies the vector elements in the lower or upper half of 
the first source SIMD&FP register by the corresponding vector elements of the second source SIMD&FP register, 
and accumulates the results with the vector elements of the destination SIMD&FP register. The destination vector 
elements are twice as long as the elements that are multiplied. 


The UMLAL instruction extracts vector elements from the lower half of the first source register, while the UMLAL2 
instruction extracts vector elements from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312|/1110 9 | 5 4| 0 | 


lati jo 1 4 4 ofsze}i] Rm _|# ofojojo of] Rn | Rd _ 


Three registers, not all the same type variant 


UMLAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 
boolean 
boolean 


d 
n 


UInt(Rd) ; 
UInt(Rn) ; 


m = UInt(Rm); 


== '11' then ReservedValue(); 


esize 


= 8 << UInt(size); 


datasize = 64; 


part = 


UInt(Q); 


elements = datasize DIV esize; 
sub_op = (ol == '1'); 
unsigned = (U == '1'); 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ=1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

bits(2sesize) accum; 


for e = 0 to elements-1 

element1 = Int(Elem[operandi, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
product = (elementl«element2)<2«esize-1:0>; 
if sub_op then 

accum = Elem[operand3, e, 2xesize] - product; 
else 

accum = Elem[operand3, e, 2xesize] + product; 
Elem[result, e, 2xesize] = accum; 


V[d] = result; 
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C7.2.319 |UMLSL, UMLSL2 (by element) 
Unsigned Multiply-Subtract Long (vector, by element). This instruction multiplies each vector element in the lower 
or upper half of the first source SIMD&FP register by the specified vector element of the second source SIMD&FP 
register and subtracts the results from the vector elements of the destination SIMD&FP register. The destination 
vector elements are twice as long as the elements that are multiplied. 
The UMLSL instruction extracts vector elements from the lower half of the first source register, while the UMLSL2 
instruction extracts vector elements from the upper half of the first source register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 2423 22 21 20/19 16/15141312|/11109 | 5 4| 0 | 
lati jo 1 4 4 a}size[efw] Rm Jol i]s ofrjo] Re | Rd 
Vector variant 
UMLSL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Ts>[<index>] 
Decode for this encoding 
integer idxdsize = if H == '1' then 128 else 64; 
integer index; 
bit Rmhi; 
case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 'Q'; 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 
integer esize = 8 << UInt(size); 
integer datasize = 64; 
integer part = UInt(Q); 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
boolean sub_op = (02 == '1'); 
Assembler symbols 
2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 
[absent] whenQ = 20 
[present] whenQ =1 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
4s when size = 01 
2D when size = 10 
The following encodings are reserved: 
° size = 00. 
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<Vn> 


<Tb> 


<Vm> 


<Ts> 


<index> 


Operation 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


. size = 11. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


4H when size = 01,Q = 0 


8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 

° size = 00,Q =x. 

° size = 11,Q =x. 

Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 
the following values: 

Q:Rm when size = 01 

M:Rm when size = 10 

The following encodings are reserved: 

° size = 00. 

° size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = 01 

s when size = 10 

The following encodings are reserved: 

° size = 00. 


° size = 11. 


Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = 01 

H:L when size = 10 

The following encodings are reserved: 

. size = 00. 


° size = 11. 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(idxdsize) operand2 = V[m]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 


element2 = Int(Elem[operand2, index, esize], unsigned); 
for e = 0 to elements-1 
element1 = Int(Elem[operandi, e, esize], unsigned); 
product = (elementl«element2)<2«esize-1:0>; 
if sub_op then 


Elem[result, e, 2*esize] = Elem[operand3, e, 2xesize] - product; 
else 
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Elem[result, e, 2*esize] = Elem[operand3, e, 2sesize] + product; 


V[d] = result; 
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C7.2.320 UMLSL, UMLSL2 (vector) 


Unsigned Multiply-Subtract Long (vector). This instruction multiplies corresponding vector elements in the lower 
or upper half of the two source SIMD&FP registers, and subtracts the results from the vector elements of the 
destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied. 
All the values in this instruction are unsigned integer values. 


The UMLSL instruction extracts each source vector from the lower half of each source register, while the UMLSL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|/1110 9 | 5 4| 0 | 


lofi jo 1 4 4 ofsze}i] Rm _|# ofifofo of Rn | Rd 


Three registers, not all the same type variant 


UMLSL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 
boolean sub_op = (01 == '1'); 

boolean unsigned = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = Q1 
2D when size = 10 


The encoding size = 11 is reserved. 





<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) operand3 = V[d]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 

bits(2sesize) accum; 


for e = 0 to elements-1 

element1 = Int(Elem[operandi, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
product = (elementl«element2)<2«esize-1:0>; 
if sub_op then 

accum = Elem[operand3, e, 2xesize] - product; 
else 

accum = Elem[operand3, e, 2xesize] + product; 
Elem[result, e, 2xesize] = accum; 


V[d] = result; 
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C7.2.321 UMOV 


Unsigned Move vector element to general-purpose register. This instruction reads the unsigned integer from the 
source SIMD&FP register, zero-extends it to form a 32-bit or 64-bit value, and writes the result to the destination 
general-purpose register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is used by the alias MOV (to general). See Alias conditions for details of when each alias is 
preferred. 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312|1110 9 | 5 4| 0 | 


fofafoyo 7770000] mms [ojo taiji] en | Rd 


32-bit variant 
Applies when Q == 0. 


UMOV <Wd>, <Vn>.<Ts>[<index>] 


64-bit variant 
Applies when Q == 1 && imm5 == x1000. 


UMOV <Xd>, <Vn>.<Ts>[<index>] 


Decode for all variants of this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


integer size; 
case Q:imm5 of 


when '@xxxx1' size = Q; // UMOV Wd, Vn.B 
when '@xxx1@' size = 1; // UMOV Wd, Vn.H 
when 'Q@xx100' size = 2; // UMOV Wd, Vn.S 
when '1x1000' size = 3; // UMOV Xd, Vn.D 
otherwise UnallocatedEncoding(); 
integer idxdsize = if imm5<4> == '1' then 128 else 64; 


integer index = UInt(imm5<4:size+1>) ; 
integer esize = 8 << size; 
integer datasize = if Q == '1' then 64 else 32; 


Alias conditions 





Alias is preferred when 





MOV (to general) imm5 == 'x1000' 





MOV (to general) —imm5 == 'xx100' 





Assembler symbols 





<Wd> Is the 32-bit name of the general-purpose destination register, encoded in the "Rd" field. 
<Xd> Is the 64-bit name of the general-purpose destination register, encoded in the "Rd" field. 
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<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


<Ts> For the 32-bit variant: is an element size specifier, encoded in the "imm5" field. It can have the 
following values: 


B when immS = xxxx1 
H when imm5 = xxx10 
S when imm5 = xx100 


The encoding imm5 = xxQQ0Q is reserved. 


For the 64-bit variant: is an element size specifier, encoded in the "imm5" field. It can have the 
following values: 


D when imm5 = x1000 
The following encodings are reserved: 
° immS = x000. 
° immS = xxxx1, 
° immS = xxx1Q. 
° immS = xx10Q. 

<index> For the 32-bit variant: is the element index encoded in the "imm5" field. It can have the following 
values: 
imm5<4:1> when imm5 = xxxx1 
imm5<4:2> when imm5 = xxx10 
imm5<4:3> when immS = xx100 
The encoding imm5 = xxQQ0Q is reserved. 


For the 64-bit variant: is the element index encoded in "imm5<4>". 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(idxdsize) operand = V[n]; 


X[{d] = ZeroExtend(Elem[operand, index, esize], datasize); 
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C7.2.322 UMULL, UMULL2 (by element) 


Unsigned Multiply Long (vector, by element). This instruction multiplies each vector element in the lower or upper 
half of the first source SIMD&FP register by the specified vector element of the second source SIMD&FP register, 
places the results in a vector, and writes the vector to the destination SIMD&FP register. The destination vector 
elements are twice as long as the elements that are multiplied. 


The UMULL instruction extracts vector elements from the lower half of the first source register, while the UMULL2 
instruction extracts vector elements from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 141312|/1110 9 | 5 4| 0 | 


lati jo 1 4 4 a}size|efw] Rm |i o 7 offjo] Ro | Rd 


Vector variant 


UMULL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Ts>[<index>] 


Decode for this encoding 


integer idxdsize = if H == '1' then 128 else 64; 

integer index; 

bit Rmhi; 

case size of 
when 'Q@1' index = UInt(H:L:M); Rmhi = 'Q'; 
when '10' index = UInt(H:L); Rmhi = M; 
otherwise UnallocatedEncoding(); 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rmhi:Rm); 


integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
4s when size = 01 
2D when size = 10 


The following encodings are reserved: 





° size = 00. 
° size = 11. 
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<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The following encodings are reserved: 


° size = 00,Q =x. 
° size = 11,Q =x. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "size:M:Rm" field. It can have 
the following values: 
Q:Rm when size = 01 
M:Rm when size = 10 


The following encodings are reserved: 
° size = 00. 
° size = 11. 


Restricted to VO-V15 when element size <Ts> is H. 


<Ts> Is an element size specifier, encoded in the "size" field. It can have the following values: 
H when size = Q1 
s when size = 10 


The following encodings are reserved: 


° size = 00. 
. size = 11. 
<index> Is the element index, encoded in the "size:L:H:M" field. It can have the following values: 
H:L:M when size = Q1 
HiL when size = 10 


The following encodings are reserved: 
° size = 00. 


° size = 11. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(idxdsize) operand2 = V[m]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

bits(2sesize) product; 


element2 = Int(Elem[operand2, index, esize], unsigned); 
for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
product = (elementl«element2)<2«esize-1:0>; 
Elem[result, e, 2*esize] = product; 


V[d] = result; 
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C7.2.323  UMULL, UMULL2 (vector) 


Unsigned Multiply long (vector). This instruction multiplies corresponding vector elements in the lower or upper 
half of the two source SIMD&FP registers, places the result in a vector, and writes the vector to the destination 
SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied. All the 
values in this instruction are unsigned integer values. 


The UMULL instruction extracts each source vector from the lower half of each source register, while the UMULL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312|1110 9 | 5 4| 0 | 


lati jo 1 4 4 ofsze |i] Rm Jr ifofojo of Rn | Rd 


Three registers, not all the same type variant 


UMULL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ =1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = Q1 
2D when size = 10 


The encoding size = 11 is reserved. 





<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 


for e = 0 to elements-1 
element1 = Int(Elem[operandi, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
Elem[result, e, 2xesize] = (elementlxelement2)<2sesize-1:0>; 


V[d] = result; 
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Unsigned saturating Add. This instruction adds the values of corresponding elements of the two source SIMD&FP 


registers, 


places the results into a vector, and writes the vector to the destination SIMD&FP register. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16/15 141312\1110 9 | 5 4| 0 | 


fo a[i[1 177 [sei] Rm [oo007]i] Rn | Rd 
U 


Scalar variant 


UQADD <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer 
integer 
integer 
integer 
integer 
integer 
boolean 


Vector 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

esize = 8 << UInt(size); 
datasize = esize; 
elements = 1; 

unsigned = (U == '1'); 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|1110 9 | 5 4| 0 | 


fofajijo 111 ofsize]i] Rm jooooirjiy Rn | Rd | 
U 


Vector variant 


UQADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer 
integer 
integer 
if size 
integer 
integer 
integer 
boolean 


d = UInt(Rd); 
n = UInt(Rn); 
m = UInt(Rm); 
:Q == '110' then ReservedValue(); 


esize = 8 << UInt(size); 

datasize = if Q == '1' then 128 else 64; 
elements = datasize DIV esize; 

unsigned = (U == '1'); 


Assembler symbols 





<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 
H when size = @1 
s when size = 10 
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D when size = 11 
<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<m> Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 

integer element1; 

integer element2; 

integer sum; 

boolean sat; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
sum = elementl + element2; 
(Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned); 
if sat then FPSR.QC = '1'; 


V[d] = result; 
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C7.2.325 UQRSHL 

Unsigned saturating Rounding Shift Left (register). This instruction takes each vector element of the first source 
SIMD&FP register, shifts the vector element by a value from the least significant byte of the corresponding vector 
element of the second source SIMD&FP register, places the results into a vector, and writes the vector to the 
destination SIMD&FP register. 
If the shift value is positive, the operation is a left shift. Otherwise, it is a right shift. The results are rounded. For 
truncated results, see UQSHL (immediate). 
If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 

|31 30 29 28|27 26 25 24/23 22 21 20] 16|15141312/1110 9 | 5 4| 0 | 

1111 0[sze[i] Rm o7opfii] mn] Ra 

U RS 

Scalar variant 
UQRSHL <V><d>, <V><n>, <V><m> 
Decode for this encoding 

integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 8 << UInt(size); 

integer datasize = esize; 

integer elements = 1; 

boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 

if S == 'Q@' && size != '11' then ReservedValue(); 
Vector 

|31 30 29 28|27 26 25 24/23 22 21 20] 16|15141312/1110 9 | 5 4| 0 | 
fofafijo 111 ofsze[t] am Jot ofiji] en [| Ra 
U RS 

Vector variant 
UQRSHL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 

integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 
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Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is a width specifier, encoded in the "size" field. It can have the following values: 


B 


H 
s 
D 


when size 
when size 
when size 


when size 


00 
= 01 
= 10 
=11 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4s 
2D 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
=@1,Q=1 
= 10,Q=0 
=10,Q=1 
=11,Q=1 


The encoding size = 11, Q = Qis reserved. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


bits(datasize) result; 


integer round_const = Q; 


integer shift; 
integer element; 
boolean sat; 


for e = 0 to elements-1 


V[d] 


shift = SInt(Elem[operand2, e, esize]<7:0>); 


if rounding then 


round_const = 1 << (-shift - 1); // ® for left shift, 2A(n-1) for right shift 
element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift; 


if saturating then 


(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 


else 


Elem[result, e, esize] =e 


= result; 


lement<esize-1:0>; 
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C7.2.326 UQRSHRN, UQRSHRN2 


Unsigned saturating Rounded Shift Right Narrow (immediate). This instruction reads each vector element in the 
source SIMD&FP register, right shifts each result by an immediate value, puts the final result into a vector, and 
writes the vector to the lower or upper half of the destination SIMD&FP register. All the values in this instruction 
are unsigned integer values. The results are rounded. For truncated results, see UQSHRN, UQSHRN2. 


The UQRSHRN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the UQRSHRN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918 16/15141312/11109 | 5 4| 0 | 


iO iTtyi 1 i i 4 of F009 [imme [i oo iijit Re | Rd 


immh op 


Scalar variant 


UQRSHRN <Vb><d>, <Va><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'Q000' then ReservedValue(); 
if immh<3> == '1' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = esize; 

integer elements = 1; 

integer part = 0; 


integer shift = (2 * esize) - UInt(immh:immb) ; 


boolean round = (op == '1'); 

boolean unsigned = (U == '1'); 
Vector 

|31 30 —— a ne 18 16|15141312/1110 9 | 5 4| 0 | 
ECC a 


immh 


Vector variant 


UQRSHRN{2} <Vd>.<Tb>, <Vn>.<Ta>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQQ' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
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integer 
integer 
integer 


integer 
boolean 
boolean 


datasize = 64; 
part = UInt(Q); 
elements = datasize DIV esize; 


shift = (2 « esize) - UInt(immh:immb); 
round = (op == '1'); 
unsigned = (U == '1'); 


Assembler symbols 





2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 

[absent] whenQ =0 
[present] whenQ = 1 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = Q01x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 

The encoding immh = 1xxx, Q = x is reserved. 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
2D when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 

The encoding immh = 1xxx is reserved. 

<Vb> Is the destination width specifier, encoded in the "immh" field. It can have the following values: 
B when immh = 0001 
H when immh = 0Q1x 
S when immh = Q1xx 
The following encodings are reserved: 
° immh = 0000. 
° immh = 1xxx. 

<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 

<Va> Is the source width specifier, encoded in the "immh" field. It can have the following values: 
H when immh = 0001 
s when immh = 001x 
D when immh = Q1xx 
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The following encodings are reserved: 

. immh = 0000. 

° immh = 1xxx. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
For the scalar variant: is the right shift amount, in the range 1| to the destination operand width in 
bits, encoded in the "immh:immb" field. It can have the following values: 
(16-UInt(immh:immb)) when immh = 0001 

(32-UInt(immh:immb)) when immh = 001x 

(64-UInt(immh:immb)) when immh = Q1xx 

The following encodings are reserved: 

° immh = 0000. 

. immh = 1xxx. 


For the vector variant: is the right shift amount, in the range 1 to the destination element width in 
bits, encoded in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasizex2) operand = V[n]; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


boolean sat; 


for e = 0 to elements-1 
element = (Int(Elem[operand, e, 2xesize], unsigned) + round_const) >> shift; 
(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 


Vpart[d, part] = result; 
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C7.2.327 UQSHL (immediate) 
Unsigned saturating Shift Left (immediate). This instruction takes each vector element in the source SIMD&FP 
register, shifts it by an immediate value, places the results in a vector, and writes the vector to the destination 
SIMD&FP register. The results are truncated. For rounded results, see UQRSHL. 
If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 ecole 26 25 sali 22 Jone ee 109 | 0 | 
SC 
immh 
Scalar variant 
UQSHL <V><d>, <V><n>, #<shift> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if immh == 'Q000' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = esize; 
integer elements = 1; 
integer shift = UInt(immh:immb) - esize; 
boolean src_unsigned; 
boolean dst_unsigned; 
case op:U of 
when 'QQ' UnallocatedEncoding(); 
when 'Q@1' src_unsigned = FALSE; dst_unsigned = TRUE; 
when '10' src_unsigned = FALSE; dst_unsigned = FALSE; 
when '11' src_unsigned = TRUE; dst_unsigned = TRUE; 
Vector 
|31 30 zo neere 26 25 2 240° 22 igs 16|15141312/11109 | 0| 
lal O41 1_o[ !=0000 me fot tots] Rn _{_Re__] 
immh 
Vector variant 
UQSHL <Vd>.<T>, <Vn>.<T>, #<shift> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = if Q == '1' then 128 else 64; 
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integer elements = datasize DIV esize; 


integer shift = UInt(immh:immb) - esize; 


boolean src_unsigned; 
boolean dst_unsigned; 
case op:U of 


when 
when 
when 
when 


"QQ' 
'Q1' 
'1Q' 
"4W1' 


UnallocatedEncoding(); 

src_unsigned = FALSE; dst_unsigned = TRUE; 
src_unsigned = FALSE; dst_unsigned = FALSE; 
src_unsigned = TRUE; dst_unsigned = TRUE; 


Assembler symbols 


<V> 


<d> 


<n> 


<Vd> 


<I> 


<Vn> 


<shift> 


Is a width specifier, encoded in the "immh" field. It can have the following values: 


B when immh = 0001 


H when immh = 001x 
S when immh = Q1xx 
D when immh = 1xxx 


The encoding immh = 0000 is reserved. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 


8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = 001x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 
2D when immh = 1xxx,Q = 1 





See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 

The encoding immh = 1xxx, Q = Q is reserved. 

Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

For the scalar variant: is the left shift amount, in the range 0 to the operand width in bits minus 1, 
encoded in the "immh:immb" field. It can have the following values: 

(UInt(immh:immb)-8) when immh = 0001 

(UInt(immh:immb)-16) when immh = 001x 

(UInt(immh:immb)-32) when immh = Q1xx 

(UInt(immh:immb)-64) when immh = 1xxx 

The encoding immh = 0000 is reserved. 


For the vector variant: is the left shift amount, in the range 0 to the element width in bits minus 1, 
encoded in the "immh:immb" field. It can have the following values: 


(UInt(immh:immb)-8) when immh = 0001 
(UInt(immh:immb)-16) when immh = 001x 
(UInt(immh:immb)-32) when immh = Q1xx 


(UInt(immh:immb)-64) when immh = 1xxx 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. C7-1447 


Non-Confidential 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
integer element; 

boolean sat; 


for e = 0 to elements-1 
element = Int(Elem[operand, e, esize], src_unsigned) << shift; 
(Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned) ; 
if sat then FPSR.QC = '1'; 


V[d] = result; 
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UQSHL (register) 


Unsigned saturating Shift Left (register). This instruction takes each element in the vector of the first source 
SIMD&FP register, shifts the element by a value from the least significant byte of the corresponding element of the 
second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP 
register. 


If the shift value is positive, the operation is a left shift. Otherwise, it is a right shift. The results are truncated. For 
rounded results, see UQRSHL. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fo aif 177 O[sze]i] Rm [07 o]oli]i] Rn | Rd 
U RS 


Scalar variant 


UQSHL <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 8 << UInt(size); 

integer datasize = esize; 

integer elements = 1; 

boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 

if S == 'Q@' && size != '11' then ReservedValue(); 


Vector 


|31 30 29 28|27 26 25 24/23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fofayifo +771 ofsze]i] Rm [07 o]o]i]i] rn | Rd 
U RS 


Vector variant 


UQSHL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 
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Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is a width specifier, encoded in the "size" field. It can have the following values: 


B 


H 
s 
D 


when size 
when size 
when size 


when size 


00 
= 01 
= 10 
=11 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4s 
2D 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


= 00,Q = 0 
= 00,Q=1 
= 01,Q = 0 
=@1,Q=1 
= 10,Q=0 
=10,Q=1 
=11,Q=1 


The encoding size = 11, Q = Qis reserved. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 


bits(datasize) result; 


integer round_const = Q; 


integer shift; 
integer element; 
boolean sat; 


for e = 0 to elements-1 


V[d] 


shift = SInt(Elem[operand2, e, esize]<7:0>); 


if rounding then 


round_const = 1 << (-shift - 1); // ® for left shift, 2A(n-1) for right shift 
element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift; 


if saturating then 


(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 


else 


Elem[result, e, esize] =e 


= result; 


lement<esize-1:0>; 
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C7.2.329 UQSHRN, UQSHRN2 


Unsigned saturating Shift Right Narrow (immediate). This instruction reads each vector element in the source 
SIMD&FP register, right shifts each result by an immediate value, saturates each shifted result to a value that is half 
the original width, puts the final result into a vector, and writes the vector to the lower or upper half of the destination 
SIMD&FP register. All the values in this instruction are unsigned integer values. The results are truncated. For 
rounded results, see UQRSHRN, UQRSHRN2. 


The UQSHRN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the UQSHRN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/11109 | 5 4| 0 | 


oT tt of 0990 Lamm [io os foli{ Rn __|_e__ 


immh 


Scalar variant 


UQSHRN <Vb><d>, <Va><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'Q000' then ReservedValue(); 
if immh<3> == '1' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = esize; 

integer elements = 1; 

integer part = 0; 


(2 » esize) - UInt(immh:immb) ; 


integer shift 


boolean round = (op == '1'); 

boolean unsigned = (U == '1'); 

Vector 

|31 30 a sa ie 16|15141312/1110 9 | 5 4| 0 | 
EC 


immh 


Vector variant 


UQSHRN{2} <Vd>.<Tb>, <Vn>.<Ta>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQQ' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 
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integer 
integer 
integer 
integer 


integer 
boolean 
boolean 


esize 


= 8 << HighestSetBit(immh) ; 


datasize = 64; 
part = UInt(Q); 
elements = datasize DIV esize; 


shift 
round 


= (2 * esize) - UInt(immh: immb) ; 
= (op == '1'); 


unsigned = (U == '1'); 


Assembler symbols 





2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 

[absent] whenQ = 20 
[present] whenQ =1 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = Q01x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = Q1xx,Q = 1 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 

The encoding immh = 1xxx, Q = x is reserved. 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
2D when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 

The encoding immh = 1xxx is reserved. 

<Vb> Is the destination width specifier, encoded in the "immh" field. It can have the following values: 
B when immh = 0001 
H when immh = 001x 
s when immh = Q1xx 
The following encodings are reserved: 
° immh = 0000. 
° immh = 1xxx. 

<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 

<Va> Is the source width specifier, encoded in the "immh" field. It can have the following values: 
H when immh = 0001 
Ss when immh = 001x 
D when immh = Q1xx 
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The following encodings are reserved: 

. immh = 0000. 

° immh = 1xxx. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
For the scalar variant: is the right shift amount, in the range 1| to the destination operand width in 
bits, encoded in the "immh:immb" field. It can have the following values: 
(16-UInt(immh:immb)) when immh = 0001 

(32-UInt(immh:immb)) when immh = 001x 

(64-UInt(immh:immb)) when immh = Q1xx 

The following encodings are reserved: 

° immh = 0000. 

. immh = 1xxx. 


For the vector variant: is the right shift amount, in the range 1 to the destination element width in 
bits, encoded in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasizex2) operand = V[n]; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


boolean sat; 


for e = 0 to elements-1 
element = (Int(Elem[operand, e, 2xesize], unsigned) + round_const) >> shift; 
(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 


Vpart[d, part] = result; 
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C7.2.330 UQSUB 


Unsigned saturating Subtract. This instruction subtracts the element values of the second source SIMD&FP register 
from the corresponding element values of the first source SIMD&FP register, places the results into a vector, and 
writes the vector to the destination SIMD&FP register. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo a[i[1 177 O[sze]i] Rm [oo707]i] Rn | Rd 
U 


Scalar variant 


UQSUB <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15141312/1110 9 | 5 4| 0| 


lafijo i 1 4 ofszefi| Rm Joo ro afi} Rn | Re 


Vector variant 


UQSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 


Assembler symbols 


<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 
H when size = Q1 


s when size = 10 
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<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vin> 


D 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 


C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


when size = 11 


Is the number of the SIMD&FP destination register, in the "Rd" field. 


Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B 
16B 
4H 
8H 
2s 
4s 
2D 


when size 
when size 
when size 
when size 
when size 
when size 


when size 


00,Q = 0 
00,Q=1 
01,Q = 0 
01,Q=1 
10,Q = 0 
10,Q=1 
11,Q=1 


The encoding size = 11, Q = Qis reserved. 


Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 


bits(datasize) operand1 
bits(datasize) operand2 
bits(datasize) result; 
integer element1; 
integer element2; 


integer diff; 
boolean sat; 


for e = 0 to elements-1 


V[d] 


= V[n]; 
= V[m]; 


element1 = Int(Elem[operandi, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
diff = element1 - element2; 
(Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned); 
if sat then FPSR.QC = '1'; 


= result; 
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C7.2.331 UQXTN, UQXTN2 
Unsigned saturating extract Narrow. This instruction reads each vector element from the source SIMD&FP register, 
saturates each value to half the original width, places the result into a vector, and writes the vector to the destination 
SIMD&FP register. All the values in this instruction are unsigned integer values. 
If saturation occurs, the cumulative saturation bit FPSR.QC is set. 
The UQXTN instruction writes the vector to the lower half of the destination register and clears the upper half, while 
the UQXTN2 instruction writes the vector to the upper half of the destination register without affecting the other bits 
of the register. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 a pp ehi28 21 20|19 18 17 16/15 14 13 12/11 10 9 a 0| 
foi 17 ofsee[t ooo oF oo ofr of an TR 
Scalar variant 
UQXTN <Vb><d>, <Va><n> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer part = Q; 
integer elements = 1; 
boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312/1110 9 | 0| 
folafijo 117 ofsze[r ooo oi 010010] mn [| Ra | 
U 
Vector variant 
UQXTN{2} <Vd>.<Tb>, <Vn>.<Ta> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = 64; 
integer part = UInt(Q); 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
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Assembler symbols 


2 


<Vd> 


<Th> 


<Vn> 


<Ta> 


<Vb> 


<d> 


<Va> 


<n> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 0 


16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 
Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 


Is the destination width specifier, encoded in the "size" field. It can have the following values: 


B when size = 00 
H when size = 01 
s when size = 10 


The encoding size = 11 is reserved. 
Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 


Is the source width specifier, encoded in the "size" field. It can have the following values: 


H when size = 00 
S when size = Q1 
D when size = 10 


The encoding size = 11 is reserved. 


Is the number of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operand = V[n]; 
bits(datasize) result; 
bits(2sesize) element; 


boolean sat; 


for e = 0 to elements-1 


element = Elem[operand, e, 2esize]; 
(Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned); 
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if sat then FPSR.QC = '1'; 


Vpart[d, part] = result; 
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C7.2.332 URECPE 


Unsigned Reciprocal Estimate. This instruction reads each vector element from the source SIMD&FP register, 
calculates an approximate inverse for the unsigned integer value, places the result into a vector, and writes the vector 
to the destination SIMD&FP register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


fofafofo +771 O[tfajio 00 0f117 00/10] Rn | Rd 


Vector variant 


URECPE <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz == '1' then ReservedValue(); 

integer esize = 32; 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 


The encoding sz = 1, Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(32) element; 


for e = 0 to elements-1 
element = Elem[operand, e, 32]; 


Elem[result, e, 32] = UnsignedRecipEstimate(element) ; 


V[d] = result; 
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C7.2.333  URHADD 
Unsigned Rounding Halving Add. This instruction adds corresponding unsigned integer values from the two source 
SIMD&FP registers, shifts each result right one bit, places the results into a vector, and writes the vector to the 
destination SIMD&FP register. 
The results are rounded. For truncated results, see UHADD. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24/23 22 21 20] 16|15 141312/1110 9 | 5 4| 0 | 
Polafiyo 117 ofsze[i] Am [ooo7o]] ek | Ra | 
U 
Three registers of the same type variant 
URHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
boolean unsigned = (U == '1'); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 
CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 
integer element1; 
integer element2; 
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for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
Elem[result, e, esize] = (elementl+element2+1)<esize:1>; 


V[d] = result; 
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C7.2.334 


URSHL 


Unsigned Rounding Shift Left (register). This instruction takes each element in the vector of the first source 
SIMD&FP register, shifts the vector element by a value from the least significant byte of the corresponding element 
of the second source SIMD&FP register, places the results in a vector, and writes the vector to the destination 
SIMD&FP register. 


If the shift value is positive, the operation is a left shift. If the shift value is negative, it is a rounding right shift. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312|1110 9 | 5 4| 0 | 


fo afi] 177 O[see[i] Rm [07 o[ifo[i] Rn | Rd 
U RS 


Scalar variant 


URSHL <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 8 << UInt(size); 

integer datasize = esize; 

integer elements = 1; 

boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 

if S == 'Q' && size != '11' then ReservedValue(); 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


olafrjo i 4 ofsze[i| Rm Jo 4 ofr Joli} Rn | Rd 


Vector variant 


URSHL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 
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Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 

The following encodings are reserved: 

° size = Ox. 


° size = 10. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 2 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


integer round_const = Q; 
integer shift; 

integer element; 
boolean sat; 


for 


V[d] 


e = Q to elements-1 
shift = SInt(Elem[operand2, e, esize]<7:0>); 
if rounding then 
round_const = 1 << (-shift - 1); // ® for left shift, 2A(n-1) for right shift 
element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift; 
if saturating then 
(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 
else 
Elem[result, e, esize] = element<esize-1:0>; 


= result; 
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C7.2.335 


URSHR 


Unsigned Rounding Shift Right (immediate). This instruction reads each vector element in the source SIMD&FP 
register, right shifts each result by an immediate value, writes the final result to a vector, and writes the vector to the 
destination SIMD&FP register. All the values in this instruction are unsigned integer values. The results are 


rounded. For truncated results, see USHR. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918 16/1514 1312/11 10 9 


0 | 


jo ift}s 1 1 4 7 0} 0000 | imme Jo ofrfojoli| Rn | Ro 


immh 01 00 


Scalar variant 


URSHR <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


integer shift = (esize » 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 


boolean round = (01 == '1'); 
boolean accumulate = (0@ == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 1918 16/1514 1312/1110 9 


olalife +111 0] om [wm [0 ofi [oy] maT ee] 


immh 01 00 


Vector variant 


URSHR <Vd>.<T>, <Vn>.<T>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 


integer shift = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 





<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = Q01x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = @ is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1 to 64, encoded in the "Immh:immb" 
field. It can have the following values: 
(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the right shift amount, in the range 1 to the element width in bits, encoded 
in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
(128-UInt(immh:immb)) when immh 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


1xxx 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 

bits(datasize) operand2; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


operand2 = if accumulate then V[d] else Zeros(); 

for e = 0 to elements-1 
element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift; 
Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>; 


V[d] = result; 
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C7.2.336 URSQRTE 


Unsigned Reciprocal Square Root Estimate. This instruction reads each vector element from the source SIMD&FP 
register, calculates an approximate inverse square root for each value, places the result into a vector, and writes the 
vector to the destination SIMD&FP register. All the values in this instruction are unsigned integer values. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 1312/1110 9 | 5 4| 0 | 


ofafifo +77 O[tfaji ooo 0j17 700/10] Rn | Rd 


Vector variant 


URSQRTE <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if sz == '1' then ReservedValue(); 

integer esize = 32; 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


Assembler symbols 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 
2S when sz = 0,Q = 0 
4S when sz = 0,Q=1 


The encoding sz = 1, Q = x is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 
bits(32) element; 


for e = 0 to elements-1 
element = Elem[operand, e, 32]; 


Elem[result, e, 32] = UnsignedRSqrtEstimate(element) ; 


V[d] = result; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


URSRA 


Unsigned Rounding Shift Right and Accumulate (immediate). This instruction reads each vector element in the 
source SIMD&FP register, right shifts each result by an immediate value, and accumulates the final results with the 
vector elements of the destination SIMD&FP register. All the values in this instruction are unsigned integer values. 
The results are rounded. For truncated results, see USRA. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 1918  16|15141312/11109 | 5 4| 0 | 


iO ift{s 4 1 4 1 0} 0000 | imme Jo ofr} joli| Ro | Ro 


immh 01 00 


Scalar variant 


URSRA <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


integer shift = (esize » 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/1110 9 | 5 4| 0 | 


lati jo 1 1 4 4 0} s0009 | immb Jo ofrfijoli| Rn | Ré 


immh 01 00 


Vector variant 


URSRA <Vd>.<T>, <Vn>.<T>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = if Q == '1' then 128 else 64; 

integer elements = datasize DIV esize; 


integer shift = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 





<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = Q01x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = @ is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1 to 64, encoded in the "Immh:immb" 
field. It can have the following values: 
(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the right shift amount, in the range 1 to the element width in bits, encoded 
in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
(128-UInt(immh:immb)) when immh 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


1xxx 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 

bits(datasize) operand2; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


operand2 = if accumulate then V[d] else Zeros(); 

for e = 0 to elements-1 
element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift; 
Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>; 


V[d] = result; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


USHL 


Unsigned Shift Left (register). This instruction takes each element in the vector of the first source SIMD&FP 
register, shifts each element by a value from the least significant byte of the corresponding element of the second 
source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register. 


If the shift value is positive, the operation is a left shift. If the shift value is negative, it is a truncating right shift. For 
a rounding shift, see URSHL. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


fo afi] 177 O[sze]i] Rm [07 ofojo[i] rn | Rd 
U RS 


Scalar variant 


USHL <V><d>, <V><n>, <V><m> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

integer esize = 8 << UInt(size); 

integer datasize = esize; 

integer elements = 1; 

boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 

if S == 'Q' && size != '11' then ReservedValue(); 


Vector 


|31 30 29 28|27 26 25 24|23 22 21 20) 16|15141312\1110 9 | 5 4| 0 | 


olafrjo i 1 4 ofsze[i| Rm Jo 4 ofofoli| Rn | Rd 


Vector variant 


USHL <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 

integer n = UInt(Rn); 

integer m = UInt(Rm); 

if size:Q == '110' then ReservedValue(); 

integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

boolean unsigned = (U == '1'); 

boolean rounding = (R == '1'); 

boolean saturating = (S == '1'); 
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Assembler symbols 


<V> 


<d> 


<n> 


<m> 


<Vd> 


<I> 


<Vn> 


<Vm> 


Is a width specifier, encoded in the "size" field. It can have the following values: 
D when size = 11 

The following encodings are reserved: 

° size = Ox. 


° size = 10. 

Is the number of the SIMD&FP destination register, in the "Rd" field. 

Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 

Is the number of the second SIMD&FP source register, encoded in the "Rm" field. 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 2 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


integer round_const = Q; 
integer shift; 

integer element; 
boolean sat; 


for 


V[d] 


e = 0 to elements-1 
shift = SInt(Elem[operand2, e, esize]<7:0>); 
if rounding then 
round_const = 1 << (-shift - 1); // ® for left shift, 2A(n-1) for right shift 
element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift; 
if saturating then 
(Elem[result, e, esize], sat) = SatQ(element, esize, unsigned); 
if sat then FPSR.QC = '1'; 
else 
Elem[result, e, esize] = element<esize-1:0>; 


= result; 
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C7.2.339 USHLL, USHLL2 


Unsigned Shift Left Long (immediate). This instruction reads each vector element in the lower or upper half of the 
source SIMD&FP register, shifts the unsigned integer value left by the specified number of bits, places the result 
into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice 
as long as the source vector elements. 


The USHLL instruction extracts vector elements from the lower half of the source register, while the USHLL2 instruction 
extracts vector elements from the upper half of the source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is used by the alias UXTL, UXTL2. See Alias conditions for details of when each alias is preferred. 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/1110 9 | 5 4| 0 | 


foJaifo 117 + of eeooo [inne [ro 10 oi] en Re 


immh 


Vector variant 


USHLL{2} <Vd>.<Ta>, <Vn>.<Tb>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh == 'QQQQ' then SEE "Advanced SIMD modified immediate"; 
if immh<3> == '1' then ReservedValue(); 

integer esize = 8 << HighestSetBit(immh) ; 

integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


integer shift = UInt(immh:immb) - esize; 
boolean unsigned = (U == '1'); 


Alias conditions 





Alias is preferred when 





UXTL, UXTL2 — immb == '000' && BitCount(immh) == 





Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 0 


[present] whenQ = 1 





<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
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2D when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = QQ1x,Q = @ 
8H when immh = 001x,Q = 1 
2S when immh = Q1xx,Q = @ 
4S when immh = @1xx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = x is reserved. 
<shift> Is the left shift amount, in the range 0 to the source element width in bits minus 1, encoded in the 
"immh:immb" field. It can have the following values: 
(UInt(immh:immb)-8) when immh = 0001 
(UInt(immh:immb)-16) when immh = 001x 
(UInt(immh:immb)-32) when immh = Q1xx 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = Vpart[n, part]; 
bits(datasizex2) result; 

integer element; 


for e = 0 to elements-1 
element = Int(Elem[operand, e, esize], unsigned) << shift; 


Elem[result, e, 2xesize] = element<2sesize-1:0>; 


V[d] = result; 
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C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


C7.2.340 USHR 


Unsigned Shift Right (immediate). This instruction reads each vector element in the source SIMD&FP register, right 
shifts each result by an immediate value, writes the final result to a vector, and writes the vector to the destination 
SIMD&FP register. All the values in this instruction are unsigned integer values. The results are truncated. For 


rounded results, see URSHR. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 |19 18 


16|15141312\1110 9 | 5 4| 0 | 


otf {s 4 1 4 7 0} 0000 | imme Jo ofofojoli| Rn | Ro 


immh 


Scalar variant 


USHR <V><d>, <V><n>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 

integer datasize = esize; 

integer elements = 1; 


integer shift = (esize « 
boolean unsigned = (U == '1'); 


boolean round = (01 == '1'); 
boolean accumulate = (0@ == '1'); 


Vector 


|31 30 29 28|27 26 25 24|23 22 |19 18 


01 00 


2) - UInt(immh: immb) ; 


16|15 141312|/1110 9 | 5 4| 0 | 


lati jo 1 1 4 1 0} s0000 | immb Jo ofofojoli| Rn | Re 


immh 


Vector variant 


USHR <Vd>.<T>, <Vn>.<T>, #<shift> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 


01 00 


if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 


if immh<3>:Q == 


'10' then ReservedValue(); 


integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

integer shift = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 

boolean round = (01 == '1'); 

boolean accumulate = (0@ == '1'); 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


C7-1473 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 





<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = Q01x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = @ is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1 to 64, encoded in the "Immh:immb" 
field. It can have the following values: 
(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the right shift amount, in the range 1 to the element width in bits, encoded 
in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
(128-UInt(immh:immb)) when immh 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


1xxx 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 

bits(datasize) operand2; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


operand2 = if accumulate then V[d] else Zeros(); 

for e = 0 to elements-1 
element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift; 
Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>; 


V[d] = result; 
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C7.2.341 USQADD 


Unsigned saturating Accumulate of Signed value. This instruction adds the signed integer values of the vector 
elements in the source SIMD&FP register to corresponding unsigned integer values of the vector elements in the 
destination SIMD&FP register, and accumulates the resulting unsigned integer values with the vector elements of 
the destination SIMD&FP register. 


If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation 
bit FPSR.QC is set. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


Scalar 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 


foi 11 ofsee[r ooo ofo oer ifr of an TR 


Scalar variant 


USQADD <V><d>, <V><n> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt(Rn) ; 


integer esize = 8 << UInt(size); 
integer datasize = esize; 
integer elements = 1; 


boolean unsigned = (U == '1'); 
Vector 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15141312/1110 9 | 5 4| 0| 


fofajijo 141 ofsie{1 ooo ofooorirfi of Rn {| Rd | 
U 


Vector variant 


USQADD <Vd>.<T>, <Vn>.<T> 


Decode for this encoding 


integer d 
integer n 


UInt(Rd) ; 
UInt (Rn); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 


boolean unsigned = (U == '1'); 


Assembler symbols 





<V> Is a width specifier, encoded in the "size" field. It can have the following values: 
B when size = 00 
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H when size = Q1 

s when size = 10 

D when size = 11 
<d> Is the number of the SIMD&FP destination register, encoded in the "Rd" field. 
<n> Is the number of the SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 

8B when size = 00,Q = 0 

16B when size = 00,Q=1 

4H when size = 01,Q = 0 

8H when size = 01,Q=1 

2S when size = 10,Q = 0 

4S when size = 10,Q=1 

2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand = V[n]; 
bits(datasize) result; 


bits(datasize) operand2 = V[d]; 
integer op1; 
integer op2; 
boolean sat; 


for e = 0 to elements-1 
opl = Int(Elem[operand, e, esize], !unsigned) ; 
op2 = Int(Elem[operand2, e, esize], unsigned); 
(Elem[result, e, esize], sat) = SatQ(opl + op2, esize, unsigned); 
if sat then FPSR.QC = '1'; 
V[d] = result; 





C7-1476 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 





C7.2.342 USRA 
Unsigned Shift Right and Accumulate (immediate). This instruction reads each vector element in the source 
SIMD&FP register, right shifts each result by an immediate value, and accumulates the final results with the vector 
elements of the destination SIMD&FP register. All the values in this instruction are unsigned integer values. The 
results are truncated. For rounded results, see URSRA. 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
Scalar 
|31 30 calla a no 18 16|15141312/1110 9 | 5 4| 0 | 
i ) 0000 | immb [o ofo{1fo]1] Rn  [  =Rd__—id 
immh 01 00 
Scalar variant 
USRA <V><d>, <V><n>, #<shift> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if immh<3> != '1' then ReservedValue(); 
integer esize = 8 << 3; 
integer datasize = esize; 
integer elements = 1; 
integer shift = (esize » 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 
boolean round = (01 == '1'); 
boolean accumulate = (0@ == '1'); 
Vector 
|31 30 so aeere 26 25 2 240° 22 he 16|15141312\1110 9 | 5 4| 0| 
lal o1 10] !=0000 | immb |o ofo[ifo]i] Rn [Ra __—i 
immh 01 00 
Vector variant 
USRA <Vd>.<T>, <Vn>.<T>, #<shift> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
if immh == 'QQQ@' then SEE "Advanced SIMD modified immediate"; 
if immh<3>:Q == '10' then ReservedValue(); 
integer esize = 8 << HighestSetBit(immh) ; 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
integer shift = (esize « 2) - UInt(immh:immb) ; 
boolean unsigned = (U == '1'); 
boolean round = (01 == '1'); 
boolean accumulate = (0@ == '1'); 
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Assembler symbols 


<V> Is a width specifier, encoded in the "immh" field. It can have the following values: 
D when immh = 1xxx 


The encoding immh = Qxxx is reserved. 





<d> Is the number of the SIMD&FP destination register, in the "Rd" field. 
<n> Is the number of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 

8B when immh = 0001,Q = @ 

16B when immh = 0001,Q = 1 

4H when immh = Q01x,Q = @ 

8H when immh = Q01x,Q = 1 

2S when immh = Q1xx,Q = @ 

4S when immh = @1xx,Q = 1 

2D when immh = 1xxx,Q = 1 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 
The encoding immh = 1xxx, Q = @ is reserved. 
<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<shift> For the scalar variant: is the right shift amount, in the range 1 to 64, encoded in the "Immh:immb" 
field. It can have the following values: 
(128-UInt(immh:immb)) when immh = 1xxx 
The encoding immh = Qxxx is reserved. 


For the vector variant: is the right shift amount, in the range 1 to the element width in bits, encoded 
in the "immh:immb" field. It can have the following values: 


(16-UInt(immh:immb)) when immh = 0001 
(32-UInt(immh:immb)) when immh = 001x 
(64-UInt(immh:immb)) when immh = Q1xx 
(128-UInt(immh:immb)) when immh 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


1xxx 


Operation for all encodings 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operand = V[n]; 

bits(datasize) operand2; 

bits(datasize) result; 

integer round_const = if round then (1 << (shift - 1)) else Q; 
integer element; 


operand2 = if accumulate then V[d] else Zeros(); 

for e = 0 to elements-1 
element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift; 
Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>; 


V[d] = result; 
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Unsigned Subtract Long. This instruction subtracts each vector element in the lower or upper half of the second 
source SIMD&FP register from the corresponding vector element of the first source SIMD&FP register, places the 
result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are 
unsigned integer values. The destination vector elements are twice as long as the source vector elements. 


The USUBL instruction extracts each source vector from the lower half of each source register, while the USUBL2 
instruction extracts each source vector from the upper half of each source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24/23 22 21 20| 16|15 141312|1110 9 | 5 4| 0 | 


lofi jo 1 4 4 olsze |i] Rm Jo ofifojo of Rn | Rd 


Three registers, not all the same type variant 


USUBL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> 


Decode for this encoding 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 


boolean 
boolean 


d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

== '11' then ReservedValue(); 
esize = 8 << UInt(size); 
datasize = 64; 

part = UInt(Q); 


elements = datasize DIV esize; 


sub_op = 
unsigned 


(01 == '1'); 
=) (Ue "1")5 


Assembler symbols 


2 


<Vd> 


<Ta> 


<Vn> 


<Tb> 


Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size" field. It can have the following values: 


8H when size = 00 
4S when size = 01 
2D when size = 10 


The encoding size = 11 is reserved. 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
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4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 

bits(datasize) operandl = Vpart[n, part]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

integer sum; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
Elem[result, e, 2xesize] = sum<2xesize-1:0>; 


V[d] = result; 
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C7.2.344 USUBW, USUBW2 


Unsigned Subtract Wide. This instruction subtracts each vector element of the second source SIMD&FP register 
from the corresponding vector element in the lower or upper half of the first source SIMD&FP register, places the 
result in a vector, and writes the vector to the SIMD&FP destination register. All the values in this instruction are 
signed integer values. 


The vector elements of the destination register and the first source register are twice as long as the vector elements 
of the second source register. 


The USUBW instruction extracts vector elements from the lower half of the first source register, while the USUBW2 
instruction extracts vector elements from the upper half of the first source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fofa}jijo 144 Ofsize}1] Rm [0 oft}tjo of Rn | Rd 
U o1 


Three registers, not all the same type variant 


USUBW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size == '11' then ReservedValue(); 


integer esize = 8 << UInt(size); 
integer datasize = 64; 

integer part = UInt(Q); 

integer elements = datasize DIV esize; 


boolean sub_op = (01 == '1'); 
boolean unsigned = (U == '1'); 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
4s when size = Q1 
2D when size = 10 


The encoding size = 11 is reserved. 





<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vm> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
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<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
4S when size = 10,Q=1 


The encoding size = 11, Q = x is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operandl = V[n]; 
bits(datasize) operand2 = Vpart[m, part]; 
bits(2sdatasize) result; 

integer element1; 

integer element2; 

integer sum; 


for e = 0 to elements-1 
element1 = Int(Elem[operand1, e, 2*esize], unsigned); 
element2 = Int(Elem[operand2, e, esize], unsigned); 
if sub_op then 
sum = elementl - element2; 
else 
sum = elementl + element2; 
Elem[result, e, 2xesize] = sum<2xesize-1:0>; 


V[d] = result; 
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C7.2.345 UXTL, UXTL2 


Unsigned extend Long. This instruction copies each vector element from the lower or upper half of the source 
SIMD&FP register into a vector, and writes the vector to the destination SIMD&FP register. The destination vector 
elements are twice as long as the source vector elements. 


The UXTL instruction extracts vector elements from the lower half of the source register, while the UXTL2 instruction 
extracts vector elements from the upper half of the source register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


This instruction is an alias of the USHLL, USHLL2 instruction. This means that: 
. The encodings in this description are named to match the encodings of USHLL, USHLL2. 


° The description of USHLL, USHLL2 gives the operational pseudocode for this instruction. 


|31 30 29 28|27 26 25 24|23 22 1918 16|15141312/11109 | 5 4| 0 | 


fofa}ijo 1414 0} 0000 joo oftoro oli] Rn [| Rd | 
U 


immh immb 


Vector variant 

UXTL{2} <Vd>.<Ta>, <Vn>.<Tb> 

is equivalent to 

USHLL{2} <Vd>.<Ta>, <Vn>.<Tb>, #0 


and is the preferred disassembly when BitCount(immh) == 


Assembler symbols 


2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 


[absent] whenQ = 20 


[present] whenQ = 1 


<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Ta> Is an arrangement specifier, encoded in the "immh" field. It can have the following values: 
8H when immh = 0001 
4s when immh = 001x 
2D when immh = Q1xx 


See Advanced SIMD modified immediate on page C4-237 when immh = 0000. 


The encoding immh = 1xxx is reserved. 





<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 
<Tb> Is an arrangement specifier, encoded in the "immh:Q" field. It can have the following values: 
8B when immh = 0001,Q = @ 
16B when immh = 0001,Q = 1 
4H when immh = Q01x,Q = @ 
8H when immh = Q01x,Q = 1 
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2s when immh = Q1xx,Q = @ 
4S when immh = Q@1xx,Q = 1 
See Advanced SIMD modified immediate on page C4-237 when immh = 0000, Q = x. 


The encoding immh = 1xxx, Q = x is reserved. 


Operation 


The description of USHLL, USHLL2 gives the operational pseudocode for this instruction. 
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C7.2.346 UZP1 
Unzip vectors (primary). This instruction reads corresponding even-numbered vector elements from the two source 
SIMD&FP registers, starting at zero, places the result from the first source register into consecutive elements in the 
lower half of a vector, and the result from the second source register into consecutive elements in the upper half of 
a vector, and writes the vector to the destination SIMD&FP register. 
Note 
This instruction can be used with UZP2 to de-interleave two vectors. 
The following figure shows an example of the operation of UZP1 and UZP2 with the arrangement specifier 8B. 
Vn Az Ag As Ag A3 Ao Ai Ao 
Vm Bz Be Bs Ba B3 Bo B, Bo 
UZP1.8, doubleword NN UZP2.8, doubleword 
Vd| Be | Bs | Bo | Bo | As | Ag | Ao | Ao Vd! By | Bs | Bz | Bi | Avy | As | As | Ay 
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
|31 30 29 28|27 26 25 24|23 22 21 20| 16|15 141312/1110 9 | 5 4| 0 | 
fofafo 01171 Olszefo] Rm Jojojo [io] en | Ra | 
op 
Advanced SIMD variant 
UZP1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 
Decode for this encoding 
integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 
if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 
integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 
integer part = UInt(op); 
Assembler symbols 
<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
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4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand] = V[n]; 
bits(datasize) operandh = V[m]; 
bits(datasize) result; 

integer e; 


bits(datasizex2) zipped = operandh:operand1; 
for e = Q to elements-1 


Elem[result, e, esize] = Elem[zipped, 2*e+part, esize]; 


V[d] = result; 
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C7.2.347 UZP2 


Unzip vectors (secondary). This instruction reads corresponding odd-numbered vector elements from the two 
source SIMD&FP registers, places the result from the first source register into consecutive elements in the lower 
half of a vector, and the result from the second source register into consecutive elements in the upper half of a vector, 
and writes the vector to the destination SIMD&FP register. 


Note 


This instruction can be used with UZP1 to de-interleave two vectors. 








The following figure shows an example of the operation of UZP1 and UZP2 with the arrangement specifier 8B. 





Vn Az Ag As Ag A3 Ao Ai Ao 



































UZP1.8, doubleword 








Vm Bz Be Bs Ba 1 Bo 
A 


B3 Bo B 
NN UZP2.8, doubleword 
Vd! Bz B 


Vd Be Ba Bo Bo Ag Ag Ao 0 5 B3 B, Az As As Ay 






























































Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20| 16/15 141312/1110 9 | 5 4| 0 | 


fofafo.o 177 O[szefo] Rm [ojijoa[10] rn | Rd 
op 





Advanced SIMD variant 


UZP2 <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode for this encoding 


integer d = UInt(Rd); 
integer n = UInt(Rn); 
integer m = UInt(Rm); 


if size:Q == '110' then ReservedValue(); 
integer esize = 8 << UInt(size); 

integer datasize = if Q == '1' then 128 else 64; 
integer elements = datasize DIV esize; 

integer part = UInt(op); 


Assembler symbols 





<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<I> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
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4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operand] = V[n]; 
bits(datasize) operandh = V[m]; 
bits(datasize) result; 

integer e; 


bits(datasizex2) zipped = operandh:operand1; 
for e = Q to elements-1 


Elem[result, e, esize] = Elem[zipped, 2*e+part, esize]; 


V[d] = result; 
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C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


Extract Narrow. This instruction reads each vector element from the source SIMD&FP register, narrows each value 
to half the original width, places the result into a vector, and writes the vector to the lower or upper half of the 
destination SIMD&FP register. The destination vector elements are half as long as the source vector elements. 


The XTN instruction writes the vector to the lower half of the destination register and clears the upper half, while the 
XTN2 instruction writes the vector to the upper half of the destination register without affecting the other bits of the 


register. 


Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 


[31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 | 5 4| 0 | 
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Vector variant 


XTN{2} <Vd>.<Tb>, <Vn>.<Ta> 


Decode for this encoding 


integer 
integer 


if size 
integer 
integer 
integer 
integer 


d 
n 


UInt(Rd) ; 
UInt(Rn) ; 


== '11' then ReservedValue(); 


esize 


= 8 << UInt(size); 


datasize = 64; 


part 


= UInt(Q); 


elements = datasize DIV esize; 


Assembler symbols 





2 Is the second and upper half specifier. If present it causes the operation to be performed on the upper 
64 bits of the registers holding the narrower elements, and is encoded in the "Q" field. It can have 
the following values: 

[absent] whenQ = 20 
[present] whenQ = 1 

<Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 

<Tb> Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 
8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2S when size = 10,Q = 0 
4S when size = 10,Q=1 
The encoding size = 11, Q = x is reserved. 

<Vn> Is the name of the SIMD&FP source register, encoded in the "Rn" field. 

<Ta> Is an arrangement specifier, encoded in the "size" field. It can have the following values: 
8H when size = 00 
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4S when size = 01 


2D when size = 10 


The encoding size = 11 is reserved. 


Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(2sdatasize) operand = V[n]; 
bits(datasize) result; 
bits(2sesize) element; 


for e = 0 to elements-1 
element = Elem[operand, e, 2esize]; 
Elem[result, e, esize] = element<esize-1:0>; 
Vpart[d, part] = result; 
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C7.2.349 


ZIP1 


Zip vectors (primary). This instruction reads adjacent vector elements from the upper half of two source SIMD&FP 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
C7.2 Alphabetical list of A64 floating-point and Advanced SIMD instructions 


registers as pairs, interleaves the pairs and places them into a vector, and writes the vector to the destination 
SIMD&FP register. The first pair from the first source register is placed into the two lowest vector elements, with 


subsequent pairs taken alternately from each source register. 


Note 





This instruction can be used with ZIP2 to interleave two vectors. 





The following figure shows an example of the operation of ZIP1 and ZIP2 with the arrangement specifier 8B. 









































Vd| Bz 








Vn Az Ag As Ag A3 Ao Ai Ao 
Vm Bz Be Bs Ba B3 Bo B, Bo 
ZIP1.8, doubleword a ZIP2.8, doubleword 
As Bo | Ao B, Ay Bo | Ao Vd} By | Az Be | Ae Bs | As Ba | Ag 





























Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 
and Exception level, an attempt to execute the instruction might be trapped. 
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Advanced SIMD variant 


ZIP1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 
integer 


for this encoding 

d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

:Q == '110' then ReservedValue(); 


esize = 8 << UInt(size); 


datasize = if Q == '1' then 128 else 64; 


elements = datasize DIV esize; 
part = UInt(op); 
pairs = elements DIV 2; 


Assembler symbols 


<Vd> 


<I> 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
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4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


integer base = part * pairs; 
integer p; 


for p = Q to pairs-1 
Elem[result, 2«p+@, esize] = Elem[operand1, base+p, esize]; 


Elem[result, 2xp+1, esize] = Elem[operand2, base+p, esize]; 


V[d] = result; 
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ZIP2 


C7 A64 Advanced SIMD and Floating-point Instruction Descriptions 
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Zip vectors (secondary). This instruction reads adjacent vector elements from the lower half of two source 
SIMD&FP registers as pairs, interleaves the pairs and places them into a vector, and writes the vector to the 
destination SIMD&FP register. The first pair from the first source register is placed into the two lowest vector 


elements, with subsequent pairs taken alternately from each source register. 


Note 





This instruction can be used with ZIP1 to interleave two vectors. 





The following figure shows an example of the operation of ZIP1 and ZIP2 with the arrangement specifier 8B. 









































Vd| Bz 








Vn Az Ag As Ag A3 Ao Ai Ao 
Vm Bz Be Bs Ba B3 Bo B, Bo 
ZIP1.8, doubleword a ZIP2.8, doubleword 
As Bo | Ao B, Ay Bo | Ao Vd} By | Az Be | Ae Bs | As Ba | Ag 
























































Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state 


and Exception level, an attempt to execute the instruction might be trapped. 


|31 30 29 28|27 26 25 24|23 22 21 20| 
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Advanced SIMD variant 


ZIP2 <Vd>.<T>, <Vn>.<T>, <Vm>.<T> 


Decode 


integer 
integer 
integer 


if size 
integer 
integer 
integer 
integer 
integer 


for this encoding 

d = UInt(Rd); 

n = UInt(Rn); 

m = UInt(Rm); 

:Q == '110' then ReservedValue(); 


esize = 8 << UInt(size); 


datasize = if Q == '1' then 128 else 64; 


elements = datasize DIV esize; 
part = UInt(op); 
pairs = elements DIV 2; 


Assembler symbols 


<Vd> 


<I> 


Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 


Is an arrangement specifier, encoded in the "size:Q" field. It can have the following values: 


8B when size = 00,Q = 0 
16B when size = 00,Q=1 
4H when size = 01,Q = 0 
8H when size = 01,Q=1 
2s when size = 10,Q = 0 
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4S when size = 10,Q=1 
2D when size = 11,Q=1 


The encoding size = 11, Q = Qis reserved. 


<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Vn> Is the name of the second SIMD&FP source register, encoded in the "Rm" field. 
Operation 


CheckFPAdvSIMDEnab1ed64(); 
bits(datasize) operandl = V[n]; 
bits(datasize) operand2 = V[m]; 
bits(datasize) result; 


integer base = part * pairs; 
integer p; 


for p = Q to pairs-1 
Elem[result, 2«p+@, esize] = Elem[operand1, base+p, esize]; 


Elem[result, 2xp+1, esize] = Elem[operand2, base+p, esize]; 


V[d] = result; 
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Chapter D1 
The AArch64 System Level Programmers’ Model 


This chapter describes the AArch64 system level programmers’ model. It contains the following sections: 
° Exception levels on page D1-1498. 
° Exception terminology on page D1-1499. 


° Execution state on page D1-1501. 

° Security state on page D1-1502. 

° Virtualization on page D1-1504. 

. Registers for instruction processing and exception handling on page D1-1507. 
° Process state, PSTATE on page D1-1513. 

° Program counter and stack pointer alignment on page D1-1515. 


° Reset on page D1-1517. 

° Exception entry on page D1-1521. 

° Exception return on page D1-1536. 

° The Exception level hierarchy on page D1-1540. 





° Synchronous exception types, routing and priorities on page D1-1547. 
° Asynchronous exception types, routing, masking and priorities on page D1-1555. 
° Configurable instruction enables and disables, and trap controls on page D1-1562. 
° System calls on page D1-1598. 
° Mechanisms for entering a low-power state on page D1-1599. 
° Self-hosted debug on page D1-1604. 
° The Performance Monitors Extension on page D1-1606. 
° Interprocessing on page D1-1607. 
° The effect of implementation choices on the programmers’ model on page D1-1619. 
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D1.1 Exception levels 











D1.1 Exception levels 
The ARMv8-A architecture defines a set of Exception levels, ELO to EL3, where: 
. If ELn is the Exception level, increased values of n indicate increased software execution privilege. 
° Execution at ELO is called unprivileged execution. 
. EL2 provides support for virtualization of Non-secure operation. 
° EL3 provides support for switching between two Security states, Secure state and Non-secure state. 
An implementation might not include all of the Exception levels. All implementations must include ELO and EL1. 
EL2 and EL3 are optional. 
Note 
A PE is not required to implement a contiguous set of Exception levels. For example, it is permissible for an 
implementation to include only ELO, EL1, and EL3. 
The effect of implementation choices on the programmers’ model on page D1-1619 shows some example 
implementations. 
When executing in AArch64 state, execution can move between Exception levels only on taking an exception or on 
returning from an exception: 
° On taking an exception, the Exception level can only increase or remain the same. 
° On returning from an exception, the Exception level can only decrease or remain the same. 
The Exception level that execution changes to or remains in on taking an exception is called the target Exception 
level of the exception. 
Each exception type has a target Exception level that is either: 
° Implicit in the nature of the exception. 
° Defined by configuration bits in the System registers. 
An exception cannot target ELO. 
Exception levels exist within a particular Security state. The ARMv8-A security model on page D1-1502 describes 
this. When executing at an Exception level, the PE can access both of the following: 
. The resources that are available for the combination of the current Exception level and the current Security 
state. 
° The resources that are available at all lower Exception levels, provided that those resources are available to 
the current Security state. 
This means that if the implementation includes EL3, then when execution is at EL3, the PE can access all resources 
available at all Exception levels, for both Security states. 
Each Exception level other than ELO has its own translation regime and associated control registers. For information 
on the translation regimes, see Chapter D4 The AArch64 Virtual Memory System Architecture. 
D1.1.1 Typical Exception level usage model 
The architecture does not specify what software uses which Exception level. Such choices are outside the scope of 
the architecture. However, the following is a common usage model for the Exception levels: 
ELO Applications. 
EL1 OS kernel and associated functions that are typically described as privileged. 
EL2 Hypervisor. 
EL3 Secure monitor. 
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D1.2 Exception terminology 
The following subsections define the terms used when describing exceptions: 
° Terminology for taking an exception. 
° Terminology for returning from an exception. 
. Exception levels. 
° Definition of a precise exception. 
. Definitions of synchronous and asynchronous exceptions on page D1-1500. 
D1.2.1 Terminology for taking an exception 
An exception is generated when the PE first responds to an exceptional condition.The PE state at this time is the 
state the exception is taken from. The PE state immediately after taking the exception is the state the exception is 
taken to. 
D1.2.2 Terminology for returning from an exception 
To return from an exception, the PE must execute an exception return instruction.The PE state when an exception 
return instruction is committed for execution is the state the exception returns from. The PE state immediately after 
the execution of that instruction is the state the exception returns to. 
D1.2.3 Exception levels 
An Exception level, ELn, with a larger value of n than another Exception level, is described as being a higher 
Exception level than the other Exception level. For example, EL3 is a higher Exception level than EL1. 
An Exception level with a smaller value of n than another Exception level is described as being a lower Exception 
level than the other Exception level. For example, ELO is a lower Exception level than EL1. 
An Exception level is described as: 
° Using AArch64 when execution in that Exception level is in the AArch64 Execution state. 
° Using AArch32 when execution in that Exception level is in the AArch32 Execution state. 
D1.2.4 Definition of a precise exception 
An exception is described as precise when the exception handler receives the PE state and memory system state that 
is consistent with the PE having executed all of the instructions up to but not including the point in the instruction 
stream where the exception was taken, and none afterwards. 
Other than the SError interrupt, all exceptions taken to AArch64 state are required to be precise. For each 
occurrence of an SError interrupt, whether the interrupt is precise or imprecise is IMPLEMENTATION DEFINED. 
Where a synchronous exception that is taken to AArch64 state is generated as part of an instruction that performs 
more than one single-copy atomic memory access, the definition of precise permits that the values in registers or 
memory affected by the instructions can be UNKNOWN, provided that: 
. The accesses affecting those registers or memory locations do not, themselves, generate exceptions. 
. The registers are not involved in the calculation of the memory address used by the instruction. 
Also, for Data Aborts from load or store instructions executed in AArch64 state, where the Data Abort is taken 
synchronously: 
. If the load or store instruction specifies writeback of a new base address, the base address is restored to the 
original value on taking the exception. 
° If the instruction was a load to either the base address register or the offset register, that register is restored 
to the original value. Any other destination registers become UNKNOWN. 
. If the instruction was a load that does not load the base address register or the offset register, then the 
destination registers become UNKNOWN. 
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Examples of instructions that perform more than one single-copy atomic memory access are the AArch32 LDM and 
STM instructions and the AArch64 LDP and STP instructions. 


Note 


° For the definition of a single-copy atomic access, see Properties of single-copy atomic accesses on 
page B2-82. 








D1.2.5 Definitions of synchronous and asynchronous exceptions 


An exception is described as synchronous if all of the following apply: 


. The exception is generated as a result of direct execution or attempted execution of an instruction. 

° The return address presented to the exception handler is guaranteed to indicate the instruction that caused the 
exception. 

. The exception is precise. 


For more information about synchronous exceptions, see Synchronous exception types, routing and priorities on 
page D1-1547. 


An exception is described as asynchronous if any of the following apply: 
° The exception is not generated as a result of direct execution or attempted execution of the instruction stream. 


. The return address presented to the exception handler is not guaranteed to indicate the instruction that caused 
the exception. 


° The exception is imprecise. 


For more information about asynchronous exceptions, see Asynchronous exception types, routing, masking and 
priorities on page D1-1555. 
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D1.3 Execution state 


The Execution states are: 


AArch64 The 64-bit Execution state. 


AArch32 The 32-bit Execution state. Operation in this state is compatible with ARMv7-A operation. 


Execution state on page A1-33 gives more information about them. 


Exception levels use Execution states. For example, ELO, EL1 and EL2 might all be using AArch32, under EL3 
using AArch64. 


This means that: 


Different software layers, such as an application, an operating system kernel, and a hypervisor, executing at 
different Exception levels, can execute in different Execution states. 


The PE can change Execution states only either: 
—  Atreset. 


—  Onachange of Exception level. 


Note 





Typical Exception level usage model on page D1-1498 shows which Exception levels different software 
layers might typically use. 


The effect of implementation choices on the programmers’ model on page D1-1619 gives information on 
supported configurations of Exception levels and Execution states. 





The interaction between the AArch64 and AArch32 Execution states is called interprocessing. Interprocessing on 
page D1-1607 describes this. 
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D1.4 Security state 


The ARMv8-A architecture provides two Security states, each with an associated physical memory address space, 


as follows: 

Secure state When in this state, the PE can access both the Secure physical address space and the 
Non-secure physical address space. 

Non-secure state When in this state, the PE: 
° Can access only the Non-secure physical address space. 
° Cannot access the Secure system control resources. 


For information on how virtual addresses translate onto Secure physical and Non-secure addresses, see About the 
Virtual Memory System Architecture (VMSA) on page D4-1722. 


D1.4.1 The ARMv8-A security model 
The general principles of the ARMv8-A security model are: 


° If the implementation includes EL3 then it has two Security states, Secure and Non-secure, and: 
—  EL3 exists only in Secure state. 
—  Achange from Non-secure state to Secure state can only occur on taking an exception to EL3. 
—  Achange from Secure state to Non-secure state can only occur on an exception return from EL3. 


— If EL2 is implemented, it exists only in Non-secure state. 


° If the implementation does not include EL3 it has one Security state, that is: 
—_— IMPLEMENTATION DEFINED, if the implementation does not include EL2. 


—  Non-secure state if the implementation includes EL2. 


Security model when EL3 is using AArch64 


Figure D1-1 on page D1-1503 shows the security model when EL3 is using AArch64. The figure shows how 
instances of ELO and ELI are present in both Security states. It also shows the expected software usage of the 
different Exception levels. 
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D1 The AArch64 System Level Programmers’ Model 


D1.4 Security state 
































Non-secure state Secure state 
AArch32 or AArch32 or AArch32 or AArch32 or AArch32 or AArch32 or 
AArch64t AArch64t AArch64t AArch64t | | AArch64t AArch64t 
ELO App1 App2 App1 App2 Secure App1 Secure App2 
AArch32 or AArch64* AArch32 or AArch64* | [AArch32 or AArch64 
EL1 Guest OS1 Guest OS2 Secure OS 
AArch32 or AArch64 
EL2 Hypervisor 





EL3 





AArch64 





Secure monitor 








t+ AArch64 permitted only if EL1 is using AArch64 
+ AArch64 permitted only if EL2 is using AArch64 


Figure D1-1 ARMv8-A security model when EL3 is using AArch64 


For an overview of the Security model when EL3 is using AArch32, see Figure G1-1 on page G1-3790. 
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D1.5 Virtualization 


The support for virtualization described in this section applies only to an implementation that includes EL2. 


EL2 provides a set of features that support virtualizing the Non-secure state of an ARMv8-A implementation. The 
basic model of a virtualized system involves: 


° A hypervisor, running in EL2, that is responsible for switching between virtual machines. A virtual machine 
comprises Non-secure EL1 and Non-secure ELO. 


° A number of Guest operating systems. A Guest OS runs on a virtual machine in Non-secure EL1. 


° For each Guest operating system, applications, that run on the virtual machine of that Guest OS, usually in 
Non-secure ELO. 





Note 
In some systems, a Guest OS is unaware that it is running on a virtual machine, and is unaware of any other Guest 
OS. In other systems, a hypervisor makes the Guest OS aware of these facts. The ARMv8-A architecture supports 
both of these models. 





The hypervisor assigns a virtual machine identifier (VMID) to each virtual machine. 
EL2 is implemented only in Non-secure state, to support Guest OS management. EL2 provides controls to: 


. Provide virtual values for the contents of a small number of identification registers. A read of one of these 
registers by a Guest OS or the applications for a Guest OS returns the virtual value. 


° Trap various operations, including memory management operations and accesses to many other registers. A 
trapped operation generates an exception that is taken to EL2. See Configurable instruction enables and 
disables, and trap controls on page D1-1562. 


. Route interrupts to the appropriate one of: 
— The current Guest OS. 
— A Guest OS that is not currently running. 


— The hypervisor. 
In Non-secure state: 
° The implementation provides an independent translation regime for memory accesses from EL2. 


° For the EL1&0 translation regime, address translation occurs in two stages: 


— Stage 1 maps the virtual address (VA) to an intermediate physical address (IPA). This is managed at 
EL1, usually by a Guest OS. The Guest OS believes that the IPA is the physical address (PA). 


— Stage 2 maps the IPA to the PA. This is managed at EL2. The Guest OS might be completely unaware 
of this stage. 


For more information on the translation regimes, see Chapter D4 The AArch64 Virtual Memory System Architecture. 


D1.5.1 The effect of implementing EL2 on the Exception model 


An implementation that includes EL2 implements the following exceptions: 
° Hypervisor Call (HVC) exception. 
. Traps to EL2. EL2 configurable controls on page D1-1571, describes these. 
. All of the virtual interrupts: 
— Virtual SError. 
— Virtual IRQ. 
— Virtual FIQ. 


Hypervisor call exceptions taken from EL3 are taken to EL3. Otherwise, Hypervisor call exceptions are taken from 
Non-secure state to EL2. 
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All virtual interrupts are always taken to Non-secure EL1, and can only be taken from Non-secure EL1 or ELO. 
Each of the virtual interrupts can be independently enabled using controls at EL2. 
Each of the virtual interrupts has a corresponding physical interrupt. See Virtual interrupts. 


When a virtual interrupt is enabled, in Non-secure state its corresponding physical exception is taken to EL2, unless 
EL3 has configured that physical exception to be taken to EL3. 


For more information, see Asynchronous exception types, routing, masking and priorities on page D1-1555. 
An implementation that includes EL2 also: 


° Provides controls that can be used to route some synchronous exceptions, taken from Non-secure state, to 
EL2. For more information see: 


— Routing exceptions to EL2 on page D1-1548. 
— Routing debug exceptions on page D2-1631. 


° Provides mechanisms to trap Non-secure PE operations to EL2. See EL2 configurable controls on 
page D1-1571. 


When an operation is trapped to EL2, the hypervisor typically either: 
—  Enmulates the required operation. The application running in the Guest OS is unaware of the trap. 


— Returns an error to the Guest OS. 


Virtual interrupts 


The virtual interrupts have names that correspond to the physical interrupts, as shown in Table D1-1. 


Table D1-1 The virtual interrupt 





Physical interrupt | Corresponding virtual interrupt 











SError Virtual SError 
IRQ Virtual IRQ 
FIQ Virtual FIQ 





Software executing in EL2 can use virtual interrupts to signal physical interrupts to Non-secure EL1 and Non-secure 
ELO. Example D1-1 shows a usage model for virtual interrupts. 


Example D1-1 Virtual interrupt usage model 


A virtual interrupt usage model is as follows: 


1. Software executing at EL2 routes a physical interrupt to EL2. 

2. When a physical interrupt of that type occurs, the exception handler executing in EL2 determines whether 
the interrupt can be handled in EL2 or requires routing to a Guest OS in ELI. If the interrupt requires routing 
to a Guest OS: 

° If the Guest OS is currently running, the hypervisor uses the appropriate virtual interrupt type to signal 


the physical interrupt to the Guest OS. 


° If the Guest OS is not currently running, the physical interrupt is marked as pending for the guest OS. 
When the hypervisor next switches to the virtual machine that is running that Guest OS, the hypervisor 
uses the appropriate virtual interrupt type to signal the physical interrupt to the Guest OS. 


A hypervisor can prevent Non-secure EL1 and Non-secure ELO from distinguishing a virtual interrupt from a 
physical interrupt. 
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For more information see: 





° Asynchronous exception types, routing, masking and priorities on page D1-1555. 
° Virtual interrupts on page D1-1558. 
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D1.6 Registers for instruction processing and exception handling 
In the ARM architecture, registers fall into two main categories: 


. Registers that provide system control or status reporting. These are described in Chapter D7 AArch64 System 
Register Descriptions. 


° Registers that are used in instruction processing, for example to accumulate a result, and in handling 
exceptions. This section introduces these registers, for execution in AArch64 state. 


This section contains the following subsections: 

° The general purpose registers, RO-R30. 

° The stack pointer registers. 

° The SIMD and floating-point registers, VO-V31 on page D1-1508. 
° Saved Program Status Registers (SPSRs) on page D1-1508. 

° Exception Link Registers (ELRs) on page D1-1511. 


D1.6.1 The general purpose registers, RO-R30 


The general purpose register bank is used when processing instructions in the base instruction set. It comprises 31 
general purpose registers, RO-R30. 


These registers can be accessed as 31 64-bit registers, XO-X30, or 31 32-bit registers, WO-W30. See Register size 
on page C6-432. 


For information on the format of these registers, see Registers in AArch64 state on page B1-59. 


D1.6.2 The stack pointer registers 


In AArch64 state, in addition to the general purpose registers, a dedicated stack pointer register is implemented for 
each implemented Exception level. The stack pointer registers are: 


° SP_ELO and SP_EL1. 
° If the implementation includes EL2, SP_EL2. 
° If the implementation includes EL3, SP_EL3. 


Note 


The four stack pointer register names define an architecture state requirement for four registers. For information on 
how to access these registers, and access restrictions, see Special-purpose registers on page C5-293. 








For information on stack pointer alignment restrictions, see SP alignment checking on page D1-1515. 


Stack pointer register selection 
When executing at ELO, the PE uses the ELO stack pointer, SP_ELO. 


When executing at any other Exception level, the PE can be configured to use either SP_ELO or the stack pointer 
for that Exception level, SP_ELx. 


By default, taking an exception selects the stack pointer for the target Exception level, SP_ELx. For example, taking 
an exception to EL1 selects SP_EL1. Software executing at the target Exception level can then choose to change 
the stack pointer to SP_ELO by updating PSTATE.SP. 


This applies even if taking the exception does not change the Exception level. For example, if the PE is executing 
at EL1 and the PE is using the SP_ELO stack pointer, then on taking an exception that targets EL1, the stack pointer 
changes to SP_EL1. 


The selected stack pointer can be indicated by a suffix to the Exception level: 





t Indicates use of the SP_ELO stack pointer. 
h Indicates use of the SP_ELx stack pointer. 
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D1.6.3 


D1.6.4 


Note 
The t and h suffixes are based on the terminology of thread and handler, introduced in ARMv7-M 








Table D1-2 shows the set of stack pointer options. 


Table D1-2 AArch64 stack pointer options 





Exception level AArch64 stack pointer options 





ELO ELOt 





EL1 ELIt, EL1h 





EL2 EL2t, EL2h 





EL3 EL3t, EL3h 





The SIMD and floating-point registers, V0-V31 


The SIMD and floating-point instructions share a common bank of registers for floating-point, vector, and other 
SIMD-related scalar operations. 


The SIMD and floating-point register bank comprises 32 quadword (128-bit) registers, VO-V31. 


These registers can be accessed as: 

° 32 doubleword (64-bit) registers, DO-D31. 
° 32 word (32-bit) registers, SO-S31. 

° 32 halfword (16-bit) registers, HO-H31. 

° 32 byte (8-bit) registers, BO-B31. 


For information on the format of these registers, see Registers in AArch64 state on page B1-59. 


Saved Program Status Registers (SPSRs) 


The Saved Program Status Registers (SPSRs) are used to save PE state on taking exceptions. 


In AArch64 state, there is an SPSR at each Exception level exceptions can be taken to, as follows: 
° SPSR_EL1, for exceptions taken to EL1 using AArch64. 

° If EL2 is implemented, SPSR_EL2, for exceptions taken to EL2 using AArch64. 

° If EL3 is implemented, SPSR_EL3, for exceptions taken to EL3 using AArch64. 





Note 


Exceptions cannot be taken to ELO. 





When the PE takes an exception, the PE state is saved from PSTATE in the SPSR at the Exception level the 
exception is taken to. For example, if the PE takes an exception to EL1, the PE state is saved in SPSR_EL1. For 
more information on PSTATE, see Process state, PSTATE on page D1-1513. 


Saving the PE state means the exception handler can: 


° On return from the exception, restore the PE state to the state stored in the SPSR at the Exception level the 
exception is returning from. For example, on returning from EL1, the PE state is restored to the state stored 
in SPSR_EL1. 


° Examine the value that PSTATE had when the exception was taken, for example to determine the Execution 
state and Exception level in which the instruction that caused an exception was executed. 
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Note 


° All PSTATE fields are saved, including those which have no direct read and write access, and those that are 
meaningful only in AArch32 state. 





° Those PSTATE fields that are meaningful only in AArch32 state are saved when an exception is taken from 
AArch32 state to AArch64 state. 





The SPSRs are UNKNOWN on reset. 


SPSR format for exceptions taken to AArch64 state 


Exceptions can be taken to AArch64 state from AArch64 state or AArch32 state: 





° For an exception taken to AArch64 state from AArch64 state, the SPSR bit assignments are: 
31 30 29 28 27 24 23 22 21 20 19 1009876543 2 1 
| o IL L M[4], Execution State 
Lae RESO 
Condition flags Mask bits 
° For an exception taken to AArch64 state from AArch32 state, the SPSR bit assignments are: 
31 30 29 28 27 26 25 2423 =. 21 20 19 1615 1098765 4 3 





L RESO LL L M[4], Execution State 
IT[1:0] ss | J | 





Condition flags Mask bits Mode field 


The following list describes the bit assignments: 


N, Z, C, V, bits[31:28] 


Shows the values of the PSTATE. {N, Z, C, V} condition flags immediately before the exception was 
taken. 


Bits[27:22] Reserved, RESO, for exceptions taken from AArch6é4 state. 


For exceptions taken from AArch32 state: 
Q, bit[27] 
Shows the value of PSTATE.Q immediately before the exception was taken. 
IT[1:0], bits[26:25] 
See Bits[19:10] on page D1-1510. 
J, bit[24] 
Shows the value of PSTATE.J immediately before the exception was taken. This bit is 
RESO. 


For the definitions of the Q, IT, and J fields, see Process state, PSTATE on page G1-3805. 


SS, bit[21] The Software Step bit. 


SPSR_ELx.SS is used by a debugger to initiate a Software Step exception. The SS bit also indicates 
which software step state machine state the PE was in. See Software Step exceptions on 
page D2-1673. 


IL, bit[20] Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. See Illegal return events from AArch64 state on page D1-1537. 
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Bits[19:10] 


Bit[9] 


Reserved, RESO, for exceptions taken from AArch6é4 state. 
For exceptions taken from AArch32 state: 


GE[3:0], bits [19:16] 
Shows the value of PSTATE.GE immediately before the exception was taken. 
IT[7:2], bits [15:10] 
In conjunction with IT[1:0], shows the value of PSTATE.IT before the exception was 
taken. 


For definitions of the GE and IT fields, see Process state, PSTATE on page G1-3805. 


D, the debug exception mask bit, for exceptions taken from AArch64 state. Shows the value of 
PSTATE.D immediately before the exception was taken. See The PSTATE debug mask bit, D on 
page D1-1604. 


E, for exceptions taken from AArch32 state. Shows the value of PSTATE.E immediately before the 
exception was taken. For the definition of the E bit, see Process state, PSTATE on page G1-3805. 


A, I, F, bits[8:6] 


Bit[5] 


Shows the values of the PSTATE.{A,I, F} exception mask bits immediately before the exception 
was taken: 


A, bit[8] | SError interrupt mask bit. 
I, bit[7]_ IRQ mask bit. 
F, bit[6]_ FIQ mask bit. 


See Asynchronous exception masking on page D1-1557. 


Reserved, RESO, for exceptions taken from AArch6é4 state. 


T, for exceptions taken from AArch32 state. Shows the value of PSTATE.T immediately before the 
exception was taken. 


MI[4:0], bits[4:0] 


Mode field. 


—— Note 
The name of this field is inherited from ARMv/7, where the M field specified the PE mode. 





For exceptions taken from AArch6é4 state: 


M[4] The value of this is 0. M[4] encodes the value of PSTATE.nRW, that indicates the 
Execution state from which the exception was taken. 


M[3:0] Encodes the Exception level and the stack pointer register selection, as shown in 
Table D1-3. 


Table D1-3 M[3:0] encodings, for exceptions taken from AArch64 state 


























M[3:0]2 Exception level and stack pointer 
0b1101 EL3h 
0b1100 EL3t 
0b1001 EL2h 
0b1000 EL2t 
0b0101 EL1h 
0b0100 ELIt 
0bee00 ELOt 





a. All M[3:0] encodings not shown in the table are reserved. 
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The M[3:0] encoding comprises: 
M{[3:2] Encodes the Exception level, 0-3. 


M1] Reserved, RESO. If set to 1 at the time of an exception return, then that 
exception return is treated as an Illegal Execution state exception return. 
M[0] Selects the SP: 
0 SP_ELO. Indicated by a t suffix on the Exception level. 
1 SP_ELx, where x is the value of M[3:2]. Indicated by an h suffix 


on the Exception level. 


See Stack pointer register selection on page D1-1507. 
For exceptions taken from AArch32 state: 


M[4] The value of this is 1. M[4] encodes the value of PSTATE.nRW, that indicates the 
Execution state from which the exception was taken. 


M[3:0] Encodes the AArch32 mode that the PE was in immediately before the exception was 
taken, as shown in Table D1-4. 


Table D1-4 M[3:0] encodings, for exceptions taken from AArch32 state 





























M[3:0] AArch32 PE mode 
0b0000 User 

0b0001 FIQ 

0b0010 IRQ 

0b0011 Supervisor 

0b0111 Abort 

0b1010 Hyp 

0b1011 Undefined 

0b1111 System 





Bits [27:22] and [19:10] of an SPSR are ignored on an exception return to AArch64 state. Bits [23:22] of an SPSR 
are ignored on an exception return to AArch32 state. 


Pseudocode description of SPSR operations 
The SPSR[] pseudocode function accesses the current SPSR, and is common to AArch32 and AArch64 operations. 


The SetPSTATEFromPSR() pseudocode function updates PSTATE from an SPSR. 


D1.6.5 Exception Link Registers (ELRs) 
Exception Link Registers hold preferred exception return addresses. 


Whenever the PE takes an exception, the preferred return address is saved in the ELR at the Exception level the 
exception is taken to. For example, whenever the PE takes an exception to EL1, the preferred return address is saved 
in ELR_EL1. 


On an exception return, the PC is restored to the address stored in the ELR. For example, on returning from EL1, 
the PC is restored to the address stored in ELR_EL1. 


AArch64 state provides an ELR for each Exception level exceptions can be taken to. The ELRs that AArch64 state 
provides are: 


° ELR_EL1, for exceptions taken to EL1. 
° If EL2 is implemented, ELR_EL2, for exceptions taken to EL2. 
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° If EL3 is implemented, ELR_EL3, for exceptions taken to EL3. 
On taking an exception from AArch32 state to AArch64 state, bits[63:32] of the ELR are set to zero. 


The preferred return address depends on the nature of the exception. For more information, see Preferred exception 
return address on page D1-1521. 
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D1.7 Process state, PSTATE 
In the ARMv8-A architecture, Process state or PSTATE is an abstraction of process state information. All of the 
instruction sets provide instructions that operate on elements of PSTATE. 
PSTATE includes all of the following: 
. Fields that are meaningful only in AArch32 state. 
. Fields that are meaningful only in AArch64 state. 
° Fields that are meaningful in both Execution states. 
PSTATE is defined in pseudocode as the PSTATE structure, of type ProcState. ProcState is defined in Chapter J1 
ARMV8 Pseudocode. 
The PSTATE fields that are meaningful in AArch64 state are: 
The condition flags 
N Negative Condition flag. 
Z Zero Condition flag. 
C Carry Condition flag. 
Vv Overflow Condition flag. 
Process state, PSTATE on page B1-61 gives more information about these flags. 
The Execution state controls 
SS Software Step bit, see Software Step exceptions on page D2-1673. On a reset or taking 
an exception to AArch64 state, this bit is set to 0. 
IL Illegal Execution state bit, see The Illegal Execution state exception on page D1-1539. 
On a reset or taking an exception to AArch64 state, this bit is set to 0. 
nRW Current Execution state, see Execution state on page D1-1501. This bit is 0 when the 
current Execution state is AArch64. This bit is set to 0: 
° On reset into an Exception level that is using AArch64. 
° On taking an exception to an Exception level that is using AArch64. 
EL Current Exception level, see Exception levels on page D1-1498. On a reset to AArch64 
state, this field holds the encoding for the highest implemented Exception level. 
——— Note 
The ARM architecture requires that a PE resets into the highest implemented Exception 
level. 
SP Stack pointer register selection bit, see Stack pointer register selection on 
page D1-1507. On a reset or taking an exception to AArch64 state, this bit is set to 1, 
meaning that SP_ELx is selected. 
The exception mask bits 
D Debug exception mask bit, see The PSTATE debug mask bit, D on page D1-1604. On a 
reset or taking an exception to AArch64 state, this bit is set to 1. 
A,I,F Asynchronous exception mask bits: 
A SError interrupt mask bit. 
I IRQ interrupt mask bit. 
F FIQ interrupt mask bit. 
See Asynchronous exception types, routing, masking and priorities on page D1-1555. 
On a reset or taking an exception to AArch64 state, each of these bits is set to 1. 
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D1.7.1 Accessing PSTATE fields 


In AArch6é4 state, PSTATE fields can be accessed using Special-purpose registers that can be directly read using the 
MRS instruction, and directly written using the MSR (register) instructions. Table D1-5 shows the Special-purpose 
registers that access the PSTATE fields that hold AArch64 state, when the PE is in AArch64 state. All other PSTATE 
fields do not have direct read and write access. 


Table D1-5 Accessing PSTATE fields using MRS and MSR (register) 





Special-purpose register PSTATE fields 














NZCV N, Z, C, V 
DAIF D,A, LF 
CurrentEL EL 
SPSel SP 





Software can also use the MSR (immediate) instruction to directly write to PSTATE.{D, A, I, F, SP}. Table D1-6 
shows the MSR (immediate) operands that can directly write to these PSTATE fields when the PE is in AArch64 
state. 


Table D1-6 Accessing PSTATE.{D, A, I, F, SP} using MSR (immediate) 





Operand PSTATE fields Notes 











DAIFSet D, A, I, F Directly sets any of the PSTATE.{D,A, I, F} bits to 1 
DAIFCIr D, A, I, F Directly clears any of the PSTATE.{D, A, I, F} bits to 0 
SPSel SP Directly sets PSTATE.SP to either 1 or 0 





PSTATE.{N, Z, C, V} can be accessed at ELO. Access to PSTATE.{D, A, I, F} at ELO using AArch64 depends on 
SCTLR_EL1.UMA, see Traps to EL1 of ELO accesses to the PSTATE.{D, A, I, F} interrupt masks on page D1-1566. 
All other PSTATE access instructions can be executed at EL1 or higher and are UNDEFINED at ELO. 


Writes to the PSTATE fields have side-effects on various aspects of the PE operation. All of these side-effects are 


guaranteed: 
° Not to be visible to earlier instructions in the execution stream. 
° To be visible to later instructions in the execution stream. 


D1.7.2 The Saved Program Status Registers (SPSRs) 


On taking an exception, PSTATE is preserved in the SPSR of the Exception level the exception is taken to. The 
SPSRs are described in Saved Program Status Registers (SPSRs) on page D1-1508. 
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Program counter and stack pointer alignment 


This section contains the following: 
° PC alignment checking. 
° SP alignment checking. 


PC alignment checking 


PC alignment checking generates a exception associated with instruction fetch if, in AArch64 state, there is an 
attempt to architecturally execute an instruction that was fetched with a misaligned PC. A misaligned PC is when 
bits[1:0] of the PC are not @b00. 


Note 


As with Instruction Aborts, speculative fetching of an instruction does not generate an exception. An exception 
occurs only on an attempt to architecturally execute the instruction. 








If an exception is generated as a result of an instruction fetch at ELO, it is taken to EL1, unless the exception occurs 
in Non-secure state and HCR_EL2.TGE bit is 1, when it is taken to EL2 instead. If an exception is generated as a 
result of an instruction fetch at any other Exception level, the Exception level is unchanged. 


A PC misalignment sets the EC field in the Exception Syndrome Register (ESR) to 0x22, for the ESR associated 
with the target Exception level. 


When the exception is taken to an Exception level using AArch64, the associated Exception Link Register holds the 
entire PC in its misaligned form, as does the FAR_ELx for the Exception level that the exception is taken to. 


Exception return and PC alignment on page D1-1537 gives more information on PC alignment checking associated 
with exception returns. 





Note 


A misalignment of the PC is a common indication of a serious error, for example software corruption of an address. 





The pseudocode function AArch64.CheckPCAlignment() performs PC alignment checking in AArch64 state. When 
necessary it calls AArch64.PCAlignmentFault() to generates an exception. 


SP alignment checking 


A misaligned stack pointer is where bits[3:0] of the stack pointer are not 0b0000, when the stack pointer is used as 
the base address of the calculation, regardless of any offset applied by the instruction. 


The PE can be configured so that if a load or store instruction uses a misaligned stack pointer, the PE generates a 
exception on the attempt to execute the instruction. In this configuration, CheckSPAlignment() performs the stack 
pointer check, and calls AArch64.SPAlignmentFault() if a misaligned stack pointer is found. 





Note 
° As with Data Aborts, a speculative data access to memory using the stack pointer does not generate the 


exception. The exception occurs only on an attempt to architecturally execute the instruction. 


° Prefetch memory abort instructions do not cause synchronous exceptions. See Prefetch memory on 
page C3-156. 





Stack pointer alignment checking is only performed in AArch64 state, and can be enabled for each Exception 
level as follows: 


° SCTLR_EL1.{SA0, SA} controls ELO and EL1, respectively 
° SCTLR_EL2.SA controls EL2 
. SCTLR_EL3.SA controls EL3. 
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If an exception is generated as a result of a load or store at ELO, it is taken as an exception to EL1 unless the 
HCR_EL2.TGE bit is set in the Non-secure state, when it is taken to EL2. If an exception is generated as a result of 
a load or store at any other Exception level, the Exception level is unchanged. 


A stack pointer misalignment sets the EC field to 0x26, in the ESR associated with the target Exception level. If 
memory alignment checking and stack pointer alignment checking are enabled, then an SP alignment fault has 
priority in setting the value of the EC field, in the ESR associated with the target Exception level. 


The pseudocode function CheckSPAlignment() performs the stack pointer alignment check. When necessary it calls 
AArch64.SPAlignmentFault() to generate an exception. 
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D1.9 Reset 
The ARMv8-A architecture supports the following resets: 
Cold reset Resets all of the logic on which the PE executes, including the integrated debug 


functionality. 

In some contexts, this logic is described as belonging to the Cold reset domain. 
Warm reset Resets the logic on which the PE executes, but does not reset the integrated debug 

functionality. 


In some contexts, this logic is described as belonging to the Warm reset domain. 


Note 


The ARMv8-A architecture also supports an external debug reset. See External debug register resets on 
page H8-4981. 








The difference between a Cold reset and a Warm reset is relevant only to the debug functionality and the RMR_ELx 
register, if an RMR_ELx register is implemented: 


° A Warm reset permits debugging across a reset of the PE logic. 
° Writing 1 to RMR_ELx.RR requests a Warm reset. 


The mechanisms, other than RMR_ELx.RR, to assert these resets are IMPLEMENTATION DEFINED. It is 
IMPLEMENTATION DEFINED whether: 


. It is possible to independently assert an External Debug reset and a Cold reset. 
° It is possible to assert a Warm reset, as opposed to asserting a Cold reset, other than by the use of 
RMR_ELx.RR. 
Note 





ARM recommends that: 


° If separate Core and Debug power domains are implemented, as described in Reset and debug on 
page H6-4955, then a Cold reset can be asserted independently of External Debug reset. 


° A Warm reset can be asserted to permit debugging across a reset of the PE logic. 





This means that an implementation can define other resets according to the requirements the implementation or 
system must fulfil. These other resets are outside the scope of the ARMv8-A architecture. However, they can be 
mapped onto the resets described here. 


In the description that follows, the term reset is used in contexts where there is no difference between the effect of 
a Cold reset and the effect of a Warm reset. 


On a reset, the PE enters the highest implemented Exception level. 
If the highest implemented Exception level can use either Execution state, then: 


° The implementation must include a Reset Management Register (RMR). Only one RMR is implemented. The 
RMR implemented is the RMR is associated with the highest Exception level. 


° On a Cold reset, the Execution state entered is determined by a configuration input signal. 
° On a Warm reset, the Execution state entered is determined by RMR_ELx.AA64. 

If the highest implemented Exception level is configured to use AArch64 state, then on reset: 

° The stack pointer for the highest implemented Exception level, SP_ELx, is selected. 


° Execution starts at an IMPLEMENTATION DEFINED address, anywhere in the physical address range. The 
RVBAR associated with the highest implemented Exception level, RVBAR_EL1, RVBAR_EL2, or 
RVBAR_EL3, holds this address. 
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D1.9 Reset 

The remainder of this section contains the following: 

° PE state on reset to AArch64 state. 

° Code sequence to use RMR_ELx.RR to request a Warm reset on page D1-1519. 

For more information about reset see: 

° Behavior of caches at reset on page D3-1698. 

° TLB behavior at reset on page D4-1813. 

° Reset and debug on page H6-4955. 

D1.9.1 PE state on reset to AArch64 state 
Note 

See the ARM® Generic Interrupt Controller Architecture Specification, GIC architecture version 3.0 and version 4.0 

for the reset requirements for GIC System registers. 

Immediately after a reset, much of the PE state is UNKNOWN. However, some of the PE state is defined. If the PE 

resets to AArch64 state using either a Cold or a Warm reset, the PE state that is defined is as follows: 

° Each of the PSTATE.{D, A, I, F} interrupt masks is set to 1. 

° The Software step control bit, PSTATE.SS, is set to 0. 

° The IL process state bit, PSTATE.IL, is set to 0. 

° All general-purpose, and SIMD and floating-point registers are UNKNOWN. 

° The ELR and SPSR for each Exception level are UNKNOWN. 

. The stack pointer register for each Exception level is UNKNOWN. 

° Unless explicitly defined in this subsection, each System register at each Exception level is in an 
architecturally UNKNOWN state. 

. The TLBs and caches are in an IMPLEMENTATION DEFINED state. This means that the TLBs, the caches, or 
both, might require invalidation using IMPLEMENTATION DEFINED invalidation sequences before the memory 
management system is enabled or Normal memory accesses are permitted to be Cacheable. 

Note 

—  Onreset, System register Cacheability control fields force all Normal memory accesses to be treated 
as Non-cacheable. This applies only for the translation regime used by the Exception level and 
Security state entered on reset. For information about these controls see Enabling and disabling the 
caching of memory accesses on page D3-1696. 

— The implementation might include IMPLEMENTATION DEFINED resets. If it does, each of these resets 
might treat the cache and TLB state differently. The ARMv8-A architecture permits this. 

— Different IMPLEMENTATION DEFINED invalidation sequences might be required for different 
IMPLEMENTATION DEFINED resets. 

— In some implementations, the IMPLEMENTATION DEFINED invalidation sequence might be a NOP. 

° In the SCTLR_ELx for the highest implemented Exception level: 

— Each of the {M, C, I} bits is set to 0 

— The EE bit is set to an IMPLEMENTATION DEFINED value, typically defined by a configuration input. 

° If an RMR is implemented, RMR_ELx.RR is set to 0. ELx in this context is the highest implemented 
Exception level. 

. The enables for the counter event stream are set to 0. This means that the following bits are set to 0: 

—  CNTKCTL_EL1.EVNTEN. 
— If the implementation includes EL2, CNTHCTL_EL2.EVNTEN. 
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PMCR_ELO.E is set to 0. 
Note 


This means the Performance Monitors cannot assert interrupts at reset. 








OSDLR_EL1.DLK bit is set to 0. 

Each of MDCCINT_EL1.{TX, RX} is set to 0. 

EDPRCR.CWRR is set to 0. 

EDPRSR.SR is set to 1. 

If the implementation includes EL3, then each of MDCR_EL3.{EPMAD, EDAD, SPME} is set to 0. 
If the implementation includes EL2, then MDCR_EL2.HPMN is set to the value of PMCR_ELO.N. 
EDESR.OSUC is set to 0, and EDESR.{SS, RC} are set to the values of EDECR.{SS, RCE}. 


Note 
On an External debug reset, EDECR.{SS, RCE} are set to 0. 








Additionally, for a Cold reset into AArch64 state: 


If an RMR is implemented, RMR_ELx.AA64 is set to 1. ELx in this context is the highest implemented 
Exception level. 


Each of MDCCSR_ELO.{TXfull, RXfull} is set to 0. 


The DBGPRCR_EL1.CORENPRDRQ is set to the value of EDPRCR.COREPURQ. 


An External Debug reset sets EDPRCR.COREPURQ to 0, see External debug register resets on 
page H8-4981. If an External Debug reset and a Cold reset coincide, both DBGPRCR_EL1.CORENPRDRQ 
and EDPRCR.COREPURQ are reset to 0. 


The debug CLAIM bits are reset to 0. 


Note 


These are the bits that are set to | by writing to DBGCLAIMSET_EL1.CLAIM, and cleared to 0 by writing 
to DBGCLAIMCLR_EL1.CLAIM. 








Each of EDSCR.{RXO, TXU, INTdis, TDA, MA, HDE, ERR, RXfull, TXfull} is set to 0. 


Note 
MDCCSR_ELO.{RXfull, TXfull} reflect the values in EDSCR. {RXfull, TXfull}. 








Each of EDECCR.{NSE, SE} is set to 0. 
OSLSR_EL1.OSLEK is set to 1. 


In the EDPRSR: 
— The SPMAD, SDAD fields are set to 0. 
— The SPD field is set to 1. 


D1.9.2 Code sequence to use RMR_ELx.RR to request a Warm reset 


The following assembler sequence uses RMR_ELx.RR to request a Warm reset: 


; in addition, interrupts and debug requests for this core should be disabled 
; in the system before running this sequence to ensure the WFI suspends execution 


MOV Wy, #3 ; for AArch64, #2 for AArch32; y is any register 
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DSB ensure all stores etc are complete 
MSR RMR_ELx, Wy ; request the reset 
ISB synchronise change to the RMR 
Loop 
WFI 
B Loop 


enter a quiescent state 


D1.9.3 Pseudocode description of reset 
The AArch64.TakeReset() pseudocode function performs a reset into AArch64 state. 


AArch64.TakeReset() calls the functions AArch64.ResetGeneralRegisters(), AArch64.ResetSIMDFPRegisters(), 
AArch64.ResetSpecialRegisters(), AArch64.ResetSystemRegisters(), and ResetExternalDebugRegisters(). 


AArch64.ResetSystemRegisters() resets all System registers to their reset state as defined in the register descriptions 
in PE state on reset to AArch64 state on page D1-1518 and Chapter D7 AArch64 System Register Descriptions. 


Note 
The AArch64.ResetSystemRegisters() function only resets the System registers. 








ResetExternalDebugRegisters() resets all external debug registers to their reset state as defined in the register 
descriptions in Chapter H9 External Debug Register Descriptions. 
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Exception entry 


Exceptions are targeted at particular Exception levels. The Exception level that an exception targets is either 
programmed by software, or is determined by the nature of the exception. 


Under no circumstances do exceptions cause execution to move to a lower Exception level. 


If an asynchronous exception targets a lower Exception level, the exception is not taken and remains pending. See 
Asynchronous exception routing on page D1-1556 and Asynchronous exception masking on page D1-1557. 


Note 


The construction of the architecture means that usually, it is impossible for an exception to target a lower Exception 
level. 








The Security state can only change on taking an exception if the exception is taken from Non-secure state to EL3. 


Note 
Taking an exception to EL3 from any Exception level has no effect on the value of the SCR_EL3.NS bit. 








On taking an exception to AArch64 state: 


° The PE state is saved in the SPSR_ELx at the Exception level the exception is taken to. See Saved Program 
Status Registers (SPSRs) on page D1-1508. 


° The preferred return address is saved in the ELR_ELx at the Exception level the exception is taken to. See 
Exception Link Registers (ELRs) on page D1-1511. 


° All of PSTATE. {D, A, I, F} are set to 1. See Process state, PSTATE on page D1-1513. 


° If the exception is a synchronous exception or an SError interrupt, information characterizing the reason for 
the exception is saved in the ESR_ELx at the Exception level the exception is taken to. See Use of the 
ESR_EL1, ESR_EL2, and ESR_EL3 on page D1-1523. 


. Execution moves to the target Exception level, and starts at the address defined by the exception vector. 
Which exception vector is used is also an indicator of whether the exception came from a lower Exception 
level or the current Exception level. See Exception vectors on page D1-1522. 


° The stack pointer register selected is the dedicated stack pointer register for the target Exception level. See 
The stack pointer registers on page D1-1507. 


The remainder of this section contains the following: 





. Preferred exception return address. 

° Exception vectors on page D1-1522. 

. Pseudocode description of exception entry to AArch64 state on page D1-1523. 

. Exception classes and the ESR_ELx syndrome registers on page D1-1523. 

° Summary of register updates on faults taken to an Exception level that is using AArch64 on page D1-1535. 

D1.10.1 Preferred exception return address 

For an exception taken to an Exception level using AArch64, the Exception Link Register for that Exception level, 

ELR_ELx, holds the preferred exception return address. The preferred exception return address depends on the 

nature of the exception, as follows: 

° For asynchronous exceptions, it is the address of the instruction following the instruction boundary at which 
the interrupt occurs. Therefore, it is the address of the first instruction that did not execute, or did not 
complete execution, as a result of taking the interrupt. 

. For synchronous exceptions other than system calls, it is the address of the instruction that generates the 
exception. 
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D1.10.2 


. For exception generating instructions, it is the address of the instruction that follows the exception generating 
instruction. 


Note 
If an exception generating instruction is trapped, disabled, or is UNDEFINED because the Exception level has 
insufficient privilege to execute the instruction, the preferred exception return address is the address of the exception 
generating instruction. 








When an exception is taken from an Exception level using AArch32 to an Exception level using AArch64, the top 
32 bits of the modified ELR_ELx are 0. 


Exception vectors 


When the PE takes an exception to an Exception level that is using AArch64, execution is forced to an address that 
is the exception vector for the exception. The exception vector exists in a vector table at the Exception level the 
exception is taken to. 


A vector table occupies a number of consecutive word-aligned addresses in memory, starting at the vector base 
address. 


Each Exception level has an associated Vector Base Address Register (VBAR), that defines the exception base 
address for the table at that Exception level. 


For exceptions taken to AArch64 state, the vector table provides the following information: 


° Whether the exception is one of the following: 
— Synchronous exception. 
—  SError. 
— IRQ. 
— FQ. 


° Information about the Exception level that the exception came from, combined with information about the 
stack pointer in use, and the state of the register file. 


Table D1-7 shows this. 


Table D1-7 Vector offsets from vector table base address 





Offset for exception type 


Exception taken from 


Synchronous JIRQorviIRQ  FlQorvFIQ — SError or vSError 














Current Exception level with SP_ELO. 0x000 Qx080 0x100 Qx180 
Current Exception level with SP_ELx, x>0. Qx200 Qx280 0x300 0x380 
Lower Exception level, where the implemented level 0x400 0x480 0x500 Qx580 
immediately lower than the target level is using 

AArch64.2 

Lower Exception level, where the implemented level x600 Qx680 0x700 Qx780 


immediately lower than the target level is using 


AArch32.4 





a. For exceptions taken to EL3, if EL2 is implemented, the level immediately lower than the target level is EL2 if the exception was taken from 
Non-secure state, but EL1 if the exception was taken from Secure EL1 or ELO. 


Reset is treated as a special vector for the highest implemented Exception level. This special vector uses an 
IMPLEMENTATION DEFINED address that is typically set either by a hardwired configuration of the PE or by 
configuration input signals. The RVBAR_ELx register contains this reset vector address, where x is the number of 
the highest implemented Exception level. 
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D1.10.3 Pseudocode description of exception entry to AArch64 state 


The AArch64.TakeException() pseudocode function describes the behavior when the PE takes an exception to an 
Exception level that is using AArch64. The AArch64.ExceptionClass() function determines the EC (Exception class) 
and IL (Instruction length) values required to report the exception, and AArch64.ReportException() reports the 
exception. 


The pseudocode functions AArch64.TakeException(), AArch64.ExceptionClass(), and AArch64.ReportException() are 
described in Chapter J1 ARMv8 Pseudocode. 


D1.10.4 Exception classes and the ESR_ELx syndrome registers 


If the exception is a synchronous exception or an SError interrupt, information characterizing the reason for the 
exception is saved in the ESR_ELx at the Exception level the exception is taken to. The information saved is 
determined at the time the exception is taken, and is not changed as a result of the explicit synchronization that takes 
place at the start of taking the exception. See Synchronization requirements for AArch64 System registers on 

page D7-1889. The following sections give more information: 


. Use of the ESR_EL1, ESR_EL2, and ESR_EL3. 
° Reporting the EC encoding when an exception is routed to EL2 on page D1-1535. 


Use of the ESR_EL1, ESR_EL2, and ESR_EL3 


An ESR_ELx holds the syndrome information for an exception that is taken to AArch64 state. 


Note 


This use of a syndrome is also the reporting model used for exceptions taken to Hyp mode when they are taken to 
EL2 using AArch32. 








Figure D1-2 shows the general format of the ESR_ELx registers. 





31 26 25 24 0 
l TT 
EC IL Iss 


Figure D1-2 Overall format of the ESR_ELx registers 
The ESR_EL<x fields are: 


EC, bits[31:26] The Exception class field, that indicates the cause of the exception. 


IL, bit[25] The Instruction length bit, for synchronous exceptions, that indicates whether a trapped 
instruction was a 16-bit or a 32-bit instruction. 


ISS, bits[24:0] The Instruction specific syndrome field. Architecturally, this field can be defined 
independently for each defined Exception class. However, in practice, some ISS encodings 
are used for more than one Exception class. 


ESR_ELx, Exception Syndrome Register (ELx) on page D7-1933 describes the register in full, including: 
° Listing the valid EC field values. 


° Describing the ISS for each Exception class. 
° Giving a full description of the use of the IL field. 
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Table D1-8 shows the encoding of the ESR_ELx.EC field, the Exception class field. For each EC value, the table 
references a subsection of the ESR_ELx register definition that describes the ISS format, with links to descriptions 
of possible causes of the exception, for example the configuration required to enable a trap. 


Table D1-8 ESR_ELx.EC field encoding 















































From, state To, Exception level 
EC Exception class ISS encoding description 
AArch32 AArch64 EL1 EL2~ EL3 
000000 =©Unknown reason Yes Yes Yes Yes Yes ISS encoding for exceptions with an 
unknown reason on page D7-1938 
000001 ‘Trapped WFI or WFE instruction Yes Yes Yes Yes Yes ISS encoding for an exception from 
execution? a WFI or WFE instruction on 
page D7-1939 
000011 Trapped MCR or MRC access with Yes No Yes Yes Yesb =: ISS encoding for an exception from 
(coproc==0b1111)4 that is not an MCR or MRC access on 
reported using EC 0b000000 page D7-1940 
000100 ‘Trapped MCRR or MRRC access Yes No Yes Yes Yes ISS encoding for an exception from 
with (coproc==0b1111)? that is an MCRR or MRRC access on 
not reported using EC 0b000000 page D7-1943 
000101 Trapped MCR or MRC access with Yes No Yes Yes Yes ISS encoding for an exception from 
(coproc==0b1110)@ an MCR or MRC access on 
page D7-1940 
@00110 Trapped LDC or STC access Yes No Yes Yes Yes ISS encoding for an exception from 
an LDC or STC instruction on 
page D7-1945 
000111 Access to Advanced SIMD or Yes Yes Yes Yes Yes ISS encoding for an exception from 
floating-point functionality an access to an Advanced SIMD or 
trapped by CPACR_EL1.FPEN floating-point register, resulting 
or CPTR_ELx.TFP control4 from CPACR_EL1.FPEN or 
CPTR_ELx.TFP on page D7-1947 
001000 ‘Trapped VMRS access, from ID Yes No No Yes No ISS encoding for an exception from 
group traps, that is not reported an MCR or MRC access on 
using EC @b000111° page D7-1940 
001100 Trapped MRRC access with Yes No Yes Yes Yes ISS encoding for an exception from 
(coproc==0b1110) an MCRR or MRRC access on 
page D7-1943 
001110 Illegal Execution state Yes Yes Yes Yes Yes ISS encoding for an exception from 
an Illegal Execution state, or a PC 
or SP alignment fault on 
page D7-1948 
010001 SVC instruction execution in Yes No Yes Yesf No ISS encoding for an exception from 
AArch32 state HVC or SVC instruction execution 
on page D7-1948 
010010 HVC instruction execution in Yes No No Yes No 
AArch32 state, when HVC is not 
disabled 
010011 SMC instruction execution in Yes No No Yess Yes ISS encoding for an exception from 
AArch32 state, when SMC is not SMC instruction execution in 
disabled AArch32 state on page D7-1949 
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Table D1-8 ESR_ELx.EC field encoding (continued) 















































From, state To, Exception level 
EC Exception class ISS encoding description 
AArch32 AArch64 EL1 EL2~ EL3 
010101 SVC instruction execution in No Yes Yes Yes Yes ISS encoding for an exception from 
AArch6é4 state AVC or SVC instruction execution 
on page D7-1948 
010110 HVC instruction execution in No Yes No Yes Yes 
AArch64 state, when HVC is not 
disabled 
Q10111 SMC instruction execution in No Yes No Yess Yes ISS encoding for an exception from 
AArch64 state, when SMC is not SMC instruction execution in 
disabled AArch64 state on page D7-1951 
011000 ‘Trapped MSR, MRS, or System No Yes Yes Yes Yes ISS encoding for an exception from 
instruction execution, that is MSR, MRS, or System instruction 
not reported using EC 0x00, execution in AArch64 state on 
0x01, or 0x07 page D7-1951 
Q11111 IMPLEMENTATION DEFINED Yes Yes No No Yes ISS encoding for a 
exception taken to EL3 IMPLEMENTATION DEFINED 
exception to EL3 on page D7-1953 
100000 =Instruction Abort from alower Yes Yes Yes Yes Yes ISS encoding for an exception from 
Exception level an Instruction Abort on 
page D7-1953 
100001 Instruction Aborttaken without Yes Yes Yes Yes Yes 
a change in Exception level 
100010 PC alignment fault Yes Yes Yes Yes Yes ISS encoding for an exception from 
an Illegal Execution state, or a PC 
or SP alignment fault on 
page D7-1948 
100100 Data Abort from a lower Yes Yes Yes Yes Yes ISS encoding for an exception from 
Exception level: a Data Abort on page D7-1955 
100101 Data Abort taken without a Yes Yes Yes Yes Yes 
change in Exception leveli 
100110 SP alignment fault No Yes Yes Yes Yes ISS encoding for an exception from 
an Illegal Execution state, or a PC 
or SP alignment fault on 
page D7-1948 
101000 Trapped floating-point Yes No Yes Yes No ISS encoding for an exception from 
exception taken from AArch32 a trapped floating-point exception 
state on page D7-1959 
101100 Trapped floating-point No Yes Yes Yes Yes 
exception taken from AArch64 
state 
101111  SError interrupt Yes Yes Yes Yes Yes ISS encoding for an SError 


interrupt on page D7-1961 
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From, state To, Exception level 
EC Exception class ISS encoding description 
AArch32 AArch64 EL1 EL2 EL3 
110000 Breakpoint exception from a Yes Yes Yes Yes) No ISS encoding for an exception from 
lower Exception level a Breakpoint or Vector Catch debug 
exception on page D7-1961 
110001 Breakpoint exception taken Yes Yes Yes Yesi No 
without a change in Exception 
level 
110010 Software Stepexceptionfroma Yes Yes Yes Yesi No ISS encoding for an exception from 
lower Exception level a Software Step exception on 
page D7-1962 
110011 Software Step exception taken Yes Yes Yes Yesi No 
without a change in Exception 
level 
110100  Watchpoint exception from a Yes Yes Yes Yesi No ISS encoding for an exception from 
lower Exception level a Watchpoint exception on 
page D7-1963 
110101  Watchpoint exception taken Yes Yes Yes Yes) No 
without a change in Exception 
level 
111000 —_BKPT instruction execution in Yes No Yes Yes) No ISS encoding for an exception from 
AArch32 state execution of a Breakpoint 
instruction on page D7-1963 
111010 Vector Catch exception from Yes No No Yesi No ISS encoding for an exception from 
AArch32 state a Breakpoint or Vector Catch debug 
exception on page D7-1961 
111100 BRK instruction execution in No Yes Yes Yesi Yesk ISS encoding for an exception from 
AArch64 state execution of a Breakpoint 
instruction on page D7-1963 
a. Exceptions caused by configurable traps, enables, or disables. 
b. See Traps to EL3 of Secure monitor functionality from Secure EL1 using AArch32 on page D1-1590. 
c. Only for MCRR or MRRC accesses to the PMCCNTR_ELO or PMCCNTR. 
d. Excludes exceptions that are generated because the value of HCR_EL2.TGE is 1. 
e. Applies only to traps of accesses to MVFRO, MVFR1, MVFR2, or FPSID. Includes traps of VMRS accesses. Because the MVFRz registers 
are read-only and a VMSR access to the FPSID is ignored and not trapped, there are no MCR or VMSR accesses that can be trapped with this EC 
value. 
f. Only as a result of HCR_EL2.TGE. 
g. Only as a result of HCR_EL2.TSC. 
h. Used for MMU faults generated by instruction accesses, and for synchronous external aborts, including synchronous parity or ECC errors. 
Not used for debug-related exceptions. 
i. Used for MMU faults generated by data accesses, Alignment faults other than SP alignment faults and PC alignment faults, and for 
synchronous external aborts, including synchronous parity or ECC errors. Not used for debug-related exceptions. 
j. Only as a result of HCR_EL2.TGE ==1 or MDCR_EL2.TDE ==1. 
k. Only if the BRK instruction is executed in EL3. This is the only debug exception that can be taken to EL3 when EL3 is using AArch64. 
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Reserved EC values 
For EC values not shown in Table D1-8 on page D1-1524: 


° Unused EC values in the range 0b000000-0b101100 (0x00-0x2C) are reserved by ARM for future use for 
synchronous exceptions. 


° Unused EC values in the range 0b101101-0b111111 (@x2D-0x3F) are reserved by ARM for future use, and might 
be used for synchronous or asynchronous exceptions. 


Exceptions with an unknown reason 


<.EC value of 0b000000 





These are the exceptions reported with an E 
This encoding reports an exception with an unknown reason. 


When ESR_ELx.EC returns a value of 0x00Q, all other fields of -ESR_ELx are invalid, and defined as follows: 
° IL is set to 1. 
° ISS[24:0] is RESO. 


An exception with an unknown reason occurs for the following reasons: 
. The attempted execution of an instruction bit pattern that has no allocated instruction at the current Exception 
level and Security state, including: 


— _ A read access using a System register pattern that is not allocated for reads at the current Exception 
level and Security state. 


—  Avwrite access using a System register pattern that is not allocated for writes at the current Exception 
level and Security state. 


— Instruction encodings for instructions that are not implemented. 


. In Debug state, the attempted execution of an instruction bit pattern that is unallocated in Debug state. 

° In Non-debug state, the attempted execution of an instruction bit pattern that is unallocated in Non-debug 
state. 

° In AArch32 state, attempted execution of a short vector floating-point instruction. 


° An exception generated by any of the SCTLR_EL1.{ITD, SED, CP15BEN} control bits. 


° Attempted execution of: 
— An HVC instruction when disabled by HCR_EL2.HCD or SCR_EL3.HCE. 
— An SWC instruction when disabled by SCR_EL3.SMD. 
— An HLT instruction when disabled by EDSCR.HDE. 


. Attempted execution of an MSR or MRS to SP_ELO when the value of SPSel.SP is 0. 


° Attempted execution, in Debug state, of: 
— A DCPS1 instruction in Non-secure state from ELO when the value of HCR_EL2.TGE is 1. 


— A DCPS2 instruction from EL] or ELO when the value of SCR_EL3.NS is 0, or when EL2 is not 
implemented. 


— A DCPS3 instruction when the value of EDSCR.SDD is 1, or when EL3 is not implemented. 


° When EL3 is using AArch64, attempted execution of an SRS instruction using R13_mon from Secure EL1. 
See Traps to EL3 of Secure monitor functionality from Secure EL1 using AArch32 on page D1-1590. 


° In Debug state when the value of EDSCR.SDD is 1, the attempted execution at EL2, EL1, or ELO of an 
instruction that is configured to trap to EL3. 


° In AArch32 state, the attempted execution of an MRS (Banked register) or an MSR (Banked register) instruction 
to SPSR_mon, SP_mon, or LR_mon. 
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° An exception that is taken to EL2 because the value of HCR_EL2.TGE is 1 that, if the value of 
HCR_EL2.TGE was 0 would have been reported with an ESR .EC value of 0x07. = 


Exception from a WFI or WFE instruction, from AArch32 or AArch64 state 





This is the exception syndrome with EC value 0b000001. 


This reports exceptions from WFI or WFE instructions executed in either Execution state that result from configurable 
traps, enables, or disables. 


The returned syndrome indicates whether the trapped instruction was a WFI or a WFE. JSS encoding for an exception 


from a WFI or WFE instruction on page D7-1939 describes the format of this syndrome. 


The following sections describe configuration settings for generating these exceptions: 

° Traps to ELI of ELO execution of WFE and WF instructions on page D1-1565. 

° Traps to EL2 of Non-secure ELO and EL] execution of WFE and WFI instructions on page D1-1581. 
° Traps to EL3 of Secure monitor functionality from Secure EL] using AArch32 on page D1-1590. 


Exception from an MCR or MRC access from AArch32 state 


These are the exception syndromes with the following EC values: 

° 0b000011, MRC or MCR access to a System register in the (coproc==0b1111) encoding space. 
° 0b000101, MRC or MCR access to a System register in the (coproc==0b1110) encoding space. 
° 0b001000, VMRS System register access. 


These report exceptions from MRC, MCR, or VMRS instructions executed in AArch32 state that result from configurable 
traps, enables, or disables and are not reported using the EC code of 0b000000. 


The returned syndrome indicates whether the instruction was an MRC or an MCR, and the instruction arguments. /SS 
encoding for an exception from an MCR or MRC access on page D7-1940 describes the format of this syndrome. 


The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b000011: 


° Traps to ELI of ELO accesses to the Generic Timer registers on page D1-1569. 


° Traps to ELI of ELO accesses to Performance Monitors registers on page D1-1570. 


° Traps to EL2 of Non-secure EL] accesses to virtual memory control registers on page D1-1573. 
° Traps to EL2 of Non-secure ELI execution of TLB maintenance instructions on page D1-1574. 
° Traps to EL2 of Non-secure ELO and EL] execution of cache maintenance instructions on page D1-1575. 


. Traps to EL2 of Non-secure ELI accesses to the Auxiliary Control Register on page D1-1576. 


° Traps to EL2 of Non-secure ELO and EL] accesses to lockdown, DMA, and TCM operations on 
page D1-1577. 


° Traps to EL2 of Non-secure ELO and ELI accesses to the ID registers on page D1-1578. 
° Trapping to EL2 of Non-secure ELI accesses to the CPACR_ELI or CPACR on page D1-1582. 


° General trapping to EL2 of Non-secure ELO and EL] accesses to System registers, from AArch32 state only 
on page D1-1584. 


° Traps to EL2 of Non-secure ELO and EL] accesses to the Generic Timer registers on page D1-1587. 
° Traps to EL2 of Non-secure ELO and EL] accesses to Performance Monitors registers on page D1-1588. 
° Traps to EL3 of Secure monitor functionality from Secure EL] using AArch32 on page D1-1590. 


° Trapping to EL3 of EL2 accesses to the CPTR_EL2 or HCPTR, and EL2 and ELI accesses to the 
CPACR_ELI or CPACR on page D1-1593. 
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° Traps to EL3 of EL2, EL1, and ELO accesses to Performance Monitors registers on page D1-1597. 


The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b000101: 


. Traps to ELI of ELO and ELI System register accesses to the trace registers on page D1-1567. 
° Traps to ELI of ELO accesses to the Debug Communications Channel (DCC) registers on page D1-1568. 


. Traps to EL2 of Non-secure ELO and EL] accesses to the ID registers on page D1-1578, for trapped accesses 
to the JIDR. 


° Traps to EL2 of Non-secure System register accesses to the trace registers on page D1-1583. 


° Trapping System register accesses to Debug ROM registers to EL2 on page D1-1585. 


° Trapping System register accesses to powerdown debug registers to EL2 on page D1-1586. 
° Trapping general System register accesses to debug registers to EL2 on page D1-1586. 

° Traps to EL3 of all System register accesses to the trace registers on page D1-1594. 

° Trapping System register accesses to powerdown debug registers to EL3 on page D1-1595. 
° Trapping general System register accesses to debug registers to EL3 on page D1-1596. 


Traps to EL2 of Non-secure ELO and ELI accesses to the ID registers on page D1-1578 describes configuration 
settings for generating exceptions that are reported using EC value 0b001000. 


Exception from an MCRR or MRRC access from AArch32 state 


These are the exception syndromes with the following EC values: 
° @b000100, MRRC or MCRR access to a System register in the (coproc==0b1111) encoding space. 
° 0b001100, MRRC access to a System register in the (coproc==0b1110) encoding space. 


These report exceptions from MCRR or MRRC instructions executed in AArch32 state that result from configurable 
traps, enables, or disables and are not reported using the EC code of 0x00. 


The returned syndrome indicates whether the instruction was an MCRR or an MRRC, and the instruction arguments. /SS 
encoding for an exception from an MCRR or MRRC access on page D7-1943 describes the format of this syndrome. 


The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b000100: 


° Traps to ELI of ELO accesses to the Generic Timer registers on page D1-1569. 

° Traps to ELI of ELO accesses to Performance Monitors registers on page D1-1570. 

° Traps to EL2 of Non-secure EL] accesses to virtual memory control registers on page D1-1573. 

° General trapping to EL2 of Non-secure ELO and EL1 accesses to System registers, from AArch32 state only 


on page D1-1584. 
° Traps to EL2 of Non-secure ELO and EL] accesses to the Generic Timer registers on page D1-1587. 
° Traps to EL2 of Non-secure ELO and EL] accesses to Performance Monitors registers on page D1-1588. 
° Traps to EL3 of EL2, EL1, and ELO accesses to Performance Monitors registers on page D1-1597. 


The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b001100: 


. Traps to ELI of ELO and ELI System register accesses to the trace registers on page D1-1567. 
. Traps to EL1 of ELO accesses to the Debug Communications Channel (DCC) registers on page D1-1568. 
° Traps to EL2 of Non-secure System register accesses to the trace registers on page D1-1583. 


° Trapping System register accesses to Debug ROM registers to EL2 on page D1-1585. 
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° Traps to EL3 of all System register accesses to the trace registers on page D1-1594. 

° Trapping System register accesses to powerdown debug registers to EL3 on page D1-1595. 
Exception from an LDC or STC access to a System register 

This is the exception syndrome with EC value b000110. 


This reports exceptions from LDC or STC instructions executed in AArch32 state that result from configurable traps, 
enables, or disables. 


The returned syndrome indicates whether the instruction was an MCRR or an MRRC, and the instruction arguments. /SS 
encoding for an exception from an LDC or STC instruction on page D7-1945 describes the format of this syndrome. 


Note 
The only architected uses of these instructions are: 
° An STC to write to memory from DBGDTRRXint. 
° An LDC to read from memory to DBGDTRTXint. 








The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b000110: 


. Traps to ELI of ELO accesses to the Debug Communications Channel (DCC) registers on page D1-1568. 
° Trapping general System register accesses to debug registers to EL2 on page D1-1586 
° Trapping general System register accesses to debug registers to EL3 on page D1-1596. 


Exception from an access to an Advanced SIMD or floating-point register, from either Execution 
State, resulting from CPACR_EL1.FPEN or CPTR_ELx.TFP 


This is the exception syndrome with EC value 0b000111. 


Note 


If HCR_EL2.TGE is 1, these exceptions are reported with EC value 0b000000 instead of 0b000111. Reporting the EC 
encoding when an exception is routed to EL2 on page D1-1535 describes this. 








It reports exceptions from accesses to the Advanced SIMD and floating-point register bank, or to SIMD and 
floating-point System registers, when CPACR_EL1.FPEN !== 0b11 or CPTR_ELx.TFP == 1. 


These are taken from either Execution state. 


ISS encoding for an exception from an access to an Advanced SIMD or floating-point register, resulting from 
CPACR_ELI.FPEN or CPTR_ELx.TFP on page D7-1947 describes the format of the returned syndrome. 


The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b000111: 


. Traps to ELI of ELO and ELI accesses to SIMD and floating-point functionality on page D1-1568. 
. General trapping to EL2 of Non-secure accesses to the SIMD and floating-point registers on page D1-1583 
° Traps to EL3 of all System register accesses to the trace registers on page D1-1594. 


Exception from an Illegal Execution state, PC alignment fault, or SP alignment fault 


These are the exception syndromes with the following EC values: 
° 0b001110, Illegal Execution state. 

° 0b100010, PC alignment fault. 

. 0b100110, SP alignment fault. 





EC returns one of these values, the ISS field does not return any syndrome information and the ISS eS) 


field is RESO. 





D1-1530 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D1 The AArch64 System Level Programmers’ Model 
D1.10 Exception entry 


There are no configuration settings for generating Illegal Execution state exceptions and PC alignment fault 
exceptions. See The Illegal Execution state exception on page D1-1539 and PC alignment checking on 

page D1-1515. SP alignment checking on page D1-1515 describes the configuration settings for generating SP 
alignment fault exceptions. 


Exception from HVC or SVC instruction execution 


These are the exception syndromes with the following EC values: 
° 0b010001, SVC instruction executed in AArch32 state. 


° 0b010010, HVC instruction executed, when not disabled, in AArch32 state. 
° 0b010101, SVC instruction executed in AArch64 state. 
° 0b010110, HVC instruction executed, when not disabled, in AArch64 state. 


The returned syndrome indicates the immediate value given as an instruction argument. JSS encoding for an 
exception from HVC or SVC instruction execution on page D7-1948 describes the format of this syndrome. 


Note 


In AArch32 state, the HVC instruction is unconditional, and a conditional SVC instruction generates an exception only 
if it passes its condition code check. Therefore, the syndrome information for these exceptions does not include 
conditionality information. 








See System calls on page D1-1598. 


Exception from SMC instruction execution in AArch32 state 
This is the exception syndrome with EC value 0b010011. 
This reports the exception from an SMC that is not disabled and is executed in AArch32 state. 


ISS encoding for an exception from SMC instruction execution in AArch32 state on page D7-1949 describes the 
format of this syndrome. 


Note 


ISS encoding for an exception from SMC instruction execution in AArch32 state on page D7-1949 describes 
ISS[24:0] for each of the following cases: 


° When an SMC instruction completes normally and generates an exception that is taken to EL3. 
° When an SMC instruction is trapped to EL2 from Non-secure EL1 because HCR_EL2.TSC is set to 1. 








Traps to EL2 of Non-secure EL] execution of SMC instructions on page D1-1578 describes the configuration 
settings for trapping SMC instructions to EL2. 


Exception from SMC instruction execution in AArch64 state 
This is the exception syndrome with EC value 0b010111. 
This reports the exception from an SMC that is not disabled and is executed in AArch64 state. 


The returned syndrome indicates the immediate value given as an instruction argument. JSS encoding for an 
exception from SMC instruction execution in AArch64 state on page D7-1951 describes the format of this syndrome. 











Note 
The value of ISS[24:0] described here is used both: 
° When an SMC instruction is trapped from Non-secure EL1. 
° When an SMC instruction is not trapped, so completes normally and generates an exception that is taken to 
EL3. 
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Traps to EL2 of Non-secure ELI execution of SMC instructions on page D1-1578 describes the configuration 
settings for trapping SMC instructions to EL2. 


Exception from MSR, MRS, or System instruction execution in AArch64 state 
This is the exception syndrome with the EC value 0b011000. 


These report exceptions from MSR, MRS, or System instructions executed in AArch64 state that result from 
configurable traps, enables, or disables and are not reported using the EC codes of 0b000000, 0b000001, or 0b000111. 


Note 


The System instruction class encoding space on page C5-270 identifies the System instructions referred to in this 
description. 








The returned syndrome indicates whether the instruction was an MSR or an MRS, and the instruction arguments. /SS 
encoding for an exception from MSR, MRS, or System instruction execution in AArch64 state on page D7-1951 
describes the format of this syndrome. 


For exceptions caused by System instructions, see System on page C4-199 for the instruction arguments returned in 
the syndrome. 


The following sections describe configuration settings for generating the exception that is reported using EC value 
0b011000: 
. In ELI configurable controls on page D1-1563: 

— Traps to ELI of ELO execution of cache maintenance instructions on page D1-1564. 

— Traps to EL] of ELO accesses to the CTR_ELO on page D1-1565. 

— Traps to EL] of ELO execution of DC ZVA instructions on page D1-1566. 

— Traps to ELI of ELO accesses to the PSTATE.{D, A, I, F} interrupt masks on page D1-1566. 

— Traps to EL1 of ELO and ELI System register accesses to the trace registers on page D1-1567. 


— Traps to EL] of ELO accesses to the Debug Communications Channel (DCC) registers on 
page D1-1568. 


— Traps to EL] of ELO accesses to the Generic Timer registers on page D1-1569. 
— Traps to ELI of ELO accesses to Performance Monitors registers on page D1-1570. 


° In EL2 configurable controls on page D1-1571: 
— Traps to EL2 of Non-secure EL1 accesses to virtual memory control registers on page D1-1573. 
— Traps to EL2 of Non-secure ELO and ELI execution of DC ZVA instructions on page D1-1574. 
— _ Traps to EL2 of Non-secure EL1 execution of TLB maintenance instructions on page D1-1574. 


— Traps to EL2 of Non-secure ELO and ELI execution of cache maintenance instructions on 
page D1-1575. 


— Traps to EL2 of Non-secure EL1 accesses to the Auxiliary Control Register on page D1-1576. 


— Traps to EL2 of Non-secure ELO and ELI accesses to lockdown, DMA, and TCM operations on 
page D1-1577. 


— __ Traps to EL2 of Non-secure ELO and ELI accesses to the ID registers on page D1-1578. 

— _ Trapping to EL2 of Non-secure ELI accesses to the CPACR_ELI or CPACR on page D1-1582. 

— Traps to EL2 of Non-secure System register accesses to the trace registers on page D1-1583. 

— Trapping System register accesses to Debug ROM registers to EL2 on page D1-1585. 

— Trapping System register accesses to powerdown debug registers to EL2 on page D1-1586. 

— _ Trapping general System register accesses to debug registers to EL2 on page D1-1586. 

— Traps to EL2 of Non-secure ELO and EL] accesses to the Generic Timer registers on page D1-1587. 


— __ Traps to EL2 of Non-secure ELO and EL] accesses to Performance Monitors registers on 
page D1-1588. 
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. In EL3 configurable controls on page D1-1589: 


— Traps to EL3 of Secure EL] accesses to the Counter-timer Physical Secure timer registers on 
page D1-1592. 


— Trapping to EL3 of EL2 accesses to the CPTR_EL2 or HCPTR, and EL2 and ELI accesses to the 
CPACR_ELI or CPACR on page D1-1593. 


— Traps to EL3 of all System register accesses to the trace registers on page D1-1594. 

— Trapping System register accesses to powerdown debug registers to EL3 on page D1-1595. 

— Trapping general System register accesses to debug registers to EL3 on page D1-1596. 

— Traps to EL3 of EL2, ELI, and ELO accesses to Performance Monitors registers on page D1-1597. 


Exception from an Instruction abort 


These are the exception syndromes with the following EC values: 


. 0100000, for an Instruction abort exception taken from a lower Exception level, that could be using AArch64 
or AArch32. 
° 0b100001, for an Instruction abort exception taken without a change in Exception level, meaning it is taken 


from an Exception level that is using AArch64. 


These EC values are used for MMU faults and synchronous external aborts, including synchronous parity or ECC 
errors, that are generated by instruction accesses. They are not used for Debug exceptions. 


The returned syndrome provides more information about the exception, including a fault code that indicates the 
cause of the exception. [SS encoding for an exception from an Instruction Abort on page D7-1953 describes the 
format of this syndrome. 


Exception from a Data abort 


These are the exception syndromes with the following EC values: 


° 0b100100, for a Data abort exception taken from a lower Exception level, that could be using AArch64 or 
AArch32. 
° 0b100101, for a Data abort exception taken without a change in Exception level, meaning it is taken from an 


Exception level that is using AArch64. 


These EC values are used for the following exceptions if the exception is generated by a data access: 
° MMU faults. 
° Alignment faults other than SP alignment faults and PC alignment faults. 


° Synchronous external aborts, including synchronous parity or ECC errors. 
They are not used for Debug exceptions. 


The returned syndrome provides more information about the exception, including a fault code that indicates the 
cause of the exception. JSS encoding for an exception from a Data Abort on page D7-1955 describes the format of 
this syndrome. 


Floating-point exceptions 


These are the exception syndromes with the following EC values: 
° 0b101000, trapped floating-point exception from AArch32. 
. 0b101100, trapped floating-point exception from AArch64. 


These Exception classes are supported only when the SIMD and floating-point implementation supports the 
trapping of floating-point exceptions. Otherwise, the 0x28 and 0x2C EC values are reserved. That is, these EC values 
are used to report the floating-point exceptions defined by IEEE 754, and input denormal. 


The returned syndrome identifies the trapped floating-point exception or exceptions. JSS encoding for an exception 
from a trapped floating-point exception on page D7-1959 describes the format of this syndrome. 
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In an implementation where the SIMD and floating-point implementation supports the trapping of floating-point 
exceptions: 


° From an Exception level using AArch64, the FPCR.{IDE, IXE, UFE, OFE, DZE, IOE} bits enable each of 
the floating-point exception traps. 


° From an Exception level using AArch32, the FPSCR. {IDE, IXE, UFE, OFE, DZE, IOE} bits enable each of 
the floating-point exception traps. 


SError interrupt 
This is the exception syndrome with EC value 0b101111. 
It is used to report the exception caused by an SError interrupt. 


The returned syndrome is implementation specific. [SS encoding for an SError interrupt on page D7-1961 describes 
the format of this syndrome. See also Asynchronous exception types, routing, masking and priorities on 
page D1-1555. 


Breakpoint exception or Vector Catch exception 


These are the exception syndromes with the following EC values: 

° 0b110000, Breakpoint exception taken from a lower Exception level. 

° 0b110001, Breakpoint exception taken without a change of Exception level. 
° 0b111010, AArch32 Vector Catch exception. 


The returned syndrome provides a fault code that indicates the cause of the exception. JSS encoding for an exception 


from a Breakpoint or Vector Catch debug exception on page D7-1961 describes the format of this syndrome. 


For more information about generating these exceptions, see: 
° Breakpoint exceptions on page D2-1641. 
° Vector Catch exceptions on page D2-1672. 


Watchpoint exception 


These are the exception syndromes with the following EC values: 
° 0b110100, Watchpoint exception taken from a lower Exception level. 
° 0b110101, Watchpoint exception taken without a change of Exception level. 


The returned syndrome provides more information about the watchpoint, including a fault code that indicates the 
cause of the exception. [SS encoding for an exception from a Watchpoint exception on page D7-1963 describes the 
format of this syndrome. 


For more information about generating these exceptions, see Watchpoint exceptions on page D2-1657. 


Software Step exception 


These are the exception syndromes with the following EC values: 
° 0b110010, Software Step exception taken from a lower Exception level. 


° 0b110011, Software Step exception taken without a change of Exception level. 


The returned syndrome provides more information about the watchpoint, including a fault code that indicates the 
cause of the exception. [SS encoding for an exception from a Software Step exception on page D7-1962 describes 
the format of this syndrome. 


For more information about generating these exceptions, see Software Step exceptions on page D2-1673. 


Breakpoint Instruction exception 


These are the exception syndromes with the following EC values: 
° 0b111000, BKPT instruction executed in AArch32 state. 
° 0b111100, BRK instruction executed in AArch64 state. 
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The returned syndrome provides the comment that was provided as an argument to the Breakpoint Instruction. /SS 
encoding for an exception from execution of a Breakpoint instruction on page D7-1963 describes the format of this 
syndrome. 


For more information about generating these exceptions, see Breakpoint Instruction exceptions on page D2-1639. 


Reporting the EC encoding when an exception is routed to EL2 


When an exception is taken to EL2 because the exception routing control HCR_EL2.TGE is enabled, the EC 
encoding that would have been used if the exception had been taken to EL1 is recorded in ESR_EL2.EC instead, 
unless the encoding is x07. 


Exceptions that use 0x07 when the HCR_EL2.TGE routing control is disabled use 0x00 when the HCR_EL2.TGE 
routing control is enabled. 


D1.10.5 Summary of register updates on faults taken to an Exception level that is using AArch64 


For all exceptions taken to an Exception level using AArch64 that are not listed in Validity of FAR_ELx, the 
FAR_ELx for the Exception level the exception is taken to is UNKNOWN. 


For all exceptions taken to EL2 using AArch64 that are not listed in Validity of HPFAR_EL2, the HPFAR_EL2 is 
UNKNOWN. 


Validity of FAR_ELx 


The faulting virtual address is saved in FAR_ELx for the Exception level the exception is taken to if an exception 


is one of: 

° An Instruction Abort exception. 
° A Data Abort exception. 

° A PC alignment fault exception. 
° A Watchpoint exception. 


The architecture permits that the FAR_ELx is UNKNOWN for Synchronous External Aborts other than Synchronous 
External Aborts on Translation Table Walks. In this case, the ISS.FnV bit returned in ESR_ELx indicates whether 
FAR_EL«x is valid. 


If an exception is taken from an Exception level using AArch32 into an Exception level using AArch64, and that 
exception writes the FAR_ELx at the Exception level the exception is taken to, the most significant 32 bits of the 
FAR_EL<x are all zero, unless both: 


° The faulting address was generated by a load or store that sequentially incremented from address OxFFFFFFFF. 
Such a load or store instruction is CONSTRAINED UNPREDICTABLE, see Out of range virtual address on 
page K1-5464. 


. The implementation treats such incrementing as setting bit[32] of the virtual address. 


The FAR_ELx for an Exception level is made UNKNOWN as a result of an exception return from that Exception level. 


Validity of HPFAR_EL2 


The faulting IPA is saved in HPFAR_EL2 if the exception is an Instruction Abort or Data Abort taken to EL2 and 
the fault is one of: 


° A Translation or Access Flag fault on a stage 2 translation. 
° A stage 2 Address Size fault. 


. A fault on the stage 2 translation of an address accessed in a stage 1 translation table walk. 


HPFAR_EL2 is made UNKNOWN as a result of an exception return from EL2. 
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D1.11 Exception return 


In the ARMv8-A architecture, an exception return is always to the same Exception level or a lower Exception level. 
An exception return is used for: 


° A return to a previously executing thread. 

° Entry to a new execution thread. For example: 
— _ The initialization of a hypervisor by a Secure monitor. 
— The initialization of an operating system by a hypervisor. 


— Application entry from an operating system or hypervisor. 


An exception return requires the simultaneous restoration of the PC and PSTATE to values that are consistent with 
the desired state of execution on returning from the exception. 


In AArch64 state, an ERET instruction causes an exception return. On an ERET instruction: 
° The PC is restored with the value held in the ELR_ELx. 
° PSTATE is restored by using the contents of the SPSR_ELx. 


The ELR_ELx and SPSR_ELx are the ELR_ELx and SPSR_ELx at the Exception level the exception is returning 
from. The exception return makes this ELR_ELx and SPSR_ELx UNKNOWN. 


See Address tagging in AArch64 state on page D4-1724 for details of how tagged addresses are handled in an 
Exception return from an Exception level using AArch64 to an Exception level using AArch64. 


Note 


When returning from an Exception level using AArch64 to an Exception level using AArch32, the top 32 bits of the 
ELR_ELx are ignored. 








An ERET instruction also: 


° Sets the Event Register for the PE executing the ERET instruction. See Mechanisms for entering a low-power 
state on page D1-1599. 


° Resets the local exclusive monitor for the PE executing the ERET instruction. This removes the risk of errors 
that might be caused when a path to an exception return fails to include a CLREX instruction. 





Note 


This behavior prevents self-hosted debug from software stepping through an LDREX/STREX pair. However, 
when self-hosted debug is using software step, it is highly probable that the exclusive monitor state would be 
lost anyway, for other reasons. Stepping code that uses exclusive monitors on page D2-1685 describes this. 





It is IMPLEMENTATION DEFINED whether the resetting of the local exclusive monitor also resets the global 
exclusive monitor. 


The ERET instruction is UNDEFINED in ELO. 


When returning from an Exception level using AArch64 to an Exception level using AArch32, the AArch32 context 
is restored. The ARMv8-A architecture defines the relationship between AArch64 state and AArch32 state, for: 


° General-purpose registers. 
° Special-purpose registers. 
° System registers. 


In an implementation that includes EL3, the Security state can only change on returning from an exception if the 
return is from EL3 to a lower Exception level. 
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The following sections give more information: 


° Exception return and PC alignment. 

° Illegal return events from AArch64 state. 

° Legal returns that set PSTATE.IL to 1 on page D1-1539. 

. The Illegal Execution state exception on page D1-1539. 

° Pseudocode description of exception return on page D1-1539. 
D1.11.1 Exception return and PC alignment 


When SPSR_ELx.M[4] == 0, indicating an Exception return to AArch64 state, the value of ELR_ELx is transferred 
to the PC. If this value is misaligned, subsequent execution results in a PC alignment fault exception. 


When SPSR_ELx.M[4] == 1, indicating an Exception return to AArch32 state, the value of ELR_ELx is transferred 
to the PC except that, for a legal exception return: 

° If SPSR_ELx.T is 0, ELR_ELx[1:0] are treated as being 0 for restoring the PC. 

° If SPSR_ELx.T is 1, ELR_ELx[0] is treated as being 0 for restoring the PC. 


This means that a PC alignment fault exception cannot occur following a legal exception return from AArch6é4 state 
to AArch32 state. However, where the Exception return with SPSR_ELx.M[4] == 1 is an illegal exception return 

then it is IMPLEMENTATION DEFINED whether a misaligned value in ELR_ELx is aligned when it is restored to the 
PC. 


Note 
In an implementation that forces the alignment of the PC value restored from SPSR_ELx on an illegal exception 
return with SPSR_ELx.M[4] == 1, if SPSR_ELx.T == 1 the restored PC value might give rise to a PC alignment 


fault exception, because the PE remains in AArch64 state and only ELR_ELx[0] is treated as being 0 for restoring 
the PC. 








For more information about the illegal exception return cases see I/legal return events from AArch64 state. 


D1.11.2 Illegal return events from AArch64 state 


In this section: 

Return In AArch64 state, refers to any of: 
. Execution of an ERET instruction. 
° Execution of a DRPS instruction in Debug state. 
° Exit from Debug state. 


Saved process state value 
In AArch64 state, refers to any of: 


° The value held in the SPSR_ELx for an ERET instruction. 
. The value held in the SPSR_ELx for a DRPS instruction executed in Debug state. 
. The value held in the DSPSR_ELO for a Debug state exit. 
Link address In AArch64 state, refers to any of: 
° The address held in ELR_ELx for an ERET instruction. 
. The address held in DLR_ELO for a Debug state exit. 


Configured from reset 
Indicates the state determined on powerup or reset by a configuration input signal, or by another 
IMPLEMENTATION DEFINED mechanism. 


The ARMvV8 architecture has a generic mechanism for handling returns to a mode or state that is illegal. In AArch64 
state, this can occur as the result of any of the following situations: 


° A return where the Exception level being returned to is higher than the current Exception level. 


° A return where the Exception level being returned to is not implemented. For example a return to EL2 when 
EL2 is not implemented. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D1-1537 
1ID092916 Non-Confidential 


D1 The AArch64 System Level Programmers’ Model 
D1.11 Exception return 


. A return to EL2 when EL3 is implemented and the value of the SCR_EL3.NS bit is 0. 

° A return to Non-secure EL1 when EL2 is implemented and the value of the HCR_EL2.TGE bit is 1. 

. A return where the value of the saved process state M[4] bit is 0, indicating a return to AArch64 state, and 
one of the following is true: 
— The M[1] bit is 1. 
— The M[3:0] bits are 0b0001. 
— The Exception level being returned to is using AArch32 state, as programmed by the SCR_EL3.RW 

or HCR_EL2.RW bits, or as configured from reset. 

. A return where the value of the saved process state M[4] bit is 1, indicating a return to AArch32 state, and 

one of the following is true: 


— The M field value is not a valid AArch32 state PE mode. Table D1-4 on page D1-1511 shows the valid 
M[3:0] values for AArch32 state PE modes. This includes the case where M[3:0] is 0b0000, indicating 
User mode, and ELO does not support AArch32 state. 


— The Exception level being returned to is using AArch64 state as determined by the SCR_EL3.RW or 
HCR_EL2.RW field or the configuration from reset. This includes the case where the Exception level 
being returned to does not support AArch32 state. 


Note 


This means that, in an implementation that supports only AArch64 state, any attempt to return to AArch32 
state is an illegal exception return. 








° A Debug state exit from ELO using AArch64 state, to ELO using AArch32 state. 


In these cases: 
° PSTATE._IL is set to 1, to indicate an illegal return. 


° PSTATE.{EL, nRW, SP} are unchanged. This means the Exception level, Execution state, and stack pointer 
selection do not change as a result of the return. 


° The following PSTATE bits are restored from the saved process state value: 
— The N, Z, C, V Condition flags. 
— The D, A, I, F exception mask bits. 


° If the illegal return is an illegal exception return, the PSTATE.SS bit is handled as normal for a return. That 
is, the SS bit is handled in the same way as an exception return that is not an illegal exception return. See 
Software Step exceptions on page D2-1673. 


In all these cases the PSTATE.SS bit is handled as it would be for a normal return, as described in Entering 
the active-not-pending state on page D2-1675 and Exiting Debug state on page H2-4880. DRPS never sets 
the SS bit. This is indicated in Entering the active-not-pending state on page D2-1675. 


° If the illegal return is not a DRPS instruction executed in Debug state, the PC is restored from the link 
address. However, if the value of the M[4] bit of the saved process state is 1, indicating a return to AArch32 
state, then: 

— It is IMPLEMENTATION DEFINED whether the PC value is aligned by setting the bottom 1 or 2 bits of its 
value to 0, as determined by the T bit of the saved process state. See Exception return and PC 
alignment on page D1-1537. 


— _ __ It is CONSTRAINED UNPREDICTABLE whether bits[63:32] of the PC are all set to zero, or are set to the 
value of the corresponding bits of the link address. 


The implementation determines the choice of these two options, and the choice might vary dynamically. 
Therefore, software must tolerate both of these options. 


Relaxation of the tagged address handling requirements on an Illegal exception return on page D4-1725 describes 
how tagged addresses are handled in an Illegal Exception return from an Exception level using AArch64 to an 
Exception level using AArch64. 
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When the value of the PSTATE.IL bit is 1, any attempt to execute any instruction results in an Illegal Execution state 
exception. See The Illegal Execution state exception. 


All aspects of the illegal return, other than the effects described in this section, occur as they do for a legal return. 


D1.11.3 Legal returns that set PSTATE.IL to 1 


In this section, return, saved process state value, and link address have the same meaning as defined in J/legal return 
events from AArch64 state on page D1-1537. 


If the value of the IL bit in the saved process state is 1, then it is copied to PSTATE by a return, meaning that 
PSTATE.IL is set to 1. In this case, if the return is not an illegal return, and targets AArch32 state, then the 
PSTATE. {IT, T} bits are either: 


° Set to 0. 


° Copied from the saved process state value. 


The choice between these two options is determined by an implementation, and might vary dynamically within the 
implementation. Correspondingly software must regard the value as being an UNKNOWN choice between the two 
values. 


The PSTATE.{IT, T} bits are only valid in AArch32 state, see Process state, PSTATE on page G1-3805. 
When the PSTATE.IL bit is 1, any attempt to execute any instruction results in an Illegal Execution state exception. 
See The Illegal Execution state exception. 

D1.11.4 The Illegal Execution state exception 


When the value of the PSTATE.IL bit is 1, any attempt to execute any instruction results in an [legal Execution state 
exception. In AArch64 state, the PSTATE.IL bit can be to set to 1 by any of: 


. An illegal return, as described in Illegal return events from AArch64 state on page D1-1537. 


° A legal return that sets PSTATE.IL to 1, as described in Legal returns that set PSTATE.IL to 1. 





An Illegal Execution state exception sets I 3Lx.EC for the target Exception level to the value of QxQE. 





On taking any exception to an Exception level that is using AArch64 state: 


1. The value of the PSTATE.IL bit is copied into the SPSR_ELx.IL bit for the Exception level to which the 
exception is taken. 


2. The PSTATE.IL bit is cleared to 0. 





Note 
This means that it is not possible for software to observe the value of PSTATE.IL. 





For the priority of this exception class, see Synchronous exception prioritization for exceptions taken to AArch64 
on page D1-1548. 


D1.11.5 Pseudocode description of exception return 


The AArch64.ExceptionReturn() pseudocode function transfers the return address to the PC, and restores PSTATE 
to its saved value by calling SetPSTATEFromPSR(). 


The I]legalExceptionReturn() function checks for an Illegal Execution state exception. 
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D1.12 The Exception level hierarchy 
The System registers provide controls that control PE behavior through the Exception level hierarchy. 
If EL3 and EL2 are implemented, System registers at EL3 and EL2 provide controls that control the Execution state 
of lower Exception levels. 
Table D1-9 shows the principal System registers. 
Table D1-9 Principal System registers 

EL3 EL2 EL1 Notes 

SCTLR_EL3. SCTLR_ELZ SCTLE_EILI Controls execution for its own Exception level. 

SCR_EL3 HCR_EL2 - Controls execution at lower Exception levels. 

- HSTR_EL2 - Used only if at least one of EL1 and ELO is using AArch32. 
The prioritization of exceptions generated as a result of controls in these registers is described in Synchronous 
exception prioritization for exceptions taken to AArch64 on page D1-1548. 

The following sections describe the Exception level hierarchy: 
. The hierarchy of configuration and routing control. 
° Control of SIMD, floating-point and trace functionality on page D1-1546. 
. Control of IMPLEMENTATION DEFINED features on page D1-1546. 
° Routing exceptions to EL2 on page D1-1548. 
D1.12.1 The hierarchy of configuration and routing control 
The following subsections give a summary of the controls available at each Exception level for controlling 
execution at that Exception level and all lower Exception levels: 
° Controls provided at EL3. 
° Controls provided at EL2 on page D1-1542. 
° Controls provided at ELI on page D1-1544. 
For information on how the controls summarized in these subsections affect PE behavior, see the definitions of the 
control bits in the register descriptions. 
Controls provided at EL3 
See: 
° Controls provided by the SCR_EL3. 
° Controls provided by the SCTLR_EL3 on page D1-1541. 
° Controls provided by the MDCR_EL3 on page D1-1541. 
Controls provided by the SCR_EL3 
SCR_EL3.NS Determines the Security state of execution at EL1 and ELO. 
SCR_EL3.RW Determines the Execution state of the next-lower Exception level. 
SCR_EL3.{EA, FIQ, IRQ} 
Route: 
EA Physical SError interrupts and synchronous External Aborts to EL3. 
FIQ Physical FIQ interrupts to EL3. 
IRQ Physical IRQ interrupts to EL3. 
SCR_EL3.SMD Disables the Secure Monitor Call exception. 
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SCR_EL3.HCE Enables the Hypervisor Call exception. 
SCR_EL3.ST Enables Secure EL1 access to the Secure timer. 
SCR_EL3.SIF Secure Instruction Fetch. When in Secure state, disables instruction fetches from 


Non-secure memory. 
SCR_EL3.TWI Trap Wait-For-Interrupt. 


SCR_EL3.TWE Trap Wait-For-Event. 


Controls provided by the SCTLR_EL3 


SCTLR_EL3.{A, SA} Enable alignment checking: 
A On data accesses from EL3. 
SA On the SP, when executing at EL3. 


SCTLR_EL3.{M, C, I, WXN} 

Memory system control bits: 

M Enables EL3 stage | address translation. 

Cc When set to 0, makes data accesses to Normal memory from EL3, and Normal 
memory accesses to the EL3 translation tables, Non-cacheable. 

I When set to 0, makes instruction accesses to Normal memory from EL3 
Non-cacheable. 

WXN For accesses from EL3, enables treating all writable memory regions as XN, 
execute never. 


SCTLR_EL3.EE Defines the endianness of data accesses from EL3, including stage 1 translation table walks 
at EL3. 





Note 


Instruction fetches are always little-endian. 





Controls provided by the MDCR_EL3 


MDCR_EL3.{EPMAD, EDAD} 


Enable external debugger accesses to: 
EPMAD Performance Monitors registers. 
EDAD Breakpoint and Watchpoint registers. 


MDCR_EL3.{SPME, SDD, SPD32} 
Secure debug controls: 
SPME Secure Performance Monitors enable. Enables event counting in Secure state. 


SDD Disables all debug exceptions taken from Secure state, if the debug target 
Exception level, ELp, is using AArch64. 


SPD32 Enables debug exceptions from Secure EL1 using AArch32. 


MDCR_EL3.{TDOSA, TDA, TPM} 


These are trap controls, that control traps to EL3 of EL2, EL1, and ELO accesses to the 
following: 

TDOSA _ The OS-related debug registers. 

TDA Those debug registers not included in the MDCR_EL3.TDOSA trap. 


TPM The Performance Monitors registers. 


For EL1 and ELO, these traps apply to accesses from both Security states. 
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Controls provided at EL2 


EL2 is implemented only in Non-secure state, and its controls apply only to execution in Non-secure EL2, 
Non-secure EL1, and Non-secure ELO. See: 


° Controls provided by SCTLR_EL2. 

° Controls provided by HCR_EL2. 

° Controls provided by the HSTR_EL2 on page D1-1544. 
° Controls provided by the MDCR_EL2 on page D1-1544. 


Controls provided by SCTLR_EL2 


SCTLR_EL2.{A, SA} Enable alignment checking: 
A On data accesses from EL2. 
SA On the SP, when executing at EL2. 


SCTLR_EL2,{M, C, I, WXN} 
Memory system control bits: 
M Enables EL2 stage | address translation. 


Cc When set to 0, makes data accesses to Normal memory from EL2, and Normal 
memory accesses to the EL2 translation tables, Non-cacheable. 


I When set to 0, makes instruction accesses to Normal memory from EL2 
Non-cacheable. 


WXN For accesses from EL2, enables treating all writable memory regions as XN, 
execute never. 


SCTLR_EL2.EE Defines the endianness of data accesses from EL2, including stage | translation table walks 
at EL2. 


Also defines the endianness of stage 2 translation table walks at Non-secure EL1 and ELO. 


Note 


Instruction fetches are always little-endian. 








Controls provided by HCR_EL2 
HCR_EL2.RW Determines the Execution state of the next-lower Exception level. 


HCR_EL2.{AMO, IMO, FMO} 
Route physical interrupts to EL2, and if the value of HCR_EL2.TGE is 0, enable virtual 
interrupts: 
AMO Route physical SError interrupts to EL2 and enable virtual SError interrupts 
IMO Route physical IRQ interrupts to EL2 and enable virtual IRQ interrupts. 
FMO Route physical FIQ interrupts to EL2 and enable virtual FIQ interrupts. 





Note 


° If a physical interrupt is routed to both EL3 and EL2, routing to EL3 takes precedence 
over routing to EL2. 


. When the value of HCR_EL2.TGE is 0, the virtual interrupt enables apply regardless 
of whether the corresponding physical interrupt is routed to EL2 or to EL3. 


. When the value of HCR_EL2.TGE is | the virtual interrupts are disabled. 
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HCR_EL2.{VSE, VI, VF} 


Cause a virtual interrupt to be pending: 


VSE Virtual SError interrupt. 
VI Virtual IRQ interrupt. 
VF Virtual FIQ interrupt. 
HCR_EL2.VM Enable bit for Non-secure EL1&0 stage 2 address translations. 


HCR_EL2.{SWIO, PTW, FB, BSU, DC, CD, ID} 
Controls for memory system behavior for accesses made from Non-secure EL1 and ELO: 
SWIO Set/Way Invalidate Override. 
PTW Protect Table Walk. 


FB Force broadcast of TLB and instruction cache maintenance instructions. 
BSU Barrier Shareability Upgrade. 

DC Default Cacheability control for the EL1&0 translation regime. 

CD For the Non-secure EL1&0 translation regime, when set to 1, forces stage 2 


translation table walks to Normal memory, and stage 2 translations for data 
accesses to Normal memory, to be Non-cacheable. 


ID For the Non-secure EL1&0 translation regime, when set to 1, forces stage 2 
translations for instruction accesses to Normal memory be Non-cacheable. 


HCR_EL2.HCD Hypervisor Call Disable. 


Note 
If an implementation includes EL3, this bit is RESO. 








HCR_EL2.{TRVM, TDZ, TVM, TTLB, TPU, TPC, TSW, TACR, TIDCP, TSC, TID1, TID2, TID3, TWE, 
TWH 

Trap operations performed at Non-secure EL1 or ELO to EL2, as follows: 

TRVM _ Trap Read of Virtual Memory controls. 

TDZ Trap Data Cache Zero. 

TVM Trap Virtual Memory controls. 


TTLB Trap TLB maintenance instructions. 


TPU Trap cache maintenance to the Point of Unification instructions. 
TPC Trap data cache maintenance to the Point of Coherency instructions. 
TSW Trap data cache maintenance by Set/Way instructions. 


TACR Trap Auxiliary Control Register accesses. 
TIDCP _ Trap Implementation-Dependent functionality. 
TSC Trap Secure Monitor Call. 

TIDO Trap ID Group 0 register accesses. 

TID1 Trap ID Group 1 register accesses. 

TID2 Trap ID Group 2 register accesses. 

TID3 Trap ID Group 3 register accesses. 

TWI Trap Wait-For-Interrupt. 

TWE Trap Wait-For-Event. 


Note 


There are no AArch64 System registers in ID Group 0, therefore the TIDO trap is only 
relevant when Non-secure EL1 is using AArch32. 








HCR_EL2.TGE Trap General Exceptions. 
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Controls provided by the HSTR_EL2 


When EL2 is using AArch64 and at least one of Non-secure EL1 or ELO is using AArch32, HSTR_EL2 provides 
the following trap of Non-secure AArch32 operation to EL2: 
HSTR_EL2.Tn, for values of n in the set {0-3, 5-13, 15} 


Trap accesses to System registers in the AArch32 (coproc==0b1111) encoding space, by the 
value of n, where n is: 


° The numeric value of the CRn argument used when accessing the register using an MCR 
or MRC instruction. 


° The numeric value of the CRm argument used when accessing the register using an 
MCRR or MRRC instruction. 


Controls provided by the MDCR_EL2 


MDCR_EL2.{TDRA, TDOSA, TDA} 


Trap controls, that control traps to EL2 of Non-secure EL1 and ELO System register 
accesses to the following: 


TDRA The Debug ROM registers. 
TDOSA _ The OS-related debug registers. 


TDA Those debug registers not included in either of the MDCR_EL2.TDRA or 
MDCR_EL2.TDOSA traps. 


MDCR_EL2.TDE _ Routes all debug exceptions taken from Non-secure EL1 and ELO to EL2. 


MDCR_EL2.{TPM, TPMCR} 
These are trap controls, that control traps to EL2 of Non-secure EL1 and ELO accesses to 
the following registers: 
TPM All Performance Monitors registers. 
TPMCR The Performance Monitors Control Registers. 


MDCR_EL2.HPMN Defines the number of Performance Monitors counters that are accessible from Non-secure 
ELI and ELO. 


Controls provided at EL1 


See: 
° Controls provided by the SCTLR_ELI. 
° Controls provided by the MDSCR_ELI on page D1-1545. 


Controls provided by the SCTLR_EL1 


SCTLR_EL1.{A, SA} Enable alignment checking: 
A On data accesses from EL1 and ELO. 
SA On the SP, when executing at EL1. 


SCTLR_EL1.SA0 Enable alignment checking on the SP when executing at ELO. 


SCTLR_EL1.{M, C, I, WXN} 
Memory system control bits: 
M Enables EL1&0 stage 1 address translation. 


Cc When set to 0, makes data accesses to Normal memory from ELO and EL1, and 
Normal memory accesses to the EL1&0 stage 1 translation tables, 
Non-cacheable. 


I When set to 0, makes instruction accesses to Normal memory from EL1 and 
ELO Non-cacheable. 
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WXN For accesses from EL1 and ELO, enables treating all writable memory regions 
as XN, execute never. 


SCTLR_EL1.EE Defines the endianness of data accesses from EL1, including stage | translation table walks 
at EL1 and ELO. 





Note 


Instruction fetches are always little-endian. 





SCTLR_EL1.E0OE — ELO Endianness. Defines the endianness used for explicit data accesses made from ELO. 


SCTLR_EL1.{UCL UCT, DZE, nTWI, nTWE} 


Trap enables: 


UCI Unprivileged Cache maintenance Instruction enable. 
UCT Unprivileged Cache Type access enable. 
DZE Data cache Zero Enable. 


nTWI Not Trap Wait-For-Interrupt. 
nTWE Not Trap Wait-For-Event. 


SCTLR_EL1.UMA — Unprivileged Mask Access. 


SCTLR_EL1.{SED, ITD, CP15BEN} 
These bits control the use, at ELO, of AArch32 functionality that is deprecated, or OPTIONAL 


and deprecated: 


SED Disables use of the SETEND instruction. 


ITD If supported, disables some uses of the IT instruction. The register field 
description identifies the uses that are disabled. 


CP15BEN If supported, enables use of the CP1SDMB, CP15DSB, and CP15ISB memory 
barrier instructions. 


The deprecated uses of the IT instruction, and use of the CP1SDMB, CP15DSB, and 
CP15ISB instructions, are deprecated for performance reasons. Implementation of the ITD 
and CP15BEN controls is optional, and if a control is not implemented then the associated 
AArch32 functionality cannot be disabled. 


Controls provided by the MDSCR_EL1 


MDSCR_EL1.{MDE, SS} 


Enable controls for the debug exceptions: 


MDE Enables Breakpoint exceptions, Watchpoint exceptions, and Vector Catch 
exceptions. 
SS Enables Software Step exceptions. 


There is no enable control for Breakpoint Instruction exceptions. Breakpoint Instruction 
exceptions are always enabled. 


MDSCR_EL1.KDE Enables debug exceptions from ELp when ELp is using AArch64. 


MDSCR_EL1.TDCC Enables a trap to EL1 of ELO accesses to the Debug Communications Channel registers. 
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D1.12.2 Control of SIMD, floating-point and trace functionality 


In addition to the controls described in The hierarchy of configuration and routing control on page D1-1540, the 
following registers provide a hierarchy of control of access to SIMD and floating-point functionality, and to trace 
functionality that is accessible using the System registers: 


CPTR_EL3 _ Traps operation at lower Exception levels to EL3, if the operation is not trapped to EL2 by 
CPTR_EL2 and is not trapped to EL1 by CPACR_EL1. 


CPTR_EL2 _ Traps operation in Non-secure EL1 or ELO to EL2, if the operation is not trapped to EL1 by 
CPACR_EL1. CPTR_EL2.{TTA, TFP} also trap operation in EL2. 
The trap bits in the CPTR_EL3 and CPTR_EL2 are as follows: 


TCPAC _ Traps accesses to the registers that control access to SIMD, floating-point, and trace 
functionality. 


TTA Traps any System register access to trace functionality, unless that access is otherwise 
trapped to a lower Exception level. 


TFP Traps any execution of an instruction that uses the SIMD and floating-point register 
bank, unless that access is otherwise trapped to a lower Exception level. 
CPACR_EL1 Traps operation from EL1 or ELO to EL1. Traps set in the CPACR_EL1 take precedence over any 
traps set in the CPTR_EL2 or CPTR_EL3. The trap fields are as follows: 
TTA Traps to EL1 any System register access from ELO or EL1 to trace functionality. 


FPEN Traps to EL1 execution of instructions that uses the SIMD and floating-point register 
bank. 


D1.12.3 Control of IMPLEMENTATION DEFINED features 


The hierarchy of configuration and routing control on page D1-1540 and Control of SIMD, floating-point and trace 
functionality describe the controls of the trapping of architecturally-defined functionality. However, the architecture 
also defines registers that can be used to provide IMPLEMENTATION DEFINED traps of IMPLEMENTATION DEFINED 
functionality to the different Exception levels. Table D1-10 shows these control registers, for AArch64 state 
controls. 


Table D1-10 Control of traps of IMPLEMENTATION DEFINED functionality 





Traps to EL3 


Traps toEL2 TrapstoEL1 Notes 














ACTLR_EL3 ACTLR_EL2 ACTER. FGI Registers also provide IMPLEMENTATION DEFINED configuration controls for 
the appropriate Exception level. 
- HACR_EL2 - Provides traps of IMPLEMENTATION DEFINED Non-secure EL1 and ELO 
functionality to EL2. 
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D1.13 Synchronous exception types, routing and priorities 


Synchronous exceptions are: 


Any exception generated by attempting to execute an instruction that is UNDEFINED, including: 
— Attempts to execute instructions at an inappropriate Exception level. 
— Attempts to execute instructions when they are disabled. 


— Attempts to execute instruction bit patterns that have not been allocated. 


Illegal Execution state exceptions. These are caused by attempts to execute an instruction when the value of 
PSTATE.IL is 1, see Illegal return events from AArch64 state on page D1-1537. 


Exceptions caused by the use of a misaligned SP. 
Exceptions caused by attempting to execute an instruction with a misaligned PC. 
Exceptions caused by the exception-generating instructions SVC, HVC, or SMC. 


Traps on attempts to execute instructions that the System registers define as instructions that are trapped to a 
higher Exception level. See Configurable instruction enables and disables, and trap controls on 
page D1-1562. 


Instruction Aborts generated by the memory address translation system that are associated with attempts to 
execute instructions from areas of memory that generate faults. 


Data Aborts generated by the memory address translation system that are associated with attempts to read or 
write memory that generate faults. 


Data Aborts caused by a misaligned address. 


All of the debug exceptions: 

— _ Breakpoint Instruction exceptions. 
— _ Breakpoint exceptions. 

—  Watchpoint exceptions. 

— Vector Catch exceptions. 


— Software Step exceptions. 


In an implementation that supports the trapping of floating-point exceptions, exceptions caused by trapped 
IEEE floating-point exceptions, see Floating-point exception traps on page D1-1552. 


In some implementations, External aborts. External aborts are failed memory accesses, and include accesses 
to those parts of the memory system that occur during the address translation. The ARMv8 architecture 
permits, but does not require, implementations to treat such exceptions synchronously. See External aborts 
on page D3-1714. 


This remainder of this section contains the following: 


Routing exceptions to EL2 on page D1-1548. 

Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548. 
Effect of Data Aborts on page D1-1551. 

Floating-point exception traps on page D1-1552. 
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D1.13.1 Routing exceptions from Non-secure ELO to EL2 


When HCR_EL2.TGE is 1, any exception taken from Non-secure ELO that would be taken to Non-secure EL] is, 
instead, routed to EL2. This means that an application can execute at Non-secure ELO without using any 
functionality at Non-secure EL1. 





Note 
Implementations typically use the following Exception level and software hierarchy in Non-secure state: 
EL2 Hypervisor. 
EL1 Operating system. 
ELO Application. 


In such an implementation, setting HCR_EL2.TGE to | means that an application can run at Non-secure ELO under 
the direct control of a hypervisor executing at EL2, with no operating system involvement. 





D1.13.2 Synchronous exception prioritization for exceptions taken to AArch64 


In principle, any single instruction can generate a number of different synchronous exceptions, between the fetching 
of the instruction, its decode, and eventual execution. For exceptions taken to an Exception level that is using 
AArch64, these are prioritized as follows, where 1 is the highest priority. 


Note 


The priority numbering in this list only shows the relative priorities of exceptions taken to an Exception level that 
is using AArch64. This numbering has no global significance and, for example, does not correlate with the 
equivalent AArch32 list in Synchronous exception prioritization for exceptions taken to AArch32 state on 

page G1-3816. 








1. Software Step exceptions. See Software Step exceptions on page D2-1673. 
2. PC alignment fault exceptions. See PC alignment checking on page D1-1515. 


3: Instruction Abort exceptions. See Exception from an Instruction abort on page D1-1533 and AArch64 state 
prioritization of synchronous aborts from a single stage of address translation on page D4-1807. 


4. Breakpoint exceptions or Address Matching Vector Catch exceptions. See: 
. Breakpoint exceptions on page D2-1641. 
° Vector Catch exceptions on page D2-1672. 


Vector Catch exceptions are only taken from AArch32 state. 


Note 


An Exception Trapping Vector Catch exception is generated on exception entry for an exception that has been 
prioritized as described in Synchronous exception prioritization for exceptions taken to AArch32 state on 
page G1-3816. This means that it is outside the scope of the description of this section. 








5. Illegal Execution state exceptions. See ///egal return events from AArch64 state on page D1-1537. 


6. Exceptions taken from EL1 to EL2 because of one of the following configuration settings: 
. HSTR_EL2.Tn. 
° HCR_EL2.TIDCP. 





Note 


These are the controls for exceptions taken to AArch64 state. For exceptions taken to AArch32 state the 
equivalent controls are HSTR.Tn and HCR.TIDCP. 
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7. Exceptions that occur as a result of attempting to execute an instruction that is UNDEFINED for one or more 
of the following reasons: 


Attempting to execute an unallocated instruction encoding, including an encoding for an instruction 
that is not implemented in the PE implementation. 


Attempting to execute an instruction that is defined never to be accessible at the current Exception 
level regardless of any enables or traps. 


Debug state execution of an instruction encoding that is unallocated in Debug state. 
Non-debug state execution of an instruction encoding that is unallocated in Non-debug state. 


Execution of an HVC instruction, when HVC instructions are disabled by SCR_EL3.HCE or 
HCR_EL2.HCD. 


Execution of an MSR or MRS instruction to SP_ELO when the value of SPSel is 0. 
Execution of an HLT instruction when HLT instructions are disabled by EDSCR.HDE. 
In Debug state: 

— Execution of a DCPS1 instruction in Non-secure ELO when HCR_EL2.TGE is 1. 


— Execution of a DCPS2 instruction in EL] or ELO when SCR_EL3.NS is 0 or when EL2 is not 
implemented. 


— Execution of a DCPS3 instruction when EDSCR.SDD is 1 or when EL3 is not implemented. 


— When the value of EDSCR.SDD is 1, execution in EL2, EL1, or ELO of an instruction that is 
trapped to EL3. 


When executing in AArch32 state, execution of an instruction that is UNDEFINED as a result of any of: 
— Being in an IT block when SCTLR_EL1.ITD is 1. 
— Executing a SETEND instruction executed SCTLR_EL1.SED. 


— Executing a CPISDMB, CP15DSB, or CP15ISB barrier instruction when 
SCTLR_EL1.CP15BEN is 0. 


Note 


These are the controls for exceptions taken to AArch64 state. For exceptions taken to AArch32 state 
the equivalent controls are SCTLR.{ITD, SED, CP15BEN}, with additional controls 
HSCTLR.{ITD, SED, CP1S5BEN}. 








See Disabling or enabling ELO use of AArch32 deprecated functionality on page D1-1567 


When executing in AArch32 state, execution of an instruction that is UNDEFINED because at least one 
of FPCR. {Stride, Len} is nonzero, when programming these bits to nonzero values is supported. See 
Floating-point exception traps on page G1-3883. 


Note 


— This case applies only when ELO is using AArch32 and EL] is using AArch64. The exception 
generated by the attempted execution at ELO of the UNDEFINED instruction is taken to EL1 
using AArch64. 


— When EL] is using AArch32, the corresponding controls are FPSCR.{ Stride, Len}, and any 
exception generated by the attempted execution at ELO or EL1 of an instruction that is 
UNDEFINED because of a nonzero {Stride, Len} value is taken to EL1 using AArch32. 








8. Exceptions taken to EL1, or taken to EL2 because the value of HCR_EL2.TGE is 1, that are generated 
because of configurable access to instructions, and that are not covered by any of priorities 1-7. 


Note 





When EL2 is using AArch32, the equivalent control for routing exceptions to EL2 is HCR.TGE. 
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10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 
20. 


21. 


22. 


Exceptions taken from ELO to EL2 because of one of the following configuration settings: 
. HSTR_EL2.Tn. 
. HCR_EL2.TIDCP. 


Note 


These are the controls for exceptions taken to AArch64 state. For exceptions taken to AArch32 state the 
equivalent controls are HSTR.Tn and HCR.TIDCP. 








Exceptions taken to EL2 because of configuration settings in the CPTR_EL2. 


Note 


These are the controls for exceptions taken to AArch64 state. For exceptions taken to AArch32 state, the 
equivalent controls are in the HCPTR. 








Exceptions taken to EL2 because of one of the following configuration settings: 
° Any setting in HCR_EL2, other than the TIDCP bit. 

° Any setting in CNTHCTL_EL2. 

° Any setting in MDCR_EL2. 


Note 


These are the controls for exceptions taken to AArch64 state. For exceptions taken to AArch32 state, the 
equivalent controls are: 

* Any setting in HCR, other than the TIDCP bit. 

° Any setting in CNTHCTL or HDCR. 








Exceptions taken to EL2 because of configurable access to instructions, and that are not covered by any of 
priorities 1-11. 


Exceptions caused by the SMC instruction being UNDEFINED because the value of SCR_EL3.SMD is 1. 


Exceptions caused by the execution of an Exception generating instruction: 


° For exceptions taken from AArch64 state, Branches, Exception generating, and System instructions 
on page C3-142 defines these instructions. 


° When executing in AArch32 state, the exception-generating instructions are SVC, HVC, SMC, and BKPT. 
Exceptions taken to EL3 because of configuration settings in the CPTR_EL3. 


Exceptions taken to EL3 from Secure EL1 using AArch32, because of execution of the instructions listed in 
Traps to EL3 of Secure monitor functionality from Secure ELI using AArch32 on page D1-1590. 


Exceptions taken to EL3 from ELO, EL1, or EL2 because of configuration settings in the MDCR_EL3. 


Exceptions taken to EL3 because of configurable access to instructions, and that are not covered by any of 
priorities 1-17. 


Trapped floating-point exceptions, if supported. See Floating-point exception traps on page D1-1552. 
SP alignment faults. See SP alignment checking on page D1-1515. 


Data Abort exceptions other than a Data Abort exception generated by a Synchronous external abort that was 
not generated by a translation table walk. That is, any Data Abort exception that is not covered by item 23. 
See Exception from a Data abort on page D1-1533 and AArch64 state prioritization of synchronous aborts 


from a single stage of address translation on page D4-1807. It is IMPLEMENTATION DEFINED whether 


Synchronous external aborts are prioritized here or as item 23. 


Watchpoint exceptions. See Watchpoint exceptions on page D2-1657. 
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23. Data Abort exception generated by a Synchronous external abort that was not generated by a translation table 
walk, see External aborts on page D3-1714. It is IMPLEMENTATION DEFINED whether Synchronous external 
aborts are prioritized here or as item 21. 


For items 21-23, if an instruction results in more than one single-copy atomic memory access, the prioritization 
between synchronous exceptions generated on each of those different memory accesses is not defined by the 
architecture. 


Note 
Exceptions generated by a translation table walk are reported and prioritized as either an Instruction Abort 
exception, priority 3 in this list, or a Data Abort exception, priority 21 in this list. See also AArch64 state 
prioritization of synchronous aborts from a single stage of address translation on page D4-1807. 








D1.13.3 Effect of Data Aborts 


If an instruction that stores to memory generates a Data Abort, the value of each memory location that instruction 
stores to is either: 


° Unchanged, if one of the following applies: 
— An MMU fault is generated. 
— _ A Watchpoint exception is generated. 


— Anexternal abort is generated, if that external abort is taken synchronously. 


Note 
If an external abort is taken asynchronously, using the SError interrupt, it is outside the scope of the 
architecture to define the effect of the store on the memory location, because it depends on the 
system-specific nature of the external abort. However, in general, ARM recommends that such 





memory locations are not updated. 





° UNKNOWN for any location for which no exception and no debug event is generated. 


For external aborts and Watchpoint exceptions, the size of a memory location is defined as being the size for which 
a memory access is single-copy atomic. 


Note 


For the definition of a single-copy atomic access, see Properties of single-copy atomic accesses on page B2-82. 








For Data Aborts from load or store instructions executed in AArch64 state, if the: 


Data Abort is taken synchronously 


° If the load or store instruction specifies writeback of a new base address, the base address is 
restored to the original value on taking the exception. 


° If the instruction was a load to either the base address register or the offset register, that 
register is restored to the original value. Any other destination registers become UNKNOWN. 


° If the instruction was a load that does not load the base address register or the offset register, 
then the destination registers become UNKNOWN. 
Data Abort is taken asynchronously, using the SError interrupt 


If the instruction was a load, the destination registers of the load take an UNKNOWN value if the 
SError interrupt is taken at a point in the instruction stream after the load. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D1-1551 
1ID092916 Non-Confidential 


D1 The AArch64 System Level Programmers’ Model 
D1.13 Synchronous exception types, routing and priorities 























D1.13.4 Floating-point exception traps 
Execution of a floating-point instruction, or an Advanced SIMD instruction that performs floating-point operations, 
can generate an exceptional condition, called a floating-point exception. 
Note 

In AArch64 state, a floating-point instruction performs only a single floating-point operation. However, an 

Advanced SIMD instruction that operates on floating-point values can perform multiple floating-point operations. 

Therefore, this section describes the handling of a floating-point exception on an operation, rather than on an 

instruction. 

The ARMv8-A architecture supports synchronous exception generation in the event of any or all of the following 

floating-point exceptions: 

° Input Denormal. 

° Inexact. 

° Underflow. 

° Overflow. 

. Divide by Zero. 

° Invalid Operation. 

Whether an implementation includes synchronous exception generation for these floating-point exceptions is 

IMPLEMENTATION DEFINED: 

° For an implementation that does provide this capability, FPCR.{IDE, IXE, UFE, OFE, DZE, IOE} are the 
control bits that enable synchronous exception generation for each of the different floating-point exceptions. 

° For an implementation that does not provide this capability, the FPCR.{IDE, IXE, UFE, OFE, DZE, IOE} 
bits are RAZ/WI. 

Note 

The ARMv8-A architecture does not support asynchronous reporting of floating-point exceptions. 

When generating synchronous exceptions for one or more floating-point exceptions is enabled, the synchronous 

exceptions generated by the floating-point exception traps are taken to the lowest Exception level that can handle 

such an exception, while adhering to the rule that an exception can never be taken to a lower Exception level. This 
means that trapped floating-point exceptions taken: 

° From ELO are taken to EL1, unless they are taken from Non-secure state when the value of HCR_EL2.TGE 
is 1, when they are taken to EL2 instead. 

. From EL] are taken to EL1. 

° From EL2 are taken to EL2. 

° From EL3 are taken to EL3. 

The exception is reported in the E for the Exception level to which it is taken. 

In an implementation that includes synchronous exception generation for floating-point exceptions in AArch64 &S 

state: 

° The registers that are presented to the exception handler are consistent with the state of the PE immediately 
before the instruction that caused the exception. An implementation is permitted not to restore the cumulative 
exception bits in the event of such an exception. For more information see Combinations of floating-point 
exceptions on page D1-1553. 

° When the execution of separate operations in separate SIMD elements causes multiple floating-point 
exceptions, the "ESR_ELx reports one exception associated with one element that the instruction uses. The 
architecture does not specify which element is reported, however ESR_ELx identifies the reported element. 
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The AArch64.FPTrappedException() and FPProcessException() pseudocode functions describe the handling of 
trapped floating-point exceptions generated in AArch64 state. 


Combinations of floating-point exceptions 


Many pseudocode functions perform floating-point operations, including FixedToFP(), FPAdd(), FPCompare(), 
FPCompareEQ(), FPCompareGE(), FPCompareGT(), FPDiv(), FPMax(), FPMin(), FPMul(), FPMulAdd(), FPRecipEstimate(), 
FPRecipStepFused(), FPRSqrtEstimate(), FPRSqrtStepFused(), FPSqrt(), FPSub(), and FPToFixed(). All of these 
operations can generate floating-point exceptions. 





Note 
FPAbs() and FPNeg() are not classified as floating-point operations because: 
° They cannot generate floating-point exceptions. 
° The floating-point operation behavior described in the following sections does not apply to them: 


—  Flush-to-zero on page A1-49. 
— NaN handling and the Default NaN on page A1-50. 





More than one floating-point exception can occur on the same operation. The only combinations of exceptions that 
can occur are: 


° Overflow with Inexact. 
° Underflow with Inexact. 
° Input Denormal with other exceptions. 


The priority order of these floating-point exceptions is that the Inexact exception is treated as lowest priority, and 
the Input Denormal exception is treated as highest priority. 


Some floating-point instructions specify more than one floating-point operation, as indicated by the pseudocode 
descriptions of the instruction. In such cases, an exception on one operation is treated as higher priority than an 
exception on another operation if the occurrence of the second exception depends on the result of the first operation. 
Otherwise, it is CONSTRAINED UNPREDICTABLE which exception is treated as higher priority, where the exception 
prioritized might differ between different instances of the same two floating-point exceptions being generated on 
the same operation during execution of the instruction. 


When none of the floating-point exceptions caused by an operation is trapped, any floating-point exception that 
occurs causes the associated cumulative bit in the FPSR to be set to 1. 


When a floating-point exception is trapped: 


° It is IMPLEMENTATION DEFINED whether the FPSR is restored when the trapped exception is taken. If the 
FPSR is not restored then, then it is CONSTRAINED UNPREDICTABLE which untrapped floating-point 
exception, if any, are indicated by the corresponding FPSR cumulative exception bits having the value 1. 





— The value of the floating-point exception trapped bit for any other untrapped floating-point exception 
generated by the same operation must be 0. This applies to both higher priority and lower priority 
untrapped floating point exceptions. 


— The value of the floating-point exception trapped bit for any lower priority trapped floating-point 
exception generated by the same operation might be 1, but the architecture does not require this. 


For trapped floating-point exceptions from Advanced SIMD instructions, the architecture does not define the 
exception prioritization between different elements of the instruction. The architectural requirements for 
floating-point exception prioritization apply only to multiple floating-point exceptions generated on the same 
element of an Advanced SIMD operation. 
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Note 


An implementation might provide information about a lower priority or untrapped floating-point exceptions in an 
IMPLEMENTATION DEFINED way, for example using an IMPLEMENTATION DEFINED register. 
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D1.14 Asynchronous exception types, routing, masking and priorities 


In the ARMv8-A architecture, asynchronous exceptions that are taken to AArch64 state are also known as 
interrupts. 


There are two types of interrupts: 


Physical interrupts Are signals sent to the PE from outside the PE. They are: 


° SError. System Error. 
. IRQ. 
. FIQ. 


Virtual interrupts Are interrupts that software executing at EL2 can enable and make pending. A virtual 
interrupt is taken from Non-secure ELO or Non-secure EL1 to Non-secure EL1. 


Virtual interrupts have names that correspond to the physical interrupts: 





° vSError. 
. vIRQ. 
. VFIQ. 
Note 
° For information about how virtual interrupts might be used see Virtual interrupt usage model on 
page D1-1505. 
° The SError interrupt replaces the ARMv7 asynchronous abort. The new name better describes the nature of 


the exception, and means that it is categorized as a unique exception class, with EC encoding 0x2F. 





An external abort generated by the memory system might be taken asynchronously using the SError interrupt. The 
effect of a failed memory access is described in Effect of Data Aborts on page D1-1551. 


Each physical interrupt type can be assigned a target Exception level of EL1, EL2 or EL3, as shown in Asynchronous 
exception routing on page D1-1556. 


When an interrupt occurs: 


° On taking an SError or a vSError interrupt to an Exception level using AArch64, the Exception Syndrome 
register for that Exception level is updated with the encoding for an SError interrupt. See Exception classes 
and the ESR_ELx syndrome registers on page D1-1523. 


° On taking an IRQ, vIRQ, FIQ or vFIQ interrupt to an Exception level using AArch64, the Exception 
Syndrome register for that Exception level is not updated. 


The remainder of this section contains the following: 





. Asynchronous exception routing on page D1-1556. 
° Asynchronous exception masking on page D1-1557. 
° Virtual interrupts on page D1-1558. 
° Prioritization and recognition of interrupts on page D1-1560. 
° Taking an interrupt or other exception during a multiple-register load or store on page D1-1560. 
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D1.14.1 Asynchronous exception routing 
The following tables show the routing of physical interrupts when the highest implemented Exception level is using 
AArch64. 
In the tables, C indicates that the interrupt is not taken, regardless of the Process state interrupt mask. 
Table D1-11 Routing when both EL3 and EL2 are implemented 
Target Exception level when executing in: 
SCR_EL3.EA AMO@ 
SCR_EL3.IRQ SCR_EL3.RW  IMO@ Non-secure Secure 
SCR_EL3.FIQ FMO@ 
ELO EL1 EL2 ELO EL1 EL3 
0 0 0 EL1 EL1 EL2 EL1 EL1 C 
x 1 EL2 EL2 EL2 EL1 EL1 C 
1 0 EL1 EL1 C EL1 EL1 C 
1 x x EL3 EL3 EL3 EL3 EL3 EL3 
a. If EL2 is using AArch64, these are the HCR_EL2.{AMO, IMO, FMO} control bits. If EL2 is using AArch32, these are the 
HCR{AMO, IMO, FMO} control bits. If HCR_EL2.TGE or HCR.TGE is 1, these bits are treated as being 1 other than for 
a direct read. 
Table D1-12 Routing when EL3 is implemented and EL2 is not implemented 
Target Exception level when executing in: 
SCR_EL3.EA 
SCR_EL3.IRQ Non-secure Secure 
SCR_EL3.FIQ 
ELO EL1 ELO EL1 EL3 
0 EL1 EL1 EL1 EL1 C 
1 EL3 EL3 EL3 EL3 EL3 
Table D1-13 Routing when EL3 is not implemented and EL2 is implemented 
Target Exception level when executing in: 
HCR_EL2.AMOa 
HCR_EL2.IMO@ Non-secure 
HCR_EL2.FMO@ 
ELO EL1 EL2 
1 EL2 EL2 EL2 
0 EL1 EL1 C 
a. If HCR_EL2.TGE is 1, these bits are treated as being | other than for 
a direct read. 
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D1.14.2 Asynchronous exception masking 
When an interrupt is masked, it means that it cannot be taken. Instead, it remains pending. 
When executing in AArch64 state, interrupts are masked implicitly when the target Exception level of the interrupt 
is lower than the current Exception level. 
In addition, interrupts can be masked when the target Exception level is the current Exception level. The controls 
for this are: 
SError PSTATE.A 
IRQ PSTATE.I 
FIQ PSTATE.F 
When the target Exception level is higher than the current Exception level: 
° If the target Exception level is EL2 or EL3, the interrupt cannot be masked by the PSTATE.{A, I, F} bits. 
. If the target Exception level is EL1, the interrupt can be masked by the PSTATE.{A, I, F} bits. 
Note 
° The ability to execute in ELO with interrupts to EL1 masked is required by some user level driver code. 
° The PSTATE.{A, I, F} bits can mask both physical interrupts and virtual interrupts. 
° The ARMv8-A architecture does not support Non-maskable FIQ (NMFI) operations. This means that it does 
not provide a configuration option to override the masking of FIQs by PSTATE.F. 
On taking any exception to an Exception level using AArch64, all of PSTATE.{A, I, F} are set to 1, masking all 
interrupts that target that Exception level. 
The following tables show the masking of physical interrupts when the highest implemented Exception level is 
using AArch64: 
° For implementations that include both EL2 and EL3, see Table D1-14. 
° For implementations that include EL3 but not EL2, see Table D1-15 on page D1-1558. 
° For implementations that include EL2 but not EL3, see Table D1-16 on page D1-1558. 
For the masking of virtual interrupts, see Virtual interrupts on page D1-1558. 
In the tables: 
A When the interrupt is asserted it is taken regardless of the value of the Process state interrupt mask. 
B When the interrupt is asserted it is subject to the corresponding Process state mask. If the value of 
the mask is | then the interrupt is not taken. If the value of the mask is 0 the interrupt is taken. 
C When the interrupt is asserted it is not taken, regardless of the value of the Process state interrupt 
mask. 
Table D1-14 Physical interrupt masking when both EL3 and EL2 are implemented 
Effect of the interrupt mask when executing in: 
SCR_EL3.EA AMO? | Target 
SCR_EL3.IRQ SCR_EL3.RW_ IMO@ Exception Non-secure Secure 
SCR_EL3.FIQ FMo2 | level 
ELO EL1 EL2 ELO EL1 EL3 
0 0 EL1 B B B B B C 
1 EL2 A A B B B C 
1 0 EL1 B B C B B C 
1 EL2 A A B B B C 
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Table D1-14 Physical interrupt masking when both EL3 and EL2 are implemented (continued) 





Effect of the interrupt mask when executing in: 





SCR_EL3.EA AMO? | Target 
SCR_EL3.IRQ SCR_EL3.RW_ IMOa Exception Non-secure Secure 
SCR_EL3.FIQ FMoa | level 

ELO EL1 EL2 ELO EL1 EL3 
1 x x EL3 A A A A A B 











a. If EL2 is using AArch64, these are the HCR_EL2.{AMO, IMO, FMO} control bits. If EL2 is using AArch32, these are the 
HCR{AMO, IMO, FMO} control bits. If HCR_EL2.TGE or HCR.TGE is 1, these bits are treated as being 1 other than a direct 


read. 


Table D1-15 Physical interrupt masking when EL3 is implemented and EL2 is not implemented 





Effect of the interrupt mask when executing in: 








SCR_EL3.EA Target 
SCR_EL3.IRQ Exception Non-secure Secure 
SCR_EL3.FIQ | level 

ELO EL1 ELO EL1 EL3 
0 EL1 C 
1 EL3 A A A A B 














Table D1-16 Physical interrupt masking when EL3 is not implemented and EL2 is implemented 





Effect of the interrupt mask when executing in: 








HCR_EL2.AMO2 | Target 
HCR_EL2.IMO@ Exception Non-secure 
HCR_EL2.FMOa | level 

ELO EL1 EL2 
0 ELI B B C 
1 EL2 A A B 











a. If HCR_EL2.TGE is 1, these bits are treated as being | other than for a direct read. 











D1.14.3 Virtual interrupts 
When the value of HCR_EL2.TGE is 0, setting an HCR_EL2.{FMO, IMO, AMO} routing control bit to 1 enables 
the corresponding virtual interrupt. When the value of HCR_EL2.TGE is 1 all virtual interrupts are disabled. 
Virtual interrupts can only be taken from Non-secure ELO or Non-secure EL1 to Non-secure EL1. When a virtual 
interrupt type is enabled, that type of interrupt can be generated by: 
° Software setting the corresponding virtual interrupt pending bit, HCR_EL2.{VSE, VI, VF}, to 1. 
° For a vVIRQ or a vVFIQ, by an IMPLEMENTATION DEFINED mechanism. This might be a signal from an interrupt 
controller. See, for example, the ARM Generic Interrupt Controller Architecture Specification. 
Note 
For a usage model for virtual interrupts, see Virtual interrupt usage model on page D1-1505. 
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When a virtual interrupt is disabled: 
° Tt cannot be taken. 
e It cannot be seen in the ISR_EL1. 


Each virtual interrupt type can be masked when execution is in Non-secure EL1 or ELO, by using the same Process 
State mask bits that mask the physical interrupts, PSTATE.{A, I, F}. 


When execution is in Secure state, or at EL2, all types of virtual interrupt are always masked. 


Table D1-17 summarizes the bits that enable virtual interrupts and the bits that cause virtual interrupts to be pending. 


Table D1-17 HCR_EL2 interrupt control bits 





Virtual interrupt type Enable control@ Cause a virtual interrupt to be pending 











vSError HCR_EL2.AMO HCR_EL2.VSE 
vIRQ HCR_EL2.IMO HCR_EL2.VI 
vFIQ HCR_EL2.FMO HCR_EL2.VF 





a. Applies only when the value of HCR_EL2.TGE is 0, otherwise the virtual interrupts are disabled. 


On taking a vIRQ or a vFIQ interrupt, the corresponding virtual interrupt pending bit in the HCR_EL2 retains its 
state. 


On taking a vSError interrupt, HCR_EL2.VSE is cleared to 0. 


Note 


This means that if the virtual interrupt pending bits are used, the vVIRQ or vFIQ exception handler must cause 
software executing at EL2 or EL3 to set their corresponding virtual interrupt pending bits to 0. 








As with physical interrupts: 


° Taking a vSError interrupt to an Exception level using AArch64 updates ESR_EL1 with the dedicated 
encoding for an SError interrupt. For the encoding, see Exception classes and the ESR_ELx syndrome 
registers on page D1-1523. 


° Taking a VIRQ or a vFIQ interrupt to an Exception level using AArch64 does not update the ESR_EL1. 


The following table shows the masking of virtual interrupts when the highest implemented Exception level is using 
AArché64. In the table: 


B When the interrupt is asserted it is subject to the corresponding Process state mask. If the value of 
the mask is | then the interrupt is not taken. If the value of the mask is 0 the interrupt is taken. 


Cc When the interrupt is asserted it is not taken, regardless of the value of the Process state mask. 


Table D1-18 Virtual interrupt masking 





Effect of the interrupt mask when executing in: 
SCR_EL3.EA FMO@ 


SCR_EL3.IRQ IMO@ TGEa Non-secure Secure 


SCR_EL3.FIQ  AMOa 
ELO EL1 EL2 ELO EL1 EL3 




















0 x C C C C Cc Cc 
1 0 B B C C Cc Cc 
1 1 C C C C Cc Cc 
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a. If EL2 is using AArch64, these are the HCR_EL2.{TGE, AMO, IMO, FMO} control bits. If EL2 
is using AArch32, these are the HCR{TGE, AMO, IMO, FMO} control bits. 




















D1.14.4 Prioritization and recognition of interrupts 
The ARMv8-A architecture does not define when interrupts are taken. The prioritization of interrupts, including 
virtual interrupts, is IMPLEMENTATION DEFINED. 
Note 

As indicated at the start of Asynchronous exception types, routing, masking and priorities on page D1-1555, in 

AArch64 state all possible asynchronous exceptions are defined as interrupts. 

Any interrupt that is pending before one of the following context synchronizing events is taken before the first 

instruction after the context synchronizing event, provided that the pending interrupt is not masked: 

° Execution of an ISB instruction. 

. Exception entry. 

. Exception return. 

. Exit from Debug state. 

Note 

. If the first instruction after the context synchronizing event generates a synchronous exception, then the 
architecture does not define whether the PE takes the interrupt or the synchronous exception first. 

° The ISR_EL]1 identifies any pending interrupts. 

. Interrupts are masked when the PE is in Debug state, and therefore this list of context synchronizing events 
does not include the DCPS and DRPS instructions. 

In the absence of a specific requirement to take an interrupt, the architecture only requires that unmasked pending 

interrupts are taken in finite time. 

If an unmasked interrupt was pending but is changed to not pending before it is taken, then the architecture permits 

the interrupt to be taken, but does not require this to happen. If the interrupt is taken then it must be taken before the 

first Context synchronization event after the interrupt was changed to not pending. 
D1.14.5 Taking an interrupt or other exception during a multiple-register load or store 

In AArch64 state, interrupts can be taken during a sequence of memory accesses caused by a single load or store 

instruction. This is true regardless of the memory type being accessed. 

If an interrupt, or another exception, is taken from AArch64 during the execution of an instruction that performs a 

sequence of memory accesses, rather than a single single-copy atomic access, then: 

° For a load, any register being loaded by the instruction other than ones used in the generation of the address 
by the instruction, can contain an UNKNOWN value. Registers used in the generation of the address are 
restored to their initial value. 

. For a store, any data location being stored to by the instruction can contain an UNKNOWN value. 

° For either a load or a store, if the instruction specifies writeback of the base address, then that register is 
restored to its initial value. 

Note 

° This interrupt behavior is in contrast to behavior in AArch32 state, when interrupts cannot be taken during a 
sequence of memory access caused by a single load or store instruction. 

. In both Execution states, synchronous data abort exceptions can be taken during the execution of an 
instruction that performs a sequence of memory accesses. 
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° Software must avoid using multiple-register load and store instructions for accesses to Device memory, 
particularly to Device memory with the non-Gathering attribute, because an exception taken during the load 
or store can result in repeated accesses. 
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D1.15 Configurable instruction enables and disables, and trap controls 


This section describes the controls provided by AArch64 state for enabling, disabling, and trapping particular 
instructions. Each control is categorized as an instruction enable, an instruction disable, or a trap control: 
Instruction enables and instruction disables 


Enable or disable the use of one or more particular instructions at a particular Exception level and 
Security state. 


When an instruction is disabled as a result of an instruction enable or disable, it is UNDEFINED. 
Trap controls A trap control determines whether one or more particular instructions, whenever executed at a 

particular Exception level, are trapped. 

A trapped instruction generates a Trap exception. 

For trap controls provided by: 


EL1 Trap exceptions are taken to EL1, unless routed from Non-secure ELO to EL2 because 
HCR_EL2.TGE is 1 as described in Routing exceptions to EL2 on page D1-1548. 


For descriptions of these controls see EL/ configurable controls on page D1-1563. 
EL2 Trap exceptions are taken to EL2. 

For descriptions of these controls see EL2 configurable controls on page D1-1571. 
EL3 Trap exceptions are taken to EL3. 


For descriptions of these controls see EL3 configurable controls on page D1-1589. 





Note 


The definitions of traps and enables and disables overlap, and the classification of some controls is historical. In 

AArch64 state, the most significant characteristic of an exception report is the ESR_ELx.EC value with which it is 
reported. Describing a register control field as an instruction enable, an instruction disable, or a trap control, gives 
no indication of how an exception that is generated as a consequence of the value of that field is handled or reported. 





An exception generated as a result of an instruction enable or disable, or a trap control, is only taken if both of the 
following apply: 


. The instruction generating the exception does not also generate a higher priority exception. Synchronous 
exception prioritization for exceptions taken to AArch64 on page D1-1548 defines the prioritization of 
different exceptions on the same instruction. 


° The instruction is not UNPREDICTABLE or CONSTRAINED UNPREDICTABLE in the PE state it is executed in. 
UNPREDICTABLE and CONSTRAINED UNPREDICTABLE instructions can generate exceptions as a result of these 
controls, but the architecture does not require them to do so. 


Exceptions generated as a result of these controls are synchronous exceptions. 
Exceptions are reported in the ESR_ELx, with an EC value that indicates the Exception class. and: 
° Many cases, including all traps, are reported with a non-zero EC value and an associated syndrome. 


° Some cases where an instruction is UNDEFINED are reported with an EC value 0x0, the value for an exception 
for an unknown or uncategorized reason, and in these cases no syndrome is provided. JSS encoding for 
exceptions with an unknown reason on page D7-1938 identifies the cases that are reported with EC value 
0x00. 
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Table D1-8 on page D1-1524 lists the EC values that are used for exceptions that result from traps, enables, and 
disables. 


Note 


° A particular control might have a mnemonic that suggests it is different type of control to the control type it 
is categorized as. For example, SCTLR_EL1.DZE is a trap control even though DZE means DC ZVA Enable. 





° In addition to the controls described in this section, a routing control, HCR_EL2.TGE, can be used to route 
exceptions from ELO to EL2. See Routing exceptions to EL2 on page D1-1548. 


° An implementation might provide additional controls, in IMPLEMENTATION DEFINED registers, to provide 
control of trapping of IMPLEMENTATION DEFINED features. 





This section is organized as follows: 

° Register access instructions. 

° ELI configurable controls. 

° EL2 configurable controls on page D1-1571. 
° EL3 configurable controls on page D1-1589. 


D1.15.1 Register access instructions 


When an instruction is disabled or trapped, the exception is taken before execution of the instruction. This means 
that if the instruction is a register access instruction: 


° No access is made before the exception is taken. 


° Side-effects that are normally associated with the access do not occur before the exception is taken. 


D1.15.2 EL1 configurable controls 


These controls are in __EL1 System registers. The resulting exceptions might be taken from either Execution state. 
SPSR_EL1.M[4] indicates which Execution state the exception was taken from. 


Table D1-19 shows the _EL1 System registers that contain these controls. 


Table D1-19 _EL1 registers that contain instruction enables and disables, and trap controls 














Register name Register description 

SCTLR_EL1 System Control Register, EL1 

CPACR_EL1 Architectural Feature Access Control Register 
MDSCR_EL1 Monitor System Debug Control Register 





PMUSERENR_ELO _ Performance Monitors User Enable Register 





Table D1-20 summarizes the controls. 


Table D1-20 Instruction enables and disables, and trap controls, provided by EL1 

















Control contol Description 
type? 

SCTLR_EL1.UCI T Traps to EL] of ELO execution of cache maintenance instructions on 

page D1-1564 
SCTLR_EL1.UCT T Traps to ELI of ELO accesses to the CTR_ELO on page D1-1565 
SCTLR_EL1.{nTWE, nTWI} T Traps to EL] of ELO execution of WFE and WFI instructions on 

page D1-1565 
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Table D1-20 Instruction enables and disables, and trap controls, provided by EL1 (continued) 





Control 























Control Description 
oO typed scriptio 

SCTLR_EL1.DZE T Traps to ELI of ELO execution of DC ZVA instructions on page D1-1566 

SCTLR_EL1.UMA T Traps to ELI of ELO accesses to the PSTATE.{D, A, I, F} interrupt masks on 
page D1-1566 

SCTLR_EL1.{SED, ITD} D Disabling or enabling ELO use of AArch32 deprecated functionality on 

SCTLR_EL1.CP1SBEN E page D1-1567 

CPACR_EL1.TTA T Traps to EL1 of ELO and EL] System register accesses to the trace registers 
on page D1-1567 

CPACR_EL1.FPEN T Traps to ELI of ELO and EL] accesses to SIMD and floating-point 
functionality on page D1-1568 

MDSCR_EL1.TDCC T Traps to ELI of ELO accesses to the Debug Communications Channel (DCC) 
registers on page D1-1568 

CNTKCTL_EL1.{ELOPTEN, T Traps to ELI of ELO accesses to the Generic Timer registers on 


ELOVTEN, ELOPCTEN, ELOVCTEN } 


page D1-1569 





PMUSERENR_ELO.{ER, CR, SW, EN} T 


Traps to EL1 of ELO accesses to Performance Monitors registers on 
page D1-1570 





a. See Table D1-21. 


Table D1-21 Control types, for exceptions taken to EL1 














Abbreviation Type See 

D Disable — Instruction enables and instruction disables on page D1-1562 
E Enable Instruction enables and instruction disables on page D1-1562 
T Trap Trap controls on page D1-1562 





Traps to EL1 of ELO execution of cache maintenance instructions 


SCTLR_EL1.UCI traps ELO execution of cache maintenance instructions to EL1: 


1 
0 


ELO execution of cache maintenance instructions is not trapped to EL1. 


Any attempt to execute a cache maintenance instruction at ELO is trapped to EL1. 


Table D1-22 shows the instructions that are trapped to EL1, and how the exceptions are reported in ESR_EL1. 


Table D1-22 Instructions trapped to EL1 when SCTLR_EL1.UCI is 0 





Traps from Trapped instructions 


Syndrome reporting in ESR_EL1 





AArch64 state DC CVAU, DC CIVAC, DC CVAC, IC IVAU Trapped AArch64 MSR, MRS, or system instruction, using EC value 


0x18@ 





AArch32 state n/a 


n/a 





a. If HCR_EL2.TGE is | and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported in ESR_EL2 using the 
same EC values as shown in the table. 
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Traps to EL1 of ELO accesses to the CTR_ELO 


SCTLR_EL1.UCT traps ELO accesses to the CTR_ELO to EL1: 
1 ELO accesses to the CTR_ELO are not trapped to EL1. 
0 ELO accesses to the CTR_ELO are trapped to EL1. 


Table D1-23 shows how the exceptions are reported in ESR_EL1. 


Table D1-23 Register accesses trapped to EL1 when SCTLR_EL1.UCT is 0 











Traps from Register Syndrome reporting in ESR_EL1 
AArch64 CTR_ELO Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 
AArch32 n/a n/a 





a. If HCR_EL2.TGE is 1 and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported 
in ESR_EL2 using the same EC values as shown in the table. 


Traps to EL1 of ELO execution of WFE and WFI instructions 
SCTLR_EL1.{nTWE, nTWI} trap ELO execution of WFE and WFI instructions to EL1: 
SCTLR_EL1.nTWE 


1 ELO execution of WFE instructions is not trapped to EL1. 


0 Any attempt to execute a WFE instruction at ELO is trapped to EL1, if the instruction 
would otherwise have caused the PE to enter a low-power state. 


SCTLR_EL1.nTWI 
1 ELO execution of WFI instructions is not trapped to EL1. 


0 Any attempt to execute a WFI instruction at ELO is trapped EL1, if the instruction would 
otherwise have caused the PE to enter a low-power state. 


Table D1-24 shows how the exceptions are reported in ESR_EL1. 


Table D1-24 Instructions trapped to EL1 when SCTLR_EL1.{nTWE, nTWI} are 0 





Syndrome reporting in 











Trap control Traps from Trapped instructions ESR_EL1 

SCTLR_ELI.nTWE Both Execution WFE Trapped WFI or WFE instruction, using 
states EC value 0x01 

SCTLR_EL1.nTWI WFI 





a. If HCR_EL2.TGE is 1| and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported 
in ESR_EL2 using the same EC values as shown in the table. 


In AArch32 state, the attempted execution of a conditional WFE or WFI instruction is only trapped if the instruction 
passes its condition code check. 


Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of WFI are not guaranteed 
to be taken, even if the WFE or WFI is executed when there is no Wakeup event. The only guarantee is that if the 
instruction does not complete in finite time in the absence of a Wakeup event, the trap will be taken. 








For more information about these instructions, and when they can cause the PE to enter a low-power state, see: 
° Wait for Event mechanism and Send event on page D1-1599. 
° Wait For Interrupt on page D1-1602. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D1-1565 
1ID092916 Non-Confidential 


D1 The AArch64 System Level Programmers’ Model 
D1.15 Configurable instruction enables and disables, and trap controls 


Traps to EL1 of ELO execution of DC ZVA instructions 
SCTLR_EL1.DZE traps ELO execution of DC ZVA instructions to EL1: 
1 ELO execution of DC ZVA instructions is not trapped to EL1. 


0 Any attempt to execute a DC ZVA instruction at ELO is trapped to EL1. Reading the DCZID_ELO 
returns a value that indicates that DC ZVA instructions are not implemented. 


Table D1-25 shows how the exceptions are reported in ESR_EL1. 


Table D1-25 Instruction trapped to EL1 when SCTLR_EL1.DZE is 0 





Traps from Trapped instruction Syndrome reporting in ESR_EL1 





AArch64 state DC ZVA Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 





AArch32 state n/a n/a 





a. If HCR_EL2.TGE is | and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported in 
ESR_EL2 using the same EC values as shown in the table. 


Traps to EL1 of ELO accesses to the PSTATE.{D, A, I, F} interrupt masks 


SCTLR_EL1.UMA traps ELO execution of MSR and MRS instructions that access the PSTATE. {D, A, I, F} masks to 


EL1: 
1 ELO execution of MSR or MRS instructions that access the DAIF is not trapped to EL1. 
0 Any attempt at ELO to execute an MSR or an MRS instruction that accesses the DAIF is trapped to EL1. 


Table D1-26 shows how the exceptions are reported in ESR_EL1. 


Table D1-26 Instructions trapped to EL1 when SCTLR_EL1.UMA is 0 





Taken from Disabled instructions Syndrome reporting in ESR_EL1 





AArch64 state MRS, MSR (register), MSR (immediate), that access the DAIF Trapped AArch64 MSR, MRS, or system instruction, 
using EC value 0x18 





AArch32 state n/a n/a 





a. If HCR_EL2.TGE is | and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported in ESR_EL2 using the 
same EC values as shown in the table. 
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Disabling or enabling ELO use of AArch32 deprecated functionality 
Table D1-27 shows the deprecated AArch32 functionality that might have disable controls in the SCTLR_EL1: 
° The SED control is always implemented. 


° Whether each of the ITD, CP15BEN controls is implemented is IMPLEMENTATION DEFINED. If a control is 
not implemented then the associated functionality cannot be disabled. 


These SCTLR_EL] controls apply only to execution at ELO using AArch32. When an instruction is disabled by one 
of these controls, it is UNDEFINED at ELO using AArch32. The table shows how the exceptions are reported in 
ESR_EL1. 


Table D1-27 EL1 controls for disabling and enabling ELO use of AArch32 deprecated functionality 





Instruction enable 


Deprecated AArch32 Syndrome reporting 











: : or disable in the Disabled instructions , 
a 
functionality SCTLR_EL1 in ESR_EL1 
SETEND instructions SED» SETEND instructions Exception for an 
unknown reason, using 
Some uses of IT instructions ITD¢ See the SCTLR_EL1.IT description EC value 0x00 
Accesses to the CP15DMB, CP15DSB, CP1IS5BEN¢ MCR accesses to the CP15DMB, 
and CP15ISB barrier instructions CP15DSB, and CP15ISB instructions 





a. If HCR_EL2.TGE is | and the PE is in Non-secure state, the exception is routed to EL2 and reported in ESR_EL2 using the EC value shown 
in the table. 


b. SETEND instruction disable. SETEND instructions are disabled when the value of this field is 1. 
c. IT instruction disable. If this control is implemented, some uses of IT instructions are disabled when the value of this field is 1. 


d. System register (coproc==0b1111) memory barrier enable. If this control is implemented, the specified register accesses are disabled when 
the value of CP15BEN is 0. 


Note 


° The uses of the IT instruction, and use of the CP15DMB, CPI5DSB, and CP15ISB barrier instructions, are 
deprecated for performance reasons. 





° The SCTLR provides similar controls that apply when EL1 is using AArch32, and the HSCTLR provides 
similar controls that apply when EL2 is using AArch32. 





Traps to EL1 of ELO and EL1 System register accesses to the trace registers 


CPACR_EL1.TTA traps ELO and EL1 System register accesses to the trace registers to EL1. 





1 ELO and EL1 System register accesses to the trace registers are trapped to EL1. 
0 ELO and EL1 System register accesses to the trace registers are not trapped to EL1. 
Note 
° The ETMv4 architecture does not permit ELO to access the trace registers. If the ARMv8-A architecture is 


implemented with an ETMv4 implementation, ELO accesses to the trace registers are UNDEFINED, and the 
resulting exception is higher priority than a CPACR_EL1.TTA Trap exception. 


° The ARMv8-A architecture does not provide traps on trace register accesses through the optional 
Memory-mapped interface. 





System register accesses to the trace registers can have side-effects. When a System register access is trapped, no 
side-effects occur before the exception is taken, see Register access instructions on page D1-1563. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D1-1567 
1ID092916 Non-Confidential 


D1 The AArch64 System Level Programmers’ Model 
D1.15 Configurable instruction enables and disables, and trap controls 


Table D1-28 shows the registers for which accesses are trapped to EL1 when CPACR_EL1.TTA is 1, and how the 
exceptions are reported in ESR_EL1. 


Table D1-28 Register accesses trapped to EL1 when CPACR_EL1.TTA is 1 











Traps from Registers Syndrome reporting in ESR_EL1 
AArch64 state All implemented trace registers Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18.4 
AArch32 state All implemented trace registers For accesses using: 


° MCR or MRC instructions, trapped MCR or MRC access (coproc==0b1110), using 
EC value @x05.4 

° MCRR or MRRC instructions, trapped MCRR or MRRC access (coproc==0b1110), 
using EC value @x@C.@ 





a. If HCR_EL2.TGE is | and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported in ESR_EL2 using the 
same EC values as shown in the table. 


Traps to EL1 of ELO and EL1 accesses to SIMD and floating-point functionality 
CPACR_EL1.FPEN traps ELO and EL] accesses to the SIMD and floating-point registers to EL1: 


00 Causes any instructions in ELO or EL] that use the registers that are associated with Advanced 
SIMD and floating-point execution to be trapped. 


01 Causes any instructions in ELO that use the registers that are associated with Advanced SIMD and 
floating-point execution to be trapped, but does not cause any instruction in EL1 to be trapped. 


10 Causes any instructions in ELO or EL] that use the registers that are associated with Advanced 
SIMD and floating-point execution to be trapped. 


11 Does not cause any instruction to be trapped. 


Table D1-29 shows the registers for which accesses are trapped, and how the exceptions are reported in ESR_EL1. 


Table D1-29 Register accesses trapped to EL1 by CPACR_EL1.FPEN 











Traps from Registers Syndrome reporting in ESR_EL1 

ELO and EL1 FPCR, FPSR, and any of the SIMD and floating-point registers Trapped access to a SIMD or 

using AArch64, V0-V31, including their views as DO-D31 registers or SO-S31 floating-point register, resulting from 

or ELO using registers. See The SIMD and floating-point registers, VO-V31 on CPACR_EL1.FPEN or CPTR_ELx.TFP, 

AArch64 only. page D1-1508. using EC value @x07> 

ELO using FPSCR, and any of the SIMD and floating-point registers QO-Q15, Trapped access to a SIMD or 

AArch32 including their views as DO-D31 registers or SO-S31 registers. See floating-point register, resulting from 
Advanced SIMD and floating-point System registers on CPACR_EL1.FPEN or CPTR_ELx.TFP, 
page G1-3882. using EC value 0x07» 





a. Depending on the value of CPACR_EL1.FPEN. See the register description for details. 
b. If HCR_EL2.TGE is | and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported in ESR_EL2 using EC 


value 0x00. 


Traps to EL1 of ELO accesses to the Debug Communications Channel (DCC) registers 


MDSCR_EL1.TDCC traps ELO accesses to the DCC registers to EL1: 
1 ELO accesses to the DCC registers are trapped to EL1. 
0 ELO accesses to the DCC registers are not trapped to EL1. 


Traps of AArch32 accesses to DBGDTRRXint and DBGDTRTXint are ignored in Debug state. 





D1-1568 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D1 The AArch64 System Level Programmers’ Model 
D1.15 Configurable instruction enables and disables, and trap controls 


Traps of AArch64 accesses to DBGDTR_ELO, DBGDTRRX_ELO, and DBGDTRTX_EL0 are ignored in Debug 
state. 


Table D1-30 shows the accesses that are trapped, and how the exceptions are reported in ESR_EL1. 


Table D1-30 Accesses trapped to EL1 when MDSCR_EL1.TDCC is 1 

















Traps from Trapped accesses Syndrome reporting in ESR_EL1 
AArch6é4 state Accesses to the MDCCSR_ELO, DBGDTR_ELO, DBGDTRTX_ELO Trapped AArch64 MSR, MRS, or system 
and DBGDTRRX_ELO instruction, using EC value 0x18 
AArch32 state °* MRC of DBGDSCRint, DBGDTRRXint, and, if implemented, Trapped MCR or MRC access 
DBGDIDR, DBGDSAR and DBGDRAR. (coproc==0b1110), using EC value @x05@ 
° MCR to DBGDTRTXint. 
° LDC access to DBGDTRTXint. Trapped LDC or STC access, using EC value 
° STC access to DBGDTRRXint. 0x06# 
If implemented, MRRC of DBGDSAR and DBGDRAR. Trapped MCRR or MRRC access 


(coproc==0b1110), using EC value @x0C@ 





a. If HCR_EL2.TGE is | and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported in ESR_EL2 using the 
same EC values as shown in the table. 


Traps to EL1 of ELO accesses to the Generic Timer registers 


CNTKCTL_EL1.{ELOPTEN, ELOVTEN, ELOPCTEN, ELOVCTEN} trap ELO accesses to the Generic Timer 
registers to EL1, as follows: 


° CNTKCTL_EL1.ELOPTEN traps ELO accesses to the physical timer registers. 

° CNTKCTL_EL1.ELOVTEN traps ELO accesses to the virtual timer registers. 

° CNTKCTL_EL1.ELOPCTEN traps ELO accesses to the frequency register and physical counter register. 
° CNTKCTL_EL1.ELOVCTEN traps ELO accesses to the frequency register and virtual counter register. 


For all of these controls: 
1 ELO accesses are not trapped to EL1. 
0 ELO accesses are trapped to EL1. 


Accesses to the frequency register, CNTFRQ_ELO or CNTFRQ, are only trapped if CNTKCTL_EL1.ELOPCTEN 
and CNTKCTL_EL1.ELOVCTEN are both 0. 


Table D1-31 shows the registers for which accesses are trapped, and how the exceptions are reported in ESR_EL1. 


Table D1-31 Register accesses trapped from EL0 to EL1 by CNTKCTL_EL1 trap controls 























Traps from Trap control Registers Syndrome reporting in ESR_EL1 
AArch64 state ELOPTEN CNTP_CTL_ELO, CNTP_CVAL_ELO, Trapped AArch64 MSR, MRS, or system instruction, 
CNTP_TVAL_ELO using EC value 0x18 
ELOVTEN CNTV_CTL_EL0, CNTV_CVAL_ELO, 
CNTV_TVAL_ELO 
ELOPCTEN CNTFRQ_ELO, CNTPCT_ELO 
ELOVCTEN CNTFRQ_ELO, CNTVCT_ELO 
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Table D1-31 Register accesses trapped from ELO to EL1 by CNTKCTL_EL1 trap controls (continued) 

















Traps from Trap control Registers Syndrome reporting in ESR_EL1 
AArch32 state ELOPTEN CNTP_CTL, CNTP_CVAL, For accesses using: 
CNTP_TVAL ° MCR or MRC instructions, trapped MCR or MRC 
ELOVTEN CNTV_CTL, CNTV_CVAL, ae Ceara une ee terns 
CNTV_TVAL * 
° MCRR or MRRC instructions, trapped MCRR or MRRC 
ELOPCTEN CNTFRQ, CNTPCT access (coproc==0b1111), using EC value 
0x04 
ELOVCTEN CNTFRQ, CNTVCT 








a. If HCR_EL2.TGE is | and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported in ESR_EL2 using the 
same EC values as shown in the table. 


Traps to EL1 of ELO accesses to Performance Monitors registers 


PMUSERENR_ELO.{ER, CR, SW, EN} trap ELO accesses to the Performance Monitors registers to EL1. For each 
of these controls: 


1 ELO accesses are not trapped to EL1. 
0 ELO accesses are trapped to EL1. 


For those Performance Monitors registers that more than one PMUSERENR_ELO.{ER, CR, SW, EN} control 
applies to, accesses are only trapped if all controls that apply are set to 0. 


The accesses that these trap controls trap might be reads, writes, or both. 


Table D1-32 shows: 


° The registers for which ELO accesses are trapped. For each register, the table shows the type of access 
trapped. 
° How the exceptions are reported in ESR_EL1. 


Table D1-32 Register accesses trapped to EL1 when disabled from ELO 


























Trap Access Syndrome reporting in 
Traps from Registers 
P control 9 type  ESR_EL1 
AArch64 ER PMXEVCNTR_ELO, PMEVCNTR<n>_ELO R Trapped AArch64 MSR, MRS, 
state or system instruction, 
PMSELR_ELO RW using EC value 0x18? 
CR PMCCNTR_ELO R 
SW PMSWINC_ELO WwW 
EN PMCNTENSET_ELO, PMCNTENCLR_ELO, PMCR_ELO, RW 
PMOVSCLR_ELO, PMSWINC_ELO, PMSELR_ELO, 
PMCEIDO_ELO, PMCEID1_EL0, PMCCNTR_ELO, 
PMXEVTYPER_ELO, PMXEVCNTR_ELO, 
PMOVSSET_ELO, PMEVCNTR<n>_EL0, 
PMEVTYPER<n>_EL0, PMCCFILTR_ELO. 
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Table D1-32 Register accesses trapped to EL1 when disabled from ELO (continued) 

















Tra Access ingi 
Traps from 7 Registers Syndrome reporting in 

control type ESR_EL1 
AArch32 ER PMXEVCNTR, PMEVCNTR<n> R Trapped MCR or MRC access 
state (coproc==0b1111), using 

a aaa EC value 0x038 
CR PMCCNTR, accessed using an MRC R 
CR PMCCNTR, accessed using an MRRC R Trapped MCRR or MRRC 


access (coproc==0b1111), 
using EC value 0x04? 














SW PMSWINC WwW Trapped MCR or MRC access 
(coproc==0b1111), using 
EN PMCNTENSET, PMCNTENCLR, PMCR, PMOVSR, RW EC value 0x032 
PMSWINC, PMSELR, PMCEIDO, PMCEID1, PMCCNTR, 
PMXEVTYPER, PMXEVCNTR, PMOVSSET, 
PMEVCNTR<n>, PMEVTYPER<n>, PMCCFILTR, accessed 
using an MCR or MRC 
EN PMCCNTR, accessed using an MCRR or MRRC RW Trapped MCRR or MRRC 


access (coproc==0b1111), 
using EC value 0x04 





a. If HCR_EL2.TGE is | and the PE is in Non-secure state, these Trap exceptions are routed to EL2 and are reported in ESR_EL2 using the 
same EC values as shown in the table. 


D1.15.3 EL2 configurable controls 


These controls are in __EL2 System registers. The resulting exceptions might be taken from either Execution state. 
SPSR_EL2.M[4] indicates which Execution state the exception was taken from. 


These controls are ignored in Secure state. 


Table D1-33 shows the _EL2 System registers that contain these controls. 


Table D1-33 _EL2 registers that contain instruction disables and trap controls 





Register name Register description 














HCR_EL2 Hypervisor Configuration Register 
HSTR_EL2 Hypervisor System Trap Register 
CPTR_EL2 Architectural Feature Trap Register, EL2 
MDCR_EL2 Monitor Debug Configuration Register, EL2 





Table D1-34 on page D1-1572 summarizes the controls. 





Note 
° There are no instruction enables at EL2. 
° For completeness, Table D1-34 on page D1-1572 includes the routing control described in Routing 


exceptions to EL2 on page D1-1548. 
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Table D1-34 Instruction disables and trap controls provided by EL2 





Control 
























































Control Description 
type@ P 

HCR_EL2.{TRVM, TVM} T Traps to EL2 of Non-secure ELI accesses to virtual memory control registers 
on page D1-1573 

HCR_EL2.HCD D Disabling Non-secure state execution of HVC instructions on page D1-1574 

HCR_EL2.TDZ T Traps to EL2 of Non-secure ELO and ELI execution of DC ZVA instructions on 
page D1-1574 

HCR_EL2.TGE R Routing exceptions to EL2 on page D1-1548 

HCR_EL2.TTLB T Traps to EL2 of Non-secure EL1 execution of TLB maintenance instructions on 
page D1-1574 

HCR_EL2.{TSW, TPC, TPU} T Traps to EL2 of Non-secure ELO and EL] execution of cache maintenance 
instructions on page D1-1575 

HCR_EL2.TACR T Traps to EL2 of Non-secure EL] accesses to the Auxiliary Control Register on 
page D1-1576 

HCR_EL2.TIDCP T Traps to EL2 of Non-secure ELO and ELI accesses to lockdown, DMA, and 
TCM operations on page D1-1577 

HCR_EL2.TSC T Traps to EL2 of Non-secure ELI execution of SMC instructions on 
page D1-1578 

HCR_EL2.{TIDO, TID1, TID2, TID3} —_T Traps to EL2 of Non-secure ELO and ELI accesses to the ID registers on 
page D1-1578 

HCR_EL2.{TWI, TWE} T Traps to EL2 of Non-secure ELO and EL] execution of WFE and WFI 
instructions on page D1-1581 

CPTR_EL2.TCPAC T Trapping to EL2 of Non-secure EL1 accesses to the CPACR_ELI or CPACR on 
page D1-1582 

CPTR_EL2.TFP T General trapping to EL2 of Non-secure accesses to the SIMD and 
floating-point registers on page D1-1583 

CPTR_EL2.TTA T Traps to EL2 of Non-secure System register accesses to the trace registers on 
page D1-1583 

HSTR_EL2.{T0-T3, T5-T13, T15} T General trapping to EL2 of Non-secure ELO and ELI accesses to System 
registers, from AArch32 state only on page D1-1584 

MDCR_EL2.{TDRA, TDOSA, TDA} T Traps to EL2 of Non-secure ELO and EL] System register accesses to debug 
registers on page D1-1585 

CNTHCTL_EL2.{EL1PCEN, T Traps to EL2 of Non-secure ELO and ELI] accesses to the Generic Timer 

ELIPCTEN} registers on page D1-1587 

MDCR_EL2.{TPM, TPMCR} T Traps to EL2 of Non-secure ELO and ELI accesses to Performance Monitors 


registers on page D1-1588 





a. See Table D1-35 on page D1-1573. 
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Table D1-35 Control types, for exceptions taken to EL1 





Abbreviation Type See 





D Disable Instruction enables and instruction disables on page D1-1562 





Routing control Routing exceptions to EL2 on page D1-1548 





T Trap Trap controls on page D1-1562 





Also see the following for more general information about traps to EL2: 
° Register access instructions on page D1-1563. 
. For traps from an Exception level using AArch32: 
— __ Instructions that fail their condition code check on page G1-3896. 
— Trapping to EL2 of instructions that are UNPREDICTABLE on page G1-3896. 


Traps to EL2 of Non-secure EL1 accesses to virtual memory control registers 
HCR_EL2.{TRVM, TVM} trap Non-secure EL1 accesses to the virtual memory control registers to EL2: 


HCR_EL2.TRVM, for read accesses: 
1 Non-secure EL] reads of the virtual memory control registers are trapped to EL2. 


0 Non-secure EL1 reads of the virtual memory control registers are not trapped to EL2. 


HCR_EL2.TVM, for write access: 
1 Non-secure EL1 writes to the virtual memory control registers are trapped to EL2. 


0 Non-secure writes to the virtual memory control registers are not trapped to EL2. 


Table D1-36 shows the registers for which: 
° Reads are trapped to EL2 when HCR_EL2.TRVM is 1. 
° Writes are trapped to EL2 when HCR_EL2.TVM is 1. 


Table D1-36 also shows how the exceptions are reported in ESR_EL2. 


Table D1-36 Register read and write accesses trapped when HCR_EL2.{TRVM, TVM} are 1 








Traps from Registers Syndrome reporting in ESR_EL2 
AArch64 state SCTLR_EL1, TTBRO_EL1, TTBR1_EL1, TCR_EL1, Trapped AArch64 MSR, MRS, or system instruction, 
ESR_EL1, FAR_EL1, AFSRO_EL1, AFSR1_EL1, using EC value 0x18. 


MAIR_ELI, AMAIR_EL1, CONTEXTIDR_EL1. 





AArch32 state SCTLR, TTBRO, TTBR1, TTBCR, DACR, DFSR, IFSR, Trapped MCR or MRC access (coproc==0b1111), using EC 
DFAR, IFAR, ADFSR, AIFSR, PRRR, NMRR, MAIRO, value 0x03. 
MAIR1, AMAIRO, AMAIR1, CONTEXTIDR. Trapped MCRR or MRRC access (coproc==0b1111), using 


EC value 0x04. 





Note 


EL2 provides a second stage of address translation, that a hypervisor can use to remap the address map defined by 
a Guest OS. In addition, a hypervisor can trap attempts by a Guest OS to write to the registers that control the 
Non-secure memory system. A hypervisor might use this trap as part of its virtualization of memory management. 
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Disabling Non-secure state execution of HVC instructions 
HCR_EL2.HCD disables Non-secure state execution of HVC instructions: 


1 HVC instructions are UNDEFINED at EL2 and Non-secure EL1, and any resulting exception is taken 
from the current Exception level to the current Exception level. 


0 HVC instruction execution is enabled at EL2 and Non-secure EL1. 





Note 
HVC instructions are always UNDEFINED at ELO. 





HCR_EL2.HCD is only implemented if EL3 is not implemented. Otherwise, it is RESO. 





Table D1-37 shows how the exceptions are reported in E 


Table D1-37 Instruction disabled when HCR_EL2.HCD is 1 





Taken from Disabled instruction Syndrome reporting in ESR_ELx 





AArch64 state = HVC Exception for an unknown reason, using EC value 0x00 





AArch32 state HVC 





Traps to EL2 of Non-secure ELO and EL1 execution of DC ZVA instructions 
HCR_EL2.TDZ traps Non-secure ELO and EL1 execution of DC ZVA instructions to EL2: 


1 Any attempt to execute a DC ZVA instruction at Non-secure ELO or EL] is trapped to EL2. Reading 
the DCZID_ELO returns a value that indicates that DC ZVA instructions are not implemented. 


0 Non-secure ELO and EL1 execution of DC ZVA instructions is not trapped to EL2. 


Table D1-38 shows how the exceptions are reported in ESR_EL2. 


Table D1-38 Instruction trapped to EL1 when HCR_EL2.TDZ is 0 





Traps from Trapped instruction Syndrome reporting in ESR_EL2 





AArch64 state DC ZVA Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 





AArch32 state n/a n/a 





Traps to EL2 of Non-secure EL1 execution of TLB maintenance instructions 
In the ARMv8-A architecture, the system instruction encoding space includes TLB maintenance instructions. 


HCR_EL2.TTLB traps Non-secure EL1 execution of TLB maintenance instructions to EL2: 





1 Any attempt to execute a TLBI instruction at Non-secure EL] is trapped to EL2. 
0 Non-secure EL1 execution of TLBI instructions is not trapped to EL2. 
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Table D1-39 shows the instructions that are trapped, and how the exceptions are reported in ESR_EL2. 


Table D1-39 Instructions trapped to EL2 when HCR_EL2.TTLB is 1 





Traps from Trapped instructions 


Syndrome reporting in 











ESR_EL2 

AArch64 state TLBI VMALLELIS, TLBI VAEIIS, TLBI ASIDEIIS, TLBI VAAEIIS, TLBI Trapped AArch64 MSR, MRS, 
VALEILIS, TLBI VAALEIIS, TLBI VMALLE1, TLBI VAE1, TLBI ASIDE1, TLBI _ or system instruction, using 
VAAE1, TLBI VALE1, TLBI VAALE1 EC value 0x18 

AArch32 state TLBIALLIS, TLBIMVAIS, TLBIASIDIS, TLBIMVAAIS, TLBIMVALIS, Trapped MCR or MRC access 
TLBIMVAALIS, ITLBIALL, ITLBIMVA, ITLBIASID, DTLBIALL, DTLBIMVA, (coproc==0b1111), using EC 
DTLBIASID, TLBIALL, TLBIMVA, TLBIASID, TLBIMVAA, TLBIMVAL, value 0x03 
TLBIMVAAL. 

Note 





These instructions are always UNDEFINED at ELO. 





For more information about these instructions, see: 


TLB maintenance instructions on page D4-1817, for the AArch64 state instructions. 


The scope of TLB maintenance instructions on page G4-4101, for the AArch32 state instructions. 


Traps to EL2 of Non-secure ELO and EL1 execution of cache maintenance instructions 


HCR_EL2.{TSW, TPC, TPU} trap cache maintenance instructions to EL2 as follows: 


0 
1 


For: 


The control has no effect on the execution of cache maintenance instructions. 


Any attempt to execute a corresponding cache maintenance instruction at Non-secure EL1, or at 
Non-secure ELO if permitted by SCTLR_EL1.UCI, is trapped to EL2. 


Table D1-40 Controls for trapping cache maintenance instructions to EL2 





Trap control Trapped instructions 





HCR_EL2.TSW __ Data or unified cache maintenance by set/way 





HCR_EL2.TPC Data or unified cache maintenance to point of coherency 





HCR_EL2.TPU — Cache maintenance to point of unification 





HCR_EL2.TSW == 1, Table D1-41 on page D1-1576 shows the instructions that are trapped, and how the 
exceptions are reported in ESR_EL2. 


HCR_EL2.TPC == 1, Table D1-42 on page D1-1576 shows the instructions that are trapped, and how the 
exceptions are reported in ESR_EL2. 


HCR_EL2.TPU == 1, Table D1-43 on page D1-1576 shows the instructions that are trapped, and how the 
exceptions are reported in ESR_EL2. 
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Table D1-41 Instructions trapped to EL2 when HCR_EL2.TSW is 1 





Traps from Trapped instructions Syndrome reporting in ESR_EL2 





AArch64 state DCISW,DCCSW,DCCISW — Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 





AArch32 state DCISW, DCCSW, DCCISW Trapped MCR or MRC access (coproc==0b1111), using EC value 0x03 





Note 
These instructions are always UNDEFINED at ELO. 








Table D1-42 Instructions trapped to EL2 when HCR_EL2.TPC is 1 





Traps from 


Trapped instructions Syndrome reporting in ESR_EL2 





AArch6é4 state 


DC IVAC, DC CVAC, DC CIVAC Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 





AArch32 state 


DCIMVAC, DCCIMVAC, DCCMVAC Trapped MCR or MRC access, (coproc==0b1111) using EC value 0x03 





Note 
DC IVAC is always UNDEFINED at ELO using AArch64. 





DCIMVAC, DCCIMVAC, and DCCMVAC are always UNDEFINED at ELO using AArch32. 





Table D1-43 Instructions trapped to EL2 when HCR_EL2.TPU is 1 





Traps from 


Trapped instructions Syndrome reporting in ESR_EL2 





AArch6é4 state 


IC IVAU, IC IALLU, IC IALLUIS, DC CVAU Trapped AArch64 MSR, MRS, or system instruction, using EC 


value 0x18 





AArch32 state 


ICIMVAU, ICIALLU, ICIALLUIS, DCCMVAU _ Trapped MCR or MRC access (coproc==0b1111), using EC value 


0x03 








Note 
IC IALLUIS and IC IALLU are always UNDEFINED at ELO using AArch64. 


ICIMVAU, ICIALLU, ICIALLUIS, and DCCMVAU are always UNDEFINED at ELO using AArch32. 





For more information about these instructions, see: 
° Cache maintenance instructions, and data cache zero on page C5-276 for the AArch64 instructions. 


° Cache maintenance instructions, functional group on page G4-4201 for the AArch32 instructions. 


Traps to EL2 of Non-secure EL1 accesses to the Auxiliary Control Register 


HCR_EL2.TACR traps Non-secure EL1 accesses to the Auxiliary Control Registers to EL2: 
1 Non-secure EL1 accesses to the Auxiliary Control Registers are trapped to EL2. 
0 Non-secure EL1 accesses to the Auxiliary Control Registers are not trapped to EL2. 
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Table D1-44 shows the registers for which accesses are trapped, and how the exceptions are reported in ESR_EL2. 


Table D1-44 Register accesses trapped to EL2 when HCR_EL2.TACR is 1 





Traps from Registers Syndrome reporting in ESR_EL2 





AArch64 state ACTLR_EL1 Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 





AArch32 state ACTLR and, if implemented, ACTLR2. Trapped MCR or MRC access (coproc==0b1111), using EC value 0x03 








Note 
° The ACTLR_EL1, ACTLR, and ACTLR2 are not accessible at ELO. 


° The Auxiliary Control Registers are IMPLEMENTATION DEFINED registers that might implement global control 
bits for the PE. 





Traps to EL2 of Non-secure ELO and EL1 accesses to lockdown, DMA, and TCM 
operations 


The lockdown, DMA, and TCM features of the ARMv8-A architecture are IMPLEMENTATION DEFINED. The 
architecture reserves the encodings of a number of System registers for control of these features. 


HCR_EL2.TIDCP traps the execution of System register access instructions that access these registers, as follows: 
1 At Non-secure EL1, any attempt to execute a System register access instruction with a reserved 
register encoding is trapped to EL2. 


At Non-secure ELO, it is IMPLEMENTATION DEFINED whether attempts to execute System register 
access instructions with reserved register encodings are: 


° Trapped to EL2. 





° UNDEFINED, and any resulting exception is taken to EL1. 
0 Non-secure ELO and EL1 System register access instructions with reserved register encodings are 
not trapped to EL2. 
Table D1-45 shows the register encodings for which accesses are trapped, and how the exceptions are reported in 
ESR_EL2. 
Table D1-45 Encodings trapped to EL2 when HCR_EL2.TIDCP is 1 
Traps . : Pee 
fon Register encodings Syndrome reportingin ESR_EL2 





AArch64 = Any access to any of the encodings described in Reserved encodings for Trapped AArch64 MSR, MRS, or system 





state IMPLEMENTATION DEFINED registers on page C5-291. instruction, using EC value 0x18 
AArch32 An access to any of the following encodings: Trapped MCR or MRC access (coproc==0b1111), 
state . CRn==c9, opcl=={0-7}, CRm==(cO-c2, c5-c8}, opc2=={0-7}. using EC value 0x03 


° CRn==c 10, opcl=={0-7}, CRm=={c0, cl, c4, c8}, opc2=={0-7}. 
° CRn==cl11, opcl=={0-7}, CRm=={cO-c8, c15}, opc2=={0-7}. 





An implementation can also include IMPLEMENTATION DEFINED registers that provide additional controls, to give 
finer-grained control of the trapping of IMPLEMENTATION DEFINED features. 


Note 


° ARM expects the trapping of Non-secure ELO accesses to these functions to EL2 to be unusual, and used only 
when the hypervisor is virtualizing ELO operation. ARM strongly recommends that unless the hypervisor 
must virtualize ELO operation, a Non-secure ELO access to any of these functions is UNDEFINED, as it would 
be if the implementation did not include EL2. The PE then takes any resulting exception to Non-secure EL1. 
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. The trapping of accesses to these registers from Non-secure EL] is higher priority than an exception resulting 
from the register access being UNDEFINED. 





Traps to EL2 of Non-secure EL1 execution of SMC instructions 
HCR_EL2.TSC traps Non-secure EL1 execution of SMC instructions to EL2: 


1 Any attempt to execute an SMC instruction at Non-secure EL1 is trapped to EL2, regardless of the 
value of SCR_EL3.SMD. 


0 Non-secure EL1 execution of SMC instructions is not trapped to EL2. 
If EL3 is not implemented, HCR_EL2.TSC is RESO. 


Table D1-46 shows how the exceptions are reported in ESR_EL2: 


Table D1-46 SMC Instruction trapped to EL2 when HCR_EL2.TSC is 1 





Traps from Trapped instruction Syndrome reporting in ESR_EL2 





AArch64 state SMC Trapped SMC instruction execution in AArch64 state, using EC value 0x17 





AArch32 state SMC on page F5-2983 Trapped SMC instruction execution in AArch32 state, using EC value 0x13 





In AArch32 state, the ARMv8-A architecture permits, but does not require, this trap to apply to conditional SMC 
instructions that fail their condition code check, in the same way as with traps on other conditional instructions. 


For more information about SMC instructions, see SMC on page C6-675. 





Note 
° This trap is implemented only if the implementation includes EL3. 
° SMC instructions are UNDEFINED at ELO. 
° HCR_EL2.TSC traps execution of the SMC instruction. It is not a routing control for the SMC exception. Trap 


exceptions and SMC exceptions have different preferred return addresses. 





Traps to EL2 of Non-secure ELO and EL1 accesses to the ID registers 


Other than the MIDR_EL1, MPIDR_EL1, and PMCR_ELO.N, the ID registers are divided into groups, with a trap 
control in the HCR_EL2 for each group. 


Table D1-47 ID register groups 





Trap control Register group 





HCR_EL2.TIDO ‘JD group 0, Primary device identification registers on page D1-1579 





HCR_EL2.TID1 ‘ID group 1, Implementation identification registers on page D1-1580 





HCR_EL2.TID2 JD group 2, Cache identification registers on page D1-1580 





HCR_EL2.TID3 ID group 3, Detailed feature identification registers on page D1-1580 





These controls trap register accesses to EL2, as follows: 


HCR_EL2.TIDO 


0 This control has no effect on Non-secure EL1 reads of the ID group 0 registers. 
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1 Any attempt at Non-secure ELO or ELI to read any register in ID group 0 is trapped to 
EL2. 


HCR_EL2.TID1 
0 This control has no effect on Non-secure EL1 reads of the ID group 1 registers. 


1 Any attempt at Non-secure EL] to read any register in ID group 1 is trapped to EL2 


HCR_EL2.TID2 


0 This control has no effect on Non-secure EL1 and ELO accesses to the ID group 2 
registers. 

1 Any attempt at Non-secure ELO or EL] to read any register in ID group 2, and any 
attempt at Non-secure ELO or EL1 to write to the CSSELR or CSSELR_EL1, is trapped 
to EL2. 


HCR_EL2.TID3 

0 This control has no effect on Non-secure EL1 reads of the ID group 3 registers. 

1 Any attempt at Non-secure EL] to read any register in ID group 3 is trapped to EL2. 
For the MIDR_EL1 and MPIDR_EL1, and for PMCR_ELO.N, the architecture provides read/write aliases. The 
original register becomes accessible only from EL2 or Secure state, and a Non-secure ELO or EL1 read of the 


original register returns the value of the read/write alias. This substitution is invisible to the ELO or EL1 software 
reading the register. 


Table D1-48 ID register substitution 











Register Original Alias, EL2 using AArch64 
Main ID MIDR_EL1 VPIDR_EL2 
Multiprocessor Affinity MPIDR_EL1 VMPIDR_EL2 





Performance Monitors Control Register PMCR_ELO.N MDCR_EL2.HPMN 





Reads of the MIDR_EL1, MPIDR_EL1 or PMCR_ELO.N from EL2 or Secure state are unchanged by the 
implementation of EL2, and access the physical registers. 


Note 


. If the optional Performance Monitors Extension is not implemented, MDCR_EL2.HPMN is RESO and 
PMCR_ELO is reserved. 





° MDCR_EL2.HPMN also affects whether a Performance Monitors counter can be accessed from Non-secure 
ELO or EL1. See the register description of MDCR_EL2 for more information. 


° PMCR_ELO contains other fields that identify the implementation. For more information about trapping 
accesses to the PMCR_ELO, see Traps to EL2 of Non-secure ELO and EL] accesses to Performance Monitors 
registers on page D1-1588. 





ID group 0, Primary device identification registers 





In: 
° AArch64 state, there are no ID group 0 registers. 
° AArch32 state, these registers identify some top-level implementation choices. 
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Table D1-49 shows the registers that are in ID group 0 for traps to EL2, and how the exceptions are reported in 
ESR_EL2. 


Table D1-49 ID group 0 registers 





Traps from Group O registers Syndrome reporting in ESR_EL2 

















AArch64 state n/a n/a 
AArch32 state = FPSID Trapped VMRS System register access, using EC value 0x08 
JIDR Trapped MRC System register access (coproc==0b1110), using EC value 0x5 
Note 


The FPSID is not accessible from ELO using AArch32. 





When the FPSID is accessible, a T32 or A32 VMSR FPSID, <Rt> instruction is permitted but is ignored. The execution 
of this VMSR instruction execution is not trapped by the ID group 0 trap. 


ID group 1, Implementation identification registers 
These registers often provide coarse-grained identification mechanisms for implementation-specific features. 


Table D1-50 shows the registers that are in ID group 1 for traps to EL2, and how the exceptions are reported in 








ESR_EL2. 
Table D1-50 ID group 1 registers 
Traps from Group 1 registers Syndrome reporting in ESR_EL2 
AArch64 state REVIDR_EL1, AIDR_EL1 Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 





AArch32 state TCMTR,TLBTR, REVIDR, AIDR Trapped MCR or MRC System register access (coproc==0b1111), using EC value 
0x03 





ID group 2, Cache identification registers 
These registers describe and control the cache implementation. 


Table D1-51 shows the registers that are in ID group 2 for traps to EL2, and how the exceptions are reported in 











ESR_EL2. 
Table D1-51 ID group 2 registers 
Traps from Group 2 registers Syndrome reporting in ESR_EL2 
AArch64 state CTR_ELO,CCSIDR_EL1,CLIDR_EL1, Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 
CSSELR_EL1 
AArch32 state CTR, CCSIDR, CLIDR, CSSELR Trapped MCR or MRC System register access (coproc==0b1111), using EC 


value 0x03 





ID group 3, Detailed feature identification registers 


These registers provide detailed information about the features of the implementation. 
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Note 


In AArch32 state, these registers are called the CPUID registers. There is no requirement for this trap to apply to 
those registers that the CPUID Identification Scheme defines as reserved. See The CPUID identification scheme on 
page G4-4195. 








Table D1-52 shows the registers that are in ID group 3 for traps to EL2, and how the exceptions are reported in 











ESR_EL2. 
Table D1-52 ID group 3 registers 
Traps from Group 3 registers Syndrome reporting in ESR_EL2 
AArch64 state ID_PFRO_EL1, ID_PFR1_EL1, ID_DFRO_EL1. Trapped AArch64 MSR, MRS, or system 
ID_AFRO_EL1, ID_MMFRO_EL1, ID_MMEFR1_EL1, instruction, using EC value 0x18 





ID_MMFR2_EL1, ID_MMFR3_EL1, and ID_MMFR4 _EL1, except 
that if ID_MMFR4_EL1 is implemented as RAZ/WI then it is 
IMPLEMENTATION DEFINED whether reads of the register are trapped. 
ID_ISARO_EL1, ID_ISAR1_EL1, ID_ISAR2_EL1, ID_ISAR3_EL1, 
ID_ISAR4_EL1, ID_ISAR5_EL1. 

MVFRO_EL1, MVFR1_EL1, MVFR2_EL1. 

ID_AA64PFRO_EL1, ID_AA64PFR1_EL1. ID_AA64DFRO_EL1, 
ID_AA64DFR1_EL1. ID_AA64ISARO_EL1, ID_AA64ISAR1_EL1. 
ID_AA64MMFRO_EL1, ID_AA64MMFR1_EL1. 
ID_AA64AFRO_EL1, ID_AA64AFR1_EL1. 

It is IMPLEMENTATION DEFINED whether HCR_EL2.TID3 traps MRS 
accesses to registers in the following range that are not already 
mentioned in this table: 


° Op@ == 3, CRn == cO0, opl == 0, CRm == {c2-c7}, op2 == {0-7}. 

















AArch32 state MVFRO, MVFR1, MVFR2. Trapped VMRS System register access, 
using EC value 0x08 
ID_PFRO, ID_PFR1, ID_DFRO, ID_AFRO. Trapped MCR or MRC System register 
ID_MMERO, ID_MMEFR1, ID_MMFR2, ID_MMFR3 and, access (coproc==0b1111), using EC 


ID_MMER4, except that if ID_MMFR4 is implemented as RAZ/WI then _ value 0x03 
it is IMPLEMENTATION DEFINED whether reads of the register are trapped. 


ID_ISARO, ID_ISAR1, ID_ISAR2, ID_ISAR3, ID_ISAR4, ID_ISARS. 
Any MRC access to any of the following encodings in the 
(coproc==0b1111) encoding space: 

° CRn == cO, opcl == 0, CRm == {c3-c7}, opc2 == {0, 1}. 

° CRn == cO, opcl == 0, CRm == c3, opc2 == 2. 

° CRn == cO, opc1 == 0, CRm == c5, opc2 == {4, 5}. 

It is IMPLEMENTATION DEFINED whether HCR_EL2.TID3 traps MRC 
accesses to in the (coproc==0b1111) encoding space in the following 
range that are not already mentioned in this table: 


° CRn == cO, opcl == 0, CRm == {c2-c7}, opc2 == {0-7}. 





Traps to EL2 of Non-secure ELO and EL1 execution of WFE and WFI instructions 
HCR_EL2.{TWE, TWI} trap Non-secure ELO and EL1 execution of WFE and WFI instructions to EL2: 


HCR_EL2.TWE 


1 Any attempt to execute a WFE instruction at Non-secure ELO or EL] is trapped to EL2, 
if the instruction would otherwise have caused the PE to enter a low-power state. 


0 Non-secure ELO or EL1 execution of WFE instructions is not trapped to EL2. 
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HCR_EL2.TWI 
1 Any attempt to execute a WFI instruction at Non-secure ELO or EL] is trapped to EL2, 
if the instruction would otherwise have caused the PE to enter a low-power state. 
0 Non-secure ELO or EL1 execution of WFI instructions is not trapped to EL2. 


Table D1-53 shows how the exceptions are reported in ESR_EL2. 


Table D1-53 Instructions trapped to EL2 when HCR_EL2.{TWE, TWI} are 1 














Trap control Traps from Trapped instructions Syndrome reporting in ESR_EL2 

HCR_EL2.TWE _ Both Execution WFE Trapped WFI or WFE instruction, using EC 
states value 0x01 

HCR_EL2.TWI WFI 





In AArch32 state, the attempted execution of a conditional WFE or WFI instruction is only trapped if the instruction 
passes its condition code check. 





Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of WFI are not guaranteed 
to be taken, even if the WFE or WFI is executed when there is no Wakeup event. The only guarantee is that if the 
instruction does not complete in finite time in the absence of a Wakeup event, the trap will be taken. 





For more information about these instructions, and when they can cause the PE to enter a low-power state, see: 
° Wait for Event mechanism and Send event on page D1-1599. 
° Wait For Interrupt on page D1-1602. 


Trapping to EL2 of Non-secure EL1 accesses to the CPACR_EL1 or CPACR 


CPTR_EL2.TCPAC traps Non-secure EL1 accesses to the CPACR_EL1 or CPACR to EL2: 
1 Non-secure EL1 accesses to the CPACR_EL1 or CPACR are trapped to EL2. 
0 Non-secure EL1 accesses to the CPACR_EL1 or CPACR are not trapped to EL2. 


Table D1-54 shows how the exceptions are reported in ESR_EL2. 


Table D1-54 Register accesses trapped to EL2 when CPTR_EL2.TCPAC is 1 





Traps from Registers Syndrome reporting in ESR_EL2 





AArch64 state CPACR_EL1 Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 





AArch32 state CPACR Trapped MCR or MRC System register access (coproc==0b1111), using EC 
value 0x03 





Note 
° The CPACR_EL1 or CPACR is not accessible at ELO. 





. In ARMV7 and earlier versions of the ARM architecture, one function of the CPACR is as an ID register that 
identifies what coprocessor or conceptual coprocessor functionality is implemented. Legacy software might 
use this identification mechanism, and a hypervisor can use this trap to emulate this mechanism. For more 
information about this coprocessor model see Background to the System register interface on page G1-3879. 
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General trapping to EL2 of Non-secure accesses to the SIMD and floating-point 
registers 


CPTR_EL2.TFP traps Non-secure accesses to SIMD and floating-point registers to EL2: 


1. Any attempt at EL2, or Non-secure ELO or EL1, to execute an instruction that accesses the SIMD 
or floating-point registers is trapped to EL2. 


0. Non-secure execution of instructions that access the SIMD or floating-point registers is not trapped 
to EL2. 


Table D1-55 shows the registers for which accesses are trapped, and how the exceptions are reported in ESR_EL2. 


Table D1-55 Register accesses trapped to EL2 when CPTR_EL2.TFP is 1 











Traps from Registers Syndrome reporting in ESR_EL2 
AArch64 state = FPCR, FPSR, FPEXC32_EL2, and any of the SIMD and floating-point Trapped access to a SIMD or 

registers VO-V31, including their views as DO-D31 registers or SO-S31 floating-point register, resulting from 

registers. See The SIMD and floating-point registers, VO-V31 on CPACR_EL1.FPEN or 

page D1-1508. CPTR_ELx.TFP, using EC value 0x07 
AArch32 state FPSID, MVFRO, MVFR1, MVFR2, FPSCR, FPEXC, and any of the Trapped access to a SIMD or 

SIMD and floating-point registers QO-Q15, including their views as floating-point register, resulting from 

DO-D31 registers or SO-S31 registers. See Advanced SIMD and CPACR_EL1.FPEN or 

floating-point System registers on page G1-3882. CPTR_ELx.TFP, using EC value 0x074 





a. Permitted VMSR accesses to the FPSID are ignored, but for the purposes of this trap the architecture defines a VMSR access to the FPSID from 
ELI or higher as an access to a SIMD and floating-point register. 


Traps to EL2 of Non-secure System register accesses to the trace registers 


CPTR_EL2.TTA traps System register accesses to the trace registers to EL2. These traps are from Non-secure state, 
so are from all of: 


° EL2. 
° Non-secure ELO and EL 1. 


When CPTR_EL2.TTA is: 





1 Non-secure System register accesses to the trace registers are trapped to EL2. 
0 Non-secure System register accesses to the trace registers are not trapped to EL2. 
Note 
° The ETMv4 architecture does not permit ELO to access the trace registers. If the ARMv8-A architecture is 


implemented with an ETMv4 implementation, ELO accesses to the trace registers are UNDEFINED, and any 
resulting exception is higher priority than a CPTR_EL2.TTA Trap exception. 


° EL2 does not provide traps on trace register accesses through the Memory-mapped interface. 





System register accesses to the trace registers can have side-effects. When a System register access is trapped, no 
side-effects occur before the exception is taken, see Register access instructions on page D1-1563. 
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Table D1-56 shows the registers for which accesses are trapped to EL2 when CPTR_EL2.TTA is 1, and how the 
exceptions are reported in ESR_EL2. 


Table D1-56 Register accesses trapped to EL2 when CPTR_EL2.TTA is 1 





Traps from Registers Syndrome reporting in ESR_EL2 





AArch64 state All implemented trace registers | Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18. 





AArch32 state All implemented trace registers Trapped MCR or MRC System register access (coproc==0b1110), using EC 
value 0x05. 
° Trapped MCRR or MRRC System register access (coproc==0b1110), using EC 
value @xQC. 





General trapping to EL2 of Non-secure ELO and EL1 accesses to System registers, from 
AArch32 state only 


HSTR_EL2.{T0-T3, T5-T13, T15} trap accesses to the AArch32 System registers in the coproc==0b1111 encoding 
space, by the register number, {cO-c3, c5-c13, c15} used for: 
° The CRn argument used when accessing the register using an MCR or MRC instruction. 


° The CRm argument used when accessing the register using an MCRR or MRRC instruction. 


These traps are from AArch32 state only. They are from both: 
° Non-secure EL1 using AArch32. 
° Non-secure ELO using AArch32. 


When an HSTR_EL2.Tx trap control is: 


1 Any AArch32 state Non-secure EL1 or ELO access to the corresponding register is trapped to EL2. 
0 AArch32 state Non-secure EL1 or ELO accesses to the corresponding register are not trapped to 
EL2. 


Table D1-57 shows the accesses that are trapped, and how the exceptions are reported in ESR_EL2. 


Table D1-57 Accesses trapped to EL2 when an HSTR_EL2.Tx trap is enabled 

















Traps from Trapped accesses Syndrome reporting in ESR_EL2 
AArch64 state n/a n/a 
AArch32 state MCR and MRC instructions, where CRn in the instruction identifies the | Trapped MCR or MRC access (coproc==0b1111), 
trapped encodings in the (coproc==0b1111) encoding space using EC value 0x03 
MCRR and MRRC instructions, where CRm in the instruction identifies Trapped MCRR or MRRC access (coproc==0b1111), 
the trapped encodings in the (coproc==0b1111) encoding space using EC value 0x04 
Note 





HSTR_EL2/4, 14] is reserved, RESO. Although the Generic Timer AArch32 System registers are implemented in in 
the coproc==0b1111 encoding space and accessed using a CRn or CRm value of c14, EL2 does not provide a trap on 
accesses to the Generic Timer System registers. 
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System registers in the (coproc==0b1111) encoding space with IMPLEMENTATION DEFINED 
access permission from ELO 


For an AArch32 System register in the (coproc==0b1111) encoding space, that is accessed using a CRn or CRm 
value that can be trapped by a HSTR_EL2.Tn control, if an access to the register from ELO is UNDEFINED when the 
value of the corresponding HSTR_EL2.Tn trap control is 0, then when that HSTR_EL2.Tn trap control is 1, it is 
IMPLEMENTATION DEFINED whether an access from Non-secure ELO using AArch32: 

° Generates a Trap exception that is taken to EL2. 

. Is UNDEFINED and generates an exception that is taken to Non-secure EL1. 


If the instruction is treated as UNDEFINED and generates an exception that is taken to Non-secure EL1, and 
Non-secure EL1 is using AArch64, the exception is reported in ESR_EL1 as an exception for an unknown reason, 
using EC value 0x00. 





Note 


ARM expects that trapping to EL2 of Non-secure ELO accesses to AArch32 System register in the 
(coproc==0b1111) encoding space will be unusual, and used only when the hypervisor must virtualize ELO 
operation. ARM recommends that, whenever possible, Non-secure ELO accesses to the System registers behave as 
they would if the implementation did not include EL2.This means that, if the architecture does not support the 
Non-secure ELO access, then the register access instruction is treated as UNDEFINED and generates an exception that 
is taken to Non-secure EL1. 





Traps to EL2 of Non-secure ELO and EL1 System register accesses to debug registers 


MDCR_EL2.{TDRA, TDOSA, TDA} trap Non-secure System register accesses to the debug registers to EL2, as 
follows: 


° MDCR_EL2.(TDRA, TDA} trap Non-secure ELO and EL] accesses. 
° MDCR_EL2.TDOSA traps Non-secure EL1 accesses. 


Note 


EL2 does not provide traps on debug register accesses through the optional memory-mapped external debug 
interfaces. 








System register accesses to the debug registers can have side-effects. When a System register access is trapped to 
EL2, no side-effects occur before the exception is taken to EL2. See Register access instructions on page D1-1563. 


Table D1-58 shows the subsections that list the accesses trapped. The subsections describe how the traps are 
reported in ESR_EL2. 


Table D1-58 Traps of Non-secure ELO and EL1 accesses to debug registers 





Trap control Subsection 





MDCR_EL2.TDRA Traps to EL2 of Non-secure ELO and EL1 System register accesses to debug registers 





MDCR_EL2.TDOSA Trapping System register accesses to powerdown debug registers to EL2 on page D1-1586 





MDCR_EL2.TDA Trapping general System register accesses to debug registers to EL2 on page D1-1586 





Trapping System register accesses to Debug ROM registers to EL2 
MDCR_EL2.TDRA traps Non-secure ELO and EL1 System register accesses to the Debug ROM registers to EL2: 


1 Non-secure ELO and EL1 System register accesses to the Debug ROM registers are trapped to EL2. 
0 Non-secure ELO and EL1 System register accesses to the Debug ROM registers are not trapped to 
EL2. 


This trap applies to Non-secure ELO only if it is using AArch32. 
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Table D1-59 shows the register accesses that are trapped, and how the exceptions are reported in ESR_EL2. 


Table D1-59 Register accesses trapped to EL2 when MDCR_EL2.TDRA is 1 



































Traps from Registers Syndrome reporting in ESR_EL2 
AArch64 state MDRAR_EL1 Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18. 
AArch32 state DBGDRAR,DBGDSAR _ For accesses using: 
° MCR or MRC instructions, trapped MCR or MRC access (coproc==0b1110), using EC 
value 0x05. 
° MCRR or MRRC instructions, trapped MCRR or MRRC access (coproc==0b1110), using 
EC value 0x@C. 
If MDCR_EL2.TDE or HCR_EL2.TGE is 1, behavior is as if MDCR_EL2.TDRA is 1 other than for the purpose 
of a direct read. 
Trapping System register accesses to powerdown debug registers to EL2 
MDCR_EL2.TDOSA traps Non-secure EL1 System register accesses to the powerdown debug registers to EL2: 
1. Non-secure EL1 System register accesses to the powerdown debug registers are trapped to EL2. 
0. Non-secure EL1 System register accesses to the powerdown debug registers are not trapped to EL2. 
Table D1-60 shows the register accesses that are trapped, and how the exceptions are reported in ESR_EL2. 
Table D1-60 Register accesses trapped to EL2 when MDCR_EL2.TDOSA is 1 
Traps from Registers Syndrome reporting in ESR_EL2 
AArch64 state OSLAR_ELI, OSLSR_EL1, OSDLR_EL1, DBGPRCR_EL1. Trapped AArch64 MSR, MRS, or system 
Any IMPLEMENTATION DEFINED integration registers. instruction, using EC value 0x18. 
Any IMPLEMENTATION DEFINED register with similar functionality that 
the implementation specifies as trapped by MDCR_EL2.TDOSA 
AArch32 state DBGOSLSR, DBGOSLAR, DBGOSDLR, DBGPRCR. Trapped MCR or MRC access 
Any IMPLEMENTATION DEFINED integration registers. (coproc==0b1110), using EC value 0x05. 
Any IMPLEMENTATION DEFINED register with similar functionality that 
the implementation specifies as trapped by HDCR.TDOSA 
Note 
These registers are not accessible at ELO. 
If MDCR_EL2.TDE or HCR_EL2.TGE is 1, behavior is as if MDCR_EL2.TDOSA is 1 other than for the purpose 
of a direct read. 
Trapping general System register accesses to debug registers to EL2 
MDCR_EL2.TDA traps Non-secure ELO and EL1 System register accesses to those debug System registers that are 
not mentioned in either of the following: 
° Traps to EL2 of Non-secure ELO and ELI System register accesses to debug registers on page D1-1585. 
° Trapping System register accesses to powerdown debug registers to EL2. 
This means that MDCR_EL2.TDA traps Non-secure ELO and EL1 System register accesses to all debug System 
registers to EL2, except the following: 
° Any access from: 
— AArch64 state to the MDRAR_EL1. 
—  AArch32 state to the DBGDRAR or DBGDSAR. 
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MDCR_EL2.TDRA traps these accesses. 


° Any access from: 
—  AdArch64 state to the OSLAR_EL1, OSLSR_EL1, OSDLR_EL1 or DBGPRCR_EL1. 
—  AArch32 state to the DBGOSLSR, DBGOSLAR, OSDLR_EL1 or DBGPRCR. 


MDCR_EL2.TDOSA traps these accesses. 





When MDCR_EL2.TDA is: 


1 Non-secure ELO or EL1 System register accesses to any of the registers shown in Table D1-61 are 
trapped to EL2. 

0 Non-secure ELO or EL1 System register accesses to the registers shown in Table D1-61 are not 
trapped to EL2. 


Table D1-61 shows how the exceptions are reported in ESR_EL2. 


Table D1-61 Accesses trapped to EL2 when MDCR_EL2.TDA is 1 





Traps from Trapped accesses Syndrome reporting in ESR_EL2 





AArch64 state Accesses to the MDCCSR_ELO, MDCCINT_EL1, DBGDTR_ELO, Trapped AArch64 MSR, MRS, or system 
DBGDTRRX_ELO, DBGDTRTX_ELO, OSDTRRX_EL1, instruction, using EC value 0x18 
MDSCR_EL1, OSDTRTX_EL1, O9ECCR_EL1, DBGBVR<n>_EL1, 
DBGBCR<n>_EL1, DBGWVR<n>_EL1, DBGWCR<n>_EL 1, 
DBGCLAIMSET_EL1, DBGCLAIMCLR_EL1, and 
DBGAUTHSTATUS_EL 1. 





AArch32 state Accesses to the DBGDIDR, DBGDSCRint, DBGDCCINT, For accesses using MCR or MRC 
DBGDTRRXint, DBGDTRTXint, DBGWFAR, DBGVCR, instructions, trapped MCR or MRC access 
DBGDSCRext, DBGDTRTXext, DBGDTRRXext, DBGBVR<n>, (coproc==0b1110), using EC value 0x05 


DBGBCR<n>, DBGBXVR<n>, DBGWCR<n>, DBGWVR<n>, 
DBGCLAIMSET, DBGCLAIMCLR, DBGAUTHSTATUS, 
DBGDEVID, DBGDEVID1, DBGDEVID2, and DBGOSECCR. 





STC accesses to DBGDTRRXint. Trapped LDC or STC access, using EC 
LDC accesses to DBGDTRTXint. value 0x06 





If MDCR_EL2.TDE or HCR_EL2.TGE is 1, behavior is as if MDCR_EL2.TDA is 1 other than for the purpose of 
a direct read. 


Traps to EL2 of Non-secure ELO and EL1 accesses to the Generic Timer registers 


CNTHCTL_EL2.{EL1PCEN, EL1PCTEN} trap Non-secure ELO and EL1 accesses to the Generic Timer registers 
to EL2, as follows: 


° CNTHCTL_EL2.EL1PCEN traps Non-secure ELO and EL] accesses to the physical timer registers. 
° CNTHCTL_EL2.EL1PCTEN traps Non-secure ELO and EL] accesses to the physical counter register. 


For each of these controls: 





1 Non-secure ELO and EL] accesses are not trapped to EL2. 
0 Non-secure ELO and EL1 accesses are trapped to EL2. 
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Table D1-62 shows the registers for which accesses are trapped, and how the exceptions are reported in ESR_EL2. 


Table D1-62 Register accesses trapped to EL2 by CNTHCTL_EL2 trap controls 





Traps from Trap control Registers 


Syndrome reporting in ESR_EL2 





AArch64 state ELIPCEN CNTP_CTL_ELO, CNTP_CVAL_ELO, 
CNTP_TVAL_ELO 





ELIPCTEN CNTPCT_ELO 


Trapped AArch64 MSR, MRS, or system instruction, 
using EC value 0x18 





AArch32 state ELIPCEN CNTP_CTL, CNTP_CVAL, CNTP_TVAL 


For accesses using: 

° MCR or MRC instructions, trapped MCR or MRC 
access (coproc==0b1111), using EC value 
0x03 

° MCRR or MRRC instructions, trapped MCRR or 
MRRC access (coproc==0b1111), using EC 
value 0x04 





ELIPCTEN CNTPCT 


Trapped MCRR or MRRC access (coproc==0b1111), 
using EC value 0x04 





Traps to EL2 of Non-secure ELO and EL1 accesses to Performance Monitors registers 


MDCR_EL2.{TPM, TPMCR} trap Non-secure ELO and EL1 accesses to the Performance Monitors registers to 


EL?2: 


MDCR_EL2.TPM 


1 Non-secure ELO and EL1 accesses to the Performance Monitors registers are trapped to 
EL2. 

0 Non-secure ELO and EL1 accesses to the Performance Monitors registers are not 
trapped to EL2. 


MDCR_EL2.TPMCR 


1 Non-secure ELO and EL1 accesses to the Performance Monitors Control Registers are 
trapped to EL2. 

0 Non-secure ELO and EL1 accesses to the Performance Monitors Control Registers are 
not trapped to EL2. 


Note 





EL2 does not provide traps on Performance Monitor register accesses through the optional memory-mapped 


external debug interface. 





For: 


° MDCR_EL2.TPM == 1, Table D1-63 on page D1-1589 shows the registers for which accesses are trapped, 


and how the exceptions are reported in ESR_EL2. 
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° MDCR_EL2.TPMCR == 1, Table D1-64 shows the registers for which accesses are trapped, and how the 
exceptions are reported in ESR_EL2. 


Table D1-63 Register accesses trapped to EL2 when MDCR_EL2.TPM is 1 





Traps from Registers Syndrome reporting in ESR_EL2 





AArch64 state PMCR_ELO, PMCNTENSET_ELO, PMCNTENCLR_ELO, Trapped AArch64 MSR, MRS, or system instruction, 
PMOVSCLR_ELO, PMSWINC_ELO, PMSELR_ELO, using EC value 0x18 

PMCEIDO_ELO, PMCEID1_ELO, PMCCNTR_ELO, 
PMXEVTYPER_ELO, PMXEVCNTR_ELO, 
PMUSERENR_ELO, PMINTENSET_EL1, 
PMINTENCLR_EL1, PMOVSSET_ELO, 
PMEVCNTR<n>_EL0, PMEVTYPER<n>_ELO, 

P 

P 

Pe 

P 

P 

Pp 


MCCFILTR_ELO. 





AArch32 state MCR, PMCNTENSET, PMCNTENCLR, PMOVSR, For accesses using: 

MSWINC, PMSELR, PMCEIDO, PMCEID1, PMCCNTR, ° MCR or MRC instructions, trapped MCR or MRC 
MXEVTYPER, PMXEVCNTR, PMUSERENR, access (coproc==0b1111), using EC value 
MINTENSET, PMINTENCLR, PMOVSSET, 0x03 


MEVCNTR<n>, PMEVTYPER<n>, PMCCFILTR. 





° MCRR or MRRC instructions, trapped MCRR or 
MRRC access, (coproc==0b1111) using 
EC value 0x04 





Table D1-64 Register accesses trapped to EL2 when MDCR_EL2.TPMCR is 1 





Traps from Registers Syndrome reporting in ESR_EL2 





AArch64 state PMCR_ELO Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 





AArch32 state PMCR Trapped MCR or MRC access (coproc==0b1111), using EC value 0x03 





Note 


MDCR_EL2.HPMN affects whether a counter can be accessed from Non-secure ELO or ELI. See the register 
description of MDCR_EL2 for more information. 








D1.15.4 EL3 configurable controls 


These controls are in __EL3 System registers. The resulting exceptions might be taken from either Execution state. 
SPSR_EL3.M[4] indicates which Execution state the exception was taken from. 


Table D1-65 shows the _EL3 System registers that contain these controls. 


Table D1-65 _EL3 registers that contain instruction enables and disables, and trap controls 











Register description Register name 
Secure Configuration Register SCR_EL3 
Architectural Feature Trap Register, EL3 CPTR_EL3 





Monitor Debug Configuration Register, EL3 =MDCR_EL3 
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Table D1-66 summarizes the controls. 


Table D1-66 Instruction enables and disables, and trap controls, provided by EL3 





Control 





























Control Description 
type? P 
SCR_EL3.{TWE, TWI} T Traps to EL3 of EL2, EL1, and ELO execution of WFE and WFI instructions on 
page D1-1591 
SCR_EL3.ST T Traps to EL3 of Secure EL1 accesses to the Counter-timer Physical Secure timer 
registers on page D1-1592 
SCR_EL3.HCE E Enabling EL3, EL2, and Non-secure ELI execution of HVC instructions on 
page D1-1592 
SCR_EL3.SMD D Disabling EL3, EL2, and ELI execution of SMC instructions on page D1-1593 
CPTR_EL3.TCPAC T Trapping to EL3 of EL2 accesses to the CPTR_EL2 or HCPTR, and EL2 and EL1 
accesses to the CPACR_ELI or CPACR on page D1-1593 
CPTR_EL3.TTA T Traps to EL3 of all System register accesses to the trace registers on page D1-1594 
CPTR_EL3.TFP T Traps to EL3 of all accesses to the SIMD and floating-point registers on page D1-1594 
MDCR_EL3.{TDOSA, TDA} T Traps to EL3 of EL2, EL1, and ELO System register accesses to debug registers on 
page D1-1595 
MDCR_EL3.TPM T Traps to EL3 of EL2, EL1, and ELO accesses to Performance Monitors registers on 


page D1-1597 





a. See Table D1-67. 


Table D1-67 Control types, for exceptions taken to EL1 





Abbreviation Type See 











D Disable — Instruction enables and instruction disables on page D1-1562 
E Enable Instruction enables and instruction disables on page D1-1562 
T Trap Trap controls on page D1-1562 





Also see the following for more general information about traps to EL3: 
° Register access instructions on page D1-1563. 


° Traps to EL3 of Secure monitor functionality from Secure EL] using AArch32. 


Traps to EL3 of Secure monitor functionality from Secure EL1 using AArch32 


If EL1 is using AArch32, all of the following are trapped to EL3: 
° Secure EL1 reads and writes to any of the SCR, NSACR, MVBAR or SDCR. 
° Any attempt at Secure EL1 to execute any of the following: 

— ___ ATS12NSOxx instructions. 

—___ SRS instructions that use the R13_mon banked register. 


— MRS or MSR instructions that access any of the SPSR_mon, R13_mon or R14_mon banked registers. 
In addition, if EL1 is using AArch32: 


° Secure EL1 write accesses to the CNTFRQ register are UNDEFINED. They are not trapped to EL3. 
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° Any attempt at Secure EL1 to change the mode to Monitor mode, by using a CPS or an MSR instruction, or by 
performing an exception return, is treated as an illegal change of the CPSR.M field. See I//egal changes to 
PSTATE.M on page G1-3809. 


Table D1-68 shows the accesses that are trapped to EL3, and how the exceptions are reported in ESR_EL3. 


Table D1-68 Accesses trapped to EL3 from Secure EL1 using AArch32 





Taken from Trapped instructions, or trapped accesses Syndrome reporting in ESR_EL3 





Secure EL1 using Reads and writes to any of the SCR, NSACR, MVBAR or SDCR Trapped MCR or MRC access 


AArch32 (coproc==0b1111), using EC value 0x03 
ATS12NSOxx instructions 








SRS instructions that use the R13_mon banked register Exception for an unknown reason, 
using EC value 0x00 





MRS or MSR instructions that accesses any of the SPSR_mon, R13_mon 
or R14_mon banked registers 








Note 


° Reads of the NSACR from either Non-secure EL1 using AArch32 or Non-secure EL2 using AArch32 return 
the value 0x00000C00. See Restricted access System registers on page G4-4156. 


. These operations are not available at ELO. 





Traps to EL3 of EL2, EL1, and ELO execution of WFE and WFI instructions 


SCR_EL3.{TWE, TWI} trap EL2, EL1, and ELO execution of WFE and WFI instructions to EL3: 


SCR_EL3.TWE 
1 Any attempt to execute a WFE instruction at any Exception level lower than EL3 is 
trapped to EL3, if the instruction would otherwise have caused the PE to enter a 
low-power state. 
0 EL2, EL1, and ELO execution of WFE instructions is not trapped to EL3. 
SCR_EL3.TWI 
1 Any attempt to execute a WFI instruction at any Exception level lower than EL3 is 
trapped to EL3, if the instruction would otherwise have caused the PE to enter a 
low-power state. 
0 EL2, EL1, and ELO execution of WFI instructions is not trapped. 


For ELO and EL1, these traps apply to WFE and WFI execution in both Security states. 


Table D1-69 shows how the exceptions are reported in ESR_EL3. 


Table D1-69 Instructions trapped to EL3 when SCR_EL3.{TWE, TWI} are 1 














Trap control Traps from Trapped instructions Syndrome reporting in ESR_EL3 

SCR_EL3.TWE _ Both Execution WFE Trapped WFI or WFE instruction, using 
states EC value 0x01 

SCR_EL3.TWI WFI 





In AArch32 state, the attempted execution of a conditional WFE or WFI instruction is only trapped if the instruction 
passes its condition code check. 
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Note 
Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of WFI are not guaranteed 
to be taken, even if the WFE or WFI is executed when there is no Wakeup event. The only guarantee is that if the 
instruction does not complete in finite time in the absence of a Wakeup event, the trap will be taken. 








For more information about these instructions, and when they can cause the PE to enter a low-power state, see: 
° Wait for Event mechanism and Send event on page D1-1599. 
° Wait For Interrupt on page D1-1602. 


Traps to EL3 of Secure EL1 accesses to the Counter-timer Physical Secure timer 
registers 


SCR_EL3.ST traps Secure EL1 accesses to the Counter-timer Physical Secure timer registers to EL3: 
1 Secure EL1 accesses to the Counter-timer Physical Secure timer registers are not trapped to EL3. 


0 Secure EL1 accesses to the Counter-timer Physical Secure timer registers are trapped to EL3. 


Note 


Accesses to the Counter-timer Physical Secure timer registers are always enabled at EL3. 








Table D1-70 shows the registers for which accesses are trapped to EL3, and how the exceptions are reported in 
ESR_EL3. 


Table D1-70 Register accesses trapped to EL3 when SCR_EL3.ST is 0 





Traps from Registers Syndrome reporting in ESR_EL3 





AArch64 state CNTPS_TVAL_EL1 Trapped AArch64 MSR, MRS, or system instruction, using EC value 
CNTPS_CTL_EL1 0x18 
CNTPS_CVAL_EL1 





AArch32 state n/a n/a 





Note 


These registers are not accessible at ELO. 








Enabling EL3, EL2, and Non-secure EL1 execution of HVC instructions 
SCR_EL3.HCE enables HVC instruction execution at EL1 and above: 
1 HVC instruction execution is enabled at EL1 and above. 


0 HVC instructions are UNDEFINED at EL1, EL2, and EL3, and any resulting exception is taken from the 
current Exception level to the current Exception level. 


For EL1, this enable control applies to HVC instructions in Non-secure state only. 


If EL2 is not implemented, this bit is RESO. 


Note 
HVC instructions are always UNDEFINED at ELO. 
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Table D1-71 shows how the exceptions are reported in E 


Table D1-71 Instruction disabled when SCR_EL3.HCE is 0 





Taken from Disabled instruction Syndrome reporting in ESR_ELx 





AArch64 state = HVC Exception for an unknown reason, using EC value 0x00 


AArch32 state 





Disabling EL3, EL2, and EL1 execution of SMC instructions 
SCR_EL3.SMD disables SMC instruction execution at EL1 and above: 


1 SMC instructions are UNDEFINED at EL1 and above, and any resulting exception is taken from the 
current Exception level to the current Exception level. 


0 SMC instruction execution is enabled at EL1 and above. 


For EL1, this disable control applies to SMC instructions in both Security states. 


Note 
SMC instructions are always UNDEFINED at ELO. 











Table D1-72 shows how the exceptions are reported in E 


Table D1-72 Exceptions generated when SCR_EL3.SMD is 1 





Taken from Disabled Instruction Syndrome reporting in ESR_ELx 





AArch64 state SMC Exception for an unknown reason, using EC value 0x00 


AArch32 state 





Note 


If HCR_EL2.TSC or HCR.TSC traps attempted EL1 execution of SMC instructions to EL2, that trap has priority over 
this disable. 








Trapping to EL3 of EL2 accesses to the CPTR_EL2 or HCPTR, and EL2 and EL1 
accesses to the CPACR_EL1 or CPACR 


CPTR_EL3.TCPAC traps all of the following to EL3: 
e EL2 accesses to the CPTR_EL2 or HCPTR. 
e EL2 and EL] accesses to the CPACR_EL1 or CPACR. 


When CPTR_EL3.TCPAC is: 


1 EL2 accesses to the CPTR_EL2 or HCPTR, and EL2 and EL! accesses to the CPACR_EL1 or 
CPACR, are trapped to EL3. 


0 EL2 accesses to the CPTR_EL2 or HCPTR, and EL2 and ELI accesses to the CPACR_EL1 or 
CPACR, are not trapped to EL3. 


For EL1, this trap control applies to accesses from both Security states. 
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Table D1-73 shows how the exceptions are reported in ESR_EL3. 


Table D1-73 Register accesses trapped to EL3 when CPTR_EL3.TCPAC is 1 





Traps from Registers Syndrome reporting in ESR_EL3 





AArch64 state CPTR_EL2 Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 
CPACR_EL1 





AArch32 state HCPTR Trapped MCR or MRC access (coproc==0b1111), using EC value 0x03 
CPACR 





Traps to EL3 of all System register accesses to the trace registers 


CPTR_EL3.TTA traps System register accesses to the trace registers, from all Exception levels, to EL3: 





1 All System register accesses to the trace registers are trapped to EL3. 
0 System register accesses to the trace registers are not trapped to EL3. 
Note 
° The ETMv4 architecture does not permit ELO to access the trace registers. If the ARMv8-A architecture is 


implemented with an ETMv4 implementation, ELO accesses to the trace registers are UNDEFINED, and any 
resulting exception is higher priority than a CPTR_EL3.TTA Trap exception. 


° EL3 does not provide traps on trace register accesses through the Memory-mapped interface. 





System register accesses to the trace registers can have side-effects. When a System register access is trapped, no 
side-effects occur before the exception is taken, see Register access instructions on page D1-1563. 


For ELO and EL1, this trap control applies to accesses from both Security states. 


Table D1-74 shows the registers for which accesses are trapped to EL3, and how the exceptions are reported in 
ESR_EL3. 


Table D1-74 Register accesses trapped to EL3 when CPTR_EL3.TTA is 1 





Traps from 


Registers Syndrome reporting in ESR_EL3 





AArch6é4 state 


All implemented trace registers | Trapped AArch64 MSR, MRS, or system instruction, using EC value 0x18 





AArch32 state 


All implemented trace registers | For accesses using: 


° MCR or MRC instructions, trapped MCR or MRC access (coproc==0b1110), using 
EC value 0x05. 

° MCRR or MRRC instructions, trapped MCRR or MRRC access (coproc==0b1110), 
using EC value @x@C. 





Traps to EL3 of all accesses to the SIMD and floating-point registers 
CPTR_EL3.TFP traps all accesses to SIMD and floating-point registers, from all Exception levels, to EL3: 


1 Any attempt at any Exception level to execute an instruction that accesses the SIMD or 
floating-point registers is trapped to EL3 


0 Execution of instructions that access the SIMD or floating-point registers is not trapped to EL3. 


For ELO and EL1, this trap control applies to accesses from both Security states. 
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Table D1-75 shows the registers for which accesses are trapped to EL3, and how the exceptions are reported in 








ESR_EL3. 
Table D1-75 Register accesses trapped to EL3 when CPTR_EL3.TFP is 1 
Traps from Registers Syndrome reporting in ESR_EL3 
AArch64 state FPCR, FPSR, FPEXC32_EL2, and any of the SIMD and Trapped access to a SIMD or floating-point 
floating-point registers VO-V31, including their views as DO-D31 register, resulting from CPACR_EL1.FPEN 
registers or SO-S31 registers. See The SIMD and floating-point or CPTR_ELx.TFP, using EC value 0x07 


registers, VO-V31 on page D1-1508. 





AArch32 state FPSID, MVFRO, MVFR1, MVFR2, FPSCR, FPEXC, and any of Trapped access to a SIMD or floating-point 
the SIMD and floating-point registers QO-Q15, including their register, resulting from CPACR_EL1.FPEN 
views as DO-D31 registers or SO-S31 registers. See Advanced SIMD _ or CPTR_ELx.TFP, using EC value 0x07@ 
and floating-point System registers on page G1-3882. 





a. Permitted VMSR accesses to the FPSID are ignored, but for the purposes of this trap the architecture defines a VMSR access to the FPSID from 
ELI or higher is an access to a SIMD and floating-point register. 





Note 
° FPEXC32_EL2 is not accessible from ELO using AArch64. 
° FPSID, MVFRO, MVFR1, and FPEXC are not accessible from ELO using AArch32. 





Traps to EL3 of EL2, EL1, and ELO System register accesses to debug registers 


MDCR_EL3.{TDOSA, TDA} trap EL2, EL1, and ELO System register accesses to the debug registers to EL3, from 
both Security states. 


Note 


EL3 does not provide traps on debug register accesses through the Memory-mapped or External debug interfaces. 








System register accesses to the debug registers can have side-effects. When a System register access is trapped to 
EL3, no side-effects occur before the exception is taken to EL3. See Register access instructions on page D1-1563. 


Table D1-76 shows the subsections that list the accesses trapped. 


Table D1-76 Traps of EL2, EL1, and ELO accesses to debug registers 





Trap control Subsection 





MDCR_EL3.TDOSA Trapping System register accesses to powerdown debug registers to EL3 





MDCR_EL3.TDA Trapping general System register accesses to debug registers to EL3 on page D1-1596 





Trapping System register accesses to powerdown debug registers to EL3 


MDCR_EL3.TDOSA traps EL2 and EL] accesses to the powerdown debug registers to EL3: 
1 EL2 and EL1 System register accesses to the powerdown debug registers are trapped to EL3. 
0 EL? and EL1 System register accesses to the powerdown debug registers are not trapped to EL3. 


For EL1, this trap control applies to accesses from both Security states. 
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Table D1-77 shows the registers for which accesses are trapped, and how the exceptions are reported in ESR_EL3. 


Table D1-77 Register accesses trapped to EL3 when MDCR_EL3.TDOSA is 1 








Traps from Registers Syndrome reporting in ESR_EL3 
AArch64 state OSLAR_EL1, OSLSR_EL1, OSDLR_EL1, Trapped AArch64 MSR, MRS, or system instruction, using 
DBGPRCR_EL1. EC value 0x18. 


Any IMPLEMENTATION DEFINED integration registers. 


Any IMPLEMENTATION DEFINED register with similar 


functionality that the implementation specifies as trapped 
by MDCR_EL3.TDOSA. 





AArch32 state DBGOSLSR, DBGOSLAR, DBGOSDLR, DBGPRCR. For accesses using: 


° MCR or MRC instructions, trapped MCR or MRC access 
(coproc==0b1110), using EC value 0x05. 


° MCRR or MRRC instructions, trapped MCRR or MRRC 
access (coproc==0b1110), using EC value 0x@C. 





Note 


These registers are not accessible at ELO. 








Trapping general System register accesses to debug registers to EL3 


MDCR_EL3.TDA traps EL2, EL1, and ELO System register accesses to the debug System registers that are not 
mentioned in Trapping System register accesses to powerdown debug registers to EL3 on page D1-1599. 


This means that MDCR_EL3.TDA traps EL2, EL1, and ELO System register accesses to all debug System registers, 
except the following: 


° Accesses from AArch64 state to the OSLAR_EL1, OSLSR_EL1, OSDLR_EL1 or DBGPRCR_EL1. 
. Accesses from AArch32 state to the DBGOSLSR, DBGOSLAR, OSDLR_EL1 or DBGPRCR. 


When MDCR_EL3.TDA is: 


1 EL2, EL1, and ELO System register accesses to any of the registers shown in Table D1-78 on 
page D1-1601 are trapped to EL3. 


0 EL2, EL1, and ELO System register accesses to the registers shown in Table D1-78 on 
page D1-1601 are not trapped to EL3. 


For ELO and EL1, this trap control applies to accesses from both Security states. 


When the PE is in Debug state, MDCR_EL3.TDA does not trap any access from: 
° AArch32 state to DBGDTRRXint and DBGDTRTXint. =) 
° AArch6é4 state to DBGDTR_ELO, DBGDTRRX_ELO, and DBGDTRTX_ELO. 
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Table D1-78 shows the registers for which accesses are trapped, and how the exceptions are reported in ESR_EL3. 


Table D1-78 Accesses trapped to EL3 when MDCR_EL3.TDA is 1 





Traps from Trapped accesses Syndrome reporting in ESR_EL3 





AArch6é4 state Accesses to the MDCR_EL2, MDRAR_EL1, MDCCSR_ELO, Trapped AArch64 MSR, MRS, or system 
MDCCINT_EL1, DBGDTR_ELO, DBGDTRRX_ELO, instruction, using EC value 0x18 
DBGDTRTX_ELO, OSDTRRX_EL1, MDSCR_EL1, 

OSDTRTX_EL1, OSECCR_EL1, DBGBVR<n>_EL 1, 
DBGBCR<n>_EL1, DBGWVR<n>_EL1, DBGWCR<n>_EL 1, 
DBGCLAIMSET_EL1, DBGCLAIMCLR_EL1, and 
DBGAUTHSTATUS_EL1. 





AArch32 state | Accesses to the HDCR, DBGDRAR, DBGDSAR, DBGDIDR, For accesses using: 


DBGDSCRint, DBGDCCINT, DBGDTRRXint, ° MCR or MRC instructions, trapped MCR or MRC 
DBGDTRTXint, DBGWFAR, DBGVCR, DBGDSCRext, acess (conroce=Obil10), usine EC 
DBGDTRTXext, DBGDTRRXext, DBGBVR<n>, aac Gu 


DBGBCR<n>, DBGBXVR<n>, DBGWCR<n>, DBGWVR<n>, 
DBGCLAIMSET, DBGCLAIMCLR, DBGAUTHSTATUS, 
DBGDEVID, DBGDEVID1, DBGDEVID2 and DBGOSECCR. 


MRRC instructions, trapped MCRR or MRRC 
access (coproc==0b1110), using EC 





value Qx@C. 
STC accesses to DBGDTRRXint. LDC or STC, trapped LDC or STC access, using EC 
LDC accesses to DBGDTRTXint. value 0x06 





Traps to EL3 of EL2, EL1, and ELO accesses to Performance Monitors registers 


MDCR_EL3.TPM traps EL2, EL1, and ELO accesses to the Performance Monitors registers to EL3: 


1 EL2, EL1, and ELO System register accesses to all Performance Monitors registers are trapped to 
EL3. 

0 EL2, EL1, and ELO System register accesses to Performance Monitors registers are not trapped to 
EL3. 


For ELO and EL1, this trap control applies to accesses from both Security states. 


Table D1-79 shows the registers for which accesses are trapped, and how the exceptions are reported in ESR_EL3. 


Table D1-79 Register accesses trapped to EL3 when MDCR_EL3.TPM is 1 





Traps from Registers Syndrome reporting in ESR_EL3 





AArch6é4 state MCR_ELO, PMCNTENSET_ELO, PMCNTENCLR_ELO, Trapped AArch64 MSR, MRS, or system 
MOVSCLR_ELO, PMSWINC_ELO, PMSELR_ELO, instruction, using EC value 0x18 
MCEIDO_ELO, PMCEID1_EL0, PMCCNTR_ELO, 

MXEVTYPER_ELO, PMXEVCNTR_ELO, PMUSERENR_ELO, 


MINTENSET_EL1, PMINTENCLR_EL1, PMOVSSET_ELO, 


MCCFILTR_ELO. 





P. 
P. 
P. 
P. 
P. 
PMEVCNTR<n>_EL0, PMEVTYPER<n>_EL0O, 
P. 
P. 
P. 
P. 
P. 











AArch32 state MCR, PMCNTENSET, PMCNTENCLR, PMOVSR, PMSWINC, _ For accesses using: 
MSELR, PMCEIDO, PMCEID1, PMCCNTR, PMXEVTYPER, . MCR or MRC instructions, trapped MCR or 
MXEVCNTR, PMUSERENR, PMINTENSET, PMINTENCLR, MRC access (coproc==0b1111), using 
MOVSSET, PMEVCNTR<n>, PMEVTYPER<n>, PMCCFILTR. EC value 0x03 
. MCRR or MRRC instructions, trapped MCRR 
or MRRC access (coproc==0b1111), 
using EC value 0x04 
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D1.16 System calls 
A system call is generated by the execution of an SVC, HVC, or SMC instruction: 


° By default, the execution of an SVC instruction generates a Supervisor Call, a synchronous exception that 
targets EL1. This provides a mechanism for software executing at ELO to make a call to an operating system 
or other software executing at EL1. 


° In an implementation that includes EL2, the execution of an HVC instruction generates a Hypervisor Call, a 
synchronous exception that targets EL2 by default. 


The HVC instruction is UNDEFINED: 
—  AtELO. 
— At EL 1 in Secure state. 


Note 


Software executing at ELO cannot directly generate a Hypervisor Call. 








. In an implementation that includes EL3, by default the execution of an SMC instruction generates a Secure 
Monitor Call, a synchronous exception that targets EL3. 


The SMC instruction is UNDEFINED at ELO, meaning software executing at ELO cannot directly generate a 
Secure Monitor Call. 


The default behavior applies when the instruction is not UNDEFINED and both of the following are true: 
. The instruction is executed at an Exception level that is the same as or lower than the target Exception level. 


. The instruction is not trapped to a different Exception level. 


Tf an SVC or HVC instruction is executed at an Exception level that is higher than the target Exception then it generates 
a synchronous exception that is taken to the current Exception level. 


EL2 and EL3 can disable Hypervisor Call exceptions, see: 
° Disabling Non-secure state execution of HVC instructions on page D1-1574. 
° Enabling EL3, EL2, and Non-secure EL] execution of HVC instructions on page D1-1592. 


EL2 can trap use of the SMC instruction, see Traps to EL2 of Non-secure EL1 execution of SMC instructions on 
page D1-1578. 


EL3 can disable Secure Monitor Call exceptions, see Disabling EL3, EL2, and ELI execution of SMC instructions 
on page D1-1593. 


D1.16.1 Pseudocode description of system calls 
The AArch64.CallSupervisor() pseudocode function performs an SVC call in AArch6é4 state. 
The AArch64.CallHypervisor() pseudocode function performs an HVC call in AArch6é4 state. 
The AArch64.CallSecureMonitor() pseudocode function performs an SMC call in AArch64 state. 


The AArch64.Cal]lSupervisor(), AArch64.CallHypervisor(), and AArch64.Cal1SecureMonitor() functions are 
described in Chapter J1 ARMv8 Pseudocode. 
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Mechanisms for entering a low-power state 


The ARM architecture provides mechanisms that software can use to indicate that the PE can enter a low-power 
state, if it supports that state. The following sections describe those mechanisms: 


° Wait for Event mechanism and Send event. 
° Wait For Interrupt on page D1-1602. 


Wait for Event mechanism and Send event 


A PE can use the Wait for Event (WFE) mechanism to enter a low-power state, depending on the value of an Event 
Register for that PE. To enter the low-power state, the PE executes a Wait For Event instruction, WFE, and if the Event 
Register is clear, the PE can enter the low-power state. 


If the PE does enter the low-power state, it remains in that low-power state until it receives a WFE wake-up event. 


The architecture does not define the exact nature of the low-power state, except that the execution of a WFE 
instruction must not cause a loss of memory coherency. 


WEE mechanism behavior depends on the interaction of all of the following, that are described in the subsections 
that follow: 


. The Event Register for the PE. See subsection The Event Register on page D1-1600. 
. The Wait For Event instruction, WFE. See subsection The Wait For Event instruction on page D1-1600. 
° WFE wake-up events. See subsection WFE wake-up events in AArch64 state on page D1-1601 


° The Send Event instructions, SEV and SEVL that can cause WFE wake-up events. See subsection The Send 
Event instructions on page D1-1601. 


Note 


Because the Wait for Event mechanism is associated with suspending execution on a PE for the purpose of power 
saving, ARM recommends that the Event Register is set only infrequently. However, software must only use the 
setting of the Event Register as a hint, and must not assume that any particular message is sent as a result of the 
setting of the Event Register. 








Example D1-2 describes how a spinlock implementation might use the WFE mechanism to save energy. 


Example D1-2 Spinlock as an example of using Wait For Event and Send Event 


A multiprocessor operating system requires locking mechanisms to protect data structures from being accessed 
simultaneously by multiple PEs. These mechanisms prevent the data structures becoming inconsistent or corrupted 
if different PEs try to make conflicting changes. If a lock is busy, because a data structure is being used by one PE, 
it might not be practical for another PE to do anything except wait for the lock to be released. For example, if a PE 
is handling an interrupt from a device, it might need to add data received from the device to a queue. If another PE 
is removing data from the same queue, it will have locked the memory area that holds the queue. The first PE cannot 
add the new data until the queue is in a consistent state and the second PE has released the lock. The first PE cannot 
return from the interrupt handler until the data has been added to the queue, so it must wait. 


Typically, a spin-lock mechanism is used in these circumstances: 


° A PE requiring access to the protected data attempts to obtain the lock using single-copy atomic 
synchronization primitives such as the Load-Exclusive and Store-Exclusive operations described in 
Synchronization and semaphores on page B2-108. 


. If the PE obtains the lock it performs its memory operation and then releases the lock. 


° If the PE cannot obtain the lock, it reads the lock value repeatedly in a tight loop until the lock becomes 
available. When the lock becomes available, the PE again attempts to obtain it. 
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A spin-lock mechanism is not ideal for all situations: 


° In a low-power system the tight read loop is undesirable because it uses energy to no effect. 
. In a multiprocessor system the execution of spin-locks by multiple waiting PEs can degrade overall 
performance. 


Using the Wait For Event and Send Event mechanism can improve the energy efficiency of a spinlock: 


. A PE that fails to obtain a lock executes a WFE instruction to request entry to a low-power state, at the time 
when the exclusive monitor is set holding the address of the location holding the lock. 


° When a PE releases a lock, the write to the lock location causes the exclusive monitor of any PE monitoring 
the lock location to be cleared. This clearing of the exclusive monitors generates a WFE wake-up event for 
each of those PEs. Then, these PEs can attempt to obtain the lock again. 


For large systems, more advanced locking systems, such as ticket locks, can avoid unfairness caused by having 
multiple PEs simultaneously reading the lock. In such systems, the WFE mechanism can be used in a similar way 
to monitor the next ticket value. 


The Event Register 


The Event Register is a single bit register for each PE. When set, an Event Register indicates that an event has 
occurred since the register was last cleared, that might require some action by the PE. Therefore, when the Event 
Register is set, the PE must not suspend operation on executing a WFE instruction. 


The reset value of the Event Register is UNKNOWN. 


The Event Register for a PE is set by any of the following: 


° A Send Event instruction, SEV, executed by any PE in the system. 

° A Send Event Local instruction, SEVL, executed by the PE. 

. An exception return. 

. The clearing of the global monitor for the PE. 

. An event from a Generic Timer event stream, see Event streams on page D6-1882. 
° An event sent by some IMPLEMENTATION DEFINED mechanism. 


The Event Register is cleared only by a Wait For Event instruction. 





Note 


Software cannot read or write the value of the Event Register directly. 





The Wait For Event instruction 
The action of the Wait For Event instruction, WFE, depends on the state of the Event Register: 
. If the Event Register is set, the instruction clears the register and completes immediately. 


° If the Event Register is clear the PE can suspend execution and enter a low-power state. It remains in that 
state until the PE detects a WFE wake-up event, or earlier if the implementation chooses, or a until a reset. 
When the PE detects a WFE wake-up event, or earlier if chosen, the WFE instruction completes. If the 
wake-up event sets the Event Register, it is IMPLEMENTATION DEFINED whether on restarting execution, the 
Event Register is cleared. 


The WFE instruction is available at all Exception levels. Attempts to enter a low-power state made by software 
executing at ELO, EL1, or EL2 might be trapped to a higher Exception level. See: 


° Traps to ELI of ELO execution of WFE and WF instructions on page D1-1565. 
° Traps to EL2 of Non-secure ELO and EL] execution of WFE and WFI instructions on page D1-1581. 
° Traps to EL3 of EL2, EL1, and ELO execution of WFE and WFI instructions on page D1-1591. 
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Note 


Software using the Wait For Event mechanism must tolerate spurious wake-up events, including multiple wake-ups. 








WFE wake-up events in AArch64 state 

The following are WFE wake-up events: 

° The execution of an SEV instruction on any PE in the multiprocessor system. 

° Any physical SError interrupt, IRQ interrupt, or FIQ interrupt received by the PE, that is not disabled by 
EDSCR.INTdis and: 


— Is marked as A in the tables in Asynchronous exception masking on page D1-1557, regardless of the 
value of the corresponding PSTATE. {A, I, F} mask bit. 


— Is marked as B in the tables in Asynchronous exception masking on page D1-1557, if the value of the 


corresponding PSTATE.{A, I, F} mask bit is 0. 


° In Non-secure EL1 or ELO, any virtual SError interrupt, IRQ interrupt, or FIQ interrupt received by the PE, 
that is not disabled by EDSCR.INTdis and is marked as B in Table D1-18 on page D1-1559 in Virtual 
interrupts on page D1-1558, if the value of the corresponding PSTATE.{A, I, F} mask bit is 0. 


° An asynchronous External Debug Request debug event, if halting is allowed. For the definition of halting is 
allowed see Halting allowed and halting prohibited on page H2-4845. 


See also External Debug Request debug event on page H3-4900. 


° An event sent by the timer event stream for the PE. See Event streams on page D6-1882. 
. An event caused by the clearing of the global monitor for the PE. 
° An event sent by some IMPLEMENTATION DEFINED mechanism. 


Not all of these wake-up events set the Event Register. 





Note 


The disabling of interrupts, and WFE wake-up events, by EDSCR.INTdis is possible only when external debug is 
enabled. 





The Send Event instructions 
The Send Event instructions are: 
SEV, Send Event This causes an event to be signaled to all PEs in the multiprocessor system. 


SEVL, Send Event Local 
This must set the local Event Register. 


Note 


It might signal an event to other PEs by some IMPLEMENTATION DEFINED mechanism, but 
is not required to do so. 








The mechanism that signals an event to other PEs is IMPLEMENTATION DEFINED. The PE is not required to guarantee 
the ordering of this event with respect to the completion of memory accesses by instructions before the SEV 
instruction. Therefore, ARM recommends that software includes a DSB instruction before any SEV instruction. 
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D1.17.2 





Note 


A DSB instruction ensures that no instructions, including any SEV instructions, that appear in program order after the 
DSB instruction, can execute until the DSB instruction has completed. See Data Synchronization Barrier (DSB) on 
page B2-89. 





The SEVL instruction appears to execute in program order relative to any subsequent WFE instruction executed on the 
same PE, without the need for any explicit insertion of barrier instructions. 


The receipt of a signaled SEV or SEVL event by a PE sets the Event Register on that PE. 


The SEV and SEVL instructions are available at all Exception levels. 


Pseudocode description of the Wait For Event mechanism 
This section identifies pseudocode functions that describe the behavior of the Wait For Event mechanism. 
The ClearEventRegister() pseudocode function clears the Event Register of the current PE. 


The EventRegistered() pseudocode function returns TRUE if the Event Register of the current PE is set and FALSE 
if it is clear. 


The WaitForEvent() pseudocode function optionally suspends execution until a WFE wake-up event or reset occurs, 
or until some earlier time if the implementation chooses. It is IMPLEMENTATION DEFINED whether restarting 
execution after the period of suspension causes ClearEventRegister() to be called. 


The SendEvent() pseudocode function sets the Event Register of every PE in the multiprocessor system. 


The EventRegisterSet() pseudocode function sets the event register for the local PE. 


Wait For Interrupt 


Software can use the Wait for Interrupt (WFI) instruction to cause the PE to enter a low-power state. The PE then 
remains in that low-power state until it receives a WFI wake-up event, or until some other IMPLEMENTATION 
DEFINED reason causes it to leave the low-power state. The architecture permits a PE to leave the low-power state 
for any reason, but requires that it must leave the low-power state on receipt of any architected WFI wake-up event. 





Note 


Because the architecture permits a PE to leave the low-power state for any reason, it is permissible for a PE to treat 
WFI as a NOP, but this is not recommended for lowest power operation. 





When the PE leaves a low-power state that was entered as a result of a WFI instruction, that WFI instruction completes. 


The architecture does not define the exact nature of the low-power state, except that the execution of a WFI 
instruction must not cause a loss of memory coherency. 


Attempts to enter a low-power state made by software executing at ELO, EL1, or EL2 might be trapped to a higher 
Exception level. See: 


° Traps to ELI of ELO execution of WFE and WF instructions on page D1-1565. 
° Traps to EL2 of Non-secure ELO and EL] execution of WFE and WFI instructions on page D1-1581. 
° Traps to EL3 of EL2, EL1, and ELO execution of WFE and WFI instructions on page D1-1591. 


WFI wake-up events 
The following are WFI wake-up events: 


° Any physical SError interrupt, IRQ interrupt, or FIQ interrupt received by the PE, that is marked as A as B 
in the tables in Asynchronous exception masking on page D1-1557, regardless of the value of the 
corresponding PSTATE. {A, I, F} mask bit. 
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° In Non-secure EL1 or ELO, any virtual SError interrupt, IRQ interrupt, or FIQ interrupt received by the PE, 
that is marked as B in Table D1-18 on page D1-1559 in Virtual interrupts on page D1-1558, regardless of the 
value of the corresponding PSTATE. {A, I, F} mask bit. 


. An asynchronous External Debug Request debug event, if halting is allowed. For the definition of halting is 
allowed see Halting allowed and halting prohibited on page H2-4845. 


See also External Debug Request debug event on page H3-4900. 


° An event sent by some IMPLEMENTATION DEFINED mechanism. 


Note 


° WFI wake-up events are never disabled by EDSCR.INTdis, and are never masked by the PSTATE.{A, I, F} 
mask bits. If wake-up is invoked by an interrupt that is disabled or masked the interrupt is not taken. 





° Because debug events are WFI wake-up events, ARM recommends that Wait For Interrupt is used as part of 
an idle loop rather than waiting for a single specific interrupt event to occur and then moving forward. This 
ensures that the intervention of debug while waiting does not significantly change the function of the program 
being debugged. 


° Some implementations of the WFI mechanism drain down any pending memory activity before suspending 
execution. This increases power saving, by increasing the area over which clocks can be stopped. The 
architecture does not require this operation, therefore software must not rely on the WFI mechanism 
operating in this way. 





Using WFI to indicate an idle state on bus interfaces 


Software can use the WFI mechanism to force quiescence on a PE, and, combined with preventing any possible WFI 
wakeup events, this can be used to complete an entry into a powerdown state. 


Because mechanisms for entering powerdown states are inherently IMPLEMENTATION DEFINED, whether an 
implementation uses the WFI mechanism is IMPLEMENTATION DEFINED. If it does, the WFI instruction forces the 
suspension of execution, and of all associated bus activity. 


The control logic that does this also tracks the activity on the bus interfaces of the PE, so that when the PE has 
completed all current operations and any associated bus activity has completed, it can signal to an external power 
controller that there is no ongoing bus activity. 


However, the PE must continue to process memory-mapped and external debug interface accesses to debug registers 
when in the WFI state. The indication of idle state to the system normally only applies to the non-debug functional 
interfaces used by the PE, not the debug interfaces. 


When the OS Double Lock control, OSDLR_EL1.DLK, is 1, the PE must not signal this idle state to the control 
logic unless it can also guarantee that the debug interface is idle. For more information about the OS Double Lock, 
see Debug behavior when the OS Double Lock is locked on page H6-4953. 





Note 


In a PE that implements separate core and debug power domains, the debug interface referred to in this section is 
the interface between the core and debug power domains, since the signal to the power controller indicates that the 
core power domain is idle. For more information about the power domains see Power domains and debug on 
page H6-4945. 





The exact nature of this interface is IMPLEMENTATION DEFINED, but the use of Wait For Interrupt as the only 
architecturally-defined mechanism that completely suspends execution makes it very suitable as the preferred 
powerdown entry mechanism. 


Pseudocode description of Wait For Interrupt 


The WaitForInterrupt() pseudocode function optionally suspends execution until a WFI wake-up event or reset 
occurs, or until some earlier time if the implementation chooses. 
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D1.18 


Self-hosted debug 


The ARMv8-A architecture supports both of the following: 


Self-hosted debug 
The PE itself hosts a debugger. The debugger programs the PE to generate debug exceptions. Debug 
exceptions are accommodated in the ARMv8-A Exception model. 

External debug 


The PE is controlled by an external debugger. The debugger programs the PE to generate Debug 
events, that cause the PE to enter Debug state. In Debug state, the PE is halted. 


This section describes self-hosted debug. It includes: 
° Debug exceptions. 
° The PSTATE debug mask bit, D. 


For external debug, see part E. 











D1.18.1 Debug exceptions 
Debug exceptions occur during normal program flow, if a debugger has programmed the PE to generate them. 
For example, a software developer might use a debugger contained in an operating system to debug an application. 
To do this, the debugger might enable one or more debug exceptions. 
The possible debug exceptions are: 
. Breakpoint Instruction exceptions. 
. Breakpoint exceptions. 
° Watchpoint exceptions. 
° Vector Catch exceptions. 
° Software Step exceptions. 
Chapter D2 AArch64 Self-hosted Debug describes these in detail for AArch64. 
For the PE to generate a debug exception requires that: 
° The debug exception is enabled. The debug exception enable controls on page D2-1630 gives the controls for 
the different debug exceptions. 
° Debug exceptions are enabled from the current Exception level and Security state. See Enabling debug 
exceptions from the current Exception level and Security state on page D2-1633. 
Debug exceptions are synchronous exceptions, and are accommodated in the ARMv8 Exception model. 
Note 
Breakpoints and Watchpoints can cause entry to Debug state instead of causing debug exceptions. See Chapter H1 
About External Debug. 
D1.18.2 The PSTATE debug mask bit, D 
As with all other exceptions, when a debug exception is taken, software must take care to avoid generating another 
instance of an exception within the exception handler, to avoid recursive entry into the exception handler and loss 
of return state. 
To help avoid this, the ARMv8 architecture provides a debug exception mask bit, PSTATE.D, that can mask 
Watchpoint, Breakpoint, and Software Step exceptions when the target Exception level is the current Exception 
level. 
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PSTATE.D is set to 1 on taking an exception. This means that while handling an exception in AArch64 state, 
Watchpoint, Breakpoint, and Software Step exceptions are masked. This prevents recursive entry at the Exception 
level that debug exceptions are targeted to. 


When execution is in AArch64 state, debug exceptions are also masked implicitly when the target Exception level 
is lower than the current Exception level. 


When the target Exception level is higher than the current Exception level, debug exceptions cannot be masked by 
PSTATE.D. 


Because debug exceptions are synchronous, the architecture requires that debug exceptions are not generated when 
PSTATE.D is 1. By preventing debug exception generation, debug exceptions cannot be taken at a subsequent time 
when the Process state D mask bit is cleared to 0. 





Note 


This differs from the behavior for interrupts, where the PSTATE.{ A, I, F} mask has the effect of preventing the 
interrupt from being taken, but instead the interrupt remains pending. 
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D1.19 |The Performance Monitors Extension 


The System registers provide access to a Performance Monitors Unit (PMU), defined as the OPTIONAL Performance 
Monitors Extension to the architecture, a non-invasive debug resource that provides information about the operation 
of the PE. The PMU provides: 


° A 64-bit cycle counter. 


° An IMPLEMENTATION DEFINED number of 32-bit event counters. Each event counter can be configured to 
count occurrences of a specified event. The events that can be counted are: 


— Architectural and microarchitectural events that are likely to be consistent across many 
microarchitectures. The PMU architecture uses event numbers to identify an event, and the PMU 
specification defines which event number must be used for each of these architectural and 
microarchitectural events. 


— _ Implementation-specific events. The PMU specification reserves event numbers for 
implementation-specific events. See Appendix K3 Recommendations for Performance Monitors 
Event Numbers for IMPLEMENTATION DEFINED Events. 


For more information, see Chapter DS The Performance Monitors Extension. 
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D1.20 Interprocessing 
Interprocessing is the term used to describe moving between the AArch64 and AArch32 Execution states. 


The Execution state can change only on a change of Exception level. This means that the Execution state can change 
only on taking an exception to a higher Exception level, or returning from an exception to a lower Exception level. 


On taking an exception to a higher Exception level, the Execution state either: 
° Remains unchanged. 
° Changes from AArch32 state to AArch64 state. 


On returning from an exception to a lower Exception level, the Execution state either: 
° Remains unchanged. 
° Changes from AArch64 state to AArch32 state. 


Note 


If, on taking or returning from an exception, the Exception level remains the same, the Execution state cannot 
change. 








For the description of: 


° Exception entry to an Exception level using AArch64, see Exception entry on page D1-1521. 
. Exception return from an Exception level using AArch64 state, see Exception return on page D1-1536. 
° Exception return to AArch32 state, see Exception return to an Exception level using AArch32 on 


page G1-3834. 





Note 


The description in Handling exceptions that are taken to an Exception level using AArch32 on page G1-3812 
is outside the scope of interprocessing, because such exceptions must have been taken from an Exception 
level that is using AArch32, and therefore there is no change of Execution state. 





The following sections describe the behavior associated with interprocessing. 

° Register mappings between AArch32 state and AArch64 state. 

. State of the general-purpose registers on taking an exception to AArch64 state on page D1-1616. 
° SPSR, ELR, and AArch64 SP relationships on changing Execution state on page D1-1617. 


D1.20.1 Register mappings between AArch32 state and AArch64 state 


This section defines the architectural mappings between AArch32 state registers and AArch64 state registers. 


The mappings describe: 
° For exceptions taken from AArch32 state to AArch64 state, where the AArch32 register content is found. 
° For exception returns from AArch64 state to AArch32 state, how the AArch32 register content is derived. 


The general model is: 





° The AArch32 register contents are situated in the bottom 32 bits of the AArch64 registers. 
° In AArch32 state, the upper 32 bits of AArch64 registers are inaccessible and are ignored. 
Note 


System software that executes in AArch64 state, such as an OS or Hypervisor, can use these mappings for context 
save and restore, or to interpret and modify the AArch32 registers of an application or virtual machine. 
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For more information see the following subsections: 

° Mapping of the general-purpose registers between the Execution states. 

. Mapping of the SIMD and floating-point registers between the Execution states on page D1-1609. 
° Mapping of the System registers between the Execution states on page D1-1610. 


Mapping of the general-purpose registers between the Execution states 


Table D1-80 shows how each of the AArch32 general-purpose registers, RO-R12, SP, and LR, including the banked 
copies of these registers, maps to an AArch64 general-purpose register. A register in the AArch64 register column 
of the table provides the AArch64 view of the corresponding register in the AArch32 register column. 





Note 


For some exceptions, the exception syndrome given in the ESR_ELx identifies one or more register numbers from 
the issued instruction that generated the exception. Where the exception is taken from an Exception level using 
AArch32 these register numbers give the AArch64 view of the register. For example, if an exception is taken from 
AArch32 Abort mode, and the faulting instruction specified R14, the ESR_ELx.ISS field would report this using 
the EC value 0b10100, because register X20 provides the AArch64 view of LR_abt. which is the copy of R14 used 
in Abort mode. 





Table D1-80 General-purpose register mapping between AArch32 state and AArch64 state 





AArch32 register AArch64 register 






























































RO X0 
RI Xi 
R2 x2 
R3 X3 
R4 x4 
RS X5 
R6 X6 
R7 X7 
R8_usr X8 
R9_usr X9 
R10_usr X10 
R11_usr X11 
R12_usr X12 
SP_usr X13 
LR_usr X14 
SP_hyp X15 
LR_irq X16 
SP_irq X17 
LR_sve X18 
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Table D1-80 General-purpose register mapping between AArch32 state and AArché64 state 





AArch32 register 


AArch64 register 






































SP_svc X19 
LR_abt X20 
SP_abt X21 
LR_und X22 
SP_und X23 
R8_fiq X24 
R9_fiq X25 
R10_fiq X26 
R11_fiq X27 
R12_fiq X28 
SP_fiq X29 
LR_fiq X30 





Note 





For a description of the banking of AArch32 general-purpose registers R8-R12, SP, and LR, see AArch32 


general-purpose registers, the PC, and the Special-purpose registers on page G1-3801. 





Mapping of the SIMD and floating-point registers between the Execution states 


Table D1-81 shows the mapping between the AArch64 V registers and the AArch32 Q registers. 


Table D1-81 SIMD and floating-point register mapping between AArch64 state and AArch32 state 




















AArch6é4 register AArch32 register 
vo QO 

v1 Ql 

v2 Q2 

V15 Q15 





The AArch64 registers V16-V31 are not accessible from AArch32 state. 


The mapping between the V, D, and S registers in AArch64 state is not the same as the mapping between the Q, D, 


and S registers in AArch32 state: 


° In AArch64 state, there are: 
— 32 128-bit V registers, VO-V31. 
— 32 64-bit D registers, DO-D31. 
— 32 32-bit S registers, SO-S31. 
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A smaller register occupies the least-significant bytes of the corresponding larger register. For example, S5 
is the least-significant word of D5 and V5. Figure D1-3 shows this mapping. 


64 63 32 31 0 


v 


« Dn 
« Sn 





v. 





Figure D1-3 AArch64 state SIMD and floating-point register mappings 
° In AArch32 state, there are: 
—_— 16 128-bit Q registers, QO-Q15. 
— 32 64-bit D registers, DO-D31. 
—  3232-bit S registers, SO-S31. 


Smaller registers are packed into larger registers. Figure D1-4 shows this mapping. 


64 63 32 31 0 











D(2n+1) 
$(4n+3) >< S(4n+2) 


D(2n) 
S$(4n+1) >< S(4n) 
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>< 
>< 


Vv 














Figure D1-4 AArch32 state SIMD and floating-point register mappings 


In AArch32 state: 
° There are no S registers that correspond to Q8-Q15. 
° D16-D31 pack into Q8-Q15. For example, D16 and D17 pack into Q8. 


Note 


A consequence of this mapping is that if software executing in AArch64 state interprets D or S registers from 
AArch32 state, it must unpack the D or S registers from the V registers before it uses them. 








Mapping of the System registers between the Execution states 


ARMvV8 architecturally defines the relationship between the AArch64 System registers and the AArch32 System 
registers, to allow supervisory code such as a hypervisor, that is executing in AArch64 state, to save, restore, and 
interpret the System registers belonging to a lower Exception level that is using AArch32. 


Any modifications made to AArch32 System registers affects only those parts of those AArch64 registers that are 
mapped to the AArch32 System registers. Bits[63:32] of AArch64 registers, where they are not mapped to AArch32 
registers, are unchanged by AArch32 state execution. 


Note 
This model is different to the model for the general-purpose registers described in Mapping of the general-purpose 
registers between the Execution states on page D1-1608. In this model, there are several cases where two AArch32 
System registers are packed into a single AArch64 System register. 








When EL3 is implemented and is using AArch32, some System registers are banked between the two Security 
states. When a register is banked in this way, there is an instance of the register in Secure state, and another instance 
of the register in Non-secure state. In Table D1-82 on page D1-1611 these banked registers are identified by 
footnote*. This banking is not supported when EL3 is using AArch64 or if EL3 is not implemented. This means that 
when EL3 is implemented and is using AArch64, exactly the same registers are accessed in the following states: 

° Secure EL1 with EL1 using AArch32. 


. Non-secure EL1 with EL1 using AArch32. 
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This means that, architecturally, it is not possible to determine whether an AArch64 register is mapped onto the 
Secure instance of the corresponding AArch32 register, or onto the Non-secure instance of that register. When EL3 
is using AArch64, the interrupt asserted by the AArch64 CNTP_* timer is the same interrupt as is asserted by the 
Non-secure AArch32 CNTP_* timer when EL3 is using AArch32. 





Note 


Although the architecture does not require this, because it is not architecturally visible, ARM expects that 
implementations will map many of the AArch64 registers for use by EL3 to the Secure instances of the banked 
AArch32 registers, and will map many of the AArch64 registers for use by EL1 to the Non-secure instances of the 
banked AArch32 registers. However, if EL2 and EL3 are implemented and both support use of AArch32, this is not 
possible for the following registers: 


IFAR This is because when EL3 is using AArch32, HIFAR is an alias of the Secure IFAR. 
DFAR This is because when EL3 is using AArch32, HDFAR is an alias of the Secure DFAR. 





Table D1-82 shows the mappings between the writable AArch64 System registers and the AArch32 System 
registers. 


Table D1-82 Mapping of writable AArch64 System registers to the AArch32 System registers 







































































AArché64 register AArch32 register 
ACTLR_EL1 ACTLR%, and, if implemented, ACTLR2?. 
AFSRO_EL1 ADFSR2 
AFSR1_EL1 AIFSR4 
AMAIR_EL1[31:0] AMAIRO2 
AMAIR_EL1[63:32] AMAIR1a 
CONTEXTIDR_EL1 CONTEXTIDR2 
CPACR_EL1 CPACR 
CSSELR_EL1 CSSELR?2 
DACR32_EL2 DACR24 
FAR_EL1[31:0] DFAR@ 
ESR_EL1 DFSR2 
HACR_EL2 HACR 
ACTLR_EL2 HACTLR and, if implemented, HACTLR2. 
AFSRO_EL2 HADFSR 
AFSR1_EL2 HAIFSR 
AMAIR_EL2[31:0] HAMAIRO 
AMAIR_EL2[63:32] HAMAIRI 
CPTR_EL2 HCPTR 
HCR_EL2[31:0] HCR 
HCR_EL2[63:32] HCR2 
MDCR_EL2 HDCR 
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Table D1-82 Mapping of writable AArch64 System registers to the AArch32 System registers 





AArch64 register 


AArch32 register 












































FAR_EL2[31:0] HDFAR 
FAR_EL2[63:32] HIFAR 
MAIR_EL2[31:0] HMAIRO 
MAIR_EL2[63:32] HMAIR1 
HPFAR_EL2[31:0] HPFAR 
SCTLR_EL2 HSCTLR 
ESR_EL2 HSR 
HSTR_EL2 HSTR 
TCR_EL2 HTCR 
TPIDR_EL2[31:0] HTPIDR 
TTBRO_EL2 HTTBR 
VBAR_EL2[31:0] HVBAR 
FAR_EL1[63:32] IFAR@ 
IFSR32_EL2 IFSR4 





MAIR_EL1[63:32] 


NMRR or MAIR14 





















































PAR_EL1 PAR@ 
MAIR_EL1[31:0] PRRR or MAIRO4 
RMR_EL1 RMR (at EL1) 
RMR_EL2 HRMR 
RMR_EL3 RMR (at EL3) 
SCTLR_EL1 SCTLR4 
SDER32_EL3 SDER 
TPIDR_EL1[31:0] TPIDRPRW2 
TPIDRRO_ELO[31:0] TPIDRURO# 
TPIDR_ELO[31:0] TPIDRURW2 
TCR_EL1[31:0] TTBCR4 
TTBRO_EL1 TTBRO# 
TTBR1_EL1 TTBRI1@ 
VBAR_EL1[31:0] VBAR@4 
VMPIDR_EL2[31:0] VMPIDR 
VPIDR_EL2 VPIDR 
VTCR_EL2 VTCR 





Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


D1 The AArch64 System Level Programmers’ Model 
D1.20 Interprocessing 


Table D1-82 Mapping of writable AArch64 System registers to the AArch32 System registers 





AArch64 register AArch32 register 





VTTBR_EL2 VTTBR 





Timer registers 





















































CNTFRQ_ELO CNTFRQ 
CNTHCTL_EL2 CNTHCTL 
CNTHP_CTL_EL2 CNTHP_CTL 
CNTHP_CVAL_EL2[63:0] CNTHP_CVAL 
CNTHP_TVAL_EL2 CNTHP_TVAL 
CNTKCTL_EL1 CNTKCTL 

CNTP CTL_ELO CNTP CTL? 
CNTP_CVAL_EL0[63:0] CNTP_CVAL?@ 
CNTP_TVAL_ELO CNTP_TVAL# 
CNTPCT_ELO0[63:0] CNTPCT 
CNTV_CTL_ELO CNTV_CTL 
CNTV_CVAL_ELO[63:0] CNTV_CVAL 
CNTV_TVAL_ELO CNTV_TVAL 
CNTVCT_ELO[63:0] CNTVCT 
CNTVOFF_EL2[63:0] CNTVOFF 








Debug System registers 















































DBGAUTHSTATUS_EL1 DBGAUTHSTATUS 
DBGBCR<n>_EL1 DBGBCR<n> 
DBGBVR<n>_EL1[31:0] DBGBVR<n> 
DBGBVR<n>_EL1[63:32] DBGBXVR<n> 
DBGCLAIMCLR_EL1 DBGCLAIMCLR 
DBGCLAIMSET_EL1 DBGCLAIMSET 
DBGDTR_ELO DBGDTRRxXint or the DBGDTRTXint 
DBGDTRRX_ELO DBGDTRRXint 
DBGDTRTX_ELO DBGDTRRXint 
DBGPRCR_EL1 DBGPRCR 
DBGVCR32_EL2 DBGVCR 
DBGWCR<n>_EL1 DBGWCR<n> 
DBGWVR<n>_EL1[31:0] DBGWVR<n> 
ID_DFRO_EL1 ID_DFRO 
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Table D1-82 Mapping of writable AArch64 System registers to the AArch32 System registers 





AArch64 register 


AArch32 register 



































MDCCSR_EL0> DBGDSCRint® 
MDCR_EL2 HDCR 
MDRAR_EL1 DBGDRAR 
MDSCR_EL1> DBGDSCRext® 
OSDLR_EL1 DBGOSDLR 
OSDTRRX_ELI1> DBGDTRRXext 
OSDTRTX_EL1> DBGDTRTXext? 
OSECCR_EL1 DBGOSECCR 
OSLAR_EL1 DBGOSLAR 
OSLSR_EL1 DBGOSLSR 
SDER32_EL3 SDER 





Performance Monitors System registers 





















































PMCCNTR_ELO[31:0] PMCCNTR (MRC/MCR) 
PMCEIDO_ELO PMCEIDO 
PMCEID1_ELO PMCEID1 
PMCNTENCLR_ELO PMCNTENCLR 
PMCNTENSET_ELO PMCNTENSET 
PMCR_ELO PMCR 
PMEVCNTR<n>_EL0O PMEVCNTR<n> 
PMEVTYPER<n>_ELO PMEVTYPER<n> 
PMINTENCLR_EL1 PMINTENCLR 
PMINTENSET_EL1 PMINTENSET 
PMOVSCLR_ELO PMOVSR 
PMOVSSET_ELO PMOVSSET 
PMSELR_ELO PMSELR 
PMSWINC_ELO PMSWINC 
PMUSERENR_ELO PMUSERENR 
PMXEVCNTR_ELO PMXEVCNTR 
PMXEVTYPER_ELO PMXEVTYPER 





a. AArch32 registers that are banked if EL3 is using AArch32. 


b. These registers have overlapping register content. One or more bits of one register 
appear in the other register. 
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There are a small number of AArch32 System registers that are not mapped to any AArch64 System registers. The 
AArch64 registers listed in Table D1-83 can be used to access these from a higher Exception level that is using 
AArch64. The registers shown in the table are UNDEFINED if EL1 cannot use AArch32. 


Table D1-83 AArch64 registers for accessing registers that are only used in AArch32 state 





AArch32 register 


AArch64 register for accessing 
the AArch32 register 


Short description 

















DACR DACR32_EL2 Domain Access Control Register 
DBGVCR DBGVCR32_EL2 Debug Vector Catch Register 

FPEXC FPEXC32_EL2 Floating-Point Exception Control Register 
IFSR IFSR32_EL2 Instruction Fault Status Register 

SDER SDER32_EL3 AArch32 Secure Debug Enable Register 





Table D1-84 shows the AArch64 System registers that allow access from AArch64 state to the AArch32 
ID registers. These AArch64 registers are UNKNOWN if no Exception level can use AArch32. 


Table D1-84 AArch64 registers that access the AArch32 ID registers 





AArch32 register 


AArché64 register for accessing 
the AArch32 register 


Short description 



























































ID_AFRO ID_AFRO_EL1 AArch32 Auxiliary Feature Register 0 
ID_DFRO ID_DFRO_EL1 AArch32 Debug Feature Register 0 
ID_ISARO ID_ISARO_EL1 EL1, AArch32 Instruction Set Attribute Register 0 
ID_ISARI ID_ISAR1_EL1 EL1, AArch32 Instruction Set Attribute Register 1 
ID_ISAR2 ID_ISAR2_EL1 EL1, AArch32 Instruction Set Attribute Register 2 
ID_ISAR3 ID_ISAR3_EL1 EL1, AArch32 Instruction Set Attribute Register 3 
ID_ISAR4 ID_ISAR4_EL1 EL1, AArch32 Instruction Set Attribute Register 4 
ID_ISARS5 ID_ISARS_EL1 EL1, AArch32 Instruction Set Attribute Register 5 
ID_MMFRO ID_MMFRO_EL1 AArch32 Memory Model Feature Register 0 
ID_MMFR1 ID_MMFR1_EL1 AArch32 Memory Model Feature Register 1 
ID_MMFR2 ID_MMFR2_EL1 AArch32 Memory Model Feature Register 2 
ID_MMFR3 ID_MMFR3_EL1 AArch32 Memory Model Feature Register 3 
ID_MMFR4 ID_MMFR4 ELI! AArch32 Memory Model Feature Register 4 
ID_PFRO ID_PFRO_EL1 AArch32 PE Feature Register 0 
ID_PFR1 ID PFRI EL AArch32 PE Feature Register 1 
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D1.20.2 State of the general-purpose registers on taking an exception to AArch64 state 


When an exception is taken from AArch32 state to AArch64 state, the state of a general-purpose register depends 
on whether, immediately before the exception, the register was accessible from AArch32 state, as follows: 


If the general-purpose register was accessible from AArch32 state 


The upper 32 bits either become zero, or hold the value that the same architectural register held 
before any AArch32 execution. The choice between these two options is IMPLEMENTATION 
DEFINED, and might vary dynamically within an implementation. Correspondingly, software must 
regard the value as being a CONSTRAINED UNPREDICTABLE choice between these two values. 


This behavior applies regardless of whether any execution occurred at the Exception level that was 
using AArch32. That is, this behavior applies even if AArch32 state was entered by an exception 
return from AArch6é4 state, and another exception was immediately taken to AArch64 state without 
any instruction execution in AArch32 state. 
Which general-purpose registers have their upper 32 bits affected in this way depends on both: 
° The AArch64 state target Exception level. 
° The values of both: 

—  SCR_EL3.RW. 

—_— HCR_EL2.RW or HCR.RW, where HCR.RW is a notional bit that is RESO. 


Table D1-85 shows which general-purpose registers can have their upper 32 bits set to zero. 


Table D1-85 General-purpose registers that can have their upper 32 bits set to zero on taking an 


exception to AArch64 state from AArch32 state 





Registers when the target Exception level is: 


SCR_EL3.RW HCR_EL2.RW or HCR.RW2 














EL3 EL2 EL1 
0 0 X0-X30 -b -b 
0 1 -c _c _c 
1 0 X0-X14, X16-X30 =-X0-X14, X16-X30_—-b 
1 1 X0-X14 X0-X14 X0-X14 





a. HCR.RW is a notional bit that is RESO. 
b. The RW bit values are not valid for the targeted Exception level. 


Cc. 


Not valid because the RW bit values would imply that EL2 is AArch32 and EL! is AArch64. 


— Note 


If EL2 is not implemented, or the SCR_EL3.NS or SCR.NS bit prevents its use, then as described 
in The effects of supporting fewer than four Exception levels on page D1-1621, the behavior is 
consistent with HCR_EL2.RW taking the value of SCR_EL3.RW. 





If the general-purpose register was not accessible from AArch32 state 


The general rule is that the register retains the state it had before any AArch32 execution. 


There is one exception to this rule, that is when taking an exception to EL3 using AArch64 when 
either EL2 is not implemented or EL1 is in Secure state. In these cases, the X15 register must be 
treated as if it is accessible when the value of SCR_EL3.RW is 0, and therefore the upper bits of 
X15 might either be set to zero or retain their previous value. 
Which general-purpose registers retain their state depends on both: 
° The AArch64 state target Exception level. 
. The values of both: 

—  SCR_EL3.RW. 
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— HCR_EL2.RW or HCR.RW, where HCR.RW is a notional bit that is RESO. 


Table D1-86 shows which general-purpose registers can retain their state. 


Table D1-86 General-purpose registers that can retain their state on taking an exception to 
AArch64 from AArch32 





Registers when the target Exception level is: 
SCR_EL3.RW HCR_EL2.RW or HCR.RW2 














EL3 EL2 EL1 
0 0 None -b -b 
0 1 ¢ ac ve 
1 0 X15 X15 -b 
1 1 X15-X30 X15-X30 X15-X30 





a. HCR.RW is a notional bit that is RESO. 
b. The RW bit values are not valid for the targeted Exception level. 
c. Not valid because the RW bit values would imply that EL2 is AArch32 and EL1 is AArch64. 


— Note 


If EL2 is not implemented, or the SCR_EL3.NS bit prevents its use, then as described in The effects 
of supporting fewer than four Exception levels on page D1-1621, the behavior is consistent with 
HCR_EL2.RW taking the value of SCR_EL3.RW. 





D1.20.3 SPSR, ELR, and AArch64 SP relationships on changing Execution state 


Table D1-87 shows the SPSR and ELR registers that are architecturally mapped between AArch32 state and 
AArch6é4 state. 


Table D1-87 SPSR and ELR mappings between AArch32 state and AArch64 state 





AArch32 register AArch64 register 











SPSR_sve SPSR_ELI1 
SPSR_hyp SPSR_EL2 
ELR_hyp ELR_EL2 





On exception entry to EL3 using AArch64 state from an Exception level using AArch32 state, when EL2 has been 
using AArch32 state, the upper 32-bits of ELR_EL2 are either set to zero or they retain the value before the 
AArch32 state execution. The implementation determines the choice between these two options, and the choice 
might vary dynamically within an implementation. Therefore, software must regard the upper 32-bits as being 
UNKNOWN. 


On exception entry to an Exception level using AArch64 state from an Exception level using AArch32 state, the 
AArch64 Stack Pointers and Exception Link Registers associated with an Exception level that are not accessible 
during execution in AArch32 state at that Exception level, retain the state that they had before the execution in 
AArch32 state. 


The following AArch32 registers are used only during execution in AArch32 state. However, they retain their state 
when there is execution at EL1 with EL1 using AArch64 state: 





° SPSR_abt. 
° SPSR_und. 
. SPSR_irq. 
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. SPSR_fiq. 
Note 
° These registers are accessible during execution in AArch64 state at Exception levels higher than EL1, for 


context switching. 


. If EL1 does not support execution in AArch32 state then these registers are RESO. 





On exception entry to an Exception level using AArch64 from an Exception level using AArch32, the AArch64 
Stack Pointers and Exception Link Registers associated with an Exception level that are not accessible during 
AArch32 execution at that Exception level retain the state that they had before AArch32 execution. This applies to 
the following registers: 





° SP_ELO. 
° SP_ELI. 
° SP_EL2. 
° ELR_EL1. 
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D1.21 The effect of implementation choices on the programmers’ model 
Three of the implementation choices in ARMV8 are: 
° The number of Exception levels implemented. 
. Which Exception levels support AArch32 and which Exception levels support AArch64. 
° Whether SIMD and floating-point support is implemented. 
The following subsections give more information about how these choices affect the programmers’ model: 
° Implication of Exception levels implemented. 
° Support for Exception levels and Execution states on page D1-1620. 
° Implementations not including Advanced SIMD and floating-point instructions on page D1-1621. 
° The effects of supporting fewer than four Exception levels on page D1-1621. 
D1.21.1 Implication of Exception levels implemented 
All implementations must include ELO and EL1. 
EL2 and EL3 are optional. The architecture permits all combinations of EL2 and EL3. 
See also Implementations not including Advanced SIMD and floating-point instructions on page D1-1621 and The 
effects of supporting fewer than four Exception levels on page D1-1621. 
For an implementation that includes all of the Exception levels Figure D1-5 shows the implemented Exception 
levels and the possible Execution states at lower Exception levels when EL3 is using AArch64. Figure D1-5 applies 
regardless of whether EL3 also supports use of AArch32. 
Non-secure state Secure state 
AArch32 or AArch32 or AArch32 or AArch32 or AArch32 or AArch32 or 
AArch64t AArch64t AArch64t AArch64t | | AArch64t AArch64t 
ELO App1 App2 App1 App2 Secure App1 Secure App2 
AArch32 or AArch64* AArch32 or AArch64* | [AArch32 or AArch64 
EL1 Guest OS1 Guest OS2 Secure OS 
AArch32 or AArch64 
EL2 Hypervisor 
AArch64 
EL3 Secure monitor 











t+ AArch64 permitted only if EL1 is using AArch64 
+ AArch64 permitted only if EL2 is using AArch64 


Figure D1-5 ARMv8-A security model when EL3 is using AArch64 
The possible combinations of Exception levels are as follows: 


° ELO, EL1, and EL2. The implementation supports only Non-secure state. 
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° ELO, EL1, and EL3. The implementation does not support Virtualization. The Exception levels and 
Execution states depend on whether EL3 is using AArch64 state or AArch32 state, as follows: 


— If EL3 is using AArch64, the Exception levels and Execution states are as shown in Figure D1-5 on 
page D1-1619 with EL2 removed and no Non-secure state virtualization of EL1 and ELO. 


—  IfEL3 is using AArch32, the Exception levels and Execution states are as shown in Figure G1-1 on 
page G1-3790 with EL2 removed and no Non-secure state virtualization of EL1 and ELO. 


° ELO and EL1 only. The implementation supports only a single Security state. This might be either Secure 
state or Non-secure state, see Behavior when only ELI and ELO are implemented on page D1-1622. 


° ELO, EL1, EL2, and EL3, as described in this section. 


For more information, see The effects of supporting fewer than four Exception levels on page D1-1621. 


D1.21.2 Support for Exception levels and Execution states 


Subject to the interprocessing rules defined in /nterprocessing on page D1-1607, an implementation of the ARM 
architecture could support: 


° AArch64 state only. 
° AArch64 and AArch32 states. 
° AArch32 state only. 


This means the ARMv8-A architecture can, potentially, support implementations with very large number of 
combinations of Execution state and Exception level. ARM intends to license only a subset of the possible 
combinations. Table D1-88 shows the combinations of Exception levels and Execution states that are currently 
licensed. 


Table D1-88 Supported combinations of Exception levels and Execution state 





Exception levels, AArch64 state Exception levels, AArch32 state 






























































Number of Supported 
Exception levels Security states EL3 EL2 EL4 ELO EL3 EL2 EL4 ELO 
Four Both Yes Yes Yes Yes Yes Yes Yes Yes 
Yes Yes Yes Yes No No Yes Yes 
Yes Yes Yes Yes No No No Yes 
Yes Yes Yes Yes No No No No 
No No No No Yes Yes Yes Yes 
Three Both Yes No Yes Yes No No Yes Yes 
Yes No Yes Yes No No No Yes 
Yes No Yes Yes No No No No 
No No No No Yes No Yes Yes 
Non-secure only No Yes Yes Yes No No Yes Yes 
No Yes Yes Yes No No No No 
Two Either No No Yes Yes No No No Yes 
No No Yes Yes No No No No 
No No No No No No Yes Yes 
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D1.21.3 Implementations not including Advanced SIMD and floating-point instructions 


In general, ARMv8-A requires the inclusion of the Advanced SIMD and floating-point instructions in all instruction 
sets. Exceptionally, for implementations targeting specialized markets that do not require support for floating-point 
or use of Advanced SIMD, ARM might produce or license an ARMv8-A implementation that does not provide any 
support for Advanced SIMD and floating-point instructions. In such an implementation: 


In AArch64 state 
° The CPACR_EL1.FPEN field is RESO. 
° The CPTR_EL2.TFP bit is RES1. 
° The CPTR_EL3.TFP bit is RES1. 
° Each of the ID_AA64PFRO_EL1.{AdvSIMD, FP} fields is @b1111. 


° The FPEXC32_EL2, FPCR, and FPSR registers are not implemented, and their encodings 
are UNDEFINED. 


° Attempted accesses to Advanced SIMD and floating-point functionality are UNDEFINED. This 
means: 


— All Advanced SIMD and floating-point instructions are UNDEFINED. 


— Attempts to access the Advanced SIMD and floating-point System registers are 
UNDEFINED. 


° If at least one Exception level supports execution in AArch32 state, the MVFRO_EL1, 
MVFRI1_EL1 and MVFR2_EL]I registers are RAZ. When no Exception level supports 
execution in AArch32 state these registers are UNKNOWN. 


In AArch32 state 


See AArch32 implications of not including support for Advanced SIMD and floating-point on 
page G1-3880. 


D1.21.4 The effects of supporting fewer than four Exception levels 


The effect of implementation choices on the programmers’ model on page D1-1619 defines the permitted 
combinations of Exception levels in an ARMv8-A implementation. 


In every implementation that supports the highest Exception level using either AArch64 state or AArch32 state, an 
IMPLEMENTATION DEFINED mechanism determines whether the highest implemented Exception level uses AArch64 
state or AArch32 state from a Cold reset. Typically, this mechanism is a configuration input. When the highest level 
is configured to be AArch6é4 state, then after a Cold reset execution starts at the reset vector in that Exception level. 


The unimplemented Exception levels have no effect on execution: 


. No interrupts are routed to these Exception levels. 

° No traps that target these Exception levels are active 

. All systems calls to unimplemented Exception levels from lower Exception levels are treated as UNDEFINED. 
° There is no support for address translation from these Exception levels. 

. Any exception return that targets an unimplemented Exception level is treated as an illegal exception return 


as described in I/legal return events from AArch64 state on page D1-1537. 


. Every accessible register associated with an unimplemented Exception level is RESO unless the register is 
associated with the Exception level only to provide the ability to transfer execution to a lower Exception 
level. 


Note 


If, for example, EL3 is not implemented and EL2 is the highest implemented Exception level, then because 
none of the EL3 registers are accessible from EL2, the content of those registers is not architecturally visible. 








The following subsections give more information about each of the permitted combinations of Exception levels that 
do not include all Exception levels. 
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Behavior when EL2 is not implemented 
If EL2 is not implemented and EL3 is implemented: 


° If EL1 can use AArch32 then the following registers are not RESO: 
—  DACR32_EL2. 
—  IFSR32_EL2. 
—  FPEXC32_EL2. 
—  DBGVCR32_EL2. 


° The VMPIDR_EL2 and VPIDR_EL2 behave as follows: 
— Reads of VMPIDR_EL2 return the value of MPIDR_EL1, writes to VMPIDR_EL2 are ignored. 
— Reads of VPIDR_EL2? return the value of MIDR_EL1, writes to VPIDR_EL2 are ignored. 


° Behavior is consistent with the HCR_EL2.RW bit taking the value of the SCR_EL3.RW bit for all purposes 
other than reading the HCR_EL2. 





. Virtual interrupts are disabled. 
. The following address translation and TLB invalidation instructions are UNDEFINED: 
— AT S1E2R and AT S1E2W. 
—  TLBI VAE2, TLBI VALE2, TLBI VAE2IS, TLBI VALE2IS, TLBI ALLE2, TLBI ALLEZ2IS. 
Note 


No other TLB or address translation instructions become UNDEFINED with this combination of 
Exception levels. 





. The SCR_EL3.HCE bit is RESO. 
If EL2 is not implemented, regardless of whether EL3 is implemented: 


° The CNTHCTL_EL2[1:0] bits are treated as if they have the value 0b11 for all purposes other than reading 
the CNTHCTL_EL2 register. 


° The MDCR_EL2.HPMN bit taking the value of the PMCR_ELO.N bit for all purposes other than reading the 
value of MDCR_EL2.HPMN 


Behavior when EL3 is not implemented and EL2 is implemented 


If EL3 is not implemented and EL2 is implemented, then: 
° All memory transactions can only access a single physical memory address space. 
° The PE behaves as if the value of the SCR_EL3.NS bit is 1, even though the SCR_EL3 is not accessible. 


This means that if the PE is part of a system that supports two Security states, it behaves as if it is in Non-secure 
state, and can only access Non-secure memory. 


Behavior when only EL1 and ELO are implemented 


If EL3 and EL2 are not implemented, it is IMPLEMENTATION DEFINED whether the PE behaves as if the value of the 
SCR_EL3.NS bit is 1 or the PE behaves as if the value of the SCR_EL3.NS bit is 0. 


This means that if the PE is part of a system that supports two Security states: 


° If it behaves as if the value of the SCR_EL3.NS bit is 1, it can only access Non-secure memory. 
° If it behaves as if the value of the SCR_EL3.NS bit is 0, it can access both Secure memory and Non-secure 
memory. 


If the PE behaves as if the value of the SCR_EL3.NS bit is 0, then: 


° The MDCR_EL3.{EPMAD, EDAD, SPME} bits behave as if they have the value 1. 
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The MDCR_EL3.SPD32 field behaves as if it has the value 0b11. 


Note 





The behavior described in this subsection still applies if EL1 is configured to use AArch32. 


The implementation can provide a configuration input that determines, from reset, whether the it behaves as 
if the value of the SCR_EL3.NS bit is 1, or as if the value of the SCR_EL3.NS bit is 0. 
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AArché4 Self-hosted Debug 


When the PE is using self-hosted debug, it generates debug exceptions. This chapter describes the AArch64 
self-hosted debug exception model. It is organized as follows: 


Introductory information 


About self-hosted debug on page D2-1626. 
The debug exception enable controls on page D2-1630. 


The debug Exception model 


Routing debug exceptions on page D2-1631. 


Enabling debug exceptions from the current Exception level and Security state on 
page D2-1633. 


The effect of powerdown on debug exceptions on page D2-1635. 
Summary of the routing and enabling of debug exceptions on page D2-1636. 
Pseudocode description of debug exceptions on page D2-1638. 


The debug exceptions 


Breakpoint Instruction exceptions on page D2-1639. 
Breakpoint exceptions on page D2-1641. 
Watchpoint exceptions on page D2-1657. 

Vector Catch exceptions on page D2-1672. 

Software Step exceptions on page D2-1673. 


Synchronization requirements 


The behavior of self-hosted debug after changes to System registers, or after changes to the 


authentication interface, but before a Context synchronization event guarantees the effects of the 
changes: 


Synchronization and debug exceptions on page D2-1687. 
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D2.1 About self-hosted debug 


Self-hosted debug supports debugging through the generation and handling of debug exceptions, that are taken using 
the exception model described in Chapter D1 The AArch64 System Level Programmers’ Model. This section 
introduces some terms used in describing self-hosted debug, and then introduces the debug exceptions. See: 


° Definition of a debugger in the context of self-hosted debug. 
° Context ID and Process ID. 
. About debug exceptions. 
D2.1.1 Definition of a debugger in the context of self-hosted debug 


Within this chapter, debugger means that part of an operating system, or higher level of system software, that 
handles debug exceptions and programs the debug System registers. An operating system with rich application 
environments might provide debug services that support a debugger user interface executing at ELO. From the 
architectural perspective, the debug services are the debugger. 


D2.1.2 Context ID and Process ID 


A CONTEXTIDR_ELx identifies the current Context ID, that is used by: 
° The debug logic, for breakpoint and watchpoint matching. 


° Implemented trace logic, to identify the current process. 


In AArch64 state, the CONTEXTIDR_ELx has a single field, PROCID, that is defined as the Process Identifier 
(Process ID). Therefore, in AArch64 state, the Context ID and Process ID are identical. 


D2.1.3 About debug exceptions 


Debug exceptions occur during normal program flow if a debugger has programmed the PE to generate them. For 
example, a software developer might use a debugger contained in an operating system to debug an application. To 
do this, the debugger enables one or more debug exceptions. The debug exceptions that can be generated in an 
AArch64 stage 1 translation regime are: 


° Breakpoint Instruction exceptions on page D2-1627. 
° Breakpoint exceptions on page D2-1627, generated by hardware breakpoints. 
° Watchpoint exceptions on page D2-1627, generated by hardware watchpoints. 


° Software Step exceptions on page D2-1628. 


In addition, debug exceptions generated in an AArch32 translation regime might be routed to EL2 using AArch64. 
See Routing debug exceptions on page D2-1631. Chapter G2 describes the debug exceptions that can be generated 
in an AArch32 translation regime. 


Vector Catch exceptions are exceptions that are never generated in an AArch64 translation regime but can be 
generated in stage 1 of an AArch32 translation regime and routed to EL2 using AArch64. Vector Catch exceptions 
on page D2-1672 describes the behavior for this case. 


The PE can only generate a particular debug exception when both: 


1. Debug exceptions are enabled from the current Exception level and Security state. 
See Enabling debug exceptions from the current Exception level and Security state on page D2-1633. 
Breakpoint Instruction exceptions are always enabled from the current Exception level and Security state. 
2: A debugger has enabled that particular debug exception. 


All of the debug exceptions except for Breakpoint Instruction exceptions have an enable control contained in 
the MDSCR_EL1. See The debug exception enable controls on page D2-1630. 
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Note 


If halting is allowed and EDSCR.HDE is 1, hardware breakpoints and watchpoints cause entry to Debug state 
instead of causing debug exceptions. In Debug state, the PE is halted. 





For the definition of halting is allowed, see Halting allowed and halting prohibited on page H2-4845. 





The following list summarizes each of the debug exceptions: 


Breakpoint Instruction exceptions 


Breakpoint instructions generate these. Breakpoint instructions are instructions that software 
developers can use to cause exceptions at particular points in the program flow. 


The breakpoint instruction in the A64 instruction set is BRK #<immediate>. Whenever one of these is 
committed for execution, the PE takes a Breakpoint Instruction exception. 


PE behavior 


Breakpoint Instruction exceptions cannot be masked. The PE takes Breakpoint 
Instruction exceptions regardless of both of the following: 


° The current Exception level. 


° The current Security state. 


For more information, see Breakpoint Instruction exceptions on page D2-1639. 


Breakpoint exceptions 


The ARMv8-A architecture provides 2-16 hardware breakpoints. These can be programmed to 
generate Breakpoint exceptions based on particular instruction addresses, or based on particular PE 
contexts, or both. 


For example, a software developer might program a hardware breakpoint to generate a Breakpoint 
exception whenever the instruction with address 0x1000 is committed for execution. 


The ARMVv8-A architecture supports the following types of hardware breakpoint for use in stage 1 
of an AArch64 translation regime: 
. Address. 
— Comparisons are made with the virtual address of each instruction in the program flow. 
° Context: 
— Context ID Match. Matches with the Context ID held in the CONTEXTIDR_EL1. 
— VMID Match. Matches with the VMID value held in the VTTBR_EL2. 
— Context ID and VMID Match. Matches with both the Context ID and the VMID value. 


An Address breakpoint can link to a Context breakpoint, so that the Address breakpoint only 
generates a Breakpoint exception if the PE is in a particular context when the address match occurs. 


A breakpoint generates a Breakpoint exception whenever an instruction that causes a match is 
committed for execution. 


PE behavior 


If halting is allowed and EDSCR.HDE is 1, hardware breakpoints cause entry to Debug 
state. That is, they halt the PE. See Chapter H2 Debug State. 


Otherwise: 

° If debug exceptions are enabled, hardware breakpoints cause Breakpoint 
exceptions. 

° If debug exceptions are disabled, hardware breakpoints are ignored. 


For more information, see Breakpoint exceptions on page D2-1641. 


Watchpoint exceptions 


The ARMv8-A architecture provides 2-16 hardware watchpoints. These can be programmed to 
generate Watchpoint exceptions based on accesses to particular data addresses, or based on accesses 
to any address in a data address range. 
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For example, a software developer might program a hardware watchpoint to generate a Watchpoint 
exception on an access to any address in the data address range 0x1000 - 0x101F. 


A hardware watchpoint can link to a hardware breakpoint if the hardware breakpoint is a Linked 
Context type. In this case, the watchpoint only generates a Watchpoint exception if the PE is ina 
particular context when the data address match occurs. 


The smallest data address size that a watchpoint can be programmed to match on is a byte. A single 
watchpoint can be programmed to match on one or more bytes. 


A watchpoint generates a Watchpoint exception whenever an instruction that initiates an access that 
causes a match is committed for execution. 
PE behavior 


If halting is allowed and EDSCR.HDE is 1, hardware watchpoints cause entry to Debug 
state. That is, they halt the PE. See Chapter H2 Debug State. 


Otherwise: 

° If debug exceptions are enabled, hardware watchpoints cause Watchpoint 
exceptions. 

° If debug exceptions are disabled, hardware watchpoints are ignored. 


For more information, see Watchpoint exceptions on page D2-1657. 


Vector Catch exceptions 
These are not generated in an AArch64 translation regime. They can only be generated in an 
AArch32 translation regime. See Vector Catch exceptions on page D2-1672. 

Software Step exceptions 
Software step is a resource that a debugger can use to make the PE single-step instructions. 


For example, by using software step, debugger software executing at a higher Exception level can 
debug software executing at a lower Exception level, by making it single-step instructions. 


After the software being debugged has single-stepped an instruction, the PE takes a Software Step 
exception. 
PE behavior 


Software step can only be used by a debugger executing in an Exception level that is 
using AArch64. However, the instruction stepped might be executed in either Execution 
state, and therefore Software Step exceptions can be taken from either Execution state. 


If debug exceptions are enabled, Software Step exceptions can be generated. 


If debug exceptions are disabled, software step is inactive. 


For more information, see Software Step exceptions on page D2-1673. 


Table D2-1 summarizes PE behavior and shows the location of the pseudocode for each of the debug exceptions. 


Table D2-1 PE behavior and pseudocode for each of the debug exceptions 





PE behavior if debug exceptions are: 

















Debug exception Pseudocode 
Enabled Disabled 

Breakpoint Instruction exceptions Takes the exception Takes the exception page D2-1640 

Breakpoint exceptions Takes the exception? Ignored page D2-1655 

Watchpoint exceptions Takes the exception? Ignored page D2-1670 

Vector Catch exceptions Takes the exception Ignored page G2-3981 

Software Step exceptions Takes the exception Not applicable> page D2-1686 





a. If halting is allowed and EDSCR.HDE is 1, hardware breakpoints and watchpoints cause the PE to enter Debug 
state instead of causing debug exceptions. See Chapter H2 Debug State. 
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b. Software Step is inactive if debug exceptions are disabled. No Software Step exceptions can be generated. 
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D2.2 The debug exception enable controls 
The enable controls for each debug exception are as follows: 


Breakpoint Instruction exceptions 


None. Breakpoint Instruction exceptions are always enabled. 


Breakpoint exceptions 
MDSCR_EL1.MDE, plus an enable control for each breakpoint, DBGBCR<n>_EL1.E. 


Watchpoint exceptions 
MDSCR_EL1.MDE, plus an enable control for each watchpoint, DBGWCR<n>_EL1.E. 


Vector Catch exceptions 
MDSCR_EL1.MDE. 


Software Step exceptions 
MDSCR_ELI1.SS. 


In addition, for all debug exceptions other than Breakpoint Instruction exceptions, software must configure the 
controls that enable debug exceptions from the current Exception level and Security state. See Enabling debug 
exceptions from the current Exception level and Security state on page D2-1633. 


The PE cannot take a debug exception if debug exceptions are disabled from either the current Exception level or 
the current Security state. 


Breakpoint Instruction exceptions are always enabled from the current Exception level and Security state. 
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D2.3 Routing debug exceptions 


Debug exceptions are usually routed to EL1. However: 


. If EL2 is implemented, the routing of debug exceptions taken from Non-secure state depends on 
MDCR_EL2.TDE: 
1 Debug exceptions taken from Non-secure state are routed to EL2. 
0 Debug exceptions behave as follows: 
° Debug exceptions taken from Non-secure EL1 and ELO are routed to Non-secure EL1. 
. Breakpoint Instruction exceptions taken from EL2 are routed to EL2. 


. All other debug exceptions are disabled from EL2 using AArch64. 





Note 
If HCR_EL2.TGE is 1, MDCR_EL2.TDE is treated as being 1 except for a direct read of MDCR_EL2. 





Table D2-2 shows this. 


Table D2-2 The effect of the TGE and TDE controls on debug exception routing 





Debug exceptions taken from 


HCR_EL2.TGE MDCR_EL2.TDE 
= = Non-secure state are routed to: 











0 0 Non-secure EL 14 
0 1 EL2 
1 x EL2 





a. Breakpoint Instruction exceptions taken from EL2 are routed to EL2. 





Note 
If EL2 is not implemented, the PE behaves as if both HCR_EL2.TGE and MDCR_EL2.TDE are 0. 





° If EL3 is implemented, Breakpoint Instruction exceptions taken from EL3 are routed to EL3. 


All other debug exceptions are disabled from EL3 using AArch64. 


Either EL1 or EL2 is the debug target exception level, ELp. That is, ELp is EL1 unless EL2 is implemented and 
MDCR_EL2.TDE is 1 and the debug exception is taken from Non-secure state, when ELp is EL2. 


The following tables show the routing of debug exceptions: 


Table D2-3 Routing when both EL3 and EL2 are implemented 











ELp when executing in: 
MDCR_EL2.TDE?@ Non-secure: Secure: 
ELO EL1 EL2 ELO EL1 EL3 
0 ELI EL1 ELI> 
EL1 ELI EL1> 
1 EL2 EL2 EL2 











a. If HCR_EL2.TGE is 1, this bit is treated as being 1 other than for a direct read of MDCR_EL2. 


b. ELp is EL1. However, all debug exceptions other than Breakpoint Instruction exceptions are disabled, 
and Breakpoint Instruction exceptions taken from EL2 are routed to EL2. 
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Table D2-4 Routing when EL3 is implemented and EL2 is not implemented 





ELp when executing in: 





Non-secure: Secure: 
ELO EL1 EL1 EL3 
EL1 ELI ELI EL12 








a. ELp is EL1. However, all debug exceptions other than Breakpoint Instruction 
exceptions are disabled, and Breakpoint Instruction exceptions taken from 
EL2 are routed to EL2. 


Table D2-5 Routing when EL3 is not implemented and EL2 is implemented 





ELp when executing in Non-secure: 
MDCR_EL2.TDE@ 








ELO EL1 EL2 
0 EL1 EL1 EL1> 
1 EL2 EL2 EL2 








a. If HCR_EL2.TGE is 1, this bit is treated as being 1 other than for a direct read of MDCR_EL2. 
b. ELp is EL1. However, all debug exceptions other than Breakpoint Instruction exceptions are disabled, and Breakpoint 
Instruction exceptions taken from EL2 are routed to EL2. 


D2.3.1 Pseudocode description of routing debug exceptions 
DebugTarget() returns the current debug target Exception level. 
DebugTargetFrom() returns the debug target Exception level for the specified Security state. 


These functions are described in Chapter J1 ARMv8 Pseudocode. 
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D2.4 Enabling debug exceptions from the current Exception level and Security state 


A debug exception can only be taken if all of the following are true: 

° The OS lock is unlocked. 

. DoubleLockStatus() == FALSE. 

. The debug exception is enabled from the current Exception level. 


. The debug exception is enabled from the current Security state. 


Table D2-6 shows when debug exceptions are enabled from the current Exception level. In the table, ELp is the 
Exception level that Table D2-3 on page D2-1631 defines. 


Table D2-6 Whether debug exceptions are enabled from the current Exception level 














Current Exception level BreakPoint ISeUGHOn All other debug exceptions 
exceptions 

Any Exception level that is Enabled Disabled 

higher than ELp? 

ELp Enabled Disabled if either of the following is true: 
° The Local (kernel) Debug Enable bit, MDSCR_EL1.KDE, is 0. 
° The Debug exception mask bit, PSTATE.D, is 1. 
Otherwise enabled. 
This means that a debugger must explicitly enable these debug 
exceptions from ELp by setting MDSCR_EL1.KDE to 1 and 
PSTATE.D to 0. 

Any Exception level that is Enabled Enabled 

lower than ELp 








a. This includes EL3. EL3 is always higher than ELp. 


Note 
PSTATE.D is set to 1 at reset and on exception entry. 








Table D2-7 shows when debug exceptions are enabled from the current Security state. 


Table D2-7 Whether debug exceptions are enabled from the current Security state 




















Current Security state BiRakpolntinswuchen All other debug exceptions 
exceptions 
Non-secure Enabled Enabled 
Secure Enabled Disabled if MDCR_EL3.SDD is 1. See Disabling debug 
exceptions from Secure state on page D2-1634. 
Otherwise enabled. 
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D2.4.1 Disabling debug exceptions from Secure state 


If EL3 is implemented, software executing at EL3 can set the Secure Debug Disable bit, MDCR_EL3.SDD, to 1 to 
disable all debug exceptions taken from AArch64 Secure state other than Breakpoint Instruction exceptions. 


The ARMv8-A architecture does not support disabling debug in Non-secure state. 


Note 


° If the boot software executed when reset is deasserted sets MDCR_EL3.SDD to 1, software operating at EL3 
never has to switch the debug registers between Secure state and Non-secure state. 





. The PE cannot take a debug exception unless it is enabled from the current Exception level. See Table D2-6 
on page D2-1633. 


° If either the OS lock or the OS double-lock is locked, debug exceptions other than Breakpoint Instruction 
exceptions are disabled. 


° If EL3 and EL2 are not implemented, and the implementation is a Secure state only implementation, the PE 
behaves as if MDCR_EL3.SDD is 0. 





D2.4.2 Pseudocode description of enabling debug exceptions 


AArch64.GenerateDebugExceptions() determines whether debug exceptions other than Breakpoint Instruction 
exceptions are enabled from the current Exception level and Security state. 


AArch64.GenerateDebugExceptionsFrom() determines whether debug exceptions other than Breakpoint Instruction 
exceptions are enabled from the specified Exception level and Security state. 


These functions are described in Chapter Jl ARMv8 Pseudocode. 
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The effect of powerdown on debug exceptions 


Debug OS Save and Restore sequences on page H6-4951 describes the powerdown save routine and the restore 
routine. 


When executing either routine, software must use the OS Lock to disable generation of all of the following: 


Breakpoint exceptions. 
Watchpoint exceptions. 
Vector Catch exceptions. 


Software Step exceptions. 


This is because the generation of these exceptions depends on the state of the debug registers, and the state of the 
debug registers might be lost over these routines. 


Debug exceptions other than Breakpoint Instruction exceptions are enabled only if both the OS Lock is unlocked 
and DoubleLockStatus() == FALSE. 


Breakpoint Instruction exceptions are enabled regardless of the state of the OS Lock and the OS Double Lock. 
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D2.6 Summary of the routing and enabling of debug exceptions 
Behavior is as follows: 
Breakpoint Instruction exceptions 
These are always enabled, regardless of the current Exception level and Security state. Table D2-8 
shows the routing of these. In the table, n/a means not applicable. 
Table D2-8 Routing of Breakpoint Instruction exceptions 
ELp when enabled from: 
Current 
Security state MDCR_EL2.TDE@ 
ELO EL1 EL2 EL3 
Secure xX Secure EL1 Secure EL1 n/a EL3 
Non-secure 0 Non-secure EL! Non-secureEL1 EL2 n/a 
1 EL2 EL2 EL2 n/a 
. If EL2 is not implemented, behavior is as if this is 0. Otherwise, if the value of HCR_EL2.TGE is 1, 
MDCR_EL2.TDE is treated as being | other than for a direct read of MDCR_EL2. 
All other debug exceptions 
Table D2-9 shows the valid combinations of MDCR_EL3.SDD, MDCR_EL2.TDE, 
MDSCR_EL1.KDE, and PSTATE.D, and for each combination shows where these exceptions are 
enabled from and where they are taken to. 
In the table, n/a means not applicable and a dash, -, means that debug exceptions are disabled from 
that Exception level. 
Table D2-9 Routing of Breakpoint, Watchpoint, Software Step, and Vector Catch exceptions 
ELp when enabled from: 
state” LK Security state SDD? TOE? KDE D 
y ELO EL1 EL2 EL3 
Yes xX xX xX xX xX | - - - - 
No TRUE x x x xX | - - - - 
No FALSE — Secure 1 xX xX xX | - - n/a - 
No FALSE — Secure 0 xX 0 X | Secure EL1 - n/a - 
No FALSE — Secure 0 xX 1 1 Secure EL1 - n/a - 
No FALSE — Secure 0 xX 1 0 Secure EL1 Secure EL1 n/a - 
No FALSE = _Non-secure xX 0 0 X | Non-secure EL] - - n/a 
No FALSE _Non-secure xX 0 1 1 Non-secure ELI - - n/a 
No FALSE Non-secure xX 0 1 0 Non-secure EL! Non-secure EL] ~— - n/a 
No FALSE _Non-secure xX 1 0 X | EL2 EL2 - n/a 
No FALSE Non-secure xX 1 1 1 EL2 EL2 - n/a 
No FALSE  Non-secure xX 1 1 0 EL2 EL2 EL2 n/a 











a. The value of (OSLSR_EL1.OSLK ==’ 1’ || DoubleLockStatus()). 


b. If EL3 is not implemented, behavior is as if this is 0. 
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c. If HCR_EL2.TGE is 1, this bit is treated as being | other than for a direct read of MDCR_EL2. If EL2 is not implemented, behavior is 
as if TDE is 0. 
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D2.7 Pseudocode description of debug exceptions 


AArch64.DebugFau1t() returns a FaultRecord object that indicates that a memory access has generated a debug 
exception: 


The AArch64.Abort() function processes FaultRecord objects, as described in Abort exceptions on page D3-1719, 
and generates a debug exception. 


AArch64.Abort() calls one of the following: 
. AArch64.BreakpointException(). 

. AArch64.WatchpointException(). 

. AArch64.VectorCatchException(). 

. AArch64. SoftwareStepException(). 


These functions are defined in Chapter J1 ARMv8 Pseudocode. 
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D2.8 Breakpoint Instruction exceptions 
This section describes Breakpoint Instruction exceptions in an AArch64 translation regime. 


The PE is using an AArch64 translation regime when it is executing either: 
. In an Exception level that is using AArch64. 
° At ELO using AArch32 when EL! is using AArch64. 


For software executing in an Exception level that is using AArch64, a Breakpoint Instruction exception results from 
the execution of an A64 BRK instruction. However, within the AArch64 EL1&0 translation regime, executing a T32 
or A32 BKPT instruction at ELO using AArch32 generates a Breakpoint Instruction exception. 


For more information about the T32 and A32 BKPT instructions see: 
. Breakpoint instruction in the A32 and T32 instruction sets on page G2-3935. 
° BKPT instructions as the first instruction in an IT block on page G2-3936. 


The following subsections describe Breakpoint Instruction exceptions in an AArch64 translation regime: 


° About Breakpoint Instruction exceptions. 

° Breakpoint instructions. 

. Exception syndrome information and preferred return address on page D2-1640. 

. Pseudocode description of Breakpoint Instruction exceptions on page D2-1640. 
D2.8.1 About Breakpoint Instruction exceptions 


A breakpoint is an event that results from the execution of an instruction, which is based on either: 
. The instruction address, the PE context, or both. This type of breakpoint is called a hardware breakpoint. 


. The instruction itself. That is, the instruction is a breakpoint instruction. These can be included in the 
program that the PE executes. This type of breakpoint is called a software breakpoint. 


Breakpoint Instruction exceptions, that this section describes, are software breakpoints. Breakpoint exceptions on 
page D2-1641 describes hardware breakpoints. 


There is no enable control for Breakpoint Instruction exceptions. They are always enabled, and cannot be masked. 


A Breakpoint Instruction exception is generated whenever a breakpoint instruction is committed for execution, 
regardless of all of the following: 


° The current Exception level. 
° The current Security state. 
° Whether the debug target Exception level, ELp, is using AArch64 or AArch32. 


Note 


° The debug target exception level, ELp, is the Exception level that debug exceptions are targeting. Routing 
debug exceptions on page D2-1631 describes how ELp is derived. 





° Debuggers using breakpoint instructions must be aware of the ARMv8 rules for concurrent modification and 
execution of instructions. See Concurrent modification and execution of instructions on page B2-83. 





D2.8.2 Breakpoint instructions 
The breakpoint instruction in the A64 instruction set is BRK #<immediate>. It is unconditional. 
For details of the instruction encoding, see BRK on page C6-475. 
The breakpoint instruction in the A32 and T32 instruction sets is BKPT #<immediate>. 


For more information about the A32 and T32 breakpoint instruction, see Breakpoint instruction in the A32 and T32 
instruction sets on page G2-3935. 
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D2.8.3 Exception syndrome information and preferred return address 


See the following: 
° Exception syndrome information. 


° Preferred return address. 


Exception syndrome information 


On taking a Breakpoint Instruction exception, the PE records information about the exception in the Exception 
Syndrome Register (ESR) at the Exception level the exception is taken to. The ESR used is one of: 


° ESR_EL1. 


. ESR_EL2. 
. ESR_EL3. 
Note 





Breakpoint Instruction exceptions are the only debug exception that can be taken to EL3 using AArch64. 





Table D2-10 shows the information that the PE records. 


Table D2-10 Information recorded in the ESR_ELx 





ESR_EL-x field 


Information recorded in ESR_EL1, ESR_EL2, or ESR_EL3. 





Exception Class, EC Whether the breakpoint instruction was executed in AArch64 state or AArch32 state. The PE sets this to: 


° 0x3C for an A64 BRK instruction. 
° 0x38 for an A32 or T32 BKPT instruction. 





Instruction Length,1L The PE sets this to: 


° 0 for a 16-bit T32 BKPT instruction. 
° 1 for an A64 BRK instruction, or an A32 BKPT instruction. 





Instruction Specific ISS[24:16] RESO. 








Syndrome, ISS ISS[15:0] The PE copies the instruction Comment field value into here, zero extended as necessary. 
Note 
° If debug exceptions are routed to EL2, it is the exception that is routed, not the instruction that is trapped. 


Therefore, if a Breakpoint Instruction exception is routed to EL2, ESR_EL2.EC is set to the same value as if 
the exception was taken to EL1. 


. For information about how debug exceptions can be routed to EL2, see Routing debug exceptions on 
page D2-1631. 





Preferred return address 


The preferred return address is the address of the breakpoint instruction, not the next instruction. This is different 
to the behavior of other exception-generating instructions, like SVC. 


D2.8.4 Pseudocode description of Breakpoint Instruction exceptions 


AArch64.SoftwareBreakpoint() generates a Breakpoint Instruction exception that is taken to AArch64 state. 


This function is defined in Chapter J1 ARMv8 Pseudocode. 
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D2.9 Breakpoint exceptions 
This section describes Breakpoint exceptions in stage 1 of an AArch64 translation regime. 


The PE is using an AArch64 translation regime when it is executing either: 
. In an Exception level that is using AArch64. 
° At ELO using AArch32 when EL! is using AArch64. 


This section contains the following subsections: 
° About Breakpoint exceptions. 


. Breakpoint types and linking of breakpoints on page D2-1642. 


. Execution conditions for which a breakpoint generates Breakpoint exceptions on page D2-1648. 

. Breakpoint instruction address comparisons on page D2-1650. 

. Breakpoint context comparisons on page D2-1651. 

° Breakpoint usage constraints on page D2-1651. 

. Exception syndrome information and preferred return address on page D2-1655. 

° Pseudocode description of Breakpoint exceptions taken from AArch64 state on page D2-1655. 
D2.9.1 About Breakpoint exceptions 


A breakpoint is an event that results from the execution of an instruction, which is based on either: 
° The instruction address, the PE context, or both. This type of breakpoint is called a hardware breakpoint. 


° The instruction itself. That is, the instruction is a breakpoint instruction. These can be included in the 
program that the PE executes. This type of breakpoint is called a software breakpoint. 


Breakpoint exceptions are generated by Breakpoint debug events. Breakpoint debug events are generated by 
hardware breakpoints. Software breakpoints are described in Breakpoint Instruction exceptions on page D2-1639. 


An implementation can include between 2-16 hardware breakpoints. [D_AA64DFRO_EL1.BRPs shows how many 
are implemented. 


To use an implemented hardware breakpoint, a debugger programs the following registers for the breakpoint: 


. The Breakpoint Control Register, DBGBCR<n>_EL1. This contains controls for the breakpoint, for example 
an enable control. 


° The Breakpoint Value Register, DBGBVR<n>_EL1. This holds the value used for breakpoint matching, that 
is one of: 


— An instruction virtual address. 
—_— A Context ID. 
— A VMID value. 


— A concatenation of both a Context ID value and a VMID value. 


These registers are numbered, so that: 
° DBGBCR1_EL1 and DBGBVR1_EL] are for breakpoint number one. 
° DBGBCR2_EL1 and DBGBVR2_EL] are for breakpoint number two. 


° DBGBCR<n>_EL1 and DBGBVR<n>_EL] are for breakpoint number n. 


A debugger can link a breakpoint that is programmed with an address and a breakpoint that is programmed with 
anything other than an address together, so that a Breakpoint debug event is only generated if both breakpoints 
match. 
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D2.9.2 


For each instruction in the program flow, all of the breakpoints are tested. When a breakpoint is tested, it generates 
a Breakpoint debug event if all of the following are true: 


° The breakpoint is enabled. That is, the breakpoint enable control for it, DBGBCR<n>_EL1.E, is 1. 
° The conditions specified in the DBGBCR<n>_EL1 are met. 


° The comparison with the value held in the DBGBVR<n>_EL1 is successful. 


. If the breakpoint is linked to another breakpoint, the comparisons made by that other breakpoint are also 
successful. 
° The instruction is committed for execution. 


If all of these conditions are met, the breakpoint generates the Breakpoint debug event regardless of the following: 
. Whether the instruction passes its condition code check. 


. The instruction type. 
If halting is allowed and EDSCR.HDE is 1, Breakpoint debug events cause entry to Debug state. 


Otherwise, if debug exceptions are: 


° Enabled, Breakpoint debug events generate Breakpoint exceptions. 
° Disabled, Breakpoint debug events are ignored. 
Note 





The remainder of this Breakpoint exceptions section, including all subsections, describes breakpoints as generating 
Breakpoint exceptions. 


However, the behavior described also applies if breakpoints are causing entry to Debug state. 





The debug exception enable controls on page D2-1630 describes the enable controls for Breakpoint debug events. 


Breakpoint types and linking of breakpoints 


Each implemented breakpoint is one of the following: 


. A context-aware breakpoint. This is a breakpoint that can be programmed to generate a Breakpoint exception 
on any one of the following: 


—  Aninstruction address match. 

— A Context ID match, with the value held in the CONTEXTIDR_EL1. 
—_ A VMID match, with the VMID value held in the VTTBR_EL2. 

— Both a Context ID match and a VMID match. 


° A breakpoint that is not context-aware. These can only be programmed to generate a Breakpoint exception 
on an instruction address match. 


ID_AA64DFRO_EL1.CTX_CMPs shows how many of the implemented breakpoints are context-aware 
breakpoints. At least one implemented breakpoint must be context-aware. The context-aware breakpoints are the 
highest numbered breakpoints. 


Any breakpoint that is programmed to generate a Breakpoint exception on an instruction address match is 
categorized as an Address breakpoint. Breakpoints that are programmed to match on anything else are categorized 
as Context breakpoints. 


When a debugger programs a breakpoint to be an Address or a Context breakpoint, it must also program that 
breakpoint so that it is either: 


° Used in isolation. In this case, the breakpoint is called an Unlinked breakpoint. 


. Enabled for linking to another breakpoint. In this case, the breakpoint is called a Linked breakpoint. 
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By linking an Address breakpoint and a Context breakpoint together, the debugger can create a breakpoint pair that 
only generates a Breakpoint exception if the PE is in a particular context when an instruction address match occurs. 
For example, a debugger might: 


1. Program breakpoint number one to be a Linked Address Match breakpoint. 
2. Program breakpoint number five to be a Linked Context ID Match breakpoint. 


3% Link these two breakpoints together. A Breakpoint exception is only generated if both the instruction address 
matches and the Context ID matches. 


The Breakpoint Type field for a breakpoint, DBGBCR<n>_EL1.BT, controls the breakpoint type and whether the 
breakpoint is enabled for linking. If BT[0] is 1, the breakpoint is enabled for linking. 


Figure D2-1 shows all of the possible breakpoint types that stage 1 of an AArch64 translation regime supports, and 
their associated BT field values. 
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Figure D2-1 Breakpoint types and their associated BT field values 


If AArch32 state is implemented, Address breakpoints can be programmed to generate Breakpoint exceptions on 
addresses that are halfword-aligned but not word-aligned. This makes it possible to breakpoint on T32 instructions. 
See Specifying the halfword-aligned address that an Address breakpoint matches on on page D2-1650. 





Note 
Stage 1 of an AArch32 translation regimes supports two additional breakpoint types, Unlinked and Linked Address 
Mismatch breakpoints, BT == 0b0100 and BT == 0b0101. For information about these, see Chapter G2 AArch32 


Self-hosted Debug. These types are reserved in stage 1 of an AArch64 translation regime. See Reserved BT values 
on page D2-1652. 
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Rules for linking breakpoints 
The rules for breakpoint linking are as follows: 
° Only Linked breakpoint types can be linked. 


° Any type of Linked Address breakpoint can link to any type of Linked Context breakpoint. The Linked 
Breakpoint Number field, DBGBCR<n>_EL1.LBN, for the Linked Address breakpoint specifies the 
particular Linked Context breakpoint that the Linked Address breakpoint links to, and: 


— DBGBCR<n>_EL1.{SSC, HMC, PMC} for the Linked Address breakpoint define the execution 
conditions that the breakpoint pair generates Breakpoint exceptions for. See Execution conditions for 
which a breakpoint generates Breakpoint exceptions on page D2-1648. 


— DBGBCR<n>_EL1.{SSC, HMC, PMC} for the Linked Context breakpoint are ignored. 


° Linked Context breakpoint types can only be linked to. The LBN field for Context breakpoints is therefore 
ignored. 

. Linked Address breakpoints cannot link to watchpoints. The LBN field can therefore only specify another 
breakpoint. 

° If a Linked Address breakpoint links to a breakpoint that is not context-aware, the behavior of the Linked 


Address breakpoint is CONSTRAINED UNPREDICTABLE. See Other usage constraints for Address breakpoints 
on page D2-1654. 


° If a Linked Address breakpoint links to an Unlinked Context breakpoint, the Linked Address breakpoint 
never generates any Breakpoint exceptions. 


. Multiple Linked Address breakpoints can link to a single Linked Context breakpoint. 





Note 


Multiple Linked watchpoints can also link to a single Linked Context breakpoint. Watchpoint exceptions on 
page D2-1657 describes watchpoints. 





These rules mean that a single Linked Context breakpoint might be linked to by all, or any combination of, the 
following: 


° Multiple Linked Address Match breakpoints. 
° Multiple Linked watchpoints. 


It is also possible that a Linked Context breakpoint might have no breakpoints or watchpoints linked to it. 


Figure D2-2 on page D2-1645 shows an example of permitted breakpoint and watchpoint linking. 
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Figure D2-2 The role of linking in Breakpoint and Watchpoint exception generation 


In Figure D2-2, each Linked Address breakpoint can only generate a Breakpoint exception if the comparisons made 
by both it, and the Linked Context breakpoint that it links, to are successful. Similarly, each Linked watchpoint can 
only generate a Watchpoint exception if the comparisons made by both it, and the Linked Context breakpoint that 


it links to, are successful. 


Breakpoint types defined by DBGBCRn_EL1.BT 


The following list provides more detail about each breakpoint type: 


0b0000, Unlinked Address Match breakpoint 


Generation of a Breakpoint exception depends on both: 


° DBGBCR<n>_EL1.{SSC, HMC, PMC}. These define the execution conditions for which 
the breakpoint generates Breakpoint exceptions. See Execution conditions for which a 


breakpoint generates Breakpoint exceptions on page D2-1648. 


° A successful address match, as described in Breakpoint instruction address comparisons on 
page D2-1650. 


DBGBCR<n>_EL1.LBN for this breakpoint is ignored. 
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0b0001, Linked Address Match breakpoint 
Generation of a Breakpoint exception depends on all of the following: 


° DBGBCR<n>_EL1.{SSC, HMC, PMC} for this breakpoint. These define the execution 
conditions that the breakpoint generates Breakpoint exceptions for. See Execution conditions 
for which a breakpoint generates Breakpoint exceptions on page D2-1648. 


° A successful address match defined by this breakpoint, as described in Breakpoint instruction 
address comparisons on page D2-1650. 

° A successful context match defined by the Linked Context breakpoint that this breakpoint 
links to. 


DBGBCR<n>_EL1.LBN for this breakpoint selects the Linked Context breakpoint that this 
breakpoint links to. 
0b0010, Unlinked Context ID Match breakpoint 
BT == 0b0010 is a reserved value if the breakpoint is not a context-aware breakpoint. 
For context-aware breakpoints, generation of a Breakpoint exception depends on both: 


° DBGBCR<n>_EL1.{SSC, HMC, PMC}. These define the execution conditions for which 
the breakpoint generates Breakpoint exceptions. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page D2-1648. 


. A successful Context ID match, as described in Breakpoint context comparisons on 
page D2-1651. 


DBGBCR<n>_EL1.{LBN, BAS} for this breakpoint are ignored 


0b0011, Linked Context ID Match breakpoint 
BT == 0b0011 is a reserved value if the breakpoint is not a context-aware breakpoint. 
For context-aware breakpoints, one of the following applies: 


° If no Linked breakpoints or Linked watchpoints link to this breakpoint, then the breakpoint 
does not generate any Breakpoint exceptions. 


. Generation of a Breakpoint exception depends on both: 


— A successful instruction address match, defined by a Linked Address breakpoint that 
links to this breakpoint, see Breakpoint instruction address comparisons on 
page D2-1650. 


—  Assuccessful Context ID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page D2-1651. 


. Generation of a Watchpoint exception depends on both: 


— A ssuccessful data address match, defined by a Linked watchpoint that links to this 
breakpoint, see Watchpoint data address comparisons on page D2-1662. 


—  Assuccessful Context ID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page D2-1651. 


DBGBCR<n>_EL1.{LBN, SSC, HMC, BAS, PMC} for this breakpoint are ignored. 


0b0100, Unlinked Address Mismatch breakpoint 


BT == 0b0100 is a reserved value in stage 1 of an AArch64 translation regime. See Reserved BT 
values on page D2-1652. 


0b0100, Unlinked Address Mismatch breakpoint on page G2-3943 describes the behavior of 
Address Mismatch breakpoints in stage 1 of an AArch32 translation regime. 
0b0101, Linked Address Mismatch breakpoint 


BT == 0b0101 is a reserved value in stage 1 of an AArch64 translation regime. See Reserved BT 
values on page D2-1652. 


Ob0101, Linked Address Mismatch breakpoint on page G2-3944 describes the behavior of Address 
Mismatch breakpoints in stage 1 of an AArch32 translation regime. 
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0b1000, Unlinked VMID Match breakpoint 
BT == 0b1000 is a reserved value if either: 
. The breakpoint is not a context-aware breakpoint. 
° EL2 is not implemented. 


For context-aware breakpoints, generation of a Breakpoint exception depends on both: 


° DBGBCR<n>_EL1.{SSC, HMC, PMC}. These define the execution conditions for which 
the breakpoint generates Breakpoint exceptions. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page D2-1648. 


° A successful VMID match, as described in Breakpoint context comparisons on 
page D2-1651. 


DBGBCR<n>_EL1.{LBN, BAS} for this breakpoint are ignored. 


0b1001, Linked VMID Match breakpoint 


BT == 0b1000 is a reserved value if either: 

° The breakpoint is not a context-matching breakpoint. 
. EL2 is not implemented. 

For context-aware breakpoints, one of the following applies: 


° If no Linked breakpoints or Linked watchpoints link to this breakpoint, then the breakpoint 
does not generate any Breakpoint exceptions. 


° Generation of a Breakpoint exception depends on both: 


— A successful instruction address match, defined by a Linked Address Match 
breakpoint that links to this breakpoint. See Breakpoint instruction address 
comparisons on page D2-1650. 


—  Asuccessful VMID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page D2-1651. 


. Generation of a Watchpoint exception depends on both: 


— A successful data address match, defined by a Linked watchpoint that links to this 
breakpoint, see Watchpoint data address comparisons on page D2-1662. 


—  Asuccessful VMID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page D2-1651. 


DBGBCR<n>_EL1.{LBN, SSC, HMC, BAS, PMC} for this breakpoint are ignored. 


0b1010, Unlinked Context ID and VMID Match breakpoint 
BT == 0b1010 is a reserved value if either: 
° The breakpoint is not a context-aware breakpoint. 
° EL2 is not implemented. 


For context-aware breakpoints, generation of a Breakpoint exception depends on all of the 
following: 


° DBGBCR<n>_EL1.{SSC, HMC, PMC}. These define the execution conditions that the 
breakpoint generates a Breakpoint exception for. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page D2-1648. 


° A successful Context ID match, as described in Breakpoint context comparisons on 
page D2-1651. 


° A successful VMID match. 


Breakpoint context comparisons on page D2-1651 describes the requirements for a successful 
Context ID match and a successful VMID match. 


DBGBCR<n>_EL1.{LBN, BAS} for this breakpoint are ignored. 
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0b1011, Linked Context ID and VMID Match breakpoint 


BT == 0b1011 is a reserved value if either: 
° The breakpoint is not a context-aware breakpoint. 


° EL2 is not implemented. 
For context-aware breakpoints, one of the following applies: 


° If no Linked breakpoints or Linked watchpoints link to this breakpoint, then the breakpoint 
does not generate any Breakpoint exceptions. 


° Generation of a Breakpoint exception depends on all of the following: 


— A successful instruction address match, defined by a Linked Address breakpoint that 
links to this breakpoint, see Breakpoint instruction address comparisons on 
page D2-1650. 


— _ Assuccessful Context ID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page D2-1651. 


— A successful VMID match defined by this breakpoint. 
. Generation of a Watchpoint exception depends on all of the following: 


— A successful data address match, defined by a Linked watchpoint that links to this 
breakpoint, see Watchpoint data address comparisons on page D2-1662. 


—  Assuccessful Context ID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page D2-1651. 


— A successful VMID match defined by this breakpoint. 


Breakpoint context comparisons on page D2-1651 describes the requirements for a successful 
Context ID match and a successful VMID match by this breakpoint. 


DBGBCR<n>_EL1.{LBN, SSC, HMC, BAS, PMC} for this breakpoint are ignored. 


Note 


See Reserved DBGBCR<n>_EL1.BT values on page D2-1652 for the behavior of breakpoints programmed with 
reserved BT values. 








D2.9.3 Execution conditions for which a breakpoint generates Breakpoint exceptions 


Each breakpoint can be programmed so that it only generates Breakpoint exceptions for certain execution 
conditions. For example, a breakpoint might be programmed to generate Breakpoint exceptions only when the PE 
is executing at ELO in Secure state. 


DBGBCR<n>_EL1.{SSC, HMC, PMC} defines the execution conditions the breakpoint generates Breakpoint 
exceptions for, as follows: 
Security State Control, SSC 
Controls whether the breakpoint generates Breakpoint exceptions only in Secure state, only in 
Non-secure state, or in both Security states. 
—— Note 


This is determined by the Security state of the PE, not from the NS attribute returned by the 
translation of the virtual address on which the breakpoint is set. 





Higher Mode Control, HMC, and Privileged Mode Control, PMC 


HMC and PMC together control which Exception levels the breakpoint generates Breakpoint 
exceptions in. 


Table D2-11 on page D2-1649 shows the valid combinations of the values of HMC, SSC, and PMC, and for each 
combination shows which Exception levels breakpoints generate Breakpoint exceptions in. 





D2-1648 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D2 AArch64 Self-hosted Debug 


D2.9 Breakpoint exceptions 






















































































In the table: 
Y or - Means that a breakpoint programmed with the values of HMC, SSC, and PMC shown in that row: 
Y Can generate Breakpoint exceptions in that Exception level. 
- Cannot generate Breakpoint exceptions in that Exception level. 
Res Means that the combination of HMC, SSC, and PMC is reserved. See Reserved 
DBGBCR<n>_EL1.{SSC, HMC, PMC} values on page D2-1652. 
Table D2-11 Summary of breakpoint HMC, SSC, and PMC encodings 
HMc ssc pmc SeCuritystatethe breakpoint | 4. cio EL1 ELO eens 
is programmed to match in NoEL3 No EL3and no EL2 
0 00 01 Both : 2 Y - = - 
0 00 10 - - - - - 
0 00 11 - . Y 2 P 
0 01 01 Non-secure - - - Res Res 
0 01 10 - - - Y Res Res 
0 01 11 - - Res Res 
0 10 01 Secure - - - Res Res 
0 10 10 - - - Y Res Res 
0 10 11 - - Y Res Res 
1 00 01 Both Y Y - - Res 
1 00 11 Y Y Y - Res 
1 01 01 Non-secure - Y Y - Res Res 
1 01 11 - Y Y Y Res Res 
1 10 00 Secure Y - - - Res Res 
1 10 01 Y - - Res Res 
1 10 11 Y - Y Res Res 
1 11 00 Non-secure - Y - - : Res if no EL2> 











a. Debug exceptions are not generated at EL3 using AArch64. This means that these combinations of HMC, SSC, and PMC are only relevant 
if breakpoints cause entry to Debug state. Self-hosted debuggers must avoid combinations of HMC, SSC, and PMC that generate Breakpoint 
exceptions at EL3 using AArch64. 


b. This encoding is only reserved when EL2 is not implemented, regardless of whether EL3 is implemented. 


All combinations of HMC, SSC, and PMC that this table does not show are reserved. See Reserved 
DBGBCR<n>_EL1.{SSC, HMC, PMC} values on page D2-1652. 
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D2.9.4 


Breakpoint instruction address comparisons 


An address comparison is successful if bits [48:2] of the current instruction virtual address are equal to 
DBGBVR<n>_EL1[48:2]. 


Note 


DBGBVR<n>_EL] is a 64-bit register. The most significant bits of this register are sign-extension bits. 
DBGBVR<n>_EL1[1:0] are RESO and are ignored. 








If EL1 is using AArch64 and ELO is using AArch32, A32 and T32 instructions can be executed in stage | of an 
AArch64 translation regime. In this case, the instruction addresses are zero-extended before comparison with the 
breakpoint. 


Specifying the halfword-aligned address that an Address breakpoint matches on 


For Address Match breakpoints, if the implementation supports AArch32 state, a debugger must program the Byte 
Address Selection field, DBGBCR<n>_EL1.BAS. 


Table D2-12 Programmable BAS values 





BAS Match instruction at Constraint for debuggers 





@b0011 DBGBCR<n>_EL1 Use for T32 instructions. 





0b1100 =DBGBCR<n>_EL1+2 Use for T32 instructions. 





@b1111_ DBGBCR<n>_EL1 Use for A64 and A32 instructions. 





If the implementation is an AArch64-only implementation, all instructions are word-aligned and 
DBGBCR<n>_EL1.BAS is RES1. 


Figure D2-3 on page D2-1651 shows a summary of when Address Match breakpoints programmed with particular 
BAS values generate Breakpoint exceptions. The figure contains four parts: 


° A column showing the row number, on the left. 
. An instruction set and instruction size table. 

. A location of instruction figure. 

° A BAS field values table, on the right. 


To use the figure, read across the rows. For example, row 7 shows that a breakpoint with DBGBCR<n>_EL1.BAS 
programmed as either @b0011 or 0b1111 generates Breakpoint exceptions for A64 instructions. A64 instructions are 
always at word-aligned addresses. 


Note 


To breakpoint on an A64 instruction, ARM recommends that the debugger programs DBGBCR<n>_EL1.BAS as 
0b1111. 








In the figure: 
Yes Means that the breakpoint generates a Breakpoint exception. 
No Means that the breakpoint does not generate a Breakpoint exception. 


UNP Means that is it CONSTRAINED UNPREDICTABLE whether the breakpoint generates a Breakpoint 
exception. See Other usage constraints for Address breakpoints on page D2-1654. 
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Location of instruction® BAS[3:0] 


2 -1 O 41 42 43 +4 «+45 0b0011 0b1100 0b1111 























Row 1 T32 16-bit 
Row 2 16-bit 
Row 3 T32 32-bit 
Row 4 32-bit 
Row 5 32-bit 
Row 6 A32 32-bit 
Row 7 A64 32-bit 























Yes No Yes 
No Yes UNP 
UNP No UNP 
Yes UNP Yes 
No Yes UNP 
Yes UNP Yes 
Yes UNP Yes 








a. 0 means the word-aligned address held in the DBGBVR<n>_EL1[48:2]:00. The other locations are as follows: 
-2 means ((DBGBVR<n>_EL1[48:2]:00) — 2). 
-1 means ((DBGBVR<n>_EL1[48:2]:00) — 1). 


+5 means ((DBGBVR<n>_EL1[48:2]:00) + 5). 


The solid areas show the location of the instruction. 


D2.9.5 Breakpoint context comparisons 


Figure D2-3 Summary of BAS field meanings for Address Match breakpoints 


The breakpoint type defined by DBGBCR<n>_EL1.BT determines what context comparison is required, if any. 
Table D2-13 shows the BT values that require a comparison, and the match required for the comparison to be 


successful. 


Table D2-13 Breakpoint context comparison tests 





DBGBCR<n>.BT 


Test required for successful context comparison 











Qb001x CONTEXTIDR_EL1 value matches DBGBVR<n>_EL1.ContextID value 
0b100x VTTBR_EL2.VMID value matches DBGBVR<n>_EL1.VMID value 
Qb101x CONTEXTIDR_ELI value matches DBGBVR<n>_EL1.ContextID value and 


VTTBR_EL2.VMID value matches DBGBVR<n>_EL1.VMID value 





No context comparison is required for other valid DBGBCR<n>.BT values. 


Context breakpoints do not generate Breakpoint exceptions when execution is in EL2 using either Execution state, 
or when execution is in EL3 using AArch64. 


The following Context breakpoint types do not generate Breakpoint exceptions in Secure state: 
° VMID Match breakpoints. 
° VMID and Context ID Match breakpoints. 





Note 
° For all Context breakpoints, DBGBCR<n>_EL1.BAS is RES1 and is ignored. 
° For Linked Context breakpoints, DBGBCR<n>_EL1.{LBN, SSC, HMC, PMC} are RESO and are ignored. 





D2.9.6 Breakpoint usage constraints 


See the following sections: 
° Reserved DBGBCR<n>_EL1.BT values on page D2-1652. 
° Reserved DBGBCR<n>_EL1.{SSC, HMC, PMC} values on page D2-1652. 
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° Reserved DBGBCR<n>_EL1.BAS values on page D2-1653. 
° Reserved DBGBCR<n>_EL1.LBN values on page D2-1654. 
° Other usage constraints for Address breakpoints on page D2-1654. 
° Other usage constraints for Context breakpoints on page D2-1654. 


Reserved DBGBCR<n>_EL1.BT values 


Table D2-14 shows when particular DBGBCR<n>_EL1.BT values are reserved. 


Table D2-14 Reserved BT values 























BT value Breakpoint type Reserved 

Qb001x Context ID Match For non context-aware breakpoints 

0b010x Address Mismatch In stage 1 of an AArch64 translation regime, or if EDSCR.HDE 
is 1 and halting is allowed 

0b011x - Always 

0b100x VMID Match For non context-aware breakpoints, or if EL2 is not implemented 

0b101x Context ID and VMID Match 

Qb11xx - Always 





If a breakpoint is programmed with one of these reserved BT values: 


. The breakpoint must behave as if it is either: 
— Disabled. 


— Programmed with a BT value that is not reserved, other than for a direct or external read of 
DBGBCR<n>_EL1. 
° For a direct or external read of DBGBCR<n>_EL 1, if the reserved BT value: 
— Has no function for any execution conditions, the value read back is UNKNOWN. 


_— Has a function for execution conditions other than the current execution conditions, the value read 
back is the value written. This permits software to save and restore the BT value so that the breakpoint 
functions for the other execution conditions. 


The behavior of breakpoints with reserved BT values might change in future revisions of the architecture. For this 
reason, software must not rely on the behavior described here. 
Reserved DBGBCR<n>_EL1.{SSC, HMC, PMC} values 


Table D2-15 shows when particular combinations of DBGBCR<n>_EL1.{SSC, HMC, PMC} are reserved in 
stage 1 of an AArch64 translation regime. 


Table D2-15 Reserved HMC, SSC, and PMC combinations 











HMC, SSC, and PMC combination Reserved 
All combinations with SSC set to 0b01 or 0b10. When EL3 is not implemented and EL2 is implemented. 
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Table D2-15 Reserved HMC, SSC, and PMC combinations (continued) 





HMC, SSC, and PMC combination 


Reserved 





Any combination where HMC or SSC is nonzero, except for 
the combination with HMC set to 1, SSC set to 0b11, and PMC 
set to 0b0. 


When both of EL2 and EL3 are not implemented 





The combination with HMC set to 1, SSC set to 0b11, and PMC 
set to @b00. 


When EL2 is not implemented 





Combinations not included in Table D2-11 on page D2-1649. 


Always 





For all breakpoints except Linked Context breakpoints, if a breakpoint is programmed with one of these reserved 


combinations: 


° If the reserved combination has a function for other execution conditions: 


— The breakpoint must behave as if it is disabled. 


— A direct or external read of DBGBCR<n>_EL1.{SSC, HMC, PMC} returns the values written. This 
means that software can save and restore the combination so that the breakpoint can function for the 


other execution conditions. 


° If the reserved combination does not have a function for other execution conditions: 


— _ It must behave either as if it is programmed with a combination that is not reserved or as if it is 


disabled. 


—_ A direct or external read of DBGBCR<n>_EL1.{SSC, HMC, PMC} returns UNKNOWN values. 


If the breakpoint is a Linked Context breakpoint, then: 
° The values of HMC, SSC, and PMC are ignored. 
° A direct or external read of DBGBCR<n>_EL1.{SSC, HMC, PMC} returns UNKNOWN values 


The behavior of breakpoints with reserved combinations of HMC, SSC, and PMC might change in future revisions 


of the architecture. For this reason, software must not rely on the behavior described here. 


Reserved DBGBCR<n>_EL1.BAS values 


In an AArch64-only implementation, DBGBCR<n>_EL1.BAS for all breakpoints is RES1. 


Otherwise: 


For all Context breakpoints 


DBGBCR<n>_EL1.BAS is RES1 and is ignored. 


For all Address breakpoints 


Table D2-12 on page D2-1650 gives the valid values of the DBGBCR<n>_EL1.BAS field. 


If a breakpoint is programmed with a reserved BAS value: 


° The breakpoint must behave as if it is either: 
— Disabled. 


— Programmed with a BAS value that is not reserved, other than for a direct or external read of 


DBGBCR<n>_EL1. 


° A direct or external read of DBGBCR<n>_EL1.BAS returns an UNKNOWN value. 


Software must not rely on these properties as the behavior of reserved values might change in a future revision of 


the architecture. 
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Reserved DBGBCR<n>_EL1.LBN values 


For all Context breakpoints 


DBGBCR<n>_EL1.LBN reads UNKNOWN and its value is ignored. 


For Linked Address breakpoints 


A Linked Address breakpoint must link to a context-aware breakpoint. For a Linked Address 
breakpoint, any DBGBCR<n>_EL1.LBN value that is not for a context-aware breakpoint is 
reserved. 


If a Linked Address breakpoint links to a breakpoint that is not implemented, or that is not 
context-aware, then reads of DBGBCR<n>_EL1.LBN return an UNKNOWN value and behavior is 
CONSTRAINED UNPREDICTABLE. The Linked Address breakpoint behaves as if it is either: 


° Disabled. 
° Linked to an UNKNOWN context-aware breakpoint. 


If a Linked Address breakpoint links to a breakpoint that is implemented and that is context-aware, 
but that is either not enabled or not programmed as a Linked Context breakpoint, it behaves as if it 
is disabled. 

For Unlinked Address breakpoints 
DBGBCR<n>_EL1.LBN reads UNKNOWN and its value is ignored. 


Other usage constraints for Address breakpoints 


For all Address breakpoints 
° DBGBVR<n>_EL1[1:0] are RESO and are ignored. 
° If the implementation supports AArch32 state: 


— For 32-bit instructions, if a breakpoint matches on the address of the second halfword 
but not the address of the first halfword, it is CONSTRAINED UNPREDICTABLE whether 
the breakpoint generates a Breakpoint exception. 


— If DBGBCR<n>.BAS is 0b1111, it is CONSTRAINED UNPREDICTABLE whether the 
breakpoint generates a Breakpoint exception for a T32 instruction starting at address 
((DBGBVR<n>[48:2]:00) + 2). For T32 instructions, ARM recommends that the 
debugger programs the BAS field with either 0b0011 or 0b1100. 


Other usage constraints for Context breakpoints 


For all Context breakpoints 
Any bits of DBGBVR<n>_EL 1 that are not used to specify Context ID or VMID are RESO and are 
ignored. 

For Linked Context breakpoints 


If no Linked Address breakpoints or Linked watchpoints link to a Linked Context breakpoint, the 
Linked Context breakpoint does not generate any Breakpoint exceptions. 
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ption syndrome information and preferred return address 


See the following: 
° Exception syndrome information. 


° Preferred return address. 


Exception syndrome information 


On taking a Breakpoint exception, the PE records information about the exception in the Exception Syndrome 
Register (ESR) at the Exception level the exception is taken to. The ESR used is one of: 


° ESR_EL1. 
° ESR_EL2. 





Note 
Breakpoint exceptions cannot be taken to EL3 using AArch64. 





Table D2-16 shows the information that the PE records. 


Table D2-16 Information recorded in the ESR_ELx 











ESR_ELx field Information recorded in ESR_EL1 or ESR_EL2. 
Exception Class, EC The PE sets this to: 

. 0x30, if the exception was taken from a lower Exception level. 

. x31, if the exception was taken without a change of Exception level. 
Instruction Length, IL The PE sets this to 1. 





Instruction Specific Syndrome, ISS _—_ ISS[24:6] RESO. 


ISS[5:0] Instruction Fault Status Code (IFSC). The PE sets this to the code for a debug 
exception, 0b100010. 





Preferred return address 


The preferred return address of a Breakpoint exception is the address of the instruction that was not executed 
because the PE took the Breakpoint exception instead. 


This means that the preferred return address is the address of the instruction that caused the exception. 


D2.9.8 Pseudocode description of Breakpoint exceptions taken from AArch64 state 


AArch64.BreakpointValueMatch() tests the value in DBGBVR<n>_EL 1. 


AArch64.StateMatch() tests the values in DBGBCR<n>_EL1.{SSC, HMC, PMC} and, if the breakpoint links to a 
Linked Context breakpoint, also tests the Linked Context breakpoint. 


For a watchpoint, AArch64.StateMatch() tests the values in DBGWCR<n>_EL1.{SSC, HMC, PAC} and, if the 
watchpoint links to a Linked Context breakpoint, also tests the Linked Context breakpoint. 


AArch64.BreakpointMatch() tests a committed instruction against all breakpoints. 
AArch64.CheckBreakpoint() generates a Breakpoint exception if all of the following are true: 
° MDSCR_EL1.MDE is 1. 


. Debug exceptions are enabled from the current Exception level and Security state. See Enabling debug 
exceptions from the current Exception level and Security state on page D2-1633. 


. All of the conditions required for Breakpoint exception generation are met. See About Breakpoint exceptions 
on page D2-1641. 
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Note 
AArch64.CheckBreakpoint() might halt the PE and cause it to enter Debug state. External debug uses Debug state. 








AArch64 .BreakpointException() is called to generate a Breakpoint exception. 


These functions are defined in Chapter J1 ARMv8 Pseudocode. 
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Watchpoint exceptions 


This section describes Watchpoint exceptions in stage 1 of an AArch64 translation regime. 


The PE is using an AArch64 translation regime when it is executing either: 
. In an Exception level that is using AArch64. 
° At ELO using AArch32 when EL! is using AArch64. 


This section contains the following subsections: 





. About Watchpoint exceptions. 

° Watchpoint types and linking of watchpoints on page D2-1659. 

° Execution conditions for which a watchpoint generates Watchpoint exceptions on page D2-1660. 

. Watchpoint data address comparisons on page D2-1662. 

. Determining the memory location that caused a Watchpoint exception on page D2-1665. 

° Watchpoint behavior on other instructions on page D2-1666. 

° Watchpoint usage constraints on page D2-1667. 

° Exception syndrome information and preferred return address on page D2-1669. 

° Pseudocode description of Watchpoint exceptions taken from AArch64 state on page D2-1670. 

D2.10.1 About Watchpoint exceptions 

A watchpoint is an event that results from the execution of an instruction, based on a data address. Watchpoints are 

also known as data breakpoints. 

A watchpoint operates as follows: 

1. A debugger programs the watchpoint with a data address, or a data address range. 

2. The watchpoint generates a Watchpoint debug event on an access to the address, or any address in the address 
range. 

A watchpoint never generates a Watchpoint debug event on an instruction fetch. 

An implementation can include between 2-16 watchpoints. In an implementation, ID_AA64DFRO_EL1.WRPs 

shows how many are implemented. 

To use an implemented watchpoint, a debugger programs the following registers for the watchpoint: 

° The Watchpoint Control Register, DBGWCR<n>_EL1. This contains controls for the watchpoint, for 
example an enable control. 

° The Watchpoint Value Register, DBGWVR<n>_EL1. This holds the data virtual address used for watchpoint 
matching. 

These registers are numbered, so that: 

° DBGWCR1_EL1 and DBGWVR1_EL1 are for watchpoint number one. 

° DBGWCR2_EL2 and DBGWVR2_EL1 are for watchpoint number two. 

° DBGWCR<n>_EL1 and DBGWVR<n>_EL] are for watchpoint number n. 

A watchpoint can: 

° Be programmed to generate Watchpoint debug events on read accesses only, on write accesses only, or on 
both types of access. 

° Link to a Linked Context breakpoint, so that a Watchpoint debug event is only generated if the PE is ina 
particular context when the address match occurs. 
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A single watchpoint can be programmed to match on one or more address bytes. A watchpoint generates a 
Watchpoint debug event on an access to any byte that it is watching. The number of bytes a watchpoint is watching 
is either: 


° One to eight bytes, provided that these bytes are contiguous and that they are all in the same naturally-aligned 
doubleword. A debugger uses the Byte Address Select field, DBGWCR<n>_EL1.BAS, to select the bytes. 
See Programming a watchpoint with eight bytes or fewer on page D2-1663. 


° Eight bytes to 2GB, provided that both of the following are true: 
— The number of bytes is a power-of-two. 


— The range starts at an address that is aligned to the range size. 


A debugger uses the MASK field, DBGWCR<n>_EL1.MASK, to program a watchpoint with eight bytes to 
2GB. See Programming a watchpoint with eight or more bytes on page D2-1664. 


A debugger must use either the BAS field or the MASK field. If it uses both, whether the watchpoint generates 
Watchpoint debug events is CONSTRAINED UNPREDICTABLE. See Programming dependencies of the BAS and MASK 
fields on page D2-1668. 


For each memory access, all of the watchpoints are tested. When a watchpoint is tested, it generates a Watchpoint 
debug event if all of the following are true: 


° The watchpoint is enabled. That is, the watchpoint enable control for it, DBGWCR<n>_EL1.E, is 1. 
° The conditions specified in the DBGWCR<n>_EL]1 are met. 
° The comparison with the address held in the DBGWVR<n>_EL] is successful. 


° If the watchpoint links to a Linked Context breakpoint, the comparison or comparisons made by the Linked 
Context breakpoint also are successful. See Figure D2-2 on page D2-1645. See also Breakpoint context 
comparisons on page D2-1651. 


° The instruction that initiates the memory access is committed for execution. 
° The instruction that initiates the memory access passes its condition code check. 
If halting is allowed and EDSCR.HDE is 1, Watchpoint debug events cause entry to Debug state. 


Otherwise, if debug exceptions are: 


° Enabled, Watchpoint debug events generate Watchpoint exceptions. 
. Disabled, Watchpoint debug events are ignored. 
Note 





The remainder of this Watchpoint Exceptions section, including all subsections, describes watchpoints as generating 
Watchpoint exceptions. 


However, the behavior described also applies if watchpoints are causing entry to Debug state. 





The debug exception enable controls on page D2-1630 describes the enable controls for Watchpoint debug events. 
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D2.10.2 Watchpoint types and linking of watchpoints 


When a debugger programs a watchpoint, it must program that watchpoint so that it is either: 
° Used in isolation. In this case, the watchpoint is called an Unlinked watchpoint. 


° Enabled for linking to a Linked Context breakpoint. In this case, the watchpoint is called a Linked watchpoint. 


When a Linked watchpoint links to a Linked Context breakpoint, the Linked watchpoint only generates a 
Watchpoint exception if the PE is in a particular context when the data address match occurs. For example, a 
debugger might: 


1. Program watchpoint number one with a data address. 
2. Program breakpoint number five to be a Linked VMID Match breakpoint. 


3. Link the watchpoint and the breakpoint together. A Watchpoint exception is only generated if both the data 
address matches and the VMID matches. 


The Watchpoint Type field for a watchpoint, DBGWCR<n>_EL1.WT, controls whether the watchpoint is enabled 
for linking. If DBGWCR<n>_EL1.WT is 1, the watchpoint is enabled for linking. 


Rules for linking watchpoints 
The rules for watchpoint linking are as follows: 
° Only Linked watchpoints can be linked. 


. A Linked watchpoint can link to any type of Linked Context breakpoint. The Linked Breakpoint Number 
field, DBGWCR<n>_EL1.LBN, for the Linked watchpoint specifies the particular Linked Context 
breakpoint that the Linked watchpoint links to, and: 


— DBGWCR<n>_EL1.WT.{SSC, HMC, PAC} for the Linked watchpoint defines the execution 
conditions that the watchpoint generates Watchpoint exceptions for. See Execution conditions for 
which a watchpoint generates Watchpoint exceptions on page D2-1660. 


— DBGBCR<n>_EL1.{SSC, HMC, PMC} for the Linked Context breakpoint are ignored. 


. A Linked watchpoint cannot link to another watchpoint. The LBN field can therefore only specify a 
breakpoint. 
° If a Linked watchpoint links to a breakpoint that is not context-aware, the behavior of the Linked watchpoint 


is CONSTRAINED UNPREDICTABLE. See Watchpoint usage constraints on page D2-1667. 


° If a Linked watchpoint links to an Unlinked Context breakpoint, the Linked watchpoint never generates any 
Watchpoint exceptions. 


. Multiple Linked watchpoints can link to a single Linked Context breakpoint. 


Note 


Multiple Address breakpoints can also link to a single Linked Context breakpoint. Breakpoint exceptions on 
page D2-1641 describes breakpoints. 








Figure D2-2 on page D2-1645 shows an example of permitted watchpoint linking. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D2-1659 
1ID092916 Non-Confidential 


D2 AArch64 Self-hosted Debug 
D2.10 Watchpoint exceptions 


D2.10.3 Execution conditions for which a watchpoint generates Watchpoint exceptions 


Each watchpoint can be programmed so that it only generates Watchpoint exceptions for certain execution 
conditions. For example, a watchpoint might be programmed to generate Watchpoint exceptions only when the PE 
is executing at EL2 in Non-secure state. 


DBGWCR<n>_EL1.{SSC, HMC, PAC} define the execution conditions a watchpoint generates Watchpoint 
exceptions for, as follows: 
Security State Control, SSC 
Controls whether the watchpoint generates Watchpoint exceptions only in Secure state, only in 
Non-secure state, or in both Security states. 
—— Note 


This is determined by the Security state of the PE, not from the NS attribute returned by the 
translation of the virtual address on which the watchpoint is set. 





Higher Mode Control, HMC, and Privileged Access Control, PAC 


HMC and PAC together control which Exception levels the watchpoint generates Watchpoint 
exceptions in. 


The PAC control relates to the privilege of the memory access, not to the Exception level at which 
the access was made. 


— Note 


This means that, if the PE executes a Load unprivileged or Store unprivileged instruction at EL1, 
the resulting data access triggers a watchpoint only if both: 


° PAC is programmed to a value that generates watchpoints on ELO accesses. 


° All other conditions for generating the watchpoint are met. 


Example A64 Load unprivileged and Store unprivileged instructions are LDTR and STTR. 





Table D2-17 on page D2-1661 shows the valid combinations of HMC, SSC, and PAC, and for each combination 
shows which Exception levels watchpoints generate Watchpoint exceptions in. 


In the table: 


Y or - Means that a watchpoint programmed with the values of HMC, SSC, and PAC shown in that row: 
Y Can generate Watchpoint exceptions in that Exception level. 


- Cannot generate Watchpoint exceptions in that Exception level. 


Res Means that the combination of HMC, SSC, and PAC is reserved. See Reserved 
DBGWCR<n>_EL1.{SSC, HMC, PAC} values on page D2-1667. 





D2-1660 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D2 AArch64 Self-hosted Debug 
D2.10 Watchpoint exceptions 


Table D2-17 Summary of watchpoint HMC, SSC, and PAC encodings 





Security state the watchpoint 


Implementation 



















































































HMC SSC PAC is programmed to match in EL3a EL2 EL1 ELO 
NoEL3 NoEL3 and no EL2 
0 00 01 Both - - Y - - - 
0 00 10 7 - - = : 
0 00 11 - - Y Y - - 
0 01 01 Non-secure - - - Res Res 
0 01 10 - - - Res Res 
0 01 11 - - Res Res 
0 10 01 Secure - - - Res Res 
0 10 10 - - - Y Res Res 
0 10 11 - - Y Res Res 
1 00 01 Both Y Y - - Res 
1 00 11 Y Y Y - Res 
1 01 01 Non-secure - Y Y - Res Res 
1 01 11 - Y Y Y Res Res 
1 10 00 Secure Y - - - Res Res 
1 10 01 Y - - Res Res 
1 10 11 Y - Y Res Res 
1 11 00 Non-secure - Y - - - Res if no EL2> 











a. Debug exceptions are not generated at EL3 using AArch64. This means that these combinations of HMC, SSC, and PAC are only relevant 
if watchpoints cause entry to Debug state. Self-hosted debuggers must avoid combinations of HMC, SSC, and PMC that generate Watchpoint 
exceptions at EL3 using AArch64. 


b. This encoding is only reserved when EL2 is not implemented, regardless of whether EL3 is implemented. 


All combinations of HMC, SSC, and PAC that this table does not show are reserved. See Reserved 
DBGWCR<n>_EL1.{SSC, HMC, PAC} values on page D2-1667. 
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D2.10.4 Watchpoint data address comparisons 

An address comparison is successful if bits [48:2] of the current data virtual address are equal to 

DBGWVR<n>_EL1[48:2], taking into account all of the following: 

° The size of the access. See Size of the data access. 
If EL1 is using AArch64 and ELO is using AArch32, AArch32 instructions can be executed in stage 1 of an 
AArch64 translation regime. In this case, data addresses are zero-extended before comparison with the 
watchpoint. 

° The bytes selected by DBGWVR<n>_EL1.BAS. See Programming a watchpoint with eight bytes or fewer 
on page D2-1663. 

° Any address ranges indicated by DBGWVR<n>_EL1.MASK. See Programming a watchpoint with eight or 
more bytes on page D2-1664. 

Note 

° DBGWVR<n>_EL] is a 64-bit register. The most significant bits of this register are sign-extension bits. 

° DBGWVR<n>_EL1[1:0] are RESO and are ignored 

Size of the data access 

Because watchpoints can be programmed to generate Watchpoint exceptions on individual bytes, the size of each 

data access must be taken into account. See Example D2-1. 

Example D2-1 

1. A debugger programs a watchpoint to generate Watchpoint exceptions only when the byte at address 0x1009 
is accessed. 

2. The PE accesses the unaligned doubleword starting at address 0x1003. 

In this scenario, the watchpoint must generate a Watchpoint exception. 

The size of data accesses initiated by DC ZVA instructions is the DC ZVA block size that DCZID_ELO.BS defines. 

The size of data accesses initiated by DC IVAC instructions is an IMPLEMENTATION DEFINED size that is both: 

° From the inclusive range between: 
—_ The size that CTR_ELO.DminLine defines. 
—  2KB. 

. A power-of-two. 

For both of these instructions: 

° The lowest address accessed by the instruction is the address supplied to the instruction, rounded down to the 
nearest multiple of the access size initiated by that instruction. 

. The highest address accessed is (size - 1) bytes above the lowest address accessed. 

See also, Watchpoint behavior on accesses by the DC IVAC instruction and the DC ZVA instruction on 

page D2-1666. 
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Programming a watchpoint with eight bytes or fewer 


The Byte Address Select field, DBGWCR<n>_EL1.BAS, selects which bytes in the doubleword starting at the 
address contained in the DBGWVR<n>_EL1 the watchpoint generates Watchpoint exceptions for. 


If the address programmed into the DBGWVR<n>_EL] is: 


° Doubleword-aligned: 


—  Alleight bits of DBGWCR<n>_EL1.BAS are used, and the descriptions given in Table D2-18 apply. 


. Word-aligned but not doubleword-aligned: 


— Only DBGWCR<n>_EL1.BAS[3:0] are used, and the descriptions given in Table D2-19 apply. In this 
case, DBGWCR<n>_EL1.BAS[7:4] are RESO. 


Table D2-18 Supported BAS values when the DBGWVRn_EL1 address alignment is doubleword 





BAS value _ Description 





0b00000000 Watchpoint never generates a Watchpoint exception. 





BAS[0] ==1 Generates a Watchpoint exception if the byte at address DBGWVR<n>_EL1[48:3]:000 is accessed. 





BAS[1] == Generates a Watchpoint exception if the byte at address DBGW VR<n>_EL1[48:3]:001 is accessed. 





BAS[2]==1 Generates a Watchpoint exception if the byte at address DBGWVR<n>_EL1[48:3]:010 is accessed. 














BAS[3] == Generates a Watchpoint exception if the byte at address DBGW VR<n>_EL1[48:3]:011 is accessed. 
BAS[4] == Generates a Watchpoint exception if the byte at address DBGW VR<n>_EL1[48:3]:100 is accessed. 
BAS[5] == Generates a Watchpoint exception if the byte at address DBGW VR<n>_EL1[48:3]:101 is accessed. 
BAS[6] == Generates a Watchpoint exception if the byte at address DBGW VR<n>_EL1[48:3]:110 is accessed. 





BAS[7] ==1 Generates a Watchpoint exception if the byte at address DBGW VR<n>_EL1[48:3]:111 is accessed. 





Table D2-19 Supported BAS values when the DBGWVRn_EL1 address alignment is word 





BAS value? _ Description 

















0b00000000 Watchpoint never generates a Watchpoint exception 

BAS[0] == Generates a Watchpoint exception if byte at address DBGWVR<n>_EL1[48:2]:00 is accessed. 
BAS[1] == Generates a Watchpoint exception if byte at address DBGWVR<n>_EL1[48:2]:01 is accessed. 
BAS[2] == Generates a Watchpoint exception if byte at address DBGWVR<n>_EL1[48:2]:10 is accessed. 
BAS[3] == Generates a Watchpoint exception if byte at address DBGWVR<n>_EL1[48:2]:11 is accessed. 





a. DBGWCR<n>_EL1.BAS[7:4] are RESO. 


If the BAS field is programmed with more than one byte, the bytes that it is programmed with must be contiguous. 
For watchpoint behavior when its BAS field is programmed with non-contiguous bytes, see Other usage constraints 
on page D2-1669. 


When programming the BAS field with anything other than @b11111111, a debugger must program 
DBGWCR<n>_EL1.MASK to be 0b00000. See Programming dependencies of the BAS and MASK fields on 
page D2-1668. 
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A watchpoint generates a Watchpoint exception whenever a watched byte is accessed, even if: 


° The access size is smaller or larger than the address region being watched. 


. The access is misaligned, and the base address of the access is not in the doubleword or word of memory 


addressed by the DBGWVR<n>_EL1[48:3]. See Example D2-1 on page D2-1662. 


The following are some example configurations of the BAS field: 


. To program a watchpoint to generate a Watchpoint exception on the byte at address 0x1003, program: 
—  DBGWVR<n>_EL1 with 0x1000. 
— DBGWCR<n>_EL1.BAS to be 0b00001000. 


° To program a watchpoint to generate a Watchpoint exception on the bytes at addresses 0x2003, 0x2004 and 


0x2005, program: 


—  DBGWVR<n>_EL1 with 0x2000. 
—  DBGWCR<n>_EL1.BAS to be 0b00111000. 


° If the address programmed into the DBGWVR<n>_EL] is doubleword-aligned: 


— To generate a Watchpoint exception when any byte in the word starting at the doubleword-aligned 


address is accessed, program DBGWCR<n>_EL1.BAS to be 0b00001111. 


— _ To generate a Watchpoint exception when any byte in the word starting at address 


DBGWVR<n>_EL1[31:3]:100 is accessed, program DBGWCR<n>_EL1.BAS to be 0b11110000. 





Note 


ARM deprecates programming a DBGWVR<n>_EL1 with an address that is not doubleword-aligned. 





Programming a watchpoint with eight or more bytes 


A debugger can use the MASK field, DBGWCR<n>_EL1.MASK, to program a single watchpoint with a data 
address range. The range must meet all of the following criteria: 


° It is a size that is: 


— A power-of-two. 


—  Aminimum of eight bytes. 


— A maximum of 2GB. 


. It starts at an address that is aligned to the size. 


The MASK field specifies the number of least significant data address bits that must be masked. Up to 31 least 
significant bits can be masked: 








MASK 0b00000 = No bits are masked. 
0b00001 Reserved. 
0b00010 Reserved. 
0b00011 Three least significant bits are masked. 
0b00100 ~=—- Four least significant bits are masked. 
0b00101 ~—‘ Five least significant bits are masked. 
0b11111 31 least significant bits are masked. 
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If n least significant address bits are masked, the watchpoint generates a Watchpoint exception on all of the 
following: 


° Address DBGWVR<n>_EL1[48:n]:000... 
° Address DBGWVR<n>_EL1[48:n]:111... 


° Any address between these two addresses. 


For example, if the four least significant address bits are masked, Watchpoint exceptions are generated for all 
addresses between DBGW VR<n>_EL1[48:4]:0000 and DBGWVR<n>_EL1[48:4]:1111, including these 
addresses. 





Note 
° The 17 most significant bits cannot be masked. This means that the full address cannot be masked. 
. For watchpoint behavior when its MASK field is programmed with a reserved value, see Reserved 


DBGWCR<n>_EL1.MASK values on page D2-1668. 





When masking address bits, a debugger must both: 


° Program DBGWCR<n>_EL1.BAS to be 0b11111111. See Programming dependencies of the BAS and MASK 
fields on page D2-1668. 


° In the DBGWVR<n>_EL1, set the masked address bits to 0. For watchpoint behavior when any of the 
masked address bits are not 0, see Other usage constraints on page D2-1669. 





D2.10.5 Determining the memory location that caused a Watchpoint exception 
On taking a Watchpoint exception, the PE records an address in a Fault Address Register that the debugger can use 
to determine the memory location that triggered the watchpoint. 
The Fault Address Register (FAR) used is either: 
° FAR_EL], if the exception is taken to EL1. 
° FAR_EL2, if the exception is taken to EL2. 
In cases where one instruction triggers multiple watchpoints, only one address is recorded. 
On entering Debug state on a Watchpoint debug event, the PE records the address in the EDWAR. 
For more information, see the subsections that follow. These are: 
° Address recorded for Watchpoint exceptions generated by instructions other than Data Cache instructions 
° Address recorded for Watchpoint exceptions generated by Data Cache instructions on page D2-1666 
Address recorded for Watchpoint exceptions generated by instructions other than Data 
Cache instructions 
The address recorded must be both: 
. From the inclusive range between: 
— The lowest address accessed by the memory access that triggered the watchpoint. 
— The highest watchpointed address accessed by the memory access. A watchpointed address is an 
address that the watchpoint is watching. 
. Within a naturally-aligned block of memory that is all of the following: 
— A power-of-two size. 
— No larger than the DC ZVA block size. 
— Contains a watchpointed address accessed by the memory access. 
The size of the block is IMPLEMENTATION DEFINED. There is no architectural means of discovering the size. 
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Example D2-2 Address recorded for a watchpoint programmed on 0x8019 


A debugger programs a watchpoint to generate a Watchpoint exception on any access to the byte 0x8019. 


An A32 load multiple instruction then loads nine registers starting from address 0x8004 upwards. This triggers the 
watchpoint. 


If the DC ZVA block size is: 
° 32 bytes, the address that the PE records must be between 0x8004 and 0x8019 inclusive. 
° 16 bytes, the address that the PE records must be between 0x8010 and 0x8019 inclusive. 


Address recorded for Watchpoint exceptions generated by Data Cache instructions 


The address recorded is the address passed to the instruction. This means that the address recorded might be higher 
than the address of the location that triggered the watchpoint. 











D2.10.6 Watchpoint behavior on other instructions 
Under normal operating conditions, the following do not generate Watchpoint exceptions: 
° Instruction cache maintenance instructions. 
° Address translation instructions. 
° TLB maintenance instructions. 
° Prefetch memory instructions. 
. All data cache maintenance instructions except DC IVAC. 
However, the debug architecture allows for IMPLEMENTATION DEFINED controls, such as those in ACTLR registers, 
to enable watchpoints on an IMPLEMENTATION DEFINED subset of these instructions. Whether a watchpoint treats 
the instruction as a load or a store, and the access size of instruction cache, address translation, and TLB operations 
are IMPLEMENTATION DEFINED. 
The access size of the IMPLEMENTATION DEFINED instruction cache, address translation, and TLB operations which 
generate Watchpoint exceptions are IMPLEMENTATION DEFINED. 
Note 
The DC ZVA instruction is not a data cache maintenance instruction. 
See also the following subsections: 
° Watchpoint behavior on accesses by Store-Exclusive instructions. 
° Watchpoint behavior on accesses by the DC IVAC instruction and the DC ZVA instruction. 
Watchpoint behavior on accesses by Store-Exclusive instructions 
If a watchpoint matches on a data access caused by a Store-Exclusive instruction, then: 
° If the store fails because an exclusive monitor does not permit it, it is IMPLEMENTATION DEFINED whether the 
watchpoint generates a Watchpoint exception. 
° Otherwise, the watchpoint generates a Watchpoint exception. 
Watchpoint behavior on accesses by the DC IVAC instruction and the DC ZVA 
instruction 
DC IVAC and DC ZVA operations are treated as data stores. This means that for a watchpoint to match on an access 
caused by one of these instructions, the debugger must program DBGWCR<n>_EL1.LSC to be one of the 
following: 
10 Match on data stores. 
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11 Match on data stores and data loads. 


Note 
For the size of data accesses performed by the DC IVAC instruction and the DC ZVA instruction, see Watchpoint data 
address comparisons on page D2-1662. The size of all data accesses must be considered because watchpoints can 
be programmed to match on individual bytes. 








D2.10.7 Watchpoint usage constraints 


See the following: 

° Reserved DBGWCR<n>_EL1.{SSC, HMC, PAC} values. 

° Reserved DBGWCR<n>_EL1.LBN values. 

. Programming dependencies of the BAS and MASK fields on page D2-1668. 
° Reserved DBGWCR<n>_EL1.BAS values on page D2-1668. 

° Reserved DBGWCR<n>_EL1.MASK values on page D2-1668. 

. Other usage constraints on page D2-1669. 


Reserved DBGWCR<n>_EL1.{SSC, HMC, PAC} values 


Table D2-20 shows when particular combinations of DBGWCR<n>_EL1.{SSC, HMC, PAC} are reserved. 


Table D2-20 Reserved SSC, HMC, and PAC combinations 











HMC, SSC, and PMC combination Reserved 
All combinations with SSC set to @b@1 or 0b10. When EL3 is not implemented and EL2 is implemented. 
All combinations where HMC or SSC is nonzero, except for the When both of EL2 and EL3 are not implemented 


combination with HMC set to 1, SSC set to @b11, and PMC set to 0b00. 





The combination with HMC set to 1, SSC set to @b11, and PMC set to 0b00. When EL2 is not implemented 





Combinations not included in Table D2-17 on page D2-1661. Always 





If a watchpoint is programmed with one of these reserved combinations: 


° The watchpoint must behave as if it is either: 
— Disabled. 


— Programmed with a combination that is not reserved, other than for a direct or external read of 
DBGWCR<n>_EL1. 
° For a direct or external read of DBGWCR<n>_EL 1, if the reserved combination: 


— Has no function for any execution conditions, the value read back for each of SSC, HMC, and PMC 
is UNKNOWN. 


— Has a function for execution conditions other than the current execution conditions, the value read 
back is the value written. This permits software to save and restore the combination so that the 
watchpoint functions for the other execution conditions. 


The behavior of watchpoints with reserved combinations of SSC, HMC, and PAC might change in future revisions 
of the architecture. For this reason, software must not rely on the behavior described here. 


Reserved DBGWCR<n>_EL1.LBN values 


For Linked Watchpoints 


A Linked watchpoint must link to a context-aware breakpoint. For a Linked watchpoint, any 
DBGWCR<n>_EL1.LBN value that is not for a context-aware breakpoint is reserved. 
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If a Linked watchpoint links to a breakpoint that is not implemented, or that is not context-aware, 
then reads of DBGWCR<n>_EL1.LBN return an UNKNOWN value and the behavior is 
CONSTRAINED UNPREDICTABLE. The Linked watchpoint behaves as if it is either: 


. Disabled 
° Linked to an UNKNOWN context-aware breakpoint. 


If a Linked watchpoint links to a breakpoint that is implemented and is context-aware, but that is 
either not enabled or not programmed as a Linked Context breakpoint, it behaves as if it is disabled. 


For Unlinked Watchpoints For Unlinked watchpoints, DBGWCR<n>_EL1.LBN reads UNKNOWN and its value is 
ignored. 


Programming dependencies of the BAS and MASK fields 
When programming a watchpoint, a debugger must use either: 
° The MASK field, to program the watchpoint with an address range that can be eight bytes to 2GB. 


° The BAS field, to select which bytes in the doubleword or word starting at the address contained in the 
DBGWVR<n>_EL]1 the watchpoint must generate Watchpoint exceptions for. 


If the debugger uses the: 
° MASK field, it must program BAS to be 0b11111111, so that all bytes in the doubleword or word are selected. 


° BAS field, it must program MASK to be 0b00000, so that the MASK field does not indicate any address 
ranges. 


If an enabled watchpoint has a MASK field that is non-zero and a BAS field that is not set to 0b11111111, then for 
each byte in the address range, it is CONSTRAINED UNPREDICTABLE whether or not a Watchpoint exception 
is generated. 


Reserved DBGWCR<n>_EL1.BAS values 


The BAS field must be programmed with a value Zeros(8-n-m) :Ones(n):Zeros(m), where: 


° nis a non-zero positive integer less-than-or-equal-to 8. 
° mis a positive integer less-than 8. 
° n+m is less-than-or-equal-to 8. 


All other values are reserved. 


Note 


If x is zero, then Zeros(x) is an empty bitstring. 








If DBGWVR<n>_EL1[2] is 1, DBGWCR<n>_EL1.BAS[7:4] are RESO and are ignored. 
If a watchpoint is programmed with a reserved BAS value: 


° It is CONSTRAINED UNPREDICTABLE whether the watchpoint generates a Watchpoint exception for each byte 
in the doubleword or word of memory addressed by the DBGWVR<n>_EL1. 


° A direct or external read of DBGWCR<n>_EL1.BAS returns an UNKNOWN value. 

Software must not rely on these properties as the behavior of reserved values might change in a future revision of 
the architecture. 

Reserved DBGWCR<n>_EL1.MASK values 

If a watchpoint is programmed with a reserved MASK value: 


. The watchpoint must behave as if it is either: 
— Disabled. 
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— Programmed with an UNKNOWN value that is not reserved, that might be @b0000, other than for a direct 
or external read of DBGWCR<n>_EL 1. 


° A direct or external read of DBGWCR<n>_EL1.MASK returns an UNKNOWN value. 


Other usage constraints 


For all watchpoints: 
° DBGWVR<n>_EL1[1:0] are RESO and are ignored. 


° If DBGWCR<n>_EL1.MASK is nonzero, and any masked bits of DBGWVR<n>_EL]1 are 
not 0, it is CONSTRAINED UNPREDICTABLE whether the watchpoint generates a Watchpoint 
exception when the unmasked bits match. 


° A watchpoint never generates any Watchpoint exceptions if DBGWCR<n>_EL1.LSC is 











0b00. 
D2.10.8 Exception syndrome information and preferred return address 
See the following: 
° Exception syndrome information. 
. Preferred return address on page D2-1670. 
Exception syndrome information 
On taking a Watchpoint exception, the PE records all of the following: 
° Information about the exception in the Exception Syndrome Register (ESR) at the Exception level the 
exception is taken to. 
° An address that the debugger can use to determine the memory location that caused the exception. The PE 
records this in a Fault Address Register (FAR). 
The ESR and FAR used is either: 
° ESR_EL1 and FAR_EL1, if the exception is taken to EL1. 
° ESR_EL2 and FAR_EL2, if the exception is taken to EL2. 
Note 
Watchpoint exceptions cannot be taken to EL3 using AArch64. 
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Table D2-21 shows the recorded information. 


Table D2-21 Information recorded in the ESR_ELx 

















rama Information recorded in ESR_EL1 or ESR_EL2 
Exception This is set to: 
Class, EC ° 0x34, if the exception was taken from a lower Exception level. 
° 0x35, if the exception was taken without a change of Exception level. 
Instruction This is set to 1. 
Length, IL 
Instruction ISS[24] Instruction Syndrome Valid (ISV). This is 0, because Watchpoint exceptions are not stage 2 aborts. 
Specific ISS[23:9] RESO. 
Syndrome, ISS[8] Cache Maintenance (CM). This indicates whether a cache maintenance instruction generated the 
ISS ae 
exception: 
0 Not generated by a cache maintenance instruction. 
1 Generated by a cache maintenance instruction. 
If a DC ZVA instruction generated the exception, CM is 0. 
ISS[7] RESO. 
ISS[6] Write-not-Read (WnR). This indicates whether the access was by a read instruction or a write 
instruction: 
0 Read instruction. 
1 Write instruction. 
ISS[5:0] Data Fault Status Code (DFSC). The PE sets this to the code for a debug exception, 0b100010. 
Preferred return address 
The preferred return address of a Watchpoint exception is the address of the instruction that was not executed 
because the PE took the Watchpoint exception instead. 
This means that the preferred return address is the address of the instruction that caused the exception. 
D2.10.9 Pseudocode description of Watchpoint exceptions taken from AArch64 state 


AArch64.WatchpointByteMatch() tests an individual byte accessed by an operation. 


AArch64.StateMatch() tests the values in DBGWCR<n>_EL1.{HMC, SSC, PAC}, and if the watchpoint is Linked, 
also tests the Linked Context breakpoint that the watchpoint links to. 


AArch64.WatchpointMatch() tests the value in DBGWVR<n>_EL 1. 


AArch64.CheckWatchpoint() generates a FaultRecord that AArch64.Abort() raises a Watchpoint exception for if all of 
the following are true: 


° MDSCR_EL1.MDE is 1. 


. Debug exceptions are enabled from the current Exception level and Security state. See Enabling debug 
exceptions from the current Exception level and Security state on page D2-1633. 


. All of the conditions required for Watchpoint exception generation are met. See About Watchpoint exceptions 
on page D2-1657. 


Note 
AArch64.CheckWatchpoint() might halt the PE and cause it to enter Debug state. External debug uses Debug state. 
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AArch64.WatchpointException() is called to generate a Watchpoint exception. 


These functions are defined in Chapter J1 ARMv8 Pseudocode. 
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D2.11 Vector Catch exceptions 


Vector Catch exceptions are not generated in AArch64 translation regimes. 


Note 


This means that they are never taken to EL1 using AArch64 and are only supported if at least EL1 using AArch32 
is supported. 








A debugger that is executing in EL2 using AArch64 can route Vector Catch exceptions to EL2 using AArch64. See 
Routing debug exceptions on page D2-1631. 


AArch64.VectorCatchException() is called to generate a Vector Catch exception. 


Vector Catch exceptions on page G2-3975 describes Vector Catch exceptions. 
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D2.12 Software Step exceptions 


The following subsections describe Software Step exceptions: 


° About Software Step exceptions. 

° Rules for setting MDSCR_ELI.SS to 1. 

. The software step state machine on page D2-1674. 

. Entering the active-not-pending state on page D2-1675. 

° Behavior in the active-not-pending state on page D2-1679. 
° Entering the active-pending state on page D2-1681. 

° Behavior in the active-pending state on page D2-1681. 


° Stepping T32 IT instructions on page D2-1682. 

° Exception syndrome information and preferred return address on page D2-1682. 
° Additional considerations on page D2-1684. 

° Pseudocode description of Software Step exceptions on page D2-1686. 


D2.12.1 About Software Step exceptions 
Software step is an ARMv8-A resource that a debugger can use to make the PE single-step instructions. 


For example, by using software step, debugger software executing at a higher Exception level can single-step 
instructions at a lower Exception level. 


Operation is as follows: 


1. A debugger: 


a. Enables software step by setting MDSCR_EL1.SS to 1. See The debug exception enable controls on 
page D2-1630. 


b. Executes an exception return instruction, ERET, to branch to the instruction to be single-stepped in the 
software being debugged. 


2. The PE then: 
a. Executes the instruction to be single-stepped. 


b. Takes a Software Step exception on the next instruction, returning control to the debugger. 


However, another exception might be generated while the instruction is being stepped. This exception is either: 
. A synchronous exception that is generated by the instruction being stepped. 


° An asynchronous exception that is taken before or after the instruction being stepped. 


The PE can only take a Software Step exception if debug exceptions are enabled from the current Exception level 
and Security state. See Enabling debug exceptions from the current Exception level and Security state on 
page D2-1633. 


A state machine describes the behavior of software step, shown in The software step state machine on 
page D2-1674. 


Throughout this Software Step exceptions section, including in all subsections, ELp means the Exception level that 
Software Step exceptions are targeting. Routing debug exceptions on page D2-1631 defines ELp as the debug target 
Exception level. 


D2.12.2 Rules for setting MDSCR_EL1.SS to 1 


Debugger software must be executing in an Exception level and Security state that debug exceptions are disabled 
from when it sets MDSCR_EL1.SS to 1. 


The Exception level that hosts the debugger software must be using AArch64. 
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D2.12.3 The software step state machine 


In Figure D2-4: 
. The OS Lock is unlocked and DoubleLockStatus() == FALSE. 


° The PE is not in Secure state with MDCR_EL3.SDD set to 1. That is, the PE is in Non-secure state, or is in 
Secure state with MDCR_EL3.SDD set to 0, or the implementation does not include EL3. 





MDSCR_EL1.SS == 


Execution is at either: 
* An Exception level that is higher than ELp. 
* ELp with (PSTATE.D == 1 || 
MDSCR_EL1.KDE == 0). 
This is termed execution in a debugger or above. 


Execution in a 
debugger or above 


By a debugger setting 


















Execution ina 


deb b 
pepeen arene To make the PE single-step an instruction, the 


debugger: 
1. Sets SPSR_ELx.SS to 1. 


2. Programs the ELR_ELx to point to the 
instruction to be stepped. 


3. Executes an ERET instruction. 


Inactive 
PSTATE.SS=0 


By ERET setting PSTATE.SS to 1 


Step completed? 
Execution is in the software being debugged, at 
either: 

* An Exception level that is lower than ELp. 


* ELp with (PSTATE.D == 0 && 
MDSCR_EL1.KDE == 1). 


Active-not- 
pending 


. PSTATE.SS=1 
By an asynchronous exception taken 


to an Exception level that debug 


exceptions are disabled from b 
Step completed 










By ERET setting PSTATE.SS to 0° 


Execution is in the software being debugged, at 
either: 


« An Exception level that is lower than ELp. 


* ELp with (PSTATE.D == 0 && 
MDSCR_EL1.KDE == 1). 


Software Step exception A Software Step exception is pending. 


Active-pending 
PSTATE.SS=0 


Execution has returned to the debugger. 
Inactive 
PSTATE.SS=0 








a. The step is the PE either: 
Taking an exception to an Exception level that debug exceptions are disabled from. 
If execution is at ELp with MDSCR_EL1.KDE == 1, executing an instruction that sets PSTATE.D to 1. 
Software step is inactive when debug exceptions are disabled from the current Exception level, and debug exceptions are disabled from ELp when 
PSTATE.D is 1. 
b. The step is the PE either: 
Executing the instruction to be stepped without taking an exception. 
Taking an exception to an Exception level that debug exceptions are enabled from. The Exception level might be using AArch64 or AArch32. 


c. Or, if execution is at ELp with MDSCR_EL1.KDE == 1, by software setting PSTATE.D to 0. 


Figure D2-4 Software step state machine 
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For a description of when debug exceptions are enabled or disabled from an Exception level, see Enabling debug 
exceptions from the current Exception level and Security state on page D2-1633. 


For more information about how a step is completed, see Behavior in the active-not-pending state on page D2-1679. 
The software step states are: 


Inactive Software step is inactive. It cannot generate any Software Step exceptions or affect PE execution. 
Software step is inactive whenever any of the following are true: 


. MDSCR_EL1.SS is 0. 
° ELp is using AArch32. 


° Debug exceptions are disabled from the current Exception level or Security state. 
Active-not-pending 
None of the conditions mentioned in /nactive are true, therefore software step is active. 
The current instruction is the instruction to be stepped. 
Active-pending 
None of the conditions mentioned in /nactive are true, therefore software step is active. 


A Software Step exception is pending on the current instruction. 


Whenever software step is active, whether the state machine is in the active-not pending state or the active-pending 
state depends on PSTATE.SS. Table D2-22 shows this. 


Table D2-22 State machine states 




















ELousing, Debusetceptionsnabioctusinthe | MOSCRLELISS peTaTE ss Ste machine 
AArch32 x Xx Inactive 
AArch64 Disabled Xx Inactive 
AArch64 Enabled 0 Xx Inactive 
AArch64 Enabled 1 1 Active-not-pending 
AArch64 Enabled 1 0 Active-pending 





D2.12.4 Entering the active-not-pending state 


Software step can only enter the active-not-pending state from the inactive state. 
Software step: 


° Enters the active-not-pending state when an ERET instruction writes 1 to PSTATE.SS, by copying from 
SPSR_ELx.SS when it restores PSTATE. 


° Might enter the active-not-pending state on exiting Debug state when DSPSR_ELO.SS or DSPSR.SS is 1. 
See Exiting Debug state on page H2-4880. 


An ERET instruction only copies 1 from SPSR_ELx.SS to PSTATE.SS if all of the following are true: 
. MDSCR_ELLSS is 1. 

° ELp is using AArch64. 

° Debug exceptions are disabled from the current Exception level. 


° Debug exceptions are enabled from the Exception level that the ERET instruction targets. 


Otherwise, ERET instructions set PSTATE.SS to 0, regardless of the value of SPSR_ELx.SS. 
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Table D2-23 shows this. In the table: 









































Lock Means the value of (OSLSR_EL1.OSLK == ’1’ || DoubleLockStatus()). 
NS Is SCR_EL3.NS. 
SDD Is MDCR_EL3.SDD. See Disabling debug exceptions from Secure state on page D2-1634. 
TDE Is MDCR_EL2.TDE. See Routing debug exceptions on page D2-1631. 
Table D2-23 Value an ERET writes to PSTATE.SS 
MDSCR_EL1.SS__ Lock NS SDD TDE EL1isusing EL2isusing Value an ERET writes to PSTATE.SS 
0 x x x x x 0 
1 TRUE xX x x Xx 0 
FALSE 0 1 x x n/a 0 
0 x AArch32 n/a 0 
AArch64 n/a See Table D2-24 on page D2-1677 
1 x 0 AArch32 x 0 
AArch64 AArch64 See Table D2-24 on page D2-1677 
1 AArch32 AArch32 0 
x AArch64 See Table D2-25 on page D2-1678 
For: 
° SCR_EL3.NS == 0 or MDCR_EL2.TDE == 0, and EL1 using AArch64, so that ELp is EL1 using AArch64, 
Table D2-24 on page D2-1677 shows the value an ERET writes to PSTATE.SS. 
° SCR_EL3.NS == 1 and MDCR_EL2.TDE == 1 and EL2 using AArch64, so that ELp is EL2 using AArch64, 
Table D2-25 on page D2-1678 shows the value an ERET writes to PSTATE.SS. 
In both tables: 
From EL Means the Exception level at which the PE executes the ERET instruction. 
Target EL Is the target Exception level of the ERET. 
—— Note 
If the ERET is an illegal exception return, the target Exception level of the ERET is the current 
Exception level. See Illegal return events from AArch64 state on page D1-1537. 
KDE Is MDSCR_EL1.KDE. See Enabling debug exceptions from the current Exception level and 
Security state on page D2-1633. 
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Table D2-24 Value an ERET writes to PSTATE.SS if ELp is EL1 using AArch64 





Software step enable 


Value an ERET writes 
























































FromEL TargetEL KDE PSTATE.D SPSR_ELxp Statusat: ores 
From EL Target EL 

EL3 EL3 x x Disabled Disabled 0 
EL2 x x Disabled Disabled 0 
EL1 0 x x Disabled Disabled 0 
1 x i Disabled Disabled 0 

0 Disabled Enabled SPSR_EL3.SS 

ELO x x Disabled Enabled SPSR_EL3.SS 
EL2 EL2 x x Disabled Disabled 0 
EL1 0 x x Disabled Disabled 0 
1 x 1 Disabled Disabled 0 

0 Disabled Enabled SPSR_EL2.SS 

ELO x xX x Disabled Enabled SPSR_EL2.SS 
ELI EL1 0 x Disabled Disabled 0 
1 0 x Enabled -b 0 
1 1 Disabled Disabled 0 

0 Disabled Enabled SPSR_EL1.SS 

ELO 0 x x Disabled Enabled SPSR_ELI.SS 
1 0 x Enabled Enabled 0 

1 x Disabled Enabled SPSR_ELI.SS 





a. Because MDSCR_EL1.SS == 1, it means that the ERET is itself being stepped. 
b. Depends on SPSR_EL1.D. 
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Table D2-25 Value an ERET writes to PSTATE.SS if ELp is EL2 using AArch64 





Software step enable 


Value an ERET writes 
























































FromEL TargetEL KDE PSTATE.D SPSR_ELxp “ttusat: Loree os 
From EL Target EL 

EL3 EL3 x Disabled Disabled 0 
EL2 0 Disabled Disabled 0 
1 1 Disabled Disabled 0 

0 Disabled Enabled SPSR_EL3.SS 

ELI xX x Disabled Enabled SPSR_EL3.SS 

ELO xX xX Disabled Enabled SPSR_EL3.SS 
EL2 EL2 0 xX x Disabled Disabled 0 
1 0 x Enabled@ -b 0 
1 1 Disabled Disabled 0 

0 Disabled Enabled SPSR_EL2.SS 

ELI 0 x x Disabled Enabled SPSR_EL2.SS 
1 0 xX Enabled@ Enabled 0 

1 xX Disabled Enabled SPSR_EL2.SS 

ELO 0 x xX Disabled Enabled SPSR_EL2.SS 
1 0 xX Enabled@ Enabled 0 

1 x Disabled Enabled SPSR_EL2.SS 
ELI ELI x x x Enabled@ Enabled 0 
ELO x x xX Enabled@ Enabled 0 





a. 


Because MDSCR_EL1.SS == 1, it means that the ERET is itself being stepped. 
b. Depends on SPSR_EL2.D. 





Note 
No AArch32 instruction can set PSTATE.SS to 1. 








D2-1678 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


D2 AArch64 Self-hosted Debug 
D2.12 Software Step exceptions 


D2.12.5 Behavior in the active-not-pending state 


In this state, the PE does one of the following: 


. Executes the instruction to be stepped and either: 


— Completes it without taking a synchronous exception. 


— Takes a synchronous exception if the instruction generates one. 


° Takes an asynchronous exception without executing any instructions. 


° Enters Debug state because of a Halting debug event. 


If the PE executes the instruction to be stepped without taking any exceptions, then either of the following occurs 
after the instruction has been executed: 


° The instruction has disabled debug by setting PSTATE.D to 1 and software step advances to the inactive state. 


° The PE sets PSTATE.SS to 0 and software step advances to the active-pending state. See Behavior in the 
active-pending state on page D2-1681. 


If the PE takes either a synchronous or an asynchronous exception, behavior is as described in one of the following: 


° If the PE takes an exception to an Exception level that is using AArch64. 


° Tf the PE takes an exception to an Exception level that is using AArch32 on page D2-1680. 


If the PE enters Debug state because of a Halting debug event, behavior is as described in Entering Debug state and 
Software Step on page H2-4854. 


If the PE takes an exception to an Exception level that is using AArch64 


As part of exception entry, the PE does all of the following: 


° Sets SPSR_ELx.SS to 0 or 1, depending on the exception. See Table D2-26. 


° Sets PSTATE.SS to 0. This causes software step to enter either the active-pending state or the inactive state, 
depending on whether debug exceptions are enabled or disabled from the Exception level that the exception 


is taken to: 


Enabled Software step enters the active-pending state. 


Disabled Software step enters the inactive state. 


In either case, on taking the exception, a step is complete. 


° Sets PSTATE.D to 1. 


Table D2-26 Categorization of exceptions, for setting SPSR_ELx.SS to 0 or 1 





Exception description 


Exceptions SPSR_ELx.SS 





Exceptions whose preferred return address is for 
the instruction that follows the instruction to be 
stepped. 


Supervisor Call (SVC) exceptions. 0 
Hypervisor Call (HVC) exceptions. 


Secure Monitor Call (SMC) exceptions. 





Exceptions whose preferred return address is the 
address of the instruction to be stepped. 


All other synchronous exceptions, and asynchronous 1 
exceptions that are taken before the instruction to be stepped. 





Note 





If an SMC instruction executed at Non-secure EL] is trapped to EL2 because HCR_EL2.TSC is 1, the exception is a 
Trap exception, not a Secure Monitor Call exception, and so SPSR_ELx.SS is set to 1, not 0. 
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If the PE takes an exception to an Exception level that is using AArch32 
This can only happen when all of the following is true: 


° EL2 is implemented and is using AArch64, the PE is executing in Non-secure state, and MDCR_EL2.TDE 
is 1. Because MDCR_EL2.TDE is 1, ELp is EL2. 


° The exception is taken to Non-secure EL1 using AArch32. 
As part of exception entry, the PE sets PSTATE.SS to 0. This causes software step to enter the active-pending state. 


Note 


° Software step always enters the active-pending state because the exception is taken to an Exception level that 
debug exceptions are enabled from, EL1. Debug exceptions are enabled from EL1 because ELp is EL2, and 





debug exceptions are always enabled from Exception levels that are lower than ELp. 
° AArch32 SPSRs have no SS bit. Where an SPSR_ELx register architecturally maps to an AArch32 
SPSR_<mode> register, SPSR_ELx.SS maps to SPSR_<mode>[21]. 


SPSR_<mode>[21] is always RESO. The PE always sets SPSR_<mode>[21] to 0 on taking an exception to 
an Exception level that is using AArch32. 





Summary of behavior in the active-not-pending state 


Table D2-27 summarizes behavior in the active-not-pending state. 


Table D2-27 Summary of behavior in the active-not-pending state 














event Yaleuritento. Target xconton pets Yale wien? Next stat 

No exception 0 n/a Disables Software step n/a Inactive 
Otherwise n/a Active-pending 

Exception 0 AArch64 Supervisor Call (SVC) 0 Active-pending 
Hypervisor Call (HVC) or inactive> 


Secure Monitor Call (SMC) 





Other 1 





AArch32 All 0c Active-pending 





For the No exception rows, this column shows the effect of the event. 


For the Exception rows, this column shows the exception taken. 





b. Which state software step enters depends on whether debug exceptions are enabled or disabled from the target Exception level. See 
Figure D2-4 on page D2-1674. 
c. SPSR_<mode>[21] is RESO. 
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D2.12.6 Entering the active-pending state 
Software step enters the active-pending state after any of the following operations, provided that both: 
. MDSCR_ELLSS is 1. 


° Debug exceptions are enabled from the Exception level and Security state that execution is in after the 
operation. 


The operations are: 


While software step is in the active-not-pending state 


The PE either: 

. Executing the instruction to be stepped without taking any exceptions. 
. Taking an exception. 

—— Note 


If entry to the active-pending state is because of the PE taking an exception, it means that the 
exception is one that is taken to Non-secure EL1 when MDCR_EL2.TDE is 1. Otherwise, debug 
exceptions are masked by PSTATE.D, therefore they would be disabled from the target Exception 
level of the exception. 





While software step is in the inactive state 
The PE either: 
° Executing an ERET instruction when SPSR_ELx.SS is 0. 


° If MDSCR_EL1.KDE is 1, executing an MSR DAIF or MSR DAIFC1r instruction that clears 
PSTATE.D to 0. 


In addition, software step might enter the active-pending state either: 


. After a direct write to a System register, for example a write to MDSCR_EL1.KDE or MDSCR_EL1.SS. 
These writes require explicit synchronization to guarantee their effect. See Synchronization and the software 
step state machine on page D2-1685. 


° On exiting Debug state when DSPSR_ELO.SS or DSPSR:SS is 0. See Exiting Debug state on page H2-4880. 


D2.12.7 Behavior in the active-pending state 
In this state, a Software Step exception is pending, and the PE takes it on the current instruction. 


Software Step exceptions have priority over all other exceptions except asynchronous exceptions taken to an 
Exception level or Security state that debug exceptions are disabled from. 


This means that there are some asynchronous exceptions that Software Step exceptions have priority over. 





Note 
. This is the only case where a synchronous exception explicitly has a higher priority than asynchronous 
exceptions. 
° For a description of when debug exceptions are enabled or disabled from an Exception level or Security state, 


see Enabling debug exceptions from the current Exception level and Security state on page D2-1633. 





In cases where both a Software Step exception is pending and an asynchronous exception taken to an Exception 
level or Security state that debug exceptions are disabled from is pending, the architecture does not define which 
exception is taken first. 
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D2.12.8 Stepping T32 IT instructions 


The ARMv8-A architecture permits a combination of an IT instruction and another 16-bit T32 instruction to 
comprise one 32-bit instruction. 


For the purpose of stepping an item, it is IMPLEMENTATION DEFINED whether: 
° The PE considers this combination to be one instruction. 


° The PE considers this combination to be two instructions. 


In an implementation that supports the ITD control, that can disable some uses of the IT instruction, it is then 
IMPLEMENTATION DEFINED whether this behavior depends on the value of the applicable ITD field. For example: 


. The PE might consider this combination to be one instruction, regardless of the state of the applicable ITD 
field. 

° The PE might consider this combination to be two instructions, regardless of the state of the applicable ITD 
field. 

. The PE might consider this combination to be one instruction when the applicable ITD field is 1, and two 


instructions when it is 0. 


The applicable ITD field is one of: 

° SCTLR_EL1.ITD if execution is at ELO using AArch32 when EL] is using AArch64. 
° SCTLR.ITD if execution is at Non-secure ELO or EL1 using AArch32. 

° HSCTLR.ITD if execution is at Non-secure EL2 using AArch32. 


D2.12.9 Exception syndrome information and preferred return address 


See the following: 
° Exception syndrome information. 


° Preferred return address on page D2-1684. 


Exception syndrome information 


On taking a Software Step exception, the PE records information about the exception in the Exception Syndrome 
Register (ESR) at the Exception level the exception is taken to. The ESR used is one of: 


. ESR_ELI. 
. ESR_EL2. 
Note 





Software Step exceptions cannot be taken to EL3. 





Table D2-28 on page D2-1683 shows the information that the PE records. 
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Table D2-28 Information recorded in the ESR_ELx 





ESR_ELx field 


Information recorded in ESR_EL1 or ESR_EL2 























Exception Class, The PE sets this to: 
EC ° 0x32, if the exception was taken from a lower Exception level. 
° 0x33, if the exception was taken from the current Exception level. 
Instruction Length, The PE sets this to 1. 
IL 
Instruction ISS[24] Instruction Syndrome Valid (ISV). This indicates whether the EX bit, ISS[6], is valid. The PE 
Specific Syndrome, sets this to: 
ISS 0 Not valid. 
1 Valid. 
ISS[23:7] RESO. 
ISS[6] Exclusive operation (EX). The PE sets this to indicate whether the instruction stepped was a 
Load-Exclusive instruction: 
0 The stepped instruction was not a Load-Exclusive instruction. 
1 The stepped instruction was a Load-Exclusive instruction. 
A debugger can use this information when stepping code that uses exclusive monitors. See 
Stepping code that uses exclusive monitors on page D2-1685 
ISS[5:0] Instruction Fault Status Code (IFSC). The PE sets this to the code for a debug exception, 
0b100010. 
When an instruction has been stepped, the PE sets: 
° The value of ISS.ISV to 1, to indicate that the EX bit is valid. 
° The value of EX to indicate whether the instruction stepped was a Load-Exclusive instruction, using the 
ISS.EX values shown in Table D2-28. 
If no instruction was stepped because software step entered the active-pending state from the inactive state without 
passing through the active-not-pending state, the PE sets both ESR_ELx. {ISV, EX} to 0. 
Note 
An implementation that always sets ISV to 0 and never sets EX is not compliant. 
ESR_ELx.ISV is UNKNOWN if, in the active-not-pending state, either: 
° The instruction stepped was an ERET or an ISB. In these cases, ESR_ELx.EX is set to 0. 
° MDCR_EL2.TDE was set to | and either: 
— The instruction to be stepped generated a synchronous exception that was taken to Non-secure EL1. 
In this case, the instruction to be stepped never completed. 
— _ The PE took an asynchronous exception to Non-secure EL1 before it could execute the instruction to 
be stepped. In this case, the instruction to be stepped was never executed. 
In both of these cases, ESR_ELx.EX is set to the correct value for the instruction. 
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Table D2-29 shows the permitted scenarios. 


Table D2-29 Values that the PE can record in ESR_ELx.{ISV, EX} 











Description ESR_ELx.ISV ESR_ELx.EX 
Syndrome data is not available because no instruction was stepped. 0 0 
Syndrome data is available because an instruction was stepped. The instruction stepped 1 0 


was an instruction other than a Load-Exclusive instruction. 





Syndrome data is available because an instruction was stepped. The instruction stepped 1 1 
was a Load-Exclusive instruction. 





The instruction stepped was an ERET or an ISB. UNKNOWN 0 





The instruction to be stepped generated a synchronous exception that was taken to UNKNOWN Set to the correct value 


Non-secure EL1. 


for the instruction. 





The PE took an asynchronous exception before it could execute the instruction to be UNKNOWN Set to the correct value 


stepped. 


for the instruction. 





ESR_ELx.EX is UNKNOWN if the stepped instruction was a conditional Load-Exclusive instruction that failed its 
condition code test. 





Note 
A Load-Exclusive instruction is any one of the following: 
° In the A64 instruction set, any instruction that has a mnemonic starting with either LDX or LDAX. 
° In the A32 and T32 instruction sets, any instruction that has a mnemonic starting with either LDREX or LDAEX. 





Preferred return address 


The preferred return of a Software Step exception is the address of the instruction that was not executed because the 
PE took the Software Step exception instead. 





D2.12.10 Additional considerations 
This section contains the following: 
° Behavior when an ERET instruction is an illegal exception return. 
° Behavior when the instruction stepped writes a misaligned PC value on page D2-1685. 
. Stepping code that uses exclusive monitors on page D2-1685. 
° Synchronization and the software step state machine on page D2-1685. 
Behavior when an ERET instruction is an illegal exception return 
If the conditions for entering the active-not-pending state in Entering the active-not-pending state on page D2-1675 
are met, but the PE executes an ERET instruction that is an illegal exception return, the exception return must be taken 
to the same Exception level that it was taken from. In this scenario, even though the Exception level remains the 
same before and after the ERET, software step can advance from the inactive state to one of the active states. Consider 
the following case: 
1. MDSCR_EL1.SS is 1 and software step is inactive. The current Exception level is EL1 using AArch64, the 
OS Lock and OS Double Lock are unlocked, and MDCR_EL2.TDE is 0, MDSCR_EL1.KDE is 1, and 
PSTATE.D is 1. 
PSTATE.D == 1 is the reason why software step is inactive, because PSTATE.D == 1 means that debug 
exceptions are disabled from the current Exception level. 
2. The PE executes an ERET instruction. 
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3: The intended target of the ERET is EL2. This means that the ERET is an illegal exception return because the 
intended target is higher than the Exception level the ERET it is executed at. In this case, the ERET must target 
ELI instead of EL2. 


If SPSR_EL1.D is 0, then on the ERET PSTATE.D becomes 0 and debug exceptions become enabled from the 
current Exception level. Software step therefore advances from the inactive state to one of the active states. 


Which active state software step advances to depends on whether SPSR_ELx.SS is 1 or 0: 


° If SPSR_ELx.SS is 1, software step advances to the active-not-pending state. 
In this case, an Illegal Execution state exception is pending on the instruction to be stepped, and the PE takes 
the Illegal Execution state exception instead of executing the instruction to be stepped. 

° If SPSR_ELx.SS is 0, software step advances to the active-pending state. 


In this case, a Software Step exception and an Illegal Execution state exception are both pending. The 
Software Step exception has higher priority. On taking the Software Step exception, the PE sets 
SPSR_ELx.IL to 1. 





Note 


Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548 shows the relative priorities 
of synchronous exceptions. 





Behavior when the instruction stepped writes a misaligned PC value 


An indirect branch that writes a misaligned PC value might generate a PC alignment fault exception at the target of 
the branch. However, if the indirect branch is stepped using software step, the PE takes a Software Step exception 
instead, because the Software Step exception has higher priority. Behavior on returning from the Software Step 
exception depends on which Execution state the Exception level being returned to is using: 


AArch64 A PC alignment fault exception is generated. 


AArch32 The return from the Software Step exception forces the PC to the correct alignment, and no PC 
alignment fault exception is generated. 


Debugger software must therefore take care when using software step to single-step an indirect branch instruction 
executed in AArch32 state, that it does not hide a PC alignment fault exception. 


Stepping code that uses exclusive monitors 


The ARMv8-A architecture provides no mechanism for preserving the state of the exclusive monitors when a 
Load-Exclusive or a Store-Exclusive instruction is stepped. 


However, for certain progressions through the software step state machine, on taking a Software Step exception, the 
PE provides an indication of whether the instruction stepped was a Load-Exclusive instruction. 


Debugger software can use this to detect the state of the exclusive monitors. For example, if the PE reports that the 
instruction stepped was a Load-Exclusive instruction, the debugger is aware that the next Store-Exclusive operation 
will fail, because all exclusive monitors are cleared on returning from the Software Step exception. The debugger 
must then take action to ensure that the code being stepped makes forwards progress. 


For more information on how the PE reports whether the instruction stepped was a Load-Exclusive instruction, see 
Exception syndrome information and preferred return address on page D2-1682. 
Synchronization and the software step state machine 


Any of the following can cause transitions between software step states: 





° A direct write to a System register. 
° A direct write to a Special-purpose register. 
. A write to an external debug register that affects the routing of debug exceptions. 
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Because the software step state machine indirectly reads these registers, it is not guaranteed to observe any new 
values until after a Context synchronization event has occurred. 


In the time between a write to one of these registers and the next Context synchronization event, it is CONSTRAINED 
UNPREDICTABLE whether software step uses the state of the PE before the write, or the state of the PE after the write. 


After a Context synchronization event, the state machine must use the state of the PE after the write. 


Example D2-3 


1. Software changes MDSCR_EL1.SS from 0 to 1 when debug exceptions are enabled. 
2. The PE executes some instructions. 
3) A Context synchronization event occurs. 


During step 2, it is CONSTRAINED UNPREDICTABLE whether software step remains in the inactive state, as if 
MDSCR_EL1.SS is 0, or enters the active-pending state because MDSCR_EL1.SS is 1. If it is in the: 


° Inactive state, then after the Context synchronization event, it must enter the active-pending state. 
° Active-pending state, the PE might take a Software Step exception before the Context synchronization event. 
Note 





A direct write to a Special-purpose register does not require explicit synchronization. 





D2.12.11 Pseudocode description of Software Step exceptions 


SSAdvance() advances software step from the active-not-pending state to the active-pending state, by setting 
PSTATE.SS to 0. It is called on completing execution of each instruction. 


CheckSoftwareStep() checks whether software step is in the active-pending state, and if it is, generates a Software 
Step exception. It is called before each instruction executed, regardless of Execution state, before checking for any 
other synchronous exceptions. 


DebugExceptionReturnSS() returns the value to write to PSTATE.SS on an exception return or an exit from Debug 
state. See Entering the active-not-pending state on page D2-1675. 


These functions are defined in Chapter J1 ARMv8 Pseudocode. 
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D2.13. Synchronization and debug exceptions 


The behavior of debug depends on all of the following: 
° The state of the external debug authentication interface. 
. Indirect reads of: 
— External debug registers. 
— System registers, including system debug registers. 
— _ Special-purpose registers. 
If a change is made to any of these, the effect of that change on debug exception generation cannot be relied on until 


after a Context synchronization event has occurred. Similarly, the effect of the change on the software step state 
machine cannot be relied on until after a Context synchronization event has occurred. 


For any instructions executed between the time when the change is made and the time when the next Context 
synchronization event occurs, it is CONSTRAINED UNPREDICTABLE whether debug uses the state of the PE before the 
change, or the state of the PE after the change. 


Example D2-4 


1. Software changes MDSCR_EL1.MDE from 0 to 1. 


2. An instruction is executed, that would cause a Breakpoint exception if self-hosted debug uses the state of the 
PE after the change. 


3. A Context synchronization event occurs. 


In this case, it is CONSTRAINED UNPREDICTABLE whether the instruction generates a Breakpoint exception. 


Example D2-5 


i Software unlocks the OS lock. 
2. The PE executes some instructions. 


3: A Context synchronization event occurs. 


During the time when the PE is executing some instructions, step 2, it is CONSTRAINED UNPREDICTABLE whether 
debug exceptions other than Breakpoint Instruction exceptions can be generated. 


Note 


° Some register updates are self-synchronizing. Others require an explicit Context synchronization event. For 
more information, see both: 


— Accessing PSTATE fields on page D1-1514. 
— Synchronization requirements for AArch64 System registers on page D7-1889. 





— _ Synchronization of changes to the external debug registers on page H8-4964. 


. See Context synchronization event for the definition of this term. 
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Chapter D3 
The AArch64 System Level Memory Model 


This chapter provides a system level view of the general features of the memory system. It contains the following 
sections: 


About the memory system architecture on page D3-1690. 
Address space on page D3-1691. 

Mixed-endian support on page D3-1692. 

Cache support on page D3-1693. 

External aborts on page D3-1714. 


Memory barrier instructions on page D3-1716. 


Pseudocode description of general memory system instructions on page D3-1717. 
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D3.1 About the memory system architecture 


The ARM architecture supports different implementation choices for the memory system microarchitecture and 
memory hierarchy, depending on the requirements of the system being implemented. In this respect, the memory 
system architecture describes a design space in which an implementation is made. The architecture does not 
prescribe a particular form for the memory systems. Key concepts are abstracted in a way that permits 
implementation choices to be made while enabling the development of common software routines that do not have 
to be specific to a particular microarchitectural form of the memory system. For more information about the concept 
of a hierarchical memory system see Memory hierarchy on page B2-71. 


D3.1.1 Form of the memory system architecture 
The ARMV8 A-profile architecture includes a Virtual Memory System Architecture (VMSA), described in 
Chapter D4 The AArch64 Virtual Memory System Architecture. 


D3.1.2 Memory attributes 


Memory types and attributes on page B2-94 describes the memory attributes, including how different memory types 
have different attributes. Each location in memory has a set of memory attributes, and the translation tables define 
the virtual memory locations, and the attributes for each location. 


Table D3-1 shows the memory attributes that are visible at the system level. 


Table D3-1 Memory attribute summary 











Memory type Shareability Cacheability 

Device@ Outer Shareable Non-cacheable. 

Normal One of: One of: 
. Non-shareable. ° Non-cacheable. 
° Inner Shareable. ° Write-Through Cacheable. 
° Outer Shareable. ° Write-Back Cacheable. 





a. Takes additional attributes, see Device memory on page B2-98. 


b. See also Cacheability, cache allocation hints, and cache transient hints on page D3-1695. 


For more information on cacheability and shareability see Shareable Normal memory on page B2-95, 
Non-shareable Normal memory on page B2-96, and Caches and memory hierarchy on page B2-70. 
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D3.2 Address space 


The ARMV8 architecture is designed to support a wide range of applications with different memory requirements. 
It supports a range of physical address (PA) sizes, and provides associated control and identification mechanisms. 
For more information, see Address size configuration on page D4-1731. 


D3.2.1 Instruction address space overflow 
When a PE performs a Simple sequential execution of instructions, it calculates: 
(address_of_current_instruction) + (size_of_executed_instruction) 
This calculation is performed after each instruction to determine which instruction to execute next. 


If the address calculation performed after executing an instruction overflows @xFFFF FFFF FFFF FFFF, the program 
counter becomes UNKNOWN. 





Note 


Address tags are not propagated to the program counter, so the tag does not affect the address calculation. 





Where an instruction accesses a sequential set of bytes that crosses the @xFFFF_FFFF_FFFF_FFFF boundary when 
tagged addresses are not used, or the @xxxFF_FFFF_FFFF_FFFF boundary when tagged addresses are used, then the 
virtual address accessed for the bytes above this boundary is UNKNOWN. When tagged addresses are used, the value 
of the tag associated with the address also becomes UNKNOWN. 
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D3.3 Mixed-endian support 


A control bit, SCTLR_EL1.EOE is provided to allow the endianness of explicit data accesses made while executing 
at ELO to be controlled independently of those made while executing at EL1. Table D3-2 shows the endianness of 
explicit data accesses and translation table walks. 


Table D3-2 Endianness support 




















Exception level Explicit data accesses Stage 1 translation table walks Stage 2 translation table walks 
ELO SCTLR_EL1.E0E SCTLR_EL1.EE SCTLR_EL2.EE 
ELI SCTLR_EL1.EE SCTLR_EL1.EE SCTLR_EL2.EE 
EL2 SCTLR_EL2.EE SCTLR_EL2.EE N/A 
EL3 SCTLR_EL3.EE SCTLR_EL3.EE N/A 
Note 





SCTLR_EL1.E0E has no effect on the endianness of the LDTR, LDTRH, LDTRSH, and LDTRSW instructions, or 
on the endianness of the STTR and STTRH instructions, when these are executed at EL1. 





ARMvV8 provides the following options for endianness support: 

. All Exception levels support mixed-endianness: 
—  SCTLR_ELx.EE is RW and SCTLR_EL1.E0E is RW. 

° Only ELO supports mixed-endianness and EL1, EL2, and EL3 support only little-endianness: 
—  SCTLR_ELx is RESO and SCTLR_EL1.E0E is RW. 

° Only ELO supports mixed-endianness and EL1, EL2, and EL3 support only big-endianness: 
—  SCTLR_ELx is RES! and SCTLR_EL1.E0E is RW. 


° All Exception levels support only little-endianness: 
—  SCTLR_ELx is RESO and SCTLR_EL1.EOE is RESO. 
° All Exception levels support only big-endianness: 


—  SCTLR_ELx is RES1 and SCTLR_EL1.EOE is REs1. 


If mixed endian support is implemented for an Exception level using AArch32, endianness is controlled by 
PSTATE.E. For exception returns to AArch32 state, PSTATE.E is copied from SPSR_ELx.E. If the target Exception 
level supports only little-endian accesses, SPSR_ELx.E is RESO. If the target Exception level supports only 
big-endian accesses, SPSR_ELx.E is RES1. PSTATE.E is ignored in AArch64 state. 


The BigEndian() function determines whether the current Exception level and Execution state are using big-endian 
data. This function is defined in Chapter J1 ARMv8 Pseudocode. 





D3-1692 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D3.4 


D3 The AArch64 System Level Memory Model 
D3.4 Cache support 


Cache support 


This section describes the ARMv8 cache identification and control mechanisms, and the cache maintenance 
instructions, in the following sections: 


° General behavior of the caches. 
° Cache identification on page D3-1694. 


. Cacheability, cache allocation hints, and cache transient hints on page D3-1695. 
. Enabling and disabling the caching of memory accesses on page D3-1696. 

. Behavior of caches at reset on page D3-1698 

° Non-cacheable accesses and instruction caches on page D3-1698. 

. About cache maintenance in ARMV8 on page D3-1699. 

° Cache maintenance instructions on page D3-1703 

° Data cache zero instruction on page D3-1711. 


° Cache lockdown on page D3-1712. 
° System level caches on page D3-1713. 
. Branch prediction on page D3-1713. 


See also Caches in a VMSAv8-64 implementation on page D4-1829. 











D3.4.1 General behavior of the caches 

When a memory location has a Normal Cacheable memory attribute, determining whether a copy of the memory 

location is held in a cache still depends on many aspects of the implementation. The following non-exhaustive list 

of factors might be involved: 

. The size, line length, and associativity of the cache. 

. The cache allocation algorithm. 

. Activity by other elements of the system that can access the memory. 

° Speculative instruction fetching algorithms. 

° Speculative data fetching algorithms. 

. Interrupt behaviors. 

Given this range of factors, and the large variety of cache systems that might be implemented, the architecture 

cannot guarantee whether: 

° A memory location present in the cache remains in the cache. 

. A memory location not present in the cache is brought into the cache. 

Instead, the following principles apply to the behavior of caches: 

. The architecture has a concept of an entry locked down in the cache. How lockdown is achieved is 
IMPLEMENTATION DEFINED, and lockdown might not be supported by: 

— A particular implementation. 
— Some memory attributes. 

. An unlocked entry in a cache might not remain in that cache. The architecture does not guarantee that an 
unlocked cache entry remains in the cache or remains incoherent with the rest of memory. Software must not 
assume that an unlocked item that remains in the cache remains dirty. 

. A locked entry in a cache is guaranteed to remain in that cache. The architecture does not guarantee that a 
locked cache entry remains incoherent with the rest of memory, that is, it might not remain dirty. 

Note 
For more information, see The interaction of cache lockdown with cache maintenance instructions on 
page D3-1712. 

° Any memory location that has a Normal Cacheable attribute at either the current Exception level or at a 
higher Exception level can be allocated to a cache at any time. 
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° It is guaranteed that no memory location that does not have a Normal Cacheable attribute is allocated into the 
cache. 
° It is guaranteed that no memory location is allocated to the cache if it has a Normal Non-cacheable attribute 


or any type of Device memory attribute in both: 
— The translation regime at the current Exception level. 


— The translation regime at any higher Exception level. 


. For data accesses, any memory location with a Normal Inner Shareable or Normal Outer Shareable attribute 
is guaranteed to be coherent with all masters in its shareability domain. 


° Any memory location is not guaranteed to remain incoherent with the rest of memory. 


. The eviction of a cache entry from a cache level can overwrite memory that has been written by another 
observer only if the entry contains a memory location that has been written to by an observer in the 
shareability domain of that memory location. The maximum size of the memory that can be overwritten is 
called the Cache Write-back Granule. In some implementations the CTR_ELO identifies the Cache 
Write-back Granule. 


° The allocation of a memory location into a cache cannot cause the most recent value of that memory location 
to become invisible to an observer if it was previously visible to that observer. 


Note 


The Cacheability attribute of an address is determined by the applicable translation table entry for that address, as 
modified by any applicable System register Cacheability controls, such as the SCTLR_EL1.{I, C} controls. 








For the purpose of these principles, a cache entry covers at least 16 bytes and no more than 2KB of contiguous 
address space, aligned to the size of the cache entry. 


D3.4.2 Cache identification 


The ARMvVv8 cache identification registers describe the implemented caches that are affected by cache maintenance 
instructions executed on the PE. This includes the cache maintenance instructions that: 


° Affect the entire cache, for example IC IALLU. 
° Operate by address, for example IC IVAU. 
° Operate by set/way, for example DC ISW. 


The cache identification registers are: 


° The Cache Type Register, CTR_ELO, that defines: 


— The minimum line length of any of the instruction caches affected by the instruction cache 
maintenance instructions. 


— The minimum line length of any of the data or unified caches, affected by the data cache maintenance 
instructions. 


— The cache indexing and tagging policy of the Level | instruction cache. 





Note 


It is IMPLEMENTATION DEFINED whether caches beyond the PoC will be reported by this mechanism, and 
because of the possible existence of system caches some caches before the PoC might not be reported. For 
more information about system caches see System level caches on page D3-1713. 





° A single Cache Level ID Register, CLIDR_EL1, that defines: 

— _ The type of cache that is implemented and can be maintained using the architected cache maintenance 
instructions that operate by set/way or operate on the entire cache at each cache level, up to the 
maximum of seven levels. 

— The Level of Coherence (LoC) for the caches. See Terms used in describing the maintenance 
instructions on page D3-1699 for the definition of LoC. 
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— The Level of Unification Uniprocessor (LoUV) for the caches. See Terms used in describing the 
maintenance instructions on page D3-1699 for the definition of LoUU. 


— An optional ICB field to indicate the boundary between the caches use for caching Inner Cacheable 
memory regions and those used only for caching Outer Cacheable regions. 


° A single Cache Size Selection Register, CSSELR_EL1, that selects the cache level and cache type of the 
current Cache Size Identification Register. 


. For each implemented cache that is identifiable by this mechanism, across all the levels of caching, a Cache 
Size Identification Register, CCSIDR_EL1, that defines: 


— Whether the cache supports Write-Through, Write-Back, Read-Allocate and Write-Allocate. 


— The number of sets, associativity and line length of the cache. See Terms used in describing the 
maintenance instructions on page D3-1699 for a definition of these terms. 


To determine the cache topology associated with a PE: 


1. Read the Cache Type Register to find the indexing and tagging policy used for the Level 1 instruction cache. 
This register also provides the size of the smallest cache lines used for the instruction caches, and for the data 
and unified caches. These values are used in cache maintenance instructions. 


2. Read the Cache Level ID Register to find what caches are implemented. The register includes seven Cache 
type fields, for cache levels 1 to 7. Scanning these fields, starting from Level 1, identifies the instruction, data 
or unified caches implemented at each level. This scan ends when it reaches a level at which no caches are 
defined. The Cache Level ID Register also specifies the Level of Unification (LoU) and the Level of 
Coherence (LoC) for the cache implementation. 


3: For each cache identified at stage 2: 
° Write to the Cache Size Selection Register to select the required cache. A cache is identified by its 
level, and whether it is: 
— An instruction cache. 


— A data or unified cache. 


° Read the Cache Size ID Register to find details of the cache. 


D3.4.3 Cacheability, cache allocation hints, and cache transient hints 


Cacheability only applies to Normal memory, and can be defined independently for Inner and Outer cache locations. 
All types of Device memory are always treated as Non-cacheable. 


As described in Memory types and attributes on page B2-94, the memory attributes include a cacheability attribute 
that is one of: 


° Non-cacheable. 
° Write-Through cacheable. 
° Write-Back cacheable. 


In ARMV8, Cacheability attributes other than Non-cacheable can be complemented by a cache allocation hint. This 
is an indication to the memory system of whether allocating a value to a cache is likely to improve performance. In 
addition, it is IMPLEMENTATION DEFINED whether a cache transient hint is supported, see Transient cacheability hint 
on page D3-1696. 


The cache allocation hints are assigned independently for read and write accesses, and therefore when the Transient 
hit is supported the following cache allocation hints can be assigned: 


For read accesses: Read-Allocate, Transient Read-Allocate, or No Read-Allocate. 


For write accesses: Write-Allocate, Transient Write-Allocate, or No Write-Allocate. 
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Note 


° A Cacheable location with both No Read-Allocate and No Write-Allocate hints is not the same as a 
Non-cacheable location. A Non-cacheable location has coherency guarantees for all observers within the 
system that do not apply for a location that is Cacheable, No Read-Allocate, No Write-Allocate. 





° Implementations can use the cache allocation hints to limit cache pollution to a part of a cache, such as to a 
subset of ways. 


° For VMSAv8-64 translation table walks, the TCR_ELx.{IRGNn, ORGNn} fields define the memory 
attributes of the translation tables, including the cacheability. However, this assignment supports only a 
subset of the cacheability attributes described in this section. 





The architecture does not require an implementation to make any use of cache allocation hints. This means an 
implementation might not make any distinction between memory locations with attributes that differ only in their 
cache allocation hint. 


Transient cacheability hint 


In ARMvV8, it is IMPLEMENTATION DEFINED whether a Transient hint is supported. In an implementation that 
supports the Transient hint, the Transient hint is a qualifier of the cache allocation hints, and indicates that the benefit 
of caching is for a relatively short period. It indicates that it might be better to restrict allocation of transient entries, 
to avoid possibly casting-out other, less transient, entries. 





Note 


The architecture does not specify what is meant by a relatively short period. 





The description of the AArch64 MAIR_EL1, MAIR_EL2, and MAIR_EL3 registers, and the AArch32 MAIRO, 
MAIR1, HMAIRO, and HMAIR1 registers, includes the assignment of the Transient hint in an implementation that 
supports this option. In this assignment: 


. The Transient hint is defined independently for Inner Cacheable and Outer Cacheable memory regions. 
° A single Transient hint applies to both read and write accesses to a memory region. 
D3.4.4 Enabling and disabling the caching of memory accesses 


In ARMvV8, Cacheability control fields can force all memory locations with the Normal memory type to be treated 
as Non-cacheable, regardless of their assigned Cacheability attribute. Independent controls are provided for each 
stage of address translation, with separate controls for: 


° Data accesses. These controls also apply to accesses to the translation tables. 


= Instruction accesses. 


Note 


These Cacheability controls replace the cache enable controls provided in previous versions of the ARM 
architecture. 








The Cacheability control fields and their effects are as follows: 


For the EL1&0 translation regime 
° When the value of SCTLR_ELI.C is 0: 
—  Allstage 1 translations for data accesses to Normal memory are Non-cacheable. 
— _ Allaccesses to the EL1&0 stage | translation tables are Non-cacheable. 
° When the value of SCTLR_EL1.I is 0: 
—  Allstage 1 translations for instruction accesses to Normal memory are Non-cacheable. 
° When the value of HCR_EL2.CD is 1: 


—  Allstage 2 translations for data accesses to Normal memory are Non-cacheable. 
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—  Allaccesses to the EL1&0 stage 2 translation tables are Non-cacheable. 


° When the value of HCR_EL2.ID is 1: 


—  Allstage 2 translations for instruction accesses to Normal memory are Non-cacheable. 


° When the value of HCR_EL2.DC is 1, all Non-secure stage | translations and all accesses to 
the Non-secure EL1&0 stage 1 translation tables, are treated as accesses to Normal 
Non-shareable Inner Write-Back Cacheable Read-Allocate Write-Allocate, Outer 
Write-Back Cacheable Read-Allocate Write-Allocate memory, regardless of the value of 
SCTLR_EL1.{I, C}. This applies to translations for both data and instruction accesses. 


— Note 


° In Non-secure state, the stage 1 and stage 2 cacheability attributes are combined as described 
in Combining the stage 1 and stage 2 cacheability attributes for Normal memory on 
page D4-1799. 


° The SCTLR_EL1.{C, I} and HCR_EL2.DC fields have no effect on the EL2 and EL3 
translation regimes. 


. The HCR_EL2.{ID, CD} fields affect only stage 2 of the Non-secure EL1&0 translation 
regime. 


° In Non-secure state, when EL2 is using AArch64 and EL] is using AArch32, the 
HCR_EL2.{ID, CD, DC} controls apply as described here, but the EL1 controls are 
SCTLR.{C, I}. 





For the EL2 translation regime 
° When the value of SCTLR_EL2.C is 0: 


— All data accesses to Normal memory using the EL2 translation regime are 
Non-cacheable. 


— All accesses to the EL2 translation tables are Non-cacheable. 
° When the value of SCTLR_EL2.1 is 0: 


—  Allinstruction accesses to Normal memory using the EL2 translation regime are 
Non-cacheable. 


— Note 
The SCTLR_EL2.{I, C} fields have no effect on the EL1&0 and EL3 translation regimes. 





For the EL3 translation regime 
° When the value of SCTLR_EL3.C is 0: 


— _ All data accesses to Normal memory using the EL3 translation regime are 
Non-cacheable. 


— All accesses to the EL3 translation tables are Non-cacheable. 
° When the value of SCTLR_EL3.1 is 0: 


— All instruction accesses to Normal memory using the EL3 translation regime are 
Non-cacheable. 


— Note 
The SCTLR_EL3.{I, C} fields have no effect on the EL1&0 and EL2 translation regimes. 





In addition: 
. For translation regimes other than the Non-secure EL1&0 translation regime, if the value of SCTLR_ELx.M 
is 0, indicating that stage 1 translations are disabled for that translation regime, then: 


— If the value of SCTLR_ELx.] is 0, instruction accesses to Normal memory from stage 1| of the 
translation regime are Outer Shareable, Inner Non-cacheable, Outer Non-cacheable. 
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— If the value of SCTLR_ELx.]is 1, instruction accesses to Normal memory from stage 1| of the 
translation regime are Outer Shareable, Inner Write-Through cacheable, Outer Write-Through 
cacheable. 


° For the Non-secure EL1&0 translation regime, if the value of SCTLR_EL1.M is 0, indicating that stage 1 
translations are disabled for that translation regime, and the value of HCR_EL2.DC is 0: 


— If the value of SCTLR_EL1.1 is 0, instruction accesses to Normal memory from stage | of the 
translation regime are Outer Shareable, Inner Non-cacheable, Outer Non-cacheable. 


— If the value of SCTLR_EL1.1 is 1, instruction accesses to Normal memory from stage | of the 
translation regime are Outer Shareable, Inner Write-Through Cacheable, Outer Write-Through 
Cacheable. 


The effect of SCTLR_ELx.C, HCR_EL2.DC and HCR_EL2.CD is reflected in the result of the address translation 
instructions in the PAR when these bits have an effect on the stages of translation being reported in the PAR. 


Note 


. In conjunction with the requirements in Non-cacheable accesses and instruction caches, the requirements in 
this section mean the architecturally required effect of SCTLR_ELx.1 is limited to its effect on caching 
instruction accesses in unified caches. 





. This specification can give rise to different cacheability attributes between instruction and data accesses to 
the same location. Where this occurs, the measures for mismatch memory attributes described in Mismatched 
memory attributes on page B2-105 must be followed to manage the corresponding loss of coherency. 





D3.4.5 Behavior of caches at reset 
In ARMVv8: 
° All caches reset to IMPLEMENTATION DEFINED states that might be UNKNOWN. 


° The Cacheability control fields described in Enabling and disabling the caching of memory accesses on 
page D3-1696 reset to values that force all memory locations to be treated as Non-cacheable. 
Note 


This applies only to the controls that apply to the Translation regime that is used by the Exception level and 
Security state entered on reset. 








° An implementation can require the use of a specific cache initialization routine to invalidate its storage array 
before caching is enabled. The exact form of any required initialization routine is IMPLEMENTATION DEFINED, 
and the routine must be documented clearly as part of the documentation of the device. 


° If an implementation permits cache hits when the Cacheability control fields force all memory locations to 
be treated as Non-cacheable then the cache initialization routine must: 


— Provide a mechanism to ensure the correct initialization of the caches. 


—  Bedocumented clearly as part of the documentation of the device. 


In particular, if an implementation permits cache hits when the Cacheability controls force all memory 
locations to be treated as Non-cacheable, and the cache contents are not invalidated at reset, the initialization 
routine must avoid any possibility of running from an uninitialized cache. It is acceptable for an initialization 
routine to require a fixed instruction sequence to be placed in a restricted range of memory. 


° ARM recommends that whenever an invalidation routine is required, it is based on the ARMv8 cache 
maintenance instructions. 


See also TLB behavior at reset on page D4-1813. 


D3.4.6 Non-cacheable accesses and instruction caches 


In AArch64 state, instruction accesses to Non-cacheable Normal memory can be held in instruction caches. 
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Correspondingly, the sequence for ensuring that modifications to instructions are available for execution must 
include invalidation of the modified locations from the instruction cache, even if the instructions are held in Normal 
Non-cacheable memory. This includes cases where System register Cacheability control fields force instruction 
accesses to memory to be Non-cacheable. 


Therefore when using self-modified code in non-cacheable space in a uniprocessor system, the following sequence 
is required: 


; Enter this code with <Wt> containing the new 32-bit instruction 

; to be held at a location pointed to by <Xn> in Normal Non-cacheable memory. 
STR <Wt>, [Xn] 

DSB ISH; Ensure visibility of the data stored 

IC IVAU, [Xn] ; Invalidate instruction cache by VA to PoU 

DSB ISH; Ensure completion of the invalidations 

ISB ; 


Ina multiprocessor system, the IC I[VAU is broadcast to all PEs within the Inner Shareable domain of the PE running 
this sequence, but additional software steps might be required to synchronize the threads with other PEs. This might 
be necessary so that the PEs executing the modified instructions can execute an ISB after completing the 
invalidation, and to avoid issues associated with concurrent modification and execution of instruction sequences. 


Larger blocks of instructions can be modified using the IC IALLU instruction for a uniprocessor system, or a IC 
IALLUIS for a multiprocessor system. 


Note 


This section applies even when the Cacheability control fields force instruction accesses to memory in AArch64 
state to be Non-cacheable, as described in Enabling and disabling the caching of memory accesses on 
page D3-1696. 








D3.4.7 About cache maintenance in ARMv8 


The following sections give general information about cache maintenance: 
° Terms used in describing the maintenance instructions. 
. The ARMVv8 abstraction of the cache hierarchy on page D3-1702. 


The following sections describe cache maintenance instruction: 





° Instruction cache maintenance instructions (IC*) on page D3-1704. 
° Data cache maintenance instructions (DC*) on page D3-1704. 
Note 


Some descriptions of the cache maintenance instructions refer to the cacheability of the address on which the 
instruction operates. The Cacheability of an address is determined by the applicable translation table entry for that 
address, as modified by any applicable System register Cacheability controls, such as the SCTLR_EL1.{I, C} 
controls. 





Terms used in describing the maintenance instructions 


Cache maintenance instructions are defined to act on particular memory locations. Instructions can be defined: 
° By the address of the memory location to be maintained, referred to as operating by VA. 


° By a mechanism that describes the location in the hardware of the cache, referred to as operating by set/way. 
In addition, for instruction caches, there are instructions that invalidate all entries. 


The following subsections define the terms used in the descriptions of the cache maintenance instructions: 





° Terminology for cache maintenance instruction operating by virtual address, VA on page D3-1700. 
° Terminology for cache maintenance instructions operating by set/way on page D3-1700. 
° Terminology for Clean, Invalidate, and Clean and Invalidate instructions on page D3-1701. 
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Terminology for cache maintenance instruction operating by virtual address, VA 


The addresses used by the PE are VAs. When all applicable stages of translation are disabled, the virtual address is 
identical to the physical address. 


Note 


For more information about memory system behavior when address translation is disabled, see The effects of 
disabling a stage of address translation on page D4-1767. 








For the cache maintenance instruction, any instruction described as operating by VA includes as part of any required 
VA to PA translation: 


° For an instruction executed at EL1, the current system Address Space [Dentifier, ASTD. 

° The current Security state. 

. Whether the instruction was performed from Hyp mode, or from Non-secure EL] state. 

° For an instruction executed from a Non-secure EL1 state, the Virtual Machine IDentifier, VMID. 


For a data or unified cache maintenance instruction by VA, the operation cannot generate a Data Abort exception 
for a Permission fault, except for the Permission fault cases described in: 


° Data cache maintenance instructions (DC*) on page D3-1704. 
° Stage 2 fault on a stage I translation table walk on page D4-1806. 


For an instruction cache maintenance instruction by VA: 


° It is IMPLEMENTATION DEFINED whether the operation can generate a Data Abort exception for a Translation 
fault or an Access flag fault. 


. The operation cannot generate a Data Abort exception for a Permission fault, except for the Permission fault 
case described in Stage 2 fault on a stage I translation table walk on page D4-1806. 


For more information about these faults, see MMU faults on page D4-1800. 


Terminology for cache maintenance instructions operating by set/way 


Cache maintenance instruction that operate by set/way refer to the particular structures in a cache. Three parameters 
describe the location in a cache hierarchy that an instruction works on. These parameters are: 


Level The cache level of the hierarchy. The number of levels of cache is IMPLEMENTATION DEFINED. The 
cache levels that can be managed using the architected cache maintenance instructions that operate 
by set/way can be determined from the CLIDR_EL1. 


In the ARM architecture, the lower numbered cache levels are those closest to the PE. See Memory 


hierarchy on page B2-71. 


Set Each level of a cache is split up into a number of sets. Each set is a set of locations in a cache level 
to which an address can be assigned. Usually, the set number is an IMPLEMENTATION DEFINED 
function of an address. 


In the ARM architecture, sets are numbered from 0. 
Way The associativity of a cache is the number of locations in a set to which a specific address can be 
assigned. The way number specifies one of these locations. 


In the ARM architecture, ways are numbered from 0. 


Note 


Because the allocation of a memory address to a cache location is entirely IMPLEMENTATION DEFINED, ARM expects 
that most portable software will use only the cache maintenance instructions by set/way as single steps in a routine 
to perform maintenance on the entire cache. 
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Terminology for Clean, Invalidate, and Clean and Invalidate instructions 


Caches introduce coherency problems in two possible directions: 


1. An update to a memory location by a PE that accesses a cache might not be visible to other observers that 
can access memory. This can occur because new updates are still in the cache and are not visible yet to the 
other observers that do not access that cache. 


2. Updates to memory locations by other observers that can access memory might not be visible to a PE that 
accesses a cache. This can occur when the cache contains an old, or stale, copy of the memory location that 
has been updated. 


The Clean and Invalidate instructions address these two issues. The definitions of these instructions are: 


Clean 


Invalidate 


A cache clean instruction ensures that updates made by an observer that controls the cache are made 
visible to other observers that can access memory at the point to which the instruction is performed. 
Once the Clean has completed, the new memory values are guaranteed to be visible to the point to 
which the instruction is performed, for example to the Point of Unification. 


The cleaning of a cache entry from a cache can overwrite memory that has been written by another 
observer only if the entry contains a location that has been written to by an observer in the 
shareability domain of that memory location. 


A cache invalidate instruction ensures that updates made visible by observers that access memory 
at the point to which the invalidate is defined, are made visible to an observer that controls the cache. 
This might result in the loss of updates to the locations affected by the invalidate instruction that 
have been written by observers that access the cache, if those updates have not been cleaned from 
the cache since they were made. 


If the address of an entry on which the invalidate instruction operates is Normal, Non-cacheable or 
any type of Device memory then an invalidate instruction also ensures that this address is not 
present in the cache. 


— Note 


Entries for addresses that are Normal Cacheable can be allocated to the cache at any time, and so 
the cache invalidate instruction cannot ensure that the address is not present in a cache. 





Clean and Invalidate 


A cache clean and invalidate instruction behaves as the execution of a clean instruction followed 
immediately by an invalidate instruction. Both instructions are performed to the same location. 


The points to which a cache maintenance instruction can be defined differ depending on whether the instruction 
operates by VA or by set/way: 


° For instructions operating by set/way, the point is defined to be to the next level of caching. For the All 
operations, the point is defined as the Point of Unification for each location held in the cache. 


° For instruction operating by VA, two conceptual points are defined: 


Point of Coherency (PoC) 


For a particular VA, the PoC is the point at which all agents that can access memory are 
guaranteed to see the same copy of a memory location. In many cases, this is effectively the main 
system memory, although the architecture does not prohibit the implementation of caches beyond 
the PoC that have no effect on the coherence between memory system agents. 


Note 


The presence of system caches can affect the definition of point of coherency as described in 
System level caches on page D3-1713. 
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Point of Unification (PoU) 


The PoU for a PE is the point by which the instruction and data caches and the translation table 
walks of that PE are guaranteed to see the same copy of a memory location. In many cases, the 
Point of Unification is the point in a uniprocessor memory system by which the instruction and 
data caches and the translation table walks have merged. 


The PoU for an Inner Shareable shareability domain is the point by which the instruction and data 
caches and the translation table walks of all the PEs in that Inner Shareable shareability domain 
are guaranteed to see the same copy of a memory location. Defining this point permits 
self-modifying software to ensure future instruction fetches are associated with the modified 
version of the software by using the standard correctness policy of: 


1. Clean data cache entry by address. 


2. Invalidate instruction cache entry by address. 
The following fields in the CLIDR_EL] relate to these conceptual points: 


LoC, Level of Coherence 


This field defines the last level of cache that must be cleaned or invalidated when cleaning or 
invalidating to the Point of Coherency. The LoC value is a cache level, so, for example, if LoC 
contains the value 3: 


° A clean to the Point of Coherency operation requires the level 1, level 2 and level 3 caches 
to be cleaned. 
° Level 4 cache is the first level that does not have to be maintained. 


If the LoC field value is @x@, this means that no levels of cache need to cleaned or invalidated 
when cleaning or invalidating to the Point of Coherency. 


If the LoC field value is a nonzero value that corresponds to a level that is not implemented, this 
indicates that all implemented caches are before the Point of Coherency. 
LoUU, Level of Unification, uniprocessor 


This field defines the last level of cache that must be cleaned or invalidated when cleaning or 
invalidating to the Point of Unification for the PE. As with LoC, the LoUU value is a cache level. 


If the LoUU field value is 0x0, this means that no levels of cache need to cleaned or invalidated 
when cleaning or invalidating to the Point of Unification. 


If the LoUU field value is a nonzero value that corresponds to a level that is not implemented, 
this indicates that all implemented caches are before the Point of Unification. 
LoUIS, Level of Unification, Inner Shareable 
In any implementation: 
. This field defines the last level of cache that must be cleaned or invalidated when cleaning 


or invalidating to the Point of Unification for the Inner Shareable shareability domain. As 
with LoC, the LoUIS value is a cache level. 


° If the LoUIS field value is 0x0, this means that no levels of cache need to cleaned or 
invalidated when cleaning or invalidating to the Point of Unification for the Inner 
Shareable shareability domain. 


. If the LoUIS field value is a nonzero value that corresponds to a level that is not 
implemented, this indicates that all implemented caches are before the Point of 
Unification. 


For more information, see CLIDR_EL1, Cache Level ID Register on page D7-1912. 


The ARMv8 abstraction of the cache hierarchy 


The following subsections describe the ARMv8 abstraction of the cache hierarchy: 





° Cache maintenance instructions that operate by address on page D3-1703. 
° Cache maintenance instructions that operate by set/way on page D3-1703. 
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Cache maintenance instructions that operate by address 

The address-based cache maintenance instructions are described as operating by VA. Each of these instructions is 
always qualified as being either: 

° Performed to the Point of Coherency. 

° Performed to the Point of Unification. 


See Terms used in describing the maintenance instructions on page D3-1699 for definitions of Point of Coherency 
and Point of Unification, and more information about possible meanings of VA. 


Cache maintenance instructions lists the address-based maintenance instructions. 


The CTR_ELO holds minimum line length values for: 
. The instruction caches. 


° The data and unified caches. 


These values support efficient invalidation of a range of addresses, because this value is the most efficient address 
stride to use to apply a sequence of address-based maintenance instructions to a range of addresses. 


For the Invalidate data or unified cache line by VA instruction, the Cache Write-back Granule field of the CTR_ELO 
defines the maximum granule that a single invalidate instruction can invalidate. This meaning of the Cache 
Write-back Granule is in addition to its defining the maximum size that can be written back. 


Cache maintenance instructions that operate by set/way 


Cache maintenance instructions lists the set/way-based maintenance instructions. Some encodings of these 
instructions include a required field that specifies the cache level for the instruction: 


° A clean instruction cleans from the level of cache specified through to at least the next level of cache, moving 
further from the PE. 
. An invalidate instruction invalidates only at the level specified. 
D3.4.8 Cache maintenance instructions 


The A64 cache maintenance instructions are part of the A64 system instruction class in the register encoding space. 
For encoding details and other general information on these system instruction, see System instructions on 
page C3-144, SYS on page C6-742 and Cache maintenance instructions, and data cache zero on page C5-276. 


The instruction and data cache maintenance instructions have the same functionality in AArch32 state and in 
AArch64 state. Table D3-3 shows these system instructions. Instructions that take an argument include Xt in the 
instruction description. 











Note 
In Table D3-3 the Point of Unification is the Point of Unification of the PE executing the cache maintenance 
instruction. 
Table D3-3 System instructions for cache maintenance 
system Instruction Notes 


instructions 





Instruction cache maintenance instructions, see System instructions on page C3-144 











IC IALLUIS Invalidate all to Point of Unification, Inner Shareable EL] or higher access. 
IC IALLU Invalidate all to Point of Unification EL] or higher access. 
IC IVAU, Xt Invalidate by virtual address to Point of Unification When SCTLR_EL1.UCI == 1, ELO access. 


Otherwise, EL1 or higher access. 
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Table D3-3 System instructions for cache maintenance (continued) 





System 
instructions 


Instruction Notes 





Data cache maintenance instructions, see System instructions on page C3-144 





DC IVAC, Xt 


Invalidate by virtual address to Point of Coherency EL1 or higher access. 





DC ISW, Xt 


Invalidate by set/way EL1 or higher access. 





DC CVAC, Xt 


Clean by virtual address to Point of Coherency When SCTLR_EL1.UCI == 1, ELO access. 
Otherwise EL1 or higher access. 





DC CSW, Xt 


Clean by set/way ELI or higher access. 





DC CVAU, Xt 


Clean by virtual address to Point of Unification When SCTLR_EL1.UCI == 1, ELO access. 
Otherwise EL1 or higher access. 





DC CIVAC, Xt 


Clean and invalidate by virtual address to Point of When SCTLR_EL1.UCI == 1, ELO access. 
Coherency Otherwise EL1 or higher access. 





DC CISW, Xt 


Clean and invalidate by set/way EL1 or higher access. 





A DSB or DMB instruction intended to ensure the completion of cache or branch predictor maintenance instructions 
must have an access type of both loads and stores. 


The following subsections give more information about these instructions: 

° Instruction cache maintenance instructions (IC*). 

° Data cache maintenance instructions (DC*). 

° ELO accessibility to cache maintenance instructions on page D3-1705. 

° General requirements for the scope of maintenance instructions on page D3-1705. 

° Effects of instructions that operate by VA to the Point of Coherency on page D3-1706. 

° Effects of instructions operate by VA but not to the Point of Coherency on page D3-1706. 
° Effects of All and set/way maintenance instructions on page D3-1707. 

° Effects of virtualization and security on the cache maintenance instructions on page D3-1707. 
° Boundary conditions for cache maintenance instructions on page D3-1708. 

° Ordering and completion of data and instruction cache instructions on page D3-1709. 


° Performing cache maintenance instructions on page D3-1710. 


Instruction cache maintenance instructions (IC*) 
The A64 assembly syntax for these instructions is described in System instructions on page C3-144. 


Where an address argument for these instructions is required, it takes the form of a 64-bit register that holds the 
virtual address argument. No restrictions apply for this address. 


See also ELO accessibility to cache maintenance instructions on page D3-1705 and Ordering and completion of 
data and instruction cache instructions on page D3-1709. 


Data cache maintenance instructions (DC*) 
The A64 assembly syntax for these instructions is described in System instructions on page C3-144. 


Where an address argument for these instructions is required, it takes the form of a 64-bit register that holds the 
virtual address argument. No alignment restrictions apply for this address. 


Data cache maintenance instructions that take a set/way/level argument take a 64-bit register, the upper 32 bits of 
which are RESO. 


DC IVAC requires write permission or else a Permission fault is generated. 
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A DCIVAC or DC ISW executed at Non-secure EL1 is performed by the PE as clean and invalidate, that is as a DC 
CIVAC or DC CISW, if both of the following apply: 


. EL2 is implemented. 
° HCR_EL2.VM is set to 1, meaning the second stage of address translation is enabled. 
Note 





This also applies to the AArch32 cache maintenance instructions DCIMVAC and DCISW. see AArch32 data cache 
maintenance instructions (DC*) on page G3-4001. 





If a data cache maintenance by set/way instruction specifies a set, way, or level argument that is larger than the value 
supported by the implementation then the instruction is CONSTRAINED UNPREDICTABLE, see Out of range values of 
the Set/Way/Index fields in cache maintenance instructions on page K1-5492 or the instruction description. 


If a memory fault that sets the FAR for the translation regime applicable for the cache maintenance instruction is 
generated from a data cache maintenance instruction, the FAR holds the address specified in the register argument 
of the instruction. 


Note 


Despite its mnemonic, DC ZVA is not a cache maintenance instruction. For more information, see DC ZVA, Data 
Cache Zero by VA on page C5-359. 








See also ELO accessibility to cache maintenance instructions and Ordering and completion of data and instruction 
cache instructions on page D3-1709. 


ELO accessibility to cache maintenance instructions 


The SCTLR_EL1.UCI bit enables ELO access for the DC CVAU, DC CVAC, DC CIVAC, and IC IVAU 
instructions. When ELO use of these instructions is disabled because SCTLR_EL1.UCI == 0, executing one of these 
instructions at ELO generates a trap to EL1, that is reported using EC = 0x18. 


For these instructions read access permission is required. When the value of SCTLR_EL1.UCT is 1: 


° For the DC CVAU, DC CVAC, and DC CIVAC instructions, if the instruction is executed at ELO and the 
address specified in the argument cannot be read at ELO, a Permission fault is generated. 


° For the IC IVAU instruction, if the instruction is executed at ELO and the address specified in the argument 
cannot be read at ELO, it is IMPLEMENTATION DEFINED whether a Permission fault is generated. 


Software can read the CTR_ELO to discover the stride needed for cache maintenance instructions. The 
SCTLR_EL1.UCT bit enables ELO access to the CTR_ELO. When ELO access to the Cache Type register is 
disabled, a register access instruction executed at ELO is trapped to EL1 using EC = 0x18. 


General requirements for the scope of maintenance instructions 


The ARMvV8 specification of the cache maintenance instructions describes what each instruction is guaranteed to do 
in a system. It does not limit other behaviors that might occur, provided they are consistent with the requirements 
described in General behavior of the caches on page D3-1693, Behavior of caches at reset on page D3-1698, and 
Preloading caches on page B2-74. 


This means that as a side-effect of a cache maintenance instruction: 


° Any location in the cache might be cleaned. 
° Any unlocked location in the cache might be cleaned and invalidated. 
Note 





ARM recommends that, for best performance, such side-effects are kept to a minimum. ARM strongly recommends 
that the side-effects of operations performed in Non-secure state do not have a significant performance impact on 
execution in Secure state. 
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Effects of instructions that operate by VA to the Point of Coherency 


For Normal memory that is not Inner Non-cacheable, Outer Non-cacheable, these instructions must affect the caches 
of other PEs in the shareability domain described by the shareability attributes of the VA supplied with the 


instruction. 


For Device memory and Normal memory that is Inner Non-cacheable, Outer Non-cacheable, these instructions must 
affect the caches of all PEs in the Outer Shareable shareability domain of the PE on which the instruction is 


operating. 
In all cases, for any affected PE, these instructions affect all data and unified caches to the Point of Coherency. 


Table D3-4 shows the scope of the Data and unified cache maintenance instructions. 


Table D3-4 PEs affected by cache maintenance instructions to the Point of Coherency 














Shareability PEs affected Effective to 
Non-shareable The PE performing the operation The Point of Coherency of the entire 
system 
Inner Shareable _—_ All PEs in the same Inner Shareable shareability domain as the PE The Point of Coherency of the entire 
performing the operation system 
Outer Shareable All PEs in the same Outer Shareable shareability domain as the PE The Point of Coherency of the entire 
performing the operation system 





Effects of instructions operate by VA but not to the Point of Coherency 


For these instructions, Table D3-5 shows how, for a VA in a Normal or Device memory location, the shareability 
attribute of the VA determines the minimum set of PEs affected, and the point to which the instruction must be 











effective. 
Table D3-5 PEs affected by cache maintenance instructions to the Point of Unification 
Shareability PEs affected Effective to 
Non-shareable The PE executing the instruction The point of unification of instruction cache fills, data cache fills and 
write-backs, and translation table walks, on the PE executing the instruction 
Inner Shareable or _—_ All PEs in the same Inner The Point of Unification of instruction cache fills, data cache fills and 
Outer Shareable Shareable shareability domain as —_ write-backs, and translation table walks, of all PEs in the same Inner 


the PE executing the instruction Shareable shareability domain as the PE executing the instruction 








Note 
The set of PEs guaranteed to be affected is never greater than the PEs in the Inner Shareable shareability domain 
containing the PE executing the instruction. 
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Effects of All and set/way maintenance instructions 
The IC IALLU and DC set/way instructions apply only to the caches of the PE that performs the instruction. 


The IC IALLUIS instruction can affect the caches of all PEs in the same Inner Shareable shareability domain as the 
PE that performs the instruction. This instruction has an effect to the Point of Unification of instruction cache fills, 
data cache fills, and write-backs, and translation table walks, of all PEs in the same Inner Shareable shareability 
domain. 


Note 


The possible presence of system caches, as described in System level caches on page D3-1713, means architecture 
does not guarantee that all levels of the cache can be maintained using set/way instructions. 








Effects of virtualization and security on the cache maintenance instructions 
Each Security state has its own physical address (PA) space, therefore cache entries are associated with PA space. 


Table D3-6 shows the effects of virtualization and security on the cache maintenance instructions. In the table, the 
Specified entries are entries that the architecture requires the instruction to affect. The rules described in General 
behavior of the caches on page D3-1693 mean that an instruction might also affect other entries. 


Table D3-6 Effects of virtualization and security on the maintenance instructions 





Cache maintenance Security 


instructions state Sperihodiontnes 





Data or unified cache maintenance instructions 




















Invalidate, Clean, or Clean Both All lines that hold the PA that, in the current Security state, is mapped to by the 
and Invalidate by VA: combination of all of: 
IVAC, CVAC, CVAU, ° The specified VA. 
CIVAC ° For an instruction executed at EL1 or ELO, the current ASID if the location is 
mapped to by a non-global page. 
° For an instruction executed at Non-secure EL1 or Non-secure ELO, the current 
VMID?. 
Invalidate, Clean, or Clean Non- secure Line specified by set/way provided that the entry comes from the Non-secure PA 
and Invalidate by set/way: space. 
ISW, CSW, CISW 
Secure Line specified by set/way regardless of the PA space that the entry has come from. 
Instruction cache maintenance instructions 
Invalidate by VA: IVAU Both All lines corresponding to the specified VA® in the current translation regime and: 
° For an instruction executed at EL1 or ELO, the current ASID. 
° For an instruction executed at Non-secure EL or Non-secure ELO, the current 
VMID?2. 
Invalidate All: IALLU, Both For an instruction executed at: 
TALLUIS . Non-secure EL1, all instruction cache lines containing entries associated with 
the current VMID@. 
° EL2, all instruction cache lines containing Non-secure entries. 
° EL3 or Secure EL], all instruction cache lines. 





a. Dependencies on the VMID apply even when HCR_EL2.VM is set to 0. The architecture does not define a reset value for 
VTTBR_EL2.VMID, and therefore, in any implementation that includes EL2, the boot software executed when reset is deasserted must 
initialize VTTBR_EL2.VMID. 
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b. The type of instruction cache used affects the interpretation of the specified entries in this table such that: 
¢ For a PIPT instruction cache, the cache maintenance applies to all entries whose physical address corresponds to the specified address. 
¢ Fora VIPT instruction cache, the cache maintenance applies to entries whose virtual index and physical tag corresponds to the specified 


address. 


¢ For an ASID and VMID tagged VIVT instruction cache, the cache maintenance applies to entries whose virtual address corresponds to 
the specified address. 


For information on types of instruction cache see Instruction caches on page D4-1829. 


For locked entries and entries that might be locked, the behavior of cache maintenance instructions described in The 
interaction of cache lockdown with cache maintenance instructions on page D3-1712 applies. 


With an implementation that generates aborts if entries are locked or might be locked in the cache, when the use of 
lockdown aborts is enabled, these aborts can occur on any cache maintenance instructions. 


In an implementation that includes EL2: 


° The architecture does not require cache cleaning when switching between virtual machines. Cache 
invalidation by set/way must not present an opportunity for one virtual machine to corrupt state associated 
with a second virtual machine. To ensure this requirement is met, Non-secure clean by set/way operations 
can be upgraded to clean and invalidate by set/way. 


° The AArch64 Data cache invalidate instructions, DC IVAC and DC ISW, at EL1 and ELO, and the AArch32 
Data cache invalidate instructions DCIMVAC and DCISW, at EL1, perform a cache clean as well as a cache 
invalidation if both of the following apply: 


— The value of HCR.VM is 1. 


— The instruction is executed in Non-secure state, or EL3 is not implemented. 
The means that, in Non-secure state: 


—  AtELI using AArch64 or ELO using AArch64, a DC IVAC instruction operates as DC CIVAC, and a 
DC ISW instruction operates as DC CISW. 


—  AtELI using AArch32, a DCIMVAC instruction operates as DCCIMVAC, and a DCISW instruction 
operates as DCCISW. 


° The AArch64 Data cache invalidate by set/way instruction DC ISW, at EL1 and ELO, and the AArch32 Data 
cache invalidate by set/way instruction DCISW, at EL1, perform a cache clean as well as a cache invalidation 
if both of the following apply: 


—_ The value of HCR_EL2.SWIO is 1. 


— The instruction is executed in Non-secure state, or EL3 is not implemented. 


This means that, in Non-secure state: 
— At ELI using AArch64 or ELO using AArch64, a DC ISW instruction operates as DC CISW. 
— At ELI using AArch32, a DCISW instruction operates as DCCISW. 


° When the value of HCR_EL2.FB is 1, TLB and instruction cache invalidate instructions executed in the 
Non-secure EL1 Exception level are broadcast across the Inner Shareable domain. When Non-secure EL1 is 
using AArch64, this applies to the TLBI VMALLE1, TLBI VAE1, TLBI ASIDE1, TLBI VAAE1, TLBI 
VALE1, TLBI VAALE1, and IC IALLU instructions. This means the instruction is upgraded to the 
corresponding Inner Shareable instruction, for example IC [ALLU is upgraded to IC IALLUIS. 


For more information about the cache maintenance instructions, see About cache maintenance in ARMV8 on 
page D3-1699, Cache maintenance instructions on page D3-1703, and Chapter D4 The AArch64 Virtual Memory 
System Architecture. 


Boundary conditions for cache maintenance instructions 


Cache maintenance instructions operate on the caches regardless of whether the System register Cacheability 
controls force all memory accesses to be Non-cacheable. 
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For address-based cache maintenance instructions, the instructions operate on the caches regardless of the memory 
type and cacheability attributes marked for the memory address in the VMSA translation table entries. This means 
that the effects of the cache maintenance instructions can apply regardless of: 


° Whether the address accessed: 
— Is Normal memory or Device memory. 
— Has the Cacheable attribute or the Non-cacheable attribute. 


. Any applicable domain control of the address accessed. 

° The access permissions for the address accessed, other than the effect of the stage two write permission on 
data or unified cache invalidation instructions. 

Ordering and completion of data and instruction cache instructions 

All data cache instructions, other than DC ZVA, that specify an address: 


. Execute in program order relative to loads or stores that access an address in Normal memory with either 
Inner Write Through or Inner Write Back attributes within the same cache line of minimum size, as indicated 
by CTR_ELO.DMinLine. 


° Can execute in any order relative to loads or stores that access any address with the Device memory attribute, 
or with Normal memory with Inner Non-cacheable attribute unless a DMB or DSB is executed between the 
instructions. 

° Execute in program order relative to other data cache maintenance instructions, other than DC ZVA, that specify 


an address within the same cache line of minimum size, as indicated by CTR_ELO.DMinLine. 


° Can execute in any order relative to loads or stores that access an address in a different cache line of minimum 
size, as indicated by CTR_ELO.DMinLine, unless a DMB or DSB is executed between the instructions. 


. Can execute in any order relative to other data cache maintenance instructions, other than DC ZVA, that specify 
an address in a different cache line of minimum size, as indicated by CTR_ELO.DMinLine, unless a DMB or 
DSB is executed between the instructions. 


* Can execute in any order relative to data cache maintenance instructions that do not specify an address unless 
a DMB or DSB is executed between the instructions. 


° Can execute in any order relative to instruction cache maintenance instructions unless a DSB is executed 
between the instructions. 


Note 


° Data cache ordering rules by address are consistent with physically indexed physically tagged caches. See 
Data and unified caches on page D4-1829. 





° Data cache zero instruction on page D3-1711 describes the ordering and completion rules for Data Cache 
Zero. 





All data cache maintenance instructions that do not specify an address: 


° Can execute in any order relative to data cache maintenance instructions that do not specify an address unless 
a DMB or DSB is executed between the instructions. 


° Can execute in any order relative to data cache maintenance instructions that specify an address, other than 
Data Cache Zero, unless a DMB or DSB is executed between the instructions. 


° Can execute in any order relative to loads or stores unless a DMB or DSB is executed between the instructions. 


° Can execute in any order relative to instruction cache maintenance instructions unless a DSB is executed 
between the instructions. 


All instruction cache instructions can execute in any order relative to other instruction cache instructions, data cache 
instructions, loads, and stores unless a DSB is executed between the instructions. 
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A cache maintenance instruction can complete at any time after it is executed, but is only guaranteed to be complete, 
and its effects visible to other observers, following a DSB instruction executed by the PE that executed the cache 
maintenance instruction. 


In all cases, where the text in this section refers to a DMB or a DSB, this means a DMB or DSB whose required access type 
is both loads and stores. 





Note 
These ordering requirements are extended from the requirements in AArch32 state given: 
. Ordering of cache and branch predictor maintenance instructions on page G3-4007. 
° AArch32 instruction cache maintenance instructions (IC*) on page G3-4001. 





Performing cache maintenance instructions 


To ensure all cache lines in a block of address space are maintained through all levels of cache ARM strongly 
recommends that software: 


° For data or unified cache maintenance, uses the CTR_ELO.DMinLine value to determine the loop increment 
size for a loop of data cache maintenance by VA instructions. 


° For instruction cache maintenance, uses the CTR_ELO.IMinLine value to determine the loop increment size 
for a loop of instruction cache maintenance by VA instructions. 


Example code for cache maintenance instructions 


The cache maintenance instructions by set/way can clean or invalidate, or both, the entirety of one or more levels 
of cache attached to a PE. However, unless all PEs attached to the caches regard all memory locations as 
Non-cacheable, it is not possible to prevent locations being allocated into the cache during such a sequence of the 
cache maintenance instructions. 


Note 


Since the set/way instructions are performed only locally, there is no guarantee of the atomicity of cache 
maintenance between different PEs, even if those different PEs are each executing the same cache maintenance 
instructions at the same time. Since any cacheable line can be allocated into the cache at any time, it is possible for 
a cache line to migrate from an entry in the cache of one PE to the cache of a different PE in a way that means the 
line is not affected by set/way based cache maintenance. Therefore, ARM strongly discourages the use of set/way 
instructions to manage coherency in coherent systems. The expected use of the cache maintenance instructions that 
operate by set/way is limited to the cache maintenance associated with the powerdown and powerup of caches, if 
this is required by the implementation. 





The limitations of cache maintenance by set/way mean maintenance by set/way does not happen on multiple PEs, 
and cannot be made to happen atomically for each address on each PE. Therefore in multiprocessor or multithreaded 
systems, the use of cache maintenance by set/way to clean, or clean and invalidate, the entire cache for coherency 
management with very large buffers or with buffers with unknown address can fail to provide the expected 
coherency results because of speculation by other PEs, or possibly by other threads. The only way that these 
instructions can be used in this way is to first ensure that all PEs that might cause speculative accesses to caches that 
need to be maintained are not capable of generating speculative accesses. This can be achieved by ensuring that 
those PEs have no memory locations with a Normal Cacheable attribute. Such an approach can have very large 
system performance effects, and ARM advises implementers to use hardware coherency mechanisms in systems 
where this will be an issue. 


System level caches on page D3-1713 refers to other limitations of cache maintenance by set/way. 





The following example code for cleaning a data or unified cache to the Point of Coherency illustrates a generic 
mechanism for cleaning the entire data or unified cache to the Point of Coherency. 


MRS X@, CLIDR_EL1 

AND W3, WO, #0x07000000 // Get 2 x Level of Coherence 
LSR W3, W3, #23 

CBZ W3, Finished 
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MOV W10, #0 // W10 = 2 x cache level 
MOV W8, #1 // W8 = constant 0b1 
Loop1: ADD W2, W10, W10, LSR #1 // Calculate 3 x cache level 
LSR W1, WO, W2 // extract 3-bit cache type for this level 
AND W1, W1, #0x7 
CMP W1, #2 
B.LT Skip // No data or unified cache at this level 
MSR CSSELR_EL1, X10 // Select this cache level 
ISB // Synchronize change of CSSELR 
MRS X1, CCSIDR_EL1 // Read CCSIDR 
AND 2, W1, #7 // W2 = 1og2(linelen)-4 
ADD 2, W2, #4 // W2 = 1og2(linelen) 
UBFX 4, W1, #3, #10 // W4 = max way number, right aligned 
CLZ 5, W4 // W5 = 32-log2(ways), bit position of way in DC operand 
LSL 9, W4, WS // W9 = max way number, aligned to position in DC operand 
LSL 16, W8, W5 // W16 = amount to decrement way number per iteration 
Loop2: UBFX 7, W1, #13, #15 // W7 = max set number, right aligned 
LSL 7, W7, W2 // W7 = max set number, aligned to position in DC operand 
LSL 17, W8, W2 // W17 = amount to decrement set number per iteration 
Loop3: ORR 11, W10, W9 // W11 = combine way number and cache number .. 
ORR 11, W11, W7 // «+. and set number for DC operand 
DC CSW, X11 // Do data cache clean by set and way 
SUBS 7, W7, W17 // Decrement set number 
B.GE Loop3 
SUBS X9, X9, X16 // Decrement way number 
B.GE Loop2 
Skip: ADD W10, W10, #2 // Increment 2 x cache level 
CMP w3, W10 
DSB // Ensure completion of previous cache maintenance instruction 
B.GT Loop1 
Finished: 


Similar approaches can be used for all cache maintenance instructions. 


Data cache zero instruction 


The Data Cache Zero by Address instruction, DC ZVA, writes 0b0Q to each of a block of N bytes, aligned in memory 
to N bytes in size, where the block in memory is identified by the address passed. There are no alignment restrictions 
on the address supplied. The DCZID_ELO register indicates the block size that is written with byte values of zero. 


Software can restrict access to this operation. See Configurable instruction enables and disables, and trap controls 
on page D1-1562. 


If disabled, the operation at ELO is trapped to EL1. 


The DC ZVA instruction behaves as a set of stores to the location being accessed, and: 


° Generates a Permission fault if the translation regime being used when the instruction is executed does not 
permit writes to the locations. 

. Requires the same considerations for ordering and the management of coherency as any other store 
instruction. 


In addition: 


. When the instruction is executed, it can generate memory faults or watchpoints that are prioritized in the same 


way as other memory related faults or watchpoints. Where a synchronous Data Abort fault or a watchpoint 
is generated, the CM bit in the syndrome field is not set to 1, which would be the case for all other cache 
maintenance instructions. See Exception from a Data abort on page D1-1533 for more information about the 
encoding of ESR_ELx and the associated ISS field. 


If the memory region being zeroed is any type of Device memory, then DC ZVA generates an Alignment fault 
which is prioritized in the same way as other alignment faults that are determined by the memory type. 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D3-1711 


Non-Confidential 


D3 The AArch64 System Level Memory Model 


D3.4 Cache support 





Note 


The architecture makes no statements about whether or not a DC ZVA instruction causes allocation to any particular 
level of the cache, for addresses that have a cacheable attribute for those levels of cache. 








D3.4.10 Cache lockdown 
The concept of an entry locked in a cache is allowed, but not architecturally defined. How lockdown is achieved is 
IMPLEMENTATION DEFINED and might not be supported by: 
. An implementation. 
° Some memory attributes. 
An unlocked entry in a cache might not remain in that cache. The architecture does not guarantee that an unlocked 
cache entry remains in the cache or remains incoherent with the rest of memory. Software must not assume that an 
unlocked item that remains in the cache remains dirty. 
A locked entry in a cache is guaranteed to remain in that cache. The architecture does not guarantee that a locked 
cache entry remains incoherent with the rest of memory, that is, it might not remain dirty. 
The interaction of cache lockdown with cache maintenance instructions 
The interaction of cache lockdown and cache maintenance instructions is IMPLEMENTATION DEFINED. However, an 
architecturally-defined cache maintenance instruction on a locked cache line must comply with the following 
general rules: 
° The effect of the following instructions on locked cache entries is IMPLEMENTATION DEFINED: 
— Cache clean by set/way, DC CSW. 
— Cache invalidate by set/way, DC ISW. 
— Cache clean and invalidate by set/way, DC CISW. 
—_ Instruction cache invalidate all, IC IALLU and IC IALLUIS. 
However, one of the following approaches must be adopted in all these cases: 
1. If the instruction specified an invalidation, a locked entry is not invalidated from the cache. 
2. If the instruction specified a clean it is IMPLEMENTATION DEFINED whether locked entries are cleaned. 
3. If an entry is locked down, or could be locked down, an IMPLEMENTATION DEFINED Data Abort 
exception is generated, using the fault status code defined for this purpose. See Exception from a Data 
abort on page D1-1533. 
This permits a usage model for cache invalidate routines to operate on a large range of addresses by 
performing the required operation on the entire cache, without having to consider whether any cache entries 
are locked. 
The effect of the following instructions is IMPLEMENTATION DEFINED: 
° Cache clean by virtual address, DC CVAC and DC CVAU. 
* Cache invalidate by virtual address, DC IVAC. 
° Cache clean and invalidate by virtual address, DC CIVAC. 
However, one of the following approaches must be adopted in all these cases: 
1. If the instruction specified an invalidation, a locked entry is invalidated from the cache. For the clean and 
invalidate instructions, the entry must be cleaned before it is invalidated. 
2. If the instruction specified an invalidation, a locked entry is not invalidated from the cache. If the instruction 
specified a clean it is IMPLEMENTATION DEFINED whether locked entries are cleaned. 
3. If an entry is locked down, or could be locked down, an IMPLEMENTATION DEFINED Data Abort exception is 
generated, using the fault status code defined for this purpose. See ESR_ELx on page K12-5663. 
In an implementation that includes EL2, if HCR_EL2.TIDCP is set to 1, any exception relating to lockdown of an 
entry associated with Non-secure memory is routed to EL2. 
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Note 


An implementation that uses an abort mechanisms for entries that can be locked down but are not actually locked 
down must: 





° Document the IMPLEMENTATION DEFINED instruction sequences that perform the required operations on 
entries that are not locked down. 


. Implement one of the other permitted alternatives for the locked entries. 


ARM recommends that, when possible, such IMPLEMENTATION DEFINED instruction sequences use 
architecturally-defined instructions. This minimizes the number of customized instructions required. 


In addition, an implementation that uses an abort to handle cache maintenance instructions for entries that might be 
locked must provide a mechanism that ensures that no entries are locked in the cache. 


The reset setting of the cache must be that no cache entries are locked. 





Additional cache functions for the implementation of lockdown 


An implementation can add additional cache maintenance functions for the handling of lockdown in the 
IMPLEMENTATION DEFINED spaces reserved for Cache Lockdown, see Reserved encodings for IMPLEMENTATION 
DEFINED registers on page C5-291. 


D3.4.11 System level caches 


The system level architecture might define further aspects of the software view of caches and the memory model 
that are not defined by the ARMv8 architecture. These aspects of the system level architecture can affect the 
requirements for software management of caches and coherency. For example, a system design might introduce 
additional levels of caching that cannot be managed using the architecturally-defined maintenance instructions. 
Such caches are referred to as system caches. 


Conceptually, three classes of system cache can be envisaged: 


1. System caches which lie before the point of coherency and cannot be managed by any cache maintenance 
instructions. Such systems fundamentally undermine the concept of cache maintenance instructions 
operating to the point of coherency, as they imply the use of non-architecture mechanisms to manage 
coherency. The use of such systems in the ARM architecture is explicitly prohibited. 


2. System caches which lie before the point of coherency and can be managed by cache maintenance by address 
instructions that apply to the point of coherency, but cannot be managed by cache maintenance by set/way 
instructions. Where maintenance of the entirety of such a cache must be performed, as in the case for power 
management, it must be performed using non-architectural mechanisms. 


3: System caches which lie beyond the point of coherency and so are invisible to the software. The management 
of such caches is outside the scope of the architecture. 
D3.4.12 Branch prediction 
ARMvV8 does not define any branch predictor maintenance instructions for AArch64 state. 


If branch prediction is architecturally visible, cache maintenance must also apply to branch prediction. 
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D3.5 External aborts 


The ARM architecture defines external aborts as errors that occur in the memory system, other than those that are 
detected by the MMU or debug logic. External aborts include parity or ECC errors detected by the caches or other 
parts of the memory system. For example, an uncorrectable parity or ECC failure on a Level 2 Memory structure 
might generate an external abort. 


An external abort is one of the following: 


° Synchronous. 
° Precise asynchronous. 
° Imprecise asynchronous. 


For more information, see Exception terminology on page D1-1499. 


The ARM architecture does not provide any method to distinguish between precise asynchronous and imprecise 
asynchronous external aborts. 


VMSAv8-64 permits external aborts on data accesses, translation table walks, and instruction fetches to be either 
synchronous or asynchronous. 


It is IMPLEMENTATION DEFINED which external aborts, if any, are supported. Asynchronous aborts are taken as 
SError interrupt exceptions. See Asynchronous exception types, routing, masking and priorities on page D1-1555. 


Synchronous external aborts are reported using the Instruction Abort and Data Abort exceptions. See Synchronous 
exception types, routing and priorities on page D1-1547. 


Normally, external aborts are rare. An imprecise asynchronous external abort is likely to be fatal to the process that 
is running, ARM recommends that implementations make external aborts precise wherever possible. 


The following subsections give more information about possible external aborts: 


° External abort on an instruction fetch. 
° External abort on data read or write. 
° Provision for the classification of external aborts. 


° Parity or ECC error reporting on page D3-1715. 


D3.5.1 External abort on an instruction fetch 


An external abort on an instruction fetch can be either synchronous or asynchronous. 
A synchronous external abort on an instruction fetch is taken precisely using the Instruction Abort exception. 


An implementation can report the external abort asynchronously from the instruction that it applies to. In such an 
implementation the abort is taken using the SError interrupt exception. 


D3.5.2 External abort on data read or write 


Externally-generated errors that occur during a data read or write can be either synchronous or asynchronous. 
A synchronous external abort on a data read or write is taken precisely using the Data Abort exception. 


An implementation can report the external abort asynchronously from the instruction that generated the access. In 
such an implementation the abort is taken using the SError interrupt exception. 


D3.5.3 Provision for the classification of external aborts 


In AArch64 state, an implementation can use ESR_ELx.EA, ISS[9], to provide more information about 
synchronous external aborts. For more information, see Exception from an Instruction abort on page D1-1533 and 
Exception from a Data abort on page D1-1533. 


For all aborts other than synchronous external aborts reported using the EC values 0x2Q, @x21, 0x24, and 0x25, 
ESR_ELx.EA, ISS[9], returns a value of 0. 
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D3.5.4 Parity or ECC error reporting 


The ARM architecture supports the reporting of both synchronous and asynchronous parity or ECC errors from the 
cache system. It is IMPLEMENTATION DEFINED what parity or ECC errors in the cache systems, if any, result in 
synchronous or asynchronous parity or ECC errors. 


A fault code is defined for reporting parity or ECC errors, see Use of the ESR_ELI, ESR_EL2, and ESR_EL3 on 
page D1-1523. However, when parity or ECC error reporting is implemented, it is IMPLEMENTATION DEFINED 
whether a parity or ECC error is reported using the assigned fault code or using another appropriate encoding. 


For all purposes other than the fault status encoding, parity or ECC errors are treated as external aborts. 
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D3.6 Memory barrier instructions 
Memory barriers on page B2-87 describes the memory barrier instructions. This section describes the system level 
controls of those instructions. 
D3.6.1 EL2 control of the shareability of data barrier instructions executed at Non-secure ELO or EL1 
In an implementation that includes EL2 and supports shareability limitations on the data barrier instructions, the 
HCR_EL2.BSU field can upgrade the required shareability of an instruction that is executed at ELO or EL1 in 
Non-secure state. Table D3-7 shows the encoding of this field. 
Table D3-7 EL2 control of shareability of barrier instructions executed at Non-secure ELO or EL1 
HCR_EL2.BSU)~ = Minimum shareability of barrier instructions 
00 No effect, shareability is as specified by the instruction 
01 Inner Shareable 
10 Outer Shareable 
11 Full system 
For an instruction executed at ELO or EL1 in Non-secure state, Table D3-8 shows how the HCR_EL2.BSU is 
combined with the shareability specified by the argument of the DMB or DSB instruction to give the scope of the 
instruction. 
Table D3-8 Effect of HCR_EL2.BSU on barrier instructions executed at Non-secure EL1 or ELO 
Shareability specified by the DMB or DSB argument HCR_EL2.BSU Resultant shareability 
Full system Any Full system 
Outer Shareable 00, 01, or 10 Outer Shareable 
11, Full system Full system 
Inner Shareable 00 or 01 Inner Shareable 
10, Outer Shareable Outer Shareable 
11, Full system Full system 
Non-shareable 00, No effect Non-shareable 
01, Inner Shareable Inner Shareable 
10, Outer Shareable Outer Shareable 
11, Full system Full system 
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D3.7 Pseudocode description of general memory system instructions 


This section lists the pseudocode describing general memory operations: 


° Memory data type definitions. 
. Basic memory access. 
. Aligned memory access. 


. Unaligned memory access on page D3-1718. 

° Exclusive monitors operations on page D3-1718. 
. Access permission checking on page D3-1719. 

° Abort exceptions on page D3-1719. 

° Memory barriers on page D3-1719. 


D3.7.1 Memory data type definitions 
This section lists the memory data types. 


The memory data types are: 
° Address descriptor, defined by the AddressDescriptor type. 
. Full address, defined by the FullAddress type. 


. Memory attributes, defined by the MemoryAttributes type. 

. Memory type, defined by the MemType enumeration. 

. Device memory type, defined by the DeviceType enumeration. 

. Normal memory attributes, defined by the MemAttrHints type. 

° Cacheability attributes, defined by the MemAttr_NC, MemAttr_WT, and MemAttr_WB constants. 

° Allocation hints, defined by the MemHint_No, MemHint_WA, MemHint_RA, and MemHint_RWA constants. 
° Access permissions, defined by the Permissions type. 


These types are defined in Chapter J1 ARMv8S Pseudocode. 


D3.7.2 Basic memory access 


The two _Mem[] accessors, Non-assignment (memory read) _Mem[] and Assignment (memory write) _Mem[], are the 
operations that perform single-copy atomic, aligned, little-endian memory accesses of size bytes to or from the 
underlying physical memory array of bytes. 


The functions address the array using desc.paddress which supplies: 
° A 48-bit physical address. 


° A single NS bit to select between Secure and Non-secure parts of the array. 


The AccType parameter describes the access type, such as normal, exclusive, ordered, and streaming. For a definition 
of AccType, see Address space on page B2-68. 


The actual implemented array of memory might be smaller than the 248 bytes implied. In this case the scheme for 
aliasing is IMPLEMENTATION DEFINED, or some parts of the address space might give rise to external aborts or a 
System Error. 


The attributes in memaddrdesc.memattrs are used by the memory system to determine caching and ordering behaviors 
as described in Memory types and attributes on page B2-94, Memory ordering on page B2-84, and Atomicity in the 
ARM architecture on page B2-81. 


PAMax() returns the IMPLEMENTATION DEFINED size of the physical address. This function is defined in Chapter J1 
ARMvV8 Pseudocode 


D3.7.3 Aligned memory access 


The two MemSingle[] accessors, Non-assignment (memory read) AArch64.MemSingle[] and Assignment (memory 
write) AArch64.MemSingle[], make atomic, little-endian accesses of size bytes. These functions are defined in 
Chapter J1 ARMV8 Pseudocode. 
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D3.7.4 Unaligned memory access 


The two Mem[] accessors, Non-assignment (memory read) Mem[] and Assignment (memory write) Mem[], make 
accesses of the required type. If an access is not architecturally defined to be atomic, Mem[] synthesizes accesses 
from multiple calls to AArch64.MemSingle[]. It also reverses the byte order if the access is big-endian. 


The AArch64.CheckAlignment() function checks the alignment of memory accesses. 


D3.7.5 Exclusive monitors operations 


The AArch64.SetExclusiveMonitors() function sets the exclusive monitors for a block of bytes, the size of which is 
determined by size, at the virtual address defined by address. 


The AArch64.ExclusiveMonitorsPass() function checks whether the exclusive monitors are set to include the location 
of a number of bytes specified by size, at the virtual address defined by address. The atomic write that follows after 
the exclusive monitors have been set must be to the same physical address. It is permitted, but not required, for this 
function to return FALSE if the virtual address is not the same as that used in the previous call to 
AArch64.SetExclusiveMonitors(). 


The ExclusiveMonitorsStatus() function returns 0 if the previous atomic write was to the same physical memory 
locations selected by AArch64.ExclusiveMonitorsPass() and therefore succeeded. Otherwise the function returns 1, 
indicating that the address translation delivered a different physical address. 


The MarkExclusiveGlobal() procedure takes as arguments a FullAddress paddress, the PE identifier processorid and 
the size of the transfer. The procedure records that the PE processorid has requested exclusive access covering at 
least size bytes from address paddress. The size of the location marked as exclusive is IMPLEMENTATION DEFINED, 
up to a limit of 2KB and no smaller than two words, and aligned in the address space to the size of the location. It 
is CONSTRAINED UNPREDICTABLE whether this causes any previous request for exclusive access to any other address 
by the same PE to be cleared. 


The MarkExclusiveLocal() procedure takes as arguments a FullAddress paddress, the PE identifier processorid and 
the size of the transfer. The procedure records in a local record that PE processorid has requested exclusive access 
to an address covering at least size bytes from address paddress. The size of the location marked as exclusive is 
IMPLEMENTATION DEFINED, and can at its largest cover the whole of memory but is no smaller than two words, and 
is aligned in the address space to the size of the location. It is IMPLEMENTATION DEFINED whether this procedure 
also performs a MarkExclusiveGlobal() using the same parameters. 


The IsExclusiveGlobal() function takes as arguments a FullAddress paddress, the PE identifier processorid and the 
size of the transfer. The function returns TRUE if the PE processorid has marked in a global record an address range 
as exclusive access requested that covers at least size bytes from address paddress. It is IMPLEMENTATION DEFINED 
whether it returns TRUE or FALSE if a global record has marked a different address as exclusive access requested. 
If no address is marked in a global record as exclusive access, IsExclusiveGlobal() returns FALSE. 


The IsExclusiveLocal() function takes as arguments a FullAddress paddress, the PE identifier processorid and the 
size of the transfer. The function returns TRUE if the PE processorid has marked an address range as exclusive 
access requested that covers at least the size bytes from address paddress. It is IMPLEMENTATION DEFINED whether 
this function returns TRUE or FALSE if the address marked as exclusive access requested does not cover all of size 
bytes from address paddress. If no address is marked as exclusive access requested, then this function returns 
FALSE. It is IMPLEMENTATION DEFINED whether this result is ANDed with the result of IsExclusiveGlobal() with 
the same parameters. 


The ClearExclusiveByAddress() procedure takes as arguments a FullAddress paddress, the PE identifier processorid 
and the size of the transfer. The procedure clears the global records of all PEs, other than processorid, for which an 
address region including any of size bytes starting from paddress has had a request for an exclusive access. It is 
IMPLEMENTATION DEFINED whether the equivalent global record of the PE processorid is also cleared if any of size 
bytes starting from paddress has had a request for an exclusive access, or if any other address has had a request for 
an exclusive access. 


The ClearExclusiveLocal() procedure takes as arguments the PE identifier processorid. The procedure clears the 
local record of PE processorid for which an address has had a request for an exclusive access. It is IMPLEMENTATION 
DEFINED whether this operation also clears the global record of PE processorid that an address has had a request for 
an exclusive access. 
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D3.7.7 
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D3 The AArch64 System Level Memory Model 
D3.7 Pseudocode description of general memory system instructions 


These functions are defined in Chapter J1 ARMv8 Pseudocode. 


Access permission checking 


The function AArch64.CheckPermission() is used by the architecture to perform access permission checking based 
on attributes derived from the translation tables or location descriptors. It returns the result of the call to 
AArch64.NoFault(). 


These functions are defined in Chapter J1 ARMv8 Pseudocode. 


The interpretation of access permission is shown in Memory access control on page D4-1783. 


Abort exceptions 


The function AArch64.Abort() generates either a Data Abort or an Instruction Abort exception by calling 
AArch64.DataAbort() or AArch64.InstructionAbort(). It also can generate a debug exception for debug related faults, 
see Chapter D2 AArch64 Self-hosted Debug. 


The function AArch64.DataAbort() generates a Data Abort exception, routes the exception to EL2 or EL3, and 
records the information required for the Exception Syndrome registers, ESR_ELx. See Exception from a Data abort 
on page D1-1533. A second stage abort might also record the intermediate physical address, IPA, but this depends 
on the type of the abort. 


For a synchronous abort, AArch64.DataAbort() also sets the FAR to the VA of the abort. 


The function AArch64.InstructionAbort() generates an Instruction Abort exception, routes the exception to EL2 or 
EL3, and records the information required for the Exception Syndrome registers, ESR_ELx. See Exception from an 
Instruction abort on page D1-1533. A second stage abort might also record the intermediate physical address, IPA, 
but this depends on the type of the abort. 


For a synchronous abort, AArch64.InstructionAbort() also sets the FAR to the VA of the abort. 


The FaultRecord type describes a fault. Functions that check for faults return a record of this type appropriate to the 
type of fault. Pseudocode description of the MMU faults on page D4-1809 provides a number of wrappers to 
generate FaultRecords. 


The function AArch64.NoFau1t() returns a null record that indicates no fault. The IsFau1t() function tests whether a 
FaultRecord contains a fault. 


Memory barriers 


The definition for the memory barrier functions is given by the enumerations MBReqDomain and MBReqTypes. 


These enumerations define the required shareability domains and required access types used as arguments for DMB 
and DSB instructions. 


The procedures DataMemoryBarrier, DataSynchronizationBarrier, and InstructionSynchronizationBarrier perform 
the memory barriers. 
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Chapter D4 
The AArch64 Virtual Memory System Architecture 


This chapter provides a system level view of the AArch64 Virtual Memory System Architecture (VMSAv8-64), the 
memory system architecture of an ARMv8 implementation that is executing in AArch64 state. It contains the 
following sections: 


° About the Virtual Memory System Architecture (VMSA) on page D4-1722. 

° The VMSAv8-64 address translation system on page D4-1726. 

° VMSAVv8-64 translation table format descriptors on page D4-1774. 

° Memory access control on page D4-1783. 

° Memory region attributes on page D4-1792. 

° MMU faults on page D4-1800. 

° Translation Lookaside Buffers (TLBs) on page D4-1810. 

° TLB maintenance requirements and the TLB maintenance instructions on page D4-1815. 
° Caches in a VMSAv8s-64 implementation on page D4-1829. 
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D4.1 About the Virtual Memory System Architecture (VMSA) 


This chapter describes the ARMv8 Virtual Memory System Architecture (VMSA), and in particular how it applies 
to a PE that is executing in AArch64 state. In this state the PE is using VMSAv8-64, as defined in ARMv8 VMSA 
naming. See The ARMv8 VMSA when some Exception levels are using AArch32 for information about the VMSA 
in other contexts. 


A VMSA provides a Memory Management Unit (MMU), that controls address translation, access permissions, and 
memory attribute determination and checking, for memory accesses made by the PE. The process of address 
translation maps the virtual addresses (VAs) used by the PE onto the physical addresses (PAs) of the physical 
memory system. The mapping of a VA to a PA requires either a single stage of translation, or two sequential stages 
of translation. 


The translations are defined independently for different Exception levels and Security states, as described in The 
VMSAV8-64 address translation system on page D4-1726. 


VMSAv8-64 supports tagging of VAs, as described in Address tagging in AArch64 state on page D4-1724. As that 
section describes, this address tagging has no effect on the address translation process. 


The remainder of this chapter gives a full description of VMSAv8-64 for an implementation that includes all of the 
Exception levels. The implemented Exception levels and the resulting translation stages and regimes on 
page D4-1769 describes the differences in the VMSA if some Exception levels are not implemented. 


D4.1.1 ARMv8s VMSA naming 
The ARMv8 VMSA naming model reflects the possible stages of address translation, as follows: 
VMSAv8 The overall translation scheme, within which an address translation has one or two stages. 


VMSAv8-32 The translation scheme for a single stage of address translation that is managed from an Exception 
level that is using AArch32. 


VMSAv8-32 is sometimes used to refer to the two stages of translation used to map a VA to a PA, 
where each stage is managed from an Exception level that is using AArch32. 


VMSAv8-64 The translation scheme for a single stage of address translation that is managed from an Exception 
level that is using AArch64. 


VMSAv8-64 is sometimes used to refer to the two stages of translation used to map a VA to a PA, 
where each stage is managed from an Exception level that is using AArch64. 


D4.1.2 The ARMv8 VMSA when some Exception levels are using AArch32 


As stated at the start of the chapter, this chapter describes VMSAv8-64, the ARMv8 VMSA that applies to an 
Exception level that is using AArch64. However, when a higher Exception level is using AArch64, and therefore 
using VMSAv8-64, lower Exception levels can be using AArch32. Chapter G4 The AArch32 Virtual Memory 
System Architecture describes VMS Av8-32, meaning it describes: 

— The translation stages and translation regimes when EL3 is using AArch32. 


— __ Any stages of address translation that are using VMSAv8-32 when EL3 is using AArch64. 


However, a PE can be executing at ELO using AArch32 when EL] is using AArch64. In this case the PE is using 
the VMSAv8-64 EL1&0 translation regime as described in Constraints on accesses from ELO when ELO is using 
AArch32 on page D4-1728. 
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D4.1.3 VMSA address types and address spaces 


A description of the VMSA refers to the following address types. 





These descriptions relate to the VMSAv8 description and therefore give more detail than the generic definitions 
given in the glossary. 





Virtual address (VA) 


An address used in an instruction, as a data or instruction address, is a Virtual Address (VA). 


— Note 
This means that an address held in the PC, LR, SP, or an ELR, is a VA. 





In AArch64 state, a VA address space has a maximum address width of 48 bits. As About address 
translation and supported input address ranges on page D4-1728 describes, a stage of address 
translation can support one or two VA ranges: 


Translation stage with a single VA range 


For a translation stage that supports a single VA range the 48-bit VA width gives a 
maximum VA space of 256TB, with VA range of 0x0000000000000000 to 
0x0000FFFFFFFFFFFF. 


Translation stage with two VA ranges 
For a translation stage that supports two VA subranges, one at the bottom of the full 


64-bit address range of the PC, and one at the top, as follows: 


° The bottom VA range runs up from address 0x0000000000000000. With the 
maximum address width of 48 bits this gives a VA range of 0x0000000000000000 
to @x0Q00FFFFFFFFFFFF. 


° The top VA subrange runs up to address @xFFFFFFFFFFFFFFFF. With the maximum 
address width of 48 bits this gives a VA range of @xFFFF000000000000 to 
OxFFFFFFFFFFFFFFFF. Reducing the address width for this subrange increases the 
bottom address of the range. 


This means that there are two VA subranges, each of up to 256TB. 


Each translation regime that takes a VA as an input address can be configured to support fewer than 
48 bits of VA space, see Address size configuration on page D4-1731. 


Intermediate physical address (IPA) 


In a translation regime that provides two stages of address translation, the IPA is: 
. The OA from the stage 1 translation. 
. The IA for the stage 2 translation. 


In a translation regime that provides only one stage of address translation, the IPA is identical to the 
PA. Alternatively, the translation regime can be considered as having no concept of IPAs. 


The IPA address space has a maximum address width of 48 bits, see Address size configuration on 
page D4-1731. 


Physical address (PA) 


The address of a location in a physical memory map. That is, an output address from the PE to the 
memory system. 


The EL3 and Secure EL1 Exception levels provide independent definitions of the PA spaces for 
Secure and Non-secure operation. This means they provide two independent address spaces, where: 





° A VA accessed in Secure state can be translated to either the Secure or the Non-secure PA 
space. 
° When in Non-secure state, a VA is always mapped to the Non-secure PA space. 
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Each PA address space has a maximum address width of 48 bits, but an implementation can 
implement fewer than 48 bits of PA. See Address size configuration on page D4-1731. 


D4.1.4 Address tagging in AArch64 state 


In AArch6é4 state, the ARMv8 architecture supports the tagging of addresses. In these cases the top eight bits of the 
VA are ignored when determining: 


. If the translation system is enabled, whether the address is out of range and therefore causes a Translation 
fault. 

° If the translation system is not enabled, whether the address is out of range and therefore causes an Address 
size fault. 

° Whether the address requires invalidation when performing a TLB invalidation instruction by address. 


The use of address tags is controlled as follows: 


For addresses using a stage 1 translation that supports two VA ranges 


The value of bit[55] of the VA determines the register bit that controls the use of address tags, as 
follows: 


VA[55]==' TCR_ELx.TBIO determines whether address tags are used. If stage 1 
translation is enabled, TTBRO_ELx holds the base address of the translation 
tables used to translate the address. 


VA[55]==1 TCR_ELx.TBI1 determines whether address tags are used. If stage 1 
translation is enabled, TTBR1_ELx holds the base address of the translation 
tables used to translate the address. 


For addresses using a stage 1 translation that supports a single VA range 


TCR_ELx.TBI determines whether address tags are used. If stage 1 translation is enabled, 
TTBRO_ELx holds the base address of the translation tables used to translate the address. 


Note 


The TCR_ELx.TBIn or TCR_ELx.TBI bit determines whether address tags are used regardless of whether the 
corresponding translation regime is enabled. 








An address tag enable bit also has an effect on the PC value in the following cases: 
. Any branch or procedure return within the controlled Exception level. 


° On taking an exception to the controlled Exception level, regardless of whether this is also the Exception 
level from which the exception was taken. 


° On performing an exception return to the controlled Exception level, regardless of whether this is also the 
Exception level from which the exception return was performed. 











° Exiting from debug state to the controlled Exception level. 
Note 
As an example of what is meant by the controlled Exception level, TCR_EL3.TBI controls this effect for: 
° A branch or procedure return within EL3. 
° Taking an exception to EL3. 
° Performing an exception return or a debug state exit to EL3. 
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The effect of the controlling TBI{n} bit is: 


For a translation regime where the stage 1 translation supports two VA ranges 


If the controlling TBIn bit for the address being loaded into the PC is set to 1, then 
bits[63:56] of the PC are forced to be a sign-extension of bit[55] of that address. 


For a translation regime where the stage 1 translation supports a single VA range 


If the controlling TBI bit for the address being loaded into the PC is set to 1, then bits[63:56] 
of the PC are forced to be 0x00. 


The AddrTop() pseudocode function shows the algorithm determining the most significant bit of the VA, and 
therefore whether the VA is using tagging. For a translation regime where the stage | translation supports two VA 
ranges, this pseudocode includes the selection between TTBRO_ELx and TTBR1_ELx described in Selection 
between TTBRO and TTBRI when two VA ranges are supported on page D4-1759. See also Relaxation of the tagged 
address handling requirements on an Illegal exception return. 


Note 


The required behavior prevents a tagged address being propagated to the program counter. 








When address tagging is enabled for an address that causes a Data Abort or a Watchpoint, the address tag is included 
in the VA returned in the FAR. 


Relaxation of the tagged address handling requirements on an Illegal exception return 


The AddrTop() pseudocode function does not cover a relaxation to the requirements for tagged address handling that 
applies to an Illegal exception return. In the case of an Illegal exception return, it is IMPLEMENTATION DEFINED 
whether the exception return targets: 


° The Exception level indicated by the current SPSR at the time of the exception return. 


. The Exception level at which the exception return instruction was executed. 
The AArch64.ExceptionReturn() pseudocode function includes this IMPLEMENTATION DEFINED choice. 


Note 


. The TCR_ELx.TBI{7} fields have the effect shown in the AArch64.ExceptionReturn() pseudocode regardless 
of whether the corresponding translation regime is enabled. 





° In the case of an Illegal exception return, the tag bits of the address can be propagated to the PC if all of the 
following apply: 


— The implementation treats the target_exception_level as being the Exception level that was described 
in the SPSR at the time of the exception return. 


— For the Exception level that was described in the SPSR at the time of the exception return, the value 
of the applicable TCR_ELx.TBI{n} field is 0. 


— In the Exception level that the exception was taken from, the value of the applicable 
TCR_ELx.TBI{n} field is 1. 


In all other cases, the tag bits cannot be propagated to the PC. 
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D4.2 The VMSAv8-64 address translation system 

The following subsections describe the VMSAv8-64 address translation system, that maps VAs to PAs: 

° About the VMSAv8-64 address translation system. 

° Controlling address translation stages on page D4-1729. 

. Memory translation granule size on page D4-1736. 

. Translation tables and the translation process on page D4-1742. 

° Overview of the VMSAv8-64 address translation stages on page D4-1745. 

. The VMSAv8-64 translation table format on page D4-1756. 

° The algorithm for finding the translation table descriptors on page D4-1763. 

° The effects of disabling a stage of address translation on page D4-1767. 

° The implemented Exception levels and the resulting translation stages and regimes on page D4-1769. 

° Pseudocode description of VMSAv8-64 address translation on page D4-1769. 

. Address translation instructions on page D4-1771. 

Related to this: 

° VMSAVv8-64 translation table format descriptors on page D4-1774 describes the translation table entries. 

. Memory region attributes on page D4-1792 describes the attributes that are held in the translation table 
entries, including how different attributes can interact. 

° Translation Lookaside Buffers (TLBs) on page D4-1810 describes the caching of translation table lookups in 
TLBs, and the architected instructions for maintaining TLBs. 

° AArch64 Address translation examples on page K7-5552 gives detailed descriptions of typical examples of 
translating a VA to a final PA, and obtaining the memory attributes of that PA. 

D4.2.1 About the VMSAv8-64 address translation system 

The Memory Management Unit (MMU) controls address translation, memory access permissions, and memory 

attribute determination and checking, for memory accesses made by the PE. 

The general model of MMU operation is that the MMU takes information about a required memory access, 

including an input address (IA), and either: 

° Returns an associated output address (OA), and the memory attributes for that address. 

° Is unable to perform the translation for one of a number of reasons, and therefore causes an exception to be 
generated. This exception is called an MMU fault. System registers are used to report any MMU faults that 
occur. 

The process of mapping an IA to an OA is an address translation, or more precisely a single stage of address 

translation. 

When using a VMSA, a translation regime maps a VA to a PA using one or two stages of translation, and: 

° The AArch64 translation regimes on page D4-1727 defines the translation regimes. 

° VMSA address types and address spaces on page D4-1723 give more information about VAs and PAs. 

The translation granule specifies the granularity of the mapping from IA to OA. That is, it defines both: 

° The page size for a stage of address translation, where a page is the smallest block of memory for which an 
IA to OA mapping can be specified. 

. The size of a complete translation table for that stage of address translation. 

The MMU is controlled by System registers, that provide independent control of each address translation stage, 

including a control to disable the stage of address translation. The effects of disabling a stage of address translation 

on page D4-1767 defines how the MMU handles an access for which a required address translation stage is disabled. 
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This section describes the address translation system for an implementation that includes all of the Exception levels, 
and gives a complete description of translations that are controlled by an Exception level that is using AArch64. In 
addition: 


° The ARMv8 VMSA when some Exception levels are using AArch32 on page D4-1722 gives information about 
the VMSA when some Exception levels are using AArch32. 


° The implemented Exception levels and the resulting translation stages and regimes on page D4-1769 
describes the effect on the address translation model when some Exception levels are not implemented. 


Each enabled stage of address translation uses a set of address translations and associated memory properties held 
in memory mapped tables called translation tables. A single translation table lookup can resolve only a limited 
number of bits of the IA, and therefore a single address translation can require multiple lookups. These are described 
as different levels of lookup. 


Translation table entries can be cached in a Translation Lookaside Buffer (TLB). 


As well as defining the OA that corresponds to the IA, the translation table entries define the following properties: 


° For accesses made from Secure state, whether the access is to the Secure or Non-secure address map. 
. Memory access permissions. 
. Memory region attributes. 


For more information, see Memory attribute fields in the VMSAv8-64 translation table format descriptors on 
page D4-1778. 


The following subsections give more information: 

. The AArch64 translation regimes. 

. About address translation and supported input address ranges on page D4-1728. 
° The VMSAv8-64 translation table format on page D4-1729. 


The AArch64 translation regimes 


The architecture defines a number of translation regimes, where a translation regime comprises either: 
° A single stage of address translation. 

This maps an input VA to an output PA. 
e Two, sequential, stages of address translation, where: 

Stage 1 maps an input VA to an output IPA. 

— Stage 2 maps an input IPA to an output PA. 


Figure D4-1 shows these translation stages and translation regimes when EL3 is using AArch64. 


Translation regimes, when EL3 is using AArch64 


Secure EL3. VA Secure EL3 stage 1 


Controlled from Secure EL3 


Secure EL1&0 stage 1 
Controlled from Secure EL1t 


» PA, Secure or Non-secure 








Secure EL1&0 VA » PA, Secure or Non-secure 


Non-secure EL2 stage 1 


Non-secure EL2. VA 
Controlled from EL2* 





» PA, Non-secure only 


Non-secure EL1&0 stage 1 > IPA Non-secure EL1&0 stage 2 


Non-secure EL1&0 VA 
Controlled from EL1t Controlled from EL2t 








» PA, Non-secure only 
t Typically controlled from this Exception level, but also accessible from higher Exception levels 


Figure D4-1 VMSAv8 AArch64 translation regimes, translation stages, and associated controls 
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This means that in VMSAv8-64 the set of translation regimes is: 


The Secure EL3 translation regime 


This has a single stage of translation, stage 1, that maps VAs to PAs and supports a single VA range. 


The Non-secure EL2 translation regime 


This has a single stage of translation, stage 1, that maps VAs to PAs and supports a single VA range. 


The Secure EL1&0 translation regime 
This has a single stage of translation, stage 1, that maps VAs to PAs and supports two VA ranges and 
the use of ASIDs. 

The Non-secure EL1&0 translation regime 


If cached in a TLB, a translation table lookup for this regime is associated with the VMID that 
identifies the current virtual machine. This regime has two stages of lookup: 


Stage 1 Maps VAs to IPAs. This stage supports two VA ranges and the use of ASIDs. 
Stage 2 Maps IPAs to PAs. This stage supports a single IPA range. 


An MMU fault might be generated by a particular stage of translation. An MMU fault is described as either a stage 
1 MMU fault or a stage 2 MMU fault. 





Note 


. In the ARM architecture, a software agent, such as an operating system, that uses or defines stage 1 memory 
translations, might be unaware of the second stage of translation, and of the distinction between IPA and PA. 


° A more generalized description of the translation regimes is that a regime always comprises two sequential 
stages of translation, but in some regimes the stage 2 translation both: 


— _ Returns an OA that equals the IA. This is called a flat mapping of the IA to the OA. 


— Does not change the memory attributes returned by the stage 1 address translation. 





Constraints on accesses from ELO when ELO is using AArch32 


ARMvV8 permits execution with EL1 using AArch64 and ELO using AArch32. In this case, accesses from EL1 and 
from ELO are using the VMSAv8-64 EL1&0 translation regime, but when the PE is executing at ELO using 
AArch32 it is using the AArch32 memory model. In particular, this means it is limited to a 32-bit VA range. 


About address translation and supported input address ranges 


For a single stage of address translation, a Translation table base register (TTBR) indicates the start of the first 
translation table required for a mapping from input address (IA) to output address (OA). For a stage of address 
translation that supports two VA ranges each VA range is an independent mapping from IA to OA. This means that 
each implemented translation stage shown in VMSAv8 AArch64 translation regimes, translation stages, and 
associated controls on page D4-1727 requires: 





° Two associated sets of translation tables if it supports two IA ranges. 
° One associated set of translation tables if it supports a single IA range. 
Note 
° Stage 2 translations never support two IA ranges. This means that, for the translation stages that support two 


IA ranges the JA is always a VA, 


° Example use of the split VA range, and the TTBRO_ELx and TTBR1_ELx controls on page D4-1760 shows 
how two supported VA ranges might be used. 





Controlling address translation stages on page D4-1729 summarizes the System registers that control address 
translation by the MMU, and Selection between TTBRO and TTBRI when two VA ranges are supported on 
page D4-1759 gives more information about the address translation stages that support two VA ranges. 





D4-1728 
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A full translation table lookup is called a translation table walk. It is performed automatically by hardware, and can 
have a significant cost in execution time. To support fine granularity of the VA to PA mapping, a single IA to OA 
translation can require multiple accesses to the translation tables, with each access giving finer granularity. Each 
access is described as a level of address lookup. The final level of the lookup defines: 

° The high bits of the required output address. 


. The attributes and access permissions of the addressed memory. 


Translation table entries can be cached in a Translation Lookaside Buffer, see Translation Lookaside Buffers (TLBs) 
on page D4-1810. 


The VMSAv8-64 translation table format 


Stages of address translation that are controlled by an Exception level that is using AArch64 use the VMSAv8-64 
translation table format. This format uses 64-bit descriptor entries in the translation tables. 


Note 
This format is an extension of the VMSAv8-32 Long-descriptor translation table format originally defined by the 
ARMvVv7 Large Physical Address Extension, and extended slightly by ARMv8. VMSAv8-32 also supports a 
Short-descriptor translation table format. Chapter G4 The AArch32 Virtual Memory System Architecture describes 
both of these formats. 








The VMSAv8-64 translation table format provides: 


° Up to four levels of address lookup. 
° Input addresses of up to 48 bits. 
° Output addresses of up to 48 bits. 


° A translation granule size of 4KB, 16KB, or 64KB. 


D4.2.2 Controlling address translation stages 


The implemented Exception levels and the resulting translation stages and regimes on page D4-1769 defines the 
translation regimes and stages. For each supported address translation stage: 


. A System register bit enables the stage of address translation. 

° A System register bit determines the endianness of the translation table lookups. 

° A Translation Control Register (TCR_ELx) controls the stage of address translation. 

° If a stage of address translation supports two VA ranges then that stage of translation provides a TTBR for 
each VA range, and the stage of address translation has: 
—  Asingle TCR. 


—  ATTBR for each VA range. TTBRO points to the translation tables for the address range that starts at 
0x0000000000000000, and TTBR1 points to the translation tables for the address range that ends at 
OxFFFF FFFFFFFFFFFF. 


Otherwise, a single TTBR holds the address of the translation table that must be used for the first lookup for 
the stage of address translation. 
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For address translation stages controlled from AArch64: 


° Table D4-1 shows the endianness bit (EE) and the enable bit (M or VM) for each stage of address translation. 
Each register entry in the table gives the endianness bit followed by the enable bit. 


Table D4-1 Enable and endianness bits for the AArch64 translation stages 























Translation stage Controlled from Controlling register 
Secure EL3 stage 1 EL3 SCTLR_EL3.{EE, M} 
Secure EL1&0 stage 1 Secure EL1 SCTLR_EL1.{EE, M} 
Non-secure EL2 stage 1 EL2 SCTLR_EL2.{EE, M} 
Non-secure EL1&0 stage 2 EL2 SCTLR_EL2.EE, HCR_EL2.VM 
Non-secure EL1&0 stage 1 Non-secure EL1 SCTLR_EL1.{EE, M} 








Note 


If the PA of the software that enables or disables a particular stage of address translation differs from its VA, 
speculative instruction fetching can cause complications. ARM strongly recommends that the PA and VA of 
any software that enables or disables a stage of address translation are identical if that stage of translation 
controls translations that apply to the software currently being executed. 





° Table D4-2 shows the TCR and TTBR, or TTBRs, for each stage of address translation. In the table, each 
Controlling registers entry gives the TCR followed by the TTBR or TTBRs. 


Table D4-2 TCRs and TTBRs for the AArch64 translation stages 




















Translation stage Controlled from Controlling registers 

Secure EL3 stage 1 EL3 TCR_EL3, TTBRO_EL3 

Secure EL1&0 stage | Secure EL1 TCR_EL1, TTBRO_EL1, TTBR1_EL1 
Non-secure EL2 stage | EL2 TCR_EL2, TTBRO_EL2 

Non-secure EL1&0 stage 2 EL2 VTCR_EL2, VTTBR_EL2 
Non-secure EL1&0 stage 1 Non-secure EL1 TCR_EL1, TTBRO_EL1, TTBR1_EL1 





The following subsections give more information about controlling address translation: 
° System registers relevant to MMU operation. 

° Address size configuration on page D4-1731. 

° Atomicity of register changes on changing virtual machine on page D4-1735. 


° Use of out-of-context translation regimes on page D4-1735. 


System registers relevant to MMU operation 


In AArch64 state, System registers have a suffix, that indicates the lowest Exception level from which they can be 
accessed. In some general descriptions of MMU control and address translation, this chapter uses a Common 
abbreviation on page D4-1731 for each of the System registers that affects MMU operation, as Table D4-3 on 
page D4-1731 shows. The common abbreviation is used when describing features that apply to multiple translation 
regimes or stages. 


Note 


The only translation regime that supports a stage 2 translation is the Non-secure EL1&0 translation regime. 
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Table D4-3 Abbreviations for System registers used in this chapter 





Exception level 




















Common Translation 

abbreviation stage EL4 EL2 EL3 

HCR - - HCR_EL2 - 

SCTLR - SCTLR_EL1 SCTLR_EL2 SCTLR_EL3 

TCR Stage 1 TCR ELI TCR _EL2 TCR_EL3 
Stage 2 - VTCR_EL2 - 

TTBR Stage 1 TTBRO_EL1, TTBR1_EL1 TTBRO_EL2 TTBRO_EL3 
Stage 2 - VTTBR_EL2~ - 





Address size configuration 


The following subsubsections specify the configuration of the PA size and of the input and output address sizes for 
each of the stages of address translation: 


° Physical address size. 


° Output address size on page D4-1732. 


° Input address size on page D4-1733. 
° Supported IPA size on page D4-1734. 


Physical address size 


The ID_AA64MMFRO_EL1.PARange field indicates the implemented PA size, as Table D4-4 shows. 


Table D4-4 Physical address size implementation options 





























ID_AA64MMFRO_EL1.PARange_ Total PAsize PA address size 
0000 4 GB 32 bits, PA[31:0] 
0001 64 GB 36 bits, PA[35:0] 
0010 1 TB 40 bits, PA[39:0] 
0011 4TB 42 bits, PA[41:0] 
0100 16 TB 44 bits, PA[43:0] 
0101 256 TB 48 bits, PA[47:0] 
All other PARange values are reserved. 
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Output address size 


For each enabled stage of address translation, TCR.{I}PS must be programmed to maximum output address size for 
that stage of translation, using the encodings as shown in Table D4-5. 


Table D4-5 Output address size implementation options 























TCR.{}PS Total output size Output address size 
000 4GB 32 bits, PA[31:0] 
001 64 GB 36 bits, PA[35:0] 
010 1 TB 40 bits, PA[39:0] 
Q11 4TB 42 bits, PA[41:0] 
100 16 TB 44 bits, PA[43:0] 
101 256 TB 48 bits, PA[47:0] 





Note 





° This field is called IPS in the TCR_EL1, and PS in the other TCRs. 
° The {I}PS fields are 3-bit fields, corresponding to the least-significant PARange bits shown in Table D4-4 


on page D4-1731. 





If {I}PS is programmed to a value larger than the implemented PA size, then the PE behaves as if programmed with 
the implemented PA size, but software must not rely on this behavior. That is, the output address size is never larger 
than the implemented PA size. Table D4-4 on page D4-1731 shows the implemented PA size. 


The PE checks that the TTBR, translation table entries, and the output address for the stage of address translation 
have the address bits above the output address size set to zero. If this is not the case, an Address size fault is 
generated for the level and stage of translation that caused the fault. An Address size fault from the TTBR is always 


reported as a level 0 fault. 


If stage 1 translation is disabled and the input address is larger than the implemented PA size, then a stage 1 level 0 


Address size fault is generated. 





Note 


These faults are reported as level 0 faults even if they occur in a translation stage that does not perform level 0 


lookups. 





When using two stages of translation: 


° If stage 2 translation is disabled and the output address from the stage 1 translation is larger than the 
implemented PA size, then a stage 1 Address size fault is generated for the level of the stage 1 translation that 


generated the output address. 


. If stage 2 translation is enabled and the output address from the stage | translation does not generate a stage | 
Address Size fault, but is larger than the input address size specified for the stage 2 translation, then a stage 2 


Translation fault is generated. 
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Input address size 
For each enabled stage of address translation, the TCR.TxSZ fields specify the input address size: 


For a stage of translation that supports two VA ranges 
The TCR has two TxSZ fields, corresponding to the two VA ranges: 
° TCR.TOSZ specifies the size for the lower VA range, translated using TTBRO. 
° TCR.T1SZ specifies the size for the upper VA range, translated using TTBR1. 
For a stage of translation that supports a single input address (IA) range 
The TCR has a single TOSZ field, and IAs are translated using TTBRO. 


Attempting to translate an address that is larger than the configured input address size generates a Translation fault. 
This means: 


° For a TCR with a single TOSZ field, Figure D4-2 shows the input address map: 


Input address (IA) 
OxFFFF_FFFF_FFFF_FFFF 





Accesses 
generate 
Translation 
faults 


0x0000_FFFF_FFFF_FFFF < Boundary, when TCR.TOSZ==16 





TTBRO 


region +r Effect of increasing TCR.TOSZ 











0x0000_0000_0000_0000 
Figure D4-2 AArch64 input address map when using a single TTBR 


° For a TCR with two TxSZ fields, the input address is always a VA, and Selection between TTBRO and TTBR1 
when two VA ranges are supported on page D4-1759 describes the VA address map. 


For the Non-secure EL1&0 translation regime, when both stages of translation are enabled, if the output address 
from the stage 1 translation does not generate a stage 1 address size fault, and is larger than the input address 
specified by VTCR_EL2.TOSZ, then the input address size check for the stage 2 translation generates a Translation 
fault. 


Although software can configure the input address size to be smaller than 48 bits, all implemented AArch64 TTBRs 
must support address sizes of up to 48 bits. 


Overview of the VMUSAv8-64 address translation stages on page D4-1745 gives more information about the 
relationship between the required input address size, the value of TxSZ, and the required initial lookup level, and 
how these are affected by the translation granule size. However: 


For all translation stages 


The maximum TxSZ value is 39. If TxSZ is programmed to a value larger than 39 then it is 
IMPLEMENTATION DEFINED whether: 





° The implementation behaves as if the field is programmed to 39 for all purposes other than 
reading back the value of the field. 
° Any use of the TxSZ value generates a Level 0 Translation fault for the stage of translation 
at which TxSZ is used. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D4-1733 


1ID092916 


Non-Confidential 


D4 The AArch64 Virtual Memory System Architecture 
D4.2 The VMSAV8-64 address translation system 


For a stage 1 translation 


The minimum TxSZ value is 16. If TxSZ is programmed to a value smaller than 16 then it is 
IMPLEMENTATION DEFINED whether: 


. The implementation behaves as if the field were programmed to 16 for all purposes other than 
reading back the value of the field. 


° Any use of the TxSZ value generates a stage 1 Level 0 Translation fault. 


For a stage 2 translation 


Supported IPA size defines the effective minimum value of TOSZ, that depends on the supported PA 
size, and also describes the possible effects of programming TOSZ to a value that is smaller than this 
effective minimum value. 


Supported IPA size 


For the Non-secure EL1&0 translation regime, the maximum IPA size is the maximum input address size for the 
second stage of translation, that must be specified by VTCR_EL2.TOSZ, see Input address size on page D4-1733. 
This value is constrained by the implemented PA size that is specified by ID_AA64MMFRO_EL1.PARange, see 


Physical address size on page D4-1731. This implemented PA size also constrains the maximum value of 
VTCR_EL2.SLO0, that specifies the level of the initial lookup. SLO also depends on the translation granule, as 
described in Overview of the VUSAv8-64 address translation stages on page D4-1745. 


Table D4-6 PA size implications for the VTCR_EL2.{T0SZ, SLO} fields 





Maximum SLO value 




















nails Effective minimum TOSZ value 
4KB granule 16KB granule 64KB granule 

32 bits 32 if EL1 is using AArch64 1 1 1 

24 if EL1 is using AArch32 
36 bits 28 if EL1 is using AArch64 1 1 1 

24 if EL1 is using AArch32 
40 bits 24 1 1 1 
42 bits 22 1 2 1 
44 bits 20 2 2 2 
48 bits 16 2 2 2 





If VTCR_EL2.SLO is programmed to a value larger than the maximum value shown in Table D4-6, or is 
programmed to a reserved value, then any memory access that uses the second stage of translation generates a 
stage 2 level 0 Translation fault. 


If VITCR_EL2.TOSZ is programmed to a value smaller than the effective minimum value shown in Table D4-6 then 
the implementation consistently does one of the following: 


Treat the VTCR_EL2.TOSZ field as being programmed to the effective minimum value for all purposes other 
than reading back the value of the field. 


Treat the VTCR_EL2.TOSZ field as being programmed to the effective minimum value for all purposes other 
than: 


— _ Reading back the value of the field. 
— Checking whether the value of VTCR_EL2.TOSZ is consistent with the value of VTCR_EL2.SLO0. 


Generate a stage 2 level 0 Translation fault on any memory access that uses the second stage of translation. 
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Note 


Programming VTCR_EL2.TOSZ to a value smaller than the effective minimum value shown in Table D4-6 on 
page D4-1734 can never provide support for a larger address range than the range given by the effective minimum 
value, because the stage 1 output address will give an Address size fault if it is larger than either: 

° The PA size, fora VMSAv8-64 stage 1 translation. 


° 40 bits, for a VMSAv8-32 stage | translation. 








Atomicity of register changes on changing virtual machine 


From the viewpoint of software executing at Non-secure EL1 or ELO, when there is a switch from one virtual 
machine to another, the registers that control or affect address translation must be changed atomically. This applies 
to the registers for the Non-secure EL1&0 translation regime. This means that all of the following registers must 
change atomically: 


° The registers associated with the stage 1 translations: 
—  MAIR_ELI and AMAIR_ELI. 
—  TTBRO_EL1, TTBR1_EL1, TCR_EL1, and CONTEXTIDR_EL1. 
—  SCTLR_EL1. 


° The registers associated with the stage 2 translations: 
—  VTTBR_EL?2 and VTCR_EL2. 
—  SCTLR_EL2. 


Note 


Only some bits of SCTLR_EL]1 affect the stage 1 translation, and only some bits of SCTLR_EL2 affect the stage 2 
translation. However, in each case, changing these bits requires a write to the register, and that write must be atomic 
with the other register updates. 








These registers apply to execution using the Non-secure EL1&0 translation regime. However, when updated as part 
of a switch of virtual machines they are updated by software executing at EL2. This means the registers are out of 
context when they are updated, and no synchronization precautions are required. 


Use of out-of-context translation regimes 
The architecture requires that: 


° When executing at EL3, EL2, or Secure EL1, the PE must not use the registers associated with the 
Non-secure EL1&0 translation regime for speculative memory accesses. 


° When executing at EL3 or Secure EL1, the PE must not use the registers associated with the EL2 translation 
regime for speculative memory accesses. 


° When executing at EL3, EL2, or Non-secure EL1, the PE must not use the registers associated with the 
Secure EL] translation regime for speculative memory accesses. 


When entering an Exception level, on completion of a DSB instruction, no new memory accesses using any 
translation table entries from a translation regime of an Exception level lower than the Exception level that has been 
entered will be observed by any observers, to the extent that those accesses are required to be observed as 
determined by the shareability and cacheability of those translation table entries. 
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Note 


° This does not require that speculative memory accesses cannot be performed using those entries if it is 
impossible to tell that those memory accesses have been observed by the observers. 





. This requirement does not imply that, on taking an exception to a higher Exception level, any translation table 
walks started before the exception was taken will be completed by the time the higher Exception level is 
entered, and therefore memory accesses required for such a translation table walk might, in effect, be 
performed speculatively. However, the execution of a DSB on entry to the higher Exception level ensures that 
these accesses are complete. 





D4.2.3 Memory translation granule size 


The memory translation granule size defines both: 
° The maximum size of a single translation table. 


° The memory page size. That is, the granularity of a translation table lookup. 


VMSAv8-64 supports translation granule sizes of 4KB, 16KB, and 64KB. Support for each granule size is optional, 
and is indicated as shown in Table D4-7: 


Table D4-7 Identifying supported granule sizes 





Support indicated by: 
Granule size 











Field Values 
4KB ID_AA64MMFRO_EL1.TGRAN4 0b0000 4KB granule size supported. 
Qb1111 4KB granule size not supported. 
16KB ID_AA64MMFRO_EL1.TGRAN16 0b0000 16KB granule size not supported. 
0b0001 16KB granule size supported. 
64KB ID_AA64MMFRO_EL1.TGRAN64 0b0000 64KB granule size supported. 
Ob1111 64KB granule size not supported. 





In VMSAv8-64, each address translation stage is configured, independently, to use one of the supported granule 
sizes. 


Note 


° Using a larger granule size can reduce the maximum required number of levels of address lookup because: 





— The increased translation table size means the translation table holds more entries. This means a single 
lookup can resolve more bits of the input address. 


— The increased page size means more of the least-significant address bits are required to address a page. 
These address bits are flat mapped from the input address to the output address, and therefore do not 
require translation. 


° ARM recommends that memory-mapped peripherals are separated by an integer multiple of the largest 
granule size supported by the operating system or hypervisor, to allow each peripheral to be managed 
independently. 
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Table D4-8 summarizes the effects of the different granule sizes. 


Table D4-8 Effect of granule size on a stage of address translation 

















Property 4KB granule 16KBgranule 64KBgranule Notes 

Maximum number of entries in a translation 512 2048 (2K) 8192 (8K) - 

table 

Address bits resolved in one level of lookup 9 11 13 29=512, 2U=2K, 213=8K 

Page size 4KB 16KB 64KB - 

Page address range VA[11:0]= VA[13:0]= VA[15:0]= 212=4K, 214=16K, 
PA[11:0] PA[13:0] PA[15:0] 216=64K 





How the granule size affects the address translation process 


As Table D4-8 shows, the translation granule determines the number of address bits: 
° Required to address a memory page. 


. That can be resolved in a single translation table lookup. 


This means the translation granule determines how the input address (IA) is resolved to an output address (OA) by 
the translation process. 


Because a single translation table lookup can resolve only a limited number of address bits, the IA to OA resolution 
requires multiple /evels of lookup. 


Considering the resolution of the maximum IA range of 48 bits, with a translation granule size of 2” bytes: 
° The least-significant n bits of the IA address the memory page. This means OA[(n-1):0]=IA[(-1):0]. 
° The remaining (48-n) bits of the IA, IA[47:n], must be resolved by the address translation. 


° A translation table descriptor is 8 bytes. Therefore: 
—  Acomplete translation table holds 2-3) descriptors. 


— A single level of translation can resolve a maximum of (n-3) bits of address. 


Consider the translation process, working back from the final level of lookup, that resolves the least 
significant of the address bits that require translation. Because the translation needs to resolve IA[47:n] and 
a level of lookup can resolve (n-3) bits of address: 


— The final level of lookup resolves IA[(2n-4):n]. 
— _ The previous level of lookup resolves IA[(3n-7):(2n-3)]. 


However, the level of lookup that resolves the most significant bits of the IA might not require a full-sized 
translation table. Therefore, in general, for a 48-bit IA the address bits resolved in a level of lookup are: 


TA[Min(47, ((x-3)(n-3)+2n-4)):(n+(x-3)(n-3))], where: 
Minaa, b) Is a function that returns the minimum of a and b. 


x Indicates the level of lookup. This is defined so that the level that resolves the least significant 
bit of the translated IA bits is level 3. 


The following diagrams show this model, for each of the permitted granule sizes. 


Figure D4-3 on page D4-1738 shows how a 48-bit IA is resolved when using the 4KB translation granule. 
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pg the 4KB translation granule 
39 38 30 29 2120 


12 11 























— IA[11:0] OA[11:0] 


Index the level 3 translation tablet 


IA[20:12] or 
OA[20:12]* 
Index the level 2 translation tablet 
IA[29:21] or 
OA[29:21]* 


IA[38:30] Index the level 1 translation tablet 


IA[47:39] Index the level 0 translation table 


OA Output address 
Tt Table entry at previous lookup level 
* Block entry at previous lookup level 


Figure D4-3 How a 48-bit IA is resolved when using the 4KB translation granule 


Figure D4-4 shows how a 48-bit IA is resolved when using the 16KB translation granule. 


Using the 16KB translation granule 


4746 36 35 25 24 


14 13 


= I as Input address (\A) 




















[___|aj13:0] OA[13:0] 


Index the level 3 translation tablet 


1A[24:14] or 
OA[24:14]* 
Index the level 2 translation tablet 
IA[35:25] or 
OA[35:25]* 


IA[46:36] Index the level 1 translation table 


IA[47] Indexes the level 0 translation table 


OA Output address 
T Table entry at previous lookup level 
* Block entry at previous lookup level 


Figure D4-4 How a 48-bit IA is resolved when using the 16KB translation granule 
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Figure D4-5 shows how a 48-bit IA is resolved when using the 64KB translation granule. 


Using the 64KB translation granule 
42.41 29 28 


16 15 





0 
Input address (lA) 
? 


A 
____|aj15:0]  OA[15:0] 


Index the level 3 translation tablet 
IA[28:16] or 





OA[28:16]* 


Index the level 2 translation tablet 
IA[41:29] or 








OAl41:29]* 


IA[47:42] Index the level 1 translation table 





OA Output address 
Tt Table entry at previous lookup level 
* Block entry at previous lookup level 


Figure D4-5 How a 48-bit IA is resolved when using the 64KB translation granule 


Later sections of this chapter give more information about the translation process, and explain the terminology used 


in these figures. 


Effect of granule size on translation table addressing and indexing 


Table D4-9 shows the effect of the translation granule size on the addressing and indexing of the TTBR, and on the 
input address range that must be resolved. 


Table D4-9 The effect of translation granule size on the translation tables 





Translation table 











Granule 
cine Notes 

Addressed by _ Indexed by» 
4KB TTBR[47:12] TA[(@ + 8):x] One level of lookup resolves up to* 9 bits of IA 
16KB TTBR[47:14] TA[(x + 10):x] One level of lookup resolves up to¢ 11 bits of IA 
64KB TTBR[47:16] TA[(@ + 12):x] One level of lookup resolves up to¢ 13 bits of IA 





a. When translating a maximum-sized input address of 48 bits, and accessing a page of memory. 


b. Where the value of x depends on the lookup level, see Table D4-10. 


c. Depending on the IA size, the initial lookup might resolve fewer bits of the IA. 


Table D4-10 shows the IA bits resolved at each level of lookup, and how these correspond to the possible values of 


x in Table D4-9. 


Table D4-10 IA bits resolved at different levels of lookup 





Lookup level 


4KB granule size 


16KB granule size 64KB granule size 











Zero TA[47:39], x = 39 IA[472], x = 47 -b 
First TA[38:30], x = 30 IA[46:36], x = 36 TA[472:42], x = 42 
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Table D4-10 IA bits resolved at different levels of lookup (continued) 











Lookup level 4KB granule size 16KB granule size 64KB granule size 
Second TA[29:21], x = 21 TA[35:25], x = 25 TA[41:29], x = 29 
Third TA[20:12], x = 12 TA[24:14], x = 14 IA[28:16], x = 16 





a. Smaller value than indicated in Table D4-9 on page D4-1739, as explained in this section. 
b. Level 0 lookup not possible with 64KB granule size 


Table D4-9 on page D4-1739 refers to accessing a complete translation table, of 4KB, 16KB, or 64KB. However, 
the ARMv8 translation system supports the following possible variations from the information in Table D4-9 on 
page D4-1739: 


Reduced IA width 


Depending on the configuration and implementation choices, the required input address width for 
the initial level of lookup might be smaller than the number of address bits that can be resolved at 
that level. This means that, for this initial level of lookup: 


° The translation table size is reduced. For each 1 bit reduction in the input address size the size 
of the translation table is halved. 


— Note ———_—_—_ 


— This has no effect on the translation table size for subsequent levels of lookup, for 
which the lookups always use full-sized translation tables. 


—  Forastage 2 translation, it might be possible to start the translation at a lower level, 
see Concatenated translation tables on page D4-1741. 





° More low-order TTBR bits are needed to hold the translation table base address. 


Example D4-1 shows how this applies to translating a 35-bit input address range using the 4KB 
granule. 


Example D4-1 Effect of an IA width of 35 bits when using the 4KB granule size 


With a 4KB granule size, a single level of lookup can resolve up to 9 bits of IA. If an implementation has a 35-bit 
input address range, IA[34:0], Table D4-10 on page D4-1739 shows that lookup must start at level 1, and that the 
initial lookup must resolve [A[34:30], meaning it resolves 5 bits of address: This 4-bit reduction in the required 
resolution means: 

. The translation table size is divided by 2+, giving a size of 256B. 


° The TTBR requires 4 more bits for the translation table base address, which becomes TTBR[47:8]. 


When using the 64KB translation granule to translate the maximum IA size of 48 bits, Table D4-10 
on page D4-1739 shows that a level 1 lookup must resolve only IA[47:42]. This is 6 bits of address, 
compared to the 13 bits that can be resolved at a single level of lookup. This 7-bit reduction in the 
required resolution means: 





° The translation table size is divided by 27, giving a size of 512B. 
° The TTBR requires 7 more bits for the translation table base address, which becomes 
TTBR[47:9]. 
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Concatenated translation tables 


For stage 2 address translations, for the initial lookup, up to 16 translation tables can be 
concatenated. This means additional JA bits can be resolved at that lookup level. The block of 
concatenated translation tables must be aligned to the size of the block of translation tables. 


This means that each additional IA bit resolved: 


. Doubles the number of translation tables required. Resolving an additional n bits requires 2” 
concatenated translation tables at the initial lookup level. 


° Reduces by 1 bit the width of the translation table base address held in the TTBR. 


This means that, for the initial lookup of a stage 2 translation table, the IA ranges shown in 
Table D4-10 on page D4-1739 can be extended by up to 4 bits. Example D4-2 shows how 
concatenation can be used to resolve a 40-bit [A when using the 4KB translation granule. 


Example D4-2 Concatenating translation tables to resolve a 40-bit IA range, with the 4K granule 


Table D4-10 on page D4-1739 shows that, when using the 4KB translation granule, a level 1 lookup can resolve a 
39-bit IA, with the first lookup resolving IA[38:30]. For a stage 2 translation, to extend the IA width to 40 bits and 
resolve IA[39:30] with the first lookup: 


. Two translation tables are concatenated, giving a total size of 8KB. 
° The TTBR requires 1 fewer bit for the translation table base address, which becomes TTBR[47:13]. 


For more information, see Use of concatenated translation tables for the initial stage 2 lookup on 
page D4-1761. 


In all cases, the translation table, or block of concatenated translation tables, must be aligned to the actual size of 
the table or block of concatenated tables. 


The translation table base address held in the TTBR is defined in the OA map for that stage of address translation. 
The information given in this section assumes this stage of translation has an OA size of 48 bits, meaning the 
translation table base address is: 


. TTBR[47:12] if using the 4KB translation granule. 
° TTBR[47:14] if using the 16KB translation granule. 
° TTBR[47:16] if using the 64KB translation granule. 


If the OA address is smaller than 48 bits then the upper bits of this field must be written as zero. For example, for a 
40-bit OA range: 
° If using the 4KB translation granule: 

—  TTBR[47:40] must be set to zero. 

—  TTBR[39:12] holds the translation table base address. 
° If using the 16KB translation granule: 

—  TTBR[47:40] must be set to zero. 

—  TTBR[39:14] holds the translation table base address. 
° If using the 64KB translation granule: 

— TTBR[47:40] must be set to zero. 

—  TTBR[39:16] holds the translation table base address. 


In all cases, if TTBR[47:40] is not zero, any attempt to access the translation table generates an Address size fault. 
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D4.2.4 Translation tables and the translation process 


The following subsections describe general properties of the translation tables and translation table walks, that are 
largely independent of the translation table format: 


° Translation table walks. 

° Ordering of memory accesses from translation table walks on page D4-1744. 
° Security state of translation table lookups on page D4-1744. 

° Control of translation table walks on page D4-1744. 


See also Selection between TTBRO and TTBRI1 when two VA ranges are supported on page D4-1759. 


Translation table walks 


A translation table walk comprises one or more translation table lookups. The translation table walk is the set of 
lookups that are required to translate the VA to the PA. For the Non-secure EL1&0 translation regime, this set 
includes lookups for both the stage 1 translation and the stage 2 translation, but translation table walk can also be 
used to refer to either: 


° The set of lookups required for the stage 1 translation, that translates the VA to the IPA. This is the stage 1 
translation table walk. 


. The set of lookups required for the stage 2 translation, that translates the IPA to the PA. This is the stage 2 
translation table walk. 


The information returned by a successful translation table walk is: 


° The required PA. If the access is from Secure state this includes identifying whether the access is to the Secure 
PA space or the Non-secure PA space, see Security state of translation table lookups on page D4-1744. 


° The memory attributes for the target memory region, as described in Memory types and attributes on 
page B2-94. For more information about how the translation table descriptors specify these attributes see 
Memory region attributes on page D4-1792. 


° The access permissions for the target memory regions. For more information about how the translation table 
descriptors specify these permissions see Memory access control on page D4-1783. 


The translation table walk starts with a read of the translation table for the initial lookup. The TTBR for the stage 
of translation holds the base address of this table. Each translation table lookup returns a descriptor, that indicates 
one of the following: 


° The entry is the final entry of the walk. In this case, the entry contains the OA, and the permissions and 
attributes for the access. 


. An additional level of lookup is required. In this case, the entry contains the translation table base address for 
that lookup. In addition: 


— The descriptor provides hierarchical attributes that are applied to the final translation, see Hierarchical 
control of Secure or Non-secure memory accesses on page D4-1782 and Hierarchical control of data 
access permissions on page D4-1785. 


— If the translation is in a Secure translation regime, the descriptor indicates whether that base address 
is in the Secure or Non-secure address space, unless a hierarchical control at a previous level of lookup 
has indicated that it must be in the Non-secure address space. 


. The descriptor is invalid. In this case, the memory access generates a Translation fault. 
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Figure D4-6 gives a generalized view of a single stage of address translation, where three levels of lookup are 
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Figure D4-6 Generalized view of a stage of address translation 


A translation table lookup from VMSAv8-64 performs a single-copy atomic 64-bit access to the translation table 
entry. This means the translation table entry is treated as a 64-bit object for the purpose of endianness. SCTLR.EE 
determines the endianness of the translation table lookups. 


Note 





Dynamically changing translation table endianness 


Because any change to an SCTLR.EE, bit requires synchronization before it is visible to subsequent 
operations, ARM strongly recommends that any EE bit is changed only when either: 


° Executing at an Exception level that does not use the translation tables affected by the EE bit 
being changed. 

° Executing with address translation disabled for any stage of translation affected by the EE bit 
being changed. 


Address translation stages are disabled by setting an SCTLR.M bit or the HCR_EL2.VM bit to 0. 
See the appropriate register description for more information. 





The appropriate TTBR holds the output address of the base of the translation table used for the initial lookup, and: 


For all address translation stages other than Non-secure EL1&0 stage 1 translations, the output address held 
in the TTBR, and any translation table base address returned by a translation table descriptor, is the PA of the 
base of the translation table. 


For Non-secure EL1&0 stage 1 translations, the output address held in the TTBR, and any translation table 
base address returned by a translation table descriptor, is the IPA of the base of the translation table. This 
means that if stage 2 address translation is enabled, each of these OAs is subject to second stage translation. 


Note 
TLB caching can be used to minimise the number of translation table lookups that must be performed. For 
the Non-secure EL1&0 translation regime, because each stage 1 OA generated during a translation table walk 
is subject to a stage 2 translation, if the caching of translation table entries is ineffective, a VA to PA address 
translation with two stages of translation can give rise to multiple translation table lookups. The number of 
lookups required is given by the following equation: 


(S1+1)*(S24+1) - 1 


Where, for this translation regime, S1 is the number of levels of lookup required for a stage | translation, and 
82 is the number of levels of lookup required for a stage 2 translation. 
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The TTBR also determines the memory cacheability and shareability attributes that apply, for the corresponding 
stage of translation, to all translation table lookups generated by that stage of translation. 


The Normal memory type is the memory type defined for a translation table lookup for a stage of translation. 


Note 


. In a two stage translation regime, a translation table lookup from stage 1, that has the Normal memory type 
defined at stage 1 by this rule, can still be given the Device memory type as part of the stage 2 translation of 
that address. ARM strongly recommends against such a remapping of the memory type, and the architecture 
includes a trap of this behavior to EL2. For more information, see Stage 2 fault on a stage 1 translation table 
walk on page D4-1806. 





° The rules about mismatched attributes given in Mismatched memory attributes on page B2-105 apply to the 
relationship between translation table walks and explicit memory accesses to the translation tables in the 
same way that they apply to the relationship between different explicit memory accesses to the same location. 
For this reason, ARM strongly recommends that the attributes that the TCR applies to the translation tables 
are the same as the attributes that are applied for explicit accesses to the memory that holds the translation 
tables. 





For more information see Overview of the VMSAv8-64 address translation stages on page D4-1745. 


See also Selection between TTBRO and TTBRI1 when two VA ranges are supported on page D4-1759. 


Ordering of memory accesses from translation table walks 
A translation table walk is considered to be a separate observer, and: 


° A write to the translation tables can be observed by that separate observer at any time after the execution of 
the instruction that performed that write, but is only guaranteed to be observable after the execution of a DSB 
instruction by the PE that executed the instruction that performed that write to the translation tables. 


. Any writes to the translation tables are not seen by any explicit memory access generated by a load or store 
that occurs in program order before the instruction that performs the write to the translation tables. 


Security state of translation table lookups 
For a Non-secure translation regime, all translation table lookups are performed to Non-secure output addresses. 
For a Secure translation regimes, the initial translation table lookup is performed to a Secure output address. 


If the translation table descriptor returned as a result of that initial lookup points to a second translation table, then 
the NSTable bit in that descriptor determines whether that translation table lookup is made to Secure or to 
Non-secure output addresses. 


This applies for all subsequent translation table lookups as part of that translation table walk, with the additional 
rule that any translation table descriptor that is returned from Non-secure memory is treated as if the NSTable bit in 
that descriptor indicates that the subsequent translation table lookup is to Non-secure memory. 


Control of translation table walks 


For a stage | translation that supports two VA ranges the TCR_ELx.{EPD0, EPD1} bits determine whether the 
translation tables for the stage are valid. EPDO indicates whether the table that TTBRO_ELx points to is valid, and 
EPD1 indicates whether the table that TTBR1_ELx points to is valid. The effect of these bits is: 


EPDn == The translation table is valid, and can be used for a translation table lookup. 


EPDn == If a TLB miss occurs based on TTBRa», a Translation fault is returned, and no translation table walk 
is performed. The fault is reported as a level 0 fault. 
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D4.2.5 Overview of the VMSAv8-64 address translation stages 


As shown in Memory translation granule size on page D4-1736, the granule size determines significant aspects of 
the address translation process. Effect of granule size on translation table addressing and indexing on page D4-1739 
shows, for each granule size: 


° How the required input address range determines the required initial lookup levels. 
° For stage 2 translations, the possible effect described in Concatenated translation tables on page D4-1741. 
° The TTBR addressing and indexing for the initial lookup. 


The following subsections summarize the multiple levels of lookup that can be required for a single stage of address 
translation that might require the maximum number of lookups: 


° Overview of VMSAv8-64 address translation using the 4KB translation granule. 
° Overview of VMSAv8-64 address translation using the 16KB translation granule on page D4-1749. 
. Overview of VMSAv8-64 address translation using the 64KB translation granule on page D4-1753. 


Overview of VMSAv8-64 address translation using the 4KB translation granule 


The requirements for the level of the initial lookup are different for stage 1 and stage 2 translations. 


Overview of stage 1 translations, 4KB granule 


For a stage 1 translation, the required initial lookup level is determined only by the required input address range 
specified by the corresponding TCR.TnSZ field. When using the 4KB translation granule, Table D4-11 shows this 
requirement. 


Table D4-11 TCR.TnSZ values and IA ranges, 4K granule with no concatenation of tables 





TnSZ values for and input address ranges? for starting at this level 
Initial lookup level 











TNSZmin lAmax TNSZmax IAmin 
0 16 TA[47:12] 24 TA[39:12] 
1 25 1A[38:12] 33 TA[30:12] 
2 34 TA[29:12] 39 TA[24:12] 





a. The IAs show the address bits to be resolved when addressing a page of memory, see the Note that follows. 


These configuration options are also permitted for stage 2 translations. 





Note 
° When using the 4KB translation granule, the initial lookup cannot be at level 3. 
° Some bits of the IA do not require resolution by the translation table lookup, because they always map 


directly to the OA, When using the 4KB translation granule, IA[11:0] = OA[11:0] for all translations. 
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Figure D4-7 shows the stage | address translation, for an address translation using the 4KB granule with an input 
address size greater than 39 bits. 
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TTBR D_Block is a Block descriptor 
D_Page is a Page descriptor 
a Indexed by IA[n:39], where IA width is (n+1) bits 
b Indexed by IA[38:30] 
c Indexed by IA[29:21] 
d Indexed by IA[20:12] 


Figure D4-7 General view of VMSAv8-64 stage 1 address translation, 4KB granule 


Overview of stage 2 translations, 4KB granule 


For a stage 2 translation, up to 16 translation tables can be concatenated at the initial lookup level. For certain input 
address sizes, concatenating tables in this way means that the lookup starts at a lower level than would otherwise be 
the case. For more information see Use of concatenated translation tables for the initial stage 2 lookup on 

page D4-1761. 


When using the 4KB translation granule, Table D4-12 shows all possibilities for the initial lookup for a stage 2 
translation. 


Table D4-12 VTCR_EL2.T0SZ values and IA ranges, 4K granule with possible concatenation of translation tables 














Tables? 1 2 4 8 16 

Initial TOSZ values and input address ranges? for starting at this level 

lookup 

level ToSZ IA TOSZ IA TOSZ IA TOSZ IA TOSZ IA 

0 16- TA[47:12]- - - 7 - : 2 2 
24 TA(39:12] 

1 25- TA[38:12]- 24 TA[39:12] = 23 TA[40:12] 22 TA[41:12] 21 TA[42:12] 
33 TA(30:12] 

2 34- TA(29:12]- 33 TA[30:12] 32 TA[31:12] 31 TA(32:12] 30 TA[33:12] 
39 TA(24:12] 





a. Number of concatenated translation tables at the initial lookup level. / table corresponds to no concatenation, also shown in Table D4-11 
on page D4-1745. 

b. The IAs shown in the table indicate the address bits to be resolved by an address translation addressing a page of memory, see the Note 
that follows. 
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Note 
° When using the 4KB translation granule, the initial lookup cannot be at level 3. 
. Because concatenating translation tables reduces the number of levels of lookup required, when using the 


4KB translation granule, tables cannot be concatenated at level 0. 


° Some bits of the IA do not require resolution by the translation table lookup, because they always map 
directly to the OA. When using the 4KB translation granule, IA[11:0] = OA[11:0] for all translations. 





In addition, VTCR_EL2.SLO indicates the required initial lookup level, as Table D4-13 shows. 


Table D4-13 VTCR_EL2.SL0 values, 4KB granule 





Initial lookup level VTCR_EL2.SLO0 











0 0b10 
1 QbO1 
2 0bee 





The VTCR_EL2.SLO value 0b11 is reserved. 


Because the maximum number of concatenated translation tables is 16, there is a relationship between the permitted 
VTCR_EL2.{TOSZ, SLO} values. Table D4-12 on page D4-1746 shows the permitted TOSZ values for each initial 
lookup level. 


If, when a translation table walk is started, the TOSZ value is not consistent with the SLO value, or VTCR_EL2.SLO 
is programmed to a reserved value, a stage 2 level 0 Translation fault is generated. 
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Figure D4-8 shows the stage 2 address translation, for an input address size of between 40 and 43 bits. For an input 
address size in this range, the lookup can start at either level 0 or level 1. 
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Figure D4-8 General view of VMSAv8-64 stage 2 address translation, 4KB granule 
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Overview of VMSAv8-64 address translation using the 16KB translation granule 


The requirements for the level of the initial lookup are different for stage 1 and stage 2 translations. 


Overview of stage 1 translations, 16KB granule 


For a stage 1 translation, the required initial lookup level is determined only by the required input address range 
specified by the corresponding TCR.TnSZ field. When using the 16KB translation granule, Table D4-14 shows this 
requirement. 


Table D4-14 TCR.TnSZ values and IA ranges, 16K granule with no concatenation of tables 





TnSZ values for and input address ranges? for starting at this level 
Initial lookup level 














TNSZmin lAmax TNSZmax lAmin 
0 16 TA[47:14] : . 
1 17 TA[46:14] 27 1A[36:14] 
2 28 TA[35:14] 38 TA[25:14] 
3 39 TA[24:14] : . 








a. The IAs show the address bits to be resolved when addressing a page of memory, see the Note that follows. 


The configuration options for an initial lookup at level 1, level 2, or level 3 are also permitted for stage 2 translations, 
but stage 2 translation does not permit an initial lookup at level 0. 





Note 
° When using the 16KB translation granule, a maximum of | bit of IA is resolved by a level 0 lookup. 
° Some bits of the IA do not require resolution by the translation table lookup, because they always map 


directly to the OA, When using the 16KB translation granule, IA[13:0] = OA[13:0] for all translations. 





Figure D4-9 shows the stage | address translation, for an address translation using the 16KB granule with an input 
address size of 48 bits. 
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Figure D4-9 General view of VMSAv8-64 stage 1 address translation, 16KB granule 
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Overview of stage 2 translations, 16KB granule 


For a stage 2 translation, up to 16 translation tables can be concatenated at the initial lookup level. For certain input 
address sizes, concatenating tables in this way means that the lookup starts at a lower level than would otherwise be 
the case. For more information see Use of concatenated translation tables for the initial stage 2 lookup on 

page D4-1761. 


When using the 16KB granule, for a stage 2 translation with an input address sized of 48 bits, the initial lookup must 
be at level 1, with two concatenated translation tables at this level. 


When using the 16KB translation granule, Table D4-15 shows all possibilities for the initial lookup for a stage 2 
translation. 


Table D4-15 VTCR_EL2.T0SZ values and IA ranges, 16K granule with possible concatenation of translation tables 














Tables? 1 2 4 8 16 

Initial TOSZ values and input address ranges? for starting at this level 

lookup 

level Tosz IA TOSZ IA TOSZ IA TOSZ IA TOSZ IA 

1 17- TA[46:14]- 16 TA[47:14] —- - - - - - 
27 JA[36:14] 

2 28- TA(35:14]- 27 IA[36:14] 26 TA[37:14] 25 JTA[38:14] 24 TA[39:14] 
38 TA(25:14] 

3 39 TA(24:14] 38 TA[(25:14] = 37 IA[26:14] 36 TA(27:14] 35 TA[28:14] 





a. Number of concatenated translation tables at the initial lookup level. / table corresponds to no concatenation, also shown in Table D4-14 
on page D4-1749. 


b. The IAs shown in the table indicate the address bits to be resolved by an address translation addressing a page of memory, see the Note 
that follows. 





Note 


° When using the 16KB translation granule for a stage 2 translation, the initial lookup cannot be at level 0. 


When a 48-bit input address is required, translation must start with a level | lookup using two concatenated 
translation tables. 


° Some bits of the IA do not require resolution by the translation table lookup, because they always map 
directly to the OA. When using the 16KB translation granule, IA[13:0] = OA[13:0] for all translations. 





In addition, VTCR_EL2.SLO indicates the required initial lookup level, as Table D4-16 shows. 


Table D4-16 VTCR_EL2.SL0 values, 16KB granule 





Initial lookup level VTCR_EL2.SLO0 











1 0b10 
2 QbO1 
es) Oboe 





The VTCR_EL2.SLO value 0b11 is reserved. 


Because the maximum number of concatenated translation tables is 16, there is a relationship between the permitted 
VTCR_EL2.{TOSZ, SLO} values. Table D4-15 shows the permitted values of TOSZ for each initial lookup level. 


If, when a translation table walk is started, the TOSZ value is not consistent with the SLO value, or VTCR_EL2.SLO 
is programmed to a reserved value, a stage 2 level 0 Translation fault is generated. 
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When stage 2 translation supports a 48-bit input address range, translation must start with a level 1 lookup using 
two concatenated translation tables. Figure D4-10 shows the translation for this case. 
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D_Table is a Table descriptor 
D_Block is a Block descriptor 
D_Page is a Page descriptor 
b Indexed by IA[47:36] 
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d_ Indexed by IA[24:14] 


Figure D4-10 VMSAv8-64 stage 2 address translation, 16KB granule, 48 bit input address 
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However, for an input address size of between 37 and 40 bits, Table D4-15 on page D4-1750 shows that translation 
can start with either a level 1 lookup or a level 2 lookup, and Figure D4-11 shows these options. 
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Figure D4-11 General view of VMSAv8-64 stage 2 address translation, 16KB granule 
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Overview of VMSAv8-64 address translation using the 64KB translation granule 


The requirements for the level of the initial lookup are different for stage 1 and stage 2 translations. 


Overview of stage 1 translations, 64KB granule 


For a stage 1 translation, the required initial lookup level is determined only by the required input address range 
specified by the corresponding TCR.TxSZ field. When using the 64KB translation granule, Table D4-17 shows this 
requirement. 


Table D4-17 TCR.TnSZ values and IA ranges, 64K granule with no concatenation of tables 





TnSZ values for and input address ranges? for starting at this level 
Lookup level 











TnSZmin IAmax TnSZmax [Amin 
1 16 IA[47:16] 21 TA[42:16] 
2 22 TA[41:16] 34 TA[29:16] 
3 35 TA[28:16] 39 TA[24:16] 





a. The IAs show the address bits to be resolved when addressing a page of memory, see the Note that 
follows. 


These configuration options are also permitted for stage 2 translations. 





Note 
° When using the 64KB translation granule, there are no level 0 lookups. 
° Some bits of the IA do not require resolution by the translation table lookup, because they always map 


directly to the OA. When using the 64KB translation granule, [A[15:0] = OA[15:0] for all translations. 





Figure D4-12 shows the stage 1 address translation, for an address translation using the 64KB granule with a an 
input address size greater than 42 bits. 
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Figure D4-12 General view of VMSAv8-64 stage 1 address translation, 64KB granule 
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Overview of stage 2 translations, 64KB granule 


For a stage 2 translation, up to 16 translation tables can be concatenated at the initial lookup level. For certain input 
address sizes, concatenating tables in this way means that the lookup starts at a lower level than would otherwise be 
the case. For more information see Use of concatenated translation tables for the initial stage 2 lookup on 


page D4-1761. 


When using the 64KB translation granule, Table D4-18 shows all possibilities for the initial lookup for a stage 2 


translation. 


Table D4-18 VTCR_EL2.T0SZ values and IA ranges, 64K granule with possible concatenation of translation tables 














Tables 1 2 4 8 16 

Initial TOSZ values and input address ranges? for starting at this level 

lookup 

level TOSZ IA TOSZ IA TOSZ IA TOSZ IA TOSZ IA 

1 16- TA[47:16]- —- : : - B 2 
21 TA[42:16] 

2 22- TA[41:16]- 21 TA[42:16] 20 TA[43:16] 19 TA[44:16] 18 TA[45:16] 
34 TA[29:16] 

3 35- TA[28:16]- 34 TA[29:16] 33 TA[30:16] 32 TA[31:16] 31 TA[32:16] 


39 TA[24:16] 





a. Number of concatenated translation tables at the initial lookup level. / table corresponds to no concatenation, also shown in 


Table D4-17 on page D4-1753. 


b. The IAs shown in the table indicate the address bits to be resolved by an address translation addressing a page of memory, see the 


Note that follows. 





Note 
° When using the 64KB translation granule, there are no level 0 lookups. 
° Because concatenating translation tables reduces the number of levels of lookup required, when using the 


64KB translation granule, tables cannot be concatenated at level 1. 


° Some bits of the IA do not require resolution by the translation table lookup, because they always map 
directly to the OA. When using the 64KB translation granule, I[A[15:0] = OA[15:0] for all translations. 





VTCR_EL2.SL0 indicates the required initial lookup level, as Table D4-19 shows. 


Table D4-19 VTCR_EL2.SL0 values, 64K granule 





Initial lookup level 


VTCR_EL2.SLO 











1 0b10 
2 Qb01 
3 0boa 





The VTCR_EL2.SLO value 0b11 is reserved. 


Because the maximum number of concatenated translation tables is 16, there is a relationship between the permitted 
VTCR_EL2.{TOSZ, SLO} values. Table D4-18 shows the permitted values of TOSZ for each initial lookup level. 


If, when a translation table walk is started, the TOSZ value is not consistent with the SLO value, or VTCR_EL2.SLO 
is programmed to a reserved value, a stage 2 level 0 Translation fault is generated. 
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Figure D4-13 shows the stage 2 address translation, for an input address size of between 43 and 46 bits. This means 
the lookup can start at either level 1 or level 2. 






























































VTCR_EL2.SL0 defines the start level. Level 3 table 
Starting at level 1 Level 2 table 
64KB 
D_P. 
Level 1 table D Blok, eo ee — 898 page 
region | 
Cc 
D_Table 
b1 
D_Table ! 
A 
a 
VTTBR_EL2 vy, 
Level 3 table 
Level 2 table 
64KB 
D_P. 
Starting at level 2 D_Block |» etehie — age page 
region | 
Cc 
D_Table 
Up to 16 concatenated 
tables at the initial level 
b2 
D_Table Key for both diagrams 
D_Table is a Table descriptor 
VTTBR EL2 D_Block is a Block descriptor 











D_Page is a Page descriptor 


a_ Indexed by IA[n:42], 
where IA width is (n+1) bits 


b1 Indexed by IA[41:29] 


b2 Indexed by IA[n:29], 
where IA width is (n+1) bits 


c Indexed by IA[28:16] 


Figure D4-13 General view of VMSAv8-64 stage 2 address translation, 64KB granule 
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D4.2.6 The VMSAv8-64 translation table format 


This section provides the full description of the VMSAv8-64 translation table format, its use for address translations 
that are controlled by an Exception level using AArch64. For these translation regimes: 
For a stage 1 translation that supports two VA ranges 

° For the lower VA range, that uses TTBRO_ELx: 


— The TCR_ELx.{SH0, ORGNO, IRGNO} fields define memory region attributes for the 
translation table walks. 


— The TCR_ELx.TGO field defines the Translation granule size. 
° For the upper VA range, that uses TTBR1_ELx: 


— The TCR_ELx.{SH1, ORGN1, IRGN1} fields define memory region attributes for the 
translation table walks. 


— The TCR_ELx.TGI field defines the Translation granule size. 
° Each of TTBRO_ELx and TTBR1_ELx contains an ASID field, and the TCR_ELx.A1 field 
selects which of these specifies the ASID to use. 
For a stage 1 translation that supports one VA range 
The translation table walks use TTBRO_ELx, and: 


° The TCR_ELx.{SHO, ORGNO, IRGNO} fields define memory region attributes for the 
translation table walks. 


° The TCR_ELx.TGO field defines the Translation granule size. 


For a stage 2 translation 
The translation table walks use VTTBR_EL2, and: 


° The VTCR_EL2.{SHO, ORGNO, IRGNO} fields define memory region attributes for the 
translation table walks. 


° The VTCR_EL2.TGO field defines the Translation granule size. 


For the VMSAv8-64 translation table format, Overview of the VMSAv8-64 address translation stages on 
page D4-1745 summarizes the lookup levels, and Descriptor encodings, ARMV8 level 0, level 1, and level 2 formats 
on page D4-1775 describes the translation table entries. 


The following subsections describe the use of this translation table format: 





. Translation granule size and associate block and page sizes on page D4-1757. 
° Selection between TTBRO and TTBR1 when two VA ranges are supported on page D4-1759. 
. Use of concatenated translation tables for the initial stage 2 lookup on page D4-1761. 
° Possible translation table registers programming errors on page D4-1762. 
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Translation granule size and associate block and page sizes 


Table D4-20 shows the supported granule sizes, block sizes and page sizes, for the different granule sizes. For 
completeness, this table includes information for AArch32 state. In the table, the OA bit ranges are the OA bits that 
the translation table descriptor specifies to address the block or page of memory, in an implementation that supports 
a 48-bit OA range. 


Table D4-20 Translation table granule sizes, with block and page sizes, and output address 
ranges 





Granule size Tablelevel Block size and OA bit range Page size and OA bit range 





4KB Zero - - 





One 1GB, OA[47:30] - 





Two 2MB, OA[47:21] : 





Three g 4KB, OA[47:12] 





16KB Zero - - 





One - - 





Two 32MB, OA[47:25] = 





Three Z 16KB, OA[47:14] 





64KB One - - 





Two 512MB, OA[47:29] - 





Three - 64KB, OA[47:16] 





Bit[1] of a translation table descriptor identifies whether the descriptor is a block descriptor, and: 
° The 4KB granule size supports block descriptors only in level 1 and level 2 translation tables. 


° The 16KB and 64KB granule sizes support block descriptors only in level 2 translation tables, 


If bit[1] of a descriptor is 0 in a translation table that does not support block descriptors then a translation table walk 
that accesses that descriptor generates a Translation fault. 


For translations managed from AArch64 state, the following tables expand the information for each granule size, 
showing for an access to a single translation table at each lookup level: 


° The maximum IA size, and the address bits that are resolved for that maximum size. 


° The maximum OA range resolved by the translation table descriptors at this level, and the corresponding 
memory region size. 


. The maximum size of the translation table. This is the size required for the maximum IA size. 
Table D4-21 on page D4-1758 shows this information for the 4KB translation granule size, Table D4-22 on 


page D4-1758 shows this information for the 16KB translation granule size, and Table D4-23 on page D4-1758 
shows this information for the 64KB translation granule size. 
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Table D4-21 Properties of the address lookup levels, 4KB granule size 





Maximum input address 


Maximum output address 














Number Block entries 
Pee ofentries supported? 
Size Address range Address range ___ Size of addressed region? PP : 
Zero 256TB = Address[47:39] Address[47:39] 512GB Up to 512 No 
One 512GB_ —— Address[38:30] Address[47:30] 1GB Up to 512 Yes 
Two 1GB Address[29:21] Address[47:21] 2MB Up to 512 Yes 
Three 2MB Address[20: 12] Address[47:12] 4KB 512 Page only 





a. That is, the size of the region either addressed by descriptors at this level or to be resolved at this and the subsequent levels of lookup. 


Table D4-22 Properties of the address lookup levels, 16KB granule size 





Maximum input address 


Maximum output address 














Number Block entries 
Envel ofentries supported? 
Size Address range Address range ___ Size of addressed region@ si ; 
Zero 256TB = Address[47] Address[47] 128TB 2b No 
One 128TB —_ Address[46:36] Address[47:36] 64GB Up to 2048 No 
Two 64GB Address[35:25] Address[47:25] 32MB Up to 2048 Yes 
Three 32MB Address[24: 14] Address[47:14] 16KB 2048 Page only 





a. That is, the size of the region either addressed by descriptors at this level or to be resolved at this and the subsequent levels of lookup. 


b. The translation table size is less than the maximum for this granule size, and therefore the number of entries is reduced. 


Table D4-23 Properties of the address lookup levels, 64KB granule size 





Maximum input address 


Maximum output address 











Number Block entries 
eve ofentries supported? 
Size Address range Address range ___ Size of addressed region@ PP : 
One 256TB Address[47:42] Address[47:42] 4TB Up to 64° No 
Two 4TB Address[41:29] Address[47:29] 512MB Up to 8192 Yes 
Three 512MB _— Address[28:16] Address[47:16] 64KB 8192 Page only 





a. 


b. The translation table size is less than the maximum for this granule size, and therefore the number of entries is reduced. 


That is, the size of the region either addressed by descriptors at this level or to be resolved at this and the subsequent levels of lookup. 
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For the initial lookup level: 


° If the IA range specified by the TCR.TxSZ field is smaller than the maximum size shown in these table then 
this reduces the number of addresses in the table and therefore reduces the table size. The smaller translation 
table is aligned to its table size. 


. For stage 2 translations, multiple translation tables can be concatenated to extend the maximum IA size 
beyond that shown in these tables. For more information see the stage 2 translation overviews in Overview 
of the VMSAv8-64 address translation stages on page D4-1745 and Use of concatenated translation tables 
for the initial stage 2 lookup on page D4-1761. 


If a supplied input address is larger than the configured input address size, a Translation fault is generated. 


Note 


Larger translation granule sizes typically requires fewer levels of translation tables to translate a particular size of 
VA. 








For the TCR programming requirements for the initial lookup, see Overview of the VMSAv8-64 address translation 
stages on page D4-1745. 


Selection between TTBRO and TTBR1 when two VA ranges are supported 


Every translation table walk starts by accessing the translation table addressed by the TTBR for the stage 1 
translation for the required translation regime. 


For a stage 1 translation that supports two VA ranges, Figure D4-14 shows this VA range split, and: 


° TTBRO_ELx points to the initial translation table for the lower VA range, that starts at address 
0x0000 000000000000, 
. TTBR1_ELx points to the initial translation table for the upper VA range, that runs up to address 
OxFFFF FFFF FFFFFFFF. 
VA 


OxFFFF_FFFF_FFFF_FFFF 





TTBR1_ELx [ Effect of increasing TCR_ELx.T1SZ 
region an 





OxFFFF_0000_0000_0000 < Boundary, when TCR_ELx.T1SZ==16 


Access generates 
ss atTranslation <s 
fault, see text 


0x0000_FFFF_FFFF_FFFF < Boundary, when TCR_ELx.TOSZ==16 


¢ 





TTBRO_ELx | 
region Tr Effect of increasing TCR_ELx.T0SZ 








0x0000_0000_0000_0000 





Figure D4-14 AArch64 TTBRn boundaries and VA ranges 


Which TTBR is used depends only on the VA presented for translation: 
° If the top bits of the VA are zero, then TTBRO_ELx is used. 
° If the top bits of the VA are one, then TTBR1_ELx is used. 


It is configurable whether this determination depends on the values of VA[63:56] or on the values of VA[55:48], see 
Address tagging in AArch64 state on page D4-1724. 
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Note 


The handling of the Contiguous bit can mean that the boundary between the translation regions defined by the 
TCR_ELx.TnSZ values and the region for which an access generates a Translation fault is wider than shown in 
Figure D4-14 on page D4-1759. That is, if the descriptor for an access to the region shown as generating a fault has 
the Contiguous bit set to 1, the access might not generate a fault. Possible translation table registers programming 
errors on page D4-1762 describes this possibility. 








Example D4-3 shows a typical application of this VA split. 


Example D4-3 Example use of the split VA range, and the TTBRO_ELx and TTBR1_ELx controls 


An example of using the split VA range is: 


TTBRO_ELx Used for process-specific addresses. 


Each process maintains a separate level 1 translation table. On a context switch: 

° TTBRO_ELx is updated to point to the level 1 translation table for the new context 
° TCR_ELx is updated if this change changes the size of the translation table 

° CONTEXTIDR_ELx is updated. 


TTBR1_ELx Used for operating system and I/O addresses, that do not change on a context switch. 


For each VA subrange, the input address size is 2(64-T”SZ), where TnSZ is one of TCR_EL1.{TOSZ, T1SZ}, 
This means the two VA subranges are: 

Lower VA subrange  0x0000_0000_0000_0000 to (2(64-TOSZ) - 1), 

Upper VA subrange (2 - 2(64-TISZ)) to QxFFFF_FFFF_FFFF_FFFF. 

The minimum TnSZ value is 16, corresponding to the maximum input address range of 48 bits. Example D4-4 


shows the two VA subranges when TOSZ and T1SZ are both set to this minimum value. 


Example D4-4 Maximum VA ranges when a stage of translation supports two ranges 


The maximum VA subranges correspond to TOSZ and T1SZ each having the minimum value of 16. In this case the 
subranges are: 


Lower VA subrange 0x0000_0000_0000_0000 to 0x0000_FFFF_FFFF_FFFF. 


Upper VA subrange 0xFFFF_0000_0000_0000 to OxFFFF_FFFF_FFFF_FFFF. 


Figure D4-14 on page D4-1759 indicates the effect of varying the TnSZ values. 


As described in Overview of the VMSAv8-64 address translation stages on page D4-1745, the TnSZ values also 
determine the initial lookup level for the translation. 
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Use of concatenated translation tables for the initial stage 2 lookup 


Overview of the VMSAv8-64 address translation stages on page D4-1745 introduced the ability to concatenate 
translation tables for the initial stage 2 translation lookup. This section gives more information about that 
concatenation. 


If a stage 2 translation would require 16 entries or fewer in its top-level translation table, that stage of translation 
can, instead, be configured so that: 


° It requires the corresponding number of concatenated translation tables at the next translation level, aligned 
to the size of the block of concatenated translation tables. 


° The stage 2 translation starts at that next translation level. 


When using the 16KB translation granule, if a 48-bit input address size is required for the stage 2 translations, 
lookup must start with two concatenated translation tables at level 1. 


The use of concatenated translation tables requires the software that is defining the translation to: 


. Define the concatenated translation tables with the required overall alignment. 

. Program VTTBR_EL2 to hold the address of the first of the concatenated translation tables. 

° Program VTCR_EL2 to indicate the required input address range and initial lookup level. 
Note 





The use of concatenated translation tables avoids the overhead of an additional level of translation. 





Concatenating additional translation tables at the initial level of look up resolves additional address bits at that level. 
To resolve n additional address bits requires 2” concatenated translation tables. Example D4-5 shows how, for 
level 1 lookups using the 4KB translation granule, translation tables can be concatenated to resolve three additional 
address bits. 


Example D4-5 Adding three bits of address resolution at level 1 lookup, using the 4KB granule 


When using the 4KB translation granule, a level 1 lookup with a single translation table resolves address bits[38:30]. 
To add three more address bits requires 23 translation tables, that is, eight translation tables. This means: 
° The total size of the concatenated translation tables is 8x 4KB=32KB. 
° This block of concatenated translation tables must be aligned to 32KB. 
° The address range resolved at this lookup level is A[41:30], of which: 
— Bits A[41:39] select the 4KB translation table. 
— Bits A[38:30] index a descriptor within that translation table. 


As an example of the concatenation of translation tables at the initial lookup level, when using the 4KB translation 
granule, Table D4-24 shows the possible uses of concatenated translation tables to permit lookup to start at level 1 
rather than at level 0. For completeness, the table starts with the case where the required IPA range means lookup 
starts at level 1 with a single translation table at that level. 


Table D4-24 Possible uses of concatenated translation tables for level 1 lookup, 4KB granule 





Configured stage 2 !A size _ Lookup starts at level0 | Lookup starts at level 1 











IPA range Size Required level 0 entries Number of concatenated tables Required alignment@ 
IPA[38:0] 236 bytes - 1 4KB 
IPA[39:0] 237 bytes 2 2 8KB 
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Table D4-24 Possible uses of concatenated translation tables for level 1 lookup, 4KB granule (continued) 





Configured stage 2 IA size Lookup starts at levelO0 — Lookup starts at level 1 











IPA range Size Required level 0 entries Number of concatenated tables Required alignment@ 
IPA[40:0] 238 bytes 4 4 16KB 
IPA[41:0] 239 bytes 8 8 32KB 
IPA[42:0] 240 bytes 16 16 64KB 





a. Required alignment of the set of concatenated level 2 tables. 


Note 


Because concatenation is permitted only for a stage 2 translation, the input addresses in the table are IPAs. 








Overview of the VMSAv8-64 address translation stages on page D4-1745 identifies all of the possible uses of 
concatenation. In all cases, the block of concatenated translation tables must be aligned to the block size. 


Possible translation table registers programming errors 


This subsection describes possible errors in programming the translation table registers. 


Misprogramming the VTCR_EL2.{T0SZ, SLO} fields 


For a stage 2 translation, the programming of the VTCR_EL2.{TOSZ, SLO} fields must be consistent. If these fields 
are not consistent, or if SLO is programmed to a reserved value, any translation table walk that uses stage 2 
translation generates a stage 2 level 0 Translation fault. For more information see Overview of the VMUSAv8-64 
address translation stages on page D4-1745. 


Misprogramming of the Contiguous bit 


For more information about the Contiguous bit, and the range of translation table entries that must have the bit set 
to 1 to mark the entries as contiguous, see The Contiguous bit on page D4-1796. 


If one or more of the following errors is made in programming the translation tables, the TLB might contain 
overlapping entries: 


° One or more of the contiguous translation table entries does not have the Contiguous bit set to 1. 


° One or more of the contiguous translation table entries holds an output address that is not consistent with all 
of the entries pointing to the same aligned contiguous address range. 


. The attributes and permissions of the contiguous entries are not all the same. 


Such misprogramming of the translation tables means the output address, memory permissions, or attributes for a 
lookup might be corrupted, and might be equal to values that are not consistent with any of the programmed 
translation table values. 


In some implementations, such misprogramming might also give rise to a TLB Conflict abort. 


The architecture guarantees that misprogramming of the Contiguous bit cannot provide a mechanism for any of the 
following to occur: 


° Software executing at EL1 or ELO accessing regions of physical memory that are not accessible by 
programming the translation tables, from EL1, with arbitrary chosen values that do not misprogram the 
Contiguous bit. 


° Software executing at EL1 or ELO accessing regions of physical memory with attributes or permissions that 
are not possible by programming the translation tables, from EL1, with arbitrary chosen values that do not 
misprogram the Contiguous bit. 
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° Software executing in Non-secure state accessing Secure physical memory. 


Note 


Hardware implementations must ensure that use of the Contiguous bit cannot provide a mechanism for avoiding 
output address range checking. This might occur if a Contiguous bit block size of 0.5GB or 1 GB is used in a system 
with the output address size configured to 4GB. The architecture permits the implemented mechanism for 
preventing any avoidance of output address range checking to suppress the use of the Contiguous bit for such entries 
in such a system. 








Where the Contiguous bit is used to mark a set of blocks as contiguous, if the address range translated by a set of 
blocks marked as contiguous is larger than the size of the input address supported at a stage of translation used to 
translate that address at that stage of translation, as defined by the TCR.TxSZ field, then this is a programming error. 
An implementation is permitted, but not required, to: 


° Treat such a block within a contiguous set of blocks as causing a Translation fault, even though the block is 
valid, and the address accessed within that block is within the size of the input address supported at a stage 
of translation, as defined by the TCR.TxSZ field. 


. Treat such a block within a contiguous set of blocks as not causing a Translation fault, even though the 
address accessed within that block is outside the size of the input address supported at a stage of translation, 
as defined by the TCR.TxSZ field, provided that both of the following apply: 


— The block is valid. 


— Atleast one address within the block, or contiguous set of blocks, is within the size of the input address 
supported at a stage of translation. 


D4.2.7 The algorithm for finding the translation table descriptors 


This subsection gives the algorithms for finding the translation table descriptor that corresponds to a given IA, for 
each required level of lookup. The algorithms encode the descriptions of address translation given earlier in this 
section. The algorithm details depend on the translation granule size for the stage of address translation, see: 


. Finding the translation table descriptor when using the 4KB translation granule on page D4-1764. 
. Finding the translation table descriptor when using the 16KB translation granule on page D4-1765. 
° Finding the translation table descriptor when using the 64KB translation granule on page D4-1766. 


Each subsection uses the following terms: 


BaseAddr The base address for the level of lookup, as defined by: 
° For the initial lookup level, the value of the appropriate TTBR.BADDR field. 


° Otherwise, the translation table address returned by the previous level of lookup. 
PAMax The supported PA width, in bits. 
IA The supplied IA for this stage of translation. 
TnSZ The translation table size for this stage of translation: 


For EL1&0 stage 1 9 TCR_EL1.TOSZ or TCR_EL1.T1SZ, as appropriate. 
For EL1&0 stage 2 =VTCR_EL2.TOSZ. 

For EL2 stage 1 TCR_EL2.TOSZ. 

For EL3 stage 1 TCR_EL3.TOSZ. 


SLO VTCR_EL2.SLO. Applies to the Non-secure EL1&0 stage 2 translation only. 


These subsections show only architecturally-valid programming of the TCR. See also Possible translation table 
registers programming errors on page D4-1762. 
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Finding the translation table descriptor when using the 4KB translation granule 


Table D4-25 shows the translation table descriptor address, for each level of lookup, when using the 4KB translation 
granule. See the start of The algorithm for finding the translation table descriptors on page D4-1763 for more 
information about terms used in the table. 


Table D4-25 Translation table entry addresses when using the 4KB translation granule 





Entry address and conditions 














a General conditions 
Stage 1 translation Stage 2 translation 
Zero BaseAddr[PA Max- 1:x]:IA[y:39]:0b000 BaseAddr[PA Max- 1:x]:IA[y:39]:0b000 y=(x+ 35) 
if? 16 < TnSZ < 24 then x = (28 - TnSZ) if SLO> == 2 then 
if? 16 < TOSZ < 24 then x = (28 - TOSZ) 
One BaseAddr[PA Max- 1:x]:[A[y:30]:0b000 BaseAddr[PA Max- 1:x]:IA[y:30]:0b000 y=(x+ 26) 
if@ 25 < TnSZ < 33 then x = (37 - TnSZ) if SLO == 1 then 
if? 21 < TOSZ < 33 then x = (37 - TOSZ) 
else® x =12 elsif SLO>. © == 2 then x = 12 
Two BaseAddr[PAMax-1:x]:IA[y:21]:0b000 BaseAddr[PA Max- 1:x]:IA[y:21]:0b000 y=(x+17) 
if? 34 < TnSZ < 39 then x = (46 - TnSZ) if SLO> == 0 then 
if? 30 < TOSZ < 39 then x = (46 - TOSZ) 
else® x =12 elsif SLO> © > 0 then x = 12 
Three BaseAddr[PAMax-1:12]:IA[20:12]:0b000 | BaseAddr[PAMax-1:12]:IA[20:12]:0b000 - 





a. This line indicates the range of permitted values for TnSZ, for a lookup that starts at this level, see Overview of VMSAv8-64 address 
translation using the 4KB translation granule on page D4-1745. 


b. SLO == 0 if the initial lookup is level 2, SLO == 1 if the initial lookup is level 1, and SLO ==2 if the initial lookup level is level 0. 


c. This is the case where this level of lookup is not the initial level of lookup. 


Table D4-7 on page D4-1736 shows how software can determine whether an implementation supports the 4KB 


granule size. 
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Finding the translation table descriptor when using the 16KB translation granule 


Table D4-26 shows the translation table descriptor address, for each level of lookup, when using the 16KB 
translation granule. See the start of The algorithm for finding the translation table descriptors on page D4-1763 for 
more information about terms used in the table. 


Table D4-26 Translation table entry addresses when using the 16KB translation granule 





Entry address and conditions 














is General conditions 
Stage 1 translation Stage 2 translation 
Zero BaseAddr[PAMax-1:4]:IA[47]:0b000 - Only applies to stage 1 
416< TnSZ 
One BaseAddr[PAMax-1:x]:IA[y:36]:0b000 BaseAddr[PA Max- 1:x]:IA[y:36]:0b000 y=(x+ 32) 
if? 17 < TnSZ < 27 then x = (31 - TnSZ) if SLO == 2 then 
if? 16 < TOSZ < 27 then x = (31 - TOSZ) 
else® x =14 
Two BaseAddr[PA Max- 1:x]:IA[y:25]:0b000 BaseAddr[PA Max- 1:x]:IA[y:25]:0b000 y=(x+21) 
if? 28 < TnSZ < 38 then x = (42 - TnSZ) if SLO == 1 then 
if@ 24 < TOSZ < 38 then x = (42 - TOSZ) 
else® x =14 elsif SLO> © == 2 then x = 14 
Three BaseAddr[PAMax-1:14]:TA[24:14]:0b000 | BaseAddr[PAMax-1:x]:IA[y:14]:0b000 y=(x+ 10) 


if SLO> == 0 then 
if@ 35 < TOSZ < 39 then x = (53 - TOSZ) 
elsif SLO> © >0 then x = 14 





a. This line indicates the range of permitted values for TnSZ, for a lookup that starts at this level, see Overview of VMSAv8-64 address 
translation using the 16KB translation granule on page D4-1749. 


b. SLO == 0 if the initial lookup is level 3, SLO == 1 if the initial lookup is level 2, and SLO ==2 if the initial lookup level is level 1. 


c. This is the case where this level of lookup is not the initial level of lookup. 


Table D4-7 on page D4-1736 shows how software can determine whether an implementation supports the 16KB 


granule size. 
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Finding the translation table descriptor when using the 64KB translation granule 


Table D4-27 shows the translation table descriptor address, for each level of lookup, when using the 64KB 
translation granule. See the start of The algorithm for finding the translation table descriptors on page D4-1763 for 
more information about terms used in the table. 


Table D4-27 Translation table entry addresses when using the 64KB translation granule 





Entry address and conditions 











is General conditions 
Stage 1 translation Stage 2 translation 
One BaseAddr[PA Max- 1:x]:IA[y:42]:0b000 BaseAddr[PA Max- 1:x]:IA[y:42]:0b000 y=(x + 38) 
if? 16 < TnSZ < 21 then x = (25 - TnSZ) if SLO> == 2 then 
if? 16 < TOSZ < 21 then x = (25 - TOSZ) 
Two BaseAddr[PAMax- 1:x]:IA[y:29]:0b000 BaseAddr[PA Max- 1:x]:IA[y:29]:0b000 y=(x+25) 
if@ 22 < TnSZ < 34 then x = (38 - TnSZ) if SLO == 1 then 
if? 18 < TOSZ < 34 then x = (38 - TOSZ) 
else® x =16 elsif SLO>. © == 2 then x= 16 
Three BaseAddr[PAMax-1:x]:IA[y:16]:0b000 BaseAddr[PA Max- 1:x]:IA[y:16]:0b000 y=(x+ 12) 


if 35 < TnSZ < 39 then x = (51 - TnSZ) 


else® x =16 


if SLO == 0 then 
if? 31 < TOSZ < 39 then x = (51 - TOSZ) 
elsif SLO> © > 0 then x = 16 





a. This line indicates the range of permitted values for TnSZ, for a lookup that starts at this level, see Overview of VMUSAv8-64 address 
translation using the 64KB translation granule on page D4-1753. 


b. SLO == 0 if the initial lookup is level 3, SLO == 1 if the initial lookup is level 2, and SLO ==2 if the initial lookup level is at level 1. 


c. This is the case where this level of lookup is not the initial level of lookup. 


Table D4-7 on page D4-1736 shows how software can determine whether an implementation supports the 64KB 


granule size. 
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D4.2.8 The effects of disabling a stage of address translation 


The following sections describe the effect on MMU behavior of disabling each stage of translation: 


° Behavior when stage 1 address translation is disabled. 
° Behavior when stage 2 address translation is disabled on page D4-1768. 
° Behavior of instruction fetches when all associated stages of translation are disabled on page D4-1768. 


Behavior when stage 1 address translation is disabled 


When a stage | address translation is disabled, memory accesses that would otherwise be translated by that stage of 
translation are treated as follows: 
Non-secure EL1 and ELO accesses if the HCR_EL2.DC bit is set to 1 

For the Non-secure EL1&0 translation regime, when the value of HCR_EL2.DC is 1, the stage 1 


translation assigns the Normal Non-shareable, Inner Write-Back Read-Allocate Write-Allocate, 
Outer Write-Back Read-Allocate Write-Allocate memory attributes. 
— Note 


This applies for both instruction and data accesses. 





All other accesses 


For all other accesses, when stage 1 address translation is disabled, the assigned attributes depend 
on whether the access is a data access or an instruction access, as follows: 
Data access 


The stage 1 translation assigns the Device-nGnRnE memory type. 


Instruction access 
The stage | translation assigns the Normal memory attribute, with the cacheability and 
shareability attributes determined by the value of the SCTLR.I bit for the translation 
regime, as follows: 
When the value of I is 0 
The stage 1 translation assigns the Non-cacheable and Outer Shareable 
attributes. 
When the value of Tis 1 


The stage 1 translation assigns the Cacheable, Inner Write-Through 
Read-Allocate No Write-Allocate, Outer Write-Through Read-Allocate No 
Write-Allocate Outer Shareable attribute. 


For this stage of translation, no memory access permission checks are performed. Therefore no MMU faults can be 
generated for this stage of address translation. 





Note 


Alignment checking is performed, and therefore Alignment faults can occur. 





For every access, the input address of the stage 1 translation is flat-mapped to the output address. 


For a Non-secure EL1 or ELO access, if EL1&0 stage 2 address translation is enabled, the stage 1 memory attribute 
assignments and output address can be modified by the stage 2 translation. 


When the value of HCR_EL2.DC is 1, in Non-secure state: 


° The SCTLR_EL1.M bit behaves as if it is 0, for all purposes other than reading the value of the bit. This 
means Non-secure EL1&0 stage 1 address translation is disabled. 


° The HCR_EL2.VM bit behaves as if it is 1, for all purposes other than reading the value of the bit. This means 
that Non-secure EL1&0 stage 2 address translation is enabled. 


See also Behavior of instruction fetches when all associated stages of translation are disabled on page D4-1768. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D4-1767 
1ID092916 Non-Confidential 


D4 The AArch64 Virtual Memory System Architecture 
D4.2 The VMSAV8-64 address translation system 


Effect of disabling address translation on maintenance and address translation instruction 
instructions 


Cache maintenance instructions act on the target cache regardless of whether any stages of address translation are 
disabled, and regardless of the values of the memory attributes. However, if a stage of address translation is disabled, 
they use the flat address mapping for that translation stage. 


TLB invalidate operations act on the target TLB regardless of whether any stage of address translation is disabled. 


The value of HCR_EL2.DC affect some address translation instructions, see Address translation instructions, AT* 
on page D4-1771. 


Behavior when stage 2 address translation is disabled 


When stage 2 address translation is disabled: 
. The IPA output from the stage 1 translation maps flat to the PA. 


. The memory attributes and permissions from the stage 1 translation apply to the PA. 

When both stages of address translation are disabled, see also Behavior of instruction fetches when all associated 
stages of translation are disabled. 

Behavior of instruction fetches when all associated stages of translation are disabled 


When EL3 is using AArch64, this section applies to: 





° The Secure EL1&0 translation regime when Secure EL1&0 stage 1 address translation is disabled. 

° The Secure EL3 translation regime, when Secure EL3 stage | address translation is disabled. 

. The Non-secure EL2 translation regime, when Non-secure EL2 stage 1 address translation is disabled 

° The Non-secure EL1&0 translation regime, when both stages of address translation are disabled. 
Note 

° The behaviors in Non-secure state apply regardless of the Execution state that EL3 is using. 


° When the value of HCR_EL2.DC is 1, then the behavior of the Non-secure EL1&0 translation regime is as 
if stage 1 translation is disabled and stage 2 translation is enabled, as described in Behavior when stage 1 
address translation is disabled on page D4-1767. 





In these cases, when execution is in AArch64 state, a memory location might be accessed as a result of an instruction 
fetch if either: 


. The memory location is in the same block of memory as, or in the next contiguous block of memory to, an 
instruction that a simple sequential execution of the program either requires to be fetched now or has required 
to be fetched since the last reset. 


° The memory location is the target of a direct branch that a simple sequential execution of the program would 
have taken since the most recent of: 
— The last reset. 
— The last synchronization of instruction cache maintenance targeting the address of the branch 


instruction. 


In this description, the blocks of memory referred to are of the size of the minimum implemented translation granule 
and are aligned to that size. 


These accesses can be caused by speculative instruction fetches, regardless of whether the prefetched instruction is 
committed for execution. 
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Note 


To ensure architectural compliance, software must ensure that both of the following apply: 





. Instructions that will be executed when all associated stages of address translation are disabled are located in 
blocks of the address space, of the translation granule size, that contain only memory that is tolerant to 
speculative accesses. 


. Each block of the address space, of the translation granule size, that immediately follows a similar block that 
holds instructions that will be executed when all associated stages address translation are disabled, contains 
only memory that is tolerant to speculative accesses. 





The implemented Exception levels and the resulting translation stages and regimes 


Elsewhere, this chapter describes an implementation that includes all Exception levels, and describes the control of 
address translation by Exception levels that are using AArch64. This subsection describes how the address 
translation scheme changes if an implementation does not include all of the Exception levels. 


If an implementation does not include EL3, it has only a single Security state, with MMU controls equivalent to the 
Secure state MMU controls. 


If an implementation does not include EL2 then: 
° If it also does not include EL3, the MMU provides only a single EL1&0 stage 1 translation regime. 
° If it includes EL3, the MMU provides an EL1&0 stage 1 translation regime in each Security state. 


Figure D4-1 on page D4-1727 shows the set of translation regimes for an implementation that implements all of the 
Exception levels. Table D4-28 shows how the supported translation stages depend on the implemented Exception 
levels, and in some cases on the Execution state being used by the highest implemented Exception level. 


Table D4-28 The relation between the implemented translation stages and Exception levels for AArch64 





Translation stage Requires 





Secure EL3 stage 1 EL3 implemented and using AArch64. 





Secure EL1&0 stage 1 Either: 


° EL3 implemented and using AArch64. 


° Only EL1 and ELO implemented, all operation is in Secure state, and EL1 is using 
AArché4. 





Non-secure EL2 stage 1 EL2 implemented. 





Non-secure EL1&0 stage 2. EL2 implemented. 





Non-secure EL1&0 stage 1 Any implementation except: 








° Only EL1 and ELO implemented, with all operation in the Secure state. 

D4.2.10 Pseudocode description of VMSAv8-64 address translation 

The following subsections outline a pseudocode description of the translation table walk: 

. Definitions required for address translation on page D4-1770. 

° Performing the full address translation on page D4-1770. 

° Stage I translation on page D4-1770. 

° Stage 2 translation on page D4-1770. 

° Translation table walk on page D4-1770. 

° Support functions on page D4-1770. 
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Definitions required for address translation 
In pseudocode, the result of a translation table lookup, in either Execution state, is returned in a TLBRecord structure. 


Memory data type definitions on page D3-1717 includes definitions of the Permissions and AddressDescriptor 
parameters. 


Performing the full address translation 


The function AArch64.FullTranslate() performs a full translation table walk. For any translation regime it performs 
a stage | translation for the supplied VA, and for the Non-secure EL1&0 translation regime it then performs a stage 
2 translation of the returned address. 


Stage 1 translation 


The function AArch64.FirstStageTranslate() performs a stage 1 translation, calling the function 
AArch64.TranslationTableWalk(), described in Translation table walk, to perform the required translation table 
walk. However, if stage 1 translation is disabled, it calls the function AArch64.TranslateAddressS10ff() to set the 
memory attributes. 


Stage 2 translation 


In the Non-secure EL1&0 translation regime, a descriptor address returned by stage 1 lookup is in the IPA address 
space, and must be mapped to a PA by a stage 2 translation. Function AArch64.SecondStageWalk() performs this 
translation, by calling the AArch64.SecondStageTranslate() function. When called from AArch64.SecondStageWalk(), 
the AArch64.SecondStageTranslate() function performs a second stage translation, from IPA to PA, of the supplied 
address, including checking that the access has read permission at the second stage. If the access does not have 
second stage read permission it generates a second stage Permission fault on the first stage translation table walk. 
The second stage translation might hit in a TLB, or might involve a translation table walk, which will use the 
algorithm described in this section. 


Translation table walk 


The function AArch64.TranslationTableWalk() returns the result, in the form of a TLBRecord, of a translation table 
walk made for a memory access from an Exception level that is using AArch64. 


Support functions 


In the translation table walk functions, the Wal kAttrDecode() function determines the attributes for a translation table 
lookup. 


The function AArch64.S1AttrDecode() decodes the attributes from a stage 1| translation table lookup. 


The function AArch64.CheckPermission() checks the access permissions returned by a stage | translation table 
lookup, see Access permission checking on page D3-1719. 


The function AArch64.CheckS2Permission() checks the access permissions returned by a stage 2 translation table 
lookup. 


The function AddrTop() returns the bit number of the most significant valid bit of a VA in the current translation 
regime. If EL1 is using AArch64 and ELO is using AArch32 then an address from ELO is zero-extended to 64 bits. 
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D4.2.11 Address translation instructions 


Each of the ARMv8 instruction sets provides instructions that return the result of translating an input address, 
supplied as an argument to the instruction, using a specified translation stage or regime. 


The available instructions only perform translations that are accessible from the Security state and Exception level 
at which the instruction is executed. That is: 


° No instruction executed in Non-secure state can return the result of a Secure address translation stage. 


. No instruction can return the result of an address translation stage that is controlled by an Exception level 
that is higher than the Exception level at which the instruction is executed. 


Address translation instructions, AT* summarizes the A64 address translation instructions. 


See also A64 system instructions for address translation on page C5-365. 


Address translation instructions, AT* 
The A64 assembly language syntax for address translation instructions is: 
AT <operation>, <Xt> 


Where: 
<operation> Is one of S1E1R, S1E1W, S1EQR, S1EQW, S12E1R, S12E1W, S12EQ@R, S12EQ@W, S1E2R, S1E2W, S1E3R, or S1E3W. 
<operation> has a structure of <stages><level><read|write>, where: 


<stages> Is one of: 


Sl Stage 1| translation. 

$12 Stage 1 translation followed by stage 2 translation. 
<level> Describes the Exception Level that the translation applies to. Is one of: 

EQ ELO. 

E1 ELI. 

E2 EL2. 

E3 EL3. 


If <level> is higher than the current Exception Level the instruction is UNDEFINED. 


<read|write> 


Is one of: 
R Read. 
W Write. 
<Xt> The address to be translated. No alignment restrictions apply for the address. 


If EL2 is not implemented, the AT $1E2R and AT S1E2W instructions are UNDEFINED. 


Note 


If EL2 is not implemented but EL3 is implemented, the AT S12E« instructions are not UNDEFINED, but behave the 
same way as the equivalent AT S1E« instructions. This is consistent with the behavior if EL2 is implemented but 
stage 2 translation is disabled. 








In each case, the address being translated is held in the 64-bit address argument register, Xt. If the address translation 
instruction uses a translation regime that is using AArch32, meaning it requires a VA of only 32 bits, then VA[63:32] 
is RESO. 


If the address translation is successful, the resulting PA is returned in PAR_EL1.PA, and PAR_EL1.F is set to 0 to 
indicate that the translation was successful. Otherwise, see Synchronous faults generated by address translation 
instructions on page D4-1772. 
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Note 
The architecture provides a single PAR, PAR_EL1, that is used regardless of: 
° The Exception level at which the instruction was executed. 
° The Exception level that controls the stage or stages of translation used by the instruction. 





For all of these instructions, the current context information determines which entries in TLB caching structures are 
used, and how the translation table walk is performed. However, it is IMPLEMENTATION DEFINED whether the 
Address translation instructions return the values held in a TLB or the result of a translation table walk. Therefore, 
ARM recommends that these instructions are not used at a time when the TLB entries might be different from the 
underlying translation tables held in memory. 


When Non-secure EL1&0 stage 1 address translation is disabled, any AT S1E0*, AT S1E1*, AT S12E0*, or 
AT S12E1* address translation instruction that accesses the Non-secure state translation reflects the effect of the 
HCR_EL2.DC bit as described in Behavior when stage 1 address translation is disabled on page D4-1767. 


Executing AT S1E2R or AT S1E2W at EL3 with SCR_EL3.NS==0 is UNDEFINED. 


Note 


AT S12Ex instructions at EL3 with SCR_EL3.NS==0 are not UNDEFINED but behave the same way as the equivalent 
AT S1E instructions. 








Synchronous faults generated by address translation instructions 


The address translation instructions use the translation mechanism, and that mechanism can generate the following 
synchronous faults: 


° Translation fault. 


. Access flag fault. 


. Permission fault. 

. Domain fault, when translating using the AArch32 translation systems. 
. Address size fault. 

° TLB conflict fault. 

° Synchronous external aborts during a translation table walk. 


In addition: 


° If the address translation instruction requires two stages of translation then these faults could arise from either 
stage 1 or stage 2. 


° For a stage | translation for the Non-secure EL1&0 translation regime, the fault might be generated on the 
stage 2 translation of an address accessed as part of the stage | translation table walk, see Stage 2 fault on a 
stage I translation table walk on page D4-1806. 


Except as described in this section, these faults are not taken as an exception for the address translation instructions, 
but instead the PAR_EL1.FST field holds the fault status information. In these cases the PAR_EL1.PA field does 
not hold the output address of the translation. 


The exceptions to this reporting the fault in PAR_EL] are: 


° Synchronous external aborts during a translation table walk are taken as a Data Abort exception. 


For an address translation instruction executed at a particular Exception level, if the synchronous external 
abort is generated on a stage | translation table walk, the Data Abort exception is taken to the Exception level 
to which a synchronous external abort on a stage | translation table walk for a memory access from that 
Exception level would be taken. 
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If the synchronous external abort is generated on a stage 2 translation table walk then: 


If the address translation instruction was executed at EL3, the synchronous Data Abort exception is 
taken to EL3. 


If the address translation instruction was executed at EL2 or EL1, the Data Abort exception is taken 
to the Exception level to which a synchronous external abort on a stage 2 translation table walk for a 
memory access from that Exception level would be taken. 


In any case where the address translation instruction causes a synchronous Data Abort exception to be taken: 


The PAR_EL]1 is UNKNOWN. 


The ESR_ELx of the target Exception Level of the exception indicates that the fault was due to a 
translation table walk for a cache maintenance instruction. 


The FAR_ELx of the target Exception Level holds the VA for the translation request. 


° For the AT $1EQ« and AT S1E1+ instructions executed from the Non-secure EL1 Exception level, if there is a 
synchronous stage 2 fault on a memory access made as part of the translation table walk then if the value of 
SCR_EL3.EA is 1 then a synchronous external abort on a stage 2 translation table walk is taken to EL3. In 
all other cases of a synchronous stage 2 fault on a memory access made as part of the translation table walk, 
the fault is taken as an exception to EL2, and: 


PAR_ELI is UNKNOWN 


ESR_EL2 indicates that the fault occurred on a translation table walk, and that the operation that 
faulted was a cache maintenance instruction. 


HPFAR_EL2 holds the IPA that faulted 
FAR_EL2 holds the VA that the executing software supplied to the address translation instruction. 


This fault can occur for any of the following reasons: 


Stage 2 Translation fault. 
Stage 2 Access fault. 
Stage 2 Permission fault. 
Stage 2 Address size fault. 


Synchronous external abort on a stage 2 translation table walk. 


Synchronization requirements of the address translation instructions 


Where an instruction results in an update to a System register, as is the case with the AT « address translation 
instructions, explicit synchronization must be performed before the result is guaranteed to be visible to subsequent 
direct reads of the PAR_EL1. 


Note 





This is consistent with the AArch32 requirement, where the VA to PA translation instructions are executed as writes 
to the (coproc==0b1111) System register encoding space, and the effect of those writes to other registers require 
explicit synchronization before the result is guaranteed to be visible to subsequent instructions. 
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D4.3 VMSAv8-64 translation table format descriptors 


In general, a descriptor is one of: 


° An invalid or fault entry. 

° A table entry, that points to the next-level translation table. 

. A block entry, that defines the memory properties for the access. 
. A reserved format. 


Bit[1] of the descriptor indicates the descriptor type, and bit[0] indicates whether the descriptor is valid. 


The following sections describe the ARMvV8 translation table descriptor formats: 
e VMSAv8-64 translation table level 0, level 1, and level 2 descriptor formats. 
. ARMvV8 translation table level 3 descriptor formats on page D4-1777. 


Memory attribute fields in the VMSAv8-64 translation table format descriptors on page D4-1778 then gives more 
information about the descriptor attribute fields, and Control of Secure or Non-secure memory access on 

page D4-1782 describe how the NS and NSTable together control whether a memory access from Secure state 
accesses the Secure memory map or the Non-secure memory map. 


D4.3.1 VMSAv8-64 translation table level 0, level 1, and level 2 descriptor formats 


In the VMSAv8-64 translation table format, the difference in the formats of the level 0, level 1 and level 2 
descriptors is: 


. Whether a block entry is permitted. 
° If a block entry is permitted, the size of the memory region described by that entry. 


These differences depend on the translation granule, as follows: 


4KB granule A level 0 descriptor does not support block translation. 


A block entry: 
° In a level 1 table describes the mapping of the associated 1GB input address range. 
° In a level 2 table describes the mapping of the associated 2MB input address range. 


16KB granule Level 0 and level 1 descriptors do not support block translation. 

A block entry in a level 2 table describes the mapping of the associated 32MB input address range. 
64KB granule Level 0 lookup is not supported. 

A level 1 descriptor does not support block translation. 


A block entry in a level 2 table describes the mapping of the associated 512MB input address range. 


Figure D4-15 on page D4-1775 shows the ARMV8 level 0, level 1, and level 2 descriptor formats: 
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63 10 


Invalid | IGNORED 0 


52 51 48 47 12 11 2 1 0 


acl — block attributes | RESO | [reso | Output address[47:n] RESO Lower block attributes| 0] 1 | 


Table 


With the 4KB granule size, for the level 1 descriptor n is 30, and for the level 2 descriptor, n is 21. 
With the 16KB granule size, for the level 2 descriptor, n is 25. 
With the 64KB granule size, for the level 2 descriptor, n is 29. 


NSTable 

APTable Stage 1 only, 

XNTable RESO at stage 2 
> PXNTable 


63 62 61 60 59 58 52 51 48 47 m 11 2 1 0 


[T_T I [ossoreo [reso | Next-level table address[47:m]* IGNORED [+] 1] 


With the 4KB granule size m is 12, with the 16KB granule size m is 14*, and with the 64KB granule size, m is 16". 


A level 0 Table descriptor returns the address of the level 1 table. 
A level 1 Table descriptor returns the address of the level 2 table. 
A level 2 Table descriptor returns the address of the level 3 table. 


$ When m > 12, bits [(m-1):12] are RESO. 


Figure D4-15 VMSAv8-64 level 0, level 1, and level 2 descriptor formats 


Descriptor encodings, ARMvé8 level 0, level 1, and level 2 formats 


Descriptor bit[0] identifies whether the descriptor is valid, and is 1 for a valid descriptor. If a lookup returns an 
invalid descriptor, the associated input address is unmapped, and any attempt to access it generates a Translation 
fault. 


Descriptor bit[1] identifies the descriptor type, and is encoded as: 


0, Block The descriptor gives the base address of a block of memory, and the attributes for that memory 
region. 
1, Table The descriptor gives the address of the next level of translation table, and for a stage 1 translation, 


some attributes for that translation. 
The other fields in the valid descriptors are: 


Block descriptor 


Gives the base address and attributes of a block of memory, as follows: 
4KB translation granule 


° For a level 1 Block descriptor, bits[47:30] are bits[47:30] of the output address. 
This output address specifies a 1GB block of memory. 


° For a level 2 descriptor, bits[47:21] are bits[47:21] of the output address.This 
output address specifies a 2MB block of memory. 


16KB translation granule 
For a level 2 Block descriptor, bits[47:25] are bits[47:25] of the output address.This 
output address specifies a 32MB block of memory. 


64KB translation granule 
For a level 2 Block descriptor, bits[47:29] are bits[47:29] of the output address.This 
output address specifies a 512MB block of memory. 


Bits[63:52, 11:2] provide attributes for the target memory block, see Memory attribute fields in the 
VMSAV8-64 translation table format descriptors on page D4-1778. The position and contents of 
these bits are identical in the level 2 block descriptor and in the level 3 page descriptor. 
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Table descriptor 
Gives the translation table address for the next-level lookup, as follows: 
4KB translation granule 
° Bits[47:12] are bits[47:12] of the address of the required next-level table, which 


is: 

— Fora level 0 Table descriptor, the address of a level | table. 

— Fora level 1Table descriptor, the address of a level 2 table. 

— Fora level 2 Table descriptor, the address of a level 3 table. 
° Bits[11:0] of the table address are zero. 


16KB translation granule 
° Bits[47:14] are bits[47:14] of the address of the required next-level table, which 


is: 

— Fora level 0 Table descriptor, the address of a level | table. 

—  Foralevel | Table descriptor, the address of a level 2 table. 

— Fora level 2 Table descriptor, the address of a level 3 table. 
° Bits[13:0] of the table address are zero. 


64KB translation granule 
° Bits[47:16] are bits[47:16] of the address of the required next-level table, which 


is: 

—  Foralevel 1 Table descriptor, the address of a level 2 table. 

— Fora level 2 Table descriptor, the address of a level 3 table. 
° Bits[15:0] of the table address are zero. 


For a stage | translation only, bits[63:59] provide attributes for the next-level lookup, see Memory 
attribute fields in the VMSAv8-64 translation table format descriptors on page D4-1778. 


If the translation table defines the Non-secure EL1&0 stage 1| translations, then the output address in the descriptor 
is the IPA of the target block or table. Otherwise, it is the PA of the target block or table. 
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D4.3.2 ARMvV8 translation table level 3 descriptor formats 


For the 4KB granule size, each entry in a level 3 table describes the mapping of the associated 4KB input address 
range. 


For the 16KB granule size, each entry in a level 3 table describes the mapping of the associated 16KB input address 
range. 


For the 64KB granule size, each entry in a level 3 table describes the mapping of the associated 64KB input address 
range. 


Figure D4-16 shows the ARMv8 level 3 descriptor formats. 


63 10 
Invalid | IGNORED 0 
63 210 

Reserved | RESO [o[+] 
52:51 48.47 12:11 210 

Page, 4KB sone — attributes [reso | Output address[47:12] Lower" attributes [+] 1] 
52.51 48 47 141312 11 210 

Page, 16KB — Ta attributes [reso | Output address[47:14] Es Lower' attributes [+] 1] 
52.51 48.47 1615 12.11 210 


Page, 64KB cael Ta attributes [reso | Output address[47:16] | reso | | esd | Lower! attributes [+] 1] 


+ Upper page attributes and Lower page attributes 
+ Field is RESO 


Figure D4-16 VMSAv8-64 level 3 descriptor format 


Descriptor bit[0] identifies whether the descriptor is valid, and is 1 for a valid descriptor. If a lookup returns an 
invalid descriptor, the associated input address is unmapped, and any attempt to access it generates a Translation 
fault. 


Descriptor bit[1] identifies the descriptor type, and is encoded as: 


0, Reserved, invalid 
Behaves identically to encodings with bit[0] set to 0. 


This encoding must not be used in level 3 translation tables. 
1, Page Gives the address and attributes of a 4KB, 16KB, or 64KB page of memory. 
At this level, the only valid format is the Page descriptor. The other fields in the Page descriptor are: 


Page descriptor 
Gives the output address of a page of memory, as follows: 
4KB translation granule 
Bits[47:12] are bits[47:12] of the output address for a page of memory. 
16KB translation granule 
Bits[47:14] are bits[47:14] of the output address for a page of memory. 
64KB translation granule 
Bits[47:16] are bits[47:16] of the output address for a page of memory. 


Bits[63:52, 11:2] provide attributes for the target memory page, see Memory attribute fields in the 
VMSAV8-64 translation table format descriptors on page D4-1778. 
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— Note 


The position and contents of bits[63:52, 11:2] are identical to bits[63:52, 11:2] in the level 0, level 
1, and level 2 block descriptors. 





For the Non-secure EL1&0 stage 1 translations, the output address in the descriptor is the IPA of the target page. 
Otherwise, it is the PA of the target page. 


D4.3.3 Memory attribute fields in the VMSAv8-64 translation table format descriptors 


Memory region attributes on page D4-1792 describes the region attribute fields. The following subsections 
summarize the descriptor attributes as follows: 


Table descriptor 


Table descriptors for stage 2 translations do not include any attribute field. For a summary of the 
attribute fields in a stage 1 table descriptor, that define the attributes for the next lookup level, see 
Next-level attributes in stage 1 VMSAv8-64 Table descriptors. 


Block and page descriptors 


These descriptors define memory attributes for the target block or page of memory. Stage 1 and 
stage 2 translations have some differences in these attributes, see: 


° Attribute fields in stage 1 VMSAv8-64 Block and Page descriptors on page D4-1779 
° Attribute fields in stage 2 VMSAv8-64 Block and Page descriptors on page D4-1781. 


Next-level attributes in stage 1 VMSAv8-64 Table descriptors 


In a Table descriptor for a stage 1 translation, bits[63:59] of the descriptor define the attributes for the next-level 
translation table access, and bits[58:52] are IGNORED: 


Next-level descriptor attributes, stage 1 only 


63 62 61 60 59 58 52 


PTT [enone 


NSTable — 
APTable 
UXNTable or XNTable t 


PXNTable + 
+ UXNTable for a translation regime that supports two VA ranges, XNTable for other regimes. 
+ Regimes that support two VA ranges only, RESO in the other regimes. 


These attributes are: 


NSTable, bit[63] 


For memory accesses from Secure state, specifies the Security state for subsequent levels of lookup, 
see Hierarchical control of Secure or Non-secure memory accesses on page D4-1782. 


For memory accesses from Non-secure state, including all accesses in the EL2 translation regime, 
this bit is RESO and is ignored by the PE. 


APTable, bits[62:61] 


Access permissions limit for subsequent levels of lookup, see Hierarchical control of data access 
permissions on page D4-1785. 


APTable[0] is RESO: 





° In the EL2 translation regime. 
° In the EL3 translation regime. 
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UXNTable or XNTable, bit[60] 


XN limit for subsequent levels of lookup, see Hierarchical control of instruction fetching on 
page D4-1789. 


The naming of this field depends on whether the stage 1 translation supports two VA ranges: 


Two VA ranges supported 


This field is UXNTable, and determines whether execution at ELO of instructions 
fetched from the region identified at a lower level of lookup permitted. 


— Note 


PXNTable is the equivalent control of execution at a higher Exception level. 





One VA range supported 
This field is XNTable. 
PXNTable, bit[59] 


PXN limit for subsequent levels of lookup, see Hierarchical control of instruction fetching on 
page D4-1789. 


This field is valid only for a stage 1 translation that supports two VA ranges. It is RESO for stage 1 
translations that support only one VA range. 


The definition of IGNORED means the architecture guarantees that the PE makes no use of the field, see JGNORED 
on page Glossary-5718. For more information about these fields see Other fields in the VMSAv8-64 translation table 
format descriptors on page D4-1795. 


Attribute fields in stage 1 VMSAv8-64 Block and Page descriptors 


In Block and Page descriptors, the memory attributes are split into an upper block and a lower block, as shown for 
a stage | translation: 


Attribute fields for VMSAv8-64 stage 1 Block and Page descriptors 


Upper attributes Lower attributes 
63 59 58 55 54 53 52 11109 8 7 6 5 4 2 
z a a a 
Reserved for software use —— nG + 

UXN or XN t AF 
PXN $ SH[1:0] 
Contiguous AP[2:1] 
NS 








Attrindx[2:0] 
t+ UXN for a translation regime that supports two VA ranges, XN for the other regimes. 
+ Regimes that support two VA ranges only, RESO in the other regimes. 


For a stage 1 descriptor, the attributes are: 


UXN or XN, bit[54] 


The Execute-never bit. Determines whether the region is executable, see Access permissions for 
instruction execution on page D4-1786. 


The naming of this field depends on whether the stage 1 translation supports two VA ranges: 


Two VA ranges supported 


This field is UXN (Unprivileged execute never), and determines whether execution at 
ELO of instructions fetched from the region is permitted. 


— Note 


PXN is the equivalent control of execution at a higher Exception level. 
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One VA range supported 
This field is XN (Execute never). 


PXN, bit[53] The Privileged execute-never bit. Determines whether the region is executable at EL1, see Access 
permissions for instruction execution on page D4-1786. 


This field is valid only for a stage 1 translation that supports two VA ranges. It is RESO for stage 1 
translations that support only one VA range. 


Contiguous, bit[52] 
A hint bit indicating that the translation table entry is one of a contiguous set or entries, that might 


be cached in a single TLB entry, see The Contiguous bit on page D4-1796. 


nG, bit[11] The not global bit. If a lookup using this descriptor is cached in a TLB, determines whether the TLB 
entry applies to all ASID values, or only to the current ASID value. See Global and process-specific 
translation table entries on page D4-1812. 


This field is valid only for a stage 1 translation that supports two VA ranges. It is RESO for stage 1 
translations that support only one VA range. 


AF, bit[10]_ The Access flag, see The Access flag on page D4-1791. 
SH, bits[9:8] Shareability field, see Memory region attributes on page D4-1792. 
AP[2:1], bits[7:6] 

Data Access Permissions bits, see Memory access control on page D4-1783. 


— Note 


The ARMvV8 translation table descriptor format defines AP[2:1] as the Access Permissions bits, and 
does not define an AP[O] bit. 





AP[1] is valid only for a stage 1 translation that supports two VA ranges. It is RESO for stage 1 
translations that support only one VA range. 


NS, bit[5] Non-secure bit. For memory accesses from Secure state, specifies whether the output address is in 
the Secure or Non-secure address map, see Control of Secure or Non-secure memory access on 
page D4-1782. 


For memory accesses from Non-secure state, including all accesses in the EL2 translation regime, 
this bit is RESO and is ignored by the PE. 


AttrIndx[2:0], bits[4:2] 
Stage 1 memory attributes index field, for the MAIR_ELx, see Stage ] memory region type and 
Cacheability attributes on page D4-1792. 


The definition of IGNORED means the architecture guarantees that the PE makes no use of the field, see JGNORED 
on page Glossary-5718. For more information about these fields see Other fields in the VMSAv8-64 translation table 
format descriptors on page D4-1795. 
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Attribute fields in stage 2 VMSAv8-64 Block and Page descriptors 
In Block and Page descriptors, the memory attributes are split into an upper block and a lower block, as shown for 


a stage 2 translation: 


Attribute fields for VMSAv8-64 stage 2 Block and Page descriptors 
Upper attributes Lower attributes 
63 60 59 58 55 54 53 52 11109 8 7 6 5 


mel Eee 2 a a 


Reserved for use by a System MMU Ly 
IGNORED ati a 
Reserved for a use S2AP[1:0] 
MemAttr[3:0] 


eit 


= 





ie. 








For a stage 2 descriptor, the attributes are: 


XN, bit[54]_ | The Execute-never bit. Determines whether the region is executable, see Access permissions for 
instruction execution on page D4-1786. 


Contiguous, bit[52] 
A hint bit indicating that the translation table entry is one of a contiguous set or entries, that might 
be cached in a single TLB entry, see The Contiguous bit on page D4-1796. 


AF, bit[10]_ The Access flag, see The Access flag on page D4-1791. 


SH, bits[9:8] Shareability field, see The stage 2 memory region attributes, EL1&0 translation regime on 
page D4-1794. 


S2AP, bits[7:6] 
Stage 2 data Access Permissions bits, see The S2AP data access permissions, Non-secure ELI &0 
translation regime on page D4-1785. 


—— Note 

In the original VMSAv7-32 Long-descriptor attribute definition, this field was called HAP[2:1], for 
consistency with the AP[2:1] field in the stage 1 descriptors and despite there being no HAP[0] bit. 
ARMvVv8 renames the field for greater clarity. 





MemAttr, bits[5:2] 
Stage 2 memory attributes, see The stage 2 memory region attributes, ELI &0 translation regime on 
page D4-1794. 


The definition of IGNORED means the architecture guarantees that the PE makes no use of the field, see JGNORED 
on page Glossary-5718. For more information about these fields see Other fields in the VMSAv8-64 translation table 


format descriptors on page D4-1795. 
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D4.3.4 


Control of Secure or Non-secure memory access 


As this section describes, the NS bit in the translation table entries: 


° For accesses from Secure state, if the translation table entry was held in secure memory, determines whether 
the access is to Secure or Non-secure memory. 


° Is ignored by: 
— Accesses from Non-secure state. 


— Accesses from Secure state if the translation table entry was held in Non-secure memory. 
In the VMSAv8-64 translation table format: 
° The NS bit relates only to the memory block or page at the output address defined by the descriptor. 


° The descriptors also include an NSTable bit, that affects accesses at lower levels of lookup, see Hierarchical 
control of Secure or Non-secure memory accesses. 


The NS and NSTable bits are valid only for memory accesses from Secure state described by translation table 
descriptors that are fetched from Secure memory, and: 
° In the translation table descriptors in a Non-secure translation table, the NS and NSTable bits are SBZ. 


. Memory accesses from Non-secure state, including all accesses from EL2, ignore the values of these bits. 


In the Secure translation regimes, for translation table descriptors that are fetched from Secure memory, the NS bit 
in a descriptor indicates whether the descriptor refers to the Secure or the Non-secure address map, as follows: 


NS == Access the Secure PA space. 
NS == 1 Access the Non-secure PA space. 


For Non-secure translation regimes, and for translation table descriptors fetched from Non-secure memory, the 
corresponding bit is RESO and is ignored by the PE. The access is made to Non-secure memory, regardless of the 
value of the bit. 


Hierarchical control of Secure or Non-secure memory accesses 


For VMSAv8-64 table descriptors for stage 1 translations, the descriptor includes an NSTable bit, that indicates 
whether the table identified in the descriptor is in Secure or Non-secure memory. For accesses from Secure state, 
the meaning of the NSTable bit is: 


NSTable == 0 The defined table address is in the Secure PA space. In the descriptors in that translation table, NS 
bits and NSTable bits have their defined meanings. 


NSTable == 1 The defined table address is in the Non-secure PA space. Because this table is fetched from the 
Non-secure address space, the NS and NSTable bits in the descriptors in this table must be ignored. 
This means that, for this table: 


° The value of the NS bit in any block or page descriptor is ignored. The block or page address 
refers to Non-secure memory. 


. The value of the NSTable bit in any table descriptor is ignored, and the table address refers 
to Non-secure memory. When this table is accessed, the NS bit in any block or page 
descriptor is ignored, and all descriptors in the table refer to Non-secure memory. 


In addition, an entry fetched in Secure state is treated as non-global if it is read from Non-secure memory. That is, 
these entries must be treated as if nG==1, regardless of the value of the nG bit. For more information about the nG 
bit, see Global and process-specific translation table entries on page D4-1812. 


The effect of NSTable applies to later entries in the translation table walk, and so its effects can be held in one or 
more TLB entries. Therefore a change to NSTable requires coarse-grained invalidation of the TLB to ensure that 
the effect of the change is visible to subsequent memory transactions. 
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D4.4 Memory access control 


The access control fields in the translation table descriptors determine whether the PE, in its current state, is 
permitted to perform the required access to the output address given in the translation table descriptor. If a 
translation stage does not permit the access then an MMU fault is generated for that translation stage, and no 
memory access is performed. 


The following sections describe the memory access controls: 


° About access permissions. 
° The data access permission controls on page D4-1784. 
° Access permissions for instruction execution on page D4-1786. 


. The Access flag on page D4-1791. 


Note 


This section describes the access controls for each of the translation regimes, and for each stage of translation in the 
Non-secure EL1&0 translation regime. 





A translation applies to memory accesses from either: 
° Only a single Exception level, for example the EL3 translation regime. 


° ELO and one higher Exception level, for example the EL1&0 translation regime. 





In addition to an output address, a translation table entry that refers to a page or region of memory includes fields 
that define properties of the target memory region. These fields can be classified as address map control, access 
control, and region attribute fields. Control of Secure or Non-secure memory access on page D4-1782 describes the 
address map control, and Memory region attributes on page D4-1792 describes the other fields. 


D4.4.1 About access permissions 


Note 


This section gives a general description of memory access permissions. In an implementation that includes EL2, 
software executing at EL1 in Non-secure state can see only the access permissions defined by the Non-secure 
EL1&0 stage 1 translations. However, software executing at EL2 can modify these permissions. This modification 
is invisible to the Non-secure software executing at EL1 or ELO. 








The access permission bits control access to the corresponding memory region. The VMSAv8-64 translation table 
format: 


° In stage | translations, uses AP[2:1] to define the data access permissions, see The AP[2:1] data access 
permissions, for stage 1 translations on page D4-1784. 


Note 


The description of the access permission field as AP[2:1] is for consistency with the VMSAv8-32 
Short-descriptor translation table format, see The VMSAv8-32 Short-descriptor translation table format on 
page G4-4040. The VMSAv8-64 translation table format does not define an AP[0] bit. 








° In stage 2 translations, uses S2AP[1:0] to define the data access permissions, see The S2AP data access 
permissions, Non-secure EL1&0 translation regime on page D4-1785. 


° Uses the UXN, XN and PXN bits to define access controls for instruction fetches, see Access permissions for 
instruction execution on page D4-1786. 


An attempt to perform a memory access that the translation table access permission bits do not permit generates a 
Permission fault, for the corresponding stage of translation. 
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Note 


In an implementation that includes EL2, each stage of the translation of amemory access made using the Non-secure 
EL1&0 translation regime has its own, independent, permission check. 



































D4.4.2 The data access permission controls 
The following subsubsections describe the data access permission controls: 
° The AP[2:1] data access permissions, for stage 1 translations. 
° The S2AP data access permissions, Non-secure EL1&0 translation regime on page D4-1785. 
. Hierarchical control of data access permissions on page D4-1785. 
The AP[2:1] data access permissions, for stage 1 translations 
In VMSAv8-64, for a translation regime that applies to both ELO and a higher Exception level, the AP[2:1] bits 
control the stage 1 data access permissions, and: 
AP[2] Selects between read-only and read/write access. 
AP[1] Selects between Application level (ELO) control and the higher Exception level control. 
This provides four permission settings for data accesses: 
° Read-only at all levels. 
° Read/write at all levels. 
° Read-only at the higher Exception level, no access by software executing at ELO. 
° Read/write at the higher Exception level, no access by software executing at ELO. 
Note 
In ARMV8.0, the only translation regime that applies to ELO and a higher Exception level is the EL1&0 translation 
regime. 
For translation regimes that apply only to accesses from a single Exception level, AP[2] determines the stage 1 data 
access permissions, and AP[1] is RES1, meaning it is ignored by hardware and is treated as if it is 1. 
Table D4-29 shows the meaning of the AP[2:1] field for stage 1 of a translation regime that applies to both ELO and 
a higher Exception level. In this table, an entry of None indicates that any access from that Exception level faults. 
Table D4-29 Data access permissions for stage 1 translations that applies to ELO and a higher 
Exception level 
AP[2:1] Access from higher Exception level Access from ELO 
00 Read/write None 
01 Read/write Read/write 
10 Read-only None 
11 Read-only Read-only 
For the Non-secure EL1&0 translation regime: 
. The stage 2 translation also defines data access permissions, see The S2AP data access permissions, 
Non-secure EL1&0 translation regime on page D4-1785. 
° When both stages of translation are enabled, Combining the stage 1 and stage 2 data access permissions on 
page D4-1797 describes how these permissions are combined. 
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Table D4-30 shows the effect of the AP[2] field for stage 1 of a translation regime that applies to only a single 
Exception level. 


Table D4-30 Data access permissions for stage 1 translations that apply to only a single 
Exception level 





AP[2] Access permission 





0 Read/write 





1 Read-only 





The S2AP data access permissions, Non-secure EL1&0 translation regime 


In the Non-secure EL1&0 translation regime, when stage 2 address translation is enabled, the S2AP field in the 
stage 2 translation table descriptors define the data access permissions as Table D4-31 shows. In this table, an entry 
of None indicates that any access generates a Permission fault. 


Table D4-31 Data access permissions for stage 2 of the Non-secure EL1&0 translation regime 





S2AP Access from Non-secure EL1 or Non-secure ELO 














00 None 

01 Read-only 
10 Write-only 
11 Read/write 





The S2AP access permissions make no distinction between Non-secure accesses from EL1 and Non-secure accesses 
from ELO. However, when both stages of address translation are enabled, these permissions are combined with the 
stage 1 access permissions defined by AP[2:1], see Combining the stage 1 and stage 2 data access permissions on 
page D4-1797. 


Combining the stage I and stage 2 attributes, Non-secure EL1&0 translation regime on page D4-1797 gives more 
information about the use of the stage 1 and stage 2 access permissions in an implementation of virtualization. 


Hierarchical control of data access permissions 


The VMSAv8-64 translation table format includes mechanisms by which entries at one level of translation table 
lookup can set limits on the permitted entries at subsequent levels of lookup. This subsection describes how these 
controls apply to the data access permissions. 


Note 


Similar hierarchical controls apply to instruction fetching, see Hierarchical control of instruction fetching on 
page D4-1789. 








The restrictions apply only to subsequent levels of lookup for the same stage of translation. The APTable[1:0] field 
restricts the access permissions, as Table D4-32 shows. 


As stated in the table footnote, for a translation regime that applies to only a single Exception level, APTable[0] is 
RESO, meaning it is ignored by the hardware. 


Table D4-32 Effect of APTable[1:0] on subsequent levels of lookup 





APTable[1:0] Effect 





00 No effect on permissions in subsequent levels of lookup. 
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Table D4-32 Effect of APTable[1:0] on subsequent levels of lookup (continued) 





APTable[1:0] Effect 











01a Access at ELO not permitted, regardless of permissions in subsequent levels of lookup. 
10 Write access not permitted, at any Exception level, regardless of permissions in subsequent levels of lookup. 
114 Regardless of permissions in subsequent levels of lookup: 

° Write access not permitted, at any Exception level. 

° Read access not permitted at ELO. 





a. Not valid any translation regime that applies to only a single Exception level. In the translation tables for such a regime, APTable[0] 
is RESO. 





Note 


The APTable[1:0] settings are combined with the translation table access permissions in the translation tables 
descriptors accessed in subsequent levels of lookup. They do not restrict or change the values entered in those 
descriptors. 





The VMSAv8-64 provides APTable[1:0] control only for stage 1 translations. The corresponding bits are RESO in 
the stage 2 translation table descriptors. 


The effect of APTable applies to later entries in the translation table walk, and so its effects can be held in one or 
more TLB entries. Therefore, a change to APTable requires coarse-grained invalidation of the TLB to ensure that 
the effect of the change is visible to subsequent memory transactions. 


D4.4.3 Access permissions for instruction execution 
Execute-never controls determine whether instructions can be executed from a memory region. These controls are: 


UXN, Unprivileged execute never 
Descriptor bit[54], defined as UXN only for stage 1 of any translation regime that applies to ELO 
and a higher Exception level. 

PXN, Privileged execute never 


Descriptor bit[53], used only for stage 1 of the any translation regime that applies to ELO and a 
higher Exception level: 


° For stage 1 of a translation regime that applies to only a single Exception level the descriptors 
define a PXN bit that is RESO, meaning it is ignored by hardware. 


° For stage 2 of the Non-secure EL1&0 translation regime, descriptor bit[53] is RESO, meaning 
it is ignored by hardware. 
XN, Execute never 
Descriptor bit[54], defined as XN for: 


. Stage 1 of any translation regime that applies to only a single Exception level. 
° Stage 2 of the EL1&0 translation regime. 











Note 
In ARMV8.0, the only translation regime that applies to ELO and a higher Exception level is the EL1&0 translation 
regime. 
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Each of theses fields is set to 1 to indicate that instructions cannot be executed from the target memory region. In 
addition: 


For a translation regime that applies to ELO and a higher Exception level, if the value of the AP[2:1] bits is 
0b01, permitting write access from ELO, then the PXN bit is treated as if it has the value 1, regardless of its 
actual value. 


In the Non-secure EL1&0 translation regime, a region is execute-never if the value of the applicable 
execute-never field is 1 in one or both of: 


— The stage 1 translation table descriptor. 


— The stage 2 translation table descriptor. 


For each translation regime, if the value of the corresponding SCTLR_ELx.WXN bit is 1 then any memory 
region that is writable is treated as XN, regardless of the value of the corresponding UXN, XN, or PXN bit. 
For more information see Preventing execution from writable locations on page D4-1790. 


The SCR_EL3.SIF bit prevents execution in Secure state of any instruction fetched from Non-secure 
memory, see Restriction on Secure instruction fetch on page D4-1791. 


The execute-never controls apply to speculative instruction fetching, meaning speculative instruction fetch from a 
memory region that is execute-never at the current Exception level is prohibited. 


Note 





Although the execute-never controls apply to speculative fetching, on a speculative instruction fetch from an 
execute-never location, no Permission fault is generated unless the PE attempts to execute the instruction that 
would have been fetched from that location. This means that, if a speculative fetch from an execute-never 
location is attempted, but there is no attempt to execute the corresponding instruction, a Permission fault is 
not generated. 


The software that defines a translation table must mark any region of memory that is read-sensitive as 
execute-never, to avoid the possibility of a speculative fetch accessing the memory region. This means it must 
mark any memory region that corresponds to a read-sensitive peripheral as execute-never. Hardware does not 
prevent speculative accesses to a region of any Device memory type unless that region is also marked as 
execute-never for all Exception levels from which it can be accessed. 


When no stage of address translation for the translation regime is enabled, memory regions cannot have 
UXN, XN, or PXN attributes assigned. Behavior of instruction fetches when all associated stages of 
translation are disabled on page D4-1768 describes how disabling all stages of address translation affects 
instruction fetching. 





The following subsubsections describe the data access permission controls: 


Stage 1 instruction access and execution permissions. 

Stage 2 instruction execution permissions on page D4-1789. 
Hierarchical control of instruction fetching on page D4-1789. 
Preventing execution from writable locations on page D4-1790. 


Restriction on Secure instruction fetch on page D4-1791. 


Stage 1 instruction access and execution permissions 


Table D4-33 on page D4-1788 and Table D4-34 on page D4-1789 include the AP[2:1] read and write permissions 
shown in Table D4-29 on page D4-1784 and Table D4-30 on page D4-1785. These permissions are shown as: 


R 
WwW 


Indicates Read permission granted. 


Indicates Write permission granted. 
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Table D4-33 shows the stage 1 access permissions for instruction execution when using a translation regime that 
applies to ELO and a higher Exception level. 


Table D4-33 Stage 1 access permissions for instruction execution for a translation regime that applies to ELO anda 


higher Exception level 





UXN PXN_ AP[2:1] 


SCTLR_ELx.WXN2 


Access from higher Exception level 


Access from ELO 

































































0 0 00 0 R, W, Executable Executable 
1 R, W, Not executable Executable 
01 0 R, W, Not executable R, W, Executable 
1 R, W, Not executable R, W, Not executable 
10 x R, Executable Executable 
11 x R, Executable R, Executable 
0 1 00 x R, W, Not executable Executable 
01 0 R, W, Not executable R, W, Executable 
1 R, W, Not executable R, W, Not executable 
10 x R, Not executable Executable 
11 x R, Not executable R, Executable 
1 0 00 0 R, W, Executable Not executable 
1 R, W, Not executable Not executable 
01 x R, W, Not executable¢ R, W, Not executable 
10 x R, Executable Not executable 
11 x R, Executable R, Not executable 
1 00 x R, W, Not executable Not executable 
01 x R, W, Not executable R, W, Not executable 
10 x R, Not executable Not executable 
11 x R, Not executable R, Not executable 
a. Where ELx is the higher Exception level to which the translation regime applies. 
b. Not executable because of SCTLR_ELx.WXN control, because region is writable at ELx. 


a 9 


Not executable, because AArch64 execution treats all regions writable at ELO as being PXN. 
Not executable because of SCTLR_ELx.WXN control, because region is writable at ELO. 
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Table D4-34 shows the stage 1 access permissions for instruction execution when using a translation regime that 
applies to only a single Exception level. 


Table D4-34 Access permissions for instruction execution for a translation regime that applies to 
only a single Exception level 




















XN AP[2] SCTLR_ELx.WXN4 Access permission 
0 0 0 R, W, Executable 
1 R, W, Not executable 
1 x R, Executable 
1 0 x R, W, Not executable 
1 x R, Not executable 





a. Where ELx is the higher Exception level to which the translation regime applies. 
b. Not executable because of the SCTLR_ELx.WXN control, because region is writable at ELx. 


Note 


The Access permissions for an AArch64 translation regime that applies to only a single Exception level are 
consistent with the following fields in the translation table entries being treated as shown: 





° AP treated as RES1. 
° APTable[0] treated as RESO. 
° PXN treated as RESO. 


° PXNTable treated as RESO. 





Stage 2 instruction execution permissions 


For the Non-secure EL1&0 stage 2 translation, the XN bit in the stage 2 translation table descriptors controls the 
execution permission, and this control is completely independent of the S2AP access permissions. 


Table D4-35 Access permissions for instruction execution for stage 2 of the Non-secure EL1&0 
translation regime, 











XN Access from Non-secure EL1or Non-secure ELO 
0 Executable 
1 Not executable 





The stage 2 XN access permissions make no distinction between Non-secure accesses from EL1 and Non-secure 
accesses from ELO. However, when both stages of address translation are enabled, these permissions are combined 
with the stage 1 access permissions defined at stage | of the translation, see Combining the stage 1 and stage 2 
instruction execution permissions on page D4-1798. 


Hierarchical control of instruction fetching 


The VMSAv8-64 translation table format includes mechanisms by which entries at one level of translation table 
lookup can set limits on the permitted entries at subsequent levels of lookup. This subsection describes how these 
controls apply to the instruction fetching controls. 
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Note 


Similar hierarchical controls apply to data accesses, see Hierarchical control of data access permissions on 
page D4-1785. 








The restrictions apply only to subsequent levels of lookup at the same stage of translation, and: 


° UXNTable or XNTable restricts the execute-never control: 


— When the value of the XNTable bit is 1, the XN bit is treated as 1 in all subsequent levels of lookup, 
regardless of its actual value. 

— When the value of the UXNTable bit is 1, the UXN bit is treated as 1 in all subsequent levels of lookup, 
regardless of its actual value. 


— When the value of a UXNTable or XNTable bit is 0 the bit has no effect. 


° For a translation regime that applies to ELO and a higher Exception level, PXNTable restricts the PKN 
control: 


— When the value of PXNTable is 1, the PXN bit is treated as 1 in all subsequent levels of lookup, 
regardless of the actual value of the bit. 


— When the value of PXNTable is 0 it has no effect. 


Note 


The UXNTable, XNTable, and PXNTable settings are combined with the UXN, XN, and PXN bits in the translation 
table descriptors accessed at subsequent levels of lookup. They do not restrict or change the values entered in those 
descriptors. 








The UXNTable, XNTable, and PXNTable controls are provided only for stage | translations. The corresponding bits 
are RESO in the stage 2 translation table descriptors. 


The effect of UXNTable, XNTable, or PXNTable applies to later entries in the translation table walk, and so its 
effects can be held in one or more TLB entries. Therefore, a change to UXNTable, XNTable, or PXNTable requires 
coarse-grained invalidation of the TLB to ensure that the effect of the change is visible to subsequent memory 
transactions. 


Preventing execution from writable locations 


ARMvV8 provides control bits that, when corresponding stage 1 address translation is enabled, force writable 
memory to be treated as UXN, PXN, or XN, regardless of the value of the UXN, PXN, or XN bit: 


° For a translation regime that applies to ELO and a higher Exception value, when the value of the applicable 
SCTLR_ELx.WXN field is 1: 


— _ Allregions that are writable from ELO at stage 1 of the address translation are treated as UXN. 


—  Allregions that are writable from EL1 at stage 1 of the address translation are treated as PXN. 





° For a translation regime that applies to only a single Exception level, when the value of the applicable 
SCTLR_ELx.WXN field is 1, all regions that are writable at stage 1 of the address translation are treated as 
XN. 
Note 
° The SCTLR_ELx.WXN controls are intended to be used in systems with very high security requirements. 
° Setting a WXN bit to 1 changes the interpretation of the translation table entry, overriding a zero value of a 


UXN, XN, or PXN field. It does not cause any change to the translation table entry. 
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For any given virtual machine, ARM expects WXN to remain static in normal operation. In particular, it is 
IMPLEMENTATION DEFINED whether TLB entries associated with a particular VMID reflect the effect of the values 
of these fields. This means that any change of these fields without a corresponding change of VMID might require 
synchronization and TLB invalidation, as described in TLB maintenance requirements and the TLB maintenance 
instructions on page D4-1815. 


Restriction on Secure instruction fetch 


EL3 provides a Secure instruction fetch bit, SCR_EL3.SIF. When the value of this bit is 1, and execution is using 
the EL3 translation regime or the Secure EL1 translation regime, any attempt to execute an instruction fetched from 
Non-secure physical memory causes a Permission fault. TLB entries might reflect the value of this bit, and therefore 
any change to the value of this bit requires synchronization and TLB invalidation, as described in TLB maintenance 
requirements and the TLB maintenance instructions on page D4-1815. 


The Access flag 


The Access flag indicates when a page or section of memory is accessed for the first time since the Access flag in 
the corresponding translation table descriptor was set to 0. 


The AF bit in the translation table descriptors is the Access flag. 


In ARMv8.0, the Access flag is managed by software as described in Software management of the Access flag. 


Software management of the Access flag 


ARMv8.0 requires that software manages the Access flag. This means an Access flag fault is generated whenever 
an attempt is made to read into the TLB a translation table descriptor entry for which the value of Access flag is 0. 


The Access flag mechanism expects that, when an Access flag fault occurs, software resets the Access flag to 1 in 
the translation table entry that caused the fault. This prevents the fault occurring the next time that memory location 
is accessed. Entries with the Access flag set to 0 are never held in the TLB, meaning software does not have to flush 
the entry from the TLB after setting the flag. 





Note 


If a system incorporates components that can autonomously update translation table entries that are shared with the 
ARM PE, then the software must be aware of the possibility that such components can update the access flag 
autonomously. 


In such a system, system software should perform any changes of translation table entries with an Access flag of 0, 
other than changes to the Access flag value, by using an Load-Exclusive/Store-Exclusive loop, to allow for the 
possibility of simultaneous updates. 
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D4.5 


D4.5.1 


Memory region attributes 


The memory region attribute fields control the memory type, accesses to the caches, and whether the memory region 
is Shareable and therefore is coherent. This section also describes some additional translation table fields, that this 
manual groups with the memory region attributes. 


In the EL1&0 translation regime, each enabled stage of address translation assigns memory region attributes, as 
described in this section. When both stages of translation are enabled, Combining the stage 1 and stage 2 attributes, 
Non-secure EL] &0 translation regime on page D4-1797 describes how the assignments from the two stages are 
combined. 


Note 


In a virtualization implementation, a hypervisor, executing at EL2, might usefully: 





° Reduce the permitted cacheability of a region. 


. Increase the required shareability of a region. 


The combining of attributes from stage 1 and stage 2 translations supports both of these options. 





The following sections describe these attributes: 

° The stage 1 memory region attributes. 

° The stage 2 memory region attributes, EL1&0 translation regime on page D4-1794. 

° Other fields in the VMSAv8-64 translation table format descriptors on page D4-1795. 





Note 


This section describes the memory region attributes for each of the translation regimes, and for each stage of 
translation in the Non-secure EL1&0 translation regime. 


A translation applies to memory accesses from either: 
° Only a single Exception level, for example the EL3 translation regime. 


° ELO and one higher Exception level, for example the EL1&0 translation regime. 


In general, attribute assignment is simpler in a regime that applies to only a single Exception level, and in these 
regimes behavior is consistent with fields in the translation tables being treated as follows: 


. AP[1] is RES1, meaning the PE ignores the value of the bit and behaves as if it is 1. 

° APTable[0] is RESO, meaning the PE ignores the value of the bit and behaves as if it is 0. 

° The PXN field is RESO, meaning the PE ignores the value of the bit and behaves as if it is 0. 

° The PXNTable bit is RESO, meaning the PE ignores the value of the bit and behaves as if it is 0. 





The stage 1 memory region attributes 


The description of the memory region attributes in a translation descriptor divides into: 


Memory type and Cacheability 


These are described indirectly, by registers referenced by bits in the table descriptor. This is 
described as remapping the memory type and attribute description. Stage 1 memory region type and 
Cacheability attributes describes this encoding. 


Shareability The SH[1:0] field in the translation table descriptor encodes shareability information. Stage 1 
Shareability attribute, for Normal memory on page D4-1793 describes this encoding. 


Stage 1 memory region type and Cacheability attributes 


In the VMSAv8-64 translation table format, the AttrIndx[2:0] field in a block or page translation table descriptor 
for a stage 1 translation indicates the 8-bit field in the MAIR_ELx that specifies the attributes for the corresponding 
memory region. The required field is Attrn, where n = AttrIndx[2:0]. For more information about AttrIndx[2:0] see 
Attribute fields in stage 1 VUSAv8-64 Block and Page descriptors on page D4-1779 
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Note 


Each MAIR_ELx is a 64-bit register that is architecturally mapped to a pair of AArch32 registers. See the 
MAIR_ELx register descriptions for more information. 








Each MAIR_ELx.Attrn field defines, for the corresponding memory region: 
° The memory type, Device or Normal. 


. For Device memory, the Device memory type, one of: 
—  Device-nGnRnE. 
—  Device-nGnRE. 
—  Device-nGRE. 
—  Device-GRE. 


° For Normal memory: 
— The inner and outer cacheability, Non-cacheable, Write-Through, or Write-Back 


— For Write-Through Cacheable and Write-Back Cacheable regions, the Read-Allocate and 
Write-Allocate policy hints, each of which is Allocate or No Allocate, and the Transient allocation 
hints, if supported. 


For more information about the memory type and attributes, see Memory types and attributes on page B2-94 and 
Cacheability, cache allocation hints, and cache transient hints on page D3-1695. 


Stage 1 Shareability attribute, for Normal memory 


When using the VMSAv8-64 translation table format, the SH[1:0] field in a block or page translation table 
descriptor specifies the Shareability attributes of the corresponding memory region. Table D4-36 shows the 
encoding of this field. 


Table D4-36 SH[1:0] field encoding for Normal memory, VMSAv8-64 translation table format 

















SH[1:0] Normal memory 

00 Non-shareable 

01 Reserved, CONSTRAINED UNPREDICTABLE® 
10 Outer Shareable 

11 Inner Shareable 





a. See Reserved values in System and memory-mapped registers and translation table 
entries on page K1-5492 for the permitted CONSTRAINED UNPREDICTABLE behavior. 





Note 
The shareability field is only relevant if the memory is a Normal Cacheable memory type. All Device and Normal 
Non-cacheable memory regions are always treated as Outer Shareable, regardless of the translation table 
shareability attributes 





See Combining the stage 1 and stage 2 shareability attributes for Normal memory on page D4-1799 for constraints 
on the Shareability attributes of a Normal memory region that is Inner Non-cacheable, Outer Non-cacheable. 
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D4.5.2 The stage 2 memory region attributes, EL1&0 translation regime 
In the stage 2 translation table descriptors for memory regions and pages, the MemAttr[3:0] and SH[1:0] fields 
describe the stage 2 memory region attributes: 
° Stage 2 memory region type and Cacheability attributes describes how the MemAttr[3:0] field defines these 
attributes. 
° The SH[1:0] field in the translation table descriptor encodes shareability information. Stage 2 Shareability 
attribute, for Normal memory on page D4-1795 describes this encoding. 
The following sections describe how, when both stages of address translation are enabled, the memory region 
attributes assigned at stage 2 of the translation are combined with those assigned at stage 1: 
° Combining the stage 1 and stage 2 memory type attributes on page D4-1798. 
. Combining the stage 1 and stage 2 cacheability attributes for Normal memory on page D4-1799. 
. Combining the stage 1 and stage 2 shareability attributes for Normal memory on page D4-1799. 
Stage 2 memory region type and Cacheability attributes 
Table D4-37 shows how MemAttr[3:2] gives a top-level definition of the memory type, and of the Outer 
cacheability of a Normal memory region. 
Table D4-37 VMSAv8-64 MemAttr[3:2] encoding, stage 2 translation 
MemAttr[3:2] | Memory type Outer cacheability 
00 Device. MemAttr[1:0] encodes the Device memory type. Not applicable 
01 Normal. MemAttr[1:0] encodes the Inner Cacheability. Outer Non-cacheable 
10 Outer Write-Through Cacheable 
11 Outer Write-Back Cacheable 
The encoding of MemAttr[1:0] depends on the Memory type indicated by MemAttr[3:2]: 
° When MemAttr[3:2]==0b00, indicating Device memory, Table D4-38 shows the encoding of MemAttr[1:0]. 
Table D4-38 MemAttr[1:0] encoding for Device memory 
MemAttr[1:0] | Meaning when MemAttr[3:2] == 0b00 
00 Region is Device-nGnRnE memory 
01 Region is Device-nGnRE memory 
10 Region is Device-nGRE memory 
11 Region is Device-GRE memory 
° When MemAttr[3:2] !=0b00, indicating Normal memory, Table D4-39 shows the encoding of MemAttr[1:0]. 
Table D4-39 MemAttr[1:0] encoding for Normal memory 
MemAttr[1:0] Meaning when MemAittr[3:2] != 0b00 
00 Reserved, CONSTRAINED UNPREDICTABLE® 
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Table D4-39 MemAttr[1:0] encoding for Normal memory (continued) 














MemAttr[1:0] Meaning when MemAittr[3:2] != Qb00 
01 Inner Non-cacheable 

10 Inner Write-Through Cacheable 

11 Inner Write-Back Cacheable 





a. See Reserved values in System and memory-mapped registers and translation table 
entries on page K1-5492 for the permitted CONSTRAINED UNPREDICTABLE behavior. 


Note 


. The stage 2 translation does not assign any allocation hints. 





The following stage 2 translation table attribute settings leave the stage | settings unchanged: 
— MemAttr[3:2] == 0b11, Normal memory, Outer Write-Back Cacheable. 
— MemAttr[1:0] == @b11, Inner Write-Back Cacheable. 





Stage 2 Shareability attribute, for Normal memory 


When using the VMSAv8-64 translation table format, the SH[1:0] field in a block or page translation table 
descriptor specifies the Shareability attributes of the corresponding memory region. Table D4-40 shows the 
encoding of this field. 


Table D4-40 SH[1:0] field encoding for Normal memory, VMSAv8-64 translation table format 

















SH[1:0] Normal memory 

00 Non-shareable 

01 Reserved, CONSTRAINED UNPREDICTABLE?® 
10 Outer Shareable 

11 Inner Shareable 





a. See Reserved values in System and memory-mapped registers and translation table 
entries on page K1-5492 for the permitted CONSTRAINED UNPREDICTABLE behavior. 


Note 


° This encoding is the same as the shareability encoding described in Stage 1 Shareability attribute, for Normal 
memory on page D4-1793. 





° The shareability field is only relevant if the memory is a Normal Cacheable memory type. All Device and 
Normal Non-cacheable memory regions are always treated as Outer Shareable, regardless of the translation 
table shareability attributes. 





See Combining the stage 1 and stage 2 shareability attributes for Normal memory on page D4-1799 for constraints 
on the Shareability attributes of a Normal memory region that is Inner Non-cacheable, Outer Non-cacheable. 


D4.5.3 Other fields in the VMSAv8-64 translation table format descriptors 


The following subsections describe the other fields in the translation table block and page descriptors: 
° The Contiguous bit on page D4-1796. 

° IGNORED fields on page D4-1796. 

° Field reserved for software use on page D4-1797. 
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The Contiguous bit 


When the value of the Contiguous bit is 1, it indicates that the entry is one of a number of adjacent translation table 
entries that point to a contiguous output address range. The required number of adjacent entries depends on the 
current translation granule size, as follows: 


4KB granule 16 adjacent translation table entries point to a contiguous output address range that has the same 
permissions and attributes. These 16 entries must be aligned in the translation table. If accessing a 
full-sized 4KB translation table, this means that the top 5 of the 9 input addresses bits that index the 
descriptor positions in the translation table are the same for all of the entries. 


The contiguous output address range must be aligned to size of 16 translation table entries at the 
same translation table level. 


16KB granule This bit indicates that adjacent translation table entries point to contiguous output address range that 
has the same permissions and attributes. With the 16KB granule, the number of contiguous entries 
indicated by setting this bit to 1 depends on the lookup level of the translation table: 


Level 2 lookup The bit indicates 32 contiguous entries, giving a 1GB block of memory. 
These entries must be aligned in the translation table. When accessing a 
full-sized 16KB translation table, this means the top 6 of the 11 input 
addresses bits that index the descriptor positions in the translation table are 
the same for all of the entries. 


The contiguous output address range must be aligned to size of 32 
translation table entries at the same translation table level. 


Level 3 lookup The bit indicates 128 contiguous entries, giving a 2MB block of memory. 
These entries must be aligned in the translation table. When accessing a 
full-sized 16KB translation table, this means the top 4 of the 11 input 
addresses bits that index the descriptor positions in the translation table are 
the same for all of the entries. 


The contiguous output address range must be aligned to size of 128 
translation table entries at the same translation table level. 


64KB granule 32 adjacent translation table entries point to a contiguous output address range that has the same 
permissions and attributes. These 32 entries must be aligned in the translation table. If accessing a 
full-sized 64KB translation table, this means that the top 8 of the 13 input addresses bits that index 
the descriptor positions in the translation table are the same for all of the entries. 


The contiguous output address range must be aligned to size of 32 translation table entries at the 
same translation table level. 


Setting this bit to 1 means that the TLB can cache a single entry to cover the contiguous translation table entries. 


This section defines the requirements for programming the Contiguous bit. Possible translation table registers 
programming errors on page D4-1762 describes the effect of not meeting these requirements. 


The architecture does not require a PE to cache TLB entries in this way. To avoid TLB coherency issues, any TLB 
maintenance by address must not assume any optimization of the TLB tables that might result from use of the 
Contiguous bit. 


TLB maintenance must be performed based on the size of the underlying translation table entries, to avoid TLB 
coherency issues. 


IGNORED fields 


In the VMSAv8-64 translation table descriptors, the following fields are identified as IGNORED, meaning the 
architecture guarantees that a PE makes no use of these fields: 


° In the stage 1 table descriptors, bits[58:52]. 
° In the stage 1 and stage 2 block and page descriptors, bits[63:59] and bits[58:55]. 
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Of these fields: 


° In the stage 1 and stage 2 block and page descriptors, bits[58:55] are reserved for software use, see Field 
reserved for software use. 


° In the stage 2 block and page descriptors, bits[63:60] are reserved for use by a System MMU control. 


Field reserved for software use 


The architecture reserves a 4-bit IGNORED field in the Block and Page table descriptors, bits[58:55], for software 
use. The definition of IGNORED means the architecture guarantees that hardware makes no use of this field. 





Note 


This means there is no need to invalidate the TLB if these bits are changed. 














D4.5.4 Combining the stage 1 and stage 2 attributes, Non-secure EL1&0 translation regime 
The Non-secure EL1&0 translation regime comprises two stage of translation, each of which can be enabled 
independently: 
° Stage | translation is configured and controlled from EL1. When enabled, stage 1 translation can define 
access permissions independently for access from ELO and for accesses from EL1. 
Stage 1 MMU faults are taken to EL1. 
. When stage 2 translation is enabled, the stage 2 access controls defined at EL2: 
— Affect only the Non-secure stage 1 access permissions settings. 
— Take no account of whether the accesses are at EL1 or ELO. 
— Permit software executing at EL2 to assign a write-only attribute to a memory region. 
Stage 2 MMU faults are taken to EL2. 
Note 
In an implementation of virtualization, the attributes defined in the stage 2 translation tables mean a hypervisor can 
define additional access restrictions to those defined by a Guest OS in the stage 1 translation tables. For a particular 
access, the actual access permission is the more restrictive of the permissions defined by: 
° The Guest OS, in the stage 1 translation tables. 
° The hypervisor, in the stage 2 translation tables. 
The effects of the combination of attributes defined by the Hypervisor are functionally transparent to the Guest OS. 
Combining the stage 1 and stage 2 data access permissions 
When both stages of translation are enabled, the following access permissions are combined: 
. The stage 1 permissions described in The AP[2:1] data access permissions, for stage 1 translations on 
page D4-1784. 
° The stage 2 permissions described in The S2AP data access permissions, Non-secure EL1&0 translation 
regime on page D4-1785. 
The stage 1 and stage 2 permissions are combined as follows: 
1. If an access is not permitted by the stage 1 permissions, then it generates a Stage 1 Permission fault, 
regardless of the stage 2 permissions. 
2. If an access is permitted by the stage 1 permissions, but is not permitted by the stage 2 Permissions, then it 
generates a Stage 2 Permission fault. 
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3: If an access is permitted by both the stage 1 permissions and the stage 2 permissions, then it does not generate 
a Permission fault. 

Combining the stage 1 and stage 2 instruction execution permissions 

When both stages of translation are enabled, the following access permissions are combined: 

° The stage 1 permissions described in Stage J instruction access and execution permissions on page D4-1787. 

° The stage 2 permissions described in Stage 2 instruction execution permissions on page D4-1789. 

The stage 1 and stage 2 permissions are combined as follows: 


1. If an instruction fetch is not permitted by the stage 1 permissions, then it generates a Stage 1 Permission fault, 
regardless of the stage 2 permissions. 


2. If an instruction fetch is permitted by the stage 1 permissions, but is not permitted by the stage 2 Permissions, 
then it generates a Stage 2 Permission fault. 


3: If an instruction fetch is permitted by both the stage 1 permissions and the stage 2 permissions, then it does 


not generate a Permission fault. 


Combining the stage 1 and stage 2 memory type attributes 


Table D4-41 shows the rules for combining the stage 1 and stage 2 memory type assignments. 


Table D4-41 Combining the stage 1 and stage 2 memory type assignments 





If either stage of translation 














Rule assigns: The resultant memory type is: 
Device has precedence over Normal Any Device memory type A Device memory type 
Non-Gathering has precedence over Gathering A Device-nGxx memory type A Device-nGxx memory type 
Non-Reordering has precedence over Reordering A Device-nGnRx memory type A Device-nGnRx memory type 

No Early write acknowledge has precedence over The Device-nGnRnE memory type The Device-nGnRnE memory type 


Early write acknowledge 





Regardless of any shareability attribute obtained as described in Combining the stage 1 and stage 2 shareability 
attributes for Normal memory on page D4-1799: 


. Any location for which the resultant memory type is any type of Device memory is always treated as Outer 
Shareable. 
° Any location for which the resultant memory type is Normal Inner Non-cacheable, Outer Non-cacheable is 


always treated as Outer Shareable. 


For information about how the cacheability attribute is obtained from the attributes assigned at each stage of 
translation see Combining the stage I and stage 2 cacheability attributes for Normal memory on 
page D4-1799. 


The combining of the memory type attributes from the two stages of translation means a translation table walk for 
stage 1 translation can be made to a type of Device memory. If this occurs then: 


° If the value of HCR_EL2.PTW is 0, then the translation table walk occurs as if it is to Normal Non-cacheable 
memory. This means it can be done speculatively. 


° If the value of HCR_EL2.PTW is 1, then the memory access generates a stage 2 Permission fault. 
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Combining the stage 1 and stage 2 cacheability attributes for Normal memory 


For a Normal memory region, Table D4-42 shows how the stage 1 and stage 2 cacheability assignments are 
combined. This combination applies, independently, for the Inner cacheability and Outer cacheability attributes. 


Table D4-42 Combining the stage 1 and stage 2 cacheability assignments for Normal memory 





Assignment in stage 1 


Assignment in stage 2 


Resultant cacheability 





Non-cacheable 


Any 


Non-cacheable 





Any 


Non-cacheable 


Non-cacheable 





Write-Through Cacheable 


Write-Through or Write-Back Cacheable — Write-Through Cacheable 





Write-Through or Write-Back Cacheable 


Write-Through Cacheable 


Write-Through Cacheable 





Write-Back Cacheable 


Write-Back Cacheable 


Write-Back Cacheable 





Combining the stage 1 and stage 2 shareability attributes for Normal memory 


A memory region is treated as Outer Shareable, regardless of any shareability assignments at either stage of 
translation, if either: 


The resultant memory type attribute, described in Combining the stage 1 and stage 2 memory type attributes 
on page D4-1798, is any type of Device memory. 


The resultant memory type attribute, described in Combining the stage 1 and stage 2 memory type attributes 
on page D4-1798, is Normal memory, and the resultant cacheability, described in Combining the stage 1 and 
stage 2 cacheability attributes for Normal memory, is Inner Non-cacheable, Outer Non-cacheable. 


For a memory region with a resultant memory type attribute of Normal, that is not Inner Non-cacheable, Outer 
Non-cacheable, Table D4-43 shows how the stage | and stage 2 shareability assignments are combined. 


Table D4-43 Combining the stage 1 and stage 2 Shareability assignments for Normal memory? 





Assignment in stage 1 


Assignment in stage 2 


Resultant shareability 





Outer Shareable 


Any 


Outer Shareable 





Inner Shareable 


Outer Shareable 


Outer Shareable 





Inner Shareable 


Inner Shareable 


Inner Shareable 





Inner Shareable 


Non-shareable 


Inner Shareable 





Non-shareable 


Outer Shareable 


Outer Shareable 





Non-shareable 


Inner Shareable 


Inner Shareable 





Non-shareable 


Non-shareable 


Non-shareable 





a. Applies only if the Normal memory is not Inner Non-cacheable, Outer Non-cacheable, see text. 
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D4.6 MMU faults 


In a VMSAv8-64 implementation, the following mechanisms cause a PE to take an exception on a failed memory 
access: 


Debug exception An exception caused by the debug configuration, see Chapter D2 AArch64 Self-hosted 


Debug. 

Alignment fault An Alignment fault is generated if the address used for a memory access does not have the 
required alignment for the operation. For more information see Alignment support on 
page B2-76. 

MMU fault An MMU fault is a fault generated by the fault checking sequence for the current translation 


regime. The remainder of this section describes MMU faults. 


External abort Any memory system fault other than a Debug exception, an Alignment fault, or an MMU 
fault. 


Collectively, these mechanisms are called aborts. 


In AArch64 state MMU faults are synchronous exceptions that are reported as either: 





° Data Aborts. 
° Instruction Aborts 
Note 


Instruction Aborts report any synchronous memory abort on an instruction fetch. 





External aborts can be reported synchronously or asynchronously. Asynchronous external aborts are reported using 
the SError interrupt. For more information, see External aborts on page D3-1714. 


An access that causes an abort is said to be aborted, and uses the Fault Address Registers (FARs) and Exception 
Syndrome Registers (ESRs) to record context information. 


For more information, see Synchronous exception types, routing and priorities on page D1-1547. 


The Exception level that the MMU fault is taken to depends on the translation regime that generated the fault. The 
fault context saved in the appropriate ESR_ELx, where ELx is the Exception level that the fault is taken to, is 
dependent on whether: 


° The MMU fault is due to an Instruction or Data Abort. 


° The exception is taken from the same or a lower Exception level. 


Software stepping, which is a debug feature, and a PC alignment fault exception are the only exceptions that are 
higher priority than an Instruction Abort. Only watchpoints are at a lower priority than Data Aborts in the exception 
priority hierarchy. For more information see Synchronous exception prioritization for exceptions taken to AArch64 
on page D1-1548. 


The following sections describe the abort mechanisms: 
° Types of MMU faults on page D4-1801. 
° The MMU fault-checking sequence on page D4-1803. 


° AArch64 state prioritization of synchronous aborts from a single stage of address translation on 
page D4-1807. 


. Pseudocode description of the MMU faults on page D4-1809. 
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D4.6.1 Types of MMU faults 


This section describes the faults that might be detected during one of the fault-checking sequences described in The 
MMU fault-checking sequence on page D4-1803. The following list includes all the types of exceptions that can 


occur: 

° Alignment fault on a data access, see Alignment support on page B2-76. 
° Permission fault, see Permission fault. 

° Translation fault, see Translation fault on page D4-1802. 


° Address size fault, see Address size fault on page D4-1802. 


° Synchronous external abort on a translation table walk, see External abort on a translation table walk on 
page D4-1802. 


° Access flag fault, see Access flag fault on page D4-1803. 
° TLB conflict abort, see TLB conflict aborts on page D4-1814. 


When an MMU fault generates an abort for a region of memory, no memory access is made if that region is or could 
be marked as Device. 


The following subsections describe the MMU faults that are not described elsewhere this Manual. 


Permission fault 


A Permission fault can be generated at any level of lookup, and the reported fault code identifies the lookup level. 
See About access permissions on page D4-1783 for information about conditions that cause a Permission fault. 


A TLB might hold a translation table entry that cause a Permission fault. Therefore, if the handling of a Permission 
fault results in an update to the associated translation tables, the software that updates the translation tables must 
invalidate the appropriate TLB entry, to prevent the stale information in the TLB being used on a subsequent 
memory access. 


This maintenance requirement applies to Permission faults in both stage 1 and stage 2 translations. 


Cache maintenance instructions cannot generate Permission faults, except that: 


° A stage 1 translation table walk performed as part of a cache maintenance instruction can generate a stage 2 
Permission fault as described in Stage 2 fault on a stage 1 translation table walk on page D4-1806. 


° When the value of SCTLR_EL1.UCTis 1, enabling ELO execution of the DC CVAU, DC CVAC, DC CIVAC, 
and IC IVAU instructions: 


— Executing a DC CVAU, DC CVAC, or DC CIVAC instruction at ELO to a location that does not have 
read permission at ELO generates a Permission fault. 


— __ Itis IMPLEMENTATION DEFINED whether executing a IC [VAU instruction at ELO to a location that does 


not have read permission at ELO generates a Permission fault. 


° A DC IVAC instruction requires write permission to the address it invalidates, otherwise it generates a 
Permission fault. 


Note 


— Execution of the DCIMVAC instruction in AArch32 state does not have this write permission 
requirement. 





— When Non-secure EL1&0 stage 2 address translation is enabled, a DC IVAC instruction executed in 
Non-secure state operates as a DC CIVAC instruction, as described in Effects of virtualization and 
security on the cache maintenance instructions on page D3-1707. 





The Data Cache Zero instruction, DC ZVA, operates as set of stores to each byte within the block being accessed, 
and therefore it generates a Permission fault if the translation system does not permit writes to these locations. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D4-1801 
1ID092916 Non-Confidential 


D4 The AArch64 Virtual Memory System Architecture 


D4.6 MMU faults 


Translation fault 


A Translation fault can be generated at any level of lookup, and the reported fault code identifies the lookup level. 
A Translation fault is generated if bits[1:0] of a translation table descriptor identify the descriptor as either a Fault 
encoding or a reserved encoding. For more information see VMSAv8-64 translation table format descriptors on 
page D4-1774. 


In addition, a Translation fault is generated if the input address for a translation either does not map onto an address 
range of a TTBR, or the TTBR range that it maps onto is disabled. In these cases the fault is reported as a level 0 
Translation fault on the translation stage at which the mapping to a region described by a TTBR failed. 


A data or unified cache maintenance by VA instruction can generate a Translation fault. It is IMPLEMENTATION 
DEFINED whether an instruction cache invalidate by VA operation can generate a Translation fault. 


The architecture guarantees that any translation table entry that causes a Translation fault is not cached, meaning 
the TLB never holds such an entry. Therefore, when a Translation fault occurs, the fault handler does not have to 
perform any TLB maintenance instructions to remove the faulting entry. 


Address size fault 
An Address size fault can be generated at any level of lookup. 


An Address size fault is generated if one of the following has nonzero address bits above the output address size, 
for the current stage of translation: 


° The TTBR used for the translation. 
° A translation table entry. 


. The output address of the translation. 


For an Address size fault generated because the TTBR used for the translation has nonzero address bits above the 
output address size, the reported fault code indicates a fault at level 0. Otherwise, the reported fault code indicates 
the lookup level at which the fault occurred. 


A data or unified cache maintenance by VA instruction can generate an Address size fault. It is IMPLEMENTATION 
DEFINED whether an instruction cache invalidate by VA instruction can generate an Address size fault. 


The architecture guarantees that any translation table entry that causes an Address size fault is not cached, meaning 
the TLB never holds such an entry. Therefore, when an Address size fault occurs, the fault handler does not have to 
perform any TLB maintenance instructions to remove the faulting entry. 


For more information on Address size faults, see Output address size on page D4-1732. 


External abort on a translation table walk 


An external abort on a translation table walk can be either synchronous or asynchronous. An external abort on a 
translation table walk is reported: 


° If the external abort is synchronous, using: 
— Asynchronous Instruction Abort exception if the translation table walk is for an instruction fetch. 


— Asynchronous Data Abort exception if the translation table walk is for a data access. 


° If the external abort is asynchronous, using the SError interrupt exception. 


Behavior of external aborts on a translation table walk caused by address translation instructions 


The address translation instructions summarized in Address translation instructions, functional group on 
page G4-4204 require translation table walks. An external abort can occur in the translation table walk. This is 
reported as follows: 


. If the external abort is synchronous, using a synchronous Data Abort exception. 


° If the external abort is asynchronous, using the SError interrupt exception. 


For more information, see Synchronous faults generated by address translation instructions on page D4-1772. 
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Access flag fault 


An Access flag fault can be generated at any level of lookup, and the reported fault code identifies the lookup level. 
An Access flag fault is generated only if a translation table descriptor with the Access flag bit set to 0 is used. 


For more information about the Access flag bit, see VMSAv8-64 translation table format descriptors on 
page D4-1774. 


The architecture guarantees that any translation table entry that causes an Access flag fault is not cached, meaning 
the TLB never holds such an entry. Therefore, when an Access flag fault occurs, the fault handler does not have to 
execute any TLB maintenance instructions to remove the faulting entry. 


Whether any cache maintenance by VA instructions can generate Access flag faults is IMPLEMENTATION DEFINED. 


For more information, see The Access flag on page D4-1791. 


The MMU fault-checking sequence 


This section describes the MMU checks made for the memory accesses required for instruction fetches and for 
explicit memory accesses: 


. If an instruction fetch faults it generates an Instruction Abort. 


. If an data memory access faults it generates a Data Abort. 
MMU fault checking is performed for each stage of address translation. 


The fault-checking sequence shows a translation from an Input address to an Output address. For more information 
about this terminology, see About address translation and supported input address ranges on page D4-1728. 


Note 


The descriptions in this section do not include the possibility that the attempted address translation generates a TLB 
conflict abort, as described in TLB conflict aborts on page D4-1814. 








Types of MMU faults on page D4-1801 describes the faults that an MMU fault-checking sequence can report. 


Figure D4-17 on page D4-1804 shows the process of fetching a descriptor from the translation table. For the 
top-level fetch for any translation, the descriptor is fetched only if the input address passes any required alignment 
check. As the figure shows, if the translation is stage 1 of the Non-secure EL1&0 translation regime, then the 
descriptor address is in the IPA address space, and is subject to a stage 2 translation to obtain the required PA. This 
stage 2 translation requires a recursive entry to the fault checking sequence. 
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Figure D4-17 Fetching the descriptor in a VMSAv8-64 translation table walk 


Figure D4-18 on page D4-1805 shows the full VMSA fault checking sequence, including the alignment check on 
the initial access. 
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Figure D4-18 VMSAv8-64 fault checking sequence 
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Stage 2 fault on a stage 1 translation table walk 


On performing a translation table walk for the stage 1 translations, the descriptor addresses must be translated from 
IPA to PA, using a stage 2 translation. This means that a memory access made as part of a stage | translation table 
lookup might generate, on a stage 2 translation: 


° A Translation fault, Access flag fault, or Permission fault. 


° A synchronous external abort on the memory access. 


If SCR_EL3.EA is set to 1, a synchronous external abort is taken to EL3. Otherwise, these faults are reported as 
stage 2 memory aborts. ESR_EL2.ISS[7] is set to 1, to indicate a stage 2 fault during a stage | translation table walk, 
and the part of the ISS field that might contain details of the instruction is invalid. For more information see Use of 
the ESR_EL1, ESR_EL2, and ESR_EL3 on page D1-1523. 


Alternatively, a memory access made as part of a stage | translation table lookup might target an area of memory 
with the Device attribute assigned on the stage 2 translation of the address accessed. When the HCR_EL2.PTW bit 
is set to 1, such an access generates a stage 2 Permission fault. 





Note 


On most systems, such a mapping to Device memory on the stage 2 translation is likely to indicate a Guest OS error, 
where the stage | translation table is corrupted. Therefore, it is appropriate to trap this access to the hypervisor. 





A TLB might hold entries that depend on the effect of HCR_EL2.PTW. Therefore, if HCR_EL2.PTW is changed 
without changing the current VMID, the TLBs must be invalidated before executing in Non-secure EL1 or 
Non-secure ELO state. For more information see Changing HCR_EL2.PTW on page D4-1828. 


A cache maintenance instruction executed at Non-secure EL1 can cause a stage | translation table walk that might 
generate a stage 2 Permission fault, as described in this section. This is an exception to the general rule that a cache 
maintenance instruction cannot generate a Permission fault. 


The level associated with MMU faults 


For MMU faults, Table D4-44 shows how the LL bits in the ESR_ELx.STATUS fields encode the lookup level 
associated with the fault. 


Table D4-44 Use of LL bits to encode the lookup level at which the fault occurred 





LL bits Meaning 














00 Level 0 of translation or translation table base register. 

01 Level 1. 

10 Level 2. 

11 Level 3. When xFSR.STATUS indicates a Domain fault, this value is reserved. 





The lookup level associated with a fault is: 
° For a fault generated on a translation table walk, the lookup level of the walk being performed. 


. For a Translation fault, the lookup level of the translation table that gave the fault. If a fault occurs because 
a stage of address translation is disabled, or because the input address is outside the range specified by the 
appropriate base address register or registers, the fault is reported as a level | fault. 


° For an Access flag fault, the lookup level of the translation table that gave the fault. 


. For a Permission fault, including a Permission fault caused by hierarchical permissions, the lookup level of 
the final level of translation table accessed for the translation. That is, the lookup level of the translation table 
that returned a Block or Page descriptor. 


Also see Synchronous external aborts from address translation caching structures on page D4-1808 
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D4.6.3 AArch6é4 state prioritization of synchronous aborts from a single stage of address translation 
Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548 describes the prioritization 
of exceptions taken to an Exception level that is using AArch64. This section gives additional information about the 
prioritization of MMU faults from VMSAv8-64 translation regimes. 

Note 

The priority numbering in this list only shows the relative priorities of aborts from a single stage of address 

translation ina VMSAv8-64 translation regime. This numbering has no global significance and, for example, does 

not correlate with the equivalent AArch32 list in AArch32 state prioritization of synchronous aborts from a single 
stage of address translation on page G4-4120. 

For a single stage of translation ina VMSAv8-64 translation regime, the following numbered list shows the priority 

of the possible memory management faults on a memory access. In this list: 

. For memory accesses that undergo two stages of translation, the italic entries show where the faults from the 
stage 2 translation can occur. A stage 2 fault within a stage | translation table walk follows the same 
prioritization of faults: 

° For synchronous external aborts from translation table walks see also Synchronous external aborts from 
address translation caching structures on page D4-1808. 

The priority order, from highest priority to lowest priority, is: 

1. Alignment fault not caused by memory type. This is possible for a stage 1 translation only. 

2: Translation fault due to the input address being out of the address range to be translated or requiring a TTBR 
that is disabled. This includes VTCR_EL2.SLO being inconsistent with VTCR_EL2.TOSZ or programmed 
to a reserved value. 

3. Address size fault on a TTBR caused by either: 

° The check on TCR_EL1.IPS, TCR_EL2.PS, TCR_EL3.PS, or VTCR_EL2.PS. 
° The PA being out of the range implemented. 

4. Second stage abort on a level 0 lookup of aa stage 1 table walk. When stage 2 address translation is enabled 
this includes an Address size fault caused by the PA being out of the range implemented. This is second stage 
abort during a first stage translation table walk. 

Di: Synchronous parity or ECC error on a level 0 lookup of a translation table walk. 

6. Synchronous external abort on a level 0 lookup level of a translation table walk. 

as Translation fault on a level 0 translation table entry. 

8. Address Size fault a level 0 lookup translation table entry caused by either: 

° The check on TCR_EL1.IPS, TCR_EL2.PS, TCR_EL3.PS, or VTCR_EL2.PS. 
. The output address being out of the range implemented. 

9. Second stage abort ona level 1 lookup of aa stage 1 table walk. When stage 2 address translation is enabled 
this includes an Address size fault caused by the PA being out of the range implemented. This is second stage 
abort during a first stage translation table walk. 

10. | Synchronous parity or ECC error on a level 1 lookup of a translation table walk. 

11. Synchronous external abort on a level 1 lookup level of a translation table walk. 

12. Translation fault on a level 1 translation table entry. 

13. Address size fault on a level 1 lookup translation table entry caused by either: 

. The check on TCR_EL1.IPS, TCR_EL2.PS, TCR_EL3.PS, or VTCR_EL2.PS. 

° The output address being out of the range implemented. 
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14. 


15. 
16. 
17. 


18. 


19. 


20. 
21. 
22. 


23. 


24. 
25. 
26. 


27, 


28. 


29. 


Second stage abort on a level 2 lookup of aa stage 1 table walk. When stage 2 address translation is enabled 
this includes an Address size fault caused by the PA being out of the range implemented. This is second stage 
abort during a first stage translation table walk. 


Synchronous parity or ECC error on a level 2 lookup of a translation table walk. 
Synchronous external abort on a level 2 lookup level of a translation table walk. 
Translation fault on a level 2 translation table entry. 


Address size fault on a level 2 lookup translation table entry caused by either: 
. The check on TCR_EL1.IPS, TCR_EL2.PS, TCR_EL3.PS, or VTCR_EL2.PS. 


° The output address being out of the range implemented. 


Second stage abort on a level 3 lookup of aa stage 1 table walk. When stage 2 address translation is enabled 
this includes an Address size fault caused by the PA being out of the range implemented. This is second stage 
abort during a first stage translation table walk. 


Synchronous parity or ECC error on a level 3 lookup of a translation table walk. 
Synchronous external abort on a level 3 lookup level of a translation table walk. 
Translation fault on a level 3 translation table entry. 


Address size fault on a level 3 lookup translation table entry caused by either: 
. The check on TCR_EL1.IPS, TCR_EL2.PS, TCR_EL3.PS, or VTCR_EL2.PS. 


° The output address being out of the range implemented. 
Access Flag fault. 

Alignment fault caused by the memory type. 

Permission fault. 


A fault from the stage 2 translation of the memory access. When stage 2 address translation is enabled this 
includes an Address size fault caused by the PA being out of the range implemented. 


Synchronous parity or ECC error on the memory access. 


Synchronous External Abort on the memory access. 


Note 





The prioritization of TLB Conflict aborts is IMPLEMENTATION DEFINED, as the exact cause of these aborts 
depends on the form of TLBs implemented. However, the TLB conflict abort must have higher priority than 
any abort that depends on a value held in the TLB. 


The prioritization of IMPLEMENTATION DEFINED MMU faults for a Load-Exclusive or Store-Exclusive to an 
unsupported memory type is IMPLEMENTATION DEFINED. 





Synchronous external aborts from address translation caching structures 


A caching structure used for caching translation table walks might support: 


An arbitrary number of levels of translation table lookup. 


One or more stages of translation, that might not correspond to the stages of an address translation lookup. 


This might mean that, on a synchronous external aborts arising from the caching structure, including parity or ECC 
errors, the PE cannot precisely determine one or both of the translation stage and level of lookup at which the error 
occurred. In this case: 


If the PE cannot determine precisely the translation stage at which the error occurred, it is reported and 
prioritized as a stage 1 error. 
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If the PE cannot determine precisely the lookup level at which the error occurred, the level is reported and 
prioritized as either: 


—  _ The lowest-numbered level that could have given rise to the error. 


— Level 0 if it the PE cannot determine any information about the level. 


D4.6.4 Pseudocode description of the MMU faults 


The following functions generate fault records that describe MMU faults: 


AArch64.AccessFlagFault(). 
AArch64.AddressSizeFault(). 
AArch64.PermissionFault(). 
AArch64.TranslationFault(). 


Abort exceptions on page D3-1719 describes how fault records are used. 
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D4.7 Translation Lookaside Buffers (TLBs) 


Translation Lookaside Buffers (TLBs) reduce the average cost of a memory access by caching the results of 
translation table walks. TLBs behave as caches of the translation table information, and the VMSA provides TLB 
maintenance instructions for the management of TLB contents. 


Note 


The ARM architecture permits TLBs to hold any translation table entry that does not directly cause a Translation 
fault, an Address size fault, or an Access flag fault. 








The following sections describe the architectural requirements for Translation Lookaside Buffers (TLBs) and their 


maintenance: 

° Use of ASIDs and VMIDs to reduce TLB maintenance requirements. 

. About ARMV8& Translation Lookaside Buffers (TLBs) on page D4-1811. 

° TLB maintenance requirements and the TLB maintenance instructions on page D4-1815. 


In these descriptions, TLB entries for a translation regime for a particular Exception level are out of context when 
executing at a higher Exception level. 


D4.7.1 Use of ASIDs and VMIDs to reduce TLB maintenance requirements 


To reduce the need for TLB maintenance on context switches, the lookups from some translation regimes can be 
associated with an ASID, or with an ASID and a VMID, as follows: 


ASIDs For stage | translations that support two VA ranges the VMSA can distinguish between Global 
pages and Process-specific pages. The Address Space Identifier (ASID) identifies pages associated 
with a specific process and provides a mechanism for changing process-specific tables without 
having to maintain the TLB structures. 


For these stage 1 translations, each of TTBRO_ELx and TTBR1_ELx has a valid ASID field, and 
TCR_ELx.A1 determines which of these holds the current ASID. 


— Note 


The selected ASID applies regardless of which set of translation tables are used. For example, when 
the value of TCR_ELx.A1 is 0, any translation table lookup using this stage of translation is 
associated with the ASID from TTBRO_ELx.ASID, regardless of whether the translation lookup 
uses TTBRO_ELx or TTBR1_ELx. 





See also ASID size on page D4-1811 and Global and process-specific translation table entries on 
page D4-1812. 


For a symmetric multiprocessor cluster where a single operating system is running on the set of 
processing elements, the ARM architecture requires all ASID values to be assigned uniquely within 
any single Inner Shareable domain. In other words, each ASID value must have the same meaning 
to all processing elements in the system. 


VMIDs For the Non-secure EL1&0 translation regime, the virtual machine identifier (VMID) identifies the 
current virtual machine, with its own independent ASID space. The TLB entries include this VMID 
information, meaning TLBs do not require explicit invalidation when changing from one virtual 
machine to another, if the virtual machines have different VMIDs. 


VTTBR_EL2.VMID holds the current VMID. 
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ASID size 


In VMSAv8-64, the ASID size is an IMPLEMENTATION DEFINED choice of 8 bits or 16 bits, and 
ID_AA64MMFRO_EL1.ASIDBits identifies the supported size. When an implementation supports a 16-bit ASID, 
TCR_ELx.AS selects whether the top 8 bits of the ASID are used. When the value of TCR_ELx.AS is 0, 
ASID[15:8]: 


. Are ignored by hardware for every purpose other than direct reads of TTBRO_ELx.ASID and 
TTBR1_ELx.ASID. 


. Are treated as if they are all zeros when used for allocating and matching entries in the TLB. 


Note 


VMSAv8-32 uses an 8-bit ASID. For backwards compatibility, when executing using translations controlled from 
an Exception level that is using AArch32, the ASID size remains at 8 bits. If the implementation supports 16-bit 
ASIDs, the 8-bit ASID used is zero-extended to 16 bits. 








D4.7.2 About ARMV8 Translation Lookaside Buffers (TLBs) 


Translation Lookaside Buffers (TLBs) are an implementation technique that caches translations or translation table 
entries. TLBs avoid the requirement for every memory access to perform a translation table walk in memory. The 
ARM architecture does not specify the exact form of the TLB structures for any design. In a similar way to the 
requirements for caches, the architecture only defines certain principles for TLBs: 


° The architecture has a concept of an entry locked down in the TLB. The method by which lockdown is 
achieved is IMPLEMENTATION DEFINED, and an implementation might not support lockdown. 


. The architecture does not guarantee that an unlocked TLB entry remains in the TLB. 


° The architecture guarantees that a locked TLB entry remains in the TLB. However, a locked TLB entry might 
be updated by subsequent updates to the translation tables. Therefore, when a change is made to the 
translation tables, the architecture does not guarantee that a locked TLB entry remains incoherent with an 
entry in the translation table. 


. The architecture guarantees that a translation table entry that generates a Translation fault, an Address size 
fault, or an Access flag fault is not held in the TLB. However a translation table entry that generates a 
Permission fault might be held in the TLB. 


. When address translation is enabled, any translation table entry that does not generate a Translation fault, an 
Address size fault, or an Access flag fault and is not from a translation regime for an Exception level that is 
lower than the current Exception level can be allocated to a TLB at any time. The only translation table entries 
guaranteed not to be held in a TLB are those that generate a Translation fault, an Address size fault, or an 
Access flag fault. 


Note 


A TLB can hold a translation table entry that does not itself generate a Translation fault but that points to a 
subsequent table in the translation table walk. This is referred to as intermediate caching of TLB entries. 








° Software can rely on the fact that between disabling and re-enabling a stage of address translation, entries in 
the TLB relating to that stage of translation have not have been corrupted to give incorrect translations. 


The following sections give more information about TLB implementation: 

° Global and process-specific translation table entries on page D4-1812. 
° TLB matching on page D4-1812. 

. TLB behavior at reset on page D4-1813. 

° TLB lockdown on page D4-1813. 

. TLB conflict aborts on page D4-1814. 


See also TLB maintenance requirements and the TLB maintenance instructions on page D4-1815. 
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Global and process-specific translation table entries 


Ina VMSA implementation, system software can divide the virtual memory map used by a stage of translation that 
supports two VA ranges into global and non-global regions, indicated by the nG bit in the translation table 


descriptors: 
nG == The translation is global, meaning the region is available for all processes. 
nG == The translation is non-global, or process-specific, meaning it relates to the current ASID, as defined 


by: 
° TTBRO_ELx.ASID, if the value of TCR_ELx.A1 is 0. 
° TTBR1_ELx.ASID, if the value of TCR_ELx.A1 is 01. 


As indicated by the nG field definitions, each non-global region has an associated ASID. These identifiers mean 
different translation table mappings can co-exist in a caching structure such as a TLB. This means that software can 
create a new mapping of a non-global memory region without removing previous mappings. 


Note 


° The selected ASID applies to the translation of any address for which the value of the nG bit is 1, regardless 
of whether the address is translated based on TTBRO_EL1 or on TTBR1_EL1. 





° In ARMv8.0, the only stage of translation that supports two VA ranges is stage 1 of the EL1&0 translation 
regime. 





ASIDs are supported only by stage 1 translations that support two VA ranges. Stage 2 translations, and stage | 
translations that support only a single VA range do not support ASIDs, and all descriptors in these regimes are 
treated as global. 


In a translation regime that supports global and non-global translations, translation table entries from lookup levels 
other than the final level of lookup are treated as being non-global, regardless of the value of the nG bit. 


When a PE is using the VMSAv8-64 translation table format, and is in Secure state, a translation must be treated as 
non-global, regardless of the value of the nG bit, if NSTable is set to 1 at any level of the translation table walk. 


For more information see Control of Secure or Non-secure memory access on page D4-1782. 


TLB matching 


A TLB is a hardware caching structure for translation table information. Like other hardware caching structures, it 
is mostly invisible to software. However, there are some situations where it can become visible. These are associated 
with coherency problems caused by an update to the translation table that has not been reflected in the TLB. Use of 
the TLB maintenance instructions described in TLB maintenance requirements and the TLB maintenance 
instructions on page D4-1815 can prevent any TLB incoherency becoming a problem. 


A particular case where the presence of the TLB can become visible is if the translation table entries that are in use 
under a particular ASID and VMID are changed without suitable invalidation of the TLB. This can occur only if the 
architecturally-required break-before-make sequence described in Using break-before-make when updating 
translation table entries on page D4-1816 is not used. If the break-before make sequence is not used, the TLB can 
hold two mappings for the same address, and this: 


° Might generate an exception that is reported using the TLB conflict fault code, see TLB conflict aborts on 
page D4-1814. 


. Might lead to CONSTRAINED UNPREDICTABLE behavior. In this case, behavior will be consistent with one of 
the mappings held in the TLB, or with some amalgamation of the values held in the TLB, but cannot give 
access to regions of memory with permissions or attributes that could not be assigned by valid translation 
table entries in the translation regime being used for the access. For more information see CONSTRAINED 
UNPREDICTABLE behaviors due to caching of control or data values on page K1-5480. 
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TLB behavior at reset 


The ARM architecture does not require a reset to invalidate the TLBs. The architecture recognizes that an 
implementation might require caches, including TLBs, to maintain their contents over a system reset. Possible 
reasons for doing so include power management and debug requirements. 


Therefore, for ARMv8: 
° All TLBs reset to an IMPLEMENTATION DEFINED state that might be UNKNOWN. 


. All TLBs are disabled from reset. All stages of address translation are disabled from reset, and the contents 
of the TLBs have no effect on address translation. For more information see Controlling address translation 
stages on page D4-1729. 


° An implementation can require the use of a specific TLB invalidation routine, to invalidate the TLB arrays 
before they are enabled after a reset. The exact form of this routine is IMPLEMENTATION DEFINED, but if an 
invalidation routine is required it must be documented clearly as part of the documentation of the device. 


ARM recommends that if an invalidation routine is required for this purpose, the routine is based on the TLB 
maintenance instructions described in TLB maintenance instructions on page D4-1817. 


Similar rules apply to cache behavior, see Behavior of caches at reset on page D3-1698. 


TLB lockdown 


The ARM architecture recognizes that any TLB lockdown scheme is heavily dependent on the microarchitecture, 
making it inappropriate to define a common mechanism across all implementations. This means that: 


° VMSAv8-64 does not require TLB lockdown support. 


. If TLB lockdown support is implemented, the lockdown mechanism is IMPLEMENTATION DEFINED. However, 
key properties of the interaction of lockdown with the architecture must be documented as part of the 
implementation documentation. 


This means that a region of the system instruction encoding space is reserved for IMPLEMENTATION DEFINED 
functions, see Reserved encodings for IMPLEMENTATION DEFINED registers on page C5-291. An 
implementation might use some of these encodings to implement TLB lockdown functions. These functions might 
include: 


° Unlock all locked TLB entries. 
. Preload into a specific level of TLB. This is beyond the scope of the PLI and PLD hint instructions. 


In an implementation that includes EL2, exceptions generated as a result of TLB lockdown when executing in 
Non-secure EL1 or Non-secure ELO state can be routed to either: 


° Non-secure EL1, as a Data Abort exception. 


° Non-secure EL2, as a Hyp Trap exception. 


For more information, see Traps to EL2 of Non-secure ELO and ELI accesses to lockdown, DMA, and TCM 
operations on page D1-1577. 
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TLB conflict aborts 


If an address matches multiple entries in the TLB, it is IMPLEMENTATION DEFINED whether a TLB conflict abort is 
generated. 


Note 


An address can hit multiple entries in the TLB if the TLB has been invalidated inappropriately, for example if TLB 
invalidation required by the architecture has not been performed. 








An implementation can generate TLB conflict aborts on either or both instruction fetches and data accesses. A TLB 
conflict abort: 


° On an instruction fetch is reported as an Instruction abort, see JSS encoding for an exception from an 
Instruction Abort on page D7-1953. 


° On a data access is reported as a Data abort, see [SS encoding for an exception from a Data Abort on 
page D7-1955. 


ARMvV8 defines the fault status encoding of 0b110000 for TLB conflict aborts. On a TLB conflict abort, the returned 
syndrome includes the address that generated the fault. That is, it includes the address that was being looked up in 
the TLB. 


It is IMPLEMENTATION DEFINED whether a TLB conflict abort is a stage 1 abort or a stage 2 abort. 


Note 
A stage 2 abort cannot be generated if stage 2 of the Non-secure EL1&0 translation regime is disabled. 








The priority of the TLB conflict abort is IMPLEMENTATION DEFINED, because it depends on the form of a TLB that 
can generate the abort. However, the TLB conflict abort must have higher priority than any abort that depends on a 
value held in the TLB. 


If an address matches multiple entries in the TLB and no TLB conflict abort is generated, the resulting behavior is 
CONSTRAINED UNPREDICTABLE, see CONSTRAINED UNPREDICTABLE behaviors due to caching of control or 
data values on page K1-5480. The CONSTRAINED UNPREDICTABLE behavior must not permit access to regions of 
memory with permissions or attributes that mean they cannot be accessed in the current Security state at the current 
Exception level. 
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TLB maintenance requirements and the TLB maintenance instructions 


Translation Lookaside Buffers (TLBs) are an implementation mechanism that caches translations or translation 
table entries. The ARM architecture does not specify the form of any TLB structures, but defines the mechanisms 
by which TLBs can be maintained. The following sections describe the VMSA TLB maintenance instructions: 


° General TLB maintenance requirements. 

° TLB maintenance instructions on page D4-1817. 

See also: 

° Maintenance requirements on changing System register values on page D4-1828. 
° Atomicity of register changes on changing virtual machine on page D4-1735. 

















D4.8.1 General TLB maintenance requirements 
TLB maintenance instructions provide a mechanism for invalidating entries from TLB caching structures to ensure 
that changes to the translation tables are reflected correctly in those TLB caching structures. 
The architecture permits the caching of any translation table entry that has been returned from memory without a 
fault, provided that the entry does not, itself, cause a Translation fault, an Address size fault, or an Access Flag fault. 
This means that the entries that can be cached include: 
° Entries in translation tables that point to subsequent tables to be used in that stage of translation. 
° Stage 2 translation table entries used as part of a stage 1 translation table walk 
° Stage 2 translation table entries used to translate the output address of the stage | translation. 
Such entries might be held in intermediate TLB caching structures that are used during a translation table walk and 
that are distinct from the data caches in that they are not required to be invalidated as the result of writes of the data. 
The architecture makes no restriction of the form of these intermediate TLB caching structures. 
The architecture does not intend to restrict the form of TLB caching structures used for holding translation table 
entries, and in particular for translation regimes that involve two stages of translation, it is recognized that such 
caching structures might contain: 
° Entries containing information from stage 1 translation table entries, at any level of the translation table walk. 
° Entries containing information from stage 2 translation table entries, at any level of the translation table walk. 
° Entries that combine information from stage | and stage 2 translation table entries, at any level of the 
translation table walk. 
Note 
For the purpose of TLB maintenance, the term TLB entry denotes any structure, including temporary working 
registers in translation table walk hardware, that holds a translation table entry. 
Where a TLB maintenance instruction is: 
° Required to apply to stage 1 entries, then it must apply to any cached entries in caching structures that include 
any stage 1 information that are used to translate the address being invalidated. 
Note 

— Where stage 1 information has been cached in multiple TLB entries, as could occur from splintering 
a page when caching in the TLB, then the invalidation must apply to each cached entry containing 
stage 1 information from the page that is used to translate the address being invalidated, regardless of 
whether or not that cached entry would be used to translate the address being invalidated. 

— __ Asstated in Global and process-specific translation table entries on page D4-1812, translation table 
entries from levels of translation other than the final level are treated as being non-global. ARM 
expects that, in at least some implementations, cached copies of levels of the translation table walk 
other than the last level are tagged with their ASID, regardless of whether the final level is global. This 
means that TLB invalidations that involve the ASID require the ASID to match such entries to perform 
the required invalidation. 
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. Required to apply to stage 2 entries only, then: 


—  Itis not required to apply to caching structures that combine stage | and stage 2 translation table 
entries. 


— It must apply to caching structures that contain information only from stage 2 translation table entries. 


. Required to apply to both stage 1 and stage 2 entries, then it must apply to any entry in the caching structures 
that includes information from either a stage 1 translation table entry or a stage 2 translation table entry, 
including any entry that combines information from both stage 1 and stage 2 translation table entries. 


Whenever translation tables entries associated with a particular VMID or ASID are changed, the corresponding 
entries must be invalidated from the TLB to ensure that these changes are visible to subsequent execution, including 
speculative execution, that uses the changed translation table entries. 


Some System register field descriptions state that the effect of the field is permitted to be cached in a TLB. This 
means that all TLB entries that might be affected by a change of the field must be invalidated whenever that field 
is changed, to ensure that the effect of the change of that control field is visible to subsequent execution, including 
speculative execution, that uses that control field. This invalidation is required in addition to, and after, the normal 
synchronization of the System registers described in Synchronization requirements for AArch64 System registers on 
page D7-1889, and applies to any stage of address translation that is implemented for the translation regime, and 
VMID if appropriate, that is affected by that control field. A control field that is permitted to be cached in a TLB 
requires this maintenance even when all stages of address translation are disabled. 


In addition to any TLB maintenance requirement, when changing the cacheability attributes of an area of memory, 
software must ensure that any cached copies of affected locations are removed from the caches. For more 
information see Cache maintenance requirement created by changing translation table attributes on page D4-1832. 


Because a TLB never holds any translation table entry that generates a Translation fault, an Address size fault, or 
an Access Flag fault, a change from a translation table entry that causes a Translation, Address size, or Access flag 
fault to one that does not fault, does not require any TLB invalidation. 


Special considerations apply to translation table updates that change the memory type, cacheability, or output 
address of an entry, see Using break-before-make when updating translation table entries. 


Using break-before-make when updating translation table entries 


To avoid possibly creating multiple TLB entries for the same address, and to avoid the effects of TLB caching 
possibly breaking coherency, ordering guarantees or uniprocessor semantics, or possibly failing to clear the 
exclusive monitors, the architecture requires the use of a break-before-make sequence when changing translation 
table entries whenever multiple threads of execution can use the same translation tables and the change to the 
translation table entries involves any of: 


. A change of the memory type. 
° A change of the cacheability attributes. 


° A change of the output address (OA), if the OA of at least one of the old translation table entry and the new 
translation table entry is writable. 
. A change to the size of block used by the translation system. This applies both: 


— When changing from a smaller size to a larger size, for example by replacing a table mapping with a 
block mapping in a stage 2 translation table. 


— When changing from a larger size to a smaller size, for example by replacing a block mapping with a 
table mapping in a stage 2 translation table. 


° Creating a global entry when there might be non-global entries in a TLB that overlap with that global entry. 
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A break-before-make sequence on changing from an old translation table entry to a new translation table entry 
requires the following steps: 


1. Replace the old translation table entry with an invalid entry, and execute a DSB instruction. 


2. Invalidate the translation table entry with a broadcast TLB invalidation instruction, and execute a DSB 
instruction to ensure the completion of that invalidation. 


3. Write the new translation table entry, and execute a DSB instruction to ensure that the new entry is visible. 


This sequence ensures that at no time are both the old and new entries simultaneously visible to different threads of 
execution, and therefore the problems described at the start of this subsection cannot arise. 





D4.8.2 TLB maintenance instructions 
The architecture defines TLB maintenance instructions, that provide the following: 
. Invalidate all entries in the TLB. 
° Invalidate a single TLB entry by ASID for a non-global entry. 
. Invalidate all TLB entries that match a specified ASID. 
. Invalidate all TLB entries that match a specified VA, regardless of the ASID. 
Each instruction can be specified as applying only to the PE that executes the instruction, or as applying to all PEs 
in the same Inner Shareable shareability domain as the PE that executes the instruction. 
The following subsubsections describe these instructions: 
° TLB maintenance instruction syntax. 
° Operation of the TLB maintenance instructions on page D4-1819. 
° Scope of the A64 TLB maintenance instructions on page D4-1820. 
° Invalidation of TLB entries from stage 2 translations on page D4-1823. 
° Broadcast TLB maintenance between AArch32 and AArch64 on page D4-1824. 
. Broadcast TLB maintenance with different translation granule sizes on page D4-1825. 
° Ordering and completion of TLB maintenance instructions on page D4-1826. 
. TLB maintenance in the event of TLB conflict on page D4-1826. 
° The interaction of TLB lockdown with TLB maintenance instructions on page D4-1827. 
TLB maintenance instructions on page C5-278 describes the encoding of the TLB maintenance instructions. 
TLB maintenance instruction syntax 
The A64 syntax for TLB maintenance instructions is: 
TLBI <operation>{, <Xt>} 
Where: 
<operation> Is one of ALLE1, ALLE2, ALLE3, ALLE1IS, ALLE2IS, ALLE3IS, VMALLE1, VMALLE1IS, VMALLS12E1, 
VMALLS12E1IS, ASIDE1, ASIDE1IS, VA{L}E1, VA{L}E2, VA{L}E3, VA{L}E1IS, VA{L}E2IS, VA{L}E31S, 
VAA{L}E1, VAA{L}E1IS, IPAS2{L}E1, or IPAS2{L}E1IS. 
<operation> has a structure of <type><level>{IS} where: 
<type> Is one of: 
ALL All translations used at <level>. 
For the scope of ALL instructions see ALL on page D4-1820. 
The ALL instructions are valid for all values of <level>. 
VMALL All stage 1 translations used at <level> with the current VMID, if 
appropriate. 
For the scope of the VMALL instructions see VMALL on page D4-1820. 
The VMALL instructions are valid only when level == £1. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D4-1817 
ID092916 Non-Confidential 


D4 The AArch64 Virtual Memory System Architecture 
D4.8 TLB maintenance requirements and the TLB maintenance instructions 


<level> 


IS 


VMALLS12 All stage 1 and stage 2 translations used at EL1 with the current VMID, if 
appropriate. 
For the scope of the VMALLS12 instructions see VMALLS/2 on page D4-1821. 
The VMALLS12 instructions are valid only when level == E1. 

ASID All translations used at EL1 with the supplied ASID. 
For the scope of the ASID instructions see AS/D on page D4-1821. 
The ASID instructions are valid only when level == E1. 


VA{L} Translations used at <level> for the specified address and, if appropriate, the 
specified ASID. 


For the scope of the VA instructions see VA on page D4-1821. For the scope 
of the VAL instructions see VAL on page D4-1821. 


The VA{L} instructions are valid for all values of <level>. 

VAA{L} Translations used at <level> for the specified address, for all ASID values, 
if appropriate. 
For the scope of the VAA instructions see VAA on page D4-1822. For the 
scope of the VAAL instructions see VAAL on page D4-1822. 
The VAA{L} instructions are valid only when level == £1. 


IPAS2{L} Translations used at <level> for the specified IPA that are held in stage 2 
only caching structures. 


For the scope of the IPAS2 instructions see /PAS2 on page D4-1822. For the 
scope of the IPAS2L instructions see JPAS2L on page D4-1822 


The IPAS2{L} instructions are valid only when level == £1. 
In the VA{L}, VAA{L}, and IPAS2{L} types: 


L An optional parameter that indicates that the invalidation only applies to 
caching of entries returned from the final lookup level of the translation 
table walk. 


Defines the Exception level of the translation regime that the invalidation applies to. Is 
one of: 


E1 EL1. 
E2 EL2. 
E3 EL3. 


An instruction that applies to the translation regime of an Exception level higher than 
the Exception level at which the instruction is executed is UNDEFINED. 


TLBI ALLE1{IS}, TLBI IPAS2{L}E1{IS} and TLBI VMALLS12E1{IS} are UNDEFINED at EL1. 


— Note 
All TLB maintenance instructions are UNDEFINED at ELO. 





When present, it indicates that the function applies to all TLBs in the Inner Shareable 
shareability domain. 


<Xt> Passes one or both of an address and an ASID as an argument, where required. <Xt> is required for 
the TLB ASID, TLB VA{L}, TLB VAA{L}, and TLB IPAS2{L} instructions. 


If EL2 is not implemented, the TLBI VA{L}E2, TLBI VA{L}E2IS, TLBI ALLE2, and TLBI ALLEZ2IS instructions are 


UNDEFINED. 


VMSAv8-64 TLB maintenance instructions that take a register argument that holds a VA, an ASID, or both, use the 
following register argument format: 


Bits[63:48] | ASID. These bits are RESO if the instruction does not require an ASID argument. 


Bits[47:44] RESO. 
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Bits[43:0] VA[55:12]. For an instruction that requires a VA argument, the treatment of the low-order bits of 
this field depends on the translation granule size, as follows: 


4KB granule size All bits are valid and used for the invalidation. 


16KB granule size _Bits[1:0] RESO and ignored when the instruction is executed, because 
VA[13:12] have no effect on the operation of the instruction. 


64KB granule size _ Bits[3:0] are RESO and ignored when the instruction is executed, because 
VA[15:12] have no effect on the operation of the instruction. 


These bits are RESO if the instruction does not require a VA argument. 


For TLB maintenance instructions that take an address argument, hardware interprets VA[63:56] as each having the 
same value as VA[55]. 


Ifa TLB maintenance instruction targets a translation regime that is using AArch32, meaning the VA is only 32-bit, 
then software must treat VA[55:32] as RESO, and these bits are ignored when the instruction is executed. 


If the implementation supports 16 bits of ASID then the upper 8 bits of the ASID are RESO when the context being 
invalidated only uses 8 bits. 


VMSAv8-64 TLB maintenance instructions that take a register argument that holds an IPA, use the following 
register argument format: 


Bits[63:36] RESO. 


Bits[35:0] IPA[47:12]. For an instruction that requires a VA argument, the treatment of the low-order bits of 
this field depends on the translation granule size, as follows: 


4KB granule size All bits are valid and used for the invalidation. 


16KB granule size _Bits[1:0] RESO and ignored when the instruction is executed, because 
IPA[13:12] have no effect on the operation of the instruction. 


64KB granule size _Bits[3:0] are RESO and ignored when the instruction is executed, because 
IPA[15:12] have no effect on the operation of the instruction. 


Operation of the TLB maintenance instructions 


Any TLB maintenance instruction can affect any TLB entries that are not locked down. 


The TLB maintenance instructions specify the Exception level of the translation regime to which they apply. 


Note 


Because there is no guarantee that an unlocked TLB entry remains in the cache, architecturally it is not possible to 
tell whether a TLB maintenance instruction has affected any TLB entries that were not specified by the instruction. 








Ifa TLB maintenance instruction specifies a VA, and a data or instruction access to that VA would generate an MMU 
abort, the TLB maintenance instruction does not generate an abort. VAs for which a TLB maintenance instruction 
does not generate an abort include VAs that are not in the range of VAs that can be translated. 


When EL3 is implemented: 


° The TLB maintenance instructions that apply to the EL1&0 translation regime take account of the current 
Security state, as part of the address translation required for the TLB operation. 
° SCR_EL3.NS modifies the effect of the TLB maintenance instructions as follows: 


— For instructions that apply to the EL1&0 translation regime, the SCR_EL3.NS bit identifies whether 
the maintenance instructions apply to the Secure or Non-secure EL1&0 translation regime. 


Note 


If EL3 is not implemented, then there is only a single EL1&0 translation regime. 








— For instructions that apply to the EL2 translation regime, the SCR_EL3.NS bit must be | or the 
instruction is UNDEFINED. 


— For instructions that apply to the EL3 translation regime, the SCR_EL3.NS bit has no effect. 
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Note 


° An address-based TLB maintenance instruction that applies to the Inner Shareable domain does so regardless 
of the Shareability attributes of the address supplied as an argument to the instruction. 





. Previous versions of the ARM architecture included TLB maintenance instructions that operated only on 
instruction TLBs, or only on data TLBs. From the introduction of ARMv7, ARM deprecated any use of these 
instructions. In ARMv8: 


—  AArch64 state does not include any of these instructions. 


—  AArch32 state includes some of these instructions, but ARM deprecates their use. 





The ARM architecture does not dictate the form in which the TLB stores translation table entries. However, when 
a TLB maintenance instruction is executed, the minimum size of the table entry that is invalidated from the TLB 
must be at least the size that appears in the translation table entry. 


Note 


The Contiguous bit does not affect the minimum size of entry that must be invalidated from the TLB 








Scope of the A64 TLB maintenance instructions 
The TLB invalidation instruction <type> affects the different possible cached entries in the TLB as follows: 


ALL The invalidation applies to all cached copies of the stage 1 and stage 2 translation table entries from 
any level of the translation table walk required to translate any address at the specified Exception 
level, that would be used with the state specified by SCR_EL3.NS. 


For entries from the Non-secure EL1&0 translation regime, ALL applies to entries with any VMID. 


For entries from the EL1&0 translation regimes, the invalidation applies to: 


° All entries above the final level of lookup. 
° All entries at the final level of lookup. 
——— Note 


This means the invalidation applies to both: 
— Global entries. 


—  Non-global entries with any ASID. 





VMALL The invalidation applies to all cached copies of the stage | translation table entries, from any level 
of the translation table walk required to translate any address at the specified Exception level, that 
would be used with all of: 

. The Security state specified by SCR_EL3.NS. 


° For the Non-secure EL1&0 translation regime, the current VMID. 


For entries from the EL1&0 translation regimes that meet the other specified conditions, the 
invalidation applies to: 


° All entries above the final level of lookup. 
° All entries at the final level of lookup. 
—— Note 


This means the invalidation applies to both: 
— Global entries. 


—  Non-global entries with any ASID. 





VMALL is valid only for EL1. 
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VMALLS12 The invalidation applies to all cached copies of the stage 1 and stage 2 translation table entries from 
any level of the translation table walk required to translate any address at the specified Exception 
level, that would be used with all of: 


° The Security state specified by SCR_EL3.NS. 
. For the Non-secure EL1&0 translation regime, the current VMID. 


For entries from the EL1&0 translation regimes that meet the other specified conditions, the 
invalidation applies to: 


° All entries above the final level of lookup. 
. All entries at the final level of lookup. 
—— Note 


This means the invalidation applies to both: 
— Global entries. 


—  Non-global entries with any ASID. 





VMALLS12 is valid only for EL1. 


If EL2 is not implemented, or if the TLBI VMALLS12 instruction is executed when the value of 
SCR_EL3.NS is 0, the instruction is not UNDEFINED but it has the same effect as TLBI VMALL. This is 
because there are no stage 2 translations to invalidate. 


ASID The invalidation applies to all cached copies of the stage | translation table entries from any level 
of the translation table walk required to translate any address at the specified Exception level, that 
would be used with all of: 

. The Security state specified by SCR_EL3.NS. 


. For the Non-secure EL1&0 translation regime, the current VMID. 


For entries from the EL1&0 translation regimes that meet the other specified conditions, the 
invalidation applies only if either: 


° The entry is from a level of lookup above the final level and matches the specified ASID. 
° The entry is a non-global entry from the final level of lookup and matches the specified 
ASID. 


ASID is valid only for EL1. 


VA The invalidation applies to all cached copies of the stage 1 translation table entries from any level 
of the translation table walk required to translate the address specified in the invalidation instruction 
at the specified Exception level that would be used with all of: 

° The Security state specified by SCR_EL3.NS. 


° For the Non-secure EL1&0 translation regime, the current VMID. 


For entries from the EL1&0 translation regimes that meet the other specified conditions, the 
invalidation applies only if one of the following applies: 


° The entry is from a level of lookup above the final level and matches the specified ASID. 
° The entry is a global entry from the final level of lookup. 
° The entry is a non-global entry from the final level of lookup that matches the specified 
ASID. 
VAL The invalidation applies to all cached copies of the stage 1 translation table entry from the final level 


of the translation table walk required to translate the address specified in the invalidation instruction 
at the specified Exception level, that would be used with all of: 


. The Security state specified by SCR_EL3.NS. 
. For the Non-secure EL1&0 translation regime, the current VMID. 
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For entries from the EL1&0 translation regimes that meet the other specified conditions, the 
invalidation applies only if either: 


° The entry is a global entry from the final level of lookup. 
° The entry is a non-global entry from the final level of lookup that matches the specified 
ASID. 
VAA The invalidation applies to all cached copies of the stage 1 translation table entries from any level 


of the translation table walk required to translate the address specified in the invalidation instruction 
at the specified Exception level that would be used with all of: 


° The Security state specified by SCR_EL3.NS. 
. For the Non-secure EL1&0 translation regime, the current VMID. 


For entries from the EL1&0 translation regimes that meet the other specified conditions, the 
invalidation applies to all of: 


° All entries above the final level of lookup. 
° All entries at the final level of lookup. 
——— Note 


This means the invalidation applies to both: 
— Global entries. 


—  Non-global entries with any ASID. 





VAAL The invalidation applies to all cached copies of the stage 1 translation table entry from the final level 
of the translation table walk required to translate the address specified in the invalidation instruction 
at the specified Exception level that would be used with all of: 


. The Security state specified by SCR_EL3.NS. 
° For the Non-secure EL1&0 translation regime, the current VMID. 


For entries from the EL1&0 translation regimes that meet the other specified conditions, the 
invalidation applies to all entries at the final level of lookup. 


—— Note 
This means the invalidation apples to both: 


° Global entries. 


° Non-global entries with any ASID. 





IPAS2 The invalidation applies to all cached copies of the stage 2 translation table entries from any level 
of the translation table walk required to translate the specified IPA, that both: 


. Are held in TLB caching structures holding stage 2 only entries. 
° Would be used with the current VMID. 


It is not required that this instruction invalidates TLB caching structures holding entries that 
combine stage | and stage 2 of the translation. 


The only translation regime to which this instruction can apply is the Non-secure EL1&0 translation 
regime. 
When executed with the SCR_EL3.NS==0, or in an implementation that does not implement EL2, 


this instruction is a NOP. 


For more information about the architectural requirements for the IPAS2 instruction see /nvalidation 
of TLB entries from stage 2 translations on page D4-1823. 


IPAS2L The invalidation applies to cached copies of the stage 2 translation table entry from the final level 
of the stage 2 translation table walk required to translate the specified IPA, that both: 


° Are held in TLB caching structures holding stage 2 only entries. 
. Would be used with the current VMID. 


It is not required that this instruction invalidates TLB caching structures holding entries that 
combine stage 1 and stage 2 of the translation. 





D4-1822 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D4 The AArch64 Virtual Memory System Architecture 
D4.8 TLB maintenance requirements and the TLB maintenance instructions 


The only translation regime to which this instruction can apply is the Non-secure EL1&0 translation 
regime. 


When executed with the SCR_EL3.NS==0, or in an implementation that does not implement EL2, 
this instruction is a NOP. 


For more information about the architectural requirements for the IPAS2L instruction see 
Invalidation of TLB entries from stage 2 translations. 


The entries that the invalidations apply to are not affected by the state of any other control bits involved in the 
translation process. Therefore, the following is a non-exhaustive list of control bits that do not affect how a TLB 
maintenance instruction updates the TLB entries: 


In AArch64 state 


SCTLR_EL1.M, SCTLR_EL2.M, SCTLR_EL3.{M, RW], HCR_EL2.{VM, RW}, 
TCR_EL1.{TG1, EPD1, T1SZ, TGO, EPDO, TOSZ, AS, Al}, TCR_EL2.{TGO, TOSZ}, 
TCR_EL3.{TGO, TOSZ}, VTCR_EL2.{SLO, TOSZ}, TTBRO_EL1.ASID, TTBR1_EL1.ASID. 


In AArch32 state 


SCTLR.M, HCR.VM, TTBCR.{EAE, PD1, PDO, N, EPD1, T1SZ, EPDO, TOSZ, A1}, 
HTCR.TOSZ, VTCR.{SLO, TOSZ}, TTBRO.ASID, TTBR1.ASID, CONTEXTIDR.ASID. 


Note 


. ARM expects most TLB maintenance performed by an operating system to occur to the last level entries of 
the stage 1 translation table walks, and the purpose of the address-based TLB invalidation instructions where 
the invalidation need only apply to caching of entries returned from the last level of translation table walk of 
stage | translation is to avoid unnecessary loss of the intermediate caching of the translation table entries. 
Similarly, for stage 2 translations ARM expects that most TLB maintenance performed by a hypervisor for a 
given Guest operation system will affect only the last level entries of the stage 2 translations. Therefore, 
similar capability is provided for instructions that invalidate single stage 2 entries. 





. The architecture permits the invalidation of entries in TLB caching structures at any time, so for each of these 
instructions the definition is in terms of the minimum set of entries that must be invalidated from TLB 
caching structures, and an implementation might choose to invalidate more entries. In general, for best 
performance, ARM recommends not invalidating entries that are not required to be invalidated. 


° Dependencies on the VMID for the Non-secure EL1&0 translation regime apply even when HCR_EL2.VM 
is set to 0. Because the architecture does not require the VTTBR_EL2.VMID field to be reset in hardware, 
the reset routine of each active PE must initialize VTTBR_EL2.VMID[7:0] to a common value such as 0, 
even if stage 2 translation is not in use. 





Invalidation of TLB entries from stage 2 translations 
The architectural requirements of the IPAS2 instruction are that: 


1. The following code is sufficient to invalidate all cached copies of the stage 2 translation of the IPA held in Xt 
for the current VMID, with the corresponding requirement for the broadcast versions of the instructions: 
TLBI IPAS2E1, Xt 


DSB 
TLBI VMALLE1 


2s The following code is sufficient to invalidate all cached copies of the stage 2 translations of the IPA held in 
Xt used to translate the VA (and the specified ASID when executing TLBI VAE1) held in Xt2, with the 
corresponding requirement for the broadcast versions of the instructions: 
TLBI IPAS2E1, Xt 


DSB 
TLBI VAE1, Xt2 ; or TLBI VAAE1, Xt2 
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3: The following code is sufficient to invalidate all cached copies of the stage 2 translations of the IPA held in 
Xt used to translate the IPA produced by the last level of stage 1 translation table lookup for the VA (and ASID 
when executing TLBI VALE1) held in Xt2, with the corresponding requirement for the broadcast versions of the 


instructions: 

TLBI IPAS2E1, Xt 

DSB 

TLBI VALE1, Xt2 ; or TLBI VAALE1, Xt2 
Note 





Depending on the invalidation required, software must use the entire sequence 1, 2, or 3, even when Non-secure 
EL1&0 stage 1 translation is disabled. 





Equivalent architectural requirements apply to the IPAS2L instruction, except that the only TLB entries that must be 
invalidated by an IPAS2L instruction are those that come from the final level of the translation table lookup. 


Broadcast TLB maintenance between AArch32 and AArch64 


In most cases, a TLB maintenance instruction affecting the Inner Shareable shareability domain executed by a PE 
in an Exception level that is using AArch64 also affects any other PE in the same Inner Shareable domain that is 
executing at the same Exception level and is using AArch32, provided that the address, qualify the scope of the 
ASID and VMID matching requirements of the original instruction are met, as specified in Scope of the A64 TLB 
maintenance instructions on page D4-1820. 


Note 
The requirement to match means that the invalidation only occurs on the PE that is using AArch32 if, for the PE 
that executed the TLB maintenance instruction at an Exception level that is using AArch64, both of the following 
apply: 
° If VA matching is required, the VA is @xQQQQFFFFFFFF or lower of the memory space. 
° If ASID matching is required and the PE is using a 16-bit ASID, then the top 8 bits of the ASID are zero. 








Except for the cases identified here, a TLB maintenance instructions affecting the Inner Shareable shareability 
domain executed by a PE in an Exception level that is using AArch32 also affects any other PE in the same Inner 
Shareable domain that is executing at the same Exception level and is using AArch64, provided that the address, 
ASID, and VMID matching requirements of the original instruction are met, as specified in Scope of the A64 TLB 
maintenance instructions on page D4-1820. In addition, for the instruction executed in AArch32 state: 


2 For a TLBIMVAAIS, TLBIMVAALIS, TLBIMVAHIS, TLBIMVATS, TLBIMVALHIS, or TLBIMVALIS instruction, the VA supplied 
as an argument is zero-extended. 


° For a TLBIIPAS2IS or TLBIIPAS2LIS instruction, the IPA supplied as an argument is zero-extended. 


° For a TLBIASIDIS, TLBIMVAIS, or TLBIMVALTS instruction, the ASID supplied as an argument is zero-extended if 
the PE executing in AArch64 state is using a 16-bit ASID. 


The VA from the instruction executed in AArch32 state is zero-extended, and the ASID is zero-extended if the PE 
executing in AArch64 state is using a 16-bit ASID. 





D4-1824 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D4 The AArch64 Virtual Memory System Architecture 
D4.8 TLB maintenance requirements and the TLB maintenance instructions 


The exceptions to these general rules are as follows: 


iB 


An ARMv7 PE in the same Inner Shareable domain is treated in the same way as an ARMv8 PE for which 
EL3 is using AArch32, except that if an ARMv8 PE issues a broadcast instruction that is not defined in 
ARMv7, then that instruction is not required to have an effect on the TLBs of the ARMv7 PE. The 
instructions that do not exist in ARMv7 include the following TLB maintenance instructions that ARMv8 
adds to the T32 and A32 instruction sets: 


° The following instructions that operate on TLB entries for the final level of translation table walk for 
stage | translations: 


TLBIMVALIS, TLBIMVAALIS, TLBIMVALHIS, TLBIMVAL, TLBIMVAAL, and TLBIMVALH. 


° The following instructions that operate by IPA on TLB entries for stage 2 translations: 
TLBIIPAS2IS, TLBIIPAS2LIS, TLBIIPAS2, and TLBIIPASZL. 


The number of Exception levels in Secure state depends on whether EL3 is using AArch32 or EL3 is using 
AArch64. This means that, within the Inner Shareable domain, there might be PEs with different numbers of 
Exception levels in Secure state. Therefore, the following exceptions are made to the general rules: 


° If a PE with EL3 using AArch32 issues a broadcast AArch32 TLB maintenance instruction affecting 
Secure entries, and the Inner Shareable domain also contains PEs with EL3 using AArch64, then the 
architecture does not require that the broadcast AArch32 TLB maintenance instruction has any effect 
on either: 


— The EL3 translation regime of the PEs with EL3 using AArch64. 


— The Secure ELI translation regime of the PEs with EL3 using AArch64, regardless of whether 
the Secure EL1 translation regime is using AArch64 or AArch32. 


° If a PE with EL3 using AArch64 issues a broadcast AArch64 TLB maintenance instruction affecting 
EL3 entries, and the Inner Shareable domain also contains PEs with EL3 using AArch32, then the 
architecture does not require that the broadcast AArch64 TLB maintenance instruction has any effect 
on the EL3 translation regime of the PEs with EL3 using AArch32. 


. If a PE with EL3 using AArch64 issues a broadcast AArch64 TLB maintenance instruction affecting 
Secure ELI entries, and the Inner Shareable domain also contains PEs with EL3 using AArch32 then 
the architecture does not require that the broadcast AArch64 TLB maintenance instruction has any 
effect on the EL3 translation regime of the PEs with EL3 using AArch32. 


Note 





While the exceptions to the general rule mean the architecture does not require the specified TLB invalidations, the 
architecture also does not require that entries in the TLB remain in the TLB at any time, and so it is permissible that 
such broadcast instructions affect these translation regimes. 





Broadcast TLB maintenance with different translation granule sizes 


In the following cases, a broadcast TLB maintenance instruction is not required to perform any invalidation on the 
recipient PE: 


The TLB maintenance instruction specifying a VA and affecting the EL2 translation regime or the EL3 
translation regime is broadcast from a PE using one translation granule size for that translation regime to a 
PE using a different translation granule size for that same translation regime 


The TLB maintenance instruction specifying a VA and affecting the EL1 translation regime is broadcast from 
a PE using one stage | translation granule size for that translation regime for a particular ASID (if applicable), 
VMID (if applicable), and Security state, to a PE where EL1 for the same ASID (if applicable), VMID (if 
applicable), and Security state, is using a different stage 1 translation granule size. 


The TLB maintenance instruction specifying a VA and affecting the Non-secure EL] translation regime is 
broadcast from a PE using one stage 2 translation granule size for a particular ASID (if applicable) and 
VMID, to a PE where EL1 for the same ASID (if applicable) and VMID is using a different stage 2 translation 
granule size. 
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. The TLB maintenance instruction specifying an IPA and affecting the Non-secure EL1 translation regime is 
broadcast from a PE using one stage 2 translation granule size for a particular VMID to a PE where EL1 for 
the same VMID is using a different stage 2 translation granule size. 

Ordering and completion of TLB maintenance instructions 


For AArch64 execution, a TLB maintenance instruction can be executed in any order relative to: 


° Any load or store instruction, unless a DSB is executed between the load or store and the TLB maintenance 
instruction. 





Note 


In the ARM architecture, a translation table walk is considered to be a separate observer, and a store to 
translation tables can be observed by that separate observer at any time after the instruction has been 
executed, but is only guaranteed to be observable after the execution of a DSB instruction by the PE that 
executed the store to the translation tables. 





° Another TLB maintenance instruction, unless a DSB is executed between the instructions. 
° A data or instruction cache maintenance instruction, unless a DSB is executed between the instructions. 
For AArch64 execution, the completion rules are: 


° A TLB invalidate instruction is complete when all memory accesses using the TLB entries that have been 
invalidated have been observed by all observers, to the extent that those accesses must be observed. The 
shareability and cacheability of the accessed memory locations determine the extent to which the accesses 
must be observed. 


Note 


For TLB maintenance instructions that affect other PEs, the memory accesses from those PEs that used the 
TLB entries that have been invalidated are included in the set of memory accesses that must have been 
observed when the TLB maintenance instruction is complete. 








After the TLB invalidate instruction is complete, no new memory accesses using the invalidated TLB entries 
will be observed by those observers. 


Note 


This requirement does not mean that speculative memory accesses cannot be performed using those entries 
if it is impossible to tell that those memory accesses have been observed by the observers. 








° A TLB maintenance instruction can complete at any time after it is issued, but is only guaranteed to be 
complete after the execution of DSB by the PE that executed the TLB maintenance instruction. 


° The effects of a completed TLB maintenance instruction are only guaranteed to be visible on the PE that 
executed the instruction after the execution of a Context synchronization event by the PE that executed the 
TLB maintenance instruction. 


In all cases in this section where a DMB or DSB is referred to, it refers to a DMB or DSB whose required access type is 
both loads and stores. A DSB NSH is sufficient to ensure completion of TLB maintenance instructions that apply to a 
single PE. A DSB ISH is sufficient to ensure completion of TLB maintenance instructions that apply to PEs in the 
same Inner Shareable domain. 


TLB maintenance in the event of TLB conflict 


In the event that multiple entries in the TLB are being used to translate a given address (which implies that an 
attempt to access the given address might give rise to a TLB Conflict abort), it is IMPLEMENTATION DEFINED as to 
the form of TLB maintenance operation that the software must perform in order to be guaranteed that all TLB entries 
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associated with the given address and translation regime have been invalidated. In all cases, an ALL or VMALL form of 
TLB maintenance operation that targets the given translation regime is guaranteed to remove all entries within that 
regime, even if there are multiple, conflicting TLB entries for any given address within that regime. 


The interaction of TLB lockdown with TLB maintenance instructions 


The precise interaction of TLB lockdown with the TLB maintenance instructions is IMPLEMENTATION DEFINED. 
However, the architecturally-defined TLB maintenance instructions must comply with these rules: 


. The effect on a locked TLB entry of a TLB invalidate all operation that would invalidate that entry if the entry 
was not locked is IMPLEMENTATION DEFINED. However, the instruction operation must be implemented as 
one of the following options: 


—  _ The operation has no effect on entries that are locked down. 


— The operation generates an IMPLEMENTATION DEFINED Data Abort exception if an entry is locked 
down, or might be locked down. 


Any such exceptions taken from Non-secure EL1 can be trapped to EL2, see Traps to EL2 of 
Non-secure ELO and EL] accesses to lockdown, DMA, and TCM operations on page D1-1577. 
Note 


These options permit a usage model for TLB invalidate routines, where the routine invalidates a large range 
of addresses, without considering whether any entries are locked in the TLB. 








° The effect on a locked TLB entry of a TLB invalidate by VA or invalidate by ASID match operation that 
would invalidate that entry if the entry was not locked is IMPLEMENTATION DEFINED. However, the operation 
must implement one of the following options: 


— The locked entry is invalidated in the TLB. 


—  _ The operation has no effect on any locked entry in the TLB. In the case of an invalidate single entry 
by VA, this means the PE treats the operation as a NOP. 


— The operation generates an IMPLEMENTATION DEFINED Data Abort exception if it operates on an entry 


that is locked down, or might be locked down. 


The exception syndrome definitions include a fault code for cache and TLB lockdown faults, see ESR_ELI, 
Exception Syndrome Register (EL1) on page D7-1930. 


Note 


Any implementation that uses an abort mechanism for entries that can be locked down but are not actually locked 
down must: 





. Document the IMPLEMENTATION DEFINED instruction sequences that perform the required operations on 
entries that are not locked down. 


° Implement one of the other specified alternatives for the locked entries. 


ARM recommends that, when possible, such IMPLEMENTATION DEFINED instruction sequences use the 
architecturally-defined operations. This minimizes the number of customized operations required. 


In addition, an implementation that uses an abort mechanism for handling the effect of TLB maintenance 
instructions on entries that can be locked down but are not actually locked down must provide an IMPLEMENTATION 
DEFINED mechanism that ensures that no TLB entries are locked. 


Similar rules apply to cache lockdown, see The interaction of cache lockdown with cache maintenance instructions 
on page D3-1712. 





The architecture does not guarantee that any unlocked entry in the TLB remains in the TLB. This means that, as a 
side-effect of any TLB maintenance instruction, any unlocked entry in the TLB might be invalidated. 
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D4.8.3 Maintenance requirements on changing System register values 


The TLB contents can be influenced by control bits in a number of System registers. This means the TLB must be 
invalidated after any changes to these bits, unless the changes are accompanied by a change to the VMID or ASID 
that defines the context to which the bits apply. The general form of the required invalidation sequence is as follows: 


; Change control bits in System registers 
ISB ; Synchronize changes to the control bits 
; Perform TLB invalidation of all entries that might be affected by the changed control bits 


The System register changes that maintenance requirement applies to are: 
e Any change to the MAIR_EL1, MAIR_EL2, or MAIR_EL3 registers. 
e Any change to the AMAIR_EL1, AMAIR_EL2, or AMAIR_EL3 registers. 
. Any change to SCTLR_EL1.EE, SCTLR_EL2.EE, or SCTLR_EL3.EE. 
. Any change to SCTLR_EL1.WXN, SCTLR_EL2.WXN, or SCTLR_EL3.WXN. 
° Any change to any of the SCR_EL3.{RW, SIF} bits. 
° Any change to any of the HCR_EL2.{RW, DC, PTW, VM} bits. See also Changing HCR_EL2.PTW. 
° Any changes to the registers that control address translation: 
— Any change to any of the TCR_EL1, TCR_EL2, TCR_EL3, or VTCR_EL2? registers. 


— Any change to the TTBRO_EL1, TTBR1_EL1, TTBRO_EL2, TTBRO_EL3, or VTTBR_EL2 
registers. 


Changing HCR_EL2.PTW 


When the value of the Protected table walk bit, HCR_EL2.PTW, is 1, a stage 1 translation table access in the 
Non-secure EL1&0 translation regime, to an address that is mapped to any type of Device memory by its stage 2 
translation, generates a stage 2 Permission fault. A TLB associated with a particular VMID might hold entries that 
depend on the effect of HCR_EL2.PTW. Therefore, if the value of HCR_EL2.PTW is changed without a change to 
the VMID value, all TLB entries associated with the current VMID must be invalidated before executing software 
at Non-secure EL1 or ELO. If this is not done, behavior is CONSTRAINED UNPREDICTABLE, see CONSTRAINED 
UNPREDICTABLE behaviors due to caching of control or data values on page K1-5480. 
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D4.9 Caches in a VMSAv8-64 implementation 


The ARM architecture describes the required behavior of an implementation of the architecture. As far as possible 
it does not restrict the implemented microarchitecture, or the implementation techniques that might achieve the 
required behavior. 


In particular, maintaining this level of abstraction is difficult when describing the relationship between memory 
address translation and caches, especially regarding the indexing and tagging policy of caches. This section: 


° Summarizes the architectural requirements for the interaction between caches and address translation. 


° Gives some information about the likely implementation impact of the required behavior. 


The following sections give this information: 
° Data and unified caches. 


° Instruction caches. 


In addition, Cache maintenance requirement created by changing translation table attributes on page D4-1832 
describes the cache maintenance required after updating the translation tables to change the attributes of an area of 
memory. 


For more information about cache maintenance see Cache maintenance instructions on page D3-1703, that 
describes the cache maintenance instructions in the A64 instruction set. 


D4.9.1 Data and unified caches 


For data and unified caches, the use of address translation is entirely transparent to any data access other than as 
described in Mismatched memory attributes on page B2-105. 


This means that the behavior of accesses from the same observer to different VAs, that are translated to the same PA 
with the same memory attributes, is fully coherent. This means these accesses behave as follows, regardless of 
which VA is accessed: 


. Two writes to the same PA occur in program order. 

° A read of a PA returns the value of the last successful write to that PA. 

° A write to a PA that occurs, in program order, after a read of that PA, has no effect on the value returned by 
that read. 


The memory system behaves in this way without any requirement to use barrier or cache maintenance instructions. 


In addition, if cache maintenance is performed on a memory location, the effect of that cache maintenance is visible 
to all aliases of that physical memory location. 


These properties are consistent with implementing all caches that can handle data accesses as Physically-indexed, 
physically-tagged (PIPT) caches. 


D4.9.2 Instruction caches 


In the ARM architecture, an instruction cache is a cache that is accessed only as a result of an instruction fetch. 
Therefore, an instruction cache is never written to by any load or store instruction executed by the PE. 


The ARM architecture supports three different behaviors for instruction caches. For ease of reference and 
description these are identified by descriptions of the associated expected implementation, as follows: 


. PIPT instruction caches. 
° Virtually-indexed, physically-tagged (VIPT) instruction caches. 
° ASID and VMID tagged Virtually-indexed, virtually-tagged (VIVT) instruction caches. 


The CTR_ELO.L1Ip field identifies the form of the instruction caches. 
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The following subsections describe the behavior associated with these cache types, including any occasions where 
explicit cache maintenance is required to make the use of address translation transparent to the instruction cache: 


° PIPT instruction caches. 

° VIPT instruction caches. 

° ASID and VMID tagged VIVT instruction caches on page D4-1831. 
° The IVIPT Extension on page D4-1831. 


Note 


For software to be portable between implementations that might use any of PIPT instruction caches, VIPT 
instruction caches, or ASID and VMID tagged VIVT instruction caches, the software must invalidate the instruction 
cache whenever any condition occurs that would require instruction cache maintenance for at least one of the 
instruction cache types. 








PIPT instruction caches 


For PIPT instruction caches, the use of memory address translation is entirely transparent to all instruction fetches 
other than as described in Mismatched memory attributes on page B2-105. 


If cache maintenance is performed on a memory location, the effect of that cache maintenance is visible to all aliases 
of that physical memory location. 


An implementation that provides PIPT instruction caches implements the [VIPT Extension, see The [VIPT 
Extension on page D4-1831. 


VIPT instruction caches 


For VIPT instruction caches, the use of memory address translation is transparent to all instruction fetches other 
than for the effect of memory address translation on instruction cache invalidate by address operations or as 
described in Mismatched memory attributes on page B2-105. 


Note 


Cache invalidation is the only cache maintenance that can be performed on an instruction cache. 








If instruction cache invalidation by address is performed on a memory location, the effect of that invalidation is 
visible only to the VA supplied with the operation. The effect of the invalidation might not be visible to any other 
aliases of that physical memory location. 


The only architecturally-guaranteed way to invalidate all aliases of a PA from a VIPT instruction cache is to 
invalidate the entire instruction cache. 


An implementation that provides VIPT instruction caches implements the [VIPT Extension, see The [VIPT 
Extension on page D4-1831. 
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ASID and VMID tagged VIVT instruction caches 


For ASID and VMID tagged VIVT instruction caches, if the instructions at any VA change, for a given translation 
regime and a given ASID and VMID, as appropriate, then instruction cache maintenance is required to ensure that 
the change is visible to subsequent execution. This maintenance is required when writing new values to instruction 
locations. It can also be required as a result of any of the following situations that change the translation of a VA to 
a PA, if, as a result of the change to the translation, the instructions at the VAs change: 


° For any translation regime other than the Non-secure EL1&0 translation regime, enabling or disabling 
stage | translations. 
° For the Non-secure EL1&0 translation regime: 


— When stage 2 translations are enabled, enabling or disabling stage 1 translations unless accompanied 
by a change of VMID. 


— When stage 2 translations are disabled, enabling or disabling stage | translations. 


— Enabling or disabling stage 2 translations. 
° Writing new mappings to the translation tables. 


° Any change to the TCR or TTBR for the current translation regime, unless: 


—  Forachange to the Secure EL1&0 translation regime, the change is accompanied by a change to the 
ASID. 


—  Forachange to the stage 1 translations of the Non-secure EL1&0 translation regime, the change is 
accompanied by a change to the ASID or VMID. 


—  Forachange to the stage 2 translations of the Non-secure EL1&0 translation regime, the change is 
accompanied by a change to the VMID. 





Note 


For ASID and VMID tagged VIVT instruction caches only, for a given translation regime and a given ASID and 
VMID, as appropriate, invalidation is not required if a change to the translations is such that the instructions 
associated with the non-faulting translations of a VA remain unchanged through the change to the translations, even 
if the physical locations being mapped to by the changed translation have been written as part of changing the 
translation. 


Examples of situations where this might occur include: 


° Copy-on-Write. 


. Demand Paging of memory locations to/from disk. 


This does not apply for VIPT or PIPT instruction caches, because those caches hold copies of PAs, and therefore 
must be invalidated when the contents are written to, to avoid the use of stale entries. 





If instruction cache invalidation by address is performed on a memory location, the effect of that invalidation is 
visible only to the VA supplied with the operation. The effect of the invalidation might not be visible to any other 
aliases of that physical memory location. 


The only architecturally-guaranteed way to invalidate all aliases of a PA from an ASID and VMID tagged VIVT 
instruction cache is to invalidate the entire instruction cache. 


The IVIPT Extension 


An implementation in which the instruction cache exhibits the behaviors described in PIPT instruction caches on 
page D4-1830, or those described in VIPT instruction caches on page D4-1830, is said to implement the /V/PT 
Extension to the ARM architecture. 


The formal definition of the IVIPT Extension to the ARM architecture is that it reduces the instruction cache 
maintenance requirement to the following condition: 


° Instruction cache maintenance is required only after writing new data to a PA that holds an instruction. 
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D4.9.3 Cache maintenance requirement created by changing translation table attributes 


Any change to the translation tables to change the attributes of an area of memory can require maintenance of the 
translation tables, as described in General TLB maintenance requirements on page D4-1815. If the change affects 
the cacheability attributes of the area of memory, including any change between Write-Through and Write-Back 
attributes, software must ensure that any cached copies of affected locations are removed from the caches, typically 
by cleaning and invalidating the locations from the levels of cache that might hold copies of the locations affected 
by the attribute change. Any of the following changes to the inner cacheability or outer cacheability attribute creates 
this maintenance requirement: 

° Write-Back to Write-Through 

. Write-Back to Non-cacheable 

° Write-Through to Non-cacheable 


° Write-Through to Write-Back. 
The cache clean and invalidate avoids any possible coherency errors caused by mismatched memory attributes. 


Similarly, to avoid possible coherency errors caused by mismatched memory attributes, the following sequence 
must be followed when changing the shareability attributes of a cacheable memory location: 





1. Make the memory location Non-cacheable, Outer Shareable. 
2. Clean and invalidate the location from them cache. 
3. Change the shareability attributes to the required new values. 
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Chapter D5 
The Performance Monitors Extension 


This chapter describes the ARMv8 implementation of the ARM Performance Monitors, that are an optional 
non-invasive debug component. It describes version 3 of the Performance Monitor Unit (PMU) architecture, 
PMUV3. It contains the following sections: 


° About the Performance Monitors on page D5-1834. 
° Accuracy of the Performance Monitors on page D5-1836. 
° Behavior on overflow on page DS5-1838. 


. Attributability on page D5-1840. 

° Effect of EL3 and EL2 on page D5-1841. 

. Event filtering on page D5-1843 

. Performance Monitors and Debug state on page D5-1845. 
° Counter enables on page D5-1846. 

° Counter access on page D5-1847. 





° Events, event numbers, and mnemonics on page D5-1848. 
° Performance Monitors Extension registers on page D5-1871. 
Note 


Table K12-1 on page K12-5660 disambiguates the general register references used in this chapter. 
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DS5.1 About the Performance Monitors 
In ARMv8-A, the Performance Monitors Extension is an OPTIONAL feature of an implementation, but ARM 
strongly recommends that ARMv8-A implementations include version 3 of the Performance Monitors Extension, 
PMUVv3. 
Note 
No previous versions of the Performance Monitors Extension can be implemented in ARMVv8. 
The basic form of the Performance Monitors is: 
° A 64-bit cycle counter, see Time as measured by the Performance Monitors cycle counter on page D5-1835. 
° A number of 32-bit event counters. The event counted by each counter is programmable. ARMV8 provides 
space for up to 31 counters. The actual number of counters is IMPLEMENTATION DEFINED, and the 
specification includes an identification mechanism. 
Note 
ARM recommends that at least two counters are implemented, and that hypervisors provide at least this many 
counters to guest operating systems. 
° When EL2 is implemented, the required controls to partition the implemented counters into the following 
sets: 
— _ Asset which is available for use by the guest operating system. 
— _ Asset which is available for use by the hypervisor. 
° Controls for: 
— Enabling and resetting counters. 
— Flagging overflows. 
— Enabling interrupts on overflow. 
Monitoring software can enable the cycle counter independently of the event counters. 
The PMU architecture uses event numbers to identify an event. It: 
° Defines event numbers for common events, for use across many architectures and microarchitectures. 
Note 
Implementations that include PMUv3 must, as a minimum requirement, implement a subset of the common 
events. See Common event numbers on page DS-1852. 
. Reserves a large event number space for IMPLEMENTATION DEFINED events. 
The full set of events for an implementation is IMPLEMENTATION DEFINED. ARM recommends that implementations 
include all of the events that are appropriate to the architecture profile and microarchitecture of the implementation. 
When an implementation includes the Performance Monitors Extension, ARMV8 defines the following possible 
interfaces to the Performance Monitors Extension registers: 
° A System register interface. This interface is mandatory. 
Note 
In AArch32 state, the interface is in the (coproc==0b1111) encoding space. 
° An external debug interface which optionally supports memory-mapped accesses. Implementation of this 
interface is OPTIONAL. See Chapter I2 Recommended External Interface to the Performance Monitors. 
An operating system can use the System registers to access the counters. 
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Also, if required, the operating system can enable application software to access the counters. This enables an 
application to monitor its own performance with fine-grain control without requiring operating system support. For 
example, an application might implement per-function performance monitoring. 


To enable interaction with external monitoring, an implementation might consider additional enhancements, such 
as providing: 


° A set of events, from which a selection can be exported onto a bus for use as external events. 


° The ability to count external events. This enhancement requires the implementation to include a set of 
external event input signals. 


The Performance Monitors Extension is common to AArch64 operation and AArch32 operation. This means the 
ARMv8 architecture defines both AArch64 and AArch32 System registers to access the Performance Monitors. For 
example, the Performance Monitors Cycle Count Register is accessible as: 


° When executing in AArch64 state, PMCCNTR_ELO, see PMCCNTR_ELO, Performance Monitors Cycle 
Count Register on page D7-2218. 


° When executing in AArch32 state, PMCCNTR, see PMCCNTR, Performance Monitors Cycle Count 
Register on page G6-4762. 


D5.1.1 Time as measured by the Performance Monitors cycle counter 


The Performance Monitors cycle counter, accessed through PMCCNTR_ELO or PMCCNTR, increments from the 
hardware processor clock, not PE clock cycles. 


The relationship between the count recorded by the Performance Monitors cycle counter and the passage of real 
time is IMPLEMENTATION DEFINED. 


Note 


° This means that, in an implementation where PEs are multithreaded, the counter continues to increment 
across all PEs, rather than only counting cycles for which the current PE is active. 





. Although the architecture requires that direct reads of PMCCNTR_ELO or PMCCNTR occur in program 
order, there is no requirement that the count increments between two such reads. Even when the counter is 
incrementing on every clock cycle, software might need check that the difference between two reads of the 
counter is nonzero. 


The architecture requires that an indirect write to the PMCCNTR_ELO or PMCCNTR is observable to direct 
reads of the register in finite time. The counter increments from the hardware processor clock are indirect 
writes to these registers. 





D5.1.2 Interaction with trace 


It is IMPLEMENTATION DEFINED whether the implementation exports counter events to a Trace macrocell, or other 
external monitoring agent, to provide triggering information. The form of any exporting is also IMPLEMENTATION 
DEFINED. If implemented, this exporting might be enabled as part of the performance monitoring control 
functionality. 


ARM recommends system designers include a mechanism for importing a set of external events to be counted, but 
such a feature is IMPLEMENTATION DEFINED. When implemented, this feature enables the Trace macrocell to pass 
in events to be counted. 





D5.1.3 Interaction with power saving operations 
All counters are subject to any changes in clock frequency, including clock stopping caused by the WFI and WFE 
instructions. 
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D5.2 


D5.2.1 


Accuracy of the Performance Monitors 


The Performance Monitors: 
° Are a non-invasive debug component. See Non-invasive behavior. 


° Must provide broadly accurate and statistically useful count information. 
However, the Performance Monitors allow for: 


. A reasonable degree of inaccuracy in the counts to keep the implementation and validation cost low. See A 
reasonable degree of inaccuracy on page D5-1837. 


° IMPLEMENTATION DEFINED controls, such as those in ACTLR registers, to put the PE in an operating state 
that might do one or both of the following: 


— Change the level of non-invasiveness of the Performance Monitors so that enabling an event counter 
can impact the performance or behavior of the PE. 


— Allow inaccurate counts. This includes, but is not limited to, cycle counts. 


Non-invasive behavior 


The Performance Monitors are a non-invasive debug feature. A non-invasive debug feature permits the observation 
of data and program flow. Performance Monitors, PC Sample-based Profiling and Trace are non-invasive debug 
features. 


Non-invasive debug components do not guarantee that they do not make any changes to the behavior or 
performance of the processor. Any changes that do occur must not be severe however, as this will reduce the 
usefulness of event counters for performance measurement and profiling. This does not include any change to 
program behavior that results from the same program being instrumented to use the Performance Monitors, or from 
some other performance monitoring process being run concurrently with the process being profiled in a multitasking 
operating system. As such, a reasonable variation in performance is permissible. 


Note 


Power consumption is one measure of performance. Therefore, a reasonable variation in power consumption is 
permissible. 








ARM does not define a reasonable variation in performance, but recommends that such a variation is kept 
within 5% of normal operating performance, when averaged across a suite of code that is representative of the 
application workload. 


Note 


For profiles other than A-profile, there is the potential for stronger requirements. Ultimately, performance 
requirements are determined by end-users, and not set by the architecture. 








For some common architectural events, this requirement to be non-invasive can conflict with the requirement to 
present an accurate value of the count under normal operating conditions. Should an implementation require more 
performance-invasive techniques to accurately count an event, there are the following options: 


° If the event is optional, define an alternative IMPLEMENTATION DEFINED event that accurately counts the 
event and document the impact on performance of enabling the event. 


. Provide an IMPLEMENTATION DEFINED control that disables accurate counting of the event to restore broadly 
accurate performance, and document the impact on performance of accurate counting. 
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A reasonable degree of inaccuracy 


The Performance Monitors provide broadly accurate and statistically useful count information. To keep the 
implementation and validation cost low, a reasonable degree of inaccuracy in the counts is acceptable. ARM does 
not define a reasonable degree of inaccuracy but recommends the following guidelines: 


° Under normal operating conditions, the counters must present an accurate value of the count. 


° In exceptional circumstances, such as a change in Security state or other boundary condition, it is acceptable 
for the count to be inaccurate. 


° Under very unusual, non-repeating pathological cases, the counts can be inaccurate. These cases are likely to 
occur as a result of asynchronous exceptions, such as interrupts, where the chance of a systematic error in the 
count is very unlikely. 





Note 


An implementation must not introduce inaccuracies that can be triggered systematically by the execution of normal 
pieces of software. For example, it is not reasonable for the count of branch behavior to be inaccurate when caused 
by a systematic error generated by the loop structure producing a dropping in branch count. 


However, dropping a single branch count as the result of a rare interaction with an interrupt is acceptable. 





The permitted inaccuracy limits the possible uses of the Performance Monitors. In particular, the architecture does 
not define the point in a pipeline where the event counter is incremented, relative to the point where a read of the 
event counters is made. This means that pipelining effects can cause some imprecision. 


A change of Security state can also affect the accuracy of the Performance Monitors, see /nteraction with EL3 on 
page D5-1841. 


In addition to this, entry to and exit from Debug state can disturb the normal running of the PE, causing further 
inaccuracy in the Performance Monitors. Disabling the counters while in Debug state limits the extent of this 
inaccuracy. An implementation can employ methods to limit this inaccuracy, for example by promptly disabling the 
counters during the Debug state entry sequence. 


An implementation must document any particular scenarios where significant inaccuracies are expected. 
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D5.3 


Behavior on overflow 


All events are counted in 32-bit wrapping counters, that overflow when they wrap. The cycle counter, PMCCNTR, 
is a 64-bit wrapping counter, that is configured by PMCR.LC to either: 


° Signal an overflow when bit PMCCNTR[63] overflows. 
° Signal an overflow when bit PMCCNTR[31] overflows into bit PMCCNTR[32]. 


On a Performance Monitors counter overflow: 
° An overflow status bit is set to 1. See PMOVSCLR. 


° An interrupt request is generated if the PE is configured to generate counter overflow interrupts. For more 
information, see Generating overflow interrupt requests. 


. The counter continues counting events. 

















D5.3.1 Generating overflow interrupt requests 
Software can program the Performance Monitors so that an overflow interrupt request is generated when a counter 
overflows. See PMINTENSET and PMINTENCLR. 
Note 
° The mechanism by which an interrupt request from the Performance Monitors generates an FIQ or IRQ 
exception is IMPLEMENTATION DEFINED. 
. ARM recommends that the overflow interrupt requests: 

— Translate into a PMUIRQ signal, so that they are observable to external devices. 

— Connect to inputs on an IMPLEMENTATION DEFINED generic interrupt controller as a Private Peripheral 
Interrupt (PPI) for the originating processor. See the ARM Generic Interrupt Controller Architecture 
Specification for information about PPIs. 

— Connect to a Cross Trigger Interface (CTI), see Chapter HS The Embedded Cross-Trigger Interface. 

° ARM strongly discourages implementations from connecting overflow interrupt requests from multiple PEs 
to the same System Peripheral Interrupt (SPD identifier. 
° From GICv3, the ARM® Generic Interrupt Controller Architecture Specification recommends that the Private 
Peripheral Interrupt (PPD) with ID 23 is used for overflow interrupt requests. 
Counter overflow when counting one or more events generates an unsigned carry out. Software can write to the 
counters to control the frequency at which interrupt requests occur. For counters other than the cycle counter, the 
counter is always a 32-bit unsigned wrapping value. For example, software might set a counter to OxFFFF000Q, to 
generate another counter overflow after 65536 increments, and reset it to this value every time an overflow interrupt 
occurs. 
Note 

If an event can occur multiple times in a single clock cycle, then counter overflow can occur without the counter 
registering a value of zero. 
The overflow interrupt request is a level-sensitive request. The PE signals a request for: 
° Any given PMNx counter, when the value of PMOVSSET[] is 1, the value of PMINTENSET[y] is 1, and 

one of the following is true: 

—  EL2 is not implemented and the value of PMCR.E is 1. 

— EL2 is implemented, x is less than the value of HDCR.HPMN, and the value of PMCR.E is 1. 

—  EL2 is implemented, x is greater than or equal to the value of HDCR.HPMN, and the value of 
HDCR.HPME is 1. 

° The cycle counter, when the values of PMOVSSET[31], PMINTENSET[31], and PMCR.E are all 1. 
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The overflow interrupt request is active in both Secure and Non-secure states. In particular, if EL3 and EL2 are both 
implemented, overflow events from PMNx where x is greater than or equal to the value of HDCR.HPMN can be 
signaled from all modes and states but only if the value of HDCR.HPME is 1. 


The interrupt handler for the counter overflow request must cancel the interrupt request, by writing to 
PMOVSCLR[x] to clear the overflow bit to 0. 


Pseudocode description of overflow interrupt requests 


See Chapter J1 ARMv8 Pseudocode for a pseudocode description of overflow interrupt requests. The 
AArch64.CheckForPMUOverflow() and AArch32.CheckForPMUOverflow() pseudocode functions signal PMU overflow 
interrupt requests to an interrupt controller and PMU overflow trigger events to the cross-trigger interface. 
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D5.4 Attributability 


An event caused by the PE counting the event is Attributable. If an agent other than the PE that is counting the events 
causes an event, these events are Unattributable. 


An event is defined as being either Attributable or Unattributable. If the event is Attributable, it is further defined 
whether it is Attributable to: 


° The current Security state of the PE. 
. The current Exception level of the PE. 


. When the PE is in Debug state, operations issued to the PE by the debugger through the external debug 
interface. 


In a multithreaded implementation, an event might be Attributable either to the current Exception level alone, or to 
both the Exception level and the Security state of another PE with the same values for affinity level 1 and higher. 


Note 


An implementation is described as multithreaded when the lowest level of affinity consists of logical PEs that are 
implemented using a multithreading type approach. In this section, when referring to a multithreaded 
implementation, thread is used to mean processing elements with different affinity level 0 values and the same 
values for affinity level 1 and higher. 








An event can be defined as the combination of multiple subevents, which can be either Attributable or 
Unattributable. 


All architecturally defined events are Attributable, unless otherwise stated. 


Unattributable events might be counted when Attributable events are not counted. See: 
° Interaction with EL3 on page D5-1841. 

. Event filtering on page D5-1843. 

. Performance Monitors and Debug state on page D5-1845. 


These sections are summarized by Table D5-1 for events Attributable to the processor, and Unattributable events. 


Table D5-1 Counting events 


























Event type 
Counter and Allowed or ‘ yP 
PMUenablea S!te rohibited Filtered 
P If Attributable to: Then Else 
Yes Non- Allowed Not filtered x Count Count 
debug 
Filtered Current Exception Do not count IMPLEMENTATION DEFINED 
level 
Prohibited Xx Current Security Do not count IMPLEMENTATION DEFINED 
state 
Debug xX x Debugger Do not count IMPLEMENTATION DEFINED 
operations or raw 
cycles 
No xX xX xX xX Do not count Do not count 
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D5.5 Effect of EL3 and EL2 
This section describes the effects of implementing EL3 and EL2 on the Performance Monitors. It contains the 
following subsections: 
° Interaction with EL3. 
° Interaction with EL2 on page D5-1842. 
D5.5.1 Interaction with EL3 
While counting events is never prohibited in Non-secure state, there are some restrictions on counting events in 
Secure state. From reset, counting events Attributable to Secure state is prohibited in Secure state. When executing 
in AArch32 state, software can set SDCR.SPME to | to permit event counting in Secure state. In AArch64 state, 
software can set MDCR_EL3.SPME to 1 to permit event counting in Secure state. 
Note 
This enables a Secure Monitor to permit profiling within Secure state without having to configure an 
IMPLEMENTATION DEFINED debug authentication interface. 
The system can use the external authentication interface to override SPME. 
If EL3 is not implemented, the behavior is as if the value of SDCR.SPME or MDCR_EL3.SPME is 1, as 
appropriate. 
Counting Attributable events in Secure state is prohibited unless any one of the following is true: 
° EL3 is not implemented. 
° EL3 is implemented, is using AArch64, and the value of MDCR_EL3.SPME is 1. 
° EL3 is implemented, is using AArch32, and the value of SDCR.SPME is 1. 
° EL3 is implemented, EL3 or EL1 is using AArch32, executing at ELO, and the value of 
SDER32_EL3.SUNIDEN is 1. 
° EL3 is implemented, and counting is permitted by an IMPLEMENTATION DEFINED authentication interface, 
External SecureNoninvasiveDebugEnabled() == TRUE. 
Note 
Software can read the Authentication Status register, DBGAUTHSTATUS, to determine the state of an 
IMPLEMENTATION DEFINED authentication interface. 
Software executing at EL3 can trap attempts by lower Exception levels to access the PMU. This means that the 
Secure monitor can identify any software which is using the PMU and switch contexts, if required. 
The cycle counter, PMCCNTR, counts even when event counting is prohibited, unless PMCR.DP is set to 1 or the 
PE is in Debug state. 
The Performance Monitors registers are always accessible regardless of the values of the authentication signals and 
the SDER.SUNIDEN bit. Authentication controls whether the counters count events, it does not control access to 
the Performance Monitors registers. 
For each Unattributable event, it is IMPLEMENTATION DEFINED whether it is counted when counting Attributable 
events is prohibited. 
Note 
° Additional controls in PMCR, HDCR, PMCNTENSET, and PMCNTENCLR can also disable the event 
counters and the cycle counter. 
° Controls in PMEVTYPER<n> and PMCCFILTR can also disable counting based on Exception level and 
Security state. 
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See AArch64.CountEvents() and AArch32.CountEvents() in Chapter J1 ARMv8 Pseudocode for more information. 
The CountEvents() functions return TRUE if PMNx counts events or the cycle counter counts cycles at the current 
Exception level and state. However, these functions do not completely describe the behavior for Unattributable 
events. 


In AArch32 state, the Performance Monitors registers are Common registers, see Classification of System registers 
on page G4-4154. 


The Performance Monitors are intended to be broadly accurate and statistically useful, see Accuracy of the 
Performance Monitors on page DS-1836. Some inaccuracy is permitted at the point of changing Security state, 
however. To avoid the leaking of information from the Secure state, the permitted inaccuracy is that transactions 
that are not prohibited can be uncounted. Where possible, prohibited transactions must not be counted, but if they 
are counted, then that counting must not degrade security. 


Multithreaded implementations 


If an implementation is multithreaded and the value of PMEVTYPER<n>.MT ==1, then the PE does not count an 
event that is Attributable to Secure state on another thread if counting events Attributable to Secure state is 
prohibited in Secure state on the PE that is counting the events. 


Example D5-1 The effect of having PMEVTYPER<n>.MT == 


If the value of MDCR_EL3.SPME is 0 on one thread, then it does not count events Attributable to Secure state on 
another thread, even if one or both of the following applies: 


° This thread is in Non-secure state. 
° MDCR_EL3.SPME==1 on the other thread. 


Otherwise: 


. When the current configuration prohibits counting of events Attributable to Secure state in Secure state, it is 
IMPLEMENTATION DEFINED whether: 


— Counting events Attributable to Secure state on this PE in Non-secure state is permitted. 


— Counting Unattributable events related to other secure operations in the system is permitted. 


° Otherwise, counting events in Non-secure state is permitted. 


Interaction with EL2 


In an implementation that includes EL2, Non-secure software executing at EL2 can: 


° Trap any attempt by the Guest OS to access the PMU. This means the hypervisor can identify which Guest 
OSs are using the PMU and intelligently employ switching of the PMU state. 


° Trap accesses to the PMCR, so that it can fully virtualize the PMU identity registers, PMCR.IMP and 
PMCR.IDCODE. 


° Reserve the highest-numbered counters for its own use by overriding the value of PMCR.N seen by the Guest 
OS. The PE does not permit a Guest OS to access the reserved counters. 


HDCR controls Performance Monitors virtualization. 
. Counter enables on page D5-1846. 
° Counter access on page D5-1847. 
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D5.6 Event filtering 


The PMU can filter events by various combinations of Exception level and Security state. This gives software the 
flexibility to count events across multiple processes. 


D5.6.1 Filtering by Exception level and PE state 
In AArch64 state: 


° For each event counter PMEVTYPER<n>_ELO specifies the Exception levels in which the counter counts 
events Attributable to Exception levels. 


° PMCCFILTR_ELO specifies the Exception levels in which the cycle counter counts. 
In an implementation that supports multithreading: 


° When the value of PMEVTYPER<n>_EL0.MT is 1, if an event is Attributable to another thread, then the 
specified filtering applies to the current Exception level and PE state of the thread to which the event is 
attributable, regardless of the Exception level and state of the counting thread. 


° When the value of PMEVTYPER<n>_EL0.MT is 0. the event only counts events that are attributable to the 
counting thread, and the filtering applies to the Exception level and PE state of the counting state, see 
Example D5-2. 


Example D5-2 Example of the effect of the PMEVTYPER<n>_ELO0.MT control 


If the value of PMEVTYPER<n>_EL0.U is 0 on the current thread, then it does not count events Attributable to 
ELO on the other thread, even if this thread is not executing at ELO. 


Otherwise, for each Unattributable event, it is IMPLEMENTATION DEFINED whether the filtering applies. 
In AArch32 state, the filtering controls are provided by the PMEVTYPER<n> and PMCCFILTR registers. 


For more information, see the individual register descriptions. 


Reserved combinations of the filtering controls 


The filtering controls are provided by the {P, U, NSK, NSU, NSH, M} fields of the controlling register, 
PMEVTYPER<n>_ELO, PMCCFILTR_ELO, PMEVTYPER<n>, or PMCCFILTR. 


Some combinations of these fields select filtering options that do not represent a meaningful use case, and therefore 
these cases are reserved. Table D5-2 shows these reserved filtering encodings. 


Table D5-2 Reserved filter encodings 





P U NSK NSU NSH M_ Exception level and state the filter applies to 




















0 1 1 1 x X Secure EL1 and Non-secure EL0@ 

1 0 0O 0 Not 0b00 ELO and at least one of EL2 and EL3 

1 0 0O 1 Not @b00 Secure ELO and at least one of EL2 and EL3 

1 0 1 1 x x Non-secure EL1 and Secure ELO 

1 1 0O 0 0 0 None 

1 1 0O 1 Not 0b00 Non-secure ELO and at least one of EL2 and EL3 





a. Software must not program the counter to count at ELO in one state and at EL] in the other state. 
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The reserved filtering encodings shown in Table DS-2 on page D5-1843 are not UNPREDICTABLE and are not 
CONSTRAINED UNPREDICTABLE. Implementations must implement the filtering controls as described in the 
appropriate register descriptions. 


D5.6.2 Accuracy of event filtering 
The PMU architecture does not require event filtering to be accurate. 


For most events, it is acceptable that, during a transition between states, events generated by instructions executed 
in one state are counted in the other state. The following sections describe the cases where event counts must not be 
counted in the wrong state: 


. Exception-related events. 


° Software increment events. 


Exception-related events 


The PMU must filter events related to exceptions and exception handling according to the Exception level in which 
the event occurred. These events are: 


° EXC_TAKEN on page D5-1855, Exception taken. 


° EXC_RETURN on page D5-1855, Instruction architecturally executed, condition code check pass, exception 
return. 

° CID_WRITE_RETIRED on page D5-1855, Instruction architecturally executed, condition code check pass, 
write to CONTEXTIDR. 


° TTBR_WRITE_RETIRED on page D5-1857, Instruction architecturally executed, condition code check pass, 
write to translation table base. 


The PMU must not count an exception after it has been taken because this could systematically report a result of 
zero exceptions at ELO. Similarly, it is not acceptable for the PMU to count exception returns or writes to 
CONTEXTIDR after the return from the exception. 


Software increment events 


The PMU must filter software increment events according to the Exception level in which the software increment 
occurred. Software increment counting must also be precise, meaning the PMU must count every architecturally 
executed software increment event, and must not count any speculatively executed software increment. 


Software increment events must also be counted without the need for explicit synchronization. For example, two 
software increments executed without an intervening Context synchronization event must increment the event 
counter twice. 


For more information, see SW_INCR on page D5-1855, Instruction architecturally executed, condition code check 
pass, software increment. 


Pseudocode description of event filtering 


See AArch64.CountEvents() and AArch32.CountEvents() in Chapter J1 ARMv8 Pseudocode for a pseudocode 
description of event filtering. However, this function does not completely describe the behavior for Unattributable 
events. 
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D5.7 Performance Monitors and Debug state 


Events that count cycles are not counted in Debug state. 


Events Attributable to the operations issued by the debugger through the external debug interface are not counted 
in Debug state. 


In an implementation that supports multithreading, when the value of PMEVTYPER<n>_EL0.MT is 1, if an event 
is Attributable to an operation issued by the debugger through the external debug interface to another thread that is 
in Debug state, then the event is not counted, and it is IMPLEMENTATION DEFINED whether the event is counted when 
the counting thread is in Debug state. 


For each Unattributable event, it is IMPLEMENTATION DEFINED whether it is counted when the counting PE is in 
Debug state. If the event might be counted, then the rules in Filtering by Exception level and PE state on 
page D5-1843 apply for the current Security state in Debug state. 
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D5.8 Counter enables 





















































D5.8 Counter enables 

Table D5-3 shows an implementation that does not include EL2, and where the PMCR.E bit is a global counter 

enable bit, and PMCNTENSET provides an enable bit for each counter. 

Table D5-3 Event counter enables when an implementation does not include EL2 
PMCR.E PMCNTENSET[x] == PMCNTENSET[x] == 
0 PMN«x disabled PMN«x disabled 
1 PMNx disabled PMN«x enabled 

If the implementation includes EL2, then in addition to the PMCR.E and PMCNTENSET enable bits: 

° HDCR.HPME overrides the value of PMCR.E for counters configured for access in Hyp mode. 

° HDCR.HPMN specifies the number of performance counters that the Guest OS can access. The minimum 
permitted value of HDCR.HPMN is 1, meaning there must be at least one counter that the Guest OS can 
access. 

Table D5-4 shows the combined effect of all the counter enable controls. 

Table D5-4 Event counter enables when an implementation includes EL2 

PMCNTENSET[x] == 

HDCR.HPME PMCR.E PMCNTENSET[x] == 
x < HDCR.HPMN x 2 HDCR.HPMN 

0 0 PMN«x disabled PMN«x disabled PMNx disabled 

0 1 PMN«x disabled PMN«x enabled PMNx disabled 

1 0 PMNx disabled PMNx disabled PMNx enabled 

1 1 PMNx disabled PMN«x enabled PMN«x enabled 

Note 

The effect of HDCR.{HPME, HPMN} on the counter enables applies in both Security states. However, in Secure 

state the value returned for PMCR.N is not affected by HDCR.HPMN. 

EL2 does not affect the enabling of PMCCNTR. Table D5-5 shows the PMCCNTR enables, for all 

implementations. 

Table D5-5 Cycle counter enables 
PMCR.E PMCNTENSET[31] == PMCNTENSET{[31] == 
0 PMCCNTR disabled PMCCNTR disabled 
1 PMCCNTR disabled PMCCNTR enabled 
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D5.9 Counter access 


All implemented counters are accessible in EL3, Secure EL1 and EL2. If EL2 is implemented the hypervisor uses 
HDCR.HPMN to reserve an event counter, with the effect that software cannot access that counter and its associated 
state from Non-secure EL1 or from Non-secure ELO. 


Note 
This section describes a counter as being accessible from a particular Exception level and state. However, access to 
the registers is subject to the access permissions described in Access permissions on page DS-1871. In particular, 
accesses from ELO might be UNDEFINED and accesses might be trapped to EL1 or EL2. 








D5.9.1 PMNX event counters 


For an implementation that includes EL2 and EL3, Table D5-6 shows how the values of the HDCR.HPMN field 
control the behavior of accesses to the PMNx event counter registers. 


Table D5-6 Result of PMNx event counter accesses 











Secure state Non-secure state 
Condition 

EL3 EL1 ELO EL2 EL1 ELO 
x < HDCR.HPMN Succeeds Succeeds Succeeds Succeeds Succeeds Succeeds 
x => HDCR.HPMN Succeeds Succeeds Succeeds Succeeds No access No access 





Where Table D5-6 shows no access: 


e If PMSELR.SEL is x then: 
— A direct read of PMXEVTYPER or PMXEVCNTR is CONSTRAINED UNPREDICTABLE. 
— A direct write to PMXEVTYPER or PMXEVCNTR is CONSTRAINED UNPREDICTABLE. 


° A direct read of PMEVTYPER<n> or PMEVCNTR<n> is CONSTRAINED UNPREDICTABLE. 
° A direct write of PMEVTYPER<n> or PMEVCNTR<n> is CONSTRAINED UNPREDICTABLE. 


° For direct reads and direct writes, PMOVSCLR[x], PMOVSSET[x], PMCNTENSET[x], 
PMCNTENCLR[x], PMINTENSET[x], and PMINTENCLR[x] are RAZ/WI. 


° Direct writes to PMS WINC[y] are ignored. 
° A direct write of 1 to PMCR.P does not reset PMNx. 


For more information on the CONSTRAINED UNPREDICTABLE behavior of the Performance Monitor Extension, see: 
° For AArch32, The Performance Monitors Extension on page K1-5462. 
° For AArch64, The Performance Monitors Extension on page K1-5480. 


Note 


In Secure state, and in the Non-secure EL2 mode, the value of HDCR.HPMN does not affect the value returned for 
PMCR.N. 








D5.9.2 Cycle counter 


The PMU does not provide any control that a hypervisor can use to reserve the cycle counter for its own use. The 
only control over the cycle counter is an access permission control for ELO. See Access permissions on 
page D5-1871. 
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D5.10 Events, event numbers, and mnemonics 


The following sections describe the events that can be counted and their associated event numbers, and the 
mnemonics for the events: 


° Definitions. 

° The event number space on page D5-1852. 

° Common event numbers on page D5-1852. 

° Common architectural event numbers on page D5-1854. 

° Common microarchitectural event numbers on page D5-1859. 

° Meaningful ratios between common microarchitectural events on page D5-1869. 


. Required events on page D5-1869. 
° IMPLEMENTATION DEFINED event numbers on page DS-1870. 


D5.10.1 Definitions 


The following subsections give more information about terms used in the event definitions: 
° Definition of terms. 

° Levels of caches and TLBs on page D5-1851. 

° Shared caches and buses on page D5-1851. 


Definition of terms 


Speculatively executed 


Many events relate to speculatively executed operations. Here, speculatively executed means the PE 
did some work associated with one or more instructions but the instructions were not necessarily 
architecturally executed. 


An instruction might create one or more microarchitectural operations (u-ops) at any point in the 
execution pipeline. For the purpose of event counting, the t-ops are counted. The definition of a 
1-op is implementation specific. An architecture instruction might create more than one y-op for 
each instruction. U-ops might also be removed or merged in the execution stream, so an architecture 
instruction might create no -ops for an instruction. Any arbitrary translation of instructions to an 
equivalent sequence of 1-ops is permitted. 


This means there is no architecturally guaranteed relationship between a speculatively executed 
-op and an architecturally executed instruction. The results of such an operation can also be 
discarded, if it transpires that the operation was not required, such as a mispredicted branch. 
Therefore, ARMv8-A defines these events as operation speculatively executed, where appropriate. 


— Note 


The definition of speculatively executed does not mean only those operations that are executed 
speculatively and later abandoned, for example due to a branch misprediction or fault. That is, 
speculatively executed operations must count operations on both false and correct execution paths. 





The counting of operations can indicate the workload on the PE. However, there is no requirement 
for operations to represent similar amounts of work, and direct comparisons between different 
microarchitectures are not meaningful. 


For example, an implementation might split an A32 or T32 LDM instruction of six registers into six 
1-ops, one for each load, and a seventh address-generation operation to determine the base address 
or writeback address. Also, for doubleword alignment, the six load h-ops might combine into four 
operations, that is, a word load, two doubleword loads, and a second word load. This single 
instruction can then be counted as five, or possibly six, events: 


. Four (Operation speculatively executed - Load) events. 
. One (Operation speculatively executed - Integer data processing) event. 
° One (Operation speculatively executed - Software change of the PC) event if the PC was one 


of the six registers in the LDM instruction. 
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Different groups of events can have different IMPLEMENTATION DEFINED definitions of 
speculatively executed. Such groups share a common base type, which the event name denotes. 
Each of the events in the previous example is of the base type, operation speculatively executed. 


For groups of events with a common base type, speculatively executed operations are all counted 
on the same basis, which normally means at the same point in the pipeline. It is possible to compare 
the counts and make meaningful observations about the program being profiled. 


Within these groups, events are commonly defined with reference to a particular architecture 
instruction or group of instructions. In the case of speculatively executed operations this means 
operations with semantics that map to that type of instruction. 


Instruction memory access 


A PE acquires instructions for execution through instruction fetches. Instruction fetches might be 


due to: 

° Fetching instructions that are architecturally executed. 

° The result of the execution of an instruction preload instruction, PLI. 

. Speculation that a particular instruction might be executed in the future. 


The relationship between the fetch of an individual instruction and an instruction memory access is 
IMPLEMENTATION DEFINED. For example, an implementation might fetch many instructions 
including a non-integer number of instructions in a single instruction memory access. 


Memory-read operations 


A PE accesses memory through memory-read and memory-write operations. A memory-read 
operation might be due to: 


° The result of an architecturally executed memory-reading instructions. 
° The result of a speculatively executed memory-reading instructions. 
. A translation table walk. 


For levels of cache hierarchy beyond the Level 1 caches, memory-read operations also include 
accesses made as part of a refill of another cache closer to the PE. Such refills might be due to: 


° Memory-read operations or memory-write operations that miss in the cache 
° The execution of a data preload instruction. 
. The execution of an instruction preload instruction on a unified cache. 
. The execution of a cache maintenance instruction. 
— Note 


A preload instruction or cache maintenance instruction is not, in itself, an access to that 
cache. However, it might generate cache refills which are then treated as memory-read 
operations beyond that cache. 





. Speculation that a future instruction might access the memory location. 
This list is not exhaustive. 


The relationship between memory-read instructions and memory-read operations is 
IMPLEMENTATION DEFINED. For example, for some implementations an LDP instruction that reads 
two 64-bit registers might generate one memory-read operation if the address is quadword-aligned, 
but for other addresses it generates two or more memory-read operations. 
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Memory-write operations 


Memory-write operations might be due to: 


° The result of an architecturally executed memory-writing instructions. 
° The result of a speculatively executed memory-writing instructions. 
—— Note 


Speculatively executed memory-writing instructions that do not become architecturally executed 
must not alter the architecturally defined view of memory. They can, however, generate a 
memory-write operation that is later undone in some implementation specific way. 





For levels of cache hierarchy beyond the Level 1 caches, memory-write operations also include 
accesses made as part of a write-back from another cache closer to the PE. Such write-backs might 
be due to: 


° Evicting a dirty line from the cache, to allocate a cache line for a cache refill, see 
memory-read operations. 


° The execution of a cache maintenance instruction. 


— Note 


A cache maintenance instruction is not in itself an access to that cache. However, it might 
generate write-backs which are then treated as memory-write operations beyond that cache. 





° The result of a coherency request from another PE. 
This list is not exhaustive. 


The relationship between memory-writing instructions and memory-write operations is 
IMPLEMENTATION DEFINED. For example, for some implementations an STP instruction that writes 
two 64-bit registers might generate one memory-write operation if the address is quadword-aligned, 
but for other addresses it generates two or more memory-write operations. In some 
implementations, the result of two STR instructions that write to adjacent memory might be merged 
into a single memory-write operation. 


— Note 


The data written back from a cache that is shared with other PEs might not be data that was written 
by the PE that performs the operation that leads to the write-back. Nevertheless, the event is counted 
as a write-back event for that PE. 





Instruction architecturally executed 


Instruction architecturally executed is a class of event that counts for each instruction of the 
specified type. Architecturally executed means that the program flow is such that the counted 
instruction would be executed in a Simple sequential execution of the program. Therefore an 
instruction that has been executed and retired is defined to be architecturally executed. When a PE 
can perform speculative execution, an instruction is not architecturally executed if the PE discards 
the results of the speculative execution. 


If an instruction that would be executed in a Simple sequential execution of the program generates 
a synchronous exception, it is IMPLEMENTATION DEFINED whether the instruction is counted. 


Each architecturally executed instruction is counted once, even if the implementation splits the 
instruction into multiple operations. Instructions that have no visible effect on the architectural state 
of the PE are architecturally executed if they form part of the architecturally executed program flow. 
The point where such instructions are retired is IMPLEMENTATION DEFINED. 


Examples of instructions that have no visible effect are: 


. A NOP. 

° A conditional instruction that fails its condition code check. 

° A Compare and Branch on Zero, CBZ, instruction that does not branch. 

° A Compare and Branch on Nonzero, CBNZ, instruction that does not branch. 


The point at which an event causes an event counter to be updated is not defined. 
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Unless otherwise stated, all instructions of the specified type are counted even if they have no visible 
effect on the architectural state of the PE. This includes a conditional instruction that fails its 
condition code check. 


For events that count only the execution of instructions that update context state, such as writes to 
the CONTEXTIDR, if such an instruction is executed twice without an intervening Context 
synchronization event, it is CONSTRAINED UNPREDICTABLE whether the first instruction is counted. 


Instruction architecturally executed, condition code check pass 


Instruction architecturally executed, condition code check pass is a class of events that explicitly do 
not occur for: 


° A conditional instruction that fails its condition code check. 

° A Compare and Branch on Zero, CBZ, instruction that does not branch. 

° A Compare and Branch on Nonzero, CBNZ, instruction that does not branch. 
° A Test and Branch on Zero, TBZ, instruction that does not branch. 

° A Test and Branch on Nonzero, TBNZ, instruction that does not branch. 

. A Store-Exclusive instruction that does not write to memory. 


Otherwise, the definition of architecturally executed is the same as for Jnstruction architecturally 
executed. 


Levels of caches and TLBs 


The mapping of different levels of cache or TLB to the PMU events is determined by the implementation. Although 
the CLIDR_EL1, or the AArch32 CLIDR, defines the implemented levels of cache, the architecture does provide 
any way of determining implemented levels of TLB. Also, many implementations include structures that provide 
some caching at a higher level than the level 1 caches or TLBs. Typically, these structures, that might be called 
Level 0 caches, or mini caches, or microcaches, are invisible to software. The implementation-specific nature of 
cache and TLB implementations mean that, in general, PMU event counts cannot be used reliably to make direct 
comparisons between different implementations. 


Shared caches and buses 


There is no architectural concept of a shared component. However, when a cache, a bus, or any other system 
component that might generate countable events is implemented, and: 


. The extent of the first-order effects due to an event from that component are only applicable to a single PE, 
then the event is not shared. 


° Otherwise, the event is shared. 


Second-order effects are not considered when determining if an event is shared. 


Example D5-3 First and second order effects of a cache miss in a multiple-PE implementation 


In an implementation that consists of two PEs, each with its own L1 cache, a cache miss by one of the PEs is a 
first-order effect of an access to its cache. Any snoop that is performed on the L1 cache of the other PE in the 
implementation as a result of that cache miss is a second order effect. 


Note 


Shared events are inherently linked to microarchitectures and so the implementer must make an informed decision 
about how such events are implemented. 
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D5.10.2 


D5.10.3 


The event number space 


The event number space is 12 bits, and is allocated as follows: 


0x000-0x03F §=Common architectural and microarchitectural event numbers. 


ARM defines these event numbers. Supported events in this range can be discovered using: 
° The PMCEIDO_ELO and PMCEID1_EL0 registers in AArch64 state. 

° The PMCEIDO and PMCEID1 registers in AArch32 state. 

For more information see: 

. Common event numbers. 

. Common architectural event numbers on page D5-1854. 


° Common microarchitectural event numbers on page D5-1859. 
@x040-0@x@BF © ARM-recommended common architectural and microarchitectural events. 
These events are IMPLEMENTATION DEFINED. For more information see: 
. IMPLEMENTATION DEFINED event numbers on page D5-1870. 
° Appendix K3 Recommendations for Performance Monitors Event Numbers for 
IMPLEMENTATION DEFINED Events. 
0x0CO-Ox3FF IMPLEMENTATION DEFINED events. 
For more information see IMPLEMENTATION DEFINED event numbers on page D5-1870. 


Common event numbers 


The event numbers of the common events are reserved for the specified events. Each of these event numbers must 
either: 

. Be used for its assigned event. 

° Not be used. 


When an implementation supports monitoring of an event that is assigned a common event number, ARM strongly 
recommends that it uses that number for the event. However, software might encounter implementations where an 
event assigned a number in this range is monitored using an event number from the IMPLEMENTATION DEFINED 
range. 


Note 
ARM might define other common event numbers. This is one reason why software must not assume that an event 
with an assigned common event number is never monitored using an event number from the IMPLEMENTATION 
DEFINED range. 








Table DS-7 lists the PMU architectural and microarchitectural event numbers in event number order. The entries in 
the Event mnemonic column link to the event description in Common architectural event numbers on page D5-1854 
or Common microarchitectural event numbers on page DS-1859. 


Table D5-7 PMU event numbers 





Event 
number 


Event type Event mnemonic Description 





0x000 


Architectural SW_INCR Instruction architecturally executed, condition code check 


pass, software increment 





0x01 


Microarchitectural LI] CACHE REFILL? Attributable Level | instruction cache refill 





0x002 


Microarchitectural L1I]_TLB REFILL@ Attributable Level 1 instruction TLB refill 





0x003 


Microarchitectural L1D CACHE REFILL? Attributable Level | data cache refill 
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Table D5-7 PMU event numbers (continued) 





Event 



















































































Haber Event type Event mnemonic Description 

0x004 Microarchitectural LID _CACHE Attributable Level 1 data cache access 

0x005 Microarchitectural L1D_TLB_REFILL@ Attributable Level 1 data TLB refill 

0x006 Architectural LD_RETIRED Instruction architecturally executed, condition code check 
pass, load 

0x007 Architectural ST_RETIRED Instruction architecturally executed, condition code check 
pass, store 

0x008 Architectural INST_RETIRED Instruction architecturally executed 

0x009 Architectural EXC_TAKEN Exception taken 

OxQ0A Architectural EXC_RETURN Instruction architecturally executed, condition code check 
pass, exception return 

0x00B Architectural CID_WRITE_RETIRED Instruction architecturally executed, condition code check 
pass, write to CONTEXTIDR 

0x00C Architectural PC_WRITE_RETIRED Instruction architecturally executed, condition code check 
pass, software change of the PC 

0x00D Architectural BR_IMMED_RETIRED Instruction architecturally executed, immediate branch 

Ox00E Architectural BR_RETURN_RETIRED Instruction architecturally executed, condition code check 
pass, procedure return 

Ox00F Architectural UNALIGNED_LDST_RETIRED __ Instruction architecturally executed, condition code check 
pass, unaligned load or store 

0x010 Microarchitectural BR _MIS_ PRED Mispredicted or not predicted branch speculatively executed 

@x011 Microarchitectural CPU_CYCLES Cycle 

0x012 Microarchitectural BR_PRED Predictable branch speculatively executed 

0x013 Microarchitectural MEM ACCESS Data memory access 

0x014 Microarchitectural L1I_CACHE Attributable Level 1 instruction cache access 

0x015 Microarchitectural LID CACHE WB Attributable Level 1 data cache write-back 

0x016 Microarchitectural L2D_ CACHE Attributable Level 2 data cache access 

0x017 Microarchitectural L2D CACHE REFILL? Attributable Level 2 data cache refill 

0x018 Microarchitectural L2D_CACHE WB Attributable Level 2 data cache write-back 

0x019 Microarchitectural BUS _ACCESS Bus access 

Ox01A Microarchitectural MEMORY_ERROR Local memory error 

0x01B Microarchitectural INST_SPEC Operation speculatively executed 

Qx01C Architectural TTBR_WRITE_RETIRED Instruction architecturally executed, condition code check 
pass, write to TTBR 

0x01D Microarchitectural BUS_CYCLES Bus cycle 
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Table D5-7 PMU event numbers (continued) 





Event 



























































ninaber Event type Event mnemonic Description 

Ox@1E Architectural CHAIN For odd-numbered counters, increments the count by one for 
each overflow of the preceding even-numbered counter. For 
even-numbered counters, there is no increment. 

Ox01F Microarchitectural LID CACHE ALLOCATE Attributable Level 1 data cache allocation without refill 

0x020 Microarchitectural L2D_ CACHE ALLOCATE Attributable Level 2 data cache allocation without refill 

0x21 Architectural BR_RETIRED Instruction architecturally executed, branch 

0x022 Microarchitectural BR MIS PRED RETIRED Instruction architecturally executed, mispredicted branch 

0x023 Microarchitectural STALL_FRONTEND No operation issued due to the frontend 

0x024 Microarchitectural STALL BACKEND No operation issued due to backend 

0x025 Microarchitectural L1D_TLB Attributable Level 1 data or unified TLB access 

0x026 Microarchitectural LII_TLB Attributable Level 1 instruction TLB access 

0x027 Microarchitectural L2I]_ CACHE Attributable Level 2 instruction cache access 

0x028 Microarchitectural L2I CACHE REFILL@ Attributable Level 2 instruction cache refill 

0x029 Microarchitectural L3D CACHE ALLOCATE Attributable Level 3 data or unified cache allocation without 
refill 

Qx02A Microarchitectural L3D CACHE REFILL? Attributable Level 3 data or unified cache refill 

0x02B Microarchitectural L3D_CACHE Attributable Level 3 data or unified cache access 

@x02C Microarchitectural L3D_CACHE_WB Attributable Level 3 data or unified cache write-back 

@x02D Microarchitectural L2D TLB_REFILL& Attributable Level 2 data or unified TLB refill 

@x02E Microarchitectural L2I_TLB REFILL? Attributable Level 2 instruction TLB refill 

Qx02F Microarchitectural L2D_TLB Attributable Level 2 data or unified TLB access 

0x030 Microarchitectural L2I]_ TLB Attributable Level 2 instruction TLB access 








a. For more information, see Meaningful ratios between common microarchitectural events on page D5-1869. 


The PMCEIDO_ELO and PMCEID1_EL0O registers identify the events that an implementation supports. 


Future revisions of this manual, or of the architecture, might define additional common event numbers. Events that 
do not require additional features in the PMU can be implemented retrospectively, meaning such events can be 


supported as part of a PMUv3 implementation. 





Note 


This means that an ARMv7 PMUv2 implementation can include support for any of the event numbers defined in 
Table D5-7 on page D5-1852. 








D5.10.4 Common architectural event numbers 
This section describes the defined common architectural event numbers. 
For the common features, normally the counters must increment only once for each event. The event descriptions 
include any exceptions to this rule. 
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In these definitions, the term architecturally executed means that the instruction flow is such that the counted 
instruction would have been executed in a Simple sequential execution model. 


The common architectural event numbers are: 


0x000, SW_INCR, Instruction architecturally executed, condition code check pass, software increment 
The counter increments on writes to the PMSWINC register. 


If the PE performs two architecturally executed writes to the PMSWINC register without an 
intervening Context synchronization event, then the counter is incremented twice. 


If PMEVTYPER<n>_EL0.evtCount is set to 0x00, then in AArch64 state, counts MSR writes to 
PMSWINC_ELO with bit [n] set to 1. 


If the value of PMEVTYPER<n>_EL0.MT is | then, in a multithreaded implementation, this counts 
writes by all PEs that have the same affinity at level 1 and above. 

0x006, LD_RETIRED, Instruction architecturally executed, condition code check pass, load 
The counter increments for every executed memory-reading instruction. 


— Note 


Event @x006 does not count the return status value of a Store-Exclusive instruction. 





Whether the preload instructions PRFM, PLD, PLDW, PLI, count as memory-reading instructions is 
IMPLEMENTATION DEFINED. ARM recommends that if the instruction is not implemented as a NOP 
then it is counted as a memory-reading instruction. 

0x007, ST_RETIRED, Instruction architecturally executed, condition code check pass, store 
The counter increments for every executed memory-writing instruction. 
DC ZVA is counted as a store. 


The counter does not increment for a Store-Exclusive instruction that fails. 


0x008, INST_RETIRED, Instruction architecturally executed 


The counter increments for every architecturally executed instruction. 


0x009, EXC_TAKEN, Exception taken 
The counter increments for each exception taken. See Exception-related events on page D5-1844. 
—— Note 
The counter counts the PE exceptions described in: 


° For exceptions taken to an Exception level using AArch64, Exception entry on 
page D1-1521. 


° For exceptions taken to an Exception level using AArch32, AArch32 state exception 
descriptions on page G1-3849. 





0x00A, EXC_RETURN, Instruction architecturally executed, condition code check pass, exception return 


The counter increments for each executed exception return instruction. See also Exception-related 
events on page D5-1844. The following sections define the counted instructions: 


° For an exception return from an Exception level using AArch64, Exception return on 
page D1-1536. 


° For an exception return from an Exception level using AArch32, Exception return 
instructions on page G1-3834. 
0x00B, CID_WRITE_RETIRED, Instruction architecturally executed, condition code check pass, write to 
CONTEXTIDR 


The counter increments for every write to CONTEXTIDR. See Exception-related events on 
page D5-1844. 
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If the PE performs two architecturally-executed writes to CONTEXTIDR without an intervening 
Context synchronization event, it is CONSTRAINED UNPREDICTABLE whether the first write is 
counted. 


0x00C, PC_WRITE_RETIRED, Instruction architecturally executed, condition code check pass, software 


change of the PC 
The counter increments for every software change of the PC. This includes all: 
° Branch instructions. 
° Memory-reading instructions that explicitly write to the PC. 
° Data-processing instructions that explicitly write to the PC. 
. Exception return instructions, ERET and RET. 


It is IMPLEMENTATION DEFINED whether the counter increments for any or all of: 


. BRK and BKPT instructions. 
° An exception generated because an instruction is UNDEFINED. 
. The exception-generating instructions, SVC, HVC, and SMC. 


It is IMPLEMENTATION DEFINED whether an ISB is counted as a software change of the PC. 
The counter does not increment for exceptions other than those explicitly identified in these lists. 


— Note 


Conditional branches are only counted if the branch is taken. 





0x00D, BR_LIMMED_RETIRED, Instruction architecturally executed, immediate branch 
The counter counts all immediate branch instructions that are architecturally executed. 
In AArch32 state, the counter increments each time the PE executes one of the following 
instructions: 
° B{<c>} <label>. 
° BL{<c>} <label>. 
° BLX{<c>} <label>. 
° CBZ <Rn>, <label>. 
° CBNZ <label>. 


In AArch64 state, the counter increments each time the PE executes an immediate branch 


instructions: 
° B <label>. 
° B.cond <label>. 


° BL <label>. 

° CBZ <Rn>, <label>. 
° CBNZ <Rn>, <label>. 
° TBZ <Rn>, <label>. 
° TBNZ <Rn>, <label>. 


— Note 


Conditional branches are always counted, regardless of whether the branch is taken. 





If an ISB is counted as a software change of the PC instruction, then it is IMPLEMENTATION DEFINED 
whether an ISB is counted as an immediate branch instruction. 


0x00E, BR_RETURN_RETIRED, Instruction architecturally executed, condition code check pass, procedure 
return 

In AArch32 state, the counter counts the following procedure return instructions: 

° BX R14. 

s MOV PC, LR. 

° POP {.., PC}. 
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° LDR PC, [SP], #offset. 


— Note 


The counter counts only the listed instructions as procedure returns. For example, it does not count 
the following as procedure return instructions: 


° BX RQ, because Rm != R14. 

° MOV PC, RQ, because Rm != R14. 

° LDM SP, {.., PC}, because writeback is not specified. 

° LDR PC, [SP, #offset], because this specifies the wrong addressing mode. 





In AArch64 state, the counter counts all architecturally executed RET instructions. 
0x00F, UNALIGNED_LDST_RETIRED, Instruction architecturally executed, condition code check pass, 
unaligned load or store 


The counter counts each memory-reading instruction or memory-writing instruction access that 
would generate an Alignment fault when Alignment fault checking is enabled. 


This event does not count accesses that would generate an SP alignment fault exception if the 
applicable stack pointer alignment check is enabled, unless that access would also generate an 
Alignment fault Data Abort exception if Alignment fault checking is enabled. 


It is IMPLEMENTATION DEFINED whether this event counts accesses that generate an exception, 
including accesses that do generate Alignment fault Data Abort exceptions. 


See SP alignment checking on page D1-1515 for more information. 

See Unaligned data access on page E2-2323 for more information. 
0x01C, TTBR_WRITE_RETIRED, Instruction architecturally executed, condition code check pass, write to 
TTBR 


The counter counts writes to TTBRO_EL1 and TTBR1_EL1 in AArch64 state and TTBRO and 
TTBRI1 in AArch32 state. When EL3 is implemented and using AArch32, this includes counting 
writes to both banked copies of TTBRO and TTBR1. See Exception-related events on 

page D5-1844. 


If the PE executes two writes to the same TTBR, without an intervening Context synchronization 
event, it is CONSTRAINED UNPREDICTABLE whether the first write to the TTBR, is counted. 


If EL3 is implemented and using AArch64, the counter does not count writes to TTBRO_EL3. 


If EL2 is implemented and using AArch64, the counter does not count writes to TTBRO_EL2 and 
to VITBR_EL2. 


If EL2 is implemented and using AArch32, the counter does not count writes to HTTBR and to 
VTTBR. 
@x01E, CHAIN 


For an odd-numbered counter, increments when an overflow occurs on the preceding 
even-numbered counter on the same PE. Even-numbered counters never increment as a result of this 
event. This means the CHAIN event links the odd-numbered counter with the preceding 
even-numbered counter to provide a 64-bit counter. 


— Note 


° The CHAIN event means a system can provide N 32-bit counters, N/2 64-bit counters, or a 
mixture of 32-bit counters and 64-bit counters. 


° The CHAIN event only counts overflows from the preceding even-numbered counter on the 
same PE. This means it is unaffected by the value of PMEVTYPER<n>_EL0.MT. 





There is no atomic access to a pair of counters, so if software reads a counter-pair that is enabled, it 
must use a high-low-high read sequence, or employ reasonable heuristics, to avoid tearing. 
Similarly, if using CHAIN events, when disabling the counters software must take care that the 
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result is not torn by the low counter overflowing at the same time as the counters are disabled 
Example D5-4 on page D5-1858 shows suitable sequences for disabling and enabling CHAIN 
counters. 


Example D5-4 Usage examples for 64-bit counters 


An example high-low-high read sequence for a 64-bit counter is: 


retry: 


RS W2,PMEVCNTR1_ELQ ;; read high counter, must be odd-numbered 


ISB ;; force ordering 
RS WQ,PMEVCNTR@_EL@ 3; read low counter 
3} must return the previous counter to PMEVCNTR1_ELO 








ISB ;; force ordering 

RS W1,PMEVCNTR1_ELQ@  ;; read high counter 

CMP W1,W2 

BNE retry 3; if the high counter has changed, then retry 


When disabling a pair of counters that are paired by a CHAIN event, software must: 


if 


2 
oe 


Disable the low counter, by setting PMCNTENCLR_ELO[n] to 1. 
Typically, software uses a read-modify-write sequence to update PACNTENCLR_ELO. 


Execute an ISB instruction, or perform another Context synchronization event. 


Disable the high counter, by setting PMCNTENCLR_ELO[n+1] to 1, or setting PMCR_ELO.E to 0. 


When enabling a pair of counters that are paired by a CHAIN event, software must: 


il. 


Enable the high counter, by setting PMCNTENCLR_ELO[n+1] to 0 and, if necessary, setting PMCR_ELO.E 
to 1. 


Execute an ISB instruction, or perform another Context synchronization event. 


Enable the low counter by setting PMCNTENCLR_ELO[n] to 0. 


When using 64-bit counters, the architecture does not define the latency between the first counter 
overflowing and the second counter incrementing the CHAIN event. There is no requirement for 
updates to occur synchronously, but software reading or enabling the counter pair using a 
low-ISB-high sequence, as shown in Example D5-4, must not observe the low counter incrementing 
and overflowing for the event and the high counter not incrementing for the resulting CHAIN event. 
This means that the ISB executed after reading the low counter must ensure the completion of the 
update of the high counter by the CHAIN event. 


— Note 


The way the CHAIN event operates means that, to filter the Exception levels and Security states in 
which the event is counted, software must: 


° Program PMEVTYPER<n>_EL0 to count the event in the required conditions. 
° Program PMEVTYPER<n+1>_EL0 to count the CHAIN event in all Exception levels and 
states. 





0x021, BR_RETIRED, Instruction architecturally executed, branch. 


The counter counts all branches on the architecturally executed path that would incur cost if 


mispredicted. 

° Counts all branch instructions, memory-reading and data-processing instructions that 
explicitly write to the PC, at retirement. 

* Counts both taken and not-taken branches. 

° Tt is IMPLEMENTATION DEFINED whether this includes each of: 


— Unconditional direct branch instructions. 
—  Exception-generating instructions. 
— Exception return instructions. 


— Context synchronization instructions. 
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D5.10.5 Common microarchitectural event numbers 
This section describes the defined common microarchitectural event numbers. 


The common microarchitectural events are features that are likely to be implemented across a wide range of 
implementations. Unlike the common architectural events, there can be some IMPLEMENTATION DEFINED variation 
between definitions on different implementations. 


Unless otherwise stated, the common microarchitectural features relate only to events resulting from the operation 
of the PE counting the events. Events resulting from the operation of other PEs that might share a resource must not 
be counted. Where a resource can be subject to events that do not result from the operation of any of the PEs that 
share it, ARM recommends that the resource implements its own event counters. An example of a resource that 
might require its own event counters is a shared Level 2 cache that is subject to accesses from a system coherency 
port on that cache. 


The event definitions relating to Level 2 caches generally assume the Level 2 cache is shared. The event definitions 
relating to Level 1 caches generally assume the Level 1 cache is not shared. 


The common microarchitectural event numbers are: 


0x01, LII_CACHE_REFILL, Attributable Level 1 instruction cache refill 


The counter counts Attributable instruction memory accesses that cause a refill of at least the 
Level 1 instruction or unified cache. This includes each instruction memory access that causes a 
refill from outside the cache. It excludes accesses that do not cause a new cache refill but are 
satisfied from refilling data of a previous miss. 


A refill includes any access that causes data to be fetched from outside the cache, even if the data is 
ultimately not allocated into the cache. For example, data might be fetched into a buffer but then 
discarded, rather than being allocated into a cache. These buffers are treated as part of the cache. 


The counter does not count cache maintenance instructions. 
See also: 
° Attributability on page D5-1840. 


. Meaningful ratios between common microarchitectural events on page DS-1869. 


0x02, LII_TLB_REFILL, Attributable Level 1 instruction TLB refill 


The counter counts Attributable instruction memory accesses that cause a TLB refill of at least the 
Level | instruction TLB. This includes each instruction memory access that causes an access to a 
level of memory system due to a translation table walk or an access to another level of TLB caching. 
It is IMPLEMENTATION DEFINED whether the count increments when: 

° A refill results in a Translation fault. 


° A refill is not allocated in the TLB. 


The counter does not count: 


° A TLB miss that does not cause a refill but does generate a translation table walk. 
° TLB maintenance instructions. 
See also: 


° Attributability on page D5-1840. 


° Meaningful ratios between common microarchitectural events on page D5-1869. 


0x003, LID_CACHE_REFILL, Attributable Level 1 data cache refill 


The counter counts each Attributable memory-read operation or Attributable memory-write 
operation that causes a refill of at least the Level 1 data or unified cache from outside the Level 1 
cache. Each access to a cache line that causes a new linefill is counted, including those from 
instructions that generate multiple accesses, such as load or store multiples, and PUSH and POP 
instructions. In particular, the counter counts accesses to the Level 1 cache that cause a refill that is 
satisfied by another Level | data or unified cache, or a Level 2 cache, or memory. 


A refill includes any access that causes data to be fetched from outside the cache, even if the data is 
ultimately not allocated into the cache. For example, data might be fetched into a buffer but then 
discarded, rather than being allocated into a cache. These buffers are treated as part of the cache. 
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The counter does not count: 

° Accesses that do not cause a new Level 1 cache refill but are satisfied from refilling data of 
a previous miss. 

° Accesses to a cache line that generate a memory access but not a new linefill, such as 
write-through writes that hit in the cache. 

° Cache maintenance instructions. 

. A write that writes an entire line to the cache and does not fetch any data from outside the 
Level | cache, for example: 
—  Avwrite of a full cache line from a coalescing buffer. 
— ADC ZVA operation. 

. A write that misses in the cache, and writes through the cache without allocating a line. 

See also: 

° Attributability on page D5-1840. 


° Meaningful ratios between common microarchitectural events on page D5-1869. 


0x004, LID_CACHE, Attributable Level 1 data cache access 


The counter counts each Attributable memory-read operation or Attributable memory-write 
operation that causes a cache access to at least the Level | data or unified cache. Each access to a 
cache line is counted including the multiple accesses of instructions, such as LDM or STM. Each access 
to other Level 1 data or unified memory structures, for example refill buffers, write buffers, and 
write-back buffers, is also counted. 


The counter does not count cache maintenance instructions. 


See also Attributability on page D5-1840. 


0x005, LID_TLB_REFILL, Attributable Level 1 data TLB refill 


The counter counts each Attributable memory-read operation or Attributable memory-write 
operation that causes a TLB refill of at least the Level 1 data or unified TLB. It counts each read or 
write that causes a refill, in the form of a translation table walk or an access to another level of TLB 
caching. It is IMPLEMENTATION DEFINED whether the count increments when: 

° A refill results in a Translation fault. 


° A refill is not allocated in the TLB. 


The counter does not count: 


° A TLB miss that does not cause a refill but does generate a translation table walk. 
° TLB maintenance instructions. 

See also: 

. Attributability on page D5-1840. 

° Meaningful ratios between common microarchitectural events on page D5-1869. 


0x010, BR_MIS_PRED, Mispredicted or not predicted branch speculatively executed 


The counter counts each correction to the predicted program flow that occurs because of a 
misprediction from, or no prediction from, the branch prediction resources and that relates to 
instructions that the branch prediction resources are capable of predicting. 


If no program-flow prediction resources are implemented, ARM recommends that the counter 
counts all branches that are not taken. 

@x011, CPU_CYCLES, Cycle 
The counter increments on every cycle. 


All counters are subject to changes in clock frequency, including when a WFI or WFE instruction stops 
the clock. This means that it is CONSTRAINED UNPREDICTABLE whether or not CPU_CYCLES 
continues to increment when the clocks are stopped by WFI and WFE instructions. 
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—— Note 

Unlike PMCCNTR, this count is not affected by PMCR.DP, PMCR.D, or PMCR.C: 

. The counter is not incremented in prohibited regions, so is not affected by PMCR.DP. 
. The counter increments on every cycle, regardless of the setting of PMCR.D. 

° The counter is reset when event counters are reset by PMCR.P, never by PMCR.C. 





In a multithreaded implementation, CPU_CYCLES counts each cycle for the processor for which 
this PE thread was active and could issue an instruction. For more information, see Cycle event 
counting on multithreaded implementations on page D5-1868. 


@x012, BR_PRED, Predictable branch speculatively executed 


The counter counts every branch or other change in the program flow that the branch prediction 
resources are capable of predicting. 


If all branches are subject to prediction, for example a BTB or BTAC, then all branches are 
predictable branches. 


If branches are decoded before the predictor, so that the branch prediction logic dynamically 
predicts only some branches, for example conditional and indirect branches, then it is 
IMPLEMENTATION DEFINED whether other branches are counted as predictable branches. ARM 
recommends that all branches are counted. 


An implementation might include other structures that predict branches, such as a loop buffer that 
predicts short backwards direct branches as taken. Each execution of such a branch is a predictable 
branch. Terminating the loop might generate a misprediction event that is counted by 
BR_MIS_PRED. 


If no program-flow prediction resources are implemented, this event is optional, but ARM 
recommends that BR_PRED counts all branches. 


0x013, MEM_ACCESS, Data memory access 


The counter counts memory-read or memory-write operations that the PE made. The counter 
increments whether the access results in an access to a Level 1 data or unified cache, a Level 2 data 
or unified cache, or neither of these. 


The counter does not increment as a result of: 


° Instruction memory accesses, see Definition of terms on page DS-1848. 
. Translation table walks. 

° Cache maintenance instructions. 

. Write-back from any cache. 


° Refilling of any cache. 


0x014, LII_CACHE, Attributable Level 1 instruction cache access 


The counter counts Attributable instruction memory accesses that access at least the Level 1 
instruction or unified cache. Each access to other Level | instruction memory structures, such as 
refill buffers, is also counted. 


See also Attributability on page D5-1840. 


0x015, LID_CACHE_WB, Attributable Level 1 data cache write-back 


The counter counts every write-back of data from the Level 1 data or unified cache. The counter 
counts each write-back that causes data to be written from the Level 1 cache to outside of the 
Level | cache. For example, the counter counts the following cases: 


° A write-back that causes data to be written to a Level 2 cache or memory. 
° A write-back of a recently fetched cache line that has not been allocated to the Level 1 cache. 
° Transfer of data from the Level 1 cache to outside of this cache made as a result of a 


coherency request. The conditions determining which of these are counted for transfers to 
other Level 1 caches within the same multiprocessor cluster are IMPLEMENTATION DEFINED. 


Each write-back is counted once, even if multiple accesses are required to complete the write-back. 
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Whether write-backs made as a result of cache maintenance instructions is IMPLEMENTATION 
DEFINED. 

The counter does not count: 

. The invalidation of a cache line without any write-back to a Level 2 cache or memory. 

° Writes from the PE that write through the Level 1 cache to outside of the Level 1 cache. 
An Unattributable write-back event occurs when a requestor outside the PE makes a coherency 
request that results in write-back. If the cache is shared, then an Unattributable write-back event is 


not counted. If the cache is not shared, then the event is counted. See Aftributability on 
page D5-1840. 


It is IMPLEMENTATION DEFINED whether a write of a whole cache line that is not the result of the 
eviction of a line from the cache, is counted. For example, this applies when the PE determines 
streaming writes to memory and does not allocate lines to the cache, or by a DC ZVA operation. 


See also Attributability on page D5-1840. 


0x016, L2ZD_CACHE, Attributable Level 2 data cache access 


The counter counts Attributable memory-read or Attributable memory-write operations, that the PE 
made, that access at least the Level 2 data or unified cache. Each access to a cache line is counted 
including refills of and write-backs from the Level | data, instruction, or unified caches. Each 
access to other Level 2 data or unified memory structures, such as refill buffers, write buffers, and 
write-back buffers, is also counted. 

The counter does not count: 

° Operations made by other PEs that share this cache. 

° Cache maintenance instructions. 


See also Attributability on page D5-1840. 


0x017, LZD_CACHE_REFILL, Attributable Level 2 data cache refill 


The counter counts Attributable memory-read or Attributable memory-write operations, that the PE 
made, that access at least the Level 2 data or unified cache and cause a refill of a Level | data, 
instruction, or unified cache or of the Level 2 data or unified cache. Each read from or write to the 
cache that causes a refill from outside the Level 1 and Level 2 caches is counted. 


A refill includes any access that causes data to be fetched from outside the cache, even if the data is 
ultimately not allocated into the cache. For example, data might be fetched into a buffer but then 
discarded, rather than being allocated into a cache. These buffers are treated as part of the cache. 


For example, the counter counts: 


° Accesses to the Level 2 cache that cause a refill that is satisfied by another Level 2 cache, a 
Level 3 cache, or memory. 


. Refills of and write-backs from any Level | data, instruction or unified cache that cause a 
refill from outside the Level 1 and Level 2 caches. 


° Accesses to the Level 2 cache that cause a refill of a Level 1 cache from outside of the 
Level | and Level 2 caches, even if there is no refill of the Level 2 cache. 


The counter does not count, as events on this PE: 


. Accesses that do not cause a new cache refill but are satisfied from refilling data of a previous 
miss. 
° Accesses to the Level 2 cache that generate a memory access but not a new linefill, such as 


write-through writes that hit in the Level 2 cache. 


° Accesses to the Level 2 cache that are part of a Level 1 cache refill or write-back that hit in 
the Level 2 cache so do not cause a refill from outside of the Level 1 and Level 2 caches. 


° Operations made by other PEs that share this cache. 
° Cache maintenance instructions. 
° A write that writes an entire line to the cache and does not fetch any data from outside the 


Level | and Level 2 caches, for example: 


— A write-back from a Level | cache to a Level 2 cache. 
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—  Avwrite from a coalescing buffer of a full cache line. 
— ADC ZVA operation. 


° A write that misses in the cache, and writes through the cache without allocating a line. 
See also: 

° Attributability on page D5-1840. 

° Meaningful ratios between common microarchitectural events on page D5-1869. 


0x018, LZD_CACHE_WB, Attributable Level 2 data cache write-back 


The counter counts every write-back of data from the Level 2 data or unified cache that occurs as a 
result of an operation by this PE. It counts each write-back that causes data to be written from the 
Level 2 cache to outside the Level 1 and Level 2 caches. For example, the counter counts: 


° A write-back that causes data to be written to a Level 3 cache or memory. 

° A write-back of a recently fetched cache line that has not been allocated to the Level 2 cache. 
Each write-back is counted once, even if it requires multiple accesses to complete the write-back. 
It is IMPLEMENTATION DEFINED whether the counter counts: 


° A transfer of data from the Level 2 cache to outside the Level 1 and Level 2 cache made as a 
result of a coherency request. 


° Write-backs made as a result of Cache maintenance instructions. 
The counter does not count: 
° The invalidation of a cache line without any write-back to a Level 3 cache or memory. 


. Writes from the PE or Level 1 data or unified cache that write through the Level 2 cache to 
outside the Level 1 and Level 2 caches. 


° Transfers of data from the Level 2 cache to a Level | cache, to satisfy a Level 1 cache refill. 


An Unattributable write-back event occurs when a requestor outside the PE makes a coherency 
request that results in write-back. If the cache is shared, then an Unattributable write-back event is 
not counted. If the cache is not shared, then the event is counted. 


It is IMPLEMENTATION DEFINED whether a write of a whole cache line that is not the result of the 
eviction of a line from the cache, is counted. For example, this applies when the PE determines 
streaming writes to memory and does not allocate lines to the cache, or by a DC ZVA operation. 


See also Attributability on page D5-1840. 


0x019, BUS_ACCESS, Attributable Bus access 


The counter counts memory-read or memory-write operations that access outside of the boundary 
of the PE and its closely-coupled caches. Where this boundary lies with respect to any implemented 
caches is IMPLEMENTATION DEFINED. 


The definition of a bus access is IMPLEMENTATION DEFINED but physically is a single beat rather 
than a burst. That is, for each bus cycle for which the bus is active. 


Bus accesses include refills of and write-backs from data, instruction, and unified caches. Whether 
bus accesses include operations that do use the bus but not explicitly transfer data is 
IMPLEMENTATION DEFINED. 


An Unattributable bus access occurs when a requestor outside the PE makes a request that results in 
a bus access, for example, a coherency request. If the bus is shared, then an Unattributable bus 
access is not counted. If the bus is not shared, then the event is counted. 


If the bus is shared, then only Attributable bus accesses are counted. If the bus is not shared, then 
all bus accesses are counted. 


Where an implementation has multiple buses at this boundary, this event counts the sum of accesses 
across all buses. 


If a bus supports multiple accesses per cycle, for example through multiple channels, the counter 
increments once for each channel that is active on a cycle, and so it might increment by more than 
one in any given cycle. 


The maximum increment in any given cycle is IMPLEMENTATION DEFINED. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D5-1863 
1ID092916 Non-Confidential 


D5 The Performance Monitors Extension 
D5.10 Events, event numbers, and mnemonics 


See also Attributability on page D5-1840. 


0x01A, MEMORY_ERROR, Local memory error 


The counter counts every occurrence of a memory error signaled by a memory closely coupled to 
this PE. The definition of local memories is IMPLEMENTATION DEFINED but includes caches, 
tightly-coupled memories, and TLB arrays. 


Memory error refers to a physical error detected by the hardware, such as a parity or ECC error. It 
includes errors that are correctable and those that are not. It does not include errors as defined in the 
architecture, such as MMU faults. 

0x01B, INST_SPEC, Operation speculatively executed 


The counter counts instructions that are speculatively executed by the PE. This includes instructions 
that are subsequently not architecturally executed. As a result, this event counts a larger number of 
instructions than the number of instructions architecturally executed. The definition of speculatively 
executed is IMPLEMENTATION DEFINED. 

0x01D, BUS_CYCLES, Bus cycle 


The counter increments on every cycle of the external memory interface of the PE. 


— Note 


If the implementation clocks the external memory interface at the same rate as the processor 
hardware, the counter counts every cycle. 





0x01F, LID_CACHE_ALLOCATE, Attributable Level 1 data cache allocation without refill 


The counter increments on every Attributable write that writes an entire line into the Level 1 cache 
without fetching from outside the Level 1 cache, for example: 


° A write from a coalescing buffer of a full cache line. 
° A DC ZVA operation. 


See also Attributability on page D5-1840. 


0x20, LZD_CACHE_ALLOCATE, Attributable Level 2 data cache allocation without refill 


The counter increments on every Attributable write that writes an entire line into the Level 2 cache 
without fetching from outside the Level 1 or Level 2 caches, for example: 


° A write-back from a Level 1 to Level 2 cache. 
. A write from a coalescing buffer of a full cache line. 
. A DC ZVA operation. 


See also Attributability on page D5-1840. 


0x022, BR_MIS_PRED_RETIRED, Instruction architecturally executed, mispredicted branch 
The counter counts all instructions counted by BR_RETIRED that were not correctly predicted. 
If no program-flow prediction resources are implemented, this event counts all retired not-taken 
branches. 


0x023, STALL_FRONTEND, No operation issued due to the frontend 


The counter counts every cycle counted by the CPU_CYCLES event on which no operation was 
issued because there are no operations available to issue for this PE from the frontend. 


The division between frontend and backend is IMPLEMENTATION DEFINED. Frontend and backend 
events must count at the same point in the pipeline. 





— Note 
° For a simplified pipeline model of Fetch + Decode > Issue > Execute —> Retire, 
ARM recommends that the events are counted when instructions are dispatched from Decode 
to Issue. 
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. On a given cycle, both events might be counted if the backend is unable to accept any 
operations and there are no operations available to issue from the frontend. 





For more information, see Cycle event counting on multithreaded implementations on 
page D5-1868. 
0x024, STALL_BACKEND, No operation issued due to the backend 


The counter counts every cycle counted by the CPU_CYCLES event on which no operation was 
issued because either: 


. The backend is unable to accept any of the operations available for issue for this PE. 
. The backend is unable to accept any operations. 
For example, the back end might be unable to accept operations because of a resource conflict or 
non-availability. 
The division between frontend and backend is IMPLEMENTATION DEFINED. Frontend and backend 
events must count at the same point in the pipeline. See STALL_FRONTEND for more information. 
For more information, see Cycle event counting on multithreaded implementations on 
page D5-1868. 

0x025, LID_TLB, Attributable Level 1 data or unified TLB access 


The counter counts each Attributable memory-read operation or Attributable memory-write 
operation that causes a TLB access to at least the Level | data or unified TLB. Each access toa TLB 
record is counted including the multiple accesses of instructions, such as LDM or STM. 


The counter does not count TLB maintenance instructions. 


See also Attributability on page D5-1840. 


0x026, LII_TLB, Attributable Level 1 instruction TLB access 


The counter counts each Attributable instruction memory access that causes a TLB access to at least 
the Level 1 instruction or unified TLB. 


The counter does not count TLB maintenance instructions. 


See also Attributability on page D5-1840. 


0x27, L2I_CACHE, Attributable Level 2 instruction cache access 


The counter counts Attributable instruction memory accesses that access at least the Level 2 
instruction or unified cache. Each Attributable access to other Level 2 instruction memory 
structures, such as refill buffers, is also counted. 


See also Attributability on page D5-1840. 


0x28, L2I_ CACHE_REFILL, Attributable Level 2 instruction cache refill 


The counter counts Attributable instruction memory accesses that cause a refill of at least the Level 
2 instruction or unified cache. This includes each Attributable memory access that causes a refill 
from outside the cache. It excludes accesses that do not cause a new cache refill but are satisfied 
from refilling data of a previous miss. 


A refill includes any Attributable access that causes data to be fetched from outside the cache, even 
if the data is ultimately not allocated into the cache. For example, data might be fetched into a buffer 
but then discarded, rather than being allocated into a cache. These buffers are treated as part of the 
cache. 


The counter does not count cache maintenance instructions. 


See also: 
. Attributability on page D5-1840. 


° Meaningful ratios between common microarchitectural events on page D5-1869. 
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0x029, L3D_CACHE_ALLOCATE, Attributable Level 3 data cache allocation without refill 
The counter increments on every Attributable write that writes an entire line into the Level 3 cache 
without fetching from outside the Level 1, Level 2, or Level 3 cache, for example: 
° A write-back from a Level 2 to Level 3 cache. 
. A write from a coalescing buffer of a full cache line. 
° A DC ZVA operation. 


See also Attributability on page D5-1840. 


0x02A, L3D_CACHE_REFILL, Attributable Level 3 data cache refill 


The counter counts each Attributable memory-read operation or Attributable memory-write 
operation that the PE makes that accesses at least the Level 3 data or unified cache and causes a refill 
of a Level | or Level 2 instruction, data, or unified cache or of the Level 3 data or unified cache. 
Each read from or write to the cache that causes a refill from outside the Level 1, Level 2, and Level 
3 caches is counted. 


A refill includes any access that causes data to be fetched from outside the cache, even if the data is 
ultimately not allocated into the cache. For example, data might be fetched into a buffer but then 
discarded, rather than being allocated into a cache. These buffers are treated as part of the cache. 


The counter does not count as events on this PE: 


° Accesses that do not cause a new cache refill but are satisfied from refilling data of a previous 
miss. 
. Accesses to the Level 3 cache that generate a memory access but not a new linefill, such as 


write-through writes that hit in the Level 3 cache. 


° Accesses to the Level 3 cache that are part of a Level 1 or Level 2 cache refill or write-back 
that hit in the Level 3 cache so do not cause a refill from outside of the Level 1, Level 2, and 
Level 3 caches. 


. Operations made by other PEs that share this cache. 
° Cache maintenance instructions. 
° A write that writes an entire line to the cache and does not fetch any data from outside the 


Level 1, Level2, and Level 3 cache, for example: 
— A write-back from a Level 2 to Level 3 cache. 


—  Awrite of a full cache line from a coalescing buffer. 


— ADC ZVA operation. 
. A write that misses in the cache, and writes through the cache without allocating a line. 
See also: 
° Attributability on page D5-1840. 
° Meaningful ratios between common microarchitectural events on page D5-1869. 


0x0@2B, L3D_CACHE, Attributable Level 3 data cache access 


The counter counts Attributable memory-read or Attributable memory-write operations, that the PE 
made, that access at least the Level 3 data or unified cache. Each access to a cache line is counted 
including refills of and write-backs from the Level 1 or Level 2 data, instruction, or unified caches. 
Each access to other Level 3 data or unified memory structures, such as refill buffers, write buffers, 
and write-back buffers, is also counted. 

The counter does not count: 

° Operations made by other PEs that share the Level 3 data or unified cache. 


° Cache maintenance instructions. 


See also Attributability on page D5-1840. 
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0x02C, L3D_CACHE_WB, Attributable Level 3 data cache write-back 


The counter counts every write-back of data from the Level 3 data or unified cache that occurs as a 
result of an operation by this PE. It counts each write-back that causes data to be written from the 
Level 3 cache to outside of the Level 1, Level 2, and Level 3 caches. For example, the counter 
counts the following cases: 


. A write-back that causes data to be written to a Level 4 cache, or to memory. 

° A write-back of a recently fetched cache line that has not been allocated to the Level 3 cache. 
Each write-back is counted once, even if multiple accesses are required to complete the write-back. 
It is IMPLEMENTATION DEFINED.whether the counter counts: 


° A transfer of data from the Level 3 cache to outside the Level 1, Level 2, and Level 3 caches 
made as a result of a coherency request. 


° A write-back made as a result of a Cache maintenance instruction. 


The counter does not count: 


° The invalidation of a cache line without any write-back to a Level 4 cache or memory. 

° Writes from the PE, Level 1, or Level 2 data or unified cache, that write through the Level 3 
cache to outside of the Level 3 cache. 

° Transfers of data from the Level 3 cache to a Level 1 or Level 2 cache, to satisfy a Level 1 or 
Level 2 cache refill. 


An Unattributable write-back event occurs when a requestor outside the PE makes a coherency 
request that results in write-back. If the cache is shared, then Unattributable write-back events are 
not counted. If the cache is not shared, then Unattributable write-back events are counted. 


It is IMPLEMENTATION DEFINED whether a write of a whole cache line that is not the result of the 
eviction of a line from the cache, is counted. For example, this applies when the PE determines 
streaming writes to memory and does not allocate lines to the cache, or by a DC ZVA operation. 


See also Attributability on page D5-1840. 


0x02D, L2D_TLB_REFILL, Attributable Level 2 data TLB refill 


The counter counts each Attributable memory-read operation or Attributable memory-write 
operation that causes a TLB refill of at least the Level 2 data or unified TLB. It counts each 
Attributable read or Attributable write that causes a refill, in the form of a translation table walk or 
an access to another level of TLB caching. It is IMPLEMENTATION DEFINED whether the count 
increments when: 

° A refill results in a Translation fault. 


° A refill is not allocated in the TLB. 


The counter does not count: 


° A TLB miss that does not cause a refill but does generate a translation table walk. 
. TLB maintenance instructions. 

See also: 

° Attributability on page D5-1840. 

° Meaningful ratios between common microarchitectural events on page D5-1869. 


0x0@2E, L2I_TLB_REFILL, Attributable Level 2 instruction TLB refill 


The counter counts Attributable instruction memory accesses that cause a TLB refill of at least the 
Level 2 instruction TLB. This includes each Attributable instruction memory access that causes an 
access to a level of memory system due to a translation table walk or an access to another level of 
TLB caching. It is IMPLEMENTATION DEFINED whether the count increments when: 

° A refill results in a Translation fault. 


° A refill is not allocated in the TLB. 


The counter does not count: 





° A TLB miss that does not cause a refill but does generate a translation table walk. 
° TLB maintenance instructions. 
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See also: 
° Attributability on page D5-1840. 


° Meaningful ratios between common microarchitectural events on page D5-1869. 


0x02F, L2D_TLB, Attributable Level 2 data or unified TLB access 


The counter counts each Attributable memory read operation or Attributable memory write 
operation that causes a TLB access to at least the Level 2 data or unified TLB. Each access toa TLB 
record is counted, including the multiple accesses of instructions such as LDM or STM. 


The counter does not count TLB maintenance instructions. 


See also Attributability on page D5-1840. 


0x030, L21_TLB, Attributable Level 2 instruction TLB access 


The counter counts each Attributable memory read operation or Attributable memory write 
operation that causes a TLB access to at least the Level 2 instruction or unified TLB. 


The counter does not count TLB maintenance instructions. 


See also Attributability on page D5-1840. 











D5.10.6 Cycle event counting on multithreaded implementations 

For most events, the event is only counted when it is attributable to the counting PE or thread, see Attributability on 

page D5-1840. 

Multithreaded implementations can have various forms, some examples of these are: 

° Simultaneous Multithreading (SMT), where every PE thread is active on every cycle. 

° Fine-grained Multithreading (FGMT), also known as a Barrel processor, where one PE thread is active on 
each cycle, and this changes regularly. 

° Switch on Event Multithreading (SoOEMT), also known as Coarse-grained Multithreading (CGMT), where 
high latency events cause the processor to switch the active PE thread. 

In the above examples, active means that the PE might execute the instructions. A PE can be active but not executing 

instructions when no instruction is available or because of limited execution resources. 

When the PMEVTYPER<n>_EL0.MT bit is set to 0, the CPU_CYCLES event only counts cycles on which the 

thread was active. For the example multithreaded implementations, this means that: 

° For an SMT implementation, the CPU_CYCLES event counts every cycle. 

° For a particular FGMT implementation, that alternates between two threads on each cycle, the 
CPU_CYCLES event counts every other cycle. 

° For a particular SoEMT implementation, that is waiting for a long latency operation, the CPU_CYCLES 
event does not count cycles, as the PE thread is not active. 

If the PMEVTYPER<n>_EL0.MT bit is set to 1, the processor counts each cycle, and can only count a maximum 

of one cycle each cycle. 

In addition, the STALL_FRONTEND and STALL_BACKEND events only count cycles that are counted by the 

CPU_CYCLES event, and so have the same limitation. For example, in an SMT implementation, if a PE thread 

cannot issue an instruction because of contention with other PE threads, these are counted as STALL_BACKEND 

cycles. 

If the PMEVTYPER<n>_ELO.MT bit is set to 1, the PE only counts cycles on which no operation is issued from 

any thread. 

Note 
The PMCCNTR register counts every processor cycle. 
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Meaningful ratios between common microarchitectural events 


The architecture highlights some meaningful ratios that can be derived from the common microarchitectural events. 
Table DS5-8 lists the highlighted ratios. 


Table D5-8 REFILL events and associated access events 





Numerator Denominator Ratio 





0x001 LIIT_ CACHE REFILL 0x014 LIT CACHE Attributable Level | instruction cache refill rate 





0x002 LII_TLB_REFILL 0x026 LII_TLB Attributable Level | instruction TLB refill rate 





0x003 LID_CACHE REFILL @x@04 LID_CACHE Attributable Level 1 data or unified cache refill rate 





0x005 LID_TLB_REFILL @x025 LID_TLB Attributable Level 1 data or unified TLB refill rate 





0x017 L2D_CACHE REFILL  0x@16 L2D_CACHE Attributable Level 2 data or unified cache refill rate 





0x028 L2I_ CACHE REFILL @x027 L2I_ CACHE Attributable Level 2 instruction cache refill rate 





0x02A L3D_CACHE_ REFILL  0x@2B L3D_CACHE Attributable Level 3 data or unified cache refill rate 











0x02D L2D_TLB_ REFILL @x02F L2D_TLB Attributable Level 2 data or unified TLB refill rate 
0x0@2E L2I_TLB_ REFILL 0x030 L2I_TLB Attributable Level 2 instruction TLB refill rate 
@x019 BUS_ACCESS @x01D BUS_CYCLES _ Bus accesses per cycle 





Required events 


PMUV3 requires that an implementation includes the following common events: 


@x000, SW_INCR, Instruction architecturally executed, condition code check pass, software increment. 
0x003, LID_CACHE_REFILL, Attributable Level 1 data cache refill. 


Note 


Event 0x003 is only required if the implementation includes a Level 1 data or unified cache. 








0x004, LID_CACHE, Attributable Level 1 data cache access. 
Note 


Event 0x004 is only required if the implementation includes a Level 1 data or unified cache. 








@x010, BR_MIS_PRED, Mispredicted or not predicted branch speculatively executed. 
Note 


Event 0x010 is only required if the implementation includes program-flow prediction. However, ARM 
recommends that the event is implemented as described in Common microarchitectural event numbers on 
page D5-1859. 








@x011, CPU_CYCLES, Cycle. 
@x012, BR_PRED, Predictable branch speculatively executed. 


Note 


Event 0x012 is only required if the implementation includes program-flow prediction. However, ARM 
recommends that the event is implemented as described in Common microarchitectural event numbers on 
page D5-1859. 








At least one of: 
— 0x008, INST_RETIRED, Instruction architecturally executed. 
—  0x01B, INST_SPEC, Operation speculatively executed. 
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Note 


ARM strongly recommends that event 0x08 is implemented. 





D5.10.9 IMPLEMENTATION DEFINED event numbers 


For IMPLEMENTATION DEFINED event numbers, each counter is defined, independently, to either: 
° Increment only once for each event. 


° Count the duration for which an event occurs. 


ARM recommends that implementers establish a standardized numbering scheme for their IMPLEMENTATION 
DEFINED events, with common definitions, and common count numbers, applied to all of their implementations. In 
general, the recommended approach is for standardization across implementations with common features. However, 
ARM recognizes that attempting to standardize the encoding of microarchitectural features across too wide a range 
of implementations is not productive. 


ARM strongly recommends that at least the following classes of event are identified in the IMPLEMENTATION 
DEFINED events: 


° Cumulative duration of stalls resulting from the holes in the instruction availability, separating out counts for 
key buffering points that might exist. 


° Cumulative duration of data-dependent stalls, separating out counts for key dependency classes that might 
exist. 
° Cumulative duration of stalls due to unavailability of execution resources, including, for example, write 


buffers, separating out counts for key resources that might exist. 


. Missed superscalar issue opportunities, if relevant, separating out counts for key classes of issue that might 
exist. 

° Miss rates for different levels of caches and TLBs. 

° Any external events passed to the PE through an IMPLEMENTATION DEFINED mechanism. 

° Cumulative duration of a PSTATE.{A, I, F} interrupt mask set to 1. 

° Cumulative occupancy for resource queues, such as data access queues, and entry/exit counts, so that 


average latencies can be determined, separating out counts for key resources that might exist. 
An implementation might also provide registers in the IMPLEMENTATION DEFINED space to further 


extend such counts, for example by specifying a minimum latency for an event to be counted. 
° Any other microarchitectural features that the implementer considers are valuable to count. 


The IMPLEMENTATION DEFINED event numbers are 0x040 to 0x3FF. Appendix K3 Recommendations for Performance 
Monitors Event Numbers for IMPLEMENTATION DEFINED Events lists the ARM recommended standardized 
numbering scheme for these events. 
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D5.11 Performance Monitors Extension registers 


The following section describes the Performance Monitors Extension registers. 


The following subsections give general information about the Performance Monitors Extension registers, that apply 
for both Execution states: 


° Relationship between AArch32 and AArch64 Performance Monitors registers. 


° Access permissions. 


Performance Monitors Extension registers, functional group on page G4-4205 summarizes the Performance 
Monitors Extension registers in AArch32 state. 


Instructions for accessing non-debug System registers on page C5-281 summarizes the Performance Monitors 
Extension registers in AArch64 state. 


D5.11.1 Relationship between AArch32 and AArch64 Performance Monitors registers 


Table K12-2 on page K12-5662 lists the Performance Monitors register names for AArch32 and AArch6é4 states. 


D5.11.2 Access permissions 


Each Exception level is able to control Performance Monitors System register accesses at lower Exception levels. 
The access control flow is: 
1; At ELO: 
° Writes to PMUSERENR are UNDEFINED. 
° Reads and writes of PMINTENSET and PMINTENCLR are UNDEFINED. 
. If PMUSERENR.EN == 0: 
— If PMUSERENR.SW == 0 then writes to PMSWINC are trapped to EL1. 
— If PMUSERENR.CR == 0 then reads of PMCCNTR are trapped to EL1. 


— If PMUSERENR.ER == 0 then reads of PMEVCNTR<n> and PMXEVCNTR, and reads and 
writes of PMSELR, are trapped to EL1. 


— Otherwise, for all other Performance Monitors registers, other than reads of PMUSERENR, 
reads and writes are trapped to EL1. 


Note 
If HCR.TGE==1, then all exceptions that would be taken to EL1 are instead taken to EL2. 








2. Otherwise, at EL1 and ELO in Non-secure state, if EL2 is implemented: 
° If HDCR.TPMCR == | then accesses to PMCR are trapped to EL2. 
° If HDCR.TPM == | then accesses to all Performance Monitors registers, including PMCR, are trapped 
to EL2. 


3. Otherwise, at EL2, EL1 and ELO, if EL3 is implemented and using AArch64, and if MDCR_EL3.TPM == 
then accesses to all Performance Monitors registers are trapped to EL3. 


Note 
These traps are not possible if EL3 is using AArch32. 








4. Otherwise, the access is permitted. 


Note 
These traps and enables only apply to System register accesses using System register access instructions. For 
accesses through the optional memory-mapped or external debug interfaces, see Access permissions for external 
views of the Performance Monitors on page 12-5137. 
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For details of the headings used in Table DS-9, see Configurable instruction enables and disables, and trap controls 
on page D1-1562. In addition, the following terms are used: 


Instruction This shows the access instruction, read (MRS), write (MSR), or both (-). In AArch32 state, the 
equivalent instructions are MRC and MCR. 


Default access 


If the Default access is - then the access is trapped from ELO to EL1 unless the PMUSERENR 
enables are set to 1. 


Resultant access permission 


This indicates the resulting access permission provided the enables at ELO are enabled and the traps 
to EL2 or EL3 are disabled. 


Table DS-9 shows the access permissions for System register accesses to the Performance Monitors registers. 


Table D5-9 Access permissions for the Performance Monitors System registers 




































































At ELO: T fi bel to: 
raps from below to Reciltant 
Register Instruction Default PMUSERENR access 
EL2 EL34 permission 
access enables 

PMCR - - EN TPMCR or TPM = TPM RW 
PMCNTENSET - - EN TPM TPM RW 
PMCNTENCLR - - EN TPM TPM RW 
PMOVSCLR - - EN TPM TPM RW 
PMSWINC - - EN or SW TPM TPM WO 
PMSELR - - EN or ER TPM TPM RW 
PMCEIDO. - - EN TPM TPM RO 
PMCEID1 - - EN TPM TPM RO 

Read - EN or CR 
PMCCNTR TPM TPM RW 

Write - EN 
PMXEVTYPER - - EN TPM TPM RW 

Read - EN or ER 
PMXEVCNTR TPM TPM RW 

Write - EN 

Read RO 
PMUSERENR - TPM TPM RW 

Write UND 
PMINTENSET - UND - TPM TPM RW 
PMINTENCLR - UND - TPM TPM RW 
PMOVSSET - - EN TPM TPM RW 
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Table D5-9 Access permissions for the Performance Monitors System registers (continued) 





At ELO: Traps from below to: 




















Resultant 
Register Instruction access 
Default PMUSERENR EL2 EL32 permission 
access enables 
Read - EN or ER 
PMEVCNTR<n> TPM TPM RW 
Write - EN 
PMEVTYPER<n>~ - - EN TPM TPM RW 
PMCCFILTR - - EN TPM TPM RW 
a. Only if EL3 is using AArch64. 
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Chapter D6 
The Generic Timer in AArch64 state 


This chapter describes the implementation of the ARM Generic Timer. It includes an overview of the AArch64 
System register interface to an ARM Generic Timer. 


It contains the following sections: 
. About the Generic Timer on page D6-1876. 
. The AArch64 view of the Generic Timer on page D6-1880. 


Chapter G5 The Generic Timer in AArch32 state describes the AArch32 view of the Generic Timer, and Chapter I1 
System Level Implementation of the Generic Timer describes the system level implementation of the Generic Timer. 
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D6.1 About the Generic Timer 


Figure D6-1 shows an example system-on-chip that uses the Generic Timer as a system timer. In this figure: 






















































° This manual defines the architecture of the individual PEs in the multiprocessor blocks. 

° The ARM Generic Interrupt Controller Architecture Specification defines a possible architecture for the 
interrupt controllers. 

° Generic Timer functionality is distributed across multiple components. 
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Figure D6-1 Generic Timer example 
The Generic Timer: 


° Provides a system counter, that measures the passing of time in real-time. 


Note 


The Generic Timer can also provide other components at a system level, but Figure D6-1 does not show any 
such components. 








° Supports virtual counters that measure the passing of virtual-time. That is, a virtual counter can measure the 
passing of time on a particular virtual machine. 


. Timers, that can trigger events after a period of time has passed. The timers: 
— Can be used as count-up or as count-down timers. 


— Can operate in real-time or in virtual-time. 


This chapter describes an instance of the Generic Timer component that Figure D6-1 shows as Timer_O or Timer_1 
within the Multiprocessor A or Multiprocessor B block. This component can be accessed from AArch64 state or 
AArch32 state, and this chapter describes access from AArch64 state. Chapter G5 The Generic Timer in AArch32 
state describes access to this component from AArch32 state. 


A Generic Timer implementation must also include a memory-mapped system component. This component: 
° Must provide the System counter shown in Figure D6-1 


° Optionally, can provide timer components for use at a system level. 


Chapter I1 System Level Implementation of the Generic Timer describes this memory-mapped component. 
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D6.1.1 The full set of Generic Timer components 
Within a system that might include multiple PEs, a full set of Generic Timer components is as follows: 


The system counter 


This provides a uniform view of system time, see The system counter on page D6-1878. Because 
this must be implemented at the system level, it is accessed through The system level 
memory-mapped implementation of the Generic Timer. However, during initialization, a status 
register in each implemented timer in the system must be programmed with the frequency of the 
system counter, so that software can read this frequency. 


PE implementations of the Generic Timer 
Each PE implementation of the Generic Timer provides the following components: 
° A physical counter, that gives access to the count value of the system counter. 


° A virtual counter, that gives access to virtual time. In AArch64 state, the CNTVOFF_EL2 
register defines the offset between physical time, as defined by the value of the system 
counter, and virtual time. 


° A number of timers. In an implementation where all Exception levels are implemented and 
can use AArch6é4 state, the timers that are accessible from AArch64 state are: 
—  AnELI physical timer. 
—  AnEL2 physical timer. 
— An EL3 physical timer. 
— A virtual timer. 


The AArch64 view of the Generic Timer on page D6-1880 describes these components. 


The system level memory-mapped implementation of the Generic Timer 


The memory-mapped registers that control the components of the system level implementation of 
the Generic Timer are grouped into frames. The Generic Timer architecture defines the offset of 
each register within its frame, but the base address of each frame is IMPLEMENTATION DEFINED, and 
defined by the system. 


Each system level component has one or two register frames. The possible system level components 
are: 
The memory-mapped counter module, required 

This module controls the system counter. It has two frames: 

° A control frame, CNTControlBase. 

° A status frame, CNTReadBase. 


The memory-mapped timer control module, required 


The system level implementation of the Generic Timer can provide up to eight timers, 
and the memory-mapped timer control module identifies: 


° Which timers are implemented. 

. The features of each implemented timer. 

This module has a single frame, CNTCTLBase. 
Memory-mapped timers, optional 


An implemented memory-mapped timer: 
. Must provide a privileged view of the timer, in the CNTBaseN frame. 


° Optionally. provides an unprivileged view of the timer in the CNTELOBaseV 
frame. 


Nis the timer number, and the corresponding frame number, in the range 0-7. 


Chapter I1 System Level Implementation of the Generic Timer describes these components. 
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D6.1.2 


The system counter 


The Generic Timer provides a system counter with the following specification: 
Width At least 56 bits wide. 

The value returned by any 64-bit read of the counter is zero-extended to 64 bits. 
Frequency —_ Increments at a fixed frequency, typically in the range 1-5OMHz. 


Can support one or more alternative operating modes in which it increments by larger amounts at a 
lower frequency, typically for power-saving. 


Roll-over Roll-over time of not less than 40 years. 


Accuracy ARM does not specify a required accuracy, but recommends that the counter does not gain or lose 
more than ten seconds in a 24-hour period. 


Use of lower-frequency modes must not affect the implemented accuracy. 


Start-up Starts operating from zero. 


The system counter, once configured and running, must provide a uniform view of system time. More precisely, it 
must be impossible for the following sequence of events to show system time going backwards: 


1. Device A reads the time from the system counter. 
2s Device A communicates with another agent in the system, Device B. 


3% After recognizing the communication from Device A, Device B reads the time from the system counter. 
The system counter must be implemented in an always-on power domain. 


To support lower-power operating modes, the counter can increment by larger amounts at a lower frequency. For 
example, a 1OMHz system counter might either increment: 


° By 1 at 1OMHz. 


° By 500 at 20kHz, when the system lowers the clock frequency, to reduce power consumption. 


In this case, the counter must support transitions between high-frequency, high-precision operation, and 
lower-frequency, lower-precision operation, without any impact on the required accuracy of the counter. 


The CNTFRQ_ELO register is intended to hold a copy of the current clock frequency to allow fast reference to this 
frequency by software running on the PE. For more information see Initializing and reading the system counter 
frequency. 


The mechanism by which the count from the system counter is distributed to system components is 
IMPLEMENTATION DEFINED, but each PE with a System register interface to the system counter must have a counter 
input that can capture each increment of the counter. 


Note 


So that the system counter can be clocked independently from the PE hardware, the count value might be distributed 
using a Gray code sequence. Gray-count scheme for timer distribution scheme on page 11-5134 gives more 
information about this possibility. 








Initializing and reading the system counter frequency 


The CNTFRQ_ELO register must be programmed to the clock frequency of the system counter. Typically, this is 
done only during the system boot process, by using the System register interface to write the system counter 
frequency to the CNTFRQ_ELO register. Only software executing at the highest implemented Exception level can 
write to CNTFRQ_ELO. 





Note 


The CNTFRQ_ELO register is UNKNOWN at reset, and therefore the counter frequency must be set as part of the 
system boot process. 





Software can read the CNTFRQ_ELO register, to determine the current system counter frequency, in the following 
states: 


° EL2. 





D6-1878 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D6 The Generic Timer in AArch64 state 
D6.1 About the Generic Timer 


° Secure and Non-secure EL]. 
° When CNTKCTL_EL1.ELOPCTEN is set to 1, Secure and Non-secure ELO. 


Memory-mapped controls of the system counter 


Some system counter controls are accessible only through the memory-mapped interface to the system counter. 
These controls are: 


. Enabling and disabling the counter. 

° Setting the counter value. 

° Changing the operating mode, to change the update frequency and increment value. 
. Enabling Halt-on-debug, that a debugger can then use to suspend counting. 


For descriptions of these controls, see Chapter I1 System Level Implementation of the Generic Timer. 
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D6.2 The AArch64 view of the Generic Timer 


The following sections describe the components and features of a PE implementation of the Generic Timer, as seen 
from AArch64 state: 


° The physical counter. 

° The virtual counter on page D6-1881. 
° Event streams on page D6-1882. 

. Timers on page D6-1883. 


D6.2.1 The physical counter 


The PE includes a physical counter that contains the count value of the system counter. The CNTPCT_ELO register 
holds the current physical counter value. 


Accessing the physical counter 
Software with sufficient privilege can read CNTPCT_ELO using a 64-bit System register read. 
CNTPCT_ELO: 


° Is always accessible from EL3, Non-secure EL2 and Secure EL] states. 


° Is accessible from Non-secure EL1 only when CNTHCTL_EL2.EL1PCTEN is set to 1. When the value of 
CNTHCTL_EL2.EL1PCTEN is 0, any attempt to access CNTPCT_ELO from Non-secure EL1 is trapped to 
EL2. 


° Is accessible from Secure ELO state when the value of CNTKCTL_EL1.ELOPCTEN is 1. When the value of 
CNTKCTL_EL1.ELOPCTEN is 0, any attempt to access CNTPCT_ELO is UNDEFINED. 


° Is accessible from Non-secure ELO when the value of CNTHCTL_EL2.EL1PCTEN is | and the value of 
CNTKCTL_EL1.ELOPCTEN is 1. Otherwise: 


— When the value of CNTKCTL_EL1.ELOPCTEN is 0, any attempt to access CNTPCT_ELO from 
Non-secure ELO generates an UNDEFINED exception. 


— When the value of CNTKCTL_EL1.ELOPCTEN is 1 and the value of CNTHCTL_EL2.EL1PCTEN 
is 0, any attempt to access CNTPCT_ELO from Non-secure ELO is trapped to EL2. 


Reads of CNTPCT_ELO can occur speculatively and out of order relative to other instructions executed on the same 
PE. 


For example, if a read from memory is used to obtain a signal from another agent that indicates that CNTPCT_ELO 
must be read, an ISB is used to ensure that the read of CNTPCT_ELO occurs after the signal has been read from 
memory, as shown in the following code sequence: 


loop ; polling for some communication to indicate a requirement to read the timer 
LDR R1, [R2] 
CMP R1, #1 
BNE loop 
ISB ; without this, the CNTPCT could be read before the memory location in [R2] 


; has had the value 1 written to it 
MRS R1, CNTPCT 
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D6.2.2 The virtual counter 

An implementation of the Generic Timer always includes a virtual counter, that indicates virtual time. 
The virtual counter contains the value of the physical counter minus a 64-bit virtual offset. When executing at 
Non-secure EL1 or ELO, the virtual offset value relates to the current virtual machine. 
The CNTVOFF_EL2 register contains the virtual offset. CNTVOFF_EL2 is only accessible from EL2 and EL3. 
For more information see Status of the CNTVOFF register. 
The CNTVCT_ELO register holds the current virtual counter value. 
Accessing the virtual counter 
Software with sufficient privilege can read CNTVCT_ELO using a 64-bit System register read. 
CNTVCT_ELO is always accessible from Secure EL3, from Secure EL1 when EL3 is using AArch64, and from 
Non-secure EL1 and EL2. 
In addition, when CNTKCTL_EL1.ELOVCTEN is set to 1, CNTVCT_ELO is accessible from ELO. 
When CNTKCTL_EL1.ELOVCTEN is set to 0, any attempt to access CNTVCT_ELO from ELO is UNDEFINED. 
Reads of CNT VCT_ELO can occur speculatively and out of order relative to other instructions executed on the same 
PE. 
For example, if a read from memory is used to obtain a signal from another agent that indicates that CNTVCT_ELO 
must be read, an ISB is used to ensure that the read of CNTVCT_ELO occurs after the signal has been read from 
memory, as shown in the following code sequence: 
loop ; polling for some communication to indicate a requirement to read the timer 

LDR R1, [R2] 

CMP R1, #1 

BNE loop 

ISB ; without this, the CNTVCT could be read before the memory location in [R2] 

; has had the value 1 written to it 

MRS R1, CNTVCT 
Status of the CNTVOFF register 
All implementations of the Generic Timer include the virtual counter. Therefore, conceptually, all implementations 
include the CNTVOFF_EL2 register that defines the virtual offset between the physical count and the virtual count. 
CNTVOFF_EL2 is only accessible at EL2 or above. If EL2 is not implemented, the virtual counter uses a fixed 
virtual offset of zero. 
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D6.2.3 Event streams 


An implementation that includes the Generic Timer can use the system counter to generate one or more event 
streams, to generate periodic wake-up events as part of the mechanism described in Wait for Event mechanism and 
Send event on page D1-1599. 





Note 
An event stream might be used: 
° To impose a time-out on a Wait For Event polling loop. 
° To safeguard against any programming error that means an expected event is not generated. 





An event stream is configured by: 


° Selecting which bit, from the bottom 16 bits of a counter, triggers the event. This determines the frequency 
of the events in the stream. 


° Selecting whether the event is generated on each 0 to 1 transition, or each 1 to 0 transition, of the selected 
counter bit. 


The CNTKCTL_EL1.{EVNTEN, EVNTDIR, EVNTI} fields define an event stream that is generated from the 
virtual counter. 


In all implementations the CNTHCTL_EL2.{EVNTEN, EVNTDIR, EVNTI} fields define an event stream that is 
generated from the physical counter. 


The operation of an event stream is as follows: 


° The pseudocode variables PreviousCNTVCT and PreviousCNTPCT are initialized as: 


// Variables used for generation of the timer event stream. 
bits(64) PreviousCNTVCT = bits(64) UNKNOWN; 
bits(64) PreviousCNTPCT = bits(64) UNKNOWN; 


e The pseudocode functions TestEventCNTV() and TestEventCNTP() are called on each cycle of the PE clock. 
° The TestEventCNTx() pseudocode template defines the functions TestEventCNTV() and TestEventCNTP(): 

// TestEventCNTx() 

// aassssssssssss= 


// Template for the TestEventCNTV() and TestEventCNTP() functions 

// Describes operation when all Exception Levels are using AArch64: 

//  CNTxCT_ELO is CNTVCT_ELO or CNTPCT_ELO 64-bit count value 
// — CNTx_CTL_ELO is CNTV_CTL_ELO or CNTP_CTL_ELO Control register 
//  PreviousCNTxCT_EL@ is PreviousCNTVCT_EL@ or PreviousCNTPCT_ELQ 


TestEventCNTx() 
if CNTx_CTL_EL@.EVNTEN == '1' then 
n = UInt(CNTx_CTL_ELO. EVNTI) ; 
SampleBit = CNTxCT_ELQ<n>; 
PreviousBit = PreviousCNTxCT_ELO<n>; 


if CNTx_CTL_EL@.EVNTDIR == 'Q' then 
if PreviousBit == '0' && SampleBit == '1' then EventRegisterSet(); 
else 
if PreviousBit == '1' && SampleBit == 'Q' then EventRegisterSet(); 
PreviousCNTxCT_EL@ = CNTxCT_ELO; 


return; 
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D6.2.4 Timers 


In an implementation of the Generic Timer that includes EL3, if EL3 can use AArch64, the following timers are 
implemented: 


° An EL] physical timer, that: 

— In Secure state, can be accessed from EL1. 

— In Non-secure state, can be accessed from EL1 unless those accesses are trapped to EL2. 

When this timer can be accessed from EL1, an EL1 control determines whether it can be accessed from ELO. 
. A Non-secure EL2 physical timer. 
° A Secure EL3 physical timer. An EL3 control determines whether this register is accessible from Secure EL1. 


. A virtual timer. 
The output of each implemented timer: 
. Provides an output signal to the system. 


° If the PE interfaces to a Generic Interrupt Controller (GIC), signals a Private Peripheral Interrupt (PPI) to 
that GIC. In a multiprocessor implementation, each PE must use the same interrupt number for each timer. 


Each timer is implemented as three registers: 


° A 64-bit CompareValue register, that provides a 64-bit unsigned upcounter. 
° A 32-bit TimerValue register, that provides a 32-bit signed downcounter. 
° A 32-bit Control register. 


Table D6-1 Timer registers summary for the Generic Timer 














Timer register EL1 physical timer EL2 physical timer EL3 physical timer Virtual timer 
Compare Value register CNTP_CVAL_ELO CNTHP_CVAL_EL2 CNTPS_CVAL_EL1 CNTV_CVAL_ELO 
TimerValue register CNTP_TVAL_ELO CNTHP_TVAL_EL2 CNTPS_TVAL_EL1 CNTV_TVAL_ELO 
Control register CNTP_CTL_ELO CNTHP_CTL_EL2 CNTPS CTL, EL! CNTV_CTL_ELO 





The following sections describe: 


. Accessing the timer registers 
° Operation of the Compare Value views of the timers on page D6-1884 
° Operation of the TimerValue views of the timers on page D6-1885. 


Accessing the timer registers 
For each timer, all timer registers have the same access permissions, as follows: 
EL1 physical timer Accessible from EL1, except that Non-secure software executing at EL2 controls access 


from Non-secure EL1. 


When access from EL] is permitted, CNTKCTL_EL1.ELOPTEN determines whether the 
registers are accessible from ELO. If an access is not permitted because 
CNTKCTL_EL1.ELOPTEN is set to 0, an attempted access from ELO is UNDEFINED. 


In all implementations: 
. The registers are accessible from EL3. 
° In Non-secure state, the registers are accessible from EL2. 


° CNTHCTL_EL2.EL1PCEN determines whether the Non-secure registers are 
accessible from Non-secure EL1. If this bit is set to 1, to enable access from 
Non-secure EL1, CNTKCTL_EL1.ELOPTEN determines whether the registers are 
accessible from Non-secure ELO. 
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If an access is not permitted because CNTHCTL_EL2.EL1PCEN is set to 0, an 
attempted access from a Non-secure EL1 or ELO is trapped to EL2. However, if 
CNTKCTL_EL1.ELOPTEN is set to 0, this control takes priority, and an attempted 
access from ELO is UNDEFINED, and generates an exception that is taken to EL1. 


EL2 physical timer Accessible from EL2 and EL3. 


Note 


In AArch32 state, the EL2 physical timer is accessible from EL3 only when the value of 
SCR.NS is 1. In AArch64 state, the EL2 physical timer is accessible from EL3 regardless 
of the value of SCR_EL3.NS. 








EL3 physical timer Accessible from EL3. 


SCR_EL3.ST determines whether the timer is also accessible from Secure EL1. The effect 


of the ST bit is: 
Q The EL3 physical timer is only accessible from EL3. Attempts to access this 
timer from Secure EL1 generate an exception that is taken to EL3. 
1 The EL3 physical timer is accessible from EL3 and from Secure EL1. 
Note 





The EL3 physical timer registers are always UNDEFINED in Non-secure EL1, and any 
attempt to access them from Non-secure EL1 generates an exception that is taken to EL1. 





Virtual timer Accessible from Secure and Non-secure EL1, from EL2, and from EL3. 


CNTKCTL_EL1.ELOVTEN determines whether the registers are accessible from ELO. If 
an access is not permitted because CNTKCTL_EL1.ELOVTEN is set to 0, an attempted 
access from an ELO is UNDEFINED. 


Operation of the CompareValue views of the timers 


The CompareValue view of a timer operates as a 64-bit upcounter. The timer condition is met when the appropriate 
counter reaches the value programmed into its Compare Value register. When the timer condition is met, an interrupt 
is generated if the interrupt is not masked in the corresponding timer control register, CNTP_CTL_ELO, 
CNTHP_CTL_EL2, CNTPS_CTL_EL1, or CNTV_CTL_ELO. For CNTP_CTL_ELD, the asserted interrupt is the 
same as the interrupt asserted by the Non-secure instance of the AArch32 register CNTP_CTL. 


The operation of this view of a timer is: 


TimerConditionMet = (((Counter[63:0] - Offset[63:0]) [63:0] - CompareValue[63:0]) >= @) 








Where: 
TimerConditionMet Is TRUE if the timer condition for this counter is met, and FALSE otherwise. 
Counter The physical counter value, that can be read from the CNTPCT_EL0 register. 
Note 
The virtual counter value, that can be read from the CNT VCT_EL0 register, is the value: 
(Counter - Offset) 
Offset For a physical timer it is zero, and for the virtual timer it is the virtual offset, held in the 
CNTVOFF_EL2 register. 
CompareValue The value of the appropriate CompareValue register, CNTP_CVAL_ELO, 


CNTHP_CVAL_EL2, CNTPS_CVAL_EL1, or CNTV_CVAL_ELO. 


In this view of a timer, Counter, Offset, and CompareValue are all 64-bit unsigned values. 
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Note 


This means that a timer with a CompareValue of, or close to, @xFFFF_FFFF_FFFF_FFFF might never meet its timer 
condition. However, there is no practical requirement to use values close to the counter wrap value. 








Operation of the TimerValue views of the timers 


The TimerValue view of a timer operates as a signed 32-bit downcounter. A TimerValue register is programmed 
with a count value. This value decrements on each increment of the appropriate counter, and the timer condition is 
met when the value reaches zero. When the timer condition is met, an interrupt is generated if the interrupt is not 
masked in the corresponding timer control register, CNTP_CTL_ELO, CNTHP_CTL_EL2, CNTPS_CTL_EL1, or 
CNTV_CTL_ELO. 


This view of a timer depends on the following behavior of accesses to TimerValue registers: 
Reads TimerValue = (CompareValue - (Counter - Offset) ) [31:0] 
Writes CompareValue = ((Counter - Offset) [63:0] + SignExtend(TimerValue) ) [63:0] 


Where the arguments have the definitions used in Operation of the Compare Value views of the timers on 
page D6-1884, and in addition: 


TimerValue The value of a TimerValue register, CNTP_TVAL_ELO, CNTHP_TVAL_EL2, 
CNTPS_TVAL_EL1, or CNTV_TVAL_ELO. 


In this view of a timer, all values are signed, in standard two’s complement form. 


A read of a TimerValue register after the timer condition has been met indicates the time since the timer condition 
was met. 


Note 


° Operation of the Compare Value views of the timers on page D6-1884 gives a strict definition of 
TimerConditionMet. However, provided that the TimerValue is not expected to wrap as a 32-bit signed value 
when decremented from 0x80000000, the TimerValue view can be used as giving an effect equivalent to: 





TimerConditionMet = (TimerValue < Q) 


. Programming TimerValue to a negative number with magnitude greater than (Counter—Offset) can lead to 
an arithmetic overflow that causes the Compare Value to be an extremely large positive value. This potentially 
delays meeting the timer condition for an extremely long period of time. 
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Chapter D7 


AArch64 System Register Descriptions 


This chapter defines the AArch64 System registers. It contains the following sections: 


About the AArch64 System registers on page D7-1888. 
General system control registers on page D7-1895. 
Debug registers on page D7-2147. 

Performance Monitors registers on page D7-2215. 


Generic Timer registers on page D7-2255. 
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D7.1 About the AArch64 System registers 


The following sections describe common features of the AArch64 registers: 
° Fixed values in the System register descriptions. 
° General behavior of accesses to the AArch64 System registers. 


. Principles of the ID scheme for fields in ID registers on page D7-1893. 


D7.1.1 Fixed values in the System register descriptions 


See Fixed values in AArch64 instruction and System register descriptions on page C2-137. This section defines how 
the glossary terms RAZ, RESO, RAO, and RES1 can be represented in the System register descriptions. 


System register width 


The register descriptions in this chapter describe each register as either a 32-bit register or a 64-bit register. For 
registers described as 32-bit registers, the upper bits, bits[63:32], are RESO. 


D7.1.2 General behavior of accesses to the AArch64 System registers 


The following subsections give general information about the behavior of accesses to the System registers: 
° Reset behavior of AArch64 System registers. 
° Synchronization requirements for AArch64 System registers on page D7-1889. 


Reset behavior of AArch64 System registers 
Reset values apply only to RW registers and fields, however: 


° Some RO registers or fields, including feature ID registers and some status registers or register fields, always 
return a known value. 


° Some RW and RO registers or register fields return status information about the PE. Unless the register 
description indicates that the value is UNKNOWN on reset, a read of the register immediately after a reset 
returns valid information. 


° Some RW and RO registers and fields are aliases of other registers or fields. In these cases, the reset behavior 
of the aliased register or field determines the value returned by a read of the register immediately after a reset. 


° WO registers that only have an effect on writes do not have meaningful reset values. However, an access to 
a WO register might affect underlying state, and that state might have a defined reset value. 


° IMPLEMENTATION DEFINED registers have IMPLEMENTATION DEFINED reset behavior. 


After a reset, only a limited subset of the PE state is guaranteed to be set to defined values. Also, for debug and trace 
System registers, reset requirements must take account of different levels of reset. For more information about the 
reset behavior of System registers when the PE resets into an Exception Level that is using AArch64, see: 


° PE state on reset to AArch64 state on page D1-1518. 


° The appropriate Trace architecture specification, for the Trace System registers. 


For a PE reset into an Exception level that is using AArch64, the architecture defines which AArch64 System 
registers have a defined reset value, and when that defined reset value applies. The register descriptions include this 
information, and PE state on reset to AArch64 state on page D1-1518 summarizes these architectural requirements. 
Otherwise, RW registers that have a meaningful reset value reset to an architecturally unknown value. 





Note 


When the PE resets into an Exception Level that is using AArch32, no PE state that relates to execution in AArch64 
state is accessible until another reset causes the Execution state to change to AArch64. Therefore, on a reset into 
AArch32 state, PE state that relates only to execution in AArch64 state cannot have a meaningful reset value. 
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Pseudocode description of resetting System registers 


The AArch64.ResetSystemRegisters() pseudocode function resets all System registers, and register fields, that have 
defined reset values, as described in this section and PE state on reset to AArch64 state on page D1-1518. 


Note 


For debug and trace System registers this function resets registers as defined for the appropriate level of reset. 








Synchronization requirements for AArch64 System registers 


Reads of the System registers can occur out of order with respect to earlier instructions executed on the same PE, 
provided that any data dependencies between the instructions are respected. 


Note 


In particular, System registers that hold self-incrementing counts such as the performance counters or the Generic 
Timer counter or timers, can be read early. For example, where a memory access is used to communicate a read of 
such counters, an ISB must be inserted between the read of the memory location and the read of the Generic Timer 
counter, where it is necessary that the Generic Timer counter returns a count value after the memory communication. 








Direct writes using the instructions in Table C5-6 on page C5-282 require synchronization before software can rely 
on the effects of changes to the System registers to affect instructions appearing in program order after the direct 
write to the System register. Direct writes to these registers are not allowed to affect any instructions appearing in 
program order before the direct write. The only exceptions are: 


. All direct writes to the same register, that use the same encoding for that register, are guaranteed to occur in 
program order relative to each other 


. All direct writes to a register occur in program order with respect to all direct reads to the same register using 
the same encoding. 


° Any System register access that an ARM Architecture Specification or equivalent specification defines as not 
requiring synchronization. 


Explicit synchronization occurs as a result of a Context synchronization event, which is one of the following events: 


° Execution of an ISB instruction. 
. Exception entry. 
° Exception return. 
° Execution of a DCPS instruction in Debug state. 
° Execution of a DRPS instruction in Debug state. 
. Exit from Debug state. 

Note 





The ISB and exception entry events are applicable both in Debug state and in Non-debug state. 





Conceptually, explicit synchronization occurs as the first step of each of these events, so that if the event uses state 
that has previously been changed but was not synchronized by the time of the event, the event is guaranteed to use 
the state as if it had been synchronized. 


Note 


This explicit synchronization applies as the first step of the execution of the events, and does not apply to any effect 
of System registers that apply to the fetch and decode of the instructions that cause these events, such as breakpoints 
or changes to the translation table. 








In addition, any system instructions that cause a write to a System register must be synchronized before the result 
is guaranteed to be visible to subsequent direct reads of that System register. 
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Direct reads to any one of the following registers, using the same encoding, occur in program order relative to each 


other: 


° ISR_EL1. 


° The Generic Timer registers, that is, CNTPCT_ELO and CNTVCT_ELO, and the Counter registers 
CNTP_TVAL_ELO, CNTV_TVAL_ELO, CNTHP_TVAL_EL2, and CNTPS_TVAL_EL1. 


° DBGCLAIMCLR_EL1. 


° The PMU Counters, that is, PMCCNTR_ELO, PMEVCNTR<n>_EL0, PMXEVCNTR_ELO, 
PMOVSCLR_ELO, and PMOVSSET_ELO. 


° The Debug Communications Channel registers, that is, DBGDTRRX_ELO, DBGDTR_ELO, and 


MDCCSR_ELO. 


All other direct reads of System registers can occur in any order if synchronization has not been performed. 


Table D7-1 describes the synchronization requirements between two successive read or write accesses to the same 
register, where the ordering of the read or write accesses 1s: 


1. Program order, in the event that both the reads or writes are caused by an instruction executed on this PE, 
other than one caused by a memory access by this PE. 


2: The order of arrival of asynchronous reads and writes at the PE relative to the execution of instructions that 


cause reads or writes. 


3: The order of arrival of asynchronous reads and writes at the PE relative to each other. 


Table D7-1 Synchronization requirements 





First read-write 


Second read-write 


Synchronization requirement 





Direct read 








Direct read None 
Direct write None 
Indirect read None 





Indirect write 


None, see Notes on page D7-1891 





Direct write 








Direct read None 
Direct write None 
Indirect read Required 





Indirect write 


None, see Notes on page D7-1891 





Indirect read 














Direct read None 
Direct write None 
Indirect read None 
Indirect write None 





Indirect write 


Direct read 


Required, see Notes on page D7-1891 





Direct write 


None, see Notes on page D7-1891 





Indirect read 


Required, see Notes on page D7-1891 





Indirect write 


None, see Notes on page D7-1891 
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Notes 


The terms Direct read, Direct write, Indirect read, and Indirect write, as used in Table D7-1 on page D7-1890, are 
defined as follows: 


Direct read Where software uses a System register access instruction to read the register, see both: 
° Instructions for accessing non-debug System registers on page C5-281. 


° Instructions for accessing debug System registers on page C5-280. 


Where a direct read of a register has a side-effect that changes the contents of a register, the effect 
of a direct read on that register is defined to be an indirect write. In this case, the indirect write is 
only guaranteed to have occurred, and be visible to subsequent direct or indirect reads or writes, if 
synchronization is performed after the direct read. 


Direct write Where software uses a System register access instruction to write to the register, see both: 
° Instructions for accessing non-debug System registers on page C5-281. 


° Instructions for accessing debug System registers on page C5-280. 


Where a direct write to a register has an effect on the register that means that the value in the register 
is not always the last value that is written (as is the case with set and clear registers), the effect of a 
direct write on that register is defined to be an indirect write. In this case, the indirect write is only 
guaranteed to be visible to subsequent direct or indirect reads or writes if synchronization is 
performed after the direct write and before the subsequent direct or indirect reads or writes. 


Indirect read Where an instruction uses a System register to establish operating conditions for the instruction, for 
example, the TTBR address or whether memory accesses are forced to be Non-cacheable. This 
includes situations where the contents of one System register selects what value is read or written 
using a different register. Indirect reads also include reads of the System register by external agents 
such as debuggers. Where an indirect read of a register has a side-effect that changes the contents 
of that register, that is defined to be an indirect write. 


Indirect write Where a System register is written as the consequence of some other instruction, exception, 
operation, or by the asynchronous operation of an external agent, including the passage of time as 
seen in counters, timers, or performance counters, the assertion of interrupts, or writes from an 
external debugger. 


— Note 


Since an exception is context synchronizing, registers such as the Exception Syndrome registers that 
are indirectly written as part of exception entry do not require additional synchronization. 





Where a direct read or write to a register is followed by an indirect write caused by an external agent, autonomous 
asynchronous event, or as a result of memory mapped write, synchronization is required to guarantee the order of 
those two accesses. 


Where an indirect write caused by a direct write is followed by an indirect write caused by an external agent, 
autonomous asynchronous event, or as a result of memory mapped write, synchronization is required to guarantee 
the order of those two indirect accesses. 


Where a direct read to one register causes a bit or field in a different register (or the same register using a different 
encoding) to be updated, the change to the different register (or same register using a different encoding) is defined 
to be an indirect write. In this case, the indirect write is only guaranteed to be visible to subsequent direct or indirect 
reads or writes if synchronization is performed after the direct read and before the subsequent direct or indirect reads 
or writes. 


Where a direct write to one register causes a bit or field in a different register (or the same register using a different 
encoding) to be updated as a side-effect of that direct write (as opposed to simply being a direct write to the different 
encoding), the change to the different register (or same register using a different encoding) is defined to be an 
indirect write. In this case, the indirect write is only guaranteed to be visible to subsequent direct or indirect reads 
or writes if synchronization is performed after the direct write and before the subsequent direct or indirect reads or 
writes. 
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Where indirect writes are caused by the actions of external agents such as debuggers, or by memory-mapped reads 
or writes by the PE, then an indirect write by that agent and mechanism to a register, followed by an indirect read 
by that agent and mechanism to the same register using the same address, does not require synchronization. 


Indirect writes caused by external agents, autonomous asynchronous events, or as a result of memory-mapped 
writes, to the registers shown in Table D7-2, are required to be observable to: 


Direct reads in finite time without explicit synchronization. 


Subsequent indirect reads without explicit synchronization. 


Table D7-2 Registers with a guarantee of observability, VMSAv8-64 











Registers Notes 
ISR_EL1 Interrupt Status Register 
DBGCLAIMCLR_EL1, DBGCLAIMSET_EL1 Debug CLAIM registers 





CNTPCT_ELO, CNTVCT_ELO, CNTP_TVAL_EL0O, CNTV_TVAL_ELO, CNTHP_TVAL_EL2, Generic Timer registers 


CNTPS_TVAL_EL1 





PMCCNTR_ELO, PMEVCNTR<n>_EL0, PMXEVCNTR_ELO, PMOVSCLR_ELO, PMU Counters 


PMOVSSET_ELO 





DBGDTRTX_ELO0, DBGDTRRX_ELO, DBGDTR_ELO, and the DCC flags in MDCCSR_ELO and Debug Communication 


EDSCR 


Channel registers 





EDSCR.PipeAdv 


External Debug Status and 
Control Register PipeAdv field 





In addition to the requirements shown in Table D7-2: 


Indirect writes to the following registers as a result of memory-mapped writes, including accesses by external 
agents, are required to be observable to the indirect read made in determining the response to a subsequent 
memory-mapped access without explicit synchronization: 


—  OSLAR_EL1. OSLAR_EL1 is indirectly read to determine whether the subsequent access is 
permitted. 


—  EDLAR, if implemented. EDLAR is indirectly read to determine whether a subsequent write or 
side-effect of an access is ignored. 





Note 


This requirement is stricter than the general requirement for the observability of indirect writes. 





When the PE is in Debug state, there are synchronization requirements for the Debug Communication 
Channel and Instruction Transfer registers. See DCC and ITR access in Debug state on page H4-4923. 


Note 





The provision of explicit synchronization requirements to System registers is provided to allow the direct 
access to these registers to be implemented in a small number of cycles, and that updates to multiple registers 
can be performed quickly with the synchronization penalty being paid only when the updates have occurred. 


Since toolkits might use registers such as the thread-local storage registers within compiled code, it is 
recommended that access to these registers is implemented to take a small number of cycles. 


While no synchronization is required between a direct write and a direct read, or between a direct read and 
an indirect write, this does not imply that a direct read causes synchronization of a previous direct write. That 
is, the sequence direct write — direct read — indirect read, with no intervening context synchronization, 
does not guarantee that the indirect read observes the result of the direct write. 
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D7.1.3 Principles of the ID scheme for fields in ID registers 


The ARM architecture specifies a number of ID registers that are characterized as comprising a set of 4-bit ID fields, 
Each ID field identifies the presence, and possibly the level of support for, a particular feature in an implementation 
of the architecture. These fields follow an architectural model that aids their use by software and provides future 
compatibility. This section describes that model. /D registers to which this scheme applies on page D7-1894 
identifies the set of ID registers. 


Note 


These fields are distinct from register fields that enumerate the number of resources, such as the number of 
breakpoints, watchpoints, or performance monitors, or the amount of memory. 








To provide forward compatibility, software can rely on the features of these fields that are described in this section. 


The ID fields, which are either signed or unsigned, use increasing numerical values to indicate increases in 
functionality. Therefore, if a value of 0x1 indicates the presence of some instructions, then the value 0x2 will indicate 
the presence of those instructions plus some additional instructions or functionality. This means software can be 
written in the form: 


if (value >= number) { // do something that relies on the value of the feature} 


For ID fields where the value 0x@ defines that a feature is not present, the field holds an unsigned value. This covers 
the vast majority of such fields. 


In a few cases, the architecture has been changed to permit implementations to exclude a feature that has previously 
been required and for which no ID field has been defined. In these cases, a new ID field is defined and: 


° The field holds a signed value. 


° The field value @xF indicates that the feature is not implemented. 
° The field value @x@ indicates that the feature is implemented. 
° Software that depends on the feature can use the test: 


if value >= @ { 
// Software features that depend on the presence of the hardware feature 


} 


In some cases, it has been decided retrospectively that the increase in functionality between two consecutive 
numerical values is too great, and it is desirable to permit an intermediate degree of functionality, and the means to 
discover this. This is done by the introduction of a fractional field that both: 


° Is referred to in the definition of the original field. 


. Applies only when the original field is at the lower value of the step. 


In principle a fractional field can be used for two different fractional steps, with different meanings associated with 
each of these steps. For this reason, a fractional field must be interpreted in the context of the field to which it relates 
and the value of that field. Example D7-1 shows the use of such a field. 


Example D7-1 Example of the use of a fractional field 


For a field describing some class of functionality: 
° The value 0x1 was defined as indicating that item A is present. 


° The value 0x2 was defined as indicating that items B and C are present, in addition to item A. 


Subsequently, it might be necessary to introduce a second ID field to indicate that A and B only are present. This 
new field is a fractional field, and might be defined as having the value 0x1 when A and B only are present. This 
fractional field is valid only when the original ID field has the value 0x1. 


This approach means that: 


° Software that depends on the test if (value >= 0x2) can rely on features A, B, and C being present, 
° Software that depends on the test if (value >= Qx1) can rely on feature A being present. 
. If new software needs to check only that features A and B are present then it can test: 


if (value >= @x2 || (value == Q@x1 && fractional_value >= Qx1)) { 
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// Software features that depend on A and B only 


A fractional field uses the same approach of increasing numerical values indicating increasing functionality, and the 
fractional approach can also be applied recursively to fractional fields. 


Unused ID fields, and fractional fields that are not applicable, are RESO to allow their future use when features, or 
fractional implementation options, are added. 


ID registers to which this scheme applies 


This scheme applies to the following registers: 


AArché4 System registers 


The AArch64 views of the AArch32 feature ID registers given by: 

— The AArch32 Auxiliary Feature register ID_AFRO_EL1. 

— The AArch32 Processor Feature registers ID_PFRO_EL1 and ID_PFR1_EL1. 
— The AArch32 Debug Feature register ID_DFRO_EL1. 


— The AArch32 Memory Model Feature registers ID_MMFRO_EL1, 
ID_MMFR1_EL1, ID_MMFR2_EL1, ID_ MMFR3_EL1, and ID_MMFR4_ EL1. 


— The AArch32 Instruction Set Attribute registers ID_ISARO_EL1, ID_ISAR1_EL1, 
ID_ISAR2_EL1, ID_ISAR3_EL1, ID_ISAR4 EL1, and ID_ISARS ELI. 


— The AArch32 Media and VFP Feature registers MVFRO_EL1, MVFR1_EL1, and 
MVFR2_EL1. 


The AArch64 Auxiliary Feature registers ID_AA64AFRO_EL1 and ID_AA64AFR1_EL1. 
The AArch64 Processor Feature registers ID_AA64PFRO_EL1 and ID_AA64PFR1_EL1. 
The AArch64 Debug Feature registers ID_AA64DFRO_EL1 and ID_AA64DFR1_EL1. 


The AArch64 Memory Model Feature registers [D_AA64MMFRO_EL1 and 
ID_AA64MMER1_EL1. 


The AArch64 Instruction Set Attribute registers ID_AA64ISARO_EL1 and 
ID_AA6O4ISAR1_EL1. 











AArch32 System registers 


e 


The AArch32 Auxiliary Feature register ID_AFRO. 
The AArch32 Processor Feature registers ID_PFRO and ID_PFR1. 
The AArch32 Debug Feature register ID_DFRO. 


The AArch32 Memory Model Feature registers ID_MMFRO, ID_MMFR1, ID_MMFR2, 
ID_MMFR3, and ID_MMFR4. 


The AArch32 Instruction Set Attribute registers ID_ISARO, ID_ISAR1, ID_ISAR2, 
ID_ISAR3, ID_ISAR4, and ID_ISARS. 


The AArch32 Media and VFP Feature registers MVFRO, MVFR1, and MVFR2. 


Memory-mapped registers 


The External Debug Processor Feature register EDPFR. 
The External Debug Feature register EDDFR. 
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D7.2 General system control registers 


This section lists the System registers in AArch64 that are not part of one of the other listed groups. 
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D7.2.1 


ACTLR_EL1, Auxiliary Control Register (EL1) 


The ACTLR_EL1 characteristics are: 


Purpose 
Provides IMPLEMENTATION DEFINED configuration and control options for execution at EL1 and 
ELO. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TACR==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch64 System register ACTLR_EL1[31:0] is architecturally mapped to AArch32 System 
register ACTLR. 


AArch64 System register ACTLR_EL1[63:32] is architecturally mapped to AArch32 System 
register ACTLR2. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
ACTLR_EL1 is a 64-bit register. 


Field descriptions 


The ACTLR_EL1 bit assignments are: 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [63:0] 


IMPLEMENTATION DEFINED. 


Accessing the ACTLR_EL1: 
To access the ACTLR_EL1: 


MRS <Xt>, ACTLR_EL1 ; Read ACTLR_EL1 into Xt 
MSR ACTLR_EL1, <Xt> ; Write Xt to ACTLR_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0001 0000 001 
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D7.2.2 ACTLR_EL2, Auxiliary Control Register (EL2) 
The ACTLR_EL2 characteristics are: 


Purpose 


Provides IMPLEMENTATION DEFINED configuration and control options for EL2. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) 


EL3 (SCR.NS=0) 





- - - RW RW 


RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register ACTLR_EL2[31:0] is architecturally mapped to AArch32 System 


register HACTLR. 


AArch64 System register ACTLR_EL2[63:32] is architecturally mapped to AArch32 System 


register HACTLR2. 
If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
ACTLR_EL2 is a 64-bit register. 


Field descriptions 


The ACTLR_EL2 bit assignments are: 


63 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [63:0] 


IMPLEMENTATION DEFINED. 


Accessing the ACTLR_EL2: 
To access the ACTLR_EL?2: 


MRS <Xt>, ACTLR_EL2 ; Read ACTLR_EL2 into Xt 
MSR ACTLR_EL2, <Xt> ; Write Xt to ACTLR_EL2 


Register access is encoded as follows: 





op0 op 


CRn CRm_= op2 





11 100 


0001 0000 001 
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D7.2.3 


ACTLR_EL3, Auxiliary Control Register (EL3) 


The ACTLR_EL3 characteristics are: 


Purpose 


Provides IMPLEMENTATION DEFINED configuration and control options for EL3. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
ACTLR_EL3 is a 64-bit register. 


Field descriptions 


The ACTLR_EL3 bit assignments are: 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [63:0] 


IMPLEMENTATION DEFINED. 


Accessing the ACTLR_EL3: 
To access the ACTLR_EL3: 


MRS <Xt>, ACTLR_EL3 ; Read ACTLR_EL3 into Xt 
MSR ACTLR_EL3, <Xt> ; Write Xt to ACTLR_EL3 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 0001 0000 001 
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D7.2.4 AFSRO_EL1, Auxiliary Fault Status Register 0 (EL1) 
The AFSRO_EL1 characteristics are: 
Purpose 
Provides additional IMPLEMENTATION DEFINED fault status information for exceptions taken to EL1. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
AArch64 System register AFSRO_EL1 is architecturally mapped to AArch32 System register 
ADFSR. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
AFSRO_EL] is a 32-bit register. 
Field descriptions 
The AFSRO_EL1 bit assignments are: 
31 0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
Accessing the AFSRO_EL1: 
To access the AFSRO_EL1: 
MRS <Xt>, AFSR@_EL1 ; Read AFSRQ_EL1 into Xt 
MSR AFSRQ_EL1, <Xt> ; Write Xt to AFSRQ_EL1 
Register access is encoded as follows: 
op0 opt CRn CRm_= op2 
11 000 0101 0001 000 
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D7.2.5 AFSRO_EL2, Auxiliary Fault Status Register 0 (EL2) 
The AFSRO_EL2 characteristics are: 
Purpose 
Provides additional IMPLEMENTATION DEFINED fault status information for exceptions taken to EL2. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register AFSRO_EL2 is architecturally mapped to AArch32 System register 
HADFSR. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
AFSRO_EL2 is a 32-bit register. 
Field descriptions 
The AFSRO_EL2 bit assignments are: 
31 0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
Accessing the AFSRO_EL2: 
To access the AFSRO_EL2: 
MRS <Xt>, AFSR@_EL2 ; Read AFSRQ_EL2 into Xt 
MSR AFSRQ_EL2, <Xt> ; Write Xt to AFSRO_EL2 
Register access is encoded as follows: 
opO0 opt CRn CRm_= op2 
11 100 0101 0001 000 
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AFSRO_EL3, Auxiliary Fault Status Register 0 (EL3) 


The AFSRO_EL3 characteristics are: 


Purpose 
Provides additional IMPLEMENTATION DEFINED fault status information for exceptions taken to EL3. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
AFSRO_EL3 is a 32-bit register. 


Field descriptions 


The AFSRO_EL3 bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the AFSRO_EL3: 
To access the AFSRO_EL3: 


MRS <Xt>, AFSR@_EL3 ; Read AFSRQ_EL3 into Xt 
MSR AFSRQ_EL3, <Xt> ; Write Xt to AFSRO_EL3 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 0101 0001 000 
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D7.2.7 AFSR1_EL1, Auxiliary Fault Status Register 1 (EL1) 
The AFSR1_EL1 characteristics are: 
Purpose 
Provides additional IMPLEMENTATION DEFINED fault status information for exceptions taken to EL1. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL] are trapped to 
EL2. 
. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
AArch64 System register AFSR1_EL1 is architecturally mapped to AArch32 System register 
AIFSR. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
AFSR1_EL] is a 32-bit register. 
Field descriptions 
The AFSR1_EL1 bit assignments are: 
31 0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
Accessing the AFSR1_EL1: 
To access the AFSR1_EL1: 
MRS <Xt>, AFSR1_EL1 ; Read AFSR1_EL1 into Xt 
MSR AFSR1_EL1, <Xt> ; Write Xt to AFSR1_EL1 
Register access is encoded as follows: 
op0 opt CRn CRm_= op2 
11 000 0101 0001 001 
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D7.2.8 AFSR1_EL2, Auxiliary Fault Status Register 1 (EL2) 
The AFSR1_EL2 characteristics are: 
Purpose 
Provides additional IMPLEMENTATION DEFINED fault status information for exceptions taken to EL2. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register AFSR1_EL2 is architecturally mapped to AArch32 System register 
HAIFSR. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
AFSR1_EL2 is a 32-bit register. 
Field descriptions 
The AFSR1_EL2 bit assignments are: 
31 0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
Accessing the AFSR1_EL2: 
To access the AFSR1_EL2: 
MRS <Xt>, AFSR1_EL2 ; Read AFSR1_EL2 into Xt 
MSR AFSR1_EL2, <Xt> ; Write Xt to AFSR1_EL2 
Register access is encoded as follows: 
opO0 opt CRn CRm_= op2 
11 100 0101 0001 001 
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D7.2.9 


AFSR1_EL3, Auxiliary Fault Status Register 1 (EL3) 


The AFSR1_EL3 characteristics are: 


Purpose 
Provides additional IMPLEMENTATION DEFINED fault status information for exceptions taken to EL3. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
AFSR1_EL3 is a 32-bit register. 


Field descriptions 


The AFSR1_EL3 bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the AFSR1_EL3: 
To access the AFSR1_EL3: 


MRS <Xt>, AFSR1_EL3 ; Read AFSR1_EL3 into Xt 
MSR AFSR1_EL3, <Xt> ; Write Xt to AFSR1_EL3 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 110 0101 0001 001 
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AIDR_EL1, Auxiliary ID Register 


The AIDR_ELI characteristics are: 


Purpose 
Provides IMPLEMENTATION DEFINED identification information. 


The value of this register must be interpreted in conjunction with the value of MIDR_EL1. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID1==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 
AArch64 System register AIDR_EL1 is architecturally mapped to AArch32 System register AIDR. 


Attributes 
AIDR_EL1 is a 32-bit register. 


Field descriptions 


The AIDR_EL1 bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the AIDR_EL1: 
To access the AIDR_ELI: 
MRS <Xt>, AIDR_EL1 ; Read AIDR_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 001 0000 0000 111 
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D7.2.11 AMAIR_EL1, Auxiliary Memory Attribute Indirection Register (EL1) 
The AMAIR_ELI characteristics are: 


Purpose 
Provides IMPLEMENTATION DEFINED memory attributes for the memory regions specified by 
MAIR_ELI. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





AMAIR_ELI is permitted to be cached in a TLB. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register AMAIR_EL1[31:0] is architecturally mapped to AArch32 System 
register AMAIRO. 


AArch64 System register AMAIR_EL1[63:32] is architecturally mapped to AArch32 System 
register AMAIR1. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
AMAIR_EL1 is a 64-bit register. 


Field descriptions 


The AMAIR_EL1 bit assignments are: 


63 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [63:0] 


IMPLEMENTATION DEFINED. 


Accessing the AMAIR_EL1: 
To access the AMAIR_EL1: 


MRS <Xt>, AMAIR_EL1 ; Read AMAIR_EL1 into Xt 
MSR AMAIR_EL1, <Xt> ; Write Xt to AMAIR_EL1 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 1010 0011 000 
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D7.2.12 AMAIR_EL2, Auxiliary Memory Attribute Indirection Register (EL2) 
The AMAIR_EL2 characteristics are: 


Purpose 


Provides IMPLEMENTATION DEFINED memory attributes for the memory regions specified by 
MAIR_EL2. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





AMAIR_EL2 is permitted to be cached in a TLB. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register AMAIR_EL2[31:0] is architecturally mapped to AArch32 System 
register HAMAIRO. 


AArch64 System register AMAIR_EL2[63:32] is architecturally mapped to AArch32 System 
register HAMAIRI. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
AMAIR_EL2 is a 64-bit register. 


Field descriptions 


The AMAIR_EL2 bit assignments are: 


63 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [63:0] 


IMPLEMENTATION DEFINED. 


Accessing the AMAIR_EL2: 
To access the AMAIR_EL2: 


MRS <Xt>, AMAIR_EL2 ; Read AMAIR_EL2 into Xt 
MSR AMAIR_EL2, <Xt> ; Write Xt to AMAIR_EL2 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 1010 0011 000 
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D7.2.13 AMAIR_EL3, Auxiliary Memory Attribute Indirection Register (EL3) 
The AMAIR_EL3 characteristics are: 


Purpose 


Provides IMPLEMENTATION DEFINED memory attributes for the memory regions specified by 
MAIR_EL3. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





AMAIR_EL3 is permitted to be cached in a TLB. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
AMAIR_ELS3 is a 64-bit register. 


Field descriptions 


The AMAIR_EL3 bit assignments are: 


63 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [63:0] 


IMPLEMENTATION DEFINED. 


Accessing the AMAIR_EL3: 
To access the AMAIR_EL3: 


MRS <Xt>, AMAIR_EL3 ; Read AMAIR_EL3 into Xt 
MSR AMAIR_EL3, <Xt> ; Write Xt to AMAIR_EL3 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 1010 0011 000 
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D7.2.14 CCSIDR_EL1, Current Cache Size ID Register 
The CCSIDR_EL1 characteristics are: 
Purpose 
Provides information about the architecture of the currently selected cache. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RO RO RO RO RO 
If CSSELR_EL1.Level indicates a cache that is not implemented, then on a read of the 
CCSIDR_EL]1 the behavior is CONSTRAINED UNPREDICTABLE, and can be one of the following: 
° The CCSIDR_EL1 read is treated as NOP. 
° The CCSIDR_EL1 read is UNDEFINED. 
° The CCSIDR_EL1 read returns an UNKNOWN value. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If HCR_EL2.TID2==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
AArch64 System register CCSIDR_EL1 is architecturally mapped to AArch32 System register 
CCSIDR. 
The implementation includes one CCSIDR_EL1 for each cache that it can access. CSSELR_EL1 
selects which Cache Size ID Register is accessible. 
Attributes 
CCSIDR_EL1 is a 32-bit register. 
Field descriptions 
The CCSIDR_EL1 bit assignments are: 
31 28 27 13 12 3 2 0 
UNKNOWN, bits [31:28] 
Reserved, UNKNOWN. 
NumSets, bits [27:13] 
(Number of sets in cache) - 1, therefore a value of 0 indicates 1 set in the cache. The number of sets 
does not have to be a power of 2. 
Associativity, bits [12:3] 
(Associativity of cache) - 1, therefore a value of 0 indicates an associativity of 1. The associativity 
does not have to be a power of 2. 
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LineSize, bits [2:0] 
(Log2(Number of bytes in cache line)) - 4. For example: 
For a line length of 16 bytes: Logo(16) = 4, LineSize entry = 0. This is the minimum line length. 
For a line length of 32 bytes: Logo(32) = 5, LineSize entry = 1. 





Note 


The parameters NumSets, Associativity, and LineSize in these registers define the architecturally visible parameters 
that are required for the cache maintenance by Set/Way instructions. They are not guaranteed to represent the actual 
microarchitectural features of a design. You cannot make any inference about the actual sizes of caches based on 
these parameters. 





Accessing the CCSIDR_EL1: 
To access the CCSIDR_EL1: 
MRS <Xt>, CCSIDR_EL1 ; Read CCSIDR_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 001 0000 = 0000 000 
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D7.2.15 CLIDR_EL1, Cache Level ID Register 
The CLIDR_EL1 characteristics are: 


Purpose 


Identifies the type of cache, or caches, that are implemented at each level and can be managed using 
the architected cache maintenance instructions that operate by set/way, up to a maximum of seven 
levels. Also identifies the Level of Coherence (LoC) and Level of Unification (LoU) for the cache 
hierarchy. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID2==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register CLIDR_EL1 is architecturally mapped to AArch32 System register 
CLIDR. 


Attributes 
CLIDR_EL1 is a 64-bit register. 


Field descriptions 


The CLIDR_EL1 bit assignments are: 


33 32, 30 29, 27 26, 24 23, 21 20..18 17.1514. 12 11 


P RESO | ICB iow LoC |LoUIS| Ctype7 5 AB AEA EAS Ctype2 | Ctype1 


Bits [63:33] 


Reserved, RESO. 


ICB, bits [32:30] 


Inner cache boundary. This field indicates the boundary for caching Inner Cacheable memory 
regions. 


The possible values are: 





000 Not disclosed by this mechanism. 

001 L1 cache is the highest Inner Cacheable level. 
010 L2 cache is the highest Inner Cacheable level. 
011 L3 cache is the highest Inner Cacheable level. 
100 L4 cache is the highest Inner Cacheable level. 
101 L5 cache is the highest Inner Cacheable level. 
110 L6 cache is the highest Inner Cacheable level. 
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L7 cache is the highest Inner Cacheable level. 


Level of Unification Uniprocessor for the cache hierarchy. 


LoC, bits [26:24] 


Level of Coherence for the cache hierarchy. 


LoUIS, bits [23:21] 


Level of Unification Inner Shareable for the cache hierarchy. 


Ctype<n>, bits [3(n-1)+2:3(n-1)], for n = 1 to 7 


Cache Type fields. Indicate the type of cache that is implemented and can be managed using the 
architected cache maintenance instructions that operate by set/way at each level, from Level 1 up to 
a maximum of seven levels of cache hierarchy. Possible values of each field are: 


000 
001 
010 
011 
100 


No cache. 

Instruction cache only. 

Data cache only. 

Separate instruction and data caches. 


Unified cache. 


All other values are reserved. 


If software reads the Cache Type fields from Ctypel upwards, once it has seen a value of 000, no 
caches that can be managed using the architected cache maintenance instructions that operate by 
set/way exist at further-out levels of the hierarchy. So, for example, if Ctype3 is the first Cache Type 
field with a value of 000, the values of Ctype4 to Ctype7 must be ignored. 


Accessing the CLIDR_EL1: 


To access the CLIDR_EL1: 


MRS <Xt>, CLIDR_EL1 ; Read CLIDR_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 001 0000 = 0.000 001 
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D7.2.16 


CONTEXTIDR_EL1, Context ID Register (EL1) 


The CONTEXTIDR_EL1 characteristics are: 


Purpose 
Identifies the current Process Identifier. 
The value of the whole of this register is called the Context ID and is used by: 
° The debug logic, for Linked and Unlinked Context ID matching. 
° The trace logic, to identify the current process. 


The significance of this register is for debug and trace use only. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL] are trapped to 
EL2. 


° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


AArch64 System register CONTEXTIDR_EL1 is architecturally mapped to AArch32 System 
register CONTEXTIDR. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CONTEXTIDR_EL1 is a 32-bit register. 


Field descriptions 


The CONTEXTIDR_EL1 bit assignments are: 


31 0 


PROCID 


PROCID, bits [31:0] 
Process Identifier. This field must be programmed with a unique value that identifies the current 


process. 


——— Note 
In AArch32 state, when TTBCR.EAE is set to 0, CONTEXTIDR.ASID holds the ASID. 


In AArch64 state, CONTEXTIDR_EL1 is independent of the ASID, and for the EL1&0 translation 
regime either TTBRO_EL1 or TTBR1_EL1 holds the ASID. 
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Accessing the CONTEXTIDR_EL1: 
To access the CONTEXTIDR_EL1: 


MRS <Xt>, CONTEXTIDR_EL1 ; Read CONTEXTIDR_EL1 into Xt 
MSR CONTEXTIDR_EL1, <Xt> ; Write Xt to CONTEXTIDR_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 1101 0000 001 
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D7.2.17 CPACR_EL1, Architectural Feature Access Control Register 
The CPACR_EL1 characteristics are: 


Purpose 


Controls access to trace, and to Advanced SIMD and floating-point functionality. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If CPTR_EL2.TCPAC==1, Non-secure accesses to this register from EL] are trapped to EL2. 
° If CPTR_EL3.TCPAC==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 


AArch64 System register CPACR_EL1 is architecturally mapped to AArch32 System register 
CPACR. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CPACR_EL1 is a 32-bit register. 


Field descriptions 


The CPACR_EL1 bit assignments are: 


29 28 27 22 21 20 19 0 


31 
RESO j RESO FPEN RESO 


TTA | 


Bits [31:29] 


Reserved, RESO. 





TTA, bit [28] 
Traps ELO and EL1 System register accesses to all implemented trace registers to EL1, from both 
Execution states. 
) ELO and EL1 System register accesses to all implemented trace registers are not trapped 
to ELI. 
1 ELO and EL1 System register accesses to all implemented trace registers are trapped to 
ELI. 
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— Note 


° The ETMv4 architecture does not permit ELO to access the trace registers. If the ARMv8-A 
architecture is implemented with an ETMv4 implementation, ELO accesses to the trace 
registers are UNDEFINED, and any resulting exception is higher priority than an exception that 
would be generated because the value of CPACR_EL1.TTA is 1. 


. The ARMv8-A architecture does not provide traps on trace register accesses through the 
optional memory-mapped interface. 





System register accesses to the trace registers can have side-effects. When a System register access 
is trapped, any side-effects that are normally associated with the access do not occur before the 
exception is taken. 


If System register access to the trace functionality is not implemented, this bit is RESO. 
Bits [27:22] 
Reserved, RESO. 
FPEN, bits [21:20] 
Traps ELO and ELI accesses to the SIMD and floating-point registers to EL1, from both Execution 


states. 

00 Causes any instructions in ELO or EL] that use the registers associated with 
floating-point and Advanced SIMD execution to be trapped. 

01 Causes any instructions in ELO that use the registers associated with floating-point and 
Advanced SIMD execution to be trapped, but does not cause any instruction in EL1 to 
be trapped. 

10 Causes any instructions in ELO or EL1 that use the registers associated with 
floating-point and Advanced SIMD execution to be trapped. 

11 Does not cause any instruction to be trapped. 


Writes to MVFRO, MVFR1 and MVFR2 from EL1 or higher are CONSTRAINED UNPREDICTABLE 
and whether these accesses can be trapped by this control depends on implemented CONSTRAINED 
UNPREDICTABLE behavior. 


—— Note 
° Attempts to write to the FPSID do count as use of the registers for accesses from EL1 or 
higher. 


° Accesses from ELO to FPSID, MVFRO, MVFR1, MVFR2 and FPEXC are UNDEFINED, and 
any resulting exception is higher priority than an exception that would be generated because 
the value of CPACR_EL1.FPEN is not 11. 





Bits [19:0] 


Reserved, RESO. 


Accessing the CPACR_EL1: 
To access the CPACR_EL1: 


MRS <Xt>, CPACR_EL1 ; Read CPACR_EL1 into Xt 
MSR CPACR_EL1, <Xt> ; Write Xt to CPACR_EL1 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0001 0000 010 
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D7.2.18 CPTR_EL2, Architectural Feature Trap Register (EL2) 
The CPTR_EL2 characteristics are: 
Purpose 
Controls trapping to EL2 of access to CPACR, CPACR_EL1, trace functionality and registers 
associated with Advanced SIMD and floating-point execution. Also controls EL2 access to this 
functionality. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If CPTR_EL3.TCPAC==1, accesses to this register from EL2 are trapped to EL3. 
Configurations 
AArch64 System register CPTR_EL2 is architecturally mapped to AArch32 System register 
HCPTR. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CPTR_EL2 is a 32-bit register. 
Field descriptions 
The CPTR_EL2 bit assignments are: 
31 30 21 20 19 1413121110 9 0 
7 RESO i RESO Bil RES1 
TCPAC __| | Po TFP 
TTA RESO 
RES1 
TCPAC, bit [31] 
Traps Non-secure EL1 accesses to the CPACR_EL1 or CPACR to EL2, from both Execution states. 
0 This control has no effect on Non-secure EL1 accesses to the CPACR_EL1 or CPACR. 
a Non-secure EL1 accesses to the CPACR_EL1 or CPACR are trapped to EL2. 
— Note 
The CPACR_EL1 or CPACR is not accessible at ELO. 
Bits [30:21] 
Reserved, RESO. 
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TTA, bit [20] 
Traps Non-secure System register accesses to the trace registers to EL2, from both Execution states. 
0 Non-secure System register accesses to the trace registers are not trapped to EL2. 
1 Any attempt at EL2, or Non-secure ELO or EL1, to execute a System register access to 
a trace register is trapped to EL2, subject to the exception prioritization rules. 
—— Note 


. The ETMv4 architecture does not permit ELO to access the trace registers. If the ARMv8-A 
architecture is implemented with an ETMv4 implementation, ELO accesses to the trace 
registers are UNDEFINED, and any resulting exception is higher priority than an exception that 
would be generated because the value of CPTR_EL2.TTA is 1. 


° EL2 does not provide traps on trace register accesses through the optional memory-mapped 
interface. 





System register accesses to the trace registers can have side-effects. When a System register access 
is trapped, any side-effects that are normally associated with the access do not occur before the 
exception is taken. 


If System register access to the trace functionality is not supported, this bit is RESO. 
Bits [19:14] 

Reserved, RESO. 
Bits [13:12] 


Reserved, RES1. 


Bit [11] 
Reserved, RESO. 
TEP, bit [10] 
Traps Non-secure accesses to SIMD and floating-point functionality to EL2, from both Execution 
states. 
0 Does not cause any instruction to be trapped. 
1 Any attempt at EL2, or Non-secure ELO or EL1, to execute an instruction that uses the 
registers associated with floating-point and Advanced SIMD execution is trapped to 
EL2, subject to the exception prioritization rules. 
Bits [9:0] 


Reserved, RES1. 


Accessing the CPTR_EL2: 
To access the CPTR_EL2: 


MRS <Xt>, CPTR_EL2 ; Read CPTR_EL2 into Xt 
MSR CPTR_EL2, <Xt> ; Write Xt to CPTR_EL2 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 100 0001 0001 010 
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D7.2.19 CPTR_EL3, Architectural Feature Trap Register (EL3) 
The CPTR_EL3 characteristics are: 


Purpose 
Controls trapping to EL3 of access to CPACR_EL1, CPTR_EL2, trace functionality and registers 
associated with Advanced SIMD and floating-point execution. Also controls EL3 access to trace 
functionality and registers associated with Advanced SIMD and floating-point execution. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





= z - - RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CPTR_EL3 is a 32-bit register. 


Field descriptions 


The CPTR_EL3 bit assignments are: 


31 30 21 20 19 1110 9 0 


i RESO i RESO i RESO 
TCPAC = | ——— TFP 
TTA 


TCPAC, bit [31] 
Traps all of the following to EL3, from both Security states and both Execution states. 
° EL2 accesses to the CPTR_EL2 or HCPTR. 
° EL2 and EL] accesses to the CPACR_EL1 or CPACR. 
When CPTR_EL3.TCPAC is: 


1) This control does not cause any accesses to CPACR_EL1, CPTR_EL2, CPACR, or 
HCPTR to trap to EL3. 
1 EL2 accesses to the CPTR_EL2 or HCPTR, and EL2 and EL] accesses to the 


CPACR_EL1 or CPACR, are trapped to EL3 if they are not trapped to EL2 by the 
CPTR_EL2.TCPAC control. 


Bits [30:21] 
Reserved, RESO. 
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Traps System register accesses to the trace registers, from all Exception levels, both Security states, 
and both Execution states, to EL3. 


) Does not cause any instruction to be trapped. 


1 Any System register access to the trace registers is trapped to EL3, subject to the 
exception prioritization rules. 


If System register access to trace functionality is not supported, this bit is RESO. 


Reserved, RESO. 


Traps all accesses to Advanced SIMD and floating-point functionality, from all Exception levels, 
both Security states, and both Execution states, to EL3. 


) Does not cause any instruction to be trapped. 


1 Any attempt at any Exception level to execute an instruction that uses the registers 
associated with Advanced SIMD and floating-point is trapped to EL3, subject to the 
exception prioritization rules. 


Reserved, RESO. 


Accessing the CPTR_EL3: 


To access the CPTR_EL3: 


MRS <Xt>, CPTR_EL3 ; Read CPTR_EL3 into Xt 
MSR CPTR_EL3, <Xt> ; Write Xt to CPTR_EL3 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 110 0001 0001 010 
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D7.2.20 CSSELR_EL1, Cache Size Selection Register 


The CSSELR_EL1 characteristics are: 


Purpose 


Selects the current Cache Size ID Register, CCSIDR_EL1, by specifying the required cache level 
and the cache type (either instruction or data cache). 


Usage constraints 


This register is accessible as follows: 





ELO 


EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) 


EL3 (SCR.NS=0) 





RW RW RW RW 


RW 





If CSSELR_EL1.Level is programmed to a cache level that is not implemented, then a read of 
CSSELR_EL1 is CONSTRAINED UNPREDICTABLE, and returns UNKNOWN values for 
CSSELR_EL1.{Level, InD}. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID2==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch64 System register CSSELR_EL1 is architecturally mapped to AArch32 System register 


CSSELR. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 


CSSELR_EL] is a 32-bit register. 


Field descriptions 


The CSSELR_EL1 bit assignments are: 


31 


0 


Bits [31:4] 


Reserved, RESO. 


Level, bits [3:1] 


Cache level of required cache. Permitted values are: 


Es InD 





000 Level 1 cache 
001 Level 2 cache 
010 Level 3 cache 
011 Level 4 cache 
100 Level 5 cache 
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101 Level 6 cache 
110 Level 7 cache 
All other values are reserved. 


If CSSELR_EL1.Level is programmed to a cache level that is not implemented, then the value for 
this field on a read of CSSELR is UNKNOWN. 


Instruction not Data bit. Permitted values are: 
0 Data or unified cache. 
1 Instruction cache. 


If CSSELR_EL1.Level is programmed to a cache level that is not implemented, then the value for 
this field on a read of CSSELR is UNKNOWN. 


Accessing the CSSELR_EL1: 


To access the CSSELR_EL1: 


MRS <Xt>, CSSELR_EL1 ; Read CSSELR_EL1 into Xt 
MSR CSSELR_EL1, <Xt> ; Write Xt to CSSELR_EL1 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 010 0000 0000 000 
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D7.2.21 CTR_ELO, Cache Type Register 
The CTR_ELO characteristics are: 


Purpose 


Provides information about the architecture of the caches. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID2==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 


. If SCTLR_EL1.UCT==0, read accesses to this register from ELO are trapped to EL1. 


Configurations 
AArch64 System register CTR_ELO is architecturally mapped to AArch32 System register CTR. 


Attributes 
CTR_ELO is a 32-bit register. 


Field descriptions 


The CTR_ELO bit assignments are: 


3130 2827 24 23 20 19 16 15 14 13 4 3 0 


RES1 __| 


Bit [31] 


Reserved, RES1. 


Bits [30:28] 


Reserved, RESO. 


CWG, bits [27:24] 


Cache Writeback Granule. Log of the number of words of the maximum size of memory that can 
be overwritten as a result of the eviction of a cache entry that has had a memory location in it 
modified. 


A value of 0b0000 indicates that this register does not provide Cache Writeback Granule information 


and either: 
° The architectural maximum of 512 words (2KB) must be assumed. 
° The Cache Writeback Granule can be determined from maximum cache line size encoded in 


the Cache Size ID Registers. 


Values greater than 0b1001 are reserved. 
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ERG, bits [23:20] 


Exclusives Reservation Granule. Log of the number of words of the maximum size of the 


reservation granule that has been implemented for the Load-Exclusive and Store-Exclusive 
instructions. 


A value of 0b0000 indicates that this register does not provide Exclusives Reservation Granule 
information and the architectural maximum of 512 words (2KB) must be assumed. 


Values greater than 0b1001 are reserved. 


DminLine, bits [19:16] 


Log» of the number of words in the smallest cache line of all the data caches and unified caches that 
are controlled by the PE. 


L1Ip, bits [15:14] 


Bits [13:4] 


Level | instruction cache policy. Indicates the indexing and tagging policy for the L1 instruction 
cache. Possible values of this field are: 


01 ASID-tagged Virtual Index, Virtual Tag (AIVIVT) 
10 Virtual Index, Physical Tag (VIPT) 
11 Physical Index, Physical Tag (PIPT) 


Other values are reserved. 


Reserved, RESO. 


IminLine, bits [3:0] 


Log» of the number of words in the smallest cache line of all the instruction caches that are 
controlled by the PE. 


Accessing the CTR_ELO: 


To access the CTR_ELO: 


MRS <Xt>, CTR_EL@ ; Read CTR_ELO into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 0000 0000 001 
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D7.2.22 DACR32_EL2, Domain Access Control Register 
The DACR32_EL2 characteristics are: 


Purpose 
Allows access to the AArch32 DACR register from AArch64 state only. Its value has no effect on 
execution in AArch64 state. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register DACR32_EL2 is architecturally mapped to AArch32 System register 
DACR. 


If EL1 does not support AArch32, this register is UNDEFINED. 


If EL2 is not implemented but EL3 is implemented, and EL] is capable of using AArch32, then this 
register is not RESO. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
DACR32_EL2 is a 32-bit register. 


Field descriptions 


The DACR32_EL2 bit assignments are: 


31 30 29 28 27 26 25 24 23 22 21 201918 17 161514131211109 8 76543210 


o%s] «| 019 or2| ox ]o10| 00 | 02 | or | os | os |o+ | os | 2 | or | 00 





De<n>, bits [2n+1:2n], for n = 0 to 15 


Domain n access permission, where n = 0 to 15. Permitted values are: 


00 No access. Any access to the domain generates a Domain fault. 
01 Client. Accesses are checked against the permission bits in the translation tables. 
11 Manager. Accesses are not checked against the permission bits in the translation tables. 


The value 10 is reserved. 


Accessing the DACR32_EL2: 
To access the DACR32_EL2: 


MRS <Xt>, DACR32_EL2 ; Read DACR32_EL2 into Xt 
MSR DACR32_EL2, <Xt> ; Write Xt to DACR32_EL2 





D7-1926 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 0011 0000 000 
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D7.2.23 DCZID_ELO, Data Cache Zero ID register 
The DCZID_ELO characteristics are: 
Purpose 
Indicates the block size that is written with byte values of 0 by the DC ZVA (Data Cache Zero by 
Address) system instruction. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
RO RO RO RO RO RO 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
There are no configuration notes. 
Attributes 
DCZID_ELO is a 32-bit register. 
Field descriptions 
The DCZID_ELO bit assignments are: 
31 5 4 3 0 
[| DZP 
Bits [31:5] 
Reserved, RESO. 
DZP, bit [4] 
Data Zero prohibited. Permitted values are: 
0 DC ZVA instruction is permitted. 
Ht DC ZVA instruction is prohibited. 
The value read from this field is governed by the access state and the values of the HCR_EL2.TDZ 
and SCTLR_EL1.DZE bits. 
BS, bits [3:0] 
Log» of the block size in words. The maximum size supported is 2KB (value == 9). 
Accessing the DCZID_ELO: 
To access the DCZID_ELO: 
MRS <Xt>, DCZID_EL@ ; Read DCZID_EL@ into Xt 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 0000 0000 111 
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D7.2.24 ESR_EL1, Exception Syndrome Register (EL1) 
The ESR_EL1 characteristics are: 


Purpose 


Holds syndrome information for an exception taken to EL1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL] are trapped to 
EL2. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


Configurations 
AArch64 System register ESR_EL1 is architecturally mapped to AArch32 System register DFSR. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
ESR_EL1 is a 32-bit register. 


Field descriptions 


See ESR_ELx. 


Accessing the ESR_EL1: 
To access the ESR_ELI: 


MRS <Xt>, ESR_EL1 ; Read ESR_EL1 into Xt 
MSR ESR_EL1, <Xt> ; Write Xt to ESR_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0101 0010 000 
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D7.2.25 ESR_EL2, Exception Syndrome Register (EL2) 
The ESR_EL2 characteristics are: 
Purpose 
Holds syndrome information for an exception taken to EL2. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register ESR_EL2 is architecturally mapped to AArch32 System register HSR. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
ESR_EL2 is a 32-bit register. 
Field descriptions 
See ESR_ELx. 
Accessing the ESR_EL2: 
To access the ESR_EL2: 
MRS <Xt>, ESR_EL2 ; Read ESR_EL2 into Xt 
MSR ESR_EL2, <Xt> ; Write Xt to ESR_EL2 
Register access is encoded as follows: 
op0 opt CRn CRm_= op2 
11 100 0101 0010 000 
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D7.2.26 


ESR_EL3, Exception Syndrome Register (EL3) 


The ESR_EL3 characteristics are: 


Purpose 


Holds syndrome information for an exception taken to EL3. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) 


EL3 (SCR.NS=0) 





: , . Z RW 


RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
ESR_EL3 is a 32-bit register. 


Field descriptions 


See ESR_ELx. 


Accessing the ESR_EL3: 
To access the ESR_EL3: 


MRS <Xt>, ESR_EL3 ; Read ESR_EL3 into Xt 
MSR ESR_EL3, <Xt> ; Write Xt to ESR_EL3 


Register access is encoded as follows: 





op0 op 


CRn 


CRm_= op2 





11 110 


0101 


0010 000 
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D7.2.27 ESR_ELx, Exception Syndrome Register (ELx) 
This page describes ESR_EL1, ESR_EL2, and ESR_EL3. 
The ESR_ELx characteristics are: 
Purpose 
Holds syndrome information for an exception taken to ELx. 
Usage constraints 
ESR_ELx is made UNKNOWN as a result of an exception return from ELx. 
When an UNPREDICTABLE instruction is treated as UNDEFINED, and the exception is taken to ELx, 
the value of ESR_ELx is UNKNOWN. The value written to ESR_ELx must be consistent with a value 
that could be created as a result of an exception from the same Exception level that generated the 
exception as a result of a situation that is not UNPREDICTABLE at that Exception level, in order to 
avoid the possibility of a privilege violation. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
If EL2 is not implemented, ESR_EL2 is RESO from EL3. 
Attributes 
The ESR_EL<x registers are 32-bit registers. 
Field descriptions 
The ESR_ELx bit assignments are: 
31 26 25 24 0 
EC, bits [31:26] 
Exception Class. Indicates the reason for the exception that this register holds information about. 
For each EC value, the table references a subsection that gives information about: 
. The cause of the exception, for example the configuration required to enable the trap. 
° The encoding of the associated ISS. 
Possible values of the EC field are: 
EC == 000000 
Unknown reason. 
This value is valid for all described registers. 
See ISS encoding for exceptions with an unknown reason. 
EC == 000001 
Trapped WFI or WFE instruction execution. 
Conditional WFE and WFI instructions that fail their condition code check do not cause 
an exception. 
This value is valid for all described registers. 
See ISS encoding for an exception from a WFI or WFE instruction. 
EC == 000011 
Trapped MCR or MRC access with (coproc==1111) that is not reported using EC 
0be00000. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-1933 


1ID092916 


Non-Confidential 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 


This value is valid for all described registers. 


See ISS encoding for an exception from an MCR or MRC access. 


EC == 000100 


Trapped MCRR or MRRC access with (coproc==1111) that is not reported using EC 
0b000000. 


This value is valid for all described registers. 

See ISS encoding for an exception from an MCRR or MRRC access. 
EC == 000101 

Trapped MCR or MRC access with (coproc==1110). 

This value is valid for all described registers. 

See ISS encoding for an exception from an MCR or MRC access. 
EC == 000110 

Trapped LDC or STC access. 


The only architected uses of these instructions are: 
° An STC to write data to memory from DBGDTRRXint. 
° An LDC to read data from memory to DBGDTRTXint. 


This value is valid for all described registers. 


See ISS encoding for an exception from an LDC or STC instruction. 


EC == 000111 


Access to an Advanced SIMD or floating-point functionality trapped by 
CPACR_EL1.FPEN, CPTR_EL2.TFP, or CPTR_EL3.TFP control. 


Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is 
1, or because Advanced SIMD and floating-point are not implemented. These are 
reported with EC value 0b000000 as described in Reporting the EC encoding when an 
exception is routed to EL2 on page D1-1535. 


This value is valid for all described registers. 
See ISS encoding for an exception from an access to an Advanced SIMD or 
floating-point register, resulting from CPACR_EL1.FPEN or CPTR_ELx.TFP. 

EC == 001000 
Trapped VMRS access, from ID group trap, that is not reported using EC 0b000111. 
This value is valid for ESR_EL2. 


See ISS encoding for an exception from an MCR or MRC access. 


EC == 001100 
Trapped MRRC access with (coproc==1110). 
This value is valid for all described registers. 


See ISS encoding for an exception from an MCRR or MRRC access. 


EC == 001110 
Illegal Execution state. 
This value is valid for all described registers. 
See ISS encoding for an exception from an Illegal Execution state, or a PC or SP 
alignment fault. 
EC == 010001 
SVC instruction execution in AArch32 state. 


This is reported in ESR_EL2 only when the exception is generated because the value of 
HCR_EL2.TGE is 1. 


This value is valid for ESR_EL1 and ESR_EL2. 

See ISS encoding for an exception from HVC or SVC instruction execution. 
EC == 010010 

HVC instruction execution in AArch32 state, when HVC is not disabled. 

This value is valid for ESR_EL2. 
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See ISS encoding for an exception from HVC or SVC instruction execution. 
EC == 010011 
SMC instruction execution in AArch32 state, when SMC is not disabled. 


This is reported in ESR_EL2 only when the exception is generated because the value of 
HCR_EL2.TSC is 1. 


This value is valid for ESR_EL2 and ESR_EL3. 


See ISS encoding for an exception from SMC instruction execution in AArch32 state. 


EC == 010101 
SVC instruction execution in AArch64 state. 
This value is valid for all described registers. 


See ISS encoding for an exception from HVC or SVC instruction execution. 


EC == 010110 
HVC instruction execution in AArch64 state, when HVC is not disabled. 
This value is valid for ESR_EL2 and ESR_EL3. 


See ISS encoding for an exception from HVC or SVC instruction execution. 


EC == 010111 
SMC instruction execution in AArch64 state, when SMC is not disabled. 


This is reported in ESR_EL2 only when the exception is generated because the value of 
HCR_EL2.TSC is 1. 


This value is valid for ESR_EL2 and ESR_EL3. 


See ISS encoding for an exception from SMC instruction execution in AArch64 state. 


EC == 011000 
Trapped MSR, MRS or System instruction execution in AArch64 state, that is not 
reported using EC 0b00000, 0b000001 or 0b000111. 


This include all instructions that cause exceptions that are part of the encoding space 
defined in System instruction class encoding overview on page C5-271, except for those 
exceptions reported using EC values 0b000000, 0b000001, or 0b000111. 


This value is valid for all described registers. 
See ISS encoding for an exception from MSR, MRS, or System instruction execution in 
AArch6é4 state. 
EC == 011111 
IMPLEMENTATION DEFINED exception to EL3. 
This value is valid for ESR_EL3. 
See ISS encoding for a IMPLEMENTATION DEFINED exception to EL3. 


EC == 100000 
Instruction Abort from a lower Exception level. 


Used for MMU faults generated by instruction accesses and Synchronous external 
aborts, including synchronous parity or ECC errors. Not used for debug related 
exceptions. 


This value is valid for all described registers. 


See ISS encoding for an exception from an Instruction Abort. 


EC == 100001 
Instruction Abort taken without a change in Exception level. 


Used for MMU faults generated by instruction accesses and Synchronous external 
aborts, including synchronous parity or ECC errors. Not used for debug related 
exceptions. 


This value is valid for all described registers. 

See ISS encoding for an exception from an Instruction Abort. 
EC == 100010 

PC alignment fault exception. 


This value is valid for all described registers. 
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See ISS encoding for an exception from an Illegal Execution state, or a PC or SP 
alignment fault. 


EC == 100100 
Data Abort from a lower Exception level. 


Used for MMU faults generated by data accesses, alignment faults other than those 
caused by the Stack Pointer misalignment, and Synchronous external aborts, including 
synchronous parity or ECC errors. Not used for debug related exceptions. 


This value is valid for all described registers. 


See ISS encoding for an exception from a Data Abort. 


EC == 100101 
Data Abort taken without a change in Exception level. 


Used for MMU faults generated by data accesses, alignment faults other than those 
caused by the Stack Pointer misalignment, and Synchronous external aborts, including 
synchronous parity or ECC errors. Not used for debug related exceptions. 


This value is valid for all described registers. 


See ISS encoding for an exception from a Data Abort. 


EC == 100110 
SP alignment fault exception. 
This value is valid for all described registers. 
See ISS encoding for an exception from an Illegal Execution state, or a PC or SP 
alignment fault. 
EC == 101000 
Trapped floating-point exception taken from AArch32 state. 
Whether this Exception class is supported is IMPLEMENTATION DEFINED. 
This value is valid for ESR_EL1 and ESR_EL2. 
See ISS encoding for an exception from a trapped floating-point exception. 
EC == 101100 
Trapped floating-point exception taken from AArch64 state. 
Whether this Exception class is supported is IMPLEMENTATION DEFINED. 
This value is valid for all described registers. 
See ISS encoding for an exception from a trapped floating-point exception. 
EC == 101111 
SError interrupt. 
This value is valid for all described registers. 


See ISS encoding for an SError interrupt. 


EC == 110000 

Breakpoint exception from a lower Exception level. 

This value is valid for ESR_EL1 and ESR_EL2. 

See ISS encoding for an exception from a Breakpoint or Vector Catch debug exception. 
EC == 110001 

Breakpoint exception taken without a change in Exception level. 

This value is valid for ESR_EL1 and ESR_EL2. 

See ISS encoding for an exception from a Breakpoint or Vector Catch debug exception. 
EC == 110010 

Software Step exception from a lower Exception level. 

This value is valid for ESR_EL1 and ESR_EL2. 

See ISS encoding for an exception from a Software Step exception. 
EC == 110011 

Software Step exception taken without a change in Exception level. 

This value is valid for ESR_EL1 and ESR_EL2. 
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See ISS encoding for an exception from a Software Step exception. 
EC == 110100 

Watchpoint exception from a lower Exception level. 

This value is valid for ESR_EL1 and ESR_EL2. 

See ISS encoding for an exception from a Watchpoint exception. 
EC == 110101 

Watchpoint exception taken without a change in Exception level. 

This value is valid for ESR_EL1 and ESR_EL2. 

See ISS encoding for an exception from a Watchpoint exception. 
EC == 111000 

BKPT instruction execution in AArch372 state. 

This value is valid for ESR_EL1 and ESR_EL2. 

See ISS encoding for an exception from execution of a Breakpoint instruction. 
EC == 111010 

Vector Catch exception from AArch32 state. 


The only case where a Vector Catch exception is taken to an Exception level that is using 
AArch64 is when the exception is routed to EL2 and EL2 is using AArch64. 


This value is valid for ESR_EL2. 

See ISS encoding for an exception from a Breakpoint or Vector Catch debug exception. 
EC == 111100 

BRK instruction execution in AArch64 state. 

This is reported in ESR_EL3 only if a BRK instruction is executed at EL3. 

This value is valid for all described registers. 


See ISS encoding for an exception from execution of a Breakpoint instruction. 
All other EC values are reserved by ARM, and: 


° Unused values in the range @b000000 - @b101100 (0x0Q - 0x2C) are reserved for future use for 
synchronous exceptions. 


° Unused values in the range 0b101101 - @b111111 (@x2D - 0x3F) are reserved for future use, and 
might be used for synchronous or asynchronous exceptions. 


The effect of programming this field to a reserved value is that behavior is CONSTRAINED 
UNPREDICTABLE, as described in Reserved values in System and memory-mapped registers and 
translation table entries on page K1-5477. 





IL, bit [25] 
Instruction Length for synchronous exceptions. Possible values of this bit are: 
) 16-bit instruction trapped. 
1 32-bit instruction trapped. This value is also used when the exception is one of the 
following: 
° An SError interrupt. 
° An Instruction Abort exception. 
° A PC alignment fault exception. 
° An SP alignment fault exception. 
. A Data Abort exception for which the value of the ISV bit is 0. 
° An Illegal Execution state exception. 
° Any debug exception except for Breakpoint instruction exceptions. For 
Breakpoint instruction exceptions, this bit has its standard meaning: 
0 16-bit T32 BKPT instruction. 
1 32-bit A32 BKPT instruction or A64 BRK instruction. 
° An exception reported using EC value 0b000000. 
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ISS, bits [24:0] 


Instruction Specific Syndrome. Architecturally, this field can be defined independently for each 
defined Exception class. However, in practice, some ISS encodings are used for more than one 
Exception class. 


Typically, an ISS encoding has a number of subfields. When an ISS subfield holds a register number, 
the value returned in that field is the AArch64 view of the register number. For an exception taken 
from AArch32 state, Mapping of the general-purpose registers between the Execution states on 
page D1-1608 defines this view of the specified AArch32 register. If the AArch32 register 
descriptor is 0b1111, then: 


° If the instruction that generated the exception was not UNPREDICTABLE, the field takes the 
value 0b11111. 


. If the instruction that generated the exception was UNPREDICTABLE, the field takes an 
UNKNOWN value that must be either: 


— The AArch64 view of the register number of a register that might have been used at 
the Exception level from which the exception was taken. 


— The value 0b11111. 


When the EC field is 0000000, indicating an exception with an unknown reason, the ISS field is not 
valid, RESO. 


The following subsections describe each ISS format. 


ISS encoding for exceptions with an unknown reason 
This encoding is used by: 
° Unknown reason. 


The ISS encoding for these exceptions is: 


24 0 


RESO 


Reserved, RESO. 


Bits [24:0] 


This EC code is used for all exceptions that are not covered by any other EC value. This includes exceptions that 
are generated in the following situations: 


. The attempted execution of an instruction bit pattern that has no allocated instruction at the current Exception 
level and Security state, including: 


— A read access using a System register pattern that is not allocated for reads at the current Exception 
level and Security state. 


—  Avwrite access using a System register pattern that is not allocated for writes at the current Exception 
level and Security state. 


— Instruction encodings for instructions not implemented in the implementation. 





. In Debug state, the attempted execution of an instruction bit pattern that is unallocated in Debug state. 
. In Non-debug state, the attempted execution of an instruction bit pattern that is unallocated in Non-debug 
state. 
° In AArch32 state, attempted execution of a short vector floating-point instruction. 
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° In an implementation that does not include Advanced SIMD and floating-point functionality, an attempted 
access to Advanced SIMD or floating-point functionality under conditions where that access would be 
permitted if that functionality was present. This includes the attempted execution of an Advanced SIMD or 
floating-point instruction, and attempted accesses to Advanced SIMD and floating-point System registers. 


° An exception generated because of the value of one of the SCTLR_EL1.{ITD, SED, CP15BEN} control bits. 


. Attempted execution of: 
— An HVC instruction when disabled by HCR_EL2.HCD or SCR_EL3.HCE. 
—  AnSMC instruction when disabled by SCR_EL3.SMD. 
—  AnHLT instruction when disabled by EDSCR.HDE. 


. Attempted execution of an MSR or MRS instruction to access SP_ELO when the value of SPSel.SP is 0. 


° Attempted execution, in Debug state, of: 
— A DCPS1 instruction in Non-secure state from ELO when the value of HCR_EL2.TGE is 1. 


— A DCPS2 instruction from EL1 or ELO when the value of SCR_EL3.NS is 0, or when EL2 is not 
implemented. 


— A DCPS3 instruction when the value of EDSCR.SDD is 1, or when EL3 is not implemented. 


° When EL3 is using AArch64, attempted execution from Secure EL1 of an SRS instruction using R13_mon. 
See Traps to EL3 of Secure monitor functionality from Secure EL1 using AArch32 on page D1-1590. 


° In Debug state when the value of EDSCR.SDD is 1, the attempted execution at EL2, EL1, or ELO of an 
instruction that is configured to trap to EL3. 


° In AArch32 state, the attempted execution of an MRS (Banked register) or an MSR (Banked register) 
instruction to SPSR_mon, SP_mon, or LR_mon. 


° An exception that is taken to EL2 because the value of HCR_EL2.TGE is 1 that, if the value of 
HCR_EL2.TGE was 0 would have been reported with an ESR_ELx.EC value of 0b000111. 


ISS encoding for an exception from a WFI or WFE instruction 
This encoding is used by: 
. Trapped WFI or WFE instruction execution. 
Conditional WFE and WFI instructions that fail their condition code check do not cause an exception. 


The ISS encoding for these exceptions is: 


24 23 20 19 10 


ii COND RESO 


a—| 


CV, bit [24] 
Condition code valid. Possible values of this bit are: 
0 The COND field is not valid. 
oe The COND field is valid. 
For exceptions taken from AArch64, CV is set to 1. 
For exceptions taken from AArch32: 
° When an A32 instruction is trapped, CV is set to 1. 
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. When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or 
set to 0. See the description of the COND field for more information. 
COND, bits [23:20] 


The condition code for the trapped instruction. This field is valid only for exceptions taken from 
AArch32, and only when the value of CV is 1. 


For exceptions taken from AArch64, this field is set to 0b1110. 
For exceptions taken from AArch32: 
° When an A32 instruction is trapped, CV is set to 1 and: 


— If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 


— If the instruction is unconditional, COND is set to @b1110. 


° A conditional A32 instruction that is known to pass its condition code check can be presented 
either: 


— With COND set to 0b1110, the value for unconditional. 
— With the COND value held in the instruction. 
° When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


— CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the 
SPSR.IT field to determine the condition, if any, of the T32 instruction. 


—  CVis set to 1 and COND is set to the condition code for the condition that applied to 
the instruction. 


. For an implementation that, for both A32 and T32 instructions, takes an exception on a 
trapped conditional instruction only if the instruction passes its condition code check, these 
definitions mean that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND 
field is set to 0b1110, or to the value of any condition that applied to the instruction. 


Bits [19:1] 
Reserved, RESO. 
TI, bit [0] 
Trapped instruction. Possible values of this bit are: 
0 WFI trapped. 
a WEE trapped. 


The following sections describe configuration settings for generating this exception: 
° Traps to ELI of ELO execution of WFE and WF instructions on page D1-1565. 
° Traps to EL2 of Non-secure ELO and EL] execution of WFE and WFI instructions on page D1-1581. 


° Traps to EL3 of EL2, EL1, and ELO execution of WFE and WFI instructions on page D1-1591. 


ISS encoding for an exception from an MCR or MRC access 

This encoding is used by: 

° Trapped MCR or MRC access with (coproc==1111) that is not reported using EC 0b000000. 
° Trapped MCR or MRC access with (coproc==1110). 

° Trapped VMRS access, from ID group trap, that is not reported using EC 0b000111. 


The ISS encoding for these exceptions is: 
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24 23 2019 1716 1413 10 0 


9 5 4 1 


CV =| = Direction 


CV, bit [24] 
Condition code valid. Possible values of this bit are: 
) The COND field is not valid. 
1 The COND field is valid. 


For exceptions taken from AArch64, CV is set to 1. 
For exceptions taken from AArch32: 
° When an A32 instruction is trapped, CV is set to 1. 
. When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to 1 or 
set to 0. See the description of the COND field for more information. 
COND, bits [23:20] 


The condition code for the trapped instruction. This field is valid only for exceptions taken from 
AArch32, and only when the value of CV is 1. 


For exceptions taken from AArch64, this field is set to 0b1110. 
For exceptions taken from AArch32: 
. When an A32 instruction is trapped, CV is set to 1 and: 


— If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 


— If the instruction is unconditional, COND is set to @b1110. 


° A conditional A32 instruction that is known to pass its condition code check can be presented 
either: 


— With COND set to 0b1110, the value for unconditional. 
— With the COND value held in the instruction. 
° When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


— CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the 
SPSR.IT field to determine the condition, if any, of the T32 instruction. 


—  CVis set to 1 and COND is set to the condition code for the condition that applied to 
the instruction. 


. For an implementation that, for both A32 and T32 instructions, takes an exception on a 
trapped conditional instruction only if the instruction passes its condition code check, these 
definitions mean that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND 
field is set to 0b1110, or to the value of any condition that applied to the instruction. 

Opc2, bits [19:17] 
The Opc2 value from the issued instruction. 


For a trapped VMRS access, holds the value 0b000. 


Opcl, bits [16:14] 

The Opcl value from the issued instruction. 

For a trapped VMRS access, holds the value @b111. 
CRn, bits [13:10] 


The CRn value from the issued instruction. 
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For a trapped VMRS access, holds the reg field from the VMRS instruction encoding. 


Rt, bits [9:5] 


The Rt value from the issued instruction, the general-purpose register used for the transfer. The 
reported value gives the AArch64 view of the register. See Mapping of the general-purpose 
registers between the Execution states on page D1-1608. 


CRn, bits [4:1] 
The CRm value from the issued instruction. 
For a trapped VMRS access, holds the value 0b0000. 
Direction, bit [0] 
Indicates the direction of the trapped instruction. The possible values of this bit are: 
Q Write to System register space. MCR instruction. 


1 Read from System register space. MRC or VMRS instruction. 


The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b000011: 


° Traps to ELI of ELO accesses to the Generic Timer registers on page D1-1569. 

° Traps to ELI of ELO accesses to Performance Monitors registers on page D1-1570. 

° Traps to EL2 of Non-secure EL] accesses to virtual memory control registers on page D1-1573. 

° Traps to EL2 of Non-secure ELI execution of TLB maintenance instructions on page D1-1574. 

° Traps to EL2 of Non-secure ELO and EL] execution of cache maintenance instructions on page D1-1575. 


° Traps to EL2 of Non-secure ELI accesses to the Auxiliary Control Register on page D1-1576. 


° Traps to EL2 of Non-secure ELO and EL] accesses to lockdown, DMA, and TCM operations on 
page D1-1577. 


° Traps to EL2 of Non-secure ELO and EL1 accesses to the ID registers on page D1-1578. 
° Trapping to EL2 of Non-secure ELI accesses to the CPACR_ELI or CPACR on page D1-1582. 


. General trapping to EL2 of Non-secure ELO and EL] accesses to System registers, from AArch32 state only 
on page D1-1584. 


° Traps to EL2 of Non-secure ELO and EL] accesses to the Generic Timer registers on page D1-1587. 
° Traps to EL2 of Non-secure ELO and EL] accesses to Performance Monitors registers on page D1-1588. 
° Traps to EL3 of Secure monitor functionality from Secure ELI using AArch32 on page D1-1590. 


° Trapping to EL3 of EL2 accesses to the CPTR_EL2 or HCPTR, and EL2 and ELI accesses to the 
CPACR_ELI or CPACR on page D1-1593. 


° Traps to EL3 of EL2, EL1, and ELO accesses to Performance Monitors registers on page D1-1597. 


The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b000101: 


. Traps to ELI of ELO and ELI System register accesses to the trace registers on page D1-1567. 
° Traps to ELI of ELO accesses to the Debug Communications Channel (DCC) registers on page D1-1568. 


. Traps to EL2 of Non-secure ELO and EL] accesses to the ID registers on page D1-1578, for trapped accesses 





to the JIDR. 
° Traps to EL2 of Non-secure System register accesses to the trace registers on page D1-1583. 
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° Trapping System register accesses to Debug ROM registers to EL2 on page D1-1585. 


° Trapping System register accesses to powerdown debug registers to EL2 on page D1-1586. 


° Trapping general System register accesses to debug registers to EL2 on page D1-1586. 


° Traps to EL3 of all System register accesses to the trace registers on page D1-1594. 
° Trapping System register accesses to powerdown debug registers to EL3 on page D1-1595. 
° Trapping general System register accesses to debug registers to EL3 on page D1-1596. 


Traps to EL2 of Non-secure ELO and ELI] accesses to the ID registers on page D1-1578 describes configuration 
settings for generating exceptions that are reported using EC value 0b001000. 


ISS encoding for an exception from an MCRR or MRRC access 


This encoding is used by: 


. Trapped MCRR or MRRC access with (coproc==1111) that is not reported using EC 0b000000. 


° Trapped MRRC access with (coproc==1110). 


The ISS encoding for these exceptions is: 


CV, bit [24] 


24 23 20 19 16 15 14 10 0 


9 5 4 1 


CV |} = Direction 


RESO 


Condition code valid. Possible values of this bit are: 

0 The COND field is not valid. 

1 The COND field is valid. 

For exceptions taken from AArch64, CV is set to 1. 

For exceptions taken from AArch32: 

° When an A32 instruction is trapped, CV is set to 1. 


° When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or 
set to 0. See the description of the COND field for more information. 


COND, bits [23:20] 


The condition code for the trapped instruction. This field is valid only for exceptions taken from 
AArch32, and only when the value of CV is 1. 


For exceptions taken from AArch64, this field is set to 0b1110. 
For exceptions taken from AArch32: 
° When an A32 instruction is trapped, CV is set to 1 and: 


— If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 


— If the instruction is unconditional, COND is set to 0b1110. 


. A conditional A32 instruction that is known to pass its condition code check can be presented 
either: 


— With COND set to 0b1110, the value for unconditional. 
— With the COND value held in the instruction. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-1943 


1ID092916 


Non-Confidential 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 


. When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


— CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the 
SPSR.IT field to determine the condition, if any, of the T32 instruction. 


—  CVis set to 1 and COND is set to the condition code for the condition that applied to 
the instruction. 


° For an implementation that, for both A32 and T32 instructions, takes an exception on a 
trapped conditional instruction only if the instruction passes its condition code check, these 
definitions mean that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND 
field is set to 0b1110, or to the value of any condition that applied to the instruction. 


Opcl, bits [19:16] 

The Opcl value from the issued instruction. 
Bit [15] 

Reserved, RESO. 


Rt2, bits [14:10] 


The Rt2 value from the issued instruction, the second general-purpose register used for the transfer. 
The reported value gives the AArch64 view of the register. See Mapping of the general-purpose 
registers between the Execution states on page D1-1608. 


Rt, bits [9:5] 


The Rt value from the issued instruction, the first general-purpose register used for the transfer. The 
reported value gives the AArch64 view of the register. See Mapping of the general-purpose 
registers between the Execution states on page D1-1608. 


CRn, bits [4:1] 


The CRm value from the issued instruction. 


Direction, bit [0] 
Indicates the direction of the trapped instruction. The possible values of this bit are: 
0 Write to System register space. MCRR instruction. 


a Read from System register space. MRRC instruction. 


The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b000100: 


° Traps to ELI of ELO accesses to the Generic Timer registers on page D1-1569. 
° Traps to ELI of ELO accesses to Performance Monitors registers on page D1-1570. 
° Traps to EL2 of Non-secure EL] accesses to virtual memory control registers on page D1-1573. 


° General trapping to EL2 of Non-secure ELO and EL1 accesses to System registers, from AArch32 state only 
on page D1-1584. 


° Traps to EL2 of Non-secure ELO and EL] accesses to the Generic Timer registers on page D1-1587. 
° Traps to EL2 of Non-secure ELO and EL] accesses to Performance Monitors registers on page D1-1588. 
° Traps to EL3 of EL2, EL1, and ELO accesses to Performance Monitors registers on page D1-1597. 


The following sections describe configuration settings for generating exceptions that are reported using EC value 
0b001100: 


. Traps to ELI of ELO and ELI System register accesses to the trace registers on page D1-1567. 
. Traps to ELI of ELO accesses to the Debug Communications Channel (DCC) registers on page D1-1568. 


° Traps to EL2 of Non-secure System register accesses to the trace registers on page D1-1583. 
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° Trapping System register accesses to Debug ROM registers to EL2 on page D1-1585. 
° Traps to EL3 of all System register accesses to the trace registers on page D1-1594. 


° Trapping System register accesses to powerdown debug registers to EL2 on page D1-1586. 


ISS encoding for an exception from an LDC or STC instruction 
This encoding is used by: 


° Trapped LDC or STC access. 
The only architected uses of these instructions are: 
—  AnSTC to write data to memory from DBGDTRRXint. 
—  AnLDC to read data from memory to DBGDTRTXint. 


The ISS encoding for these exceptions is: 


24 23 20 19 121110 9 5 4 3 1 0 


CV | | L_ Direction 
Offset 


RESO 
CV, bit [24] 
Condition code valid. Possible values of this bit are: 
0 The COND field is not valid. 
1 The COND field is valid. 
For exceptions taken from AArch64, CV is set to 1. 
For exceptions taken from AArch32: 
° When an A32 instruction is trapped, CV is set to 1. 
° When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or 


set to 0. See the description of the COND field for more information. 


COND, bits [23:20] 


The condition code for the trapped instruction. This field is valid only for exceptions taken from 
AArch32, and only when the value of CV is 1. 


For exceptions taken from AArch64, this field is set to 0b1110. 
For exceptions taken from AArch32: 
° When an A32 instruction is trapped, CV is set to 1 and: 


— If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 


— If the instruction is unconditional, COND is set to @b1110. 


. A conditional A32 instruction that is known to pass its condition code check can be presented 
either: 


— With COND set to 0b1110, the value for unconditional. 
— With the COND value held in the instruction. 
° When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


— CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the 
SPSR.IT field to determine the condition, if any, of the T32 instruction. 
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— CV is set to 1 and COND is set to the condition code for the condition that applied to 
the instruction. 


. For an implementation that, for both A32 and T32 instructions, takes an exception on a 
trapped conditional instruction only if the instruction passes its condition code check, these 
definitions mean that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND 
field is set to 0b1110, or to the value of any condition that applied to the instruction. 


imm8, bits [19:12] 


Bits [11:10] 


Rn, bits [9:5] 


Offset, bit [4] 


AM, bits [3:1] 


The immediate value from the issued instruction. 


Reserved, RESO. 


The Rn value from the issued instruction, the general-purpose register used for the transfer. The 
reported value gives the AArch64 view of the register. See Mapping of the general-purpose 
registers between the Execution states on page D1-1608. 


This field is valid only when AM[2] is 0, indicating an immediate form of the LDC or STC 
instruction. When AM[2] is 1, indicating a literal form of the LDC or STC instruction, this field is 
UNKNOWN. 


Indicates whether the offset is added or subtracted: 
0 Subtract offset. 
1 Add offset. 


This bit corresponds to the U bit in the instruction encoding. 


Addressing mode. The permitted values of this field are: 


000 Immediate unindexed. 
001 Immediate post-indexed. 
010 Immediate offset. 

011 Immediate pre-indexed. 
100 Literal unindexed. 


LDC instruction in A32 instruction set only. 
For a trapped STC instruction or a trapped T32 LDC instruction this encoding is 
reserved. 
110 Literal offset. 
LDC instruction only. 
For a trapped STC instruction, this encoding is reserved. 
The values 0b101 and 0b111 are reserved. The effect of programming this field to a reserved value is 


that behavior is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and 
memory-mapped registers and translation table entries on page K1-5492. 


Bit [2] in this subfield indicates the instruction form, immediate or literal. 


Bits [1:0] in this subfield correspond to the bits {P, W} in the instruction encoding. 


Direction, bit [0] 


Indicates the direction of the trapped instruction. The possible values of this bit are: 





0 Write to memory. STC instruction. 
1 Read from memory. LDC instruction. 
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The following sections describe the configuration settings for the traps that are reported using EC value 0b000110: 
° Traps to ELI of ELO accesses to the Debug Communications Channel (DCC) registers on page D1-1568. 
° Trapping general System register accesses to debug registers to EL2 on page D1-1586. 


° Trapping general System register accesses to debug registers to EL3 on page D1-1596. 


ISS encoding for an exception from an access to an Advanced SIMD or floating-point register, 
resulting from CPACR_EL1.FPEN or CPTR_ELx.TFP 


This encoding is used by: 


° Access to an Advanced SIMD or floating-point functionality trapped by CPACR_EL1.FPEN, 
CPTR_EL2.TFP, or CPTR_EL3.TFP control. 


Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is 1, or because 
Advanced SIMD and floating-point are not implemented. These are reported with EC value 0b000000 as 
described in Reporting the EC encoding when an exception is routed to EL2 on page D1-1535. 


The ISS encoding for these exceptions is: 


24 23 20 19 0 


7 COND RESO 


a | 


The accesses covered by this trap include: 
° Execution of Advanced SIMD and floating-point instructions. 
° Accesses to the Advanced SIMD and floating-point System registers. 


For an implementation that does not include floating-point or Advanced SIMD, the exception is reported using the 
EC value 0b000000. 


CV, bit [24] 
Condition code valid. Possible values of this bit are: 
) The COND field is not valid. 
1 The COND field is valid. 


For exceptions taken from AArch64, CV is set to 1. 
For exceptions taken from AArch32: 
° When an A32 instruction is trapped, CV is set to 1. 
° When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or 
set to 0. See the description of the COND field for more information. 
COND, bits [23:20] 


The condition code for the trapped instruction. This field is valid only for exceptions taken from 
AArch32, and only when the value of CV is 1. 


For exceptions taken from AArch64, this field is set to 0b1110. 
For exceptions taken from AArch32: 
. When an A32 instruction is trapped, CV is set to 1 and: 


— If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 


— If the instruction is unconditional, COND is set to @b1110. 
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. A conditional A32 instruction that is known to pass its condition code check can be presented 
either: 


— With COND set to 0b1110, the value for unconditional. 
— With the COND value held in the instruction. 
° When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


— CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the 
SPSR.IT field to determine the condition, if any, of the T32 instruction. 


—  CViis set to 1 and COND is set to the condition code for the condition that applied to 
the instruction. 


. For an implementation that, for both A32 and T32 instructions, takes an exception on a 
trapped conditional instruction only if the instruction passes its condition code check, these 
definitions mean that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND 
field is set to 0b1110, or to the value of any condition that applied to the instruction. 


Bits [19:0] 


Reserved, RESO. 
The following sections describe the configuration settings for the traps that are reported using EC value 0b000111: 
° Traps to ELI of ELO and ELI accesses to SIMD and floating-point functionality on page D1-1568. 
° General trapping to EL2 of Non-secure accesses to the SIMD and floating-point registers on page D1-1583. 


° Traps to EL3 of all System register accesses to the trace registers on page D1-1594. 


ISS encoding for an exception from an Illegal Execution state, or a PC or SP alignment fault 


This encoding is used by: 

. Illegal Execution state. 

° PC alignment fault exception. 
° SP alignment fault exception. 


The ISS encoding for these exceptions is: 


24 0 


RESO 


Reserved, RESO. 


Bits [24:0] 


There are no configuration settings for generating Illegal Execution state exceptions and PC alignment fault 
exceptions. SP alignment checking on page D1-1515 describes the configuration settings for generating SP 
alignment fault exceptions. 


ISS encoding for an exception from HVC or SVC instruction execution 
This encoding is used by: 


° SVC instruction execution in AArch32 state. 


This is reported in ESR_EL2 only when the exception is generated because the value of HCR_EL2.TGE is 1. 





° HVC instruction execution in AArch32 state, when HVC is not disabled. 
° SVC instruction execution in AArch64 state. 
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° HVC instruction execution in AArch64 state, when HVC is not disabled. 


The ISS encoding for these exceptions is: 


24 1615 0 


Reserved, RESO. 


Bits [24:16] 


imm16, bits [15:0] 
The value of the immediate field from the HVC or SVC instruction. 


For an HVC instruction, and for an A64 SVC instruction, this is the value of the imm16 field of the 
issued instruction. 


For an A32 or T32 SVC instruction: 
° Tf the instruction is unconditional, then: 


— For the T32 instruction, this field is zero-extended from the imm8 field of the 
instruction. 


— For the A32 instruction, this field is the bottom 16 bits of the imm24 field of the 
instruction. 


° If the instruction is conditional, this field is UNKNOWN. 


In AArch32 state, the HVC instruction is unconditional, and a conditional SVC instruction generates an exception 
only if it passes its condition code check. Therefore, the syndrome information for these exceptions does not require 
conditionality information. 


For T32 and A32 instructions, see SVC, and HVC. 


For A64 instructions, see SVC and HVC. 


ISS encoding for an exception from SMC instruction execution in AArch32 state 


This encoding is used by: 


° SMC instruction execution in AArch32 state, when SMC is not disabled. 
This is reported in ESR_EL2 only when the exception is generated because the value of HCR_EL2.TSC is 1. 


The ISS encoding for these exceptions is: 


24 23 20 19 18 0 


7 COND ii RESO 
| 
CCKNOWNPASS 


For an SMC instruction that completes normally and generates an exception that is taken to EL3, the ISS encoding 
is RESO. 
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For an SMC instruction that is trapped to EL2 from Non-secure EL1 because HCR_EL2.TSC is 1, the ISS encoding 
is as shown in the diagram. 


CV, bit [24] 
Condition code valid. Possible values of this bit are: 
) The COND field is not valid. 
1 The COND field is valid. 


For exceptions taken from AArch64, CV is set to 1. 
For exceptions taken from AArch32: 
. When an A32 instruction is trapped, CV is set to 1. 


. When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or 
set to 0. See the description of the COND field for more information. 


This field is only valid if CCKNOWMPASS is 1, otherwise it is RESO. 


COND, bits [23:20] 


The condition code for the trapped instruction. This field is valid only for exceptions taken from 
AArch32, and only when the value of CV is 1. 


For exceptions taken from AArch64, this field is set to 0b1110. 
For exceptions taken from AArch32: 
° When an A32 instruction is trapped, CV is set to 1 and: 


— If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 


— If the instruction is unconditional, COND is set to 0b1110. 


° A conditional A32 instruction that is known to pass its condition code check can be presented 
either: 


— With COND set to 0b1110, the value for unconditional. 
— With the COND value held in the instruction. 
. When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


— CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the 
SPSR.IT field to determine the condition, if any, of the T32 instruction. 


—  CVis set to 1 and COND is set to the condition code for the condition that applied to 
the instruction. 


° For an implementation that, for both A32 and T32 instructions, takes an exception on a 
trapped conditional instruction only if the instruction passes its condition code check, these 
definitions mean that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND 
field is set to @b1110, or to the value of any condition that applied to the instruction. 


This field is only valid if CCKNOWMPASS is 1, otherwise it is RESO. 


CCKNOWNFPASS, bit [19] 
Indicates whether the instruction might have failed its condition code check. 
Q The instruction was unconditional, or was conditional and passed its condition code 
check. 
1 The instruction was conditional, and might have failed its condition code check. 
——— Note 


In an implementation in which an SMC instruction that fails it code check is not trapped, this field 
can always return the value 0. 





Bits [18:0] 


Reserved, RESO. 
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Traps to EL2 of Non-secure ELI execution of SMC instructions on page D1-1578 describes the configuration 
settings for trapping SMC instructions from Non-secure EL1 modes, and System calls on page D1-1598 describes 
the case where these exceptions are trapped to EL3. 


ISS encoding for an exception from SMC instruction execution in AArch64 state 
This encoding is used by: 


° SMC instruction execution in AArch64 state, when SMC is not disabled. 


This is reported in ESR_EL2 only when the exception is generated because the value of HCR_EL2.TSC is 1. 


The ISS encoding for these exceptions is: 


24 1615 0 


Reserved, RESO. 


Bits [24:16] 


imm16, bits [15:0] 


The value of the immediate field from the issued SMC instruction. 


The value of ISS[24:0] described here is used both: 


° When an SMC instruction is trapped from Non-secure EL1 modes. 
° When an SMC instruction is not trapped, so completes normally and generates an exception that is taken to 
EL3. 


Traps to EL2 of Non-secure ELI] execution of SMC instructions on page D1-1578 describes the configuration 
settings for trapping SMC instructions from Non-secure EL1 modes, and System calls on page D1-1598 describes 
the case where these exceptions are trapped to EL3. 


ISS encoding for an exception from MSR, MRS, or System instruction execution in AArch64 state 
This encoding is used by: 


° Trapped MSR, MRS or System instruction execution in AArch64 state, that is not reported using EC 
0b000000, 0b000001 or 0b000111. 


This include all instructions that cause exceptions that are part of the encoding space defined in System 
instruction class encoding overview on page C5-271, except for those exceptions reported using EC values 
0b000000, 0b000001, or 0b000111. 


The ISS encoding for these exceptions is: 


22212019 1716 1413 





ae 
Bits [24:22] 
Reserved, RESO. 
Op), bits [21:20] 
The Op0 value from the issued instruction. 
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Op2, bits [19:17] 


The Op2 value from the issued instruction. 


Op1, bits [16:14] 


The Op! value from the issued instruction. 


CRa, bits [13:10] 


The CRn value from the issued instruction. 


Rt, bits [9:5] 


The Rt value from the issued instruction, the general-purpose register used for the transfer. 


CRn, bits [4:1] 


The CRm value from the issued instruction. 


Direction, bit [0] 
Indicates the direction of the trapped instruction. The possible values of this bit are: 
Q Write access, including MSR instructions. 


1 Read access, including MRS instructions. 


For exceptions caused by System instructions, see System on page C4-199 for the encoding values returned by an 
instruction. 


The following sections describe configuration settings for generating the exception that is reported using EC value 
0b011000: 
. In ELI configurable controls on page D1-1563. 

— _ Traps to ELI of ELO execution of cache maintenance instructions on page D1-1564. 

— Traps to EL] of ELO accesses to the CTR_ELO on page D1-1565. 

— Traps to EL] of ELO execution of DC ZVA instructions on page D1-1566. 

— Traps to ELI of ELO accesses to the PSTATE.{D, A, I, F} interrupt masks on page D1-1566. 

— Traps to EL1 of ELO and ELI System register accesses to the trace registers on page D1-1567. 


— Traps to EL] of ELO accesses to the Debug Communications Channel (DCC) registers on 
page D1-1568. 


— Traps to EL] of ELO accesses to the Generic Timer registers on page D1-1569. 
— Traps to EL1 of ELO accesses to Performance Monitors registers on page D1-1570. 


° In EL2 configurable controls on page D1-1571. 
— Traps to EL2 of Non-secure EL1 accesses to virtual memory control registers on page D1-1573. 
— Traps to EL2 of Non-secure ELO and ELI execution of DC ZVA instructions on page D1-1574. 
— _ Traps to EL2 of Non-secure EL1 execution of TLB maintenance instructions on page D1-1574. 


— Traps to EL2 of Non-secure ELO and ELI execution of cache maintenance instructions on 
page D1-1575. 


— Traps to EL2 of Non-secure EL1 accesses to the Auxiliary Control Register on page D1-1576. 


— Traps to EL2 of Non-secure ELO and ELI accesses to lockdown, DMA, and TCM operations on 
page D1-1577. 


— Traps to EL2 of Non-secure ELO and EL] accesses to the ID registers on page D1-1578. 

— Trapping to EL2 of Non-secure ELI accesses to the CPACR_ELI or CPACR on page D1-1582. 
— Traps to EL2 of Non-secure System register accesses to the trace registers on page D1-1583. 
— Trapping System register accesses to Debug ROM registers to EL2 on page D1-1585. 

— Trapping System register accesses to powerdown debug registers to EL2 on page D1-1586. 


— _ Traps to EL2 of Non-secure ELO and ELI accesses to the Generic Timer registers on page D1-1587. 
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— __ Trapping general System register accesses to debug registers to EL2 on page D1-1586. 


— __ Traps to EL2 of Non-secure ELO and EL] accesses to Performance Monitors registers on 
page D1-1588. 


° In EL3 configurable controls on page D1-1589. 


— Traps to EL3 of Secure ELI accesses to the Counter-timer Physical Secure timer registers on 
page D1-1592. 


— Trapping to EL3 of EL2 accesses to the CPTR_EL2 or HCPTR, and EL2 and EL] accesses to the 
CPACR_ELI or CPACR on page D1-1593. 


— Traps to EL3 of all System register accesses to the trace registers on page D1-1594. 

— Trapping System register accesses to powerdown debug registers to EL3 on page D1-1595. 

— Trapping general System register accesses to debug registers to EL3 on page D1-1596. 

— Traps to EL3 of EL2, ELI, and ELO accesses to Performance Monitors registers on page D1-1597. 
ISS encoding for a IMPLEMENTATION DEFINED exception to EL3 
This encoding is used by: 
. IMPLEMENTATION DEFINED exception to EL3. 


The ISS encoding for these exceptions is: 


24 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [24:0] 

IMPLEMENTATION DEFINED. 
ISS encoding for an exception from an Instruction Abort 
This encoding is used by: 


. Instruction Abort from a lower Exception level. 
Used for MMU faults generated by instruction accesses and Synchronous external aborts, including 
synchronous parity or ECC errors. Not used for debug related exceptions. 

° Instruction Abort taken without a change in Exception level. 


Used for MMU faults generated by instruction accesses and Synchronous external aborts, including 
synchronous parity or ECC errors. Not used for debug related exceptions. 


The ISS encoding for these exceptions is: 
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24 11109 8 7 6 5 0 


RESO il IFSC 
| fe RESO 
S1PTW 


RESO 
EA 
FnV 


Bits [24:11] 


Reserved, RESO. 


FnV, bit [10] 
FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 
0 FAR is valid. 
1 FAR is not valid, and holds an UNKNOWN value. 
This field is only valid if the IFSC code is 010000. It is RESO for all other aborts. 
EA, bit [9] 
External abort type. This bit can provide an IMPLEMENTATION DEFINED classification of external 
aborts. 
For any abort other than an External abort this bit returns a value of 0. 
Bit [8] 
Reserved, RESO. 
S1PTW, bit [7] 
For a stage 2 fault, indicates whether the fault was a stage 2 fault on an access made for a stage 1 
translation table walk: 
Q Fault not on a stage 2 translation for a stage 1 translation table walk. 
1 Fault on the stage 2 translation of an access for a stage 1 translation table walk. 
For any abort other than a stage 2 fault this bit is RESO. 
Bit [6] 


Reserved, RESO. 


IFSC, bits [5:0] 
Instruction Fault Status Code. Possible values of this field are: 
000000 Address size fault, level 0 of translation or translation table base register 
000001 Address size fault, level 1 
000010 Address size fault, level 2 
000011 Address size fault, level 3 
000100 Translation fault, level 0 
000101 Translation fault, level 1 
000110 Translation fault, level 2 
000111 Translation fault, level 3 
001001 Access flag fault, level 1 
001010 Access flag fault, level 2 
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001101 
001110 
001111 
010000 
011000 
010100 
010101 
010110 
010111 
011100 
011101 
011110 
011111 
110000 
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Access flag fault, level 3 

Permission fault, level 1 

Permission fault, level 2 

Permission fault, level 3 

Synchronous external abort, not on translation table walk 

Synchronous parity or ECC error on memory access, not on translation table walk 
Synchronous external abort, on translation table walk, level 0 

Synchronous external abort, on translation table walk, level 1 

Synchronous external abort, on translation table walk, level 2 

Synchronous external abort, on translation table walk, level 3 

Synchronous parity or ECC error on memory access on translation table walk, level 0 
Synchronous parity or ECC error on memory access on translation table walk, level 1 
Synchronous parity or ECC error on memory access on translation table walk, level 2 
Synchronous parity or ECC error on memory access on translation table walk, level 3 


TLB conflict abort 


All other values are reserved. The effect of programming this field to a reserved value is that 
behavior is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and 
memory-mapped registers and translation table entries on page K1-5492. 


For more information about the lookup level associated with a fault, see The level associated with 
MMU faults on page D4-1806. 


— Note 


Because Access flag faults and Permission faults can only result from a Block or Page translation 
table descriptor, they cannot occur at level 0. 





ISS encoding for an exception from a Data Abort 


This encoding is used by: 


. Data Abort from a lower Exception level. 


Used for MMU faults generated by data accesses, alignment faults other than those caused by the Stack 
Pointer misalignment, and Synchronous external aborts, including synchronous parity or ECC errors. Not 
used for debug related exceptions. 


. Data Abort taken without a change in Exception level. 


Used for MMU faults generated by data accesses, alignment faults other than those caused by the Stack 
Pointer misalignment, and Synchronous external aborts, including synchronous parity or ECC errors. Not 
used for debug related exceptions. 


The ISS encoding for these exceptions is: 
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24 23 22 21 20 16151413 11109 8 7 6 5 





WnR 
| |___ eee 


ISV, bit [24] 


Instruction syndrome valid. Indicates whether the syndrome information in ISS[23:14] is valid. 
) No valid instruction syndrome. ISS[23:14] are RESO. 

1 ISS[23:14] hold a valid instruction syndrome. 

This bit is 0 for all faults reported in ESR_EL2 except the following stage 2 aborts: 


. AArch64 loads and stores of a single general-purpose register (including the register 
specified with 0b11111), including those with Acquire/Release semantics, but excluding Load 
Exclusive or Store Exclusive and excluding those with writeback. 


° AArch32 instructions where the instruction: 


—  IsanLDR, LDA, LDRT, LDRSH, LDRSHT, LDRH, LDAH, LDRHT, LDRSB, 
LDRSBT, LDRB, LDAB, LDRBT, STR, STL, STRT, STRH, STLH, STRHT, STRB, 
STLB, or STRBT instruction. 


— Is not performing register writeback. 
— Is not using R15 as a source or destination register. 


For these cases, ISV is UNKNOWN if the exception was generated in Debug state in memory access 
mode, and otherwise indicates whether ISS[23:14] hold a valid syndrome. 


ISV is 0 for all faults reported in ESR_EL1 or ESR_EL3. 


For ISS reporting, a stage 2 abort on a stage | translation table does not return a valid instruction 
syndrome, and therefore ISV is 0 for these aborts. 


The value of ISV on a Synchronous external abort on a stage 2 translation table walk is 
IMPLEMENTATION DEFINED. 


SAS, bits [23:22] 


SSE, bit [21] 


Syndrome Access Size. When ISV is 1, indicates the size of the access attempted by the faulting 
operation. 


00 Byte 

01 Halfword 

10 Word 

11 Doubleword 


This field is UNKNOWN when the value of ISV is UNKNOWN. 
This field is RESO when the value of ISV is 0. 


Syndrome Sign Extend. When ISV is 1, for a byte, halfword, or word load operation, indicates 
whether the data item must be sign extended. For these cases, the possible values of this bit are: 


7) Sign-extension not required. 
a Data item must be sign-extended. 


For all other operations this bit is 0. 
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This field is UNKNOWN when the value of ISV is UNKNOWN. 
This field is RESO when the value of ISV is 0. 


SRT, bits [20:16] 


SF, bit [15] 


AR, bit [14] 


Bits [13:11] 


FnV, bit [10] 


Syndrome Register transfer. When ISV is 1, the register number of the Rt operand of the faulting 
instruction. If the exception was taken from an Exception level that is using AArch32 then this is 
the AArch64 view of the register. See Mapping of the general-purpose registers between the 
Execution states on page D1-1608. 


This field is UNKNOWN when the value of ISV is UNKNOWN. 
This field is RESO when the value of ISV is 0. 


Width of the register accessed by the instruction is Sixty-Four. When ISV is 1, the possible values 
of this bit are: 


0 Instruction loads/stores a 32-bit wide register. 
1 Instruction loads/stores a 64-bit wide register. 
—— Note 


This field specifies the register width identified by the instruction, not the Execution state. 





This field is UNKNOWN when the value of ISV is UNKNOWN. 
This field is RESO when the value of ISV is 0. 


Acquire/Release. When ISV is 1, the possible values of this bit are: 
) Instruction did not have acquire/release semantics. 

1 Instruction did have acquire/release semantics. 

This field is UNKNOWN when the value of ISV is UNKNOWN. 

This field is RESO when the value of ISV is 0. 


Reserved, RESO. 


FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 


) FAR is valid. 
1 FAR is not valid, and holds an UNKNOWN value. 
This field is valid only if the DFSC code is @b010000. It is RESO for all other aborts. 





EA, bit [9] 
External abort type. This bit can provide an IMPLEMENTATION DEFINED classification of external 
aborts. 
For any abort other than an External abort this bit returns a value of 0. 

CM, bit [8] 
Cache maintenance. Indicates whether the Data Abort came from a cache maintenance or address 
translation instruction: 
) The Data Abort was not generated by the execution of one of the system instructions 

identified in the description of value 1. 
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1 The Data Abort was generated by either the execution of a cache maintenance 
instruction or by a synchronous fault on the execution of an address translation 
instruction. The DC ZVA instruction is not classified as a cache maintenance 
instruction, and therefore its execution cannot cause this field to be set to 1. 

S1PTW, bit [7] 


For a stage 2 fault, indicates whether the fault was a stage 2 fault on an access made for a stage 1 
translation table walk: 


) Fault not on a stage 2 translation for a stage 1 translation table walk. 
1 Fault on the stage 2 translation of an access for a stage 1 translation table walk. 


For any abort other than a stage 2 fault this bit is RESO. 


WnbR, bit [6] 


Write not Read. Indicates whether a synchronous abort was caused by a write instruction or a read 
instruction. The possible values of this bit are: 


Q Abort caused by a read instruction. 
1 Abort caused by a write instruction. 
For faults on cache maintenance and address translation instructions, this bit always returns a value 
of 1. 
DESC, bits [5:0] 
Data Fault Status Code. Possible values of this field are: 
000000 Address size fault, level 0 of translation or translation table base register 
000001 Address size fault, level 1 
000010 Address size fault, level 2 
000011 Address size fault, level 3 
000100 Translation fault, level 0 
000101 Translation fault, level 1 
000110 Translation fault, level 2 
000111 Translation fault, level 3 
001001 Access flag fault, level 1 
001010 Access flag fault, level 2 
001011 Access flag fault, level 3 
001101 Permission fault, level 1 
001110 Permission fault, level 2 
001111 Permission fault, level 3 


010000 Synchronous external abort, not on translation table walk 

011000 Synchronous parity or ECC error on memory access, not on translation table walk 
010100 Synchronous external abort, on translation table walk, level 0 

010101 Synchronous external abort, on translation table walk, level 1 

010110 Synchronous external abort, on translation table walk, level 2 

010111 Synchronous external abort, on translation table walk, level 3 

011100 Synchronous parity or ECC error on memory access on translation table walk, level 0 
011101 Synchronous parity or ECC error on memory access on translation table walk, level 1 
011110 Synchronous parity or ECC error on memory access on translation table walk, level 2 
011111 Synchronous parity or ECC error on memory access on translation table walk, level 3 
100001 Alignment fault 

110000 TLB conflict abort 
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110100 IMPLEMENTATION DEFINED fault (Lockdown fault) 

110101 IMPLEMENTATION DEFINED fault (Unsupported Exclusive access fault) 
111101 Section Domain Fault, used only for faults reported in the PAR_EL1 
111110 Page Domain Fault, used only for faults reported in the PAR_EL1 


All other values are reserved. The effect of programming this field to a reserved value is that 
behavior is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and 
memory-mapped registers and translation table entries on page K1-5492. 


For more information about the lookup level associated with a fault, see The level associated with 
MMU faults on page D4-1806. 


— Note 


Because Access flag faults and Permission faults can only result from a Block or Page translation 
table descriptor, they cannot occur at level 0. 





ISS encoding for an exception from a trapped floating-point exception 
This encoding is used by: 


° Trapped floating-point exception taken from AArch32 state. 
Whether this Exception class is supported is IMPLEMENTATION DEFINED. 


° Trapped floating-point exception taken from AArch6é4 state. 
Whether this Exception class is supported is IMPLEMENTATION DEFINED. 


The ISS encoding for these exceptions is: 


24 23 22 11 10 876543210 


RESO __| | IOF 
TFV DZF 


OFF 
UFF 
IXF 
RESO 
IDF 


Bit [24] 


Reserved, RESO. 


TFYV, bit [23] 
Trapped Fault Valid bit. Indicates whether the IDF, IXF, UFF, OFF, DZF, and IOF bits hold valid 


information about trapped floating-point exceptions. The possible values of this bit are: 


0 The IDF, IXF, UFF, OFF, DZF, and IOF bits do not hold valid information about trapped 
floating-point exceptions and are UNKNOWN. 


1 One or more floating-point exceptions occurred during an operation performed while 
executing the reported instruction. The IDF, [XF, UFF, OFF, DZF, and IOF bits indicate 
trapped floating-point exceptions that occurred. For more information see 
Floating-point exception traps on page D1-1552. 


It is IMPLEMENTATION DEFINED whether this field is set to 0 on an exception generated by a trapped 
floating point exception from a vector instruction. 
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Bits [22:11] 


VECITR, bits 


— Note 


This is not a requirement. Implementations can set this field to 1 on a trapped floating-point 
exception from a vector instruction and return valid information in the { IDF, IXF, UFF, OFF, DZF, 
IOF} fields. 





Reserved, RESO. 


[10:8] 


For a trapped floating-point exception from an instruction executed in AArch32 state this field is 
RES1. 


For a trapped floating-point exception from an instruction executed in AArch64 state this field is 
UNKNOWN. 





IDF, bit [7] 
Input Denormal floating-point exception trapped bit. If the TFV field is 0, this bit is UNKNOWN. 
Otherwise, the possible values of this bit are: 
0 Input denormal floating-point exception has not occurred. 
1 Input denormal floating-point exception occurred during execution of the reported 
instruction. 
Bits [6:5] 
Reserved, RESO. 
IXF, bit [4] 
Inexact floating-point exception trapped bit. If the TFV field is 0, this bit is UNKNOWN. Otherwise, 
the possible values of this bit are: 
Q Underflow floating-point exception has not occurred. 
1 Inexact floating-point exception occurred during execution of the reported instruction. 
UFF, bit [3] 
Underflow floating-point exception trapped bit. If the TFV field is 0, this bit is UNKNOWN. 
Otherwise, the possible values of this bit are: 
Q Underflow floating-point exception has not occurred. 
1 Underflow floating-point exception occurred during execution of the reported 
instruction. 
OFF, bit [2] 
Overflow floating-point exception trapped bit. If the TFV field is 0, this bit is UNKNOWN. Otherwise, 
the possible values of this bit are: 
0 Overflow floating-point exception has not occurred. 
1 Overflow floating-point exception occurred during execution of the reported 
instruction. 
DZF, bit [1] 
Divide-by-zero floating-point exception trapped bit. If the TFV field is 0, this bit is UNKNOWN. 
Otherwise, the possible values of this bit are: 
Q Divide-by-zero floating-point exception has not occurred. 
1 Divide-by-zero floating-point exception occurred during execution of the reported 
instruction. 
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IOF, bit [0] 


Invalid Operation floating-point exception trapped bit. If the TFV field is 0, this bit is UNKNOWN. 
Otherwise, the possible values of this bit are: 


) Invalid Operation floating-point exception has not occurred. 
1 Invalid Operation floating-point exception occurred during execution of the reported 
instruction. 


In an implementation where the SIMD and floating-point implementation supports the trapping of floating-point 
exceptions: 


° From an Exception level using AArch64, the FPCR.{IDE, IXE, UFE, OFE, DZE, IOE} bits enable each of 
the floating-point exception traps. 


° From an Exception level using AArch32, the FPSCR. {IDE, IXE, UFE, OFE, DZE, IOE} bits enable each of 
the floating-point exception traps. 


ISS encoding for an SError interrupt 
This encoding is used by: 


° SError interrupt. 


The ISS encoding for these exceptions is: 


ISV, bit [24] 


Instruction syndrome valid. Indicates whether the rest of the syndrome information in this register 


is valid. 
0 No valid instruction syndrome. ISS[23:0] are RESO. 
1 ISS[23:0] hold a valid instruction syndrome. 


IS, bits [23:0] 


IMPLEMENTATION DEFINED syndrome information that can be used to provide additional 
information about the SError interrupt. Only valid if bit[24] of this register is 1. If bit[24] is 0, this 
field is RESO. 


ISS encoding for an exception from a Breakpoint or Vector Catch debug exception 


This encoding is used by: 

. Breakpoint exception from a lower Exception level. 

. Breakpoint exception taken without a change in Exception level. 
° Vector Catch exception from AArch32 state. 


The only case where a Vector Catch exception is taken to an Exception level that is using AArch64 is when 
the exception is routed to EL2 and EL2 is using AArch64. 


The ISS encoding for these exceptions is: 
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24 6 5 0 


RESO IFSC 


Reserved, RESO. 


Bits [24:6] 


IFSC, bits [5:0] 
Instruction Fault Status Code. This field is set to @b100010, to indicate a Debug exception. 
For more information about generating these exceptions: 


. For exceptions from AArch64, see Breakpoint exceptions on page D2-1641. 


° For exceptions from AArch32, see Breakpoint exceptions on page G2-3938 and Vector Catch exceptions on 
page G2-3975. 


ISS encoding for an exception from a Software Step exception 


This encoding is used by: 
° Software Step exception from a lower Exception level. 
° Software Step exception taken without a change in Exception level. 


The ISS encoding for these exceptions is: 


24 23 7 65 0 


[i] RESO IFSC 


ISV = ee. EX 


ISV, bit [24] 
Instruction syndrome valid. Indicates whether the EX bit, ISS[6], is valid, as follows: 
i) EX bit is RESO. 
1 EX bit is valid. 


See the EX bit description for more information. 
Bits [23:7] 

Reserved, RESO. 
EX, bit [6] 


Exclusive operation. If the ISV bit is set to 1, this bit indicates whether a Load-Exclusive instruction 
was stepped. 


Q An instruction other than a Load-Exclusive instruction was stepped. 

1 A Load-Exclusive instruction was stepped. 

If the ISV bit is set to 0, this bit is RESO, indicating no syndrome data is available. 
IFSC, bits [5:0] 


Instruction Fault Status Code. This field is set to 0b100010, to indicate a Debug exception. 


For more information about generating these exceptions, see Software Step exceptions on page D2-1673. 
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ISS encoding for an exception from a Watchpoint exception 


This encoding is used by: 
. Watchpoint exception from a lower Exception level. 
° Watchpoint exception taken without a change in Exception level. 


The ISS encoding for these exceptions is: 


24 9 8 5 0 


7 6 
RESO | DFSC 
| ee 
RESO 


CM 
Bits [24:9] 
Reserved, RESO. 
CM, bit [8] 

Cache maintenance. Indicates whether the Data Abort came from a cache maintenance or address 

translation instruction: 

Q The Data Abort was not generated by the execution of one of the system instructions 
identified in the description of value 1. 

1 The Data Abort was generated by either the execution of a cache maintenance 
instruction or by a synchronous fault on the execution of an address translation 
instruction. The DC ZVA instruction is not classified as a cache maintenance 
instruction, and therefore its execution cannot cause this field to be set to 1. 

Bit [7] 
Reserved, RESO. 
Walk, bit [6] 


Write not Read. Indicates whether the abort was caused by a write instruction or a read instruction. 
The possible values of this bit are: 


) Abort caused by a read instruction. 
1 Abort caused by a write instruction. 
For faults on cache maintenance and address translation instructions, this bit always returns a value 
of 1. 
DESC, bits [5:0] 
Data Fault Status Code. This field is set to @b100010, to indicate a Debug exception. 


For more information about generating these exceptions, see Watchpoint exceptions on page D2-1657. 


ISS encoding for an exception from execution of a Breakpoint instruction 


This encoding is used by: 
° BKPT instruction execution in AArch32 state. 
° BRK instruction execution in AArch64 state. 


This is reported in ESR_EL3 only if a BRK instruction is executed at EL3. 


The ISS encoding for these exceptions is: 
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24 1615 0 


Reserved, RESO. 


Bits [24:16] 


Comment, bits [15:0] 


Set to the instruction comment field value, zero extended as necessary. For the AArch32 BKPT 
instructions, the comment field is described as the immediate field. 


For more information about generating these exceptions, see Breakpoint Instruction exceptions on page D2-1639. 
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D7.2.28 FAR_EL1, Fault Address Register (EL1) 
The FAR_EL1 characteristics are: 


Purpose 


Holds the faulting Virtual Address for all synchronous Instruction or Data Abort, PC alignment fault 
and Watchpoint exceptions that are taken to EL1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Execution at ELO makes FAR_EL1 become UNKNOWN. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register FAR_EL1[31:0] is architecturally mapped to AArch32 System register 
DFAR (NS). 


AArch64 System register FAR_EL1[63:32] is architecturally mapped to AArch32 System register 
IFAR (NS). 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
FAR_EL1 is a 64-bit register. 


Field descriptions 


The FAR_ELI bit assignments are: 


63 0 


Faulting Virtual Address for synchronous exceptions taken to EL1 





Bits [63:0] 


Faulting Virtual Address for synchronous exceptions taken to EL1. Exceptions that set the 
FAR_EL] are Instruction Aborts (EC 0x20 or 0x21), Data Aborts (EC 0x24 or 0x25), PC alignment 


fault exceptions (EC 0x22), and Watchpoint exceptions (EC 0x34 or 0x35). ESR_EL1.EC holds the 
EC value for the exception. 


For a synchronous external abort, if the virtual address that generated the abort was from an address 
range for which TCR_ELx.TBI{0,1}==1 for the translation regime in use when the abort was 
generated, then the top eight bits of the FAR_EL1 are UNKNOWN. 
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For a synchronous external abort other than a synchronous external abort on a translation table walk, 
this field is valid only if ESR_EL1.FnV is 0, and the FAR_EL1 is UNKNOWN if ESR_EL1.FnV is 1. 


For all other exceptions taken to EL1, the FAR_EL1 is UNKNOWN. 


If a memory fault that sets FAR_EL1 is generated from a data cache maintenance or DC ZVA 
instruction, this field holds the address specified in the register argument of the instruction. 


If the exception that updates FAR_EL]1 is taken from an Exception level that is using AArch32, the 
top 32 bits are all zero, unless the faulting address is generated by a load or store instruction that 
sequentially increments from address @xFFFFFFFF. This is an UNPREDICTABLE condition, and in this 
case the upper 32-bits are set to 0x00000001. 


For a Data Abort or Watchpoint exception, if address tagging is enabled for the address accessed by 
the data access that caused the exception, then this field includes the tag. For more information 
about address tagging, see Address tagging in AArch64 state on page D4-1724. 


— Note 


The address held in this field is an address accessed by the instruction fetch or data access that 
caused the exception that gave rise to the instruction or data abort. It is the lower address that gave 
rise to the fault. Where different faults from different addresses arise from the same instruction, such 
as for an instruction that loads or stores a mis-aligned address that crosses a page boundary, the 
architecture does not prioritize between those different faults. 





FAR_EL1 is made UNKNOWN on an exception return from EL1. 


Accessing the FAR_EL1: 
To access the FAR_EL1: 


MRS <Xt>, FAR_EL1 ; Read FAR_EL1 into Xt 
MSR FAR_EL1, <Xt> ; Write Xt to FAR_EL1 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0110 0000 000 
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D7.2.29 FAR_EL2, Fault Address Register (EL2) 
The FAR_EL2 characteristics are: 


Purpose 


Holds the faulting Virtual Address for all synchronous Instruction or Data Abort, PC alignment fault 
and Watchpoint exceptions that are taken to EL2. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Execution at EL! or ELO makes FAR_EL2 become UNKNOWN. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register FAR_EL2[31:0] is architecturally mapped to AArch32 System register 
HDFAR. 


AArch64 System register FAR_EL2[63:32] is architecturally mapped to AArch32 System register 
HIFAR. 


AArch64 System register FAR_EL2[31:0] is architecturally mapped to AArch32 System register 
DFAR (S) when EL2 is implemented. 


AArch64 System register FAR_EL2[63:32] is architecturally mapped to AArch32 System register 
IFAR (S) when EL2 is implemented. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
FAR_EL2 is a 64-bit register. 


Field descriptions 


The FAR_EL2 bit assignments are: 


63 0 


Faulting Virtual Address for synchronous exceptions taken to EL2 





Bits [63:0] 


Faulting Virtual Address for synchronous exceptions taken to EL2. Exceptions that set the 
FAR_EL2 are Instruction Aborts (EC 0x20 or 0x21), Data Aborts (EC 0x24 or 0x25), PC alignment 
fault exceptions (EC 0x22), and Watchpoint exceptions (EC 0x34 or 0x35). ESR_EL2.EC holds the 
EC value for the exception. 


For a synchronous external abort, if the virtual address that generated the abort was from an address 
range for which TCR_ELx.TBI{0,1}==1 for the translation regime in use when the abort was 
generated, then the top eight bits of the FAR_EL2 are UNKNOWN. 


For a synchronous external abort other than a synchronous external abort on a translation table walk, 
this field is valid only if ESR_EL2.FnV is 0, and the FAR_EL2 is UNKNOWN if ESR_EL2.FnV is 1. 
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For all other exceptions taken to EL2, the FAR_EL2 is UNKNOWN. 


If a memory fault that sets FAR_EL2 is generated from a data cache maintenance or DC ZVA 
instruction, this field holds the address specified in the register argument of the instruction. 


If the exception that updates FAR_EL2 is taken from an Exception level that is using AArch32, the 
top 32-bits are all zero, unless the faulting address is generated by a load or store instruction that 
sequentially increments from address Q@xFFFFFFFF. This is an UNPREDICTABLE condition, and in this 
case the upper 32-bits are set to 0x00000001. 


For a Data Abort or Watchpoint exception, if address tagging is enabled for the address accessed by 
the data access that caused the exception, then this field includes the tag. For more information 
about address tagging, see Address tagging in AArch64 state on page D4-1724. 


— Note 


The address held in this field is an address accessed by the instruction fetch or data access that 
caused the exception that gave rise to the instruction or data abort. It is the lower address that gave 
rise to the fault. Where different faults from different addresses arise from the same instruction, such 
as for an instruction that loads or stores a mis-aligned address that crosses a page boundary, the 
architecture does not prioritize between those different faults. 





FAR_EL2 is made UNKNOWN on an exception return from EL2. 


Accessing the FAR_EL2: 
To access the FAR_EL2: 


MRS <Xt>, FAR_EL2 ; Read FAR_EL2 into Xt 
MSR FAR_EL2, <Xt> ; Write Xt to FAR_EL2 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 100 0110 0000 000 
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D7.2.30 FAR_EL3, Fault Address Register (EL3) 
The FAR_EL3 characteristics are: 


Purpose 


Holds the faulting Virtual Address for all synchronous Instruction or Data Abort and PC alignment 
fault exceptions that are taken to EL3. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Execution at EL2, EL1 or ELO makes FAR_EL3 become UNKNOWN. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
FAR_EL3 is a 64-bit register. 


Field descriptions 


The FAR_EL3 bit assignments are: 


63 0 


Faulting Virtual Address for synchronous exceptions taken to EL3 





Bits [63:0] 


Faulting Virtual Address for synchronous exceptions taken to EL3. Exceptions that set the 
FAR_EL3 are Instruction Aborts (EC @x20 or 0x21), Data Aborts (EC 0x24 or @x25), and PC 
alignment fault exceptions (EC 0x22). ESR_EL3.EC holds the EC value for the exception. 


For a synchronous external abort, if the virtual address that generated the abort was from an address 
range for which TCR_ELx.TBI{0,1}==1 for the translation regime in use when the abort was 
generated, then the top eight bits of the FAR_EL3 are UNKNOWN. 


For a synchronous external abort other than a synchronous external abort on a translation table walk, 
this field is valid only if ESR_EL3.FnV is 0, and the FAR_EL3 is UNKNOWN if ESR_EL3.FnV is 1. 


For all other exceptions taken to EL3, the FAR_EL3 is UNKNOWN. 


If a memory fault that sets FAR_EL3 is generated from a data cache maintenance or DC ZVA 
instruction, this field holds the address specified in the register argument of the instruction. 


If the exception that updates FAR_EL3 is taken from an EL using AArch32, the top 32-bits are all 
zero, unless the faulting address is generated by a load or store instruction that sequentially 
increments from address Oxf fffffff. This is an UNPREDICTABLE condition, and in this case the upper 
32-bits are set to 0x00000001. 


For a Data Abort or Watchpoint exception, if address tagging is enabled for the address accessed by 
the data access that caused the exception, then this field includes the tag. For more information 
about address tagging, see Address tagging in AArch64 state on page D4-1724. 
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— Note ———————_ 


The address held in this register is an address accessed by the instruction fetch or data access that 
caused the exception that actually gave rise to the instruction or data abort. It is the lowest address 
that gave rise to the fault. Where different faults from different addresses arise from the same 
instruction, such as for an instruction that loads or stores a mis-aligned address that crosses a page 
boundary, the architecture does not prioritize between those different faults. 





FAR_EL3 is made UNKNOWN on an exception return from EL3. 


Accessing the FAR_EL3: 
To access the FAR_EL3: 


MRS <Xt>, FAR_EL3 ; Read FAR_EL3 into Xt 
MSR FAR_EL3, <Xt> ; Write Xt to FAR_EL3 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 110 0110 0000 000 
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D7.2.31 FPEXC32_EL2, Floating-Point Exception Control register 
The FPEXC32_EL2 characteristics are: 


Purpose 
Allows access to the AArch32 register FPEXC from AArch64 state only. Its value has no effect on 
execution in AArch64 state. 

Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If CPTR_EL2.TFP==1, Non-secure accesses to this register from EL2 are trapped to EL2. 
. If CPTR_EL3.TFP==1, accesses to this register from EL2 and EL3 are trapped to EL3. 


Configurations 


AArch64 System register FPEXC32_EL2 is architecturally mapped to AArch32 System register 
FPEXC. 


If EL1 cannot use AArch32, this register is UNDEFINED. 


If EL2 is not implemented but EL3 is implemented, and EL] is capable of using AArch32, then this 
register is not RESO. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
FPEXC32_EL2 is a 32-bit register. 


Field descriptions 


The FPEXC32_EL2 bit assignments are: 


31 30 29 28 27 26 25 11 10 876543210 





x— | | Lor 


DEX OFF 
FP2V UFF 
TFV IXF 
RESO 
IDF 

EX, bit [31] 


Exception bit. In ARMVv8, this bit is RAZ/WI. 
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EN, bit [30] 


DEX, bit [29] 


FP2V, bit [28] 


VV, bit [27] 


Enables access to the Advanced SIMD and floating-point functionality from all Exception levels, 
except that setting this field to 0 does not disable the following: 


° VMSR accesses to the FPEXC or FPSID. 
° VMRS accesses from the FREXC, FPSID, MVFRO, MVFR1, or MVFR2. 


Q Accesses to the FPSCR, and any of the SIMD and floating-point registers QO-Q15, 
including their views as DO-D31 registers or SO-S31 registers, are UNDEFINED at all 
Exception levels. 


1 This control permits access to the Advanced SIMD and floating-point functionality at 
all Exception levels. 


Execution of floating-point and Advanced SIMD instructions in AArch32 state can be disabled or 
trapped by the following controls: 


° CPACR.cp10, or, if executing at ELO, CPACR_EL1.FPEN. 
. FPEXC.EN. 
° If executing in Non-secure state: 
—  HCPTR.TCP10, or if EL2 is using AArch64, CPTR_EL2.TFP. 
— NSACR.cp10, or if EL3 is using AArch64, CPTR_EL3.TFP. 
° For Advanced SIMD instructions only: 
—  CPACR.ASEDIS. 
— _ If executing in Non-secure state, HCPTR.TASE and NSACR.NSTRCDIS. 


See the descriptions of the controls for more information. 


— Note 


When executing at ELO using AArch32 with EL1 using AArch64, the PE behaves as if the value of 
FPEXC.EN bit is 1. 





Defined synchronous exception on floating-point execution. 


This field identifies whether a synchronous exception generated by the attempted execution of an 
instruction was generated by an unallocated encoding. The instruction must be in the encoding space 
that is identified by the pseudocode function ExecutingCP10or11Instr() returning TRUE. This field 
also indicates whether the FPEXC32_EL2.TFV is valid. 


The meaning of this bit is: 


) The exception was generated by the attempted execution of an unallocated instruction 
in the encoding space that is identified by the pseudocode function 
ExecutingCP10or11Instr(). If FREXC32_EL2.TFV is RW then it is invalid and 
UNKNOWN. If FPEXC32_EL2.{IDF, IXF, UFF, OFF, DZF, IOF} are RW then they are 
invalid and UNKNOWN. 


1 The exception was generated during the execution of an allocated encoding. 
FPEXC32_EL2.TFV is valid and indicates the cause of the exception. 


On an exception that sets this bit to 1 the exception-handling routine must clear this bit to 0. 


On an implementation that both does not support trapping of floating-point exceptions and 
implements the AArch32 FPSCR. {Stride, Len} fields as RAZ, this bit is RESO. 


FPINST2 instruction valid bit. In ARMv8, this bit is RESO. 


VECITR valid bit. In ARMv8, this bit is RESO. 
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TEV, bit [26] 


Trapped Fault Valid bit. Valid only when the value of FPEXC.DEX is 1. When valid, it indicates the 
cause of the exception and therefore whether the FPEXC.{IDF, IXF, UFF, OFF, DZF, IOF} bits are 
valid. 


0 The exception was caused by the execution of a floating-point VABS, VADD, VDIV, 
VFMA, VFMS, VFNMA, VFNMS, VMLA, VMLS, VMOV, VMUL, VNEG, 
VNMLA, VNMLS, VNMUL, VSQRT, or VSUB instruction when one or both of 
FPSCR.{ Stride, Len} was non-zero. If the FPEXC.{IDF, IXF, UFF, OFF, DZF, IOF} 
bits are RW then they are invalid and UNKNOWN. 


1 FPEXC. {IDF, IXF, UFF, OFF, DZF, IOF} indicate the presence of trapped 
floating-point exceptions that had occurred at the time of the exception. Bits are set for 
all trapped exceptions that had occurred at the time of the exception. 


This bit returns a status value and ignores writes. 
When the value of FREXC.DEX is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


On an implementation that supports the trapping of floating-point exceptions and implements 
FPSCR.{Stride, Len} as RAZ, this bit is RAO/WI. 
Bits [25:11] 


Reserved, RESO. 


VECITR, bits [10:8] 
Vector iteration count. In ARMvé8, this field is RES1. 


IDF, bit [7] 


Input Denormal trapped exception bit. Valid only when the value of FPEXC.TFV is 1. When valid, 
it indicates whether an Input Denormal exception occurred while FPSCR.IDE was 1: 


) Input denormal exception has not occurred. 
1 Input denormal exception has occurred. 
Input Denormal exceptions can occur only when FPSCR.FZ is 1. 
This bit must be cleared to 0 by the exception-handling routine. 
When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 
On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 
Bits [6:5] 


Reserved, RESO. 


IXF, bit [4] 


Inexact trapped exception bit. Valid only when the value of FPEXC.TFV is 1. When valid, it 
indicates whether an Inexact exception occurred while FPSCR.IXE was 1: 


7) Inexact exception has not occurred. 

1 Inexact exception has occurred. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 
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UFF, bit [3] 


OFF, bit [2] 


DZF, bit [1] 


IOF, bit [0] 


Underflow trapped exception bit. Valid only when the value of FPEXC.TFV is 1. When valid, it 
indicates whether an Underflow exception occurred while FPSCR.UFE was 1: 


) Underflow exception has not occurred. 

1 Underflow exception has occurred. 

Underflow trapped exceptions can occur only when FPSCR.FZ is 0. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


Overflow trapped exception bit. Valid only when the value of FPEXC.TFV is 1. When valid, it 
indicates whether an Overflow exception occurred while FPSCR.OFE was 1: 


0 Overflow exception has not occurred. 

1 Overflow exception has occurred. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


Divide-by-zero trapped exception bit. Valid only when the value of FPEXC.TFV is 1. When valid, 
it indicates whether a Divide-by-zero exception occurred while FPSCR.DZE was 1: 


) Divide-by-zero exception has not occurred. 

1 Divide-by-zero exception has occurred. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


Invalid Operation trapped exception bit. Valid only when the value of FREXC.TFV is 1. When valid, 
it indicates whether an Invalid Operation exception occurred while FPSCR.IOE was 1: 


v) Invalid Operation exception has not occurred. 

1 Invalid Operation exception has occurred. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


Accessing the FPEXC32_EL2: 


To access the FREXC32_EL2: 


MRS <Xt>, FPEXC32_EL2 ; Read FPEXC32_EL2 into Xt 
MSR FPEXC32_EL2, <Xt> ; Write Xt to FPEXC32_EL2 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 0101 0011 000 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-1975 
1ID092916 Non-Confidential 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 























D7.2.32 HACR_EL2, Hypervisor Auxiliary Control Register 
The HACR_EL2 characteristics are: 
Purpose 
Controls trapping to EL2 of IMPLEMENTATION DEFINED aspects of Non-secure EL1 or ELO 
operation. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register HACR_EL2 is architecturally mapped to AArch32 System register 
HACR. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HACR_EL2 is a 32-bit register. 
Field descriptions 
The HACR_EL2 bit assignments are: 
31 0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
Accessing the HACR_EL2: 
To access the HACR_EL2: 
MRS <Xt>, HACR_EL2 ; Read HACR_EL2 into Xt 
MSR HACR_EL2, <Xt> ; Write Xt to HACR_EL2 
Register access is encoded as follows: 
opO0 opt CRn CRm_= op2 
11 100 0001 0001 111 
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D7.2.33 HCR_EL2, Hypervisor Configuration Register 
The HCR_EL2 characteristics are: 


Purpose 
Provides configuration controls for virtualization, including defining whether various Non-secure 
operations are trapped to EL2. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register HCR_EL2[31:0] is architecturally mapped to AArch32 System register 
HCR. 


AArch64 System register HCR_EL2[63:32] is architecturally mapped to AArch32 System register 
HCR2. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
HCR_EL2? is a 64-bit register. 


Field descriptions 


The HCR_EL2 bit assignments are: 


63 39 38 37 34 33 32 31 30 29 28 27 26 25 24 23 22 21 2019181716 1514131211109 8 76543 2 1 0 


RESO ea | Eira 
MIOCNCE SWIO 
RESO PTW 
CD FMO 










RW IMO 
TRVM AMO 
HCD VF 
TDZ VSE 
TGE DC 
TVM Mw 
TTLB TWE 
TPU TIDO 
TPC TID1 
TSW TID2 
TACR HIBS 
TIDCP TS 
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Bits [63:39] 
Reserved, RESO. 


MIOCNCE, bit [38] 

Mismatched Inner/Outer Cacheable Non-Coherency Enable, for the Non-secure EL1&0 translation 

regime. 

0 For the Non-secure EL1&0 translation regime, for permitted accesses to a memory 
location that use a common definition of the Shareability and Cacheability of the 
location, there must be no loss of coherency if the Inner Cacheability attribute for those 
accesses differs from the Outer Cacheability attribute. 

1 For the Non-secure EL1&0 translation regime, for permitted accesses to a memory 


location that use a common definition of the Shareability and Cacheability of the 
location, there might be a loss of coherency if the Inner Cacheability attribute for those 
accesses differs from the Outer Cacheability attribute. 


For more information see Mismatched memory attributes on page B2-105. 
This field can be implemented as RAZ/WI. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
Bits [37:34] 


Reserved, RESO. 


ID, bit [33] 


Stage 2 Instruction access cacheability disable. For the Non-secure EL1&0 translation regime, when 
HCR_EL2.VM==1, this control forces all stage 2 translations for instruction accesses to Normal 
memory to be Non-cacheable. 


0 This control has no effect on stage 2 of the Non-secure EL1&0 translation regime. 


a For the Non-secure EL1&0 translation regime, forces all stage 2 translations for 
instruction accesses to Normal memory to be Non-cacheable. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


This bit has no effect on the EL2 or EL3 translation regimes. 


CD, bit [32] 


Stage 2 Data access cacheability disable. For the Non-secure EL1&0 translation regime, when 
HCR_EL2.VM==1, this control forces all stage 2 translations for data accesses and translation table 
walks to Normal memory to be Non-cacheable. 


0 This control has no effect on stage 2 of the Non-secure EL1&0 translation regime for 
data accesses and translation table walks. 


1 For the Non-secure EL1&0 translation regime, forces all stage 2 translations for data 
accesses and translation table walks to Normal memory to be Non-cacheable. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


This bit has no effect on the EL2 or EL3 translation regimes. 


RW, bit [31] 
Execution state control for lower Exception levels: 
Q Lower levels are all AArch32. 


1 The Execution state for EL1 is AArch64. The Execution state for ELO is determined by 
the current value of PSTATE.nRW when executing at ELO. 


If all lower Exception levels cannot use AArch32 then this bit is RAO/WI. 
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In an implementation that includes EL3, when SCR_EL3.NS==0, the PE behaves as if this bit has 
the same value as the SCR_EL3.RW bit for all purposes other than a direct read or write access of 
HCR_EL2. 


The RW bit is permitted to be cached in a TLB. 


TRVM, bit [30] 
Trap Reads of Virtual Memory controls. Traps Non-secure EL1 reads of the virtual memory control 


registers to EL2, from both Execution states. The registers for which read accesses are trapped are 
as follows: 


Non-secure EL1 using AArch64: SCTLR_EL1, TTBRO_EL1, TTBR1_EL1, TCR_EL1, ESR_EL1, 
FAR_EL1, AFSRO_EL1, AFSR1_EL1, MAIR_EL1, AMAIR EL1, CONTEXTIDR_ELI. 


Non-secure EL1 using AArch32: SCTLR, TTBRO, TTBR1, TTBCR, DACR, DFSR, IFSR, DFAR, 
IFAR, ADFSR, AIFSR, PRRR, NMRR, MAIRO, MAIR1, AMAIRO, AMAIRI, CONTEXTIDR. 


) This control has no effect on Non-secure EL1 read accesses to Virtual Memory controls. 
He Non-secure EL] read accesses to the specified Virtual Memory controls are trapped to 
EL2. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
HCD, bit [29] 


HVC instruction disable. Disables Non-secure state execution of HVC instructions, from both 
Execution states. 


0 HVC instruction execution is enabled at EL2 and Non-secure EL1. 

1 HVC instructions are UNDEFINED at EL2 and Non-secure EL1. Any resulting exception 
is taken to the Exception level at which the HVC instruction is executed. 

— Note 

HVC instructions are always UNDEFINED at ELO. 





This bit is only implemented if EL3 is not implemented. Otherwise, it is RESO. 





TDZ, bit [28] 
Trap DC ZVA instructions. Traps Non-secure ELO and EL1 execution of DC ZVA instructions to 
EL2, from AArch64 state only. 
) This control has no effect on the Non-secure ELO and EL1 execution of DC ZVA 
instructions. 
Hb In AArch64 state, any attempt to execute a DC ZVA instruction at Non-secure EL1, or 
at Non-secure ELO when the instruction is not UNDEFINED at ELO, is trapped to EL2. 
Reading the DCZID_ELO returns a value that indicates that DC ZVA instructions are 
not supported. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
TGE, bit [27] 
Trap General Exceptions, from Non-secure ELO. 
0 This control has no effect on execution at ELO. 
1 When the value of SCR_EL3.NS is 0, this control has no effect on execution at ELO. 

When the value of SCR_EL3.NS is 1, then: 

. All exceptions that would be routed to EL1 are routed to EL2. 

° The SCTLR_EL1.M field, or the SCTLR.M field if EL1 is using AArch32, is 
treated as being 0 for all purposes other than returning the result of a direct read 
of SCTLR_EL1 or SCTLR. 
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° The HCR_EL2.{FMO, IMO, AMO} fields are treated as being 1 for all purposes 
other than a direct read or write access of HCR_EL2. 


° All virtual interrupts are disabled. 

. Any IMPLEMENTATION DEFINED mechanisms for signaling virtual interrupts are 
disabled. 

. An exception return to EL] is treated as an illegal exception return. 


. The MDCR_EL2.{TDRA,TDOSA,TDA, TDE} fields are treated as being 1 for 
all purposes other than returning the result of a direct read of MDCR_EL2. 


HCR_EL2.TGE must not be cached in a TLB. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


TVM, bit [26] 


Trap Virtual Memory controls. Traps Non-secure EL1 writes to the virtual memory control registers 
to EL2, from both Execution states. The registers for which write accesses are trapped are as 
follows: 


Non-secure EL1 using AArch64: SCTLR_EL1, TTBRO_EL1, TTBR1_EL1, TCR_EL1, ESR_EL1, 
FAR_EL1, AFSRO_EL1, AFSR1_EL1, MAIR _EL1, AMAIR_ EL1, CONTEXTIDR_ELI. 


Non-secure EL1 using AArch32: SCTLR, TTBRO, TTBR1, TTBCR, DACR, DFSR, IFSR, DFAR, 
IFAR, ADFSR, AIFSR, PRRR, NMRR, MAIRO, MAIR1, AMAIRO, AMAIR1, CONTEXTIDR. 


) This control has no effect on Non-secure EL1 write accesses to EL1 virtual memory 
control registers. 


1 Non-secure EL1 write accesses to the specified EL1 virtual memory control registers 
are trapped to EL2. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


TTLB, bit [25] 


Trap TLB maintenance instructions. Traps Non-secure EL1 execution of TLB maintenance 
instructions to EL2, from both Execution states. This applies to the following instructions: 


Non-secure EL1 using AArch64: TLBI VMALLELIS, TLBI VAEIIS, TLBI ASIDELS, TLBI 
VAAELS, TLBI VALELS, TLBI VAALE1IS, TLBI VMALLE1, TLBI VAE1, TLBI ASIDE1, 
TLBI VAAE1, TLBI VALE1, TLBI VAALE1. 


Non-secure EL1 using AArch32: TLBIALLIS, TLBIMVAIS, TLBIASIDIS, TLBIMVAAIS, 
TLBIMVALIS, TLBIMVAALIS, ITLBIALL, ITLBIMVA, ITLBIASID, DTLBIALL, 
DTLBIMVA, DTLBIASID, TLBIALL, TLBIMVA, TLBIASID, TLBIMVAA, TLBIMVAL, 


TLBIMVAAL 

1) This control has no effect on Non-secure EL1 execution of TLB maintenance 
instructions. 

1 Non-secure EL1 execution of the specified TLB maintenance instructions are trapped 
to EL2. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


TPU, bit [24] 


Trap cache maintenance instructions that operate to the Point of Unification. Traps execution of 
those cache maintenance instructions at Non-secure EL1 or ELO using AArch64, and at Non-secure 
EL1 using AArch32, to EL2. This applies to the following instructions: 


Non-secure ELO using AArch64:IC [VAU, DC CVAU. However, if the value of SCTLR_EL1.UCI 
is 0 these instructions are UNDEFINED at ELO and any resulting exception is higher priority than this 
trap to EL2. 


Non-secure EL1 using AArch64: IC IVAU, IC IALLU, IC IALLUIS, DC CVAU. 
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Non-secure EL1 using AArch32: ICIMVAU, ICIALLU, ICIALLUIS, DCCMVAU. 


— Note 


An exception generated because an instruction is UNDEFINED at ELO is higher priority than this trap 
to EL2. In addition: 


° IC IALLUIS and IC IALLU are always UNDEFINED at ELO using AArch64. 
° ICIMVAU, ICIALLU, ICIALLUIS, and DCCMVAU are always UNDEFINED at ELO using 





AArch32. 
) This control has no effect on the execution of cache maintenance instructions. 
1 Non-secure execution of the specified instructions is trapped to EL2. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
TPC, bit [23] 


Trap data or unified cache maintenance instructions that operate to the Point of Coherency. Traps 
execution of those cache maintenance instructions at Non-secure EL1 or ELO using AArch64, and 
at Non-secure EL1 using AArch32, to EL2. This applies to the following instructions: 


Non-secure ELO using AArch64: DC CIVAC, DC CVAC. However, if the value of 
SCTLR_EL1.UCTI is 0 these instructions are UNDEFINED at ELO and any resulting exception is 
higher priority than this trap to EL2. 


Non-secure EL1 using AArch64: DC IVAC, DC CIVAC, DC CVAC. 
Non-secure EL1 using AArch32: DCIMVAC, DCCIMVAC, DCCMVAC. 


— Note 


An exception generated because an instruction is UNDEFINED at ELO is higher priority than this trap 
to EL2. In addition: 


. DC IVAC is always UNDEFINED at ELO using AArch64. 
° DCIMVAC, DCCIMVAC, and DCCMVAC are always UNDEFINED at ELO using AArch32. 





0 This control has no effect on the execution of cache maintenance instructions. 
1 Non-secure execution of the specified instructions is trapped to EL2. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
TSW, bit [22] 


Trap data or unified cache maintenance instructions that operate by Set/Way. Traps execution of 
those cache maintenance instructions at Non-secure EL1 using AArch64, and at Non-secure EL1 
using AArch32, to EL2. This applies to the following instructions: 


Non-secure EL1 using AArch64: DC ISW, DC CSW, DC CISW. 
Non-secure EL1 using AArch32: DCISW, DCCSW, DCCISW. 


— Note 


An exception generated because an instruction is UNDEFINED at ELO is higher priority than this trap 
to EL2, and these instructions are always UNDEFINED at ELO. 





0 This control has no effect on the execution of cache maintenance instructions. 
1 Non-secure execution of the specified instructions is trapped to EL2. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
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TACR, bit [21] 


Trap Auxiliary Control Registers. Traps Non-secure EL1 accesses to the Auxiliary Control 
Registers to EL2, from both Execution states. This applies to the following register accesses: 


° Non-secure EL1 using AArch64: ACTLR_EL1. 
° Non-secure EL1 using AArch32: ACTLR and, if implemented, ACTLR2. 


0 This control has no effect on Non-secure EL1 accesses to the Auxiliary Control 
Registers. 
1 Non-secure EL] accesses to the specified registers are trapped to EL2. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
TIDCP, bit [20] 


Trap IMPLEMENTATION DEFINED functionality. Traps Non-secure EL1 accesses to the encodings 
reserved for IMPLEMENTATION DEFINED functionality to EL2. This applies to the following register 
accesses: 


AArch64: The following reserved encoding spaces: 


° IMPLEMENTATION DEFINED system instructions, which are accessed using SYS and SYSL, 
with CRn == {11, 15}. 
° IMPLEMENTATION DEFINED System registers, which are accessed using MRS and MSR with 


the S3_<op1>_<Cn>_<Cm>_<op2> register name. 





AArch32: MCR and MRC instructions accessing the following encodings: 

° All coproc==p15, CRn==c9, opel == {0-7}, CRm == {c0-c2, c5-c8}, opc2 == {0-7}. 
° All coproc==p15, CRn==c10, opel =={0-7}, CRm == {c0, cl, c4, c8}, opc2 == {0-7}. 
° All coproc==p15, CRn==cl1, opcl=={0-7}, CRm == {c0-c8, c15}, opc2 == {0-7}. 


When the value of HCR_EL2.TIDCP is 1, it is IMPLEMENTATION DEFINED whether any of this 
functionality accessed from Non-secure ELO is trapped to EL2. If it is not, then it is UNDEFINED, and 
any attempt to access it from Non-secure ELO generates an exception that is taken to EL1. 


0 This control has no effect on the encodings reserved for IMPLEMENTATION DEFINED 
functionality. 
‘1 Non-secure EL] accesses to or execution of the specified encodings reserved for 


IMPLEMENTATION DEFINED functionality are trapped to EL2. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
TSC, bit [19] 


Trap SMC instructions. Traps Non-secure EL1 execution of SMC instructions to EL2, from both 
Execution states. 


Q This control has no effect on the execution of SMC instructions. 

1 Any attempt to execute an SMC instruction at Non-secure EL1 using AArch64 or 
Non-secure EL1 using AArch32 is trapped to EL2, regardless of the value of 
SCR_EL3.SMD. 


In AArch32 state, the ARMv8-A architecture permits, but does not require, this trap to apply to 
conditional SMC instructions that fail their condition code check, in the same way as with traps on 
other conditional instructions. 


If EL3 is not implemented, this bit is RESO. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


TID3, bit [18] 
Trap ID group 3. Traps Non-secure EL] reads of the following registers to EL2: 
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AArch64: ID_PFRO_EL1, ID_PFR1_EL1, ID_DFRO_EL1, ID_AFRO_EL1, ID_MMFRO_EL1, 
ID_MMFR1_EL1, ID_LMMFR2_EL1, ID_MMFR3_EL1, ID_ISARO_EL1, ID_ISAR1_ELI, 
ID_ISAR2_EL1, ID_ISAR3_EL1, ID_LISAR4_ EL1, ID_ISAR5_EL1, MVFRO_EL1, 
MVFR1_EL1, MVFR2_EL1, ID_AA64PFRO_EL1, ID_AA64PFR1_EL1, ID_AA64DFRO_EL1, 
ID_AA64DFR1_EL1, ID_AA64ISARO_EL1, ID_LAA64ISAR1_EL1, ID_LAA64MMFRO_EL1, 
ID_AA64MMEFR1_EL1, ID_AA64AFRO_EL1, ID_AA64AFR1_EL1, and ID_MMFR4_EL1, 
except that if ID_ MMFR4_EL] is implemented as RAZ/WI then it is IMPLEMENTATION DEFINED 
whether accesses to ID_MMFR4_EL] are trapped. 


It is IMPLEMENTATION DEFINED whether this field traps MRS accesses to encodings in the following 
range that are not already mentioned in this field description: 


° Op0 == 3, opl == 0, CRn == c0, CRm == {c2-c7}, op2 == {0-7}. 


AArch32: ID_PFRO, ID_PFR1, ID_DFRO, ID_AFRO, ID_MMFRO, ID_MMFR1, ID_MMFR2, 
ID_MMFR3, ID_ISARO, ID_ISAR1, ID_ISAR2, ID_ISAR3, ID_ISAR4, ID_ISAR5, MVFRO, 
MVFRI1, MVFR2, and ID_MMFR4, except that if ID_MMFR4 is implemented as RAZ/WI then it 
is IMPLEMENTATION DEFINED whether accesses to ID_MMFR4 are trapped. 











MRC access to any of the following encodings are also trapped: 
° coproc==p15, opcl == 0, CRn == c0, CRm == {c3-c7}, opce2 == {0,1}. 
° coproc==p15, opcl == 0, CRn == c0O, CRm == c3, ope2 == 2. 
° coproc==p15, opcl == 0, CRn == c0, CRm == ¢5, opce2 == {4,5}. 
It is IMPLEMENTATION DEFINED whether this bit traps MRC accesses to the following encodings: 
° coproc==p15, opcl == 0, CRn == c0O, CRm == c2, ope2 == 7. 
° coproc==p15, opel == 0, CRn == c0O, CRm == c3, ope2 == {3-7}. 
° coproc==p15, opcl == 0, CRn == c0O, CRm == {c4, c6, c7}, opc2 == {2-7}. 
° coproc==p15, opcl == 0, CRn == c0, CRm == ¢5, opce2 == {2, 3, 6, 7}. 
) This control has no effect on Non-secure EL1 reads of the ID group 3 registers. 
1 The specified Non-secure EL1 read accesses to ID group 3 registers are trapped to EL2. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
TID2, bit [17] 
Trap ID group 2. Traps the following register accesses to EL2: 
AArch64: 
° Non-secure EL reads of CTR_ELO, CCSIDR_EL1, CLIDR_EL1, and CSSELR_EL1. 


° Non-secure ELO reads of CTR_ELO, except that if the value of SCTLR_EL1.UCT is 0 then 
ELO reads of CTR_ELO are UNDEFINED and any resulting exception takes precedence over 


this trap. 
° Non-secure EL writes to CSSELR_EL1. 
AArch32: 


° Non-secure EL] reads of the CTR, CCSIDR, CLIDR, and CSSELR. 
° Non-secure EL] writes to the CSSELR. 


0 This control has no effect on Non-secure EL1 and ELO accesses to the ID group 2 
registers. 

1 The specified Non-secure EL1 and ELO accesses to ID group 2 registers are trapped to 
EL2. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


TID1, bit [16] 
Trap ID group 1. Traps Non-secure EL1 reads of the following registers are trapped to EL2: 
AArch64: REVIDR_EL1, AIDR_EL1. 
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TIDO, bit [15] 


TWE, bit [14] 


TWI, bit [13] 


AArch32: TCMTR, TLBTR, REVIDR, AIDR. 
Q This control has no effect on Non-secure EL1 reads of the ID group 1 registers. 
1 The specified Non-secure EL1 read accesses to ID group | registers are trapped to EL2. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


Trap ID group 0. Traps the following register accesses to EL2: 

AArch64: None. 

AArch32: 

. Non-secure EL] reads of the JIDR. 

° If the JIDR is RAZ from Non-secure ELO, Non-secure ELO of the JIDR. 
° Non-secure EL1 reads of the FPSID. 


— Note 


7 It is IMPLEMENTATION DEFINED whether the JIDR is RAZ or UNDEFINED at ELO. If it is 
UNDEFINED at ELO then any resulting exception takes precedence over this trap. 


° The FPSID is not accessible at ELO using AArch32. 





° Writes to the FPSID are ignored, and not trapped by this control. 
Q This control has no effect on Non-secure EL1 reads of the ID group 0 registers. 
1 The specified Non-secure EL1 read accesses to ID group 0 registers are trapped to EL2. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


Traps Non-secure ELO and EL1 execution of WFE instructions to EL2, from both Execution states. 


1) This control has no effect on the execution of WFE instructions at Non-secure ELO or 
Non-secure EL 1. 


1 Any attempt to execute a WFE instruction at Non-secure ELO or ELI is trapped to EL2, 
if the instruction would otherwise have caused the PE to enter a low-power state, except 
that when the value of SCTLR_EL1.nTWE is 0, the trap of ELO execution to EL1 takes 
precedence over this trap. 


In AArch32 state, the attempted execution of a conditional WFE instruction is only trapped if the 
instruction passes its condition code check. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


— Note 


Since a WFE can complete at any time, even without a Wakeup event, the traps on WFE are not 
guaranteed to be taken, even if the WFE is executed when there is no Wakeup event. The only 
guarantee is that if the instruction does not complete in finite time in the absence of a Wakeup event, 
the trap will be taken. 





Traps Non-secure ELO and EL1 execution of WFI instructions to EL2, from both Execution states. 


1) This control has no effect on the execution of WFI instructions at Non-secure EL1 or 
Non-secure ELO. 





D7-1984 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


DC, bit [12] 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 


1 Any attempt to execute a WFI instruction at Non-secure ELO or EL] is trapped to EL2, 
if the instruction would otherwise have caused the PE to enter a low-power state, except 
that when the value of SCTLR_EL1.nTWI is 0, the trap of ELO execution to EL1 takes 
precedence over this trap. 


In AArch32 state, the attempted execution of a conditional WFI instruction is only trapped if the 
instruction passes its condition code check. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


— Note 


Since a WFI can complete at any time, even without a Wakeup event, the traps on WFI are not 
guaranteed to be taken, even if the WFI is executed when there is no Wakeup event. The only 
guarantee is that if the instruction does not complete in finite time in the absence of a Wakeup event, 
the trap will be taken. 





Default Cacheability. 
) This control has no effect on the Non-secure EL1&0 translation regime. 
1 In Non-secure state: 


° When EL]! is using AArch64, the PE behaves as if the value of the 
SCTLR_EL1.M field is 0 for all purposes other than returning the value of a 
direct read of SCTLR_EL1. 


° When EL] is using AArch32, the PE behaves as if the value of the SCTLR.M 
field is 0 for all purposes other than returning the value of a direct read of SCTLR. 


° The PE behaves as if the value of the HCR_EL2.VM field is 1 for all purposes 
other than returning the value of a direct read of HCR_EL2. 


° The memory type produced by stage 1 of the EL1&0 translation regime is 
Normal Non-Shareable, Inner Write-Back Read-Allocate Write-Allocate, Outer 
Write-Back Read-Allocate Write-Allocate. 


This field has no effect on the EL2 and EL3 translation regimes. 
This field is permitted to be cached in a TLB. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


BSU, bits [11:10] 


Barrier Shareability upgrade. This field determines the minimum shareability domain that is applied 
to any barrier instruction executed from Non-secure EL1 or Non-secure ELO: 


00 No effect 


Q1 Inner Shareable 
10 Outer Shareable 
11 Full system 


This value is combined with the specified level of the barrier held in its instruction, using the same 
principles as combining the shareability attributes from two stages of address translation. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 





FB, bit [9] 
Force broadcast. Causes the following instructions to be broadcast within the Inner Shareable 
domain when executed from Non-secure EL1: 
AArch32: BPIALL, TLBIALL, TLBIMVA, TLBIASID, DTLBIALL, DTLBIMVA, DTLBIASID, 
ITLBIALL, ITLBIMVA, ITLBIASID, TLBIMVAA, ICIALLU, TLBIMVAL, TLBIMVAAL. 
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AArch64: TLBI VMALLE1, TLBI VAE1, TLBI ASIDE1, TLBI VAAE1, TLBI VALE1, TLBI 
VAALE1, IC IALLU. 


0 This field has no effect on the operation of the specified instructions. 


1 When one of the specified instruction is executed at Non-secure EL1, the instruction is 
broadcast within the Inner Shareable shareability domain. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
VSE, bit [8] 
Virtual SError interrupt. 
0 This mechanism is not making a virtual SError interrupt pending. 
1 A virtual SError interrupt is pending because of this mechanism. 
The virtual SError interrupt is only enabled when the value of HCR_EL2.{TGE, AMO} is {0, 1}. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
VI, bit [7] 
Virtual IRQ Interrupt. 
Q This mechanism is not making a virtual IRQ pending. 
1 A virtual IRQ is pending because of this mechanism. 
The virtual IRQ is only enabled when the value of HCR_EL2.{TGE, IMO} is {0, 1}. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
VF, bit [6] 
Virtual FIQ Interrupt. 
0 This mechanism is not making a virtual FIQ pending. 
1 A virtual FIQ is pending because of this mechanism. 
The virtual FIQ is only enabled when the value of HCR_EL2.{TGE, FMO} is {0, 1}. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
AMO, bit [5] 
Physical SError Interrupt routing. 


0 When executing at Non-secure Exception levels below EL2, physical SError Interrupts 
are not taken to EL2. 


When executing at EL2 using AArch64, physical SError Interrupts are not taken unless 
they are routed to EL3 by the SCR_EL3.EA bit. 


Virtual SError interrupts are disabled. 
1 When executing at any Exception level in Non-secure state: 
° Physical SError interrupts are taken to EL2 unless they are routed to EL3. 


° If HCR_EL2.TGE==0 then Virtual SError interrupts are enabled in the 
Non-secure state. 


If the value of HCR_EL2.TGE is 1, then in Non-secure state the HCR_EL2.AMO bit behaves as | 
for all purposes other than a direct read of the value of the bit. 


For more information, see Asynchronous exception routing on page D1-1556. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
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IMO, bit [4] 
Physical IRQ Routing. 


0 When executing at Non-secure Exception levels below EL2, physical IRQ interrupts are 
not taken to EL2. 


When executing at EL2 using AArch64, physical IRQ interrupts are not taken unless 
they are routed to EL3 by the SCR_EL3.IRQ bit. 


Virtual IRQ interrupts are disabled. 
1 When executing at any Exception level in Non-secure state: 
° Physical IRQ interrupts are taken to EL2 unless they are routed to EL3. 


° If HCR_EL2.TGE==0 then Virtual IRQ interrupts are enabled in Non-secure 
state. 


If the value of HCR_EL2.TGE is 1, then in Non-secure state the HCR_EL2.IMO bit behaves as 1 
for all purposes other than a direct read of the value of the bit. 


For more information, see Asynchronous exception routing on page D1-1556. 


In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 


FMO, bit [3] 
Physical FIQ Routing. 
v) When executing at Non-secure Exception levels below EL2, physical FIQ interrupts are 
not taken to EL2. 
When executing at EL2 using AArch64, physical FIQ interrupts are not taken unless 
they are routed to EL3 by the SCR_EL3.FIQ bit. 
Virtual FIQ interrupts are disabled. 
1 When executing at any Exception level in Non-secure state: 
° Physical FIQ interrupts are taken to EL2 unless they are routed to EL3. 
° If HCR_EL2.TGE==0 then Virtual FIQ interrupts are enabled in Non-secure 
state. 
If the value of HCR_EL2.TGE is 1, then in Non-secure state the HCR_EL2.FMO bit behaves as 1 
for all purposes other than a direct read of the value of the bit. 
For more information, see Asynchronous exception routing on page D1-1556. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
PTW, bit [2] 


Protected Table Walk. In the Non-secure EL1&0 translation regime, a translation table access made 
as part of a stage 1 translation table walk is subject to a stage 2 translation. The combining of the 
memory type attributes from the two stages of translation means the access might be made to a type 
of Device memory. If this occurs then the value of this bit determines the behavior: 


0 The translation table walk occurs as if it is to Normal Non-cacheable memory. This 
means it can be made speculatively. 


1 The memory access generates a stage 2 Permission fault. 
This field is permitted to be cached in a TLB. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
SWIO, bit [1] 


Set/Way Invalidation Override. Causes Non-secure EL1 execution of the data cache invalidate by 
set/way instructions to be treated as data cache clean and invalidate by set/way: 





Q This control has no effect on the operation of data cache invalidate by set/way 
instructions. 
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1 Data cache invalidate by set/way instructions operate as data cache clean and invalidate 
by set/way. 


When the value of this bit is 1: 
AArch32: DCISW is executed as DCCISW. 
AArch64: DC ISW is executed as DC CISW. 
This bit can be implemented as RES1. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 
this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
VM, bit [0] 


Virtualization enable. Enables stage 2 address translation for the Non-secure EL1&0 translation 
regime. Possible values of this bit are: 


Q Non-secure EL1&0 stage 2 address translation disabled. 
1 Non-secure EL1&0 stage 2 address translation enabled. 


When the value of this bit is 1, data cache invalidate instructions executed at Non-secure EL1 
operate as data cache clean and invalidate instructions. For the invalidate by set/way instruction this 
behavior applies regardless of the value of the HCR_EL2.SWIO bit. 


This bit is permitted to be cached in a TLB. 
In an implementation that includes EL3, when the value of SCR_EL3.NS is 0 the PE behaves as if 


this field is 0 for all purposes other than a direct read or write access of HCR_EL2. 
Accessing the HCR_EL2: 
To access the HCR_EL2: 


MRS <Xt>, HCR_EL2 ; Read HCR_EL2 into Xt 
MSR HCR_EL2, <Xt> ; Write Xt to HCR_EL2 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 100 0001 0001 000 
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D7.2.34 HPFAR_EL2, Hypervisor IPA Fault Address Register 
The HPFAR_EL2 characteristics are: 


Purpose 


Holds the faulting IPA for some aborts on a stage 2 translation taken to EL2. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





e z - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register HPFAR_EL2[31:0] is architecturally mapped to AArch32 System register 
HPFAR. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
HPFAR_EL2 is a 64-bit register. 


Field descriptions 


The HPFAR_EL2 bit assignments are: 


40 39 


63 4 3 
| RESO FIPA[47:12] RESO 


Execution at EL1 or ELO makes HPFAR_EL2 become UNKNOWN. 


0 


Bits [63:40] 
Reserved, RESO. 
FIPA[47:12], bits [39:4] 


Bits [47:12] of the faulting intermediate physical address. For implementations with fewer than 48 
physical address bits, the corresponding upper bits in this field are RESO. 


The HPFAR_EL2 is written for: 


° Translation or Access faults in the second stage of translation. 

. An abort in the second stage of translation performed during the translation table walk of a 
first stage translation, caused by a Translation fault, an Access flag fault, or a Permission 
fault. 


° A stage 2 Address size fault. 
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— Note 


The address held in this register is an address accessed by the instruction fetch or data access that 
caused the exception that gave rise to the instruction or data abort. It is the lowest address that gave 
rise to the fault. Where different faults from different addresses arise from the same instruction, such 
as for an instruction that loads or stores a mis-aligned address that crosses a page boundary, the 
architecture does not prioritize between those different faults. 





For all other exceptions taken to EL2, this register is UNKNOWN. 


Bits [3:0] 
Reserved, RESO. 


Accessing the HPFAR_EL2: 
To access the HPFAR_EL2: 


MRS <Xt>, HPFAR_EL2 ; Read HPFAR_EL2 into Xt 
MSR HPFAR_EL2, <Xt> ; Write Xt to HPFAR_EL2 


Register access is encoded as follows: 





op0 opi CRn CRm_ op2 





11 100 0110 0000 100 
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D7.2.35 HSTR_EL2, Hypervisor System Trap Register 
The HSTR_EL2 characteristics are: 
Purpose 
Controls trapping to Hyp mode of Non-secure accesses, at EL1 or lower in AArch32, to the System 
register in the coproc == 1111 encoding space, by the CRn value used to access the register using 
MCR or MRC instruction. When the register is accessible using an MCRR or MRRC instruction, 
this is the CRm value used to access the register. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register HSTR_EL2 is architecturally mapped to AArch32 System register HSTR. 
If EL2 is not implemented, this register is RESO from EL3. 
If no Exception level can use AArch32, then this register is RESO 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HSTR_EL2 is a 32-bit register. 
Field descriptions 
The HSTR_EL2 bit assignments are: 
161514131211109 8 76543 2 1 0 
—se T10 
T11 
T12 
T13 
T14 
T15 
Bits [31:16] 
Reserved, RESO. 
T<n>, bit [n], for n = 0 to 15 
Fields T14 and T4 are RESO. 
The remaining fields control whether Non-secure ELO and EL1 accesses, using MCR, MRC, 
MCRR, and MRRC instructions, to the System registers in the coproc == 1111 encoding space are 
trapped to Hyp mode: 
0 This control has no effect on Non-secure ELO or EL1 accesses to System registers. 
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1 Any Non-secure EL1 MCR, MRC access with coproc == 1111 and CRn == <n> is 
trapped to Hyp mode if the access is not UNDEFINED when the value of this field is 0. 


Any Non-secure EL1 MCRR, MRRC access with coproc == 1111 and CRm == <n> is 
trapped to Hyp mode if the access is not UNDEFINED when the value of this field is 0. 


For example, when HSTR_EL2.T7 is 1: 


° Any 32-bit access from a Non-secure EL1 mode, using an MCR or MRC instruction with 
coproc set to 1111 and <CRn> set to c7, and that is not UNDEFINED when HSTR_EL2.77 is 0, 
is trapped to Hyp mode. 


° Any 64-bit access from a Non-secure EL1 mode, using an MCRR or MRRC instruction with 
coproc set to 1111 and <CRm> set to c7, and that is not UNDEFINED when HSTR_EL2.T7 is 0, 
is trapped to Hyp mode. 
Accessing the HSTR_EL2: 
To access the HSTR_EL2: 


MRS <Xt>, HSTR_EL2 ; Read HSTR_EL2 into Xt 
MSR HSTR_EL2, <Xt> ; Write Xt to HSTR_EL2 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 100 0001 0001 011 
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ID_AA64AFRO_EL1, AArch64 Auxiliary Feature Register 0 


The ID_AA64AFRO_EL1 characteristics are: 


Purpose 
Provides information about the IMPLEMENTATION DEFINED features of the PE in AArch64. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ID_AA64AFRO_EL1 is a 64-bit register. 


Field descriptions 


The ID_AA64AFRO_EL1 bit assignments are: 


32 31 28 27 24 23 20 19 1615 12 11 


RESO | IMP DEF | IMP DEF | IMP DEF | IMP DEF | IMP DEF | IMP DEF | IMP DEF | IMP DEF 


Bits [63:32] 
Reserved, RESO. 


IMPLEMENTATION DEFINED, bits [31:28] 
IMPLEMENTATION DEFINED. 


IMPLEMENTATION DEFINED, bits [27:24] 
IMPLEMENTATION DEFINED. 


IMPLEMENTATION DEFINED, bits [23:20] 
IMPLEMENTATION DEFINED. 


IMPLEMENTATION DEFINED, bits [19:16] 
IMPLEMENTATION DEFINED. 

















IMPLEMENTATION DEFINED, bits [15:12] 
IMPLEMENTATION DEFINED. 
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IMPLEMENTATION DEFINED, bits [11:8] 


IMPLEMENTATION DEFINED. 


IMPLEMENTATION DEFINED, bits [7:4] 


IMPLEMENTATION DEFINED. 

















IMPLEMENTATION DEFINED, bits [3:0] 


IMPLEMENTATION DEFINED. 


Accessing the ID_AA64AFRO_EL1: 
To access the ID_AA64AFRO_EL1: 
MRS <Xt>, ID_AA64AFRQ_EL1 ; Read ID_AA64AFRO_EL1 into Xt 


Register access is encoded as follows: 





op0 opt CRn 


CRm_= op2 





11 000 0000 


0101 100 
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ID_AA64AFR1_EL1, AArch64 Auxiliary Feature Register 1 


The ID_AA64AFR1_EL1 characteristics are: 


Purpose 


Reserved for future expansion of information about the IMPLEMENTATION DEFINED features of the 
PE in AArché4. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ID_AA64AFR1_EL]1 is a 64-bit register. 


Field descriptions 


The ID_AA64AFR1_EL1 bit assignments are: 


RESO 


Bits [63:0] 


Reserved, RESO. 


Accessing the ID_AA64AFR1_EL1: 
To access the ID_AA64AFR1_EL1: 
MRS <Xt>, ID_AA64AFR1_EL1 ; Read ID_AA64AFR1_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 000 0000 0101 101 
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D7.2.38 


ID_AA64DFRO_EL1, AArch64 Debug Feature Register 0 


The ID_AA64DFRO_EL1 characteristics are: 


Purpose 
Provides top level information about the debug system in AArch64. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


The external register EDDFR gives information from this register. 


Attributes 
ID_AA64DFRO_EL1 is a 64-bit register. 


Field descriptions 


The ID_AA64DFRO_EL1 bit assignments are: 


32 31 28 27... 2423. 2019..1615.. 12 11 


RESO  cocows crx cus RESO | WRPs | RESO | BRPs = TraceVer | DebugVer 


Bits [63:32] 
Reserved, RESO. 


CTX_CMPs, bits [31:28] 


Number of breakpoints that are context-aware, minus 1. These are the highest numbered 
breakpoints. 


Bits [27:24] 
Reserved, RESO. 
WRPs, bits [23:20] 
Number of watchpoints, minus 1. The value of 0b0000 is reserved. 
Bits [19:16] 
Reserved, RESO. 
BRPs, bits [15:12] 


Number of breakpoints, minus 1. The value of @b0000 is reserved. 
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PMUVer, bits [11:8] 


Performance Monitors Extension version. Indicates whether System register interface to 
Performance Monitors extension is implemented. Defined values are: 


0000 Performance Monitors Extension System registers not implemented. 

0001 Performance Monitors Extension System registers implemented, PMUv3. 

1111 IMPLEMENTATION DEFINED form of performance monitors supported, PMUv3 not 
supported. 


All other values are reserved. 


TraceVer, bits [7:4] 


Trace support. Indicates whether System register interface to a trace macrocell is implemented. 
Defined values are: 


0000 Trace macrocell System registers not implemented. 
0001 Trace macrocell System registers implemented. 
All other values are reserved. 


A value of 0b0000 only indicates that no System register interface to a trace macrocell is 
implemented. A trace macrocell might nevertheless be implemented without a System register 
interface. 


DebugVer, bits [3:0] 


Debug architecture version. Indicates presence of ARMv8 debug architecture. 
0110 ARMv8 debug architecture. 


All other values are reserved. 


Accessing the ID_AA64DFRO_EL1: 


To access the ID_AA64DFRO_EL1: 


MRS <Xt>, ID_AA64DFRO@_EL1 ; Read ID_AA64DFROQ_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 0101 000 
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D7.2.39 


ID_AA64DFR1_EL1, AArch64 Debug Feature Register 1 


The ID_AA64DFR1_EL1 characteristics are: 


Purpose 
Reserved for future expansion of top level information about the debug system in AArch64. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ID_AA64DFR1_EL]1 is a 64-bit register. 


Field descriptions 


The ID_AA64DFR1_EL1 bit assignments are: 


RESO 


Bits [63:0] 


Reserved, RESO. 


Accessing the ID_AA64DFR1_EL1: 
To access the ID_AA64DFR1_EL1: 
MRS <Xt>, ID_AA64DFR1_EL1 ; Read ID_AA64DFR1_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 000 0000 0101 001 
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D7.2.40 ID_AA64ISARO_EL1, AArch64 Instruction Set Attribute Register 0 
The ID_AA64ISARO_EL1 characteristics are: 


Purpose 
Provides information about the instructions implemented in AArch64 state. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


There are no configuration notes. 


Attributes 
ID_AA64ISARO_EL] is a 64-bit register. 


Field descriptions 


The ID_AA64ISARO_EL]1 bit assignments are: 


20 19 16 15 12 11 


63 8 7 4 3 0 
| RESO CRC32 SHA2 SHA1 AES RESO 


Bits [63:20] 


Reserved, RESO. 


CRC32, bits [19:16] 
CRC32 instructions in AArch64. Defined values are: 
0000 No CRC32 instructions implemented. 


0001 CRC32B, CRC32H, CRC32W, CRC32X, CRC32CB, CRC32CH, CRC32CW, and 
CRC32CxX instructions implemented. 


All other values are reserved. 


SHA2, bits [15:12] 
SHA2 instructions in AArch64. Defined values are: 
0000 No SHA2 instructions implemented. 
0001 SHA256H, SHA256H2, SHA256SU0, and SHA256SU1 instructions implemented. 


All other values are reserved. 
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SHALI, bits [11:8] 
SHAI instructions in AArch64. Defined values are: 


0000 No SHAL instructions implemented. 
0001 SHAIC, SHAIP, SHAIM, SHA1H, SHAISUO, and SHA1SU1 instructions 
implemented. 


All other values are reserved. 


AES, bits [7:4] 


AES instructions in AArch64. Defined values are: 


0000 No AES instructions implemented. 
0001 AESE, AESD, AESMC, and AESIMC instructions implemented. 
0010 As for 0001, plus PMULL/PMULL2 instructions operating on 64-bit data quantities. 


All other values are reserved. 


Bits [3:0] 


Reserved, RESO. 


Accessing the ID_AA64ISARO_EL1: 
To access the ID_AA64ISARO_EL1: 
MRS <Xt>, ID_AA64ISARQ_EL1 ; Read ID_AA64ISARQ_EL1 into Xt 


Register access is encoded as follows: 





op0- opi 


CRn CRm_= op2 





11 000 


0000 0110 000 
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D7.2.41 ID_AA64ISAR1_EL1, AArch64 Instruction Set Attribute Register 1 
The ID_AA64ISAR1_EL1 characteristics are: 


Purpose 


Reserved for future expansion of the information about the instructions implemented in AArch64 


state. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 


for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) 


EL3 (SCR.NS=0) 





- RO RO RO RO 


RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 


EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ID_AA64ISAR1_EL1 is a 64-bit register. 


Field descriptions 


The ID_AA64ISAR1_EL1 bit assignments are: 














63 0 
RESO 
Bits [63:0] 
Reserved, RESO. 

Accessing the ID_AA64ISAR1_EL1: 

To access the ID_AA64ISAR1_EL1: 

MRS <Xt>, ID_AAG4ISAR1_EL1 ; Read ID_AA64ISAR1_EL1 into Xt 

Register access is encoded as follows: 
op0 opi CRn CRm_ op2 
11 000 0000 = 0110 001 
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D7.2.42 ID_AA64MMFRO_EL1, AArch64 Memory Model Feature Register 0 
The ID_AA64MMFRO_EL | characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArché4. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


There are no configuration notes. 


Attributes 
ID_AA64MMFRO_EL1 is a 64-bit register. 


Field descriptions 


The ID_LAA64MMFRO_EL1 bit assignments are: 


63 32 31 28 27 24 23 20 19 16 15 12 11 


RESO TGran4 | TGran64 | TGran16 | BigEndELO | SNSMem on BigEnd | ASIDBits aE PARange 


Bits [63:32] 
Reserved, RESO. 


TGran4, bits [31:28] 
Support for 4KB memory translation granule size. Defined values are: 
0000 4KB granule supported. 
1111 4KB granule not supported. 


All other values are reserved. 


TGran64, bits [27:24] 
Support for 64KB memory translation granule size. Defined values are: 
0000 64KB granule supported. 
1111 64KB granule not supported. 


All other values are reserved. 
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TGran16, bits [23:20] 
Support for 16KB memory translation granule size. Defined values are: 
0000 16KB granule not supported. 
0001 16KB granule supported. 


All other values are reserved. 


BigEndEL4, bits [19:16] 
Mixed-endian support at ELO only. Defined values are: 
0000 No mixed-endian support at ELO. The SCTLR_EL1.EOE bit has a fixed value. 
0001 Mixed-endian support at ELO. The SCTLR_EL1.EOE bit can be configured. 
All other values are reserved. 


This field is invalid and is RESO if the BigEnd field, bits [11:8], is not 0000. 


SNSMen,, bits [15:12] 
Secure versus Non-secure Memory distinction. Defined values are: 
0000 Does not support a distinction between Secure and Non-secure Memory. 
0001 Does support a distinction between Secure and Non-secure Memory. 


All other values are reserved. 


BigEnd, bits [11:8] 


Mixed-endian configuration support. Defined values are: 


0000 No mixed-endian support. The SCTLR_ELx.EE bits have a fixed value. See the 
BigEndELO field, bits[ 19:16], for whether ELO supports mixed-endian. 

0001 Mixed-endian support. The SCTLR_ELx.EE and SCTLR_EL1.EOE bits can be 
configured. 


All other values are reserved. 
ASIDBits, bits [7:4] 
Number of ASID bits. Defined values are: 
0000 8 bits. 
0010 16 bits. 
All other values are reserved. 
PARange, bits [3:0] 


Physical Address range supported. Defined values are: 


0000 32 bits, 4GB. 
0001 36 bits, 64GB. 
0010 40 bits, 1TB. 
0011 42 bits, 4TB. 
0100 44 bits, 16TB. 
0101 48 bits, 256TB. 


All other values are reserved. 


Accessing the ID_AA64MMFRO_EL1: 
To access the ID_AA64MMFRO_ELI: 


MRS <Xt>, ID_AA64MMFRO@_EL1 ; Read ID_AA64MMFRO_EL1 into Xt 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0000 0111 000 
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D7.2.43 ID_AA64MMFR1_EL1, AArch64 Memory Model Feature Register 1 
The ID_AA64MMFR1_EL 1 characteristics are: 


Purpose 


Reserved for future expansion of the information about the implemented memory model and 


memory management support in AArch64. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 


for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) 


EL3 (SCR.NS=0) 





- RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 


EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ID_AA64MMEFR1_EL1 is a 64-bit register. 


Field descriptions 


The ID_LAA64MMEFR1_EL1 bit assignments are: 














63 
RESO 
Bits [63:0] 
Reserved, RESO. 

Accessing the ID_AA64MMFR1_EL1: 

To access the ID_AA64MMFR1_ELI: 

MRS <Xt>, ID_AA64MMFR1_EL1 ; Read ID_AA64MMFR1_EL1 into Xt 

Register access is encoded as follows: 
op0 op1 CRn CRm op2 
11 000 0000 0111 001 
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D7.2.44 ID_AA64PFRO_EL1, AArch64 Processor Feature Register 0 
The ID_AA64PFRO_EL1 characteristics are: 


Purpose 
Provides additional information about implemented PE features in AArch64. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


The external register EDPFR gives information from this register. 


Attributes 
ID_AA64PFRO_EL1 is a 64-bit register. 


Field descriptions 


The ID_AA64PFRO_EL1 bit assignments are: 


28 27 .. 24 23 2019 .. 1615 .. 12 11 


pe eiisisiee 


Bits [63:28] 


Reserved, RESO. 


GIC, bits [27:24] 
System register GIC interface support. Defined values are: 
0000 No System register interface to the GIC is supported. 
0001 System register interface to versions 3.0 and 4.0 of the GIC CPU interface is supported. 


All other values are reserved. 


AdvSIMD, bits [23:20] 
Advanced SIMD. Defined values are: 
0000 Advanced SIMD is implemented. 
1111 Advanced SIMD is not implemented. 
All other values are reserved. 


This field must have the same value as the FP field. 
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FP, bits [19:16] 
Floating-point. Defined values are: 
0000 Floating-point is implemented. 
1111 Floating-point is not implemented. 
All other values are reserved. 


This field must have the same value as the AdvSIMD field. 


EL3, bits [15:12] 
EL3 Exception level handling. Defined values are: 
0000 EL3 is not implemented. 
0001 EL3 can be executed in AArch64 state only. 
0010 EL3 can be executed in either AArch64 or AArch32 state. 


All other values are reserved. 


EL2, bits [11:8] 


EL2 Exception level handling. Defined values are: 


0000 EL2 is not implemented. 
0001 EL2 can be executed in AArch64 state only. 
0010 EL2 can be executed in either AArch64 or AArch32 state. 


All other values are reserved. 


EL1, bits [7:4] 
EL1 Exception level handling. Defined values are: 
0001 EL1 can be executed in AArch64 state only. 
0010 EL1 can be executed in either AArch64 or AArch32 state. 
All other values are reserved. 
ELO, bits [3:0] 
ELO Exception level handling. Defined values are: 
0001 ELO can be executed in AArch64 state only. 
0010 ELO can be executed in either AArch64 or AArch32 state. 


All other values are reserved. 


Accessing the ID_AA64PFRO_EL1: 
To access the ID_AA64PFRO_EL1: 
MRS <Xt>, ID_AA64PFRQ_EL1 ; Read ID_AA64PFRO_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 = 0100 000 
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D7.2.45 ID_AA64PFR1_EL1, AArch64 Processor Feature Register 1 
The ID_AA64PFR1_EL1 characteristics are: 


Purpose 
Reserved for future expansion of information about implemented PE features in AArch64. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 

Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ID_AA64PFR1_EL1 is a 64-bit register. 


Field descriptions 


The ID_AA64PFR1_EL1 bit assignments are: 


63 0 


RESO 


Bits [63:0] 


Reserved, RESO. 


Accessing the ID_AA64PFR1_EL1: 
To access the ID_AA64PFR1_EL1: 
MRS <Xt>, ID_AA64PFR1_EL1 ; Read ID_AA64PFR1_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 000 0000 0100 001 
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D7.2.46 ID_AFRO_EL1, AArch32 Auxiliary Feature Register 0 
The ID_AFRO_EL1 characteristics are: 
Purpose 
Provides information about the IMPLEMENTATION DEFINED features of the PE in AArch32. 
Must be interpreted with the Main ID Register, MIDR_EL1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RO RO RO RO RO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
AArch64 System register ID_AFRO_EL1 is architecturally mapped to AArch32 System register 
ID_AFRO. 
In an AArch64-only implementation, this register is UNKNOWN. 
Attributes 
ID_AFRO_EL] is a 32-bit register. 
Field descriptions 
The ID_AFRO_EL1 bit assignments are: 
31 16 15 12 11 8 7 4 3 0 
IMP DEF 
IMP DEF 
IMP DEF 
IMP DEF 
Bits [31:16] 
Reserved, RESO. 
IMPLEMENTATION DEFINED, bits [15:12] 
IMPLEMENTATION DEFINED. 
IMPLEMENTATION DEFINED, bits [11:8] 
IMPLEMENTATION DEFINED. 
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IMPLEMENTATION DEFINED, bits [7:4] 


IMPLEMENTATION DEFINED. 


IMPLEMENTATION DEFINED, bits [3:0] 


IMPLEMENTATION DEFINED. 


Accessing the ID_AFRO_EL1: 
To access the ID_AFRO_EL1: 
MRS <Xt>, ID_AFR@_EL1 ; Read ID_AFRQ@_EL1 into Xt 


Register access is encoded as follows: 





op0 opt CRn 


CRm_= op2 





11 000 0000 


0001 011 








D7-2010 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 


D7.2.47 ID_DFRO_EL1, AArch32 Debug Feature Register 0 
The ID_DFRO_EL1 characteristics are: 


Purpose 
Provides top level information about the debug system in AArch32. 
Must be interpreted with the Main ID Register, MIDR_EL1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register ID_DFRO_EL1 is architecturally mapped to AArch32 System register 
ID_DFRO. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_DFRO_EL1 is a 32-bit register. 


Field descriptions 


The ID_DFRO_EL1 bit assignments are: 


28 27 24 23 20 19 1615 12 11 


RESO MProfDbg | MMapTrc MMapDbg |} CopSDbg | CopDbg 


Bits [31:28] 
Reserved, RESO. 


PerfMon, bits [27:24] 


Performance Monitors. Support for System registers-based ARM Performance Monitors Extension, 
using registers in the coproc == 1111 encoding space, for A and R profile processors. Defined values 





are: 
0000 Performance Monitors Extension System registers not implemented. 
0001 Support for Performance Monitors Extension version 1 (PMUv1) System registers. 
0010 Support for Performance Monitors Extension version 2 (PMUv2) System registers. 
0011 Support for Performance Monitors Extension version 3 (PMUV3) System registers. 
1111 IMPLEMENTATION DEFINED form of Performance Monitors System registers supported. 

PMUvV3 not supported. 
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All other values are reserved. 
In ARMV8-A the permitted values are 0000, 0011, and 1111. 


In ARMv7, the value 0000 can mean that PMUv1 is implemented. PMUV1 is not permitted in an 
ARMv8 implementation. 


MProfDbg, bits [23:20] 
M Profile Debug. Support for memory-mapped debug model for M profile processors. Defined 


values are: 
0000 Not supported. 
0001 Support for M profile Debug architecture, with memory-mapped access. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


MMapTre, bits [19:16] 
Memory Mapped Trace. Support for memory-mapped trace model. Defined values are: 
0000 Not supported. 
0001 Support for ARM trace architecture, with memory-mapped access. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 


In the Trace registers, the ETMIDR gives more information about the implementation. 


CopTre, bits [15:12] 


Support for System registers-based trace model, using registers in the coproc == 1110 encoding 
space. Defined values are: 


0000 Not supported. 

0001 Support for ARM trace architecture, with System registers access. 
All other values are reserved. 

In ARMv8-A the permitted values are 0000 and 0001. 


In the Trace registers, the ETMIDR gives more information about the implementation. 


MMapDbg, bits [11:8] 


Memory Mapped Debug. Support for v7 memory-mapped debug model, for A and R profile 
processors. 


In ARMv8-A this field is RESO. 
The optional memory map defined by ARMV8 is not compatible with ARMv7. 


CopSDbg, bits [7:4] 


Support for a System registers-based Secure debug model, using registers in the coproc = 1110 
encoding space, for an A profile processor that includes EL3. 


If EL3 is not implemented and the implemented Security state is Non-Secure state, this field is RESO. 
Otherwise, this field reads the same as bits [3:0]. 


CopDbg, bits [3:0] 


Support for System registers-based debug model, using registers in the coproc == 1110 encoding 
space, for A and R profile processors. Defined values are: 





0000 Not supported. 
0010 Support for ARMv6, v6 Debug architecture, with System registers access. 
0011 Support for ARMv6, v6.1 Debug architecture, with System registers access. 
0100 Support for ARMv7, v7 Debug architecture, with System registers access. 
0101 Support for ARMv7, v7.1 Debug architecture, with System registers access. 
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0110 Support for ARMv8 debug architecture, with System registers access. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000, and 0110. 


Accessing the ID_DFRO_EL1: 
To access the ID_DFRO_EL1: 


MRS <Xt>, ID_DFRQ@_EL1 ; Read ID_DFRQ_EL1 into Xt 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0000 0001 010 
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D7.2.48 ID_ISARO_EL1, AArch32 Instruction Set Attribute Register 0 
The ID_ISARO_EL1 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 


Must be interpreted with ID_ISAR1_EL1, ID_ISAR2_EL1, ID_ISAR3_EL1, ID_ISAR4_EL1, and 
ID_ISAR5_EL1. 





For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register ID_ISARO_EL1 is architecturally mapped to AArch32 System register 
ID_ISARO. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_ISARO_EL1 is a 32-bit register. 


Field descriptions 


The ID_ISARO_EL1 bit assignments are: 


31 28 27 24 23 20 19 1615 12 11 8 7 4 3 0 





Bits [31:28] 
Reserved, RESO. 
Divide, bits [27:24] 


Indicates the implemented Divide instructions. Defined values are: 


0000 None implemented. 
0001 Adds SDIV and UDIV in the T32 instruction set. 
0010 As for 0001, and adds SDIV and UDIV in the A32 instruction set. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 
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Debug, bits [23:20] 


Indicates the implemented Debug instructions. Defined values are: 
0000 None implemented. 

0001 Adds BKPT. 

All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Coproc, bits [19:16] 


Indicates the implemented System register access instructions. Defined values are: 


0000 None implemented, except for instructions separately attributed by the architecture to 
provide access to AArch32 System registers and System instructions. 

0001 Adds generic CDP, LDC, MCR, MRC, and STC. 

0010 As for 0001, and adds generic CDP2, LDC2, MCR2, MRC2, and STC2. 

0011 As for 0010, and adds generic MCRR and MRRC. 

0100 As for 0011, and adds generic MCRR2 and MRRC2. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


CmpBranch, bits [15:12] 


Indicates the implemented combined Compare and Branch instructions in the T32 instruction set. 
Defined values are: 


0000 None implemented. 
0001 Adds CBNZ and CBZ. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


BitField, bits [11:8] 


Indicates the implemented BitField instructions. Defined values are: 
0000 None implemented. 

0001 Adds BFC, BFI, SBFX, and UBFX. 

All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


BitCount, bits [7:4] 


Indicates the implemented Bit Counting instructions. Defined values are: 
0000 None implemented. 

0001 Adds CLZ. 

All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Swap, bits [3:0] 


Indicates the implemented Swap instructions in the A32 instruction set. Defined values are: 
0000 None implemented. 

0001 Adds SWP and SWPB. 

All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 
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Accessing the ID_ISARO_EL1: 
To access the ID_ISARO_EL1: 
MRS <Xt>, ID_ISAR@_EL1 ; Read ID_ISAR@_EL1 into Xt 


Register access is encoded as follows: 





op0 op 


CRn CRm_= op2 





11 000 


0000 = 0010 000 
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D7.2.49 ID_ISAR1_EL1, AArch32 Instruction Set Attribute Register 1 
The ID_ISAR1_EL1 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 


Must be interpreted with ID_ISARO_EL1, ID_ISAR2_EL1, ID_ISAR3_EL1, ID_ISAR4_EL1, and 
ID_ISAR5_EL1. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 





Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


AArch64 System register ID_ISAR1_EL1 is architecturally mapped to AArch32 System register 
ID_ISAR1. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_ISAR1_EL1 is a 32-bit register. 


Field descriptions 


The ID_ISAR1_EL1 bit assignments are: 


28 27 24 23 20 19 1615 12 11 


Jazelle, bits [31:28] 
Indicates the implemented Jazelle extension instructions. Defined values are: 
0000 No support for Jazelle. 


0001 Adds the BXJ instruction, and the J bit in the PSR. This setting might indicate a trivial 
implementation of the Jazelle extension. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Interwork, bits [27:24] 


Indicates the implemented Interworking instructions. Defined values are: 





0000 None implemented. 
0001 Adds the BX instruction, and the T bit in the PSR. 
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0010 As for 0001, and adds the BLX instruction. PC loads have BX-like behavior. 


0011 As for 0010, and guarantees that data-processing instructions in the A32 instruction set 
with the PC as the destination and the S bit clear have BX-like behavior. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0011. 


Immediate, bits [23:20] 
Indicates the implemented data-processing instructions with long immediates. Defined values are: 
0000 None implemented. 
0001 Adds: 
° The MOVT instruction. 
° The MOV instruction encodings with zero-extended 16-bit immediates. 


° The T32 ADD and SUB instruction encodings with zero-extended 12-bit 
immediates, and the other ADD, ADR, and SUB encodings cross-referenced by 
the pseudocode for those encodings. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0001. 
IfThen, bits [19:16] 
Indicates the implemented If-Then instructions in the T32 instruction set. Defined values are: 
0000 None implemented. 
0001 Adds the IT instructions, and the IT bits in the PSRs. 
All other values are reserved. 
In ARMv8-A the only permitted value is 0001. 
Extend, bits [15:12] 


Indicates the implemented Extend instructions. Defined values are: 


0000 No scalar sign-extend or zero-extend instructions are implemented, where scalar 
instructions means non-Advanced SIMD instructions. 

0001 Adds the SXTB, SXTH, UXTB, and UXTH instructions. 

0010 As for 0001, and adds the SXTB16, SXTAB, SXTAB16, SXTAH, UXTB16, UXTAB, 


UXTAB16, and UXTAH instructions. 
All other values are reserved. 
In ARMv8-A the only permitted value is 0010. 
Except_AR, bits [11:8] 


Indicates the implemented A and R profile exception-handling instructions. Defined values are: 


0000 None implemented. 
0001 Adds the SRS and RFE instructions, and the A and R profile forms of the CPS 
instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Except, bits [7:4] 
Indicates the implemented exception-handling instructions in the ARM instruction set. Defined 


values are: 


0000 Not implemented. This indicates that the User bank and Exception return forms of the 
LDM and STM instructions are not implemented. 


0001 Adds the LDM (exception return), LDM (user registers), and STM (user registers) 
instruction versions. 
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All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Endian, bits [3:0] 
Indicates the implemented Endian instructions. Defined values are: 
0000 None implemented. 
0001 Adds the SETEND instruction, and the E bit in the PSRs. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


Accessing the ID_ISAR1_EL1: 
To access the ID_ISAR1_EL1: 
MRS <Xt>, ID_ISAR1_EL1 ; Read ID_ISAR1_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 0010 001 
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D7.2.50 ID_ISAR2_EL1, AArch32 Instruction Set Attribute Register 2 
The ID_ISAR2_EL1 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 


Must be interpreted with ID_ISARO_EL1, ID_ISAR1_EL1, ID_ISAR3_EL1, ID_ISAR4_EL1, and 
ID_ISAR5_EL1. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 





Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register ID_ISAR2_EL1 is architecturally mapped to AArch32 System register 
ID_ISAR2. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_ISAR2_EL1 is a 32-bit register. 


Field descriptions 


The ID_ISAR2_EL1 bit assignments are: 


28 27 24 23 20 19 1615 12 11 


se MultiAccessInt 


Reversal, bits [31:28] 


Indicates the implemented Reversal instructions. Defined values are: 


0000 None implemented. 
0001 Adds the REV, REV16, and REVSH instructions. 
0010 As for 0001, and adds the RBIT instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 
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PSR_AR, bits [27:24] 


Indicates the implemented A and R profile instructions to manipulate the PSR. Defined values are: 


0000 None implemented. 
0001 Adds the MRS and MSR instructions, and the exception return forms of data-processing 
instructions. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0001. 
The exception return forms of the data-processing instructions are: 


° In the A32 instruction set, data-processing instructions with the PC as the destination and the 
S bit set. These instructions might be affected by the WithShifts attribute. 


° In the T32 instruction set, the SUBS PC,LR,#N instruction. 


MultU, bits [23:20] 


Indicates the implemented advanced unsigned Multiply instructions. Defined values are: 


0000 None implemented. 
0001 Adds the UMULL and UMLAL instructions. 
0010 As for 0001, and adds the UMAAL instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


MultS, bits [19:16] 


Indicates the implemented advanced signed Multiply instructions. Defined values are: 


0000 None implemented. 
0001 Adds the SMULL and SMLAL instructions. 
0010 As for 0001, and adds the SMLABB, SMLABT, SMLALBB, SMLALBT, SMLALTB, 


SMLALTT, SMLATB, SMLATT, SMLAWB, SMLAWT, SMULBB, SMULBT, 
SMULTB, SMULTT, SMULWB, and SMULWT instructions. Also adds the Q bit in the 
PSRs. 


0011 As for 0010, and adds the SMLAD, SMLADX, SMLALD, SMLALDX, SMLSD, 
SMLSDX, SMLSLD, SMLSLDX, SMMLA, SMMLAR, SMMLS, SMMLSR, 
SMMUL, SMMULR, SMUAD, SMUADX, SMUSD, and SMUSDX instructions. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0011. 
Mult, bits [15:12] 


Indicates the implemented additional Multiply instructions. Defined values are: 


0000 No additional instructions implemented. This means only MUL is implemented. 
0001 Adds the MLA instruction. 
0010 As for 0001, and adds the MLS instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


MultiAccessInt, bits [11:8] 


Indicates the support for interruptible multi-access instructions. Defined values are: 


0000 No support. This means the LDM and STM instructions are not interruptible. 
0001 LDM and STM instructions are restartable. 
0010 LDM and STM instructions are continuable. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 
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MemHint, bits [7:4] 


Indicates the implemented Memory Hint instructions. Defined values are: 


0000 
0001 
0010 
0011 
0100 


None implemented. 

Adds the PLD instruction. 

Adds the PLD instruction. (0001 and 0010 have identical effects.) 
As for 0001 (or 0010), and adds the PLI instruction. 

As for 0011, and adds the PLDW instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0100. 


LoadStore, bits [3:0] 


Indicates the implemented additional load/store instructions. Defined values are: 


0000 
0001 
0010 


No additional load/store instructions implemented. 
Adds the LDRD and STRD instructions. 


As for 0001, and adds the Load Acquire (LDAB, LDAH, LDA, LDAEXB, LDAEXH, 
LDAEX, LDAEXD) and Store Release (STLB, STLH, STL, STLEXB, STLEXH, 
STLEX, STLEXD) instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


Accessing the ID_ISAR2_EL1: 


To access the ID_ISAR2_EL1: 


MRS <Xt>, ID_ISAR2_EL1 ; Read ID_ISAR2_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 0010 010 
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D7.2.51 ID_ISAR3_EL1, AArch32 Instruction Set Attribute Register 3 
The ID_ISAR3_EL1 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 


Must be interpreted with ID_ISARO_EL1, ID_ISAR1_EL1, ID_ISAR2_EL1, ID_ISAR4_EL1, and 
ID_ISAR5_EL1. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 





Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


AArch64 System register ID_ISAR3_EL1 is architecturally mapped to AArch32 System register 
ID_ISAR3. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_ISAR3_EL1 is a 32-bit register. 


Field descriptions 


The ID_ISAR3_EL1 bit assignments are: 


28 27 24 23 20 19 1615 12 11 


T32EE TrueNOP | T32Copy | TabBranch aaa) SIMD 


T32EHE, bits [31:28] 
Indicates the implemented T32EE instructions. Defined values are: 
0000 None implemented. 


0001 Adds the ENTERX and LEAVEX instructions, and modifies the load behavior to 
include null checking. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


TrueNOP, bits [27:24] 
Indicates the implemented true NOP instructions. Defined values are: 


0000 None implemented. This means there are no NOP instructions that do not have any 
register dependencies. 
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0001 Adds true NOP instructions in both the T32 and A32 instruction sets. This also permits 
additional NOP-compatible hints. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


T32Copy, bits [23:20] 
Indicates the support for T32 non flag-setting MOV instructions. Defined values are: 


0000 Not supported. This means that in the T32 instruction set, encoding T1 of the MOV 
(register) instruction does not support a copy from a low register to a low register. 


0001 Adds support for T32 instruction set encoding T1 of the MOV (register) instruction, 
copying from a low register to a low register. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


TabBranch, bits [19:16] 
Indicates the implemented Table Branch instructions in the T32 instruction set. Defined values are: 
0000 None implemented. 
0001 Adds the TBB and TBH instructions. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


SynchPrim, bits [15:12] 


Used in conjunction with ID_ISAR4.SynchPrim_frac to indicate the implemented Synchronization 
Primitive instructions. Defined values are: 


0000 If SynchPrim_frac == 0000, no Synchronization Primitives implemented. 

0001 If SynchPrim_frac == 0000, adds the LDREX and STREX< instructions. 
If SynchPrim_frac == 0011, also adds the CLREX, LDREXB, STREXB, and STREXH 
instructions. 

0010 If SynchPrim_frac == 0000, as for [0001, 0011] and also adds the LDREXD and 
STREXD instructions. 


All other combinations of SynchPrim and SynchPrim_frac are reserved. 
In ARMv8-A the only permitted value is 0010. 
SVC, bits [11:8] 
Indicates the implemented SVC instructions. Defined values are: 
0000 Not implemented. 
0001 Adds the SVC instruction. 
All other values are reserved. 
In ARMv8-A the only permitted value is 0001. 
SIMD, bits [7:4] 


Indicates the implemented SIMD instructions. Defined values are: 


0000 None implemented. 
0001 Adds the SSAT and USAT instructions, and the Q bit in the PSRs. 
0011 As for 0001, and adds the PKHBT, PKHTB, QADD16, QADD8, QASX, QSUB 16, 


QSUB8, QSAX, SADD16, SADD8, SASX, SEL, SHADD16, SHADD8, SHASX, 
SHSUB16, SHSUB8, SHSAX, SSAT16, SSUB16, SSUB8, SSAX, SXTAB16, 
SXTB16, UADD16, UADD8, UASX, UHADD16, UHADD8, UHASX, UHSUB16, 
UHSUB8, UHSAX, UQADD16, UQADD8, UQASX, UQSUB 16, UQSUB8, UQSAX, 
USAD8, USADA8, USAT16, USUB16, USUB8, USAX, UXTAB 16, and UXTB16 
instructions. Also adds support for the GE[3:0] bits in the PSRs. 
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All other values are reserved. 
In ARMv8-A the only permitted value is 0011. 


The SIMD field relates only to implemented instructions that perform SIMD operations on the 
general-purpose registers. In an implementation that supports floating-point and Advanced SIMD 
instructions, MVFRO, MVFR1, and MVFR2 give information about the implemented Advanced 
SIMD instructions. 


Saturate, bits [3:0] 


Indicates the implemented Saturate instructions. Defined values are: 


0000 None implemented. This means no non-Advanced SIMD saturate instructions are 
implemented. 
0001 Adds the QADD, QDADD, QDSUB, and QSUB instructions, and the Q bit in the PSRs. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Accessing the ID_ISAR3_EL1: 


To access the ID_ISAR3_EL1: 


MRS <Xt>, ID_ISAR3_EL1 ; Read ID_ISAR3_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 0010 011 
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D7.2.52 ID_ISAR4_EL1, AArch32 Instruction Set Attribute Register 4 
The ID_ISAR4_EL1 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 


Must be interpreted with ID_ISARO_EL1, ID_ISAR1_EL1, ID_ISAR2_EL1, ID_ISAR3_EL1, and 
ID_ISAR5_EL1. 





For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


AArch64 System register ID_ISAR4_EL1 is architecturally mapped to AArch32 System register 
ID_ISAR4. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_ISAR4_EL1 is a 32-bit register. 


Field descriptions 


The ID_ISAR4_EL1 bit assignments are: 


31 28 27 24 23 20 19 1615 12 11 8 7 4 3 0 





SynchPrim_frac ass ae 


SWP_frac, bits [31:28] 
Indicates support for the memory system locking the bus for SWP or SWPB instructions. Defined 


values are: 
0000 SWP or SWPB instructions not implemented. 
0001 SWP or SWPB implemented but only in a uniprocessor context. SWP and SWPB do not 


guarantee whether memory accesses from other masters can come between the load 
memory access and the store memory access of the SWP or SWPB. 


All other values are reserved. This field is valid only if the ID_ISARO.Swap_instrs field is 0000. 
In ARMv8-A the only permitted value is 0000. 
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PSR_M, bits [27:24] 
Indicates the implemented M profile instructions to modify the PSRs. Defined values are: 
0000 None implemented. 
0001 Adds the M profile forms of the CPS, MRS, and MSR instructions. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


SynchPrim_frac, bits [23:20] 


Used in conjunction with ID_ISAR3.SynchPrim to indicate the implemented Synchronization 
Primitive instructions. Possible values are: 


0000 If SynchPrim == 0000, no Synchronization Primitives implemented. If SynchPrim == 
0001, adds the LDREX and STREX instructions. If SynchPrim == 0010, also adds the 
CLREX, LDREXB, LDREXH, STREXB, STREXH, LDREXD, and STREXD 
instructions. 


0011 If SynchPrim == 0001, adds the LDREX, STREX, CLREX, LDREXB, LDREXH, 
STREXB, and STREXH instructions. 


All other combinations of SynchPrim and SynchPrim_frac are reserved. 
In ARMv8-A the only permitted value is 0000. 
Barrier, bits [19:16] 


Indicates the implemented Barrier instructions in the A32 and T32 instruction sets. Defined values 


are: 

0000 None implemented. Barrier operations are provided only as System instructions in the 
(coproc==1111) encoding space. 

0001 Adds the DMB, DSB, and ISB barrier instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


SMC, bits [15:12] 
Indicates the implemented SMC instructions. Defined values are: 
0000 None implemented. 
0001 Adds the SMC instruction. 
All other values are reserved. 
In ARMv8-A the permitted values are 0001 and 0000. 
If EL1 cannot use AArch32 then this field has the value 0000. 


Writeback, bits [11:8] 
Indicates the support for Writeback addressing modes. Defined values are: 


0000 Basic support. Only the LDM, STM, PUSH, POP, SRS, and RFE instructions support 
writeback addressing modes. These instructions support all of their writeback 
addressing modes. 


0001 Adds support for all of the writeback addressing modes. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


WithShifts, bits [7:4] 


Indicates the support for instructions with shifts. Defined values are: 





0000 Nonzero shifts supported only in MOV and shift instructions. 
0001 Adds support for shifts of loads and stores over the range LSL 0-3. 
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0011 As for 0001, and adds support for other constant shift options, both on load/store and 
other instructions. 


0100 As for 0011, and adds support for register-controlled shift options. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0100. 
Unpriv, bits [3:0] 


Indicates the implemented unprivileged instructions. Defined values are: 


0000 None implemented. No T variant instructions are implemented. 
0001 Adds the LDRBT, LDRT, STRBT, and STRT instructions. 
0010 As for 0001, and adds the LDRHT, LDRSBT, LDRSHT, and STRHT instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


Accessing the ID_ISAR4_EL1: 
To access the ID_ISAR4 ELI: 
MRS <Xt>, ID_ISAR4_EL1 ; Read ID_ISAR4_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 000 0000 0010 100 
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D7.2.53 ID_ISAR5_EL1, AArch32 Instruction Set Attribute Register 5 
The ID_ISARS_EL1 characteristics are: 
Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 
Must be interpreted with ID_ISARO_EL1, ID_ISAR1_EL1, ID_ISAR2_EL1, ID_ISAR3_EL1, and 
ID_ISAR4_EL1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RO RO RO RO RO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
AArch64 System register ID_ISAR5_EL1 is architecturally mapped to AArch32 System register 
ID_ISARS. 
In an AArch64-only implementation, this register is UNKNOWN. 
Attributes 
ID_ISAR5_EL1 is a 32-bit register. 
Field descriptions 
The ID_ISARS5_EL1 bit assignments are: 
31 20 19 16 15 12 11 8 7 4 3 0 
RESO CRC32 SHA2 SHA1 SEVL 
Bits [31:20] 
Reserved, RESO. 
CRC32, bits [19:16] 
Indicates whether CRC32 instructions are implemented in AArch32. 
0000 No CRC32 instructions implemented. 
0001 CRC32B, CRC32H, CRC32W, CRC32CB, CRC32CH, and CRC32CW instructions 
implemented. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
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SHA2, bits [15:12] 
Indicates whether SHA2 instructions are implemented in AArch32. 
0000 No SHA2 instructions implemented. 


0001 SHA256H, SHA256H2, SHA256SU0, and SHA256SU1 implemented. 


All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
SHALL, bits [11:8] 
Indicates whether SHA1 instructions are implemented in AArch32. 


0000 No SHAL instructions implemented. 


0001 SHAIC, SHAIP, SHAIM, SHA1H, SHAISUO, and SHA1SU1 implemented. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


AES, bits [7:4] 


Indicates whether AES instructions are implemented in AArch32. 


0000 No AES instructions implemented. 
0001 AESE, AESD, AESMC, and AESIMC implemented. 
0010 As for 0001, plus PMULL/PMULL2 instructions operating on 64-bit data quantities. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0010. 


SEVL, bits [3:0] 
Indicates whether the SEVL instruction is implemented in AArch32. 
0000 SEVL is implemented as a NOP. 
0001 SEVL is implemented as Send Event Local. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Accessing the ID_ISAR5_EL1: 
To access the ID_ISAR5_EL1: 
MRS <Xt>, ID_ISAR5_EL1 ; Read ID_ISAR5_EL1 into Xt 


Register access is encoded as follows: 














opO0 opt CRn CRm_= op2 
11 000 0000 0010 101 
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D7.2.54 ID_MMFRO_EL1, AArch32 Memory Model Feature Register 0 
The ID_MMFRO_EL1 characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArch32. 


Must be interpreted with ID_MMFR1_EL1, ID MMFR2_EL1, ID_ MMFR3_EL1, and 
ID_MMFR4 ELI. 





For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register ID_MMFRO_EL1 is architecturally mapped to AArch32 System register 
ID_MMFRO. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_MMFRO_EL1 is a 32-bit register. 


Field descriptions 


The ID_MMFRO_EL] bit assignments are: 


28 27 24 23 20 19 1615 12 11 


rose | Acsteg | Tom sharelwi | outers | pasa | VMSA 


InnerShr, bits [31:28] 


Innermost Shareability. Indicates the innermost shareability domain implemented. Defined values 


are: 
0000 Implemented as Non-cacheable. 

0001 Implemented with hardware coherency support. 
1111 Shareability ignored. 


All other values are reserved. 
In ARMV8 the permitted values are 0000, 0001, and 1111. 


This field is valid only if the implementation supports two levels of shareability, as indicated by 
ID_MMFRO_EL1.ShareLvl] having the value 0001. 
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When ID_MMFRO_EL1.ShareLv1 is zero, this field is UNK. 
FCSE, bits [27:24] 
Indicates whether the implementation includes the FCSE. Defined values are: 
0000 Not supported. 
0001 Support for FCSE. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 


AuxReg, bits [23:20] 


Auxiliary Registers. Indicates support for Auxiliary registers. Defined values are: 


0000 None supported. 
0001 Support for Auxiliary Control Register only. 
0010 Support for Auxiliary Fault Status Registers (AIFSR and ADFSR) and Auxiliary 


Control Register. 
All other values are reserved. 
In ARMvV8 the only permitted value is 0010. 
— Note 


Accesses to unimplemented Auxiliary registers are UNDEFINED. 





TCM, bits [19:16] 
Indicates support for TCMs and associated DMAs. Defined values are: 


0000 Not supported. 

0001 Support is IMPLEMENTATION DEFINED. ARMv7 requires this setting. 
0010 Support for TCM only, ARMv6 implementation. 

0011 Support for TCM and DMA, ARMv6 implementation. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


ShareLvl, bits [15:12] 
Shareability Levels. Indicates the number of shareability levels implemented. Defined values are: 
0000 One level of shareability implemented. 
0001 Two levels of shareability implemented. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0001. 


OuterShr, bits [11:8] 


Outermost Shareability. Indicates the outermost shareability domain implemented. Defined values 


are: 
0000 Implemented as Non-cacheable. 

0001 Implemented with hardware coherency support. 
1111 Shareability ignored. 


All other values are reserved. 


In ARMvV8 the permitted values are 0000, 0001, and 1111. 


PMSA, bits [7:4] 
Indicates support for a PMSA. Defined values are: 
0000 Not supported. 
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Support for IMPLEMENTATION DEFINED PMSA. 
Support for PMSAv6, with a Cache Type Register implemented. 
Support for PMSAv7, with support for memory subsections. ARMv7-R profile. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


VMSA, bits [3:0] 


Indicates support for a VMSA. Defined values are: 


0000 
0001 
0010 
0011 


0100 


0101 


Not supported. 
Support for IMPLEMENTATION DEFINED VMSA. 
Support for VMSAv6, with Cache and TLB Type Registers implemented. 


Support for VMSAv7, with support for remapping and the Access flag. ARMv7-A 
profile. 


As for 0011, and adds support for the PXN bit in the Short-descriptor translation table 
format descriptors. 


As for 0100, and adds support for the Long-descriptor translation table format. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0101. 


Accessing the ID_MMFRO_EL1: 


To access the ID_ MMFRO_EL1: 


MRS <Xt>, ID_MMFR@_EL1 ; Read ID_MMFRQ_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 000 0000 0001 100 
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D7.2.55 ID_MMFR1_EL1, AArch32 Memory Model Feature Register 1 
The ID_MMFR1_EL 1 characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArch32. 


Must be interpreted with ID_MMFRO_EL1, ID_ MMFR2_EL1, ID_MMFR3_EL1, and 
ID_MMFR4 ELI. 





For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register ID_MMFR1_EL1 is architecturally mapped to AArch32 System register 
ID_MMFRI1. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_MMFR1_EL1 is a 32-bit register. 


Field descriptions 


The ID_MMFR1_EL] bit assignments are: 


28 27 24 23 20 19 1615 12 11 


L1TstCln L1Hvd L1UniSW | L1HvdSW | L1UniVA | LiHvdVA 


BPred, bits [31:28] 


Branch Predictor. Indicates branch predictor management requirements. Defined values are: 


0000 No branch predictor, or no MMU present. Implies a fixed MPU configuration. 
0001 Branch predictor requires flushing on: 

° Enabling or disabling a stage of address translation. 

. Writing new data to instruction locations. 

° Writing new mappings to the translation tables. 


° Changes to the TTBRO, TTBR1, or TTBCR registers. 
° Changes to the ContextID or ASID, or to the FCSE ProcessID if this is supported. 
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0010 Branch predictor requires flushing on: 
° Enabling or disabling a stage of address translation. 
. Writing new data to instruction locations. 
° Writing new mappings to the translation tables. 


° Any change to the TTBRO, TTBR1, or TTBCR registers without a change to the 
corresponding ContextID or ASID, or FCSE ProcessID if this is supported. 


0011 Branch predictor requires flushing only on writing new data to instruction locations. 
0100 For execution correctness, branch predictor requires no flushing at any time. 
All other values are reserved. 


In ARMVv8-A the permitted values are 0010, 0011, or 0100. For values other than 0000 and 0100 the 
ARM Architecture Reference Manual, or the product documentation, might give more information 
about the required maintenance. 


L1TstCln, bits [27:24] 


Level 1 cache Test and Clean. Indicates the supported Level 1 data cache test and clean operations, 
for Harvard or unified cache implementations. Defined values are: 


0000 None supported. 

0001 Supported Level 1 data cache test and clean operations are: 
° Test and clean data cache. 

0010 As for 0001, and adds: 
. Test, clean, and invalidate data cache. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


L1Uni, bits [23:20] 


Level 1 Unified cache. Indicates the supported entire Level 1 cache maintenance operations for a 
unified cache implementation. Defined values are: 


0000 None supported. 
0001 Supported entire Level 1 cache operations are: 
. Invalidate cache, including branch predictor if appropriate. 
° Invalidate branch predictor, if appropriate. 
0010 As for 0001, and adds: 
° Clean cache, using a recursive model that uses the cache dirty status bit. 
° Clean and invalidate cache, using a recursive model that uses the cache dirty 
status bit. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0000. 
L1Hvd, bits [19:16] 


Level 1 Harvard cache. Indicates the supported entire Level 1 cache maintenance operations for a 
Harvard cache implementation. Defined values are: 





0000 None supported. 
0001 Supported entire Level 1 cache operations are: 
° Invalidate instruction cache, including branch predictor if appropriate. 
° Invalidate branch predictor, if appropriate. 
0010 As for 0001, and adds: 
. Invalidate data cache. 
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° Invalidate data cache and instruction cache, including branch predictor if 
appropriate. 
0011 As for 0010, and adds: 
° Clean data cache, using a recursive model that uses the cache dirty status bit. 
° Clean and invalidate data cache, using a recursive model that uses the cache dirty 
status bit. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0000. 
L1UniSW, bits [15:12] 


Level 1 Unified cache by Set/Way. Indicates the supported Level 1 cache line maintenance 
operations by set/way, for a unified cache implementation. Defined values are: 


0000 None supported. 

0001 Supported Level 1 unified cache line maintenance operations by set/way are: 
° Clean cache line by set/way. 

0010 As for 0001, and adds: 
° Clean and invalidate cache line by set/way. 

0011 As for 0010, and adds: 
. Invalidate cache line by set/way. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


L1HvdSW,, bits [11:8] 


Level 1 Harvard cache by Set/Way. Indicates the supported Level 1 cache line maintenance 
operations by set/way, for a Harvard cache implementation. Defined values are: 


0000 None supported. 
0001 Supported Level 1 Harvard cache line maintenance operations by set/way are: 
° Clean data cache line by set/way. 
° Clean and invalidate data cache line by set/way. 
0010 As for 0001, and adds: 
° Invalidate data cache line by set/way. 
0011 As for 0010, and adds: 
. Invalidate instruction cache line by set/way. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


L1UniVA, bits [7:4] 


Level 1 Unified cache by Virtual Address. Indicates the supported Level 1 cache line maintenance 
operations by VA, for a unified cache implementation. Defined values are: 


0000 None supported. 

0001 Supported Level 1 unified cache line maintenance operations by VA are: 
° Clean cache line by VA. 
° Invalidate cache line by VA. 


° Clean and invalidate cache line by VA. 
0010 As for 0001, and adds: 
° Invalidate branch predictor by VA, if branch predictor is implemented. 


All other values are reserved. 
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In ARMv8-A the only permitted value is 0000. 


L1HvdVA, bits [3:0] 


Level 1 Harvard cache by Virtual Address. Indicates the supported Level 1 cache line maintenance 
operations by VA, for a Harvard cache implementation. Defined values are: 


0000 None supported. 
0001 Supported Level 1 Harvard cache line maintenance operations by VA are: 


° Clean data cache line by VA. 


° Invalidate data cache line by VA. 
° Clean and invalidate data cache line by VA. 
° Clean instruction cache line by VA. 
0010 As for 0001, and adds: 
° Invalidate branch predictor by VA, if branch predictor is implemented. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


Accessing the ID_MMFR1_EL1: 
To access the ID_MMFR1_EL1: 
MRS <Xt>, ID_MMFR1_EL1 ; Read ID_MMFR1_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 0001 101 
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D7.2.56 ID_MMFR2_EL1, AArch32 Memory Model Feature Register 2 
The ID_MMFR2_EL 1 characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArch32. 


Must be interpreted with ID_MMFRO_EL1, ID MMFR1_EL1, ID_MMFR3_EL1, and 
ID_MMFR4 ELI. 





For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


AArch64 System register ID_MMFR2_EL1 is architecturally mapped to AArch32 System register 
ID_MMEFR2?. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_MMFR2_EL1 is a 32-bit register. 


Field descriptions 


The ID_MMFR2_EL] bit assignments are: 


28 27 24 23 20 19 1615 12 11 


HWAccFlg | WEFIStall UniTLB HvdTLB | L1HvdRng | L1HvdBG | L1HvdFG 


HWAccFilg, bits [31:28] 


Hardware Access Flag. In earlier versions of the ARM Architecture, this field indicates support for 
a Hardware Access flag, as part of the VMSAv7 implementation. Defined values are: 


0000 Not supported. 
0001 Support for VMSAv7 Access flag, updated in hardware. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 





D7-2038 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 


WFIStall, bits [27:24] 
Wait For Interrupt Stall. Indicates the support for Wait For Interrupt (WFI) stalling. Defined values 


are: 
0000 Not supported. 
0001 Support for WFI stalling. 


All other values are reserved. 
In ARMvV8 the permitted values are 0000 and 0001. 
MemBarr, bits [23:20] 


Memory Barrier. Indicates the supported memory barrier System instructions in the (coproc==1111) 
encoding space: 


0000 None supported. 

0001 Supported memory barrier System instructions are: 
. Data Synchronization Barrier (DSB). 

0010 As for 0001, and adds: 
° Instruction Synchronization Barrier (ISB). 


. Data Memory Barrier (DMB). 
All other values are reserved. 
In ARMvV8 the only permitted value is 0010. 


ARM deprecates the use of these operations. ID_ISAR4.Barrier_instrs indicates the level of support 
for the preferred barrier instructions. 


UniTLB, bits [19:16] 


Unified TLB. Indicates the supported TLB maintenance operations, for a unified TLB 
implementation. Defined values are: 


0000 Not supported. 

0001 Supported unified TLB maintenance operations are: 
. Invalidate all entries in the TLB. 
° Invalidate TLB entry by VA. 

0010 As for 0001, and adds: 
° Invalidate TLB entries by ASID match. 

0011 As for 0010, and adds: 


° Invalidate instruction TLB and data TLB entries by VA All ASID. This is a 
shared unified TLB operation. 


0100 As for 0011, and adds: 
° Invalidate Hyp mode unified TLB entry by VA. 
. Invalidate entire Non-secure PL1&0 unified TLB. 
. Invalidate entire Hyp mode unified TLB. 


0101 As for 0100, and adds the following operations: TLBIMVALIS, TLBIMVAALIS, 
TLBIMVALHIS, TLBIMVAL, TLBIMVAAL, TLBIMVALH. 
0110 As for 0101, and adds the following operations: TLBIIPAS2IS, TLBUPAS2LIS, 


TLBIIPAS2, TLBIIPAS2L. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0110. 


HvdTLB, bits [15:12] 


If the Unified TLB field (UniTLB, bits [19:16]) is not 0000, then the meaning of this field is 
IMPLEMENTATION DEFINED. ARM deprecates the use of this field by software. 
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L1HvdRng, bits [11:8] 


Level 1 Harvard cache Range. Indicates the supported Level 1 cache maintenance range operations, 
for a Harvard cache implementation. Defined values are: 


0000 Not supported. 

0001 Supported Level 1 Harvard cache maintenance range operations are: 
° Invalidate data cache range by VA. 
° Invalidate instruction cache range by VA. 


° Clean data cache range by VA. 
° Clean and invalidate data cache range by VA. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 


L1HvdBG, bits [7:4] 


Level 1 Harvard cache Background fetch. Indicates the supported Level 1 cache background fetch 
operations, for a Harvard cache implementation. When supported, background fetch operations are 
non-blocking operations. Defined values are: 


0000 Not supported. 
0001 Supported Level 1 Harvard cache background fetch operations are: 
. Fetch instruction cache range by VA. 


° Fetch data cache range by VA. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 


L1HvdKFG, bits [3:0] 


Level 1 Harvard cache Foreground fetch. Indicates the supported Level 1 cache foreground fetch 
operations, for a Harvard cache implementation. When supported, foreground fetch operations are 
blocking operations. Defined values are: 


0000 Not supported. 
0001 Supported Level 1 Harvard cache foreground fetch operations are: 
. Fetch instruction cache range by VA. 


° Fetch data cache range by VA. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 


Accessing the ID_MMFR2_EL1: 
To access the ID MMFR2_EL1: 
MRS <Xt>, ID_MMFR2_EL1 ; Read ID_MMFR2_EL1 into Xt 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0000 0001 110 
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D7.2.57 ID_MMFR3_EL1, AArch32 Memory Model Feature Register 3 
The ID_MMFR3_EL1 characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArch32. 


Must be interpreted with ID_MMFRO_EL1, ID MMFR1_EL1, ID_ MMFR2_EL1, and 
ID_MMFR4 ELI. 





For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


AArch64 System register ID_MMFR3_EL1 is architecturally mapped to AArch32 System register 
ID_MMEFR3. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_MMFR3_EL1 is a 32-bit register. 


Field descriptions 


The ID_MMFR3_EL] bit assignments are: 


31 28 27 24 23 20 19 1615 12 11 8 7 4 3 0 





Supersec, bits [31:28] 


Supersections. On a VMSA implementation, indicates whether Supersections are supported. 
Defined values are: 


0000 Supersections supported. 
1111 Supersections not supported. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 1111. 
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CMem3&z, bits [27:24] 


Cached Memory Size. Indicates the physical memory size supported by the caches. Defined values 


are: 
0000 4GB, corresponding to a 32-bit physical address range. 

0001 64GB, corresponding to a 36-bit physical address range. 

0010 1TB or more, corresponding to a 40-bit or larger physical address range. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000, 0001, and 0010. 


CohWalk, bits [23:20] 


Coherent Walk. Indicates whether Translation table updates require a clean to the point of 
unification. Defined values are: 


0000 Updates to the translation tables require a clean to the point of unification to ensure 
visibility by subsequent translation table walks. 


0001 Updates to the translation tables do not require a clean to the point of unification to 
ensure visibility by subsequent translation table walks. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Bits [19:16] 


Reserved, RESO. 


MaintBcst, bits [15:12] 


Maintenance Broadcast. Indicates whether Cache, TLB, and branch predictor operations are 
broadcast. Defined values are: 


0000 Cache, TLB, and branch predictor operations only affect local structures. 


0001 Cache and branch predictor operations affect structures according to shareability and 
defined behavior of instructions. TLB operations only affect local structures. 


0010 Cache, TLB, and branch predictor operations affect structures according to shareability 
and defined behavior of instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


BPMaint, bits [11:8] 


Branch Predictor Maintenance. Indicates the supported branch predictor maintenance operations in 
an implementation with hierarchical cache maintenance operations. Defined values are: 


0000 None supported. 

0001 Supported branch predictor maintenance operations are: 
° Invalidate all branch predictors. 

0010 As for 0001, and adds: 


° Invalidate branch predictors by VA. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


CMaintSW, bits [7:4] 


Cache Maintenance by Set/Way. Indicates the supported cache maintenance operations by set/way, 
in an implementation with hierarchical caches. Defined values are: 





0000 None supported. 
0001 Supported hierarchical cache maintenance instructions by set/way are: 
° Invalidate data cache by set/way. 
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Clean data cache by set/way. 


Clean and invalidate data cache by set/way. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


In a unified cache implementation, the data cache maintenance operations apply to the unified 


caches. 


CMaintVA, bits [3:0] 


Cache Maintenance by Virtual Address. Indicates the supported cache maintenance operations by 
VA, in an implementation with hierarchical caches. Defined values are: 


0000 None supported. 


0001 Supported hierarchical cache maintenance operations by VA are: 


Invalidate data cache by VA. 

Clean data cache by VA. 

Clean and invalidate data cache by VA. 
Invalidate instruction cache by VA. 


Invalidate all instruction cache entries. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


In a unified cache implementation, data cache maintenance operations apply to the unified caches, 
and the instruction cache maintenance instructions are not implemented. 


Accessing the ID_MMFR3_EL1: 


To access the ID_ MMFR3_EL1: 


MRS <Xt>, ID_MMFR3_EL1 ; Read ID_MMFR3_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 0001 111 
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D7.2.58 ID_MMFR4_EL1, AArch32 Memory Model Feature Register 4 
The ID_MMFR4._ ELI characteristics are: 
Purpose 
Provides information about the implemented memory model and memory management support in 
AArch32. 
Must be interpreted with ID. MMFRO_EL1, ID_MMFR1_EL1, ID_MMFR2_EL1, and 
ID_MMFR3_EL1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RO RO RO RO RO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
If HCR_EL2.TID3==1, then: 
° If the register is not RAZ/WI then Non-secure EL1 read accesses to this register are trapped 
to EL2. 
° Otherwise, it is IMPLEMENTATION DEFINED whether Non-secure EL] read accesses to this 
register are trapped to EL2. 
Configurations 
AArch64 System register ID_MMFR4_EL1 is architecturally mapped to AArch32 System register 
ID_MMFR4. 
In an implementation that does not include ACTLR2 and HACTLR2 this register is RAZ. 
In an AArch64-only implementation, this register is UNKNOWN. 
Attributes 
ID_MMFR4_EL1 is a 32-bit register. 
Field descriptions 
The ID_MMFR4_EL] bit assignments are: 
31 8 7 4 3 0 
Bits [31:8] 
Reserved, RAZ. 
AC2, bits [7:4] 
Indicates the extension of the ACTLR and HACTLR registers using ACTLR2 and HACTLR2. 
Defined values are: 
0000 ACTLR2 and HACTLR2 are not implemented. 
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0001 ACTLR2 and HACTLR2 are implemented. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


Bits [3:0] 
Reserved, RAZ. 


Accessing the ID_MMFR4_EL1: 
To access the ID_ MMFR4_EL1: 
MRS <Xt>, ID_MMFR4_EL1 ; Read ID_MMFR4_EL1 into Xt 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0000 0010 110 
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D7.2.59 ID_PFRO_EL1, AArch32 Processor Feature Register 0 
The ID_PFRO_EL1 characteristics are: 


Purpose 
Gives top-level information about the instruction sets supported by the PE in AArch32. 
Must be interpreted with ID_PFR1_EL1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register ID_PFRO_EL1 is architecturally mapped to AArch32 System register 
ID_PFRO. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
ID_PFRO_EL1 is a 32-bit register. 


Field descriptions 


The ID_PFRO_EL1 bit assignments are: 


31 1615 12 11 8 


7 43 0 
RESO State3 State2 State1 Stated 


Bits [31:16] 


Reserved, RESO. 


State3, bits [15:12] 
T32EE instruction set support. Defined values are: 
0000 Not implemented. 
0001 T32EE instruction set implemented. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


State2, bits [11:8] 
Jazelle extension support. Defined values are: 


0000 Not implemented. 
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Jazelle extension implemented, without clearing of JOSCR.CV on exception entry. 


Jazelle extension implemented, with clearing of JOSCR.CV on exception entry. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Statel, bits [7:4] 


T32 instruction set support. Defined values are: 


0000 
0001 


0011 


T32 instruction set not implemented. 

T32 encodings before the introduction of Thumb-2 technology implemented: 
° All instructions are 16-bit. 

° A BL or BLX is a pair of 16-bit instructions. 

. 32-bit instructions other than BL and BLX cannot be encoded. 


T32 encodings after the introduction of Thumb-2 technology implemented, for all 
16-bit and 32-bit T32 basic instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0011. 


StateO, bits [3:0] 


A32 instruction set support. Defined values are: 


0000 
0001 


A32 instruction set not implemented. 


A32 instruction set implemented. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Accessing the ID_PFRO_EL1: 


To access the ID_PFRO_EL1: 


MRS <Xt>, ID_PFRQ_EL1 ; Read ID_PFRQ_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 000 0000 0001 000 
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D7.2.60 ID_PFR1_EL1, AArch32 Processor Feature Register 1 
The ID_PFR1_ELI characteristics are: 
Purpose 
Gives information about the AArch32 programmers' model. 
Must be interpreted with ID_PFRO_EL1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RO RO RO RO RO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
AArch64 System register ID_PFR1_EL1 is architecturally mapped to AArch32 System register 
1: PER, 
In an AArch64-only implementation, this register is UNKNOWN. 
Attributes 
ID_PFR1_EL1 is a 32-bit register. 
Field descriptions 
The ID_PFR1_EL1 bit assignments are: 
28 27 24 23 20 19 16 15 12 11 
ee Virtualization 
GIC, bits [31:28] 
System register GIC CPU interface. Defined values are: 
0000 No System register interface to the GIC CPU interface is supported. 
0001 System register interface to versions 3.0 and 4.0 of the GIC CPU interface is supported. 
All other values are reserved. 
Virt_frac, bits [27:24] 
Virtualization fractional field. When the Virtualization field is 0000, determines the support for 
features from the ARMv7 Virtualization Extensions. Defined values are: 
0000 No features from the ARMv7 Virtualization Extensions are implemented. 
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0001 The following features of the ARMv7 Virtualization Extensions are implemented: 
° The SCR.SIF bit, if EL3 is implemented. 


° The modifications to the SCR.AW and SCR.FW bits described in the 
Virtualization Extensions, if EL3 is implemented. 


° The MSR (Banked register) and MRS (Banked register) instructions. 
° The ERET instruction. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
This field is only valid when the value of ID_PFR1_EL1.Virtualization is 0, otherwise it holds the 
value 0000. 


— Note 


The ID_ISAR registers do not identify whether the instructions added by the ARMv7 Virtualization 
Extensions are implemented. 





Sec_frac, bits [23:20] 


Security fractional field. When the Security field is 0000, determines the support for features from 
the ARMv7 Security Extensions. Defined values are: 


0000 No features from the ARMv7 Security Extensions are implemented. 

0001 The following features from the ARMv7 Security Extensions are implemented: 
° The VBAR register. 
° The TTBCR.PDO and TTBCR.PD1 bits. 


0010 As for 0001, plus the ability to access Secure or Non-secure physical memory is 
supported. 


All other values are reserved. 
In ARMv8-A the permitted values are 0000, 0001, and 0010. 
This field is only valid when the value of ID_PFR1_EL1.Security is 0, otherwise it holds the value 
0000. 
GenTimer, bits [19:16] 
Generic Timer support. Defined values are: 
0000 Not implemented. 
0001 Generic Timer implemented. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Virtualization, bits [15:12] 


Virtualization support. Defined values are: 


0000 EL2, Hyp mode, and the HVC instruction not implemented. 
0001 EL2, Hyp mode, the HVC instruction, and all the features described by Virt_frac == 
0001 implemented. 


All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 


In an implementation that includes EL2, if EL2 cannot use AArch32 but EL1 can use AArch32 then 
this field has the value 0001. 


If EL1 cannot use AArch32 then this field has the value 0000. 
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——— Note 
The ID_ISARs do not identify whether the HVC instruction is implemented. 





MProgMod, bits [11:8] 
M profile programmers' model support. Defined values are: 
0000 Not supported. 
0010 Support for two-stack programmers' model. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


Security, bits [7:4] 


Security support. Defined values are: 


0000 EL3, Monitor mode, and the SMC instruction not implemented. 

0001 EL3, Monitor mode, the SMC instruction, and all the features described by Sec_frac == 
0001 implemented. 

0010 As for 0001, and adds the ability to set the NSACR.RFR bit. Not permitted in ARMv8 


as the NSACR.RFR bit is RESO. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 


In an implementation that includes EL3, if EL3 cannot use AArch32 but EL1 can use AArch32 then 
this field has the value 0001. 


If EL1 cannot use AArch32 then this field has the value 0000. 


ProgMod, bits [3:0] 


Support for the standard programmers' model for ARMv4 and later. Model must support User, FIQ, 
IRQ, Supervisor, Abort, Undefined, and System modes. Defined values are: 


0000 Not supported. 

0001 Supported. 

All other values are reserved. 

In ARMv8-A the permitted values are 0001 and 0000. 

If EL1 cannot use AArch32 then this field has the value 0000. 


Accessing the ID_PFR1_EL1: 
To access the ID_PFR1_EL1: 
MRS <Xt>, ID_PFR1_EL1 ; Read ID_PFR1_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 000 0000 0001 001 
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D7.2.61 IFSR32_EL2, Instruction Fault Status Register (EL2) 
The IFSR32_EL2 characteristics are: 


Purpose 
Allows access to the AArch32 IFSR register from AArch64 state only. Its value has no effect on 
execution in AArch64 state. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register IFSR32_EL2 is architecturally mapped to AArch32 System register 
IFSR. 


If EL1 is AArch64 only, this register is UNDEFINED. 


If EL2 is not implemented but EL3 is implemented, and EL] is capable of using AArch32, then this 
register is not RESO. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
IFSR32_EL2 is a 32-bit register. 


Field descriptions 


The IFSR32_EL2 bit assignments are: 


When TTBCR.EAE==0: 


31 171615 131211109 8 


RESO i RESO Ae RESO FS[3:0] 


FnV ee fs LPAE 


FS[4] 
RESO 
ExT 


Bits [31:17] 
Reserved, RESO. 


FnV, bit [16] 


FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 





) IFAR is valid. 
1 IFAR is not valid, and holds an UNKNOWN value. 
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This field is only valid for a Synchronous external abort other than a Synchronous external abort on 
a translation table walk. It is RESO for all other Prefetch Abort exceptions. 


Bits [15:13] 


Reserved, RESO. 


EXT, bit [12] 


External abort type. This bit can be used to provide an IMPLEMENTATION DEFINED classification of 
external aborts. 


In an implementation that does not provide any classification of external aborts, this bit is RESO. 


For aborts other than external aborts this bit always returns 0. 


Bit [11] 


Reserved, RESO. 


FS[4], bit [10] 
See FS[3:0], bits [3:0] for description of the FS field. 


LPAE, bit [9] 
On taking a Data Abort exception, this bit is set as follows: 
) Using the Short-descriptor translation table formats. 
1 Using the Long-descriptor translation table formats. 


Hardware does not interpret this bit to determine the behavior of the memory system, and therefore 
software can set this bit to 0 or 1 without affecting operation. 








Bits [8:4] 
Reserved, RESO. 

FS[3:0], bits [3:0] 
Fault status bits. Interpreted with bit [10]. Possible values of FS[4:0] are: 
00001 PC alignment fault 
00010 Debug exception 
00011 Access flag fault, level 1 
00101 Translation fault, level 1 
00110 Access flag fault, level 2 
00111 Translation fault, level 2 
01000 Synchronous external abort, not on translation table walk 
01001 Domain fault, level 1 
01011 Domain fault, level 2 
01100 Synchronous external abort, on translation table walk, level 1 
01101 Permission fault, level 1 
01110 Synchronous external abort, on translation table walk, level 2 
Q1111 Permission fault, level 2 
10000 TLB conflict abort 
10100 IMPLEMENTATION DEFINED fault (Lockdown fault) 
11001 Synchronous parity or ECC error on memory access, not on translation table walk 
11100 Synchronous parity or ECC error on translation table walk, level 1 
11110 Synchronous parity or ECC error on translation table walk, level 2 
All other values are reserved. 
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When TTBCR.EAE==1: 


31 


171615 131211109 8 


RESO i RESO Ty [ae RESO STATUS 


FnV oo Po LPAE 


Bits [31:17] 


FnV, bit [16] 


Bits [15:13] 


ExT, bit [12] 


Bits [11:10] 


LPAE, bit [9] 


Bits [8:6] 


RESO 
ExT 


Reserved, RESO. 


FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 


) IFAR is valid. 
1 IFAR is not valid, and holds an UNKNOWN value. 


This field is only valid for a Synchronous external abort other than a Synchronous external abort on 
a translation table walk. It is RESO for all other Prefetch Abort exceptions. 


Reserved, RESO. 


External abort type. This bit can be used to provide an IMPLEMENTATION DEFINED classification of 
external aborts. 


In an implementation that does not provide any classification of external aborts, this bit is RESO. 


For aborts other than external aborts this bit always returns 0. 


Reserved, RESO. 


On taking a Data Abort exception, this bit is set as follows: 
7) Using the Short-descriptor translation table formats. 
1 Using the Long-descriptor translation table formats. 


Hardware does not interpret this bit to determine the behavior of the memory system, and therefore 
software can set this bit to 0 or 1 without affecting operation. 


Reserved, RESO. 


STATUS, bits [5:0] 


Fault status bits. All encodings not shown below are reserved: 
000000 Address size fault in TTBRO or TTBR1 

000001 Address size fault, level 1 

000010 Address size fault, level 2 

000011 Address size fault, level 3 

000101 Translation fault, level 1 

000110 Translation fault, level 2 
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000111 
001001 
001010 
001011 
001101 
001110 
001111 
010000 
010101 
010110 
010111 
011000 
011101 
011110 
011111 
100001 
100010 
110000 





Translation fault, level 3 

Access flag fault, level 1 

Access flag fault, level 2 

Access flag fault, level 3 

Permission fault, level 1 

Permission fault, level 2 

Permission fault, level 3 

Synchronous external abort, not on translation table walk 

Synchronous external abort, on translation table walk, level 1 

Synchronous external abort, on translation table walk, level 2 

Synchronous external abort, on translation table walk, level 3 

Synchronous parity or ECC error on memory access, not on translation table walk 
Synchronous parity or ECC error on memory access on translation table walk, level 1 
Synchronous parity or ECC error on memory access on translation table walk, level 2 
Synchronous parity or ECC error on memory access on translation table walk, level 3 
PC alignment fault 

Debug exception 

TLB conflict abort 


All other values are reserved. 


The lookup level associated with a fault is: 


For a fault generated on a translation table walk, the lookup level of the walk being 
performed. 


For a Translation fault, the lookup level of the translation table that gave the fault. If a fault 
occurs because a stage of address translation is disabled, or because the input address is 
outside the range specified by the appropriate base address register or registers, the fault is 
reported as a fault at level 1. 


For an Access flag fault, the lookup level of the translation table that gave the fault. 


For a Permission fault, including a Permission fault caused by hierarchical permissions, the 
lookup level of the final level of translation table accessed for the translation. That is, the 
lookup level of the translation table that returned a Block or Page descriptor. 


Accessing the IFSR32_EL2: 


To access the IFSR32_EL2: 


MRS <Xt>, IFSR32_EL2 
MSR IFSR32_EL2, <Xt> 


; Read IFSR32_EL2 into Xt 
; Write Xt to IFSR32_EL2 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 100 0101 0000 001 
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D7.2.62 ISR_EL1, Interrupt Status Register 
The ISR_EL1 characteristics are: 
Purpose 
Shows whether any IRQ, FIQ, or SError interrupt is pending. In an implementation that includes 
EL2, when the register is accessed from Non-secure EL1, a pending interrupt or external abort might 
be physical or virtual, and the architecture does not provide any mechanism that software executing 
at Non-secure EL1 can use to determine whether a pending interrupt or external abort is physical or 
virtual. For all other accesses, any indicated interrupt or external abort must be physical. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RO RO RO RO RO 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register ISR_EL1 is architecturally mapped to AArch32 System register ISR. 
Attributes 
ISR_EL1 is a 32-bit register. 
Field descriptions 
The ISR_EL1 bit assignments are: 
31 98765 0 
RESO 4) fe RESO 
Bits [31:9] 
Reserved, RESO. 
A, bit [8] 
SError interrupt pending bit: 
0 No pending SError. 
1 An SError interrupt is pending. 
I, bit [7] 
IRQ pending bit. Indicates whether an IRQ interrupt is pending: 
Q No pending IRQ. 
1 An IRQ interrupt is pending. 
F, bit [6] 
FIQ pending bit. Indicates whether an FIQ interrupt is pending. 
7) No pending FIQ. 
1 An FIQ interrupt is pending. 
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Bits [5:0] 


Reserved, RESO. 


Accessing the ISR_EL1: 
To access the ISR_EL1: 
MRS <Xt>, ISR_EL1 ; Read ISR_EL1 into Xt 


Register access is encoded as follows: 





op0 


op1 


CRn CRm_= op2 





11 


000 


1100 = 0001 000 
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D7.2.63 MAIR_EL1, Memory Attribute Indirection Register (EL1) 
The MAIR_EL1 characteristics are: 


Purpose 
Provides the memory attribute encodings corresponding to the possible AttrIndx values in a 
Long-descriptor format translation table entry for stage 1 translations at EL1. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


AArch64 System register MAIR_EL1[31:0] is architecturally mapped to AArch32 System register 
PRRR when TTBCR.EAE==0. 


AArch64 System register MAIR_EL1[31:0] is architecturally mapped to AArch32 System register 
MAIRO when TTBCR.EAE==1. 


AArch64 System register MAIR_EL1[63:32] is architecturally mapped to AArch32 System register 
NMRR when TTBCR.EAE==0. 


AArch64 System register MAIR_EL1[63:32] is architecturally mapped to AArch32 System register 
MAIRI when TTBCR.EAE==1. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
MAIR _EL1 is a 64-bit register. 


Field descriptions 


The MAIR_EL1 bit assignments are: 


56 55 48 47 40 39 32 31 24 23 1615 


P Attr7 | Attr6 Attr5 Attr4 Attr3 | Attr2 | Attr1 AttrO 


MAIR_EL1 is permitted to be cached in a TLB. 


Attr<n>, bits [8n+7:8n], for n = 0 to 7 


The memory attribute encoding for an AttrIndx[2:0] entry in a Long descriptor format translation 
table entry, where AttrIndx[2:0] gives the value of <n> in Attr<n>. 
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Bits [7:4] are encoded as follows: 





Attr<n>[7:4] 


Meaning 





0000 


Device memory. See encoding of Attr<n>[3:0] for the type of Device memory. 





QORW, RW not 00 


Normal Memory, Outer Write-Through transient 





0100 


Normal Memory, Outer Non-cacheable 





Q1RW, RW not 00 


Normal Memory, Outer Write-Back transient 





10RW 


Normal Memory, Outer Write-Through non-transient 








11RW 





Normal Memory, Outer Write-Back non-transient 





R = Outer Read-Allocate policy, W = Outer Write-Allocate policy. 


The meaning of bits [3:0] depends on the value of bits [7:4]: 





Attr<n>[3:0] 


Meaning when Attr<n>[7:4] is 0000 Meaning when Attr<n>[7:4] is not 0000 





0000 


Device-nGnRnE memory 


UNPREDICTABLE 





OORW, RW not 00 


UNPREDICTABLE 


Normal Memory, Inner Write-Through transient 





0100 


Device-nGnRE memory 


Normal memory, Inner Non-cacheable 





Q1RW, RW not 00 


UNPREDICTABLE 


Normal Memory, Inner Write-Back transient 





1000 


Device-nGRE memory 


Normal Memory, Inner Write-Through non-transient (RW=00) 





10RW, RW not 00 


UNPREDICTABLE 


Normal Memory, Inner Write-Through non-transient 





1100 


Device-GRE memory 


Normal Memory, Inner Write-Back non-transient (RW=00) 








11RW, RW not 00 





UNPREDICTABLE 








Normal Memory, Inner Write-Back non-transient 





R = Inner Read-Allocate policy, W = Inner Write-Allocate policy. 


The R and W bits in some Attr<n> fields have the following meanings: 





RorW Meaning 





) No Allocate 





1 Allocate 





Accessing the MAIR_EL1: 


To access the MAIR_ELI: 


MRS <Xt>, MAIR_EL1 ; Read MAIR_EL1 into Xt 
MSR MAIR_EL1, <Xt> ; Write Xt to MAIR_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 1010 0010 000 
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D7.2.64 MAIR_EL2, Memory Attribute Indirection Register (EL2) 
The MAIR_EL2 characteristics are: 


Purpose 


Provides the memory attribute encodings corresponding to the possible AttrIndx values in a 
Long-descriptor format translation table entry for stage 1 translations at EL2. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register MAIR_EL2[31:0] is architecturally mapped to AArch32 System register 
HMAIRO. 


AArch64 System register MAIR_EL2[63:32] is architecturally mapped to AArch32 System register 
HMAIRI. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
MAIR_EL2 is a 64-bit register. 


Field descriptions 


The MAIR_EL2 bit assignments are: 


56 55 48 47 40 39 32 31 24 23 16 15 


P Attr7 | Attr6 Attr5 Attr4 Attr3 | Attr2 | Attr1 AttrO 


MAIR_EL2 is permitted to be cached in a TLB. 


Attr<n>, bits [8n+7:8n], for n = 0 to 7 


The memory attribute encoding for an AttrIndx[2:0] entry in a Long descriptor format translation 
table entry, where AttrIndx[2:0] gives the value of <n> in Attr<n>. 


Bits [7:4] are encoded as follows: 





Attr<n>[7:4] Meaning 





0000 Device memory. See encoding of Attr<n>[3:0] for the type of Device memory. 





@@RW, RW not @@ Normal Memory, Outer Write-Through transient 





0100 Normal Memory, Outer Non-cacheable 
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Attr<n>[7:4] Meaning 





Q1RW, RW not 0@ Normal Memory, Outer Write-Back transient 





10RW Normal Memory, Outer Write-Through non-transient 





11RW Normal Memory, Outer Write-Back non-transient 





R = Outer Read-Allocate policy, W = Outer Write-Allocate policy. 
The meaning of bits [3:0] depends on the value of bits [7:4]: 





Attr<n>[3:0] 


Meaning when Attr<n>[7:4] is 0000 Meaning when Attr<n>[7:4] is not 0000 






























































0000 Device-nGnRnE memory UNPREDICTABLE 
Q@RW, RW not @@ UNPREDICTABLE Normal Memory, Inner Write-Through transient 
0100 Device-nGnRE memory Normal memory, Inner Non-cacheable 
Q1RW, RW not @@ UNPREDICTABLE Normal Memory, Inner Write-Back transient 
1000 Device-nGRE memory Normal Memory, Inner Write-Through non-transient (RW=00) 
10RW, RW not 0@@ UNPREDICTABLE Normal Memory, Inner Write-Through non-transient 
1100 Device-GRE memory Normal Memory, Inner Write-Back non-transient (RW=00) 
11RW, RW not 00 UNPREDICTABLE Normal Memory, Inner Write-Back non-transient 
R = Inner Read-Allocate policy, W = Inner Write-Allocate policy. 
The R and W bits in some Attr<n> fields have the following meanings: 
RorW Meaning 
0 No Allocate 
1 Allocate 
Accessing the MAIR_EL2: 
To access the MAIR_EL2: 
MRS <Xt>, MAIR_EL2 ; Read MAIR_EL2 into Xt 
MSR MAIR_EL2, <Xt> ; Write Xt to MAIR_EL2 
Register access is encoded as follows: 
op0 opi CRn CRm_= op2 
11 100 1010 = @010 000 
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D7.2.65 MAIR_EL3, Memory Attribute Indirection Register (EL3) 
The MAIR_EL3 characteristics are: 


Purpose 


Provides the memory attribute encodings corresponding to the possible AttrIndx values in a 
Long-descriptor format translation table entry for stage 1 translations at EL3. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
MAIR_EL3 is a 64-bit register. 


Field descriptions 


The MAIR_EL3 bit assignments are: 


56 55 48 47 40 39 32 31 24 23 16 15 


P Attr7 | Attr6 Attr5 Attr4 Attr3 | Attr2 | Attr1 AttrO 


MAIR_EL3 is permitted to be cached in a TLB. 


Attr<n>, bits [8n+7:8n], for n = 0 to 7 


The memory attribute encoding for an AttrIndx[2:0] entry in a Long descriptor format translation 
table entry, where AttrIndx[2:0] gives the value of <n> in Attr<n>. 


Bits [7:4] are encoded as follows: 





Attr<n>[7:4] Meaning 





0000 Device memory. See encoding of Attr<n>[3:0] for the type of Device memory. 





@@RW, RW not @@ Normal Memory, Outer Write-Through transient 





0100 Normal Memory, Outer Non-cacheable 





Q1RW, RW not 0@ Normal Memory, Outer Write-Back transient 





10RW Normal Memory, Outer Write-Through non-transient 











11RW Normal Memory, Outer Write-Back non-transient 





R = Outer Read-Allocate policy, W = Outer Write-Allocate policy. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-2061 
1ID092916 Non-Confidential 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 


The meaning of bits [3:0] depends on the value of bits [7:4]: 





Attr<n>[3:0] 


Meaning when Attr<n>[7:4] is 0000 Meaning when Attr<n>[7:4] is not 0000 





























0000 Device-nGnRnE memory UNPREDICTABLE 

Q@RW, RW not @@ UNPREDICTABLE Normal Memory, Inner Write-Through transient 

0100 Device-nGnRE memory Normal memory, Inner Non-cacheable 

Q1RW, RW not 0@@ UNPREDICTABLE Normal Memory, Inner Write-Back transient 

1000 Device-nGRE memory Normal Memory, Inner Write-Through non-transient (RW=00) 
10RW, RW not 0@@ UNPREDICTABLE Normal Memory, Inner Write-Through non-transient 

1100 Device-GRE memory Normal Memory, Inner Write-Back non-transient (RW=00) 
11RW, RW not 00 UNPREDICTABLE Normal Memory, Inner Write-Back non-transient 











R = Inner Read-Allocate policy, W = Inner Write-Allocate policy. 


The R and W bits in some Attr<n> fields have the following meanings: 





RorW Meaning 





0 No Allocate 





1 Allocate 





Accessing the MAIR_EL3: 


To access the MAIR_EL3: 


MRS <Xt>, MAIR_EL3 ; Read MAIR_EL3 into Xt 
MSR MAIR_EL3, <Xt> ; Write Xt to MAIR_EL3 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 1010 0010 000 
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MIDR_EL1, Main ID Register 
The MIDR_EL1 characteristics are: 


Purpose 


Provides identification information for the PE, including an implementer code for the device and a 
device ID number. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register MIDR_EL1 is architecturally mapped to AArch32 System register 
MIDR. 


AArch64 System register MIDR_EL1 is architecturally mapped to External register MIDR_EL1. 


Attributes 
MIDR_EL1 is a 32-bit register. 


Field descriptions 


The MIDR_EL] bit assignments are: 


31 24 23 20 19 1615 4 3 0 


Architecture as a 


Implementer, bits [31:24] 


The Implementer code. This field must hold an implementer code that has been assigned by ARM. 
Assigned codes include the following: 





Hex representation ASCll representation Implementer 


























0x41 A ARM Limited 
Qx42 B Broadcom Corporation 
0x43 C Cavium Inc. 
0x44 D Digital Equipment Corporation 
0x49 I Infineon Technologies AG 
0x4D M Motorola or Freescale Semiconductor Inc. 
Ox4E N NVIDIA Corporation 
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Hex representation ASCll representation Implementer 





0x50 


Applied Micro Circuits Corporation 





Qx51 





0x56 


P 
Q Qualcomm Inc. 
Vv 


Marvell International Ltd. 





0x69 


_ 


Intel Corporation 





ARM can assign codes that are not published in this manual. All values not assigned by ARM are 
reserved and must not be used. 


Variant, bits [23:20] 


An IMPLEMENTATION DEFINED variant number. Typically, this field is used to distinguish between 
different product variants, or major revisions of a product. 


Architecture, bits [19:16] 


The permitted values of this field are: 


0001 
0010 
0011 
0100 
0101 
0110 
0111 
1111 


ARMv4 

ARMVv4T 

ARMvsS (obsolete) 
ARMv5T 
ARMvS5TE 
ARMVSTEJ 
ARMv6 


Architectural features are individually identified in the ID_* registers, see Identification 
registers, functional group on page G4-4194. 





All other values are reserved. 


PartNum, bits [15:4] 


An IMPLEMENTATION DEFINED primary part number for the device. 


On processors implemented by ARM, if the top four bits of the primary part number are Qx@ or 0x7, 
the variant and architecture are encoded differently. 


Revision, bits [3:0] 


An IMPLEMENTATION DEFINED revision number for the device. 


Accessing the MIDR_EL1: 


To access the MIDR_EL1: 


MRS <Xt>, MIDR_EL1 ; Read MIDR_EL1 into Xt 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0000 = 0.000 000 
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D7.2.67 MPIDR_EL1, Multiprocessor Affinity Register 
The MPIDR_EL1 characteristics are: 


Purpose 


In a multiprocessor system, provides an additional PE identification mechanism for scheduling 
purposes. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 


AArch64 System register MPIDR_EL1 is architecturally mapped to AArch32 System register 
MPIDR. 


The assigned value of the MPIDR.{ Aff2, Aff1, Aff0} or MPIDR_EL1.{Aff3, Aff2, Affl, Aff0} set 
of fields of each PE must be unique within the system as a whole. 


In a uniprocessor system ARM recommends that each Aff<n> field of this register returns a value 
of 0. 


Attributes 
MPIDR_EL1 is a 64-bit register. 


Field descriptions 


The MPIDR_EL1 bit assignments are: 


63 40 39 32 31 30 29 25 24 23 1615 8 7 0 


RESO Aff3 it RESO Aff2 Aff1 AffO 


RES1 oo a MT 


Bits [63:40] 
Reserved, RESO. 
Aff3, bits [39:32] 
Affinity level 3. Highest level affinity field. 





Bit [31] 
Reserved, RES1. 

U, bit [30] 
Indicates a Uniprocessor system, as distinct from PE 0 in a multiprocessor system. The possible 
values of this bit are: 
0 Processor is part of a multiprocessor system. 
L Processor is part of a uniprocessor system. 
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Bits [29:25] 
Reserved, RESO. 
MT, bit [24] 


Indicates whether the lowest level of affinity consists of logical PEs that are implemented using a 
multithreading type approach. The possible values of this bit are: 


0 Performance of PEs at the lowest affinity level is largely independent. 


1 Performance of PEs at the lowest affinity level is very interdependent. 


Aff2, bits [23:16] 
Affinity level 2. Second highest level affinity field. 


Aff1, bits [15:8] 
Affinity level 1. Third highest level affinity field. 


Aff0, bits [7:0] 
Affinity level 0. Lowest level affinity field. 


Accessing the MPIDR_EL1: 
To access the MPIDR_ELI1: 
MRS <Xt>, MPIDR_EL1 ; Read MPIDR_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 0000 101 
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D7.2.68 MVFRO_EL1, AArch32 Media and VFP Feature Register 0 
The MVFRO_EL | characteristics are: 
Purpose 
Describes the features provided by the AArch32 Advanced SIMD and Floating-point 
implementation. 
Must be interpreted with MVFR1_EL1 and MVFR2_EL1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RO RO RO RO RO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
AArch64 System register MVFRO_EL1 is architecturally mapped to AArch32 System register 
MVFERO. 
In an implementation where at least one Exception level supports execution in AArch32 state, but 
there is no support for Advanced SIMD and floating-point operation, this register is RAZ. 
In an AArch64-only implementation, this register is UNKNOWN. 
Attributes 
MVFRO_EL1 is a 32-bit register. 
Field descriptions 
The MVFRO_EL1 bit assignments are: 
28 27 24 23 20 19 16 15 12 11 
FPRound | FPShVec FPSaqrt FPDivide FPTrap FPDP FPSP SIMDReg 
FPRound, bits [31:28] 
Floating-Point Rounding modes. Indicates whether the floating-point implementation provides 
support for rounding modes. Defined values are: 
0000 Not implemented, or only Round to Nearest mode supported, except that Round towards 
Zero mode is supported for VCVT instructions that always use that rounding mode 
regardless of the FPSCR setting. 
0001 All rounding modes supported. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
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FPShVec, bits [27:24] 


Short Vectors. Indicates whether the floating-point implementation provides support for the use of 
short vectors. Defined values are: 


0000 Short vectors not supported. 
0001 Short vector operation supported. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


FPSart, bits [23:20] 


Square Root. Indicates whether the floating-point implementation provides support for the ARMv6 
VFP square root operations. Defined values are: 


0000 Not supported in hardware. 
0001 Supported. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
The VSQRT.F32 instruction also requires the single-precision floating-point attribute, bits [7:4], 
and the VSQRT.F64 instruction also requires the double-precision floating-point attribute, bits 
[11:8]. 
FPDivide, bits [19:16] 


Indicates whether the floating-point implementation provides support for VFP divide operations. 
Defined values are: 


0000 Not supported in hardware. 

0001 Supported. 

All other values are reserved. 

In ARMv8-A the permitted values are 0000 and 0001. 

The VDIV.F32 instruction also requires the single-precision floating-point attribute, bits [7:4], and 

the VDIV.F64 instruction also requires the double-precision floating-point attribute, bits [11:8]. 
FPTrap, bits [15:12] 


Floating Point Exception Trapping. Indicates whether the floating-point implementation provides 
support for exception trapping. Defined values are: 


0000 Not supported. 
0001 Supported. 
All other values are reserved. 
A value of 0001 indicates that, when the corresponding trap is enabled, a floating-point exception 
generates an exception. 
FPDP, bits [11:8] 


Double Precision. Indicates whether the floating-point implementation provides support for 
double-precision operations. Defined values are: 


0000 Not supported in hardware. 
0001 Supported, VFPv2. 
0010 Supported, VFPv3, VFPv4, or ARMv8. VFPv3 and ARMV8 add an instruction to load 


a double-precision floating-point constant, and conversions between double-precision 
and fixed-point values. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0010. 
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A value of 0b0001 or 0b0010 indicates support for all VFP double-precision instructions in the 
supported version of VFP, except that, in addition to this field being nonzero: 


. VSQRT.F64 is only available if the Square root field is 0001. 
° VDIV.F64 is only available if the Divide field is 0001. 


° Conversion between double-precision and single-precision is only available if the 
single-precision field is nonzero. 


FPSP, bits [7:4] 


Single Precision. Indicates whether the floating-point implementation provides support for 
single-precision operations. Defined values are: 


0000 Not supported in hardware. 

0001 Supported, VFPv2. 

0010 Supported, VFPv3 or VFPv4. VFPv3 adds an instruction to load a single-precision 
floating-point constant, and conversions between single-precision and fixed-point 
values. 


All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0010. 


A value of 0b0001 or 0b0010 indicates support for all VFP single-precision instructions in the 
supported version of VFP, except that, in addition to this field being nonzero: 


° VSQRT.F32 is only available if the Square root field is 0001. 
. VDIV.F32 is only available if the Divide field is 0001. 


. Conversion between double-precision and single-precision is only available if the 
double-precision field is nonzero. 


SIMDReg, bits [3:0] 


Advanced SIMD registers. Indicates whether the Advanced SIMD and floating-point 
implementation provides support for the Advanced SIMD and floating-point register bank. Defined 


values are: 

0000 The implementation has no Advanced SIMD and floating-point support. 

0001 The implementation includes floating-point support with 16 x 64-bit registers. 

0010 The implementation includes Advanced SIMD and floating-point support with 32 x 


64-bit registers. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0010. 


Accessing the MVFRO_EL1: 
To access the MVFRO_EL1: 
MRS <Xt>, MVFR@_EL1 ; Read MVFRQ_EL1 into Xt 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0000 = 0011 000 
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D7.2.69 MVFR1_EL1, AArch32 Media and VFP Feature Register 1 
The MVFR1_EL1 characteristics are: 


Purpose 


Describes the features provided by the AArch32 Advanced SIMD and Floating-point 
implementation. 


Must be interpreted with MVFRO_EL1 and MVFR2_EL1. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register MVFR1_EL1 is architecturally mapped to AArch32 System register 
MVFRI. 


In an implementation where at least one Exception level supports execution in AArch32 state, but 
there is no support for Advanced SIMD and floating-point operation, this register is RAZ. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
MVFR1_EL1 is a 32-bit register. 


Field descriptions 


The MVFR1_EL1 bit assignments are: 


28 27 24 23 20 19 1615 12 11 


= FPHP SIMDHP | SIMDSP SIMDInt SIMDLS | FPDNaN FPFtZ 


SIMDFMAC, bits [31:28] 


SIMDFMAC 


Advanced SIMD Fused Multiply-Accumulate. Indicates whether the Advanced SIMD 
implementation provides fused multiply accumulate instructions. Defined values are: 


0000 Not implemented. 
0001 Implemented. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 
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The Advanced SIMD and floating-point implementations must provide the same level of support 
for these instructions. 


FPHP, bits [27:24] 


Floating Point Half Precision. Indicates whether the floating-point implementation provides 
half-precision floating-point conversion instructions. Defined values are: 


0000 Not implemented. 
0001 Instructions to convert between half-precision and single-precision implemented. 
0010 As for 0b0001, and also instructions to convert between half-precision and 


double-precision implemented. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0010. 


SIMDRP, bits [23:20] 


Advanced SIMD Half Precision. Indicates whether the Advanced SIMD and floating-point 
implementation provides half-precision floating-point conversion instructions. Defined values are: 


0000 Not implemented. 
0001 Implemented. This value is permitted only if the SIMDSP field is 0001. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


SIMDSP, bits [19:16] 


Advanced SIMD Single Precision. Indicates whether the Advanced SIMD and floating-point 
implementation provides single-precision floating-point instructions. Defined values are: 


0000 Not implemented. 
0001 Implemented. This value is permitted only if the SIMDInt field is 0001. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


SIMDInt, bits [15:12] 


Advanced SIMD Integer. Indicates whether the Advanced SIMD and floating-point implementation 
provides integer instructions. Defined values are: 


0000 Not implemented. 
0001 Implemented. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


SIMDLS, bits [11:8] 


Advanced SIMD Load/Store. Indicates whether the Advanced SIMD and floating-point 
implementation provides load/store instructions. Defined values are: 


0000 Not implemented. 
0001 Implemented. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


FPDNaN, bits [7:4] 


Default NaN mode. Indicates whether the floating-point implementation provides support only for 
the Default NaN mode. Defined values are: 





0000 Not implemented, or hardware supports only the Default NaN mode. 
0001 Hardware supports propagation of NaN values. 
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All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
FPFtzZ, bits [3:0] 


Flush to Zero mode. Indicates whether the floating-point implementation provides support only for 
the Flush-to-Zero mode of operation. Defined values are: 


0000 Not implemented, or hardware supports only the Flush-to-Zero mode of operation. 
0001 Hardware supports full denormalized number arithmetic. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


Accessing the MVFR1_EL1: 
To access the MVFR1_EL1: 
MRS <Xt>, MVFR1_EL1 ; Read MVFR1_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 0011 001 
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D7.2.70 MVFR2_EL1, AArch32 Media and VFP Feature Register 2 
The MVFR2_EL1 characteristics are: 


Purpose 


Describes the features provided by the AArch32 Advanced SIMD and Floating-point 
implementation. 


Must be interpreted with MVFRO_EL1 and MVFR1_EL1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 


AArch64 System register MVFR2_EL1 is architecturally mapped to AArch32 System register 
MVFR2. 


In an implementation where at least one Exception level supports execution in AArch32 state, but 
there is no support for Advanced SIMD and floating-point operation, this register is RAZ. 


In an AArch64-only implementation, this register is UNKNOWN. 


Attributes 
MVFR2_EL1 is a 32-bit register. 


Field descriptions 


The MVFR2_EL1 bit assignments are: 


31 8 7 4 3 0 
RESO SIMDMisc 
Bits [31:8] 


Reserved, RESO. 


FPMisc, bits [7:4] 


Indicates whether the floating-point implementation provides support for miscellaneous VFP 





features. 
0000 Not implemented, or no support for miscellaneous features. 
0001 Support for Floating-point selection. 
0010 As 0001, and Floating-point Conversion to Integer with Directed Rounding modes. 
0011 As 0010, and Floating-point Round to Integral Floating-point. 
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0100 As 0011, and Floating-point MaxNum and MinNum. 

All other values are reserved. 

In ARMv8-A the permitted values are 0000 and 0100. 
SIMDMiisc, bits [3:0] 


Indicates whether the Advanced SIMD implementation provides support for miscellaneous 


Advanced SIMD features. 

0000 Not implemented, or no support for miscellaneous features. 

0001 Floating-point Conversion to Integer with Directed Rounding modes. 
0010 As 0001, and Floating-point Round to Integral Floating-point. 

0011 As 0010, and Floating-point MaxNum and MinNum. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0011. 


Accessing the MVFR2_EL1: 
To access the MVFR2_EL1: 
MRS <Xt>, MVFR2_EL1 ; Read MVFR2_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 000 0000 0011 010 
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D7.2.71 PAR_EL1, Physical Address Register 
The PAR_EL] characteristics are: 


Purpose 
Returns the output address (OA) from an address translation instruction that executed successfully, 
or fault information if the instruction did not execute successfully. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
AArch64 System register PAR_EL1 is architecturally mapped to AArch32 System register PAR. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
PAR_ELI is a 64-bit register. 


Field descriptions 


The PAR_EL1 bit assignments are: 


For all register layouts: 


F, bit [0] 
Indicates whether the instruction performed a successful address translation. 
0 Address translation completed successfully. 
1 Address translation aborted. 


When PAR_EL1.F==0: 


63 56 55 48 47 1211109 8 7 6 1 0 
ATTR RESO | PA | he si RESO a 

| Lo ae DEF 

RES1 


This section describes the register value returned by the successful execution of an Address translation instruction. 
Software might subsequently write a different value to the register, and that write does not affect the operation of 
the PE. 


On a successful conversion, the PAR_EL1 can return a value that indicates the resulting attributes, rather than the 
values that appear in the translation table descriptors. More precisely: 


° The ATTR and SH fields are permitted to report the resulting attributes, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of reporting the values that appear in 
the translation table descriptors. 
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° See the NS bit description for constraints on the value it returns. 


ATTR, bits [63:56] 


Memory attributes for the returned output address. This field uses the same encoding as the Attr<n> 
fields in MAIR_EL1, MAIR_EL2, and MAIR_EL3. 


The value returned in this field can be the resulting attribute, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of the value that appears in 
the translation table descriptor. 


Bits [55:48] 


Reserved, RESO. 


PA, bits [47:12] 


Output address. The output address (OA) corresponding to the supplied input address. This field 
returns address bits[47:12]. 


For implementations that implement fewer than 48 bits of physical address, the upper bits of this 
field, corresponding to address bits that are not implemented, are RESO. 


Bit [11] 
Reserved, RES1. 
IMP DEF, bit [10] 
IMPLEMENTATION DEFINED. 
NS, bit [9] 
Non-secure. The NS attribute for a translation table entry from a Secure translation regime. 


For a result from a Secure translation regime, this bit reflects the Security state of the physical 
address space of the translation. This means it reflects the effect of the NSTable bits of earlier levels 
of the translation table walk if those NSTable bits have an effect on the translation. 


For a result from a Non-secure translation regime, this bit is UNKNOWN. 
SH, bits [8:7] 


Shareability attribute, for the returned output address. Permitted values are: 


00 Non-shareable. 
10 Outer Shareable. 
11 Inner Shareable. 


The value Q1 is reserved. 


—— Note 

This field returns the value 10 for: 

° Any type of Device memory. 

° Normal memory with both Inner Non-cacheable and Outer Non-cacheable attributes. 





The value returned in this field can be the resulting attribute, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of the value that appears in 
the translation table descriptor. 





Bits [6:1] 
Reserved, RESO. 
F, bit [0] 
Indicates whether the instruction performed a successful address translation. 
0 Address translation completed successfully. 
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When PAR_EL1.F==1: 


63 56 55 52 51 48 47 


IMP DEF IMP DEF | IMP DEF RESO 





| a RESO 
PTW 


RESO 
RES1 


This section describes the register value returned by a fault on the execution of an Address translation instruction. 
Software might subsequently write a different value to the register, and that write does not affect the operation of 
the PE. 


IMP DEF, bits [63:56] 
IMPLEMENTATION DEFINED. 


IMP DEF, bits [55:52] 
IMPLEMENTATION DEFINED. 





IMP DEF, bits [51:48] 

IMPLEMENTATION DEFINED. 
Bits [47:12] 
Reserved, RESO. 


Bit [11] 
Reserved, RES1. 
Bit [10] 
Reserved, RESO. 
S, bit [9] 
Indicates the translation stage at which the translation aborted: 
) Translation aborted because of a fault in the stage 1 translation. 
al Translation aborted because of a fault in the stage 2 translation. 
PTW, bit [8] 
If this bit is set to 1, it indicates the translation aborted because of a stage 2 fault during a stage 1 
translation table walk. 
Bit [7] 


Reserved, RESO. 
FST, bits [6:1] 
Fault status code, as shown in the Data Abort ESR encoding. 


¥, bit [0] 
Indicates whether the instruction performed a successful address translation. 


1 Address translation aborted. 


Accessing the PAR_EL1: 


To access the PAR_ELI: 
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MRS <Xt>, PAR_EL1 ; Read PAR_EL1 into Xt 
MSR PAR_EL1, <Xt> ; Write Xt to PAR_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0111 0100 000 
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REVIDR_EL1, Revision ID Register 


The REVIDR_EL1 characteristics are: 


Purpose 


Provides implementation-specific minor revision information. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TID1==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register REVIDR_EL1 is architecturally mapped to AArch32 System register 
REVIDR. 


If REVIDR_EL1 has the same value as MIDR_EL1, then its contents have no significance. 


Attributes 
REVIDR_EL] is a 32-bit register. 


Field descriptions 


The REVIDR_EL1 bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the REVIDR_EL1: 
To access the REVIDR_EL1: 
MRS <Xt>, REVIDR_EL1 ; Read REVIDR_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0000 0000 110 
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D7.2.73 RMR_EL1, Reset Management Register (if EL2 and EL3 not implemented) 
The RMR_ELI characteristics are: 
Purpose 
When this register is implemented: 
° A write to the register can request a Warm reset. 
. If EL1 can use AArch32 and AArch64, this register specifies the Execution state that the PE 
boots into on a Warm reset. 
Usage constraints 
This register is accessible as follows: 
ELO EL1 
- RW 
However, see Configurations for information about whether the register is implemented. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register RMR_EL1 is architecturally mapped to AArch32 System register RMR 
(at EL1). 
Only implemented if EL1 is the highest implemented Exception level. In this case: 
° If EL1 can use AArch32 and AArch64 then this register must be implemented. 
. If EL1 cannot use AArch32 then it is IMPLEMENTATION DEFINED whether the register is 
implemented. 
When this register is not implemented its encoding is UNDEFINED. 
See the field descriptions for the reset values. These apply whenever the register is implemented. 
Attributes 
RMR_EL1 is a 32-bit register. 
Field descriptions 
The RMR_EL1 bit assignments are: 
31 210 
RESO ae 
— AA64 
Bits [31:2] 
Reserved, RESO. 
RR, bit [1] 
Reset Request. Setting this bit to 1 requests a Warm reset. 
This field resets to @ on a Warm or Cold reset. 
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When EL] can use AArch32, determines which Execution state the PE boots into after a Warm 
reset: 


0 AArch32. 
1 AArch64. 


On coming out of the Warm reset, execution starts at the IMPLEMENTATION DEFINED reset vector 
address of the specified Execution state. 


If EL1 cannot use AArch32 this bit is RAO/WI. 


When implemented as an RW field, this field resets to 1 on a Cold reset. It is not affected by a Warm 
reset. 


Accessing the RMR_EL1: 


To access the RMR_ELI1: 


MRS <Xt>, RMR_EL1 ; Read RMR_EL1 into Xt 
MSR RMR_EL1, <Xt> ; Write Xt to RMR_EL1 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 000 1100 0000 010 
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D7.2.74 RMR_EL2, Reset Management Register (if EL2 implemented and EL3 not implemented) 
The RMR_EL2 characteristics are: 
Purpose 
When this register is implemented: 
° A write to the register can request a Warm reset. 
. If EL2 can use AArch32 and AArch64, this register specifies the Execution state that the PE 
boots into on a Warm reset. 
Usage constraints 
This register is accessible as follows: 
ELO EL1 EL2(NS) 
- - RW 
However, see Configurations for information about whether the register is implemented. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register RMR_EL2 is architecturally mapped to AArch32 System register HRMR. 
Only implemented if EL2 is the highest implemented Exception level. In this case: 
. If EL2 can use AArch32 and AArch64 then this register must be implemented. 
° If EL2 cannot use AArch32 then it is IMPLEMENTATION DEFINED whether the register is 
implemented. 
When this register is not implemented its encoding is UNDEFINED. 
See the field descriptions for the reset values. These apply whenever the register is implemented. 
Attributes 
RMR_EL2? is a 32-bit register. 
Field descriptions 
The RMR_EL2 bit assignments are: 
31 210 
RESO FA 
| AA64 
Bits [31:2] 
Reserved, RESO. 
RR, bit [1] 
Reset Request. Setting this bit to 1 requests a Warm reset. 
This field resets to @ on a Warm or Cold reset. 
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When EL2 can use AArch32, determines which Execution state the PE boots into after a Warm 
reset: 


0 AArch32. 
1 AArch64. 


On coming out of the Warm reset, execution starts at the IMPLEMENTATION DEFINED reset vector 
address of the specified Execution state. 


If EL2 cannot use AArch32 this bit is RAO/WI. 


When implemented as an RW field, this field resets to 1 on a Cold reset. It is not affected by a Warm 
reset. 


Accessing the RMR_EL2: 


To access the RMR_EL2: 


MRS <Xt>, RMR_EL2 ; Read RMR_EL2 into Xt 
MSR RMR_EL2, <Xt> ; Write Xt to RMR_EL2 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 100 1100 0000 010 
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D7.2.75 RMR_EL3, Reset Management Register (if EL3 implemented) 
The RMR_EL3 characteristics are: 
Purpose 
When this register is implemented: 
° A write to the register can request a Warm reset. 
. If EL3 can use AArch32 and AArch64, the register specifies the Execution state that the PE 
boots into on a Warm reset. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - - RW RW 
However, see Configurations for information about whether the register is implemented. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register RMR_EL3 is architecturally mapped to AArch32 System register RMR 
(at EL3). 
When EL3 is implemented: 
. If EL3 can use AArch32 and AArch64 then this register must be implemented. 
° If EL3 cannot use AArch32 then it is IMPLEMENTATION DEFINED whether the register is 
implemented. 
When this register is not implemented its encoding is UNDEFINED. 
See the field descriptions for the reset values. These apply whenever the register is implemented. 
Attributes 
RMR_EL3 is a 32-bit register. 
Field descriptions 
The RMR_EL3 bit assignments are: 
31 210 
RESO ae 
— AA64 
Bits [31:2] 
Reserved, RESO. 
RR, bit [1] 
Reset Request. Setting this bit to 1 requests a Warm reset. 
This field resets to @ on a Warm or Cold reset. 
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When EL3 can use AArch32, determines which Execution state the PE boots into after a Warm 
reset: 


0 AArch32. 
1 AArch64. 


On coming out of the Warm reset, execution starts at the IMPLEMENTATION DEFINED reset vector 
address of the specified Execution state. 


If EL3 cannot use AArch32 this bit is RAO/WI. 


When implemented as an RW field, this field resets to 1 on a Cold reset. It is not affected by a Warm 
reset. 


Accessing the RMR_EL3: 


To access the RMR_EL3: 


MRS <Xt>, RMR_EL3 ; Read RMR_EL3 into Xt 
MSR RMR_EL3, <Xt> ; Write Xt to RMR_EL3 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 110 1100 0000 010 
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D7.2.76 RVBAR_EL1, Reset Vector Base Address Register (if EL2 and EL3 not implemented) 
The RVBAR_ELI characteristics are: 


Purpose 


If EL] is the highest Exception level implemented, contains the IMPLEMENTATION DEFINED address 
that execution starts from after reset when executing in AArch64 state. 


Usage constraints 


This register is accessible as follows: 


ELO EL1 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


Only implemented if the highest Exception level implemented is EL1. 


Attributes 
RVBAR_EL1 is a 64-bit register. 


Field descriptions 


The RVBAR_EL1I bit assignments are: 


63 0 


Reset Address 


Bits [63:0] 


Reset Address. The IMPLEMENTATION DEFINED address that execution starts from after reset when 
executing in 64-bit state. Bits[1:0] of this register are 00, as this address must be aligned, and the 
address must be within the physical address size supported by the PE. 


Accessing the RVBAR_EL1: 
To access the RVBAR_ELI: 


MRS <Xt>, RVBAR_EL1 ; Read RVBAR_EL1 into Xt 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 1100 0000 001 
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RVBAR_EL42, Reset Vector Base Address Register (if EL3 not implemented) 


The RVBAR_EL2 characteristics are: 


Purpose 
If EL2 is the highest Exception level implemented, contains the IMPLEMENTATION DEFINED address 
that execution starts from after reset when executing in AArch64 state. 

Usage constraints 


This register is accessible as follows: 





ELO EL1  EL2 (NS) 





: z RO 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


Only implemented if the highest Exception level implemented is EL2. 


Attributes 
RVBAR_EL2 is a 64-bit register. 


Field descriptions 


The RVBAR_EL2 bit assignments are: 


Reset Address 


Bits [63:0] 


Reset Address. The IMPLEMENTATION DEFINED address that execution starts from after reset when 
executing in 64-bit state. Bits[1:0] of this register are 00, as this address must be aligned, and the 
address must be within the physical address size supported by the PE. 


Accessing the RVBAR_EL2: 
To access the RVBAR_EL2: 


MRS <Xt>, RVBAR_EL2 ; Read RVBAR_EL2 into Xt 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 1100 0000 001 
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D7.2.78 RVBAR_EL3, Reset Vector Base Address Register (if EL3 implemented) 
The RVBAR_EL3 characteristics are: 


Purpose 
If EL3 is the highest Exception level implemented, contains the IMPLEMENTATION DEFINED address 
that execution starts from after reset when executing in AArch64 state. 

Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: g 2 2 RO RO 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


Only implemented if the highest Exception level implemented is EL3. 


Attributes 
RVBAR_ELS3 is a 64-bit register. 


Field descriptions 


The RVBAR_EL3 bit assignments are: 


63 0 


Reset Address 


Bits [63:0] 


Reset Address. The IMPLEMENTATION DEFINED address that execution starts from after reset when 
executing in 64-bit state. Bits[1:0] of this register are 00, as this address must be aligned, and the 
address must be within the physical address size supported by the PE. 


Accessing the RVBAR_EL3: 
To access the RVBAR_EL3: 


MRS <Xt>, RVBAR_EL3 ; Read RVBAR_EL3 into Xt 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 1100 0000 001 
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D7.2.79 $3_<op1>_<Cn>_<Cm>_<op2>, IMPLEMENTATION DEFINED registers 


The S3_<opl>_<Cn>_<Cm>_<op2> characteristics are: 





Purpose 


This area of the instruction set space is reserved for IMPLEMENTATION DEFINED registers. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





IMPDEF IMPDEF IMPDEF IMPDEF IMP DEF IMP DEF 





The numbers in these register names are encoded in decimal without leading zeroes, and the Cn and 
Cm fields require a literal C before the number. For example, $3_4_C11_C9_7. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TIDCP==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 


$3_<op1>_<Cn>_<Cm>_<op2> is a 64-bit register. 





Field descriptions 


The $3_<op1>_<Cn>_<Cm>_<op2> bit assignments are: 





63 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [63:0] 


IMPLEMENTATION DEFINED. 


Accessing the S3_<op1>_<Cn>_<Cm>_<op2>: 


To access the $3_<op1>_<Cn>_<Cm>_<op2>: 





MRS <Xt>, S3_<opl>_<Cn>_<Cm>_<op2> ; Read S3_<op1>_<Cn>_<Cm>_<op2> into Xt 
MSR S$3_<op1>_<Cn>_<Cm>_<op2>, <Xt> ; Write Xt to S3_<op1>_<Cn>_<Cm>_<op2> 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 XXX 1x11 XXXX XXX 





The value of <Cn> must be either 11 or 15. Other values may refer to architecturally-defined registers. 
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D7.2.80 SCR_EL3, Secure Configuration Register 
The SCR_EL3 characteristics are: 


Purpose 
Defines the configuration of the current Security state. It specifies: 
° The Security state of ELO and EL1, either Secure or Non-secure. 
° The Execution state at lower Exception levels. 


° Whether IRQ, FIQ, and External Abort interrupts are taken to EL3. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register SCR_EL3 can be mapped to AArch32 System register SCR, but this is 
not architecturally mandated. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
SCR_EL3 is a 32-bit register. 


Field descriptions 


The SCR_EL3 bit assignments are: 


14131211109 8765 43 2 1 0 





(se 
FIQ 


EA 
RES1 
RESO 

SMD 
HCE 
SIF 
RW 
TWI 
TWE 


Bits [31:14] 
Reserved, RESO. 
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TWE, bit [13] 
Traps EL2, EL1, and ELO execution of WFE instructions to EL3, from both Security states and both 
Execution states. 
0 EL2, EL1, and ELO execution of WFE instructions is not trapped to EL3. 


1 Any attempt to execute a WFE instruction at any Exception level lower than EL3 is 
trapped to EL3, if the instruction would otherwise have caused the PE to enter a 
low-power state. 


In AArch32 state, the attempted execution of a conditional WFE instruction is only trapped if the 
instruction passes its condition code check. 


— Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of 
WF are not guaranteed to be taken, even if the WFE or WFI is executed when there is no Wakeup 
event. The only guarantee is that if the instruction does not complete in finite time in the absence of 
a Wakeup event, the trap will be taken. 











TWI, bit [12] 
Traps EL2, EL1, and ELO execution of WFI instructions to EL3, from both Security states and both 
Execution states. 
) EL2, EL1, and ELO execution of WFI instructions is not trapped to EL3. 
al Any attempt to execute a WFI instruction at any Exception level lower than EL3 is 
trapped to EL3, if the instruction would otherwise have caused the PE to enter a 
low-power state. 
In AArch32 state, the attempted execution of a conditional WFI instruction is only trapped if the 
instruction passes its condition code check. 
— Note 
Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of 
WF are not guaranteed to be taken, even if the WFE or WFI is executed when there is no Wakeup 
event. The only guarantee is that if the instruction does not complete in finite time in the absence of 
a Wakeup event, the trap will be taken. 
ST, bit [11] 
Traps Secure EL1 accesses to the Counter-timer Physical Secure timer registers to EL3, from 
AArch64 state only. 
1) Secure EL1 using AArch64 accesses to the CNTPS_TVAL_EL1, CNTPS_CTL_EL1, 
and CNTPS_CVAL_EL1 are trapped to EL3. 
1 Secure EL1 using AArch64 accesses to the CNTPS_TVAL_EL1, CNTPS_CTL_EL1, 
and CNTPS_CVAL_EL1 are not trapped to EL3. 
RW, bit [10] 
Execution state control for lower Exception levels. 
0 Lower levels are all AArch32. 
a The next lower level is AArch64. 
If EL2 is present: 
. EL2 is AArch64. 
° EL2 controls EL1 and ELO behaviors. 
If EL2 is not present: 
° EL1 is AArch64. 
° ELO is determined by the Execution state described in the current process state 
when executing at ELO. 
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If all lower exception levels cannot use AArch32 then this bit is RAO/WI. 
This bit is permitted to be cached in a TLB. 











SIF, bit [9] 
Secure instruction fetch. When the PE is in Secure state, this bit disables instruction fetch from 
Non-secure memory. The possible values for this bit are: 
Q Secure state instruction fetches from Non-secure memory are permitted. 
1 Secure state instruction fetches from Non-secure memory are not permitted. 
This bit is permitted to be cached in a TLB. 
HCE, bit [8] 
Hypervisor Call instruction enable. Enables HVC instructions at EL3, EL2, and Non-secure EL1, 
in both Execution states. 
1) HVC instructions are UNDEFINED at EL3, EL2, and Non-secure EL1, and any resulting 
exception is taken from the current Exception level to the current Exception level. 
1 HVC instructions are enabled at EL1 and above. 
— Note 
HVC instructions are always UNDEFINED at ELO. 
If EL2 is not implemented, this bit is RESO. 
SMD, bit [7] 
Secure Monitor Call disable. Disables SMC instructions at EL1 and above, from both Security states 
and both Execution states. 
0 SMC instructions are enabled at EL1 and above. 
1 SMC instructions are UNDEFINED at EL1 and above. 
—— Note 
SMC instructions are always UNDEFINED at ELO. 
Bit [6] 
Reserved, RESO. 
Bits [5:4] 
Reserved, RES 1. 
EA, bit [3] 
External Abort and SError Interrupt Routing. 
7) When executing at Exception levels below EL3, External Aborts and SError Interrupts 
are not taken to EL3. 
In addition, when executing at EL3: 
° SError Interrupts are not taken. 
° External Aborts are taken to EL3. 
1 When executing at any Exception level, External Aborts and SError Interrupts are taken 
to EL3. 
For more information, see Asynchronous exception routing on page D1-1556. 
FIQ, bit [2] 
Physical FIQ Routing. 
) When executing at Exception levels below EL3, physical FIQ interrupts are not taken 
to EL3. 
D7-2092 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 


When executing at EL3, physical FIQ interrupts are not taken. 
1 When executing at any Exception level, physical FIQ interrupts are taken to EL3. 


For more information, see Asynchronous exception routing on page D1-1556. 


IRQ, bit [1] 
Physical IRQ Routing. 
0 When executing at Exception levels below EL3, physical IRQ interrupts are not taken 
to EL3. 
When executing at EL3, physical IRQ interrupts are not taken. 
1 When executing at any Exception level, physical IRQ interrupts are taken to EL3. 
For more information, see Asynchronous exception routing on page D1-1556. 
NS, bit [0] 
Non-secure bit. 
0 Indicates that ELO and EL] are in Secure state, and so memory accesses from those 
Exception levels can access Secure memory. 
When executing at EL3: 


. The AT SIE2R, AT SIE2W, TLBI VAE2, TLBI VALE2, TLBI VAE2IS, TLBI 
VALEZ2IS, TLBI ALLE2, and TLBI ALLE2IS System instructions are 
UNDEFINED. 


. Each AT $12E* System instruction executes as the corresponding AT S1E* 
instruction. For example, AT S12EOR executes as AT S1EOR. 


. Each of the TLBI IPAS2E1, TLBI IPAS2E1IS, TLBI IPAS2LE1, and TLBI 
IPAS2LEIIS System instructions executes as a NOP. 


° A TLBI VMALLS12E1 System instruction executes as TLBI VMALLE1, anda 
TLBI VMALLS12E1IS System instruction executes as TLBI VMALLEIIS. 


1 Indicates that ELO and EL1 are in Non-secure state, and so memory accesses from those 
Exception levels cannot access Secure memory. 
—— Note 


EL2 is not supported in the Secure state. When SCR_EL3.NS==0, it is not possible to enter EL2, 
and the EL2 state has no effect on execution. See Virtualization on page D1-1504. 





Accessing the SCR_EL3: 
To access the SCR_EL3: 


MRS <Xt>, SCR_EL3 ; Read SCR_EL3 into Xt 
MSR SCR_EL3, <Xt> ; Write Xt to SCR_EL3 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 110 0001 0001 000 
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D7.2.81 SCTLR_EL1, System Control Register (EL1) 
The SCTLR_EL1 characteristics are: 


Purpose 


Provides top level control of the system, including its memory system, at EL1 and ELO. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register SCTLR_EL] is architecturally mapped to AArch32 System register 
SCTLR. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL1 using AArch64. Otherwise, RW fields in this register reset to architecturally UNKNOWN 
values. 


Attributes 
SCTLR_EL1 is a 32-bit register. 


Field descriptions 


The SCTLR_EL1 bit assignments are: 


31 30 29 28 27 26 25 24 23 22 21 201918 17 161514131211109 8 76543210 





< —TT [tL _. 
RES1 SAO 





RESO CP15BEN 

UCI RESO 
EE ITD 
EOE SED 
RES1 UMA 
RESO RESO 
RES1 RES1 

WXN RESO 
nTWE DZE 
RESO UCT 
nTWI 
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Bits [31:30] 
Reserved, RESO. 
Bits [29:28] 


Reserved, RES1. 


Bit [27] 
Reserved, RESO. 
UCI, bit [26] 
Traps ELO execution of cache maintenance instructions to EL1, from AArch64 state only. 
Q Any attempt to execute a DC CVAU, DC CIVAC, DC CVAC, or IC IVAU instruction 
at ELO using AArch64 is trapped to EL1. 
1 Does not cause any instruction to be trapped. 
This field resets to a value that is architecturally UNKNOWN. 
EE, bit [25] 


Endianness of data accesses at EL1, and stage 1 translation table walks in the EL1&0 translation 
regime. 


The possible values of this bit are: 


Q Explicit data accesses at EL1, and stage 1 translation table walks in the EL1&0 
translation regime are little-endian. 


i Explicit data accesses at EL1, and stage 1 translation table walks in the EL1&0 
translation regime are big-endian. 


If an implementation does not provide Big-endian support at Exception Levels higher than ELO, this 
bit is RESO. 


If an implementation does not provide Little-endian support at Exception Levels higher than ELO, 
this bit is RES1. 


The EE bit is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to an IMPLEMENTATION DEFINED value. 


EOE, bit [24] 
Endianness of data accesses at ELO. 
The possible values of this bit are: 
0 Explicit data accesses at ELO are little-endian. 
a Explicit data accesses at ELO are big-endian. 


If an implementation only supports Little-endian accesses at ELO then this bit is RESO. This option 
is not permitted when SCTLR_EL1.EE is RES1. 


If an implementation only supports Big-endian accesses at ELO then this bit is RES1. This option is 
not permitted when SCTLR_EL1.EE is RESO. 


This bit has no effect on the endianness of LDTR, LDTRH, LDTRSH, LDTRSW, STTR and 
STTRH instructions executed at EL1. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
Bits [23:22] 

Reserved, RES1. 
Bit [21] 


Reserved, RESO. 
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Bit [20] 


WXN, bit [19] 


Reserved, RES1. 


Write permission implies XN (Execute-never). For the EL1&0 translation regime, this bit can force 
all memory regions that are writable to be treated as XN. The possible values of this bit are: 


Q This control has no effect on memory access permissions. 


al Any region that is writable in the EL1&0 translation regime is forced to XN for accesses 
from software executing at EL1 or ELO. 


The WXN bit is permitted to be cached in a TLB. 


This field resets to a value that is architecturally UNKNOWN. 


nTWE, bit [18] 


Bit [17] 


nTWI, bit [16] 


UCT, bit [15] 


Traps ELO execution of WFE instructions to EL1, from both Execution states. 


Q Any attempt to execute a WFE instruction at ELO is trapped to EL1, if the instruction 
would otherwise have caused the PE to enter a low-power state. 


1 ELO execution of WFE instructions is not trapped to EL1. 

In AArch32 state, the attempted execution of a conditional WFE instruction is only trapped if the 
instruction passes its condition code check. 

—— Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of 
WF are not guaranteed to be taken, even if the WFE or WFI is executed when there is no Wakeup 
event. The only guarantee is that if the instruction does not complete in finite time in the absence of 
a Wakeup event, the trap will be taken. 





This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


Traps ELO execution of WFI instructions to EL1, from both Execution states. 


0 Any attempt to execute a WFI instruction at ELO is trapped EL1, if the instruction would 
otherwise have caused the PE to enter a low-power state. 


1 ELO execution of WFI instructions is not trapped to EL1. 

In AArch32 state, the attempted execution of a conditional WFI instruction is only trapped if the 
instruction passes its condition code check. 

—— Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of 
WF are not guaranteed to be taken, even if the WFE or WFI is executed when there is no Wakeup 
event. The only guarantee is that if the instruction does not complete in finite time in the absence of 
a Wakeup event, the trap will be taken. 





This field resets to a value that is architecturally UNKNOWN. 


Traps ELO accesses to the CTR_ELO to EL1, from AArch64 state only. 
) Accesses to the CTR_ELO from ELO using AArch64 are trapped to EL1. 
1 Accesses to the CTR_ELO from ELO using AArch64 are not trapped to EL1. 


This field resets to a value that is architecturally UNKNOWN. 
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DZE, bit [14] 
Traps ELO execution of DC ZVA instructions to EL1, from AArch6é4 state only. 


) Any attempt to execute a DC ZVA instruction at ELO using AArch64 is trapped to EL1. 
Reading DCZID_EL0.DZP from ELO returns 1, indicating that DC ZVA instructions 
are not supported. 


an ELO execution of DC ZVA instructions is not trapped to EL1. 


This field resets to a value that is architecturally UNKNOWN. 





Bit [13] 
Reserved, RESO. 
I, bit [12] 
Instruction access Cacheability control, for accesses at ELO and EL1: 
) All instruction access to Normal memory from ELO and EL1 are Non-cacheable for all 
levels of instruction and unified cache. 
If the value of SCTLR_EL1.M is 0, instruction accesses from stage 1 of the EL1&0 
translation regime are to Normal, Outer Shareable, Inner Non-cacheable, Outer 
Non-cacheable memory. 
1 This control has no effect on the Cacheability of instruction access to Normal memory 
from ELO and EL1. 
If the value of SCTLR_EL1.M is 0, instruction accesses from stage 1 of the EL1&0 
translation regime are to Normal, Outer Shareable, Inner Write-Through, Outer 
Write-Through memory. 
When the value of the HCR_EL2.DC bit is 1, then instruction access to Normal memory from ELO 
and EL1 are Cacheable regardless of the value of the SCTLR_EL1.1 bit. 
When this register has an architecturally-defined reset value, this field resets to 0. 
Bit [11] 
Reserved, RES1. 
Bit [10] 
Reserved, RESO. 
UMA, bit [9] 
User Mask Access. Traps ELO execution of MSR and MRS instructions that access the PSTATE. {D, 
A, I, F} masks to EL1, from AArch64 state only. 
1) Any attempt at ELO using AArch64 to execute an MRS, MSR(register), or MSR( immediate) 
instruction that accesses the DAIF is trapped to EL1. 
1 ELO execution of MRS, MSR(register), or MSR( immediate) instructions that access the 
DAIF is not trapped to EL1. 
This field resets to a value that is architecturally UNKNOWN. 
SED, bit [8] 
SETEND instruction disable. Disables SETEND instructions at ELO using AArch32. 
) SETEND instruction execution is enabled at ELO using AArch32. 
1 SETEND instructions are UNDEFINED at ELO using AArch32. 
If the implementation does not support mixed-endian operation at any Exception level, this bit is 
RES]. 
If ELO cannot use AArch32, this bit is RES1. 
If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
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ITD, bit [7] 
IT Disable. Disables some uses of IT instructions at ELO using AArch32. 
0 AIL IT instruction functionality is enabled at ELO using AArch32. 
1 Any attempt at ELO using AArch32 to execute any of the following is UNDEFINED: 
° All encodings of the IT instruction with hw1[3:0]!=1000. 
° All encodings of the subsequent instruction with the following values for hw1: 
LUXXXXXXXXXXXXXX 
All 32-bit instructions, and the 16-bit instructions B, UDF, SVC, 
LDM, and STM. 
1O11XXXXXXXXXXXX 
All instructions in Miscellaneous 16-bit instructions on 
page F3-2442. 
101 0OXXxXXxXXXXXXX 
ADD Rd, PC, #imm 
Q1001XxXXxXXXXXXXX 
LDR Rd, [PC, #imm] 
@100x1xxx1111xxx 
ADD Rdn, PC; CMP Rn, PC; MOV Rd, PC; BX PC; BLX PC. 
010001xx1xxxx111 
ADD PC, Rm; CMP PC, Rm; MOV PC, Rm. This pattern also covers 
UNPREDICTABLE cases with BLX Rn. 
These instructions are always UNDEFINED, regardless of whether they would pass or fail 
the condition code check that applies to them as a result of being in an IT block. 
It is IMPLEMENTATION DEFINED whether the IT instruction is treated as: 
° A 16-bit instruction, that can only be followed by another 16-bit instruction. 
° The first half of a 32-bit instruction. 
This means that, for the situations that are UNDEFINED, either the second 16-bit 
instruction or the 32-bit instruction is UNDEFINED. 
An implementation might vary dynamically as to whether IT is treated as a 16-bit 
instruction or the first half of a 32-bit instruction. 
If an instruction in an active IT block that would be disabled by this field sets this field to 1 then 
behavior is CONSTRAINED UNPREDICTABLE. For more information see Changes to an ITD control by 
an instruction in an IT block on page E1-2298. 
If ELO cannot use AArch32, this bit is RES1. 
ITD is optional, but if it is implemented in the SCTLR then it must also be implemented in the 
SCTLR_ELI. If it is not implemented then this bit is RAZ/WI. 
If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
Bit [6] 
Reserved, RESO. 
CP15BEN, bit [5] 
System instruction memory barrier enable. Enables accesses to the DMB, DSB, and ISB System 
instructions in the (coproc==1111) encoding space from ELO: 
) ELO using AArch32: ELO execution of the CP1SDMB, CP15DSB, and CP15ISB 
instructions is UNDEFINED. 
1 ELO using AArch32: ELO execution of the CP1SDMB, CP15DSB, and CP1S5ISB 
instructions is enabled. 
If ELO cannot use AArch32, this bit is RESO. 
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CP15BEN is optional, but if it is implemented in the SCTLR then it must also be implemented in 
the SCTLR_EL1. If it is not implemented then this bit is RAO/WI. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


SAO, bit [4] 


SP Alignment check enable for ELO. When set to 1, if a load or store instruction executed at ELO 
uses the SP as the base address and the SP is not aligned to a 16-byte boundary, then a SP alignment 
fault exception is generated. For more information, see SP alignment checking on page D1-1515. 


This field resets to a value that is architecturally UNKNOWN. 


SA, bit [3] 


SP Alignment check enable. When set to 1, if a load or store instruction executed at EL1 uses the 
SP as the base address and the SP is not aligned to a 16-byte boundary, then a SP alignment fault 
exception is generated. For more information, see SP alignment checking on page D1-1515. 


This field resets to a value that is architecturally UNKNOWN. 


C, bit [2] 
Cacheability control, for data accesses. 


0 All data access to Normal memory from ELO and EL1, and all Normal memory accesses 
to the EL1&0 stage 1 translation tables, are Non-cacheable for all levels of data and 
unified cache. 


1 This control has no effect on the Cacheability of: 
. Data access to Normal memory from ELO and EL1. 
° Normal memory accesses to the EL1&0 stage 1 translation tables. 


When the value of the HCR_EL2.DC bit is 1, the PE ignores SCLTR.C. This means that Non-secure 
ELO and Non-secure EL1 data accesses to Normal memory are Cacheable. 


When this register has an architecturally-defined reset value, this field resets to 0. 


A, bit [1] 
Alignment check enable. This is the enable bit for Alignment fault checking at EL1 and ELO: 
0 Alignment fault checking disabled when executing at EL1 or ELO. 


Instructions that load or store one or more registers, other than load/store exclusive and 
load-acquire/store-release, do not check that the address being accessed is aligned to the 
size of the data element(s) being accessed. 


1 Alignment fault checking enabled when executing at EL1 or ELO. 


All instructions that load or store one or more registers have an alignment check that the 
address being accessed is aligned to the size of the data element(s) being accessed. If 
this check fails it causes an Alignment fault, which is taken as a Data Abort exception. 


Load/store exclusive and load-acquire/store-release instructions have an alignment check regardless 
of the value of the A bit. 


This field resets to a value that is architecturally UNKNOWN. 


M, bit [0] 
MMU enable for EL1 and ELO stage 1 address translation. Possible values of this bit are: 
) EL] and ELO stage 1 address translation disabled. 
See the SCTLR_EL1.I field for the behavior of instruction accesses to Normal memory. 
1 EL] and ELO stage 1 address translation enabled. 


If the value of HCR_EL2.{DC, TGE} is not {0, 0} then in Non-secure state the PE behaves as if the 
value of the SCTLR_EL1.M field is 0 for all purposes other than returning the value of a direct read 
of the field. 


When this register has an architecturally-defined reset value, this field resets to Q. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-2099 
1ID092916 Non-Confidential 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 


Accessing the SCTLR_EL1: 
To access the SCTLR_EL1: 


MRS <Xt>, SCTLR_EL1 ; Read SCTLR_EL1 into Xt 
MSR SCTLR_EL1, <Xt> ; Write Xt to SCTLR_EL1 


Register access is encoded as follows: 





op0 op1 


CRn CRm_= op2 





11 000 


0001 0000 000 
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D7.2.82 SCTLR_EL2, System Control Register (EL2) 
The SCTLR_EL2 characteristics are: 
Purpose 
Provides top level control of the system, including its memory system, at EL2. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register SCTLR_EL2 is architecturally mapped to AArch32 System register 
HSCTLR. 
If EL2 is not implemented, this register is RESO from EL3. 
Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 using AArch64. Otherwise, RW fields in this register reset to architecturally UNKNOWN 
values. 
Attributes 
SCTLR_EL2 is a 32-bit register. 
Field descriptions 
The SCTLR_EL2 bit assignments are: 
31 30 29 28 27 26 25 24 23 22 21 201918171615 131211 10 6543210 
RESO a SA 
RES1 RES1 
RESO RES1 
EE 
RESO 
RES1 
RESO 
WXN 
RES1 
RESO 
RES1 
Bits [31:30] 
Reserved, RESO. 
Bits [29:28] 
Reserved, RES1. 
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Bits [27:26] 
Reserved, RESO. 


EE, bit [25] 


Endianness of data accesses at EL2, stage 1 translation table walks in the EL2 translation regime, 
and stage 2 translation table walks in the EL1&0 translation regime. 


The possible values of this bit are: 


0 Explicit data accesses at EL2, stage 1 translation table walks in the EL2 translation 
regime, and stage 2 translation table walks in the EL1&0 translation regime are 
little-endian. 


1 Explicit data accesses at EL2, stage 1 translation table walks in the EL2 translation 
regime, and stage 2 translation table walks in the EL1&0 translation regime are 
big-endian. 


If an implementation does not provide Big-endian support at Exception Levels higher than ELO, this 
bit is RESO. 


If an implementation does not provide Little-endian support at Exception Levels higher than ELO, 
this bit is RES1. 


The EE bit is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to an IMPLEMENTATION DEFINED value. 


Bit [24] 


Reserved, RESO. 


Bits [23:22] 


Reserved, RES1. 


Bits [21:20] 


Reserved, RESO. 


WXN, bit [19] 


Write permission implies XN (Execute-never). For the EL2 translation regime, this bit can force all 
memory regions that are writable to be treated as XN. The possible values of this bit are: 


0 This control has no effect on memory access permissions. 


1 Any region that is writable in the EL2 translation regime is forced to XN for accesses 
from software executing at EL2. 


The WXN bit is permitted to be cached in a TLB. 


This field resets to a value that is architecturally UNKNOWN. 


Bit [18] 

Reserved, RES1. 
Bit [17] 

Reserved, RESO. 
Bit [16] 


Reserved, RES1. 


Bits [15:13] 


Reserved, RESO. 
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Instruction access Cacheability control, for accesses at EL2: 


0 All instruction access to Normal memory from EL2 are Non-cacheable for all levels of 
instruction and unified cache. 


If the value of SCTLR_EL2.M is 0, instruction accesses from stage 1 of the EL2 
translation regime are to Normal, Outer Shareable, Inner Non-cacheable, Outer 
Non-cacheable memory. 


1 This control has no effect on the Cacheability of instruction access to Normal memory 
from EL2. 


If the value of SCTLR_EL2.M is 0, instruction accesses from stage 1 of the EL2 
translation regime are to Normal, Outer Shareable, Inner Write-Through, Outer 
Write-Through memory. 


This bit has no effect on the EL1&0 or EL3 translation regimes. 


When this register has an architecturally-defined reset value, this field resets to Q. 


Reserved, RES1. 


Reserved, RESO. 


Reserved, RES1. 


SP Alignment check enable. When set to 1, if a load or store instruction executed at EL2 uses the 
SP as the base address and the SP is not aligned to a 16-byte boundary, then a SP alignment fault 
exception is generated. For more information, see SP alignment checking on page D1-1515. 


This field resets to a value that is architecturally UNKNOWN. 


Cacheability control, for data accesses. 


) All data access to Normal memory from EL2, and all Normal memory accesses to the 
EL2 translation tables, are Non-cacheable for all levels of data and unified cache. 


1 This control has no effect on the Cacheability of: 
. Data access to Normal memory from EL2. 
° Normal memory accesses to the EL2 translation tables. 


This bit has no effect on the EL1&0 or EL3 translation regimes. 


When this register has an architecturally-defined reset value, this field resets to Q. 


Alignment check enable. This is the enable bit for Alignment fault checking at EL2: 


) Alignment fault checking disabled when executing at EL2. 
Instructions that load or store one or more registers, other than load/store exclusive and 
load-acquire/store-release, do not check that the address being accessed is aligned to the 
size of the data element(s) being accessed. 


1 Alignment fault checking enabled when executing at EL2. 


All instructions that load or store one or more registers have an alignment check that the 
address being accessed is aligned to the size of the data element(s) being accessed. If 
this check fails it causes an Alignment fault, which is taken as a Data Abort exception. 


Load/store exclusive and load-acquire/store-release instructions have an alignment check regardless 
of the value of the A bit. 
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This field resets to a value that is architecturally UNKNOWN. 
M, bit [0] 
MMU enable for EL2 stage | address translation. Possible values of this bit are: 
0 EL2 stage 1 address translation disabled. 
See the SCTLR_EL2.] field for the behavior of instruction accesses to Normal memory. 
iL. EL2 stage 1 address translation enabled. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the SCTLR_EL2: 
To access the SCTLR_EL2: 


MRS <Xt>, SCTLR_EL2 ; Read SCTLR_EL2 into Xt 
MSR SCTLR_EL2, <Xt> ; Write Xt to SCTLR_EL2 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 100 0001 0000 000 








D7-2104 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D7 AArch64 System Register Descriptions 
D7.2 General system control registers 

















D7.2.83 SCTLR_EL3, System Control Register (EL3) 
The SCTLR_EL3 characteristics are: 
Purpose 
Provides top level control of the system, including its memory system, at EL3. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
: = - - RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL3 using AArch64. Otherwise, RW fields in this register reset to architecturally UNKNOWN 
values. 
Attributes 
SCTLR_EL3 is a 32-bit register. 
Field descriptions 
The SCTLR_EL3 bit assignments are: 
31 30 29 28 27 26 25 24 23 22 21 201918171615 131211 10 6543210 
RESO [a = SA 
RES1 RES1 
RESO RES1 
EE 
RESO 
RES1 
RESO 
WXN 
RES1 
RESO 
RES1 
Bits [31:30] 
Reserved, RESO. 
Bits [29:28] 
Reserved, RES1. 
Bits [27:26] 
Reserved, RESO. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-2105 


1ID092916 


Non-Confidential 


D7 AArch64 System Register Descriptions 


D7.2 General system control registers 


EE, bit [25] 


Bit [24] 


Bits [23:22] 


Bits [21:20] 


WXN, bit [19] 


Bit [18] 


Bit [17] 


Bit [16] 


Bits [15:13] 


Endianness of data accesses at EL3, and stage 1 translation table walks in the EL3 translation 
regime. 


The possible values of this bit are: 


0 Explicit data accesses at EL3, and stage 1 translation table walks in the EL3 translation 
regime are little-endian. 


1 Explicit data accesses at EL3, and stage 1 translation table walks in the EL3 translation 
regime are big-endian. 


If an implementation does not provide Big-endian support at Exception Levels higher than ELO, this 
bit is RESO. 


If an implementation does not provide Little-endian support at Exception Levels higher than ELO, 
this bit is RES1. 


The EE bit is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to an IMPLEMENTATION DEFINED value. 


Reserved, RESO. 


Reserved, RES1. 


Reserved, RESO. 


Write permission implies XN (Execute-never). For the EL3 translation regime, this bit can force all 
memory regions that are writable to be treated as XN. The possible values of this bit are: 


Q This control has no effect on memory access permissions. 


1 Any region that is writable in the EL3 translation regime is forced to XN for accesses 
from software executing at EL3. 


The WXN bit is permitted to be cached in a TLB. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RES 1. 


Reserved, RESO. 


Reserved, RES1. 


Reserved, RESO. 





I, bit [12] 

Instruction access Cacheability control, for accesses at EL3: 

0 All instruction access to Normal memory from EL3 are Non-cacheable for all levels of 
instruction and unified cache. 
If the value of SCTLR_EL3.M is 0, instruction accesses from stage 1 of the EL3 
translation regime are to Normal, Outer Shareable, Inner Non-cacheable, Outer 
Non-cacheable memory. 
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1 This control has no effect on the Cacheability of instruction access to Normal memory 
from EL3. 


If the value of SCTLR_EL3.M is 0, instruction accesses from stage 1 of the EL3 
translation regime are to Normal, Outer Shareable, Inner Write-Through, Outer 
Write-Through memory. 


This bit has no effect on the EL1&0 or EL2 translation regimes. 


When this register has an architecturally-defined reset value, this field resets to Q. 





Bit [11] 
Reserved, RES1. 
Bits [10:6] 
Reserved, RESO. 
Bits [5:4] 
Reserved, RES1. 
SA, bit [3] 
SP Alignment check enable. When set to 1, if a load or store instruction executed at EL3 uses the 
SP as the base address and the SP is not aligned to a 16-byte boundary, then a SP alignment fault 
exception is generated. For more information, see SP alignment checking on page D1-1515. 
This field resets to a value that is architecturally UNKNOWN. 
C, bit [2] 
Cacheability control, for data accesses. 
0 All data access to Normal memory from EL3, and all Normal memory accesses to the 
EL3 translation tables, are Non-cacheable for all levels of data and unified cache. 
1 This control has no effect on the Cacheability of: 
° Data access to Normal memory from EL3. 
° Normal memory accesses to the EL3 translation tables. 
This bit has no effect on the EL1&0 or EL2 translation regimes. 
When this register has an architecturally-defined reset value, this field resets to 0. 
A, bit [1] 
Alignment check enable. This is the enable bit for Alignment fault checking at EL3: 
0 Alignment fault checking disabled when executing at EL3. 
Instructions that load or store one or more registers, other than load/store exclusive and 
load-acquire/store-release, do not check that the address being accessed is aligned to the 
size of the data element(s) being accessed. 
1 Alignment fault checking enabled when executing at EL3. 
All instructions that load or store one or more registers have an alignment check that the 
address being accessed is aligned to the size of the data element(s) being accessed. If 
this check fails it causes an Alignment fault, which is taken as a Data Abort exception. 
Load/store exclusive and load-acquire/store-release instructions have an alignment check regardless 
of the value of the A bit. 
This field resets to a value that is architecturally UNKNOWN. 
M, bit [0] 
MMU enable for EL3 stage | address translation. Possible values of this bit are: 
0 EL3 stage 1 address translation disabled. 
See the SCTLR_EL3.I field for the behavior of instruction accesses to Normal memory. 
1 EL3 stage 1 address translation enabled. 
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When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the SCTLR_EL3: 
To access the SCTLR_EL3: 


MRS <Xt>, SCTLR_EL3 ; Read SCTLR_EL3 into Xt 
MSR SCTLR_EL3, <Xt> ; Write Xt to SCTLR_EL3 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 0001 0000 000 
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D7.2.84 TCR_EL1, Translation Control Register (EL1) 
The TCR_EL1 characteristics are: 


Purpose 
Determines which of the Translation Table Base Registers defines the base address for a translation 
table walk required for the stage 1 translation of a memory access from ELO or EL1. Also controls 
the translation table format and holds cacheability and shareability information. 

Usage constraints 


This register is accessible as follows: 





ELO _EL1 (NS) EL1 (S) EL2 (NS) EL3 (SCR.NS=1) EL3 (SCR.NS=0) 





- RW RW RW RW RW 





Any of the bits in TCR_EL1 are permitted to be cached in a TLB. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL] are trapped to 
EL2. 


Configurations 


AArch64 System register TCR_EL1[31:0] is architecturally mapped to AArch32 System register 
TTBCR. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TCR_ELI is a 64-bit register. 


Field descriptions 


The TCR_EL1 bit assignments are: 


39 38 37 36 35 34, 32 31 30 29 28 27 26 25 24 23 22 21,1615 14131211109 8 7 6 5,0 


63 
TBI1 | | 
TBIO 








AS 
RESO 
ORGN1 
IRGN1 
Bits [63:39] 
Reserved, RESO. 
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TBH, bit [38] 


TBIO, bit [37] 


AS, bit [36] 


Bit [35] 


Top Byte ignored - indicates whether the top byte of an address is used for address match for the 
TTBR1_ELI region, or ignored and used for tagged addresses. Defined values are: 


) Top Byte used in the address calculation. 
1 Top Byte ignored in the address calculation. 


This affects addresses generated in ELO and EL1 using AArch64 where the address would be 
translated by tables pointed to by TTBR1_EL1. It has an effect whether the EL1&0 translation 
regime is enabled or not. 


Additionally, this affects changes to the program counter, when TBI1 is 1 and bit [55] of the target 
address is 1, caused by: 


° A branch or procedure return within ELO or EL1. 
° An exception taken to EL1. 
° An exception return to ELO or EL1. 


In these cases bits [63:56] of the address are also set to 1 before it is stored in the PC. 


Top Byte ignored - indicates whether the top byte of an address is used for address match for the 
TTBRO_EL] region, or ignored and used for tagged addresses. Defined values are: 


Q Top Byte used in the address calculation. 
1 Top Byte ignored in the address calculation. 


This affects addresses generated in ELO and EL1 using AArch64 where the address would be 
translated by tables pointed to by TTBRO_EL1. It has an effect whether the EL1&0 translation 
regime is enabled or not. 


Additionally, this affects changes to the program counter, when TBIO is 1 and bit [55] of the target 
address is 0, caused by: 


° A branch or procedure return within ELO or EL1. 
° An exception taken to EL1. 
° An exception return to ELO or EL1. 


In these cases bits [63:56] of the address are also set to 0 before it is stored in the PC. 


ASID Size. Defined values are: 


) 8 bit - the upper 8 bits of TTBRO_EL1 and TTBR1_EL1 are ignored by hardware for 
every purpose except reading back the register, and are treated as if they are all zeros for 
when used for allocation and matching entries in the TLB. 


1 16 bit - the upper 16 bits of TTBRO_EL1 and TTBR1_EL1 are used for allocation and 
matching in the TLB. 


If the implementation has only 8 bits of ASID, this field is RESO. 


Reserved, RESO. 


IPS, bits [34:32] 


Intermediate Physical Address Size. 





000 32 bits, 4GB. 
001 36 bits, 64GB. 
010 40 bits, 1TB. 
011 42 bits, 4TB. 
100 44 bits, 16TB. 
101 48 bits, 256TB. 
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Other values are reserved. 


The reserved values behave in the same way as the 101 encoding, but software must not rely on this 
property as the behavior of the RESERVED values might change in a future revision of the 
architecture. 


TGI1, bits [31:30] 
Granule size for the TTBR1_EL1. 


01 16KB 
10 4KB 
11 64KB 


Other values are reserved. 


If the value is programmed to either a reserved value, or a size that has not been implemented, then 
the hardware will treat the field as if it has been programmed to an IMPLEMENTATION DEFINED 
choice of the sizes that has been implemented for all purposes other than the value read back from 
this register. 


It is IMPLEMENTATION DEFINED whether the value read back is the value programmed or the value 
that corresponds to the size chosen. 
SH1, bits [29:28] 


Shareability attribute for memory associated with translation table walks using TTBR1_EL1. 
Defined values are: 


00 Non-shareable 
10 Outer Shareable 
11 Inner Shareable 


Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


ORGNI, bits [27:26] 


Outer cacheability attribute for memory associated with translation table walks using TTBR1_EL1. 


00 Normal memory, Outer Non-cacheable 

01 Normal memory, Outer Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Outer Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Outer Write-Back Read-Allocate No Write-Allocate Cacheable 


IRGNI1, bits [25:24] 


Inner cacheability attribute for memory associated with translation table walks using TTBR1_EL1. 


00 Normal memory, Inner Non-cacheable 

01 Normal memory, Inner Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Inner Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Inner Write-Back Read-Allocate No Write-Allocate Cacheable 


EPD1, bit [23] 


Translation table walk disable for translations using TTBR1_EL1. This bit controls whether a 
translation table walk is performed on a TLB miss, for an address that is translated using 
TTBR1_EL1. The encoding of this bit is: 


Q Perform translation table walks using TTBR1_EL1. 


1 A TLB miss on an address that is translated using TTBR1_EL1 generates a Translation 
fault. No translation table walk is performed. 
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A1, bit [22] 
Selects whether TTBRO_EL1 or TTBR1_EL1 defines the ASID. The encoding of this bit is: 
) TTBRO_EL1.ASID defines the ASID. 
1 TTBR1_EL1.ASID defines the ASID. 


T1SZ, bits [21:16] 
The size offset of the memory region addressed by TTBR1_EL]1. The region size is 2(64-T!SZ) bytes. 


The maximum and minimum possible values for T1SZ depend on the level of translation table and 
the memory translation granule size, as described in the AArch64 Virtual Memory System 
Architecture chapter. 


TGO, bits [15:14] 
Granule size for the TTBRO_EL1. 


00 4KB 
01 64KB 
10 16KB 


Other values are reserved. 


If the value is programmed to either a reserved value, or a size that has not been implemented, then 
the hardware will treat the field as if it has been programmed to an IMPLEMENTATION DEFINED 
choice of the sizes that has been implemented for all purposes other than the value read back from 
this register. 


It is IMPLEMENTATION DEFINED whether the value read back is the value programmed or the value 
that corresponds to the size chosen. 


SHO, bits [13:12] 


Shareability attribute for memory associated with translation table walks using TTBRO_EL1. 


00 Non-shareable 
10 Outer Shareable 
11 Inner Shareable 


Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


ORGNDO, bits [11:10] 


Outer cacheability attribute for memory associated with translation table walks using TTBRO_EL1. 


00 Normal memory, Outer Non-cacheable 

01 Normal memory, Outer Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Outer Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Outer Write-Back Read-Allocate No Write-Allocate Cacheable 


IRGNO, bits [9:8] 


Inner cacheability attribute for memory associated with translation table walks using TTBRO_EL1. 





00 Normal memory, Inner Non-cacheable 
01 Normal memory, Inner Write-Back Read-Allocate Write-Allocate Cacheable 
10 Normal memory, Inner Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Inner Write-Back Read-Allocate No Write-Allocate Cacheable 
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EPDO, bit [7] 


Translation table walk disable for translations using TTBRO_EL1. This bit controls whether a 
translation table walk is performed on a TLB miss, for an address that is translated using 
TTBRO_EL1. The encoding of this bit is: 


) Perform translation table walks using TTBRO_EL1. 


ol A TLB miss on an address that is translated using TTBRO_EL1 generates a Translation 
fault. No translation table walk is performed. 


Bit [6] 
Reserved, RESO. 
TOSZ, bits [5:0] 
The size offset of the memory region addressed by TTBRO_EL1. The region size is 2(64-TOSZ) bytes. 


The maximum and minimum possible values for TOSZ depend on the level of translation table and 
the memory translation granule size, as described in the AArch64 Virtual Memory System 
Architecture chapter. 


Accessing the TCR_EL1: 


To access the TCR_EL1: 


MRS <Xt>, TCR_EL1 ; Read TCR_EL1 into Xt 
MSR TCR_EL1, <Xt> ; Write Xt to TCR_EL1 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 000 0010 0000 010 
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D7.2.85 TCR_EL42, Translation Control Register (EL2) 
The TCR_EL2 characteristics are: 
Purpose 
Controls translation table walks required for the stage 1 translation of memory accesses from EL2, 
and holds cacheability and shareability information for the accesses. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Any of the bits in TCR_EL2 are permitted to be cached in a TLB. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register TCR_EL2 is architecturally mapped to AArch32 System register HTCR. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
TCR_EL2 is a 32-bit register. 
Field descriptions 
The TCR_EL2 bit assignments are: 
31 30 24 23 22 21 20 19 18 161514131211109 8 7 6 5 
eit RESO 
RES1 IRGNO 
— ORGNO 
Hise 
Bit [31] 
Reserved, RES1. 

Bits [30:24] 
Reserved, RESO. 

Bit [23] 
Reserved, RES1. 

Bits [22:21] 
Reserved, RESO. 
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Top Byte ignored - indicates whether the top byte of an address is used for address match for the 
TTBRO_EL2 region, or ignored and used for tagged addresses. 


0 Top Byte used in the address calculation. 
Hl Top Byte ignored in the address calculation. 


This affects addresses generated in EL2 using AArch64 where the address would be translated by 
tables pointed to by TTBRO_EL2. It has an effect whether the EL2 translation regime is enabled or 
not. 


Additionally, this affects changes to the program counter, when TBI is 1, caused by: 


° A branch or procedure return within EL2. 
° An exception taken to EL2. 
° An exception return to EL2. 


In these cases bits [63:56] of the address are set to 0 before it is stored in the PC. 


Reserved, RESO. 


PS, bits [18:16] 


Physical Address Size. 


000 32 bits, 4GB. 
001 36 bits, 64GB. 
010 40 bits, 1TB. 
011 42 bits, 4TB. 
100 44 bits, 16TB. 
101 48 bits, 256TB. 


Other values are reserved. 


The reserved values behave in the same way as the 101 encoding, but software must not rely on this 
property as the behavior of the RESERVED values might change in a future revision of the 
architecture. 


TGO, bits [15:14] 


Granule size for the TTBRO_EL2. 


00 4KB 
01 64KB 
10 16KB 


Other values are reserved. 


If the value is programmed to either a reserved value, or a size that has not been implemented, then 
the hardware will treat the field as if it has been programmed to an IMPLEMENTATION DEFINED 
choice of the sizes that has been implemented for all purposes other than the value read back from 
this register. 


It is IMPLEMENTATION DEFINED whether the value read back is the value programmed or the value 
that corresponds to the size chosen. 


SHO, bits [13:12] 


Shareability attribute for memory associated with translation table walks using TTBRO_EL2. 
00 Non-shareable 





10 Outer Shareable 
11 Inner Shareable 
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Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


ORGNDO, bits [11:10] 


Outer cacheability attribute for memory associated with translation table walks using TTBRO_EL2. 


00 
Q1 
10 
11 


IRGNO, bits [9:8] 


Normal memory, Outer Non-cacheable 

Normal memory, Outer Write-Back Read-Allocate Write-Allocate Cacheable 
Normal memory, Outer Write-Through Read-Allocate No Write-Allocate Cacheable 
Normal memory, Outer Write-Back Read-Allocate No Write-Allocate Cacheable 


Inner cacheability attribute for memory associated with translation table walks using TTBRO_EL2. 


00 
Q1 
10 
11 


Bits [7:6] 


Normal memory, Inner Non-cacheable 

Normal memory, Inner Write-Back Read-Allocate Write-Allocate Cacheable 
Normal memory, Inner Write-Through Read-Allocate No Write-Allocate Cacheable 
Normal memory, Inner Write-Back Read-Allocate No Write-Allocate Cacheable 


Reserved, RESO. 


TOSZ, bits [5:0] 


The size offset of the memory region addressed by TTBRO_EL2. The region size is 2(64-TOSZ) bytes. 


The maximum and minimum possible values for TOSZ depend on the level of translation table and 
the memory translation granule size, as described in the AArch64 Virtual Memory System 
Architecture chapter. 


Accessing the TCR_EL2: 


To access the TCR_EL2: 


MRS <Xt>, TCR_EL2 ; Read TCR_EL2 into Xt 
MSR TCR_EL2, <Xt> ; Write Xt to TCR_EL2 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 100 0010 0000 010 
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D7.2.86 TCR_EL3, Translation Control Register (EL3) 
The TCR_EL3 characteristics are: 


Purpose 
Controls translation table walks required for the stage | translation of memory accesses from EL3, 
and holds cacheability and shareability information for the accesses. 

Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Any of the bits in TCR_EL3 are permitted to be cached in a TLB. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TCR_EL3 is a 32-bit register. 


Field descriptions 


The TCR_EL3 bit assignments are: 


31 30 24 23 2221201918 161514131211109 8 7 6 5 





eit RESO 
RES1 IRGNO 
an ORGNO 
eess 


Bit [31] 


Reserved, RES1. 
Bits [30:24] 

Reserved, RESO. 
Bit [23] 

Reserved, RES1. 
Bits [22:21] 

Reserved, RESO. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-2117 
1ID092916 Non-Confidential 


D7 AArch64 System Register Descriptions 


D7.2 General system control registers 


TBI, bit [20] 


Bit [19] 


Top Byte ignored - indicates whether the top byte of an address is used for address match for the 
TTBRO_EL3 region, or ignored and used for tagged addresses. 


0 Top Byte used in the address calculation. 
Hl Top Byte ignored in the address calculation. 


This affects addresses generated in EL3 using AArch64 where the address would be translated by 
tables pointed to by TTBRO_EL3. It has an effect whether the EL3 translation regime is enabled or 
not. 


Additionally, this affects changes to the program counter, when TBI is 1, caused by: 


. A branch or procedure return within EL3. 
° A exception taken to EL3. 
° An exception return to EL3. 


In these cases bits [63:56] of the address are set to 0 before it is stored in the PC. 


Reserved, RESO. 


PS, bits [18:16] 


Physical Address Size. 


000 32 bits, 4GB. 
001 36 bits, 64GB. 
010 40 bits, 1TB. 
011 42 bits, 4TB. 
100 44 bits, 16TB. 
101 48 bits, 256TB. 


Other values are reserved. 


The reserved values behave in the same way as the 101 encoding, but software must not rely on this 
property as the behavior of the RESERVED values might change in a future revision of the 
architecture. 


TGO, bits [15:14] 


Granule size for the TTBRO_EL3. 


00 4KB 
01 64KB 
10 16KB 


Other values are reserved. 


If the value is programmed to either a reserved value, or a size that has not been implemented, then 
the hardware will treat the field as if it has been programmed to an IMPLEMENTATION DEFINED 
choice of the sizes that has been implemented for all purposes other than the value read back from 
this register. 


It is IMPLEMENTATION DEFINED whether the value read back is the value programmed or the value 
that corresponds to the size chosen. 


SHO, bits [13:12] 


Shareability attribute for memory associated with translation table walks using TTBRO_EL3. 
00 Non-shareable 





10 Outer Shareable 
11 Inner Shareable 
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Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


ORGNDO, bits [11:10] 


Outer cacheability attribute for memory associated with translation table walks using TTBRO_EL3. 


00 
Q1 
10 
11 


IRGNO, bits [9:8] 


Normal memory, Outer Non-cacheable 

Normal memory, Outer Write-Back Read-Allocate Write-Allocate Cacheable 
Normal memory, Outer Write-Through Read-Allocate No Write-Allocate Cacheable 
Normal memory, Outer Write-Back Read-Allocate No Write-Allocate Cacheable 


Inner cacheability attribute for memory associated with translation table walks using TTBRO_EL3. 


00 
Q1 
10 
11 


Bits [7:6] 


Normal memory, Inner Non-cacheable 

Normal memory, Inner Write-Back Read-Allocate Write-Allocate Cacheable 
Normal memory, Inner Write-Through Read-Allocate No Write-Allocate Cacheable 
Normal memory, Inner Write-Back Read-Allocate No Write-Allocate Cacheable 


Reserved, RESO. 


TOSZ, bits [5:0] 


The size offset of the memory region addressed by TTBRO_EL3. The region size is 2(©4-T0SZ) bytes. 


The maximum and minimum possible values for TOSZ depend on the level of translation table and 
the memory translation granule size, as described in the AArch64 Virtual Memory System 
Architecture chapter. 


Accessing the TCR_EL3: 


To access the TCR_EL3: 


MRS <Xt>, TCR_EL3 ; Read TCR_EL3 into Xt 
MSR TCR_EL3, <Xt> ; Write Xt to TCR_EL3 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 110 0010 0000 010 
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D7.2.87 


TPIDR_ELO, ELO Read/Write Software Thread ID Register 


The TPIDR_ELO characteristics are: 


Purpose 


Provides a location where software executing at ELO can store thread identifying information, for 
OS management purposes. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW RW RW RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register TPIDR_ELO[31:0] is architecturally mapped to AArch32 System register 
TPIDRURW. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TPIDR_ELO is a 64-bit register. 


Field descriptions 


The TPIDR_ELO bit assignments are: 


Thread ID 


Bits [63:0] 


Thread ID. Thread identifying information stored by software running at this Exception level. 


Accessing the TPIDR_ELO: 
To access the TPIDR_ELO: 


MRS <Xt>, TPIDR_EL@ ; Read TPIDR_ELQ into Xt 
MSR TPIDR_EL@, <Xt> ; Write Xt to TPIDR_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1101 0000 010 
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TPIDR_EL1, EL1 Software Thread ID Register 


The TPIDR_EL1 characteristics are: 


Purpose 


Provides a location where software executing at EL1 can store thread identifying information, for 
OS management purposes. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register TPIDR_EL1[31:0] is architecturally mapped to AArch32 System register 
TPIDRPRW. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TPIDR_EL1 is a 64-bit register. 


Field descriptions 


The TPIDR_ELI bit assignments are: 


Thread ID 


Bits [63:0] 


Thread ID. Thread identifying information stored by software running at this Exception level. 


Accessing the TPIDR_EL1: 
To access the TPIDR_EL1: 


MRS <Xt>, TPIDR_EL1 ; Read TPIDR_EL1 into Xt 
MSR TPIDR_EL1, <Xt> ; Write Xt to TPIDR_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 1101 0000 100 
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D7.2.89 TPIDR_EL2, EL2 Software Thread ID Register 
The TPIDR_EL2 characteristics are: 


Purpose 


Provides a location where software executing at EL2 can store thread identifying information, for 
OS management purposes. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register TPIDR_EL2[31:0] is architecturally mapped to AArch32 System register 
HTPIDR. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TPIDR_EL2 is a 64-bit register. 


Field descriptions 


The TPIDR_EL2 bit assignments are: 


63 0 


Thread ID 


Bits [63:0] 


Thread ID. Thread identifying information stored by software running at this Exception level. 


Accessing the TPIDR_EL2: 
To access the TPIDR_EL2: 


MRS <Xt>, TPIDR_EL2 ; Read TPIDR_EL2 into Xt 
MSR TPIDR_EL2, <Xt> ; Write Xt to TPIDR_EL2 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 1101 0000 010 
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TPIDR_EL3, EL3 Software Thread ID Register 


The TPIDR_EL3 characteristics are: 


Purpose 


Provides a location where software executing at EL3 can store thread identifying information, for 
OS management purposes. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TPIDR_EL3 is a 64-bit register. 


Field descriptions 


The TPIDR_EL3 bit assignments are: 


Thread ID 


Bits [63:0] 


Thread ID. Thread identifying information stored by software running at this Exception level. 


Accessing the TPIDR_EL3: 
To access the TPIDR_EL3: 


MRS <Xt>, TPIDR_EL3 ; Read TPIDR_EL3 into Xt 
MSR TPIDR_EL3, <Xt> ; Write Xt to TPIDR_EL3 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 1101 0000 010 
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D7.2.91 TPIDRRO_ELO, ELO Read-Only Software Thread ID Register 
The TPIDRRO_ELO characteristics are: 


Purpose 


Provides a location where software executing at EL1 or higher can store thread identifying 
information that is visible to software executing at ELO, for OS management purposes. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RO RW RW RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register TPIDRRO_ELO[31:0] is architecturally mapped to AArch32 System 
register TPIDRURO. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TPIDRRO_ELO is a 64-bit register. 


Field descriptions 


The TPIDRRO_ELO bit assignments are: 


63 0 


Thread ID 


Bits [63:0] 


Thread ID. Thread identifying information stored by software running at this Exception level. 


Accessing the TPIDRRO_ELO: 
To access the TPIDRRO_ELO: 


MRS <Xt>, TPIDRRO_EL@ ; Read TPIDRRO_ELQ into Xt 
MSR TPIDRRO_EL@, <Xt> ; Write Xt to TPIDRRO_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1101 0000 011 
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D7.2.92 TTBRO_EL1, Translation Table Base Register 0 (EL1) 
The TTBRO_EL1 characteristics are: 


Purpose 


Holds the base address of translation table 0, and information about the memory it occupies. This is 
one of the translation tables for the stage 1 translation of memory accesses at ELO and EL1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Any of the fields in this register are permitted to be cached in a TLB. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register TTBRO_EL1 is architecturally mapped to AArch32 System register 
TTBRO. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TTBRO_EL1 is a 64-bit register. 


Field descriptions 


The TTBRO_EL1 bit assignments are: 


63 48 47 0 


ASID BADDR 


ASID, bits [63:48] 


An ASID for the translation table base address. The TCR_EL1.A1 field selects either 
TTBRO_EL1.ASID or TTBR1_EL1.ASID. 


If the implementation has only 8 bits of ASID, then the upper 8 bits of this field are RESO. 


BADDR, bits [47:0] 


Translation table base address, bits[47:x]. Bits [x-1:0] are RESO, with the additional requirement that 
if they are not all zero, this is a misaligned translation table base address, with effects that are 
CONSTRAINED UNPREDICTABLE, and must be on of the following: 


° Bits [x-1:0] are treated as if all the bits are zero. The value read back from those bits is either 
the value written or zero. 
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° The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


The AArch64 Virtual Memory System Architecture chapter describes how x is calculated based on 
the value of TCR_EL1.TOSZ, the stage of translation, and the translation granule size. 
Accessing the TTBRO_EL1: 
To access the TTBRO_EL1: 


MRS <Xt>, TTBRO_EL1 ; Read TTBRQ_EL1 into Xt 
MSR TTBRQ_EL1, <Xt> ; Write Xt to TTBRQ_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0010 0000 000 
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D7.2.93 TTBRO_EL2, Translation Table Base Register 0 (EL2) 
The TTBRO_EL2 characteristics are: 


Purpose 


Holds the base address of the translation table for the stage 1 translation of memory accesses from 
EL2. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Any of the fields in this register are permitted to be cached in a TLB. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register TTBRO_EL2 is architecturally mapped to AArch32 System register 
HTTBR. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TTBRO_EL2 is a 64-bit register. 


Field descriptions 


The TTBRO_EL2 bit assignments are: 


63 48 47 0 


RESO BADDR 


Bits [63:48] 


Reserved, RESO. 


BADDR, bits [47:0] 


Translation table base address, bits[47:x]. Bits [x-1:0] are RESO, with the additional requirement that 
if they are not all zero, this is a misaligned translation table base address, with effects that are 
CONSTRAINED UNPREDICTABLE, and must be on of the following: 


° Bits [x-1:0] are treated as if all the bits are zero. The value read back from those bits is either 
the value written or zero. 


° The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


The AArch64 Virtual Memory System Architecture chapter describes how x is calculated based on 
the value of TCR_EL2.TOSZ, the stage of translation, and the translation granule size. 
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Accessing the TTBRO_EL2: 
To access the TTBRO_EL2: 


MRS <Xt>, TTBRO@_EL2 ; Read TTBRQ_EL2 into Xt 
MSR TTBRQ_EL2, <Xt> ; Write Xt to TTBRO_EL2 


Register access is encoded as follows: 





op0 op1 


CRn CRm_= op2 





11 100 


0010 «©0000 000 
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D7.2.94 TTBRO_EL3, Translation Table Base Register 0 (EL3) 
The TTBRO_EL3 characteristics are: 


Purpose 


Holds the base address of the translation table for the stage 1 translation of memory accesses from 
EL3. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Any of the fields in this register are permitted to be cached in a TLB. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TTBRO_EL3 is a 64-bit register. 


Field descriptions 


The TTBRO_EL3 bit assignments are: 


63 48 47 0 


RESO BADDR 


Bits [63:48] 


Reserved, RESO. 


BADDR, bits [47:0] 


Translation table base address, bits[47:x]. Bits [x-1:0] are RESO, with the additional requirement that 
if they are not all zero, this is a misaligned translation table base address, with effects that are 
CONSTRAINED UNPREDICTABLE, and must be on of the following: 


° Bits [x-1:0] are treated as if all the bits are zero. The value read back from those bits is either 
the value written or zero. 


° The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


The AArch64 Virtual Memory System Architecture chapter describes how x is calculated based on 
the value of TCR_EL3.TOSZ, the stage of translation, and the translation granule size. 


Accessing the TTBRO_EL3: 


To access the TTBRO_EL3: 


MRS <Xt>, TTBRO_EL3 ; Read TTBRQ_EL3 into Xt 
MSR TTBRQ_EL3, <Xt> ; Write Xt to TTBRO_EL3 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 0010 0000 000 
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D7.2.95 TTBR1_EL1, Translation Table Base Register 1 (EL1) 
The TTBR1_EL1 characteristics are: 


Purpose 


Holds the base address of translation table 1, and information about the memory it occupies. This is 
one of the translation tables for the stage 1 translation of memory accesses at ELO and EL1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Any of the fields in this register are permitted to be cached in a TLB. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


Configurations 


AArch64 System register TTBR1_EL1 is architecturally mapped to AArch32 System register 
TTBRI. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
TTBR1_EL] is a 64-bit register. 


Field descriptions 


The TTBR1_EL1 bit assignments are: 


63 48 47 0 


ASID BADDR 


ASID, bits [63:48] 


An ASID for the translation table base address. The TCR_EL1.A1 field selects either 
TTBRO_EL1.ASID or TTBR1_EL1.ASID. 


If the implementation has only 8 bits of ASID, then the upper 8 bits of this field are RESO. 


BADDR, bits [47:0] 


Translation table base address, bits[47:x]. Bits [x-1:0] are RESO, with the additional requirement that 
if they are not all zero, this is a misaligned translation table base address, with effects that are 
CONSTRAINED UNPREDICTABLE, and must be on of the following: 


° Bits [x-1:0] are treated as if all the bits are zero. The value read back from those bits is either 
the value written or zero. 
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° The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


The AArch64 Virtual Memory System Architecture chapter describes how x is calculated based on 
the value of TCR_EL1.T1SZ, the stage of translation, and the translation granule size. 
Accessing the TTBR1_EL1: 
To access the TTBR1_EL1: 


MRS <Xt>, TTBR1_EL1 ; Read TTBR1_EL1 into Xt 
MSR TTBR1_EL1, <Xt> ; Write Xt to TTBR1_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 0010 0000 001 
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VBAR_EL1, Vector Base Address Register (EL1) 


The VBAR_EL1 characteristics are: 


Purpose 


Holds the vector base address for any exception that is taken to EL1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register VBAR_EL1[31:0] is architecturally mapped to AArch32 System register 
VBAR. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
VBAR_EL1 is a 64-bit register. 


Field descriptions 


The VBAR_ELI bit assignments are: 


11 10 0 


Vector Base Address RESO 


Bits [63:11] 
Vector Base Address. Base address of the exception vectors for exceptions taken to EL1. 


If tagged addresses are being used, bits [55:48] of VBAR_EL1 must be the same or else the use of 
the vector address will result in a recursive exception. 


If tagged addresses are not being used, bits [63:48] of VBAR_EL1 must be the same or else the use 
of the vector address will result in a recursive exception. 


Bits [10:0] 


Reserved, RESO. 


Accessing the VBAR_EL1: 
To access the VBAR_ELI: 


MRS <Xt>, VBAR_EL1 ; Read VBAR_EL1 into Xt 
MSR VBAR_EL1, <Xt> ; Write Xt to VBAR_EL1 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 1100 0000 000 
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D7.2.97 VBAR_EL2, Vector Base Address Register (EL2) 
The VBAR_EL2 characteristics are: 
Purpose 
Holds the vector base address for any exception that is taken to EL2. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register VBAR_EL2[31:0] is architecturally mapped to AArch32 System register 
HVBAR. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
VBAR_EL2? is a 64-bit register. 
Field descriptions 
The VBAR_EL2 bit assignments are: 
11 10 0 
Vector Base Address RESO 
Bits [63:11] 
Vector Base Address. Base address of the exception vectors for exceptions taken to EL2. 
If tagged addresses are being used, bits [55:48] of VBAR_EL?2 must be 0 or else the use of the vector 
address will result in a recursive exception. 
If tagged addresses are not being used, bits [63:48] of VBAR_EL2 must be 0 or else the use of the 
vector address will result in a recursive exception. 
Bits [10:0] 
Reserved, RESO. 
Accessing the VBAR_EL2: 
To access the VBAR_EL2: 
MRS <Xt>, VBAR_EL2 ; Read VBAR_EL2 into Xt 
MSR VBAR_EL2, <Xt> ; Write Xt to VBAR_EL2 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 1100 0000 000 
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D7.2.98 VBAR_EL3, Vector Base Address Register (EL3) 
The VBAR_EL3 characteristics are: 


Purpose 


Holds the vector base address for any exception that is taken to EL3. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
VBAR_ELS3 is a 64-bit register. 


Field descriptions 


The VBAR_EL3 bit assignments are: 


63 11 10 0 


Vector Base Address RESO 


Bits [63:11] 
Vector Base Address. Base address of the exception vectors for exceptions taken to EL3. 


If tagged addresses are being used, bits [55:48] of VBAR_EL3 must be 0 or else the use of the vector 
address will result in a recursive exception. 


If tagged addresses are not being used, bits [63:48] of VBAR_EL3 must be 0 or else the use of the 
vector address will result in a recursive exception. 


Bits [10:0] 


Reserved, RESO. 


Accessing the VBAR_EL3: 
To access the VBAR_EL3: 


MRS <Xt>, VBAR_EL3 ; Read VBAR_EL3 into Xt 
MSR VBAR_EL3, <Xt> ; Write Xt to VBAR_EL3 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 110 1100 0000 000 
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D7.2.99 VMPIDR_EL42, Virtualization Multiprocessor ID Register 
The VMPIDR_EL2 characteristics are: 


Purpose 


Holds the value of the Virtualization Multiprocessor ID. This is the value returned by Non-secure 
ELI reads of MPIDR_EL1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Traps and Enables 
There are no traps or enables affecting this register. 


Configurations 


AArch64 System register VMPIDR_EL2[31:0] is architecturally mapped to AArch32 System 
register VMPIDR. 


If EL2 is not implemented, reads of this register return the value of the MPIDR_EL1, and writes to 
the register are ignored. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
VMPIDR_EL2? is a 64-bit register. 


Field descriptions 


The VMPIDR_EL2 bit assignments are: 


63 40 39 32 31 30 29 25 24 23 1615 8 7 0 


RESO Aff3 it RESO Aff2 Aff1 AffO 


RES1 oo PO MT 


Bits [63:40] 
Reserved, RESO. 
Aff3, bits [39:32] 
Affinity level 3. Highest level affinity field. 
Bit [31] 
Reserved, RES 1. 
U, bit [30] 


Indicates a Uniprocessor system, as distinct from PE 0 in a multiprocessor system. The possible 
values of this bit are: 


0 Processor is part of a multiprocessor system. 


1 Processor is part of a uniprocessor system. 
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Bits [29:25] 
Reserved, RESO. 
MT, bit [24] 


Indicates whether the lowest level of affinity consists of logical PEs that are implemented using a 
multithreading type approach. The possible values of this bit are: 


0 Performance of PEs at the lowest affinity level is largely independent. 


1 Performance of PEs at the lowest affinity level is very interdependent. 


Aff2, bits [23:16] 
Affinity level 2. Second highest level affinity field. 


Aff1, bits [15:8] 
Affinity level 1. Third highest level affinity field. 


Aff0, bits [7:0] 
Affinity level 0. Lowest level affinity field. 


Accessing the VMPIDR_EL2: 
To access the VMPIDR_EL2: 


MRS <Xt>, VMPIDR_EL2 ; Read VMPIDR_EL2 into Xt 
MSR VMPIDR_EL2, <Xt> ; Write Xt to VMPIDR_EL2 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 0000 0000 101 
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D7.2.100 


VPIDR_EL2, Virtualization Processor ID Register 


The VPIDR_EL2 characteristics are: 


Purpose 


Holds the value of the Virtualization Processor ID. This is the value returned by Non-secure EL1 
reads of MIDR_ELI1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register VPIDR_EL2 is architecturally mapped to AArch32 System register 
VPIDR. 


If EL2 is not implemented, reads of this register return the value of the MIDR_EL1, and writes to 
the register are ignored. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
VPIDR_EL2 is a 32-bit register. 


Field descriptions 


The VPIDR_EL2 bit assignments are: 


31 24 23 20 19 1615 4 3 0 


Architecture oe 


Implementer, bits [31:24] 


The Implementer code. This field must hold an implementer code that has been assigned by ARM. 
Assigned codes include the following: 





Hex representation ASCll representation Implementer 























0x41 A ARM Limited 
Qx42 B Broadcom Corporation 
0x43 C Cavium Inc. 
0x44 D Digital Equipment Corporation 
0x49 I Infineon Technologies AG 
0x4D M Motorola or Freescale Semiconductor Inc. 
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Hex representation ASCll representation Implementer 

















Ox4E N NVIDIA Corporation 

0x50 P Applied Micro Circuits Corporation 
Ox51 Q Qualcomm Inc. 

0x56 Vv Marvell International Ltd. 

0x69 i Intel Corporation 





ARM can assign codes that are not published in this manual. All values not assigned by ARM are 
reserved and must not be used. 


Variant, bits [23:20] 


An IMPLEMENTATION DEFINED variant number. Typically, this field is used to distinguish between 
different product variants, or major revisions of a product. 


Architecture, bits [19:16] 


The permitted values of this field are: 


0001 
0010 
0011 
0100 
0101 
0110 
0111 
1111 


ARMv4 

ARMVv4T 

ARMvVsS (obsolete) 
ARMv5T 
ARMvS5TE 
ARMVSTEJ 
ARMv6 


Architectural features are individually identified in the ID_* registers, see Identification 
registers, functional group on page G4-4194., 





All other values are reserved. 


PartNum, bits [15:4] 


An IMPLEMENTATION DEFINED primary part number for the device. 


On processors implemented by ARM, if the top four bits of the primary part number are Qx@ or 0x7, 
the variant and architecture are encoded differently. 


Revision, bits [3:0] 


An IMPLEMENTATION DEFINED revision number for the device. 


Accessing the VPIDR_EL2: 


To access the VPIDR_EL2: 


MRS <Xt>, VPIDR_EL2 ; Read VPIDR_EL2 into Xt 
MSR VPIDR_EL2, <Xt> ; Write Xt to VPIDR_EL2 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





1 100 0000 0000 000 
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D7.2.101 


VTCR_EL2, Virtualization Translation Control Register 


The VTCR_EL2 characteristics are: 


Purpose 
Controls the translation table walks required for the stage 2 translation of memory accesses from 
Non-secure ELO and EL1, and holds cacheability and shareability information for the accesses. 
Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Any of the bits in VTCR_EL2 are permitted to be cached in a TLB. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register VTCR_EL2 is architecturally mapped to AArch32 System register 
VTCR. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
VTCR_EL2 is a 32-bit register. 


Field descriptions 


The VTCR_EL2 bit assignments are: 


31 30 1918 161514131211109 8 7 6 5 


ORGNO 


Bit [31] 
Reserved, RES1. 
Bits [30:19] 
Reserved, RESO. 
PS, bits [18:16] 
Physical Address Size. 





000 32 bits, 4GB. 
001 36 bits, 64GB. 
010 40 bits, 1TB. 
011 42 bits, 4TB. 
100 44 bits, 16TB. 
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101 48 bits, 256TB. 
Other values are reserved. 


The reserved values behave in the same way as the 101 encoding, but software must not rely on this 
property as the behavior of the RESERVED values might change in a future revision of the 
architecture. 


TGO, bits [15:14] 
Granule size for the VTTBR_EL2. 


00 4KB 
01 64KB 
10 16KB 


Other values are reserved. 


If the value is programmed to either a reserved value, or a size that has not been implemented, then 
the hardware will treat the field as if it has been programmed to an IMPLEMENTATION DEFINED 
choice of the sizes that has been implemented for all purposes other than the value read back from 
this register. 


It is IMPLEMENTATION DEFINED whether the value read back is the value programmed or the value 
that corresponds to the size chosen. 


SHO, bits [13:12] 


Shareability attribute for memory associated with translation table walks using VTTBR_EL2. 


00 Non-shareable 
10 Outer Shareable 
11 Inner Shareable 


Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


ORGNDO, bits [11:10] 
Outer cacheability attribute for memory associated with translation table walks using VTTBR_EL2. 


00 Normal memory, Outer Non-cacheable 

01 Normal memory, Outer Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Outer Write-Through Read-Allocate No Write-Allocate Cacheable 
lak Normal memory, Outer Write-Back Read-Allocate No Write-Allocate Cacheable 


IRGNO, bits [9:8] 


Inner cacheability attribute for memory associated with translation table walks using VTTBR_EL2. 


00 Normal memory, Inner Non-cacheable 

01 Normal memory, Inner Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Inner Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Inner Write-Back Read-Allocate No Write-Allocate Cacheable 


SLO, bits [7:6] 


Starting level of the VTCR_EL2 addressed region. The meaning of this field depends on the value 
of VTCR_EL2.TG0 (the granule size). 


00 If TGO is 00 (4KB granule), start at level 2. If TGO is 10 (16KB granule) or 01 (64KB 
granule), start at level 3. 

01 If TGO is 00 (4KB granule), start at level 1. If TGO is 10 (16KB granule) or 01 (64KB 
granule), start at level 2. 

10 If TGO is 00 (4KB granule), start at level 0. If TGO is 10 (16KB granule) or 01 (64KB 


granule), start at level 1. 
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All other values are reserved. If this field is programmed to a reserved value, or to a value that is not 
consistent with the programming of TOSZ, then a stage 2 level 0 Translation fault is generated. 


TOSZ, bits [5:0] 
The size offset of the memory region addressed by VTTBR_EL2. The region size is 2(64-T0SZ) bytes. 


The maximum and minimum possible values for TOSZ depend on the level of translation table and 
the memory translation granule size, as described in the AArch64 Virtual Memory System 
Architecture chapter. 


If this field is programmed to a value that is not consistent with the programming of SLO then a stage 
2 level 0 Translation fault is generated. 

Accessing the VITCR_EL2: 

To access the VTCR_EL2: 


MRS <Xt>, VTCR_EL2 ; Read VTCR_EL2 into Xt 
MSR VTCR_EL2, <Xt> ; Write Xt to VTCR_EL2 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 100 0010 0001 010 
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D7.2.102 VTTBR_EL42, Virtualization Translation Table Base Register 
The VTTBR_EL2 characteristics are: 


Purpose 


Holds the base address of the translation table for the stage 2 translation of memory accesses from 
Non-secure ELO and EL1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Any of the fields in this register are permitted to be cached in a TLB. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register VTTBR_EL2 is architecturally mapped to AArch32 System register 
VTTBR. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
VTTBR_EL2 is a 64-bit register. 


Field descriptions 


The VTTBR_EL2 bit assignments are: 


56 55 48 47 


63 0 
| RESO VMID BADDR 


Bits [63:56] 


Reserved, RESO. 


VMI, bits [55:48] 
The VMID for the translation table. 


BADDR, bits [47:0] 


Translation table base address, bits[47:x]. Bits [x-1:0] are RESO, with the additional requirement that 
if they are not all zero, this is a misaligned translation table base address, with effects that are 
CONSTRAINED UNPREDICTABLE, and must be on of the following: 


° Bits [x-1:0] are treated as if all the bits are zero. The value read back from those bits is either 
the value written or zero. 


° The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


The AArch64 Virtual Memory System Architecture chapter describes how x is calculated based on 
the value of VTCR_EL2.TOSZ, the stage of translation, and the translation granule size. 
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Accessing the VITTBR_EL2: 
To access the VTTBR_EL2: 


MRS <Xt>, VTTBR_EL2 ; Read VTTBR_EL2 into Xt 
MSR VITBR_EL2, <Xt> ; Write Xt to VITTBR_EL2 


Register access is encoded as follows: 





op0 op1 


CRn CRm_= op2 





11 100 


0010 = 0001 000 
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D7.3 Debug registers 
This section lists the Debug System registers in AArch64 state, in alphabetic order: 


° The principal encoding space for debug registers is 0p0==0b10, Op1=={0, 3, 4}. Instructions for accessing 
debug System registers on page C5-280 summarizes the registers in this encoding space and lists them in 
order of their encodings. 


° In addition, the following registers in the 0p@==0b11 encoding space are classified as Debug registers: 
—  DLR_ELO. 
—  DSPSR_ELO. 
—  MDCR _EL2. 
—  MDCR_EL3. 
—  SDER32_EL3. 
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D7.3.1 DBGAUTHSTATUS_EL1, Debug Authentication Status register 
The DBGAUTHSTATUS_ELI characteristics are: 


Purpose 
Provides information about the state of the IMPLEMENTATION DEFINED authentication interface for 
debug. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TDA==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If MDCR_EL3.TDA==1, read accesses to this register from EL1 and EL2 are trapped to 
EL3. 


Configurations 


AArch64 System register DBGAUTHSTATUS_EL1 is architecturally mapped to AArch32 System 
register DBGAUTHSTATUS. 


AArch64 System register DBGAUTHSTATUS_EL1 is architecturally mapped to External register 
DBGAUTHSTATUS_EL1. 


Attributes 
DBGAUTHSTATUS_EL1 is a 32-bit register. 


Field descriptions 


The DBGAUTHSTATUS_ELI bit assignments are: 


31 876543210 


RESO SNID js] NSID 


os NSNID 


Bits [31:8] 
Reserved, RESO. 


SNID, bits [7:6] 


Secure non-invasive debug. Possible values of this field are: 


00 Not implemented. EL3 is not implemented and the implemented Security state is 
Non-secure state. 

10 Implemented and disabled. ExternalSecureNoninvasiveDebugEnabled() == FALSE. 

11 Implemented and enabled. ExternalSecureNoninvasiveDebugEnabled() == TRUE. 


Other values are reserved. 
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SID, bits [5:4] 
Secure invasive debug. Possible values of this field are: 
00 Not implemented. EL3 is not implemented and the implemented Security state is 
Non-secure state. 
10 Implemented and disabled. ExternalSecureInvasiveDebugEnabled() == FALSE. 
11 Implemented and enabled. ExternalSecureInvasiveDebugEnabled() == TRUE. 
Other values are reserved. 
NSNID, bits [3:2] 
Non-secure non-invasive debug. Possible values of this field are: 
00 Not implemented. EL3 is not implemented and the implemented Security state is Secure 
state. 
10 Implemented and disabled. ExternalNoninvasiveDebugEnabled() == FALSE. 
11 Implemented and enabled. ExternalNoninvasiveDebugEnabled() == TRUE. 
Other values are reserved. 
NSID, bits [1:0] 
Non-secure invasive debug. Possible values of this field are: 
00 Not implemented. EL3 is not implemented and the implemented Security state is Secure 
state. 
10 Implemented and disabled. ExternallnvasiveDebugEnabled() == FALSE. 
11 Implemented and enabled. ExternalInvasiveDebugEnabled() == TRUE. 


Other values are reserved. 


Accessing the DBGAUTHSTATUS_EL1: 
To access the DBGAUTHSTATUS_EL1: 


MRS <Xt>, DBGAUTHSTATUS_EL1 ; Read DBGAUTHSTATUS_EL1 into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





10 000 0111 1110 110 
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D7.3.2 DBGBCR<n>_EL1, Debug Breakpoint Control Registers, n = 0 - 15 
The DBGBCR<n>_EL]1 characteristics are: 
Purpose 
Holds control information for a breakpoint. Forms breakpoint n together with value register 
DBGBVR<n>_EL1. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548 and Software Access debug event 
on page H3-4903. Subject to the prioritization rules: 
° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
. If EDSCR.TDA==1, OSLSR_EL1.0SLK==0, and halting is allowed, EL1, EL2, and EL3 
accesses to this register generate a Software Access Debug event. 
Configurations 
AArch64 System register DBGBCR<n>_EL] is architecturally mapped to AArch32 System 
register DBGBCR<n>. 
AArch64 System register DBGBCR<n>_EL1 is architecturally mapped to External register 
DBGBCR<n>_EL1. 
If breakpoint n is not implemented then this register is unallocated. 
This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 
Attributes 
DBGBCR<n>_EL 1 is a 32-bit register. 
Field descriptions 
The DBGBCR<n>_EL 1 bit assignments are: 
31 24 23 20 19 16 15 14 13 12 5 43 2 1 0 
a RESO 
HMC 
Bits [31:24] 
Reserved, RESO. 
BT, bits [23:20] 
Breakpoint Type. Possible values are: 
0000 Unlinked instruction address match. 
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0001 Linked instruction address match. 
0010 Unlinked context ID match. 
0011 Linked context ID match 
1000 Unlinked VMID match. 
1001 Linked VMID match. 
1010 Unlinked VMID and context ID match. 
1011 Linked VMID and context ID match. 
The field breaks down as follows: 
° BT[3:1]: Base type. 

000 Match address. DBGBVR<n>_EL] is the address of an instruction. 

001 Match context ID. DBGBVR<n>_EL1.ContextID is a context ID. 

100 Match VMID. DBGBVR<n>_EL1.VMID is a VMID. 

101 Match VMID and context ID. DBGBVR<n>_EL1.ContextID is a context ID, 


and DBGBVR<n>_EL1.VMID is a VMID. 
° BT[O]: Enable linking. 


All other values are reserved. Constraints on breakpoint programming mean other values are 
reserved under some conditions. For more information, including the effect of programming this 
field to a reserved value, see Reserved DBGBCR<n>_EL1.BT values on page D2-1652. 


This field resets to a value that is architecturally UNKNOWN. 


LBN, bits [19:16] 


Linked breakpoint number. For Linked address matching breakpoints, this specifies the index of the 
Context-matching breakpoint linked to. 


For all other breakpoint types this field is ignored and reads of the register return an UNKNOWN 
value. 


This field is ignored when the value of DBGBCR<n>_EL1.E is 0. 


This field resets to a value that is architecturally UNKNOWN. 


SSC, bits [15:14] 


HMC, bit [13] 


Bits [12:9] 


BAS, bits [8:5] 


Security state control. Determines the Security states under which a Breakpoint debug event for 
breakpoint n is generated. This field must be interpreted along with the HMC and PMC fields, and 
there are constraints on the permitted values of the {HMC, SSC, PMC} fields. For more 
information, including the effect of programming the fields to a reserved set of values, see Reserved 
DBGBCR<n>_EL1.{SSC, HMC, PMC} values on page D2-1652. 


This field resets to a value that is architecturally UNKNOWN. 


Higher mode control. Determines the debug perspective for deciding when a Breakpoint debug 
event for breakpoint n is generated. This field must be interpreted along with the SSC and PMC 
fields, and there are constraints on the permitted values of the {HMC, SSC, PMC} fields. For more 
information see the SSC, bits [15:14] description. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


Byte address select. Defines which half-words an address-matching breakpoint matches, regardless 
of the instruction set and Execution state. In an AArch64-only implementation, this field is reserved, 
RES1. 


The permitted values depend on the breakpoint type. 
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Bits [4:3] 


For Address match breakpoints, the permitted values are: 





BAS Match instruction at Constraint for debuggers 





0011 DBGBVR<n>_EL1 Use for T32 instructions. 





1100 DBGBVR<n>_EL1+2 Use for T32 instructions. 





1111 DBGBVR<n>_EL1 Use for A64 and A32 instructions. 





All other values are reserved. For more information, see Reserved DBGBCR<n>_EL1.BAS values 
on page D2-1653. 


For more information on using the BAS field in address match breakpoints, see Using the BAS field 
in Address Match breakpoints on page G2-3949. 


For Context matching breakpoints, this field is RES1 and ignored. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


PMC, bits [2:1] 


E, bit [0] 


Privilege mode control. Determines the Exception level or levels at which a Breakpoint debug event 
for breakpoint n is generated. This field must be interpreted along with the SSC and HMC fields, 
and there are constraints on the permitted values of the {HMC, SSC, PMC} fields. For more 
information see the DBGBCR<n>_EL1.SSC description. 


This field resets to a value that is architecturally UNKNOWN. 


Enable breakpoint DBGBVR<n>_EL1. Possible values are: 
0 Breakpoint disabled. 
1 Breakpoint enabled. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGBCR<n>_EL1: 


To access the DBGBCR<n>_EL]1: 


MRS <Xt>, DBGBCR<n>_EL1 ; Read DBGBCR<n>_EL1 into Xt, where n is in the range @ to 15 
MSR DBGBCR<n>_EL1, <Xt> ; Write Xt to DBGBCR<n>_EL1, where n is in the range Q to 15 


Register access is encoded as follows: 














op0 op1 CRn CRm op2 
10 000 0000 n<3:Q> 101 
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D7.3.3 DBGBVR<n>_EL1, Debug Breakpoint Value Registers, n = 0 - 15 
The DBGBVR<n>_EL 1 characteristics are: 


Purpose 


Holds a virtual address, or a VMID and/or a context ID, for use in breakpoint matching. Forms 
breakpoint n together with control register DBGBCR<n>_EL1. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548 and Software Access debug event 
on page H3-4903. Subject to the prioritization rules: 


° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
. If EDSCR.TDA==1, OSLSR_EL1.0SLK==0, and halting is allowed, EL1, EL2, and EL3 
accesses to this register generate a Software Access Debug event. 
Configurations 


AArch64 System register DBGBVR<n>_EL1[31:0] is architecturally mapped to AArch32 System 
register DBGBVR<n>. 


If the breakpoint is context-aware and EL2 is implemented then AArch64 System register 
DBGBVR<n>_EL1[63:32] is architecturally mapped to AArch32 System register DBGBXVR<n>. 
Otherwise there is no System register access to DBGBVR<n>_EL1[63:32] from AArch32 state. 


AArch64 System register DBGBVR<n>_EL] is architecturally mapped to External register 
DBGBVR<n>_EL1. 


If breakpoint n is not implemented then this register is unallocated. 
This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 
Attributes 
How this register is interpreted depends on the value of DBGBCR<n>_EL1.BT. 
° When DBGBCR<n>_EL1.BT is 0b000x, this register holds a virtual address. 
° When DBGBCR<n>_EL1.BT is 0b001x, this register holds a Context ID. 
° When DBGBCR<n>_EL1.BT is 0b100x, this register holds a VMID. 
° When DBGBCR<n>_EL1.BT is 0b101x, this register holds a VMID and a Context ID. 
For other values of DBGBCR<n>_EL1.BT, this register is RESO. 


Field descriptions 


The DBGBVR<n>_EL1 bit assignments are: 
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When DBGBCR<n>_EL1.BT==0b000x: 


63 49 48 2 10 


RESS VA 


a 


RESS, bits [63:49] 


Reserved, Sign extended. Software must treat this field as RESO if bit[48] is 0 or RESO, and as RES1 
if bit[48] is 1. 


Hardware always ignores the value of these bits and it is IMPLEMENTATION DEFINED whether: 


° The bits are hardwired to a copy of bit [48], meaning writes to these bits are ignored, and 
reads to the bits always return the hardwired value. 


° The value in those bits can be written, and reads will return the last value written. The value 
held in those bits is ignored by hardware. 


This field resets to a value that is architecturally UNKNOWN. 
VA, bits [48:2] 
Bits[48:2] of the address value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 
Bits [1:0] 


Reserved, RESO. 


When DBGBCRe<n>_EL1.BT==0b001x: 


63 32 31 0 


RESO ContextID 


Bits [63:32] 


Reserved, RESO. 


ContextID, bits [31:0] 
Context ID value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 


When DBGBCRe<n>_EL1.BT==0b100x and EL2 implemented: 


63 40 39 32 31 0 


RESO VMID RESO 


Bits [63:40] 
Reserved, RESO. 
VMD, bits [39:32] 


VMID value for comparison. 
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This field resets to a value that is architecturally UNKNOWN. 
Bits [31:0] 
Reserved, RESO. 
When DBGBCRe<n>_EL1.BT==0b101x and EL2 implemented: 
63 40 39 32 31 0 
| RESO VMID | ContextID 
Bits [63:40] 
Reserved, RESO. 
VMID, bits [39:32] 
VMID value for comparison. 
This field resets to a value that is architecturally UNKNOWN. 
ContextID, bits [31:0] 
Context ID value for comparison. 
This field resets to a value that is architecturally UNKNOWN. 
Accessing the DBGBVR<n>_EL1: 
To access the DBGBVR<n>_EL 1: 
MRS <Xt>, DBGBVR<n>_EL1 ; Read DBGBVR<n>_EL1 into Xt, where n is in the range @ to 15 
MSR DBGBVR<n>_EL1, <Xt> ; Write Xt to DBGBVR<n>_EL1, where n is in the range Q to 15 
Register access is encoded as follows: 
op0 op1 CRn CRm op2 
10 000 0000 n<3:0> 100 
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D7.3.4 DBGCLAIMCLR_EL1, Debug Claim Tag Clear register 


The DBGCLAIMCLR_EL1 characteristics are: 

Purpose 
Used by software to read the values of the CLAIM tag bits, and to clear these bits to 0. 
The architecture does not define any functionality for the CLAIM tag bits. 


— Note 


CLAIM tags are typically used for communication between the debugger and target software. 





Used in conjunction with the DBGCLAIMSET_ELI register. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 


AArch64 System register DBGCLAIMCLR_EL1 is architecturally mapped to AArch32 System 
register DBGCLAIMCLR. 


AArch64 System register DBGCLAIMCLR_EL1 is architecturally mapped to External register 
DBGCLAIMCLR_EL1. 


An implementation must include 8 CLAIM tag bits. 


This register is in the Cold reset domain. See the CLAIM field description for the effect of a Cold 
reset on the value returned by this register. This register is not affected by a Warm reset. 


Attributes 
DBGCLAIMCLR_EL1 is a 32-bit register. 


Field descriptions 


The DBGCLAIMCLR_EL1 bit assignments are: 


31 8 7 0 
RAZ/SBZ CLAIM 
Bits [31:8] 


Reserved, RAZ/SBZ. Software can rely on these bits reading as zero, and must use a should-be-zero 
policy on writes. Implementations must ignore writes. 


CLAIM, bits [7:0] 
Read or clear CLAIM tag bits. Reading this field returns the current value of the CLAIM tag bits. 
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Writing a 1 to one of these bits clears the corresponding CLAIM tag bit to 0. This is an indirect write 
to the CLAIM tag bits. A single write operation can clear multiple CLAIM tag bits to 0. 


Writing 0 to one of these bits has no effect. 


A cold reset clears the CLAIM tag bits to 0. 


Accessing the DBGCLAIMCLR_EL1: 
To access the DBGCLAIMCLR_EL1: 


MRS <Xt>, DBGCLAIMCLR_EL1 ; Read DBGCLAIMCLR_EL1 into Xt 
MSR DBGCLAIMCLR_EL1, <Xt> ; Write Xt to DBGCLAIMCLR_EL1 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





10 000 0111 1001 110 
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D7.3.5 DBGCLAIMSET_EL1, Debug Claim Tag Set register 
The DBGCLAIMSET_EL1 characteristics are: 
Purpose 


Used by software to set the CLAIM tag bits to 1. 
The architecture does not define any functionality for the CLAIM tag bits. 


— Note 


CLAIM tags are typically used for communication between the debugger and target software. 





Used in conjunction with the DBGCLAIMCLR_EL|I register. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 


AArch64 System register DBGCLAIMSET_EL1] is architecturally mapped to AArch32 System 
register DBGCLAIMSET. 


AArch64 System register DBGCLAIMSET_EL1] is architecturally mapped to External register 
DBGCLAIMSET_EL1. 


An implementation must include 8 CLAIM tag bits. 


Attributes 
DBGCLAIMSET_EL1 is a 32-bit register. 


Field descriptions 


The DBGCLAIMSET_EL1 bit assignments are: 


31 8 7 0 
RAZ/SBZ CLAIM 
Bits [31:8] 


Reserved, RAZ/SBZ. Software can rely on these bits reading as zero, and must use a should-be-zero 
policy on writes. Implementations must ignore writes. 

CLAIM, bits [7:0] 
Set CLAIM tag bits. RAO. 


Writing a 1 to one of these bits sets the corresponding CLAIM tag bit to 1. This is an indirect write 
to the CLAIM tag bits. A single write operation can set multiple CLAIM tag bits to 1. 
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Writing 0 to one of these bits has no effect. 


A cold reset clears the CLAIM tag bits to 0. 


Accessing the DBGCLAIMSET_EL1: 
To access the DBGCLAIMSET_EL1: 


MRS <Xt>, DBGCLAIMSET_EL1 ; Read DBGCLAIMSET_EL1 into Xt 
MSR DBGCLAIMSET_EL1, <Xt> ; Write Xt to DBGCLAIMSET_EL1 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





10 000 0111 1000 110 
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D7.3.6 DBGDTR_ELO, Debug Data Transfer Register, half-duplex 
The DBGDTR_ELO characteristics are: 


Purpose 
Transfers 64 bits of data between the PE and an external debugger. Can transfer both ways using 
only a single register. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules, 
in Non-debug state: 


° If MDCR_EL2.TDA==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TDA==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If MDSCR_EL1.TDCC==1, accesses to this register from ELO are trapped to EL1. 


— Note 


Read and write accesses to this register are not trapped in Debug state. 





Configurations 


There are no configuration notes. 


Attributes 
DBGDTR_ELO is a 64-bit register. 


Field descriptions 


The DBGDTR_ELO bit assignments are: 


63 32 31 0 


HighWord LowWord 


HighWord, bits [63:32] 
Writes to this register set DTRRX to the value in this field and do not change RXfull. 
Reads from this register return the value of DTRTX and do not change TXfull. 


LowWord, bits [31:0] 
Writes to this register set DTRTX to the value in this field and set TXfull to 1. 
Reads from this register return the value of DT[RRX and clear RXfull to 0. 
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Accessing the DBGDTR_ELO: 
To access the DBGDTR_ELO: 


MRS <Xt>, DBGDTR_EL@ ; Read DBGDTR_EL@ into Xt 
MSR DBGDTR_EL@, <Xt> ; Write Xt to DBGDTR_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





10 011 0000 0100 000 
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D7.3.7 DBGDTRRX_ELO, Debug Data Transfer Register, Receive 


The DBGDTRRX_ELO characteristics are: 


Purpose 
Transfers data from an external debugger to the PE. For example, it is used by a debugger 
transferring commands and data to a debug target. It is a component of the Debug Communications 
Channel. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules, 
in Non-debug state: 


° If MDCR_EL2.TDA==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


. If MDSCR_EL1.TDCC==1, read accesses to this register from ELO are trapped to EL1. 


— Note 


Read accesses to this register are not trapped in Debug state. 





Configurations 


AArch64 System register DBGDTRRX_ELO is architecturally mapped to AArch32 System register 
DBGDTRRXint. 


AArch64 System register DBGDTRRX_ELO is architecturally mapped to External register 
DBGDTRRX_ELO. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 
DBGDTRRX_ELO is a 32-bit register. 


Field descriptions 


The DBGDTRRX_ELO bit assignments are: 


31 0 


Update DTRRX 


Bits [31:0] 
Update DTRRX. 


If RXfull is set to 1, then reads of this register return the last value written to DTRRX and clear 
RXfull to 0. 
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For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGDTRRX_ELO: 
To access the DBGDTRRX_ELO: 
MRS <Xt>, DBGDTRRX_EL@ ; Read DBGDTRRX_EL@ into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





10 011 0000 0101 000 
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D7.3.8 DBGDTRTX_ELO, Debug Data Transfer Register, Transmit 


The DBGDTRTX_ELO characteristics are: 


Purpose 
Transfers data from the PE to an external debugger. For example, it is used by a debug target to 
transfer data to the debugger. It is a component of the Debug Communication Channel. 

Usage constraints 


This register is accessible as follows: 











ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-WO WO WO WO WO WO 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules, 
in Non-debug state: 


° If MDCR_EL2.TDA==1, Non-secure write accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If MDCR_EL3.TDA==1, write accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


° If MDSCR_EL1.TDCC==1, write accesses to this register from ELO are trapped to EL1. 


— Note 


Write accesses to this register are not trapped in Debug state. 





Configurations 


AArch64 System register DBGDTRTX_ELO is architecturally mapped to AArch32 System register 
DBGDTRTXint. 


AArch64 System register DBGDTRTX_ELDO is architecturally mapped to External register 
DBGDTRTX_ELO. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 
DBGDTRTX_ELO is a 32-bit register. 


Field descriptions 


The DBGDTRTX_ELO bit assignments are: 


31 0 
Return DTRTX 
Bits [31:0] 
Return DTRTX. 


If TXfull is set to 0, then writes of this register update the value in DTRTX and set TXfull to 1. 


For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 
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This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGDTRTX_ELO: 
To access the DBGDTRTX_ELO: 


MSR DBGDTRTX_EL@, <Xt> ; Write Xt to DBGDTRTX_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





10 011 0000 0101 000 
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D7.3.9 DBGPRCR_EL1, Debug Power Control Register 
The DBGPRCR_EL | characteristics are: 


Purpose 


Controls behavior of the PE on powerdown request. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL2.TDOSA==1, Non-secure accesses to this register from EL1 are trapped to 
EL2. 


. If MDCR_EL3.TDOSA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 


AArch64 System register DBGPRCR_EL1 is architecturally mapped to AArch32 System register 
DBGPRCR. 


Bit [0] of this register is mapped to EDPRCR.CORENPDRQ, bit [0] of the external view of this 
register. 


The other bits in these registers are not mapped to each other. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 
DBGPRCR_ELI is a 32-bit register. 


Field descriptions 


The DBGPRCR_EL1 bit assignments are: 


31 10 
RESO | 
| CORENPDRQ 
Bits [31:1] 


Reserved, RESO. 


CORENPDRQ, bit [0] 
Core no powerdown request. Requests emulation of powerdown. Possible values of this bit are: 
) If the system responds to a powerdown request, it powers down Core power domain. 
i If the system responds to a powerdown request, it does not powerdown the Core power 


domain, but instead emulates a powerdown of that domain. 
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Writes to this bit are permitted regardless of the state of the IMPLEMENTATION DEFINED 
authentication interface. This means that a debugger can request Core no powerdown regardless of 
whether invasive debug is permitted. 


It is IMPLEMENTATION DEFINED whether this bit is reset to the value of EDPRCR.COREPURQ on 
exit from an IMPLEMENTATION DEFINED software-visible retention state. 

Accessing the DBGPRCR_EL1: 

To access the DBGPRCR_EL1: 


MRS <Xt>, DBGPRCR_EL1 ; Read DBGPRCR_EL1 into Xt 
MSR DBGPRCR_EL1, <Xt> ; Write Xt to DBGPRCR_EL1 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





10 000 0001 0100 100 
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D7.3.10 DBGVCR32_EL2, Debug Vector Catch Register 


The DBGVCR32_EL2 characteristics are: 


Purpose 
Allows access to the AArch32 register DBGVCR from AArch64 state only. Its value has no effect 
on execution in AArch6é4 state. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL3.TDA==1, accesses to this register from EL2 are trapped to EL3. 


Configurations 


AArch64 System register DBGVCR32_EL2 is architecturally mapped to AArch32 System register 
DBGVCR. 


If EL1 does not support AArch32, this register is UNDEFINED. 


If EL2 is not implemented but EL3 is implemented, and EL] is capable of using AArch32, then this 
register is not RESO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
DBGVCR32_EL2 is a 32-bit register. 


Field descriptions 


The DBGVCR32_EL2 bit assignments are: 


When EL3 implemented and using AArch64: 


31 30 29 28 27 26 25 24 876543210 





NSF | | — RESO 





RESO RESO 

NSD SF 

NSP 

NSS 

NSU 

NSF, bit [31] 
FIQ vector catch enable in Non-secure state. 
The exception vector offset is @x1C. 
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NSI, bit [30] 


Bit [29] 


NSD, bit [28] 


NSP, bit [27] 


NSS, bit [26] 


NSU, bit [25] 


Bits [24:8] 


SF, bit [7] 


SI, bit [6] 


Bit [5] 


SD, bit [4] 


SP, bit [3] 


D7 AArch64 System Register Descriptions 


This field resets to a value that is architecturally UNKNOWN. 


IRQ vector catch enable in Non-secure state. 
The exception vector offset is 0x18. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


Data Abort vector catch enable in Non-secure state. 
The exception vector offset is 0x10. 


This field resets to a value that is architecturally UNKNOWN. 


Prefetch Abort vector catch enable in Non-secure state. 
The exception vector offset is Qx@C. 


This field resets to a value that is architecturally UNKNOWN. 


Supervisor Call (SVC) vector catch enable in Non-secure state. 
The exception vector offset is 0x08. 


This field resets to a value that is architecturally UNKNOWN. 


Undefined Instruction vector catch enable in Non-secure state. 
The exception vector offset is 0x04. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


FIQ vector catch enable in Secure state. 
The exception vector offset is @x1C. 


This field resets to a value that is architecturally UNKNOWN. 


IRQ vector catch enable in Secure state. 
The exception vector offset is 0x18. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


Data Abort vector catch enable in Secure state. 
The exception vector offset is 0x10. 


This field resets to a value that is architecturally UNKNOWN. 


Prefetch Abort vector catch enable in Secure state. 


D7.3 Debug registers 
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The exception vector offset is Qx@C. 


This field resets to a value that is architecturally UNKNOWN. 


SS, bit [2] 

Supervisor Call (SVC) vector catch enable in Secure state. 

The exception vector offset is 0x08. 

This field resets to a value that is architecturally UNKNOWN. 
SU, bit [1] 

Undefined Instruction vector catch enable in Secure state. 

The exception vector offset is 0x04. 

This field resets to a value that is architecturally UNKNOWN. 
Bit [0] 


Reserved, RESO. 


When EL3 not implemented: 


Bits [31:8] 


Reserved, RESO. 


¥, bit [7] 
FIQ vector catch enable. 
The exception vector offset is @x1C. 


This field resets to a value that is architecturally UNKNOWN. 


I, bit [6] 
IRQ vector catch enable. 
The exception vector offset is 0x18. 


This field resets to a value that is architecturally UNKNOWN. 


Bit [5] 
Reserved, RESO. 


D, bit [4] 
Data Abort vector catch enable. 
The exception vector offset is 0x10. 


This field resets to a value that is architecturally UNKNOWN. 


P, bit [3] 
Prefetch Abort vector catch enable. 
The exception vector offset @x@C. 


This field resets to a value that is architecturally UNKNOWN. 
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S, bit [2] 

Supervisor Call (SVC) vector catch enable. 

The exception vector offset is 0x08. 

This field resets to a value that is architecturally UNKNOWN. 
U, bit [1] 

Undefined Instruction vector catch enable. 

The exception vector offset is 0x04. 

This field resets to a value that is architecturally UNKNOWN. 
Bit [0] 


Reserved, RESO. 


Accessing the DBGVCR32_EL2: 
To access the DBGVCR32_EL2: 


MRS <Xt>, DBGVCR32_EL2 ; Read DBGVCR32_EL2 into Xt 
MSR DBGVCR32_EL2, <Xt> ; Write Xt to DBGVCR32_EL2 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





10 100 0000 0111 000 
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D7.3.11 DBGWCR<n>_EL1, Debug Watchpoint Control Registers, n = 0 - 15 
The DBGWCR<n>_EL 1 characteristics are: 


Purpose 


Holds control information for a watchpoint. Forms watchpoint n together with value register 
DBGWVR<n>_EL1. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548 and Software Access debug event 
on page H3-4903. Subject to the prioritization rules: 


° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


. If EDSCR.TDA==1, OSLSR_EL1.0SLK==0, and halting is allowed, EL1, EL2, and EL3 
accesses to this register generate a Software Access Debug event. 


Configurations 


AArch64 System register DBGWCR<n>_EL1 is architecturally mapped to AArch32 System 
register DBGWCR<n>. 


AArch64 System register DBGWCR<n>_EL 1 is architecturally mapped to External register 
DBGWCR<n>_EL1. 


If breakpoint n is not implemented then this register is unallocated. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 
DBGWCR<n>_EL1 is a 32-bit register. 


Field descriptions 


The DBGWCR<n>_EL1 bit assignments are: 


29 28 2423 212019 16 15 14 13 12 5 43 2 1 0 


WT | a HMC 


Bits [31:29] 
Reserved, RESO. 

MASK, bits [28:24] 
Address mask. Only objects up to 2GB can be watched using a single mask. 
00000 No mask. 
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00001 Reserved. 
00010 Reserved. 
If programmed with a reserved value, a watchpoint must behave as if either: 


° MASK has been programmed with a defined value, which might be 0 (no mask), other than 
for a direct read of DBGWCRn_EL1. 


° The watchpoint is disabled. 


Software must not rely on this property because the behavior of reserved values might change in a 
future revision of the architecture. 


Other values mask the corresponding number of address bits, from @b00011 masking 3 address bits 
(0x00000007 mask for address) to 0b11111 masking 31 address bits (@x7FFFFFFF mask for address). 


This field resets to a value that is architecturally UNKNOWN. 


Bits [23:21] 
Reserved, RESO. 


WT, bit [20] 
Watchpoint type. Possible values are: 
0 Unlinked data address match. 
1 Linked data address match. 


This field resets to a value that is architecturally UNKNOWN. 


LBN, bits [19:16] 


Linked breakpoint number. For Linked data address watchpoints, this specifies the index of the 
Context-matching breakpoint linked to. 


This field resets to a value that is architecturally UNKNOWN. 


SSC, bits [15:14] 


Security state control. Determines the Security states under which a Watchpoint debug event for 
watchpoint n is generated. This field must be interpreted along with the HMC and PAC fields, see 
Execution conditions for which a watchpoint generates Watchpoint exceptions on page D2-1660. 


This field resets to a value that is architecturally UNKNOWN. 


HMC, bit [13] 


Higher mode control. Determines the debug perspective for deciding when a Watchpoint debug 
event for watchpoint n is generated. This field must be interpreted along with the SSC and PAC 
fields, see Execution conditions for which a watchpoint generates Watchpoint exceptions on 
page D2-1660. 


This field resets to a value that is architecturally UNKNOWN. 
BAS, bits [12:5] 


Byte address select. Each bit of this field selects whether a byte from within the word or 
double-word addressed by DBGWVR<n>_EL1 is being watched. 





BAS Description 





Xxxxxxxl Match byte at DBGWVR<n>_EL1 





Xxxxxxlx Match byte at DBGWVR<n>_EL1+1 





Xxxxxlxx Match byte at DBGWVR<n>_EL1+2 





xxxx1xxx Match byte at DBGWVR<n>_EL1+3 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-2173 
1ID092916 Non-Confidential 


D7 AArch64 System Register Descriptions 
D7.3 Debug registers 


In cases where DBGWVR<n>_EL 1 addresses a double-word: 





BAS Description, if DBGWVR<n>_EL1[2] == 





xxxIxxxx Match byte at DBGWVR<n>_EL1+4 





xx1xxxxx Match byte at DBGWVR<n>_EL1+5 





x1xxxxxx Match byte at DBGWVR<n>_EL1+6 





1xxxxxxx Match byte at DBGWVR<n>_EL1+7 





If DBGWVR<n>_EL1[2] == 1, only BAS[3:0] are used and BAS[7:4] are ignored. ARM 
deprecates setting DBGWVR<n>_EL1[2] == 1. 


The valid values for BAS are non-zero binary numbers all of whose set bits are contiguous. All other 
values are reserved and must not be used by software. See Reserved DBGWCR<n>_EL1.BAS 
values on page D2-1668. 


This field resets to a value that is architecturally UNKNOWN. 


LSC, bits [4:3] 


Load/store control. This field enables watchpoint matching on the type of access being made. 
Possible values of this field are: 


01 Match instructions that load from a watchpointed address. 
10 Match instructions that store to a watchpointed address. 
11 Match instructions that load from or store to a watchpointed address. 


All other values are reserved, but must behave as if the watchpoint is disabled. Software must not 
rely on this property as the behavior of reserved values might change in a future revision of the 
architecture. 


This field resets to a value that is architecturally UNKNOWN. 


PAC, bits [2:1] 


Privilege of access control. Determines the Exception level or levels at which a Watchpoint debug 
event for watchpoint n is generated. This field must be interpreted along with the SSC and HMC 
fields, see Execution conditions for which a watchpoint generates Watchpoint exceptions on 

page D2-1660. 


This field resets to a value that is architecturally UNKNOWN. 


E, bit [0] 
Enable watchpoint n. Possible values are: 
0 Watchpoint disabled. 
1 Watchpoint enabled. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGWCR<n>_EL1: 
To access the DBGWCR<n>_EL 1: 


MRS <Xt>, DBGWCR<n>_EL1 ; Read DBGWCR<n>_EL1 into Xt, where n is in the range @ to 15 
MSR DBGWCR<n>_EL1, <Xt> ; Write Xt to DBGWCR<n>_EL1, where n is in the range Q to 15 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





10 000 0000 n<3:@> = 111 
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D7.3.12 DBGWVR<n>_EL1, Debug Watchpoint Value Registers, n = 0 - 15 
The DBGWVR<n>_EL 1 characteristics are: 


Purpose 


Holds a data address value for use in watchpoint matching. Forms watchpoint n together with 
control register DBGWCR<n>_EL1. 


Usage constraints 


This register is accessible as follows: 





ELO L1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548 and Software Access debug event 
on page H3-4903. Subject to the prioritization rules: 


° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


. If EDSCR.TDA==1, OSLSR_EL1.0SLK==0, and halting is allowed, EL1, EL2, and EL3 
accesses to this register generate a Software Access Debug event. 


Configurations 


AArch64 System register DBGWVR<n>_EL1[31:0] is architecturally mapped to AArch32 System 
register DBGWVR<n>. 


AArch64 System register DBGWVR<n>_EL] is architecturally mapped to External register 
DBGWVR<n>_EL1. 


If breakpoint n is not implemented then this register is unallocated. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 
DBGWVR<n>_EL] is a 64-bit register. 


Field descriptions 


The DBGWVR<n>_EL1 bit assignments are: 


63 49 48 2 10 


RESS VA 


a 


RESS, bits [63:49] 


Reserved, Sign extended. Hardware and software must treat this field as RESO if bit[48] is 0 or RESO, 
and as RES1 if bit[48] is 1. 


Hardware always ignores the value of these bits and it is IMPLEMENTATION DEFINED whether: 


° The bits are hardwired to a copy of bit [48], meaning writes to these bits are ignored, and 
reads to the bits always return the hardwired value. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-2175 
1ID092916 Non-Confidential 


D7 AArch64 System Register Descriptions 
D7.3 Debug registers 


° The value in those bits can be written, and reads will return the last value written. The value 
held in those bits is ignored by hardware. 


This field resets to a value that is architecturally UNKNOWN. 


VA, bits [48:2] 

Bits[48:2] of the address value for comparison. 

ARM deprecates setting DBGWVR<n>_EL1[2] == 1. 

This field resets to a value that is architecturally UNKNOWN. 
Bits [1:0] 


Reserved, RESO. 


Accessing the DBGWVR<n>_EL1: 
To access the DBGWVR<n>_EL 1: 


MRS <Xt>, DBGWVR<n>_EL1 ; Read DBGWVR<n>_EL1 into Xt, where n is in the range @ to 15 
MSR DBGWVR<n>_EL1, <Xt> ; Write Xt to DBGWVR<n>_EL1, where n is in the range Q to 15 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





10 000 0000 n<3:@> 110 
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D7.3.13 DLR_ELO, Debug Link Register 
The DLR_ELO characteristics are: 


Purpose 


In Debug state, holds the address to restart from. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW RW RW RW RW RW 





Access to this register is from Debug state only. During normal execution this register is 
unallocated. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register DLR_EL0[31:0] is architecturally mapped to AArch32 System register 
DLR. 


Attributes 
DLR_ELO is a 64-bit register. 


Field descriptions 


The DLR_ELO bit assignments are: 


63 0 


Restart address 


Bits [63:0] 


Restart address. 


Accessing the DLR_ELO: 
To access the DLR_ELO: 


MRS <Xt>, DLR_EL@ ; Read DLR_EL@ into Xt 
MSR DLR_EL@, <Xt> ; Write Xt to DLR_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 0100 0101 001 
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D7.3.14 DSPSR_ELO, Debug Saved Program Status Register 
The DSPSR_ELO characteristics are: 


Purpose 


Holds the saved process state on entry to Debug state. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW RW RW RW RW RW 





Access to this register is from Debug state only. During normal execution this register is 
unallocated. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
AArch64 System register DSPSR_ELO is architecturally mapped to AArch32 System register 
DSPSR. 

Attributes 


DSPSR_ELO is a 32-bit register. 


Field descriptions 


The DSPSR_ELO bit assignments are: 


When exiting Debug state to AArch32: 


31 30 29 28 27 26 25 24 23 22 21 20 19 1615 1098765 4 3 





ree RAL = |e PT ee | 
IT[1:0] | ee 
RESO 





N, bit [31] 
Copied to CPSR.N on exiting Debug state. 

Z, bit [30] 
Copied to CPSR.Z on exiting Debug state. 

C, bit [29] 
Copied to CPSR.C on exiting Debug state. 

V, bit [28] 
Copied to CPSR.V on exiting Debug state. 

Q, bit [27] 
Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 
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IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvV8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:22] 


Reserved, RESO. 


SS, bit [21] 
Software step. Shows the value of PSTATE.SS immediately before Debug state was entered. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before Debug state was 
entered. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 


E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
Q Little-endian operation 
ak Big-endian operation. 
Instruction fetches ignore this bit. 


When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 


If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 


If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 


Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 


A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
Q Exception not masked. 
Hb Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 


) Exception not masked. 
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1 Exception masked. 
¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
Q Exception not masked. 
Hh Exception masked. 
T, bit [5] 


T32 Instruction set state bit. Determines the AArch32 instruction set state that the Debug state entry 
was taken from. Possible values of this bit are: 


0 Taken from A32 state. 
1 Taken from T32 state. 
M4], bit [4] 
Execution state that Debug state was entered from. Possible values of this bit are: 
He Exception taken from AArch32. 
M{[3:0], bits [3:0] 


AArch32 mode that Debug state was entered from. The possible values are: 





M[3:0] Mode 





Qb0000 User 





0b0001 + ©=FIQ 





obe010 ~=—s IRQ 





Qb0011 Supervisor 





Qb0110 Monitor 








@b0111 Abort 





0b1010 Hyp 





Qb1011 Undefined 








Qb1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5492. 


When entering Debug state from AArch64 and exiting Debug state to AArch64: 


31 30 29 28 27 22 21 20 19 1098765 4 3 





ee M4] 
RESO 





N, bit [31] 
Set to the value of the N condition flag on entering Debug state, and copied to the N condition flag 
on exiting Debug state. 
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Z, bit [30] 
Set to the value of the Z condition flag on entering Debug state, and copied to the Z condition flag 
on exiting Debug state. 

C, bit [29] 
Set to the value of the C condition flag on entering Debug state, and copied to the C condition flag 
on exiting Debug state. 

V, bit [28] 


Set to the value of the V condition flag on entering Debug state, and copied to the V condition flag 
on exiting Debug state. 


Bits [27:22] 

Reserved, RESO. 
SS, bit [21] 

Software step. Shows the value of PSTATE.SS immediately before Debug state was entered. 
IL, bit [20] 


Illegal Execution state bit. Shows the value of PSTATE.IL immediately before Debug state was 
entered. 


Bits [19:10] 


Reserved, RESO. 





D, bit [9] 
Process state D mask. The possible values of this bit are: 
0 Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 
level are not masked. 
1 Watchpoint, Breakpoint, and Software Step exceptions targeted at the current Exception 
level are masked. 
When the target Exception level of the debug exception is higher than the current Exception level, 
the exception is not masked by this bit. 
A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
Q Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
7) Exception not masked. 
1 Exception masked. 
¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
Q Exception not masked. 
1 Exception masked. 
Bit [5] 
Reserved, RESO. 
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M4], bit [4] 
Execution state that Debug state was entered from. Possible values of this bit are: 
) Exception taken from AArch64. 

M{[3:0], bits [3:0] 
AArch64 mode that Debug state was entered from. The possible values are: 





M[3:0] Mode 





0b0000 ELOt 





0b0100 ELIt 





0b0101 ELth 





0b1000 EL2t 





0b1001 EL2h 





0b1100 EL3t 





0b1101 EL3h 





Other values are reserved, and returning to an Exception level that is using AArch64 with a reserved 
value in this field is treated as an illegal exception return. 


The bits in this field are interpreted as follows: 
° M[3:2] holds the Exception Level. 
° M[1] is unused and is RESO for all non-reserved values. 
° M[O] is used to select the SP: 
—  Omeans the SP is always SPO. 


— 1 means the exception SP is determined by the EL. 


Accessing the DSPSR_ELO: 
To access the DSPSR_ELO: 


MRS <Xt>, DSPSR_EL@ ; Read DSPSR_EL@ into Xt 
MSR DSPSR_EL@, <Xt> ; Write Xt to DSPSR_ELO 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 0100 0101 000 
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D7.3.15 MDCCINT_EL1, Monitor DCC Interrupt Enable Register 
The MDCCINT_EL 1 characteristics are: 


Purpose 


Enables interrupt requests to be signaled based on the DCC status flags. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 


AArch64 System register MDCCINT_EL] is architecturally mapped to AArch32 System register 
DBGDCCINT. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch64. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 

Attributes 


MDCCINT_EL1 is a 32-bit register. 


Field descriptions 


The MDCCINT_EL1 bit assignments are: 


31 30 29 28 0 


ite RESO 


RESO __| 


Bit [31] 


Reserved, RESO. 


RX, bit [30] 


DCC interrupt request enable control for DTRRX. Enables a common COMMIRQ interrupt request 
to be signaled based on the DCC status flags. 


0 No interrupt request generated by DTRRX. 
1 Interrupt request will be generated on RXfull == 1. 


If legacy COMMRX and COMMTX signals are implemented, then these are not affected by the 
value of this bit. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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TX, bit [29] 


DCC interrupt request enable control for DT[RTX. Enables acommon COMMIRQ interrupt request 
to be signaled based on the DCC status flags. 


0 No interrupt request generated by DTRTX. 
1 Interrupt request will be generated on TXfull == 0. 


If legacy COMMRX and COMMTX signals are implemented, then these are not affected by the 
value of this bit. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Bits [28:0] 


Reserved, RESO. 


Accessing the MDCCINT_EL1: 
To access the MDCCINT_ELI: 


MRS <Xt>, MDCCINT_EL1 ; Read MDCCINT_EL1 into Xt 
MSR MDCCINT_EL1, <Xt> ; Write Xt to MDCCINT_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





10 000 0000 =: 0010 000 
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D7.3.16 MDCCSR_ELO, Monitor DCC Status Register 
The MDCCSR_ELO characteristics are: 


Purpose 
Main control register for the debug implementation, containing flow-control flags for the DCC. This 
is an internal, read-only view. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TDA==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


. If MDSCR_EL1.TDCC==1, read accesses to this register from ELO are trapped to EL1. 


Attributes 
MDCCSR_ELO is a 32-bit register. 


Field descriptions 


The MDCCSR_ELO bit assignments are: 


31 30 29 28 19 18 15 141312 11 6 5 2 1 0 


RESO __| | _ RESO 
RXfull RAZ 


TXfull RESO 


Bit [31] 
Reserved, RESO. 


RXfull, bit [30] 
DTRRxX full. Read-only view of the equivalent bit in the EDSCR. 


TXfull, bit [29] 

DTRTX full. Read-only view of the equivalent bit in the EDSCR. 
Bits [28:19] 

Reserved, RESO. 


Bits [18:15] 


Reserved. Hardware must implement this field as RAZ. Software must not rely on the register 
reading as zero. 
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Bits [14:13] 
Reserved, RESO. 


Bit [12] 
Reserved. Hardware must implement this field as RAZ. Software must not rely on the register 
reading as zero. 

Bits [11:6] 
Reserved, RESO. 

Bits [5:2] 
Reserved. Hardware must implement this field as RAZ. Software must not rely on the register 
reading as zero. 

Bits [1:0] 


Reserved, RESO. 


Accessing the MDCCSR_ELO: 
To access the MDCCSR_ELO: 
MRS <Xt>, MDCCSR_EL@ ; Read MDCCSR_EL@ into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





10 011 0000 0001 000 
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D7.3.17 MDCR_EL2, Monitor Debug Configuration Register (EL2) 
The MDCR_EL2 characteristics are: 


Purpose 


Provides EL2 configuration options for self-hosted debug and the Performance Monitors Extension. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL3.TDA==1, accesses to this register from EL2 are trapped to EL3. 


Configurations 


AArch64 System register MDCR_EL2 is architecturally mapped to AArch32 System register 
HDCR. 


If EL2 is not implemented, this register is RESO from EL3. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch64. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 

Attributes 


MDCR_EL2 is a 32-bit register. 


Field descriptions 


The MDCR_EL2 bit assignments are: 


31 1211109 8 7 65 4 


RESO HUTT mw J HPMN 
(nee 
TPM 


HPME 
TDE 
TDA 

TDOSA 

TDRA 


Bits [31:12] 
Reserved, RESO. 
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TDRA, bit [11] 


Trap Debug ROM Address register access. Traps Non-secure System register accesses to the Debug 
ROM registers to EL2. This trap is from: 


° Non-secure ELO using AArch32. 


° Non-secure EL1, regardless of which Execution state it is using. 

) Non-secure ELO and EL1 System register accesses to the Debug ROM registers are not 
trapped to EL2. 

1 Non-secure ELO and EL1 System register accesses to the Debug ROM registers are 
trapped to EL2. 


The registers for which accesses are trapped are as follows: 
AArch64: MDRAR_EL1. 
AArch32: DBGDRAR, DBGDSAR. 


If MDCR_EL2.TDE == | or HCR_EL2.TGE == 1, behavior is as if this bit is 1 other than for the 
purpose of a direct read. 


This field resets to a value that is architecturally UNKNOWN. 
TDOSA, bit [10] 


Trap debug OS-related register access. Traps Non-secure EL1 System register accesses to the 
powerdown debug registers to EL2, from both Execution states: 


) Non-secure EL1 System register accesses to the powerdown debug registers are not 
trapped to EL2. 

1 Non-secure EL1 System register accesses to the powerdown debug registers are trapped 
to EL2. 


The registers for which accesses are trapped are as follows: 
AArch64: OSLAR_EL1, OSLSR_EL1, OSDLR_EL1, and the DBGPRCR_EL1. 
AArch32: DBGOSLSR, DBGOSLAR, DBGOSDLR, and the DBGPRCR. 


AArch64 and AArch32: Any IMPLEMENTATION DEFINED register with similar functionality that the 
implementation specifies as trapped by this bit. 


— Note 


These registers are not accessible at ELO. 





If MDCR_EL2.TDE == | or HCR_EL2.TGE == 1, behavior is as if this bit is 1 other than for the 
purpose of a direct read. 


This field resets to a value that is architecturally UNKNOWN. 





TDA, bit [9] 

Trap Debug Access. Traps Non-secure ELO and EL1 System register accesses to those debug 

System registers that are not trapped by either of the following: 

° MDCR_EL2.TDRA. 

. MDCR_EL2.TDOSA. 

0 Has no effect on System register accesses to the debug registers. 

1 Non-secure ELO or EL1 System register accesses to the debug registers, other than the 
registers trapped by MDCR_EL2.TDRA and MDCR_EL2.TDOSA, are trapped to EL2, 
from both Execution states. 

Traps of AArch32 accesses to DBGDTRRXint and DBGDTRTXint are ignored in Debug state. 

Traps of AArch64 accesses to DBGDTR_ELO, DBGDTRRX_ELO, and DBGDTRTX_ELO are 

ignored in Debug state. 

If MDCR_EL2.TDE == 1 or HCR_EL2.TGE == 1, behavior is as if this bit is 1 other than for the 

purpose of a direct read. 
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This field resets to a value that is architecturally UNKNOWN. 


TDE, bit [8] 
Trap Debug exceptions. The possible values of this field are: 


Q This control has no effect on the routing of debug exceptions, and has no effect on 
Non-secure accesses to debug registers. 


1 In Non-secure state: 
° Debug exceptions generated at EL1 or ELO are routed to EL2. 


. The MDCR_EL2.{TDRA, TDOSA, TDA} fields are treated as being 1 for all 
purposes other than returning the result of a direct read of the register. 


When HCR_EL2.TGE == 1, the PE behaves as if the value of this field is 1 for all purposes other 
than returning the value of a direct read of the register. 


This field resets to a value that is architecturally UNKNOWN. 


HPME, bit [7] 
Hypervisor Performance Monitors Enable. The possible values of this bit are: 
0 EL2 Performance Monitors disabled. 
1 EL2 Performance Monitors enabled. 


When the value of this bit is 1, the Performance Monitors counters that are reserved for use from 
EL2 or Secure state are enabled. For more information see the description of the HPMN field. 


If the Performance Monitors Extension is not implemented, this field is RESO. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


TPM, bit [6] 


Trap Performance Monitors accesses. Traps Non-secure ELO and EL1 accesses to all Performance 
Monitors registers to EL2, from both Execution states: 


0 Non-secure ELO and EL] accesses to all Performance Monitors registers are not trapped 
to EL2. 

1 Non-secure ELO and EL1 accesses to all Performance Monitors registers are trapped to 
EL2. 

——— Note 


EL2 does not provide traps on Performance Monitor register accesses through the optional 
memory-mapped external debug interface. 





If the Performance Monitors Extension is not implemented, this field is RESO. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


TPMCR, bit [5] 


Trap PMCR_ELO or PMCR accesses. Traps Non-secure ELO and EL] accesses to the PMCR_ELO 
or PMCR to EL2. 


) Non-secure ELO and EL1 accesses to the PMCR_ELO or PMCR are not trapped to EL2. 
1 Non-secure ELO and EL1 accesses to the PMCR_ELO or PMCR are trapped to EL2. 
—— Note 


EL2 does not provide traps on Performance Monitor register accesses through the optional 
memory-mapped external debug interface. 





If the Performance Monitors Extension is not implemented, this field is RESO. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
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HPMN, bits [4:0] 


Defines the number of Performance Monitors counters that are accessible from Non-secure ELO and 
EL1 modes. 


If the Performance Monitors Extension is not implemented, this field is RESO. 


In Non-secure state, HPMN divides the Performance Monitors counters as follows. For counter n in 
Non-secure state: 


° If nis in the range 0<=n<HPMN, the counter is accessible from EL1 and EL2, and from ELO 
if permitted by PMUSERENR_ELO. PMCR_ELO.E enables the operation of counters in this 
range. 


. If n is in the range HPMN<=n<PMCR_ELO.N, the counter is accessible only from EL2 and 
from Secure state. MDCR_EL2.HPME enables the operation of counters in this range. 


If this field is set to 0, or to a value larger than PMCR_ELO.N, then the following CONSTRAINED 
UNPREDICTABLE behavior applies: 


° The value returned by a direct read of MDCR_EL2.HPMN is UNKNOWN. 
° Either: 


— An UNKNOWN number of counters are reserved for EL2 use. That is, the PE behaves 
as if MDCR_EL2.HPMN is set to an UNKNOWN non-zero value less than 
PMCR_ELO.N. 


—  Allcounters are reserved for EL2 use, meaning no counters are accessible from 
Non-secure EL1 and Non-secure ELO. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to the value of PMCR_ELO.N. 


Accessing the MDCR_EL2: 


To access the MDCR_EL2: 


MRS <Xt>, MDCR_EL2 ; Read MDCR_EL2 into Xt 
MSR MDCR_EL2, <Xt> ; Write Xt to MDCR_EL2 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 100 0001 0001 001 
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D7.3.18 MDCR_EL3, Monitor Debug Configuration Register (EL3) 
The MDCR_EL3 characteristics are: 


Purpose 


Provides EL3 configuration options for self-hosted debug and the Performance Monitors Extension. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register MDCR_EL3 can be mapped to AArch32 System register SDCR, but this 
is not architecturally mandated. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch64. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
MDCR_EL3 is a 32-bit register. 


Field descriptions 


The MDCR_EL3 bit assignments are: 


22 212019181716151413 11109 8 7 6 5 





cpunp | 

EDAD —— 

RESO 

ea ete 
SPD32 


Bits [31:22] 


Reserved, RESO. 


EPMAD, bit [21] 


External debug interface Performance Monitors registers disable. This disables access to these 
registers by an external debugger: 


0 Access to Performance Monitors registers from external debugger is permitted. 


1 Access to Performance Monitors registers from external debugger is disabled, unless 
overridden by the IMPLEMENTATION DEFINED authentication interface. 


If the Performance Monitors Extension is not implemented or does not support external debug 
interface accesses this bit is RESO. 
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When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 
EDAD, bit [20] 


External debug interface breakpoint and watchpoint register access disable. This disables access to 
these registers by an external debugger: 


Q Access to breakpoint and watchpoint registers from external debugger is permitted. 


1 Access to breakpoint and watchpoint registers from external debugger is disabled, 
unless overridden by the IMPLEMENTATION DEFINED authentication interface. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Bits [19:18] 


Reserved, RESO. 


SPME, bit [17] 
Secure Performance Monitors enable. This allows event counting in Secure state: 


Q Event counting prohibited in Secure state, unless overridden by the IMPLEMENTATION 
DEFINED authentication interface. 


al Event counting allowed in Secure state. 
If the Performance Monitors Extension is not implemented, this field is RESO. 


When this register has an architecturally-defined reset value, this field resets to 0. 


SDD, bit [16] 


AArch64 Secure self-hosted invasive debug disable. Disables Software debug exceptions in Secure 
state, other than Breakpoint Instruction exceptions. 


Q Debug exceptions from Secure ELO are enabled, and debug exceptions from Secure 
EL] are enabled if the value of MDSCR_EL1.KDE is | and the value of PSTATE.D is 0. 


1 Debug exceptions, other than Breakpoint Instruction exceptions, are disabled from all 
Exception levels in Secure state. 


The SDD bit is ignored unless both of the following are true: 
° The PE is in Secure state. 
° Secure EL] is using AArch64. 


This field resets to a value that is architecturally UNKNOWN. 


SPD32, bits [15:14] 


AArch32 Secure self-hosted privileged invasive debug control. Enables or disables debug 
exceptions from Secure EL1 using AArch32, other than Breakpoint Instruction exceptions. Valid 
values for this field are: 


00 Legacy mode. Debug exceptions from Secure EL1 are enabled by the IMPLEMENTATION 
DEFINED authentication interface. 

10 Secure privileged debug disabled. Debug exceptions from Secure EL1 are disabled. 

11 Secure privileged debug enabled. Debug exceptions from Secure EL1 are enabled. 


Other values are reserved, and have the CONSTRAINED UNPREDICTABLE behavior that they must 
have the same behavior as 0b00. Software must not rely on this property as the behavior of reserved 
values might change in a future revision of the architecture. 


This field has no effect on Breakpoint Instruction exceptions. These are always enabled. 





This field is: 
° Ignored if either the PE is in Non-secure state or Secure EL1 is using AArch64. 
° RESO if the implementation does not support EL1 using AArch32. 
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If Secure EL1 is using AArch32 then: 
° If debug exceptions from Secure EL] are enabled, then debug exceptions from Secure ELO 
are also enabled. 
° Otherwise, debug exceptions from Secure ELO are enabled only if the value of 


SDER32_EL3.SUIDEN is 1. 
If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
Bits [13:11] 
Reserved, RESO. 
TDOSA, bit [10] 


Trap debug OS-related register access. Traps EL2 and EL1 System register accesses to the 
powerdown debug registers to EL3: 


0 EL2 and EL1 System register accesses to the powerdown debug registers are not 
trapped to EL3. 

1 EL2 and EL1 System register accesses to the powerdown debug registers are trapped to 
EL3. 


The registers for which accesses are trapped are as follows: 
AArch64: OSLAR_EL1, OSLSR_EL1, OSDLR_EL1, DBGPRCR_EL1. 
AArch32: DBGOSLAR, DBGOSLSR, DBGOSDLR, DBGPRCR. 


AArch64 and AArch32: Any IMPLEMENTATION DEFINED register with similar functionality that the 
implementation specifies as trapped by this bit. 


This field resets to a value that is architecturally UNKNOWN. 


TDA, bit [9] 


Trap Debug Access. Traps EL2, EL1, and ELO System register accesses to those debug System 
registers that cannot be trapped using the MDCR_EL3.TDOSA field. When MDCR_EL3.TDA is: 


) ELO, EL1, and EL2 accesses to the debug registers, other than the registers that can be 
trapped by MDCR_EL3.TDOSA, are not trapped to EL3. 


1 ELO, EL1, and EL2 accesses to the debug registers, other than the registers that can be 
trapped by MDCR_EL3.TDOSA, are trapped to EL3, from both Security states and 
both Execution states. 


Traps of AArch32 accesses to DBGDTRRXint and DBGDTRTXint are ignored in Debug state. 


Traps of AArch64 accesses to DBGDTR_ELO, DBGDTRRX_ELO, and DBGDTRTX_ELO are 
ignored in Debug state. 


This field resets to a value that is architecturally UNKNOWN. 


Bits [8:7] 


Reserved, RESO. 


TPM, bit [6] 


Trap Performance Monitors accesses. Traps EL2, EL1, and ELO accesses to all Performance 
Monitors registers to EL3, from both Security states and both Execution states. 


) EL2, EL1, and ELO System register accesses to all Performance Monitors registers are 
not trapped to EL3. 


1 EL2, EL1, and ELO System register accesses to all Performance Monitors registers are 
trapped to EL3. 


If the Performance Monitors Extension is not implemented, this field is RESO. 
If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
Bits [5:0] 


Reserved, RESO. 
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Accessing the MDCR_EL3: 
To access the MDCR_EL3: 


MRS <Xt>, MDCR_EL3 ; Read MDCR_EL3 into Xt 
MSR MDCR_EL3, <Xt> ; Write Xt to MDCR_EL3 


Register access is encoded as follows: 





op0 op1 


CRn CRm_= op2 





11 110 


0001 8=ee11 001 
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D7.3.19 MDRAR_EL1, Monitor Debug ROM Address Register 
The MDRAR_ELI characteristics are: 


Purpose 


Defines the base physical address of a 4KB-aligned memory-mapped debug component, usually a 
ROM table that locates and describes the memory-mapped debug components in the system. 
ARMvVv8 deprecates any use of this register. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





ARMvVv8 deprecates any use of this register. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TDRA==1, Non-secure read accesses to this register from EL1 are trapped 
to EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from EL1 and EL2 are trapped to 
EL3. 


Configurations 


AArch64 System register MDRAR_EL1 is architecturally mapped to AArch32 System register 
DBGDRAR. 


Attributes 
MDRAR_EL1 is a 64-bit register. 


Field descriptions 


The MDRAR_EL1I bit assignments are: 


63 48 47 12 11 2 1 0 


RESO ROMADDRJ47:12] RESO 


Bits [63:48] 


Reserved, RESO. 


ROMADDR{[47:12], bits [47:12] 
Bits[47:12] of the ROM table physical address. 


If the physical address size in bits (PAsize) is less than 48 then the register bits corresponding to 
ROMADDR [47:PAsize] are RESO. 


Bits [11:0] of the ROM table physical address are zero. 


ARM strongly recommends that bits ROMADDR[(PAsize-1):32] are zero in any system that 
supports AArch32 at the highest implemented Exception level. 


In an implementation that includes EL3, ROMADDR is an address in Non-secure memory. It is 
IMPLEMENTATION DEFINED whether the ROM table is also accessible in Secure memory. 
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Bits [11:2] 


Reserved, RESO. 


Valid, bits [1:0] 


This field indicates whether the ROM Table address is valid. The permitted values of this field are: 


00 ROM Table address is not valid. Software must ignore ROMADDR. 


11 ROM Table address is valid. 


Other values are reserved. 


Accessing the MDRAR_EL1: 
To access the MDRAR_ELI: 
MRS <Xt>, MDRAR_EL1 ; Read MDRAR_EL1 into Xt 


Register access is encoded as follows: 





op0- op 


CRn CRm_= op2 





10 000 


0001 0000 000 
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D7.3.20 MDSCR_EL1, Monitor Debug System Control Register 
The MDSCR_EL] characteristics are: 


Purpose 


Main control register for the debug implementation. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 


AArch64 System register MDSCR_EL1 is architecturally mapped to AArch32 System register 
DBGDSCRext. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch64. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
MDSCR_EL1 is a 32-bit register. 


Field descriptions 


The MDSCR_ELI bit assignments are: 


31 30 29 28 27 26 25 24 23 22 21201918 16151413 12 11 7 65 1 0 





RESO a4 | Po ERR 
RXfull TDCC 


TXfull KDE 
RESO HDE 
RXO MDE 
TXU 

RESO 

INTdis 

TDA 

RESO 

RAZ/WI 


Bit [31] 
Reserved, RESO. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-2197 
1ID092916 Non-Confidential 


D7 AArch64 System Register Descriptions 


D7.3 Debug registers 


RXfull, bit [30] 


Used for save/restore of EDSCR.RXfull. 


When OSLSR_EL1.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat 
it as UNK/SBZP. 


When OSLSR_EL1.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.RXfull. 


The architected behavior of this field determines the value it returns after a reset. 


TXfull, bit [29] 


Bit [28] 


RXO, bit [27] 


TXU, bit [26] 


Bits [25:24] 


Used for save/restore of EDSCR.TXfull. 


When OSLSR_EL1.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat 
it as UNK/SBZP. 


When OSLSR_EL1.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.TXfull. 


The architected behavior of this field determines the value it returns after a reset. 


Reserved, RESO. 


Used for save/restore of EDSCR.RXO. 


When OSLSR_EL1.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat 
it as UNK/SBZP. 


When OSLSR_EL1.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.RXO. 


The architected behavior of this field determines the value it returns after a reset. 


Used for save/restore of EDSCR.TXU. 


When OSLSR_EL1.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat 
it as UNK/SBZP. 


When OSLSR_EL1.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.TXU. 


The architected behavior of this field determines the value it returns after a reset. 


Reserved, RESO. 


INTdis, bits [23:22] 


TDA, bit [21] 


Used for save/restore of EDSCR.INTdis. 


When OSLSR_EL1.OSLK == 0 (the OS lock is unlocked), this field is RO, and software must treat 
it as UNK/SBZP. 


When OSLSR_EL1.OSLK == 1 (the OS lock is locked), this field is RW and holds the value of 
EDSCR.INTdis. 


The architected behavior of this field determines the value it returns after a reset. 


Used for save/restore of EDSCR.TDA. 


When OSLSR_EL1.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat 
it as UNK/SBZP. 


When OSLSR_EL1.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.TDA. 
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The architected behavior of this field determines the value it returns after a reset. 


Bits [20:19] 


Reserved, RESO. 


Bits [18:16] 
Reserved. Hardware must implement this field as RAZ/WI. Software must not rely on the register 
reading as zero, and must use a read-modify-write sequence to update the register. 
MDE, bit [15] 
Monitor debug events. Enable Breakpoint, Watchpoint, and Vector Catch exceptions. 
0 Breakpoint, Watchpoint, and Vector Catch exceptions disabled. 
1 Breakpoint, Watchpoint, and Vector Catch exceptions enabled. 


This field resets to a value that is architecturally UNKNOWN. 


HDE, bit [14] 
Used for save/restore of EDSCR.HDE. 


When OSLSR_EL1.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat 
it as UNK/SBZP. 


When OSLSR_EL1.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.HDE. 


The architected behavior of this field determines the value it returns after a reset. 


KDE, bit [13] 
Local (kernel) debug enable. If ELp is using AArch64, enable debug exceptions within ELp. 
Permitted values are: 
) Debug exceptions, other than Breakpoint Instruction exceptions, disabled within ELp. 
1 Breakpoint exceptions enabled within ELp. 
RESO if ELp is using AArch32. 


This field resets to a value that is architecturally UNKNOWN. 


TDCC, bit [12] 
Traps ELO accesses to the DCC registers to EL1, from both Execution states: 
0 ELO using AArch64: ELO accesses to the MDCCSR_ELO, DBGDTR_ELO, 
DBGDTRTX_ELO, and DBGDTRRX_ELO registers are not trapped to EL1. 


ELO using AArch32: ELO accesses to the DBGDSCRint, DBGDTRRXint, 
DBGDTRTXint, DBGDIDR, DBGDSAR, and DBGDRAR registers are not trapped to 
ELI. 


1 ELO using AArch64: ELO accesses to the MDCCSR_ELO, DBGDTR_ELO, 
DBGDTRTX_ELO, and DBGDTRRX_ELO registers are trapped to EL1. 


ELO using AArch32: ELO accesses to the DBGDSCRint, DBGDTRRXint, 
DBGDTRTXint, DBGDIDR, DBGDSAR, and DBGDRAR registers are trapped to 
ELI. 


— Note 


All accesses to these AArch32 registers are trapped, including LDC and STC accesses to 
DBGDTRTXint and DBGDTRR Xint, and MRRC accesses to DBGDSAR and DBGDRAR. 





Traps of AArch32 accesses to the DBGDTRRXint and DBGDTRTXint are ignored in Debug state. 


Traps of AArch64 accesses to DBGDTR_ELO, DBGDTRRX_ELO, and DBGDTRTX_EL0 are 
ignored in Debug state. 


This field resets to a value that is architecturally UNKNOWN. 
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Bits [11:7] 
Reserved, RESO. 
ERR, bit [6] 
Used for save/restore of EDSCR.ERR. 


When OSLSR_EL1.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat 
it as UNK/SBZP. 


When OSLSR_EL1.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.ERR. 


The architected behavior of this field determines the value it returns after a reset. 
Bits [5:1] 
Reserved, RESO. 
SS, bit [0] 
Software step control bit. If ELp is using AArch64, enable Software step. Permitted values are: 
0 Software step disabled 
1 Software step enabled. 
RESO if ELp is using AArch32. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the MDSCR_EL1: 
To access the MDSCR_EL1: 


MRS <Xt>, MDSCR_EL1 ; Read MDSCR_EL1 into Xt 
MSR MDSCR_EL1, <Xt> ; Write Xt to MDSCR_EL1 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





10 000 0000 =: 0010 010 
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OSDLR_EL1, OS Double Lock Register 


The OSDLR_EL1 characteristics are: 


Purpose 


Used to control the OS Double Lock. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL2.TDOSA==1, Non-secure accesses to this register from EL1 are trapped to 
EL2. 


. If MDCR_EL3.TDOSA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 


Attributes 


AArch64 System register OSDLR_EL1 is architecturally mapped to AArch32 System register 
DBGOSDLR. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch64. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


OSDLR_EL1 is a 32-bit register. 


Field descriptions 


The OSDLR_EL1 bit assignments are: 


31 


1 0 


RESO i 


Bits [31:1] 


DLK, bit [0] 


| DLK 


Reserved, RESO. 


OS Double Lock control bit. Possible values are: 
0 OS Double Lock unlocked. 


aL OS Double Lock locked, if DBGPRCR_EL1.CORENPDRQ (Core no powerdown 
request) bit is set to 0 and the PE is in Non-debug state. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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Accessing the OSDLR_EL1: 
To access the OSDLR_ELI1: 


MRS <Xt>, OSDLR_EL1 ; Read OSDLR_EL1 into Xt 
MSR OSDLR_EL1, <Xt> ; Write Xt to OSDLR_EL1 


Register access is encoded as follows: 





op0 op1 


CRn CRm_= op2 





10 000 


0001 8=e011 100 
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D7.3.22 OSDTRRX_EL1, OS Lock Data Transfer Register, Receive 
The OSDTRRX_EL | characteristics are: 
Purpose 
Used for save/restore of DBGDTRRX_ELO. It is a component of the Debug Communications 
Channel. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RW RW RW RW RW 
ARM deprecates reads and writes of OSDTRRX_EL1 when the OS lock is unlocked. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
Configurations 
AArch64 System register OSDTRRX_EL1 is architecturally mapped to AArch32 System register 
DBGDTRRXext. 
Attributes 
OSDTRRX_ELI is a 32-bit register. 
Field descriptions 
The OSDTRRX_EL] bit assignments are: 
31 0 
Update DTRRX without side-effect 
Bits [31:0] 
Update DTRRX without side-effect. 
Writes to this register update the value in DTRRX and do not change RXfull. 
Reads of this register return the last value written to DT[RRX and do not change RXfull. 
For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 
Accessing the OSDTRRX_EL1: 
To access the OSDTRRX_ELI1: 
MRS <Xt>, OSDTRRX_EL1 ; Read OSDTRRX_EL1 into Xt 
MSR OSDTRRX_EL1, <Xt> ; Write Xt to OSDTRRX_EL1 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





10 000 0000 0000 010 








D7-2204 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D7 AArch64 System Register Descriptions 
D7.3 Debug registers 














D7.3.23 OSDTRTX_EL1, OS Lock Data Transfer Register, Transmit 
The OSDTRTX_EL | characteristics are: 
Purpose 
Used for save/restore of DBGDTRTX_ELO. It is a component of the Debug Communications 
Channel. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RW RW RW RW RW 
ARM deprecates reads and writes of OSDTRTX_EL1 when the OS lock is unlocked. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
Configurations 
AArch64 System register OSDTRTX_EL] is architecturally mapped to AArch32 System register 
DBGDTRTXext. 
Attributes 
OSDTRTX_EL1] is a 32-bit register. 
Field descriptions 
The OSDTRTX_EL1 bit assignments are: 
31 0 
Return DTRTX without side-effect 
Bits [31:0] 
Return DTRTX without side-effect. 
Reads of this register return the value in DTRTX and do not change TXfull. 
Writes of this register update the value in DTRTX and do not change TXfull. 
For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 
Accessing the OSDTRTX_EL1: 
To access the OSDTRTX_EL1: 
MRS <Xt>, OSDTRTX_EL1 ; Read OSDTRTX_EL1 into Xt 
MSR OSDTRTX_EL1, <Xt> ; Write Xt to OSDTRTX_EL1 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





10 000 0000 0011 010 
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D7.3.24 OSECCR_EL1, OS Lock Exception Catch Control Register 
The OSECCR_EL1 characteristics are: 


Purpose 
Provides a mechanism for an operating system to access the contents of EDECCR that are otherwise 
invisible to software, so it can save/restore the contents of EDECCR over powerdown on behalf of 
the external debugger. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 


AArch64 System register OSECCR_EL1 is architecturally mapped to AArch32 System register 
DBGOSECCR. 


AArch64 System register OSECCR_EL1 is architecturally mapped to External register EDECCR. 


If OSLSR_EL1.OSLK == 0 then OSECCR_EL1 returns an UNKNOWN value on reads and ignores 
writes. 


Attributes 
OSECCR_EL1 is a 32-bit register. 


Field descriptions 


The OSECCR_EL1 bit assignments are: 


When OSLSR.OSLK==1: 


31 0 


EDECCR 


EDECCR, bits [31:0] 


Used for save/restore to EDECCR over powerdown. 


Accessing the OSECCR_EL1: 
To access the OSECCR_EL1: 


MRS <Xt>, OSECCR_EL1 ; Read OSECCR_EL1 into Xt 
MSR OSECCR_EL1, <Xt> ; Write Xt to OSECCR_EL1 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





10 000 0000 0110 010 
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D7.3.25 OSLAR_EL1, OS Lock Access Register 
The OSLAR_EL1 characteristics are: 
Purpose 
Used to lock or unlock the OS lock. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- WO WO WO WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If MDCR_EL2.TDOSA==1, Non-secure write accesses to this register from EL1 are trapped 
to EL2. 
° If MDCR_EL3.TDOSA==1, write accesses to this register from EL1 and EL2 are trapped to 
EL3. 
Configurations 
AArch64 System register OSLAR_EL1 is architecturally mapped to AArch32 System register 
DBGOSLAR. 
AArch64 System register OSLAR_EL] is architecturally mapped to External register 
OSLAR_EL1. 
Attributes 
OSLAR_EL1 is a 32-bit register. 
Field descriptions 
The OSLAR_EL1 bit assignments are: 
31 10 
RESO [ 
= OSLK 
Bits [31:1] 
Reserved, RESO. 
OSLK, bit [0] 
On writes to OSLAR_EL1, bit[0] is copied to the OS lock. 
Use OSLSR_EL1.OSLK to check the current status of the lock. 
Accessing the OSLAR_EL1: 
To access the OSLAR_ELI1: 
MSR OSLAR_EL1, <Xt> ; Write Xt to OSLAR_EL1 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





10 000 0001 0000 100 
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D7.3.26 OSLSR_EL1, OS Lock Status Register 
The OSLSR_EL1 characteristics are: 


Purpose 
Provides the status of the OS lock. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL2.TDOSA==1, Non-secure read accesses to this register from EL1 are trapped 
to EL2. 


. If MDCR_EL3.TDOSA==1, read accesses to this register from EL1 and EL2 are trapped to 
EL3. 
Configurations 


AArch64 System register OSLSR_EL1 is architecturally mapped to AArch32 System register 
DBGOSLSR. 


This register is in the Cold reset domain. Some or all RW fields of this register have defined reset 
values. On a Cold reset these apply only if the PE resets into an Exception level that is using 
AArch64. Otherwise, on a Cold reset RW fields in this register reset to architecturally UNKNOWN 
values. The register is not affected by a Warm reset. 


Attributes 
OSLSR_EL1 is a 32-bit register. 


Field descriptions 


The OSLSR_EL1 bit assignments are: 


31 T 10 


RESO TTY 
| OSLM[O] 
ia 





Benn 
Bits [31:4] 
Reserved, RESO. 
OSLM[1], bit [3] 
See below for description of the OSLM field. 
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nTT, bit [2] 


Not 32-bit access. This bit is always RAZ. It indicates that a 32-bit access is needed to write the key 
to the OS Lock Access Register. 


OSLK, bit [1] 
OS Lock Status. The possible values are: 
i) OS lock unlocked. 
a OS lock locked. 
The OS lock is locked and unlocked by writing to the OS Lock Access Register. 
When this register has an architecturally-defined reset value, this field resets to 1. 
OSLM[0], bit [0] 


OS lock model implemented. Identifies the form of OS save and restore mechanism implemented. 
In ARMvV8 these bits are as follows: 


10 OS lock implemented. DBGOSSRR not implemented. 


All other values are reserved. 


Accessing the OSLSR_EL1: 
To access the OSLSR_EL1: 
MRS <Xt>, OSLSR_EL1 ; Read OSLSR_EL1 into Xt 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





10 000 0001 0001 100 
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D7.3.27 SDER32_EL3, AArch32 Secure Debug Enable Register 
The SDER32_EL3 characteristics are: 
Purpose 
Allows access to the AArch32 register SDER from AArch64 state only. Its value has no effect on 
execution in AArch64 state. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - - RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 

AArch64 System register SDER32_EL3 is architecturally mapped to AArch32 System register 

SDER. 

If EL1 is AArch64 only, this register is UNDEFINED. 

This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 

to architecturally UNKNOWN values. 

Attributes 
SDER32_EL3 is a 32-bit register. 
Field descriptions 
The SDER32_EL3 bit assignments are: 
31 2 10 
RESO i 
—_ SUIDEN 
SUNIDEN 
Bits [31:2] 
Reserved, RESO. 
SUNIDEN, bit [1] 

Secure User Non-Invasive Debug Enable: 

0 Performance Monitors event counting prohibited in Secure ELO unless allowed by 
MDCR_EL3.SPME or the IMPLEMENTATION DEFINED authentication interface 
ExternalSecureNoninvasiveDebugEnabled(). 

He Performance Monitors event counting allowed in Secure ELO. 

This field resets to a value that is architecturally UNKNOWN. 
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SUIDEN, bit [0] 


Secure User Invasive Debug Enable: 


Q Debug exceptions other than Breakpoint Instruction exceptions from Secure ELO are 
disabled, unless enabled by MDCR_EL3.SPD32. 
1 Debug exceptions from Secure ELO are enabled. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the SDER32_EL3: 
To access the SDER32_EL3: 


MRS <Xt>, SDER32_EL3 ; Read SDER32_EL3 into Xt 
MSR SDER32_EL3, <Xt> ; Write Xt to SDER32_EL3 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 110 0001 0001 001 
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D7.4 Performance Monitors registers 


This section lists the Performance Monitoring registers in AArch64. 
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D7.4.1 PMCCFILTR_ELO, Performance Monitors Cycle Count Filter Register 
The PMCCFILTR_ELO characteristics are: 


Purpose 
Determines the modes in which the Cycle Counter, PMCCNTR_ELO, increments. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





PMCCFILTR_ELO can also be accessed by using PMXEVTYPER_ELO with PMSELR_ELO.SEL 
set to @b11111. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL1 are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register PMCCFILTR_ELO is architecturally mapped to AArch32 System register 
PMCCFILTR. 


AArch64 System register PMCCFILTR_ELO is architecturally mapped to External register 
PMCCFILTR_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMCCFILTR_ELO is a 32-bit register. 


Field descriptions 


The PMCCFILTR_ELO bit assignments are: 


31 30 29 28 27 26 25 0 


NSK | | 
NSU 





NSH 
P, bit [31] 
Privileged filtering bit. Controls counting in EL1. If EL3 is implemented, then counting in 
Non-secure EL] is further controlled by the NSK bit. The possible values of this bit are: 
0 Count cycles in EL1. 
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1 Do not count cycles in EL1. 
U, bit [30] 
User filtering bit. Controls counting in ELO. If EL3 is implemented, then counting in Non-secure 
ELO is further controlled by the NSU bit. The possible values of this bit are: 
0 Count cycles in ELO. 
1 Do not count cycles in ELO. 
NSK, bit [29] 


Non-secure EL1 (kernel) modes filtering bit. Controls counting in Non-secure EL1. If EL3 is not 
implemented, this bit is RESO. 


If the value of this bit is equal to the value of P, cycles in Non-secure EL1 are counted. 


Otherwise, cycles in Non-secure EL1 are not counted. 


NSU, bit [28] 


Non-secure ELO (Unprivileged) filtering. Controls counting in Non-secure ELO. If EL3 is not 
implemented, this bit is RESO. 


If the value of this bit is equal to the value of U, cycles in Non-secure ELO are counted. 


Otherwise, cycles in Non-secure ELO are not counted. 


NSH, bit [27] 


Non-secure EL2 (Hypervisor) filtering bit. Controls counting in Non-secure EL2. If EL2 is not 
implemented, this bit is RESO. 


) Do not count cycles in EL2. 
1 Count cycles in EL2. 


M, bit [26] 
Secure EL3 filtering bit. If EL3 is not implemented, this bit is RESO. 
If the value of this bit is equal to the value of P, cycles in Secure EL3 are counted. 
Otherwise, cycles in Secure EL3 are not counted. 
Most applications can ignore this field and set its value to 0. 


—— Note 
This field is not visible in the AArch32 PMCCFILTR System register. 





Bits [25:0] 


Reserved, RESO. 


Accessing the PMCCFILTR_ELO: 
To access the PMCCFILTR_ELO: 


MRS <Xt>, PMCCFILTR_ELO ; Read PMCCFILTR_ELO into Xt 
MSR PMCCFILTR_EL@, <Xt> ; Write Xt to PMCCFILTR_ELO 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 011 1110 1111 111 





PMCCFILTR_ELO can also be accessed by using PMXEVTYPER_ELO with PMSELR_ELO.SEL set to @b11111. 
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D7.4.2 PMCCNTR_ELO, Performance Monitors Cycle Count Register 
The PMCCNTR_ELO characteristics are: 


Purpose 


Holds the value of the processor Cycle Counter, CCNT, that counts processor clock cycles. See Time 
as measured by the Performance Monitors cycle counter on page D5-1835 for more information. 


PMCCFILTR_ELO determines the modes and states in which the PMCCNTR_ELO can increment. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If PMUSERENR_ELO.CR==0, and PMUSERENR_ELO.EN==0, read accesses to this 
register from ELO are trapped to EL1. 


° If PMUSERENR_ELO.EN==0, write accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register PMCCNTR_ELO is architecturally mapped to AArch32 System register 
PMCCNTR. 


AArch64 System register PMCCNTR_ELO is architecturally mapped to External register 
PMCCNTR_ELO. 


All counters are subject to any changes in clock frequency, including clock stopping caused by the 
WFI and WFE instructions. This means that it is CONSTRAINED UNPREDICTABLE whether or not 
PMCCNTR_ELO continues to increment when clocks are stopped by WFI and WFE instructions. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMCCNTR_ELO is a 64-bit register. 


Field descriptions 


The PMCCNTR_ELO bit assignments are: 


63 0 


CCNT 
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CCNT, bits [63:0] 


Cycle count. Depending on the values of PMCR_ELO.{LC,D}, this field increments in one of the 
following ways: 


° Every processor clock cycle. 
. Every 64th processor clock cycle. 
Writing 1 to PMCR_ELO.C sets this field to 0. 


Accessing the PMCCNTR_ELO: 
To access the PMCCNTR_ELO: 


MRS <Xt>, PMCCNTR_EL@ ; Read PMCCNTR_ELO into Xt 
MSR PMCCNTR_EL@, <Xt> ; Write Xt to PMCCNTR_ELO 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 011 1001 1101 000 
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D7.4.3 PMCEIDO_ELO, Performance Monitors Common Event Identification register 0 

The PMCEIDO_ELO characteristics are: 

Purpose 
Defines which common architectural and common microarchitectural feature events in the range 
0x00 to Qx01F are implemented. If a particular bit is set to 1, then the event for that bit is 
implemented. 

Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RO RO RO RO RO RO 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If MDCR_EL2.TPM==1, Non-secure read accesses to this register from ELO and EL1 are 

trapped to EL2. 
° If MDCR_EL3.TPM==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 

° If PMUSERENR_ELO.EN==0, read accesses to this register from ELO are trapped to EL1. 

Configurations 
AArch64 System register PMCEIDO_ELO[31:0] is architecturally mapped to AArch32 System 
register PMCEIDO. 
AArch64 System register PMCEIDO_ELO[31:0] is architecturally mapped to External register 
PMCEIDO. 

Attributes 
PMCEID0_ELO is a 32-bit register. 

Field descriptions 

The PMCEIDO_ELO bit assignments are: 

31 0 

ID[31:0] 

ID[31:0], bits [31:0] 
PMCEIDO_ELO[n] maps to event n. For a list of event numbers and descriptions, see Events, event 
numbers, and mnemonics on page D5-1848. 
For each bit: 
) The common event is not implemented. 
1 The common event is implemented. 
Bits that map to reserved event numbers are reserved to identify events that might be defined in 
future revisions to the architecture. 
Events that do not require additional features in the PMU can be defined retrospectively, meaning 
that they can be implemented as part of a PMUv3 implementation. 
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Accessing the PMCEIDO_ELO: 
To access the PMCEIDO_ELO: 
MRS <Xt>, PMCEID@_EL@ ; Read PMCEID@_EL@ into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 1001 1100 110 
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D7.4.4 PMCEID1_ELO, Performance Monitors Common Event Identification register 1 

The PMCEID1_EL0O characteristics are: 

Purpose 
Defines which common architectural and common microarchitectural feature events in the range 
0x020 to Qx03F are implemented. If a particular bit is set to 1, then the event for that bit is 
implemented. 

Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RO RO RO RO RO RO 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If MDCR_EL2.TPM==1, Non-secure read accesses to this register from ELO and EL1 are 

trapped to EL2. 
° If MDCR_EL3.TPM==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 

° If PMUSERENR_ELO.EN==0, read accesses to this register from ELO are trapped to EL1. 

Configurations 
AArch64 System register PMCEID1_EL0[31:0] is architecturally mapped to AArch32 System 
register PMCEID1. 
AArch64 System register PMCEID1_EL0[31:0] is architecturally mapped to External register 
PMCEID1. 

Attributes 
PMCEID1_EL0O is a 32-bit register. 

Field descriptions 

The PMCEID1_ELO bit assignments are: 

31 0 

ID[63:32] 

ID[63:32], bits [31:0] 
PMCEID1_ELO[n] maps to event (n + 32). For a list of event numbers and descriptions, see Events, 
event numbers, and mnemonics on page D5-1848. 
For each bit: 
) The common event is not implemented. 
1 The common event is implemented. 
Bits that map to reserved event numbers are reserved to identify events that might be defined in 
future revisions to the architecture. 
Events that do not require additional features in the PMU can be defined retrospectively, meaning 
that they can be implemented as part of a PMUv3 implementation. 
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Accessing the PMCEID1_EL0O: 
To access the PMCEID1_ELO: 
MRS <Xt>, PMCEID1_EL@ ; Read PMCEID1_EL@ into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 1001 1100 111 
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D7.4.5 PMCNTENCLR_ELO, Performance Monitors Count Enable Clear register 
The PMCNTENCLR_ELO characteristics are: 
Purpose 
Disables the Cycle Count Register, PMCCNTR_ELO, and any implemented event counters 
PMEVCNTR<n>. Reading this register shows which counters are enabled. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RW RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 
° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 
. If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 
Configurations 
AArch64 System register PACNTENCLR_ELO is architecturally mapped to AArch32 System 
register PMCNTENCLR. 
AArch64 System register PMCNTENCLR_ELO is architecturally mapped to External register 
PMCNTENCLR_ELO. 
This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 
Attributes 
PMCNTENCLR_ELO is a 32-bit register. 
Field descriptions 
The PMCNTENCLR_ELO bit assignments are: 
31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR_ELO disable bit. Disables the cycle counter register. Possible values are: 
0 When read, means the cycle counter is disabled. When written, has no effect. 
1 When read, means the cycle counter is enabled. When written, disables the cycle 
counter. 
P<n>, bit [n], for n = 0 to 30 
Event counter disable bit for PMEVCNTR<n>_ELO. 
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Bits [30:N] are RAZ/WI. When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in 
MDCR_EL2.HPMN. Otherwise, N is the value in PMCR_ELO.N. 


Possible values of each bit are: 


0 When read, means that PMEVCNTR<n>_EL0O is disabled. When written, has no effect. 
1 When read, means that PMEVCNTR<n>_ELO is enabled. When written, disables 


PMEVCNTR<n>_ELO. 
Accessing the PMCNTENCLR_ELO: 
To access the PMACNTENCLR_ELO: 


MRS <Xt>, PMCNTENCLR_EL@ ; Read PMCNTENCLR_EL@ into Xt 
MSR PMCNTENCLR_EL@, <Xt> ; Write Xt to PMCNTENCLR_ELO 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 1001 1100 010 
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D7.4.6 PMCNTENSET_ELO, Performance Monitors Count Enable Set register 
The PMCNTENSET_ELO characteristics are: 
Purpose 
Enables the Cycle Count Register, PMCCNTR_ELO, and any implemented event counters 
PMEVCNTR<n>. Reading this register shows which counters are enabled. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RW RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 
° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 
. If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 
Configurations 
AArch64 System register PMCNTENSET_ELO is architecturally mapped to AArch32 System 
register PMCNTENSET. 
AArch64 System register PMCNTENSET_ELO is architecturally mapped to External register 
PMCNTENSET_ELO. 
This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 
Attributes 
PMCNTENSET_ELO is a 32-bit register. 
Field descriptions 
The PMCNTENSET_ELO bit assignments are: 
31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR_ELO enable bit. Enables the cycle counter register. Possible values are: 
0 When read, means the cycle counter is disabled. When written, has no effect. 
1 When read, means the cycle counter is enabled. When written, enables the cycle 
counter. 
P<n>, bit [n], for n = 0 to 30 
Event counter enable bit for PMEVCNTR<n>_EL0. 
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Bits [30:N] are RAZ/WI. When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in 
MDCR_EL2.HPMN. Otherwise, N is the value in PMCR_ELO.N. 


Possible values of each bit are: 


0 When read, means that PMEVCNTR<n>_EL0O is disabled. When written, has no effect. 
1 When read, means that PMEVCNTR<n>_EL0O event counter is enabled. When written, 


enables PMEVCNTR<n>_ELO. 
Accessing the PMCNTENSET_ELO: 
To access the PACNTENSET_ELO: 


MRS <Xt>, PMCNTENSET_EL@ ; Read PMCNTENSET_EL@ into Xt 
MSR PMCNTENSET_EL@, <Xt> ; Write Xt to PMCNTENSET_ELO 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 1001 1100 001 
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D7.4.7 


PMCR_ELO, Performance Monitors Control Register 


The PMCR_ELO characteristics are: 


Purpose 
Provides details of the Performance Monitors implementation, including the number of counters 
implemented, and configures and controls the counters. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If MDCR_EL2.TPMCR==1, Non-secure accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register PMCR_ELO is architecturally mapped to AArch32 System register 
PMCR. 


AArch64 System register PMCR_ELO[6:0] is architecturally mapped to External register 
PMCR_ELO[6:0]. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch64. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
PMCR_ELO is a 32-bit register. 


Field descriptions 


The PMCR_ELO bit assignments are: 


31 24 23 1615 11 10 7654321 0 


IMP, bits [31:24] 
Implementer code. This field is RO with an IMPLEMENTATION DEFINED value. 


The implementer codes are allocated by ARM. Values have the same interpretation as bits [31:24] 
of the MIDR. 
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IDCODE, bits [23:16] 
Identification code. This field is RO with an IMPLEMENTATION DEFINED value. 


Each implementer must maintain a list of identification codes that is specific to the implementer. A 
specific implementation is identified by the combination of the implementer code and the 
identification code. 

N, bits [15:11] 


Number of event counters. A RO field that indicates the number counters implemented. A value of 
0b00000 in this field indicates that only the Cycle Count Register PMCCNTR_ELO is implemented. 


The value of this field is the number of event counters implemented. This value is in the range of 
0b00000, in which case only the PMCCNTR_ELO is implemented, to 0b11111, which indicates that 
the PMCCNTR_ELO and 31 event counters are implemented. 


In an implementation that includes EL2, reads of this field from Non-secure EL1 and Non-secure 
ELO return the value of MDCR_EL2.HPMN. 

Bits [10:7] 
Reserved, RESO. 


LC, bit [6] 


Long cycle counter enable. Determines which PMCCNTR_ELO bit generates an overflow recorded 
by PMOVSR[31]. 


Q Cycle counter overflow on increment that changes PMCCNTR_ELO[31] from 1 to 0. 
1 Cycle counter overflow on increment that changes PMCCNTR_ELO[63] from | to 0. 
ARM deprecates use of PMCR_ELO.LC = 0. 

In an AArch64-only implementation, this field is RES1. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


DP, bit [5] 
Disable cycle counter when event counting is prohibited. The possible values of this bit are: 
) PMCCNTR_ELO, if enabled, counts when event counting is prohibited. 
a PMCCNTR_ELO does not count when event counting is prohibited. 


Counting events is never prohibited in Non-secure state. However, there are some restrictions on 
counting events in Secure state. For more information about the interaction between the 
Performance Monitors and EL3, see /nteraction with EL3 on page D5-1841. 


If EL3 is not implemented, this field is RESO, otherwise it is an RW field. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


X, bit [4] 


Enable export of events in an IMPLEMENTATION DEFINED event stream. The possible values of this 
bit are: 


() Do not export events. 
1 Export events where not prohibited. 


This field enables the exporting of events over an event bus to another device, for example to an 
OPTIONAL trace macrocell. If the implementation does not include such an event bus then this field 
is RAZ/WI, otherwise it is an RW field. 


In an implementation that includes an event bus, no events are exported when counting is prohibited. 


This field does not affect the generation of Performance Monitors overflow interrupt requests or 
signaling to a cross-trigger interface (CTI) that can be implemented as signals exported from the PE. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
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D, bit [3] 
Clock divider. The possible values of this bit are: 
) When enabled, PMCCNTR_ELO counts every clock cycle. 


1 When enabled, PMCCNTR_ELO counts once every 64 clock cycles. 


In an AArch64-only implementation this field is RESO, otherwise it is an RW field. If 
PMCR_ELO.LC == 1, this bit is ignored and the cycle counter counts every clock cycle. 


ARM deprecates use of PMCR_ELO.D = 1. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


C, bit [2] 


Cycle counter reset. This bit is WO. The effects of writing to this bit are: 


0 No action. 
1 Reset PMCCNTR_ELO to zero. 
This bit is always RAZ. 


Resetting PMCCNTR_ELO does not clear the PMCCNTR_ELO overflow bit to 0. 


P, bit [1] 


Event counter reset. This bit is WO. The effects of writing to this bit are: 


1) No action. 


1 Reset all event counters accessible in the current EL, not including PMCCNTR_ELO, 


to zero. 


This bit is always RAZ. 


In Non-secure ELO and EL1, if EL2 is implemented, a write of 1 to this bit does not reset event 


counters that MDCR_EL2.HPMN reserves for EL2 use. 

In EL2 and EL3, a write of | to this bit resets all the event counters. 

Resetting the event counters does not clear any overflow bits to 0. 
E, bit [0] 

Enable. The possible values of this bit are: 


0 All counters that are accessible at Non-secure EL1, including PMCCNTR_ELO, are 


disabled. 


1 All counters that are accessible at Non-secure EL1 are enabled by 


PMCNTENSET_ELO. 
This bit is RW. 


If EL2 is implemented, this bit does not affect the operation of event counters that 


MDCR_EL2.HPMN reserves for EL2 use. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the PMCR_ELO: 
To access the PMCR_ELO: 


MRS <Xt>, PMCR_EL@ ; Read PMCR_EL@ into Xt 
MSR PMCR_EL@, <Xt> ; Write Xt to PMCR_EL@ 


Register access is encoded as follows: 














op0 opt CRn CRm_= op2 
11 011 1001 1100 000 
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D7.4.8 PMEVCNTR<n>_ELO, Performance Monitors Event Count Registers, n = 0 - 30 
The PMEVCNTR<n>_EL0 characteristics are: 


Purpose 


Holds event counter n, which counts events, where n is 0 to 30. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





PMEVCNTR<n>_EL0 can also be accessed by using PMXEVCNTR_ELO with 
PMSELR_ELO.SEL set to n. 


If <n> is greater than or equal to the number of accessible counters, reads and writes of 
PMEVCNTR<n>_ELO are CONSTRAINED UNPREDICTABLE, and the following behaviors are 
permitted: 


° Accesses to the register are UNDEFINED. 
° Accesses to the register behave as RAZ/WI. 
° Accesses to the register execute as a NOP. 


° For an access from Non-secure EL1, or an access from Non-secure ELO when the value of 
PMUSERENR_ELO.EN is 1, if PMSELR_ELO.SEL is greater than or equal to the number of 
accessible counters but is less than the number of implemented counters, the register access 
is trapped to EL2. 


— Note 


In an implementation that includes EL2, in Non-secure state at ELO and EL1, MDCR_EL2.HPMN 
identifies the number of accessible counters. Otherwise, the number of accessible counters is the 
number of implemented counters. 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If PMUSERENR_ELO.EN==0, and PMUSERENR_ELO.ER==0, read accesses to this 
register from ELO are trapped to EL1. 


° If PMUSERENR_ELO.EN==0, write accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register PMEVCNTR<n>_EL0 is architecturally mapped to AArch32 System 
register PMEVCNTR<n>. 


AArch64 System register PMEVCNTR<n>_EL0O is architecturally mapped to External register 
PMEVCNTR<n>_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMEVCNTR<n>_ELO is a 32-bit register. 
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Field descriptions 


The PMEVCNTR<n>_ELO bit assignments are: 


31 0 
Bits [31:0] 


Event counter n. Value of event counter n, where n is the number of this register and is a number 
from 0 to 30. 


Accessing the PMEVCNTR<n>_EL0: 


To access the PMEVCNTR<n>_EL0: 


MRS <Xt>, PMEVCNTR<n>_EL@ ; Read PMEVCNTR<n>_EL@ into Xt, where n is in the range Q to 30 
MSR PMEVCNTR<n>_EL@, <Xt> ; Write Xt to PMEVCNTR<n>_EL@, where n is in the range @ to 30 


Register access is encoded as follows: 





op0 opt CRn CRm op2 





11 Q11 1110 10:n<4:3> — n<2:@> 





PMEVCNTR<n>_EL0 can also be accessed by using PAXEVCNTR_ELO with PMSELR_ELO.SEL set to the 


value of <n>. 
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D7.4.9 PMEVTYPER<n>_EL0O, Performance Monitors Event Type Registers, n = 0 - 30 
The PMEVTYPER<n>_ELO characteristics are: 


Purpose 


Configures event counter n, where n is 0 to 30. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





PMEVTYPER<n>_ELO can also be accessed by using PMXEVTYPER_ELO with 
PMSELR_ELO.SEL set to n. 


If <n> is greater than or equal to the number of accessible counters, reads and writes of 
PMEVTYPER<n>_EL0O are CONSTRAINED UNPREDICTABLE, and the following behaviors are 
permitted: 


° Accesses to the register are UNDEFINED. 
° Accesses to the register behave as RAZ/WI. 
° Accesses to the register execute as a NOP. 


° For an access from Non-secure EL1, or an access from Non-secure ELO when the value of 
PMUSERENR_ELO.EN is 1, if PMSELR_ELO.SEL is greater than or equal to the number of 
accessible counters but is less than the number of implemented counters, the register access 
is trapped to EL2. 


— Note 


In an implementation that includes EL2, in Non-secure state at ELO and EL1, MDCR_EL2.HPMN 
identifies the number of accessible counters. Otherwise, the number of accessible counters is the 
number of implemented counters. 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


. If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register PMEVTYPER<n>_EL0 is architecturally mapped to AArch32 System 
register PMEVTYPER<n>. 


AArch64 System register PMEVTYPER<n>_EL0 is architecturally mapped to External register 
PMEVTYPER<n>_EL0O. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMEVTYPER<n>_ELO is a 32-bit register. 
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NSK 
NSU 
NSH 
MT 


Field descriptions 


The PMEVTYPER<n>_ELO bit assignments are: 


31 30 29 28 27 26 25 24 10 9 0 


— 


P, bit [31] 
U, bit [30] 
NSK, bit [29] 
NSU, bit [28] 
NSH, bit [27] 


M, bit [26] 


Privileged filtering bit. Controls counting in EL1. If EL3 is implemented, then counting in 
Non-secure EL] is further controlled by the NSK bit. The possible values of this bit are: 


0 Count events in EL1. 


1 Do not count events in EL1. 


User filtering bit. Controls counting in ELO. If EL3 is implemented, then counting in Non-secure 
ELO is further controlled by the NSU bit. The possible values of this bit are: 


) Count events in ELO. 


1 Do not count events in ELO. 


Non-secure EL1 (kernel) modes filtering bit. Controls counting in Non-secure EL1. If EL3 is not 
implemented, this bit is RESO. 


If the value of this bit is equal to the value of P, events in Non-secure EL] are counted. 


Otherwise, events in Non-secure EL1 are not counted. 


Non-secure ELO (Unprivileged) filtering. Controls counting in Non-secure ELO. If EL3 is not 
implemented, this bit is RESO. 


If the value of this bit is equal to the value of U, events in Non-secure ELO are counted. 


Otherwise, events in Non-secure ELO are not counted. 


Non-secure EL2 (Hypervisor) filtering. Controls counting in Non-secure EL2. If EL2 is not 
implemented, this bit is RESO. 


1) Do not count events in EL2. 


1 Count events in EL2. 


Secure EL3 filtering bit. If EL3 is not implemented, this bit is RESO. 

If the value of this bit is equal to the value of P, cycles in Secure EL3 are counted. 
Otherwise, cycles in Secure EL3 are not counted. 

Most applications can ignore this field and set its value to 0. 


—— Note 
This field is not visible in the AArch32 PMEVTYPER System register. 
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MT, bit [25] 
Multithreading. When the implementation is multi-threaded, the valid values for this bit are: 
) Count events only on controlling PE. 
1 Count events from any PE with the same affinity at level 1 and above as this PE. 


When the implementation is not multi-threaded, this bit is RESO. 


— Note 


° An implementation is described as multi-threaded when the lowest level of affinity consists 
of logical PEs that are implemented using a multi-threading type approach. That is, the 
performance of PEs at the lowest affinity level is highly interdependent. On such an 
implementation, the value of MPIDR_EL1.MT, when read at the highest implemented 
Exception level, is 1. 


° Events from a different thread of a multithreaded implementation are not Attributable to the 
thread counting the event. 





Bits [24:10] 
Reserved, RESO. 


evtCount, bits [9:0] 


Event to count. The event number of the event that is counted by event counter 
PMEVCNTR<n>_ELO. 


Software must program this field with an event that is supported by the PE being programmed. 


There are three ranges of event numbers: 


° Event numbers in the range 0x00Q to 0x03F are common architectural and microarchitectural 
events. 
° Event numbers in the range 0x040 to @x@BF are ARM recommended common architectural and 


microarchitectural events. 
. Event numbers in the range @x0CQ to @x3FF are IMPLEMENTATION DEFINED events. 


If evtCount is programmed to an event that is reserved or not supported by the PE, the behavior 
depends on the event type: 


° For the range 0x000 to @x03F, no events are counted, and the value returned by a direct or 
external read of the evtCount field is the value written to the field. 


° For IMPLEMENTATION DEFINED events, it is UNPREDICTABLE what event, if any, is counted, 
and the value returned by a direct or external read of the evtCount field is UNKNOWN. 


— Note 


UNPREDICTABLE means the event must not expose privileged information. 





ARM recommends that the behavior across a family of implementations is defined such that if a 
given implementation does not include an event from a set of common IMPLEMENTATION DEFINED 
events, then no event is counted and the value read back on evtCount is the value written. 
Accessing the PMEVTYPER<n>_ELO: 
To access the PMEVTYPER<n>_EL0: 


MRS <Xt>, PMEVTYPER<n>_EL@ ; Read PMEVTYPER<n>_EL@ into Xt, where n is in the range Q to 30 
MSR PMEVTYPER<n>_EL@, <Xt> ; Write Xt to PMEVTYPER<n>_ELO, where n is in the range @ to 30 
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Register access is encoded as follows: 





op0 opi CRn CRm op2 





11 011 1110 11:n<4:3> — n<2:@> 
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D7.4.10 PMINTENCLR_EL1, Performance Monitors Interrupt Enable Clear register 
The PMINTENCLR_EL1 characteristics are: 
Purpose 
Disables the generation of interrupt requests on overflows from the Cycle Count Register, 
PMCCNTR_ELO, and the event counters PMEVCNTR<n>_EL0. Reading the register shows which 
overflow interrupt requests are enabled. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If MDCR_EL2.TPM==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TPM==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
Configurations 
AArch64 System register PMINTENCLR_EL1 is architecturally mapped to AArch32 System 
register PMINTENCLR. 
AArch64 System register PMINTENCLR_EL1 is architecturally mapped to External register 
PMINTENCLR_EL1. 
This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 
Attributes 
PMINTENCLR_EL1 is a 32-bit register. 
Field descriptions 
The PMINTENCLR_EL1 bit assignments are: 
31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR_ELO overflow interrupt request disable bit. Possible values are: 
Q When read, means the cycle counter overflow interrupt request is disabled. When 
written, has no effect. 
1 When read, means the cycle counter overflow interrupt request is enabled. When 
written, disables the cycle count overflow interrupt request. 
P<n>, bit [n], for n = 0 to 30 
Event counter overflow interrupt request disable bit for PMEVCNTR<n>_ELO. 
When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in MDCR_EL2.HPMN. 
Otherwise, N is the value in PMCR_ELO.N. 
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Bits [30:N] are RAZ/WI. 


Possible values are: 


) When read, means that the PMEVCNTR<n>_EL0 event counter interrupt request is 


disabled. When written, has no effect. 


Hl When read, means that the PMEVCNTR<n>_EL0O event counter interrupt request is 
enabled. When written, disables the PMEVCNTR<n>_ELO interrupt request. 


Accessing the PMINTENCLR_EL1: 
To access the PMINTENCLR_EL1: 


MRS <Xt>, PMINTENCLR_EL1 ; Read PMINTENCLR_EL1 into Xt 
MSR PMINTENCLR_EL1, <Xt> ; Write Xt to PMINTENCLR_EL1 


Register access is encoded as follows: 





op0- op 


CRn CRm_= op2 





11 000 


1001 


1110 010 
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D7.4.11 PMINTENSET_EL1, Performance Monitors Interrupt Enable Set register 
The PMINTENSET_EL1 characteristics are: 
Purpose 
Enables the generation of interrupt requests on overflows from the Cycle Count Register, 
PMCCNTR_ELO, and the event counters PMEVCNTR<n>_ELO. Reading the register shows which 
overflow interrupt requests are enabled. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If MDCR_EL2.TPM==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TPM==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
Configurations 
AArch64 System register PMINTENSET_EL1 is architecturally mapped to AArch32 System 
register PMINTENSET. 
AArch64 System register PMINTENSET_EL1 is architecturally mapped to External register 
PMINTENSET_EL1. 
This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 
Attributes 
PMINTENSET_EL1 is a 32-bit register. 
Field descriptions 
The PMINTENSET_EL1 bit assignments are: 
31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR_ELO overflow interrupt request enable bit. Possible values are: 
Q When read, means the cycle counter overflow interrupt request is disabled. When 
written, has no effect. 
1 When read, means the cycle counter overflow interrupt request is enabled. When 
written, enables the cycle count overflow interrupt request. 
P<n>, bit [n], for n = 0 to 30 
Event counter overflow interrupt request enable bit for PMEVCNTR<n>_EL0. 
When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in MDCR_EL2.HPMN. 
Otherwise, N is the value in PMCR_ELO.N. 
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Bits [30:N] are RAZ/WI. 


Possible values are: 


) When read, means that the PMEVCNTR<n>_ELO event counter interrupt request is 


disabled. When written, has no effect. 


Hl When read, means that the PMEVCNTR<n>_EL0O event counter interrupt request is 
enabled. When written, enables the PMEVCNTR<n>_EL0 interrupt request. 


Accessing the PMINTENSET_EL1: 
To access the PMINTENSET_EL1: 


MRS <Xt>, PMINTENSET_EL1 ; Read PMINTENSET_EL1 into Xt 
MSR PMINTENSET_EL1, <Xt> ; Write Xt to PMINTENSET_EL1 


Register access is encoded as follows: 





op0- op 


CRn CRm_= op2 





11 000 


1001 


1110 001 
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D7.4.12 PMOVSCLR_ELO, Performance Monitors Overflow Flag Status Clear Register 

The PMOVSCLR_ELO characteristics are: 

Purpose 
Contains the state of the overflow bit for the Cycle Count Register, PMCCNTR_ELO, and each of 
the implemented event counters PMEVCNTR<n>. Writing to this register clears these bits. 

Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RW RW RW RW RW RW 
This register is accessible at ELO when PMUSERENR_ELO.EN is set to 1. 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 

to EL2. 
° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 

. If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 

Configurations 
AArch64 System register PMOVSCLR_ELO is architecturally mapped to AArch32 System register 
PMOVSR. 
AArch64 System register PMOVSCLR_ELDO is architecturally mapped to External register 
PMOVSCLR_ELO. 
This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 

Attributes 
PMOVSCLR_ELO is a 32-bit register. 

Field descriptions 

The PMOVSCLR_ELDO bit assignments are: 

31 30 0 

P<n>, bit [n] 

C, bit [31] 
PMCCNTR_ELO overflow bit. Possible values are: 
0 When read, means the cycle counter has not overflowed. When written, has no effect. 
1 When read, means the cycle counter has overflowed. When written, clears the overflow 

bit to 0. 
PMCR_ELO.LC controls whether an overflow is detected from PMCCNTR_ELO[31] or from 
PMCCNTR_ELO. 
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P<n>, bit [n], for n = 0 to 30 
Event counter overflow clear bit for PMEVCNTR<n>_EL0. 
Bits [30:N] are RAZ/WI. 


When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in MDCR_EL2.HPMN. 
Otherwise, N is the value in PMCR_ELO.N. 


Possible values of each bit are: 


1) When read, means that PMEVCNTR<n>_EL0O has not overflowed. When written, has 
no effect. 
1 When read, means that PMEVCNTR<n>_ELO has overflowed. When written, clears 


the PMEVCNTR<n>_EL0 overflow bit to 0. 


Accessing the PMOVSCLR_ELO: 
To access the PMOVSCLR_ELO: 


MRS <Xt>, PMOVSCLR_EL@ ; Read PMOVSCLR_EL@ into Xt 
MSR PMOVSCLR_ELO, <Xt> ; Write Xt to PMOVSCLR_ELO 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 1001 1100 011 
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D7.4.13 PMOVSSET_ELO, Performance Monitors Overflow Flag Status Set register 
The PMOVSSET_ELO characteristics are: 


Purpose 
Sets the state of the overflow bit for the Cycle Count Register, PMCCNTR_ELO, and each of the 
implemented event counters PMEVCNTR<n>. 

Usage constraints 


This register is accessible as follows: 











ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RW RW RW RW RW RW 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


. If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register PMOVSSET_ELO is architecturally mapped to AArch32 System register 
PMOVSSET. 


AArch64 System register PMOVSSET_ELO is architecturally mapped to External register 
PMOVSSET_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMOVSSET_ELO is a 32-bit register. 


Field descriptions 


The PMOVSSET_ELO bit assignments are: 


31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR_ELO overflow bit. Possible values are: 
0 When read, means the cycle counter has not overflowed. When written, has no effect. 
1 When read, means the cycle counter has overflowed. When written, sets the overflow 
bit to 1. 


P<n>, bit [n], for n = 0 to 30 
Event counter overflow set bit for PMEVCNTR<n>_ELO. 
Bits [30:N] are RAZ/WI. 
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When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in MDCR_EL2.HPMN. 
Otherwise, N is the value in PMCR_ELO.N. 


Possible values are: 


1) When read, means that PMEVCNTR<n>_EL0O has not overflowed. When written, has 
no effect. 
1 When read, means that PMEVCNTR<n>_ELO has overflowed. When written, sets the 


PMEVCNTR<n>_ELO overflow bit to 1. 


Accessing the PMOVSSET_ELO: 
To access the PMOVSSET_ELO: 


MRS <Xt>, PMOVSSET_EL@ ; Read PMOVSSET_EL@ into Xt 
MSR PMOVSSET_ELO, <Xt> ; Write Xt to PMOVSSET_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1001 1110 011 
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D7.4.14 PMSELR_ELO, Performance Monitors Event Counter Selection Register 
The PMSELR_ELO characteristics are: 
Purpose 
Selects the current event counter PMEVCNTR<n> or the cycle counter, CCNT. 
PMSELR_ELO is used in conjunction with PMXEVTYPER_ELO to determine the event that 
increments a selected event counter, and the modes and states in which the selected counter 
increments. 
It is also used in conjunction with PMXEVCNTR_ELDO, to determine the value of a selected event 
counter. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RW RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 
° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 
° If PMUSERENR_ELO.EN==0, and PMUSERENR_ELO.ER==0, accesses to this register 
from ELO are trapped to EL1. 
Configurations 
AArch64 System register PMSELR_ELO is architecturally mapped to AArch32 System register 
PMSELR. 
This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 
Attributes 
PMSELR_ELO is a 32-bit register. 
Field descriptions 
The PMSELR_ELO bit assignments are: 
31 5 4 0 
RESO SEL 
Bits [31:5] 
Reserved, RESO. 
SEL, bits [4:0] 
Selects event counter, PMEVCNTR<n>, where n is the value held in this field. This value identifies 
which event counter is accessed when a subsequent access to PMXEVTYPER_ELO or 
PMXEVCNTR_ELO occurs. 
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This field can take any value from 0 (@b00000) to (PMCR.N)-1, or 31 (0b11111). 
When PMSELR_ELO.SEL is 0b11111 it selects the cycle counter and: 

° A read of the PMXEVTYPER_ELO returns the value of PMCCFILTR_ELO. 
° A write of the PMXEVTYPER_ELO writes to PMCCFILTR_ELO. 


° A read or write of PMXEVCNTR_ELO has CONSTRAINED UNPREDICTABLE effects, that can 
be one of the following: 


Access to PMXEVCNTR_ELO is UNDEFINED. 
Access to PMXEVCNTR_ELO behaves as a NOP. 
Access to PMXEVCNTR_ELO behaves as if the register is RAZ/WI. 


Access to PMXEVCNTR_ELO behaves as if the PMSELR_ELO.SEL field contains an 
UNKNOWN value. 


If this field is set to a value greater than or equal to the number of implemented counters, but not 
equal to 31, the results of access to PMXEVTYPER_ELO or PMXEVCNTR_ELO are CONSTRAINED 
UNPREDICTABLE, and can be one of the following: 


° Access to PMUXEVTYPER_ELO or PMXEVCNTR_ELO is UNDEFINED. 
° Access to PMUXEVTYPER_ELO or PMXEVCNTR_ELO behaves as a NOP. 


° Access to PMXEVTYPER_ELO or PMXEVCNTR_ELO behaves as if the register is 
RAZ/WI. 


° Access to PMUXEVTYPER_ELO or PMXEVCNTR_ELO behaves as if the 
PMSELR_ELO.SEL field contains an UNKNOWN value. 


° Access to PMUXEVTYPER_ELO or PMXEVCNTR_ELO behaves as if the 
PMSELR_ELO.SEL field contains @b11111. 


Direct reads of this field return an UNKNOWN value. 


Accessing the PMSELR_ELO: 


To access the PMSELR_ELO: 


MRS <Xt>, PMSELR_ELO@ ; Read PMSELR_EL@ into Xt 
MSR PMSELR_EL@, <Xt> ; Write Xt to PMSELR_ELO 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 011 1001 1100 101 
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D7.4.15 PMSWINC_ELO, Performance Monitors Software Increment register 
The PMSWINC_ELO characteristics are: 


Purpose 
Increments a counter that is configured to count the Software increment event, event 0x00. For more 
information, see SW_INCR. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-WO WO WO WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If MDCR_EL2.TPM==1, Non-secure write accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If MDCR_EL3.TPM==1, write accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


° If PMUSERENR_ELO.EN==0, and PMUSERENR_ELO.SW==0, write accesses to this 
register from ELO are trapped to EL1. 
Configurations 


AArch64 System register PMSWINC_ELO is architecturally mapped to AArch32 System register 
PMSWINC. 


AArch64 System register PMSWINC_ELO is architecturally mapped to External register 
PMSWINC_ELO. 


Attributes 
PMSWINC_ELO is a 32-bit register. 


Field descriptions 


The PMSWINC_ELO bit assignments are: 


31 30 0 


i P<n>, bit [n] 


RESO __| 


Bit [31] 
Reserved, RESO. 


P<n>, bit [n], for n = 0 to 30 
Event counter software increment bit for PMEVCNTR<n>_EL0. 
Bits [30:N] are WI. 


When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in MDCR_EL2.HPMN. 
Otherwise, N is the value in PMCR.N. 
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The effects of writing to this bit are: 
Q No action. The write to this bit is ignored. 


1 If PMEVCNTR<n>_ELO is enabled and configured to count the software increment 
event, increments PMEVCNTR<n>_EL0 by 1. If PMEVCNTR<n>_EL0 is disabled, or 
not configured to count the software increment event, the write to this bit is ignored. 

Accessing the PMSWINC_ELO: 
To access the PMSWINC_ELO: 
MSR PMSWINC_EL@, <Xt> ; Write Xt to PMSWINC_ELO 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 011 1001 1100 100 
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D7.4.16 PMUSERENR_ELO, Performance Monitors User Enable Register 
The PMUSERENR_ELO characteristics are: 


Purpose 


Enables or disables ELO access to the Performance Monitors. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RO RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 


to EL2. 
° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 
Configurations 
AArch64 System register PMUSERENR_ELO is architecturally mapped to AArch32 System 
register PMUSERENR. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMUSERENR_ELO is a 32-bit register. 


Field descriptions 


The PMUSERENR_ELO bit assignments are: 


31 43210 


= EN 
SW 





CR 
ER 
Bits [31:4] 
Reserved, RESO. 
ER, bit [3] 
Event counter read trap control: 
0 ELO using AArch64: ELO reads of the PMXEVCNTR_ELO and 
PMEVCNTR<n>_EL0, and ELO read/write accesses to the PMSELR_ELO, are trapped 
to EL1 if PMUSERENR_ELO.EN is also 0. 
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CR, bit [2] 


SW, bit [1] 


EN, bit [0] 


ELO using AArch32: ELO reads of the PMXEVCNTR and PMEVCNTR<n>, and ELO 
read/write accesses to the PMSELR, are trapped to EL1 if PMUSERENR_ELO.EN is 
also 0. 


1 ELO using AArch64: ELO reads of the PMXEVCNTR_ELO and 
PMEVCNTR<n>_ELO, and ELO read/write accesses to the PMSELR_ELO, are not 
trapped to EL1. 


ELO using AArch32: ELO reads of the PMXEVCNTR and PMEVCNTR<n>, and ELO 
read/write accesses to the PMSELR, are not trapped to EL1. 


Cycle counter read trap control: 

Q ELO using AArch64: ELO read accesses to the PMCCNTR_ELO are trapped to EL1 if 
PMUSERENR_ELO.EN is also 0. 
ELO using AArch32: ELO read accesses to the PMCCNTR are trapped to EL1 if 
PMUSERENR_ELO.EN is also 0. 

1 ELO using AArch64: ELO read accesses to the PMCCNTR_ELO are not trapped to EL1. 
ELO using AArch32: ELO read accesses to the PMCCNTR are not trapped to EL1. 


Software Increment write trap control: 

0 ELO using AArch64: ELO writes to the PMSWINC_EL0 are trapped to EL1 if 
PMUSERENR_ELO.EN is also 0. 
ELO using AArch32: ELO writes to the PMSWINC are trapped to EL1 if 
PMUSERENR_ELO.EN is also 0. 

1 ELO using AArch64: ELO writes to the PMSWINC_EL0O are not trapped to EL1. 
ELO using AArch32: ELO writes to the PMSWINC are not trapped to EL1. 


Traps ELO accesses to the Performance Monitors registers to EL1, from both Execution states: 


0 ELO accesses to the Performance Monitors registers are trapped to EL1, unless enabled 
by one of PMUSERENR_ELO.{ER, CR, SW}. 


1 ELO accesses to the Performance Monitors registers are not trapped to EL1. Software 
can access all PMU registers at ELO. 


—— Note 

. The PMUSERENR_ELO and PMUSERENR registers are always RO at ELO and not trapped 
by this bit. 

° ELO cannot read or write PMINTENSET_EL1 and PMINTENCLR_EL1, or PMINTENSET 
and PMINTENCLR. 





Accessing the PMUSERENR_ELO: 


To access the PMUSERENR_ELO: 


MRS <Xt>, PMUSERENR_ELO@ ; Read PMUSERENR_ELO into Xt 
MSR PMUSERENR_ELO@, <Xt> ; Write Xt to PMUSERENR_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1001 1110 000 
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D7.4.17 PMXEVCNTR_ELO, Performance Monitors Selected Event Count Register 
The PMXEVCNTR_ELO characteristics are: 


Purpose 
Reads or writes the value of the selected event counter, PMEVCNTR<n>_ELO. 
PMSELR_ELO.SEL determines which event counter is selected. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





If PMSELR_ELO.SEL is greater than or equal to the number of accessible counters then reads and 
writes of PMXEVCNTR_ELO are CONSTRAINED UNPREDICTABLE, and the following behaviors are 
permitted: 


° Accesses to the register are UNDEFINED. 
. Accesses to the register behave as RAZ/WI. 
° Accesses to the register execute as a NOP. 


° Accesses to the register behave as if PMSELR_ELO.SEL has an UNKNOWN value less than 
the number of counters accessible at the current Exception level and Security state. 


° Accesses to the register behave as if PMSELR_ELO.SEL is 31. 


° For an access from Non-secure EL1, or an access from Non-secure ELO when the value of 
PMUSERENR_ELO.EN is 1, if PMSELR_ELO.SEL is greater than or equal to the number of 
accessible counters but is less than the number of implemented counters, the register access 
is trapped to EL2. 


— Note 


In an implementation that includes EL2, in Non-secure state at ELO and EL1, MDCR_EL2.HPMN 
identifies the number of accessible counters. Otherwise, the number of accessible counters is the 
number of implemented counters. 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If PMUSERENR_ELO.EN==0, and PMUSERENR_ELO.ER==0, read accesses to this 
register from ELO are trapped to EL1. 


° If PMUSERENR_ELO.EN==0, write accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register PAXEVCNTR_ELO is architecturally mapped to AArch32 System 
register PAXEVCNTR. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMXEVCNTR_ELO is a 32-bit register. 
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Field descriptions 


The PMXEVCNTR_ELO bit assignments are: 


0 


31 
PMEVCNTR<n> 


PMEVCNTR<n>, bits [31:0] 
Value of the selected event counter, PMEVCNTR<n>_EL0O, where n is the value stored in 


PMSELR_ELO.SEL. 


Accessing the PMXEVCNTR_ELO: 


To access the PMXEVCNTR_ELO: 


MRS <Xt>, PMXEVCNTR_ELO ; Read PMXEVCNTR_ELO@ into Xt 
MSR PMXEVCNTR_EL@, <Xt> ; Write Xt to PMXEVCNTR_ELO 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 1001 1101 010 








Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
ID092916 


D7-2252 
Non-Confidential 


D7 AArch64 System Register Descriptions 
D7.4 Performance Monitors registers 


D7.4.18 PMXEVTYPER_ELO, Performance Monitors Selected Event Type Register 
The PMXEVTYPER_ELO characteristics are: 


Purpose 
When PMSELR_ELO.SEL selects an event counter, this accesses a PMEVTYPER<n>_EL0O 
register. When PMSELR_ELO.SEL selects the cycle counter, this accesses PMCCFILTR_ELO. 
Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





If PMSELR_ELO.SEL is greater than or equal to the number of accessible counters then reads and 
writes of PMXEVTYPER_ELO are CONSTRAINED UNPREDICTABLE, and the following behaviors are 
permitted: 


° Accesses to the register are UNDEFINED. 
. Accesses to the register behave as RAZ/WI. 
° Accesses to the register execute as a NOP. 


° Accesses to the register behave as if PMSELR_ELO.SEL has an UNKNOWN value less than 
the number of counters accessible at the current Exception level and Security state. 


° Accesses to the register behave as if PMSELR_ELO.SEL is 31. 


° For an access from Non-secure EL1, or an access from Non-secure ELO when the value of 
PMUSERENR_ELO.EN is 1, if PMSELR_ELO.SEL is greater than or equal to the number of 
accessible counters but is less than the number of implemented counters, the register access 
is trapped to EL2. 


— Note 


In an implementation that includes EL2, in Non-secure state at ELO and EL1, MDCR_EL2.HPMN 
identifies the number of accessible counters. Otherwise, the number of accessible counters is the 
number of implemented counters. 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register PAXEVTYPER_ELO is architecturally mapped to AArch32 System 
register PMXEVTYPER. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMXEVTYPER_ELO is a 32-bit register. 
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Field descriptions 


The PMXEVTYPER_ELO bit assignments are: 


0 


31 
Event type register or PMCCFILTR_ELO 


Bits [31:0] 
Event type register or PMCCFILTR_ELO. 
When PMSELR_ELO.SEL == 31, this register accesses PMCCFILTR_ELO. 


Otherwise, this register accesses PMEVTYPER<n>_ELO where n is the value in 
PMSELR_ELO.SEL. 


Accessing the PMXEVTYPER_ELO: 
To access the PMXEVTYPER_ELO: 


MRS <Xt>, PMXEVTYPER_EL@ ; Read PMXEVTYPER_EL@ into Xt 
MSR PMXEVTYPER_EL@, <Xt> ; Write Xt to PMXEVTYPER_EL@ 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1001 1101 001 
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D7.5 Generic Timer registers 


This section lists the Generic Timer registers in AArch64. 
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D7.5.1 CNTFRQ_ELO, Counter-timer Frequency register 
The CNTFRQ_ELO characteristics are: 


Purpose 


This register is provided so that software can discover the frequency of the system counter. It must 
be programmed with this value as part of system initialization. The value of the register is not 
interpreted by hardware. 


Usage constraints 


If EL] is the highest exception level implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 





Config-RO RW 





If EL2 is the highest exception level implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RW 





If EL3 is implemented and is using AArch64, this register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO RO RO RO RW RW 





Can only be written at the highest Exception level implemented. For example, if EL3 is the highest 
implemented Exception level, CNTFRQ_ELO can only be written at EL3. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If CNTKCTL_EL1.ELOPCTEN==0, and CNTKCTL_EL1.ELOVCTEN==0, read accesses 
to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register CNTFRQ_ELO is architecturally mapped to AArch32 System register 
CNTFRQ. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTFRQ_ELO is a 32-bit register. 


Field descriptions 


The CNTFRQ_ELO bit assignments are: 
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31 0 
Clock frequency 
Bits [31:0] 


Clock frequency. Indicates the system counter clock frequency, in Hz. 


Accessing the CNTFRQ_ELO: 
To access the CNTFRQ_ELO: 


MRS <Xt>, CNTFRQ_ELO ; Read CNTFRQ_EL@ into Xt 
MSR CNTFRQ_EL@, <Xt> ; Write Xt to CNTFRQ_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1110 0000 000 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. D7-2257 
1ID092916 Non-Confidential 


D7 AArch64 System Register Descriptions 
D7.5 Generic Timer registers 














D7.5.2 CNTHCTL_EL2, Counter-timer Hypervisor Control register 
The CNTHCTL_EL2 characteristics are: 
Purpose 
Controls the generation of an event stream from the physical counter, and access from Non-secure 
EL] to the physical counter and the Non-secure EL1 physical timer. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register CNTHCTL_EL2 is architecturally mapped to AArch32 System register 
CNTHCTL. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTHCTL_EL2 is a 32-bit register. 
Field descriptions 
The CNTHCTL_EL2 bit assignments are: 
31 8 7 43210 
RESO EVNTI yy | 
L EL1PCTEN 
EL1PCEN 
EVNTEN 
EVNTDIR 
Bits [31:8] 
Reserved, RESO. 
EVNTI, bits [7:4] 
Selects which bit (0 to 15) of the counter register CNTPCT_ELO is the trigger for the event stream 
generated from that counter, when that stream is enabled. 
EVNTDIR, bit [3] 
Controls which transition of the counter register CNTPCT_ELO trigger bit, defined by EVNTI, 
generates an event when the event stream is enabled: 
Q A 0 to 1 transition of the trigger bit triggers an event. 
1 A | to 0 transition of the trigger bit triggers an event. 
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EVNTEN, bit [2] 


Enables the generation of an event stream from the counter register CNTPCT_ELO: 


1) Disables the event stream. 
1 Enables the event stream. 
ELIPCEN, bit [1] 


Traps Non-secure ELO and EL] accesses to the physical timer registers to EL2. 

1) From AArch64 state: Non-secure ELO and ELI accesses to the CNTP_CTL_ELO, 
CNTP_CVAL_ELO, and CNTP_TVAL_ELO are trapped to EL2. 

From AArch32 state: Non-secure ELO and EL1 accesses to the CNTP_CTL, 
CNTP_CVAL, and CNTP_TVAL are trapped to EL2. 

1 From AArch64 state: Non-secure ELO and EL! accesses to the CNTP_CTL_ELO, 
CNTP_CVAL_ELO, and CNTP_TVAL_ELO are not trapped to EL2. 


From AArch32 state: Non-secure ELO and EL 1 accesses to the CNTP_CTL, 
CNTP_CVAL, and CNTP_TVAL are not trapped to EL2. 





If EL3 is implemented and EL2 is not implemented, behavior is as if this bit is 1 other than for the 
purpose of a direct read. 

EL1PCTEN, bit [0] 
Traps Non-secure ELO and EL] accesses to the physical counter register to EL2. 


1) From AArch6é4 state: Non-secure ELO and EL1 accesses to the CNTPCT_ELO are 
trapped to EL2. 
From AArch32 state: Non-secure ELO and EL! accesses to the CNTPCT are trapped to 
EL2. 

1 From AArch6é4 state: Non-secure ELO and EL! accesses to the CNTPCT_ELO are not 
trapped to EL2. 
From AArch32 state: Non-secure ELO and EL1 accesses to the CNTPCT are not trapped 
to EL2. 


If EL3 is implemented and EL2 is not implemented, behavior is as if this bit is 1 other than for the 
purpose of a direct read. 

Accessing the CNTHCTL_EL2: 

To access the CNTHCTL_EL2: 


MRS <Xt>, CNTHCTL_EL2 ; Read CNTHCTL_EL2 into Xt 
MSR CNTHCTL_EL2, <Xt> ; Write Xt to CNTHCTL_EL2 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 1110 0001 000 
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D7.5.3 CNTHP_CTL_EL2, Counter-timer Hypervisor Physical Timer Control register 


The CNTHP_CTL_EL2 characteristics are: 


Purpose 


Control register for the EL2 physical timer. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





e z - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


Attributes 


AArch64 System register CNTHP_CTL_EL2 is architecturally mapped to AArch32 System 
register CNTHP_CTL. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


CNTHP_CTL_EL2 is a 32-bit register. 


Field descriptions 


The CNTHP_CTL_EL2 bit assignments are: 


31 


3 10 


RESO TT 


Bits [31:3] 


[| ENABLE 
IMASK 


ISTATUS 


Reserved, RESO. 


ISTATUS, bit [2] 


The status of the timer. This bit indicates whether the timer condition is met: 
0 Timer condition is not met. 
1 Timer condition is met. 


When the value of the ENABLE bit is 1, ISTATUS indicates whether the timer condition is met. 
ISTATUS takes no account of the value of the IMASK bit. If the value of ISTATUS is 1 and the 
value of IMASK is 0 then the timer interrupt is asserted. 


When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN. 


For more information see Operation of the Compare Value views of the timers on page D6-1884 and 
Operation of the TimerValue views of the timers on page D6-1885. 


This bit is read-only. 
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IMASK, bit [1] 
Timer interrupt mask bit. Permitted values are: 
Q Timer interrupt is not masked by the IMASK bit. 
1 Timer interrupt is masked by the IMASK bit. 


For more information, see the description of the ISTATUS bit. 


ENABLE, bit [0] 
Enables the timer. Permitted values are: 
) Timer disabled. 
1 Timer enabled. 


Setting this bit to 0 disables the timer output signal, but the timer value accessible from 
CNTHP_TVAL_EL2 continues to count down. 


— Note 


Disabling the output signal might be a power-saving option. 





Accessing the CNTHP_CTL_EL2: 
To access the CNTHP_CTL_EL2: 


MRS <Xt>, CNTHP_CTL_EL2 ; Read CNTHP_CTL_EL2 into Xt 
MSR CNTHP_CTL_EL2, <Xt> ; Write Xt to CNTHP_CTL_EL2 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 100 1110 0010 001 
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D7.5.4 CNTHP_CVAL_EL2, Counter-timer Hypervisor Physical Timer CompareValue register 


63 





The CNTHP_CVAL_EL2 characteristics are: 


Purpose 


Holds the compare value for the EL2 physical timer. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register CNTHP_CVAL_EL2 is architecturally mapped to AArch32 System 
register CNTHP_CVAL. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTHP_CVAL_EL2 is a 64-bit register. 


Field descriptions 


The CNTHP_CVAL_EL2 bit assignments are: 


EL2 physical timer compare value 





Bits [63:0] 


EL2 physical timer compare value. 


Accessing the CNTHP_CVAL_EL2: 
To access the CNTHP_CVAL_EL2: 


MRS <Xt>, CNTHP_CVAL_EL2 ; Read CNTHP_CVAL_EL2 into Xt 
MSR CNTHP_CVAL_EL2, <Xt> ; Write Xt to CNTHP_CVAL_EL2 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 100 1110 0010 010 
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D7.5.5 CNTHP_TVAL_EL2, Counter-timer Hypervisor Physical Timer TimerValue register 
The CNTHP_TVAL_EL2 characteristics are: 


Purpose 


Holds the timer value for the EL2 physical timer. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW RW 





Where the value of CNTHP_CTL_EL2.ENABLE is 0: 

. A write to CNTHP_TVAL_EL2 updates the register 

° The value held in CNTHP_TVAL_EL2 continues to decrement 
° A read of CNTHP_TVAL_EL2 returns an UNKNOWN value. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch64 System register CNTHP_TVAL_EL2 is architecturally mapped to AArch32 System 
register CNTHP_TVAL. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTHP_TVAL_EL2 is a 32-bit register. 


Field descriptions 


The CNTHP_TVAL_EL2 bit assignments are: 


31 0 
EL2 physical timer value 
Bits [31:0] 


EL2 physical timer value. 


Accessing the CNTHP_TVAL_EL2: 
To access the CNTHP_TVAL_EL2: 


MRS <Xt>, CNTHP_TVAL_EL2 ; Read CNTHP_TVAL_EL2 into Xt 
MSR CNTHP_TVAL_EL2, <Xt> ; Write Xt to CNTHP_TVAL_EL2 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 100 1110 0010 000 
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D7.5.6 CNTKCTL_EL1, Counter-timer Kernel Control register 
The CNTKCTL_EL 1 characteristics are: 
Purpose 
Controls the generation of an event stream from the virtual counter, and access from ELO to the 
physical counter, virtual counter, EL1 physical timers, and the virtual timer. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- RW RW RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register CNTKCTL_EL] is architecturally mapped to AArch32 System register 
CNTKCTL. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTKCTL_EL1 is a 32-bit register. 
Field descriptions 
The CNTKCTL_EL1 bit assignments are: 
31 109 8 7 43210 
RESO ti EVNTI al 
L ELOPCTEN 
ELOVCTEN 
EVNTEN 
EVNTDIR 
ELOVTEN 
ELOPTEN 
Bits [31:10] 
Reserved, RESO. 
ELOPTEN, bit [9] 
Traps ELO accesses to the physical timer registers to EL1. 
1) ELO using AArch64: ELO accesses to the CNTP_CTL_ELO, CNTP_CVAL_ELO, and 
CNTP_TVAL_ELO registers are trapped to EL1. 
ELO using AArch32: ELO accesses to the CNTP_CTL, CNTP_CVAL, and 
CNTP_TVAL registers are trapped to EL1. 
1 ELO using AArch64: ELO accesses to the CNTP_CTL_ELO, CNTP_CVAL_ELO, and 
CNTP_TVAL_ELO registers are not trapped to EL1. 
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ELO using AArch32: ELO accesses to the CNTP_CTL, CNTP_CVAL, and 
CNTP_TVAL registers are not trapped to EL1. 


ELOVTEN, bit [8] 
Traps ELO accesses to the virtual timer registers to EL1. 
1) ELO using AArch64: ELO accesses to the CNTV_CTL_ELO, CNTV_CVAL_ELO, and 


CNTV_TVAL_ELO registers are trapped to EL1. 
ELO using AArch32: ELO accesses to the CNTV_CTL, CNTV_CVAL, and 
CNTV_TVAL registers are trapped to EL1. 
1 ELO using AArch64: ELO accesses to the CNTV_CTL_ELO, CNTV_CVAL_ELO, and 
CNTV_TVAL_ELO registers are not trapped to EL1. 

E 

Cc 


LO using AArch32: ELO accesses to the CNTV_CTL, CNTV_CVAL, and 
NTV_TVAL registers are not trapped to EL1. 














EVNTI, bits [7:4] 


Selects which bit (0 to 15) of the counter register CNTVCT_ELO is the trigger for the event stream 
generated from that counter, when that stream is enabled. 


EVNTDIR, bit [3] 


Controls which transition of the counter register CNTVCT_ELO trigger bit, defined by EVNTI, 
generates an event when the event stream is enabled: 


0 A 0 to 1 transition of the trigger bit triggers an event. 
1 A | to 0 transition of the trigger bit triggers an event. 
EVNTEN, bit [2] 
Enables the generation of an event stream from the counter register CNTVCT_ELO: 
) Disables the event stream. 
1 Enables the event stream. 


ELOVCTEN, bit [1] 
Traps ELO accesses to the frequency register and virtual counter register to EL1. 


0 ELO using AArch64: ELO accesses to the CNTVCT_ELO are trapped to EL1. 


ELO using AArch64: ELO accesses to the CNTFRQ_ELO register are trapped to EL1, if 
CNTKCTL_EL1.ELOPCTEN is also 0. 


ELO using AArch32: ELO accesses to the CNTVCT are trapped to EL1. 


ELO using AArch32: ELO accesses to the CNTFRQ register are trapped to EL1, if 
CNTKCTL_EL1.ELOPCTEN is also 0. 








1 ELO using AArch64: ELO accesses to the CNTFRQ_ELO and CNTVCT_EL0 are not 
trapped to EL1. 
ELO using AArch32: ELO accesses to the CNTFRQ and CNTVCT are not trapped to 
ELI. 
ELOPCTEN, bit [0] 
Traps ELO accesses to the frequency register and physical counter register to EL1. 
Q ELO using AArch64: ELO accesses to the CNTPCT_ELO are trapped to EL1. 


ELO using AArch64: ELO accesses to the CNTFRQ_ELO register are trapped to EL1, if 
CNTKCTL_EL1.ELOVCTEN is also 0. 


ELO using AArch32: ELO accesses to the CNTPCT are trapped to EL1. 


ELO using AArch32: ELO accesses to the CNTFRQ and register are trapped to EL1, if 
CNTKCTL_EL1.ELOVCTEN is also 0. 








1 ELO using AArch64: ELO accesses to the CNTFRQ_ELO and CNTPCT_ELO are not 
trapped to EL1. 
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ELO using AArch32: ELO accesses to the CNTFRQ and CNTPCT are not trapped to 
ELI. 

Accessing the CNTKCTL_EL1: 

To access the CNTKCTL_ELI: 


MRS <Xt>, CNTKCTL_EL1 ; Read CNTKCTL_EL1 into Xt 
MSR CNTKCTL_EL1, <Xt> ; Write Xt to CNTKCTL_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 000 1110 0001 000 
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D7.5.7 CNTP_CTL_ELO, Counter-timer Physical Timer Control register 
The CNTP_CTL_ELO characteristics are: 
Purpose 
Control register for the EL1 physical timer. 
Usage constraints 
This register is accessible as follows: 
ELO EL1 (NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
Config-RW  Config-RW RW RW RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If CNTHCTL_EL2.EL1PCEN==0, Non-secure accesses to this register from EL1 are 
trapped to EL2. 
. If CNTHCTL_EL2.EL1PCEN==0, and CNTKCTL_EL1.ELOPTEN==1, Non-secure 
accesses to this register from ELO are trapped to EL2. 
° If CNTKCTL_EL1.ELOPTEN==0, accesses to this register from ELO are trapped to EL1. 
Configurations 
AArch64 System register CNTP_CTL_ELO is architecturally mapped to AArch32 System register 
CNTPJCTL: 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTP_CTL_ELO is a 32-bit register. 
Field descriptions 
The CNTP_CTL_ELO bit assignments are: 
31 3 1.0 
RESO TT] 
[| ENABLE 
IMASK 
ISTATUS 
Bits [31:3] 
Reserved, RESO. 
ISTATUS, bit [2] 
The status of the timer. This bit indicates whether the timer condition is met: 
0 Timer condition is not met. 
1 Timer condition is met. 
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When the value of the ENABLE bit is 1, ISTATUS indicates whether the timer condition is met. 
ISTATUS takes no account of the value of the IMASK bit. If the value of ISTATUS is 1 and the 
value of IMASK is 0 then the timer interrupt is asserted. 


When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN. 


For more information see Operation of the Compare Value views of the timers on page D6-1884 and 
Operation of the TimerValue views of the timers on page D6-1885. 


This bit is read-only. 


IMASK, bit [1] 
Timer interrupt mask bit. Permitted values are: 
Q Timer interrupt is not masked by the IMASK bit. 
1 Timer interrupt is masked by the IMASK bit. 


For more information, see the description of the ISTATUS bit. 


ENABLE, bit [0] 
Enables the timer. Permitted values are: 
) Timer disabled. 
1 Timer enabled. 


Setting this bit to 0 disables the timer output signal, but the timer value accessible from 
CNTP_TVAL_ELO continues to count down. 


— Note 


Disabling the output signal might be a power-saving option. 





Accessing the CNTP_CTL_ELO: 
To access the CNTP_CTL_ELO: 


MRS <Xt>, CNTP_CTL_EL@ ; Read CNTP_CTL_EL@ into Xt 
MSR CNTP_CTL_ELO, <Xt> ; Write Xt to CNTP_CTL_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1110 0010 001 
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D7.5.8 CNTP_CVAL_ELO, Counter-timer Physical Timer CompareValue register 
The CNTP_CVAL_ELO characteristics are: 


Purpose 


Holds the compare value for the EL1 physical timer. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If CNTHCTL_EL2.EL1PCEN==0, Non-secure accesses to this register from EL1 are 
trapped to EL2. 


. If CNTHCTL_EL2.EL1PCEN==0, and CNTKCTL_EL1.ELOPTEN==1, Non-secure 
accesses to this register from ELO are trapped to EL2. 


° If CNTKCTL_EL1.ELOPTEN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register CNTP_CVAL_ELO is architecturally mapped to AArch32 System 
register CNTP_CVAL. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTP_CVAL_ELO is a 64-bit register. 


Field descriptions 


The CNTP_CVAL_ELO bit assignments are: 


63 0 


EL1 physical timer compare value 





Bits [63:0] 


EL1 physical timer compare value. 


Accessing the CNTP_CVAL_ELO: 
To access the CNTP_CVAL_ELO: 


MRS <Xt>, CNTP_CVAL_EL@ ; Read CNTP_CVAL_EL@ into Xt 
MSR CNTP_CVAL_EL@, <Xt> ; Write Xt to CNTP_CVAL_ELO 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1110 0010 010 
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D7.5.9 CNTP_TVAL_ELO, Counter-timer Physical Timer TimerValue register 

The CNTP_TVAL_ELO characteristics are: 

Purpose 
Holds the timer value for the EL1 physical timer. 

Usage constraints 
This register is accessible as follows: 

ELO EL1 (NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RW  Config-RW RW RW RW RW 

Where the value of CNTP_CTL_ELO.ENABLE is 0: 
. A write to CNTP_TVAL_ELO updates the register 
° The value held in CNTP_TVAL_ELO continues to decrement 
° A read of CNTP_TVAL_ELO returns an UNKNOWN value. 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If CNTHCTL_EL2.EL1PCEN==0, Non-secure accesses to this register from EL1 are 

trapped to EL2. 
° If CNTHCTL_EL2.EL1PCEN==0, and CNTKCTL_EL1.ELOPTEN==1, Non-secure 
accesses to this register from ELO are trapped to EL2. 

° If CNTKCTL_EL1.ELOPTEN==0, accesses to this register from ELO are trapped to EL1. 

Configurations 
AArch64 System register CNTP_TVAL_ELO is architecturally mapped to AArch32 System register 
CNTP_TVAL. 
RW fields in this register reset to architecturally UNKNOWN values. 

Attributes 
CNTP_TVAL_ELO is a 32-bit register. 

Field descriptions 

The CNTP_TVAL_ELO bit assignments are: 

31 0 

EL1 physical timer value 

Bits [31:0] 
EL1 physical timer value. 

Accessing the CNTP_TVAL_ELO: 

To access the CNTP_TVAL_ELO: 

MRS <Xt>, CNTP_TVAL_ELO ; Read CNTP_TVAL_EL@ into Xt 

MSR CNTP_TVAL_EL@, <Xt> ; Write Xt to CNTP_TVAL_ELO 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1110 0010 000 
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D7.5.10 CNTPCT_ELO, Counter-timer Physical Count register 
The CNTPCT_ELO characteristics are: 


Purpose 


Holds the 64-bit physical count value. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


° If CNTHCTL_EL2.EL1PCTEN==0, Non-secure read accesses to this register from EL1 are 
trapped to EL2. 


. If CNTHCTL_EL2.EL1PCTEN==0, and CNTKCTL_EL1.ELOPCTEN==1, Non-secure 
read accesses to this register from ELO are trapped to EL2. 


° If CNTKCTL_EL1.ELOPCTEN==0, read accesses to this register from ELO are trapped to 
ELI. 


Configurations 


AArch64 System register CNTPCT_ELO is architecturally mapped to AArch32 System register 
CNITPCT. 


Attributes 
CNTPCT_ELO is a 64-bit register. 


Field descriptions 


The CNTPCT_ELO bit assignments are: 


63 0 


Physical count value 


Bits [63:0] 


Physical count value. 


Accessing the CNTPCT_ELO: 
To access the CNTPCT_ELO: 
MRS <Xt>, CNTPCT_EL@ ; Read CNTPCT_EL@ into Xt 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 011 1110 0000 001 
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D7.5.11 CNTPS_CTL_EL1, Counter-timer Physical Secure Timer Control register 
The CNTPS_CTL_EL]1 characteristics are: 
Purpose 
Control register for the secure physical timer, usually accessible at EL3 but configurably accessible 
at EL1 in Secure state. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) _ EL1(S) EL2 (NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - Config-RW - RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If SCR_EL3.ST==0, Secure accesses to this register from EL1 are trapped to EL3. 
Configurations 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTPS_CTL_EL] is a 32-bit register. 
Field descriptions 
The CNTPS_CTL_EL1 bit assignments are: 
31 3 10 
RESO TT] 
= ENABLE 
IMASK 
ISTATUS 
Bits [31:3] 
Reserved, RESO. 
ISTATUS, bit [2] 
The status of the timer. This bit indicates whether the timer condition is met: 
0 Timer condition is not met. 
1 Timer condition is met. 
When the value of the ENABLE bit is 1, ISTATUS indicates whether the timer condition is met. 
ISTATUS takes no account of the value of the IMASK bit. If the value of ISTATUS is 1 and the 
value of IMASK is 0 then the timer interrupt is asserted. 
When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN. 
For more information see Operation of the Compare Value views of the timers on page D6-1884 and 
Operation of the TimerValue views of the timers on page D6-1885. 
This bit is read-only. 
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IMASK, bit [1] 
Timer interrupt mask bit. Permitted values are: 
Q Timer interrupt is not masked by the IMASK bit. 
1 Timer interrupt is masked by the IMASK bit. 


For more information, see the description of the ISTATUS bit. 


ENABLE, bit [0] 
Enables the timer. Permitted values are: 
) Timer disabled. 
1 Timer enabled. 


Setting this bit to 0 disables the timer output signal, but the timer value accessible from 


CNTPS_TVAL_EL1 continues to count down. 


— Note 


Disabling the output signal might be a power-saving option. 





Accessing the CNTPS_CTL_EL1: 
To access the CNTPS_CTL_EL1: 


MRS <Xt>, CNTPS_CTL_EL1 ; Read CNTPS_CTL_EL1 into Xt 
MSR CNTPS_CTL_EL1, <Xt> ; Write Xt to CNTPS_CTL_EL1 


Register access is encoded as follows: 





op0- opi 


CRn 


CRm = op2 





11 111 


1110 


0010 001 
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D7.5.12 CNTPS_CVAL_EL1, Counter-timer Physical Secure Timer CompareValue register 
The CNTPS_CVAL_EL 1 characteristics are: 


Purpose 


Holds the compare value for the secure physical timer, usually accessible at EL3 but configurably 
accessible at EL1 in Secure state. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - Config-RW - RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If SCR_EL3.ST==0, Secure accesses to this register from EL1 are trapped to EL3. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTPS_CVAL_EL1 is a 64-bit register. 


Field descriptions 


The CNTPS_CVAL_EL1 bit assignments are: 


63 0 


Secure physical timer compare value 





Bits [63:0] 


Secure physical timer compare value. 


Accessing the CNTPS_CVAL_EL1: 
To access the CNTPS_CVAL_EL1: 


MRS <Xt>, CNTPS_CVAL_EL1 ; Read CNTPS_CVAL_EL1 into Xt 
MSR CNTPS_CVAL_EL1, <Xt> ; Write Xt to CNTPS_CVAL_EL1 


Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 111 1110 0010 010 
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D7.5.13 


CNTPS_TVAL_EL1, Counter-timer Physical Secure Timer TimerValue register 


The CNTPS_TVAL_EL1 characteristics are: 


Purpose 
Holds the timer value for the secure physical timer, usually accessible at EL3 but configurably 
accessible at EL1 in Secure state. 

Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - Config-RW - RW RW 





Where the value of CNTPS_CTL_EL1.ENABLE is 0: 

° A write to CNTPS_TVAL_EL1 updates the register 

° The value held in CNTPS_TVAL_EL1 continues to decrement 
° A read of CNTPS_TVAL_EL1 returns an UNKNOWN value. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If SCR_EL3.ST==0, Secure accesses to this register from EL1 are trapped to EL3. 


Configurations 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTPS_TVAL_EL] is a 32-bit register. 


Field descriptions 


The CNTPS_TVAL_EL1 bit assignments are: 


31 0 


Secure physical timer value 


Bits [31:0] 


Secure physical timer value. 


Accessing the CNTPS_TVAL_EL1: 
To access the CNTPS_TVAL_EL1: 


MRS <Xt>, CNTPS_TVAL_EL1 ; Read CNTPS_TVAL_EL1 into Xt 
MSR CNTPS_TVAL_EL1, <Xt> ; Write Xt to CNTPS_TVAL_EL1 


Register access is encoded as follows: 





op0 opi CRn CRm_= op2 





11 111 1110 0010 000 








D7-2276 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


D7 AArch64 System Register Descriptions 
D7.5 Generic Timer registers 


D7.5.14 CNTV_CTL_ELO, Counter-timer Virtual Timer Control register 
The CNTV_CTL_ELO characteristics are: 


Purpose 


Control register for the virtual timer. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If CNTKCTL_EL1.ELOVTEN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register CNTV_CTL_ELO is architecturally mapped to AArch32 System register 
CNTV_CTL. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTV_CTL_ELO is a 32-bit register. 


Field descriptions 


The CNTV_CTL_ELO bit assignments are: 


31 3 10 


RESO TT 
[|__ ENABLE 
IMASK 


ISTATUS 


Bits [31:3] 
Reserved, RESO. 


ISTATUS, bit [2] 
The status of the timer. This bit indicates whether the timer condition is met: 
0 Timer condition is not met. 
1 Timer condition is met. 


When the value of the ENABLE bit is 1, ISTATUS indicates whether the timer condition is met. 
ISTATUS takes no account of the value of the IMASK bit. If the value of ISTATUS is 1 and the 
value of IMASK is 0 then the timer interrupt is asserted. 


When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN. 


For more information see Operation of the Compare Value views of the timers on page D6-1884 and 
Operation of the TimerValue views of the timers on page D6-1885. 
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This bit is read-only. 


IMASK, bit [1] 
Timer interrupt mask bit. Permitted values are: 
0 Timer interrupt is not masked by the IMASK bit. 
1 Timer interrupt is masked by the IMASK bit. 


For more information, see the description of the ISTATUS bit. 


ENABLE, bit [0] 
Enables the timer. Permitted values are: 
0 Timer disabled. 


1 Timer enabled. 


Setting this bit to 0 disables the timer output signal, but the timer value accessible from 


CNTV_TVAL_ELO continues to count down. 


— Note 


Disabling the output signal might be a power-saving option. 





Accessing the CNTV_CTL_ELO: 
To access the CNTV_CTL_ELO: 


MRS <Xt>, CNTV_CTL_EL@ ; Read CNTV_CTL_EL@ into Xt 
MSR CNTV_CTL_EL@, <Xt> ; Write Xt to CNTV_CTL_ELO 


Register access is encoded as follows: 





op0 op1 


CRn 


CRm_= op2 





11 011 


1110 


0011 001 
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CNTV_CVAL_ELO, Counter-timer Virtual Timer CompareValue register 


The CNTV_CVAL_ELO characteristics are: 


Purpose 


Holds the compare value for the virtual timer. 


Usage constraints 


This register is accessible as follows: 





ELO EL1(NS) L1(S) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW RW RW RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 


. If CNTKCTL_EL1.ELOVTEN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch64 System register CNTV_CVAL_ELO is architecturally mapped to AArch32 System 
register CNTV_CVAL. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTV_CVAL_ELO is a 64-bit register. 


Field descriptions 


The CNTV_CVAL_ELO bit assignments are: 


Virtual timer compare value 


Bits [63:0] 


Virtual timer compare value. 


Accessing the CNTV_CVAL_ELO: 
To access the CNTV_CVAL_ELO: 


MRS <Xt>, CNTV_CVAL_EL@ ; Read CNTV_CVAL_EL@ into Xt 
MSR CNTV_CVAL_EL@, <Xt> ; Write Xt to CNTV_CVAL_ELO 


Register access is encoded as follows: 





opO0 opt CRn CRm_= op2 





11 011 1110 0011 010 
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D7.5.16 CNTV_TVAL_ELO, Counter-timer Virtual Timer TimerValue register 

The CNTV_TVAL_ELO characteristics are: 

Purpose 
Holds the timer value for the virtual timer. 

Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RW RW RW RW RW RW 
Where the value of CNTV_CTL_ELO.ENABLE is 0: 
. A write to CNTV_TVAL_ELO updates the register 
° The value held in CNTV_TVAL_ELO continues to decrement 
° A read of CNTV_TVAL_ELO returns an UNKNOWN value. 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
° If CNTKCTL_EL1.ELOVTEN==0, accesses to this register from ELO are trapped to EL1. 

Configurations 
AArch64 System register CNTV_TVAL_ELDO is architecturally mapped to AArch32 System 
register CNTV_TVAL. 
RW fields in this register reset to architecturally UNKNOWN values. 

Attributes 
CNTV_TVAL_ELO is a 32-bit register. 

Field descriptions 

The CNTV_TVAL_ELO bit assignments are: 

31 0 

Bits [31:0] 
Virtual timer value. 

Accessing the CNTV_TVAL_ELO: 

To access the CNTV_TVAL_ELO: 

MRS <Xt>, CNTV_TVAL_ELO ; Read CNTV_TVAL_EL® into Xt 

MSR CNTV_TVAL_EL@, <Xt> ; Write Xt to CNTV_TVAL_ELO 
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Register access is encoded as follows: 





op0 opt CRn CRm_= op2 





11 011 1110 0011 000 
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D7.5.17 CNTVCT_ELO, Counter-timer Virtual Count register 

The CNTVCT_ELO characteristics are: 

Purpose 
Holds the 64-bit virtual count value. The virtual count value is equal to the physical count value 
visible in CNTPCT_ELO minus the virtual offset visible in CNTVOFF_EL2. 

Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
Config-RO RO RO RO RO RO 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch64 on page D1-1548. Subject to the prioritization rules: 
. If CNTKCTL_EL1.ELOVCTEN==0, read accesses to this register from ELO are trapped to 

EL1. 

Configurations 
AArch64 System register CNTVCT_ELO is architecturally mapped to AArch32 System register 
CNTVCT. 
The virtual count value is equal to the physical count value visible in CNTPCT_ELO minus the 
virtual offset visible in CNTVOFF_EL2. 
When EL2 is not implemented, CNTVOFF_EL2 is RESO, and the value of this register is the same 
as the value of CNTPCT_ELO. 

Attributes 
CNTVCT_ELO is a 64-bit register. 

Field descriptions 

The CNTVCT_ELO bit assignments are: 

0 
Virtual count value 

Bits [63:0] 
Virtual count value. 

Accessing the CNTVCT_ELO: 

To access the CNTVCT_ELO: 

MRS <Xt>, CNTVCT_ELO ; Read CNTVCT_EL@ into Xt 

Register access is encoded as follows: 

op0 opi CRn CRm_= op2 
11 011 1110 0000 010 
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D7.5.18 CNTVOFF_EL2, Counter-timer Virtual Offset register 
The CNTVOFF_EL2 characteristics are: 
Purpose 
Holds the 64-bit virtual offset. This is the offset between the physical count value visible in 
CNTPCT_ELO and the virtual count value visible in CNTVCT_ELO. 
Usage constraints 
This register is accessible as follows: 
ELO EL1(NS) EL1(S) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 
- - - RW RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch64 System register CNTVOFF_EL2 is architecturally mapped to AArch32 System register 
CNTVOFF. 
If EL2 is not implemented, this register is RESO from EL3 and the virtual counter uses a fixed virtual 
offset of zero. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTVOFF_EL2 is a 64-bit register. 
Field descriptions 
The CNTVOFF_EL2 bit assignments are: 
63 0 
Virtual offset 
Bits [63:0] 
Virtual offset. 
Accessing the CNTVOFF_EL2: 
To access the CNTVOFF_EL2: 
MRS <Xt>, CNTVOFF_EL2 ; Read CNTVOFF_EL2 into Xt 
MSR CNTVOFF_EL2, <Xt> ; Write Xt to CNTVOFF_EL2 
Register access is encoded as follows: 
opO0 opt CRn CRm_= op2 
11 100 1110 0000 011 
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Chapter E1 
The AArch32 Application Level Programmers’ Model 


This chapter gives an Application level description of the programmers’ model for software executing in AArch32 
state. This means it describes execution in ELO when ELO is using AArch32. It contains the following sections: 


° About the Application level programmers’ model on page E1-2288. 

° The Application level programmers’ model in AArch32 state on page E1-2289. 
° Advanced SIMD and floating-point instructions on page E1-2300. 

. About the AArch32 System register interface on page E1-2312. 

° Exceptions on page E1-2313. 
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E1.1 About the Application level programmers’ model 


This chapter contains the programmers’ model information required for the development of applications that will 
execute in AArch32 state. 


The information in this chapter is distinct from the system information required to service and support application 
execution under an operating system, or higher level of system software. However, some knowledge of that system 
information is needed to put the Application level programmers' model into context. 


Depending on the implementation, the architecture supports multiple levels of execution privilege. These privilege 
levels are indicated by different Exception levels that number upwards from ELO, where ELO corresponds to the 
lowest privilege level and is often described as unprivileged. The Application level programmers’ model is the 
programmers’ model for software executing at ELO. For more information see ARMv8& architectural concepts on 
page A1-33. 


System software determines the Exception level, and therefore the level of privilege, at which application software 
runs. When an operating system supports execution at both EL1 and ELO, an application usually runs unprivileged. 
This has the following effects: 


. It means that the operating system can allocate system resources to an application in a unique or shared 
manner. 
° It provides a degree of protection from other processes, and so helps protect the operating system from 


malfunctioning software. 


This chapter indicates where some System level understanding is helpful, and if appropriate it gives a reference to 
the System level description. 


When included in an implementation: 


° EL3 provides two Security states, Secure and Non-secure. Secure state provides additional hardware features 
that support the development of secure applications. 


° EL2 provides virtualization of operation in Non-secure state. 


However, application level software is generally unaware of its Security state, and of any virtualization. For more 
information, see The ARMv8-A security model on page G1-3789 and The effect of implementing EL2 on the 
Exception model on page G1-3794. 


Note 


° When an implementation includes EL3, application and operating system software normally executes in 
Non-secure state. 





. EL2, that provides the virtualization features, is implemented only in Non-secure state. 


° Older documentation, describing implementations or architecture versions that support only two privilege 
levels, often refers to execution at EL1 as privileged execution. 


. In this manual, the terms CONSTRAINED UNPREDICTABLE, IMPLEMENTATION DEFINED. 
OPTIONAL, RESO, RES], UNDEFINED, UNKNOWN, and UNPREDICTABLE have ARM-specific 
meanings, as defined in the Glossary. In body text, these terms are shown in SMALL CAPS, for example 
IMPLEMENTATION DEFINED. 
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E1.2 The Application level programmers’ model in AArch32 state 


The following sections give more information about the Application level programmers’ model in AArch32 state: 
. Instruction sets, arithmetic operations, and register files. 

° Core data types and arithmetic in AArch32 state. 

° The general-purpose registers, and the PC, in AArch32 state on page E1-2291. 

° Process state, PSTATE on page E1-2294. 

° Jazelle support on page E1-2299. 


E1.2.1 Instruction sets, arithmetic operations, and register files 


The A32 and T32 instruction sets both provide a wide range of integer arithmetic and logical operations, that operate 
on a register file of sixteen 32-bit registers, that are comprised of the AArch32 general-purpose registers and the 
PC. As described in The general-purpose registers, and the PC, in AArch32 state on page E1-2291, these registers 
include the registers SP (R13) and LR (R14), which have specialized uses. Core data types and arithmetic in 
AArch32 state gives more information about these operations. 


In addition, an implementation that implements the T32 and A32 instruction sets includes both: 
° Scalar floating-point instructions. 
° The Advanced SIMD vector instructions. 


Floating-point and vector instructions operate on a separate common register file, described in The SIMD and 
floating-point register file on page E1-2300. Advanced SIMD and floating-point instructions on page E1-2300 gives 
more information about these instructions. 


E1.2.2 Core data types and arithmetic in AArch32 state 


When executing in AArch32 state, a PE supports the following data types in memory: 


Byte 8 bits 
Halfword 16 bits 
Word 32 bits 


Doubleword 64 bits. 


PE registers are 32 bits in size. The instruction sets provide instructions that use the following data types for data 
held in registers: 


° 32-bit pointers. 

° Unsigned or signed 32-bit integers. 

° Unsigned 16-bit or 8-bit integers, held in zero-extended form. 
° Signed 16-bit or 8-bit integers, held in sign-extended form. 

° Two 16-bit integers packed into a register. 

° Four 8-bit integers packed into a register. 

° Unsigned or signed 64-bit integers held in two registers. 


Load and store operations can transfer bytes, halfwords, or words to and from memory. Loads of bytes or halfwords 
zero-extend or sign-extend the data as it is loaded, as specified in the appropriate load instruction. 


The instruction sets include load and store operations that transfer two or more words to and from memory. Software 
can load and store doublewords using these instructions. 


Note 


For information about the atomicity of memory accesses see Atomicity in the ARM architecture on page E2-2328. 








When any of the data types is described as unsigned, the N-bit data value represents a non-negative integer in the 
range 0 to 2N-1, using normal binary format. 


When any of these types is described as signed, the N-bit data value represents an integer in the range -2‘N-) to 
+2(N-D-1, using two's complement format. 
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The instructions that operate on packed halfwords or bytes include some multiply instructions that use only one of 
two halfwords, and SIMD instructions that perform parallel addition or subtraction on all of the halfwords or bytes. 


Note 


These SIMD instructions operate on values held in the general-purpose registers, and must not be confused with the 
Advanced SIMD instructions that operate on a separate register file that provides registers of up 128 bits. 








Direct instruction support for 64-bit integers is limited, and most 64-bit operations require sequences of two or more 
instructions to synthesize them. 


Integer arithmetic 


The instruction set provides a wide range of operations on the values in registers, including bitwise logical 
operations, shifts, additions, subtractions, multiplications, and divisions. The pseudocode described in 
Appendix K11 ARM Pseudocode Definition defines these operations, usually in one of three ways: 


° By direct use of the pseudocode operators and built-in functions defined in Operators on page K11-5638. 
. By use of pseudocode helper functions defined in the main text. See Appendix K12 Pseudocode Index. 
° By a sequence of the form: 
1: Use of the SInt(), UInt(), and Int() built-in functions defined in Converting bitstrings to integers on 


page K11-5651 to convert the bitstring contents of the instruction operands to the unbounded integers 
that they represent as two's complement or unsigned integers. 


2. Use of mathematical operators, built-in functions and helper functions on those unbounded integers to 
calculate other such integers. 


3. Use of either the bitstring extraction operator defined in Bitstring concatenation and slicing on 
page K11-5639 or of the saturation helper functions described in Pseudocode description of saturation 
on page E1-2291 to convert an unbounded integer result into a bitstring result that can be written to a 
register. 


Shift and rotate operations 
The following types of shift and rotate operations are used in instructions: 


Logical Shift Left 


The LSL() pseudocode function moves each bit of a bitstring left by a specified number of bits. Zeros 
are shifted in at the right end of the bitstring. Bits that are shifted off the left end of the bitstring are 
discarded, except that the last such bit can be produced as a carry output. 


Logical Shift Right 
The LSR() pseudocode function moves each bit of a bitstring right by a specified number of bits. 


Zeros are shifted in at the left end of the bitstring. Bits that are shifted off the right end of the 
bitstring are discarded, except that the last such bit can be produced as a carry output. 


Arithmetic Shift Right 


The ASR() pseudocode function moves each bit of a bitstring right by a specified number of bits. 
Copies of the leftmost bit are shifted in at the left end of the bitstring. Bits that are shifted off the 
right end of the bitstring are discarded, except that the last such bit can be produced as a carry output. 


Rotate Right The ROR() pseudocode function moves each bit of a bitstring right by a specified number of bits. 
Each bit that is shifted off the right end of the bitstring is re-introduced at the left end. The last bit 
shifted off the right end of the bitstring can be produced as a carry output. 


Rotate Right with Extend 


The RRX() pseudocode function moves each bit of a bitstring right by one bit. A carry input is shifted 
in at the left end of the bitstring. The bit shifted off the right end of the bitstring can be produced as 
a carry output. 
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Pseudocode description of addition and subtraction 


In pseudocode, addition and subtraction can be performed on any combination of unbounded integers and bitstrings, 
provided that if they are performed on two bitstrings, the bitstrings must be identical in length. The result is another 
unbounded integer if both operands are unbounded integers, and a bitstring of the same length as the bitstring 
operand or operands otherwise. For the definition of these operations, see Addition and subtraction on 

page K11-5640. 


The main addition and subtraction instructions can produce status information about both unsigned carry and signed 
overflow conditions. When necessary, multi-word additions and subtractions can be synthesized from this status 
information. In pseudocode the AddwWithCarry() function provides an addition with a carry input and a set of output 
Condition flags including carry output and overflow: 


An important property of the AddwWithCarry() function is that if: 


(result, nzcv) = AddWithCarry(x, NOT(y), carry_in) 


Then: 
° If carry_in == ‘1’, then result == x-y with: 

— _ nzcv<@> == ‘1’ if signed overflow occurred during the subtraction. 

— _nzcv<1> == ‘1’ if unsigned borrow did not occur during the subtraction, that is, if x2y. 
° If carry_in == ‘Q’, then result == x-y-1 with: 

— _nzcv<@> == ‘1’ if signed overflow occurred during the subtraction. 

— _ nzcv<1> == ‘1’ if unsigned borrow did not occur during the subtraction, that is, if x2y. 


Taken together, this means that the carry_in and nzcv<1> output in AddwWithCarry() calls can act as NOT borrow flags 
for subtractions as well as carry flags for additions. 


Pseudocode description of saturation 


Some instructions perform saturating arithmetic, that is, if the result of the arithmetic overflows the destination 
signed or unsigned N-bit integer range, the result produced is the largest or smallest value in that range, rather than 
wrapping around modulo 2N. This is supported in pseudocode by: 


e The SignedSatQ() and UnsignedSatQ() functions when an operation requires, in addition to the saturated 
result, a Boolean argument that indicates whether saturation occurred. 


e The SignedSat() and UnsignedSat() functions when only the saturated result is required. 


SatQ(i, N, unsigned) returns either UnsignedSatQ(i, N) or SignedSatQ(i, N) depending on the value of its third 
argument, and Sat(i, N, unsigned) returns either UnsignedSat(i, N) or SignedSat(i, N) depending on the value of 
its third argument. 


E1.2.3 The general-purpose registers, and the PC, in AArch32 state 
In the AArch32 Application level view, a PE has: 


° Fifteen general-purpose 32-bit registers, RO to R14, of which R13 and R14 have alternative names reflecting 
how they are, or can be, used: 


— R13 is usually identified as SP. 
— R14 is usually identified as LR. 


° The PC (program counter), that can be described as R15. 
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The specialized uses of the SP (R13), LR (R14), and PC (R15) are: 


SP, the stack pointer 
The PE uses SP as a pointer to the active stack. 


In the T32 instruction set, some instructions cannot access SP. Instructions that can access SP can 
use SP as a general-purpose register. 


The A32 instruction set provides more general access to SP, and it can be used as a general-purpose 
register. 
—— Note 


Using SP for any purpose other than as a stack pointer might break the requirements of operating 
systems, debuggers, and other software systems, causing them to malfunction. 





Software can refer to SP as R13. 


LR, the link register 


The link register can be used to hold return link information, and some cases described in this 
manual require this use of the LR. When software does not require the LR for linking, it can use it 
for other purposes. Software can refer to LR as R14. 


PC, the program counter 


° When executing an A32 instruction, PC reads as the address of the current instruction plus 8. 
. When executing a T32 instruction, PC reads as the address of the current instruction plus 4. 
° Writing an address to PC causes a branch to that address. 


Most T32 instructions cannot access PC. 


The A32 instruction set provides more general access to the PC, and many A32 instructions can use 
the PC as a general-purpose register. However, ARM deprecates the use of PC for any purpose other 
than as the program counter. See Writing to the PC for more information. 


Software can refer to PC as RIS. 


See AArch32 general-purpose registers, the PC, and the Special-purpose registers on page G1-3801 for the system 
level view of these registers. 


Note 


In general, ARM strongly recommends using the names SP, LR and PC instead of R13, R14 and R15. However, 
sometimes it is simpler to use the R13-R15 names when referring to a group of registers. For example, it is simpler 
to refer to registers R8 to R15, rather than to registers R8 to R12, the SP, LR and PC. These two descriptions of the 
group of registers have exactly the same meaning. 








Writing to the PC 


In the A32 and T32 instruction sets, many data-processing instructions can write to the PC. Writes to the PC are 
handled as follows: 


° In T32 state, the following 16-bit T32 instruction encodings branch to the value written to the PC: 
— Encoding T2 of ADD, ADDS (register) on page F5-2573. 
— Encoding T1 of MOV, MOVS (register) on page F5-2815. 


The value written to the PC is forced to be halfword-aligned by ignoring its least significant bit, treating that 
bit as being 0. 


° The B, BL, CBNZ, CBZ, CHKA, HB, HBL, HBLP, HBP, TBB, and TBH instructions remain in the same instruction set state 
and branch to the value written to the PC. 


The definition of each of these instructions ensures that the value written to the PC is correctly aligned for 
the current instruction set state. 
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° The BLX (immediate) instruction switches between A32 and T32 states and branches to the value written to 
the PC. Its definition ensures that the value written to the PC is correctly aligned for the new instruction set 
state. 

° The following instructions write a value to the PC, treating that value as an interworking address to branch 


to, with low-order bits that determine the new instruction set state: 

— BLX (register), BX, and BX]. 

— LDR instructions with <Rt> equal to the PC. 

— POP and all forms of LDM except LDM (exception return), when the register list includes the PC. 


— In A32 state only, ADC, ADD, ADR, AND, ASR (immediate), BIC, EOR, LSL (immediate), LSR (immediate), MOV, 
MVN, ORR, ROR (immediate), RRX, RSB, RSC, SBC, and SUB instructions with <Rd> equal to the PC and without 
flag-setting specified. 

For details of how an interworking address specifies the new instruction set state and instruction address, see 

Pseudocode description of operations on the AArch32 general-purpose registers and the PC. 


Note 


The register-shifted register instructions, that are available only in the A32 instruction set and are 
summarized in Data-processing register (register shift) on page F4-2513, are CONSTRAINED UNPREDICTABLE 
if they attempt to write to the PC, see Using R15 on page K1-5457. 








° Some instructions are treated as exception return instructions, and write both the PC and the CPSR. For more 
information, including which instructions are exception return instructions, see Exception return to an 
Exception level using AArch32 on page G1-3834. 


° Some instructions cause an exception, and the exception handler address is written to the PC as part of the 
exception entry. 


Pseudocode description of operations on the AArch32 general-purpose registers and 
the PC 


In pseudocode, the uses of the R[] function, with an index parameter n, are: 
° Reading or writing RO-R12, SP, and LR, using n = 0-12, 13, and 14 respectively. 
. Reading the PC, using n= 15. 


Pseudocode description of general-purpose register and PC operations on page G1-3803 describes accesses to 
these registers. 


Descriptions of A32 store instructions that store the PC value use the PCStoreValue() pseudocode function to specify 
the PC value stored by the instruction. 


Writing an address to the PC causes either a simple branch to that address or an interworking branch that also selects 
the instruction set to execute after the branch. A simple branch is performed by the BranchWritePC() function. 


An interworking branch is performed by the BXWritePC() function. 


The LoadWritePC() and ALUWritePC() functions are used for two cases where the behavior was systematically 
modified between architecture versions. 
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E1.2.4 Process state, PSTATE 
Process state or PSTATE is an abstraction of process state information. All of the instruction sets provide 
instructions that operate on elements of PSTATE. 
Note 
In this chapter, references to PSTATE link to the more appropriate of: 
° The Application-level view of PSTATE given in this section. 
° The System-level description in Process state, PSTATE on page G1-3805. 
The following PSTATE information is accessible at ELO: 
The condition flags 
Flag-setting instructions set these. They are: 
N Negative condition flag. If the result of the instruction is regarded as a two's 
complement signed integer, the PE sets this to: 
° 1 if the result is negative. 
° 0 if the result is positive or zero. 
Z Zero condition flag. Set to: 
. 1 if the result of the instruction is zero. 
° 0 otherwise. 
A result of zero often indicates an equal result from a comparison. 
Cc Carry condition flag. Set to: 
. 1 if the instruction results in a carry condition, for example an unsigned overflow 
that is the result of an addition. 
° 0 otherwise. 
Vv Overflow condition flag. Set to: 
° 1 if the instruction results in an overflow condition, for example a signed 
overflow that is the result of an addition. 
° 0 otherwise. 
Conditional instructions test the N, Z, C, and V condition flags, combining them with the condition 
code for the instruction, to determine whether the instruction must be executed. In this way, 
execution of the instruction is conditional on the result of a previous operation. For more 
information about conditional execution, see Conditional execution on page F2-2407. 
The overflow or saturation flag 
Q Some instructions can set this. For those instructions that can, the PE: 
° Sets it to 1 if the instruction indicates overflow or saturation. 
° Leaves it unchanged otherwise. 
For more information, see Pseudocode description of saturation on page E1-2291. 
The greater than or equal flags 
GE[3:0] | The instructions described in Parallel addition and subtraction instructions on 
page F1-2378 update these to indicate the results from individual bytes or halfwords of 
the operation. These flags can control a later SEL instruction. For more information, see 
SEL on page F5-2964. 
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PSTATE also contains PE state controls. There is no direct access to these from application level instructions, but 
they can be changed by side-effects of application level instructions. They are: 


Instruction set state 


J,T The current instruction set state, as shown in Table El-1. In ARMv8, the J bit is RESO, 
see the Note in this section. 


Table E1-1 PSTATE.{J, T} encoding 





J T Instruction set state 











0 O A32 
0 1 T32 
A32 The PE is executing the A32 instruction set, summarized in Chapter F4 The 
A32 Instruction Set Encoding. 
T32 The PE is executing the T32 instruction set, summarized in Chapter F3 The 
T32 Instruction Set Encoding. 
—— Note 


Encoding with J==1 before ARMv8, Jazelle and T32EE states 
In previous versions of the ARM architecture, the encoding {1, 0} selected 
Jazelle state, and encoding {1, 1} selected T32EE state. ARMv8 does not 
support either of these states, and these are encodings for unimplemented 
instruction set states, see Unimplemented instruction sets on page G1-3810. 
ARMV8 AArch32 state requires a Trivial Jazelle implementation, see 
Trivial implementation of the Jazelle extension on page G1-3810. 





The IT block state 


IT[7:0] The If-Then controls for the T32 IT instruction, that applies to the IT block of 
instructions that immediately follow the IT instruction. See /T on page F5-2681 for a 
description of the IT instruction and its associated IT block. 

For more information about the use of PSTATE,IT see Use of PSTATE.IT on 
page E1-2297. 


Endianness mapping 


E For data accesses, controls the endianness: 
0 Little-endian. 
1 Big-endian. 
If an implementation does not provide: 
° Big-endian support for data accesses, this bit is RESO. 
° Little-endian support for data accesses, this bit is RES1. 


Instruction fetches are always little-endian, and ignore PSTATE.E. 
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Accessing PSTATE fields at ELO 


The following sections describe which PSTATE fields can be directly accessed at ELO, and how they can be 
accessed: 


° The Application Program Status Register, APSR. 
° The SETEND instruction. 


The Application Program Status Register, APSR 


At ELO, some PSTATE fields can be accessed using the Special-purpose Application Program Status Register 
(APSR). The APSR can be directly read using the MRS instruction, and directly written using the MSR (register) 
and MSR (immediate) instructions. 


The APSR bit assignments are: 
31 30 29 282726 = =—.24. 23 20 19 1615 0 


RAZ/ 


Lo 


Condition flags 


N, Z, C, V, bits [31:28] 
The PSTATE condition flags. 


Q, bit [27] The PSTATE overflow or saturation flag. 


Bits[26:24] Reserved, RAZ/SBZP. Software can use MSR instructions that write the top byte of the APSR without 
using a read-modify-write sequence. If it does this, it must write zeros to bits[26:24]. 


Bits[23:20, 15:0] 


Reserved bits that are allocated to system features, or are available for future expansion. 
Unprivileged execution ignores writes to fields that are accessible only at EL1 or higher. However, 
application level software that writes to the APSR must treat reserved bits as Do-Not-Modify 
(DNM) bits. For more information about the reserved bits, see The Current Program Status Register, 
CPSR on page G1-3807. 


GE[3:0], bits [19:16] 
The PSTATE greater than or equal flags. 
The other PSTATE fields cannot be accessed by using the APSR. 


The system level alias for the APSR is the CPSR. The CPSR is a superset of the APSR. See The Current Program 
Status Register, CPSR on page G1-3807. 


Writes to the PSTATE fields have side-effects on various aspects of PE operation. All of these side-effects, except 
side-effects on memory accesses associated with fetching instructions, are synchronous to the APSR write. This 
means they are guaranteed: 


° Not to be visible to earlier instructions in the execution stream. 
° To be visible to later instructions in the execution stream. 
The SETEND instruction 


The A32 and T32 instruction sets both include an instruction to manipulate PSTATE.E: 
SETEND BE Sets PSTATE.E to 1, for big-endian operation. 
SETEND LE Sets PSTATE.E to 0, for little-endian operation. 


The SETEND instruction is unconditional. For more information, see SETEND on page F5-2966. ARM deprecates use 
of the SETEND instruction. 
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Use of PSTATE.IT 


PSTATE,IT provides the If-Then controls for the T32 IT instruction, that applies to the /T block of instructions that 
immediately follow the IT instruction. 


PSTATE.IT divides into two subfields: 


IT[7:5] Holds the base condition for the current IT block. The base condition is the top three bits of the 
condition code specified by the <firstcond> field of the IT instruction. 


IT[4:0] Encodes: 


° Implicitly, the size of the IT block. This is the number of instructions that are to be 
conditionally executed. The size of the block is indicated by the position of the least 
significant 1 in this field, as shown in Table E1-2. 


° For each instruction in the IT block, the least significant bit of the condition code. This is 
encoded in the IT block entries that Table E1-2 shows as Nx. 
—— Note 


Changing the least significant bit of a condition code from 0 to 1 has the effect of inverting 
the condition code. 





Both subfields are all zeros when no IT block is active. 


When an IT instruction is executed, PSTATE.IT is set according to the <firstcond> field of the instruction and the 
Then and Else (T and E) parameters in the instruction, see /T on page F5-2681. This means that, on executing an IT 
instruction, the initial state of PSTATE.IT depends on the number of instructions in the IT block, as Table E1-2 
shows: 


Table E1-2 Initial state of PSTATE.IT on executing an IT instruction 





PSTATE.IT bits@ 

















Number of instructions in IT block Notes 
[7:5] [4] [3] [2] [1] [0] 
4 cond_base NI N2 N3 MM 1 - 
3 cond_base Nl N2 N3 1 0 - 
2 cond_base Nl N2 1 0 0 - 
1 cond_base Nl 1 0 0 0 - 
Not executing an IT instruction 000 0 0 0 0 0 No IT block is active 





a. Combinations of the IT bits not shown in this table are reserved. 


In Table El-2, N1 refers to the first instruction in the IT block, and N2, N3, and N4 refer to the second, third, and 
fourth instructions in the IT block if they are present, 


When permitted, an instruction in an IT block is conditional, see Conditional instructions on page F1-2369 and 
Conditional execution on page F2-2407. The condition code used is the current value of IT[7:4]. When an 
instruction in an IT block completes its execution normally, PSTATE.IT[4:0] is left-shifted by one bit, so that 
PSTATE[4] always relates to the next instruction to be executed. Table E1-3 on page E1-2298 shows how 
PSTATE.IT during the execution of an IT instruction with four instructions in the IT block. 
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Table E1-3 Updates to PSTATE.IT when executing an IT instruction with a four-instruction IT block 




















PSTATE.IT bits 
IT block instruction being executed Notes 
[7:5] [4] [3] [2] [1] [0] 
First cond_base NI N2 N3 MM 1 - 
Second cond_base N2 N3 N4 1 0 - 
Third cond_base N3 N4 1 0 0 - 
Fourth cond_base N4 1 0 0 0 - 
Not executing an IT instruction 000 0 0 0 0 0 No IT block is active 





A few instructions, for example BKPT, cannot be conditional and therefore are always executed ignoring the current 
value of PSTATE.IT. 


For details of what happens if an instruction in an IT block takes an exception, see Overview of exception entry on 
page G1-3819. 


An instruction that might complete its normal execution by branching is only permitted in an IT block as the last 
instruction in the block. This means that normal execution of the instruction always results in PSTATE.IT advancing 
to execution where no IT block is active. 


For performance reasons, ARMv8 deprecates the use of IT other than with a single 16-bit T32 instruction from a 
specified subset of the 16-bit T32 instructions, see Partial deprecation of IT on page K5-5531. In addition, 
implementations can provide a set of ITD control fields, SCTLR.ITD, SCTLR_EL1.ITD, and HSCTLR.ITD, to 
disable these deprecated uses, making them UNDEFINED. When an implementation includes ITD control fields, 
Changes to an ITD control by an instruction in an IT block describes the permitted CONSTRAINED UNPREDICTABLE 
behaviors if an instruction in an IT block changes the value of an ITD control to disable the use of the IT instruction. 


On a branch or an exception return, if PSTATE.IT is set to a value that is not consistent with the instruction stream 
being branched to or returned to, then instruction execution is CONSTRAINED UNPREDICTABLE. 


PSTATE.IT affects instruction execution only in T32 state. In A32 state, PSTATE.IT must be 0b00000000, otherwise 
the behavior is CONSTRAINED UNPREDICTABLE. 


For more information see CONSTRAINED UNPREDICTABLE behavior associated with IT instructions and 
PSTATE.IT on page K1-5458. 


Changes to an ITD control by an instruction in an IT block 


In an implementation that includes SCTLR.ITD, SCTLR_EL1.ITD, and HSCTLR.ITD controls, if an instruction in 
an IT block changes an ITD control so that the IT instruction using the IT block would be disabled, then one of the 
following behaviors applies: 


° The change to the ITD field, once synchronized, has no effect on the execution of instructions in the current 
IT block, but applies only to any subsequent execution of an IT instruction to which the control applies. 


° Synchronizing the change to the ITD field guarantees that all bits of PSTATE.IT are cleared to 0. 


In addition, after the change to the ITD field has been synchronized, any remaining instructions in the IT block that 
would be made UNDEFINED by the new value of ITD are either: 


° Executed normally. 
° Treated as UNDEFINED. 


The choice between the options described in this section is determined by the implementation, and any choice can 
vary between different changes to an ITD control by an instruction in an IT block. 
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Pseudocode description of PSTATE PE state fields 


The pseudocode function CurrentInstrSet() returns the current instruction set. The pseudocode function 
SelectInstrSet() selects a new instruction set. 


PSTATE.IT advances after normal execution of an IT block instruction. This is described by the 
AArch32.1TAdvance() pseudocode function. 


The pseudocode function InITBlock() tests whether the current instruction is in an IT block. The pseudocode 
function LastInITBlock() tests whether the current instruction is the last instruction in an IT block. 


The BigEndian() pseudocode function tests whether big-endian data memory accesses are currently selected. 


E1.2.5 Jazelle support 


ARMvV8 requires AArch32 state to include a trivial implementation of the Jazelle extension, as described in Trivial 
implementation of the Jazelle extension on page G1-3810. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. E1-2299 
1ID092916 Non-Confidential 


E1 The AArch32 Application Level Programmers’ Model 
E1.3 Advanced SIMD and floating-point instructions 


E1:3 Advanced SIMD and floating-point instructions 


In general, ARMv8 requires implementation of Advanced SIMD and floating-point instructions in the T32 and A32 
instruction sets, but see Implications of not including Advanced SIMD and floating-point support on page E1-2306. 


The Advanced SIMD instructions perform packed Single Instruction Multiple Data (SIMD) operations, either 
integer or single-precision floating-point. The floating-point instructions perform single-precision or 
double-precision scalar floating-point operations. 


These instructions permit floating-point exceptions, such as overflow or division by zero, to be handled without 
trapping. When handled in this way, a floating-point exception causes a cumulative status register bit to be set to 1 
and a default result to be produced by the operation. ARMv8 also optionally supports the trapping of floating-point 
exceptions, see Trapping of floating-point exceptions on page E1-2302. For more information about floating-point 
exceptions see Floating-point exceptions on page E1-2303. 


The floating-point and Advanced SIMD instructions also provide conversion functions in both directions between 
half-precision floating-point and single-precision floating-point. 


Some Advanced SIMD instructions support polynomial arithmetic over {0, 1}, as described in Polynomial 
arithmetic over {0, 1} on page A1-45. 


For system level information about the Advanced SIMD and Floating-point implementation see Advanced SIMD 
and floating-point support on page G1-3880. 


The following sections give more information about the Advanced SIMD and floating-point instructions: 
° Floating-point standards, and terminology on page A1-48. 

° The SIMD and floating-point register file. 

. Data types supported by the Advanced SIMD implementation on page E1-2302. 

° Advanced SIMD and floating-point System registers on page E1-2302. 

° Trapping of floating-point exceptions on page E1-2302. 

° Floating-point data types and arithmetic on page E1-2303. 

° Floating-point exceptions on page E1-2303. 


° Controls of Advanced SIMD operation that do not apply to floating-point operation on page E1-2306. 
° Implications of not including Advanced SIMD and floating-point support on page E1-2306. 
° Pseudocode description of floating-point operations on page E1-2306. 

E1.3.1 The SIMD and floating-point register file 


The Advanced SIMD and floating-point instructions use the same register file, that comprises 32 registers. This is 
distinct from the register file that holds the general-purpose registers and the PC. 


The Advanced SIMD and floating-point views of the register file are different. The following sections describe these 
different views. Figure E1-1 on page E1-2301 shows the views of the register file, and the way the word, 
doubleword, and quadword registers overlap. 


Advanced SIMD views of the register file 


Advanced SIMD can view this register file as: 
° Sixteen 128-bit quadword registers, QQ-Q15. 
° Thirty-two 64-bit doubleword registers, D0-D31. 


These views can be used simultaneously. For example, a program might hold 64-bit vectors in D@ and D1 and a 
128-bit vector in Q1. 
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Floating-point views of the register file 


The Advanced SIMD and floating-point register file consists of thirty-two doubleword registers, that can be viewed 


as: 
° Thirty-two 64-bit doubleword registers, D0-D31. This view is also available to Advanced SIMD instructions. 
° Thirty-two 32-bit single word registers, S@-S31. Only half of the set is accessible in this view. 


The two views can be used simultaneously. 


SIMD and Floating-point register file mapping onto registers 


Figure E1-1 shows the different views of the SIMD and floating-point register file, and the relationship between 
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Figure E1-1 SIMD and floating-point register file, AArch32 operation 


The mapping between the registers is as follows: 


° S<2n> maps to the least significant half of D<n>. 
° S<2n+1> maps to the most significant half of D<n>. 
. D<2n> maps to the least significant half of Q<n>. 
. D<2n+1> maps to the most significant half of Q<n>. 


For example, software can access the least significant half of the elements of a vector in Q6 by referring to D12, and 
the most significant half of the elements by referring to D13. 
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Pseudocode description of the SIMD and Floating-point register file 


The functions _Dclone, S[], and D[] provide the S@-S31, D@-D31, and QQ-Q15 views of the Advanced SIMD and 
floating-point registers: 


The Din[] function returns a Doubleword register from the _Dclone[] copy of the SIMD and Floating-point register 
file, and the Qin[] function returns a Quadword register from that register file. 


Note 


The CheckAdvSIMDEnabled() function copies the D[] register file to _Dclone[], see Pseudocode description of 
enabling SIMD and floating-point functionality on page G1-3919. 











E1.3.2 Data types supported by the Advanced SIMD implementation 
Advanced SIMD instructions can operate on integer and floating-point data, and the implementation defines a set 
of data types that support the required data formats. Vector formats in AArch32 state on page A1-38 describes these 
formats. 
Advanced SIMD vectors 
In an implementation that includes support for Advanced SIMD operation, a register can hold one or more packed 
elements, all of the same size and type. The combination of a register and a data type describes a vector of elements. 
The vector is considered to be an array of elements of the data type specified in the instruction. The number of 
elements in the vector is implied by the size of the data elements and the size of the register. 
Vector indices are in the range 0 to (number of elements — 1). An index of 0 refers to the least significant end of the 
vector. In Vector formats in AArch32 state on page A1-38, Figure A1-3 on page A1-40 shows the Advanced SIMD 
vector formats. 
Pseudocode description of Advanced SIMD vectors 
The pseudocode function Elem[] accesses the element of a specified index and size in a vector. 

E1.3.3 Advanced SIMD and floating-point System registers 
The Advanced SIMD and floating-point instructions have a shared register space for System registers. Only one 
register in this space is accessible at the Application level, see FPSCR, Floating-Point Status and Control Register 
on page G6-4335. 
Writes to the FPSCR can have side-effects on various aspects of PE operation. All of these side-effects are 
synchronous to the FPSCR write. This means they are guaranteed not to be visible to earlier instructions in the 
execution stream, and they are guaranteed to be visible to later instructions in the execution stream. 
See Advanced SIMD and floating-point System registers on page G1-3882 for the system level view of the registers. 
These registers can be described as the SIMD and floating-point System registers. 

E1.3.4 Trapping of floating-point exceptions 
It is IMPLEMENTATION DEFINED whether the floating-point implementation supports the trapping of floating-point 
exceptions: 
° If it does, the FPSCR.{IDE, IXE, UFE, OFE, DZE, IOE} bits enable the exception traps. 
° Otherwise, the FPSCR trap bits are RESO. 
Trapped exception handling never causes the corresponding cumulative exception bit of the FPSCR to be set to 1. 
If this behavior is desired, the trap handler routine must use a read, modify, write sequence on the FPSCR to set the 
cumulative exception bit. 
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E1.3.5 Floating-point data types and arithmetic 


The T32 and A32 floating-point instructions support single-precision (32-bit) and double-precision (64-bit) data 
types and arithmetic as defined by the IEEE 754 floating-point standard. They also support the half-precision 
(16-bit) floating-point data type for data storage only, by supporting conversions between single-precision and 
half-precision data types. 


ARM standard floating-point arithmetic means IEEE 754 floating-point arithmetic with the restrictions described 
in Floating-point and Advanced SIMD support on page A1-46, including supporting only the input and output 
values described in ARM standard floating-point input and output values on page A1-49. 


The AArch32 Advanced SIMD instructions support only single-precision ARM standard floating-point arithmetic. 


The following sections describe the Advanced SIMD and floating-point formats: 
° Half-precision floating-point formats on page A1-40. 
° Single-precision floating-point format on page Al-42. 


. Double-precision floating-point format on page A1-43. 


The following sections describe features of Advanced SIMD and floating-point processing: 
° Flush-to-zero on page A1-49. 
. NaN handling and the Default NaN on page A1-50. 


E1.3.6 Floating-point exceptions 


ARM Advanced SIMD and floating-point instructions record the following floating-point exceptions in the FPSCR 
cumulative bits, unless the floating-point exception is trapped and generates an exception: 


FPSCR.IOC Invalid Operation. The bit is set to 1 if the result of an operation has no mathematical value or cannot 
be represented. Cases include, for example: 


° (infinity) x 0. 
° (+infinity) + (infinity). 
These tests are made after flush-to-zero processing. For example, if flush-to-zero mode is selected, 


multiplying a denormalized number and an infinity is treated as (0 x infinity), and causes an Invalid 
Operation floating-point exception. 


IOC is also set on any floating-point operation with one or more signaling NaNs as operands, except 
for negation and absolute value, as described in Floating-point negation and absolute value on 
page E1-2307. 


FPSCR.DZC _ Division by Zero. The bit is set to 1 if a divide operation has a zero divisor and a dividend that is 
not zero, an infinity or a NaN. These tests are made after flush-to-zero processing, so if flush-to-zero 
processing is selected, a denormalized dividend is treated as zero and prevents Division by Zero 
from occurring, and a denormalized divisor is treated as zero and causes Division by Zero to occur 
if the dividend is a normalized number. 


For the reciprocal and reciprocal square root estimate functions the dividend is assumed to be +1.0. 
This means that a zero or denormalized operand to these functions sets the DZC bit. 


FPSCR.OFC Overflow. The bit is set to 1 if the absolute value of the result of an operation, produced after 
rounding, is greater than the maximum positive normalized number for the destination precision. 


FPSCR.UFC Underflow. The bit is set to 1 if the absolute value of the result of an operation, produced before 
rounding, is less than the minimum positive normalized number for the destination precision, and 
the rounded result is inexact. 


The criteria for the Underflow exception to occur are different in Flush-to-zero mode. For details, 
see Flush-to-zero on page A1-49. 

FPSCR.IXC _ Inexact. The bit is set to 1 if the result of an operation is not equivalent to the value that would be 
produced if the operation were performed with unbounded precision and exponent range. 


The criteria for the Inexact exception to occur are different in Flush-to-zero mode. For details, see 
Flush-to-zero on page A1-49. 
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FPSCR.IDC Input Denormal. The bit is set to 1 if a denormalized input operand is replaced in the computation 


by a zero, as described in Flush-to-zero on page A1-49. 


For Advanced SIMD instructions, and for floating-point instructions when floating-point exception trapping is not 
supported, these are non-trapping exceptions and the data-processing instructions do not generate any trapped 
exceptions. 


For floating-point instructions when floating-point exception trapping is supported: 


These exceptions can be trapped, by setting trap enable bits in the FPSCR, see Trapping of floating-point 
exceptions on page E1-2302 and Floating-point exception traps on page G1-3883, and: 


— When atrap is not enabled the corresponding floating-point exception updates the corresponding 
FPSCR cumulative bit does not generate an exception. 


— When atrap is enabled the corresponding floating-point exception does not update the FPSCR, but 
generates an exception. In this case, bits in the FPEXC indicate which floating-point exceptions have 
occurred. 


The definition of the Underflow exception is different in the trapped and cumulative exception cases. In the 
trapped case the definition is: 


— The trapped Underflow exception occurs if the absolute value of the result of an operation, produced 
before rounding, is less than the minimum positive normalized number for the destination precision, 
regardless of whether the rounded result is inexact. 


As with cumulative exceptions, higher priority trapped exceptions can prevent lower priority exceptions from 
occurring, as described in Combinations of floating-point exceptions on page E1-2305. 


For Invalid Operation exceptions, for details of which quiet NaN is produced as the default result see NaN 
handling and the Default NaN on page A1-50. 


For Overflow exceptions, the sign bit of the default result is determined normally for the overflowing 
operation. 


For Division by Zero exceptions, the sign bit of the default result is determined normally for a division. This 
means it is the exclusive OR of the sign bits of the two operands. 


Table E1-4 shows the results of untrapped floating-point exceptions. That table uses the following abbreviations: 


MaxNorm The maximum normalized number of the destination precision. 


RM 
RN 
RP 
RZ 


Round towards Minus Infinity mode, as defined in the IEEE 754 standard. 
Round to Nearest mode, as defined in the IEEE 754 standard. 

Round towards Plus Infinity mode, as defined in the IEEE 754 standard. 
Round towards Zero mode, as defined in the IEEE 754 standard. 


For more information about the IEEE 754 descriptions of the rounding modes see Floating-point standards, and 
terminology on page A1-48. 


Table E1-4 Results of untrapped floating-point exceptions 





























Exception type Default result for positive sign Default result for negative sign 
IOC, Invalid Operation Quiet NaN Quiet NaN 
DZC, Division by Zero _+infinity -infinity 
OFC, Overflow RN, RP: _ +infinity RN, RM: _ -infinity 
RM, RZ: +MaxNorm RP, RZ: -MaxNorm 
UFC, Underflow Normal rounded result Normal rounded result 
IXC, Inexact Normal rounded result Normal rounded result 
IDC, Input Denormal Normal rounded result Normal rounded result 
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Combinations of floating-point exceptions 


Many pseudocode functions perform floating-point operations, including FixedToFP(), FPAdd(), FPCompare(), 
FPCompareEQ(), FPCompareGE(), FPCompareGT(), FPDiv(), FPMax(), FPMin(), FPMul(), FPMulAdd(), FPRecipEstimate(), 
FPRecipStep(), FPRSqrtEstimate(), FPRSqrtStep(), FPSqrt(), FPSub(), and FPToFixed(). All of these operations can 
generate floating-point exceptions. 





Note 
FPAbs() and FPNeg() are not classified as floating-point operations because: 
. They cannot generate floating-point exceptions. 
° The floating-point operation behavior described in the following sections does not apply to them: 


—  Flush-to-zero on page A1-49. 
— NaN handling and the Default NaN on page A1-50. 





More than one exception can occur on the same operation. The only combinations of exceptions that can occur are: 


° Overflow with Inexact. 
° Underflow with Inexact. 
° Input Denormal with other exceptions. 


When none of the exceptions caused by an operation is trapped, any exception that occurs causes the associated 
cumulative bit in the FPSCR to be set. 


When one or more exceptions caused by an operation are trapped, the behavior of the instruction depends on the 
priority of the exceptions. The Inexact exception is treated as lowest priority, and Input Denormal as highest priority: 


. If the higher priority exception is trapped, its trap handler is called. It is IMPLEMENTATION DEFINED whether 
any information about the lower priority exception is provided. 


Note 


Information about the lower priority exception might be provided in: 





— The FPEXC, if the exception generated by the trap is taken to an Exception level that is using 
AArch32. 


— The ESR_ELx.ISS field, if the exception generated by the trap is taken to an Exception level that is 
using AArch64. 


However, information might be provided in another IMPLEMENTATION DEFINED way, for example using an 
IMPLEMENTATION DEFINED register. 





Apart from this, the lower priority exception is ignored in this case. 


° If the higher priority exception is untrapped, its cumulative bit is set to 1 and its default result is evaluated. 
Then the lower priority exception is handled normally, using this default result. 


Some floating-point instructions specify more than one floating-point operation, as indicated by the pseudocode 
descriptions of the instruction. In such cases, an exception on one operation is treated as higher priority than an 
exception on another operation if the occurrence of the second exception depends on the result of the first operation. 
Otherwise, it is CONSTRAINED UNPREDICTABLE which exception is treated as higher priority. 


For example, a VMLA.F32 instruction specifies a floating-point multiplication followed by a floating-point addition. 
The addition can generate Overflow, Underflow and Inexact exceptions, all of which depend on both operands to 
the addition and so are treated as lower priority than any exception on the multiplication. The same applies to Invalid 
Operation exceptions on the addition caused by adding opposite-signed infinities. The addition can also generate an 
Input Denormal exception, caused by the addend being a denormalized number while in Flush-to-zero mode. It is 
CONSTRAINED UNPREDICTABLE which of an Input Denormal exception on the addition and an exception on the 
multiplication is treated as higher priority, because the occurrence of the Input Denormal exception does not depend 
on the result of the multiplication. The same applies to an Invalid Operation exception on the addition caused by the 
addend being a signaling NaN. 
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Note 


The VFMA instruction performs a vector addition and a vector multiplication as a single operation. The VFMS 
instruction performs a vector subtraction and a vector multiplication as a single operation. 








E1.3.7 Controls of Advanced SIMD operation that do not apply to floating-point operation 


ARMvV7 permitted implementation of either, both, or neither of the Advanced SIMD and floating-point additions to 
the base instruction set, and provided some controls that applied to the Advanced SIMD functionality but not to the 
floating-point functionality. In ARMv8, Advanced SIMD functionality cannot be separated from floating-point 
functionality, but in AArch32 state these controls function as they did in ARMv7. This means they apply only to the 
following instructions and instruction encodings: 


° All instructions with encodings defined in: 
— Advanced SIMD data-processing on page F3-2454, for the T32 instruction set. 
— Advanced SIMD data-processing on page F4-2541, for the A32 instruction set. 


. All instructions with encodings defined in: 
— Advanced SIMD element or structure Load/Store on page F3-2479, for the T32 instruction set. 
— Advanced SIMD element or structure Load/Store on page F4-2553, for the A32 instruction set. 


. The form of the VDUP instruction described in VDUP (general-purpose register) on page F6-3394. 


° The byte and halfword forms of the VMOV instructions described in each of: 
— VMOV (general-purpose register to scalar) on page F6-3512. 
—  VMOV (scalar to general-purpose register) on page F6-3516. 


The controls of this functionality are: 
° The CPACR.ASEDIS field. 
° The HCPTR.TASE field. 


In an implementation that supports Advanced SIMD functionality, support for each of these controls is optional: 


° If the CPACR.ASEDIS control is not supported then the CPACR.ASEDIS field is RAZ/WI. This is 
equivalent to the control permitting the execution of Advanced SIMD instructions at EL1 and ELO. 


° If the HCPTR.TASE control is not supported then the HCPTR.TASE field is RAZ/WI. This means the 
HCPTR does not provide a control that can trap Non-secure execution of Advanced SIMD instructions to 
Hyp mode. 


E1.3.8 Implications of not including Advanced SIMD and floating-point support 


In general, ARMv8 requires the inclusion of the Advanced SIMD and floating-point instructions in all instruction 
sets. Exceptionally, for implementation targeting specialized markets, ARM might produce or license an ARMv8-A 
implementation that does not provide any support for Advanced SIMD and floating-point instructions. In such an 
implementation, in AArch32 state: 


° Each of the CPACR.{cp10, cp11} fields is RESO. 

° The CPACR.ASEDIS bit is RESO. 

° Each of the HCPTR.{TASE, TCP10, TCP11} fields is RES1. 

° Each of the NSACR.{NSASEDIS, cp10, cp11} fields is RESO. 
° The FPEXC register is UNDEFINED. 


E1.3.9 Pseudocode description of floating-point operations 
The following subsections contain pseudocode definitions of the floating-point functionality supported by the 
ARMvV8 architecture: 
° Generation of specific floating-point values on page E1-2307. 
° Floating-point negation and absolute value on page E1-2307. 
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. Floating-point value unpacking. 

. Floating-point exception and NaN handling. 

° Floating-point rounding. 

. Selection of ARM standard floating-point arithmetic on page E1-2308. 
° Floating-point comparisons on page E1-2308. 

° Floating-point maximum and minimum on page E1-2308. 

° Floating-point addition and subtraction on page E1-2308. 

° Floating-point multiplication and division on page E1-2308. 

° Floating-point fused multiply-add on page E1-2308. 

° Floating-point reciprocal estimate and step on page E1-2308. 

° Floating-point square root on page E1-2309. 

° Floating-point reciprocal square root estimate and step on page E1-2309. 
° Floating-point conversions on page E1-2311. 


Generation of specific floating-point values 


The following pseudocode functions generate specific floating-point values. The sign argument is '0' for the 
positive version and '1' for the negative version: 


. FPInfinity(). 

. FPMaxNormal(). 
. FPZero(). 

. FPTwo(). 

. FPThree(). 

. FPDefaultNaN(). 


Floating-point negation and absolute value 


The floating-point negation and absolute value operations only affect the sign bit. They do not treat NaN operands 
specially, nor denormalized number operands when flush-to-zero is selected. 


The floating-point negation operation is described by the pseudocode function FPNeg(). The floating-point absolute 
value operation is described by the pseudocode function FPAbs(). 


Floating-point value unpacking 


The FPUnpack() function determines the type of a floating-point number, defined by FPType{}, and its numerical 
value. It also does flush-to-zero processing on input operands. 


Floating-point exception and NaN handling 


The FPProcessException() procedure checks whether a floating-point exception is trapped, and handles it 
accordingly. The floating-point exception types are defined by FPExc{}. 


The FPProcessNaN() function processes a NaN operand, producing the correct result value and generating an Invalid 
Operation exception if necessary. The FPProcessNaNs() function performs the standard NaN processing for a 
two-operand operation. The FPProcessNaNs3() function performs the standard NaN processing for a three-operand 
operation. 


Floating-point rounding 


The FPRound() function rounds and encodes a floating-point result value to a specified destination format. This 
includes processing Overflow, Underflow and Inexact floating-point exceptions and performing flush-to-zero 
processing on result values. 
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Selection of ARM standard floating-point arithmetic 


The StandardFPSCRValue() function returns the FPSCR value that selects ARM standard floating-point arithmetic. 
Most of the arithmetic functions have a Boolean fpscr_controlled argument that is TRUE for Floating-point 
operations and FALSE for Advanced SIMD operations, and that selects between using the real FPSCR value and this 
value. 


Floating-point comparisons 


The FPCompare() function compares two floating-point numbers, producing a {N, Z, C, V} Condition flags result as 
shown in Table E1-S: 


Table E1-5 Effect of a Floating-point comparison on the Condition flags 





Comparisonresult N Z C V 














Equal 0 1 1 0 
Less than 1 0 O 0 
Greater than 0 0 1 0 
Unordered 0 0 1 1 





This result defines the operation of the VCMP floating-point instruction. The VCMP instruction writes these flag values 
in the FPSCR. After using a VMRS instruction to transfer them to the APSR, they can control conditional execution 
as shown in Table F2-1 on page F2-2407. 


The FPCompareEQ(), FPCompareGE(), and FPCompareGT() functions describe the operation of Advanced SIMD 
instructions that perform floating-point comparisons. 


Floating-point maximum and minimum 


The FPMax() function returns the maximum of two floating-point numbers. The FPMin() function returns the 
minimum of two floating-point numbers. 


Floating-point addition and subtraction 


The FPAdd() function adds two floating-point numbers. The FPSub() function subtracts one floating-point number 
from another floating-point number. 


Floating-point multiplication and division 


The FPMu1() function multiplies two floating-point numbers. The FPDiv() function divides one floating-point 
number by another floating-point number. 


Floating-point fused multiply-add 


The FPMulAdd() function performs a floating-point fused multiply-add. 


Floating-point reciprocal estimate and step 


The Advanced SIMD implementation includes instructions that support Newton-Raphson calculation of the 
reciprocal of a number. 


The VRECPE instruction produces the initial estimate of the reciprocal. It uses the pseudocode functions: 

° FPRecipEstimate(). 

° UnsignedRecipEstimate(). This pseudocode function calls the C function recip_estimate(): 
double recip_estimate(double a) 


{ 
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int q, S; 
double r; 
q = (int)(a * 512.0); /* a in units of 1/512 rounded down «/ 
r= 1.0 / (((double)q + 0.5) / 512.0); /« reciprocal r x/ 
s = (int)(256.0 » r+ 0.5); /* r in units of 1/256 rounded to nearest «/ 
return (double)s / 256.0; 
} 


Table E1-6 shows the results where input values are out of range. 


Table E1-6 VRECPE results for out of range inputs 




















Number type Input Vm[i] Result Vd[i] 
Integer <= Ox7FFFFFFF OxFFFFFFFF 
Floating-point NaN Default NaN 
Floating-point +0 or denormalized number tinfinity @ 
Floating-point +infinity +0 
Floating-point Absolute value >= 2!26 +0 





a. FPSCR.DZC is set to 1 


The Newton-Raphson iteration: 
Xml = Xn(2-dXn) 
converges to (1/d) if xo is the result of VRECPE applied to d. 


The VRECPS instruction performs a (2 - op1xop2) calculation and can be used with a multiplication to perform a step 
of this iteration. The functionality of this instruction is defined by the FPRecipStep() pseudocode function. 


Table E1-7 shows the results where input values are out of range. 


Table E1-7 VRECPS results for out of range inputs 

















Input Vn{i] Input Vm{[i] Result Vd[i] 
Any NaN - Default NaN 
- Any NaN Default NaN 
+0.0 or denormalized number +infinity 2.0 

+infinity +0.0 or denormalized number 2.0 





Floating-point square root 


The FPSqrt() function returns the square root of a floating-point number. 


Floating-point reciprocal square root estimate and step 


The Advanced SIMD implementation includes instructions that support Newton-Raphson calculation of the 
reciprocal of the square root of a number. 


The VRSQRTE instruction produces the initial estimate of the reciprocal of the square root. It uses the pseudocode 





functions: 
. FPRSqrtEstimate(). 
° UnsignedRSqrtEstimate(). This pseudocode function calls the C function recip_sqrt_estimate(). 
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double recip_sqrt_estimate(double a) 


{ 

int qQ, ql, s; 

double r; 

if (a < 0.5) /* range 0.25 <= a< 0.5 «/ 
qQ = (int)(a « 512.0); /*x a in units of 1/512 rounded down «/ 
r= 1.0 / sqrt(((double)q@ + 0.5) / 512.0); /» reciprocal root r «/ 

} 

else /x range 0.5 <=a< 1.0 «/ 
ql = (int)(a « 256.0); /* a in units of 1/256 rounded down «/ 
r= 1.0 / sqrt(((double)ql + 0.5) / 256.0); /» reciprocal root r «/ 

} 


s = (int)(256.0 » r+ @.5);  /* r in units of 1/256 rounded to nearest «/ 
return (double)s / 256.0; 
} 


Table E1-8 shows the results where input values are out of range. 


Table E1-8 VRSQRTE results for out of range inputs 





Number type Input Vim[i] Result Vd[i] 





Integer <= Ox3FFFFFFF OxFFFFFFFF 





Floating-point NaN, —(mormalized number), —infinity Default NaN 











Floating-point —O or —(denormalized number) — infinity 4 
Floating-point +0 or +(denormalized number) +infinity @ 
Floating-point +infinity +0 





a. FPSCR.DZC is set to 1. 


The Newton-Raphson iteration: 
Xm = Xn(3-dxp2)/2 
converges to (1/Vd) if xo is the result of VRSQRTE applied to d. 


The VRSQRTS instruction performs a (3 — op1xop2)/2 calculation and can be used with two multiplications to perform 
a step of this iteration. The functionality of this instruction is defined by the FPRSqrtStep() pseudocode function. 


Table E1-9 shows the results where input values are out of range. 


Table E1-9 VRSQRTS results for out of range inputs 

















Input Vn{[i] Input Vin{[i] Result Vd[i] 
Any NaN - Default NaN 
- Any NaN Default NaN 
+0.0 or denormalized number +infinity 1.5 

+infinity +0.0 or denormalized number 1.5 





FPRSqrtStep() calls the FPHalvedSub() pseudocode function. 
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Floating-point conversions 


The FPConvert() pseudocode function performs conversions between half-precision, single-precision, and 
double-precision floating-point numbers. 


The FPToFixed() and FixedToFP() functions perform conversions between floating-point numbers and integers or 
fixed-point numbers. 
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E1.4 About the AArch32 System register interface 

AArch32 state provides a System register encoding space, that is indexed by the parameter set {coproc, opcl, CRn, 

CRm, opc2}, and a set of System register access instructions. This encoding space is used for: 

° System registers. 

° System instructions, for: 

— Cache and branch predictor maintenance. 
— Address translation. 
—  TLB maintenance. 
In ARMvV8, this encoding space uses only the coproc values @b111x. 
Note 

The encoding space with coproc values 0b101x is redefined to provide Advanced SIMD and floating-point 

functionality. 

In ARMv8: 

° The (coproc==0b1111) encodings provide system control functionality, by providing access to System 
registers and System instructions. This includes architecture and feature identification, as well as control, 
status information and configuration support. 

The following sections give a general description of these encodings: 
— About the System registers for VMSAv8-32 on page G4-4148. 
—_ VMSAV8-32 organization of registers in the (coproc==ObI1111) encoding space on page G4-4175. 
— Functional grouping of VMSAv8-32 System registers on page G4-4193. 
These encodings also provide the Performance monitor registers, see Chapter D5 The Performance Monitors 
Extension. 
° The (coproc==0b1110) encodings provide access to additional registers, that support: 
— Debug, see Chapter G2 AArch32 Self-hosted Debug. 
— The Jazelle identification registers, see Jazelle support on page E1-2299. 

UNPREDICTABLE, CONSTRAINED UNPREDICTABLE, and UNDEFINED behavior for AArch32 System 

register accesses on page G4-4151 gives information more information about permitted accesses to the System 

registers in AArch32 state. 

Most functionality in the (coproc==0b111x) encoding space cannot be accessed by software executing at ELO. This 

manual clearly identifies those functions that can be accessed at ELO. 

For more information: 

° About this encoding space, including the naming of the parameters that index the space, see The AArch32 
System register interface on page G1-3877. 

. About the System interface access instructions, see System register access instructions on page F1-2387. 
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The ARM architecture uses the following terms to describe various types of exceptional condition: 


Exceptions 


In the ARM architecture, an exception causes entry to EL1, EL2, or EL3. If the Exception level that 
is entered is using AArch32, it also causes entry to the PE mode in which the exception must be 
taken. A software handler for the exception is then executed. 


— Note 


The term floating-point exception does not use this meaning of exception. This term is described 
later in this list. 





Exceptions include: 


° Reset. 

° Interrupts. 

° Memory system aborts. 
° Undefined instructions. 


° Supervisor calls (SVCs), Secure Monitor calls (SMCs), and hypervisor calls (HVCs). 


° Debug exceptions. 


Most details of exception handling are not visible to application level software, and are described in 
Handling exceptions that are taken to an Exception level using AArch32 on page G1-3812. In an 
ARMv8 implementation that includes all the Exception levels, aspects that are visible to application 
level software are: 


° The SVC instruction causes a Supervisor Call exception. This provides a mechanism for 
unprivileged software to make a call to the operating system, or other system component that 
is accessible only at EL1. 


. The SMC instruction causes a Secure Monitor Call exception, but only if software execution is 
at EL1 or higher. Unprivileged software can only cause a Secure Monitor Call exception by 
methods defined by the operating system, or by another component of the software system 
that executes at EL1 or higher. 


. The HVC instruction causes a Hypervisor Call exception, but only if software execution is at 
ELI or higher. Unprivileged software can only cause a Hypervisor Call exception by methods 
defined by the hypervisor, or by another component of the software system that executes at 
EL1 or higher. 


° The BKPT instruction causes a Breakpoint Instruction exception, that is taken as a Prefetch 
Abort exception. This provides a mechanism for a debugger to insert breakpoints into 
unprivileged software, or for unprivileged software to make a call into a debugger that is 
accessible at EL1. 


° The WFI (Wait for Interrupt) instruction provides a hint that nothing needs to be done until an 
interrupt or another WFI wake-up event occurs, see Wait For Interrupt on page G1-3875. 
This means the hardware might enter a low-power state until the wake-up event occurs. 


. The WFE (Wait for Event) instruction provides a hint that nothing needs to be done until either 
an SEV instruction generates an event, or another WFE wake-up event occurs, see Wait For 
Event and Send Event on page G1-3872. This means the hardware might enter a low-power 
state until the wake-up event occurs. 


Floating-point exceptions 


These relate to exceptional conditions encountered during floating-point arithmetic, such as division 
by zero or overflow. For more information see: 


° Floating-point exceptions on page E1-2303. 

° FPEXC, Floating-Point Exception Control register on page G6-4330. 

° FPSCR, Floating-Point Status and Control Register on page G6-4335. 

° ANSIAIEEE Std. 754, IEEE Standard for Binary Floating-Point Arithmetic. 
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Chapter E2 
The AArch32 Application Level Memory Model 


This chapter gives an application level description of the memory model for software executing in AArch32 state. 
This means it describes the memory model for execution in ELO when ELO is using AArch32 in the following 
sections: 


° Address space on page E2-2316. 

° Memory type overview on page E2-2317. 

° Caches and memory hierarchy on page E2-2318. 

° Alignment support on page E2-2323. 

° Endian support on page E2-2325. 

° Atomicity in the ARM architecture on page E2-2328. 
. Memory ordering on page E2-2332. 


. Memory types and attributes on page E2-2342. 

° Mismatched memory attributes on page E2-2352. 

° Synchronization and semaphores on page E2-2355 
Note 





In this chapter, System register names usually link to the description of the register in Chapter G6 AArch32 System 
Register Descriptions, for example SCTLR. 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. E2-2315 
1ID092916 Non-Confidential 


E2 The AArch32 Application Level Memory Model 
E2.1 Address space 


E2.1 Address space 
Address calculations are performed using 32-bit registers. Supervisory software determines the valid address range. 
Attempting to access an address that is not valid generates an MMU fault. 
Address calculations are performed modulo 232. 
The result of an address calculation is UNKNOWN if it overflows or underflows the 32-bit address range A[31:0]. 


Memory accesses use the MemA[], MemO[], MemU[], and MemU_unpriv[] pseudocode functions: 


° The MemA[] function makes an aligned access of the required type. 

° The MemO[] function makes an ordered access of the required type. 

° The MemU[] function makes an unaligned access of the required type 

° The MemU_unpriv[] function makes an unaligned, unprivileged access of the required type. 


Each of these functions calls Mem_with_type[] function, that specifies the required access. This calls 
AArch32.MemSingle[], which performs an atomic, little-endian read of size bytes. 


The AccType enumeration defines the different access types. 


Note 
° Chapter G3 The AArch32 System Level Memory Model and Chapter G4 The AArch32 Virtual Memory System 
Architecture include descriptions of memory system features that are transparent to the application, including 
memory access, address translation, memory maintenance instructions, and alignment checking and the 
associated fault handling. These chapters also reference pseudocode descriptions of these operations. 





° For references to the pseudocode that relates to memory accesses, see Basic memory access on 
page G3-4017, Unaligned memory access on page G3-4018, and Aligned memory access on page G3-4018. 
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E2.2 Memory type overview 
ARMvV8 provides the following mutually-exclusive memory types: 
Normal This is generally used for bulk memory operations, both read-write and read-only operations. 
Device The ARM architecture forbids speculative reads of any type of Device memory. This means Device 


memory types are suitable attributes for read-sensitive locations. 


Locations of the memory map that are assigned to peripherals are usually assigned the Device 
memory attribute. 


Device memory has additional attributes that have the following effects: 


° They prevent aggregation of reads and writes, maintaining the number and size of the 
specified memory accesses. See Gathering on page E2-2348. 


° They preserve the access order and synchronization requirements, both for accesses to a 
single peripheral and where there is a synchronization requirement on the observability of 
one or more memory write and read accesses. See Reordering on page E2-2349 


° They indicate whether a write can be acknowledged other than at the end point. See Early 
Write Acknowledgement on page E2-2350. 


° For more information on Normal memory and Device memory, see Memory types and attributes on 
page E2-2342. 


Note 
Earlier versions of the ARM architecture defined a single Device memory type and a Strongly-Ordered memory 
type. A Note in Device memory on page E2-2346 describes how these memory types map onto the ARMv8 memory 
types. 
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E2.3 Caches and memory hierarchy 


The implementation of a memory system depends heavily on the microarchitecture and therefore many details of 
the memory system are IMPLEMENTATION DEFINED. ARMv8 defines the application level interface to the memory 
system, including a hierarchical memory system with multiple levels of cache. This section describes an application 
level view of this system. It contains the subsections: 


° Introduction to caches. 
° Memory hierarchy. 
° Implication of caches for the application programmer on page E2-2320. 


° Preloading caches on page E2-2321. 


E2.3.1 Introduction to caches 


A cache is a block of high-speed memory that contains a number of entries, each consisting of: 
° Main memory address information, commonly known as a tag. 
° The associated data. 


Caches increase the average speed of a memory access and take account of two principles of locality: 


Spatial locality 


An access to one location is likely to be followed by accesses to adjacent locations. Examples of this 
principle are: 
° Sequential instruction execution. 


° Accessing a data structure. 


Temporal locality 


An access to an area of memory is likely to be repeated in a short time period. An example of this 
principle is the execution of a software loop. 


To minimize the quantity of control information stored, the spatial locality property groups several locations 
together under the same tag. This logical block is commonly known as a cache line. When data is loaded into a 
cache, access times for subsequent loads and stores are reduced, resulting in overall performance benefits. An access 
to information already in a cache is known as a cache hit, and other accesses are called cache misses. 


Normally, caches are self-managing, with the updates occurring automatically. Whenever the PE accesses a 
cacheable memory location, the cache is checked. If the access is a cache hit, the access occurs in the cache. 
Otherwise, the access is made to memory. Typically, when making this access, a cache location is allocated and the 
cache line loaded from memory. ARMV8 permits different cache topologies and access policies, provided they 
comply with the memory coherency model described in this manual. 


Caches introduce a number of potential problems, mainly because: 
° Memory accesses can occur at times other than when the programmer would expect them. 


° A data item can be held in multiple physical locations. 


E2.3.2 Memory hierarchy 


Typically memory close to a PE has very low latency, but is limited in size and expensive to implement. Further 
from the PE it is common to implement larger blocks of memory but these have increased latency. To optimize 
overall performance, an ARMv8 memory system can include multiple levels of cache in a hierarchical memory 
system that exploits this trade-off between size and latency. Figure E2-1 on page E2-2319 shows an example of such 
a system in an ARMv8-A system that supports virtual addressing. 
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Figure E2-1 Multiple levels of cache in a memory hierarchy 
Note 
In this manual, in a hierarchical memory system, Level 1 refers to the level closest to the PE, as shown in 
Figure E2-1. 





Instructions and data can be held in separate caches or in a unified cache. A cache hierarchy can have one or more 
levels of separate instruction and data caches, with one or more unified caches located at the levels closest to the 
main memory. Memory coherency for cache topologies can be defined by two conceptual points: 


Point of Unification (PoU) 


The point at which the instruction cache, data cache, and translation table walks of a particular PE 
are guaranteed to see the same copy of a memory location. In many cases, the point of unification 
is the point in a uniprocessor memory system by which the instruction and data caches and the 
translation table walks have merged. The point of unification might coincide with the point of 
coherency. 


Point of Coherency (PoC) 


The point at which all agents that can access memory are guaranteed to see the same copy of a 
memory location. In many cases this is effectively the main system memory, although the 
architecture does not prohibit the implementation of caches beyond the PoC that have no effect on 
the coherency between memory system agents. 


— Note 


The presence of system caches can affect the definition of the point of coherency as described in 
System level caches on page D3-1713. 





See also About cache maintenance in ARMV8 on page G3-3995. 


The Cacheability and Shareability memory attributes 
Cacheability and Shareability are two attributes that describe the memory hierarchy in a multiprocessing system: 


Cacheability This term defines whether memory locations are allowed to be allocated into a cache or not. 
Cacheability is defined independently for Inner and Outer Cacheability locations. 


Shareability This term defines whether memory locations are shareable between different agents in a system. 
Marking a memory location as shareable for a particular domain requires hardware to ensure that 
the location is coherent for all agents in that domain. Shareability is defined independently for Inner 
and Outer Shareability domains. 


For more information about the Cacheability and Shareability attributes see Memory types and attributes on 
page E2-2342. 
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E2.3.3 Implication of caches for the application programmer 

In normal operation, the caches are largely invisible to the application programmer. However they can become 

visible when there is a breakdown in the coherency of the caches. Such a breakdown can occur: 

° When memory locations are updated by other agents in the system that do not use hardware management of 
coherency. 

° When memory updates made from the application software must be made visible to other agents in the 
system, without the use of hardware management of coherency. 

For example: 

° In the absence of hardware management of coherency of DMA accesses, in a system with a DMA controller 
that reads memory locations that are held in the data cache of a PE, a breakdown of coherency occurs when 
the PE has written new data in the data cache, but the DMA controller reads the old data held in memory. 

. In a Harvard cache implementation, where there are separate instruction and data caches, a breakdown of 
coherency occurs when new instruction data has been written into the data cache, but the instruction cache 
still contains the old instruction data. 

Data coherency issues 

Software can ensure the data coherency of caches in the following ways: 

. By not using the caches in situations where coherency issues can arise. This can be achieved by: 

— Using Non-cacheable or, in some cases, Write-Through Cacheable memory. 
— Not enabling caches in the system. 

. By using system calls to functions using cache maintenance instructions that execute at a higher Exception 
level. 

° By using hardware coherency mechanisms to ensure the coherency of data accesses to memory for cacheable 
locations by observers within the different shareability domains, see Non-shareable Normal memory on 
page E2-2344 and Shareable, Inner Shareable, and Outer Shareable Normal memory on page E2-2343. 

Note 
The performance of these hardware coherency mechanisms is highly implementation-specific. In some 
implementations the mechanism suppresses the ability to cache shareable locations. In other 
implementations, cache coherency hardware can hold data in caches while managing coherency between 
observers within the shareability domains. 

Synchronization and coherency issues between data and instruction accesses 

How far ahead of the current point of execution instructions are fetched from is IMPLEMENTATION DEFINED. Such 

prefetching can be either a fixed or a dynamically varying number of instructions, and can follow any or all possible 

future execution paths. For all types of memory: 

. The PE might have fetched the instructions from memory at any time since the last Context synchronization 
event on that PE. 

° Any instructions fetched in this way might be executed multiple times, if this is required by the execution of 
the program, without being re-fetched from memory. 

The ARM architecture does not require the hardware to ensure coherency between instruction caches and memory, 

even for locations of shared memory. 

If software requires coherency between instruction execution and memory, it must manage this coherency using 

Context synchronization events, DSB memory barriers, and cache maintenance instructions. See Context 

synchronization event. These can only be accessed from an Exception level that is higher than ELO, and therefore 

require a system call, see Exception-generating and exception-handling instructions on page F1-2386. The 

following code sequence can be used for this purpose: 
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; Coherency example for data and instruction accesses within the same Inner Shareable domain. 
; Enter this code with <Rt> containing a new 32-bit instruction, 

; to be held in Cacheable space at a location pointed to by Rn. Use STRH in the first line 

; instead of STR for a 16-bit instruction. 








STR Rt, [Rn] 
DCCMVAU Rn ; Clean data cache by MVA to point of unification (PoU) 
DSB ; Ensure visibility of the data cleaned from cache 
ICIMVAU Rn ; Invalidate instruction cache by MVA to PoU 
BPIMVA Rn ; Invalidate branch predictor by MVA to PoU 
DSB ; Ensure completion of the invalidations 
ISB ; Synchronize the fetched instruction stream 
Note 
. For accesses that are Non-cacheable or Write-Through, the clean data cache instruction is not required. For 


accesses that are Non-cacheable, the invalidate instruction cache is not required, because in AArch32 state 
these accesses are not permitted to be held in an instruction cache. 


° This code can be used when the thread of execution modifying the code is the same thread of execution that 
is executing the code. The ARMv8 architecture limits the set of instructions that can be executed by one 
thread of execution as they are being modified by another thread of execution without requiring explicit 
synchronization. See Concurrent modification and execution of instructions on page E2-2330. 





E2.3.4 Preloading caches 


The ARM architecture provides the memory system hints PLD (Preload Data), PLDW (Preload Data With Intent To 
Write) and PLI (Preload Instruction) that software can use to communicate the expected use of memory locations to 
the hardware. The memory system can respond by taking actions that are expected to speed up the memory accesses 
if they occur. The effect of these memory system hints is IMPLEMENTATION DEFINED. Typically, implementations 
use this information to bring data or instruction locations into caches. 


The Preload instructions are hints, and so implementations can treat them as NOPs without affecting the functional 
behavior of the device. The instructions cannot generate synchronous Data Abort exceptions, but the resulting 
memory system operations might, under exceptional circumstances, generate an asynchronous external abort, which 
is reported using an SError interrupt and taken using an asynchronous Data Abort exception. For more information, 
see Data Abort exception on page G1-3859. 


A PLD, PLDW, or PLI instruction can only cause allocation to software-visible caching structures such caches or TLBs 
for memory locations that can be accessed, according to the permissions defined by the current translation regime 
or a translation regime for a higher Exception level in the current Security state, by any of: 


° Reads. 
° Writes. 
° Instruction fetches. 


A PLD, PLDW, or PLI instruction can access any memory location in Normal memory that can be accessed, according 
to the permissions defined by the current translation regime or a translation regime for a higher Exception level in 
the current Security state, by any of: 


° Reads. 

° Writes. 

° Instruction fetches. 
Note 





In each case, the entire list applies to each of PLD, PLDW, and PLI. 





A PLD, PLDW, or PLI instruction is guaranteed not to access any type of Device memory. 


A PLI instruction must not perform any access that cannot be performed by a speculative instruction fetch by the 
processor. Therefore in a VMSA implementation, if all associated MMUs are disabled, a PLI instruction cannot 
access any memory location that cannot be accessed by instruction fetches. 
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The pseudocode enumeration PrefetchHint defines the prefetch hint types. 


The Hint_Prefetch() pseudocode function signals to the memory system that memory accesses of the type hint to 
or from the specified address are likely to occur in the near future. The memory system might take some action to 
speed up the memory accesses when they do occur, such as preloading the specified address into one or more caches 
as indicated by the innermost cache level target and non-temporal hint stream. 


For more information on PLD, PLI, and PLDW, see: 

° PLD, PLDW (immediate) on page F5-2869. 
. PLD (literal) on page F5-2871. 

° PLD, PLDW (register) on page F5-2873. 

° PLI (immediate, literal) on page F5-2875. 

° PLI (register) on page F5-2878. 
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E2.4 Alignment support 

This section describes alignment support. It contains the following subsections: 

° Instruction alignment. 

° Unaligned data access. 

° Cases where unaligned accesses are CONSTRAINED UNPREDICTABLE on page E2-2324. 

° Unaligned data access restrictions on page E2-2324. 
E2.4.1 Instruction alignment 


A32 instructions are word-aligned. 


T32 instructions are halfword-aligned. 


E2.4.2 Unaligned data access 


An ARMv8 implementation must support unaligned data accesses to Normal memory by some load and store 
instructions. As Table E2-1 shows, software can control whether a misaligned access to Normal memory by one of 
these instructions causes an Alignment fault Data Abort exception: 


° By setting SCTLR.A, for unaligned accesses from any mode other than Hyp mode. 
° By setting HSCTLR.A, for unaligned accesses from Hyp mode. 


Table E2-1 Alignment requirements of load/store instructions 





Result if check fails when: 





























Instructions Alignment 

check SCTLR.A or SCTLR.A or 

HSCTLR.A is 0 HSCTLR.A is 1 

LDRB, LDREXB, LDRBT, LDRSB, LDRSBT, STRB, STREXB, STRBT, TBB None - - 
LDRH, LDRHT, LDRSH, LDRSHT, STRH, STRHT, TBH Halfword Unaligned access Alignment fault 
LDREXH, STREXH, LDAH, STLH, LDAEXH, STLEXH Halfword Alignment fault Alignment fault 
LDR, LDRT, STR, STRT Word Unaligned access Alignment fault 
PUSH, encodings T3 and A2 only 
POP, encodings T3 and A2 only 
LDREX, STREX, LDA, STL, LDAEX, STLEX Word Alignment fault Alignment fault 
LDREXD, STREXD, LDAEXD, STLEXD Doubleword Alignment fault Alignment fault 
All forms of LDM and STM, LDRD, RFE, SRS, STRD Word Alignment fault Alignment fault 
LDC, STC Word Alignment fault Alignment fault 
VLDM, VLDR, VPOP, VPUSH, VSTM, VSTR Word Alignment fault Alignment fault 





VLD1, VLD2, VLD3, VLD4, VST1, VST2, VST3, VST4, all with standard alignment 


Element size 


Unaligned access 


Alignment fault 





VLD1, VLD2, VLD3, VLD4, VST1, VST2, VST3, VST4, all with :<align> specified 


As specified 
by :<align> 


Alignment fault 


Alignment fault 





a. Previous versions of this manual used @<align> to specify alignment. Both forms are supported, see Chapter F6 732 and A32 Advanced 


SIMD and floating-point Instruction Descriptionsfor more information. 
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Note 


Any unaligned access to any type of Device memory generates an Alignment fault, see Alignment faults on 
page G4-4117. 








E2.4.3 Cases where unaligned accesses are CONSTRAINED UNPREDICTABLE 


Any load instruction that is not faulted by the alignment restrictions shown in Table E2-1 on page E2-2323 and that 
loads the PC has CONSTRAINED UNPREDICTABLE behavior if the address it loads from is not word-aligned, see Loads 
and Stores to unaligned locations on page K1-5458. This overrules any permitted Load/Store behavior shown in 
Table E2-1 on page E2-2323. 


E2.4.4 Unaligned data access restrictions 


The following points apply to unaligned data accesses in ARMv8: 


° Accesses are not guaranteed to be single-copy atomic except at the byte access level, see Atomicity in the 
ARM architecture on page E2-2328. 


° Unaligned accesses typically take a number of additional cycles to complete compared to a naturally-aligned 
access. 
° An operation that performs an unaligned access can abort on any memory access that it makes, and can abort 


on more than one access. This means that an unaligned access that occurs across a page boundary can 
generate an abort on either side of the boundary. 
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E2.5 Endian support 


General description of endianness in the ARM architecture describes the relationship between endianness and 
memory addressing in the ARM architecture. 


The following subsections then describe the endianness schemes supported by the architecture: 
° Instruction endianness. 


° Data endianness on page E2-2326. 


E2.5.1 General description of endianness in the ARM architecture 


This section only describes memory addressing and the effects of endianness for data elements up to doubleword 
of 64 bits. However, this description can be extended to apply to larger data elements. 


For an address A, Figure E2-2 shows, for big-endian and little-endian memory systems, the relationship between: 
° The doubleword at address A. 

° The words at addresses A and A+4. 

° The halfwords at addresses A, A+2, A+4, and A+6. 

e The bytes at addresses A, A+1, A+2, A+3, A+4, A+5, A+6, and A+7. 


The terms in Figure E2-2 have the following definitions: 
MSByte Most-significant byte. 
LSByte Least-significant byte. 


Big-endian memory system 





MSByte Incrementing byte address > LSByte 





Doubleword at address A 
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Little-endian memory system 
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In this figure, Byte, A+7 is an abbreviation for Byte at address A+1 


Figure E2-2 Endianness relationships in AArch32 state 


E2.5.2 Instruction endianness 


In ARMv8-A, the mapping of instruction memory is always little-endian. 
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E2.5.3 Data endianness 


The size of the data value that is loaded or stored is the size that is used for the purpose of endian conversion for 
floating-point, Advanced SIMD, and general-purpose register loads and stores. 


Table E2-2 shows the element sizes of all the load/store instructions, for all instruction sets. 


Table E2-2 Element size of load/store instructions 





Instructions 


Element size 




















LDRB, LDREXB, LDRBT, LDRSB, LDRSBT, STRB, STREXB, STRBT, TBB Byte 
LDRH, LDREXH, LDRHT, LDRSH, LDRSHT, STRH, STREXH, STRHT, TBH Halfword 
LDR, LDRT, LDREX, STR, STRT, STREX Word 
LDRD, LDREXD, STRD, STREXD Word 
All forms of LDM, PUSH, POP, RFE, SRS, all forms of STM, Word 
LDC, STC Word 











Forms of VLDM, VLDR, VPOP, VPUSH, VSTM, VSTR that transfer 32-bit Si registers Word 





Forms of VLDM, VLDR, VPOP, VPUSH, VSTM, VSTR that transfer 64-bit Di registers | Doubleword 





VLD1, VLD2, VLD3, VLD4, VST1, VST2, VST3, VST4 Element size of the Advanced SIMD access 





CPSR.E determines the data endianness. 
The data size used for endianness conversions: 


° Is the size of the data value that is loaded or stored for Advanced SIMD and floating-point register and 
general-purpose register loads and stores. 


° Is the size of the data element that is loaded or stored for Advanced SIMD element and data structure loads 
and stores. For more information see Endianness in Advanced SIMD on page E2-2327. 


Instructions to reverse bytes in registers 


An application or device driver might have to interface to memory-mapped peripheral registers or shared memory 
structures that are not the same endianness as the internal data structures. Similarly, the endianness of the operating 
system might not match that of the peripheral registers or shared memory. In these cases, the PE requires an efficient 
method to transform explicitly the endianness of the data. 


Table E2-3 shows the instructions that provide this functionality in the A32 and T32 instruction sets. 


Table E2-3 Byte reversal instructions 











Function Is2! nee Notes 

Instruction 
Reverse bytes in whole register REV For use with general purpose registers. 
Reverse bytes in 16-bit halfwords REV16 For use with general purpose registers. 





Reverse bytes in halfword and sign-extend REVSH For use with general purpose registers. 

















Reverse elements in doublewords, vector VREV64 For use with registers in the SIMD and floating-point register file 
Reverse elements in words, vector VREV32 For use with registers in the SIMD and floating-point register file 
Reverse elements in halfwords, vector VREV16 For use with registers in the SIMD and floating-point register file 
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Endianness in Advanced SIMD 


Advanced SIMD element Load/Store instructions transfer vectors of elements between memory and the SIMD and 
floating-point register file. An instruction specifies both the length of the transfer and the size of the data elements 
being transferred. This information is used by the PE to load and store data correctly in both big-endian and 
little-endian systems. 


Consider, for example, the A32 or T32 instruction: 
VLD1.16 {DQ}, [R1] 


This loads a 64-bit register with four 16-bit values. The four elements appear in the register in array order, with the 
lowest indexed element fetched from the lowest address. The order of bytes in the elements depends on the 
endianness configuration, as shown in Figure E2-3. Therefore, the order of the elements in the registers is the same 
regardless of the endianness configuration. 


64-bit register containing four 16-bit elements 





A[15:8] 
AI7:0] 
B[15:8] 
B[7:0] 
C[15:8] 
C[7:0] 
D[15:8] 
7 |D[7:0] 

















VLD1.16 {D0}, [R1] VLD1.16 {D0}, [R1] 








oak WN = O 




















0 
1 
2 
3 
4 
5 
6 
7 


Memory system with Memory system with 
little-endian addressing (LE) big-endian addressing (BE) 





Figure E2-3 Advanced SIMD byte order example for AArch32 state 
For information about the alignment of Advanced SIMD instructions see Alignment support on page E2-2323. 
The BigEndian() pseudocode function determines the current endianness of the data. 
The BigEndianReverse() pseudocode function reverses the endianness of a bitstring. 


The BigEndian() and BigEndianReverse() functions are defined in Chapter J1 ARMv8 Pseudocode. 
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E2.6 Atomicity in the ARM architecture 


Atomicity is a feature of memory accesses, described as atomic accesses. The ARM architecture description refers 
to two types of atomicity, single-copy atomicity and multi-copy atomicity. In the ARMV8 architecture, the atomicity 
requirements for memory accesses depend on the memory type, and whether the access is explicit or implicit. For 
more information see: 


° Requirements for single-copy atomicity. 

° Properties of single-copy atomic accesses on page E2-2329. 
° Multi-copy atomicity on page E2-2329. 

° Requirements for multi-copy atomicity on page E2-2330. 


° Concurrent modification and execution of instructions on page E2-2330. 


For more information about the memory types, see Memory type overview on page E2-2317. 


E2.6.1 Requirements for single-copy atomicity 
In AArch32 state, the single-copy atomic PE accesses are: 
° All byte accesses. 
° All halfword accesses to halfword-aligned locations. 
° All word accesses to word-aligned locations. 
° Memory accesses caused by LDREXD and STREXD instructions to doubleword-aligned locations. 


LDM, LDC, LDRD, STM, STC, STRD, PUSH, POP, RFE, SRS, VLDM, VLDR, VSTM, and VSTR instructions are executed as a sequence 
of word-aligned word accesses. Each 32-bit word access is guaranteed to be single-copy atomic. The architecture 
does not require subsequences of two or more word accesses from the sequence to be single-copy atomic. 


LDRD and STRD accesses to 64-bit aligned locations are 64-bit single-copy atomic as seen by translation table walks 
and accesses to translation tables. 





Note 


This requirement has been added to avoid the need for complex measures to avoid atomicity issues when changing 
translation table entries, without creating a requirement that all locations in the memory system are 64-bit 
single-copy atomic. This addition means: 


° The system designer must ensure that all writable memory locations that might be used to hold translations, 
such as bulk SDRAM, can be accessed with 64-bit single-copy atomicity. 


° Software must ensure that translation tables are not held in memory locations that cannot meet this atomicity 
requirement, such as peripherals that are typically accessed using a narrow bus. 


This requirement places no burden on read-only memory locations for which reads have no side effects, since it is 
impossible to detect the size of memory accesses to such locations. 





Advanced SIMD element and structure loads and stores are executed as a sequence of accesses of the element or 
structure size. The architecture requires the element accesses to be single-copy atomic if and only if both: 
° The element size is 64 bits, or smaller. 


. The elements are naturally aligned. 


Accesses to 64-bit elements or structures that are 32-bit aligned are executed as a sequence of 32-bit accesses, each 
of which is single-copy atomic. The architecture does not require subsequences of two or more 32-bit accesses from 
the sequence to be single-copy atomic. 


When an access is not single-copy atomic by the rules described in this section, it is executed as a sequence of one 
or more accesses that aggregate to the size of the original access. Each of the accesses in this sequence is single-copy 
atomic, at least at the byte level. 
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Note 


In this section, the terms before the write operation and after the write operation mean before or after the write 
operation has had its effect on the coherence order of the bytes of the memory location accessed by the write 
operation. 








If, according to these rules, an instruction is executed as a sequence of accesses, a synchronous Data Abort exception 
or Debug state entry can be taken during that sequence. This causes execution of the instruction to be abandoned. 
See Data Abort exception on page G1-3859. 


If the synchronous Data Abort exception is returned from using the preferred return address, the instruction that 
generated the sequence of accesses is re-executed and so any access that was performed before the exception was 
taken is repeated. This also applies to an exit from Debug state. 





Note 


The exception behavior for these multiple access instructions means they are not suitable for use for writes to 
memory for the purpose of software synchronization. 





For implicit accesses: 


° Cache linefills and evictions have no effect on the single-copy atomicity of explicit transactions or instruction 
fetches. 
° Instruction fetches are single-copy atomic: 


— At 32-bit granularity in A32 state. 
— At 16-bit granularity in T32 state. 


° Concurrent modification and execution of instructions on page E2-2330 describes additional constraints on 
the behavior of instruction fetches. 


° Translation table walks are performed using accesses that are single-copy atomic: 
— At 32-bit granularity when using Short-descriptor format translation tables. 


— At 64-bit granularity when using Long-descriptor format translation tables. 


E2.6.2 Properties of single-copy atomic accesses 
A read or write operation that is single-copy atomic has the following properties: 


1. For a single-copy atomic store, if the store overlaps another single-copy atomic store, then all of the writes 
from one of the stores are inserted into the Coherence order of each overlapping byte before any of the writes 
of the other store are inserted into the Coherence orders of the overlapping bytes. 


2. If a single-copy atomic load overlaps a single-copy atomic store and for any of the overlapping bytes the load 
returns the data written by the write inserted into the Coherence order of that byte by the single-copy atomic 
store then the load must return data from a point in the Coherence order no earlier than the writes inserted 
into the Coherence order by the single-copy atomic store of all of the overlapping bytes. 


E2.6.3 Multi-copy atomicity 


In a multiprocessing system, writes to a memory location are multi-copy atomic if the following conditions are both 


true: 

° All writes to the same location are serialized, meaning they are observed in the same order by all observers, 
although some observers might not observe all of the writes. 

° A read of a location does not return the value of a write until all observers observe that write. 


Note 


Writes that are not coherent are not multi-copy atomic. 
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E2.6.4 Requirements for multi-copy atomicity 
For Normal memory, writes are not required to be multi-copy atomic. 
For Device memory with the non-Gathering attribute, writes that are single-copy atomic are also multi-copy atomic. 
For Device memory with the Gathering attribute, writes are not required to be multi-copy atomic. 
E2.6.5 Concurrent modification and execution of instructions 
The ARMV8 architecture limits the set of instructions that can be executed by one thread of execution as they are 
being modified by another thread of execution without requiring explicit synchronization. 
Concurrent modification and execution of instructions can lead to the resulting instruction performing any behavior 
that can be achieved by executing any sequence of instructions that can be executed from the same Exception level, 
except where the instruction before modification and the instruction after modification is a: 
° B, BL, NOP, BKPT, SVC, HVC, or SMC A32 instruction 
° B, BL, BLX, NOP, BKPT, or SVC 16-bit T32 instruction. 
In addition, for the T32 instructions, for which Instruction encodings on page F2-2402 describes the meaning of 
{hw1, hw2}: 
° hw1 of a 32-bit BL (immediate) instruction can be concurrently modified to hw1 of another BL (immediate) 
instruction: 
— This means that some of the most significant bits of the immediate value can be modified. 
° hw1 of a 32-bit BLX (immediate) instruction can be concurrently modified to hw1 of another BLX immediate 
instruction: 
— This means that some of the most significant bits of the immediate value can be modified. 
° hw1 of a 32-bit BL (immediate) or BLX (immediate) instruction can be concurrently modified to a T32 16-bit B, 
BL, BLX, BKPT, or SVC instruction. This modification also works in reverse. 
° hw2 of a 32-bit BL (immediate) instruction can be concurrently modified to hw2 of another BL (immediate) 
instruction with a different immediate: 
— This means that some bits of the immediate value, including the least significant bits, can be modified. 
° hw2 of a 32-bit BLX (immediate) instruction can be concurrently modified to hw2 of another BLX (immediate) 
instruction with a different immediate: 
— This means that some bits of the immediate value, including the least significant bits, can be modified. 
° hw2 of a 32-bit B (immediate) instruction with a condition field can be concurrently modified to hw2 of another 
32-bit B (immediate) instruction with a condition field with a different immediate: 
— This means that some bits of the immediate value, including the least significant bits, can be modified. 
° hw2 of a 32-bit B (immediate) instruction without a condition field can be concurrently modified to hw2 of 
another 32-bit B (immediate) instruction without a condition field: 
— This means that some bits of the immediate value, including the least significant bits, can be modified. 
Note 
In the T32 instruction set: 
. The only encodings of BKPT and SVC are 16-bit. 
° The only encoding of BL is 32-bit. 
For the instructions explicitly identified in this section, the architecture guarantees that, after modification of the 
instruction, behavior is consistent with execution of either: 
. The instruction originally fetched. 
. A fetch of the modified instruction. 
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The instructions to which this applies are the B, BL, NOP, BKPT, SVC, HVC, and SMC instructions. 


For both instruction sets, if one thread of execution changes a conditional branch instruction to another conditional 
branch instruction, and the change affects both the condition field and the branch target, execution of the changed 
instruction by another thread of execution before the change is synchronized can lead to either: 


° The old condition being associated with the new target address. 


° The new condition being associated with the old target address. 


These possibilities apply regardless of whether the condition, either before or after the change to the branch 
instruction, is the always condition. 


For all other instructions, to avoid UNPREDICTABLE or CONSTRAINED UNPREDICTABLE behavior, instruction 
modifications must be explicitly synchronized before they are executed. The required synchronization is as follows: 


1. No PE must be executing an instruction when another PE is modifying that instruction. 


2. To ensure that the modified instructions are observable, the PE that modified the instructions must issue the 
following sequence of instructions and operations: 


; Coherency example for self-modifying code 

; Enter this code with <Rt> containing a new 32-bit instruction, 

; to be held in Cacheable space at a location pointed to by Rn. Use STRH in the first 
; line instead of STR for a 16-bit instruction. 

STR <Rt>, [Rn] 








DCCMVAU Rn ; Clean data cache by MVA to point of unification (PoU) 
DSB ; Ensure visibility of the data stored 
ICIMVAU Rn ; Invalidate instruction cache by VA to PoU 
BPIMVA Rn ; Invalidate branch predictor by MVA to PoU 
DSB ; Ensure completion of the invalidations 
Note 
The DCCMVAU operation is not required if the area of memory is either Non-cacheable or Write-through 
Cacheable. 
3: In a multiprocessor system, the ICIMVAU and BPIMVA are broadcast to all PEs within the Inner Shareable domain 


of the PE running this sequence. However, once the modified instructions are observable, each PE that is 
executing the modified instructions must issue the following instruction to ensure execution of the modified 
instructions: 


ISB ; Synchronize fetched instruction stream 


For more information about the required synchronization operation, see Synchronization and coherency issues 
between data and instruction accesses on page E2-2320. 


Note 


For information about memory accesses caused by instruction fetches, see Ordering requirements on page E2-2333. 
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E2.7 Memory ordering 


This section describes observation ordering. It contains the following subsections: 
° Observability and completion. 

° Ordering requirements on page E2-2333. 

° Memory barriers on page E2-2335. 


° Summary of the memory ordering rules on page E2-2339. 
For information on endpoint ordering of memory accesses, see Reordering on page E2-2349., 


In the ARMv8 memory model, the shareability memory attribute indicates whether hardware must ensure memory 
coherency. 


The ARMv8 memory system architecture defines additional attributes and associated behaviors, defined in the 
system level section of this manual. See: 


° Chapter G3 The AArch32 System Level Memory Model. 
° Chapter G4 The AArch32 Virtual Memory System Architecture. 


See also Mismatched memory attributes on page E2-2352. 


E2.7.1 Observability and completion 


An observer is a master in the system that is capable of observing memory accesses. For a PE, the following 
mechanisms must be treated as independent observers: 


. The mechanism that performs reads or writes to memory. 


. A mechanism that causes an instruction cache to be filled from memory or that fetches instructions to be 
executed directly from memory. These are treated as reads. 


° A mechanism that performs translation table walks. These are treated as reads. 
The set of observers that can observe a memory access is defined by the system. 


In the definitions in this subsection, subsequent means whichever of the following is appropriate to the context: 
° After the point in time where the location is observed by that observer. 


° After the point in time where the location is globally observed. 


For all memory: 


° A write to a location in memory is said to be observed by an observer when: 


—  Asubsequent read of the location by the same observer returns the value written by the observed write, 
or written by a write to that location by any observer that is sequenced in the Coherence order of the 
location after the observed write. 


— _ A subsequent write of the location by the same observer is sequenced in the Coherence order of the 
location after the observed write. 


° A write to a location in memory is said to be globally observed for a shareability domain or set of observers 
when: 


— A subsequent read of the location by any observer in that shareability domain returns the value written 
by the globally observed write, or written by a write to that location by any observer that is sequenced 
in the Coherence order of the location after the globally observed write. 


— A subsequent write of the location by any observer in that shareability domain is sequenced in the 
Coherence order of the location after the globally observed write. 


° A read of a location in memory is said to be observed by an observer when a subsequent write to the location 
by the same observer has no effect on the value returned by the read. 


. A read of a location in memory is said to be globally observed for a shareability domain when a subsequent 
write to the location by any observer in that shareability domain has no effect on the value returned by the 
read. 
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Additionally, for Device-nGnRnE memory: 

° A read or write of a memory-mapped location in a peripheral that exhibits side-effects is said to be observed, 
and globally observed, only when the read or write: 
— Meets the general conditions listed. 
— Can begin to affect the state of the memory-mapped peripheral. 


— Can trigger all associated side-effects, whether they affect other peripheral devices, processors, or 
memory. 


Note 


This definition is consistent with the memory access having reached the peripheral. 








For all memory, the completion rules are defined as: 


. A read or write is complete for a shareability domain when all of the following are true: 
— The read or write is globally observed for that shareability domain. 
— Any translation table walks associated with the read or write are complete for that shareability domain. 


. A translation table walk is complete for a shareability domain when the memory accesses associated with the 
translation table walk are globally observed for that shareability domain, and the TLB is updated. 


. A cache or TLB maintenance instruction is complete for a shareability domain when the effects of the 
instruction are globally observed for that shareability domain, and any translation table walks that arise from 
the instruction are complete for that shareability domain. 


The completion of any cache or TLB maintenance instruction includes its completion on all processors that 
are affected by both the instruction and the DSB operation that is required to guarantee visibility of the 
maintenance instruction. 


Completion of side-effects of accesses to Device memory 


The completion of a memory access to Device memory is not guaranteed to be sufficient to determine that the 
side-effects of the memory access are visible to all observers. The mechanism that ensures the visibility of 
side-effects of a memory access is IMPLEMENTATION DEFINED. 











E2.7.2 Ordering requirements 
ARMvV8 defines restrictions for the permitted ordering of memory accesses. These restrictions depend on the 
memory type of the addresses that are accessed, see Memory types and attributes on page E2-2342. 
Note 
See Summary of the memory ordering rules on page E2-2339 for the definition of address dependency. 
For accesses to all memory types, the only stores by an observer that can be observed by another observer are those 
stores that have been Architecturally executed. Speculative writes by an observer cannot be observed by another 
observer. For the purposes of this requirement, speculative writes are all of: 
° Writes generated by store instructions that appear in the Execution stream after a branch that is not 
architecturally resolved. 
. Writes generated by store instructions that appear in the Execution stream after an instruction where a 
synchronous exception condition has not been architecturally resolved. 
° Writes generated by conditional store instructions for which the conditions for the instruction have not been 
architecturally resolved. 
. Writes generated by store instructions for which the data being written comes from a register that has not been 
architecturally committed. 
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The following additional restrictions apply to the order in which accesses to memory are observed: 


° Reads and writes can be observed in any order provided the following constraints are met: 


—  Ifanaddress dependency exists between two reads or between a read and a write, then those memory 
accesses are observed in program order by all observers within the common shareability domain of the 
memory addresses being accessed. 


— Ordering can be achieved by using a DMB or DSB barrier. For more information on DMB and DSB 
instructions, see Memory barriers on page E2-2335. 


. Reads and writes to the same address are coherent within the shareability domain of the memory address 
being accessed. 


° Two reads to the same address by the same observer are observed in program order by all observers within 
the shareability domain of the memory address being accessed. 


. Writes are not required to be multi-copy atomic. This means that in the absence of barriers, the observation 
of a store by one observer does not imply the observation of the store by another observer. 


° Instructions that access multiple elements have no defined ordering requirements for the memory accesses 
relative to each other. 


For Device memory with the non-Reordering attribute, the order of memory accesses arriving at a single peripheral 
is the same as occurs in a Simple sequential execution on page Glossary-5728 of the program. This means the 
accesses arrive in program order. This ordering applies for all accesses using any of the memory types with the 
non-Reordering attribute, which means Device-nGnRE accesses are ordered with respect to Device-nGnRnE 
accesses to the same peripheral. If the memory accesses are not to a peripheral then there are no ordering restrictions 
from the non-Reordering attribute. For the purposes of this definition, a single peripheral is a region of memory of 
an IMPLEMENTATION DEFINED size that is defined by the peripheral. 


Memory accesses caused by instruction fetches are not required to be observed in program order, unless they are 
separated by an ISB or other Context synchronization event. 


Address dependencies and order 


In the ARMV8 architecture, a register data dependency between the value returned by a load instruction and the 
address used by a subsequent memory transaction creates order between that load instruction and the subsequent 
memory transaction. 


A register data dependency exists between a first data value and a second data value when either: 

. The register used to hold the first data value is used in the calculation of the second data value, and the 
calculation between the first data value and the second data value does not consist of either: 
— Aconditional branch whose condition is determined by the first data value. 


— A conditional selection, move, or computation whose condition is determined by the first data value, 
where the input data values for the selection, move, or computation do not have a data dependency on 
the first data value. 


. There is a register data dependency between the first data value and a third data value, and between the third 
data value and the second data value. 





Note 
A register data dependency can exist even if the value of the first data value is discarded as part of the calculation, 
as might be the case if it is ANDed with @x@ or if arithmetic using the first data value cancels out its contribution. 


For example, each of the following code sequences creates order between the memory transactions: 


Sequence 1 —_LDR R1, [R2] 
AND R1, R1, #0 
LDR R4,[R3, R1] 


Sequence 2 —_LDR R1, [R2] 
ADD R3, R3, R1 
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SUB R3, R3, R1 
STR R4, [R3] 





E2.7.3 Memory barriers 


The ARM architecture is a weakly ordered memory architecture that supports out of order completion. Memory 
barrier is the general term applied to an instruction, or sequence of instructions, that forces synchronization events 
by a PE with respect to retiring Load/Store instructions. The memory barriers defined by the ARMV8 architecture 
provide a range of functionality, including: 


° Ordering of Load/Store instructions. 
° Completion of Load/Store instructions. 
° Context synchronization. 


The following subsections describe the ARMv8 memory barrier instructions: 

° Instruction Synchronization Barrier (ISB). 

° Data Memory Barrier (DMB) on page E2-2336. 

° Data Synchronization Barrier (DSB) on page E2-2337. 

° Shareability and access limitations on the data barrier operations on page E2-2337. 


° Load-Acquire, Store-Release on page E2-2338. 





Note 
Depending on the required synchronization, a program might use memory barriers on their own, or it might use them 
in conjunction with cache maintenance and memory management instructions that in general are only available 
when software execution is at EL1 or higher. 





The DMB and DSB memory barriers affect reads and writes to the memory system generated by Load/Store instructions 
and data or unified cache maintenance instructions being executed by the PE. Instruction fetches or accesses caused 
by a hardware translation table access are not explicit accesses. 


AArch32 state also supports the legacy barrier instructions CP1SDMB, CP15DSB, and CP15ISB. These 
instructions are executed as MCRs using the appropriate encoding, and are accessible from ELO. However, for 
performance reasons ARM deprecates any use of these operations, and strongly recommends that software uses the 
DMB, DSB, and ISB instructions described in this section instead. Optionally, an implementation can support a 
CP15DEN control that supervisory software can use to disable use of these instructions, meaning the corresponding 
MCR encodings are UNDEFINED. When the CPI5BEN control is supported, setting one of the following CP1S5BEN 
fields to 0 makes execution of CP15DMB, CP15DSB, and CP15ISB UNDEFINED: 


° SCTLR_EL1.CP15BEN, for execution of these instructions at ELO using AArch32 when EL1 is using 
AArché4. 


° SCTLR.CPI5BEN, for execution of these instructions at ELO or EL1 when EL] is using AArch32. 


° HSCTLR.CPISBEN, for execution of these instructions at EL2 when EL2 is using AArch32. 


Instruction Synchronization Barrier (ISB) 


An ISB instruction flushes the pipeline in the PE, so that all instructions that come after the ISB instruction in 
program order are fetched from the cache or memory after the ISB instruction has completed. Using an ISB ensures 
that the effects of context-changing operations executed before the ISB are visible to the instructions fetched after 
the ISB instruction. Examples of context-changing operations that require the insertion of an ISB instruction to ensure 
the effects of the operation are visible to instructions fetched after the ISB instruction are: 


° Completed cache and TLB maintenance instructions. 


° Changes to System registers. 


Any context-changing operations appearing in program order after the ISB instruction only take effect after the ISB 
has been executed. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. E2-2335 
1ID092916 Non-Confidential 


E2 The AArch32 Application Level Memory Model 


E2.7 Memory ordering 


The pseudocode function for the operation of an ISB is InstructionSynchronizationBarrier(). 


See also Memory barriers on page G3-4019. 


Data Memory Barrier (DMB) 


The DMB instruction is a data memory barrier. The PE that executes the DMB instruction is referred to as the executing 
PE, PEe. The DMB instruction takes an <option> argument that specifies the shareability domains and access types to 
which the instruction applies, see Shareability and access limitations on the data barrier operations on 

page E2-2337. 


If the required shareability is Full system then the operation applies to all observers within the system. 
A DMB creates two groups of memory accesses, Group A and Group B: 


Group A Contains: 


° All explicit memory accesses of the required access types from observers in the same 
required shareability domain as PEe that are observed by PEe before the DMB instruction. 
These accesses include any accesses of the required access types performed by PEe. 


° All loads of required access types from an observer PEx in the same required shareability 
domain as PEe that have been observed by any given different observer, PEy, in the same 
required shareability domain as PEe before PEy has performed a memory access that is a 
member of Group A. 


Group B Contains: 


° All explicit memory accesses of the required access types by PEe that occur in program order 
after the DMB instruction. 


° All explicit memory accesses of the required access types by any given observer PEx in the 
same required shareability domain as PEe that can only occur after a load by PEx has returned 
the result of a store that is a member of Group B. 


Any observer with the same required shareability domain as PEe observes all members of Group A before it 
observes any member of Group B to the extent that those group members are required to be observed, as determined 
by the shareability and cacheability of the memory addresses accessed by the group members. 


If members of Group A and members of Group B access the same memory-mapped peripheral of arbitrary 
system-defined size, then members of Group A that are accessing Device or Normal Non-cacheable memory arrive 
at that peripheral before members of Group B that are accessing Device or Normal Non-cacheable memory. Where 
the members of Group A and Group B that must be ordered are from the same PE, a DMB NSH is sufficient for this 
guarantee. 


Note 


° A memory access might be in neither Group A nor Group B. The DMB does not affect the order of observation 
of such a memory access. 





° The second part of the definition of Group A is recursive. Ultimately, membership of Group A derives from 
the observation by PEy of a load before PEy performs an access that is a member of Group A as a result of 
the first part of the definition of Group A. 


° The second part of the definition of Group B is recursive. Ultimately, membership of Group B derives from 
the observation by any observer of an access by PEe that is a member of Group B as a result of the first part 
of the definition of Group B. 





DMB only affects memory accesses and the operation of data cache and unified cache maintenance instructions, see 
About cache maintenance in ARMV& on page G3-3995. It has no effect on the ordering of any other instructions 
executing on the PE. A DMB intended to ensure the completion of cache maintenance instructions must have an access 
type of both loads and stores. 


The pseudocode function for the operation of a DMB is DataMemoryBarrier(). 


See also Memory barrier instructions on page G3-4016 and Memory barriers on page G3-4019. 
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Data Synchronization Barrier (DSB) 
The DSB instruction is a memory barrier, that synchronizes the execution stream with memory accesses. 


The DSB instruction takes an <option> argument that specifies the shareability domains and access types to which the 
instruction applies, see Shareability and access limitations on the data barrier operations. 


If the required shareability is Full system then the operation applies to all observers within the system. 


A DSB behaves as a DMB with the same arguments, and also has the additional properties defined in this section. The 
PE that executes the DSB instruction is referred to as the executing PE, PEe. 


Execution of a DSB at EL2 ensures that any memory accesses caused by speculative translation table walks from the 
Non-secure PL1&0 translation regime have been observed. 


For more information, see Use of out-of-context translation regimes on page G4-4028. 
A DSB completes when all of the following apply: 


° All explicit memory accesses that are observed by PEe before the DSB is executed and are of the required 
access types, and are from observers in the same required shareability domain as PEe, are complete for the 
set of observers in the required shareability domain. 


° If the required access types of the DSB is reads and writes, then all cache and branch predictor maintenance 
instructions and all TLB maintenance instructions issued by PEe before the DSB are complete for the required 
shareability domain. 


In addition, no instruction that appears in program order after the DSB instruction can execute until the DSB completes. 


See also Memory barrier instructions on page G3-4016 and Memory barriers on page G3-4019. 


Shareability and access limitations on the data barrier operations 


The DMB and DSB instructions can each take an optional limitation argument that specifies: 
. The shareability domain over which the instruction must operate. This is one of: 
—  Fullsystem. 
— Outer Shareable. 
— Inner Shareable. 
—  Non-shareable. 
° The accesses for which the instruction operates. This is one of: 
— __ Read and write accesses in Group A and Group B. 
— _ Write accesses only in Group A and Group B. 


— __ Read access only in Group A and read and write accesses in Group B. 


Note 


This form of a DMB or DSB instruction can be described as a Load-Load/Store barrier. 








Table E2-4 shows how these options are encoded in the <option> field of the instruction. 


Table E2-4 Encoding of the DMB and DSB <option> parameter 














Accesses Shareability domain 

Group A Group B Fullsystem Outer Shareable Inner Shareable Non-shareable 
Reads and writes Reads and writes SY OSH ISH NSH 

Writes Writes ST OSHST ISHST NSHST 

Reads Reads and writes LD OSHLD ISHLD NSHLD 








ARM DDI 0487A.k_iss10775 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


E2-2337 


E2 The AArch32 Application Level Memory Model 


E2.7 Memory ordering 


If no <option> is specified then the instruction operates for read and write accesses, over the full system, meaning 
the operation is the same as for the SY option. See the instruction descriptions for more information: 


. DMB on page F5-2659. 
° DSB on page F5-2662. 


Note 


ISB also supports an optional limitation argument that can only contain one value that corresponds to full system 
operation, see /SB on page F5-2679. 








Load-Acquire, Store-Release 
ARMvV8 provides a set of instructions with Acquire semantics for loads, and Release semantics for stores. 
For all memory types, these instructions have the following ordering requirements: 


° A Store-Release followed by a Load-Acquire is observed in program order by each observer within the 
shareability domain of the memory address being accessed by the Store-Release and the memory address 
being accessed by the Load-Acquire. 


° For a Load-Acquire, observers in the shareability domain of the address accessed by the Load-Acquire 
observe accesses in the following order: 


1. The read caused by the Load-Acquire. 


2; Reads and writes caused by loads and stores that appear in program order after the Load-Acquire for 
which the shareability of the address accessed by the load or store requires that the observer observes 
the access. 


There are no other ordering requirements on loads or stores that appear before the Load-Acquire. 


° For a Store-Release, observers in the shareability domain of the address accessed by the Store-Release 
observe accesses in the following order: 


1. All of the following for which the shareability of the address accessed requires that the observer 
observes the access: 


° Reads and writes caused by loads and stores that appear in program order before the 
Store-Release. 


° Writes that were observed by the PE executing the Store-Release before it executed the 
Store-Release. 


2. The write caused by the Store-Release. 


There are no additional ordering requirements on loads or stores that appear in program order after the 
Store-Release. 


° All Store-Release instructions must be multi-copy atomic when they are observed with Load-Acquire 
instructions. 


In addition, for accesses to a memory-mapped peripheral of an arbitrary system-defined size that are defined as any 
type of Device memory accesses, these instructions have the following requirements: 


° A Load-Acquire to an address in the memory-mapped peripheral ensures that all memory accesses using 
Device memory types to the same memory-mapped peripheral that are architecturally required to be observed 
after the Load-Acquire will arrive at the memory-mapped peripheral after the memory access of the 
Load-Acquire. 


° A Store-Release to an address in the memory-mapped peripheral ensures that all memory accesses using 
Device memory types to the same memory-mapped peripheral that are architecturally required to be observed 
before the Store-Release will arrive at the memory-mapped peripheral before the memory access of the 
Store-Release. 
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. If a Load-Acquire to a memory address in the memory-mapped peripheral has observed the value stored to 
that address by a Store-Release, then any memory access to the memory-mapped peripheral that is 
architecturally required to be ordered before the memory access of the Store-Release will arrive at the 
memory-mapped peripheral before any memory access to the same peripheral that is architecturally required 
to be ordered after the memory access of the Load-Acquire. 


Load-Acquire and Store-Release, other than LDAEXD and STLEXD, access only a single data element. This access is 
single-copy atomic. The address of the data object must be aligned to the size of the data element being accessed, 
otherwise the access generates an Alignment fault. 


LDAEXD and STLEXD access two data elements. The address supplied to the instructions must be doubleword aligned, 
otherwise the access generates an Alignment fault. 


A Store-Release Exclusive instruction only has the release semantics if the store is successful. 





Note 


° Each Load-Acquire Exclusive and Store-Release Exclusive instruction is essentially a variant of the 
equivalent Load-Exclusive or Store-Exclusive instruction. All usage restrictions and single-copy atomicity 
properties: 


— That apply to the Load-Exclusive instructions also apply to the Load-Acquire Exclusive instructions. 
— That apply to the Store-Exclusive instructions also apply to the Store-Release Exclusive instructions. 


° The Load-Acquire/Store-Release instructions can remove the requirement to use the explicit DMB memory 
barrier instruction. 





Table E2-5 summarizes the Load-Acquire/Store-release instructions. 


Table E2-5 Load-Acquire/Store-Release instructions 





























Data type Load-Acquire Store-Release Load-Acquire Exclusive Store-Release Exclusive 
32-bit word LDA STL LDAEX STLEX 
16-bit halfword LDAH STLH LDAEXH STLEXH 
8-bit byte LDAB STLB LDAEXB STLEXB 
64-bit doubleword - - LDAEXD STLEXD 
E2.7.4 Summary of the memory ordering rules 
The following is a concise list of the situations that are required, by the ARM architecture specification, to cause 
externally-visible order of memory. This ordering means that if memory transaction A has externally visible order 
ahead of memory transaction B, then all observers within the shareability domains of A and B will observe A 
before B. See Terms used in the summary of the memory ordering rules on page E2-2340 for definitions of the terms 
used. 
Note 
This list applies to both AArch32 state and AArch64 state, and is consistent with the requirements of ARMv7. 
1. DMB and DSB barrier instructions, and load acquire/store release instructions, create externally-visible order as 
defined by those instructions. 
2. A True or False Address dependency from a Load to a Load or from a Load to a Store creates 
externally-visible order. 
3. A True Control dependency from a Load to an ISB instruction creates externally-visible order between the 
load and any memory accesses after the ISB instruction. 
4. A True Register data dependency from a Load to a Store creates externally-visible order. 
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3: A True Control dependency from a Load to a Store creates externally-visible order. 


6. Memory is coherent within the shareability domain of a memory address, which means there is a total order 
of all writes to that address that all observers within that shareability domain will agree on. 


Note 


A consequence of this is that reads to the same address by the same processor are observed in order. 








7. A Dependency from a Store to a Load through memory between different PEs creates externally-visible order 
but stores are not multi-copy atomic except where explicitly defined to be by the definition of the store. 





Note 


A consequence of the lack of multi-copy atomicity is that a Store to Load dependency through memory on 
the same PE does not create externally-visible order. 





No other effects are required to create externally visible order in the ARM architecture. 


Terms used in the summary of the memory ordering rules 
The summary uses the following terms: 


Register data dependency 
This is defined in Address dependencies and order on page E2-2334. 


False Register data dependency 
A False Register data dependency is a Register data dependency where no register in the system 
holds a variable for which a change of the first data value causes a change of the second data value. 
True Register data dependency 
A True Register data dependency is a Register data dependency that is not a false Register data 
dependency. 
True Address dependency 


A True Address dependency between a load and a subsequent memory transaction exists where 
there is a True Register data dependency between the data value returned from the load and the 
address used by the subsequent memory transaction. 


False Address dependency 


A False Address dependency between a load and a subsequent memory transaction exists where 
there is a False Register data dependency between the data value returned from the load and the 
address used by the subsequent memory transaction. 


True Control dependency 


A True Control dependency between a load and a subsequent instruction exists: 


° Where there is a True Register data dependency between the data value returned from the 
load and data value used in the evaluation of a conditional branch and the subsequent 
instruction is only executed as a result of one of the possible outcomes of that conditional 
branch. 


° Where there is a True Register data dependency between the data value returned from the 
load and the data value used in the evaluation of a subsequent instruction that is a conditional 
selection, move or computation for which both: 


— The condition is determined by the returned data value. 


— No input data value for the selection, move or computation has a register data 
dependency on the returned data value. 
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Dependency from a Store to a Load through memory 


A Dependency from a Store to a Load through memory exists where the Store and Load are to the 
same physical address, and value returned by the Load is the value that was written by the Store, 
and could not be the value that was previously held in that memory address. 
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E2.8 Memory types and attributes 


In ARMV8 the ordering of accesses for addresses in memory, referred to as the memory order model, is defined by 
the memory attributes. The following sections describe this model: 


Normal memory. 
Device memory on page E2-2346. 


Memory access restrictions on page E2-2351. 


E2.8.1 Normal memory 


The Normal memory type attribute applies to most memory in a system. It indicates that the hardware is permitted 
by the architecture to perform speculative data read accesses to these locations, regardless of the access permissions 
for these locations. 


The Normal memory type has the following properties: 


A write to a memory location with the Normal attribute completes in finite time. This means that it is globally 
observed for the shareability domain of the memory location in finite time. For a Non-cacheable location, the 
location is observed by all observers in finite time. 


A completed write to a memory location with the Normal attribute is globally observed for the shareability 
domain of the memory location in finite time without the need for explicit cache maintenance instructions or 
barriers. For a Non-cacheable location, the completed write is globally observed for all observers in finite 
time without the need for explicit cache maintenance instructions or barriers. 


Writes to a memory location with the Normal memory attribute that is Non-cacheable must reach the 
endpoint for that location in the memory system in finite time. 


Unaligned memory accesses can access Normal memory if the system is configured to generate such 
accesses. 


There is no requirement for the memory system beyond the PE to be able to identify the elements accessed 
by multi-register Load/Store instructions. See Multi-register loads and stores that access Normal memory on 
page E2-2346. 


Note 





The Normal memory attribute is appropriate for locations of memory that are idempotent, meaning that they 
exhibit all of the following properties: 


— __ Read accesses can be repeated with no side-effects. 
— _ Repeated read accesses return the last value written to the resource being read. 
— Read accesses can fetch additional memory locations with no side-effects. 


—  _ Write accesses can be repeated with no side-effects if the contents of the location accessed are 
unchanged between the repeated writes or as the result of an exception, as described in this section. 


— _Unaligned accesses can be supported. 


— Accesses can be merged before accessing the target memory system. 


An instruction that generates a sequence of accesses as described in Atomicity in the ARM architecture on 
page E2-2328 might be abandoned as a result of an exception being taken during the sequence of accesses. 
On return from the exception the instruction is restarted, and therefore one or more of the memory locations 
might be accessed multiple times. This can result in repeated write accesses to a location that has been 
changed between the write accesses. 





The following sections describe the other attributes for Normal memory: 


Shareable Normal memory on page E2-2343. 
Non-shareable Normal memory on page E2-2344. 
Cacheability attributes for Normal memory on page E2-2344. 
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See also: 
° Multi-register loads and stores that access Normal memory on page E2-2346. 
° Atomicity in the ARM architecture on page E2-2328. 


° Memory barriers on page E2-2335. For accesses to Normal memory, a DMB instruction is required to ensure 
the required ordering. 


° Concurrent modification and execution of instructions on page E2-2330. 


Shareable Normal memory 


A Normal memory location has a Shareability attribute that is defined as one of: 


° Inner Shareable. 
° Outer Shareable. 
° Non-shareable. 


The shareability attributes define the data coherency requirements of the location, that hardware must enforce. They 
do not affect the coherency requirements of instruction fetches, see Synchronization and coherency issues between 
data and instruction accesses on page E2-2320. 





Note 


° System designers can use the Shareability attribute to specify the locations in Normal memory for which 
coherency must be maintained. However, software developers must not assume that specifying a memory 
location as Non-shareable permits software to make assumptions about the incoherency of the location 
between different PEs in a shared memory system. Such assumptions are not portable between different 
multiprocessing implementations that might use the Shareability attribute. Any multiprocessing 
implementation might implement caches that are shared, inherently, between different PEs. 


° This architecture assumes that all PEs that use the same operating system or hypervisor are in the same Inner 
Shareable shareability domain. 





Shareable, Inner Shareable, and Outer Shareable Normal memory 
The ARM architecture abstracts the system as a series of Inner and Outer Shareability domains. 


Each Inner Shareability domain contains a set of observers that are data coherent for each member of that set for 
data accesses with the Inner Shareable attribute made by any member of that set. 


Each Outer Shareability domain contains a set of observers that are data coherent for each member of that set for 
data accesses with the Outer Shareable attribute made by any member of that set. 


The following properties also hold: 


. Each observer is only a member of a single Inner Shareability domain. 
° Each observer is only a member of a single Outer Shareability domain. 
° All observers in an Inner Shareability domain are always members of the same Outer Shareability domain. 


This means that an Inner Shareability domain is a subset of an Outer Shareability domain, although it is not 
required to be a proper subset. 


Note 


e Because all data accesses to Non-cacheable locations are data coherent to all observers, Non-cacheable 
locations are always treated as Outer Shareable. 











° The Inner Shareable domain is expected to be the set of PEs controlled by a single hypervisor or operating 
system. 
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The details of the use of the Shareability attributes are system-specific. Example E2-1 shows how they might be 
used. 


Example E2-1 Use of shareability attributes 


In an implementation, a particular subsystem with two clusters of PEs has the requirement that: 


. In each cluster, the data caches or unified caches of the PEs in the cluster are transparent for all data accesses 
to memory locations with the Inner Shareable attribute. 


e However, between the two clusters, the caches: 
— Are not required to be coherent for data accesses that have only the Inner Shareable attribute. 


— Are coherent for data accesses that have the Outer Shareable attribute. 


In this system, each cluster is in a different Shareability domain for the Inner Shareable attribute, but all components 
of the subsystem are in the same Shareability domain for the Outer Shareable attribute. 


A system might implement two such subsystems. If the data caches or unified caches of one subsystem are not 
transparent to the accesses from the other subsystem, this system has two Outer Shareable Shareability domains. 


Having two levels of shareability means system designers can reduce the performance and power overhead for 
shared memory locations that do not need to be part of the Outer Shareable Shareability domain. 


For Shareable Normal memory, the Load-Exclusive and Store-Exclusive synchronization primitives take account 
of the possibility of accesses by more than one observer in the same Shareability domain. 


Non-shareable Normal memory 


For Normal memory locations, the Non-shareable attribute identifies Normal memory that is likely to be accessed 
only by a single PE. 


A location in Normal memory with the Non-shareable attribute does not require the hardware to make data accesses 
by different observers coherent, unless the memory is Non-cacheable. For a Non-shareable location, if other 
observers share the memory system, software must use cache maintenance instructions, if the presence of caches 
might lead to coherency issues when communicating between the observers. This cache maintenance requirement 
is in addition to the barrier operations that are required to ensure memory ordering. 


For Non-shareable Normal memory, it is IMPLEMENTATION DEFINED whether the Load-Exclusive and 
Store-Exclusive synchronization primitives take account of the possibility of accesses by more than one observer. 


Cacheability attributes for Normal memory 


In addition to being Outer Shareable, Inner Shareable or Non-shareable, each region of Normal memory is assigned 
a Cacheability attribute that is one of: 


° Write-Through Cacheable. 
° Write-Back Cacheable. 


° Non-cacheable. 
Also, for Write-Through Cacheable and Write-Back Cacheable Normal memory regions: 
° A region might be assigned cache allocation hints for read and write accesses. 


° It is IMPLEMENTATION DEFINED whether the cache allocation hints can have an additional attribute of 
Transient or Non-transient. 


For more information see Cacheability, cache allocation hints, and cache transient hints on page G3-3992. 
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A memory location can be marked as having different cacheability attributes, for example when using aliases in a 
virtual to physical address mapping: 


. If the attributes differ only in the cache allocation hint this does not affect the behavior of accesses to that 
location. 
° For other cases see Mismatched memory attributes on page E2-2352. 


The cacheability attributes provide a mechanism of coherency control with observers that lie outside the 
Shareability domain of a region of memory. In some cases, the use of Write-Through Cacheable or Non-cacheable 
regions of memory might provide a better mechanism for controlling coherency than the use of hardware coherency 
mechanisms or the use of cache maintenance routines. To this end, the architecture requires the following properties 
for Non-cacheable or Write-Through Cacheable memory: 


° A completed write to a memory location that is Non-cacheable or Write-Through Cacheable for a level of 
cache made by an observer accessing the memory system inside the level of cache is visible to all observers 
accessing the memory system outside the level of cache without the need of explicit cache maintenance. 


. A completed write to a memory location that is Non-cacheable for a level of cache made by an observer 
accessing the memory system outside the level of cache is visible to all observers accessing the memory 
system inside the level of cache without the need of explicit cache maintenance. 


Note 


Implementations can use the cache allocation hints to indicate a probable performance benefit of caching. For 
example, a programmer might know that a piece of memory is not going to be accessed again and would be better 
treated as Non-cacheable. The distinction between memory regions with attributes that differ only in the cache 
allocation hints exists only as a hint for performance. 








For Normal memory, the ARM architecture provides cacheability attributes that are defined independently for each 

of two conceptual levels of cache, the inner and the outer cache. The relationship between these conceptual levels 

of cache and the implemented physical levels of cache is IMPLEMENTATION DEFINED, and can differ from the 

boundaries between the Inner and Outer Shareability domains. However: 

. Inner refers to the innermost caches, meaning the caches that are closest to the PE, and always includes the 
lowest level of cache. 

. No cache that is controlled by the Inner cacheability attributes can lie outside a cache that is controlled by the 
Outer cacheability attributes. 


. An implementation might not have any outer cache. 


Example E2-2, Example E2-3 on page E2-2346, and Example E2-4 on page E2-2346 describe the possible ways of 
implementing a system with three levels of cache, level I (L1) to level 3 (L3). 





Note 
° LI cache is the level closest to the PE, see Memory hierarchy on page E2-2318. 
. When managing coherency, system designs must consider both the inner and outer cacheability attributes, as 


well as the Shareability attributes. This is because hardware might have to manage the coherency of caches 
at one conceptual level, even when another conceptual level has the Non-cacheable attribute. 





Example E2-2 Implementation with two inner and one outer cache levels 


Implement the three levels of cache in the system, L1 to L3, with: 





. The Inner cacheability attribute applied to L1 and L2 cache. 
° The Outer cacheability attribute applied to L3 cache. 
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Example E2-3 Implementation with three inner and no outer cache levels 


Implement the three levels of cache in the system, L1 to L3, with the Inner cacheability attribute applied to L1, L2, 
and L3 cache. Do not use the Outer cacheability attribute. 


Example E2-4 Implementation with one inner and two outer cache levels 


Implement the three levels of cache in the system, L1 to L3, with: 
° The Inner cacheability attribute applied to L1 cache. 
° The Outer cacheability attribute applied to L2 and L3 cache. 


Multi-register loads and stores that access Normal memory 


For all instructions that load or store more than one general-purpose register from an Exception level there is no 
requirement for the memory system beyond the PE to be able to identify the size of the elements accessed by these 
load or store instructions. 


For all instructions that load or store more than one general-purpose register from an Exception level the order in 
which the registers are accessed is not defined by the architecture. 


For all instructions that load or store one or more registers from the SIMD and floating-point register file from an 
Exception level there is no requirement for the memory system beyond the PE to be able to identify the size of the 
element accessed by these load or store instructions. 


E2.8.2 Device memory 


The Device memory type attributes define memory locations where an access to the location can cause side-effects, 
or where the value returned for a load can vary depending on the number of loads performed. Typically, the Device 
memory attributes are used for memory-mapped peripherals and similar locations. 


The attributes for ARMv8 Device memory are: 
Gathering Identified as G or nG, see Gathering on page E2-2348. 
Reordering Identified as R or nR, see Reordering on page E2-2349. 


Early Write Acknowledgement hint 
Identified as E or nE, see Early Write Acknowledgement on page E2-2350. 


The ARMv8 Device memory types are: 


Device-nGnRnE Device non-Gathering, non-Reordering, No Early write acknowledgement. 


Equivalent to the Strongly-ordered memory type in earlier versions of the architecture. 


Device-nGnRE Device non-Gathering, non-Reordering, Early Write Acknowledgement. 


Equivalent to the Device memory type in earlier versions of the architecture. 


Device-nGRE Device non-Gathering, Reordering, Early Write Acknowledgement. 


ARMvV8 adds this memory type to the translation table formats found in earlier versions of 
the architecture. The use of barriers is required to order accesses to Device-nGRE memory. 


The Device-nGRE memory type is introduced into the AArch32 translation table formats 
when the PE is using the Long Descriptor Translation Table format. 


Device-GRE Device Gathering, Reordering, Early Write Acknowledgement. 
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ARMv8 adds this memory type to the translation table formats found in earlier versions of 
the architecture. Device-GRE memory has the fewest constraints. It behaves similar to 
Normal memory, with the restriction that speculative accesses to Device-GRE memory is 
forbidden. 


The Device-GRE memory type is introduced into the AArch32 translation table formats 
when the PE is using the Long Descriptor Translation Table format. 


Collectively these are referred to as any Device memory type. Going down the list, the memory types are described 
as getting weaker; conversely the going up the list the memory types are described as getting stronger. 





Note 


As the list of types shows, these additional attributes are hierarchical. For example, a memory location that 
permits Gathering must also permit Reordering and Early Write Acknowledgement. 


The architecture does not require an implementation to distinguish between each of these memory types and 
ARM recognizes that not all implementations will do so. The subsection that describes each of the attributes, 
describes the implementation rules for the attribute. 


Earlier versions of the ARM architecture defined the following memory types: 
—  Strongly-ordered memory. This is the equivalent of the Device-nGnRnE memory type. 


— Device memory. This is the equivalent of the Device-nGnRE memory type. 





All of these memory types have the following properties: 


Speculative data accesses are not permitted to any memory location with any Device memory attribute. This 
means that each memory access to any Device memory type must be one that would be generated by a simple 
sequential execution of the program. 


An exception to this applies: 


— __ Reads generated by the Advanced SIMD and floating-point instructions can access bytes that are not 
explicitly accessed by the instruction if the bytes accessed are in a 16-byte window, aligned to 
16-bytes, that contains at least one byte that is explicitly accessed by the instruction. 


Note 


An instruction that generates a sequence of accesses as described in Atomicity in the ARM architecture on 
page E2-2328 might be abandoned as a result of an exception being taken during the sequence of accesses. 
On return from the exception the instruction is restarted, and therefore one or more of the memory locations 
might be accessed multiple times. This can result in repeated accesses to a location where the program only 
defines a single access. For this reason, ARM strongly recommends that no accesses to Device memory are 
performed from a single instruction that spans the boundary of a translation granule or which in some other 
way could lead to some of the accesses being aborted. 





— Write speculation that is visible to other observers is prohibited for all memory types. 





A write to a memory location with any Device memory attribute completes in finite time. This means that it 
is globally observed for all observers in the system in finite time. 


If a location with any Device memory attribute changes without an explicit write by an observer, this change 
must also be globally observed for all observers in the system in finite time. Such a change might occur in a 
peripheral location that holds status information. 


A completed write to a memory location with any Device memory attribute is globally observed for all 
observers in finite time without the need for explicit maintenance. 


Data accesses to memory locations are coherent for all observers in the system, and correspondingly are 
treated as being Outer Shareable. 


A memory location with any Device memory attribute cannot be allocated into a cache. 


Writes to a memory location with any Device memory attribute must reach the endpoint for that address in 
the memory system in finite time. Typically, the endpoint is a peripheral or some physical memory. 
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. All accesses to memory with any Device memory attribute must be aligned. Any unaligned access generates 
an Alignment fault at the first stage of translation that defined the location as being Device. 





Note 


In the Non-secure PL1&0 translation regime in systems where HCR.TGE==1 and HCR.DC==0, any 
Alignment fault that results from the fact that all locations are treated as Device is a fault at the first stage of 
translation. This causes the value of HSR.ISS.[24] to be 0. 





° Hardware does not prevent speculative instruction fetches from a memory location with any of the Device 
memory attributes unless the memory location is also marked as Execute-never for all Exception levels. 


Note 


This means that to prevent speculative instruction fetches from memory locations with Device memory 
attributes, any location that is assigned any Device memory type must also be marked as Execute-never for 
all Exception levels. Failure to mark a memory location with any Device memory attribute as Execute-never 
for all Exception levels is a programming error. 








See also Memory access restrictions on page E2-2351. 


The memory types for Translation table walks cannot be defined as any Device memory type within the TCR. For 
the Non-secure EL1&0 translation regime, the memory accesses made during a stage | translation table walk are 
subject to a stage 2 translation, and as a result of this second stage of translation, the accesses from the first stage 
translation table walk might be made to memory locations with any Device memory type. These accesses might be 
made speculatively. When the value of the HCR.PTW bit is 1, a stage 2 permission fault is generated if a first stage 
translation table walk is made to any Device memory type. 


For instruction fetches, if branches cause the program counter to point to an area of memory with the Device 
attribute which is not marked as Execute-never for the current Exception level, an implementation can either: 


. Treat the instruction fetch as if it were to a memory location with the Normal Non-cacheable attribute. 
. Take a Permission fault. 

Gathering 

In the Device memory attribute: 

G Indicates that the location has the Gathering attribute. 

nG Indicates that the location does not have the Gathering attribute, meaning it is non-Gathering. 


The Gathering attribute determines whether it is permissible for either: 


° Multiple memory accesses of the same type, read or write, to the same memory location to be merged into a 
single transaction. 


° Multiple memory accesses of the same type, read or write, to different memory locations to be merged into 
a single memory transaction on an interconnect. 


For memory types with the Gathering attribute, either of these behaviors is permitted, provided that the ordering and 
coherency rules of the memory location are followed. 


For memory types with the non-Gathering attribute, neither of these behaviors is permitted. As a result: 


. The number of memory accesses that are made corresponds to the number that would be generated by a 
simple sequential execution of the program. 


° All access occur at their programmed size, except that there is no requirement for the memory system beyond 
the PE to be able to identify the elements accessed by multi-register Load/Store instructions. See 
Multi-register loads and stores that access Device memory on page E2-2350. 





E2-2348 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


E2 The AArch32 Application Level Memory Model 
E2.8 Memory types and attributes 


Gathering between memory accesses separated by a memory barrier that affects those memory accesses is not 
permitted. This applies if one memory access is in Group A and one memory access is in Group B. That is, gathering 
is not permitted between a memory access in Group A and a memory access in Group B if the two accesses are 
separated by a barrier that affects at least one of the accesses. 


Gathering between two memory accesses generated by a Load-Acquire/Store-Release is not permitted. 


A read from a memory location with the non-Gathering attribute cannot come from a cache or a buffer, but must 
come from the endpoint for that address in the memory system. Typically this is a peripheral or physical memory. 


Note 


° A read from a memory location with the Gathering attribute can come from intermediate buffering of a 
previous write, provided that: 





— The accesses are not separated by a DMB or DSB barrier that affects both of the accesses, for example if 
one access is in Group A and the other is in Group B. 


— The accesses are not separated by other ordering constructions that require that the accesses are in 
order. Such a construction might be a combination of Load-Acquire and Store-Release. 


— The accesses are not generated by a Store-Release instruction. 


° The ARM architecture only defines programmer visible behavior. Therefore, gathering can be performed if 
a programmer cannot tell whether gathering has occurred. 





An implementation is permitted to perform an access with the Gathering attribute in a manner consistent with the 
requirements specified by the Non-gathering attribute. 


An implementation is not permitted to perform an access with the Non-gathering attribute in a manner consistent 
with the relaxations allowed by the Gathering attribute. 


Reordering 


In the Device memory attribute: 
R Indicates that the location has the Reordering attribute. 


nR Indicates that the location does not have the Reordering attribute, meaning it is non-Reordering. 


For all memory types with the non-Reordering attribute, the order of memory accesses arriving at a single peripheral 
of IMPLEMENTATION DEFINED size, as defined by the peripheral, must be the same order that occurs in a simple 
sequential execution of the program. That is, the accesses appear in program order. This ordering applies to all 
accesses using any of the memory types with the non-Reordering attribute. As a result, if there is a mixture of 
Device-nGnRE and Device-nGnRnE accesses to the same peripheral, these occur in program order. If the memory 
accesses are not to a peripheral, then this attribute imposes no restrictions. 


Note 


° The IMPLEMENTATION DEFINED size of the single peripheral is the same as applies for the ordering guarantee 
provided by the DMB instruction. 





. The ARM architecture only defines programmer visible behavior. Therefore, reordering can be performed if 
a programmer cannot tell whether reordering has occurred. 





An implementation is permitted to perform an access with the Reordering attribute in a manner consistent with the 
requirements specified by the non-Reordering attribute. 


An additional relaxation is that an implementation is not permitted to perform an access with the non-Reordering 
attribute in a manner consistent with the relaxations allowed by the Reordering attribute. 


The non-Reordering attribute does not require any additional ordering, other than that which applies to Normal 
memory, between: 


° Accesses with the non-Reordering attribute and accesses with the Reordering attribute. 
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° Accesses with the non-Reordering attribute and accesses to Normal memory. 


° Accesses with the non-Reordering attribute and accesses to different peripherals of IMPLEMENTATION 
DEFINED size. 


Early Write Acknowledgement 


In the Device memory attribute: 
E Indicates that the location has the Early Write Acknowledgement attribute. 
nE Indicates that the location has the No Early Write Acknowledgement attribute. 


Early Write Acknowledgement is a hint to the platform memory system. Assigning the No Early Write 
Acknowledgement attribute to a Device memory location recommends that only the endpoint of the write access 
returns a write acknowledgement of the access, and that no earlier point in the memory system returns a write 
acknowledge. This means that a DSB barrier, executed by the PE that performed the write to the No Early Write 
Acknowledgement location, completes only after the write has reached its endpoint in the memory system. 
Typically, this endpoint is a peripheral or physical memory. 


When the Early Write Acknowledgement attribute is assigned to a Device memory location, there is no such 
recommendation for the handling of accesses to that location. 





Note 


. The Early Write Acknowledgement hint has no effect on the ordering rules. The purpose of signaling no Early 
Write Acknowledgement is to signal to the interconnect that the peripheral requires the ability to signal the 
acknowledgement. The No Write Acknowledgement signal also provides an additional semantic that can be 
interpreted by the driver that is accessing the peripheral. 


. This attribute is treated as a hint, as the exact nature of the interconnects attached to a PE is outside the scope 
of the ARM architecture definition, and not all interconnects provide a mechanism to ensure that a write has 
reached the physical endpoint of the memory system. 


° ARM recommends that writes with the No Early Write Acknowledgement hint are used for PCIe 
configuration writes. However, the mechanisms by which PCle configuration writes are identified are 
IMPLEMENTATION DEFINED. 


. ARM strongly recommends that the Early Write Acknowledgement hint is not ignored by a PE, but is made 
available for use by the system. 





Because the No Early Write Acknowledgement attribute is a hint: 


. An implementation is permitted to perform an access with the Early Write Acknowledgement attribute in a 
manner consistent with the requirements specified by the No Early Write Acknowledgement attribute. 


° An implementation is permitted to perform an access with the No Early Write Acknowledgement attribute in 
a manner consistent with the relaxations allowed by the Early Write Acknowledgement attribute. 


Multi-register loads and stores that access Device memory 


For all instructions that load or store more than one general-purpose register there is no requirement for the memory 
system beyond the PE to be able to identify the size of the elements accessed by these load and store instructions. 


For all instructions that load or store one or more registers from the SIMD and floating-point register file there is 
no requirement for the memory system beyond the PE to be able to identify the size of the element accessed by these 
load and store instructions. 


For an LDRD, STRD, or LDM instruction with a register list that includes the PC, or an STM instruction with a register list 
that includes the PC, the order in which the registers are accessed is not defined by the architecture. 


For a load or store of an Advanced SIMD element or structure, the order in which the registers are accessed is not 
defined by the architecture. 
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For a VLDM, VSTM, LDM and STM instruction with a register list that does not include the PC, all registers are accessed in 
ascending address order for Device accesses with the non-Reordering attribute. 


E2.8.3 Memory access restrictions 


The following restrictions apply to memory accesses: 


For accesses to any two bytes, p and gq, that are generated by the same instruction: 


The bytes p and q must have the same memory type and Shareability attributes. otherwise the results 
are CONSTRAINED UNPREDICTABLE. For example, an LDC, LDM, LDRD STC, STM or STRD instruction, or an 
unaligned load or store that spans the boundary between Normal memory and Device memory is 
CONSTRAINED UNPREDICTABLE. 


Except for possible differences in the cache allocation hints, ARM deprecates having different 
cacheability attributes for bytes p and q. 


For the permitted CONSTRAINED UNPREDICTABLE behavior, see Crossing a page boundary with different 
memory types or Shareability attributes on page K1-5465. 


If the accesses of an instruction that causes multiple accesses to any type of Device memory cross a 4KB 
address boundary then behavior is CONSTRAINED UNPREDICTABLE and Crossing a 4KB boundary with a 
Device access on page K1-5466 describes the permitted behaviors. 





Note 


The boundary referred to is between two Device memory regions that are both of 4KB and aligned to 
4KB. 


This restriction means it is important that an access to a volatile memory device is not made using a 
single instruction that crosses a 4KB address boundary. 


ARM expects this restriction to constrain the placing of volatile memory devices in the system 
memory map, rather than expecting a compiler to be aware of the alignment of memory accesses. 
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E2.9 


Mismatched memory attributes 


In the ARMV8 architecture mismatched memory attributes are controlled by privileged software. For more 
information, see Chapter G4 The AArch32 Virtual Memory System Architecture. 


Physical memory locations are accessed with mismatched attributes if all accesses to the location do not use a 
common definition of all of the following attributes of that location: 


. Memory type, Device or Normal. 
° Shareability. 


° Cacheability, for the same level of the inner or outer cache, but excluding any cache allocation hints. 


Collectively these are referred to as memory attributes. 





Note 


The terms location and memory location refer to any byte within the current coherency granule and are used 
interchangeably. 





When a memory location is accessed with mismatched attributes the only software visible effects are one or more 
of the following: 
° Uniprocessor semantics for reads and writes to that memory location might be lost. This means: 


— A read of the memory location by one agent might not return the value most recently written to that 
memory location by the same agent. 


— Multiple writes to the memory location by one agent with different memory attributes might not be 
ordered in program order. 


° There might be a loss of coherency when multiple agents attempt to access a memory location. 
. There might be a loss of properties derived from the memory type, as described in later bullets in this section. 
° If all Load-Exclusive/Store-Exclusive instructions executed across all threads to access a given memory 


location do not use consistent memory attributes, the exclusive monitor state becomes UNKNOWN. 


. Bytes written without the Write-Back cacheable attribute within the same Write-Back granule as bytes 
written with the Write-Back cacheable attribute might have their values reverted to the old values as a result 
of cache Write-Back. 


The loss of properties associated with mismatched memory type attributes refers only to the following properties of 
Device memory that are additional to the properties of Normal memory: 


° Prohibition of speculative read accesses. 
° Prohibition on Gathering. 
. Prohibition on Re-ordering. 


For the following situations, when a physical memory location is accessed with mismatched attributes, a more 
restrictive set of behaviors applies. The description of each situation also describes the behaviors that apply: 


1. If the only memory type mismatch associated with a memory location across all users of the memory location 
is between different types of Device memory, then all accesses might take the properties of the weakest 
Device memory type. 


2. Any agent that reads that memory location using the same common definition of the Shareability and 
Cacheability attributes is guaranteed to access it coherently, to the extent required by that common definition 
of the memory attributes, only if all of the following conditions are met: 


° All aliases to the memory location with write permission both use a common definition of the 
Shareability and Cacheability attributes for the memory location, and either: 


— Have the Inner Cacheability attribute the same as the Outer Cacheability attribute. 
— Inthe Non-secure PL1&0 translation regime, have HCR2.MIOCNCE set to 0. 


° All aliases to a memory location use a definition of the Shareability attributes that encompasses all the 
agents with permission to access the location. 
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3: The possible software-visible effects caused by mismatched attributes for a memory location are defined 
more precisely if all of the mismatched attributes define the memory location as one of: 
° Any Device memory type. 
° Normal Inner Non-cacheable, Outer Non-cacheable memory. 


In these cases, the only permitted software-visible effects of the mismatched attributes are one or more of the 
following: 


° Possible loss of properties derived from the memory type when multiple agents attempt to access the 
memory location. 


° Possible reordering of memory transactions to the same memory location with different memory 
attributes, potentially leading to a loss of coherency or uniprocessor semantics. Any possible loss of 
coherency or uniprocessor semantics can be avoided by inserting DMB barrier instructions between 
accesses to the same memory location that might use different attributes. 


Where there is a loss of the uniprocessor semantics, ordering, or coherency, the following approaches can be used: 


1. If the mismatched attributes for a memory location all assign the same Shareability attribute to the location, 
any loss of uniprocessor semantics, ordering, or coherency within a Shareability domain can be avoided by 
use of software cache management. To do so, software must use the techniques that are required for the 
software management of the ordering or coherency of cacheable locations between agents in different 
shareability domains. This means: 


° Before writing to a location not using the Write-Back attribute, software must invalidate, or clean, a 
location from the caches if any agent might have written to the location with the Write-Back attribute. 
This avoids the possibility of overwriting the location with stale data. 


° After writing to a location with the Write-Back attribute, software must clean the location from the 
caches, to make the write visible to external memory. 

° Before reading the location with a cacheable attribute, software must invalidate the location from the 
caches, to ensure that any value held in the caches reflects the last value made visible in external 
memory. 

° Executing a DMB barrier instruction, with scope that applies to the common Shareability of the accesses, 


between any accesses to the same memory location that use different attributes. 


Note 


In AArch32 state, cache maintenance instructions can only be accessed from an Exception level that is higher 
than ELO, and therefore require a system call. For information on system calls, see Exception-generating and 
exception-handling instructions on page F1-2386. For information about the AArch32 cache maintenance 
instructions, see AArch32 cache and branch predictor support on page G3-3989. 








In all cases: 


° Location refers to any byte within the current coherency granule. 

° A clean and invalidate instruction can be used instead of a clean instruction, or instead of an invalidate 
instruction. 

. In the sequences outlined in this section, all cache maintenance instructions and memory transactions 


must be completed, or ordered by the use of barrier operations, if they are not naturally ordered by the 
use of a common address, see Ordering of cache and branch predictor maintenance instructions on 
page G3-4007. 


Note 


With software management of coherency, race conditions can cause loss of data. A race condition occurs 
when different agents write simultaneously to bytes that are in the same location, and the invalidate, write, 
clean sequence of one agent overlaps with the equivalent sequence of another agent. A race condition also 
occurs if the first operation of either sequence is a clean, rather than an invalidate. 
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2. If the mismatched attributes for a location mean that multiple cacheable accesses to the location might be 
made with different Shareability attributes, then ordering and coherency are guaranteed only if: 


. Each PE that accesses the location with a cacheable attribute performs a clean and invalidate of the 
location before and after accessing that location. 


° A DMB barrier with scope that covers the full Shareability of the accesses is placed between any accesses 
to the same memory location that use different attributes. 





Note 


The Note in rule 1 of this list, about possible race conditions, also applies to this rule. 





In addition, if multiple agents attempt to use Load-Exclusive or Store-Exclusive instructions to access a location, 
and the accesses from the different agents have different memory attributes associated with the location, the 
exclusive monitor state becomes UNKNOWN. 


ARM strongly recommends that software does not use mismatched attributes for aliases of the same location. An 
implementation might not optimize the performance of a system that uses mismatched aliases. 
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E2.10 Synchronization and semaphores 


ARMvV8 provides non-blocking synchronization of shared memory, using synchronization primitives. The 
information in this section about memory accesses by synchronization primitives applies to accesses to both Normal 
and Device memory. 


Note 


Use of the ARMV8 synchronization primitives scales for multiprocessing system designs. 








Table E2-6 shows the synchronization primitives and the associated CLREX instruction. 


Table E2-6 Synchronization primitives and associated instruction 





Function A32/T32 Instruction 





Load-Exclusive 











Byte LDREXB, LDAEXB 
Halfword LDREXH, LDAEXH 
Word LDREX, LDAEX 





Doubleword — LDREXD. LDAEXD 





Store-Exclusive 











Byte STREXB, STLEXB 
Halfword STREXH, STLEXH 
Word STREX, STLEX 





Doubleword — STREXD, STLEXD 








Clear-Exclusive CLREX 





The model for the use of a Load-Exclusive/Store-Exclusive instruction pair accessing a non-aborting memory 
address x is: 


. The Load-Exclusive instruction reads a value from memory address x. 


° The corresponding Store-Exclusive instruction succeeds in writing back to memory address x only if no other 
observer, process, or thread has performed a more recent store to address x. The Store-Exclusive instruction 
returns a status bit that indicates whether the memory write succeeded. 


A Load-Exclusive instruction marks a small block of memory for exclusive access. The size of the marked block is 
IMPLEMENTATION DEFINED, see Marking and the size of the marked memory block on page E2-2361. A 
Store-Exclusive instruction to any address in the marked block clears the marking. 





Note 


In this section, the term PE includes any observer that can generate a Load-Exclusive or a Store-Exclusive 
instruction. 





The following sections give more information: 

° Exclusive access instructions and Non-shareable memory locations on page E2-2356. 
° Exclusive access instructions and shareable memory locations on page E2-2358. 

° Marking and the size of the marked memory block on page E2-2361. 
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° Context switch support on page E2-2361. 
. Load-Exclusive and Store-Exclusive instruction usage restrictions on page E2-2362. 
° Use of WFE and SEV instructions by spin-locks on page E2-2364. 


E2.10.1 Exclusive access instructions and Non-shareable memory locations 


For memory locations for which the Shareability attribute is Non-shareable, the exclusive access instructions rely 
on a local monitor that marks any address from which the PE executes a Load-Exclusive instruction. Any 
non-aborted attempt by the same PE to use a Store-Exclusive instruction to modify any address is guaranteed to 
clear the marking. 


A Load-Exclusive instruction performs a load from memory, and: 
. The executing PE marks the physical memory address for exclusive access. 


° The local monitor of the executing PE transitions to the Exclusive Access state. 
A Store-Exclusive instruction performs a conditional store to memory that depends on the state of the local monitor: 


If the local monitor is in the Exclusive Access state 


° If the address of the Store-Exclusive instruction is the same as the address that has been 
marked in the monitor by an earlier Load-Exclusive instruction, then the store occurs. 
Otherwise, it is IMPLEMENTATION DEFINED whether the store occurs. 


° A status value is returned to a register: 
— If the store took place the status value is 0. 


— Otherwise, the status value is 1. 


° The local monitor of the executing PE transitions to the Open Access state. 


If the local monitor is in the Open Access state 


° No store takes place. 
. A status value of 1 is returned to a register. 
° The local monitor remains in the Open Access state. 


The Store-Exclusive instruction defines the register to which the status value is returned. 
When a PE writes using any instruction other than a Store-Exclusive instruction: 


° If the write is to a physical address that is not marked as Exclusive Access by its local monitor and that local 
monitor is in the Exclusive Access state it is IMPLEMENTATION DEFINED whether the write affects the state of 
the local monitor. 


. If the write is to a physical address that is marked as Exclusive Access by its local monitor it is 
IMPLEMENTATION DEFINED whether the write affects the state of the local monitor. 


It is IMPLEMENTATION DEFINED whether a store to a marked physical address causes a mark in the local monitor to 
be cleared if that store is by an observer other than the one that caused the physical address to be marked. 


Figure E2-4 on page E2-2357 shows the state machine for the local monitor and the effect of each of the operations 
shown in the figure. 
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StoreExcl (x) Store(Marked_address) * Store(Marked_address) * 
Store(x) Store(!Marked_address) * Store(!Marked_address) * 

CLREX StoreExcl (Marked_address) 

StoreExcl (!Marked_address) 

CLREX 


Operations marked * are possible alternative IMPLEMENTATION DEFINED options. 

In the diagram: _LoadExcl represents any Load-Exclusive instruction 
StoreExcl represents any Store-Exclusive instruction 
Store represents any other store instruction. 


Any LoadExcl operation updates the marked address to the most significant bits of the address x used for the operation. 


Figure E2-4 Local monitor state machine diagram 


For more information about marking see Marking and the size of the marked memory block on page E2-2361. 


Note 


For the local monitor state machine, as shown in Figure E2-4: 





° The IMPLEMENTATION DEFINED options for the local monitor are consistent with the local monitor being 
constructed so that it does not hold any physical address, but instead treats any access as matching the address 
of the previous Load-Exclusive instruction. 


° A local monitor implementation can be unaware of Load-Exclusive and Store-Exclusive instructions from 
other PEs. 
° The architecture does not require a load instruction, by another PE, that is not a Load-Exclusive instruction, 


to have any effect on the local monitor. 


° It is IMPLEMENTATION DEFINED whether the transition from Exclusive Access to Open Access state occurs 
when the Store or StoreExcl is from another observer. 





Changes to the local monitor state resulting from speculative execution 


The architecture permits a local monitor to transition to the Open Access state as a result of speculation, or from 
some other cause. This is in addition to the transitions to Open Access state caused by the architectural execution 
of an operation shown in Figure E2-4. 


An implementation must ensure that: 


° The local monitor cannot be seen to transition to the Exclusive Access state except as a result of the 
architectural execution of one of the operations shown in Figure E2-4. 


° Any transition of the local monitor to the Open Access state not caused by the architectural execution of an 
operation shown in Figure E2-4 must not indefinitely delay forward progress of execution. 
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E2.10.2 Exclusive access instructions and shareable memory locations 


In the context of this section, a shareable memory location is a memory location that has, or is treated as if it has, a 
Shareability attribute of Inner Shareable or Outer Shareable. 


For shareable memory locations, exclusive access instructions rely on: 


° A local monitor for each PE in the system, that marks any address from which the PE executes a 
Load-Exclusive. The local monitor operates as described in Exclusive access instructions and Non-shareable 
memory locations on page E2-2356, except that for shareable memory any Store-Exclusive is then subject to 
checking by the global monitor if it is described in that section as doing at least one of the following: 


— Updating memory. 
— Returning a status value of 0. 


The local monitor can ignore accesses from other PEs in the system. 


. A global monitor that marks a physical address as exclusive access for a particular PE. This marking is used 
later to determine whether a Store-Exclusive to that address that has not been failed by the local monitor can 
occur. Any successful write to the marked block by any other observer in the Shareability domain of the 
memory location is guaranteed to clear the marking. For each PE in the system, the global monitor: 

— Can hold at least one marked block. 


— Maintains a state machine for each marked block it can hold. 


Note 


For each PE, the architecture only requires global monitor support for a single marked address. Any situation 
that might benefit from the use of multiple marked addresses on a single PE is CONSTRAINED 
UNPREDICTABLE, see Load-Exclusive and Store-Exclusive instruction usage restrictions on page E2-2362. 











Note 


The global monitor can either reside in a block that is part of the hardware on which the PE executes or exist as a 
secondary monitor at the memory interfaces. The IMPLEMENTATION DEFINED aspects of the monitors mean that the 
global monitor and local monitor can be combined into a single unit, provided that the unit performs the global 
monitor and local monitor functions defined in this manual. 





For shareable memory locations, in some implementations and for some memory types, the properties of the global 
monitor require functionality outside the PE. Some system implementations might not implement this functionality 
for all locations of memory. In particular, this can apply to: 


. Any type of memory in the system implementation that does not support hardware cache coherency. 


. Non-cacheable memory, or memory treated as Non-cacheable, in an implementation that does support 
hardware cache coherency. 


In such a system, it is defined by the system: 





° Whether the global monitor is implemented. 
° If the global monitor is implemented, which address ranges or memory types it monitors. 
Note 


To support the use of the Load-Exclusive/Store-Exclusive mechanism when address translation is disabled, a system 
might define at least one location of memory, of at least the size of the translation granule, in the system memory 
map to support the global monitor for all PEs within a common Inner Shareable domain. However, this is not an 
architectural requirement. Therefore, architecturally-compliant software that requires mutual exclusion must not 
rely on using the Load-Exclusive/Store-Exclusive mechanism, and must instead use a software algorithm such as 
Lamport’s Bakery algorithm to achieve mutual exclusion. 





Because implementations can choose which memory types are treated as Non-cacheable, the only memory types for 
which it is architecturally guaranteed that a global exclusive monitor is implemented are: 


° Inner Shareable, Inner Write-Back, Outer Write-Back Normal memory with Read allocation hint and Write 
allocation hint and not transient. 
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° Outer Shareable, Inner Write-Back, Outer Write-Back Normal memory with Read allocation hint and Write 
allocation hints and not transient. 


The set of memory types that support atomic instructions must include all of the memory types for which a global 
monitor is implemented. 


If the global monitor is not implemented for an address range or memory type, then performing a 
Load-Exclusive/Store-Exclusive instruction to such a location has one or more of the following effects: 


° The instruction generates an external abort. 
. The instruction generates an IMPLEMENTATION DEFINED MMU fault. This is reported using the Fault Status 
code of: 


—  DFSR.STATUS = 0b110101 when using the Long-descriptor translation table format. The fault can also 
be reported in the HSR.ISS[5:0] field for exceptions to Hyp mode. 


—  DEFSR.FS = 0b10101 when using the Short-descriptor translation table format. 
Tf the IMPLEMENTATION DEFINED MMU fault is generated for the Non-secure PL1&0 translation regime then: 


— If the fault is generated because of the memory type defined in the first stage of translation, or if the 
second stage of translation is disabled, then this is a first stage fault and the exception is taken to EL1. 


— Otherwise, the fault is a second stage fault and the exception is taken to EL2. 
The priority of this fault is IMPLEMENTATION DEFINED. 
. The instruction is treated as a NOP. 


. The Load-Exclusive instruction is treated as if it were accessing a Non-shareable location, but the state of the 
local monitor becomes UNKNOWN. 


° The Store-Exclusive instruction is treated as if it were accessing a Non-shareable location, but the state of the 
local monitor becomes UNKNOWN. 


° The value held in the result register of the Store-Exclusive instruction becomes UNKNOWN. 


In addition, for write transactions generated by non-PE observers that do not implement exclusive accesses or other 
atomic access mechanisms, the effect that writes have on the global and local monitors used by an ARM PE is 
IMPLEMENTATION DEFINED. The writes might not clear the global monitors of other PEs for: 


° Some address ranges. 


° Some memory types. 


Operation of the global monitor 


A Load-Exclusive instruction from shareable memory performs a load from memory, and causes the physical 
address of the access to be marked as exclusive access for the requesting PE. This access can also cause the 
exclusive access mark to be removed from any other physical address that has been marked by the requesting PE. 


Note 


The global monitor only supports a single outstanding exclusive access to shareable memory for each PE. 








A Load-Exclusive instruction by one PE has no effect on the global monitor state for any other PE. 
A Store-Exclusive instruction performs a conditional store to memory: 


. The store is guaranteed to succeed only if the physical address accessed is marked as exclusive access for the 
requesting PE and both the local monitor and the global monitor state machines for the requesting PE are in 
the Exclusive Access state. In this case: 


— A sstatus value of 0 is returned to a register to acknowledge the successful store. 
— The final state of the global monitor state machine for the requesting PE is IMPLEMENTATION DEFINED. 
— _ Ifthe address accessed is marked for exclusive access in the global monitor state machine for any other 


PE then that state machine transitions to Open Access state. 


. If no address is marked as exclusive access for the requesting PE, the store does not succeed: 
—  Asstatus value of 1 is returned to a register to indicate that the store failed. 


— The global monitor is not affected and remains in Open Access state for the requesting PE. 
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. If a different physical address is marked as exclusive access for the requesting PE, it is IMPLEMENTATION 
DEFINED whether the store succeeds or not: 


— __ If the store succeeds a status value of 0 is returned to a register, otherwise a value of | is returned. 


— If the global monitor state machine for the PE was in the Exclusive Access state before the 
Store-Exclusive instruction it is IMPLEMENTATION DEFINED whether that state machine transitions to 
the Open Access state. 


The Store-Exclusive instruction defines the register to which the status value is returned. 


In ashared memory system, the global monitor implements a separate state machine for each PE in the system. The 
state machine for accesses to shareable memory by PE(n) can respond to all the shareable memory accesses visible 
to it. This means it responds to: 


. Accesses generated by PE(n). 


° Accesses generated by the other observers in the Shareability domain of the memory location. These accesses 
are identified as (!n). 


In a shared memory system, the global monitor implements a separate state machine for each observer that can 
generate a Load-Exclusive or a Store-Exclusive instruction in the system. 


Clear global monitor event 


Whenever the global monitor state for a PE changes from Exclusive access to Open access, an event is generated 
and held in the Event register for that PE. This register is used by the Wait for Event mechanism, see Wait For Event 
and Send Event on page G1-3872. 


Figure E2-5 shows the state machine for PE(n) in a global monitor. 






























































LoadExcl (x,n) LoadExcl (x,n) 
v 
Open Exclusive }«— 
> Access Access 
x r’ 

CLREXCn) StoreExcl (Marked_address, !n)+ StoreExcl (Marked_address, !n)+ 
CLREXC!n) Store(Marked_address, !n) Store(!Marked_address,n) 
LoadExcl (x, !n) StoreExcl (Marked_address,n)* StoreExcl (Marked_address,n)* 
StoreExcl (x,n) StoreExcl (!Marked_address,n)* StoreExcl (!Marked_address,n)* 

StoreExcl (x, !n) Store(Marked_address,n)* Store(Marked_address,n)* 
Store(x,n) CLREX (Cn) * CLREX(Cn) * 
Store (x, !n) StoreExcl (!Marked_address, !n) 
Store(!Marked_address, !n) 
CLREXC!n) 


+StoreExc1(Marked_address,!n) clears the monitor only if the StoreExcl updates memory 
Operations marked * are possible alternative IMPLEMENTATION DEFINED options. 
In the diagram: LoadExc] represents any Load-Exclusive instruction 
StoreExcl represents any Store-Exclusive instruction 
Store represents any other store instruction. 


Any LoadExcl operation updates the marked address to the most significant bits of the address x used for the operation. 


Figure E2-5 Global monitor state machine diagram for PE(n) in a multiprocessor system 


For more information about marking see Marking and the size of the marked memory block on page E2-2361. 
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Note 


For the global monitor state machine, as shown in Figure E2-5 on page E2-2360: 





. The architecture does not require a load instruction by another PE, that is not a Load-Exclusive instruction, 
to have any effect on the global monitor. 


° Whether a Store-Exclusive instruction successfully updates memory or not depends on whether the address 
accessed matches the marked shareable memory address for the PE issuing the Store-Exclusive instruction, 
and whether the local and global monitors are in the exclusive state. For this reason, Figure E2-5 on 
page E2-2360 only shows how the operations by (!n) cause state transitions of the state machine for PE(n). 


° A Load-Exclusive instruction can only update the marked shareable memory address for the PE issuing the 
Load-Exclusive instruction. 


. When the global monitor is in the Exclusive Access state, it is IMPLEMENTATION DEFINED whether a CLREX 
instruction causes the global monitor to transition from Exclusive Access to Open Access state. 
s It is IMPLEMENTATION DEFINED: 


— Whether a modification to a Non-shareable memory location can cause a global monitor to transition 
from Exclusive Access to Open Access state. 


— Whether a Load-Exclusive instruction to a Non-shareable memory location can cause a global monitor 
to transition from Open Access to Exclusive Access state. 





E2.10.3 Marking and the size of the marked memory block 


When a Load-Exclusive instruction is executed, the resulting marked block ignores the least significant bits of the 
64-bit memory address. 


When a Load-Exclusive instruction is executed, a marked block of size 2¢ bytes is created by ignoring the least 
significant bits of the memory address. A marked address is any address within this marked block. The size of the 
marked memory block is called the Exclusives reservation granule. The Exclusives reservation granule is 
IMPLEMENTATION DEFINED in the range 4 - 512 words. 





Note 
This definition means that the Exclusives reservation granule is: 
° 4 words in an implementation where a is 4. 
° 512 words in an implementation where a is 11. 


For example, in an implementation where a is 4, a successful LDREXB of address 0x341B4 defines a marked block 
using bits[47:4] of the address. This means that the four words of memory from 0x341B0 to 0x341BF are marked for 
exclusive access. 





In some implementations the CTR identifies the Exclusives reservation granule, see CTR. Otherwise, software must 
assume that the maximum Exclusives reservation granule, 512 words, is implemented. 
E2.10.4 Context switch support 


An exception return clears the local monitor. As a result, performing a CLREX instruction as part of a context switch 
is not required in most situations. 





Note 


Context switching is not an application level operation. However, this information is included here to complete the 
description of the exclusive operations. 
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E2.10.5 Load-Exclusive and Store-Exclusive instruction usage restrictions 


The Load-Exclusive and Store-Exclusive instructions are intended to work together as a pair, for example a 
LDREX/STREX pair or a LDREXB/STREXB pair. To support different implementations of these functions, software must 
follow the notes and restrictions given in this subsection. 


The following notes describe use of a Load-Exclusive/ Store-Exclusive pair, LoadExcl/StoreExcl, to indicate the use 
of any of the Load-Exclusive/Store-Exclusive instruction pairs shown in Table E2-6 on page E2-2355. In this 
context, a LoadExcl/StoreExcl pair comprises two instructions in the same thread of execution: 


The exclusives support a single outstanding exclusive access for each PE thread that is executed. The 
architecture makes use of this by not requiring an address or size check as part of the IsExclusiveLocal() 
function. If the target virtual address of a StoreExc] is different from the virtual address of the preceding 
LoadExcl instruction in the same thread of execution, behavior can be CONSTRAINED UNPREDICTABLE with 
the following behavior: 


— The StoreExcl either passes or fails, and the status value returned by the StoreExcl is UNKNOWN. 


Note 


This means the StoreExcl might pass for some instances of a LoadExc1/StoreExcl pair with mismatched 
addresses, and fail for other instances of a LoadExcl/StoreExcl pair with mismatched addresses. 








— The data at the address accessed by the LoadExcl, and at the address accessed by the StoreExc], is 
UNKNOWN. 


This means software can rely on a LoadExcl/StoreExcl pair to eventually succeed only if the LoadExcl and the 
StoreExcl are executed with the same virtual address. 


If two StoreExcl instructions are executed without an intervening LoadExc] instruction the second StoreExcl 

instruction returns a status value of 1. This means that: 

— | ARMrecommends that, in a given thread of execution, every StoreExcl instruction has a preceding 
LoadExcl instruction associated with it. 


It is not necessary for every LoadExcl instruction to have a subsequent StoreExcl instruction. 


An implementation of the Load-Exclusive and Store-Exclusive instructions can require that, in any thread of 
execution, the transaction size of a StoreExcl instruction is the same as the transaction size of the preceding 
LoadExcl instruction executed in that thread. If the transaction size of a StoreExc] instruction is different from 
the preceding LoadExcl instruction in the same thread of execution, behavior can be CONSTRAINED 
UNPREDICTABLE with the following behavior: 


— The StoreExcl either passes or fails, and the status value returned by the StoreExcl is UNKNOWN. 


Note 


This means the StoreExcl might pass for some instances of a LoadExc1/StoreExcl pair with mismatched 
transaction sizes, and fail for other instances of a LoadExc1/StoreExcl pair with mismatched transaction 
sizes. 








— _ The block of data of the size of the larger of the transaction sizes used by the LoadExcl/StoreExcl pair 
at the address accessed by the LoadExc1/StoreExcl pair, is UNKNOWN. 


This means software can rely on a LoadExcl/StoreExcl pair to eventually succeed only if the LoadExcl and the 
StoreExcl have the same transaction size. 


LoadExcl/StoreExcl loops are guaranteed to make forward progress only if, for any LoadExcl/StoreExcl loop 
within a single thread of execution, the software meets all of the following conditions: 


1 Between the Load-Exclusive and the Store-Exclusive, there are no explicit memory accesses, 
preloads, direct or indirect System register writes, address translation instructions, cache or TLB 
maintenance instructions, exception generating instructions, exception returns, or indirect 
branches. 
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2 Between the Store-Exclusive returning a failing result and the retry of the corresponding 
Load-Exclusive: 


° There are no stores or PLDW instructions to any address within the Exclusives reservation 
granule accessed by the Store-Exclusive. 


° There are no loads or preloads to any address within the Exclusives reservation granule 
accessed by the Store-Exclusive that use a different VA alias to that address. 


. There are no direct or indirect System register writes, other than changes to the flag fields 
in the CPSR or FPSCR, caused by data processing or comparison instructions. 


° There are no direct or indirect address translation instructions, cache or TLB maintenance 
instructions, exception generating instructions, exception returns, or indirect branches. 


. All loads and stores are to a block of contiguous virtual memory of not more than 512 
bytes in size. 


The exclusive monitor can be cleared at any time without an application-related cause, provided that such 
clearing is not systematically repeated so as to prevent the forward progress in finite time of at least one of 
the threads that is accessing the exclusive monitor. 


° Implementations can benefit from keeping the LoadExc] and StoreExcl operations close together in a single 
thread of execution. This minimizes the likelihood of the exclusive monitor state being cleared between the 
LoadExcl instruction and the StoreExc] instruction. Therefore, for best performance, ARM strongly 
recommends a limit of 128 bytes between LoadExcl and StoreExc] instructions in a single thread of execution. 


° The architecture sets an upper limit of 2048 bytes on the Exclusives reservation granule that can be marked 
as exclusive. For performance reasons, ARM recommends that objects that are accessed by exclusive 
accesses are separated by the size of the exclusive reservations granule. This is a performance guideline 
rather than a functional requirement. 


. After taking a Data Abort exception, the state of the exclusive monitors is UNKNOWN. 


° For the memory location accessed by a LoadExcl/StoreExc] pair, if the memory attributes for a StoreExcl 
instruction are different from the memory attributes for the preceding LoadExcl instruction in the same thread 
of execution, behavior is CONSTRAINED UNPREDICTABLE. Where this occurs because the translation of the 
accessed address changes between the LoadExcl instruction and the StoreExcl instruction, the CONSTRAINED 
UNPREDICTABLE behavior is as follows: 


— The StoreExcl either passes or fails, and the status value returned by the StoreExcl is UNKNOWN. 


Note 


This means the StoreExcl might pass for some instances of a LoadExcl/StoreExcl pair with changed 
memory attributes, and fail for other instances of a LoadExc1/StoreExcl pair with changed memory 
attributes. 








— The data at the address accessed by the StoreExcl is UNKNOWN. 





Note 


Another bullet point in this list covers the case where the memory attributes of a LoadExcl/StoreExcl pair 
differ as a result of using different virtual addresses with different attributes that point to the same physical 
address. 





° The effect of a data or unified cache invalidate, clean, or clean and invalidate instruction on a local or global 
exclusive monitor that is in the Exclusive Access state is CONSTRAINED UNPREDICTABLE, and the instruction 
might clear the monitor, or it might leave it in the Exclusive Access state. For address-based maintenance 
instructions, this also applies to the monitors of other PEs in the same Shareability domain as the PE 
executing the cache maintenance instruction, as determined by the Shareability domain of the address being 
maintained. 
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Note 


ARM strongly recommends that implementations ensure that the use of such maintenance instructions by a 
PE in the Non-secure state cannot cause a denial of service on a PE in the Secure state. 








. If the mapping of the virtual to physical address is changed between the LoadExcl instruction and the 
StoreExcl instruction, and the change is performed using a break-before-make sequence as described in 
Using break-before-make when updating translation table entries on page G4-4094, if the StoreExc] is 
performed after another write to the same physical address as the StoreExcl, and that other write was 
performed after the old translation was properly invalidated and that invalidation was properly synchronized, 
then the StoreExcl will not pass its monitor check. 


Note 
ARM expects that, in many implementations, either: 


— The TLB invalidation will clear either the local or global monitor. 
— The physical address will be checked between the LoadExcl] and StoreExc]. 








Note 


In the event of repeatedly-contending LoadExcl/StoreExcl instruction sequences from multiple PEs, an 
implementation must ensure that forward progress is made by at least one PE. 








E2.10.6 Use of WFE and SEV instructions by spin-locks 


ARMvV§8 provides Wait For Event, Send Event, and Send Event Local instructions, WFE, SEV, SEVL, that can assist with 
reducing power consumption and bus contention caused by PEs repeatedly attempting to obtain a spin-lock. These 
instructions can be used at the application level, but a complete understanding of what they do depends on a system 
level understanding of exceptions. They are described in Wait For Event and Send Event on page G1-3872. 
However, in ARMv8, when the global monitor for a PE changes from Exclusive Access state to Open Access state, 
an event is generated. 


Note 


This is equivalent to issuing an SEVL instruction on the PE for which the monitor state has changed. It removes the 
need for spinlock code to include an SEV instruction after clearing a spinlock. 
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Chapter F1 
The AArch32 Instruction Sets Overview 


This chapter describes the T32 and A32 instruction sets. It contains the following sections: 


° Support for instructions in different versions of the ARM architecture on page F1-2368. 
° Unified Assembler Language on page F1-2369. 

° Branch instructions on page F1-2371. 

° Data-processing instructions on page F1-2372. 


° PSTATE and banked register access instructions on page F1-2380. 


° Load/store instructions on page F1-2381. 

. Load/store multiple instructions on page F1-2384. 

° Miscellaneous instructions on page F1-2385. 

° Exception-generating and exception-handling instructions on page F1-2386. 
° System register access instructions on page F1-2387. 


° Advanced SIMD and floating-point load/store instructions on page F1-2388. 
° Advanced SIMD and floating-point register transfer instructions on page F1-2390. 
° Advanced SIMD data-processing instructions on page F1-2391. 


° Floating-point data-processing instructions on page F1-2399. 
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F1.1 Support for instructions in different versions of the ARM architecture 


This manual describes the ARM AArch32 instruction sets, T32 and A32, for the ARMvV8 architecture. Therefore, it 
indicates how any options or extensions in the ARMv8 architecture affect the available instructions. Also, 
Appendix K5 ARMv8 Changes to the T32 and A32 Instruction Sets provides information for those migrating from 
earlier versions of the architecture. 


This manual does not provide any information about which T32 and A32 instructions were supported in specific 
earlier versions of the architecture. For this information, see the ARM® Architecture Reference Manual, ARMv7-A 
and ARMv7-R edition. 
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Unified Assembler Language 


This manual uses the ARM Unified Assembler Language (UAL). This assembly language syntax provides a 
canonical form for all T32 and A32 instructions. 


UAL describes the syntax for the mnemonic and the operands of each instruction. In addition, it assumes that 
instructions and data items can be given labels. It does not specify the syntax to be used for labels, nor what 
assembler directives and options are available. See your assembler documentation for these details. 


Most earlier ARM assembly language mnemonics are still supported as synonyms, as described in the instruction 
details. 


Note 


Most earlier T32 assembly language mnemonics are not supported. 








VAL includes instruction selection rules that specify which instruction encoding is selected when more than one 
can provide the required functionality. For example, both 16-bit and 32-bit encodings exist for an ADD RQ, R1, R2 
instruction. The most common instruction selection rule is that when both a 16-bit encoding and a 32-bit encoding 
are available, the 16-bit encoding is selected, to optimize code density. 


Syntax options exist to override the normal instruction selection rules and ensure that a particular encoding is 
selected. These are useful when disassembling code, to ensure that subsequent assembly produces the original code, 
and in some other situations. 


Conditional instructions 


For maximum portability of UAL assembly language between the T32 and A372 instruction sets, ARM recommends 
that: 


° IT instructions are written before conditional instructions in the correct way for the T32 instruction set. 


° When assembling to the A32 instruction set, assemblers check that any IT instructions are correct, but do not 
generate any code for them. 


Although other T32 instructions are unconditional, all instructions that are made conditional by an IT instruction 
must be written with a condition. These conditions must match the conditions imposed by the IT instruction. For 
example, an ITTEE EQ instruction imposes the EQ condition on the first two following instructions, and the NE 

condition on the next two. Those four instructions must be written with EQ, EQ, NE and NE conditions respectively. 


Some instructions cannot be made conditional by an IT instruction. Some instructions can be conditional if they are 
the last instruction in the IT block, but not otherwise. 


The branch instruction encodings that include a condition code field cannot be made conditional by an IT 
instruction. If the assembler syntax indicates a conditional branch that correctly matches a preceding IT instruction, 
it is assembled using a branch instruction encoding that does not include a condition code field. 





Note 


ARMv8 deprecates many uses of IT, for performance reasons, see Partial deprecation of IT on page K5-5531. As 
described in that section, an implementation can include ITD controls that disable those uses of IT, making them 
UNDEFINED. 





Use of labels in UAL instruction syntax 


The UAL syntax for some instructions includes the label of an instruction or a literal data item that is at a fixed offset 
from the instruction being specified. The assembler must: 


1. Calculate the PC or Align(PC, 4) value of the instruction. The PC value of an instruction is its address plus 4 
for a T32 instruction, or plus 8 for an A32 instruction. The Align(PC, 4) value of an instruction is its PC value 
ANDed with @xFFFFFFFC to force it to be word-aligned. There is no difference between the PC and 
Align(PC, 4) values for an A32 instruction, but there can be for a T32 instruction. 
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Calculate the offset from the PC or Align(PC, 4) value of the instruction to the address of the labeled 
instruction or literal data item. 


Assemble a PC-relative encoding of the instruction, that is, one that reads its PC or Align(PC, 4) value and 
adds the calculated offset to form the required address. 


Note 


For instructions that can encode a subtraction operation, if the instruction cannot encode the calculated offset 
but can encode minus the calculated offset, the instruction encoding specifies a subtraction of minus the 
calculated offset. 








The syntax of the following instructions includes a label: 


B, BL, and BLX (immediate). The assembler syntax for these instructions always specifies the label of the 
instruction that they branch to. Their encodings specify a sign-extended immediate offset that is added to the 
PC value of the instruction to form the target address of the branch. 


CBNZ and CBZ. The assembler syntax for these instructions always specifies the label of the instruction that they 
branch to. Their encodings specify a zero-extended immediate offset that is added to the PC value of the 
instruction to form the target address of the branch. They do not support backward branches. 


LDR, LDRB, LDRD, LDRH, LDRSB, LDRSH, PLD, PLDW, PLI, and VLDR. The normal assembler syntax of these load 
instructions can specify the label of a literal data item that is to be loaded. The encodings of these instructions 
specify a zero-extended immediate offset that is either added to or subtracted from the Align(PC, 4) value of 
the instruction to form the address of the data item. A few such encodings perform a fixed addition or a fixed 
subtraction and must only be used when that operation is required, but most contain a bit that specifies 
whether the offset is to be added or subtracted. 


When the assembler calculates an offset of 0 for the normal syntax of these instructions, it must assemble an 
encoding that adds 0 to the Align(PC, 4) value of the instruction. Encodings that subtract 0 from the Align(PC, 
4) value cannot be specified by the normal syntax. 


There is an alternative syntax for these instructions that specifies the addition or subtraction and the 
immediate offset explicitly. In this syntax, the label is replaced by [PC, #+/-<imm>], where: 


+/- Is + or omitted to specify that the immediate offset is to be added to the Align(PC, 4) value, or - 
if it is to be subtracted. 


<imm> Is the immediate offset. 


This alternative syntax makes it possible to assemble the encodings that subtract 0 from the Align(PC, 4) 
value, and to disassemble them to a syntax that can be re-assembled correctly. 


ADR. The normal assembler syntax for this instruction can specify the label of an instruction or literal data item 
whose address is to be calculated. Its encoding specifies a zero-extended immediate offset that is either added 
to or subtracted from the Align(PC, 4) value of the instruction to form the address of the data item, and some 
opcode bits that determine whether it is an addition or subtraction. 


When the assembler calculates an offset of 0 for the normal syntax of this instruction, it must assemble the 
encoding that adds 0 to the Align(PC, 4) value of the instruction. The encoding that subtracts 0 from the 
Align(PC, 4) value cannot be specified by the normal syntax. 


There is an alternative syntax for this instruction that specifies the addition or subtraction and the immediate 
value explicitly, by writing them as additions ADD <Rd>, PC, #<imm> or subtractions SUB <Rd>, PC, #<imm>. 
This alternative syntax makes it possible to assemble the encoding that subtracts 0 from the Align(PC, 4) 
value, and to disassemble it to a syntax that can be re-assembled correctly. 





Note 


ARM recommends that where possible, software avoids using: 


The alternative syntax for the ADR, LDC, LDR, LDRB, LDRD, LDRH, LDRSB, LDRSH, PLD, PLI, PLDW, and VLDR instructions. 


The encodings of these instructions that subtract 0 from the Align(PC, 4) value. 
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Branch instructions 


F1 The AArch32 Instruction Sets Overview 
F1.3 Branch instructions 


Table F1-1 summarizes the branch instructions in the T32 and A32 instruction sets. In addition to providing for 
changes in the flow of execution, some branch instructions can change instruction set. 


Table F1-1 Branch instructions 


























Instruction See Range, T32 Range, A32 

Branch to target address B on page F5-2607 +16MB +32MB 

Compare and Branch on Nonzero, CBNZ, CBZ on page F5-2630 0-126 bytes a 

Compare and Branch on Zero 

Call a subroutine BL, BLX (immediate) on +16MB +32MB 

Call a subroutine, change instruction set page F5-2623 +16MB +32MB 

Call a subroutine, optionally change instruction set  BLX (register) on Any Any 
page F5-2625 

Branch to target address, change instruction set BX on page F5-2627 Any Any 

Change to Jazelle state BXJ on page F5-2629 - - 

Table Branch (byte offsets) TBB, TBH on page F5-3143 0-510 bytes a 


Table Branch (halfword offsets) 


0-131070 bytes 





a. These instructions do not exist in the A32 instruction set. 


b. The range is determined by the instruction set of the BLX instruction, not of the instruction it branches to. 


Branches to loaded and calculated addresses can be performed by LDR, LDM and data-processing instructions. For 
details see Load/store instructions on page F1-2381, Load/store multiple instructions on page F1-2384, Standard 
data-processing instructions on page F1-2372, and Shift instructions on page F1-2374. 


In addition to the branch instructions shown in Table F1-1: 


° In the A32 instruction set, a data-processing instruction that targets the PC behaves as a branch instruction. 
For more information, see Data-processing instructions on page F1-2372. 


° In the T32 and A32 instruction sets, a load instruction that targets the PC behaves as a branch instruction. For 
more information, see Load/store instructions on page F1-2381. 
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F1.4 


F1.4.1 


Data-processing instructions 


Core data-processing instructions belong to one of the following groups: 


Standard data-processing instructions. 


These instructions perform basic data-processing operations, and share a common format with some 
variations. 


Shift instructions on page F1-2374. 

Multiply instructions on page F1-2374. 

Saturating instructions on page F1-2376. 

Saturating addition and subtraction instructions on page F1-2376. 
Packing and unpacking instructions on page F1-2377. 

Parallel addition and subtraction instructions on page F1-2378. 
Divide instructions on page F1-2379. 


Miscellaneous data-processing instructions on page F1-2379. 


For related Advanced SIMD and floating-point instructions see Advanced SIMD data-processing instructions on 
page F1-2391 and Floating-point data-processing instructions on page F1-2399. 


Standard data-processing instructions 


These instructions generally have a destination register Rd, a first operand register Rn, and a second operand. The 
second operand can be another register Rm, or an immediate constant. 


If the second operand is an immediate constant, it can be: 


Encoded directly in the instruction. 


A modified immediate constant that uses 12 bits of the instruction to encode a range of constants. T32 and 
A32 instructions have slightly different ranges of modified immediate constants. For more information, see 
Modified immediate constants in T32 instructions on page F2-2420 and Modified immediate constants in A32 
instructions on page F2-2422. 


If the second operand is another register, it can optionally be shifted in any of the following ways: 


LSL 
LSR 
ASR 
ROR 
RRX 


Logical Shift Left by 1-31 bits. 

Logical Shift Right by 1-32 bits. 

Arithmetic Shift Right by 1-32 bits. 

Rotate Right by 1-31 bits. 

Rotate Right with Extend. For details see Shift and rotate operations on page E1-2290. 


In T32 code, the amount to shift by is always a constant encoded in the instruction. In A32 code, the amount to shift 
by is either a constant encoded in the instruction, or the value of a register, Rs. 


For instructions other than CMN, CMP, TEQ, and TST, the result of the data-processing operation is placed in the 
destination register. In the A32 instruction set, the destination register can be the PC, causing the result to be treated 
as a branch address. In the T32 instruction set, this is only permitted for some 16-bit forms of the ADD and MOV 
instructions. 


These instructions can optionally set the condition flags, according to the result of the operation. If they do not set 
the flags, existing flag settings from a previous instruction are preserved. 


Table F1-2 on page F1-2373 summarizes the main data-processing instructions in the T32 and A32 instruction sets. 
Generally, each of these instructions is described in three sections in Chapter F2 About the T32 and A32 Instruction 
Descriptions, one section for each of the following: 


INSTRUCTION (immediate) where the second operand is a modified immediate constant. 
INSTRUCTION (register) where the second operand is a register, or a register shifted by a constant. 


INSTRUCTION (register-shifted register) where the second operand is a register shifted by a value obtained from 
another register. These are only available in the A32 instruction set. 
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Table F1-2 Standard data-processing instructions 











Instruction Mnemonic Notes 
Add with Carry ADC - 
Add ADD T32 instruction set permits use of a modified immediate constant or a zero-extended 


12-bit immediate constant. 


















































Form PC-relative Address ADR First operand is the PC. Second operand is an immediate constant. T32 instruction 
set uses a zero-extended 12-bit immediate constant. Operation is an addition or a 
subtraction. 

Bitwise AND AND - 

Bitwise Bit Clear BIC - 

Compare Negative CMN Sets flags. Like ADD but with no destination register. 

Compare CMP Sets flags. Like SUB but with no destination register. 

Bitwise Exclusive OR EOR - 

Copy operand to destination MOV Has only one operand, with the same options as the second operand in most of these 
instructions. If the operand is a shifted register, the instruction is an LSL, LSR, ASR, or 
ROR instruction instead. For details see Shift instructions on page F1-2374. 

The T32 and A32 instruction sets permit use of a modified immediate constant or a 
zero-extended 16-bit immediate constant. 

Bitwise NOT MVN Has only one operand, with the same options as the second operand in most of these 
instructions. 

Bitwise OR NOT ORN Not available in the A32 instruction set. 

Bitwise OR ORR - 

Reverse Subtract RSB Subtracts first operand from second operand. This permits subtraction from constants 
and shifted registers. 

Reverse Subtract with Carry RSC Not available in the T32 instruction set. 

Subtract with Carry SBC - 

Subtract SUB T32 instruction set permits use of a modified immediate constant or a zero-extended 
12-bit immediate constant. 

Test Equivalence TEQ Sets flags. Like EOR but with no destination register. 

Test TST Sets flags. Like AND but with no destination register. 
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F1.4.2 Shift instructions 


Table F1-3 lists the shift instructions in the T32 and A32 instruction sets. 


In the A32 instruction set only, the destination register of these instructions can be the PC, causing the result to be 


Table F1-3 Shift instructions 





Instruction 


See 





Arithmetic Shift Right 


ASR (immediate) on page F5-2599 
ASR (register) on page F5-2601 
ASRS (immediate) on page F5-2603 
ASRS (register) on page F5-2605 





Logical Shift Left 


LSL (immediate) on page F5-2788 
LSL (register) on page F5-2790 
LSLS (immediate) on page F5-2792 
LSLS (register) on page F5-2794 





Logical Shift Right 


LSR (immediate) on page F5-2796 
LSR (register) on page F5-2798 
LSRS (immediate) on page F5-2800 
LSRS (register) on page F5-2802 





Rotate Right 


ROR (immediate) on page F5-2921 
ROR (register) on page F5-2923 
RORS (immediate) on page F5-2925 
RORS (register) on page F5-2927 





Rotate Right with Extend 


RRX on page F5-2929 
RRXS on page F5-2931 





treated as an address to branch to. 


F1.4.3 Multiply instructions 


These instructions can operate on signed or unsigned quantities. In some types of operation, the results are the same 
whether the operands are signed or unsigned. 


Table F1-4 summarizes the multiply instructions where there is no distinction between signed and unsigned 


quantities. 


The least significant 32 bits of the result are used. More significant bits are discarded. 


Table F1-5 on page F1-2375 summarizes the signed multiply instructions. 


Table F1-6 on page F1-2375 summarizes the unsigned multiply instructions. 


Table F1-4 General multiply instructions 














Instruction Operation (number of bits) 
Multiply Accumulate = MLA, MLAS on page F5-2808 32 =32+32 x 32 

Multiply and Subtract MLS on page F5-2810 32 = 32 —32 x 32 

Multiply MUL, MULS on page F5-2845 = 32 = 32 x 32 
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Table F1-5 Signed multiply instructions 





Instruction 


See Operation (number of bits) 





Signed Multiply Accumulate (halfwords) 


SMLABB, SMLABT, SMLATB, SMLATT 32 =32+ 16x 16 
on page F5-2985 











Signed Multiply Accumulate Dual SMLAD, SMLADX on page F5-2987 32 =32+ 16x 16+ 16x 16 
Signed Multiply Accumulate Long SMLAL, SMLALS on page F5-2989 64 = 64 + 32 x 32 
Signed Multiply Accumulate Long (halfwords) SMLALBB, SMLALBT, SMLALTB, 64 = 64 + 16 x 16 


SMLALTT on page F5-2991 





Signed Multiply Accumulate Long Dual 


SMLALD, SMLALDX on page F5-2994 64 = 64+ 16x 16+ 16x 16 





Signed Multiply Accumulate (word by halfword) 


SMLAWB, SMLAWT on page F5-2996 32 =32+32x 164 





Signed Multiply Subtract Dual 


SMLSD, SMLSDX on page F5-2998 32 =32+ 16x 16- 16x 16 





Signed Multiply Subtract Long Dual 


SMLSLD, SMLSLDX on page F5-3000 64 = 64+ 16 x 16-16 x 16 














Signed Most Significant Word Multiply Accumulate © SMMLA, SMMLAR on page F5-3002 32 = 32 +32 x 326 
Signed Most Significant Word Multiply Subtract SMMLS, SMMLSR on page F5-3004 32 = 32 —-32 x 326 
Signed Most Significant Word Multiply SMMUL, SMMULR on page F5-3006 32 = 32 x 326 

Signed Dual Multiply Add SMUAD, SMUADX on page F5-3008 32=16x 16+16x 16 





Signed Multiply (halfwords) 


SMULBB, SMULBT, SMULTB, SMULTT 32=16x 16 
on page F5-3010 





Signed Multiply Long 


SMULL, SMULLS on page F5-3012 64 = 32 x 32 





Signed Multiply (word by halfword) 


SMULWB, SMULWT on page F5-3014 32 =32 x 164 





Signed Dual Multiply Subtract 


SMUSD, SMUSDX on page F5-3016 32=16x 16-16x 16 





a. The most significant 32 bits of the 48-bit product are used. Less significant bits are discarded. 


b. The most significant 32 bits of the 64-bit product are used. Less significant bits are discarded. 


Table F1-6 Unsigned multiply instructions 





Instruction 


See Operation (number of bits) 





Unsigned Multiply Accumulate Accumulate Long © UMAAL on page F5-3178 64 = 32 + 32 + 32 x 32 





Unsigned Multiply Accumulate Long 


UMLAL, UMLALS on page F5-3180 64=64 +32 x 32 





Unsigned Multiply Long 


UMULL, UMULLS on page F5-3182 64 =32 x 32 
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F1.4.4 Saturating instructions 


Table F1-7 lists the saturating instructions in the T32 and A32 instruction sets. For more information, see 
Pseudocode description of saturation on page E1-2291. 


Table F1-7 Saturating instructions 





Instruction See Operation 





Signed Saturate SSAT on page F5-3022 Saturates optionally shifted 32-bit value to selected range 





Signed Saturate 16 SSAT16 on page F5-3024 Saturates two 16-bit values to selected range 





Unsigned Saturate USAT on page F5-3200 Saturates optionally shifted 32-bit value to selected range 





Unsigned Saturate 16 USAT16 on page F5-3202 Saturates two 16-bit values to selected range 





F1.4.5 Saturating addition and subtraction instructions 


Table F1-8 lists the saturating addition and subtraction instructions in the T32 and A32 instruction sets. For more 
information, see Pseudocode description of saturation on page E1-2291. 


Table F1-8 Saturating addition and subtraction instructions 











Instruction See Operation 
Saturating Add QADD on page F5-2891 Add, saturating result to the 32-bit signed integer range 
Saturating Subtract QSUB on page F5-2904 Subtract, saturating result to the 32-bit signed integer range 





Saturating Double and Add =QADD on page F5-2891 Doubles one value and adds a second value, saturating the doubling and 
the addition to the 32-bit signed integer range 











Saturating Double and QDSUB on page F5-2900 Doubles one value and subtracts the result from a second value, saturating 
Subtract the doubling and the subtraction to the 32-bit signed integer range 
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F1 The AArch32 Instruction Sets Overview 


F 1.4 Data-processing instructions 


Table F1-9 lists the packing and unpacking instructions in the T32 and A32 instruction sets. 


Table F1-9 Packing and unpacking instructions 





Instruction 


See 


Operation 





Pack Halfword 


PKHBT, PKHTB on page F5-2867 


Combine halfwords 





Signed Extend and Add Byte 


SXTAB on page F5-3131 


Extend 8 bits to 32 and add 





Signed Extend and Add Byte 16 


SXTAB16 on page F5-3133 


Dual extend 8 bits to 16 and add 





Signed Extend and Add Halfword 


SXTAH on page F5-3135 


Extend 16 bits to 32 and add 





Signed Extend Byte 


SXTB on page F5-3137 


Extend 8 bits to 32 





Signed Extend Byte 16 


SXTB16 on page F5-3139 


Dual extend 8 bits to 16 





Signed Extend Halfword 


SXTH on page F5-3141 


Extend 16 bits to 32 





nsigned Extend and Add Byte 


UXTAB on page F5-3210 


Extend 8 bits to 32 and add 





nsigned Extend and Add Byte 16 


UXTAB16 on page F5-3212 


Dual extend 8 bits to 16 and add 





UXTAH on page F5-3214 


Extend 16 bits to 32 and add 





nsigned Extend Byte 


UXTB on page F5-3216 


Extend 8 bits to 32 





nsigned Extend Byte 16 


UXTB16 on page F5-3218 


Dual extend 8 bits to 16 





U 
U 
Unsigned Extend and Add Halfword 
U 
U 
U 





nsigned Extend Halfword 


UXTH on page F5-3220 


Extend 16 bits to 32 
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F1.4.7 Parallel addition and subtraction instructions 


These instructions perform additions and subtractions on the values of two registers and write the result to a 
destination register, treating the register values as sets of two halfwords or four bytes. That is, they perform SIMD 


additions or subtractions on the general-purpose registers. 


These instructions consist of a prefix followed by a main instruction mnemonic. The prefixes are as follows: 


S Signed arithmetic modulo 28 or 2!6. 

Q Signed saturating arithmetic. 

SH Signed arithmetic, halving the results. 

U Unsigned arithmetic modulo 28 or 2!6. 
UQ Unsigned saturating arithmetic. 

UH Unsigned arithmetic, halving the results. 


The main instruction mnemonics are as follows: 


ADD16 Adds the top halfwords of two operands to form the top halfword of the result, and the bottom 


halfwords of the same two operands to form the bottom halfword of the result. 


ASX Exchanges halfwords of the second operand, and then adds top halfwords and subtracts bottom 
halfwords. 

SAX Exchanges halfwords of the second operand, and then subtracts top halfwords and adds bottom 
halfwords. 

SUB16 Subtracts each halfword of the second operand from the corresponding halfword of the first operand 


to form the corresponding halfword of the result. 


ADD8 Adds each byte of the second operand to the corresponding byte of the first operand to form the 


corresponding byte of the result. 


SUB8 Subtracts each byte of the second operand from the corresponding byte of the first operand to form 


the corresponding byte of the result. 


The instruction set permits all 36 combinations of prefix and main instruction operand, as Table F1-10 shows. 


See also Advanced SIMD parallel addition and subtraction on page F1-2392. 


Table F1-10 Parallel addition and subtraction instructions 






































Main instruction Signed  Saturating san Unsigned aa java 
ADD16, add, two halfwords SADD16 QADD16 SHADD16 UADD16 UQADD16 UHADD16 
ASX, add and subtract with exchange SASX QASX SHASX UASX UQASX UHASX 
SAX, subtract and add with exchange SSAX QSAX SHSAX USAX UQSAX UHSAX 
SUB16, subtract, two halfwords SSUB16 QSUB16 SHSUB16 USUB16 UQSUB16 UHSUB16 
ADD8, add, four bytes SADD8 QADD8 SHADD8 UADD8 UQADD8 UHADD8 
SUB8, subtract, four bytes SSUB8 QSUB8 SHSUB8 USUB8 UQSUB8 UHSUB8 
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In ARMV8, signed and unsigned integer divide instructions are included in both the T32 instruction set and the A32 
instruction set. For more information about their implementation in previous versions of the ARM architecture see 
the ARM® Architecture Reference Manual, ARMv7-A and ARMv7-R edition. 


For descriptions of the instructions see: 


° SDIV on page F5-2962. 
° UDIV on page F5-3164. 


For the SDIV and UDIV instructions, divide-by-zero always returns a zero result. 


The ID_ISARO.Divide_instrs field indicates the level of support for these instructions. The field value of 0b0010 
indicates they are implemented in both the T32 and A32 instruction sets. 


F1.4.9 


Miscellaneous data-processing instructions 


Table F1-11 lists the miscellaneous data-processing instructions in the T32 and A32 instruction sets. Immediate 
values in these instructions are simple binary numbers. 


Table F1-11 Miscellaneous data-processing instructions 





Instruction 


See 


Notes 





BitField Clear 


BFC on page F5-2610 





BitField Insert 


BFI on page F5-2612 





Count Leading Zeros 


CLZ on page F5-2632 





Move Top 


MOVT on page F5-2824 


Moves 16-bit immediate value to top 


halfword. Bottom halfword unchanged. 





Reverse Bits 


RBIT on page F5-2910 





Byte-Reverse Word 


REV on page F5-2912 





Byte-Reverse Packed Halfword 


REV16 on page F5-2914 





Byte-Reverse Signed Halfword 


REVSH on page F5-2916 





Signed BitField Extract 


SBFX on page F5-2960 





Select Bytes using GE flags 


SEL on page F5-2964 





Unsigned BitField Extract 


UBFX on page F5-3160 





Unsigned Sum of Absolute Differences 


USAD8 on page F5-3196 





Unsigned Sum of Absolute Differences and Accumulate 


USADAS on page F5-3198 








ARM DDI 0487A.k_iss10775 
1ID092916 


Non-Confidential 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


F1-2379 


F1 The AArch32 Instruction Sets Overview 
F1.5 PSTATE and banked register access instructions 


F1.5 PSTATE and banked register access instructions 


These instructions transfer PE state information to or from a general-purpose register. 


F1.5.1 PSTATE access instructions 
PSTATE holds process state information, see Process state, PSTATE on page E1-2294. In AArch32 state: 
. At EL] or higher, PSTATE is accessible using the Current Program Status Register (CPSR). 
° At ELO, a subset of the CPSR is accessible as the Application Program Status Register (APSR). 


° On taking an exception, the contents of the CPSR are copied to the Saved Program Status Register (SPSR) 
of the mode from which the exception is taken. 


The MRS and MSR instructions move the contents of the CPSR, APSR, or the SPSR of the current mode to or from a 
general-purpose register, see: 


° MRS on page F5-2830. 
° MSR (immediate) on page F5-2840. 
° MSR (register) on page F5-2842. 


When executed at ELO, MRS and MSR instructions can only access the APSR. 


The PSTATE condition flags, PSTATE.{N, Z, C, V} are set by the execution of data-processing instructions, and 
can control the execution of conditional instructions. However, software can set the condition flags explicitly using 
the MSR instruction, and can read the current state of the condition flags explicitly using the MRS instruction. 


In addition, at EL1 or higher, software can use the CPS instruction to change the PSTATE.M field and the 
PSTATE.{A, I, F} interrupt mask bits, see CPS, CPSID, CPSIE on page F5-2645. 


F1.5.2 Banked register access instructions 


At EL] or higher, the MRS (Banked register) and MSR (Banked register) instructions move the contents of a Banked 
general-purpose register, the SPSR, or the ELR_hyp, to or from a general-purpose register. See: 


. MRS (Banked register) on page F5-2832. 
. MSR (Banked register) on page F5-2836. 
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F1.6 Load/store instructions 


Table F1-12 summarizes the general-purpose register load/store instructions in the T32 and A32 instruction sets. 


Some of these instructions can also operate on the PC. See also: 


° Load/store multiple instructions on page F1-2384. 


° Synchronization and semaphores on page E2-2355, for more information about the Load-Exclusive and 
Store-Exclusive instructions. 


° Load-Acquire, Store-Release on page E2-2338, for more information about the Load-Acquire/Store-Release 
and Load-Acquire Exclusive/Store-Release Exclusive instructions. 


. Advanced SIMD and floating-point load/store instructions on page F1-2388. 


Load/store instructions have several options for addressing memory. For more information, see Addressing modes 
on page F1-2382. 


Table F1-12 Load/store instructions 



































Unprivileged Exclusive Exclusive 
Data type Load Store [oad alae a 
Acquire Release  Load- Store- 
Load Store Load _ Store : 

Acquire Release 
32-bit word LDR STR LDRT STRT LDREX STREX LDA STL LDAEX STLEX 
16-bit halfword - STRH - STRHT - STREXH LDAH STLH LDAEXH STLEXH 
16-bit unsigned LDRH - LDRHT - LDREXH —- - - - - 
halfword 
16-bit signed LDRSH - LDRSHT —- - - - - - - 
halfword 
8-bit byte - STRB - STRBT - STREXB LDAB STLB LDAEXB STLEXB 
8-bit unsigned byte LDRB - LDRBT - LDREXB-- - - - - 
8-bit signed byte LDRSB - LDRSBT —- - - - - - - 

Two 32-bit words LDRD STRD - - - - - - - - 
64-bit doubleword - - - - LDREXD STREXD- - - LDAEXD STLEXD 
F1.6.1 Loads to the PC 


The LDR instruction can load a value into the PC. The value loaded is treated as an interworking address, as described 
by the LoadWritePC() pseudocode function in Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


F1.6.2 Halfword and byte loads and stores 


Halfword and byte stores store the least significant halfword or byte from the register, to 16 or 8 bits of memory 
respectively. There is no distinction between signed and unsigned stores. 


Halfword and byte loads load 16 or 8 bits from memory into the least significant halfword or byte of a register. 
Unsigned loads zero-extend the loaded value to 32 bits, and signed loads sign-extend the value to 32 bits. 
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F1.6.3 Load unprivileged and Store unprivileged 


When executing at ELO, a Load unprivileged or Store unprivileged instruction operates in exactly the same way as 
the corresponding ordinary load or store instruction. For example, an LDRT instruction executes in exactly the same 
way as the equivalent LDR instruction. When executed at PL1, Load unprivileged and Store unprivileged instructions 
behave as they would if they were executed at ELO. For example, an LDRT instruction executes in exactly the way 
that the equivalent LDR instruction would execute at ELO. In particular, the instructions make unprivileged memory 
accesses. 


Note 


As described in Security state, Exception levels, and AArch32 execution privilege on page G1-3792, execution at 
PL1 describes all of the following: 


. Execution at Non-secure EL1 using AArch32. 





° Execution at Secure EL1 using AArch32 when EL3 is not implemented. 
° Execution at Secure EL1 using AArch32 when EL3 is implemented and is using AArch64. 
° Execution at Secure EL3 when EL3 is implemented and is using AArch32. 





The Load unprivileged and Store unprivileged instructions are CONSTRAINED UNPREDICTABLE if executed at EL2, 
see Execution of Load/Store unprivileged instructions in Hyp mode on page K1-5475. 


For more information about execution privilege, see Access permissions on page G4-4068. 


F1.6.4 Load-Exclusive and Store-Exclusive 


Load-Exclusive and Store-Exclusive instructions provide shared memory synchronization. For more information, 
see Synchronization and semaphores on page E2-2355. 


F1.6.5 Load-Acquire and Store-Release 


Load-Acquire and Store-Release instructions provide memory barriers. Load-Acquire Exclusive and Store-Release 
Exclusive instructions provide memory barriers with shared memory synchronization. For more information, see 
Load-Acquire, Store-Release on page E2-2338. 


F1.6.6 Addressing modes 


The address for a load or store is formed from two parts: a value from a base register, and an offset. 
The base register can be any one of the general-purpose registers RO-R12, SP, or LR. 


For loads, the base register can be the PC. This provides PC-relative addressing for position-independent code. 
Instructions marked (literal) in their title in Chapter F2 About the T32 and A32 Instruction Descriptions are 
PC-relative loads. 


The offset takes one of three formats: 


Immediate The offset is an unsigned number that can be added to or subtracted from the base register 
value. Immediate offset addressing is useful for accessing data elements that are a fixed 
distance from the start of the data object, such as structure fields, stack offsets and 
input/output registers. 


Register The offset is a value from a general-purpose register. The value can be added to, or 
subtracted from, the base register value. Register offsets are useful for accessing arrays or 
blocks of data. 

Scaled register The offset is a general-purpose register, shifted by an immediate value, then added to or 


subtracted from the base register. This means an array index can be scaled by the size of each 
array element. 
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The offset and base register can be used in three different ways to form the memory address. The addressing modes 
are described as follows: 


Offset The offset is added to or subtracted from the base register to form the memory address. 


Pre-indexed The offset is added to or subtracted from the base register to form the memory address. The 
base register is then updated with this new address, to permit automatic indexing through an 
array or memory block. 


Post-indexed The value of the base register alone is used as the memory address. The offset is then added 
to or subtracted from the base register. The result is stored back in the base register, to permit 
automatic indexing through an array or memory block. 


Note 
Not every variant is available for every instruction, and the range of permitted immediate values and the options for 
scaled registers vary from instruction to instruction. See Chapter F2 About the T32 and A32 Instruction Descriptions 
for full details for each instruction. 
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F1.7 Load/store multiple instructions 
Load Multiple instructions load from memory a subset, or possibly all, of the general-purpose registers and the PC. 
Store Multiple instructions store to memory a subset, or possibly all, of the general-purpose registers. 
The memory locations are consecutive word-aligned words. The addresses used are obtained from a base register, 
and can be either above or below the value in the base register. The base register can optionally be updated by the 
total size of the data transferred. 
Table F1-13 summarizes the load/store multiple instructions in the T32 and A32 instruction sets. 
Table F1-13 Load/store multiple instructions 
Instruction See 
Load Multiple, Increment After or Full Descending LDM, LDMIA, LDMFD on page F5-2699 
Load Multiple, Decrement After or Full Ascending # LDMDA, LDMFA on page F5-2707 
Load Multiple, Decrement Before or Empty Ascending LDMDB, LDMEA on page F5-2709 
Load Multiple, Increment Before or Empty Descending @ LDMIB, LDMED on page F5-2712 
Pop multiple registers off the stack > POP on page F5-2880 
Push multiple registers onto the stack ¢ PUSH on page F5-2886 
Store Multiple, Increment After or Empty Ascending STM, STMIA, STMEA on page F5-3049 
Store Multiple, Decrement After or Empty Descending @ STMDA, STMED on page F5-3055 
Store Multiple, Decrement Before or Full Descending STMDB, STMFD on page F5-3057 
Store Multiple, Increment Before or Full Ascending @ STMIB, STMFA on page F5-3060 
a. Not available in the T32 instruction set. 
b. This instruction is equivalent to an LDM instruction with the SP as base register, and base register updating. 
c. This instruction is equivalent to an STMDB instruction with the SP as base register, and base register updating. 
When executing at EL1, variants of the LDM and STM instructions load and store User mode registers. Another 
system level variant of the LDM instruction performs an exception return. 
F1.7.1 Loads to the PC 
The LDM, LDMDA, LDMDB, LDMIB, and POP instructions can load a value into the PC. The value loaded is treated as an 
interworking address, as described by the LoadWritePC() pseudocode function in Pseudocode description of 
operations on the AArch32 general-purpose registers and the PC on page E1-2293. 
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Table F1-14 summarizes the miscellaneous instructions in the T32 and A32 instruction sets. 


Table F1-14 Miscellaneous instructions 





Instruction 


See 





Clear-Exclusive 


CLREX on page F5-2631 





Data Memory Barrier 


DMB on page F5-2659 





Data Synchronization Barrier 


DSB on page F5-2662 





Instruction Synchronization Barrier 


ISB on page F5-2679 





If-Then 


IT on page F5-2681 





No Operation 


NOP on page F5-2854 





Preload Data 


PLD, PLDW (immediate) on page F5-2869 
PLD (literal) on page F5-2871 
PLD, PLDW (register) on page F5-2873 





Preload Instruction 


PLI (immediate, literal) on page F5-2875 
PLI (register) on page F5-2878 





Set Endianness 


SETEND on page F5-2966 





Send Event 


SEV on page F5-2967 





Send Event Local 


SEVL on page F5-2969 





Wait For Event 


WFE on page F5-3222 





Wait For Interrupt 


WFI on page F5-3224 





Yield 


YIELD on page F5-3226 








Note 


Previous versions of the architecture defined the DBG instruction, that could provide a hint to the debug system, in 
this group. In ARMV8, this instruction executes as a NOP. ARM deprecates any use of the DBG instruction. 





F1.8.1 


The Yield instruction 


In a Symmetric Multithreading (SMT) design, a thread can use the YIELD instruction to give a hint to the PE that it 
is running on. The YIELD hint indicates that whatever the thread is currently doing is of low importance, and so could 
yield. For example, the thread might be sitting in a spin-lock. A similar use might be in modifying the arbitration 

priority of the snoop bus in a multiprocessor (MP) system. Defining such an instruction permits binary compatibility 
between SMT and SMP systems. 


AArch32 state defines a YIELD instruction as a specific NOP (No Operation) hint instruction. 


The YIELD instruction has no effect in a single-threaded system, but developers of such systems can use the 
instruction to flag its intended use on migration to a multiprocessor or multithreading system. Operating systems 
can use YIELD in places where a yield hint is wanted, knowing that it will be treated as a NOP if there is no 


implementation benefit. 
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F1.9 Exception-generating and exception-handling instructions 


The following instructions are intended specifically to cause a synchronous exception to occur: 


The SVC instruction generates a Supervisor Call exception. For more information, see Supervisor Call (SVC) 
exception on page G1-3853. 


The Breakpoint instruction BKPT provides software breakpoints. For more information, see Breakpoint 
Instruction exceptions on page G2-3935. 


In an implementation that includes EL3, when executing at EL1 or higher, the SMC instruction generates a 
Secure Monitor Call exception. For more information, see Secure Monitor Call (SMC) exception on 
page G1-3854. 


In an implementation that includes EL2, in software executing in a Non-secure EL1 mode, the HVC instruction 
generates a Hypervisor Call exception. For more information, see Hypervisor Call (HVC) exception on 
page G1-3855. 


For an exception taken to an EL1 mode: 


The system level variants of the SUBS and LDM instructions can perform a return from an exception. 


Note 


The variants of SUBS include MOVS. See the references to Subtract (exception return), Move (exception return), 
and Load Multiple (exception return) in Table F1-15 for more information. 








The SRS instruction can be used near the start of the handler, to store return information. The RFE instruction 
can then perform a return from the exception using the stored return information. 


In an implementation that includes EL2, the ERET instruction performs a return from an exception taken to Hyp 


mode. 


For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 


Table F1-15 summarizes the instructions, in the T32 and A32 instruction sets, for generating or handling an 
exception. Except for BKPT and SVC, these are system level instructions. 


Table F1-15 Exception-generating and exception-handling instructions 





























Instruction See 

Supervisor Call SVC on page F5-3129 

Breakpoint BKPT on page F5-2621 

Secure Monitor Call SMC on page F5-2983 

Return From Exception RFE, RFEDA, RFEDB, RFEIA, RFEIB on page F5-2918 
Subtract (exception return)? SUB, SUBS (immediate) on page F5-31142 

Move (exception return) MOV, MOVS (register) on page F5-28154 

Hypervisor Call HVC on page F5-2677 

Exception Return ERET on page F5-2673 





Load Multiple (exception return) LDM (exception return) on page F5-2703 





Store Return State SRS, SRSDA, SRSDB, SRSIA, SRSIB on page F5-3018 





a. The A32 instruction set includes other instruction forms that can be used for an exception return, that have 
previously been described as variants of SUBS PC, LR. ARM deprecates any use of these instruction forms. 
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System register access instructions 


The System register encoding space is indexed using the parameters {coproc, opcl, CRn, CRm, opc2}, see The AArch32 
System register interface on page G1-3877. This encoding space provides System registers and System instructions. 
In ARMV8, the only permitted values of coproc are 0b1110 and 0b1111, and the following instructions give access to 
this encoding space: 


. Instructions that transfer data between general-purpose registers and System registers. See: 
—  MCRon page F5-2804. 
— MCRR on page F5-2806. 
—  MRCon page F5-2826. 
—  MRRC on page F5-2828. 


. Instructions that load or store from memory to a System register. See: 
— LDC (immediate) on page F5-2695. 
— LDC (literal) on page F5-2697. 
— STC on page F5-3032. 


Note 


The System register encoding space with coproc==0b101x is redefined to provide some of the Advanced SIMD and 
floating-point functionality. That is, to: 





. Initiate a floating-point data-processing operation, see Floating-point data-processing instructions on 
page F1-2399. 


° Transfer data between general-purpose registers and the Advanced SIMD and floating-point registers, see 
Advanced SIMD and floating-point register transfer instructions on page F1-2390. 


° Load or store data to the Advanced SIMD and floating-point registers, see Advanced SIMD and floating-point 
load/store instructions on page F1-2388. 





System register access instructions are part of the instruction stream executed by the PE, and therefore any System 
register access instruction that cannot be executed by the implementation causes an Undefined Instruction 
exception. In ARMv8-A and ARMVv8-R, the instruction encodings in the System register access instruction 
encoding space are unallocated, and generate Undefined Instruction exceptions, except for: 


. The instructions summarized in this section that access the coproc==0b111x encoding space. 


° The instructions in the coproc==0b101x encoding space that are redefined to provide Advanced SIMD and 
floating-point functionality, as summarized in the Note in this section. 
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F1.11 Advanced SIMD and floating-point load/store instructions 


Table F1-16 summarizes the SIMD and floating-point register file load/store instructions in the Advanced SIMD 
and floating-point instruction sets. 


Advanced SIMD also provides instructions for loading and storing multiple elements, or structures of elements, see 
Element and structure load/store instructions. 


Table F1-16 SIMD and floating-point register file load/store instructions 














Instruction See Operation 

Vector Load Multiple VLDM, VLDMDB, Load 1-16 consecutive 64-bit registers, Advanced SIMD and floating-point. 
VLDMIA on Load 1-16 consecutive 32-bit registers, floating-point only. 
page F6-3458 

Vector Load Register VLDR (immediate) on Load one 64-bit register, Advanced SIMD and floating-point. 
page F6-3463 Load one 32-bit register, floating-point only. 
VLDR (literal) on 
page F6-3465 

Vector Store Multiple VSTM, VSTMDB, Store 1-16 consecutive 64-bit registers, Advanced SIMD and floating-point. 
VSTMIA on Store 1-16 consecutive 32-bit registers, floating-point only. 


page F6-3744 





Vector Store Register VSTR on page F6-3749 Store one 64-bit register, Advanced SIMD and floating-point. 


Store one 32-bit register, floating-point only. 





F1.11.1 Element and structure load/store instructions 


Table F1-17 shows the element and structure load/store instructions available in the Advanced SIMD instruction 
set. Loading and storing structures of more than one element automatically de-interleaves or interleaves the 
elements, see Figure F1l-1 on page F1-2389 for an example of de-interleaving. Interleaving is the inverse process. 


Table F1-17 Element and structure load/store instructions 





Instruction See 





Load single element 





Multiple elements VLD1 (multiple single elements) on page F6-3424 





To one lane VLD1 (single element to one lane) on page F6-3418 





To all lanes VLD1 (single element to all lanes) on page F6-3421 





Load 2-element structure 





Multiple structures VLD2 (multiple 2-element structures) on page F6-3435 





To one lane VLD2 (single 2-element structure to one lane) on page F6-3428 





To all lanes VLD2 (single 2-element structure to all lanes) on page F6-3432 





Load 3-element structure 





Multiple structures = VLD3 (multiple 3-element structures) on page F6-3445 











To one lane VLD3 (single 3-element structure to one lane) on page F6-3438 
To all lanes VLD3 (single 3-element structure to all lanes) on page F6-3442 
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Table F1-17 Element and structure load/store instructions (continued) 





Instruction 


See 





Load 4-element structure 





Multiple structures 


VLD4 (multiple 4-element structures) on page F6-3455 





To one lane 


VLD4 (single 4-element structure to one lane) on page F6-3448 





To all lanes 


VLD4 (single 4-element structure to all lanes) on page F6-3452 





Store single element 





Multiple elements 


VST1 (multiple single elements) on page F6-3719 





From one lane 


VST1 (single element from one lane) on page F6-3716 





Store 2-element structure 





Multiple structures 


VST2 (multiple 2-element structures) on page F6-3727 





From one lane 


VST2 (single 2-element structure from one lane) on page F6-3723 





Store 3-element structure 





Multiple structures 


VST3 (multiple 3-element structures) on page F6-3734 





From one lane 


VST3 (single 3-element structure from one lane) on page F6-3730 





Store 4-element structure 





Multiple structures 


VST4 (multiple 4-element structures) on page F6-3741 





From one lane 


VST4 (single 4-element structure from one lane) on page F6-3737 





Figure F1-1 shows the de-interleaving of a VLD3.16 (multiple 3-element structures) instruction: 
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Figure F1-1 De-interleaving an array of 3-element structures 


Figure F1-1 shows the VLD3.16 instruction operating to three 64-bit registers that comprise four 16-bit elements: 


Different instructions in this group 


would produce similar figures, but operate on different numbers of 


registers. For example, VLD4 and VST4 instructions operate on four registers. 


Different element sizes would produce similar figures but with 8-bit or 32-bit elements. 


These instructions operate only on doubleword (64-bit) registers. 
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F1.12 |Advanced SIMD and floating-point register transfer instructions 


Table F1-18 summarizes the SIMD and floating-point register file transfer instructions in the Advanced SIMD and 
floating-point instruction sets. These instructions transfer data between the general-purpose registers and the 
registers in the SIMD and floating-point register file. 


Advanced SIMD vectors, and single-precision and double-precision floating-point registers, are all views of the 
same register file. For details see The SIMD and floating-point register file on page E1-2300. 


Table F1-18 SIMD and floating-point register file transfer instructions 














Instruction See 

Copy element from general-purpose register to every element of an Advanced VDUP (general-purpose register) on 

SIMD vector page F6-3394 

Copy byte, halfword, or word from general-purpose register to a register in the VMOV (general-purpose register to scalar) on 
SIMD and floating-point register file page F6-3512 

Copy byte, halfword, or word from a register in the SIMD and floating-point VMOV (scalar to general-purpose register) on 
register file to a general-purpose register page F6-3516 





Copy from single-precision floating-point register to general-purpose register, or | VMOV (between general-purpose register and 
from general-purpose register to single-precision floating-point register single-precision) on page F6-3514 





Copy two words from general-purpose registers to consecutive single-precision VMOV (between two general-purpose registers 








floating-point registers, or from consecutive single-precision floating-point and two single-precision registers) on 

registers to general-purpose registers page F6-3518 

Copy two words from general-purpose registers to a doubleword register in the VMOV (between two general-purpose registers 
SIMD and floating-point register file, or from a doubleword register in the SIMD — and a doubleword floating-point register) on 
and floating-point register file to general-purpose registers page F6-3503 

Copy from an Advanced SIMD and floating-point System Register to a VMRS on page F6-3525 


general-purpose register 





Copy from a general-purpose register to an Advanced SIMD and floating-point VMSR on page F6-3528 
System Register 
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F1.13 Advanced SIMD data-processing instructions 


Advanced SIMD data-processing instructions process registers containing vectors of elements of the same type 
packed together, enabling the same operation to be performed on multiple items in parallel. 


Instructions operate on vectors held in 64-bit or 128-bit registers. Figure F1-2 shows an operation on two 64-bit 
operand vectors, generating a 64-bit vector result. 


Note 


Figure F1-2 and other similar figures show 64-bit vectors that consist of four 16-bit elements, and 128-bit vectors 
that consist of four 32-bit elements. Other element sizes produce similar figures, but with one, two, eight, or sixteen 
operations performed in parallel instead of four. 
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Figure F1-2 Advanced SIMD instruction operating on 64-bit registers 


Many Advanced SIMD instructions have variants that produce vectors of elements double the size of the inputs. In 
this case, the number of elements in the result vector is the same as the number of elements in the operand vectors, 
but each element, and the whole vector, is double the size. 


Figure F1-3 shows an example of an Advanced SIMD instruction operating on 64-bit registers, and generating a 
128-bit result. 
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Figure F1-3 Advanced SIMD instruction producing wider result 
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There are also Advanced SIMD instructions that have variants that produce vectors containing elements half the 
size of the inputs. Figure F1-4 on page F1-2392 shows an example of an Advanced SIMD instruction operating on 
one 128-bit register, and generating a 64-bit result. 
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Figure F1-4 Advanced SIMD instruction producing narrower result 


Some Advanced SIMD instructions do not conform to these standard patterns. Their operation patterns are 


described in the individual instruction descriptions. 


Advanced SIMD instructions that perform floating-point arithmetic use the ARM standard floating-point arithmetic 
defined in Floating-point and Advanced SIMD support on page A1-46. 


F1.13.1 Advanced SIMD parallel addition and subtraction 


Table F1-19 shows the Advanced SIMD parallel add and subtract instructions. 


Table F1-19 Advanced SIMD parallel add and subtract instructions 





Instruction 


See 





Vector Add 


VADD (integer) on page F6-3289 
VADD (floating-point) on page F6-3286 





Vector Add and Narrow, returning High Half 


VADDHN on page F6-3291 











Vector Add Long VADDL on page F6-3293 
Vector Add Wide VADDW on page F6-3295 
Vector Halving Add VHADD on page F6-3414 





Vector Halving Subtract 


VHSUB on page F6-3416 





Vector Pairwise Add and Accumulate Long 


VPADAL on page F6-3562 





Vector Pairwise Add 


VPADD (integer) on page F6-3566 
VPADD (floating-point) on page F6-3564 





Vector Pairwise Add Long 


VPADDL on page F6-3568 





Vector Rounding Add and Narrow, returning High Half 


VRADDEN on page F6-3629 





Vector Rounding Halving Add 


VRHADD on page F6-3644 





Vector Rounding Subtract and Narrow, returning High Half 


VRSUBHN on page F6-3688 





Vector Saturating Add 


VQADD on page F6-3584 





Vector Saturating Subtract 


VOSUB on page F6-3627 





Vector Subtract 


VSUB (integer) on page F6-3754 
VSUB (floating-point) on page F6-3751 
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Table F1-19 Advanced SIMD parallel add and subtract instructions (continued) 














Instruction See 

Vector Subtract and Narrow, returning High Half VSUBHN on page F6-3756 
Vector Subtract Long VSUBL on page F6-3758 
Vector Subtract Wide VSUBW on page F6-3760 





F1.13.2 Bitwise Advanced SIMD data-processing instructions 


Table F1-20 shows bitwise Advanced SIMD data-processing instructions. These operate on the doubleword (64-bit) 
or quadword (128-bit) registers in the SIMD and floating-point register file, and there is no division into vector 






































elements. 
Table F1-20 Bitwise Advanced SIMD data-processing instructions 

Instruction See 

Vector Bitwise AND VAND (register) on page F6-3299 

Vector Bitwise Bit Clear (AND complement) VBIC (immediate) on page F6-3301 
VBIC (register) on page F6-3303 

Vector Bitwise Exclusive OR VEOR on page F6-3398 

Vector Bitwise Insert if False VBIF on page F6-3305 

Vector Bitwise Insert if True VBIT on page F6-3307 

Vector Bitwise Move VMOV (immediate) on page F6-3505 
VMOV (register) on page F6-3508 

Vector Bitwise NOT VMVN (immediate) on page F6-3541 
VMVN (register) on page F6-3543 

Vector Bitwise OR VORR (immediate) on page F6-3558 
VORR (register) on page F6-3560 

Vector Bitwise OR NOT VORN (register) on page F6-3556 

Vector Bitwise Select VBSL on page F6-3309 

F1.13.3 Advanced SIMD comparison instructions 


Table F1-21 shows Advanced SIMD comparison instructions. 


Table F1-21 Advanced SIMD comparison instructions 





Instruction See 





Vector Absolute Compare Greater Than or Equal VACGE on page F6-3278 














Vector Absolute Compare Greater Than VACGT on page F6-3282 
Vector Compare Equal VCEQ (register) on page F6-3313 
Vector Compare Equal to Zero VCEQ (immediate #0) on page F6-3311 
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Table F1-21 Advanced SIMD comparison instructions (continued) 
































Instruction See 
Vector Compare Greater Than or Equal VCGE (register) on page F6-3318 
Vector Compare Greater Than or Equal to Zero VCGE (immediate #0) on page F6-3316 
Vector Compare Greater Than VCGT (register) on page F6-3324 
Vector Compare Greater Than Zero VCGT (immediate #0) on page F6-3322 
Vector Compare Less Than or Equal to Zero VCLE (immediate #0) on page F6-3328 
Vector Compare Less Than Zero VCLT (immediate #0) on page F6-3335 
Vector Test Bits VTST on page F6-3770 
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Advanced SIMD shift instructions 


F1 The AArch32 Instruction Sets Overview 
F1.13 Advanced SIMD data-processing instructions 


Table F1-22 lists the shift instructions in the Advanced SIMD instruction set. 


Table F1-22 Advanced SIMD shift instructions 





Instruction 


See 





Vector Saturating Rounding Shift Left 


VORSHL on page F6-3606 





Vector Saturating Rounding Shift Right and Narrow 


VORSHRN, VORSHRUN on page F6-3610 





Vector Saturating Shift Left 


VOSHL (register) on page F6-3618 
VOSHL, VOSHLU (immediate) on page F6-3615 





Vector Saturating Shift Right and Narrow 


VOSHRN, VOSHRUN on page F6-3622 





Vector Rounding Shift Left 


VRSHL on page F6-3672 





Vector Rounding Shift Right 


VRSHR on page F6-3674 





Vector Rounding Shift Right and Accumulate 


VRSRA on page F6-3686 





Vector Rounding Shift Right and Narrow 


VRSHRN on page F6-3678 











Vector Shift Left VSHL (immediate) on page F6-3693 
VSHL (register) on page F6-3695 

Vector Shift Left Long VSHLL on page F6-3697 

Vector Shift Right VSHR on page F6-3700 





Vector Shift Right and Narrow 


VSHRN on page F6-3704 





Vector Shift Left and Insert 


VSLI on page F6-3708 





Vector Shift Right and Accumulate 


VSRA on page F6-3712 





Vector Shift Right and Insert 


VSRI on page F6-3714 
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F1.13.5 Advanced SIMD multiply instructions 
Table F1-23 summarizes the Advanced SIMD multiply instructions. 
Table F1-23 Advanced SIMD multiply instructions 
Instruction See 





Vector Multiply Accumulate 


VMLA (integer) on page F6-3484 
VMLA (floating-point) on page F6-3481 
VMLA (by scalar) on page F6-3486 





Vector Multiply Accumulate Long 


VMLAL (integer) on page F6-3488 
VMLAL (by scalar) on page F6-3490 





Vector Multiply Subtract 


VMLS (integer) on page F6-3495 
VMLS (floating-point) on page F6-3492 
VMLS (by scalar) on page F6-3497 





Vector Multiply Subtract Long 


VMLSL (integer) on page F6-3499 
VMLSL (by scalar) on page F6-3501 








Vector Multiply VMUL (integer and polynomial) on page F6-3533 
VMUL (floating-point) on page F6-3530 
VMUL (by scalar) on page F6-3535 

Vector Multiply Long VMULL (integer and polynomial) on page F6-3537 


VMULL (by scalar) on page F6-3539 





Vector Fused Multiply Accumulate 


VFMA on page F6-3404 





Vector Fused Multiply Subtract 


VFMS on page F6-3407 





Vector Saturating Doubling Multiply Accumulate Long 


VQDMLAL on page F6-3586 





Vector Saturating Doubling Multiply Subtract Long 


VODMLSL on page F6-3589 





Vector Saturating Doubling Multiply Returning High Half 


VOQDMULH on page F6-3592 





Vector Saturating Rounding Doubling Multiply Returning High Half 


VORDMULHB on page F6-3603 





Vector Saturating Doubling Multiply Long 


VQDMULL on page F6-3595 





Advanced SIMD multiply instructions can operate on vectors of: 


° 8-bit, 16-bit, or 32-bit unsigned integers. 


° 8-bit, 16-bit, or 32-bit signed integers. 


° 8-bit polynomials over {0, 1}. VMUL and VMULL are the only instructions that operate on polynomials. VMULL 


produces a 16-bit polynomial over {0, 1}. 


° Single-precision (32-bit) floating-point numbers. 


They can also act on one vector and one scalar. 


Long instructions have doubleword (64-bit) operands, and produce quadword (128-bit) results. Other Advanced 
SIMD multiply instructions can have either doubleword or quadword operands, and produce results of the same 


size. 


Floating-point multiply instructions can operate on: 


° Single-precision (32-bit) floating-point numbers. 


° Double-precision (64-bit) floating-point numbers. 
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F1.13.6 Miscellaneous Advanced SIMD data-processing instructions 
Table F1-24 shows miscellaneous Advanced SIMD data-processing instructions. 
Table F1-24 Miscellaneous Advanced SIMD data-processing instructions 
Instruction See 





Vector Absolute Difference and Accumulate 


VABA on page F6-3265 





Vector Absolute Difference and Accumulate Long 


VABAL on page F6-3267 





Vector Absolute Difference 


VABD (integer) on page F6-3271 
VABD (floating-point) on page F6-3269 





Vector Absolute Difference Long 


VABDL (integer) on page F6-3273 





Vector Absolute 


VABS on page F6-3275 





Vector Convert between floating-point and fixed 
point 


VCVT (between floating-point and fixed-point, Advanced SIMD) on 
page F6-3361 





Vector Convert between floating-point and integer 


VCVT (between floating-point and integer, Advanced SIMD) on page F6-3354 





Vector Convert between half-precision and 
single-precision 


VCVT (between half-precision and single-precision, Advanced SIMD) on 
page F6-3352 





Vector Count Leading Sign Bits 


VCLS on page F6-3333 





Vector Count Leading Zeros 


VCLZ on page F6-3340 





Vector Count Set Bits 


VCNT on page F6-3348 





Vector Duplicate scalar 


VDUP (scalar) on page F6-3396 





Vector Extract 


VEXT (byte elements) on page F6-3400 





Vector Move and Narrow 


VMOVN on page F6-3523 





Vector Move Long 


VMOVL on page F6-3521 





Vector Maximum 


VMAX (integer) on page F6-3469 
VMAX (floating-point) on page F6-3467 





Vector Minimum 


VMIN (integer) on page F6-3476 
VMIN (floating-point) on page F6-3474 








Vector Negate 


VNEG on page F6-3545 





Vector Pairwise Maximum 


VPMAX (integer) on page F6-3572 
VPMAX (floating-point) on page F6-3570 





Vector Pairwise Minimum 


VPMIN (integer) on page F6-3576 
VPMIN (floating-point) on page F6-3574 





Vector Reciprocal Estimate 


VRECPE on page F6-3631 





Vector Reciprocal Step 


VRECPS on page F6-3633 





Vector Reciprocal Square Root Estimate 


VRSORTE on page F6-3682 





Vector Reciprocal Square Root Step 


VRSORTS on page F6-3684 





Vector Reverse in halfwords 


VREV16 on page F6-3635 
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Table F1-24 Miscellaneous Advanced SIMD data-processing instructions (continued) 





Instruction 


See 





Vector Reverse in words 


VREV32 on page F6-3638 





Vector Reverse in doublewords 


VREV64 on page F6-3641 





Vector Saturating Absolute 


VOABS on page F6-3582 





Vector Saturating Move and Narrow 


VQMOVN, VQMOVUN on page F6-3598 





Vector Saturating Negate 


VONEG on page F6-3601 





Vector Swap 


VSWP on page F6-3762 





Vector Table Lookup 


VTBL, VTBX on page F6-3764 





Vector Transpose 


VTRN on page F6-3767 














Vector Unzip VUZP on page F6-3772 
Vector Zip VZIP on page F6-3775 
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F1.14 Floating-point data-processing instructions 
Table F1-25 summarizes the data-processing instructions in the floating-point instruction set. In this table, 
floating-point register means a register in the SIMD and floating-point register file. 
For details of the floating-point arithmetic used by floating-point instructions, see Floating-point and Advanced 
SIMD support on page A1-46. 
Table F1-25 Floating-point data-processing instructions 
Instruction See 





Convert between double-precision and single-precision 


VCVT (between double-precision and single-precision) on page F6-3350 





Convert between floating-point and fixed-point 


VCVT (between floating-point and fixed-point, floating-point) on 
page F6-3364 





Convert between half-precision and single-precision, 
writing to bottom half of single-precision register 


VCVTB on page F6-3371 





Convert between half-precision and single-precision, 
writing to top half of single-precision register 


VCVTT on page F6-3389 





Convert from floating-point to integer 


VCVT (floating-point to integer, floating-point) on page F6-3356 





Convert from floating-point to integer using FPSCR 
rounding mode 


VCVTR on page F6-3386 





Convert from integer to floating-point 


VCVT (integer to floating-point, floating-point) on page F6-3359 





Copy from one floating-point register to another 


VMOV (register) on page F6-3508 





Divide 


VDIV on page F6-3392 





Move immediate value to a floating-point register 


VMOV (immediate) on page F6-3505 





Square Root 


VSQRT on page F6-3710 





Vector Absolute value 


VABS on page F6-3275 





Vector Add 


VADD (floating-point) on page F6-3286 





Vector Compare with exceptions disabled 


VCMPE on page F6-3345 





Vector Compare with exceptions enabled 


VCMP on page F6-3342 





Vector Fused Multiply Accumulate 


VFMA on page F6-3404 





Vector Fused Multiply Subtract 


VFMS on page F6-3407 





Vector Fused Negate Multiply Accumulate 


VFNMA on page F6-3410 





Vector Fused Negate Multiply Subtract 


VFNMS on page F6-3412 





Vector Multiply 


VMUL (floating-point) on page F6-3530 





Vector Multiply Accumulate 


VMLA (floating-point) on page F6-3481 





Vector Multiply Subtract 


VMLS (floating-point) on page F6-3492 





Vector Negate Multiply 


VNMUL on page F6-3552 





Vector Negate Multiply Accumulate 


VNMLA on page F6-3548 
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Table F1-25 Floating-point data-processing instructions (continued) 




















Instruction See 
Vector Negate Multiply Subtract VNMLS on page F6-3550 
Vector Negate, by inverting the sign bit VNEG on page F6-3545 
Vector Subtract VSUB (floating-point) on page F6-3751 
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Chapter F2 


About the T32 and A32 Instruction Descriptions 


This chapter describes each instruction. It contains the following sections: 


Format of instruction descriptions on page F2-2402. 

Standard assembler syntax fields on page F2-2406. 

Conditional execution on page F2-2407. 

Shifts applied to a register on page F2-2410. 

Memory accesses on page F2-2412. 

Encoding of lists of general-purpose registers and the PC on page F2-2413. 

General information about the T32 and A32 instruction descriptions on page F2-2414. 
Additional pseudocode support for instruction descriptions on page F2-2426. 

Additional information about Advanced SIMD and floating-point instructions on page F2-2427. 
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F2.1 Format of instruction descriptions 


The instruction descriptions in Chapter F5 732 and A32 Base Instruction Set Instruction Descriptions and 
Chapter F6 732 and A32 Advanced SIMD and floating-point Instruction Descriptions normally use the following 
format: 


Instruction section title. 

Introduction to the instruction. 

A description of each encoding of the instruction. 
Assembler syntax. 

Pseudocode describing how the instruction operates. 


Notes, if applicable. 


Each of these items is described in more detail in the following subsections. 


F2.1.1 Instruction section title 


The instruction section title gives the base mnemonic for the instruction or instructions described in the section. 
When one mnemonic has multiple forms described in separate instruction sections, this is followed by a short 
description of the form in parentheses. The most common use of this is to distinguish between forms of an 
instruction in which one of the operands is an immediate value and forms in which it is a register. 


F2.1.2 Introduction to the instruction 


The introduction to the instruction briefly describes the main features of the instruction. This description is not 
necessarily complete and is not definitive. If there is any conflict between it and the more detailed information that 
follows, the latter takes priority. 


F2.1.3 Instruction encodings 


This is a list of one or more instruction encodings. Each instruction encoding is labelled as: 


Al, A2, A3 ... for the first, second, third and any additional A32 encodings. 
T1, T2, T3 ... for the first, second, third and any additional T32 encodings. 


Each instruction encoding description consists of: 


An assembly syntax that ensures that the assembler selects the encoding in preference to any other encoding. 
In some cases, multiple syntax variants are given. These are written in a typewriter font using the 
conventions described in Assembler syntax prototype line conventions on page F2-2404. The correct one to 
use can be indicated by: 


— A subheading that identifies the encodings that correspond to the syntax. See, for example, the 
subheading Flag setting, rotate right with extend variant in the description of the Al encoding of the 
ADC, ADCS (register) instructions in AJ on page F5-2563. 


— An annotation to the syntax, such as Inside IT block or Outside IT block. See, for example, the syntax 
descriptions of the T1 encoding of the ADC, ADCS (register) instructions in T7 on page F5-2564. 


In other cases, the correct one to use can be determined by looking at the assembler syntax description and 
using it to determine which syntax corresponds to the instruction being disassembled. 


There is usually more than one syntax variant that ensures re-assembly to any particular encoding, and the 
exact set of syntaxes that do so usually depends on the register numbers, immediate constants and other 
operands to the instruction. For example, when assembling to the T32 instruction set, the syntax AND RQ, RQ, 
R8 ensures selection of a 32-bit encoding but AND RQ, RQ, R1 selects a 16-bit encoding. 
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For each instruction encoding belonging to a target instruction set, an assembler can use this information to 
determine whether it can use that encoding to encode the instruction requested by the UAL source. If multiple 
encodings can encode the instruction then: 


— _ Ifbotha 16-bit encoding and a 32-bit encoding can encode the instruction, the architecture prefers the 
16-bit encoding. This means the assembler must use the 16-bit encoding rather than the 32-bit 
encoding. 


Software can use the .W and .N qualifiers to specify the required encoding width, see Standard 
assembler syntax fields on page F2-2406. 


— If multiple encodings of the same length can encode the instruction, the Assembler syntax subsection 
says which encoding is preferred, and how software can, instead, select the other encodings. 


Each encoding also documents UAL syntax that selects it in preference to any other encoding. 


If no encodings of the target instruction set can encode the instruction requested by the UAL source, normally 
the assembler generates an error saying that the instruction is not available in that instruction set. 


Note 


In some cases, an instruction is available in one instruction set but not in another. The Assembler syntax 
subsection identifies many of these cases. For example, the A32 instructions with bits<31:28> == 0b1111 
described in Branch, branch with link, and block data transfer on page F4-2529, System register access, 
Advanced SIMD, floating-point, and Supervisor Call on page F4-2531, and Unconditional instructions on 
page F4-2540 cannot have a condition code, but the equivalent T32 instructions often can, and this usually 
appears in the Assembler syntax subsection as a statement that the A32 instruction cannot be conditional. 





However, some such cases are too complex to describe in the available space, so the definitive test of whether 
an instruction is available in a given instruction set is whether there is an available encoding for it in that 
instruction set. 





The assembly syntax given for an encoding is therefore a suitable one for a disassembler to disassemble that 
encoding to. However, disassemblers might wish to use simpler syntaxes when they are suitable for the 
operand combination, in order to produce more readable disassembled code. 


° An encoding diagram, where: 
— Fora 32-bit A32 encoding diagram, the bits are numbered from 31 to 0. 


— Fora 16-bit T32 encoding diagram, the bits are numbered from 15 to 0. 
This halfword can be described as hw1 of the instruction. 


— Fora 32-bit T32 encoding diagram, the bits are numbered from 15 to 0 for each halfword, as a 
reminder that a 32-bit T32 instruction consists of two consecutive halfwords rather than a word. 


In this case, the left-hand halfword in the diagram is identified as hw1, and the right-hand halfword is 
identified as hw2. 


Where instructions are stored using the standard little-endian instruction endianness: 


— The encoding diagram for an A32 instruction at address A shows, from left to right, the bytes at 
addresses A+3, A+2, A+], A. 


— The encoding diagram for a 32-bit T32 instruction shows bytes in the order A+1, A for hw1, followed 
by bytes A+3, A+2 for hw2. 


. Encoding-specific pseudocode. This is pseudocode that translates the encoding-specific instruction fields 
into inputs to the encoding-independent pseudocode in the Operation subsection, and that picks out any 
special cases in the encoding. For a detailed description of the pseudocode used and of the relationship 
between the encoding diagram, the encoding-specific pseudocode and the encoding-independent 
pseudocode, see Appendix K11 ARM Pseudocode Definition. 
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F2.1.4 Assembler symbols 
The Assembly symbols describes the standard UAL syntax for the instruction. 
Each syntax description consists of the following elements: 


° Descriptions of all of the variable or optional fields of the syntax. 


Some syntax fields are standardized across all or most instructions. Standard assembler syntax fields on 
page F2-2406 describes these fields. 


By default, syntax fields that specify registers, such as <Rd>, <Rn>, or <Rt>, can be any of RO-R12 or LR in 
T32 instructions, and any of RO-R12, SP or LR in A32 instructions. These require that the encoding-specific 
pseudocode set the corresponding integer variable (such as d, n, or t) to the corresponding register number, 
using 0-12 for RO-R12, 13 for SP, or 14 for LR: 


— Normally, software can do this by setting the corresponding field in the instruction, typically named 
Rd, Rn, Rt, to the binary encoding of that number. 


— In the case of 16-bit T32 encodings, the field is normally of length 3, and so the encoding is only 
available when the assembler syntax specifies one of RO-R7. Such encodings often use a register field 
name like Rdn. This indicates that the encoding is only available if <Rd> and <Rn> specify the same 
register, and that the register number of that register is encoded in the field if they do. 


The description of a syntax field that specifies a register sometimes extends or restricts the permitted range 
of registers or documents other differences from the default rules for such fields. Examples of extensions are 
permitting the use of the SP in a T32 instruction, or permitting the use of the PC, identified using register 
number 15. 


° Where appropriate, text that briefly describes changes from the pre-UAL assembler syntax. Where present, 
this usually consists of an alternative pre-UAL form of the assembler mnemonic. The pre-UAL assembler 
syntax does not conflict with UAL. ARM recommends that it is supported, as an optional extension to UAL, 
so that pre-UAL assembler source files can be assembled. 


Assembler syntax prototype line conventions 
The following conventions are used in assembler syntax prototype lines and their subfields: 


<> Any item bracketed by < and > is a short description of a type of value to be supplied by the user in 
that position. A longer description of the item is normally supplied by subsequent text. Such items 
often correspond to a similarly named field in an encoding diagram for an instruction. When the 
correspondence only requires the binary encoding of an integer value or register number to be 
substituted into the instruction encoding, it is not described explicitly. For example, if the assembler 
syntax for an instruction contains an item <Rn> and the instruction encoding diagram contains a 4-bit 
field named Rn, the number of the register specified in the assembler syntax is encoded in binary in 
the instruction field. 


If the correspondence between the assembler syntax item and the instruction encoding is more 
complex than simple binary encoding of an integer or register number, the item description indicates 
how it is encoded. This is often done by specifying a required output from the encoding-specific 
pseudocode, such as add = TRUE. The assembler must only use encodings that produce that output. 


{ } Any item bracketed by { and } is optional. A description of the item and of how its presence or 
absence is encoded in the instruction is normally supplied by subsequent text. 


Many instructions have an optional destination register. Unless otherwise stated, if such a 
destination register is omitted, it is the same as the immediately following source register in the 
instruction syntax. 


# In the assembler syntax, numeric constants are normally preceded by a #. Some UAL instruction 
syntax descriptions explicitly show this # as optional. Any UAL assembler: 
° Must treat the # as optional where an instruction syntax description shows it as optional. 


° Can treat the # either as mandatory or as optional where an instruction syntax description does 
not show it as optional. 
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F2.1 Format of instruction descriptions 


— Note 


ARM recommends that UAL assemblers treat all uses of # shown in this manual as optional. 





spaces Single spaces are used for clarity, to separate items. When a space is obligatory in the assembler 
syntax, two or more consecutive spaces are used. 


+/- This indicates an optional + or - sign. If neither is coded, + is assumed. 


All other characters must be encoded precisely as they appear in the assembler syntax. Apart from { and }, the 
special characters described above do not appear in the basic forms of assembler instructions documented in this 
manual. In a few places, the { and } characters must be encoded as part of a variable item. When this happens, the 
long description of the variable item indicates how they must be used. 


F2.1.5 Pseudocode describing how the instruction operates 


The Operation for all classes subsection contains encoding-independent pseudocode that describes the main 
operation of the instruction. For a detailed description of the pseudocode used and of the relationship between the 
encoding diagram, the encoding-specific pseudocode and the encoding-independent pseudocode, see 

Appendix K11 ARM Pseudocode Definition. 
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F2.2 Standard assembler syntax fields 
The following assembler syntax fields are standard across all or most instructions: 


<C> Is an optional field. It specifies the condition under which the instruction is executed. See 
Conditional execution on page F2-2407 for the range of available conditions and their encoding. If 
<c> is omitted, it defaults to always (AL). 


<q> Specifies optional assembler qualifiers on the instruction. The following qualifiers are defined: 


.N Meaning narrow, specifies that the assembler must select a 16-bit encoding for the 
instruction. If this is not possible, an assembler error is produced. 


.W Meaning wide, specifies that the assembler must select a 32-bit encoding for the 
instruction. If this is not possible, an assembler error is produced. 


If neither .W nor .N is specified, the assembler can select either 16-bit or 32-bit encodings. If both 
are available, it must select a 16-bit encoding. In a few cases, more than one encoding of the same 
length can be available for an instruction. The rules for selecting between such encodings are 
instruction-specific and are part of the instruction description. The assembler syntax includes a 
mandatory .W qualifier, along with a note describing the cases in which it applies, where this 
qualifier is required to select a particular encoding for an instruction. Additional assembler syntax 
will describe the syntax when the conditions are not met. 


— Note 


When assembling to the A32 instruction set, the .N qualifier produces an assembler error and the .W 
qualifier has no effect. 
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F2 About the T32 and A32 Instruction Descriptions 
F2.3 Conditional execution 


Conditional execution 


Most T32 and A32 instructions can be executed conditionally, based on the values of the APSR condition flags. 
Table F2-1 lists the available conditions. 


Table F2-1 Condition codes 


















































cond Mnemonic extension Meaning (integer) Meaning (floating-point) 4 Condition flags 
0000 EQ Equal Equal Z== 

0001 NE Not equal Not equal, or unordered Z== 

0010 cs b Carry set Greater than, equal, or unordered C==1 

0011 cce Carry clear Less than C== 

0100 MI Minus, negative Less than N== 

0101 PL Plus, positive or zero Greater than, equal, or unordered = N== 

0110 vs Overflow Unordered V== 

0111 VC No overflow Not unordered V==0 

1000 HI Unsigned higher Greater than, or unordered C==1 and Z== 
1001 LS Unsigned lower or same Less than or equal C==0orZ==1 
1010 GE Signed greater than or equal Greater than or equal N==V 

1011 LT Signed less than Less than, or unordered N!l=V 

1100 GT Signed greater than Greater than Z == 0 and N == 
1101 LE Signed less than or equal Less than, equal, or unordered Z==lorN!=V 
1110 None (AL) 4 Always (unconditional) Always (unconditional) Any 





ao SF 


Unordered means at least one NaN operand. 


HS (unsigned higher or same) is a synonym for CS. 


LO (unsigned lower) is a synonym for CC. 


AL is an optional mnemonic extension for always, except in IT instructions. For details see /T on page F5-2681. 


In T32 instructions, the condition, if it is not AL, is normally encoded in a preceding IT instruction. For more 
information see Conditional instructions on page F1-2369 and IT on page F5-2681. Some conditional branch 
instructions do not require a preceding IT instruction, because they include a condition code in their encoding. 


For performance reasons, ARMv8 deprecates the use of IT other than with a single 16-bit T32 instruction from a 
specified subset of the 16-bit T32 instructions, see Partial deprecation of IT on page K5-5531. In addition, 
implementations can provide a set of ITD control fields, SCTLR.ITD, SCTLR_EL1.ITD, and HSCTLR.ITD, to 
disable these deprecated uses, making them UNDEFINED. For more information see: 


° Disabling or enabling PLO and PL1 use of AArch32 deprecated functionality on page G1-3888. 


° Disabling or enabling EL2 use of AArch32 deprecated functionality on page G1-3897. 


In A32 instructions, bits[31:28] of the instruction contain either: 


. The condition code, see The condition code field in A32 instruction encodings on page F2-2408. 


° 0b1111 for some A32 instructions that can only be executed unconditionally. 
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F2.3 Conditional execution 


F2.3.1 


F2.3.2 


ARM deprecates the conditional execution of any instruction encoding provided by Advanced SIMD that is not also 
provided by floating-point, and strongly recommends that: 


For A32 instructions, any such Advanced SIMD instruction that can be conditionally executed is executed 
with the <c> field omitted or set to AL. 





Note 


This applies only to VDUP, see VDUP (general-purpose register) on page F6-3394. The other A32 instructions 
do not permit conditional execution. 





For T32 instructions, such Advanced SIMD instructions are never included in an IT block. This means they 
must be specified with the <c> field omitted or set to AL. 


This deprecation does not apply to Advanced SIMD instruction encodings that are also available as floating-point 
instruction encodings. That is, it does not apply to the Advanced SIMD encodings of the instructions described in 
the following sections: 


VLDM, VLDMDB, VLDMIA on page F6-3458. 

VLDR (immediate) on page F6-3463 and VLDR (literal) on page F6-3465. 

VMOV (general-purpose register to scalar) on page F6-3512. 

VMOV (between two general-purpose registers and a doubleword floating-point register) on page F6-3503. 
VMRS on page F6-3525. 

VMSR on page F6-3528. 

VPOP on page F6-3578. 

VPUSH on page F6-3580. 

VSTM, VSTMDB, VSTMIA on page F6-3744. 

VSTR on page F6-3749. 


See also Conditional execution of undefined instructions on page G1-3851. 


The condition code field in A32 instruction encodings 


Every conditional A32 instruction contains a 4-bit condition code field, the cond field, in bits 31 to 28: 


31 30 29 28 27 26 25 24 23 22 21 201918 17161514131211109 8 76543210 


a ea SS 


This field contains one of the values 0b0000-0b1110, as shown in Table F2-1 on page F2-2407. Most instruction 
mnemonics can be extended with the letters defined in the Mnemonic extension on page F2-2407 column of that 


table. 


If the always (AL) condition is specified, the instruction is executed irrespective of the value of the condition flags. 
The absence of a condition code on an instruction mnemonic implies the AL condition code. 


Pseudocode description of conditional execution 


The AArch32.CurrentCond() function returns a 4-bit condition specifier as follows: 


For A32 instructions, it returns bits[31:28] of the instruction. 


For the T1 and T3 encodings of the Branch instruction (see B on page F5-2607), it returns the 4-bit cond field 
of the encoding. 


For all other T32 instructions: 

— If PSTATE.IT<3:@> != 'QQQQ' it returns PSTATE.IT<7:4>. 

— If PSTATE.IT<7:0> == '00000000' it returns '1110'. 

—_ Otherwise, execution of the instruction is CONSTRAINED UNPREDICTABLE. 


For more information, see Process state, PSTATE on page E1-2294. 
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The ConditionPassed() function uses this condition specifier and the condition flags to determine whether the 
instruction must be executed, by calling the ConditionHolds() function. 


Chapter J1 ARMv8 Pseudocode includes the definitions of these functions. 


Undefined Instruction exception on page G1-3849 describes the handling of conditional instructions that are 
UNDEFINED, UNPREDICTABLE, or CONSTRAINED UNPREDICTABLE. The pseudocode in the manual, as a sequential 
description of the instructions, has limitations in this respect. For more information, see Limitations of the 
instruction pseudocode on page K11-5632. 
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F2.4 Shifts applied to a register 


A32 register offset load/store word and unsigned byte instructions can apply a wide range of different constant shifts 
to the offset register. Both T32 and A32 data-processing instructions can apply the same range of different constant 
shifts to the second operand register. For details see Constant shifts. 


A32 data-processing instructions can apply a register-controlled shift to the second operand register. 


F2.4.1 Constant shifts 
These are the same in T32 and A32 instructions, except that the input bits come from different positions. 
<shift> is an optional shift to be applied to <Rm>. It can be any one of: 
(omitted) No shift. 
LSL #<n> Logical shift left <n> bits. 1 <= <n> <= 31. 


LSR #<n> Logical shift right <n> bits. 1 <= <n> <= 32. 


ASR #<n> Arithmetic shift right <n> bits. 1 <= <n> <= 32. 
ROR #<n> Rotate right <n> bits. 1 <= <n> <= 31. 
RRX Rotate right one bit, with extend. Bit[0] is written to shifter_carry_out, bits[31:1] are shifted right 


one bit, and the Carry flag is shifted into bit[31]. 


Note 


Assemblers can permit the use of some or all of ASR #0, LSL #0, LSR #0, and ROR #0 to specify that no shift is to be 
performed. This is not standard UAL, and the encoding selected for T32 instructions might vary between UAL 
assemblers if it is used. To ensure disassembled code assembles to the original instructions, disassemblers must omit 
the shift specifier when the instruction specifies no shift. 





Similarly, assemblers can permit the use of #0 in the immediate forms of ASR, LSL, LSR, and ROR instructions to specify 
that no shift is to be performed, that is, that a MOV (register) instruction is wanted. Again, this is not standard UAL, 
and the encoding selected for T32 instructions might vary between UAL assemblers if it is used. To ensure 
disassembled code assembles to the original instructions, disassemblers must use the MOV (register) syntax when the 
instruction specifies no shift. 





Encoding 

The assembler encodes <shift> into two type bits and five immediate bits, as follows: 
(omitted) type = 0b00, immediate = 0. 

LSL #<n> type = 0b00, immediate = <n>. 


LSR #<n> type = 0b01. 
If <n> < 32, immediate = <n>. 


If <n> == 32, immediate = 0. 


ASR #<n> type = 0b10. 
If <n> < 32, immediate = <n>. 


If <n> == 32, immediate = 0. 





ROR #<n> type = 0b11, immediate = <n>. 
RRX type = 0b11, immediate = 0. 
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F2.4.2 Register controlled shifts 
These are only available in A32 instructions. 


<type> is the type of shift to apply to the value read from <Rm>. It must be one of: 


ASR Arithmetic shift right, encoded as type = 0b10. 
LSL Logical shift left, encoded as type = 0b00. 

LSR Logical shift right, encoded as type = 0b01. 
ROR Rotate right, encoded as type = 0b11. 


The bottom byte of <Rs> contains the shift amount. 


F2.4.3 Pseudocode description of instruction-specified shifts and rotates 


The pseudocode enumeration SRType{} defines the shift types. Shift and rotate instruction decode is described by 
the pseudocode function: 


e DecodeImmShift() for a constant shift. 
e DecodeRegShift() for a register controlled shift. 


Shift and rotate operations are made by the pseudocode function Shift(). 
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F2.5 Memory accesses 


Commonly, the following addressing modes are permitted for memory access instructions: 


Offset addressing 


The offset value is applied to an address obtained from the base register. The result is used as the 
address for the memory access. The value of the base register is unchanged. 


The assembly language syntax for this mode is: 


[<Rn>, <offset>] 


Pre-indexed addressing 


The offset value is applied to an address obtained from the base register. The result is used as the 
address for the memory access, and written back into the base register. 


The assembly language syntax for this mode is: 


[<Rn>, <offset>] ! 


Post-indexed addressing 


The address obtained from the base register is used, unchanged, as the address for the memory 
access. The offset value is applied to the address, and written back into the base register 


The assembly language syntax for this mode is: 


[<Rn>], <offset> 


In each case, <Rn> is the base register. <offset> can be: 


° An immediate constant, such as <imm8> or <imm12>. 
° An index register, <Rm>. 
° A shifted index register, such as <Rm>, LSL #<shift>. 


For information about unaligned access, endianness, and exclusive access, see: 
° Alignment support on page E2-2323. 

° Endian support on page E2-2325. 

° Synchronization and semaphores on page E2-2355. 
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F2.6 Encoding of lists of general-purpose registers and the PC 


A number of instructions operate on lists of general-purpose registers. For some load instructions, the list of 
registers to be loaded can include the PC. For these instructions, the assembler syntax includes a <registers> field, 
that provides a list of the registers to be operated on, with list entries separated by commas. 


The registers list is encoded in the instruction encoding. Most often, this is done using an 8-bit, 13-bit, or 16-bit 
register_list field. This section gives more information about these and other possible register list encodings. 


In a register_list field, each bit corresponds to a single register, and if the <registers> field of the assembler 
instruction includes Rt then register_list<t> is set to 1, otherwise it is set to 0. 


The full rules for the encoding of lists of general-purpose registers, and possibly the PC, are: 


° Except for the cases listed here, 16-bit T32 encodings use an 8-bit register list, and can access only registers 
RO-R7. 


The exceptions to this rule are: 


— The T1 encoding of POP uses an 8-bit register list, and an additional bit, P, that corresponds to the PC. 
This means it can access any of RO-R7 and the PC. 


— The T1 encoding of PUSH uses an 8-bit register list, and an additional bit, M, that corresponds to the LR. 
This means it can access any of RO-R7 and the LR. 


° 32-bit T32 encodings of load operations use a 13-bit register list, and two additional bits, M, corresponding to 
the LR, and P, corresponding to the PC. This means these instructions can access any of RO-R12 and the LR 
and PC. 

° 32-bit T32 encodings of store operations use a 13-bit register list, and one additional bit, M, corresponding to 


the LR. This means these instructions can access any of RO-R12 and the LR. 


° Except for the case listed here, A32 encodings use a 16-bit register list. This means these instructions can 
access any of RO-R12 and the SP, LR, and PC. 
The exception to this rule is: 
— The system instructions LDM (exception return) and LDM (User registers) use a 15-bit register list. This 


means these instructions can access any of RO-R12 and the SP and LR. 


. The T3 and A2 encodings of POP, and the T3 and A2 encodings of PUSH, access a single register from the set 
of registers {RO-R12, LR, PC} and encode the register number in the Rt field. 





Note 


POP is a load operation, and PUSH is a store operation. 





In every case, the encoding-specific pseudocode converts the register list into a 32-bit variable, registers, with a 
bit corresponding to each of the registers RO-R12, SP, LR, and PC. 





Note 


Some floating-point and Advanced SIMD instructions operate on lists of SIMD and floating-point registers. The 
assembler syntax of these instructions includes a <list> field that specifies the registers to be operated on, and the 
description of the instruction in Alphabetical list of T32 and A32 base instruction set instructions on page F5-2560 
defines the use and encoding of this field. 
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F2.7 General information about the T32 and A32 instruction descriptions 

Chapter F3 The T32 Instruction Set Encoding describes the T32 instruction encodings, and Chapter F4 The A32 

Instruction Set Encoding describes the A32 instruction encodings. The following subsections give more information 

about the descriptions of these instructions and their encodings: 

° UNDEFINED, UNPREDICTABLE, and CONSTRAINED UNPREDICTABLE instruction set space. 

° T32 and A32 Advanced SIMD and floating-point instruction encodings on page F2-2415. 

° The PC and the use of Ob1111 as a register specifier in T32 and A32 instructions on page F2-2419. 

° The SP and the use of 0b1101 as a register specifier in T32 and A32 instructions on page F2-2420. 

. Modified immediate constants in T32 and A32 instructions on page F2-2420. 

F2.7.1 UNDEFINED, UNPREDICTABLE, and CONSTRAINED UNPREDICTABLE instruction set space 

An attempt to execute an unallocated instruction results in either: 

° Unpredictable behavior. The instruction is described as UNPREDICTABLE or CONSTRAINED UNPREDICTABLE. 
ARMv8-A greatly reduces the architecturally UNPREDICTABLE behavior in AArch32 state. Most cases that 
earlier versions of the architecture describe as UNPREDICTABLE become either: 

— CONSTRAINED UNPREDICTABLE, meaning the architecture defines a limited range of permitted 
behaviors. 

— Fully predictable. 

For more information see Appendix K1 Architectural Constraints on UNPREDICTABLE behaviors. 

° An Undefined Instruction exception. The instruction is described as UNDEFINED. 

An instruction is UNDEFINED if it is declared as UNDEFINED in an instruction description, or in Chapter F3 The T32 

Instruction Set Encoding or Chapter F4 The A32 Instruction Set Encoding. 

An instruction is UNPREDICTABLE only if: 

. It is declared as UNPREDICTABLE in an instruction description or in Chapter F3 or Chapter F4, and 
Appendix K1 does not redefine the behavior as CONSTRAINED UNPREDICTABLE. 

° The pseudocode for that encoding does not indicate that a different special case applies, and a bit marked (0) 
or (1) in the encoding diagram of an instruction is not 0 or 1 respectively. In most cases, ARMv8 makes these 
cases CONSTRAINED UNPREDICTABLE, as described in SBZ or SBO fields T32 and A32 in instructions on 
page K1-5460. 

Unless otherwise specified, T32 and A32 instructions provided as part of an architectural extension, or by an 

optional feature of the architecture, are UNDEFINED in an implementation that does not include that extension or 

feature. 
Note 

Examples of where this rule applies are: 

° The instructions provided by the Cryptographic Extension. 

° The system instructions that provide access to the System registers of the OPTIONAL Performance Monitors 
Extension. 

° The Advanced SIMD and floating-point instructions. 

For more information about UNDEFINED, UNPREDICTABLE, and CONSTRAINED UNPREDICTABLE instruction behavior, 

see Undefined Instruction exception on page G1-3849. 

For more information about the behavior of T32 and A32 instructions in earlier versions of the architecture see the 

ARM® Architecture Reference Manual, ARMv7-A and ARMv7-R edition. 
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F2.7.2 T32 and A32 Advanced SIMD and floating-point instruction encodings 


The T32 and A32 encodings of Advanced SIMD and floating-point instructions that are described in Chapter F3 
The T32 Instruction Set Encoding and in Chapter F4 The A32 Instruction Set Encoding are common to the T32 and 
A32 instruction sets. This means: 


° The instruction groups, and the set of instructions in each group, are identical for T32 and A32. 
° For each instruction: 
— Each T32 encoding is exactly equivalent to an A32 encoding. 


— There is no T32 encoding without an equivalent A32 encoding, and no A32 encoding without an 
equivalent T32 encoding. 





Note 
° In the T32 instruction sets, the Advanced SIMD and floating-point instructions have 32-bit encodings. 
° In the base instruction sets, some instructions are common to the T32 and A32 instruction sets, whereas other 


instructions have equivalent but not identical functionality in the two instruction sets. 





32-bit T32 encodings are described as two contiguous halfwords, {hw1:hw2}, as described in Instruction encodings 
on page F2-2402. In general: 


° hw1 of a T32 encoding maps onto bits[31:16] of an equivalent A32 encoding. 
° hw2 of a T32 encoding maps onto bits[15:0] of an equivalent A32 encoding. 


However, the different structures of the T32 instruction encoding space and the A32 instruction encoding space 
mean that: 


° For a given Advanced SIMD and floating-point instruction group: 


— The positions of the fields that identify the instruction, or instruction encoding, within the instruction 
group might differ between the T32 encodings and the A32 encodings. 


— However, the field values that identify the instruction of instruction encoding are identical for the T32 
encoding and the A32 encoding. 


The remainder of this section describes the equivalence of the T32 and A32 encodings for each of the Advanced 
SIMD and floating-point instruction groups. 


Advanced SIMD data-processing 


The T32 encoding of the Advanced SIMD data-processing group is: 


[15 42\11 i765 | 0 |15 | i765 4|3 0 | 


| at | | ttt op2 2 
op0 __| | a op4 
op1 3 


The A32 encoding of the Advanced SIMD data-processing group is: 


31 | 24/23 2221 | | | iI7 6 5 4|3 0 | 


| ttt001 | op 0 | opt | i 
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The encodings in this group are identified by: 


° hw1[15:13] of the T32 encoding is equivalent to bits[27:25] of the A32 encoding, and: 
— Has the value 0b111 in the T32 encoding. 
— Has the value 0b001 in the A32 encoding. 
° hw1[11:8] of the T32 encoding is equivalent to bits[31:28] of the A32 encoding, and has the value 0b111. 


This table shows the equivalence of the fields that identify the instructions, or instruction encodings, within this 
group: 





T32 encoding A32encoding Field size 














op0:op1 ope 2 bits 
op2 opl 15 bits 
op3 op2 1 bit 
op4 op3 1 bit 





Advanced SIMD element or structure load/store 


The T32 encoding of the Advanced SIMD element or structure load/store group is: 


|15 | |7 6 5 4|3 015 12\1110 9 | 4|3 0 | 


[__avo0t | [0 | opt | op? 
opO eer 


The A32 encoding of the Advanced SIMD element or structure load/store group is: 


31 | |23 22 21 20/19 | 12/1110 9 | 4|3 0 | 


| tttotoo | | OP fot pe 
opO _ 


The encodings in this group are identified by: 
° hw1[15:12] of the T32 encoding is equivalent to bits[31:28] of the A32 encoding, and has the value 0b1111. 


° hw1[11:8] of the T32 encoding is equivalent to bits[27:24] of the A32 encoding, and: 
— Has the value 0b1001 in the T32 encoding. 
— Has the value 0b0100 in the A32 encoding. 
° hw1[4] of the T32 encoding is equivalent to bit[20] of the A32 encoding, and has the value 0b0. 


op@, op1, and op2 are the fields that identify the instructions, or instruction encodings, within this group, and they are 
in equivalent positions in the T32 and A32 encodings. 
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Floating-point and Advanced SIMD load/store and 64-bit register moves 


The T32 encoding of the Floating-point and Advanced SIMD load/store and 64-bit register moves group is: 


15 | 8 | 5 4| 015 12\11 8 | | 0 


‘mor | 00 701 


The A32 encoding of the Floating-point and Advanced SIMD load/store and 64-bit register moves group is: 


31 27 24| 21 20| | 12\11 8 | | 0 


Pofeaitt | 440 | op P01 


The encodings in the group are identified by: 


° hw1[15:12] of the T32 encoding is equivalent to bits[31:28] of the A32 encoding, and: 
— Has the value 0b1110 in the T32 encoding. 


— Can have any value other than 0b1111 in the A32 encoding. 


This range of values is required because A32 instructions in this group can be executed conditionally, 
see Conditional execution on page F2-2407. 


° hw1[11:9] of the T32 encoding is equivalent to bits[27:25] of the A32 encoding, and has the value 0b110. 
° hw2[11:9] of the T32 encoding is equivalent to bits[11:9] of the A32 encoding, and has the value 0b101. 
op0 is the field that identifies the instructions, or instruction encodings, within this group, and is in equivalent 
positions in the T32 and A32 encodings. 

Floating-point and Advanced SIMD 32-bit register moves 


The T32 encoding of the Floating-point and Advanced SIMD 32-bit register moves group Is: 


15 | il7 5 4| 0 |15 12\11 8/7 5 4| 0 | 


[oro | op0 [En] 107 | [nn] 
EEE op1 


The A32 encoding of the Advanced SIMD 32-bit register moves group is: 


31 |27 23 2120 | 12\11 8|7 5 4| 0 | 


p i=tiit | 1110 | op) | 101 | [a 1 
[| op1 


The encodings in this group are identified by: 


° hw1[15:12] of the T32 encoding is equivalent to bits[31:28] of the A32 encoding, and: 
— Has the value 0b1110 in the T32 encoding. 
— Can have any value other than 0b1111 in the A32 encoding. 


This range of values is required because A32 instructions in this group can be executed conditionally, 
see Conditional execution on page F2-2407. 


° hw1[11:8] of the T32 encoding is equivalent to bits[27:24] of the A32 encoding, and has the value 0b1110. 
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° hw2[11:9] of the T32 encoding is equivalent to bits[11:9] of the A32 encoding, and has the value 0b101. 
° hw2[4:0] of the T32 encoding is equivalent to bits[4:0] of the A32 encoding, and has the value 0b11111. 


op@ is the field that identifies the instructions, or instruction encodings, within this group, and is in equivalent 
positions in the T32 and A32 encodings. 


Floating-point data-processing 


The T32 encoding of the Floating-point data-processing group is: 


[15 12/11 |7 4/3 2 0|15 12\11 8/7 6 5 4|3 0 | 


p11 | | to | opt | a 101 ey lO 
op0 __ | Le op3 
op2 


The A32 encoding of the Floating-point data-processing group is: 


31 28|27 23 20|19 18 | 12\11 8|7 6 5 4|3 0 | 


| cond | 170 | oppo | | t07 TT | fol 
op1 _ [ op2 


The encodings in this group are identified by: 


° hw1[15:12] of the T32 encoding is equivalent to bits[31:28] of the A32 encoding, and: 


— In the T32 encoding, hw1[15:13] has the value @b111, and hw1[12] is the op@ parameter used in 
identifying instruction encodings within this group. 


— Inthe A32 encoding, is the cond field and also implies the value of bit[28] of some A32 instruction 
encodings within this group, as the following table shows: 





cond Significance of bit[28] in A32 encodings 





!= @b1111 Part of the cond field. 





0b1111 Has fixed value of 1. 





The range of cond values other than 0b1111 is required because A32 instructions in this group can be 
executed conditionally, see Conditional execution on page F2-2407. 


° hw1[11:8] of the T32 encoding is equivalent to bits[27:24] of the A32 encoding, and has the value b1110. 
° hw2[11:9] of the T32 encoding is equivalent to bits[11:9] of the A32 encoding, and has the value 0b101. 


° hw2[4] of the T32 encoding is equivalent to bit[4] of the A32 encoding, and has the value 0b0. 
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This table shows the equivalence of the fields that identify the instructions, or instruction encodings, within this 




















group: 
T32 encoding A32 encoding 
ope Bit[28] of the instruction encoding is 1 when cond is 0b1111. 
op1 ope 
op2 opl 
op3 op2 
F2.7.3 The PC and the use of 0b1111 as a register specifier in T32 and A32 instructions 


Restrictions on the use of PC or 0b1111 as a register specifier differ between the T32 and the A32 instruction sets, 
as described in: 


T32 restrictions on the use of the PC, and use of ObI1111 as a register specifier. 
A32 restrictions on the use of PC or ObI1111 as a register specifier on page F2-2420. 


T32 restrictions on the use of the PC, and use of 0b1111 as a register specifier 


The use of 0b1111 as a register specifier is not normally permitted in T32 instructions. When a value of 0b1111 is 
permitted, a variety of meanings is possible. For register reads, these meanings include: 


Read the PC value, that is, the address of the current instruction + 4. The base register of the table branch 
instructions TBB and TBH can be the PC. This means branch tables can be placed in memory immediately after 
the instruction. 

Note 


ARM deprecates use of the PC as the base register in the STC instruction. 








Read the word-aligned PC value, that is, the address of the current instruction + 4, with bits[1:0] forced to 
zero. The base register of LDC, LDR, LDRB, LDRD (pre-indexed, no writeback), LDRH, LDRSB, and LDRSH instructions 
can be the word-aligned PC. This provides PC-relative data addressing. In addition, some encodings of the 
ADD and SUB instructions permit their source registers to be 0b1111 for the same purpose. 


Read zero. This is done in some cases when one instruction is a special case of another, more general 
instruction, but with one operand zero. In these cases, the instructions are listed on separate pages, with a 
special case in the pseudocode for the more general instruction cross-referencing the other page. 


For register writes, these meanings include: 


The PC can be specified as the destination register of an LDR instruction. This is done by encoding Rt as 
0b1111. The loaded value is treated as an address, and the effect of execution is a branch to that address. Bit[0] 
of the loaded value selects whether to execute A32 or T32 instructions after the branch. 


Some other instructions write the PC in similar ways. An instruction can specify that the PC is written: 
—  Implicitly, for example, branch instructions. 

— Explicitly by a register specifier of 0b1111, for example 16-bit MOV (register) instructions. 

— Explicitly by using a register mask, for example LDM instructions. 

The address to branch to can be: 

— A loaded value, for example, RFE. 

— A register value, for example, BX. 


— The result of a calculation, for example, TBB or TBH. 
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The method of choosing the instruction set used after the branch can be: 
— Similar to the LDR case, for example, LDM or BX. 
— A fixed instruction set other than the one currently being used, for example, the immediate form of BLX. 
— Unchanged, for example, branch instructions or 16-bit MOV (register) instructions. 
— Set from the SPSR.T bit, for RFE and SUBS PC, LR, #imm8. 

° Discard the result of a calculation. This is done in some cases when one instruction is a special case of 
another, more general instruction, but with the result discarded. In these cases, the instructions are listed on 


separate pages, with a special case in the pseudocode for the more general instruction cross-referencing the 
other page. 


° If the destination register specifier of an LDRB, LDRH, LDRSB, or LDRSH instruction is 0b1111, the instruction is a 
memory hint instead of a load operation. 


. If the destination register specifier of an MRC instruction is @b1111, bits[31:28] of the value transferred from 
the System register are written to the N, Z, C, and V condition flags in the APSR, and bits[27:0] are discarded. 


A32 restrictions on the use of PC or 0b1111 as a register specifier 
In A32 instructions, the use of 0b1111 as a register specifier specifies the PC. 


Many instructions are CONSTRAINED UNPREDICTABLE if they use 0b1111 as a register specifier. This is specified by 
pseudocode in the instruction description. ARMv8-A constrains the resulting CONSTRAINED UNPREDICTABLE 
behavior, see Using R15 on page K1-5457. 


Note 


ARM deprecates use of the PC as the base register in any store instruction. 

















F2.7.4 The SP and the use of 0b1101 as a register specifier in T32 and A32 instructions 
In the T32 and A32 instruction sets, ARM recommends that the use of 0b1101 as a register specifier specifies the SP. 
Note 
. The recommendation that the register specifier b1101 is only used to specify the SP applies to both the T32 
and the A32 instruction sets. 
° Despite this recommendation, T32 instructions that can access R13, or the SP, behave predictably in ARMv8. 
This differs from ARMv7, where many uses of R13 are defined as UNPREDICTABLE. For more information 
about these cases see the ARM® Architecture Reference Manual, ARMv7-A and ARMv7-R edition. 
F2.7.5 Modified immediate constants in T32 and A32 instructions 
The following sections describe the encoding of modified immediate constants: 
° Modified immediate constants in T32 instructions. 
° Modified immediate constants in A32 instructions on page F2-2422. 
. Modified immediate constants in T32 and A32 Advanced SIMD instructions on page F2-2423. 
. Modified immediate constants in T32 and A32 floating-point instructions on page F2-2424. 
Modified immediate constants in T32 instructions 
The encoding of a modified immediate constant in a 32-bit T32 instruction is: 
1514131211109 8 7 65 4 3 2 1 0/1514131211109 8 765 43 2 1 0 
Oc 
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Table F2-2 shows the range of modified immediate constants available in T32 data-processing instructions, and 


their encoding in the a, b, c, d, e, 


f, g, h, and i bits, and the imm3 field, in the instruction. 


Table F2-2 Encoding of modified immediates in T32 data-processing instructions 









































i:imm3:a <const> 4 

0000x 00000000 20000000 00000000 abcdefgh 
0001x 00000000 abcdefgh 00000000 abcdefgh > 
0010x abcdefgh 00000000 abcdefgh 00000000 > 
OO11x abcdefgh abcdefgh abcdefgh abcdefgh > 
01000 lbcdefgh 00000000 02000000 00000000 
01001 Olbcdefg h0200000 20000000 02000000 ¢ 
01010 QG1bcdef gh000200 20000000 2000000 
01011 Q001bcde Fgh00202 22000000 02000000 ¢ 

8-bit values shifted to other positions 

11101 00000000 20000000 000001bc defghda0 ¢ 
11110 00000000 20000000 0000001b cdefghdo 
11111 00000000 20000000 00000001 bcdefghd ¢ 





Note 





This table shows the immediate constant value in binary form, to relate abcdefgh to the 
encoding diagram. In assembly syntax, the immediate value is specified in the usual 
way (a decimal number by default). 


. ARM deprecates using a modified immediate with abcdefgh == 00000000, and these 


cases are CONSTRAINED UNPREDICTABLE, see UNPREDICTABLE cases in immediate 
constants in T32 data-processing instructions on page K1-5460. 


Not available in A32 instructions if h == 1. 


As the footnotes to Table F2-2 show, the range of values available in T32 modified immediate constants is slightly 
different from the range of values available in A32 instructions. See Modified immediate constants in A32 
instructions on page F2-2422 for the A32 values. 





Carry out 


A logical instruction with i:imm3:a == '00xxx' does not affect the Carry flag. Otherwise, a logical flag-setting 


instruction sets the Carry flag to 


the value of bit[31] of the modified immediate constant. 


Operation of modified immediate constants, T32 instructions 


For a T32 data-processing instruction, the T32ExpandImm() pseudocode function returns the value of the 32-bit 
immediate constant, calling T32ExpandImm_C() to evaluate the constant. 
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Modified immediate constants in A32 instructions 
The encoding of a modified immediate constant in an A32 instruction is: 


31 30 29 28 27 26 25 24 23 22 21 20191817161514131211109 8 7 6 5 3 2 0 
boc e f h 


1 
[CC™COCOC‘“(SCC~*@déCsigns [a be de Fg A 


Table F2-3 shows the range of modified immediate constants available in A32 data-processing instructions, and 
their encoding in the a, b, c, d, e, f, g, and h bits and the rotation field in the instruction. 


ay 


Table F2-3 Encoding of modified immediates in A32 processing instructions 




















rotation <const> 4 

0000 00000000 00000000 00000000 abcdefgh 
0001 gh000000 00000000 00000000 ddabcdef 
0010 efgh®000 00000000 00000000 e000abcd 
0011 cdefgh®0 00000000 00000000 000000ab 
0100 abcdefgh 00000000 00000000 20000000 





8-bit values shifted to other even-numbered positions 





1001 00000000 @0abcdef gh000000 02000000 





8-bit values shifted to other even-numbered positions 





1110 00000000 00000000 O000abcd efgh0o00 





1111 00000000 00000000 000000ab cdefghdd 





a. This table shows the immediate constant value in binary form, to relate abcdefgh to the encoding diagram. 
In assembly syntax, the immediate value is specified in the usual way (a decimal number by default). 





Note 


The range of values available in A32 modified immediate constants is slightly different from the range of values 
available in 32-bit T32 instructions. See Modified immediate constants in T32 instructions on page F2-2420. 





Carry out 


A logical instruction with the rotation field set to 0b0000 does not affect APSR.C. Otherwise, a logical flag-setting 
instruction sets APSR.C to the value of bit[31] of the modified immediate constant. 


Constants with multiple encodings 


Some constant values have multiple possible encodings. In this case, a UAL assembler must select the encoding 
with the lowest unsigned value of the rotation field. This is the encoding that appears first in Table F2-3. For 
example, the constant #3 must be encoded with (rotation, abcdefgh) == (0b0000, 0b00000011), not (0b0001, 
0b00001100), (0b0010, 0b00110000), or (0b0011, b11000000). 


In particular, this means that all constants in the range 0-255 are encoded with rotation == 0b0000, and permitted 
constants outside that range are encoded with rotation != 0b0000. A flag-setting logical instruction with a modified 
immediate constant therefore leaves APSR.C unchanged if the constant is in the range 0-255 and sets it to the most 
significant bit of the constant otherwise. This matches the behavior of T32 modified immediate constants for all 
constants that are permitted in both the A32 and T32 instruction sets. 
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An alternative syntax is available for a modified immediate constant that permits the programmer to specify the 
encoding directly. In this syntax, #<const> is instead written as #<byte>, #<rot>, where: 


<byte> Is the numeric value of abcdefgh, in the range 0-255. 
<rot> Is twice the numeric value of rotation, an even number in the range 0-30. 


This syntax permits all A32 data-processing instructions with modified immediate constants to be disassembled to 
assembler syntax that assembles to the original instruction. 


This syntax also makes it possible to write variants of some flag-setting logical instructions that have different 
effects on APSR.C to those obtained with the normal #<const> syntax. For example, ANDS R1, R2, #12, #2 has the 
same behavior as ANDS R1, R2, #3 except that it sets APSR.C to 0 instead of leaving it unchanged. Such variants of 
flag-setting logical instructions do not have equivalents in the T32 instruction set, and ARM deprecates their use. 


Operation of modified immediate constants, A32 instructions 


For an A32 data-processing instruction, the A32ExpandImm() pseudocode function returns the value of the 32-bit 
immediate constant, calling A32ExpandImm_C() to evaluate the constant. 


Modified immediate constants in T32 and A32 Advanced SIMD instructions 


Table F2-4 shows the modified immediate constants available with Advanced SIMD instructions, and how they are 
encoded. 


Table F2-4 Modified immediate values for Advanced SIMD instructions 









































op cmode Constant@ <dt>b Notes 
- 000x 00000000 20000000 00000000 abcdefgh 00000000 00000000 00000000 abcdefgh 132 c 
OO1x 00000000 20000000 abcdefgh 00000000 00000000 00000000 abcdefgh 20000000 132 c,d 
010x 00000000 abcdefgh 00000000 00000000 00000000 abcdefgh 00000000 90000000 132 c,d 
Ollx abcdefgh 00000000 00000000 20000000 abcdefgh 00000000 90000000 00000000 132 c,d 
100x 00000000 abcdefgh 00000000 abcdefgh 00000000 abcdefgh 00000000 abcdefgh 116 c 
101x abcdefgh 00000000 abcdefgh 00000000 abcdefgh 00000000 abcdefgh 00000000 116 c,d 
1100 00000000 20000000 abcdefgh 11111111 00000000 00000000 abcdefgh 11111111 132 d,e 
1101 00000000 abcdefgh 11111111 11111111 00000000 abcdefgh 11111111 11111111 132 d,e 
0 1110 abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh 18 f 
1111 aBbbbbbc defgh®Q0 00000000 20000000 aBbbbbbc defgh000 20000000 20000000 F32 fg 
1 1110 aaaaaaaa bbbbbbbb cccccccc dddddddd eeeeeeee fffffffF gggggggg hhhhhhhh 164 f 
1111 UNDEFINED = = 





a. In this table, the immediate value is shown in binary form, to relate abcdefgh to the encoding diagram. In assembler syntax, the constant is 
specified by a data type and a value of that type. That value is specified in the normal way (a decimal number by default) and is replicated 
enough times to fill the 64-bit immediate. For example, a data type of 132 and a value of 10 specify the 64-bit constant 0x0000000A0000000A. 


b. This specifies the data type used when the instruction is disassembled. On assembly, the data type must be matched in the table if possible. 
Other data types are permitted as pseudo-instructions when a program is assembled, provided the 64-bit constant specified by the data type 
and value is available for the instruction. If a constant is available in more than one way, the first entry in this table that can produce it is 
used. For example, VMOV.164 DQ, #0x8000000080000000 does not specify a 64-bit constant that is available from the 164 line of the table, but 
does specify one that is available from the fourth [32 line or the F32 line. It is assembled to the first of these, and therefore is disassembled 
as VMOV.132 DO, #0x80000000. 


c. This constant is available for the VBIC, VMOV, VMVN, and VORR instructions. 
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d. CONSTRAINED UNPREDICTABLE if abcdefgh == 0b00000000, see UNPREDICTABLE cases in immediate constants in Advanced SIMD 
instructions on page K1-5461. The required behavior is that these encodings produce an immediate constant of zero. 


e. This constant is available for the VMOV and VMVN instructions only. 
f. This constant is available for the VMOV instruction only. 


g. In this entry, 8 = NOT(b). The bit pattern represents the floating-point number (—1)S x 2¢xP x mantissa, where S = UInt(a), exp 
= UInt(NOT(b):c:d)-3 and mantissa = (16+UInt(e:f:g:h))/16. 


Operation of modified immediate constants, Advanced SIMD instructions 


For a T32 or A32 Advanced SIMD instruction that uses a modified immediate constant, the operation described by 
the AdvSIMDExpandImm() pseudocode function returns the value of the 64-bit immediate constant. 


Modified immediate constants in T32 and A32 floating-point instructions 


Table F2-5 shows the immediate constants available in the VMOV (immediate) floating-point instruction, and 
Table F2-6 shows the resulting floating-point values. 


Table F2-5 Floating-point modified immediate constants 











Datatype imm4H imm4L Constant 
F32 abcd efgh aBbbbbbc defgh®00 00000000 20000000 
F64 abcd efgh aBbbbbbb bbcdefgh 00000000 00000000 00000000 02000000 20000000 a0000000 





a. In this column, 8B = NOT(b). The bit pattern represents the floating-point number (—1)S x 2¢xP x mantissa, where S = UInt(a), exp 
= UInt(NOT(b) :c:d)-3 and mantissa = (16+UInt(e:f:g:h))/16. 


Table F2-6 Floating-point constant values 








bed 
efgh 

000 001 010 0611 100 101 110 111 
0000 = 2.0 4.0 8.0 16.0 0.125 0.25 0.5 1.0 





00010 2.125 4.25 8.5 17.0 0.1328125 0.265625 0.53125 1.0625 





0010 = 2.25 4.5 9.0 18.0 0.140625 0.28125 0.5625 1.125 





0011 8=92.375 «4.75 9.5 19.0 0.1484375 0.296875 0.59375 — 1.1875 





0100. 2.5 5.0 10.0 20.0 0.15625 0.3125 0.625 1.25 





0101) 92.625) 5.25 10.5 21.0 0.1640625 0.328125 0.65625 = 1.3125 





0110 = 2.75 DEO) 11.0 22.0 0.171875 0.34375 0.6875 1.375 





@111) 2.875) 5.75 11.5 23.0 0.1796875 0.359375. 0.71875 «1.4375 





1000S 3.0 6.0 12.0 24.0 0.1875 0.375 0.75 1.5 





1001) 03.125) 6.25) 12.5 25.0 0.1953125 0.390625 0.78125 1.5625 





1010 = 3.25 6.5 13.0 26.0 0.203125 0.40625 0.8125 1.625 





1011) 3.375) 6.75) 13.5 27.0 0.2109375 0.421875 0.84375 1.6875 








1100 3.5 7.0 14.0 28.0 0.21875 0.4375 0.875 1.75 
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Table F2-6 Floating-point constant values (continued) 





bed 
efgh 
000 001 010 011 100 101 110 111 





1101) 3.625) 7.25) 14.5 29.0 0.2265625 0.453125 0.90625 1.8125 





1110.03.75 TS 15.0 30.0 0.234375 0.46875 0.9375 1.875 





1111 3.8750 7.75) 15.5 31.0 0.2421875 =—-0.484375 —-0.96875 1.9375 





Operation of modified immediate constants, floating-point instructions 


For a T32 or A32 floating-point instruction that uses a modified immediate constant, the operation described by the 
VFPExpandImm() pseudocode function returns the value of the immediate constant. 
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F2.8 Additional pseudocode support for instruction descriptions 


Earlier sections of this chapter include pseudocode that describes features of the execution of A32 and T32 
instructions, see: 


° Pseudocode description of conditional execution on page F2-2408. 


° Pseudocode description of instruction-specified shifts and rotates on page F2-2411 


The following subsection gives additional pseudocode support functions for some of the instructions described in 
Alphabetical list of T32 and A32 base instruction set instructions on page F5-2560. See also Pseudocode support 
for the Banked register transfer instructions on page F5-3231. 


F2.8.1 Pseudocode description of operations for System register access instructions 


The AArch32.CheckSystemAccess() pseudocode function determines whether a System register access instruction is 
accepted for execution. 


The AArch32.SysRegRead() function obtains the word for an MRC instruction from the System register. 
The AArch32.SysRegRead64() function obtains the two words for an MRRC instruction from the System register. 


Note 


The relative significance of the two words returned is IMPLEMENTATION DEFINED, but all uses within this manual 
present the two words in the order (most significant, least significant). 








The AArch32.SysRegWrite() procedure sends the word for an MCR instruction to the System register. 


The AArch32.SysRegWrite64() procedure sends the two words for an MCRR instruction to the System register. 





Note 


The relative significance of word2 and word1 is IMPLEMENTATION DEFINED, but all uses within this manual treat word2 
as more significant than word1. 





The CP14DebugInstrDecode() pseudocode function decodes an accepted access to a debug System register in the 
(coproc==0b1110) encoding space. 


The CP14JazelleInstrDecode() pseudocode function decodes an accepted access to a Jazelle System register. These 
registers are in the (coproc==0b1110) encoding space. 


The CP14TraceInstrDecode() pseudocode function decodes an accepted access to a Trace System register. These 
registers are in the (coproc==0b1110) encoding space. 


The CP15InstrDecode() pseudocode function decodes an accepted access to a System register in the 
(coproc==0b1111) encoding space. 


F2.8.2 Pseudocode details of system calls 


The AArch32.CallSupervisor() pseudocode function generates a Supervisor Call exception. Valid execution of the 
SVC instruction calls this function. 


The AArch32.CallHypervisor() pseudocode function generates an HVC exception. Valid execution of the HVC 
instruction calls this function. 
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F2.9 Additional information about Advanced SIMD and floating-point instructions 


The following subsections give additional information about the Advanced SIMD and floating-point instructions: 
° Advanced SIMD and floating-point instruction syntax. 

° The Advanced SIMD addressing mode. 

° Advanced SIMD instruction modifiers on page F2-2428. 

° Advanced SIMD operand shapes on page F2-2428. 

° Data type specifiers on page F2-2429. 

° Register specifiers on page F2-2430. 

. Register lists on page F2-2431. 

° Register encoding on page F2-2431. 

° Advanced SIMD scalars on page F2-2432. 


Note 


The Advanced SIMD architecture, its associated implementations, and supporting software, are commonly referred 
to as NEON™ technology. 








F2.9.1 Advanced SIMD and floating-point instruction syntax 
Advanced SIMD and floating-point instructions use the general conventions of the T32 and A32 instruction sets. 
Advanced SIMD and floating-point data-processing instructions use the following general format: 
V{<modi fier>}<operation>{<shape>}{<c>}{<q>}{.<dt>} {<dest>,} <srcl>, <src2> 


All Advanced SIMD and floating-point instructions begin with a V. This distinguishes Advanced SIMD vector and 
floating-point instructions from scalar instructions. 


The main operation is specified in the <operation> field. It is usually a three letter mnemonic the same as or similar 
to the corresponding scalar integer instruction. 


The <c> and <q> fields are standard assembler syntax fields. For details see Standard assembler syntax fields on 
page F2-2406. 


F2.9.2 The Advanced SIMD addressing mode 
All the element and structure load/store instructions use this addressing mode. There is a choice of three formats: 


[<Rn>{:<align>}] The address is contained in general-purpose register Rn. 
Rn is not updated by this instruction. 
Encoded as Rm = 0b1111. 


If Rn is encoded as @b1111, the instruction is CONSTRAINED UNPREDICTABLE. 


[<Rn>{:<align>}]! The address is contained in general-purpose register Rn. 
Rn is updated by this instruction: Rn = Rn + transfer_size 
Encoded as Rm = 0b1101. 


transfer_size is the number of bytes transferred by the instruction. This means that, after 
the instruction is executed, Rn points to the address in memory immediately following the 
last address loaded from or stored to. 


If Rn is encoded as 0b1111, the instruction is CONSTRAINED UNPREDICTABLE. 
This addressing mode can also be written as: 
[<Rn>{:align}], #<transfer_size> 


However, disassembly produces the [<Rn>{:align}]! form. 
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[<Rn>{:<align>}], <Rm> 
The address is contained in general-purpose register <Rn>. 
Rn is updated by this instruction: Rn = Rn + Rm 
Encoded as Rm = Rm. Rm must not be encoded as @b1111 or 0b1101, the PC or the SP. 
If Rn is encoded as 0b1111, the instruction is CONSTRAINED UNPREDICTABLE. 
The CONSTRAINED UNPREDICTABLE behavior of encodings where Rn is 0b1111 is that the instruction is either 


UNDEFINED or executes as a NOP, see CONSTRAINED UNPREDICTABLE behavior for A32 memory hints, 
Advanced SIMD instructions, and miscellaneous instructions on page K1-5467. 


In all cases, <align> specifies an alignment, as specified by the individual instruction descriptions. 


Previous versions of the manual used the @ character for alignment. So, for example, the first format in this section 
was shown as [<Rn>{@<align>}]. Both @ and : are supported. However, to ensure portability of code to assemblers 
that treat @ as a comment character, : is preferred. 















































F2.9.3 Advanced SIMD instruction modifiers 
The <modifier> field provides additional variants of some instructions. Table F2-7 provides definitions of the 
modifiers. Modifiers are not available for every instruction. 
Table F2-7 Advanced SIMD instruction modifiers 
<modifier> Meaning 
Q The operation uses saturating arithmetic. 
R The operation performs rounding. 
D The operation doubles the result (before accumulation, if any). 
H The operation halves the result. 
F2.9.4 Advanced SIMD operand shapes 
The <shape> field provides additional variants of some instructions. Table F2-8 provides definitions of the shapes. 
Operand shapes are not available for every instruction. 

Table F2-8 Advanced SIMD operand shapes 
<shape> Meaning Typical register shape 
(none) The operands and result are all the same width. Dd, Dn, Dm Qd, Qn, Qm 
L Long operation - result is twice the width of both operands Qd, Dn, Dm 
N Narrow operation - result is half the width of both operands Dd, Qn, Qm 
WwW Wide operation - result and first operand are twice the width of the second operand Qd, Qn, Dm 

Note 
° Some assemblers support a Q shape specifier, that requires all operands to be Q registers. An example of 
using this specifier is VADDQ.S32 qQ, ql, q2. This is not standard UAL, and ARM recommends that 
programmers do not use a Q shape specifier. 
. A disassembler must not generate any shape specifier not shown in Table F2-8. 
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F2.9.5 Data type specifiers 
The <dt> field normally contains one data type specifier. Unless the assembler syntax description for the instruction 
indicates otherwise, this indicates the data type contained in: 
° The second operand, if any. 
° The operand, if there is no second operand. 
° The result, if there are no operand registers. 
The data types of the other operand and result are implied by the <dt> field combined with the instruction shape. For 
information about data type formats see Data types supported by the Advanced SIMD implementation on 
page E1-2302. 
In the instruction syntax descriptions in Chapter F2 About the T32 and A32 Instruction Descriptions, the <dt> field 
is usually specified as a single field. However, where more convenient, it is sometimes specified as a concatenation 
of two fields, <type><size>. 
Syntax flexibility 
There is some flexibility in the data type specifier syntax: 
° Software can specify three data types, specifying the result and both operand data types. For example: 
VSUBW.116.116.S8 Q3, Q5, DQ instead of VSUBW.S8 Q3, Q5, DQ 
° Software can specify two data types, specifying the data types of the two operands. The data type of the result 
is implied by the instruction shape. For example: 
VSUBW.116.S8 Q3, Q5, DO instead of VSUBW.S8 Q3, Q5, DQ 
° Software can specify two data types, specifying the data types of the single operand and the result. For 
example: 
VMOVN.116.132 D@, Q1 instead of VMOVN.132 DQ, Ql 
. Where an instruction requires a less specific data type, software can instead specify a more specific type, as 
shown in Table F2-9. 
° Where an instruction does not require a data type, software can provide one. 
° The F32 data type can be abbreviated to F. 
° The F64 data type can be abbreviated to D. 
In all cases, if software provides additional information, the additional information must match the instruction 
shape. Disassembly does not regenerate this additional information. 
Table F2-9 Data type specification flexibility 
Specified datatype | Permitted more specific data types 
None Any 
.I<size> - .S<size> .U<size> - - 
8 .18 .S8 .U8 .P8 = 
.16 .I16 $16 .U16 P16 .F16 
32 132 $32 .U32 - .F32 or .F 
64 . 164 S64 .U64 - .F64 or .D 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F2-2429 


1ID092916 


Non-Confidential 


F2 About the T32 and A32 Instruction Descriptions 
F2.9 Additional information about Advanced SIMD and floating-point instructions 



























































F2.9.6 Register specifiers 

The <dest>, <srcl>, and <src2> fields contain register specifiers, or in some cases scalar specifiers or register lists. 

Table F2-10 shows the register and scalar specifier formats that appear in the instruction descriptions. 

If <dest> is omitted, it is the same as <srcl>. 

Table F2-10 Advanced SIMD and floating-point register specifier formats 
<specifier> | Usual meaning 4 Used in 
<Qd> A quadword destination register for the result vector. Advanced SIMD 
<Qn> A quadword source register for the first operand vector. Advanced SIMD 
<Qm> A quadword source register for the second operand vector. Advanced SIMD 
<Dd> A doubleword destination register for the result vector. Both 
<Dn> A doubleword source register for the first operand vector. Both 
<Dm> A doubleword source register for the second operand vector. Both 
<Sd> A singleword destination register for the result vector. Floating-point 
<Sn> A singleword source register for the first operand vector. Floating-point 
<Sm> A singleword source register for the second operand vector. Floating-point 
<Dd[x]> A destination scalar for the result. Element x of vector <Dd>. Advanced SIMD 
<Dn[x]> A source scalar for the first operand. Element x of vector <Dn>. Both? 
<Dm[x]> A source scalar for the second operand. Element x of vector <Dm>. Advanced SIMD 
<Rt> A general-purpose register, used for a source or destination address. Both 
<Rt2> A general-purpose register, used for a source or destination address. Both 
<Rn> A general-purpose register, used as a load or store base address. Both 
<Rm> A general-purpose register, used as a post-indexed address source. Both 
a. In some instructions the roles of registers are different. 
b. In the floating-point instructions, <Dn[x]> is used only in VMOV (scalar to general-purpose register), see VUOV 
(scalar to general-purpose register) on page F6-3516. 
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F2 About the T32 and A32 Instruction Descriptions 
F2.9 Additional information about Advanced SIMD and floating-point instructions 





























F2.9.7 Register lists 

A register list is a list of register specifiers separated by commas and enclosed in brackets { and }. There are 

restrictions on what registers can appear in a register list. These restrictions are described in the individual 

instruction descriptions. Table F2-11 shows some register list formats, with examples of actual register lists 
corresponding to those formats. 
Note 

Register lists must not wrap around the end of the register bank. 

Syntax flexibility 

There is some flexibility in the register list syntax: 

° Where a register list contains consecutive registers, they can be specified as a range, instead of listing every 
register, for example {D0-D3} instead of {D0, D1, D2, D3}. 

. Where a register list contains an even number of consecutive doubleword registers starting with an even 
numbered register, it can be written as a list of quadword registers instead, for example {Q1, Q2} instead of 
{D2-D5}. 

° Where a register list contains only one register, the enclosing braces can be omitted, for example 
VLD1.8 DQ, [RQ] instead of VLD1.8 {DQ}, [RQ]. 

Table F2-11 Example register lists 
Format Example Alternative 
{<Dd>} {D3} D3 
{<Dd>, <Dd+1>, <Dd+2>}  {D3, D4, D5} {D3-D5} 
{<Dd[x]>, <Dd+2[x]} {D@[3], D2[3]}  - 
{<Dd[]>} {D7[]} D7] 
F2.9.8 Register encoding 

An Advanced SIMD register is either: 

° Quadword, meaning it is 128 bits wide. 

° Doubleword, meaning it is 64 bits wide. 

Some instructions have options for either doubleword or quadword registers. This is normally encoded in Q, bit[6], 

as Q = 0 for doubleword operations, or Q = 1 for quadword operations. 

A floating-point register is either: 

. Double-precision, meaning it is 64 bits wide. 

° Single-precision, meaning it is 32 bits wide. 

This is encoded in the sz field, bit[8], as sz = 1 for double-precision operations, or sz = 0 for single-precision 

operations. 

The T32 instruction encoding of Advanced SIMD or floating-point registers is: 

1514131211109 8 7 6 5 4 3 2 1 0/1514131211109 8 765 43 2 1 0 

po Tf sz{N ff vm 

The A32 instruction encoding of Advanced SIMD or floating-point registers is: 

31 30 29 28 27 26 25 24 23 22 21 20191817161514131211109 876543210 

pv vf sz{N ffm vm 
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F2 About the T32 and A32 Instruction Descriptions 
F2.9 Additional information about Advanced SIMD and floating-point instructions 


F2.9.9 


Some instructions use only one or two registers, and use the unused register fields as additional opcode bits. 


Table F2-12 shows the encodings for the registers. 


Table F2-12 Encoding of register numbers 
































eas lale Usual usage cee ney Bet Notes 4 Used in 

<Qd> Destination (quadword) D, Vd (bits[22, 15:13]) bit[12] == 0> Advanced SIMD 
<Qn> First operand (quadword) N, Vn (bits[7, 19:17]) bit[16] == 0> Advanced SIMD 
<Qm> Second operand (quadword) M, Vm (bits[5, 3:1]) bit[0] == 0> Advanced SIMD 
<Dd> Destination (doubleword) D, Vd (bits[22, 15:12]) - Both 

<Dn> First operand (doubleword) N, Vn (bits[7, 19:16]) - Both 

<Dm> Second operand (doubleword) M, Vm (bits[5, 3:0]) - Both 

<Sd> Destination (single-precision) Vd, D (bits[ 15:12, 22]) - Floating-point 
<Sn> First operand (single-precision) Vn, N (bits[19:16, 7]) - Floating-point 
<Sm> Second operand (single-precision) Vm, M (bits[3:0, 5]) - Floating-point 





a. Bit numbers given for the A32 instruction encoding. See the figures in this section for the equivalent bits in the T32 encoding. 


b. If this bit is 1, the instruction is UNDEFINED. 


Advanced SIMD scalars 


Advanced SIMD scalars can be 8-bit, 16-bit, 32-bit, or 64-bit. Instructions other than multiply instructions can 
access any element in the register set. The instruction syntax refers to the scalars using an index into a doubleword 
vector. The descriptions of the individual instructions contain details of the encodings. 


Table F2-13 shows the form of encoding for scalars used in multiply instructions. These instructions cannot access 
scalars in some registers. The descriptions of the individual instructions contain cross references to this section 
where appropriate. 


32-bit Advanced SIMD scalars, when used as single-precision floating-point numbers, are equivalent to 
Floating-point single-precision registers. That is, Dm[x] in a 32-bit context (0 <=m <= 15, 0 <= x <=1) is equivalent 
to S[2m + x]. 


Table F2-13 Encoding of scalars in multiply instructions 





Scalar 


mnemonic 


Usual usage 


Scalar 
size 


Index 
specifier 


Register 
specifier 


Accessible 
registers 





<Dm[x]> 


Second operand 16-bit 


Vm[2:0] — M, Vm[3] 


DO-D7 





32-bit 


Vm[3:0] M 


DO-D15 
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Chapter F3 
The T32 Instruction Set Encoding 


This chapter introduces the T32 instruction set and describes how it uses the ARM programmers’ model. It contains 
the following sections: 


° Top level T32 instruction set encoding on page F3-2434. 
° 16-bit T32 instruction encoding on page F3-2436. 
. 32-bit T32 instruction encoding on page F3-2447. 


In this chapter: 





° In the decode tables, an entry of - for a field value means the value of the field does not affect the decoding. 
° In the decode diagrams, a shaded field indicates that the bits in that field are not used in that level of decode. 
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F3 The T32 Instruction Set Encoding 
F3.1 Top level T32 instruction set encoding 


F3.1 Top level T32 instruction set encoding 


The T32 instruction stream is treated as a sequence of halfword-aligned halfwords. Each T32 instruction is either a 
single 16-bit halfword in that stream, or a 32-bit instruction consisting of two consecutive halfwords in that stream. 


When the value of bits[15:11] of the halfword being decoded is one of the following, the halfword is the first 
halfword of a 32-bit instruction: 


: 0b11101. 
is 0b11110. 
° 0b11111. 


Otherwise, the halfword is a 16-bit instruction. 


15 13 12|11 10 | | 015 | | | 0 | 


Table F3-1 Main encoding table for the T32 instruction set 





Decode fields 
Decode group or instruction page 











op0 op1 

l= 111 - 16-bit T32 instruction encoding on page F3-2436 
111 00 B - T2 variant 

111 != Q@@ 32-bit T32 instruction encoding on page F3-2447 





The behavior of an attempt to execute an unallocated instruction is described in UNDEFINED, UNPREDICTABLE, 
and CONSTRAINED UNPREDICTABLE instruction set space on page F2-2414. 


F3.1.1 About the T32 Advanced SIMD and floating-point instructions and their encoding 


The Advanced SIMD and floating-point instructions are common to the T32 and A32 instruction sets. These 
instructions perform Advanced SIMD and floating-point operations on a common register file, the SIMD&FP 
register file. This means: 


. In general, the instructions that load or store registers in this file, or move data between general-purpose 
registers and this register file, are common to the Advanced SIMD and floating-point instructions. 


° There are distinct Advanced SIMD data-processing instructions and floating-point data-processing 
instructions. 


All T32 Advanced SIMD and floating-point instructions have 32-bit encodings. Different groups of these 
instructions are decoded from different points in the 32-bit T32 instruction decode structure. Table F3-2 shows these 
instruction groups, and where each group is decoded from the overall T32 decode structure: 


Table F3-2 Advanced SIMD and floating-point instructions in the T32 decode structure 





Advanced SIMD and floating-point instruction group 132 decode is from 





Advanced SIMD data-processing on page F3-2454 System register access, Advanced SIMD, and floating-point 
instructions on page F3-2447 





Floating-point data-processing on page F3-2450 System register access, Advanced SIMD, and floating-point 
instructions on page F3-2447 
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F3 The T32 Instruction Set Encoding 
F3.1 Top level T32 instruction set encoding 


Table F3-2 Advanced SIMD and floating-point instructions in the T32 decode structure (continued) 





Advanced SIMD and floating-point instruction group 132 decode is from 





Floating-point and Advanced SIMD 32-bit register moves on System register access, Advanced SIMD, and floating-point 








page F3-2453 instructions on page F3-2447 

Advanced SIMD and floating-point Load/Store and 64-bit System register access, Advanced SIMD, and floating-point 
register moves on page F3-2448 instructions on page F3-2447 

Advanced SIMD element or structure Load/Store on 32-bit T32 instruction encoding on page F3-2447 


page F3-2479 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 


F3.2 


F3.2.1 


16-bit T32 instruction encoding 


This section describes the encoding of the 16-bit T32 instruction encoding group. This section is decoded from Top 
level T32 instruction set encoding on page F3-2434. 


|15 | 109 | | 0| 





Decode fields 
Decode group or instruction page 
op0 





QQxxxx Shift (immediate), add, subtract, move, and compare 





010000 Data-processing (two low registers) on page F3-2438 





010001 Special data instructions and branch and exchange on page F3-2439 





01001x LDR (literal) - T1 variant 





Q101xx Load/Store (register offset) on page F3-2440 








Q11xxx Load/Store word/byte (immediate offset) on page F3-2440 





1000xx Load/Store halfword (immediate offset) on page F3-2441 





1001xx Load/Store (SP-relative) on page F3-2441 





1010xx Add PC/SP (immediate) on page F3-2441 





1011xx Miscellaneous 16-bit instructions on page F3-2442 





1100xx Load/Store multiple on page F3-2445 





1101xx Conditional branch, and Supervisor Call on page F3-2445 





Shift (immediate), add, subtract, move, and compare 


This section describes the encoding of the Shift (immediate), add, subtract, move, and compare group. This section 
is decoded from 16-bit T32 instruction encoding. 


145 1312/1109 | | 0 | 


[oo | foot} Po 


op0 _ 
op2 


Decode fields 





Decode group or instruction page 
op0- op op2 





) 11 ") Add, subtract (three low registers) on page F3-2437 





) 11 1 Add, subtract (two low registers and immediate) on page F3-2437 





) !'= 11 - MOV, MOVS (register) - T2 variant 





1 - - Add, subtract, compare, move (one low register and immediate) on page F3-2437 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 


Add, subtract (three low registers) 
This section describes the encoding of the Add, subtract (three low registers) instruction class. This section is 


decoded from Shift (immediate), add, subtract, move, and compare on page F3-2436. 


15 141312/11109 8| 65 |32 O| 


foo 0717 0s] Rm | Rn] Ra | 





Decode fields 
Instruction Page 








Ss 
1) ADD, ADDS (register) 
1 SUB, SUBS (register) 





Add, subtract (two low registers and immediate) 
This section describes the encoding of the Add, subtract (two low registers and immediate) instruction class. This 


section is decoded from Shift (immediate), add, subtract, move, and compare on page F3-2436. 


\15141312\11109 8| 65 |32 O| 


fo oo 11 44s} imms | Rn | Rd_| 





Decode fields 
Instruction Page 








Ss 
0 ADD, ADDS (immediate) 
1 SUB, SUBS (immediate) 





Add, subtract, compare, move (one low register and immediate) 


This section describes the encoding of the Add, subtract, compare, move (one low register and immediate) 
instruction class. This section is decoded from Shift (immediate), add, subtract, move, and compare on 
page F3-2436. 


15 141312/1110 8|7 | 0 | 


foo i[o] Ro [imme _— 





Decode fields 
Instruction Page 











op 
00 MOV, MOVS (immediate) 
01 CMP (immediate) 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 





Decode fields 
Instruction Page 








op 
10 ADD, ADDS (immediate) 
11 SUB, SUBS (immediate) 





F3.2.2 Data-processing (two low registers) 
This section describes the encoding of the Data-processing (two low registers) instruction class. This section is 


decoded from 16-bit T32 instruction encoding on page F3-2436. 


(15141312/11109 | 65 |3 2 0| 


fot o000] o | Rs | Ra | 





Decode fields 
Instruction Page 
























































op 
0000 AND, ANDS (register) 
0001 EOR, EORS (register) 
0010 MOV, MOVS (register-shifted register) - Logical shift left variant 
0011 MOV, MOVS (register-shifted register) - Logical shift right variant 
0100 MOV, MOVS (register-shifted register) - Arithmetic shift right variant 
0101 ADC, ADCS (register) 
0110 SBC, SBCS (register) 
0111 MOV, MOVS (register-shifted register) - Rotate right variant 
1000 TST (register) 
1001 RSB, RSBS (immediate) 
1010 CMP (register) 
1011 CMN (register) 
1100 ORR, ORRS (register) 
1101 MUL, MULS 
1110 BIC, BICS (register) 
1111 MVN, MVNS (register) 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 


F3.2.3 Special data instructions and branch and exchange 


This section describes the encoding of the Special data instructions and branch and exchange group. This section is 
decoded from 16-bit T32 instruction encoding on page F3-2436. 


[15 | 9 8|7 | 0 | 


[070001 [ op0 [nn 





Decode fields 
Decode group or instruction page 
op0d 





11 Branch and exchange 





l= 11 Add, subtract, compare, move (two high registers) 





Branch and exchange 


This section describes the encoding of the Branch and exchange instruction class. This section is decoded from 
Special data instructions and branch and exchange. 


\15141312\1110 9 8|7 6 |3 21 0| 


or 00074 tft] Rm __ ooo) 





Decode fields 
Instruction Page 








L 
Q BX 
ab BLX (register) 





Add, subtract, compare, move (two high registers) 


This section describes the encoding of the Add, subtract, compare, move (two high registers) instruction class. This 
section is decoded from Special data instructions and branch and exchange. 


[151413 12\1110 9 8|7 6 ig 2 oO 
o1 000 ino] Re | Ra | 
op 





Decode fields 
Instruction Page 
op D:Rd Rs 














00 != 1101 ‘!= 1101 ADD, ADDS (register) 

eo OC - 1101 ADD, ADDS (SP plus register) - T1 variant 

0@ = 1101 != 1101 ADD, ADDS (SP plus register) - T2 variant 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 





Decode fields 
Instruction Page 
op D:Rd Rs 





Ql - - CMP (register) 





10 - - MOV, MOVS (register) 





F3.2.4 Load/Store (register offset) 


This section describes the encoding of the Load/Store (register offset) instruction class. This section is decoded from 
16-bit T32 instruction encoding on page F3-2436. 


15 141312/11109 8| 65 |32 0O| 


fo +o t[t[efa, Rm | Rn] Rt | 





Decode fields 
Instruction Page 





) Q 0 STR (register) 





) () 1 STRH (register) 





0 1 0 STRB (register) 





) 1 1 LDRSB (register) 





1 Y) 0 LDR (register) 





1 0 1 LDRH (register) 





1 1 0 LDRB (register) 





1 al 1 LDRSH (register) 





F3.2.5 Load/Store word/byte (immediate offset) 


This section describes the encoding of the Load/Store word/byte (immediate offset) instruction class. This section 
is decoded from 16-bit T32 instruction encoding on page F3-2436. 























|15 14 13 12/11 10 | 65 |3 2 0| 

fo.1 1{B/L] imms [| Rn | Rt_| 
Decode fields 

Instruction Page 
B L 
0 () STR (immediate) 
) 1 LDR (immediate) 
1 Q STRB (immediate) 
1 1 LDRB (immediate) 
F3-2440 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 


F3.2.6 Load/Store halfword (immediate offset) 


This section describes the encoding of the Load/Store halfword (immediate offset) instruction class. This section is 
decoded from 16-bit T32 instruction encoding on page F3-2436. 














\15 14 13 12/11 10 | 65 |32 O| 
Decode fields 
Instruction Page 
L 
0 STRH (immediate) 
1 LDRH (immediate) 
F3.2.7 Load/Store (SP-relative) 


This section describes the encoding of the Load/Store (SP-relative) instruction class. This section is decoded from 
16-bit T32 instruction encoding on page F3-2436. 


\15141312|1110 8|7 | 0 | 
toot] Rt | imms 





Decode fields 
Instruction Page 








L 
() STR (immediate) 
1 LDR (immediate) 





F3.2.8 Add PC/SP (immediate) 


This section describes the encoding of the Add PC/SP (immediate) instruction class. This section is decoded from 
16-bit T32 instruction encoding on page F3-2436. 

















|15 14 13 12|11 10 8|7 | 0 | 
10706] Rd [imma 
Decode fields 
Instruction Page 
SP 
Y) ADR 
1 ADD, ADDS (SP plus immediate) 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 


F3.2.9 Miscellaneous 16-bit instructions 
This section describes the encoding of the Miscellaneous 16-bit instructions group. This section is decoded from 


16-bit T32 instruction encoding on page F3-2436. 


|15 |11 8/7 6 5 4|3 0 | 
1011 opO op1 op3 


f= op2 








Decode fields 
Decode group or instruction page 
op0- opi op2 


















































0000S - - Adjust SP (immediate) 
0010—- - Extend on page F3-2443 

0110 +8=0 - Unallocated. 

0110 =01 - Change Processor State on page F3-2443 
0110 = 1x - Unallocated. 

Q111.—- - Unallocated. 

1000 - - Unallocated. 

1010 = 10 - HLT 

1010 «!=10 - Reverse bytes on page F3-2444 
1110 - - BKPT 

1111_—si- 0000 Hints on page F3-2444 

1111. —- != 9000 = «IT 

xOxl- - CBNZ, CBZ 

xl0x- - Push and Pop on page F3-2445 





Adjust SP (immediate) 


This section describes the encoding of the Adjust SP (immediate) instruction class. This section is decoded from 
Miscellaneous 16-bit instructions. 


15 14 1312/1110 9 8|7 6 | 0 | 


1o1rtooools}| imm | 





Decode fields 
Instruction Page 














Ss 
0 ADD, ADDS (SP plus immediate) 
1 SUB, SUBS (SP minus immediate) 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 


Extend 


This section describes the encoding of the Extend instruction class. This section is decoded from Miscellaneous 
16-bit instructions on page F3-2442. 


15 141312/11109 8|7 65 |3 2 O| 


fT o71007 ojuje] Rm | Ra | 





Decode fields 
Instruction Page 














U B 

0 ) SXTH 
0 1 SXTB 

1 Y) UXTH 
1 1 UXTB 





Change Processor State 


This section describes the encoding of the Change Processor State instruction class. This section is decoded from 
Miscellaneous 16-bit instructions on page F3-2442. 


15 141312|11109 8/7 6 5 4| 0 | 


TOT1T011700 top] fags | 





Decode fields 
Instruction Page 














op flags 

i) - SETEND 

1 - CPS, CPSID, CPSIE 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 


Reverse bytes 
This section describes the encoding of the Reverse bytes instruction class. This section is decoded from 


Miscellaneous 16-bit instructions on page F3-2442. 


15 141312/11109 8|7 65 |3 2 O| 


fTo717071 [Fo] Rm] Ra | 
op 





Decode fields 
Instruction Page 











op 
00 REV 

01 REV16 
1 REVSH 





Hints 


This section describes the encoding of the Hints instruction class. This section is decoded from Miscellaneous 16-bit 
instructions on page F3-2442. 


15 14 1312/1110 9 8|7 4|3 21 0| 


1orirti11 at hint jo 0 0 0] 





Decode fields 
Instruction Page 
































hint 
0000 NOP 
0001 YIELD 
0010 WFE 
0011 WFI 
0100 SEV 
0101 SEVL 
Q11x Reserved hint, behaves as NOP. 
1xxx Reserved hint, behaves as NOP. 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 


Push and Pop 
This section describes the encoding of the Push and Pop instruction class. This section is decoded from 


Miscellaneous 16-bit instructions on page F3-2442. 


15 14 1312/1110 9 8|7 | 0 | 


[07 t[t]1 o[P] resister ist] 





Decode fields 
Instruction Page 








L 
Q PUSH 
1 POP 





F3.2.10 Load/Store multiple 


This section describes the encoding of the Load/Store multiple instruction class. This section is decoded from /6-bit 
T32 instruction encoding on page F3-2436. 


\15141312\1110 8|7 | 0 | 


110 o[L| Rn | register_list 





Decode fields 
Instruction Page 








L 
0 STM, STMIA, STMEA 
1 LDM, LDMIA, LDMFD 





F3.2.11 Conditional branch, and Supervisor Call 
This section describes the encoding of the Conditional branch, and Supervisor Call group. This section is decoded 


from 16-bit T32 instruction encoding on page F3-2436. 


\15 \11 8|7 | 0| 


| tot | op | 





Decode fields 
Decode group or instruction page 














op0 
111x Exception generation on page F3-2446 
!= 111x B - T1 variant 
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F3 The T32 Instruction Set Encoding 
F3.2 16-bit T32 instruction encoding 


Exception generation 
This section describes the encoding of the Exception generation instruction class. This section is decoded from 


Conditional branch, and Supervisor Call on page F3-2445. 


15 14 1312/1110 9 8/7 | 0 | 


Tioti tis] imme _—| 





Decode fields 
Instruction Page 














Ss 
0 UDF 
1 SVC 
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F3.3 


F3.3.1 


F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


32-bit T32 instruction encoding 


This section describes the encoding of the 32-bit T32 instruction encoding group. This section is decoded from Top 
level T32 instruction set encoding on page F3-2434. 


|15 


12| 


9 8| 4|3 0 |15 14 | | | 0 | 


| tt | op | opt 


Po op3 





Decode fields 


Decode group or instruction page 









































op0- opi op3 

xllx- - System register access, Advanced SIMD, and floating-point instructions 

Q100 = xx@xx - Load/Store multiple on page F3-2466 

0100 =xx1xx - Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on 
page F3-2466 

0101) - - Data-processing (shifted register) on page F3-2470 

10xx - 1 Branches and miscellaneous control on page F3-2471 

10x® - 0 Data-processing (modified immediate) on page F3-2475 

10x1- ) Data-processing (plain binary immediate) on page F3-2477 

1100 = 1xxxd - Advanced SIMD element or structure Load/Store on page F3-2479 

1100 != 1xxxd - Load/Store single on page F3-2485 

1101 = Oxxxx - Data-processing (register) on page F3-2490 

1101 = 1xxx - Multiply, multiply accumulate, and absolute difference on page F3-2494 

1101 = 11xxx - Long multiply and divide on page F3-2496 








System register access, Advanced SIMD, and floating-point instructions 


This section describes the encoding of the System register access, Advanced SIMD, and floating-point instructions 
group. This section is decoded from 32-bit T32 instruction encoding. 


|15 


12|11 


9 8|7 | 015 12\11 9 8| 5 4|3 0 | 


ptt ff tt Popo Poet TO 


oo op2 





Decode fields 


Decode group or instruction page 

















op0 op1 op2 
Ox 101 - Advanced SIMD and floating-point Load/Store and 64-bit register moves on page F3-2448 
10 101 ) Floating-point data-processing on page F3-2450 
10 101 1 Floating-point and Advanced SIMD 32-bit register moves on page F3-2453 
!=11 !=1x1 - Unallocated 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Decode group or instruction page 





!=11 111 - System register access on page F3-2464 





11 - - Advanced SIMD data-processing on page F3-2454 





Advanced SIMD and floating-point Load/Store and 64-bit register moves 


This section describes the encoding of the Advanced SIMD and floating-point Load/Store and 64-bit register moves 
group. This section is decoded from System register access, Advanced SIMD, and floating-point instructions on 
page F3-2447. 


[15 | 8 | 5 4| 0 |15 12\11 8 | | 0 | 


Tov | _0p0__ [NN] 101 _ up 





Decode fields 
Decode group or instruction page 








op0d 
00xd Advanced SIMD and floating-point 64-bit move 
!= Q0x0 Advanced SIMD and floating-point Load/Store on page F3-2449 





Advanced SIMD and floating-point 64-bit move 
This section describes the encoding of the Advanced SIMD and floating-point 64-bit move instruction class. This 


section is decoded from Advanced SIMD and floating-point Load/Store and 64-bit register moves. 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 0 | 


[111077400 ofdfofopf Re | Rt [1 0 1[szjopc2}mfo3} vm __ | 





Decode fields 
Instruction Page 
op opc2 o3 D_ sz 
































- = - Y) - Unallocated. 

0 00 1 1 0 VMOV (between two general-purpose registers and two single-precision registers) 

0 00 1 1 1 VMOV (between two general-purpose registers and a doubleword floating-point register) 
- - Y) 1 - Unallocated. 

- 01 - 1 - Unallocated. 

- 1x - 1 - Unallocated. 

1 00 1 1 0 VMOV (between two general-purpose registers and two single-precision registers) 

1 00 1 1 1 VMOV (between two general-purpose registers and a doubleword floating-point register) 
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Advanced SIMD and floating-point Load/Store 


F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


This section describes the encoding of the Advanced SIMD and floating-point Load/Store instruction class. This 
section is decoded from Advanced SIMD and floating-point Load/Store and 64-bit register moves on page F3-2448. 


[15 141312/11109 8|7 6 5 4|3 0 |15 


77047 oPluppwt] en | va_[1 0 1|e| mms —_—s 


12/1110 9 8|7 | 0 | 





Decode fields 


P U L_~sz_ imm8s W Rn 


Instruction Page 

































































0 0 - = - 1 - Unallocated. 
® 1 0 @ - - - VSTM, VSTMDB, VSTMIA 
Q 1 0 1 XXXXXXXO@ = - VSTM, VSTMDB, VSTMIA 
® 41 0 1 XXXXXXXL_- - FSTMDBX, FSTMIAX - Increment After variant 
® 1 1 0 - - - VLDM, VLDMDB, VLDMIA 
Q 1 1 1 XXXXXXXO = - VLDM, VLDMDB, VLDMIA 
® 41 1 1 XXXXXXXL_- - FLDMDBX, FLDMIAX - Increment After variant 
1 - @® - - 0 - VSTR 
1 0@ 0 0 - 1 - VSTM, VSTMDB, VSTMIA 
1 0 0 1 XXXXXXXO@ 1 - VSTM, VSTMDB, VSTMIA 
1 0 0 1 XXXXXXXL 1 - FSTMDBX, FSTMIAX - Decrement Before variant 
1 0 1 0 - 1 - VLDM, VLDMDB, VLDMIA 
1 0 1 1 XXXXXXXO_ 1 - VLDM, VLDMDB, VLDMIA 
1 0 1 1 XXXXXXXL 1 - FLDMDBX, FLDMIAX - Decrement Before variant 
1 - 1 - - ) != 1111 VLDR (immediate) 
1 - 1 - - ) 1111 VLDR (literal) 
1 1 - = - 1 - Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Floating-point data-processing 


This section describes the encoding of the Floating-point data-processing group. This section is decoded from 
System register access, Advanced SIMD, and floating-point instructions on page F3-2447. 


[15 12/11 |7 4/3 2 0|15 12\11 8|7 6 5 4|3 0 | 


pt | | to | opt | a 101 RO 


op0 _ | _— = op3 
op2 


Decode fields 





Decode group or instruction page 
op0 op1 op2 op3 




















i) 1x11 = 1 Floating-point data-processing (two registers) 

() 1x11 - ) VMOV (immediate) 

0 != 1x11 - - Floating-point data-processing (three registers) on page F3-2451 
1 Oxxx - ) VSELEQ, VSELGE, VSELGT, VSELVS 

1 1x00 - - Floating-point minNum/maxNum on page F3-2452 

1 1x11 1 1 Floating-point directed convert to integer on page F3-2452 





Floating-point data-processing (two registers) 
This section describes the encoding of the Floating-point data-processing (two registers) instruction class. This 


section is decoded from Floating-point data-processing. 


15 141312/11109 8/7 6 5 4/3 2. O|15 12/1109 8|7 6 5 4/3 0 | 


Ti 701170 10] tpi] ope [ va [10 tfsefoo[t]wpo] vm _| 





Decode fields 
Instruction Page 
01 opc2 03 





0 000 ) VMOV (register) 





7) 000 1 VABS 





0 001 Y) VNEG 





0 © 001 1 VSQRT 





0 010 0 VCVTB 





) 010 1 VCVTT 





7) 011 0 VCVTB 





7) 011 a VCVTT 





) 100 Q VCMP 





7) 100 1 VCMPE 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 









































01 opc2 

) 101 VCMP 

) 101 VCMPE 

) 110 VRINTR 

) 110 VRINTZ (floating-point) 

0 111 VRINTX (floating-point) 

) 111 VCVT (between double-precision and single-precision) 

1 000 VCVT (integer to floating-point, floating-point) 

a 001 Unallocated. 

1 Q1x VCVT (between floating-point and fixed-point, floating-point) 
1 100 VCVTR 

1 10x VCVT (floating-point to integer, floating-point) 

1 101 VCVTR 

1 11x VCVT (between floating-point and fixed-point, floating-point) 





Floating-point data-processing (three registers) 


This section describes the encoding of the Floating-point data-processing (three registers) instruction class. This 
section is decoded from Floating-point data-processing on page F3-2450. 


[15 1413 12/1110 9 8|7 6 5 4|3 


0 |15 12/1109 8|7 6 5 4/3 0 | 


11140111 ofoofD] ot | vn [vd [1 0 1fsz{Nfoz{mfo] vm __ | 





Decode fields 
Instruction Page 








00 of 02 
Q nn) VMLA (floating-point) 
) 00 1 VMLS (floating-point) 





0 01 0 VNMLS 

















) Ql VNMLA 
0 10 VMUL (floating-point) 
v) 1 261 VNMUL 
) 11 VADD (floating-point) 








) 11 1 VSUB (floating-point) 





1. 00 0 VDIV 





1 01 0 VFNMS 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 
00 of 02 





1 01 a VFNMA 





1 10 ) VFMA 





1 10 a VFMS 





Floating-point minNum/maxNum 
This section describes the encoding of the Floating-point minNum/maxNum instruction class. This section is 


decoded from Floating-point data-processing on page F3-2450. 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


114117170 1Jpfo of vn [va [1 0 1]sz{Nfopfmfo] vm __| 





Decode fields 
Instruction Page 








op 
0 VMAXNM 
1 VMINNM 





Floating-point directed convert to integer 


This section describes the encoding of the Floating-point directed convert to integer instruction class. This section 
is decoded from Floating-point data-processing on page F3-2450. 


15 14131211109 8/7 6 5 4/3 2.1 0|15 12/1110 9 8|7 6 5 4/3 0 | 





ptt to foi 1 tfotf rm] vd [to 14szjop{t{mjo] vm __| 





Decode fields 
Instruction Page 
































o1 rm 

Y) 00 VRINTA (floating-point) 
Y) 01 VRINTN (floating-point) 
Y) 10 VRINTP (floating-point) 
Y) 11 VRINTM (floating-point) 
1 00 VCVTA (floating-point) 

1 01 VCVTN (floating-point) 

1 10 VCVTP (floating-point) 

a 11 VCVTM (floating-point) 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Floating-point and Advanced SIMD 32-bit register moves 
This section describes the encoding of the Floating-point and Advanced SIMD 32-bit register moves group. This 


section is decoded from System register access, Advanced SIMD, and floating-point instructions on page F3-2447. 


|15 | |7 5 4| 0|15 12/11 8/7 5 4| 0 | 
[oro | op0 [En] 107 | [7 
Se 





Decode fields 
Decode group or instruction page 











op0 op1 

000 0 VMOV (between general-purpose register and single-precision) 
111 0 Floating-point move special register 

- 1 Advanced SIMD 8/16/32-bit element move/duplicate 





Floating-point move special register 


This section describes the encoding of the Floating-point move special register instruction class. This section is 
decoded from Floating-point and Advanced SIMD 32-bit register moves. 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 21 O| 








Decode fields 
Instruction Page 








L 
Y) VMSR 
1 VMRS 





Advanced SIMD 8/16/32-bit element move/duplicate 


This section describes the encoding of the Advanced SIMD 8/16/32-bit element move/duplicate instruction class. 
This section is decoded from Floating-point and Advanced SIMD 32-bit register moves. 


[1514131211109 8|7 5 4|3 0 |15 12\11109 8|7 6 5 4/3 2 1 0| 








Decode fields 
Instruction Page 
opcet1 opc2 L 





Oxx - @ VMOV (general-purpose register to scalar) 





- - 1  VMOV (scalar to general-purpose register) 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 
opci opc2 L 





1xx Ox @ VDUP (general-purpose register) 





1xx 1x @ Unallocated. 





Advanced SIMD data-processing 


This section describes the encoding of the Advanced SIMD data-processing group. This section is decoded from 
System register access, Advanced SIMD, and floating-point instructions on page F3-2447. 


[15 42/11 


pt | || 


i765 | 0 |15 | i765 4|3 0 | 


op0 __| | 
op1 


op2 | i 


= 





Decode fields 


Decode group or instruction page 






































op0 opi op2 op3 op4 

0 1 11XXXXXXXXXXXXX - Q VEXT (byte elements) 

1 1 L1XXXXXXXXOXXXX - Y) Advanced SIMD two registers misc on page F3-2455 

1 1 11xxXxXxXXXxXX10xxx - 0 VTBL, VTBX 

1 1 L1XxXXXXXXX11xxx - Y) Advanced SIMD duplicate (scalar) on page F3-2457 

- ) - - - Advanced SIMD three registers of the same length on page F3-2457 

- 1 QQOXXXXXXXXXXXO - 1 Advanced SIMD one register and modified immediate on page F3-2460 

- 1 Y= L1XXXXXXXXXXXXX Y) Advanced SIMD three registers of different lengths on page F3-2461 

- 1 Y= LIXXXXXXXXXXXXX 1 Y) Advanced SIMD two registers and a scalar on page F3-2462 

- 1 = QQQXXXXXXXXXXXO = 1 Advanced SIMD two registers and shift amount on page F3-2463 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Advanced SIMD two registers misc 


This section describes the encoding of the Advanced SIMD two registers misc instruction class. This section is 
decoded from Advanced SIMD data-processing on page F3-2454. 


15 14 1312/1110 9 8/7 6 5 4/3 21 0|15 12|11 10 


i765 4|3 


0} 


11411111 14d]1 Afsizefopet] vd [0] opc2 jalmjo] vm | 





Decode fields 


Instruction Page 




























































































opci opc2 Q_ size 

00 0000 - VREV64 

00 0001 - VREV32 

00 0010 - VREV16 

00 0011 - Unallocated. 

00 10x - VPADDL 

00 0110 - AESE 

00 0110 - AESD 

00 Q111 - AESMC 

00 111 - AESIMC 

00 1000 - VCLS 

00 1001 - VCLZ 

00 1010 - VCNT 

00 1011 - VMVN (register) 

00 110x - VPADAL 

00 1110 - VQABS 

00 1111 - VQNEG 

Q1 x000 - VCGT (immediate #0) 
Q1 x001 - VCGE (immediate #0) 
01 x010 - VCEQ (immediate #0) 
Q1 x011 - VCLE (immediate #0) 
01 x100 - VCLT (immediate #0) 
1 x110 - VABS 

01 x111 - VNEG 

@1 0101 - SHAIH 

10 0000 00 VSWP 

10 0001 - VTRN 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 
opci opc2 Q_ size 






















































































10 0010 - = VUZP 

10 0011 - = VZIP 

10 0100 Q - VMOVN 

10 0100 1 - VQMOVN, VQMOVUN - Unsigned result variant 

10 0101 - - VQMOVN, VQMOVUN - Signed result variant 

10 0110 Qo - VSHLL 

10 Q111 Q - SHAISU1 

10 111 Ts Ae SHA256SU0 

10 1000 - = VRINTN (Advanced SIMD) 

10 1001 - = VRINTX (Advanced SIMD) 

10 1010 - = VRINTA (Advanced SIMD) 

10 1011 - = VRINTZ (Advanced SIMD) 

10 11x QoQ - VCVT (between half-precision and single-precision, Advanced SIMD) 

10 1100 1 - Unallocated. 

10 1101 - = VRINTM (Advanced SIMD) 

10 1110 1 - Unallocated. 

10 1111 - = VRINTP (Advanced SIMD) 

11 000x - - VCVTA (Advanced SIMD) 

11 001x - - VCVTN (Advanced SIMD) 

11 010x - - VCVTP (Advanced SIMD) 

11 Q11x - - VCVTM (Advanced SIMD) 

11 10x0 - = VRECPE 

11 10x1 - = VRSQRTE 

11 11xx - = VCVT (between floating-point and integer, Advanced SIMD) 
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Advanced SIMD duplicate (scalar) 


F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


This section describes the encoding of the Advanced SIMD duplicate (scalar) instruction class. This section is 
decoded from Advanced SIMD data-processing on page F3-2454. 


[15 141312/11109 8|7 6 5 4/3 0 |15 


Advanced SIMD three registers of the same length 


12/1110 9 | 


t1at1t1 ts tfojr at imma [vd [1 1] ope fafmjo} vm | 


765 4|3 0 | 





Decode fields 


Instruction Page 














ope 
000 VDUP (scalar) 
001 Unallocated. 
Q1x Unallocated. 
1xx Unallocated. 





This section describes the encoding of the Advanced SIMD three registers of the same length instruction class. This 
section is decoded from Advanced SIMD data-processing on page F3-2454. 


[15 141312/11109 8|7 6 5 4/3 0 |15 


42)11 


8/7 6 5 4|3 0 | 


7 afuft 117 o[pysze] va | va _| ope _[NJQ|Moi] vm _| 





Decode fields 


Instruction Page 


















































ope o1 U_ size Q 
0000 - - -  VHADD 
00001 - - -  VQADD 
0001 0 - - -  VRHADD 
0001 =o ) 00 - VAND (register) 
0001 1 ®@ 1 -  VBIC (register) 
0001 =o ) 10 - VORR (register) 
0001 = ) 11 - VORN (register) 
0001 «1 1 00 - VEOR 
0001 =o 1 01 - VBSL 
0001 1 1 10 -  VBIT 
0001 «1 1 il -  VBIF 
0010 0 - - - VHSUB 
0010 #1 - - -  VQSUB 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 














































































































ope o1 U_ size Q 
0011 20 - = -  VCGT (register) 
0011 1 - = - | VCGE (register) 
0100 «0 - - -  VSHL (register) 
01001 - = - | VQSHL (register) 
0101 0 - = -  VRSHL 
0101 #1 - = - VQRSHL 
0110 0 - - - VMAX (integer) 
0110 «1 - - - VMIN (integer) 
Q111 0 - - - VABD (integer) 
0111 #1 - - = VABA 
1000 0 ) - - VADD (integer) 
1000 0 1 - - VSUB (integer) 
1000 1 Qo - -  VTST 
1000 1 1 - - VCEQ (register) 
1001 @ ) - - VMLA (integer) 
1001 0 1 - - VMLS (integer) 
1001 #1 - - - | VMUL (integer and polynomial) 
1010 0 - - 0 VPMAX (integer) 
1010 - - = 1 Unallocated. 
1010 «1 - - ) VPMIN (integer) 
1011 0 Q@ - -  VQDMULH 
1011 0 1 - -  VQRDMULH 
1011 «1 ) - - VPADD (integer) 
1011 1 1 - - Unallocated. 
1100 0 00 = SHAIC 
1100 0 @ 1 -  SHAIP 
1100 0 @ 10 -  SHAIM 
1100 0 @ 11 - SHAI1SUO 
1100 0 1 00 - SHA256H 
1100 0 1 @1 - SHA256H2 
1100 0 1 10 -  SHA256SU1 
1100 «1 @ 00 - VFMA 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 

















































































































ope o1 U_ size Q 
1100 «1 ® 1 - Unallocated. 
11001 0 10 - VFMS 
1100 «1 ® 11 - Unallocated. 
1100 1 J- = - Unallocated. 
1101 0 Q@ ex -  VADD (floating-point) 
1101 0 ® ix -  VSUB (floating-point) 
1101 0 1 0x - | VPADD (floating-point) 
1101 0 1 ix -  VABD (floating-point) 
1101 #1 0 00 - | VMLA (floating-point) 
1101 1 ® 1 - Unallocated. 
1101 1 ® 10 - | VMLS (floating-point) 
1101 1 ® 11 - Unallocated. 
1101 #1 1 0 - | VMUL (floating-point) 
1101 1 1 01 - Unallocated. 
1110 0 Ox - VCEQ (register) 
1110 0 ® ix - Unallocated. 
1110 @ 1 Ox - VCGE (register) 
1110 0 1 1x - VCGT (register) 
1110 «1 1 00 = VACGE 
1110 «1 1 01 - Unallocated. 
1110 «1 1 10 - VACGT 
1110 «1 I Th. - Unallocated. 
1111 0 ® 00 - VMAX (floating-point) 
1111 0 ® 1 - Unallocated. 
1111 0 ® 10 -  VMIN (floating-point) 
1111 0 ® 11 - Unallocated. 
1111 0 1 0 ®@ VPMAX (floating-point) 
1111 0 1 01 @ Unallocated. 
1111 0 1 - 1 Unallocated. 
1111 0 1 10 @  VPMIN (floating-point) 
1111 0 1 il @ Unallocated. 
1111 1 @ Ox - | VRECPS 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 
ope o1 U_ size Q 

















1111 1 ® ix - | VRSQRTS 
1111 #1 1 00 - VMAXNM 
1111 1 1 01 - Unallocated. 
1111 #1 1 10 - VMINNM 
1111 1 1 il - Unallocated. 








Advanced SIMD one register and modified immediate 
This section describes the encoding of the Advanced SIMD one register and modified immediate instruction class. 


This section is decoded from Advanced SIMD data-processing on page F3-2454. 


\15141312/11109 8|7 6 5 4/3 2. 015 42|11 8|7 6 5 4|3 0| 


apt 717 to]o 0 of mms [va | omode [oafos[7] imma 





Decode fields 
Instruction Page 
op cmode 


















































0 OxxO VMOV (immediate) 
0 Oxx1 VORR (immediate) 
) 10x0 VMOV (immediate) 
0 10x1 VORR (immediate) 
) 11xx VMOV (immediate) 
1 Oxxd VMVN (immediate) 
1 Oxx1 VBIC (immediate) 
1 10x0 VMVN (immediate) 
1 10x1 VBIC (immediate) 
1 110x VMVN (immediate) 
1 1110 VMOV (immediate) 
1 1111 UNDEFINED. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Advanced SIMD three registers of different lengths 


This section describes the encoding of the Advanced SIMD three registers of different lengths instruction class. This 
section is decoded from Advanced SIMD data-processing on page F3-2454. 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 42\11 8/7 6 5 4|3 0 | 


faut 711 type] va | va _| ope _[NJo[mjo. vm_| 


size 





Decode fields 
Instruction Page 




































































opc U 

0000 - VADDL 

0001 - VADDW 

0010 - VSUBL 

0011 - VSUBW 

0100 ) VADDHN 

0100 1 VRADDHN 

0101 - VABAL 

0110 ) VSUBHN 

0110 1 VRSUBHN 

0111 - VABDL (integer) 

1000 - VMLAL (integer) 

1001 ) VQDMLAL 

1001 1 Unallocated. 

1010 - VMLSL (integer) 

1011 ) VQDMLSL 

1011 1 Unallocated. 

11x0 - VMULL (integer and polynomial) 

1101 ) VQDMULL 

1101 1 Unallocated. 

1111 - Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Advanced SIMD two registers and a scalar 


This section describes the encoding of the Advanced SIMD two registers and a scalar instruction class. This section 
is decoded from Advanced SIMD data-processing on page F3-2454. 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 42/11 81/7 6 5 4|3 0 | 


7 ajet 717 tpppen] va | va_| ope _[N[1]MJo. vm _| 


size 





Decode fields 
Instruction Page 
























































opc Q 

000x - VMLA (by scalar) 

0010 - VMLAL (by scalar) 

0011 ") VQDMLAL 

0011 1 Unallocated. 

010x - VMLS (by scalar) 

0110 - VMLSL (by scalar) 

0111 ") VQDMLSL 

0111 1 Unallocated. 

100x - VMUL (by scalar) 

1010 - VMULL (by scalar) 

1011 ") VQDMULL 

1011 1 Unallocated. 

1100 - VQDMULH 

1101 - VQRDMULH 

111x - Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Advanced SIMD two registers and shift amount 


This section describes the encoding of the Advanced SIMD two registers and shift amount instruction class. This 
section is decoded from Advanced SIMD data-processing on page F3-2454. This decode imposes the constraint that 
imm3H:L != £0Q0Q’. 


15 14 1312/1110 9 8|7 6 5 


i 2. 0|15 42/11 8/7 6 5 4/3 0 | 


[11 tfuft 1 14 1]d]}immsH | immat] vd | ope |e fafmii] vm | 








Decode fields 


Instruction Page 

































































opc L imm3L Q 

e000 - Sr VSHR 

e001 - SC - VSRA 

e010 - VRSHR 

0011 -) - VRSRA 

0100 -~ = - VSRI 

0101 =- - VSHL (immediate) 

0101 - - VSLI 

0110 - - VQSHL, VQSHLU (immediate) 

0111 - - 

1000 «6Ot- VSHRN 

1000 «0 (Ot stéir VQSHRN, VQSHRUN - Unsigned result variant 

1000 «0 Ot- VRSHRN 

1000 «S(O stéir VQRSHRN, VQRSHRUN - Unsigned result variant 

1001 O - VQSHRN, VQSHRUN - Signed result variant 

1001 O - VQRSHRN, VQRSHRUN - Signed result variant 

1010 OF - VSHLL 

1010 0 000 VMOVL 

lllix O - VCVT (between floating-point and fixed-point, Advanced SIMD) 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


System register access 


This section describes the encoding of the System register access group. This section is decoded from System 
register access, Advanced SIMD, and floating-point instructions on page F3-2447. 


[15 12/11 109 8| 5 4| 0/15 12\11 8|7 5 4/3 0 | 


Pt Et opt tte 
coproc 
op0 2-4 | op2 





Decode fields 
Decode group or instruction page 
op0- opi op2 














i) 00x0 - System register 64-bit move 

i) 1=Q0x0 - System register Load/Store on page F3-2465 
1 Oxxx ) Unallocated 

1 Oxxx 1 System register 32-bit move on page F3-2465 





System register 64-bit move 


This section describes the encoding of the System register 64-bit move instruction class. This section is decoded 
from System register access. 


151413 12\11109 8|7 6 5 4|3 0 |15 12\11 8|7 4|3 0 | 


7 ipolt 700 oppjoft] Re | mt | ix | opi | cRm | 


coproc 





Decode fields 
Instruction Page 
0o0 D L 





0 ) x Unallocated. 





Y) 1 i) MCRR 





Y) HE 1 MRRC 





1 Xx X Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


System register Load/Store 
This section describes the encoding of the System register Load/Store instruction class. This section is decoded from 


System register access on page F3-2464. 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1110 9 8|7 | 0 | 


1 afoot 7 ofP[upopw[t] en | ord | ix [imme _—is 
D 


coproc 





Decode fields 
Instruction Page 
coproc o0 P:U:W - CRd L Rn 























1110 ") 000 - - - Unallocated 

1110 ") 1=000 !=Q101—- - Unallocated 

1110 Q '=000 0101 0 - STC 

1110 ) '=000 0101 0 !=1111 LDC (immediate) 
1110 Q 1=000 0101 1 1111 LDC (literal) 
1110 1 = - - - Unallocated 

1111 - - - - - Unallocated 





System register 32-bit move 


This section describes the encoding of the System register 32-bit move instruction class. This section is decoded 
from System register access on page F3-2464. 


[15 141312/11109 8|7 5 4|3 0 |15 12\11 8/7 5 4/3 0 | 


[11 tfooft 1 1 of opet JL] crn | Rt | tttx | opc2 [1] CRm_| 


coproc 





Decode fields 
Instruction Page 

















00 L 
) 0 MCR 
) 1 MRC 
1 X Unallocated 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


F3.3.2 Load/Store multiple 


This section describes the encoding of the Load/Store multiple instruction class. This section is decoded from 32-bit 


T32 instruction encoding on page F3-2447. 


[15 14 1312/1110 9 8|7 6 5 4|3 


77040 Ofope[opw[t] Rn [P[w[O] ____registerist——~sd 


0/45 14 13 12| | | 0 | 





Decode fields 


Instruction Page 





























opc L 

00 0 SRS, SRSDA, SRSDB, SRSIA, SRSIB - T1 variant 

00 1 RFE, RFEDA, RFEDB, RFEIA, RFEIB - T1 variant 

Q1 0 STM, STMIA, STMEA 

01 1 LDM, LDMIA, LDMFD 

10 Q STMDB, STMFD 

10 1 LDMDB, LDMEA 

11 Q SRS, SRSDA, SRSDB, SRSIA, SRSIB - T2 variant 

11 1 RFE, RFEDA, RFEDB, RFEIA, RFEIB - T2 variant 
F3.3.3 Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch 


This section describes the encoding of the Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and 
table branch group. This section is decoded from 32-bit T32 instruction encoding on page F3-2447. 


























|15 | 8 | 5 4/3 0 |15 | 8|7 5 4| 0| 
| ttiot00_ | op] |p o P 
op1 | 
Decode fields 
Decode group or instruction page 
op0 opi op2 op3 
0010 - - - Load/Store Exclusive word on page F3-2467 
0110 i) - 000 Unallocated. 
0110 1 - 000 TBB, TBH 
0110 - - Q1x Load/Store Exclusive byte/half/dual on page F3-2467 
0110 - - 1xx Load-Acquire/Store-Release on page F3-2468 
@x11 - l= 1111 - Load/Store dual (immediate, post-indexed) on page F3-2469 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Decode group or instruction page 
op0 opi op2 op3 











1x10 - l= 1111 - Load/Store dual (immediate) on page F3-2469 
1x11 - l= 1111 - Load/Store dual (immediate, pre-indexed) on page F3-2469 
'= Qxx®@ = - 1111 - LDRD (literal) 





Load/Store Exclusive word 


This section describes the encoding of the Load/Store Exclusive word instruction class. This section is decoded from 
Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on page F3-2466. 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 42)\11 8|7 | 0 | 


Tito7o0007o1] Re | Rt | Ri | mms | 





Decode fields 
Instruction Page 








L 
Y) STREX 
1 LDREX 





Load/Store Exclusive byte/half/dual 


This section describes the encoding of the Load/Store Exclusive byte/half/dual instruction class. This section is 
decoded from Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on 
page F3-2466. 


151413 12|11109 8|7 6 5 4|3 0 |15 42|11 8|7 6 5 4|3 0 | 


f7totoootion] em | & | R@ [Oi[e] Ri 





Decode fields 
Instruction Page 
































L SZ 

0 00 STREXB 

0 01 STREXH 

0 10 Unallocated. 
) 11 STREXD 

1 00 LDREXB 

1 01 LDREXH 

1 10 Unallocated. 
1 11 LDREXD 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Load-Acquire/Store-Release 


This section describes the encoding of the Load-Acquire/Store-Release instruction class. This section is decoded 
from Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on page F3-2466. 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/11 81/7 6 5 4|3 


0} 


f77totoootio] em | R& | R@ |ifla] Ro 





Decode fields 


op L_ sz 


Instruction Page 



























































0 ) 00 STLB 

0 ) 01 SIL 

0 ) 10 STL 

0 0 11 Unallocated. 

) 1 00 LDAB 

) 1 01 LDAH 

0 1 10 LDA 

0 1 11 Unallocated. 

1 ) 00 STLEXB 

1 ) 01 STLEXH 

1 ) 10 STLEX 

1 ) 11 STLEXD 

1 1 00 LDAEXB 

1 ne 01 LDAEXH 

1 1 10 LDAEX 

1 1 11 LDAEXD 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Load/Store dual (immediate, post-indexed) 


This section describes the encoding of the Load/Store dual (immediate, post-indexed) instruction class. This section 
is decoded from Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on 
page F3-2466. 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12\11 8|7 | 0 | 


Titorooour my =m | Rm | R2 | imma 
Rn 





Decode fields 
Instruction Page 








L 
() STRD (immediate) 
1 LDRD (immediate) 





Load/Store dual (immediate) 
This section describes the encoding of the Load/Store dual (immediate) instruction class. This section is decoded 


from Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on page F3-2466. 


151413 12|11109 8|7 6 5 4|3 0 |15 12|11 8|7 | 0 | 


Rn 





Decode fields 
Instruction Page 








L 
() STRD (immediate) 
a LDRD (immediate) 





Load/Store dual (immediate, pre-indexed) 


This section describes the encoding of the Load/Store dual (immediate, pre-indexed) instruction class. This section 
is decoded from Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on 
page F3-2466. 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 42\11 8|7 | 0 | 


11170700 4fuf1 tft] 11 [ Rte | R2 | imme 
Rn 





Decode fields 
Instruction Page 














L 

0 STRD (immediate) 

1 LDRD (immediate) 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


F3.3.4 


Data-processing (shifted register) 


This section describes the encoding of the Data-processing (shifted register) instruction class. This section is 
decoded from 32-bit T32 instruction encoding on page F3-2447. 


[15 14 13 12/1110 9 8| 


5 4]3 0/1514 12\11 8|7 6 5 4|3 0 | 


770407] opt [8] Rn [Ol imma | Ra imma] ype] Rm 





Decode fields 


Instruction Page 


































































































op1 Rn Rd imm3:imm2:type 

0000 - - - AND, ANDS (register) - AND, rotate right with extend variant 

0000 - != 1111 = != 0000011 AND, ANDS (register) - ANDS, shift or rotate by value variant 

0000 - != 1111 0000011 AND, ANDS (register) - ANDS, rotate right with extend variant 

0000 - 1111 != Q000011 TST (register) - Shift or rotate by value variant 

0000 - 1111 0000011 TST (register) - Rotate right with extend variant 

0001 - - - BIC, BICS (register) 

0010 f= 1111 - - ORR, ORRS (register) - ORR, rotate right with extend variant 

0010 1111 - - MOV, MOVS (register) - MOV, rotate right with extend variant 

0010 f= 1111 - - ORR, ORRS (register) - ORRS, rotate right with extend variant 

0010 1111 - - MOV, MOVS (register) - MOVS, rotate right with extend variant 

0011 '= 1111 - - ORN, ORNS (register) - ORN, rotate right with extend variant 

0011 1111 - - MVN, MVNS (register) - MVN, rotate right with extend variant 

0011 '= 1111 - - ORN, ORNS (register) - ORNS, rotate right with extend variant 

0011 1111 - - MVN, MVNS (register) - MVNS, rotate right with extend variant 

0100 - - - EOR, EORS (register) - EOR, rotate right with extend variant 

0100 - != 1111 != 0000011 EOR, EORS (register) - EORS, shift or rotate by value variant 

0100 - != 1111 0000011 EOR, EORS (register) - EORS, rotate right with extend variant 

0100 - 1111 != Q000011 TEQ (register) - Shift or rotate by value variant 

0100 - 1111 0000011 TEQ (register) - Rotate right with extend variant 

0101 - - - Unallocated. 

0110 - - XXXXX00 PKHBT, PKHTB - PKHBT variant 

0110 - - XXXXXO1 Unallocated. 

0110 - - XXXXX10 PKHBT, PKHTB - PKHTB variant 

0110 - - XXXXX11 Unallocated. 

0111 - - - Unallocated. 

1000 !'= 1101 - - ADD, ADDS (register) - ADD, rotate right with extend variant 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 





















































op1 Rn Rd imm3:imm2:type 
1000 1101 - - ADD, ADDS (SP plus register) - ADD, rotate right with extend variant 
1000 '= 1101 !=1111 - ADD, ADDS (register) - ADDS, rotate right with extend variant 
1000 1101 f= 1111 - ADD, ADDS (SP plus register) - ADDS, rotate right with extend variant 
1000 - 1111 - CMN (register) 
1001 - - - Unallocated. 
1010 - - - ADC, ADCS (register) 
1011 - - - SBC, SBCS (register) 
1100 - - - Unallocated. 
1101 != 1101 - - SUB, SUBS (register) - SUB, rotate right with extend variant 
1101 1101 - - SUB, SUBS (SP minus register) - SUB, rotate right with extend variant 
1101 != 1101 !=1111 - SUB, SUBS (register) - SUBS, rotate right with extend variant 
1101 1101 f= 1111 - SUB, SUBS (SP minus register) - SUBS, rotate right with extend variant 
1101 - 1111 - CMP (register) 
1110 - - - RSB, RSBS (register) 
1111 - - - Unallocated. 
F3.3.5 Branches and miscellaneous control 


This section describes the encoding of the Branches and miscellaneous control group. This section is decoded from 
32-bit T32 instruction encoding on page F3-2447. 


[15 


| 109 | 65 4/3 0/1514 12/1110 8/7 65 4| 0 | 
| itto | | opt —fop2] ts op | | opt | oT | 
sie 


Lo: op5 





Decode fields 


Decode group or instruction page 





























op0- opt op2 op3 op4 op5 
0 1110 Ox 0x0 - 0 MSR (register) 
Y) 1110 Ox 0x0 - 1 MSR (Banked register) 
Q 1110 10 0x0 000 - Hints on page F3-2472 
Q 1110 10 0x0 '= 000 - Change processor state on page F3-2473 
Y) 1110 11 0x0 - - Miscellaneous system on page F3-2473 
Q 1111 00 0xd 7 - BX] 
0 1111 Q1 0x0 - - Exception return on page F3-2474 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F3-2471 


1ID092916 


Non-Confidential 


F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Decode group or instruction page 









































op0- op op2 op3 op4 op5 

0 1111 1x 0xd = 0 MRS 

Y) 1111 1x 0x0 - 1 MRS (Banked register) 

1 1110 00 000 - - DCPS on page F3-2474 

1 1110 00 010 - - Unallocated. 

1 1110 01 0xd - - Unallocated. 

1 1110 1x 0x0 - - Unallocated. 

1 1111 Ox 0xd - - Unallocated. 

1 1111 1x 0x0 - - Exception generation on page F3-2475 
- != 1llx- 0x0 - - B - T3 variant 

- - - Qx1 - - B - T4 variant 

- - - 1x0 - - BL, BLX (immediate) - T2 variant 
- - - 1x1 - - BL, BLX (immediate) - T1 variant 





Hints 


This section describes the encoding of the Hints instruction class 


miscellaneous control on page F3-2471. 


. This section is decoded from Branches and 


1514131211109 8|7 6 5 4/3 2 1 0|15141312/1110 9 8|7 4|3 0 | 


14470014101 ofataats ofofofofo oof hint | option | 





Decode fields 


Instruction Page 









































hint option 

0000 = 0000 NOP 

0000 = 0001 YIELD 

0000 = 0010 WFE 

0000 = 0011 WFI 

0000 =: 0100 SEV 

0000 = @101 SEVL 

0000 Q11x Reserved hint, behaves as NOP. 

0000 1xxx Reserved hint, behaves as NOP. 

0001 - Reserved hint, behaves as NOP. 

001x - Reserved hint, behaves as NOP. 

O1xx - Reserved hint, behaves as NOP. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 














hint option 

10xx - Reserved hint, behaves as NOP. 
110x - Reserved hint, behaves as NOP. 
1110 - Reserved hint, behaves as NOP. 
1111 - DBG 





Change processor state 


This section describes the encoding of the Change processor state instruction class. This section is decoded from 
Branches and miscellaneous control on page F3-2471. 


[15 1413 12/1110 9 8/7 6 5 4/3 2 1 0|15141312/11109 8|7 6 5 4| 0 | 





tt 1001 1 10 1 Of AAA] + OfOfofOfimoa[Mja]i]F] mode | 





Decode fields 
Instruction Page 














imod M 

00 1 CPS, CPSID, CPSIE - CPS variant 
01 - Unallocated. 

10 - CPS, CPSID, CPSIE - CPSIE variant 
11 - CPS, CPSID, CPSIE - CPSID variant 





Miscellaneous system 


This section describes the encoding of the Miscellaneous system instruction class. This section is decoded from 
Branches and miscellaneous control on page F3-2471. 


[145 14 1312/1110 9 8/7 6 5 4/3 2 1 0/15141312/1110 9 8|7 4|3 0 | 





tt toot 10 1 AAA] + ofofo fafa ope | option _| 





Decode fields 
Instruction Page 




















opc 
Q00x Unallocated. 
0010 CLREX 
0011 Unallocated. 
0100 DSB 
0101 DMB 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 











ope 
0110 ISB 

0111 Unallocated. 
1xxx Unallocated. 





Exception return 


This section describes the encoding of the Exception return instruction class. This section is decoded from Branches 
and miscellaneous control on page F3-2471. 


[15 1413 12/1110 9 8|7 6 5 4/3 01514 1312/1110 9 8|7 | 0 | 


TiitoOott to Re [1 Oo mms ___ 





Decode fields 
Instruction Page 
Rn imms 





- != 00000000 SUB, SUBS (immediate) 





1110 00000000 ERET 





DCPS 


This section describes the encoding of the DCPS instruction class. This section is decoded from Branches and 
miscellaneous control on page F3-2471. 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12|11 | | 210 
1117011171000] imma [1000] immo [opt 





Decode fields 
Instruction Page 











imm4 imm10 opt 

!= 1111 - - Unallocated. 
1111 '= 0000000000 - Unallocated. 
1111 0000000000 00 Unallocated. 





1111 0000000000 01 DCPS1, DCPS2, DCPS3 - DCPS1 variant 




















1111 0000000000 10 DCPS1, DCPS2, DCPS3 - DCPS2 variant 
1111 0000000000 11 DCPS1, DCPS2, DCPS3 - DCPS3 variant 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Exception generation 


This section describes the encoding of the Exception generation instruction class. This section is decoded from 
Branches and miscellaneous control on page F3-2471. 


[15 141312/11109 8|7 6 5 4|3 0/45 14 13 12|11 | | 0 | 


Tit toOtiit1 tp] mm [1 oqo] imma ——S—=s 





Decode fields 
Instruction Page 














o1 o2 

) Y) HVC 

0 1 Unallocated. 
1 Y) SMC 

1 1 UDF 





F3.3.6 Data-processing (modified immediate) 


This section describes the encoding of the Data-processing (modified immediate) instruction class. This section is 
decoded from 32-bit T32 instruction encoding on page F3-2447. 


[15 14 13 12/1110 9 8| 5 4]3 0/1514 12\11 8|7 | 0 | 


[741 of foy opt [Ss] Rn [0] mma] Ra | imme—s 





Decode fields 
Instruction Page 


















































op1 S_ Rn Rd 
0000s 8tsé - AND, ANDS (immediate) - AND variant 
0000 2 l1- != 1111 AND, ANDS (immediate) - ANDS variant 
0000 «Sf dlr 1111 TST (immediate) 
0001 - - - BIC, BICS (immediate) 
0010 «0 != 1111 - ORR, ORRS (immediate) - ORR variant 
0010 #88 = 8 1111 - MOV, MOVS (immediate) - MOV variant 
0010 «1 != 1111 - ORR, ORRS (immediate) - ORRS variant 
0010 1 1111 - MOV, MOVS (immediate) - MOVS variant 
0011 @ o's 1111 - ORN, ORNS (immediate) - Not flag setting variant 
0011 0 1111 - MVN, MVNS (immediate) - MVN variant 
Q@11 1 != 1111 - ORN, ORNS (immediate) - Flag setting variant 
0011 1 1111 - MVN, MVNS (immediate) - MVNS variant 
0100 «=O - - EOR, EORS (immediate) - EOR variant 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 







































































opi S_ Rn Rd 
0100 «=61lC- != 1111 EOR, EORS (immediate) - EORS variant 
0100 «61 - 1111 TEQ (immediate) 
0101 - - - Unallocated. 
Q@llx - - - Unallocated. 
1000 0 != 1101 - ADD, ADDS (immediate) - ADD variant 
1000 1101 - ADD, ADDS (SP plus immediate) - ADD variant 
1000 = 1 != 1101 !=1111 ADD, ADDS (immediate) - ADDS variant 
100011101 != 1111 ADD, ADDS (SP plus immediate) - ADDS variant 
1000 =61C- 1111 CMN (immediate) 
1001 - - - Unallocated. 
1010 - - - ADC, ADCS (immediate) 
1011. -) - - SBC, SBCS (immediate) 
1100 -— - - Unallocated. 
1101 0 != 1101 - SUB, SUBS (immediate) - SUB variant 
1101 0 1101 - SUB, SUBS (SP minus immediate) - SUB variant 
1101 #1 != 1101 !=1111 SUB, SUBS (immediate) - SUBS variant 
1101 1 1101 != 1111 SUB, SUBS (SP minus immediate) - SUBS variant 
1101 #1 - 1111 CMP (immediate) 
1110 —- - - RSB, RSBS (immediate) 
1lll -) - - Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


F3.3.7 Data-processing (plain binary immediate) 
This section describes the encoding of the Data-processing (plain binary immediate) group. This section is decoded 


from 32-bit T32 instruction encoding on page F3-2447. 


[15 | 109 8|7 6 5 4/3 0/15 14 | | | 0 | 


p 11110 | opt | 0 RR © (RR 
opO a 





Decode fields 
Decode group or instruction page 














op0 op1 

0 Qx Data-processing (simple immediate) 

0 10 Move Wide (16-bit immediate) on page F3-2478 
0 11 Unallocated. 

1 - Saturate, Bitfield on page F3-2478 





Data-processing (simple immediate) 


This section describes the encoding of the Data-processing (simple immediate) instruction class. This section is 
decoded from Data-processing (plain binary immediate). 


[15 1413 12/1110 9 8|7 6 5 4/3 0/1514 12\11 8|7 | 0 | 


741 Ofi[t ofeifofozfoy Rn [ol] mma | Ra | imme—_—s 





Decode fields 
Instruction Page 



































o1 o2 Rn 
) 0 != 11x1 ADD, ADDS (immediate) 
1) 1) 1101 ADD, ADDS (SP plus immediate) 
0 ) 1111 ADR - T3 variant 
0 1 - Unallocated. 
1 0 - Unallocated. 
1 1 != 11x1 SUB, SUBS (immediate) 
1 Hl 1101 SUB, SUBS (SP minus immediate) 
1 1 1111 ADR - T2 variant 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Move Wide (16-bit immediate) 


This section describes the encoding of the Move Wide (16-bit immediate) instruction class. This section is decoded 
from Data-processing (plain binary immediate) on page F3-2477. 


[15 1413 12/1110 9 8|7 6 5 4/3 0/1514 = 12\11 8|7 | 0 | 


741 Oli] ofei[t oo] mma [oy imma] Ra | imme—s 





Decode fields 
Instruction Page 








o1 
0 MOV, MOVS (immediate) 
1 MOVT 





Saturate, Bitfield 


This section describes the encoding of the Saturate, Bitfield instruction class. This section is decoded from 
Data-processing (plain binary immediate) on page F3-2477. 


1514131211109 8/7 5 4|3 0\1514 12\11 8|7 6 5 4| 0 | 


[11.1 1 ofoft 4] opt fof Rn [ol imma | Rd jimm2|(o)] _widthmt _ | 








Decode fields 
Instruction Page 









































op1 Rn imm3:imm2 
000 - - SSAT - Logical shift left variant 
001 - != 00000 SSAT - Arithmetic shift right variant 
001 - 00000 SSAT16 
010 - - SBFX 
011 f= 1111 - BFI 
011 1111 - BFC 
100 - - USAT - Logical shift left variant 
101 - != 00000 USAT - Arithmetic shift right variant 
101 - 00000 USAT16 
110 - - UBFX 
111 - - Unallocated. 
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F3.3.8 


F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Advanced SIMD element or structure Load/Store 
This section describes the encoding of the Advanced SIMD element or structure Load/Store group. This section is 


decoded from 32-bit T32 instruction encoding on page F3-2447. 


[15 | lI7 6 5 4|3 0 |15 12/1110 9 | 4|3 0 | 


[__aiv00t | [0 opt [| on? 
op0 ——EE 





Decode fields 


Decode group or instruction page 
























































op0-= opi op2 
Y) - 1101 Advanced SIMD Load/Store multiple structures (immediate, post-indexed) 
Y) - 1111 Advanced SIMD Load/Store multiple structures (no writeback) on page F3-2480 
Y) - != 11x1 Advanced SIMD Load/Store multiple structures (register, post-indexed) on page F3-2481 
1 11 1101 Advanced SIMD Load single structure to all lanes (immediate, post-indexed) on page F3-2481 
1 11 1111 Advanced SIMD load single structure to all lanes (no writeback) on page F3-2482 
1 11 != 11x1 Advanced SIMD load single structure to all lanes (register, post-indexed) on page F3-2482 
1 != 11 1101 Advanced SIMD Load/Store single structure to one lane (immediate, post-indexed) on page F3-2483 
d, f= 11 1111 Advanced SIMD Load/Store single structure to one lane (no writeback) on page F3-2484 
1 != 11 !=11x1 Advanced SIMD Load/Store single structure to one lane (register, post-indexed) on page F3-2484 
Advanced SIMD Load/Store multiple structures (immediate, post-indexed) 
This section describes the encoding of the Advanced SIMD Load/Store multiple structures (immediate, 
post-indexed) instruction class. This section is decoded from Advanced SIMD element or structure Load/Store. 
115 1413 12\11109 8|7 6 5 4/3 0 |15 12|11 8/7 6 5 4/3 2 1 | 
Tit 171007 o[p[tjo] Ra | va | ype [swe [aign[t 1 01 
Decode fields 
Instruction Page 
L type 
) 0010 VST1 (multiple single elements) - Post-indexed variant 
Q11x 
1010 
0 0011 VST2 (multiple 2-element structures) - Post-indexed variant 
100x 
0 000x VST4 (multiple 4-element structures) - Post-indexed variant 
0 010x VST3 (multiple 3-element structures) - Post-indexed variant 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 




















L type 

1 0010 VLD1 (multiple single elements) - Post-indexed variant 
011x 
1010 

dl 0011 VLD2 (multiple 2-element structures) - Post-indexed variant 
100x 

a 000x VLD4 (multiple 4-element structures) - Post-indexed variant 

1 010x VLD3 (multiple 3-element structures) - Post-indexed variant 

- 1011 Unallocated. 

- 11xx Unallocated. 





Advanced SIMD Load/Store multiple structures (no writeback) 


This section describes the encoding of the Advanced SIMD Load/Store multiple structures (no writeback) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F3-2479. 


[15 141312/11109 8|7 6 5 4/3 0 |15 42/11 817 6 5 4|3 21 0| 


Tii770070)0[t)o] Ra | va | ype |sweJain[i 171 





Decode fields 
Instruction Page 






































L type 
i) 0010 VST1 (multiple single elements) - Offset variant 
011x 
1010 
() 0011 VST2 (multiple 2-element structures) - Offset variant 
100x 
0 000x VST4 (multiple 4-element structures) - Offset variant 
0 010x VST3 (multiple 3-element structures) - Offset variant 
1 0010 VLD1 (multiple single elements) - Offset variant 
011x 
1010 
1 0011 VLD2 (multiple 2-element structures) - Offset variant 
100x 
1 000x VLD4 (multiple 4-element structures) - Offset variant 
1 010x VLD3 (multiple 3-element structures) - Offset variant 
- 1011 Unallocated. 
- 11xx Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Advanced SIMD Load/Store multiple structures (register, post-indexed) 


This section describes the encoding of the Advanced SIMD Load/Store multiple structures (register, post-indexed) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F3-2479. 


[15 141312/11109 8|7 6 5 4/3 


0 |15 12/11 81/7 6 5 4|3 0 | 


Ti 777007 0)p[tfo] Ra | va | ype [sweJalgn[ Rm | 





Decode fields 


Instruction Page 





VST1 (multiple single elements) - Post-indexed variant 





VST2 (multiple 2-element structures) - Post-indexed variant 





VST4 (multiple 4-element structures) - Post-indexed variant 





VST3 (multiple 3-element structures) - Post-indexed variant 





VLD1 (multiple single elements) - Post-indexed variant 





VLD2 (multiple 2-element structures) - Post-indexed variant 





VLD4 (multiple 4-element structures) - Post-indexed variant 





L type 
1) 0010 
Q11x 
1010 
0 0011 
100x 
0 000x 
0 010x 
1 0010 
Q11x 
1010 
1 0011 
100x 
1 Q00x 
1 010x 


VLD3 (multiple 3-element structures) - Post-indexed variant 





Advanced SIMD Load single structure to all lanes (immediate, post-indexed) 


This section describes the encoding of the Advanced SIMD Load single structure to all lanes (immediate, 
post-indexed) instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on 


page F3-2479. 


151413 12|11109 8|7 6 5 4|3 


0 |15 12\11109 8|7 6 5 4/3 21 0| 


A t1t1oo04 tfoftjof Rn | vd |1 1] N Jsize}tfaj1 1 0 4) 





Decode fields 
Instruction Page 














L N a 

0 - - Unallocated. 

1 00 - VLD1 (single element to all lanes) 

1 @1 - VLD2 (single 2-element structure to all lanes) 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 





VLD3 (single 3-element structure to all lanes) 





Unallocated. 





L N a 
1 10 () 
1 10 1 
1 11 - 


VLD4 (single 4-element structure to all lanes) 





Advanced SIMD load single structure to all lanes (no writeback) 


This section describes the encoding of the Advanced SIMD load single structure to all lanes (no writeback) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F3-2479. 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 21 0| 








Decode fields 


Instruction Page 





Unallocated. 





VLD1 (single element to all lanes) 





VLD2 (single 2-element structure to all lanes) 





VLD3 (single 3-element structure to all lanes) 





Unallocated. 





L N a 
0 2 & 
ae 00 - 
1 1 - 
1 10 0 
1 10 1 
1 11 - 


VLD4 (single 4-element structure to all lanes) 





Advanced SIMD load single structure to all lanes (register, post-indexed) 


This section describes the encoding of the Advanced SIMD load single structure to all lanes (register, post-indexed) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F3-2479. 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 


11111007 rfoirfof Rn [ va [11] N {sze]Tfa] Rm | 





Decode fields 


Instruction Page 














L N a 
0 - - Unallocated. 
1 00 - VLDI1 (single element to all lanes) 
1 Q1 - VLD2 (single 2-element structure to all lanes) 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 





VLD3 (single 3-element structure to all lanes) 





Unallocated. 





L N a 
1 10 () 
1 10 1 
1 11 - 


VLD4 (single 4-element structure to all lanes) 





Advanced SIMD Load/Store single structure to one lane (immediate, post-indexed) 


This section describes the encoding of the Advanced SIMD Load/Store single structure to one lane (immediate, 
post-indexed) instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on 


page F3-2479. 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 4|3 21 0| 





Apart oorsroirfo} Ra | vd [i] N Jindex align] 1 1 0 14) 


size 





Decode fields 


Instruction Page 






































L N 

0 00 VST1 (single element from one lane) 

() 01 VST2 (single 2-element structure from one lane) 

i) 10 VST3 (single 3-element structure from one lane) 

0 11 VST4 (single 4-element structure from one lane) 

1 00 VLD1 (single element to one lane) 

1 01 VLD2 (single 2-element structure to one lane) 

1 10 VLD3 (single 3-element structure to one lane) 

1 11 VLD4 (single 4-element structure to one lane) 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Advanced SIMD Load/Store single structure to one lane (no writeback) 


This section describes the encoding of the Advanced SIMD Load/Store single structure to one lane (no writeback) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F3-2479. 


[15 141312/11109 8|7 6 5 4|3 0 |15 


12/1110 9 8|7 4|3 21 0| 





7717007 tp]fo] Rn | va [=i] N [index align]? 71 1) 


size 





Decode fields 


Instruction Page 





VSTI1 (single element from one lane) 





VST2 (single 2-element structure from one lane) 





VST3 (single 3-element structure from one lane) 





VST4 (single 4-element structure from one lane) 





VLD1 (single element to one lane) 





VLD2 (single 2-element structure to one lane) 





VLD3 (single 3-element structure to one lane) 





L N 

0 00 
Q Q1 
Q 10 
0 11 
1 00 
1 Q1 
1 10 
1 11 





VLD4 (single 4-element structure to one lane) 





Advanced SIMD Load/Store single structure to one lane (register, post-indexed) 


This section describes the encoding of the Advanced SIMD Load/Store single structure to one lane (register, 
post-indexed) instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on 


page F3-2479. 


151413 12|11109 8|7 6 5 4|3 0 |15 


12|\1110 9 8|7 4|3 0| 





A t1t1oo0 4 tfoftjof Rn | vd |i] N index align] Rm | 


size 





Decode fields 


Instruction Page 



































L N 
0 00 VSTI (single element from one lane) 
() 01 VST2 (single 2-element structure from one lane) 
0 10 VST3 (single 3-element structure from one lane) 
0 11 VST4 (single 4-element structure from one lane) 
1 00 VLD1 (single element to one lane) 
1 01 VLD2 (single 2-element structure to one lane) 
1 10 VLD3 (single 3-element structure to one lane) 
1 11 VLD4 (single 4-element structure to one lane) 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


F3.3.9 Load/Store single 


This section describes the encoding of the Load/Store single group. This section is decoded from 32-bit T32 
instruction encoding on page F3-2447. 


[15 | 8 | |3 0 |15 12\11 | 6 


5 | 0 | 
mn00 | tex | op) [mn] opt _ [ = 





Decode fields 
Decode group or instruction page 
op0 op1 





!= 1111 000000 Load/Store (register offset) 





'= 1111 = 10x@xx Unallocated. 





!= 1111 10xlxx = Load/Store (immediate, post-indexed) on page F3-2486 





!= 1111  1100xx Load/Store (negative immediate) on page F3-2487 





!= 1111 1110xx = Load/Store (unprivileged) on page F3-2487 





!= 1111 11xlxx_ —_ Load/Store (immediate, pre-indexed) on page F3-2488 











!= 1111 - Load/Store (positive immediate) on page F3-2489 








1111 - Load literal on page F3-2489 





Load/Store (register offset) 
This section describes the encoding of the Load/Store (register offset) instruction class. This section is decoded from 


Load/Store single. 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 0 | 


[1111470 ofsfolsize|t] =1111 | Rt |0 0000 Ofimm2] Rm __| 
Rn 








Decode fields 
Instruction Page 
S size L Rt 
































0 00 0 - STRB (register) 
0 00 1 != 1111 LDRB (register) 
0 00 1 1111 PLD, PLDW (register) - Preload read variant 
0 01 ) - STRH (register) 
0 1 1 != 1111 LDRH (register) 
0 01 1 1111 PLD, PLDW (register) - Preload write variant 
® 10 Q - STR (register) 
0 10 1 - LDR (register) 
@ 11 - = Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 

















S size L Rt 

1 00 1 != 1111  LDRSB (register) 

1 00 1 1111 PLI (register) 

1 @1 1 != 1111 LDRSH (register) 

1 01 1 1111 Reserved hint, behaves as NOP. 
1 ix 1 - Unallocated. 





Load/Store (immediate, post-indexed) 


This section describes the encoding of the Load/Store (immediate, post-indexed) instruction class. This section is 


decoded from Load/Store single on page F3-2485. 


151413 12\11109 8|7 6 5 4|3 





0 \15 12|\1110 9 8|7 


+1170 [sols] enn [| Rt [1 ojuji] mms —is 
Rn 


| 0 | 








Decode fields 


Instruction Page 






































S size L 

Q 00 () STRB (immediate) 

() 00 1 LDRB (immediate) 

() 01 0 STRH (immediate) 

0 Q1 1 LDRH (immediate) 

0 10 Q STR (immediate) 

) 10 1 LDR (immediate) 

Q 11 - Unallocated. 

1 00 1 LDRSB (immediate) 

at 01 1 LDRSH (immediate) 

1 1x 1 Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Load/Store (negative immediate) 
This section describes the encoding of the Load/Store (negative immediate) instruction class. This section is 


decoded from Load/Store single on page F3-2485. 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1110 9 8|7 | 0 | 


+1117 0 o[sfofseft] enn | Ri (1100, mm | 
Rn 








Decode fields 
Instruction Page 
S size L Rt 



































0 00 Qo - STRB (immediate) 

0 00 1 != 1111 LDRB (immediate) 

® 00 1 1111 PLD, PLDW (immediate) - Preload read variant 
® 1 Qo - STRH (immediate) 

® 01 1 != 1111 LDRH (immediate) 

® 01 1 1111 PLD, PLDW (immediate) - Preload write variant 
0 10 Qo - STR (immediate) 

® 10 1 - LDR (immediate) 

® i121 - = Unallocated. 

1 00 1 - LDRSB (immediate) 

1 00 1 1111 PLI (immediate, literal) 





1 01 1 !=1111 LDRSH (immediate) 





1 01 1 1111 Reserved hint, behaves as NOP. 





1 1x 1 - Unallocated. 





Load/Store (unprivileged) 
This section describes the encoding of the Load/Store (unprivileged) instruction class. This section is decoded from 


Load/Store single on page F3-2485. 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 12/1110 9 8|7 | 0 | 


[111170 0fsfofsize|t] 2111 | Rte [1170] imma 
Rn 








Decode fields 
Instruction Page 














S size L 
0 00 Y) STRBT 
) 00 1  LDRBT 
0 Q1 ") STRHT 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 























S size L 

) 01 1  LDRHT 

) 10 M) STRT 

0 10 1  LDRT 

0 11 - Unallocated. 
1 00 1 LDRSBT 

1 01 1 LDRSHT 

1 1x 1 Unallocated. 





Load/Store (immediate, pre-indexed) 
This section describes the encoding of the Load/Store (immediate, pre-indexed) instruction class. This section is 


decoded from Load/Store single on page F3-2485. 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


[1114170 0fsfolsize|r] 111 | Rte [1 tfuli] imma 
Rn 








Decode fields 
Instruction Page 






































S size L 

0 00 ) STRB (immediate) 

0 00 1 LDRB (immediate) 

0 Q1 () STRH (immediate) 

0 01 1 LDRH (immediate) 

Q 10 Q STR (immediate) 

() 10 1 LDR (immediate) 

") 11 - Unallocated. 

1 00 1 LDRSB (immediate) 

1 01 1 LDRSH (immediate) 

1 1x 1 Unallocated. 
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Load/Store (positive immediate) 


F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


This section describes the encoding of the Load/Store (positive immediate) instruction class. This section is decoded 


from Load/Store single on page F3-2485. 


a 109 817 6 5 4/3 


0 |15 


42\11 | 0 | 





cc 





Decode fields 


Instruction Page 






































S_ size Rt 

0 00 - STRB (immediate) 

0 00 != 1111 LDRB (immediate) 

® 00 1111 PLD, PLDW (immediate) - Preload read variant 
® 1 - STRH (immediate) 

® 01 != 1111 LDRH (immediate) 

® 01 1111 PLD, PLDW (immediate) - Preload write variant 
Qo 10 - STR (immediate) 

® 10 - LDR (immediate) 

1 00 != 1111 LDRSB (immediate) 

1 00 1111 PLI (immediate, literal) 

1 @1 != 1111 LDRSH (immediate) 

1 @1 1111 Reserved hint, behaves as NOP. 





Load literal 


This section describes the encoding of the Load literal instruction class. This section is decoded from Load/Store 


single on page F3-2485. 


151413 12|11109 8|7 6 5 4/3 2 1 0\15 


12|11 | | 0| 





Decode fields 


Instruction Page 




















S size L Rt 
@ ex 1 1111 PLD (literal) 
0 00 1 != 1111  LDRB (literal) 
® 1 1 != 1111 LDRH (literal) 
@ 10 1 - LDR (literal) 
® il - - Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 
S size L Rt 

















1 00 1 != 1111 LDRSB (literal) 

1 00 1 1111 PLI (immediate, literal) 

1 1 1  != 1111 LDRSH (literal) 

1 Q1 1 1111 Reserved hint, behaves as NOP. 
1 ix 1 - Unallocated. 





F3.3.10 Data-processing (register) 
This section describes the encoding of the Data-processing (register) group. This section is decoded from 32-bit T32 


instruction encoding on page F3-2447. 


|15 | |7 6 | 0 |15 [11 8|7 4|3 0 | 


Po tttoro FE Ett et 
opO — 





Decode fields 
Decode group or instruction page 
































op0 op1 
i) 0000 MOV, MOVS (register-shifted register) - Flag setting variant 
i) 0001 Unallocated. 
i) 001x Unallocated. 
i) Q1xx Unallocated. 
i) 1xxx Register extends on page F3-2491 
1 Oxxx Parallel add-subtract on page F3-2491 
1 10xx Data-processing (two source registers) on page F3-2493 
1 11xx Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


Register extends 
This section describes the encoding of the Register extends instruction class. This section is decoded from 


Data-processing (register) on page F3-2490. 


[15 14 1312/1110 9 8|7 6 5 4|3 0/45 14 13 12|11 81/7 6 5 4|3 0 | 


Ti 777070 Ofopiul Ra [i177] Rd [t[Ojotate] Rm _| 





Decode fields 
Instruction Page 





















































opi U_ Rn 

00 ) != 1111 SXTAH 
00 @ 1111 SXTH 

00 1 != 1111 UXTAH 
00 1 1111 UXTH 

01 0 != 1111 SXTABI16 
01 @ 1111 SXTB16 
Q1 ne != 1111 UXTAB16 
Q1 1 1111 UXTB16 
10 ) != 1111 SXTAB 
10 0 1111 SXTB 

10 1 != 1111 UXTAB 
10 1 1111 UXTB 

11 - - Unallocated. 





Parallel add-subtract 
This section describes the encoding of the Parallel add-subtract instruction class. This section is decoded from 


Data-processing (register) on page F3-2490. 


[1514131211109 8|7 6 43 0 |15 14 13 12/11 8|7 6 5 4|3 0 | 


14411701701] opf | Rn [1171] Rd _ folufH]s| Rm | 





Decode fields 
Instruction Page 
op1 U H §& 





000 ®@ @® 6 SADD8 





000 ®@ ® 1 QADD8 





000 ®@ 1 06 SHADD8 





000 0 1 1 Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 
































































































































op1 U H 

000 1 @ UADD8 

000 1 @ UQADD8 
000 1 1 UHADD8 
000 1 1 Unallocated. 
001 eo 2@ SADD16 
001 0 20 QADD16 
001 @ 1 SHADD16 
001 i) 1 Unallocated. 
001 1 @ UADD16 
001 1 @ UQADD16 
001 1 1 UHADD16 
001 1 1 Unallocated. 
010 eo 20 SASX 

010 e 20 QASX 

010 ® 1 SHASX 

010 0 1. Unallocated. 
010 1 @ UASX 

010 1 @ UQASX 

010 1 1 UHASX 

010 1 1 Unallocated. 
100 eo 20 SSUB8 

100 e 0 QSUB8 

100 ® 1 SHSUB8 
100 0 1 Unallocated. 
100 1 @ USUB8 

100 1 @ UQSUB8 
100 1 1 UHSUB8 
100 1 1 Unallocated. 
101 0 @ SSUB16 

101 eo 20 QSUB16 
101 0 1 SHSUB16 
101 i) il Unallocated. 








F3-2492 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


Data-processing (two source registers) 


F3 The T32 Instruction Set Encoding 


F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 


















































op1 U H §& 

101 1 @® ® USUB16 
101 1 @ 1 UQSUB16 
101 1 1 @®  UHSUB16 
101 1 1 1 - Unallocated. 
110 ®@ @® 06 SSAX 

110 ®@ @® 1 QSAX 

110 ®@ 1 0 SHSAX 

110 i) 1 1 Unallocated. 
110 1 @® 0 USAX 

110 1 @ 1 UQSAX 

110 1 1 @ UHSAX 

110 1 1 1 Unallocated. 
111 - - - Unallocated. 








This section describes the encoding of the Data-processing (two source registers) instruction class. This section is 


decoded from Data-processing (register) on page F3-2490. 


|15141312/1110 9 8|7 6 4|3 


0 |15 14 13 12|11 


81/7 6 5 4|3 0 | 


1447701701] opf | Rn [1171] Rd [4 ofop2] Rm | 





Decode fields 


Instruction Page 






































op1 op2 

000 00 QADD 
000 01 QDADD 
000 10 QSUB 
000 11 QDSUB 
001 00 REV 

001 01 REV16 
001 10 RBIT 
001 11 REVSH 
010 00 SEL 

010 01 Unallocated. 
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F3-2493 


F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 
Instruction Page 


















































op1 op2 
010 1x Unallocated. 
Q11 00 CLZ 
011 01 Unallocated. 
011 1x Unallocated. 
100 00 CRC32 - CRC32B variant 
100 Q1 CRC32 - CRC32H variant 
100 10 CRC32 - CRC32W variant 
100 11 UNPREDICTABLE 
101 00 CRC32C - CRC32CB variant 
101 Q1 CRC32C - CRC32CH variant 
101 10 CRC32C - CRC32CW variant 
101 11 UNPREDICTABLE 
11x - Unallocated. 

F3.3.11 Multiply, multiply accumulate, and absolute difference 


This section describes the encoding of the Multiply, multiply accumulate, and absolute difference group. This 
section is decoded from 32-bit T32 instruction encoding on page F3-2447. 


[15 | | 6 | 0 |15 | 8|7 65 | 0 | 


P0110 [po 





Decode fields 
Decode group or instruction page 

















op0 
00 Multiply and absolute difference on page F3-2495 
Q1 Unallocated. 
1x Unallocated. 
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Multiply and absolute difference 


F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 


This section describes the encoding of the Multiply and absolute difference instruction class. This section is decoded 
from Multiply, multiply accumulate, and absolute difference on page F3-2494. 


|15141312|/1110 9 8|7 6 


4|3 


0 |15 12/11 8/7 6 5 4|3 0 | 


jpt1t11014 of opt [| Rn | Ra | Rd [0 Ofop2] Rm | 





Decode fields 


Instruction Page 









































































































































op1 op2 Ra 
000s != 1111 MLA, MLAS 
000s 1111 MUL, MULS 
000—t—«éi2 - MLS 
000 1x - Unallocated. 
001 = 0 != 1111 SMLABB, SMLABT, SMLATB, SMLATT - SMLABB variant 
001 = 0 1111 SMULBB, SMULBT, SMULTB, SMULTT - SMULBB variant 
001s G1 != 1111 SMLABB, SMLABT, SMLATB, SMLATT - SMLABT variant 
001s 1 1111 SMULBB, SMULBT, SMULTB, SMULTT - SMULBT variant 
001 —Ss«10 != 1111 SMLABB, SMLABT, SMLATB, SMLATT - SMLATB variant 
001 —Ss «10 1111 SMULBB, SMULBT, SMULTB, SMULTT - SMULTB variant 
001.011 != 1111 SMLABB, SMLABT, SMLATB, SMLATT - SMLATT variant 
001 11 1111 SMULBB, SMULBT, SMULTB, SMULTT - SMULTT variant 
010 = 0 != 1111 SMLAD, SMLADX - SMLAD variant 
010 = 0 1111 SMUAD, SMUADX - SMUAD variant 
010 1 != 1111 SMLAD, SMLADX - SMLADX variant 
010 G1 1111 SMUAD, SMUADX - SMUADX variant 
010 1x - Unallocated. 
011 = 0 != 1111 SMLAWB, SMLAWT - SMLAWB variant 
011 = 0 1111 SMULWB, SMULWT - SMULWB variant 
0111 != 1111 SMLAWB, SMLAWT - SMLAWT variant 
0111 1111 SMULWB, SMULWT - SMULWT variant 
011 1x - Unallocated. 
100 0 != 1111 SMLSD, SMLSDX - SMLSD variant 
100 0 1111 SMUSD, SMUSDX - SMUSD variant 
100s != 1111 SMLSD, SMLSDX - SMLSDX variant 
100s 1111 SMUSD, SMUSDX - SMUSDX variant 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 












































opi op2 Ra 

100 1x - Unallocated. 

101 = 00 != 1111 SMMLA, SMMLAR - SMMLA variant 
101 = 00 1111 SMMUL, SMMULR - SMMUL variant 
101 01 != 1111 SMMLA, SMMLAR - SMMLAR variant 
101.01 1111 SMMUL, SMMULR - SMMULR variant 
101 1x - Unallocated. 

110 = 00 - SMMLS, SMMLSR - SMMLS variant 
1100s 01 - SMMLS, SMMLSR - SMMLSR variant 
110 1x - Unallocated. 

111 00 != 1111 USADA8 

111 00 1111 USAD8 

111 01 - Unallocated. 

111 1x - Unallocated. 















































F3.3.12 Long multiply and divide 
This section describes the encoding of the Long multiply and divide instruction class. This section is decoded from 
32-bit T32 instruction encoding on page F3-2447. 
15 141312|1110 9 8|7 6 4|3 0 |15 12/11 8|7 4|3 0 | 
Decode fields 
Instruction Page 
op1 op2 
000 != Q00@ Unallocated. 
000 0000 SMULL, SMULLS 
001 != 1111 Unallocated. 
001 1111 SDIV 
010 != 0000 Unallocated. 
010 0000 UMULL, UMULLS 
011 != 1111 Unallocated. 
011 1111 UDIV 
100 0000 SMLAL, SMLALS 
100 0001 Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 





Decode fields 


Instruction Page 













































































opi op2 
100 Q01x Unallocated. 
100 Q1xx Unallocated. 
100 1000 SMLALBB, SMLALBT, SMLALTB, SMLALTT - SMLALBB variant 
100 1001 SMLALBB, SMLALBT, SMLALTB, SMLALTT - SMLALBT variant 
100 1010 SMLALBB, SMLALBT, SMLALTB, SMLALTT - SMLALTB variant 
100 1011 SMLALBB, SMLALBT, SMLALTB, SMLALTT - SMLALTT variant 
100 1100 SMLALD, SMLALDX - SMLALD variant 
100 1101 SMLALD, SMLALDX - SMLALDX variant 
100 111x Unallocated. 
101 Oxxx Unallocated. 
101 10xx Unallocated. 
101 1100 SMLSLD, SMLSLDX - SMLSLD variant 
101 1101 SMLSLD, SMLSLDX - SMLSLDX variant 
101 111x Unallocated. 
110 0000 UMLAL, UMLALS 
110 0001 Unallocated. 
110 Q01x Unallocated. 
110 010x Unallocated. 
110 0110 UMAAL 
110 0111 Unallocated. 
110 1xxx Unallocated. 
111 - Unallocated. 
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F3 The T32 Instruction Set Encoding 
F3.3 32-bit T32 instruction encoding 
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Chapter F4 


The A32 Instruction Set Encoding 


This chapter describes the encoding of the A32 instruction set. It contains the following sections: 


Top level A32 instruction set encoding on page F4-2500. 

Data-processing and miscellaneous instructions on page F4-2502. 

Load/Store Word, Unsigned byte (immediate, literal) on page F4-2519. 

Load/Store Word, Unsigned byte (register) on page F4-2520. 

Media instructions on page F4-2521. 

Branch, branch with link, and block data transfer on page F4-2529. 

System register access, Advanced SIMD, floating-point, and Supervisor Call on page F4-2531. 


Unconditional instructions on page F4-2540. 


In this chapter: 


In the decode tables, an entry of - for a field value means the value of the field does not affect the decoding. 


In the decode diagrams, a shaded field indicates that the bits in that field are not used in that level of decode. 
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F4 The A32 Instruction Set Encoding 
F4.1 Top level A32 instruction set encoding 


F4.1 


F4.1.1 


Top level A32 instruction set encoding 


The A32 instruction stream is a sequence of word-aligned words. Each A32 instruction is a single 32-bit word in 
that stream. The encoding of an A32 instruction is: 


|31 28|27 25 24| | | | | 5 4/3 0 | 


| cond | op PO 
— op1 


Most A32 instructions can be conditional, with a condition determined by bits[31:28] of the instruction, the cond 
field. For more information, see The condition code field in A32 instruction encodings on page F2-2408. This 
applies to all instructions except those with the cond field equal to 0b1111. 


Table F4-1 shows the top-level encoding of the A32 instruction set, including the cases where bits[3 1:28] are 
defined as 0b1111 rather than as the cond. 


The behavior of an attempt to execute an unallocated instruction is described in UNDEFINED, UNPREDICTABLE, 
and CONSTRAINED UNPREDICTABLE instruction set space on page F2-2414. 


For more information on A32 instruction encodings see Chapter F2 About the T32 and A32 Instruction 
Descriptions. 


Table F4-1 Main encoding table for the A32 instruction set 





Decode fields 
Decode group or instruction page 
cond op0- opi 





1111 = @0x - Data-processing and miscellaneous instructions on page F4-2502 





1111 010 - Load/Store Word, Unsigned byte (immediate, literal) on page F4-2519 





1111 011 0 Load/Store Word, Unsigned byte (register) on page F4-2520 





= 1111 011 a Media instructions on page F4-2521 





- 10x - Branch, branch with link, and block data transfer on page F4-2529 





- 11x - System register access, Advanced SIMD, floating-point, and Supervisor 
Call on page F4-2531 





1111 Oxx - Unconditional instructions on page F4-2540 





About the A32 Advanced SIMD and floating-point instructions and their encodings 


The Advanced SIMD and floating-point instructions are common to the A32 and T32 instruction sets. These 
instructions perform Advanced SIMD and floating-point operations on a common register file, the SIMD&FP 
register file. This means: 


° In general, the instructions that load or store registers in this file, or move data between general-purpose 
registers and this register file, are common to the Advanced SIMD and floating-point instructions. 


° There are distinct Advanced SIMD data-processing instructions and floating-point data-processing 
instructions. 


Different groups of the A32 Advanced SIMD and floating-point instructions are decoded from different points in 
the A32 instruction decode structure. Table F4-2 on page F4-2501 shows these instruction groups, and where each 
group is decoded from the overall A32 decode structure: 
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F4 The A32 Instruction Set Encoding 
F4.1 Top level A32 instruction set encoding 


Table F4-2 Advanced SIMD and floating-point instructions in the A32 decode structure 





Advanced SIMD and floating-point instruction group A32 decode is from 





Advanced SIMD data-processing on page F4-2541 Unconditional instructions on page F4-2540 





Floating-point data-processing on page F4-2533 System register access, Advanced SIMD, floating-point, and 
Supervisor Call on page F4-2531 





Floating-point and Advanced SIMD 32-bit register moves on System register access, Advanced SIMD, floating-point, and 








page F4-2536 Supervisor Call on page F4-2531 

Floating-point and Advanced SIMD Load/Store and 64-bit System register access, Advanced SIMD, floating-point, and 
register moves on page F4-2531 Supervisor Call on page F4-2531 

Advanced SIMD element or structure Load/Store on Unconditional instructions on page F4-2540 


page F4-2553 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


F4.2 Data-processing and miscellaneous instructions 


This section describes the encoding of the Data-processing and miscellaneous instructions group. This section is 
decoded from Top level A32 instruction set encoding on page F4-2500. 


31 27 25 24| 20|19 | | 8|7 6 5 4|3 0 | 


| teiitt | oof | opt Po | PT 


op0 | | | op4 


Decode fields 





Decode group or instruction page 





























op0 op1 op2 op3 op4 
i) - 1 1= @ 1 Extra Load/Store 
i) OXXXx 1 00 1 Multiply and Accumulate on page F4-2505 
i) 1xxxx il 00 1 Synchronization primitives and Load-Acquire/Store-Release on page F4-2506 
i) 10xxd ) - - Miscellaneous on page F4-2507 
i) 10xx0 1 - 0 Halfword Multiply and Accumulate on page F4-2510 
0 != 10xx® - - 0 Data-processing register (immediate shift) on page F4-2511 
0 != 10xx® 0 - 1 Data-processing register (register shift) on page F4-2513 
1 - - - - Data-processing immediate on page F4-2515 
F4.2.1 Extra Load/Store 


This section describes the encoding of the Extra Load/Store group. This section is decoded from Data-processing 
and miscellaneous instructions. 


31 \27 24/23 2221 | | | 8|7 6 4/3 0 | 


p f=1111 | 000 a ee * | '=00 | 1 
opO 22 | 





Decode fields 
Decode group or instruction page 














op0d 
Y) Load/Store Dual, Half, Signed byte (register) on page F4-2503 
1 Load/Store Dual, Half, Signed byte (immediate, literal) on page F4-2504 
F4-2502 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


Load/Store Dual, Half, Signed byte (register) 


F4 The A32 Instruction Set Encoding 


F4.2 Data-processing and miscellaneous instructions 


This section describes the encoding of the Load/Store Dual, Half, Signed byte (register) instruction class. This 


section is decoded from Extra Load/Store on page F4-2502. 













































































31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 
eri [oo o[Pluo|wpei] an | Re [oofoyol too /7] Rm 
cond op2 
Decode fields 
Instruction Page 
o1 op2 P 
) 01 0 STRH (register) - Post-indexed variant 
0 01 () STRHT 
) 01 1 STRH (register) - Pre-indexed variant 
) 10 0 LDRD (register) - Post-indexed variant 
0 10 ) Unallocated. 
1) 10 1 LDRD (register) - Pre-indexed variant 
0 11 0 STRD (register) - Post-indexed variant 
) 11 ) Unallocated. 
) 11 1 STRD (register) - Pre-indexed variant 
1 01 0 LDRH (register) - Post-indexed variant 
1 01 () LDRHT 
1 @1 1 LDRH (register) - Pre-indexed variant 
1 10 Q LDRSB (register) - Post-indexed variant 
1 10 ) LDRSBT 
1 10 1 LDRSB (register) - Pre-indexed variant 
1 11 0 LDRSH (register) - Post-indexed variant 
1 11 ) LDRSHT 
1 11 1 LDRSH (register) - Pre-indexed variant 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


Load/Store Dual, Half, Signed byte (immediate, literal) 


This section describes the encoding of the Load/Store Dual, Half, Signed byte (immediate, literal) instruction class. 
This section is decoded from Extra Load/Store on page F4-2502. 


31 28/27 26 25 24|23 22 21 20/19 16|15 12\11 8|7 6 5 4|3 0 | 


erm [oo o[P[u[i [wet] Rn [Rt | mma [a[00]7] imal _ 


cond op2 





Decode fields 
Instruction Page 
o1 op2 P:W~ Rn 




























































































) 01 00 - STRH (immediate) - Post-indexed variant 

() 01 01 - STRHT 

) 01 10 - STRH (immediate) - Offset variant 

0 01 11 - STRH (immediate) - Pre-indexed variant 

) 10 - 1111 LDRD (literal) 

() 10 00 != 1111 LDRD (immediate) - Post-indexed variant 

0 10 01 != 1111 Unallocated. 

) 10 10 != 1111 LDRD (immediate) - Offset variant 

0 10 11 != 1111 LDRD (immediate) - Pre-indexed variant 

0 11 00 - STRD (immediate) - Post-indexed variant 

) 11 01 - Unallocated. 

0 11 10 - STRD (immediate) - Offset variant 

) 11 11 - STRD (immediate) - Pre-indexed variant 

1 01 l= @1 1111 LDRH (literal) 

1 Q1 00 != 1111 LDRH (immediate) - Post-indexed variant 

1 01 01 - LDRHT 

1 01 10 != 1111 LDRH (immediate) - Offset variant 

1 01 11 != 1111 LDRH (immediate) - Pre-indexed variant 

1 10 l= @1 1111 LDRSB (literal) 

1 10 00 != 1111 LDRSB (immediate) - Post-indexed variant 

1 10 01 - LDRSBT 

1 10 10 != 1111 LDRSB (immediate) - Offset variant 

1 10 11 != 1111 LDRSB (immediate) - Pre-indexed variant 

1 11 l= @1 1111 LDRSH (literal) 

1 11 00 != 1111 LDRSH (immediate) - Post-indexed variant 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 





Decode fields 
Instruction Page 
01 op2 P:W~ Rn 











1 11) 01 - LDRSHT 
1 11 10 != 1111 LDRSH (immediate) - Offset variant 
1 11 11 != 1111 LDRSH (immediate) - Pre-indexed variant 





F4.2.2 Multiply and Accumulate 


This section describes the encoding of the Multiply and Accumulate instruction class. This section is decoded from 
Data-processing and miscellaneous instructions on page F4-2502. 


31 28|27 26 25 24|23 21 20/19 16/15 12\11 8/7 6 5 4|3 0 | 


cond 





Decode fields 
Instruction Page 









































opc Ss 

000 - MUL, MULS 

001 - MLA, MLAS 

010 () UMAAL 

010 1 Unallocated. 

Q11 () MLS 

011 1 Unallocated. 

100 - UMULL, UMULLS 

101 - UMLAL, UMLALS 

110 - SMULL, SMULLS 

111 - SMLAL, SMLALS 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


F4.2.3 Synchronization primitives and Load-Acquire/Store-Release 
This section describes the encoding of the Synchronization primitives and Load-Acquire/Store-Release group. This 


section is decoded from Data-processing and miscellaneous instructions on page F4-2502. 


31 \27 |23 22 | | 12/11 9 8|7 |3 0 | 


[erm | 001 | [| 1 [ana 1007 [inn 
opO —EE 





Decode fields 
Decode group or instruction page 








op0 
0 Unallocated. 
1 Load/Store Exclusive and Load-Acquire/Store-Release 





Load/Store Exclusive and Load-Acquire/Store-Release 


This section describes the encoding of the Load/Store Exclusive and Load-Acquire/Store-Release instruction class. 
This section is decoded from Synchronization primitives and Load-Acquire/Store-Release. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


cond 





Decode fields 
Instruction Page 
type L ex. ord 





00 0 0 i) SIL 





00 0 0 1 Unallocated. 





00 @ 1 0 STLEX 





00 @ 1 1 STREX 

















00 1 @ i) LDA 

00 1 @ 1 Unallocated. 
00 1 -a i) LDAEX 

00 1 1 1 LDREX 

01 0 0 - Unallocated. 





01 @ 1 i) STLEXD 





01 @ 1 1 STREXD 





01 1 1) - Unallocated. 





01 1 1 i) LDAEXD 





01 1 2 1 LDREXD 





F4-2506 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 





Decode fields 
Instruction Page 
type L ex. ord 





10 0 0 i) STLB 





10 1) Q 1 Unallocated. 





10 @ 1 i) STLEXB 





10 @ 1 1 STREXB 














10 1 0 i) LDAB 

10 1 @ 1 Unallocated. 
10 1 1 i) LDAEXB 

10 L of 1 LDREXB 





11 0 0 0 STL 





11 1) 1) 1 Unallocated. 





11 @ 1 i) STLEXH 





11 @ 1 1 STREXH 





11 1 @ i) LDAH 





11 a} Q 1 Unallocated. 





11 toot 0 LDAEXH 





11 1 1 1 LDREXH 





F4.2.4 Miscellaneous 


This section describes the encoding of the Miscellaneous group. This section is decoded from Data-processing and 
miscellaneous instructions on page F4-2502. 


31 |27 | 2221 20\19 | | 8|7 6 4|3 0 | 


| fe1iit_ | 00010 op fof of ot PO 





Decode fields 
Decode group or instruction page 





























op0 op1 
00 001 Unallocated. 
00 010 Unallocated. 
00 011 Unallocated. 
00 110 Unallocated. 
01 001 BX 
01 010 BxJ 
01 011 BLX (register) 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 





Decode fields 
Decode group or instruction page 












































op0 op1 

01 110 Unallocated. 
10 001 Unallocated. 
10 010 Unallocated. 
10 011 Unallocated. 
10 110 Unallocated. 
11 001 CLZ 

11 010 Unallocated. 
11 011 Unallocated. 
11 110 ERET 

- 111 Exception Generation 





- 000 Move special register (register) on page F4-2509 





- 100 Cyclic Redundancy Check on page F4-2509 





- 101 Integer Saturating Arithmetic on page F4-2510 





Exception Generation 


This section describes the encoding of the Exception Generation instruction class. This section is decoded from 
Miscellaneous on page F4-2507. 


|31 28|27 26 25 24|23 22 21 20|19 | | 8|7 6 5 4|3 0| 
1111 [o 0 0 1 Ofopcfo] immi2_— ft 1 1] imms | 
cond 





Decode fields 
Instruction Page 




















opc 
00 HLT 
01 BKPT 
10 HY 
11 SMC 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


Move special register (register) 


This section describes the encoding of the Move special register (register) instruction class. This section is decoded 
from Miscellaneous on page F4-2507. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


[eri [00 07 O]opeo] mask | Rd [oloje[m[o oo 0] Rn | 


cond 





Decode fields 
Instruction Page 














opc B 

x0 Y) MRS 

x0 1 MRS (Banked register) 
x1 Y) MSR (register) 

x1 1 MSR (Banked register) 





Cyclic Redundancy Check 


This section describes the encoding of the Cyclic Redundancy Check instruction class. This section is decoded from 
Miscellaneous on page F4-2507. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


[erm [oo 070] eo] en | Rd jojofcjofo 1 oo] Rm | 


cond 





Decode fields 
Instruction Page 





























SZ Cc 
00 i) CRC32 - CRC32B variant 
00 1 CRC32C - CRC32CB variant 
01 i) CRC32 - CRC32H variant 
Q1 1 CRC32C - CRC32CH variant 
10 Q CRC32 - CRC32W variant 
10 1 CRC32C - CRC32CW variant 
11 - UNPREDICTABLE 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


Integer Saturating Arithmetic 


This section describes the encoding of the Integer Saturating Arithmetic instruction class. This section is decoded 
from Miscellaneous on page F4-2507. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 6 5 4/3 0| 


eri [oo 07 Ofopefo] Rn | Rd ojofoyofo 107] Rm | 


cond 





Decode fields 
Instruction Page 














opc 

00 QADD 
01 QSUB 
10 QDADD 
11 QDSUB 





F4.2.5 Halfword Multiply and Accumulate 


This section describes the encoding of the Halfword Multiply and Accumulate instruction class. This section is 
decoded from Data-processing and miscellaneous instructions on page F4-2502. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 


| fet ]o 0 0 4 Ofopcjof Rd | Ra | Rm _ | ijMjNfo} Rn | 


cond 





Decode fields 
Instruction Page 
ope M N 





00 - - SMLABB, SMLABT, SMLATB, SMLATT 





01 7) Y) SMLAWB, SMLAWT - SMLAWB variant 





01 ) 1 SMULWB, SMULWT - SMULWB variant 





01 1 Y) SMLAWB, SMLAWT - SMLAWT variant 





01 il 1 SMULWB, SMULWT - SMULWT variant 




















10 - - SMLALBB, SMLALBT, SMLALTB, SMLALTT 
11 - - SMULBB, SMULBT, SMULTB, SMULTT 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


F4.2.6 Data-processing register (immediate shift) 


This section describes the encoding of the Data-processing register (immediate shift) group. This section is decoded 
from Data-processing and miscellaneous instructions on page F4-2502. 


31 |27 24|23 22 21 20/19 | | | 5 4]3 0 | 


| tert | 000 Joop | | PO 


op1 —eeee 





Decode fields 


Decode group or instruction page 





Integer Data Processing (three register, immediate shift) 





Integer Test & Compare (two register, immediate shift) on page F4-2512 





op0 op1 
Ox - 
10 1 
11 - 


Logical Arithmetic (three register, immediate shift) on page F4-2512 





Integer Data Processing (three register, immediate shift) 


This section describes the encoding of the Integer Data Processing (three register, immediate shift) instruction class. 
This section is decoded from Data-processing register (immediate shift). 















































|31 28|27 26 25 24/23 21 20/19 16|15 12|11 17 6 5 4/3 0 | 
cond 
Decode fields 
Instruction Page 
ope S_ Rn 
000 - = AND, ANDS (register) 
001 - EOR, EORS (register) 
010 @ ‘!= 1101 SUB, SUBS (register) - SUB, rotate right with extend variant 
010 @ 1101 SUB, SUBS (SP minus register) - SUB, rotate right with extend variant 
010 1 != 1101 SUB, SUBS (register) - SUBS, rotate right with extend variant 
010 1 = 1101 SUB, SUBS (SP minus register) - SUBS, rotate right with extend variant 
011 - RSB, RSBS (register) 
100 0 != 1101 ADD, ADDS (register) - ADD, rotate right with extend variant 
100 @ 1101 ADD, ADDS (SP plus register) - ADD, rotate right with extend variant 
100 1 != 1101 ADD, ADDS (register) - ADDS, rotate right with extend variant 
100 1 = 1101 ADD, ADDS (SP plus register) - ADDS, rotate right with extend variant 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 





Decode fields 
Instruction Page 











ope S_ Rn 

101 - - ADC, ADCS (register) 
110 - - SBC, SBCS (register) 
111 - RSC, RSCS (register) 





Integer Test & Compare (two register, immediate shift) 


This section describes the encoding of the Integer Test & Compare (two register, immediate shift) instruction class. 
This section is decoded from Data-processing register (immediate shift) on page F4-2511. 


31 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 iI7 6 5 4|3 0| 


[eri [0007 Ofope[t] Rn [OfOfOo mms [wpe]o] Rm 


cond 





Decode fields 
Instruction Page 














opc 

00 TST (register) 
01 TEQ (register) 
10 CMP (register) 
11 CMN (register) 





Logical Arithmetic (three register, immediate shift) 


This section describes the encoding of the Logical Arithmetic (three register, immediate shift) instruction class. This 
section is decoded from Data-processing register (immediate shift) on page F4-2511. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 i765 4|3 0 | 


| i111 fo 0 0 4 tfope{s}| Rn | Rd | ___immS [type Jo] Rm _ 


cond 





Decode fields 
Instruction Page 




















opc 
00 ORR, ORRS (register) 
01 MOV, MOVS (register) 
10 BIC, BICS (register) 
11 MVN, MVNS (register) 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


F4.2.7 Data-processing register (register shift) 
This section describes the encoding of the Data-processing register (register shift) group. This section is decoded 


from Data-processing and miscellaneous instructions on page F4-2502. 


31 |27 24|23 22 21 20/19 | | 81/7 6 5 4|3 0 | 


| tert | 000 Jop | | Poot 
op1 a ES 





Decode fields 
Decode group or instruction page 











op0 op1 

Ox - Integer Data Processing (three register, register shift) 

10 1 Integer Test & Compare (two register, register shift) on page F4-2514 
11 - Logical Arithmetic (three register, register shift) on page F4-2514 





Integer Data Processing (three register, register shift) 


This section describes the encoding of the Integer Data Processing (three register, register shift) instruction class. 
This section is decoded from Data-processing register (register shift). 


31 28/27 26 25 24|23 21 20/19 16/15 12\11 81/7 6 5 4|3 0 | 


[erm [oo 0 0] ops] Rn | Rd | Re [O[wpe]i] Rm | 


cond 





Decode fields 
Instruction Page 
































opc 
000 AND, ANDS (register-shifted register) 
001 EOR, EORS (register-shifted register) 
010 SUB, SUBS (register-shifted register) 
011 RSB, RSBS (register-shifted register) 
100 ADD, ADDS (register-shifted register) 
101 ADC, ADCS (register-shifted register) 
110 SBC, SBCS (register-shifted register) 
111 RSC, RSCS (register-shifted register) 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


Integer Test & Compare (two register, register shift) 


This section describes the encoding of the Integer Test & Compare (two register, register shift) instruction class. 
This section is decoded from Data-processing register (register shift) on page F4-2513. 


31 28|27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 81/7 6 5 4|3 0| 





eri [0007 Ofope[t] Rn ofofool Re [O[wee]7] Rm 


cond 





Decode fields 
Instruction Page 














opc 
00 TST (register-shifted register) 
01 TEQ (register-shifted register) 
10 CMP (register-shifted register) 
11 CMN (register-shifted register) 





Logical Arithmetic (three register, register shift) 


This section describes the encoding of the Logical Arithmetic (three register, register shift) instruction class. This 
section is decoded from Data-processing register (register shift) on page F4-2513. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 


[erm [oo 07 t)ope[s] Rn | Rd | Re [O[wpe]i] Rm | 


cond 





Decode fields 
Instruction Page 




















opc 
00 ORR, ORRS (register-shifted register) 
01 MOV, MOVS (register-shifted register) 
10 BIC, BICS (register-shifted register) 
11 MVN, MVNS (register-shifted register) 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


F4.2.8 Data-processing immediate 


This section describes the encoding of the Data-processing immediate group. This section is decoded from 
Data-processing and miscellaneous instructions on page F4-2502. 


31 \27 24|23 22 21 20/19 | | | | 0 | 


[rt [001 | op0 [i] opt [i i I 





Decode fields 
Decode group or instruction page 

















op0 op1 

Ox - Integer Data Processing (two register and immediate) 

10 00 Move Halfword (immediate) on page F4-2516 

10 10 Move Special Register & Hints (immediate) on page F4-2516 

10 x1 Integer Test & Compare (one register and immediate) on page F4-2517 
11 - Logical Arithmetic (two register and immediate) on page F4-2518 





Integer Data Processing (two register and immediate) 


This section describes the encoding of the Integer Data Processing (two register and immediate) instruction class. 
This section is decoded from Data-processing immediate. 


|31 28|27 26 25 24/23 212019 16|15 12|11 | 0| 
EAM [0.01 0] op [Ss] Rn | Rd | mma —id 
cond 





Decode fields 
Instruction Page 












































ope S_ Rn 
000 - - AND, ANDS (immediate) 
001 - = EOR, EORS (immediate) 
010 ) != 11x1 SUB, SUBS (immediate) - SUB variant 
010 Q 1101 SUB, SUBS (SP minus immediate) - SUB variant 
010 @ 1111 ADR - A2 variant 
010 1 != 1101 SUB, SUBS (immediate) - SUBS variant 
010 1 1101 SUB, SUBS (SP minus immediate) - SUBS variant 
011 - - RSB, RSBS (immediate) 
100 () != 11x11 ADD, ADDS (immediate) - ADD variant 
100 0 1101 ADD, ADDS (SP plus immediate) - ADD variant 
100 @ 1111 ADR - AJ variant 
100 1 != 1101 ADD, ADDS (immediate) - ADDS variant 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 





Decode fields 
Instruction Page 














ope S_ Rn 

100 1 1101 ADD, ADDS (SP plus immediate) - ADDS variant 
101 - - ADC, ADCS (immediate) 

110 - = SBC, SBCS (immediate) 

111 - - RSC, RSCS (immediate) 





Move Halfword (immediate) 


This section describes the encoding of the Move Halfword (immediate) instruction class. This section is decoded 
from Data-processing immediate on page F4-2515. 


\31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 | 0| 
EAM [0.011 O[H]O 0] mm | Rd | imma _—id 
cond 





Decode fields 
Instruction Page 








H 
) MOV, MOVS (immediate) 
1 MOVT 





Move Special Register & Hints (immediate) 


This section describes the encoding of the Move Special Register & Hints (immediate) instruction class. This 
section is decoded from Data-processing immediate on page F4-2515. 


|31 28|27 26 25 24/23 22 21 20/19 16|15 14 13 12/11 | | 0| 
1111 [o 0 11 Of[R]1 Of imma [faa imm2— 
cond 





Decode fields 
Instruction Page 
R:imm4 = imm12 
































'= 00000 = - MSR (immediate) 
00000 xxxx00000000 NOP 
00000 xxxx00000001 YIELD 
00000 Xxxx00000010 WHE 
00000 xxxx00000011 WEI 
00000 xxxx00000100 SEV 
00000 xxxx00000101 SEVL 
00000 xxxx0000011x Reserved hint, behaves as NOP. 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 





Decode fields 


Instruction Page 
































R:imm4 = imm12 

00000 Xxxx00001xxx Reserved hint, behaves as NOP. 
00000 XXxxX0Q01xxxx Reserved hint, behaves as NOP. 
00000 XXXXQQ1xxxxx Reserved hint, behaves as NOP. 
00000 XXXXQ1xxxxxx Reserved hint, behaves as NOP. 
00000 XXXX10xxxxxx Reserved hint, behaves as NOP. 
00000 Xxxx110xxxxx Reserved hint, behaves as NOP. 
00000 Xxxx1110xxxx Reserved hint, behaves as NOP. 
00000 XXxx1111xxxx DBG 





Integer Test & Compare (one register and immediate) 


This section describes the encoding of the Integer Test & Compare (one register and immediate) instruction class. 
This section is decoded from Data-processing immediate on page F4-2515. 























|34 28/27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 | | 0 | 

1111, |0 0 1 1 0 

cond 
Decode fields 

Instruction Page 
opc 
00 TST (immediate) 
Q1 TEQ (immediate) 
10 CMP (immediate) 
11 CMN (immediate) 
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F4 The A32 Instruction Set Encoding 
F4.2 Data-processing and miscellaneous instructions 


Logical Arithmetic (two register and immediate) 


This section describes the encoding of the Logical Arithmetic (two register and immediate) instruction class. This 
section is decoded from Data-processing immediate on page F4-2515. 


|31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 | | 0| 
EAM [0.011 tJopc[s] Rn | Rd | immi2——sd 
cond 





Decode fields 
Instruction Page 




















opc 
00 ORR, ORRS (immediate) 
01 MOV, MOVS (immediate) 
10 BIC, BICS (immediate) 
11 MVN, MVNS (immediate) 
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F4 The A32 Instruction Set Encoding 
F4.3 Load/Store Word, Unsigned byte (immediate, literal) 


F4.3 Load/Store Word, Unsigned byte (immediate, literal) 


This section describes the encoding of the Load/Store Word, Unsigned byte (immediate, literal) instructions group. 
This section is decoded from Top level A32 instruction set encoding on page F4-2500. 


\31 28|27 26 25 24/23 22 21 20/19 16|15 12|11 | | 0 | 
Pen lo 1 ojPlufewioy en | Rt | mmi2_—sS—Csd 
cond 





Decode fields 
Instruction Page 
01 o2 P:W~ Rn 




































































0 0 00 - STR (immediate) - Post-indexed variant 

i) ) Q1 - STRT 

0 0 10 - STR (immediate) - Offset variant 

() 0 11 - STR (immediate) - Pre-indexed variant 

0 1 00 - STRB (immediate) - Post-indexed variant 

) 1 01 - STRBT 

0 1 10 - STRB (immediate) - Offset variant 

0 1 11 - STRB (immediate) - Pre-indexed variant 

1 0 l= @1 1111 LDR (literal) 

i 0 00 != 1111 LDR (immediate) - Post-indexed variant 

1 0 01 - LDRT 

1 0 10 !'= 1111 LDR (immediate) - Offset variant 

1 0 11 != 1111 LDR (immediate) - Pre-indexed variant 

1 1 != @1 1111 LDRB (literal) 

1. 1 00 != 1111 LDRB (immediate) - Post-indexed variant 

1 1 01 - LDRBT 

1 1 10 != 1111 LDRB (immediate) - Offset variant 

il il 11 != 1111 LDRB (immediate) - Pre-indexed variant 
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F4 The A32 Instruction Set Encoding 
F4.4 Load/Store Word, Unsigned byte (register) 


F4.4 Load/Store Word, Unsigned byte (register) 


This section describes the encoding of the Load/Store Word, Unsigned byte (register) instructions group. This 
section is decoded from Top level A32 instruction set encoding on page F4-2500. 


31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 \7 6 5 4|3 0 | 


erm [07 t[P[upawie] Rn [Rt mmd__[upe]o] Rm 


cond 





Decode fields 
Instruction Page 
o1 o2 P W 





0 Q 0 0 STR (register) - Post-indexed variant 





) 0 @ 1 STRT 





) ) 1 - STR (register) - Pre-indexed variant 





) £ 0 0 STRB (register) - Post-indexed variant 





) iL @ 1 STRBT 





Q 1 1 - STRB (register) - Pre-indexed variant 





1 0 0 @ LDR (register) - Post-indexed variant 





1 i) @ 1 LDRT 





1 ) 1 - LDR (register) - Pre-indexed variant 





1 1 0 @ LDRB (register) - Post-indexed variant 





1 1 @ 1 LDRBT 





1 1 1 - LDRB (register) - Pre-indexed variant 
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F4 The A32 Instruction Set Encoding 
F4.5 Media instructions 


F4.5 Media instructions 


This section describes the encoding of the Media instructions group. This section is decoded from Top level A32 
instruction set encoding on page F4-2500. 


31 27 24| 20|19 | | 8|7 5 4|3 0 | 


[erm | of | op [| opt _[ 1 [EN 





Decode fields 
Decode group or instruction page 
op0 op1 





QOxxx - Parallel Arithmetic on page F4-2522 





01000 101 SEL 





01000 001 nallocated. 





01000 xx KHBT, PKHTB 





01001 x01 nallocated. 








0110x x01 nallocated. 





U 
P 
U 
01001 xx Unallocated. 
U 
U 


0110x xx@ nallocated. 





01x10 001 Saturate 16-bit on page F4-2524 








01x10 101 Unallocated. 





Q1x11 x01 Reverse Bit/Byte on page F4-2524 





Q1x1x xx Saturate 32-bit on page F4-2524 





O1xxx 111 Unallocated. 





Q1xxx 011 Extend and Add on page F4-2525 





10xxx - Signed multiply, Divide on page F4-2525 





11000 000 Unsigned Sum of Absolute Differences on page F4-2527 





11000 100 nallocated. 





11001 x00 nallocated. 








110xx 111 nallocated. 





U 
U 

1101x x00 Unallocated. 
U 
U 


1110x 111 nallocated. 





1110x x00 Bitfield Insert on page F4-2527 





11110 111 Unallocated. 





11111 111 Permanently UNDEFINED on page F4-2527 





1111x x00 Unallocated. 








11x0x x10 Unallocated. 
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F4 The A32 Instruction Set Encoding 
F4.5 Media instructions 





Decode fields 
Decode group or instruction page 
op0 op1 





11x1x x10 Bitfield Extract on page F4-2528 





11xxx 011 Unallocated. 





11xxx x01 Unallocated. 





F4.5.1 Parallel Arithmetic 


This section describes the encoding of the Parallel Arithmetic instruction class. This section is decoded from Media 
instructions on page F4-2521. 


31 28|27 26 25 24|23 22 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


| feitt fo 4110 0} opt [Rn | Rd (fyb ] op2 [1] Rm _ 


cond 





Decode fields 
Instruction Page 
op1 op2 B 





000 - - Unallocated. 





001 00 @ SADD16 





001 00 1 SADD8 





001 01 @ SASK 





001 Q1 1 Unallocated. 





001 10 @ SSAX 





001 10 1 Unallocated. 





001 11 @ SSUB16 








001 11 1 SSUB8 





010 00 ®@ QADD16 





010 00 1 QADD8 





010 01 @ QASX 








010 01 1 Unallocated. 





010 10 ®@ QSAX 








010 10 al Unallocated. 





010 11 ®@ QSUB16 





010 #11 411  QSUB8 





Q11 00 @ SHADD16 








011 00 1 SHADD8 





011 01 @ SHASX 
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F4 The A32 Instruction Set Encoding 
F4.5 Media instructions 





Decode fields 
Instruction Page 
opi op2 B 





011 01 1 Unallocated. 





011 10 @ SHSAX 





011 10 1 Unallocated. 





011 11 @ SHSUB16 

















011 11 1 SHSUB8 
100 - - Unallocated. 
101 00 @ UADD16 
101 00 1 UADD8 





101 01 @ UASX 








101 Q1 al Unallocated. 





101 10 @ USAX 











101 10 1 Unallocated. 





101 11 @ USUB16 





101 11 1  USUB8 





110 00 @ UQADD16 























110 00 1 UQADD8 
110 01 @ UQASX 

110 01 1 Unallocated. 
110 10 @ UQSAX 

110 10 1 Unallocated. 








110 11 @ UQSUB16 





1100 1 UQSUB8 





111 00 @ UHADD16 








111 00 1 UHADD8 





111 01 @ UHASX 





111 Q1 1 Unallocated. 





111 10 @ UHSAX 








111 10 1 Unallocated. 





111 1. @ UHSUB16 








111 11 1 UHSUB8 
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F4 The A32 Instruction Set Encoding 
F4.5 Media instructions 


F4.5.2 Saturate 16-bit 


This section describes the encoding of the Saturate 16-bit instruction class. This section is decoded from Media 
instructions on page F4-2521. 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


Derm [0770 1[U]1 0] satimm | Ra (yoo 1 7] Rn 


cond 





Decode fields 
Instruction Page 








U 
Q SSAT16 
1 USATI16 





F4.5.3 Reverse Bit/Byte 


This section describes the encoding of the Reverse Bit/Byte instruction class. This section is decoded from Media 
instructions on page F4-2521. 























|31 28|27 26 25 24/23 22 21 20|19 18 17 16|15 12\11109 8|7 6 5 4|3 0| 

1111 fo 11 0 tfoif1 tft, Ra [aenenfpfozfo_1 1] Rm | 
cond 

Decode fields 

Instruction Page 

o1 o2 

Q 0 REV 

1) 1 REV16 

1 0 RBIT 

1 1 REVSH 

F4.5.4 Saturate 32-bit 


This section describes the encoding of the Saturate 32-bit instruction class. This section is decoded from Media 
instructions on page F4-2521. 


31 28|27 26 25 24/23 22 21 20| 16/15 12\11 |7 6 5 4|3 0 | 


Derm [0770 1[U]i] satimm [Rd | mms [sh[o 7] Rn 


cond 





Decode fields 
Instruction Page 














U 
Q SSAT 
1 USAT 
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F4 The A32 Instruction Set Encoding 
F4.5 Media instructions 


F4.5.5 Extend and Add 


This section describes the encoding of the Extend and Add instruction class. This section is decoded from Media 
instructions on page F4-2521. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


Derm [0770 1[Ulo] Rn | Rd fotatcfoyofo 117] Rm | 


cond 





Decode fields 
Instruction Page 












































op U_ Rn 

00 0 != 1111 SXTAB16 
00 ) 1111 SXTB16 
00 1 != 1111 UXTABI6 
00 L 1111 UXTBI16 
10 0 != 1111 SXTAB 
10 0 1111 SXTB 

10 i != 1111 UXTAB 
10 1 1111 UXTB 

11 Q != 1111 SXTAH 
11 0 1111 SXTH 

11 1 != 1111 UXTAH 
11 1 1111 UXTH 











F4.5.6 Signed multiply, Divide 


This section describes the encoding of the Signed multiply, Divide instruction class. This section is decoded from 
Media instructions on page F4-2521. 


31 28|27 26 25 24|23 22 20|19 16/15 12\11 8|7 5 4|3 0 | 


[erm [o7 770] opt | Rd | Ra | Rm | op [i] Rn 


cond 





Decode fields 
Instruction Page 
opi op2 Ra 























000 = 000 != 1111 SMLAD, SMLADX - SMLAD variant 
000 = 000 1111 SMUAD, SMUADX - SMUAD variant 
000 = 001 != 1111 SMLAD, SMLADX - SMLADX variant 
000 = 001 1111 SMUAD, SMUADX - SMUADX variant 
000 = 10 != 1111 SMLSD, SMLSDX - SMLSD variant 
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F4 The A32 Instruction Set Encoding 
F4.5 Media instructions 





Decode fields 
Instruction Page 
op1 op2 Ra 
















































































000 010 1111 SMUSD, SMUSDX - SMUSD variant 
000 011 != 1111 SMLSD, SMLSDX - SMLSDX variant 
000 011 1111 SMUSD, SMUSDX - SMUSDX variant 
000 1xx - Unallocated. 
001 '= Q00—- Unallocated. 
001 000 - SDIV 
010 - - Unallocated. 
011 '= 000 - Unallocated. 
011 000 - UDIV 
100 000 - SMLALD, SMLALDX - SMLALD variant 
100 001 - SMLALD, SMLALDX - SMLALDX variant 
100 010 - SMLSLD, SMLSLDX - SMLSLD variant 
100 011 - SMLSLD, SMLSLDX - SMLSLDX variant 
100 1xx - Unallocated. 
101 000 != 1111 SMMLA, SMMLAR - SMMLA variant 
101 000 1111 SMMUL, SMMULR - SMMUL variant 
101 001 != 1111 SMMLA, SMMLAR - SMMLAR variant 
101 001 1111 SMMUL, SMMULR - SMMULR variant 
101 Q1x - Unallocated. 
101 10x - Unallocated. 
101 110 - SMMLS, SMMLSR - SMMLS variant 
101 111 - SMMLS, SMMLSR - SMMLSR variant 
11x - - Unallocated. 
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F4 The A32 Instruction Set Encoding 
F4.5 Media instructions 


F4.5.7 Unsigned Sum of Absolute Differences 


This section describes the encoding of the Unsigned Sum of Absolute Differences instruction class. This section is 
decoded from Media instructions on page F4-2521. 


|31 28/27 26 25 24|23 22 21 20|19 16|15 12|11 8|7 6 5 4|3 0| 
eam [Oo+1117000] Rd | Ra | Rm [O00%] Rn | 
cond 





Decode fields 
Instruction Page 
Ra 





!= 1111 USADA8 





1111 USAD8 





F4.5.8 Bitfield Insert 


This section describes the encoding of the Bitfield Insert instruction class. This section is decoded from Media 
instructions on page F4-2521. 


|31 28|27 26 25 24|23 22 21 20| 16|15 12|11 |7 6 5 4|3 0| 
1111 [0.1717 7 0[  msb | Rd | sb [0 4] Rn 
cond 





Decode fields 
Instruction Page 
Rn 





t= 1111 BFI 





1111 BFC 





F4.5.9 Permanently UNDEFINED 


This section describes the encoding of the Permanently UNDEFINED instruction class. This section is decoded 
from Media instructions on page F4-2521. 


\31 28|27 26 25 24|23 22 21 20|19 | | 8|7 6 5 4|3 0| 
eam jot tt4it1t] mma (1114) mm | 
cond 





Decode fields 
Instruction Page 




















cond 
OXxxx Unallocated. 
10xx Unallocated. 
110x Unallocated. 
1110 UDF 
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F4 The A32 Instruction Set Encoding 
F4.5 Media instructions 


F4.5.10 Bitfield Extract 


This section describes the encoding of the Bitfield Extract instruction class. This section is decoded from Media 
instructions on page F4-2521. 


31 28|27 26 25 24|23 22 21 20 16/15 12\11 i765 4|3 0 | 


Derm [0777 1[U]i] wiht [Ra [| sb [107] Rn | 


cond 





Decode fields 
Instruction Page 














U 
) SBFX 
1 UBFX 
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F4 The A32 Instruction Set Encoding 
F4.6 Branch, branch with link, and block data transfer 


F4.6 Branch, branch with link, and block data transfer 


This section describes the encoding of the Branch, branch with link, and block data transfer group. This section is 
decoded from Top level A32 instruction set encoding on page F4-2500. 


\31 28|27 25 24| | | | | | 0 | 
| cond | 10 | PT 
opO | 





Decode fields 
Decode group or instruction page 
cond op0 











1111 0 Exception Save/Restore 
!= 1111 @ Load/Store Multiple on page F4-2530 
- 1 Branch (immediate) on page F4-2530 





F4.6.1 Exception Save/Restore 


This section describes the encoding of the Exception Save/Restore instruction class. This section is decoded from 
Branch, branch with link, and block data transfer. 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 | | 5 4| 0 | 


77170 opus] en | ———~«Y—smede — 





Decode fields 
Instruction Page 





- - 0 0 Unallocated. 





0 0 ®@ 1 RFE, RFEDA, RFEDB, RFEIA, RFEIB - Decrement After variant 





0 0 1 1) SRS, SRSDA, SRSDB, SRSIA, SRSIB - Decrement After variant 





0 1 @ 1 RFE, RFEDA, RFEDB, RFEIA, RFEIB - Increment After variant 





0 1 1 @  SRS,SRSDA, SRSDB, SRSIA, SRSIB - Increment After variant 





1 0 1) Hl RFE, RFEDA, RFEDB, RFEIJA, RFEIB - Decrement Before variant 





1 0 1 1) SRS, SRSDA, SRSDB, SRSIA, SRSIB - Decrement Before variant 





- - 1 I Unallocated. 





1 1 1) I RFE, RFEDA, RFEDB, RFEIJA, RFEIB - Increment Before variant 





1 1 1 1) SRS, SRSDA, SRSDB, SRSIA, SRSIB - Increment Before variant 
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F4 The A32 Instruction Set Encoding 
F4.6 Branch, branch with link, and block data transfer 


F4.6.2 Load/Store Multiple 


This section describes the encoding of the Load/Store Multiple instruction class. This section is decoded from 
Branch, branch with link, and block data transfer on page F4-2529. 


|31 28/27 26 25 24|23 22 21 20|19 16|15 | | 0| 
Tem [10 ojPlujopwitl An | ——~—registorist ~—SOSCSC~* 
cond 





Decode fields 
Instruction Page 
op P U_ L  register_list 






































0 0 0 oo - STMDA, STMED 

) ® ® 1 - LDMDA, LDMFA 

i) ® 1 0 - STM, STMIA, STMEA 
) ®@ 1 1 - LDM, LDMIA, LDMFD 
) 1 ®@ ® - STMDB, STMFD 

) 1 0 1 - LDMDB, LDMEA 

) 1 1 0 - STMIB, STMFA 

) 1.2) 2 os LDMIB, LDMED 

1 - Qe - STM (User registers) 

1 - - 1 QXXXxXXXXXXxXXXxxx | _LDM (User registers) 

1 - - 1 1XXxxXxxxxxxxxxxxx  _LDM (exception return) 








F4.6.3 Branch (immediate) 


This section describes the encoding of the Branch (immediate) instruction class. This section is decoded from 
Branch, branch with link, and block data transfer on page F4-2529. 


|31 28|27 26 25 24|23 | | | | | 0| 





Decode fields 
Instruction Page 
cond H 





!= 1111 i) B 





!= 1111 1 BL, BLX (immediate) - Al variant 





1111 - BL, BLX (immediate) - A2 variant 
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F4 The A32 Instruction Set Encoding 
F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 


F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 


This section describes the encoding of the System register access, Advanced SIMD, floating-point, and Supervisor 
Call group. This section is decoded from Top level A32 instruction set encoding on page F4-2500. 


31 28|27 25 2423 | | 12\11 9 8| 5 4|3 0 | 


EEE Sti ell elllUcEeESC( ase 
| op2 





Decode fields 
Decode group or instruction page 
op0~ op op2 


























Ox 101 - Floating-point and Advanced SIMD Load/Store and 64-bit register moves 
10 101 ) Floating-point data-processing on page F4-2533 
10 101 1 Floating-point and Advanced SIMD 32-bit register moves on page F4-2536 
!=11 !=1x1 - Unallocated 
!=11 111 - System register access on page F4-2538 
11 - - SVC 

F4.7.1 Floating-point and Advanced SIMD Load/Store and 64-bit register moves 


This section describes the encoding of the Floating-point and Advanced SIMD Load/Store and 64-bit register moves 
group. This section is decoded from System register access, Advanced SIMD, floating-point, and Supervisor Call. 


31 |27 24 21 20 | 12\11 8 | | 0 | 


[ert | 10 | op) [| 107 [pu 





Decode fields 
Decode group or instruction page 














op0 

00x0 Advanced SIMD and floating-point 64-bit move on page F4-2532 

1= Q0x0 Advanced SIMD and floating-point Load/Store on page F4-2532 
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F4 The A32 Instruction Set Encoding 
F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 


Advanced SIMD and floating-point 64-bit move 


This section describes the encoding of the Advanced SIMD and floating-point 64-bit move instruction class. This 
section is decoded from Floating-point and Advanced SIMD Load/Store and 64-bit register moves on page F4-2531. 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


Derm [i700 o[p}ofel R@ | Rt [1 0 1[szlopca|mfoo] vm 


cond 





Decode fields 
Instruction Page 
op opc2 o3 D_ sz 


























- - - Y) - Unallocated. 

) 00 il 1 0 VMOV (between two general-purpose registers and two single-precision registers) 

) 00 1 1 1 VMOV (between two general-purpose registers and a doubleword floating-point register) 
- - Q 1 - Unallocated. 

- Q1 - 1 - Unallocated. 

- 1x - 1 - Unallocated. 

1 00 1 1 0 VMOV (between two general-purpose registers and two single-precision registers) 

1 00 1 1 1 VMOV (between two general-purpose registers and a doubleword floating-point register) 





Advanced SIMD and floating-point Load/Store 


This section describes the encoding of the Advanced SIMD and floating-point Load/Store instruction class. This 
section is decoded from Floating-point and Advanced SIMD Load/Store and 64-bit register moves on page F4-2531. 


|31 28|27 26 25 24/23 22 21 20/19 16|15 12\1110 9 8|7 | 0| 
| eit [1 1 ofPfujpiwit] Rn | va [4 0 1tfsz] imma 
cond 





Decode fields 
Instruction Page 
P U L_sz_ imms W Rn 



































0 0 - = - 1 - Unallocated. 
® 1 0 @ - - - VSTM, VSTMDB, VSTMIA 
Q 1 0 1 XXXXXXXO = - VSTM, VSTMDB, VSTMIA 
1) 1 0 1 XXXXXXXL = - FSTMDBX, FSTMIAX - Increment After variant 
® 1 1 0 - - - VLDM, VLDMDB, VLDMIA 
1) 1 1 1 XXXXXXXO = - VLDM, VLDMDB, VLDMIA 
@ 41 1 1 XXXXXXXL_- - FLDMDBX, FLDMIAX - Increment After variant 
1 - ® - - ) - VSTR 
1 0@ 0 0 - 1 - VSTM, VSTMDB, VSTMIA 
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F4 The A32 Instruction Set Encoding 


F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 





Decode fields 


Instruction Page 
































P U L_sz_ imm8s W Rn 
1 0 1 XXXXXXXO@ 1 - VSTM, VSTMDB, VSTMIA 
1 0 1 XXXXXXXL 1 - FSTMDBX, FSTMIAX - Decrement Before variant 
1 0 i) - 1 - VLDM, VLDMDB, VLDMIA 
1 0 1 XXXXXXX@_ 1 - VLDM, VLDMDB, VLDMIA 
1 0 1 XXXXXXXL 1 - FLDMDBX, FLDMIAX - Decrement Before variant 
1 - - - ) != 1111 VLDR (immediate) 
1 - - - ) 1111 VLDR (literal) 
1 1 - - 1 - Unallocated. 
F4.7.2 Floating-point data-processing 


This section describes the encoding of the Floating-point data-processing group. This section is decoded from 
System register access, Advanced SIMD, floating-point, and Supervisor Call on page F4-2531. 


31 


28|27 


|23 


20/19 18 


| 12\11 8|7 6 5 4/3 0 | 


[| cond [| 110 | oo | | 107 [| ff fol 
op1 a 


ae op2 





Decode fields 


Decode group or instruction page 





























cond op0 op1 op2 

1111 Oxxx - () VSELEQ, VSELGE, VSELGT, VSELVS 

1111 1x00 - - Floating-point minNum/maxNum on page F4-2534 

1111 1x11 1 1 Floating-point directed convert to integer on page F4-2534 

f= 1111 1x11 - 1 Floating-point data-processing (two registers) on page F4-2535 

!= 1111 1x11 - ) VMOV (immediate) 

f= 1111 !=1xll_ - - Floating-point data-processing (three registers) on page F4-2536 
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F4 The A32 Instruction Set Encoding 
F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 


Floating-point minNum/maxNum 


This section describes the encoding of the Floating-point minNum/maxNum instruction class. This section is 
decoded from Floating-point data-processing on page F4-2533. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


1141147170 1Jpfo of vn [va [1 0 1]sz{Nfop[mjo] vm __| 





Decode fields 
Instruction Page 








op 
0 VMAXNM 
1 VMINNM 





Floating-point directed convert to integer 


This section describes the encoding of the Floating-point directed convert to integer instruction class. This section 
is decoded from Floating-point data-processing on page F4-2533. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


1444147170 1J0]1 4 tJofm{ va [1 0 1fszfop}t{mfo] vm __| 





Decode fields 
Instruction Page 
































o1 rm 

Y) 00 VRINTA (floating-point) 
Y) 01 VRINTN (floating-point) 
Y) 10 VRINTP (floating-point) 
Y) 11 VRINTM (floating-point) 
1 00 VCVTA (floating-point) 

1 01 VCVTN (floating-point) 

1 10 VCVTP (floating-point) 

al 11 VCVTM (floating-point) 
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F4 The A32 Instruction Set Encoding 


F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 


Floating-point data-processing (two registers) 


This section describes the encoding of the Floating-point data-processing (two registers) instruction class. This 
section is decoded from Floating-point data-processing on page F4-2533. 


31 28|27 26 25 24|23 22 21 20/19 18 


Perm [i710 1[0]1 toi] ope [| va [10 a [sz[ox[1[mpo] vm _ 


cond 


16/15 12\11109 8|7 6 5 4/3 0| 





Decode fields 


Instruction Page 







































































o1 opc2 03 

) 000 Y) VMOV (register) 

) 000 1 VABS 

) 001 Q VNEG 

() 001 1 VSQRT 

v) 010 Q VCVTB 

) 010 1 VCVTT 

) 011 ") VCVTB 

) Q11 1 VCVTT 

) 100 Q VCMP 

) 100 1 VCMPE 

) 101 ) VCMP 

() 101 1 VCMPE 

) 110 ") VRINTR 

0 110 1 VRINTZ (floating-point) 

Q 111 Y) VRINTX (floating-point) 

0 111 1 VCVT (between double-precision and single-precision) 
a 000 - VCVT (integer to floating-point, floating-point) 

1 001 - Unallocated. 

1 Q1x - VCVT (between floating-point and fixed-point, floating-point) 
1 100 ") VCVTR 

1 10x 1 VCVT (floating-point to integer, floating-point) 

1 101 ") VCVTR 

1 11x - VCVT (between floating-point and fixed-point, floating-point) 
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F4 The A32 Instruction Set Encoding 
F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 


Floating-point data-processing (three registers) 


This section describes the encoding of the Floating-point data-processing (three registers) instruction class. This 
section is decoded from Floating-point data-processing on page F4-2533. 


31 28/27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


| fertt ft 4 1 ofoofof ot | vn | vd 44 0 4fsz{NJo2}mfo} vm | 


cond 





Decode fields 
Instruction Page 
00 of 02 





0 rn) VMLA (floating-point) 





) 00 1 VMLS (floating-point) 





0 01 0 VNMLS 





0 01 1 VNMLA 





) 10 VMUL (floating-point) 





7) 10 1 VNMUL 





0 11 0 VADD (floating-point) 








) 11 1 VSUB (floating-point) 





1 00 ) VDIV 





ak 01 ) VFNMS 





1. 01 1 VFNMA 





1 10 ) VFMA 





1 10 1 VFMS 





F4.7.3 Floating-point and Advanced SIMD 32-bit register moves 


This section describes the encoding of the Floating-point and Advanced SIMD 32-bit register moves group. This 
section is decoded from System register access, Advanced SIMD, floating-point, and Supervisor Call on 
page F4-2531. 


31 \27 |23 2120 | 12\11 8/7 5 4| 0 | 
Cer | 110 | op [Dn] 107 | [1 
ae 





Decode fields 
Decode group or instruction page 

















op0 op1 
000 i) VMOV (between general-purpose register and single-precision) 
111 i) Floating-point move special register on page F4-2537 
- 1 Advanced SIMD 8/16/32-bit element move/duplicate on page F4-2537 
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F4 The A32 Instruction Set Encoding 
F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 


Floating-point move special register 


This section describes the encoding of the Floating-point move special register instruction class. This section is 
decoded from Floating-point and Advanced SIMD 32-bit register moves on page F4-2536. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 2 1 0| 





cond 





Decode fields 
Instruction Page 








L 
Q VMSR 
1 VMRS 





Advanced SIMD 8/16/32-bit element move/duplicate 


This section describes the encoding of the Advanced SIMD 8/16/32-bit element move/duplicate instruction class. 
This section is decoded from Floating-point and Advanced SIMD 32-bit register moves on page F4-2536. 























|31 28|27 26 25 24|23 21 20/19 16|15 12|\11109 8|7 6 5 4|3 21 0| 

Ft |4 11 0 1011 

cond 
Decode fields 

Instruction Page 
opci opc2 L 
Oxx - @ VMOV (general-purpose register to scalar) 
- - 1 VMOV (scalar to general-purpose register) 
1xx Ox @ VDUP (general-purpose register) 
1xx 1x @ Unallocated. 
F4.7.4 Supervisor call 


This section describes the encoding of the Supervisor call group. This section is decoded from System register 
access, Advanced SIMD, floating-point, and Supervisor Call on page F4-2531. 


|31 28|27 |23 | | | | | 0| 


[cond [a4 [py 





Decode fields 
Decode group or instruction page 














cond 
1111 Unallocated. 
!= 1111 SVC 
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F4 The A32 Instruction Set Encoding 
F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 


F4.7.5 System register access 


This section describes the encoding of the System register access instruction group. This section is decoded from 
System register access, Advanced SIMD, floating-point, and Supervisor Call on page F4-2531. 


31 28|27 26 25 24| 21 20 | 12\11 8|7 «5 4/3 0 | 


ee | | opt) ee  11tx a 
coproc 
op0 _ 4 a op2 





Decode fields 
Decode group or instruction page 
op0 op1 op2 














i) 00xd - System register 64-bit move 

0 '= Q0x0- System register Load/Store on page F4-2539 
1 Oxxx 0 Unallocated 

1 Oxxx 1 System register 32-bit move on page F4-2539 





System register 64-bit move 


This section describes the encoding of the System register 64-bit move instruction class. This section is decoded 
from System register access. 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8/7 4|3 0 | 


[end [i700 ofpjoft] Re | mt | ix | opi | cRm | 


coproc 





Decode fields 
Instruction Page 
cond DL 





XXXX 1) - Unallocated 





!=1111 1 @ MCRR 





!=1111 1 1 MRRC 





1111 - - Unallocated 
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F4 The A32 Instruction Set Encoding 
F4.7 System register access, Advanced SIMD, floating-point, and Supervisor Call 


System register Load/Store 
This section describes the encoding of the System register Load/Store instruction class. This section is decoded from 


System register access on page F4-2538. 


31 28/27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 | 0 | 


[cond [i 7 ofP[upopw[t] Rn | oRd_ | ix [imme _—s 
D 


coproc 





Decode fields 
Instruction Page 
cond P:U:W CRd L Rn 























XXXX 000 - - - Unallocated 
!=1111 '=000 !=Q101—- - Unallocated 
!=1111 1=000 0101 ) - STC 

!=1111 1=000 0101 1 !=1111 LDC (immediate) 
!=1111 1=000 0101 1 1111 LDC (literal) 
1111 - - - - Unallocated 

: : - - - Unallocated 





System register 32-bit move 


This section describes the encoding of the System register 32-bit move instruction class. This section is decoded 
from System register access on page F4-2538. 


31 28/27 26 25 24/23 21 20/19 16/15 12\11 8/7 5 4/3 0 | 


coproc 





Decode fields 
Instruction Page 

















cond L 
!=1111 () MCR 
!=1111 1 MRC 
1111 xX Unallocated 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


F4.8 Unconditional instructions 


This section describes the encoding of the Unconditional instructions group. This section is decoded from Top level 
A32 instruction set encoding on page F4-2500. 


31 | 26 25 24| 21 20/19 | | | | 0 | 


| 11710 | pO 
op1 ee 





Decode fields 
Decode group or instruction page 

















op0 op1 

00 - Miscellaneous 

01 - Advanced SIMD data-processing on page F4-2541 

1x 1 Memory hints and barriers on page F4-2551 

10 i) Advanced SIMD element or structure Load/Store on page F4-2553 
11 0 Unallocated. 





F4.8.1 Miscellaneous 


This section describes the encoding of the Miscellaneous group. This section is decoded from Unconditional 
instructions. 


31 | 24 20|19 | | 8|7 4|3 0 | 


tiito00_ | op et 





Decode fields 
Decode group or instruction page 
op0 op1 





Oxxxx - Unallocated. 





10000 XxOx Change Process State on page F4-2541 





10001 xx0x Unallocated. 





1001x xxOx Unallocated. 





100xx 0011 Unallocated. 





100xx 0111 UNPREDICTABLE 





100xx 0x10 Unallocated. 





100xx 1x1x Unallocated. 














101xx - Unallocated. 
11xxx - Unallocated. 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Change Process State 


This section describes the encoding of the Change Process State instruction class. This section is decoded from 
Miscellaneous on page F4-2540. 


[31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 1413 12/1110 9 8/7 6 5 4| 0 | 


7710007 000 0 [mod] mpoploOKOLOOLOETAT FTO] mode 








Decode fields 
Instruction Page 
op imod M_ mode 











0 - - - CPS, CPSID, CPSIE 
1 - - Oxxxx SETEND 
1 - - 1xxxx Unallocated. 





F4.8.2 Advanced SIMD data-processing 


This section describes the encoding of the Advanced SIMD data-processing group. This section is decoded from 
Unconditional instructions on page F4-2540. 


31 | 24/23 2221 | | | i765 4|3 0 | 


| ttt1001 | op | opt | i 


— 


op 





Decode fields 
Decode group or instruction page 






































op0 opi op2 op3 
Q1 11xXxXxXXXXXXXXXXX - 0 VEXT (byte elements) 
11 LIXxXXXXXXXOXXXX - 0 Advanced SIMD two registers misc on page F4-2542 
11 11xxXxXxXxXXXX1Oxxx - 0 VTBL, VTBX 
11 L1XXXXXXXX11xxx - Y) Advanced SIMD duplicate (scalar) on page F4-2544 
x0 - - - Advanced SIMD three registers of the same length on page F4-2544 
x1 QQOXXXXXXXXXXXO - 1 Advanced SIMD one register and modified immediate on page F4-2547 
x1 Y= LIXXXXXXXXXXXXX  Q 0 Advanced SIMD three registers of different lengths on page F4-2548 
x1 Y= L1XXXXXXXXXXXXX 1 Y) Advanced SIMD two registers and a scalar on page F4-2549 
x1 $= QQOXXXXXXXXXXXO - 1 Advanced SIMD two registers and shift amount on page F4-2550 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Advanced SIMD two registers misc 


This section describes the encoding of the Advanced SIMD two registers misc instruction class. This section is 
decoded from Advanced SIMD data-processing on page F4-2541. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 12/11 10 


iI7 6 5 4|3 


0) 


Ti 770077 10] iszefopei] va [0] ope? [O]w[o[ vm _| 





Decode fields 


Instruction Page 































































































opc1 opc2 @Q_ size 
00 0000 - VREV64 
00 0001 - VREV32 
00 0010 - VREV16 
00 0011 - Unallocated. 
00 10x - VPADDL 
00 0110 - AESE 
00 0110 - AESD 
00 @111 - AESMC 
00 111 - AESIMC 
00 1000 - VCLS 
00 1001 - VCLZ 
00 1010 - VCNT 
00 1011 - VMVN (register) 
00 110x - VPADAL 
00 1110 - VQABS 
00 1111 - VQNEG 
01 x000 - VCGT (immediate #0) 
Q1 x001 - VCGE (immediate #0) 
01 x010 - VCEQ (immediate #0) 
Q1 x011 - VCLE (immediate #0) 
01 x100 - VCLT (immediate #0) 
01 x110 - VABS 
01 x111 - VNEG 
@1 0101 - SHAH 
10 0000 00 VSWP 
10 0001 - VTRN 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 





Decode fields 


Instruction Page 






















































































opc1 opc2 Q 

10 0010 VUZP 

10 0011 VZIP 

10 0100 VMOVN 

10 0100 VQMOVN, VQMOVUN - Unsigned result variant 

10 0101 VQMOVN, VQMOVUN - Signed result variant 

10 0110 VSHLL 

10 Q111 SHAISU1 

10 111 SHA256SU0 

10 1000 VRINTN (Advanced SIMD) 

10 1001 VRINTX (Advanced SIMD) 

10 1010 VRINTA (Advanced SIMD) 

10 1011 VRINTZ (Advanced SIMD) 

10 11x0 VCVT (between half-precision and single-precision, Advanced SIMD) 

10 1100 Unallocated. 

10 1101 VRINTM (Advanced SIMD) 

10 1110 Unallocated. 

10 1111 VRINTP (Advanced SIMD) 

11 000x VCVTA (Advanced SIMD) 

11 001x VCVTN (Advanced SIMD) 

11 010x VCVTP (Advanced SIMD) 

11 Q11x VCVTM (Advanced SIMD) 

11 10x0 VRECPE 

11 10x1 VRSQRTE 

11 11xx VCVT (between floating-point and integer, Advanced SIMD) 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Advanced SIMD duplicate (scalar) 
This section describes the encoding of the Advanced SIMD duplicate (scalar) instruction class. This section is 


decoded from Advanced SIMD data-processing on page F4-2541. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 |7 65 4/3 0| 


11417001774 1Jd]1 4] imma [ vd [1 1] ope fafmjo] vm | 





Decode fields 
Instruction Page 














opc 
000 VDUP (scalar) 
001 Unallocated. 
Q1x Unallocated. 
1xx Unallocated. 





Advanced SIMD three registers of the same length 


This section describes the encoding of the Advanced SIMD three registers of the same length instruction class. This 
section is decoded from Advanced SIMD data-processing on page F4-2541. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11 8/7 6 5 4|3 0| 


717100 1[Ulo[pysze[ va | va_| ope _[NJQ[ Moi] vm _ 





Decode fields 
Instruction Page 
opc o1 U_ size Q 















































0000 «8 - - - VHADD 

0000 = 1 - - -  VQADD 

0001 0 - - - VRHADD 

0001 461 0 00 -  VAND (register) 

0001 = 0 01 - VBIC (register) 

0001 #1 i) 10 - VORR (register) 

0001 461 ®@ 11 -  VORN (register) 

0001 = Hl 00 - VEOR 

0001 461 1 01 -  VBSL 

0001 «61 1 10 - VBIT 

0001 61 1 11 -  VBIF 

0010 «0 - - - | VHSUB 

0010 «1 - - - VQSUB 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 





Decode fields 


Instruction Page 














































































































ope o1 U_ size Q 
0011 0 - = -  VCGT (register) 
0011 «1 - = -  VCGE (register) 
0100 «0 - - - VSHL (register) 
0100 «= - - - VQSHL (register) 
0101 +0 - = -  VRSHL 
0101 «1 - = -  VQRSHL 
0110 0 - - - VMAX (integer) 
0110 «1 - - - VMIN (integer) 
0111 0 - = -  VABD (integer) 
Q111 «1 - -  VABA 
10000 Qo - -  VADD (integer) 
10000 1 - - VSUB (integer) 
10001 Qo - -  WTST 
100011 1 - - VCEQ (register) 
1001 0 0 - - VMLA (integer) 
1001 0 1 - - | WVMLS (integer) 
1001 «#1 - - - VMUL (integer and polynomial) 
1010 0 - - 1) VPMAX (integer) 
1010 - - - 1 Unallocated. 
1010 «(1 - - 0 VPMIN (integer) 
1011 0 e - -  VQDMULH 
1011 0 1 - -  VQRDMULH 
1011 «1 0 - - VPADD (integer) 
1011. «1 1 - - Unallocated. 
1100 0 0 00 -  SHAIC 
1100 0 ®@ 1 -  SHAIP 
1100 0 10 - SHAIM 
1100 0 @ 11 - SHA1SUO 
1100 8 1 00 - SHA256H 
1100 0 1 @1 - SHA256H2 
1100 0 1 10 -  SHA256SU1 
11001 0 00 -  VFMA 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 





Decode fields 


Instruction Page 





























































































































ope o1 U_ size Q 
1100 1 0 01 - Unallocated. 
1100 1 v) 10 = VFMS 
1100 1 ) 11 - Unallocated. 
1100 1 1 - - Unallocated. 
1101 0 ®@ ex -  VADD (floating-point) 
1101 0 @ ix -  VSUB (floating-point) 
1101 0 1 0x -  VPADD (floating-point) 
1101 0 1 1x -  VABD (floating-point) 
1101 «1 0 0 - | VMLA (floating-point) 
1101 «#1 v) 01 - Unallocated. 
1101 «1 @ 10 - | WMLS (floating-point) 
1101 «#1 v) 11 - Unallocated. 
1101 «1 1 00 - | VMUL (floating-point) 
1101 «1 1 @1 - Unallocated. 
1110 0 @ 0x - VCEQ (register) 
1110 0 i) 1x - Unallocated. 
1110 0 1 0x - VCGE (register) 
1110 0 1 1x -  VCGT (register) 
1110 «1 1 00 - VACGE 
1110 «1 1 @1 - Unallocated. 
1110 «1 1 10 = VACGT 
1110 «1 1 11 - Unallocated. 
1111 0 0 00 - VMAX (floating-point) 
1111 0 v) 01 - Unallocated. 
1111 0 ® 10 -  VMIN (floating-point) 
1111 0 ) 11 - Unallocated. 
1111 0 1 00 @ VPMAX (floating-point) 
1111 0 1 @1 ®@ Unallocated. 
1111 0 1 - 1 Unallocated. 
1111 0 1 10 @ VPMIN (floating-point) 
1111 0 1 11 ® Unallocated. 
1111 «#1 0 Ox - VRECPS 
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F4 The A32 Instruction Set Encoding 


F4.8 Unconditional instructions 





Decode fields 


Instruction Page 

















ope o1 U_ size 

1111 #1 @ 1x VRSQRTS 
1111 «#1 1 00 VMAXNM 
1111 (1 1 01 Unallocated. 
1111 #1 1 10 VMINNM 
1111 (1 1 11 Unallocated. 








Advanced SIMD one register and modified immediate 


This section describes the encoding of the Advanced SIMD one register and modified immediate instruction class. 
This section is decoded from Advanced SIMD data-processing on page F4-2541. 


|31 30 29 28|27 26 25 24/23 22 21 20/1918  16|15 


12\11 


8|7 6 5 4|3 


77100 ti[t[Jo 0 Of mms | va | omode [o[afos[7] imma 


0 | 





Decode fields 


Instruction Page 















































op cmode 
) Oxxd VMOV (immediate) 
0 Oxx1 VORR (immediate) 
) 10x0 VMOV (immediate) 
0 10x1 VORR (immediate) 
0 11xx VMOV (immediate) 
1 Oxxd VMVN (immediate) 
1 Oxx1 VBIC (immediate) 
1 10x0 VMVN (immediate) 
1 10x1 VBIC (immediate) 
1 110x VMVN (immediate) 
1 1110 VMOV (immediate) 
1 1111 UNDEFINED. 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Advanced SIMD three registers of different lengths 


This section describes the encoding of the Advanced SIMD three registers of different lengths instruction class. This 
section is decoded from Advanced SIMD data-processing on page F4-2541. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|11 8|7 6 5 4|3 0| 
71700 tultopen[ va | va | ope [NJO|MJo] vm_| 
size 





Decode fields 
Instruction Page 




































































opc U 

0000 - VADDL 

0001 - VADDW 

0010 - VSUBL 

0011 - VSUBW 

0100 ) VADDHN 

0100 1 VRADDHN 

0101 - VABAL 

0110 () VSUBHN 

0110 1 VRSUBHN 

0111 - VABDL (integer) 

1000 - VMLAL (integer) 

1001 () VQDMLAL 

1001 1 Unallocated. 

1010 - VMLSL (integer) 

1011 ) VQDMLSL 

1011 1 Unallocated. 

11x0 - VMULL (integer and polynomial) 

1101 ) VQDMULL 

1101 1 Unallocated. 

1111 - Unallocated. 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Advanced SIMD two registers and a scalar 


This section describes the encoding of the Advanced SIMD two registers and a scalar instruction class. This section 


is decoded from Advanced SIMD data-processing on page F4-2541. 






























































|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12|11 8|7 6 5 4|3 0| 
717100 taliop=en] vn | va | ope [N[i][wpo. vm | 
size 
Decode fields 
Instruction Page 
opc Q 
000x - VMLA (by scalar) 
0010 - VMLAL (by scalar) 
0011 Y) VQDMLAL 
0011 a Unallocated. 
010x - VMLS (by scalar) 
0110 - VMLSL (by scalar) 
0111 Y) VQDMLSL 
0111 1 Unallocated. 
100x - VMUL (by scalar) 
1010 - VMULL (by scalar) 
1011 Y) VQDMULL 
1011 ab Unallocated. 
1100 - VQDMULH 
1101 - VQRDMULH 
111x - Unallocated. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F4-2549 


1ID092916 


Non-Confidential 


F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Advanced SIMD two registers and shift amount 


This section describes the encoding of the Advanced SIMD two registers and shift amount instruction class. This 
section is decoded from Advanced SIMD data-processing on page F4-2541. This decode imposes the constraint that 
imm3H:L != £0000’. 


[31 30 29 28/27 26 25 24/23 2221 |1918 16|15 12\11 8|7 6 5 4|3 0 | 


17100 tuto] imma [imma [ va | ope [Ljo|Myi] vm 





Decode fields 
Instruction Page 
ope L imm3L Q U 

































































0000 =i - -  VSHR 

e001 - Ce - -  VSRA 

0010 -—- -  -  VRSHR 

Q@11 - - - -  VRSRA 

0100 = - Sl - - 1 VSRI 

0101 - - - 0 VSHL (immediate) 

0101 -— - - 1 VSLI 

0110 - - - 1  VQSHL, VQSHLU (immediate) 

0111; - - - 

1000 8 - @ @ VSHRN 

1000 0 - 0 d: VQSHRN, VQSHRUN - Unsigned result variant 

1000 «8 | 1 @ VRSHRN 

1000 «Si 1 1 VQRSHRN, VQRSHRUN - Unsigned result variant 

1001 8 - () - VQSHRN, VQSHRUN - Signed result variant 

1001 8 - 1 - VQRSHRN, VQRSHRUN - Signed result variant 

1010 0 - @ -  VSHLL 

1010 8 000 @ - VMOVL 

llix ® - - -  VCVT (between floating-point and fixed-point, Advanced SIMD) 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


F4.8.3 Memory hints and barriers 
This section describes the encoding of the Memory hints and barriers group. This section is decoded from 


Unconditional instructions on page F4-2540. 


|31 | 25 | 21 2019 | | | 5 4|3 0| 


ptt |p a, | 
— op1 





Decode fields 
Decode group or instruction page 


























op0 op1 

00x11 - UNPREDICTABLE. 

01011 - Barriers 

Q1111 - UNPREDICTABLE. 

Oxx01 - UNPREDICTABLE. 

Oxxxd - Preload (immediate) on page F4-2552 
1xxxd ) Preload (register) on page F4-2552 
1xxx1 0 UNPREDICTABLE. 

1xxxx 1 Unallocated. 





Barriers 


This section describes the encoding of the Barriers instruction class. This section is decoded from Memory hints and 
barriers. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 1615 14 13 12/1110 9 8|7 4|3 0| 








Decode fields 
Instruction Page 
































opcode 

0000 UNPREDICTABLE. 

0001 CLREX 

001x UNPREDICTABLE. 

0100 DSB 

0101 DMB 

0110 ISB 

Q111 UNPREDICTABLE. 

1xxx UNPREDICTABLE. 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Preload (immediate) 


This section describes the encoding of the Preload (immediate) instruction class. This section is decoded from 


Memory hints and barriers on page F4-2551. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 | | 0 | 


777107 opluRpo 7, Rn (imme 





Decode fields 


Instruction Page 





Reserved hint, behaves as NOP. 





PLI (immediate, literal) 





PLD (literal) 





PLD, PLDW (immediate) - Preload write variant 





D R_ Rn 

Y) 0 = 

a 1 - 

1 - 1111 

1 0 != 1111 
i, 1 != 1111 


PLD, PLDW (immediate) - Preload read variant 





Preload (register) 


This section describes the encoding of the Preload (register) instruction class. This section is decoded from Memory 


hints and barriers on page F4-2551. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12\11 iI7 6 5 4]3 0| 


17107 tplupyo ty Rn (AL mms [ype fo. Rm 





Decode fields 


Instruction Page 




















02 D 
0 Q Reserved hint, behaves as NOP. 
i) 1 PLD, PLDW (register) - Preload write, rotate right with extend variant 
1 ) PLI (register) 
1 1 PLD, PLDW (register) - Preload read, rotate right with extend variant 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 
























































F4.8.4 Advanced SIMD element or structure Load/Store 
This section describes the encoding of the Advanced SIMD element or structure Load/Store group. This section is 
decoded from Unconditional instructions on page F4-2540. 
|31 | |23 22 21 20|19 | 12|11 10 9 | 4|3 0 | 
11110100 PP op opt Pop? 
opO ee 
Decode fields 
Decode group or instruction page 
opO opt op2 
) - 1101 Advanced SIMD Load/Store multiple structures (immediate, post-indexed) 
0 - 1111 Advanced SIMD Load/Store multiple structures (no writeback) on page F4-2554 
) - != 11x1 Advanced SIMD Load/Store multiple structures (register, post-indexed) on page F4-2555 
1 11 1101 Advanced SIMD load single structure to all lanes (immediate, post-indexed) on page F4-2556 
1 11 1111 Advanced SIMD load single structure to all lanes (no writeback) on page F4-2556 
1 11 != 11x1 Advanced SIMD load single structure to all lanes (register, post-indexed) on page F4-2557 
1 {= 11 1101 Advanced SIMD Load/Store single structure to one lane (immediate, post-indexed) on 
page F4-2557 
1 fed . Dasa Advanced SIMD Load/Store single structure to one lane (no writeback) on page F4-2558 
1 != 11 !=11x1 Advanced SIMD Load/Store single structure to one lane (register, post-indexed) on page F4-2558 
Advanced SIMD Load/Store multiple structures (immediate, post-indexed) 
This section describes the encoding of the Advanced SIMD Load/Store multiple structures (immediate, 
post-indexed) instruction class. This section is decoded from Advanced SIMD element or structure Load/Store. 
|31 30 29 28|27 26 25 2423 22 21 20/19 16|15 12/11 8|7 65 4/3 21 0| 
T1i107000[o[cjo] Rn | ve | ‘ype [swe latgn]t 10 1 
Decode fields 
Instruction Page 
L type 
0 0010 VST1 (multiple single elements) - Post-indexed variant 
Q11x 
1010 
) 0011 VST2 (multiple 2-element structures) - Post-indexed variant 
100x 
0 000x VST4 (multiple 4-element structures) - Post-indexed variant 
) 010x VST3 (multiple 3-element structures) - Post-indexed variant 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 





Decode fields 


Instruction Page 




















L type 

1 0010 VLD1 (multiple single elements) - Post-indexed variant 
011x 
1010 

dl 0011 VLD2 (multiple 2-element structures) - Post-indexed variant 
100x 

a 000x VLD4 (multiple 4-element structures) - Post-indexed variant 

1 010x VLD3 (multiple 3-element structures) - Post-indexed variant 

- 1011 Unallocated. 

- 11xx Unallocated. 





Advanced SIMD Load/Store multiple structures (no writeback) 


This section describes the encoding of the Advanced SIMD Load/Store multiple structures (no writeback) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F4-2553. 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12\11 817 6 5 4|3 21 0| 


Ti7701000)0[tjo] Ra | va | ype |sweJain[i 1171 





Decode fields 
Instruction Page 






































L type 
() 0010 VST1 (multiple single elements) - Offset variant 
011x 
1010 
0 0011 VST2 (multiple 2-element structures) - Offset variant 
100x 
() 000x VST4 (multiple 4-element structures) - Offset variant 
0 010x VST3 (multiple 3-element structures) - Offset variant 
1 0010 VLD1 (multiple single elements) - Offset variant 
011x 
1010 
1 0011 VLD2 (multiple 2-element structures) - Offset variant 
100x 
1 000x VLD4 (multiple 4-element structures) - Offset variant 
1 010x VLD3 (multiple 3-element structures) - Offset variant 
- 1011 Unallocated. 
- 11xx Unallocated. 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Advanced SIMD Load/Store multiple structures (register, post-indexed) 


This section describes the encoding of the Advanced SIMD Load/Store multiple structures (register, post-indexed) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F4-2553. 


[31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12\11 8/7 6 5 4|3 0| 


Ti 770100 0)0[tfo] Ra | va | ype [sweJalgn] Rm | 





Decode fields 
Instruction Page 






































L type 
) 0010 VST1 (multiple single elements) - Post-indexed variant 
011x 
1010 
) 0011 VST2 (multiple 2-element structures) - Post-indexed variant 
100x 
) 000x VST4 (multiple 4-element structures) - Post-indexed variant 
0 Q10x VST3 (multiple 3-element structures) - Post-indexed variant 
1 0010 VLD1 (multiple single elements) - Post-indexed variant 
Q11x 
1010 
1 0011 VLD2 (multiple 2-element structures) - Post-indexed variant 
100x 
1 000x VLD4 (multiple 4-element structures) - Post-indexed variant 
1 010x VLD3 (multiple 3-element structures) - Post-indexed variant 
- 1011 Unallocated. 
- 11xx Unallocated. 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Advanced SIMD load single structure to all lanes (immediate, post-indexed) 


This section describes the encoding of the Advanced SIMD load single structure to all lanes (immediate, 
post-indexed) instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on 
page F4-2553. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 2 1 0| 


Ti770100 [fo] Ra | va [1 i[N [swe]tJali 107 





Decode fields 
Instruction Page 











L N a 

0 - - Unallocated. 

1 00 - VLD1 (single element to all lanes) 

1 @1 - VLD2 (single 2-element structure to all lanes) 





1 10 0 VLD3 (single 3-element structure to all lanes) 





1 10 1 Unallocated. 





1 11 - VLD4 (single 4-element structure to all lanes) 





Advanced SIMD load single structure to all lanes (no writeback) 


This section describes the encoding of the Advanced SIMD load single structure to all lanes (no writeback) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F4-2553. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 2 1 0| 


Ti770100)0[tfo] Ra | va [ti] N [se[tJali i177 





Decode fields 
Instruction Page 











L N a 

0 - - Unallocated. 

1 00 - VLD1 (single element to all lanes) 

a 01 - VLD2 (single 2-element structure to all lanes) 





1 10 ) VLD3 (single 3-element structure to all lanes) 





1 10 1 Unallocated. 





a 11 - VLD4 (single 4-element structure to all lanes) 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Advanced SIMD load single structure to all lanes (register, post-indexed) 


This section describes the encoding of the Advanced SIMD load single structure to all lanes (register, post-indexed) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F4-2553. 


[31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


1141170700 1fpirfof Rn [ va [11] N {sze[Tla] Rm | 





Decode fields 
Instruction Page 











L N a 

0 - - Unallocated. 

1 00 - VLD1 (single element to all lanes) 

1 Q1 - VLD2 (single 2-element structure to all lanes) 





1 10 0 VLD3 (single 3-element structure to all lanes) 





1 10 1 Unallocated. 





1 11 - VLD4 (single 4-element structure to all lanes) 





Advanced SIMD Load/Store single structure to one lane (immediate, post-indexed) 


This section describes the encoding of the Advanced SIMD Load/Store single structure to one lane (immediate, 
post-indexed) instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on 
page F4-2553. 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12\1110 9 8|7 4/3 2 1 0| 
Tt 410700)0[t)o] Rn | Va [=t] N [index align]? 10 1 
size 





Decode fields 
Instruction Page 



































L N 

) 00 VSTI (single element from one lane) 

0 01 VST2 (single 2-element structure from one lane) 

0 10 VST3 (single 3-element structure from one lane) 

i) 11 VST4 (single 4-element structure from one lane) 

1 00 VLD1 (single element to one lane) 

i 01 VLD2 (single 2-element structure to one lane) 

1 10 VLD3 (single 3-element structure to one lane) 

1 11 VLD4 (single 4-element structure to one lane) 
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F4 The A32 Instruction Set Encoding 
F4.8 Unconditional instructions 


Advanced SIMD Load/Store single structure to one lane (no writeback) 


This section describes the encoding of the Advanced SIMD Load/Store single structure to one lane (no writeback) 
instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on page F4-2553. 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 


12/1110 9 8|7 4|3 21 0| 


Ti 770100 1)0[tfo] Ra | va [eM] N [index align]? 11 1] 


size 





Decode fields 


Instruction Page 





VST1 (single element from one lane) 





VST2 (single 2-element structure from one lane) 





VST3 (single 3-element structure from one lane) 





VST4 (single 4-element structure from one lane) 





VLD1 (single element to one lane) 





VLD2 (single 2-element structure to one lane) 





VLD3 (single 3-element structure to one lane) 





L N 

0 00 
Q Q1 
Q 10 
0 11 
1 00 
1 Q1 
1 10 
1 11 





VLD4 (single 4-element structure to one lane) 





Advanced SIMD Load/Store single structure to one lane (register, post-indexed) 


This section describes the encoding of the Advanced SIMD Load/Store single structure to one lane (register, 
post-indexed) instruction class. This section is decoded from Advanced SIMD element or structure Load/Store on 


page F4-2553. 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 


12/1110 9 8|7 4|3 0| 


114170700 1{0]tfof Rn [va [it] N |indexalign] Rm __| 


size 





Decode fields 


Instruction Page 



































L N 
0 00 VSTI (single element from one lane) 
() 01 VST2 (single 2-element structure from one lane) 
0 10 VST3 (single 3-element structure from one lane) 
0 11 VST4 (single 4-element structure from one lane) 
1 00 VLD1 (single element to one lane) 
1 01 VLD2 (single 2-element structure to one lane) 
1 10 VLD3 (single 3-element structure to one lane) 
1 11 VLD4 (single 4-element structure to one lane) 
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Chapter F5 
T32 and A32 Base Instruction Set Instruction 
Descriptions 


This chapter describes each instruction. It contains the following sections: 





° Alphabetical list of T32 and A32 base instruction set instructions on page F5-2560. 
° Encoding and use of Banked register transfer instructions on page F5-3228. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


This section lists every instruction in the T32 and A32 base instruction sets. For details of the format used see 
Format of instruction descriptions on page F2-2402. 


This section is formatted so that each instruction description starts on a new page. 
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F5.1.1 ADC, ADCS (immediate) 


Add with Carry (immediate) adds an immediate value and the Carry flag value to a register value, and writes the 
result to the destination register. 


If the destination register is not the PC, the ADCS variant of the instruction updates the condition flags based on the 
result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The ADC variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 

° The ADCS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See J/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12\11 | | 0 | 
im [oo 710/10 Js] en | Ra | mma ——=*” 


cond 


ADC variant 
Applies when $ == 0. 


ADC{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


ADCS variant 
Applies when $ == 1. 


ADCS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); imm32 = A32ExpandImm(imm12) ; 
T1 


[15 141312/11109 8|7 6 5 4/3 0/1514 12\11 8|7 | 0 | 


741 of foyt 07 ofS] Rn [0] imma] Ra [imme —s 


ADC variant 
Applies when S == 0. 


ADC{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


ADCS variant 


Applies when S == 1. 
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ADCS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); imm32 = T32ExpandImm(i:imm3:imm8) ; 
if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 


Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
° For the ADC variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 


general-purpose registers and the PC on page E1-2293. 


° For the ADCS variant, the instruction performs an exception return, that restores PSTATE 


from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 


this register is the same as <Rn>. 


<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 


used, but this is deprecated. 


For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 


<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 


page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 


page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(R[n], imm32, PSTATE.C); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 





F5-2562 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.2 ADC, ADCS (register) 


Add with Carry (register) adds a register value, the Carry flag value, and an optionally-shifted register value, and 
writes the result to the destination register. 


If the destination register is not the PC, the ADCS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The ADC variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The ADCS variant of the instruction performs an exception return without the use of the stack. In this case: 


A1 


|31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20|19 16/15 12\11 \7 6 5 4|3 0 | 


[erm [oo 0 0[10 7s] Rn | Rd | mmd [ype]o] Rm | 


cond 


ADC, rotate right with extend variant 


Applies when S$ == 0 && imm5 == 00000 && type == 11. 


ADC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ADC, shift or rotate by value variant 


Applies when S == 0 && !(imm5 == 00000 && type == 11). 


ADC{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


ADCS, rotate right with extend variant 


Applies when S$ == 1 && imm5 == 00000 && type == 11. 


ADCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ADCS, shift or rotate by value variant 


Applies when S == 1 && !(imm5 == 00000 && type == 11). 


ADCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); n= 
(shift_t, shift_n) 


UInt(Rn);  m = UInt(Rm); setflags = (S == '1'); 
= DecodeImmShift(type, immS); 
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T1 


\15141312/11109 8|7 65 |3 2. O| 


T1 variant 


ADC<c>{<q>} {<Rdn>,} <Rdn>, <Rm> // Inside IT block 
ADCS{<q>} {<Rdn>,} <Rdn>, <Rm> // Outside IT block 


Decode for this encoding 
d = UInt(Rdn); n = UInt(Rdn); m= UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 1413 12|1110 9 8|7 6 5 4/3 0|1514 — 12/|11 817 6 5 4/3 0| 


7707071 07 08] Rn [OL immd | Rd imma] ype] Rm 


ADC, rotate right with extend variant 
Applies when $ == 0 && imm3 == 000 && imm2 == 00 && type == 11. 


ADC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ADC, shift or rotate by value variant 
Applies when S == Q && !(imm3 == 000 && imm2 == Q@ && type == 11). 


ADC<c>.W {<Rd>,} <Rn>, <Rm> // Inside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
ADC{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


ADCS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && imm2 == 00 && type == 11. 


ADCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ADCS, shift or rotate by value variant 
Applies when S$ == 1 && !(imm3 == 000 && imm2 == QQ && type == 11). 


ADCS.W {<Rd>,} <Rn>, <Rm> // Outside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
ADCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdn> Is the first general-purpose source register and the destination register, encoded in the "Rdn" field. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
. For the ADC variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the ADCS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T2: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. 
For encoding T2: is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 


For encoding T1 and T2: is the second general-purpose source register, encoded in the "Rm" field. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


In T32 assembly: 


° Outside an IT block, if ADCS <Rd>, <Rn>, <Rd> has <Rd> and <Rn> both in the range RO-R7, it is assembled 
using encoding T1 as though ADCS <Rd>, <Rn> had been written. 


° Inside an IT block, if ADC<c> <Rd>, <Rn>, <Rd> has <Rd> and <Rn> both in the range RO-R7, it is assembled 
using encoding T1 as though ADC<c> <Rd>, <Rn> had been written. 


To prevent either of these happening, use the .W qualifier. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], shifted, PSTATE.C); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
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ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.3 ADC, ADCS (register-shifted register) 


Add with Carry (register-shifted register) adds a register value, the Carry flag value, and a register-shifted register 
value. It writes the result to the destination register, and can optionally update the condition flags based on the result. 


A1 
31 28/27 26 25 24|23 22 21 20/19 16/15 12\11 8/7 6 5 4|3 0| 
| i=1111_ [o 0 0 o]1 0 1{s] Rn | Ra | Rs_ [O|type{1] Rm __ 


cond 


Flag setting variant 
Applies when S == 1. 


ADCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when $ == 0. 


ADC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 1 
ASR when type = 10 
ROR when type = 11 
<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 


the "Rs" field. 
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Operation 


if ConditionPassed() then 


EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], shifted, PSTATE.C); 
R[d] = result; 
if setflags then 

PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.4 ADD, ADDS (immediate) 


Add (immediate) adds an immediate value to a register value, and writes the result to the destination register. 


If the destination register is not the PC, the ADDS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. If the 
destination register is the PC: 


. The ADD variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The ADDS variant of the instruction performs an exception return without the use of the stack. ARM 
deprecates use of this instruction. However, in this case: 


A1 


|31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20/19 16/15 12\11 | | 0 | 


Derm [oo 7 oto o[s| en | Rd [| mmi2——S—=d 


cond 


ADD variant 


Applies when $ == @ && Rn != 11x1. 


ADD{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


ADDS variant 


Applies when $ == 1 && Rn != 1101. 


ADDS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


if Rn == 
if Rn == 


"1111' && S == '@' then SEE ADR; 
'1101' then SEE ADD (SP plus immediate); 


d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); imm32 = A32ExpandImm(imm12) ; 


71 


\15141312\11109 8| 65 |32 0O| 


fo oo 11 Af] imms [| Rn | Rad_| 


T1 variant 


ADD<c>{<q>} <Rd>, <Rn>, #<imm3> // Inside IT block 
ADDS{<q>} <Rd>, <Rn>, #<imm3> // Outside IT block 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); setflags = !InITBlock(); imm32 = ZeroExtend(imm3, 32); 
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T2 
|15141312|/1110 8|7 | 0| 
T2 variant 


ADD<c>{<q>} <Rdn>, #<imm8> // Inside IT block, and <Rdn>, <imm8> can be represented in T1 
ADD<c>{<q>} {<Rdn>,} <Rdn>, #<imm8> // Inside IT block, and <Rdn>, <imm8> cannot be represented in T1 
ADDS{<q>} <Rdn>, #<imm8> // Outside IT block, and <Rdn>, <imm8> can be represented in T1 
ADDS{<q>} {<Rdn>,} <Rdn>, #<imm8> // Outside IT block, and <Rdn>, <imm8> cannot be represented in T1 


Decode for this encoding 


d = UInt(Rdn); n = UInt(Rdn); setflags = !InITBlock(); imm32 = ZeroExtend(imm8, 32); 
T3 


151413 12|11109 8|7 6 5 4|3 0\1514 12|11 8|7 | 0 | 


4117 ofifolr oo ofs| =r01 [olimma | Ra [imme —i 
Rn 


ADD variant 
Applies when S == 0. 


ADD<c>.W {<Rd>,} <Rn>, #<const> // Inside IT block, and <Rd>, <Rn>, <const> can be represented in T1 or 
T2 
ADD{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


ADDS variant 
Applies when $ == 1 && Rd != 1111. 


ADDS.W {<Rd>,} <Rn>, #<const> // Outside IT block, and <Rd>, <Rn>, <const> can be represented in T1 or 12 
ADDS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 
if Rd == '1111' && S == '1' then SEE CMN (immediate); 
if Rn == '1101' then SEE ADD (SP plus immediate); 


d = UInt(Rd); mn = UInt(Rn); setflags = (S == '1'); imm32 = T32ExpandImm(i:imm3:imm8) ; 
if (d == 15 && !setflags) || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


T4 


[15 1413 12/1110 9 8|7 6 5 4|3 0/1514 12\11 8|7 | 0 | 


4111 offs ofopofopoy eta [opimms [Ra [imme —s 
Rn 


T4 variant 


ADD{<c>}{<q>} {<Rd>,} <Rn>, #<imm12> // <imm12> cannot be represented in T1, T2, or T3 
ADDW{<c>}{<q>} {<Rd>,} <Rn>, #<imm12> // <imm12> can be represented in 71, 12, or T3 
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Decode for this encoding 


if Rn == '1111' then SEE ADR; 

if Rn == '1101' then SEE ADD (SP plus immediate); 

d = UInt(Rd); mn = UInt(Rn); setflags = FALSE; imm32 = ZeroExtend(i:imm3:imm8, 32); 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdn> Is the general-purpose source and destination register, encoded in the "Rdn" field. 

<imm8> Is a 8-bit unsigned immediate, in the range 0 to 255, encoded in the "imm8" field. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 


this register is the same as <Rn>. If the PC is used: 


° For the ADD variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


. For the ADDS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. ARM deprecates use of this instruction. 


For encoding T1, T3 and T4: is the general-purpose destination register, encoded in the "Rd" field. 
If omitted, this register is the same as <Rn>. 


<Rn> For encoding A1 and T4: is the general-purpose source register, encoded in the "Rn" field. If the SP 
is used, see ADD, ADDS (SP plus immediate). If the PC is used, see ADR. 
For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 


For encoding T3: is the general-purpose source register, encoded in the "Rn" field. If the SP is used, 
see ADD, ADDS (SP plus immediate). 


<imm3> Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "imm3" field. 
<imm12> Is a 12-bit unsigned immediate, in the range 0 to 4095, encoded in the "1:imm3:imm8" field. 
<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 


page F2-2422 for the range of values. 


For encoding T3: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 


When multiple encodings of the same length are available for an instruction, encoding T3 is preferred to encoding 
T4 (if encoding T4 is required, use the ADDW syntax). Encoding T1 is preferred to encoding T2 if <Rd> is specified 
and encoding T2 is preferred to encoding T1 if <Rd> is omitted. 


Operation for all encodings 


if CurrentInstrSet() == InstrSet_A32 then 
if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(R[n], imm32, '0'); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
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ALUExceptionReturn(result) ; 
else 
ALUWritePC(result) ; 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
else 
if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(R[n], imm32, '0'); 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.5 ADD, ADDS (register) 


Add (register) adds a register value and an optionally-shifted register value, and writes the result to the destination 


register. 


If the destination register is not the PC, the ADDS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. If the 
destination register is the PC: 


° The ADD variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The ADDS variant of the instruction performs an exception return without the use of the stack. ARM 
deprecates use of this instruction. However, in this case: 


A1 


31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20/19 16/15 12\11 |7 6 5 4|3 0 | 


[enn [000 o[1 0 o[8| =101 | Ra | immS  |hpe]o] Rm | 
Rn 


cond 


ADD, rotate right with extend variant 


Applies when S == 0 && imm5 == 00000 && type == 11. 


ADD{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ADD, shift or rotate by value variant 


Applies when S$ == 0 && !(immS == 00000 && type == 11). 


ADD{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


ADDS, rotate right with extend variant 


Applies when S == 1 && imm5 == 00000 && type == 11. 


ADDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ADDS, shift or rotate by value variant 


Applies when S$ == 1 && !(immS == 00000 && type == 11). 


ADDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


if Rn == '1101' then SEE ADD (SP plus register); 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 
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T1 


\15141312\11109 8| 65 |32 O| 


foo 077 0fo] Rm | Rn] Ra | 


T1 variant 


ADD<c>{<q>} <Rd>, <Rn>, <Rm> // Inside IT block 
ADDS{<q>} {<Rd>,} <Rn>, <Rm> // Outside IT block 


Decode for this encoding 


d = UInt(Rd); n 


= UInt(Rn); m = UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = 


(SRType_LSL, @); 
T2 


[15 14 1312/1110 9 8|7 6 i 2 O| 


foro 00 4Jo of | =Mo1 | Ran | 
Rm 
ae DN 


T2 variant 
Applies when !(DN == 1 && Rdn == 101). 


ADD<c>{<q>} <Rdn>, <Rm> // Preferred syntax, Inside IT block 
ADD{<c>}{<q>} {<Rdn>,} <Rdn>, <Rm> 


Decode for this encoding 


if (DN:Rdn) == '1101' || Rm == '1101' then SEE ADD (SP plus register) 

d = UInt(DN:Rdn); n=d; m= UInt(Rm); setflags = FALSE; (shift_t, shift_n) = (SRType_LSL, @); 
if n == 15 && m == 15 then UNPREDICTABLE; 

if d == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


T3 


151413 12|11109 8|7 6 5 4|3 0\1514 12\11 8|7 6 5 4|3 0| 


[1141070 1110 0 ofs| !1101 [0 imm3 | Rd fimma|type| Rm _| 
Rn 





ADD, rotate right with extend variant 
Applies when S$ == 0 && imm3 == 000 && imm2 == 00 && type == 11. 


ADD{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ADD, shift or rotate by value variant 
Applies when $ == 0 && !(imm3 == 000 && imm2 == QQ && type == 11). 


ADD<c>.W {<Rd>,} <Rn>, <Rm> // Inside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
ADD{<c>}.W {<Rd>,} <Rn>, <Rm> // <Rd> == <Rn>, and <Rd>, <Rn>, <Rm> can be represented in T2 
ADD{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 
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ADDS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && Rd != 1111 && imm2 == 00 && type == 11. 


ADDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ADDS, shift or rotate by value variant 
Applies when S == 1 && !(imm3 == 000 && imm2 == 00 && type == 11) && Rd != 1111. 


ADDS.W {<Rd>,} <Rn>, <Rm> // Outside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 or T2 
ADDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


if Rd == '1111' && S == '1' then SEE CMN (register); 

if Rn == '1101' then SEE ADD (SP plus register); 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if (d == 15 && !setflags) || n == 15 || m == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rdn> Is the general-purpose source and destination register, encoded in the "DN:Rdn" field. If the PC is 


used, the instruction is a branch to the address calculated by the operation. This is a simple branch, 
see Pseudocode description of operations on the AArch32 general-purpose registers and the PC on 
page E1-2293. The assembler language allows <Rdn> to be specified once or twice in the assembler 
syntax. When used inside an IT block, and <Rdn> and <Rm> are in the range RO to R7, <Rdn> must be 
specified once so that encoding T2 is preferred to encoding T1. In all other cases there is no 
difference in behavior when <Rdn> is specified once or twice. 


<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. If the PC is used: 


° For the ADD variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the ADDS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. ARM deprecates use of this instruction. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. When used 
inside an IT block, <Rd> must be specified. When used outside an IT block, <Rd> is optional, and: 


° If omitted, this register is the same as <Rn>. 


° If present, encoding T1 is preferred to encoding T2. 


For encoding T3: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used. If the SP is used, see ADD, ADDS (SP plus register). 
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For encoding T1: is the first general-purpose source register, encoded in the "Rn" field. 
For encoding T3: is the first general-purpose source register, encoded in the "Rn" field. If the SP is 
used, see ADD, ADDS (SP plus register). 

<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 
For encoding T1 and T3: is the second general-purpose source register, encoded in the "Rm" field. 


For encoding T2: is the second general-purpose source register, encoded in the "Rm" field. The PC 


can be used. 
<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 1 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T3: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Inside an IT block, if ADD<c> <Rd>, <Rn>, <Rd> cannot be assembled using encoding T1, it is assembled using 
encoding T2 as though ADD<c> <Rd>, <Rn> had been written. To prevent this happening, use the .W qualifier. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], shifted, '0'); 
if d == 15 then 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.6 ADD, ADDS (register-shifted register) 


Add (register-shifted register) adds a register value and a register-shifted register value. It writes the result to the 
destination register, and can optionally update the condition flags based on the result. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 
| is1111_ [o 0 0 o]1 0 ofS] Rn | Ra | Rs_ [O|type{1] Rm __ 


cond 


Flag setting variant 
Applies when S == 1. 


ADDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when S == 0. 


ADD{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 

<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 
the "Rs" field. 
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Operation 


if ConditionPassed() then 


EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], shifted, '@'); 
R[d] = result; 
if setflags then 

PSTATE.<N,Z,C,V> = nzcv; 





F5-2578 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.7 ADD, ADDS (SP plus immediate) 


Add to SP (immediate) adds an immediate value to the SP value, and writes the result to the destination register. 


If the destination register is not the PC, the ADDS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. However, 
when the destination register is the PC: 


. The ADD variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The ADDS variant of the instruction performs an exception return without the use of the stack. ARM 
deprecates use of this instruction. However, in this case: 


A1 


|31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 | | 0 | 


| eit joo 1 of1 oo ofs[t1o1f Rd fo immi2 


cond 


ADD variant 


Applies when S == 0. 


ADD{<c>}{<q>} {<Rd>,} SP, #<const> 


ADDS variant 


Applies when S == 1. 


ADDS{<c>}{<q>} {<Rd>,} SP, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); setflags = (S == '1'); imm32 = A32ExpandImm(imm12) ; 


T1 


15 141312/1110 8|7 | 0 | 


[1 o 4 oft] Rd [imme 


T1 variant 


ADD{<c>}{<q>} <Rd>, SP, #<imm8> 


Decode for this encoding 


d = UInt(Rd); setflags = FALSE; imm32 = ZeroExtend(imm8:'Q0', 32); 
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T2 


\15141312\1110 9 8|7 6 | 0| 


TO77 0000/0] mm? | 


T2 variant 


ADD{<c>}{<q>} {SP,} SP, #<imm7> 


Decode for this encoding 


d = 13; setflags = FALSE; imm32 = ZeroExtend(imm7:'Q0', 32); 
T3 


151413 12/1110 9 8|7 6 5 4/3 2 1 0|1514 12/11 8|7 | 0 | 


741 of fol 00 o[s]1 10 1/0] mma] Ra [imme —_—s 


ADD variant 
Applies when $ == 0. 


ADD{<c>}.W {<Rd>,} SP, #<const> // <Rd>, <const> can be represented in T1 or T2 
ADD{<c>}{<q>} {<Rd>,} SP, #<const> 


ADDS variant 
Applies when $ == 1 && Rd != 1111. 


ADDS{<c>}{<q>} {<Rd>,} SP, #<const> 


Decode for all variants of this encoding 
if Rd == '1111' && S == '1' then SEE CMN (immediate); 


d = UInt(Rd); setflags = (S == '1');  imm32 = T32ExpandImm(i:imm3:imm8) ; 
if d == 15 && !setflags then UNPREDICTABLE; 


T4 


151413 12/1110 9 8|7 6 5 4/3 2 1 0|1514 12/11 8|7 | 0 | 


741 Of] ofofopofoyt +0 to] mma | Ra | imms—_—s 


T4 variant 


ADD{<c>}{<q>} {<Rd>,} SP, #<imm12> // <imm12> cannot be represented in T1, 12, or T3 
ADDW{<c>}{<q>} {<Rd>,} SP, #<imm12> // <imm12> can be represented in T1, T2, or T3 


Decode for this encoding 

d = UInt(Rd); setflags = FALSE; imm32 = ZeroExtend(i:imm3:imm8, 32); 
if d == 15 then UNPREDICTABLE; 

Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<imm7> Is the unsigned immediate, a multiple of 4, in the range 0 to 508, encoded in the "imm7" field as 
<imm7>/4. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 


this register is the SP. ARM deprecates using the PC as the destination register, but if the PC is used: 


° For the ADD variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the ADDS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. 


For encoding T3 and T4: is the general-purpose destination register, encoded in the "Rd" field. If 
omitted, this register is the SP. 


<imm8> Is an unsigned immediate, a multiple of 4, in the range 0 to 1020, encoded in the "imm8" field as 
<imm8>/4. 

<imm12> Is a 12-bit unsigned immediate, in the range 0 to 4095, encoded in the "i:imm3:imm8" field. 

<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 


page F2-2422 for the range of values. 


For encoding T3: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(SP, imm32, 'Q'); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result) ; 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.8 ADD, ADDS (SP plus register) 


Add to SP (register) adds an optionally-shifted register value to the SP value, and writes the result to the destination 


register. 


If the destination register is not the PC, the ADDS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 


deprecates any use of these encodings. However, when the destination register is the PC: 


° The ADD variant of the instruction is an interworking branch, see Pseudocode description of operations on 


the AArch32 general-purpose registers and the PC on page E1-2293. 


° The ADDS variant of the instruction performs an exception return without the use of the stack. In this case: 


— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— _ The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 


AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 


i7 6 5 43 


0| 


| fit jo oo oft o ofs[t 101] Rd | immS | typefo] Rm _| 


cond 


ADD, rotate right with extend variant 
Applies when S == 0 && imm5 == 00000 && type == 11. 


ADD{<c>}{<q>} {<Rd>,} SP, <Rm> , RRX 


ADD, shift or rotate by value variant 
Applies when S$ == 0 && !(imm5 == 00000 && type == 11). 


ADD{<c>}{<q>} {<Rd>,} SP, <Rm> {, <shift> #<amount>} 


ADDS, rotate right with extend variant 
Applies when S$ == 1 && imm5 == 00000 && type == 11. 


ADDS{<c>}{<q>} {<Rd>,} SP, <Rm> , RRX 


ADDS, shift or rotate by value variant 
Applies when S$ == 1 && !(imm5 == 00000 && type == 11). 


ADDS{<c>}{<q>} {<Rd>,} SP, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 
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T1 


\15141312\11109 8|7 65 4/3 2. 0O| 


fot 000700] [170 1 Rem | 
js = DM 


T1 variant 


ADD{<c>}{<q>} {<Rdm>,} SP, <Rdm> 


Decode for this encoding 
d = UInt(DM:Rdm); | m = UInt(DM:Rdm); setflags = FALSE; 


(shift_t, shift_n) = (SRType_LSL, Q); 
if d == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


T2 


15 14 1312/1110 9 8|7 6 i321 0| 


fo 70007001] =n [107 


Rm 


T2 variant 


ADD{<c>}{<q>} {SP,} SP, <Rm> 


Decode for this encoding 
if Rm == '1101' then SEE encoding T1; 


d = 13; m= UInt(Rm); setflags = FALSE; 
(shift_t, shift_n) = (SRType_LSL, Q); 


T3 


151413 12/1110 9 8|7 6 5 4/3 2 1 0|1514 12/11 8|7 6 5 4|3 0 | 


[111070 144 0 0 ofs{t 1 0 140 imm3 | Rd imma] type] Rm_ 


ADD, rotate right with extend variant 
Applies when S$ == 0 && imm3 == 000 && imm2 == 00 && type == 11. 


ADD{<c>}{<q>} {<Rd>,} SP, <Rm>, RRX 


ADD, shift or rotate by value variant 
Applies when $ == 0 && !(imm3 == 000 && imm2 == QQ && type == 11). 


ADD{<c>}.W {<Rd>,} SP, <Rm> // <Rd>, <Rm> can be represented in Tl or T2 
ADD{<c>}{<q>} {<Rd>,} SP, <Rm> {, <shift> #<amount>} 


ADDS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && Rd != 1111 && imm2 == Q0 && type == 11. 


ADDS{<c>}{<q>} {<Rd>,} SP, <Rm>, RRX 
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ADDS, shift or rotate by value variant 
Applies when S == 1 && !(imm3 == 000 && imm2 == QQ && type == 11) && Rd != 1111. 


ADDS{<c>}{<q>} {<Rd>,} SP, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 

if Rd == '1111' && S == '1' then SEE CMN (register); 

d = UInt(Rd); m= UInt(Rm); setflags = (S == '1'); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if (d == 15 && !setflags) || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rdm> Is the general-purpose destination and second source register, encoded in the "Rdm" field. If 


omitted, this register is the SP. ARM deprecates using the PC as the destination register, but if the 
PC is used, the instruction is a branch to the address calculated by the operation. This is a simple 
branch, see Pseudocode description of operations on the AArch32 general-purpose registers and the 
PC on page E1-2293. 


<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the SP. ARM deprecates using the PC as the destination register, but if the PC is used: 


° For the ADD variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the ADDS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T3: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the SP. 


<Rm> For encoding Al and T2: is the second general-purpose source register, encoded in the "Rm" field. 
The PC can be used, but this is deprecated. 


For encoding T3: is the second general-purpose source register, encoded in the "Rm" field. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T3: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(SP, shifted, '@'); 
if d == 15 then 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.9 ADD (immediate, to PC) 
Add to PC adds an immediate value to the Align(PC, 4) value to form a PC-relative address, and writes the result 
to the destination register. ARM recommends that, where possible, software avoids using this alias 
This instruction is a pseudo-instruction of the ADR instruction. This means that: 
° The encodings in this description are named to match the encodings of ADR. 
° The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of ADR gives the operational pseudocode for this instruction. 
A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 | | 0 | 
eit [oo 1 0]1 0 ofolt1 71] Ra | mmia———S—=*d 
cond 
Al variant 
ADD{<c>}{<q>} <Rd>, PC, #<const> 
is equivalent to 
ADR{<c>}{<q>} <Rd>, <label> 
and is never the preferred disassembly. 
T1 
15141312/1110 8|7 | 0 | 
101 ojo] Ra | imma 
T1 variant 
ADD{<c>}{<q>} <Rd>, PC, #<imm8> 
is equivalent to 
ADR{<c>}{<q>} <Rd>, <label> 
and is never the preferred disassembly. 
T3 
15141312/11109 8/7 6 5 4/3 2 10/1514 12|11 8/7 | 0 | 
1414 [i]t ofopofofolt +1 t]o] mms [Ra [mm ———d 
T3 variant 
ADDW{<c>}{<q>} <Rd>, PC, #<imm12> // <Rd>, <imm12> can be represented in T1 
is equivalent to 
ADR{<c>}{<q>} <Rd>, <label> 
and is never the preferred disassembly. 
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ADD{<c>}{<q>} <Rd>, PC, #<imm12> 
is equivalent to 
ADR{<c>}{<q>} <Rd>, <label> 


and is never the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If the PC is 


used, the instruction is a branch to the address calculated by the operation. This is an interworking 
branch, see Pseudocode description of operations on the AArch32 general-purpose registers and the 
PC on page E1-2293. 


For encoding T1 and T3: is the general-purpose destination register, encoded in the "Rd" field. 


<label> For encoding A1 and A2: the label of an instruction or literal data item whose address is to be loaded 
into <Rd>. The assembler calculates the required value of the offset from the Align(PC, 4) value of 
the ADR instruction to this label. If the offset is zero or positive, encoding A1 is used, with imm32 equal 
to the offset. If the offset is negative, encoding A2 is used, with imm32 equal to the size of the offset. 
That is, the use of encoding A2 indicates that the required offset is minus the value of imm32. 
Permitted values of the size of the offset are any of the constants described in Modified immediate 
constants in A32 instructions on page F2-2422. 


For encoding T1: the label of an instruction or literal data item whose address is to be loaded into 

<Rd>. The assembler calculates the required value of the offset from the Align(PC, 4) value of the 

ADR instruction to this label. Permitted values of the size of the offset are multiples of 4 in the range 
0 to 1020. 


For encoding T2 and T3: the label of an instruction or literal data item whose address is to be loaded 
into <Rd>. The assembler calculates the required value of the offset from the Align(PC, 4) value of 
the ADR instruction to this label. If the offset is zero or positive, encoding T3 is used, with imm32 equal 
to the offset. If the offset is negative, encoding T2 is used, with imm32 equal to the size of the offset. 
That is, the use of encoding T2 indicates that the required offset is minus the value of imm32. 
Permitted values of the size of the offset are 0-4095. 


<imm8> Is an unsigned immediate, a multiple of 4, in the range 0 to 1020, encoded in the "imm8" field as 
<imm8>/4. 

<imm12> Is a 12-bit unsigned immediate, in the range 0 to 4095, encoded in the "i:imm3:imm8" field. 

<const> Animmediate value. See Modified immediate constants in A32 instructions on page F2-2422 for the 


range of values. 


Operation for all encodings 


The description of ADR gives the operational pseudocode for this instruction. 
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F5.1.10 ADR 


Form PC-relative address adds an immediate value to the PC value to form a PC-relative address, and writes the 
result to the destination register. 


This instruction is used by the pseudo-instructions ADD (immediate, to PC) and SUB (immediate, from PC). The 
pseudo-instruction is never the preferred disassembly. 


A1 

|31 28|27 26 25 24/23 22 21 20|19 18 17 16|15 12|11 | | 0 | 
1171 [o 0 1 0/1 0 ofoj1 114] Ra [ immi2_ 
cond 

Al variant 


ADR{<c>}{<q>} <Rd>, <label> 


Decode for this encoding 


d = UInt(Rd); imm32 = A32ExpandImm(imm12); add = TRUE; 


A2 

\31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12|11 | 0| 
1141 [oo 1 ofo 1 ofoj1 114] Ra [ immiz2_ 
cond 

A2 variant 


ADR{<c>}{<q>} <Rd>, <label> 


Decode for this encoding 


d = UInt(Rd); | imm32 = A32ExpandImm(imm12); add = FALSE; 


T1 

|145141312|1110 8|7 | 0| 
10 1 ofo] Ra | imme 

T1 variant 


ADR{<c>}{<q>} <Rd>, <label> 


Decode for this encoding 


d = UInt(Rd); imm32 = ZeroExtend(imm8:'@0', 32); add = TRUE; 
T2 


[145 141312/11109 8/7 6 5 4/3 21 0/1514 12/11 8|7 | 0 | 


41 oft oft opi foyt +4 to] mma] Ra [imme —sd 
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T2 variant 


ADR{<c>}{<q>} <Rd>, <label> 


Decode for this encoding 


d = UInt(Rd); imm32 = ZeroExtend(i:imm3:imm8, 32); add = FALSE; 


if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
T3 
\15141312/11109 8|7 6 5 4|/3 2 1 0/1514 = 12/11 8|7 | 0| 


[1111 ofif{t ofojofofoji 14 1fo] imms | Rd | imms 


T3 variant 


ADR{<c>}.W <Rd>, <label> // <Rd>, <label> can be presented in T1 
ADR{<c>}{<q>} <Rd>, <label> 


Decode for this encoding 
d = UInt(Rd); imm32 = ZeroExtend(i:imm3:imm8, 32); add = TRUE; 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 


Constraints on UNPREDICTABLE behaviors. 


Alias conditions 





Alias or pseudo-instruction ofvariant is preferred when 





ADD (immediate, to PC) - Never 
SUB (immediate, from PC) T2 j:imm3:imm& == 'Q00000000000' 
SUB (immediate, from PC) A2 imm12 == '000000000000' 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1 and A2: is the general-purpose destination register, encoded in the "Rd" field. If 


the PC is used, the instruction is a branch to the address calculated by the operation. This is an 
interworking branch, see Pseudocode description of operations on the AArch32 general-purpose 
registers and the PC on page E1-2293. 


For encoding T1, T2 and T3: is the general-purpose destination register, encoded in the "Rd" field. 


<label> For encoding A1 and A2: the label of an instruction or literal data item whose address is to be loaded 
into <Rd>. The assembler calculates the required value of the offset from the Align(PC, 4) value of 
the ADR instruction to this label. If the offset is zero or positive, encoding A1 is used, with imm32 equal 
to the offset. If the offset is negative, encoding A2 is used, with imm32 equal to the size of the offset. 
That is, the use of encoding A2 indicates that the required offset is minus the value of imm32. 
Permitted values of the size of the offset are any of the constants described in Modified immediate 
constants in A32 instructions on page F2-2422. 
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For encoding T1: the label of an instruction or literal data item whose address is to be loaded into 

<Rd>. The assembler calculates the required value of the offset from the Align(PC, 4) value of the 

ADR instruction to this label. Permitted values of the size of the offset are multiples of 4 in the range 
0 to 1020. 


For encoding T2 and T3: the label of an instruction or literal data item whose address is to be loaded 
into <Rd>. The assembler calculates the required value of the offset from the Align(PC, 4) value of 
the ADR instruction to this label. If the offset is zero or positive, encoding T3 is used, with imm32 equal 
to the offset. If the offset is negative, encoding T2 is used, with imm32 equal to the size of the offset. 
That is, the use of encoding T2 indicates that the required offset is minus the value of imm32. 
Permitted values of the size of the offset are 0-4095. 


The instruction aliases permit the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = if add then (Align(PC,4) + imm32) else (Align(PC,4) - imm32); 
if d == 15 then // Can only occur for A32 encodings 
ALUWritePC(result) ; 
else 
R[d] = result; 
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F5.1.11 AND, ANDS (immediate) 


Bitwise AND (immediate) performs a bitwise AND of a register value and an immediate value, and writes the result 
to the destination register. 


If the destination register is not the PC, the ANDS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The AND variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The ANDS variant of the instruction performs an exception return without the use of the stack. In this case: 


A1 


|31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20/19 16|15 12\11 | | 0 | 


| feito o 1 ofo o ofs{ Rn | Rd | immi2 


cond 


AND variant 


Applies when S == 0. 


AND{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


ANDS variant 


Applies when S == 1. 


ANDS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); mn = UInt(Rn); setflags = (S == '1'); 
(imm32, carry) = A32ExpandImm_C(imm12, PSTATE.C); 


T1 


[15 141312/11109 8|7 6 5 4/3 0/1514 12\11 8|7 | 0 | 


[741 ofifojo oo os] Rn lo] imma] Ra | imme _—s 


AND variant 


Applies when $ == 0. 


AND{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


ANDS variant 


Applies when S == 1 && Rd != 1111. 
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ANDS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 

if Rd == '1111' && S == 'L' then SEE TST (immediate); 

d = UInt(Rd); n= UInt(Rn); setflags = (S == '1'); 

(imm32, carry) = T32ExpandImm_C(i:imm3:imm8, PSTATE.C); 

if (d == 15 && !setflags) || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
. For the AND variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the ANDS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 
used, but this is deprecated. 


For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 


<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = R[n] AND imm32; 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.12 AND, ANDS (register) 


Bitwise AND (register) performs a bitwise AND of a register value and an optionally-shifted register value, and 
writes the result to the destination register. 


If the destination register is not the PC, the ANDS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The AND variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The ANDS variant of the instruction performs an exception return without the use of the stack. In this case: 


A1 


|31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20/19 16|15 12\11 i765 4|3 0 | 


| i111 jo oo ofo o ofs} Rn | Rd _ | immS | type[o] Rm _ 


cond 


AND, rotate right with extend variant 


Applies when S == 0 && imm5 == 00000 && type == 11. 


AND{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


AND, shift or rotate by value variant 


Applies when S$ == 0 && !(immS == 00000 && type == 11). 


AND{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


ANDS, rotate right with extend variant 


Applies when S == 1 && imm5 == 00000 && type == 11. 


ANDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ANDS, shift or rotate by value variant 


Applies when S == 1 && !(imm5 == 00000 && type == 11). 


ANDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); n= 
(shift_t, shift_n) 


UInt(Rn); | m = UInt(Rm); setflags = (S == '1'); 
= DecodeImmShift(type, imm5); 
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T1 


\15141312\11109 8|7 65 |32  O| 


T1 variant 


AND<c>{<q>} {<Rdn>,} <Rdn>, <Rm> // Inside IT block 
ANDS{<q>} {<Rdn>,} <Rdn>, <Rm> // Outside IT block 


Decode for this encoding 
d = UInt(Rdn); n = UInt(Rdn); m= UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 1413 12|1110 9 8|7 6 5 4/3 0|1514 12/11 817 6 5 4/3 o| 


11 tortotjooo ojs| Rv of imm3 | Rd fimm2]type] Rm __| 


AND, rotate right with extend variant 
Applies when $ == 0 && imm3 == 000 && imm2 == QQ && type == 11. 


AND{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


AND, shift or rotate by value variant 
Applies when S$ == @ && !(imm3 == 000 && imm2 == QQ && type == 11). 


AND<c>.W {<Rd>,} <Rn>, <Rm> // Inside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
AND{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


ANDS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && Rd != 1111 && imm2 == Q0 && type == 11. 


ANDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ANDS, shift or rotate by value variant 
Applies when S == 1 && !(imm3 == 000 && imm2 == QQ && type == 11) && Rd != 1111. 


ANDS.W {<Rd>,} <Rn>, <Rm> // Outside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
ANDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


if Rd == '1111' && S == '1' then SEE TST (register); 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if (d == 15 && !setflags) || n == 15 || m == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdn> Is the first general-purpose source register and the destination register, encoded in the "Rdn" field. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
° For the AND variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the ANDS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T2: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. 
For encoding T2: is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 


For encoding T1 and T2: is the second general-purpose source register, encoded in the "Rm" field. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


In T32 assembly: 


° Outside an IT block, if ANDS <Rd>, <Rn>, <Rd> has <Rd> and <Rn> both in the range RO-R7, it is assembled 
using encoding T1 as though ANDS <Rd>, <Rn> had been written. 


° Inside an IT block, if AND<c> <Rd>, <Rn>, <Rd> has <Rd> and <Rn> both in the range RO-R7, it is assembled 
using encoding T1 as though AND<c> <Rd>, <Rn> had been written. 


To prevent either of these happening, use the .W qualifier. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] AND shifted; 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
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ALUWritePC(result) ; 
else 

R[d] = result; 

if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.13 AND, ANDS (register-shifted register) 


Bitwise AND (register-shifted register) performs a bitwise AND of a register value and a register-shifted register 
value. It writes the result to the destination register, and can optionally update the condition flags based on the result. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 
| i=1111_ [o 0 0 ofo o ofS] Rn | Ra | Rs_ [O|type{1] Rm __ 


cond 


Flag setting variant 
Applies when S == 1. 


ANDS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when S == 0. 


AND{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 

<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 
the "Rs" field. 

ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-2597 


ID092916 Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] AND shifted; 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result) ; 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.14 ASR (immediate) 
Arithmetic Shift Right ((mmediate) shifts a register value right by an immediate number of bits, shifting in copies 
of its sign bit, and writes the result to the destination register. 
This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 
° The encodings in this description are named to match the encodings of MOV, MOVS (register). 
° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
A1 
31 28|27 26 25 24|23 22 21 20|19 18 17 16/15 12|11 i765 4]/3 0| 
iit [000 7 1]0 tof oOo] Ra | —imms [1 oo] Rm] 
cond S) type 
MOV, shift or rotate by value variant 
ASR{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, ASR #<imm> 
and is always the preferred disassembly. 
T2 
|15 14 13 12/11 10 | 65 |32 Of 
[oo of1 of imms | Rm [ Rd _| 
op 
T2 variant 
ASR<c>{<q>} {<Rd>,} <Rm>, #<imm> // Inside IT block 
is equivalent to 
MOV<c>{<q>} <Rd>, <Rm>, ASR #<imm> 
and is the preferred disassembly when InITBlock(). 
T3 
151413 12/11109 8|7 6 5 4/3 21 0|1514 12\11 8/7 6 5 4|3 0 | 
Tit 0710 4J0 07 00111 to] mms | Rd [imma[i o] Rm | 
Ss type 
MOV, shift or rotate by value variant 
ASR<c>.W {<Rd>,} <Rm>, #<imm> // Inside IT block, and <Rd>, <Rm>, <imm> can be represented in 12 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, ASR #<imm> 
and is always the preferred disassembly. 
ASR{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 
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is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, ASR #<imm> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 


deprecates using the PC as the destination register, but if the PC is used, the instruction is a branch 
to the address calculated by the operation. This is an interworking branch, see Pseudocode 
description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 


For encoding T2 and T3: is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 
For encoding T2 and T3: is the general-purpose source register, encoded in the "Rm" field. 
<imm> For encoding Al and T2: is the shift amount, in the range 1 to 32, encoded in the "Iimm5" field as 
<imm> modulo 32. 


For encoding T3: is the shift amount, in the range 1 to 32, encoded in the "imm3:imm2" field as 
<imm> modulo 32. 


Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
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F5.1.15 ASR (register) 
Arithmetic Shift Right (register) shifts a register value right by a variable number of bits, shifting in copies of its 
sign bit, and writes the result to the destination register. The variable number of bits is read from the bottom byte of 
a register 
This instruction is an alias of the MOV, MOVS (register-shifted register) instruction. This means that: 
° The encodings in this description are named to match the encodings of MOV, MOVS (register-shifted 
register). 
° The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this 
instruction. 
A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 12|11 8/7 6 5 4|3 0 | 
1111 Jo oo 1 1{0 t]ofo)((oof Rd | Rs [oft of1] Rm __| 
cond Ss type 
Not flag setting variant 
ASR{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, ASR <Rs> 
and is always the preferred disassembly. 
T1 
151413121109 | 65 |32 £0O| 
07000 0]07100] Rs | Rim | 
op 
Arithmetic shift right variant 
ASR<c>{<q>} {<Rdm>,} <Rdm>, <Rs> // Inside IT block 
is equivalent to 
MOV<c>{<q>} <Rdm>, <Rdm>, ASR <Rs> 
and is the preferred disassembly when InITBlock(). 
T2 
15 141312/11109 8|7 6 5 4|3 0 15 14 13 12|11 8/7 6 5 4|3 0 | 
1147170170 0/7 ofo}] Rm [141174] Rd fooool Rs | 
type S 
Not flag setting variant 
ASR<c>.W {<Rd>,} <Rm>, <Rs> // Inside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in T1 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, ASR <Rs> 
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and is always the preferred disassembly. 
ASR{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 

is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, ASR <Rs> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdm> Is the first general-purpose source register and the destination register, encoded in the "Rdm" field. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 

<Rs> Is the second general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 


the "Rs" field. 


Operation for all encodings 


The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this instruction. 
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F5.1.16 ASRS (immediate) 


Arithmetic Shift Right, setting flags (mmediate) shifts a register value right by an immediate number of bits, 
shifting in copies of its sign bit, and writes the result to the destination register. 


If the destination register is not the PC, this instruction updates the condition flags based on the result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


e The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


° The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from AArch32 
State on page G1-3835. 


. The instruction is UNDEFINED in Hyp mode. 

° The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 

This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 

° The encodings in this description are named to match the encodings of MOV, MOVS (register). 


° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 i7 6 5 4|3 0| 


fist fo oo 4 1fo 1] 1foy@ oo) Ra | imms [1 ofo] Rm _ 
iS) 


cond type 


MOVS, shift or rotate by value variant 
ASRS{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, ASR #<imm> 


and is always the preferred disassembly. 
T2 


|15 14 13 12/11 10 | 65 |32 Of 


foo 0]1 0] mms] Rm] Ra | 
op 


T2 variant 

ASRS{<q>} {<Rd>,} <Rm>, #<imm> // Outside IT block 
is equivalent to 

MOVS{<q>} <Rd>, <Rm>, ASR #<imm> 


and is the preferred disassembly when ! InITBlock(). 
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T3 


151413 12/1110 9 8|7 6 5 4/3 21 0|1514 12/11 8|7 6 5 4|3 0 | 


Ti T0710 7007 Of1[1 17 TO] mms | Rd limma[i o[ Rm _| 
Ss 


type 


MOVS, shift or rotate by value variant 


ASRS.W {<Rd>,} <Rm>, #<imm> // Outside IT block, and <Rd>, <Rm>, <imm> can be represented in 12 


is equivalent to 


MOVS{<c>}{<q>} <Rd>, <Rm>, ASR #<imm> 


and is always the preferred disassembly. 


ASRS{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 


is equivalent to 


MOVS{<c>}{<q>} <Rd>, <Rm>, ASR #<imm> 


and is always the preferred disassembly. 


Assembler symbols 


<c> 


<q> 


<Rd> 


<Rm> 


<imm> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 
deprecates using the PC as the destination register, but if the PC is used, the instruction performs an 
exception return, that restores PSTATE from SPSR_<current_mode>. 


For encoding T2 and T3: is the general-purpose destination register, encoded in the "Rd" field. 
For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 

For encoding T2 and T3: is the general-purpose source register, encoded in the "Rm" field. 

For encoding Al and T2: is the shift amount, in the range 1 to 32, encoded in the "imm5" field as 
<imm> modulo 32. 


For encoding T3: is the shift amount, in the range 1 to 32, encoded in the "imm3:imm2" field as 
<imm> modulo 32. 


Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
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ASRS (register) 


Arithmetic Shift Right, setting flags (register) shifts a register value right by a variable number of bits, shifting in 
copies of its sign bit, writes the result to the destination register, and updates the condition flags based on the result. 
The variable number of bits is read from the bottom byte of a register 


This instruction is an alias of the MOV, MOVS (register-shifted register) instruction. This means that: 


° The encodings in this description are named to match the encodings of MOV, MOVS (register-shifted 
register). 


° The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this 
instruction. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 8/7 6 5 4|3 0| 


[e111 fo oo 4 1fo t]ifoy@ oo) Ra _ [Rs Jolt oft] Rm | 
S) 


cond type 


Flag setting variant 
ASRS{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, ASR <Rs> 


and is always the preferred disassembly. 
T1 


15141312/1109 | 65 |32  O| 


op 


Arithmetic shift right variant 

ASRS{<q>} {<Rdm>,} <Rdm>, <Rs> // Outside IT block 
is equivalent to 

MOVS{<q>} <Rdm>, <Rdm>, ASR <Rs> 


and is the preferred disassembly when !InITBlock(). 
T2 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0 | 


1111107001 of1sf Rm [ira it} Rd ooo oj] Rs | 


type S 


Flag setting variant 
ASRS.W {<Rd>,} <Rm>, <Rs> // Outside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in T1 
is equivalent to 


MOVS{<c>}{<q>} <Rd>, <Rm>, ASR <Rs> 
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and is always the preferred disassembly. 
ASRS{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, ASR <Rs> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdm> Is the first general-purpose source register and the destination register, encoded in the "Rdm'" field. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 

<Rs> Is the second general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 


the "Rs" field. 


Operation for all encodings 


The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this instruction. 
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F5.1.18 B 


Branch causes a branch to a target address. 


A1 


|31 28|27 26 25 24|23 | | | | | 0| 


Derm [toto] SSCS S—SCSCiS 


cond 


Al variant 


B{<c>}{<q>} <label> 


Decode for this encoding 


imm32 = SignExtend(imm24:'Q0', 32); 
T1 


|15 14 13 12|11 8|7 | 0 | 


Tiot) etx [imme _—| 


cond 


T1 variant 


B<c>{<q>} <label> // Not permitted in IT block 


Decode for this encoding 
if cond == '1110' then SEE UDF; 
if cond == '1111' then SEE SVC; 


imm32 = SignExtend(imm8:'@', 32); 
if InITBlock() then UNPREDICTABLE; 


T2 


|15 14 13 12/11 10 | | 0| 


1itoo; immit 


T2 variant 


B{<c>}{<q>} <label> // Outside or last in IT block 


Decode for this encoding 
imm32 = SignExtend(imm11:'@', 32); 
if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


T3 


15141312\1109 | 65 | 0 15 14 13 12/11 10 | | 0 | 





11 js] it | imme [1 opiope] sim ————s™ 


cond 
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T3 variant 


B<c>.W <label> // Not permitted in IT block, and <label> can be represented in T1 
B<c>{<q>} <label> // Not permitted in IT block 


Decode for this encoding 
if cond<3:1> == '111' then SEE "Related encodings"; 


imm32 = SignExtend(S:J2:J1:imm6:imm11:'0', 32); 
if InITBlock() then UNPREDICTABLE; 


T4 


[15 141312/11109 | | 0 |15 14 13 12/11 10 | | 0 | 


Tit t os] immo ——«d oem C*” 


T4 variant 


B{<c>}.W <label> // <label> can be represented in T2 
B{<c>}{<q>} <label> 


Decode for this encoding 

I1 = NOT(J1 EOR S$); I2 = NOT(J2 EOR S); | imm32 = SignExtend(S:11:12:imm10:imm11:'0', 32); 
if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 

Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Related encodings: Branches and miscellaneous control on page F3-2471. 


Assembler symbols 


<C> For encoding Al, T2 and T4: see Standard assembler syntax fields on page F2-2406. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. Must not be AL or omitted. 


For encoding T3: see Standard assembler syntax fields on page F2-2406. <c> must not be AL or 


omitted. 
<q> See Standard assembler syntax fields on page F2-2406. 
<label> For encoding A1: the label of the instruction that is to be branched to. The assembler calculates the 


required value of the offset from the PC value of the B instruction to this label, then selects an 
encoding that sets imm32 to that offset. Permitted offsets are multiples of 4 in the range —33554432 
to 33554428. 


For encoding T1: the label of the instruction that is to be branched to. The assembler calculates the 
required value of the offset from the PC value of the B instruction to this label, then selects an 
encoding that sets imm32 to that offset. Permitted offsets are even numbers in the range —256 to 254. 


For encoding T2: the label of the instruction that is to be branched to. The assembler calculates the 
required value of the offset from the PC value of the B instruction to this label, then selects an 
encoding that sets imm32 to that offset. Permitted offsets are even numbers in the range —2048 to 
2046. 


For encoding T3: the label of the instruction that is to be branched to. The assembler calculates the 
required value of the offset from the PC value of the B instruction to this label, then selects an 
encoding that sets imm32 to that offset. Permitted offsets are even numbers in the range —1048576 to 
1048574. 
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For encoding T4: the label of the instruction that is to be branched to. The assembler calculates the 
required value of the offset from the PC value of the B instruction to this label, then selects an 
encoding that sets imm32 to that offset. Permitted offsets are even numbers in the range —16777216 
to 16777214. 


Operation for all encodings 
if ConditionPassed() then 


EncodingSpecificOperations(); 
BranchWritePC(PC + imm32); 
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F5.1.19 


Bit Field Clear clears any number of adjacent bits at any position in a register, without affecting the other bits in the 
register. 


A1 


31 28|27 26 25 24/23 22 21 20! 16/15 12\11 i765 4/3 21 0 


pea fort at tof omsb TRG [sh fo tt tt 4 


cond 


Al variant 


BFC{<c>}{<q>} <Rd>, #<Isb>, #<width> 


Decode for this encoding 
d = UInt(Rd); msbit = UInt(msb); Isbit = UInt(1sb); 
if d == 15 then UNPREDICTABLE; 


T1 


15 141312/11109 8/7 6 5 4/3 21 0/1514 = 12/11 81/7 6 5 4| 0 | 


741 oft tfo api folt 14 1 [o] mms | Ra limmofoy mab 


T1 variant 


BFC{<c>}{<q>} <Rd>, #<Isb>, #<width> 


Decode for this encoding 
d = UInt(Rd); msbit = UInt(msb); Isbit = UInt(imm3:imm2); 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 


<Isb> For encoding A1: is the least significant bit to be cleared, in the range 0 to 31, encoded in the "Isb" 
field. 


For encoding T1: is the least significant bit that is to be cleared, in the range 0 to 31, encoded in the 


"imm3:imm2" field. 


<width> Is the number of bits to be cleared, in the range 1 to 32-<Isb>, encoded in the "msb" field as 
<Isb>+<width>-1. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if msbit >= Isbit then 
R[d]<msbit:]lsbit> = Replicate('@', msbit-1sbit+1) ; 
// Other bits of R[d] are unchanged 
else 
UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If msbit < 1sbit, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 
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F5.1.20 


BFI 


Bit Field Insert copies any number of low order bits from a register into the same number of adjacent bits at any 
position in the destination register. 


A1 


31 28|27 26 25 24/23 22 21 20! 16/15 12\11 iI7 6 5 4|3 0 | 


Posi jo 1411117 0f  msb | Rd | tsb fo 0 4] !1111 | 
Rn 


cond 


Al variant 


BFI{<c>}{<q>} <Rd>, <Rn>, #<Isb>, #<width> 


Decode for this encoding 
if Rn == '1111' then SEE BFC; 


d = UInt(Rd); nm = UInt(Rn); msbit = UInt(msb); Isbit = UInt(1sb); 
if d == 15 then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4|3 0\1514 12\11 8|7 6 5 4| 0 | 


111 ofoft to t]tfo, En [ol imma [Ra limmafo] mb __| 
Rn 





T1 variant 


BFI{<c>}{<q>} <Rd>, <Rn>, #<Isb>, #<width> 


Decode for this encoding 

if Rn == '1111' then SEE BFC; 

d = UInt(Rd); nm = UInt(Rn); msbit = UInt(msb); Isbit = UInt(imm3:imm2); 

if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the general-purpose source register, encoded in the "Rn" field. 


<Isb> For encoding A1: is the least significant destination bit, in the range 0 to 31, encoded in the "Isb" 
field. 


For encoding T1: is the least significant destination bit, in the range 0 to 31, encoded in the 


"imm3:imm2" field. 


<width> Is the number of bits to be copied, in the range 1 to 32-<Isb>, encoded in the "msb" field as 
<Isb>+<width>-1. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if msbit >= Isbit then 
R[d]<msbit:]lsbit> = R[n]<(msbit-]sbit) :0>; 
// Other bits of R[d] are unchanged 
else 
UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If msbit < 1sbit, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.21 BIC, BICS (immediate) 


Bitwise Bit Clear (immediate) performs a bitwise AND of a register value and the complement of an immediate 
value, and writes the result to the destination register. 


If the destination register is not the PC, the BICS variant of the instruction updates the condition flags based on the 
result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The BIC variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 

° The BICS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— _ The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12\11 | | 0 | 
1111 [oo 1744/1 of/s}| Rn [| Rd [| immi2_— 


cond 


BIC variant 

Applies when S == 0. 

BIC{<c>}{<q>} {<Rd>,} <Rn>, #<const> 

BICS variant 

Applies when S == 1. 

BICS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 

Decode for all variants of this encoding 

d = UInt(Rd); n= UInt(Rn); setflags = (S == '1'); 
(imm32, carry) = A32ExpandImm_C(imm12, PSTATE.C); 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0/1514 12\11 8|7 | 0 | 


[741 of fofo oo 7s] Rn [0] imma] Ra [imme 


BIC variant 

Applies when $ == 0. 

BIC{<c>}{<q>} {<Rd>,} <Rn>, #<const> 
BICS variant 


Applies when S == 1. 
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BICS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); setflags = (S == '1'); 

(imm32, carry) = T32ExpandImm_C(i:imm3:imm8, PSTATE.C); 

if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
. For the BIC variant, the instruction is a branch to the address calculated by the operation. This 


is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


. For the BICS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 
used, but this is deprecated. 


For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 


<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = R[n] AND NOT(imm32) ; 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result) ; 
else 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.22 BIC, BICS (register) 


Bitwise Bit Clear (register) performs a bitwise AND of a register value and the complement of an optionally-shifted 


register value, and writes the result to the destination register. 


If the destination register is not the PC, the BICS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 


deprecates any use of these encodings. However, when the destination register is the PC: 


° The BIC variant of the instruction is an interworking branch, see Pseudocode description of operations on 


the AArch32 general-purpose registers and the PC on page E1-2293. 


° The BICS variant of the instruction performs an exception return without the use of the stack. In this case: 


— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— _ The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 


AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16|15 


i765 4|3 


0| 


| fit jo oo 4 1f1 ofs{ Rn | Rd | immS | typeof] Rm _| 


cond 


BIC, rotate right with extend variant 
Applies when S == 0 && imm5 == 00000 && type == 11. 


BIC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


BIC, shift or rotate by value variant 
Applies when S$ == 0 && !(imm5 == 00000 && type == 11). 


BIC{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


BICS, rotate right with extend variant 
Applies when S$ == 1 && imm5 == 00000 && type == 11. 


BICS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


BICS, shift or rotate by value variant 
Applies when S == 1 && !(imm5S == 00000 && type == 11). 


BICS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); n 


= UInt(Rn); m= UInt(Rm); setflags 
(shift_t, shift_n) = 


DecodeImmShift(type, imm5); 
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T1 


\15141312\11109 8|7 65 |32 O| 


T1 variant 


BIC<c>{<q>} {<Rdn>,} <Rdn>, <Rm> // Inside IT block 
BICS{<q>} {<Rdn>,} <Rdn>, <Rm> // Outside IT block 


Decode for this encoding 
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 1413 12|1110 9 8|7 6 5 4/3 0|1514 12/11 817 6 5 4/3 o| 


At torortjooo sts} Rv of imm3 | Rd fimm2}type] Rm __| 


BIC, rotate right with extend variant 
Applies when $ == 0 && imm3 == 000 && imm2 == QQ && type == 11. 


BIC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


BIC, shift or rotate by value variant 
Applies when $ == 0 && !(imm3 == 000 && imm2 == QQ && type == 11). 


BIC<c>.W {<Rd>,} <Rn>, <Rm> // Inside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
BIC{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


BICS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && imm2 == 00 && type == 11. 


BICS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


BICS, shift or rotate by value variant 
Applies when S$ == 1 && !(imm3 == 000 && imm2 == 00 && type == 11). 


BICS.W {<Rd>,} <Rn>, <Rm> // Outside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
BICS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> 


<q> 


<Rdn> 


<Rd> 


<Rn> 


<Rm> 


<shift> 


<amount> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 
Is the first general-purpose source register and the destination register, encoded in the "Rdn" field. 


For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 


° For the BIC variant, the instruction is a branch to the address calculated by the operation. This 
is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the BICS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T2: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. 

For encoding T2: is the first general-purpose source register, encoded in the "Rn" field. 

For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 

For encoding T1 and T2: is the second general-purpose source register, encoded in the "Rm" field. 


Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 


LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 


For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] AND NOT(shifted) ; 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.23 BIC, BICS (register-shifted register) 


Bitwise Bit Clear (register-shifted register) performs a bitwise AND of a register value and the complement of a 
register-shifted register value. It writes the result to the destination register, and can optionally update the condition 
flags based on the result. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0| 
| istii1_ fo oo 1 1{1 ofs{ Rn | Ra | Rs_ [oftype]t] Rm __| 


cond 


Flag setting variant 
Applies when S == 1. 


BICS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when $ == 0. 


BIC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 

<Rs> Is the general-purpose source register holding a shift amount in its bottom 8 bits, encoded in the "Rs" 
field. 
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Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] AND NOT(shifted) ; 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result) ; 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


BKPT 


Breakpoint causes a Breakpoint Instruction exception. 


Breakpoint is always unconditional, even when inside an IT block. 
A1 


31 28|27 26 25 24|23 22 21 20/19 | | 81/7 6 5 4|3 0 | 


[enm [ooo 7 0f07)o] _mmi2_———=«diO 117] imme | 


cond 


Al variant 
BKPT{<q>} {#}<imm> 
Decode for this encoding 


imm16 = imm12:imm4; 
if cond != '1110' then UNPREDICTABLE; // BKPT must be encoded with AL condition 


CONSTRAINED UNPREDICTABLE behavior 

If cond != '1110', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes unconditionally. 


° The instruction executes conditionally. 


T1 


\15 1413 12|1110 9 8|7 | 0 | 


tortit1of] imme | 


T1 variant 


BKPT{<q>} {#}<imm> 


Decode for this encoding 


jimm16 = ZeroExtend(imm8, 16); 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. An BKPT instruction must be unconditional. 
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<imm> For encoding A1: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 
"imm12:imm4" field. This value: 


. Is recorded in the Comment field of ESR_ELx.ISS if the Software Breakpoint Instruction 
exception is taken to an exception level that is using AArch64. 


° Is ignored otherwise. 


For encoding T1: is a 8-bit unsigned immediate, in the range 0 to 255, encoded in the "imm8" field. 
This value: 


° Is recorded in the Comment field of ESR_ELx.ISS if the Software Breakpoint Instruction 
exception is taken to an exception level that is using AArch64. 


° Is ignored otherwise. 


Operation for all encodings 


EncodingSpecificOperations(); 
AArch32.SoftwareBreakpoint(imm16) ; 
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F5.1.25 BL, BLX (immediate) 
Branch with Link calls a subroutine at a PC-relative address, and setting LR to the return address. 


Branch with Link and Exchange Instruction Sets (immediate) calls a subroutine at a PC-relative address, setting LR 
to the return address, and changes the instruction set from A32 to T32, or from T32 to A32. 


A1 
\31 28|27 26 25 24|23 | | | | 0| 
=1111 
cond 
Al variant 


BL{<c>}{<q>} <label> 


Decode for this encoding 


imm32 = SignExtend(imm24:'00', 32); targetInstrSet = InstrSet_A32; 


A2 
|31 28|27 26 25 24|23 | | | | 0| 
1111 
cond 
A2 variant 


BLX{<c>}{<q>} <label> 


Decode for this encoding 


imm32 = SignExtend(imm24:H:'@', 32); targetInstrSet = InstrSet_T32; 


T1 

|145141312\11109 | | 0 |15 14 13 12/11 10 | | 0| 
Toit os] mmo (i ipyipa, imma —SC 

T1 variant 


BL{<c>}{<q>} <label> 


Decode for this encoding 


I1 = NOT(J1 EOR S); 12 = NOT(J2 EOR S); | imm32 = SignExtend(S:11:12:imm10:imm11:'@', 32); 
targetInstrSet = InstrSet_T32; 
if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 





T2 
|15141312\11109 | | 0 |15 14 13 12/11 10 | | 1 0| 
11717 0]8] mmo it toa] immiar_————‘A 
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T2 variant 


BLX{<c>}{<q>} <label> 


Decode for this encoding 


if H == '1' then UNDEFINED; 

I1 = NOT(J1 EOR S); 12 = NOT(J2 EOR S); | imm32 = SignExtend(S:11:12:imm1QH:imm1@L:'@0', 32); 
targetInstrSet = InstrSet_A32; 

if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<C> For encoding Al, T1 and T2: see Standard assembler syntax fields on page F2-2406. 


For encoding A2: see Standard assembler syntax fields on page F2-2406. <c> must be AL or omitted. 
<q> See Standard assembler syntax fields on page F2-2406. 


<label> For encoding A1: the label of the instruction that is to be branched to. The assembler calculates the 
required value of the offset from the PC value of the BL instruction to this label, then selects an 
encoding that sets imm32 to that offset. Permitted offsets are multiples of 4 in the range —33554432 
to 33554428. 


For encoding A2: the label of the instruction that is to be branched to. The assembler calculates the 
required value of the offset from the PC value of the BLX instruction to this label, then selects an 
encoding with imm32 set to that offset. Permitted offsets are even numbers in the range —33554432 
to 33554430. 


For encoding T1: the label of the instruction that is to be branched to. The assembler calculates the 
required value of the offset from the PC value of the BL instruction to this label, then selects an 
encoding with imm32 set to that offset. Permitted offsets are even numbers in the range —-16777216 
to 16777214. 


For encoding T2: the label of the instruction that is to be branched to. The assembler calculates the 
required value of the offset from the Align(PC, 4) value of the BLX instruction to this label, then 
selects an encoding with imm32 set to that offset. Permitted offsets are multiples of 4 in the range 
—16777216 to 16777212. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if CurrentInstrSet() == InstrSet_A32 then 
LR = PC - 4; 
else 
LR = PC<31:1> : '1'; 
if targetInstrSet == InstrSet_A32 then 
targetAddress = Align(PC,4) + imm32; 
else 
targetAddress = PC + imm32; 
SelectInstrSet(targetInstrSet) ; 
BranchWritePC(targetAddress) ; 
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F5.1.26 BLX (register) 


Branch with Link and Exchange (register) calls a subroutine at an address specified in the register, and if necessary 
changes to the instruction set indicated by bit[0] of the register value. If the value in bit[0] is 0, the instruction set 
after the branch will be A32. If the value in bit[0] is 1, the instruction set after the branch will be T32. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 8|7 6 5 4/3 0 | 





cond 


Al variant 


BLX{<c>}{<q>} <Rm> 


Decode for this encoding 
m = UInt(Rm); 
if m == 15 then UNPREDICTABLE; 


T1 


|15 14 1312/1110 9 8|7 6 i321 0| 


o7100017 4 144f Rm __|oO(0) 


T1 variant 


BLX{<c>}{<q>} <Rm> 


Decode for this encoding 

m = UInt(Rm); 

if m == 15 then UNPREDICTABLE; 

if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rm> Is the general-purpose register holding the address to be branched to, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
target = R[m]; 
if CurrentInstrSet() == InstrSet_A32 then 
next_instr_addr = PC - 4; 
LR = next_instr_addr; 
else 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-2625 
1ID092916 Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


next_instr_addr = PC - 2; 
LR = next_instr_addr<31:1> : '1'; 
BXWritePC(target) ; 
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F5.1.27 


BX 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Branch and Exchange causes a branch to an address and instruction set specified by a register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 13 12/1110 9 8/7 6 5 4]3 0| 


cond 





Al variant 


BX{<c>}{<q>} <Rm> 


Decode for this encoding 


m = UInt(Rm); 
T1 


15 14 1312/1110 9 8|7 6 i321 0| 


o1000177 tof Rm __ {ooo 


T1 variant 


BX{<c>}{<q>} <Rm> 


Decode for this encoding 

m = UInt(Rm); 

if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<Rm> For encoding A1: is the general-purpose register holding the address to be branched to, encoded in 
the "Rm" field. The PC can be used. 


For encoding T1: is the general-purpose register holding the address to be branched to, encoded in 
the "Rm" field. The PC can be used. 
— Note 


If <Rm> is the PC at a non word-aligned address, it results in UNPREDICTABLE behavior because the 
address passed to the BXWritePC() pseudocode function has bits<1:0> ='10'. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
BXWritePC(R[m]); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.28 BXJ 


Branch and Exchange, previously Branch and Exchange Jazelle. 


In ARMv8, BXJ behaves as a BX instruction, see BX. This means it causes a branch to an address and instruction set 
specified by a register. 





A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312|1110 9 8|7 6 5 4|3 0| 
cond 
Al variant 


BXJ {<c>}{<q>} <Rm> 


Decode for this encoding 
m = UInt(Rm); 
if m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0|/15141312/11109 8|7 6 5 4|3 21 0| 





ttt oor st 110 of Rm _ [1 ofofo falloff ofofo) 


T1 variant 


BXJ{<c>}{<q>} <Rm> 


Decode for this encoding 

m = UInt(Rm); 

if m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 

Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rm> Is the general-purpose register holding the address to be branched to, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
BXWritePC(R[m]); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.29 


CBNZ, CBZ 
Compare and Branch on Nonzero and Compare and Branch on Zero compare the value in a register with zero, and 
conditionally branch forward a constant value. They do not affect the condition flags. 


T1 


15 14 1312/1110 9 8/7 i 2 O| 


[10.4 tfopfofifa] imms [| Rn_| 


CBNZ variant 
Applies when op == 1. 
CBNZ{<q>} <Rn>, <label> 
CBZ variant 
Applies when op == 0. 
CBZ{<q>} <Rn>, <label> 
Decode for all variants of this encoding 
n = UInt(Rn); imm32 = ZeroExtend(i:imm5:'@', 32); nonzero = (op == '1'); 
if InITBlock() then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose register to be tested, encoded in the "Rn" field. 
<label> Is the program label to be conditionally branched to. Its offset from the PC, a multiple of 2 and in 
the range 0 to 126, is encoded as "i:imm5" times 4. 
Operation 
EncodingSpecificOperations(); 


if nonzero != IsZero(R[n]) then 
BranchWritePC(PC + imm32); 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.30 CLREX 


Clear-Exclusive clears the local monitor of the executing PE. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 13 12/1110 9 8/7 6 5 4/3 21 0| 


1404101111 ANAM AAA AAO] 0 001 [MOM 


Al variant 


CLREX{<c>}{<q>} 


Decode for this encoding 


// No additional decoding required 


T1 


15 14 1312/1110 9 8/7 6 5 4/3 2 1 0/15141312/11109 8/7 6 5 4/3 21 0| 


tt toot 1 ot tat ofofofaaafafo_o 1 ofa) 





T1 variant 
CLREX{<c>}{<q>} 


Decode for this encoding 


// No additional decoding required 


Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. Must be AL or omitted. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
ClearExclusiveLocal(ProcessorID()); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.31 


CLZ 


Count Leading Zeros returns the number of binary zero bits before the first binary one bit in a value. 
A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11109 8|7 6 5 4/3 0| 


| feitt fo 0 0 404 1 Offa, Ra ffo_ 0 0 AP Rm 


cond 


Al variant 

CLZ{<c>}{<q>} <Rd>, <Rm> 

Decode for this encoding 

d = UInt(Rd); m = UInt(Rm); 

if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0| 


Tati tiototors] Rn [1777] Rea [1 o[oo] Rm | 


T1 variant 
CLZ{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(Rd); m= UInt(Rm); n = UInt(Rn); 
if m!=n || d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 

Ifm != n, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

. The instruction executes as described, with no change to its behavior and no additional side effects. 
° The instruction executes with the additional decode: m = UInt(Rn);. 


° The value in the destination register is UNKNOWN. 


Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 


<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. 


For encoding T1: is the general-purpose source register, encoded in the "Rm" field. It must be 
encoded with an identical value in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = CountLeadingZeroBits(R[m]); 
R[d] = result<31:0>; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.32  CMN (immediate) 


Compare Negative (immediate) adds a register value and an immediate value. It updates the condition flags based 
on the result, and discards the result. 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 | | 0| 
I=1111_[0 0 1 1 0 
cond 
Al variant 


CMN{<c>}{<q>} <Rn>, #<const> 


Decode for this encoding 


n = UInt(Rn); imm32 = A32ExpandImm(imm12) ; 
T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0/1514 12/1110 9 8|7 | 0 | 


741 of folio oof] Rn [ol mma [117 i[ imme 


T1 variant 
CMN{<c>}{<q>} <Rn>, #<const> 


Decode for this encoding 


n = UInt(Rn); imm32 = T32ExpandImm(i:imm3:imm8) ; 
if n == 15 then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 


used, but this is deprecated. 
For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 

<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(R[n], imm32, 'Q'); 
PSTATE.<N,Z,C,V> = nzcv; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.33 CMN (register) 


Compare Negative (register) adds a register value and an optionally-shifted register value. It updates the condition 


flags based on the result, and discards the result. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 iI7 6 5 4]|3 


0) 


| tei fo oo 4 oft tft] Ra (ofO|__imms__ [type Jo] Rm _ 


cond 


Rotate right with extend variant 
Applies when imm5 == 00000 && type == 11. 


CMN{<c>}{<q>} <Rn>, <Rm>, RRX 

Shift or rotate by value variant 

Applies when !(imm5 == 00000 && type == 11). 
CMN{<c>}{<q>} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


n = UInt(Rn); m = UInt(Rm); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 


T1 


\15141312\11109 8|7 65 |3 2. O| 


T1 variant 


CMN{<c>}{<q>} <Rn>, <Rm> 


Decode for this encoding 


n = UInt(Rn); m = UInt(Rm); 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


151413 12|11109 8|7 6 5 4|3 0\1514 12/11109 8|7 6 5 4/3 


0| 


Rotate right with extend variant 
Applies when imm3 == 000 && imm2 == 00 && type == 11. 


CMN{<c>}{<q>} <Rn>, <Rm>, RRX 


Shift or rotate by value variant 


Applies when !(imm3 == 000 && imm2 == 00 && type == 11). 
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CMN{<c>}.W <Rn>, <Rm> // <Rn>, <Rm> can be represented in T1 
CMN{<c>}{<q>} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 

n = UInt(Rn); m = UInt(Rm); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 


be used, but this is deprecated. 

For encoding T1 and T2: is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 

can be used, but this is deprecated. 


For encoding T1 and T2: is the second general-purpose source register, encoded in the "Rm" field. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
ESL when type = 00 
LSR when type = 1 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], shifted, '@'); 
PSTATE.<N,Z,C,V> = nzcv; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.34 CMN (register-shifted register) 
Compare Negative (register-shifted register) adds a register value and a register-shifted register value. It updates the 
condition flags based on the result, and discards the result. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 8/7 6 5 4|3 0| 


| te111_ fo oo 4 oft tft] Rv (ofof|_—Rs_—foftype | 1] Rm __ 


cond 


Al variant 


CMN{<c>}{<q>} <Rn>, <Rm>, <type> <Rs> 


Decode for this encoding 

n = UInt(Rn); m= UInt(Rm); s = UInt(Rs); 

shift_t = DecodeRegShift(type) ; 

if n == 15 || m== 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 


the "Rs" field. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], shifted, '@'); 
PSTATE.<N,Z,C,V> = nzcv; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.35 CMP (immediate) 


Compare (immediate) subtracts an immediate value from a register value. It updates the condition flags based on 
the result, and discards the result. 


A1 
\31 28|27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 | | 0| 
I=1111_[0 0 1 1 0 
cond 
Al variant 


CMP{<c>}{<q>} <Rn>, #<const> 


Decode for this encoding 


n = UInt(Rn); imm32 = A32ExpandImm(imm12) ; 


T1 
|145141312|1110 8|7 | 0| 
T1 variant 


CMP{<c>}{<q>} <Rn>, #<imm8> 


Decode for this encoding 


n = UInt(Rn); imm32 = ZeroExtend(imm8, 32); 
T2 


[15 1413 12/1110 9 8|7 6 5 4|3 0/1514 12/1110 9 8|7 | 0 | 


741 of foyt to apt] Rn [op imma [117 1[ imme _—s 


T2 variant 


CMP{<c>}.W <Rn>, #<const> // <Rd>, <const> can be represented in T1 
CMP{<c>}{<q>} <Rn>, #<const> 


Decode for this encoding 
n = UInt(Rn); imm32 = T32ExpandImm(i:imm3:imm8) ; 
if n == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


<q> See Standard assembler syntax fields on page F2-2406. 

<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 
used, but this is deprecated. 
For encoding T1: is a general-purpose source register, encoded in the "Rn" field. 
For encoding T2: is the general-purpose source register, encoded in the "Rn" field. 

<imm8> Is a 8-bit unsigned immediate, in the range 0 to 255, encoded in the "imm8" field. 

<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 


page F2-2422 for the range of values. 


For encoding T2: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(R[n], NOT(imm32), '1'); 
PSTATE.<N,Z,C,V> = nzcv; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.36 CMP (register) 


Compare (register) subtracts an optionally-shifted register value from a register value. It updates the condition flags 
based on the result, and discards the result. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 iI7 6 5 4]|3 0| 
| is1111_ fo 0 0 1 Of4 oft] Rn OO()(0)]_—immd_—typefO] Rm __ 


cond 


Rotate right with extend variant 
Applies when imm5 == 00000 && type == 11. 


CMP{<c>}{<q>} <Rn>, <Rm>, RRX 


Shift or rotate by value variant 
Applies when !(imm5 == 00000 && type == 11). 


CMP{<c>}{<q>} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 
n = UInt(Rn); m = UInt(Rm); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 


T1 


\15141312\11109 8|7 65 |3 2. O| 


T1 variant 


CMP{<c>}{<q>} <Rn>, <Rm> // <Rn> and <Rm> both from RQ-R7 


Decode for this encoding 
n = UInt(Rn); m = UInt(Rm); 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


\15141312\1110 9 8|7 6 |3 2 Ol 


T2 variant 


CMP{<c>}{<q>} <Rn>, <Rm> // <Rn> and <Rm> not both from RQ-R7 


Decode for this encoding 


n = UInt(N:Rn); m = UInt(Rm); 

(shift_t, shift_n) = (SRType_LSL, Q); 

if n < 8 &&m < 8 then UNPREDICTABLE; 

if n == 15 || m == 15 then UNPREDICTABLE; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


CONSTRAINED UNPREDICTABLE behavior 


Ifn < 8 && m < 8, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction executes as described, with no change to its behavior and no additional side effects. 
° The condition flags become UNKNOWN. 
T3 
|15141312|/1110 9 8|7 6 5 4/3 0\1514 12/11109 8|7 6 5 4/3 0| 


Rotate right with extend variant 
Applies when imm3 == 000 && imm2 == 00 && type == 11. 


CMP{<c>}{<q>} <Rn>, <Rm>, RRX 


Shift or rotate by value variant 
Applies when !(imm3 == 000 && imm2 == 00 && type == 11). 


CMP{<c>}.W <Rn>, <Rm> // <Rn>, <Rm> can be represented in T1 or T2 
CMP{<c>}{<q>} <Rn>, <Rm>, <shift> #<amount> 


Decode for all variants of this encoding 

n = UInt(Rn); m = UInt(Rm); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 


be used, but this is deprecated. 
For encoding T1 and T3: is the first general-purpose source register, encoded in the "Rn" field. 
For encoding T2: is the first general-purpose source register, encoded in the "N:Rn" field. 

<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 


For encoding T1, T2 and T3: is the second general-purpose source register, encoded in the "Rm" 





field. 
<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
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LSR when type = 1 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T3: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], NOT(shifted), '1'); 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.37 CMP (register-shifted register) 


Compare (register-shifted register) subtracts a register-shifted register value from a register value. It updates the 
condition flags based on the result, and discards the result. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 8/7 6 5 4|3 


0) 


| tei fo oo 4 oft oft] Rv [oyfof|_—Rs_—foftype {1 ] Rm _ 


cond 


Al variant 


CMP{<c>}{<q>} <Rn>, <Rm>, 


<type> <Rs> 


Decode for this encoding 


n = UInt(Rn); m= UInt(Rm); s = UInt(Rs); 
shift_t = DecodeRegShift(type) ; 
if n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 


the following values: 


LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 


the "Rs" field. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], NOT(shifted), '1'); 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.38 CPS, CPSID, CPSIE 
Change PE State changes one or more of the PSTATE. {A, I, F} interrupt mask bits and, optionally, the PSTATE.M 
mode field, without changing any other PSTATE bits. 
CPS is treated as NOP if executed in User mode unless it is defined as being CONSTRAINED UNPREDICTABLE elsewhere 
in this section. 
The PE checks whether the value being written to PSTATE.M is legal. See I/legal changes to PSTATE.M on 
page G1-3809. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/1514 1312|1110 9 8|7 6 5 4| 0| 
T1110007 000 OfimoaMofofoooKofofoyalt[FloT mode |] 
CPS variant 
Applies when imod == 00 && M == 1. 
CPS{<q>} #<mode> // Cannot be conditional 
CPSID variant 
Applies when imod == 11 && M == 0. 
CPSID{<q>} <iflags> // Cannot be conditional 
CPSID variant 
Applies when imod == 11 && M == 1. 
CPSID{<q>} <iflags> , #<mode> // Cannot be conditional 
CPSIE variant 
Applies when imod == 10 && M == 0. 
CPSIE{<q>} <iflags> // Cannot be conditional 
CPSIE variant 
Applies when imod == 10 && M == 1. 
CPSIE{<q>} <iflags> , #<mode> // Cannot be conditional 
Decode for all variants of this encoding 
if mode != 'QQ000' && M == '@' then UNPREDICTABLE; 
if (imod<1> == '1' && A:I:F == '@Q0') || (imod<1> == 'Q' && A:I:F != 'Q00') then UNPREDICTABLE; 
enable = (imod == '10'); disable = (imod == '11'); changemode = (M == '1') 
affectA = (A == '1'); affectI = (I == '1'); affectF = (F == '1'); 
if (imod == 'Q0' && M == 'Q') || imod == '@1' then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If imod == 'Q1', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
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If imod == 'Q0' && M == 'Q', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

If mode != 'Q0000' && M == 'Q', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes with the additional decode: changemode = TRUE. 

. The instruction executes as described, and the value specified by mode is ignored. There are no additional 


side-effects. 


If imod<1> == '1' && A:I:F == '000', then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction behaves as if imod<1> == '0'. 

° The instruction behaves as if A:I:F has an UNKNOWN nonzero value. 

If imod<1> == '@' && A:I:F != '000', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

. The instruction behaves as if imod<1>=='I'. 


° The instruction behaves as if A:I:F == '000'. 
T1 


15 14 1312/1110 9 8|7 6 5 4/3 21 0| 


7 o4 1077100 41 [moat ir] 


CPSID variant 
Applies when im == 


CPSID{<q>} <iflags> // Not permitted in IT block 


CPSIE variant 
Applies when im == 


CPSIE{<q>} <iflags> // Not permitted in IT block 


Decode for all variants of this encoding 


if A:I:F == 'QQ@' then UNPREDICTABLE; 

enable = (im == '@'); disable = (im == '1'); changemode = FALSE; 
affectA = (A == '1'); affect = (I == '1'); affectF = (F == '1'); 
if InITBlock() then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If A:I:F == 'Q00', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 


° The instruction executes as NOP. 


T2 


15 14 1312/1110 9 8/7 6 5 4/3 2 1 0|15141312/11109 8|7 6 5 4| 0 | 


tt 1001 4 10 1 Of AIA] + Off ofOfimoa[Mja]i]F] mode | 





CPS variant 
Applies when imod == 00 && M == 


CPS{<q>} #<mode> // Not permitted in IT block 


CPSID variant 
Applies when imod == 11 && M == 


CPSID.W <iflags> // Not permitted in IT block 


CPSID variant 
Applies when imod == 11 && M == 


CPSID{<q>} <iflags>, #<mode> // Not permitted in IT block 


CPSIE variant 
Applies when imod == 10 && M == 


CPSIE.W <iflags> // Not permitted in IT block 


CPSIE variant 
Applies when imod == 10 && M == 


CPSIE{<q>} <iflags>, #<mode> // Not permitted in IT block 


Decode for all variants of this encoding 


if imod == 'QQ0' && M == '@' then SEE "Hint instructions"; 

if mode != 'QQ000' && M == '@' then UNPREDICTABLE; 

if (imod<1> == '1' && A:I:F == 'Q00') || (imod<1> == 'O' && A:I:F != 'Q00') then UNPREDICTABLE; 
enable = (imod == '10'); disable = (imod == '11'); changemode = (M == '1'); 

affectA = (A == '1'); affectI = (I == '1'); affectF = (F == '1'); 

if imod == 'Q@1' || InITBlock() then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If imod == 'Q1', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 


If mode != 'Q0000' && M == 'Q', then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
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. The instruction executes as NOP. 
° The instruction executes with the additional decode: changemode = TRUE. 
. The instruction executes as described, and the value specified by mode is ignored. There are no additional 


side-effects. 


If imod<1> == '1' && A:I:F == '000', then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction behaves as if imod<1> == '0'. 

° The instruction behaves as if A:I:F has an UNKNOWN nonzero value. 

If imod<1> == 'Q@' && A:I:F != '000', then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction behaves as if imod<1> == '1'. 


° The instruction behaves as if A:I:F == '000'. 


Notes for all encodings 


Hint instructions: In encoding T2, if the imod field is 0@ and the M bit is 0, a hint instruction is encoded. To determine 
which hint instruction, see Branches and miscellaneous control on page F3-2471. 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 
<iflags> Is a sequence of one or more of the following, specifying which interrupt mask bits are affected: 
a Sets the A bit in the instruction, causing the specified effect on PSTATE.A, the SError 


interrupt mask bit. 


i Sets the I bit in the instruction, causing the specified effect on PSTATE.L, the IRQ 
interrupt mask bit. 


f Sets the F bit in the instruction, causing the specified effect on PSTATE.F, the FIQ 
interrupt mask bit. 


<mode> Is the number of the mode to change to, in the range 0 to 31, encoded in the "mode" field. 


Operation for all encodings 


if CurrentInstrSet() == InstrSet_A32 then 
EncodingSpecificOperations(); 
if PSTATE.EL != EL@ then 
if enable then 


if affectA then PSTATE.A = '0'; 

if affectI then PSTATE.I = '0'; 

if affectF then PSTATE.F = 'Q'; 
if disable then 

if affectA then PSTATE.A = '1'; 


if affectI then PSTATE.I = '1'; 
if affectF then PSTATE.F = '1'; 
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if changemode then 
// AArch32.WriteModeByInstr() sets PSTATE.IL to 1 if this is an illegal mode change. 
AArch32.WriteModeByInstr(mode) ; 
else 
EncodingSpecificOperations(); 
if PSTATE.EL != ELQ then 
if enable then 


if affectA then PSTATE.A = 'Q'; 

if affectI then PSTATE.I = '0'; 

if affectF then PSTATE.F = 'Q'; 
if disable then 

if affectA then PSTATE.A = '1'; 

if affectI then PSTATE.I = '1'; 


if affectF then PSTATE.F = '1'; 

if changemode then 
// AArch32.WriteModeByInstr() sets PSTATE.IL to 1 if this is an illegal mode change. 
AArch32.WriteModeByInstr(mode) ; 
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F5.1.39 CRC32 


CRC32 performs a cyclic redundancy check (CRC) calculation on a value held in a general-purpose register. It takes 
an input CRC value in the first source operand, performs a CRC on the input value in the second source operand, 

and returns the output CRC value. The second source operand can be 8, 16, or 32 bits. To align with common usage, 
the bit order of the values is reversed as part of the operation, and the polynomial 0x04C11DB7 is used for the CRC 

calculation. 


In ARMv8-A, this is an OPTIONAL instruction. 


Note 
ID_ISARS.CRC32 indicates whether this instruction is supported in the T32 and A32 instruction sets. 








A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


Derm [oo 07 oo] en | Rd ofojofojo 1 oo] Rm | 


cond Cc 


CRC32B variant 
Applies when sz == 00. 


CRC32B{<q>} <Rd>, <Rn>, <Rm> 


CRC32H variant 
Applies when sz == @1. 


CRC32H{<q>} <Rd>, <Rn>, <Rm> 


CRC32W variant 
Applies when sz == 10. 


CRC32W{<q>} <Rd>, <Rn>, <Rm> 


Decode for all variants of this encoding 


if ! HaveCRCExt() then UNDEFINED; 

d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); 

size = 8 << UInt(sz); 

crco32c = (C == '1'); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if size == 64 then UNPREDICTABLE; 

if cond != '1110' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If size == 64, then one of the following behaviors must occur: 





. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes with the additional decode: size = 32;. 
If cond != '1110', then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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. The instruction executes unconditionally. 
° The instruction executes conditionally. 
T1 
151413 12/11109 8|7 6 5 4|3 0/15 14 13 12|11 8|7 6 5 4|3 0 | 


Tiit707077 00] Re |[i117]/ Rd [10] az] Rn | 
Cc 


CRC32B variant 
Applies when sz == 00. 


CRC32B{<q>} <Rd>, <Rn>, <Rm> 


CRC32H variant 
Applies when sz == 01. 


CRC32H{<q>} <Rd>, <Rn>, <Rm> 


CRC32W variant 
Applies when sz == 10. 


CRC32W{<q>} <Rd>, <Rn>, <Rm> 


Decode for all variants of this encoding 


if ! HaveCRCExt() then UNDEFINED; 

d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); 

size = 8 << UInt(sz); 

crco32c = (C == '1'); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if size == 64 then UNPREDICTABLE; 

if InITBlock() then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If size == 64, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes with the additional decode: size = 32;. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. An CRC32 instruction must be unconditional. 
<Rd> Is the general-purpose accumulator output register, encoded in the "Rd" field. 

<Rn> Is the general-purpose accumulator input register, encoded in the "Rn" field. 

<Rm> Is the general-purpose data source register, encoded in the "Rm" field. 
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Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 


acc = R[n]; // accumulator 

val = R[m]<size-1:0>; // input value 

poly = (if crc32c then @x1EDC6F41 else 0x04C11DB7)<31:0>; 

tempacc = BitReverse(acc):Zeros(size); 

tempval = BitReverse(val):Zeros(32); 

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation 
R[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly)); 
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F5.1.40 CRC32C 


CRC32C performs a cyclic redundancy check (CRC) calculation on a value held in a general-purpose register. It takes 
an input CRC value in the first source operand, performs a CRC on the input value in the second source operand, 
and returns the output CRC value. The second source operand can be 8, 16, or 32 bits. To align with common usage, 
the bit order of the values is reversed as part of the operation, and the polynomial 0x1EDC6F41 is used for the CRC 
calculation. 


In ARMv8-A, this is an OPTIONAL instruction. 


Note 
ID_ISARS.CRC32 indicates whether this instruction is supported in the T32 and A32 instruction sets. 








A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


Derm [oo 07 0] fo] en | Rd joolijojo 1 oo] Rm | 
Cc 


cond 


CRC32CB variant 
Applies when sz == 00. 


CRC32CB{<q>} <Rd>, <Rn>, <Rm> 


CRC32CH variant 
Applies when sz == @1. 


CRC32CH{<q>} <Rd>, <Rn>, <Rm> 


CRC32CW variant 
Applies when sz == 10. 


CRC32CW{<q>} <Rd>, <Rn>, <Rm> 


Decode for all variants of this encoding 


if ! HaveCRCExt() then UNDEFINED; 

d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); 

size = 8 << UInt(sz); 

crco32c = (C == '1'); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if size == 64 then UNPREDICTABLE; 

if cond != '1110' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If size == 64, then one of the following behaviors must occur: 





. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes with the additional decode: size = 32;. 
If cond != '1110', then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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. The instruction executes unconditionally. 
° The instruction executes conditionally. 
T1 
151413 12/11109 8|7 6 5 4|3 0/15 14 13 12|11 8|7 6 5 4|3 0 | 


Tiitio070 io] Re [iii7]/ Rd [10] az] Rn | 
Cc 


CRC32CB variant 
Applies when sz == 00. 


CRC32CB{<q>} <Rd>, <Rn>, <Rm> 


CRC32CH variant 
Applies when sz == 01. 


CRC32CH{<q>} <Rd>, <Rn>, <Rm> 


CRC32CW variant 
Applies when sz == 10. 


CRC32CW{<q>} <Rd>, <Rn>, <Rm> 


Decode for all variants of this encoding 


if ! HaveCRCExt() then UNDEFINED; 

d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); 

size = 8 << UInt(sz); 

crco32c = (C == '1'); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if size == 64 then UNPREDICTABLE; 

if InITBlock() then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If size == 64, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes with the additional decode: size = 32;. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<q> See Standard assembler syntax fields on page F2-2406. An CRC32C instruction must be 
unconditional. 
<Rd> Is the general-purpose accumulator output register, encoded in the "Rd" field. 
<Rn> Is the general-purpose accumulator input register, encoded in the "Rn" field. 
<Rm> Is the general-purpose data source register, encoded in the "Rm" field. 
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Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 


acc = R[n]; // accumulator 

val = R[m]<size-1:0>; // input value 

poly = (if crc32c then @x1EDC6F41 else 0x04C11DB7)<31:0>; 

tempacc = BitReverse(acc):Zeros(size); 

tempval = BitReverse(val):Zeros(32); 

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation 
R[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly)); 
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F5.1.41 DBG 


In ARMv8, DBG executes as a NOP. ARM deprecates any use of the DBG instruction. 
A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 13 12/1110 9 8/7 6 5 4]3 0| 


| fe1itt_ fo o 1 4 ofoft ofo ofo of afar ()()(0) 1 1 1 1] option _| 


cond 


Al variant 


DBG{<c>}{<q>} #<option> 


Decode for this encoding 


// DBG executes as a NOP. The 'option' field is ignored 


71 


[15 14 1312/1110 9 8/7 6 5 4/3 2 1 0|15141312/11109 8|7 6 5 4/3 0 | 


11470017141 01 ofayafayayts ofofofojo o oft 1 4 1] option | 


T1 variant 


DBG{<c>}{<q>} #<option> 


Decode for this encoding 


// DBG executes as a NOP. The 'option' field is ignored 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<option> Is a 4-bit unsigned immediate, in the range 0 to 15, encoded in the "option" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.42 DCPS1, DCPS2, DCPS3 
DCPSx, Debug Change PE State to ELx, where x is 1, 2, or 3. 
When executed in Debug state, the target Exception level of the instruction is: 
° ELx, if the instruction is executed at an Exception level lower than ELx. 
° Otherwise, the Exception level at which the instruction is executed. 
On executing a DCPSx instruction in Debug state when the instruction is not UNDEFINED: 


. If the instruction is executed at an Exception level that is lower than the target Exception level the PE enters 
the target Exception level, Elx, and: 
— If ELx is using AArch64, the PE selects SP_ELx. 
— If the target Exception level is EL1 using AArch32 the PE enters Supervisor mode. 


— If the instruction was executed in Non-secure state and the target Exception level is EL2 using 
AArch32 the PE enters Hyp mode. 


— If the target Exception level is EL3 using AArch32 the PE enters Supervisor mode and SCR.NS is set 
to 0. 


° Otherwise, there is no change to the Exception level and: 
— If the instruction was executed at EL1 the PE enters Supervisor mode. 
— If the insruction was executed at EL2 the PE remains in Hyp mode. 


— Ifthe instruction was a DCPS1 instruction executed at EL3 the PE enters Supervisor mode and SCR.NS 
is set to 0. 


— If the instruction was a DCPS3 instruction executed at EL3 the PE enters Monitor mode and SCR.NS is 
set to 0. 


These instructions are always UNDEFINED in Non-debug state. 

DCPS1 is UNDEFINED at ELO in Non-secure state if either: 

. EL2 is implemented and using AArch64 and HCR_EL2.TGE == 1. 
° EL2 is implemented and using AArch32 and HCR.TGE == 1. 
DCPS2 is UNDEFINED at all Exception levels if EL2 is not implemented. 
DCPS2 is UNDEFINED in the following states if EL2 is implemented: 

° At ELO and EL] in Secure state. 

° At EL3 if EL3 is using AArch32. 

DCPS3 is UNDEFINED at all Exception levels if either: 

° EDSCR.SDD == 1. 

° EL3 is not implemented. 

On executing a DCPSx instruction that is not UNDEFINED and targets ELx: 


° If ELx is using AArch64: 
— ELR_ELx, SPSR_ELx, and ESR_ELx become UNKNOWN. 
— DLR_ELO and DSPSR_ELO become UNKNOWN. 


° If ELx is using AArch32 DLR and DSPSR become UNKNOWN and: 


— If the target Exception level is EL1 or EL3, the LR and SPSR of the target PE mode become 
UNKNOWN. 


— If the target Exception level is EL2, then ELR_hyp, SPSR_hyp, and HSR become UNKNOWN. 
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For more information on the operation of these instructions, see DCPS<n> on page H2-4870. 


T1 


[15 14 1312/1110 9 8/7 6 5 4/3 2 1 0/15141312/11109 8/7 6 5 4/3 21 0| 


11110111100 0/1 111/100 0/0 000000 0 0 Of opt | 


DCPS1 variant 

Applies when opt == 01. 

DCPS1 

DCPS2 variant 

Applies when opt == 10. 

DCPS2 

DCPS3 variant 

Applies when opt == 11. 

DCPS3 

Decode for all variants of this encoding 


if !Halted() || opt == '@@' then UNDEFINED; 


Operation 


DCPSInstruction(opt) ; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.43 DMB 


Data Memory Barrier is a memory barrier that ensures the ordering of observations of memory accesses, see Data 
Memory Barrier (DMB) on page E2-2336. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 1413 12/1110 9 8|7 6 5 4/3 0| 





Al variant 


DMB{<c>}{<q>} {<option>} 


Decode for this encoding 


// No additional decoding required 


T1 


[15 14 1312/1110 9 8/7 6 5 4/3 2 1 0|15141312/11109 8|7 6 5 4/3 0 | 





tt too 81 ot AAA] + ofofofafafafyfo +o +] option _| 


T1 variant 


DMB{<c>}{<q>} {<option>} 


Decode for this encoding 


// No additional decoding required 


Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. Must be AL or omitted. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<option> Specifies an optional limitation on the barrier operation. Values are: 
SY Full system is the required shareability domain, reads and writes are the required access 


types in both Group A and Group B. Can be omitted. This option is referred to as the 
full system DMB. Encoded as option = 0b1111. 


ST Full system is the required shareability domain, writes are the required access type in 
both Group A and Group B. SYST is a synonym for ST. Encoded as option = 0b1110. 


LD Full system is the required shareability domain, reads are the required access type in 
Group A, and reads and writes are the required access types in Group B. Encoded as 
option = 0b1101. 


ISH Inner Shareable is the required shareability domain, reads and writes are the required 
access types in both Group A and Group B. Encoded as option = 0b1011. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


ISHST 


ISHLD 


NSH 


NSHST 


NSHLD 


OSH 


OSHST 


OSHLD 


Inner Shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as option = 0b1019. 


Inner Shareable is the required shareability domain, reads are the required access type 
in Group A, and reads and writes are the required access types in Group B. Encoded as 
option = 0b1001. 


Non-shareable is the required shareability domain, reads and writes are the required 
access types in both Group A and Group B. Encoded as option = 0b0111. 


Non-shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as option = 0b0119. 


Non-shareable is the required shareability domain, reads are the required access type in 
Group A, and reads and writes are the required access types in Group B. Encoded as 
option = 0b0101. 


Outer Shareable is the required shareability domain, reads and writes are the required 
access types in both Group A and Group B. Encoded as option = 0b0011. 


Outer Shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as option = 0b0019. 


Outer Shareable is the required shareability domain, reads are the required access type 
in Group A, and reads and writes are the required access types in Group B. Encoded as 


option = 0b0001. 


All other encodings of option are reserved. It is IMPLEMENTATION DEFINED whether options other 
than SY are implemented. All unsupported and reserved options must execute as a full system DMB 
operation, but software must not rely on this behavior. 


— Note 


The instruction supports the following alternative <option> values, but ARM recommends that 
software does not use these alternative values: 


SH as an alias for ISH. 


SHST as an alias for ISHST. 


UN as an alias for NSH. 


UNST as an alias for NSHST. 





Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


case opt 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
othe 


ion of 
"Q001' 
"Q010' 
"Q011' 
"Q101' 
"Q110' 
"Q111' 
"1001" 
"1010' 
"1011' 
"1101' 
'1110' 
rwise 





domain = MBReqDomain_OuterShareable; types = MBReqlypes_Reads; 
domain = MBReqDomain_OuterShareable; types = MBReqlypes_Writes; 
domain = MBReqDomain_OuterShareable; types = MBReqlypes_All; 
domain = MBReqDomain_Nonshareable; types = MBReqTypes_Reads; 
domain = MBReqDomain_Nonshareable; types = MBReqlypes_Writes; 
domain = MBReqDomain_Nonshareable; types = MBReqTypes_Al1; 
domain = MBReqDomain_InnerShareable; types = MBReqlypes_Reads; 
domain = MBReqDomain_InnerShareable; types = MBReqlypes_Writes; 
domain = MBReqDomain_InnerShareable; types = MBReqlypes_All1; 
domain = MBReqDomain_Ful1lSystem; types = MBReqTypes_Reads; 
domain = MBReqDomain_Ful1System; types = MBReqlypes_Writes; 
domain = MBReqDomain_Ful1System; types = MBReqTypes_Al1; 


if HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} then 
if HCR.BSU == '11' then 
domain = MBReqDomain_Ful1lSystem; 


if HCR.BSU == '10' && domain != MBReqDomain_FullSystem then 


domain = MBReqDomain_OuterShareable; 
if HCR.BSU == 'Q1' && domain == MBReqDomain_Nonshareable then 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


domain = MBReqDomain_InnerShareable; 


DataMemoryBarrier(domain, types); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.44 DSB 


Data Synchronization Barrier is a memory barrier that ensures the completion of memory accesses, see Data 
Synchronization Barrier (DSB) on page E2-2337. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 1413 12/1110 9 8|7 6 5 4/3 0| 





Al variant 


DSB{<c>}{<q>} {<option>} 


Decode for this encoding 


// No additional decoding required 


T1 


[15 1413 12/1110 9 8/7 6 5 4/3 2 1 0|15141312/11109 8|7 6 5 4/3 0 | 





tt to 41 0 1 AAA] + ofofofafaafajo + 0 0] option _| 


T1 variant 


DSB{<c>}{<q>} {<option>} 


Decode for this encoding 


// No additional decoding required 


Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. Must be AL or omitted. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<option> Specifies an optional limitation on the barrier operation. Values are: 
SY Full system is the required shareability domain, reads and writes are the required access 


types in both Group A and Group B. Can be omitted. This option is referred to as the 
full system DSB. Encoded as option = 0b1111. 


ST Full system is the required shareability domain, writes are the required access type in 
both Group A and Group B. SYST is a synonym for ST. Encoded as option = 0b1110. 


LD Full system is the required shareability domain, reads are the required access type in 
Group A, and reads and writes are the required access types in Group B. Encoded as 
option = 0b1101. 


ISH Inner Shareable is the required shareability domain, reads and writes are the required 
access types in both Group A and Group B. Encoded as option = 0b1011. 
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ISHST 


ISHLD 


NSH 


NSHST 


NSHLD 


OSH 


OSHST 


OSHLD 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Inner Shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as option = 0b1010. 


Inner Shareable is the required shareability domain, reads are the required access type 
in Group A, and reads and writes are the required access types in Group B. Encoded as 
option = 0b1001. 


Non-shareable is the required shareability domain, reads and writes are the required 
access types in both Group A and Group B. Encoded as option = 0b0111. 


Non-shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as option = 0b0119. 


Non-shareable is the required shareability domain, reads are the required access type in 
Group A, and reads and writes are the required access types in Group B. Encoded as 
option = 0b0101. 


Outer Shareable is the required shareability domain, reads and writes are the required 
access types in both Group A and Group B. Encoded as option = 0b0011. 


Outer Shareable is the required shareability domain, writes are the required access type 
in both Group A and Group B. Encoded as option = 0b0019. 


Outer Shareable is the required shareability domain, reads are the required access type 
in Group A, and reads and writes are the required access types in Group B. Encoded as 


option = 0b0001. 


All other encodings of option are reserved. It is IMPLEMENTATION DEFINED whether options other 
than SY are implemented. All unsupported and reserved options must execute as a full system DSB 
operation, but software must not rely on this behavior. 


— Note 


The instruction supports the following alternative <option> values, but ARM recommends that 
software does not use these alternative values: 


SH as an alias for ISH. 


SHST as an alias for ISHST. 


UN as an alias for NSH. 


UNST as an alias for NSHST. 





Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


case opt 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
othe 


if HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} then 


ion of 
"Q001' 
"Q010' 
"Q011' 
"Q101' 
"Q110' 
"Q111' 
"1001" 
"1010' 
"1011' 
"1101' 
'1110' 
rwise 





domain = MBReqDomain_OuterShareable; types = MBReqlypes_Reads; 
domain = MBReqDomain_OuterShareable; types = MBReqlypes_Writes; 
domain = MBReqDomain_OuterShareable; types = MBReqlypes_All; 
domain = MBReqDomain_Nonshareable; types = MBReqTypes_Reads; 
domain = MBReqDomain_Nonshareable; types = MBReqlypes_Writes; 
domain = MBReqDomain_Nonshareable; types = MBReqTypes_Al1; 
domain = MBReqDomain_InnerShareable; types = MBReqlypes_Reads; 
domain = MBReqDomain_InnerShareable; types = MBReqlypes_Writes; 
domain = MBReqDomain_InnerShareable; types = MBReqlypes_All1; 
domain = MBReqDomain_Ful1lSystem; types = MBReqTypes_Reads; 
domain = MBReqDomain_Ful1System; types = MBReqlypes_Writes; 
domain = MBReqDomain_Ful1System; types = MBReqTypes_Al1; 


if HCR.BSU == '11' then 
domain = MBReqDomain_Ful1lSystem; 


if HCR.BSU == '10' && domain != MBReqDomain_FullSystem then 


domain = MBReqDomain_OuterShareable; 
if HCR.BSU == 'Q1' && domain == MBReqDomain_Nonshareable then 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


1ID092916 


Non-Confidential 


F5-2663 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


domain = MBReqDomain_InnerShareable; 


DataSynchronizationBarrier(domain, types); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.45 EOR, EORS (immediate) 
Bitwise Exclusive OR (immediate) performs a bitwise Exclusive OR of a register value and an immediate value, 
and writes the result to the destination register. 
If the destination register is not the PC, the EORS variant of the instruction updates the condition flags based on the 
result. 
The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 
° The EOR variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 
° The EORS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 
— The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 
AArch32 state on page G1-3835. 
— The instruction is UNDEFINED in Hyp mode. 
— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16|15 12\11 | | 0 | 
ii11 [oo 1 ofo o afs} Rn [ Rd | immi2_ 
cond 
EOR variant 
Applies when S == 0. 
EOR{<c>}{<q>} {<Rd>,} <Rn>, #<const> 
EORS variant 
Applies when S == 1. 
EORS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 
Decode for all variants of this encoding 
d = UInt(Rd); mn = UInt(Rn); setflags = (S == '1') 
(imm32, carry) = A32ExpandImm_C(imm12, PSTATE.C); 
T1 
151413 12/11109 8|7 6 5 4/3 0/1514 12|11 8/7 | 0 | 
1417 o[ifofo +o o[s] Ra [ol imma | Ro [mma —+ 
EOR variant 
Applies when $ == 0. 
EOR{<c>}{<q>} {<Rd>,} <Rn>, #<const> 
EORS variant 
Applies when S == 1 && Rd != 1111. 
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EORS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


if Rd == '1111' && S == '1' then SEE TEQ (immediate); 

d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); 
(imm32, carry) = T32ExpandImm_C(i:imm3:imm8, PSTATE.C); 
if (d == 15 && !setflags) || n == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
° For the EOR variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


. For the EORS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 
used, but this is deprecated. 


For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 


<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = R[n] EOR imm32; 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result) ; 
else 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.46 EOR, EORS (register) 


Bitwise Exclusive OR (register) performs a bitwise Exclusive OR of a register value and an optionally-shifted 
register value, and writes the result to the destination register. 


If the destination register is not the PC, the EORS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The EOR variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The EORS variant of the instruction performs an exception return without the use of the stack. In this case: 


A1 


|31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20|19 16|15 12\11 i765 4|3 0 | 


| iit jo oo ofo oo tfs{ Rn | Rd | immS | type[o] Rm _| 


cond 


EOR, rotate right with extend variant 


Applies when S == 0 && imm5 == 00000 && type == 11. 


EOR{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


EOR, shift or rotate by value variant 


Applies when S$ == 0 && !(immS == 00000 && type == 11). 


EOR{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


EORS, rotate right with extend variant 


Applies when S == 1 && imm5 == 00000 && type == 11. 


EORS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


EORS, shift or rotate by value variant 


Applies when S$ == 1 && !(immS == 00000 && type == 11). 


EORS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); n= 
(shift_t, shift_n) 


UInt(Rn); | m = UInt(Rm); setflags = (S == '1'); 
= DecodeImmShift(type, imm5); 
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T1 


\15141312\11109 8|7 65 |32 O| 


T1 variant 


EOR<c>{<q>} {<Rdn>,} <Rdn>, <Rm> // Inside IT block 
EORS{<q>} {<Rdn>,} <Rdn>, <Rm> // Outside IT block 


Decode for this encoding 
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 1413 12/1110 9 817 6 5 4/3 0|1514 12/11 817 6 5 4/3 o| 


pt toro tjoro ofs} Rv \of imm3 | Rd fimm2]type] Rm __| 


EOR, rotate right with extend variant 
Applies when $ == 0 && imm3 == 000 && imm2 == 00 && type == 11. 


EOR{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


EOR, shift or rotate by value variant 
Applies when $ == 0 && !(imm3 == 000 && imm2 == QQ && type == 11). 


EOR<c>.W {<Rd>,} <Rn>, <Rm> // Inside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
EOR{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


EORS, rotate right with extend variant 
Applies when S$ == 1 && imm3 == 000 && Rd != 1111 && imm2 == 00 && type == 11. 


EORS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


EORS, shift or rotate by value variant 
Applies when S == 1 && !(imm3 == 000 && imm2 == 00 && type == 11) && Rd != 1111. 


EORS.W {<Rd>,} <Rn>, <Rm> // Outside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
EORS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


if Rd == '1111' && S == '1' then SEE TEQ (register); 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if (d == 15 && !setflags) || n == 15 || m == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdn> Is the first general-purpose source register and the destination register, encoded in the "Rdn" field. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
° For the EOR variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the EORS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T2: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. 
For encoding T2: is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 


For encoding T1 and T2: is the second general-purpose source register, encoded in the "Rm" field. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


In T32 assembly: 


° Outside an IT block, if EORS <Rd>, <Rn>, <Rd> has <Rd> and <Rn> both in the range RO-R7, it is assembled 
using encoding T1 as though EORS <Rd>, <Rn> had been written 


° Inside an IT block, if EOR<c> <Rd>, <Rn>, <Rd> has <Rd> and <Rn> both in the range RO-R7, it is assembled 
using encoding T1 as though EOR<c> <Rd>, <Rn> had been written. 


To prevent either of these happening, use the .W qualifier. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] EOR shifted; 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


ALUWritePC(result) ; 
else 

R[d] = result; 

if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.47 EOR, EORS (register-shifted register) 


Bitwise Exclusive OR (register-shifted register) performs a bitwise Exclusive OR of a register value and a 
register-shifted register value. It writes the result to the destination register, and can optionally update the condition 
flags based on the result. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0| 
| istii1_ jo oo ojo o tfs{ Rn | Ra | Rs_ [oltype]i] Rm __| 


cond 


Flag setting variant 
Applies when S == 1. 


EORS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when $ == 0. 


EOR{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 

<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 
the "Rs" field. 
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Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] EOR shifted; 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.48 ERET 
Exception Return. 


The PE branches to the address held in the register holding the preferred return address, and restores PSTATE from 
SPSR_<current_mode>. 


The register holding the preferred return address is: 
. ELR_hyp, when executing in Hyp mode. 
° LR, when executing in a mode other than Hyp mode, User mode, or System mode. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from AArch32 state on 
page G1-3835. 


Exception Return is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


In Debug state, the T1 encoding of ERET executes the DRPS operation. 


A1 


31 28/27 26 25 24/23 22 21 20\19 18 17 16/15 14 13 12/1110 9 8/7 6 5 4/3 21 0} 





cond 


Al variant 


ERET{<c>}{<q>} 


Decode for this encoding 


// No additional decoding required 


71 


[15 14 1312/1110 9 8/7 6 5 4/3 2 1 0/15141312/11109 8/7 6 5 4/3 21 0| 


77100111071 11 Of7 COMMS 0000000) 


T1 variant 


ERET{<c>}{<q>} 


Decode for this encoding 


if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if PSTATE.M IN {M32_User,M32_System} then 
UNPREDICTABLE; // UNDEFINED or NOP 
else 
new_pc_value = if PSTATE.EL == EL2 then ELR_hyp else R[14]; 
AArch32.ExceptionReturn(new_pc_value, SPSR[]); 


CONSTRAINED UNPREDICTABLE behavior 


If PSTATE.M IN {M32_User,M32_System}, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Halting breakpoint causes a software breakpoint to occur. 


Halting breakpoint is always unconditional, even inside an IT block. 
A1 


31 28|27 26 25 24|23 22 21 20|19 | | 81/7 6 5 4|3 0 | 


[erm [ooo 7 0f0 fo] _mmi2_———=«diO 117] imme | 


cond 


Al variant 
HLT{<q>} {#}<imm> 
Decode for this encoding 


if EDSCR.HDE == '@' || !HaltingAllowed() then UNDEFINED; 
if cond != '1110' then UNPREDICTABLE; // HLT must be encoded with AL condition 


CONSTRAINED UNPREDICTABLE behavior 

If cond != '1110', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes unconditionally. 


° The instruction executes conditionally. 


T1 


\15141312\11109 8|7 6 5 | 0 | 


1o1t10707 0f imme | 


T1 variant 
HLT{<q>} {#}<imm> 
Decode for this encoding 


if EDSCR.HDE == '@' || !HaltingAllowed() then UNDEFINED; 


Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 

<q> See Standard assembler syntax fields on page F2-2406. An HLT instruction must be unconditional. 


<imm> For encoding A1: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 
"imm12:imm4" field. This value is for assembly and disassembly only. It is ignored by the PE, but 
can be used by a debugger to store more information about the halting breakpoint. 
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For encoding T1: is a 6-bit unsigned immediate, in the range 0 to 63, encoded in the "imm6" field. 
This value is for assembly and disassembly only. It is ignored by the PE, but can be used by a 
debugger to store more information about the halting breakpoint. 


Operation for all encodings 


EncodingSpeci ficOperations(); 
Halt (DebugHalt_HaltInstruction) ; 
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F5.1.50 HVC 


Hypervisor Call causes a Hypervisor Call exception. For more information see Hypervisor Call (HVC) exception 
on page G1-3855. Non-secure software executing at EL1 can use this instruction to call the hypervisor to request a 
service. 


The HVC instruction is: 


° UNDEFINED in Secure state, and in User mode in Non-secure state. 
° When SCR.HCE is set to 0, UNDEFINED in Non-secure EL] modes and CONSTRAINED UNPREDICTABLE in 
Hyp mode. 


On executing an HVC instruction, the HSR reports the exception as a Hypervisor Call exception, using the EC value 
@x12, and captures the value of the immediate argument, see Use of the HSR on page G4-4137. 


A1 

\31 28|27 26 25 24/23 22 21 20/19 | | 8|7 6 5 4/3 0| 
1171 [o 0 0 1 0]1 ofof iimmt2 ft 1 1] mms | 
cond 

Al variant 


HVC{<q>} {#}<imm16> 


Decode for this encoding 


if cond != '111@' then UNPREDICTABLE; 
imm16 = imm12:imm4; 


CONSTRAINED UNPREDICTABLE behavior 


If cond != '1110', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction executes unconditionally. 
. The instruction executes conditionally. 
T1 
\15141312|/11109 8|7 6 5 4|3 0 \15 14 13 12|11 | | 0 | 


Tit 7to1ii117 1/0] mm [7 ofofo] imma —S—=d 


T1 variant 
HVC{<q>} {#}<imm16> 
Decode for this encoding 


imm16 = imm4:imm12; 
if InITBlock() then UNPREDICTABLE; 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 
<q> See Standard assembler syntax fields on page F2-2406. An HVC instruction must be unconditional. 


<imm16> For encoding A1: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 
"imm12:imm4" field. This value is for assembly and disassembly only. It is reported in the HSR but 
otherwise is ignored by hardware. An HVC handler might interpret imm16, for example to 
determine the required service. 


For encoding T1: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 
"imm4:imm12" field. This value is for assembly and disassembly only. It is reported in the HSR but 
otherwise is ignored by hardware. An HVC handler might interpret imm16, for example to 
determine the required service. 


Operation for all encodings 


EncodingSpecificOperations(); 
if !HaveEL(EL2) || PSTATE.EL == EL@ || IsSecure() then 
UNDEFINED; 


if HaveEL(EL3) then 
if ELUsingAArch32(EL3) && SCR.HCE == '@' && PSTATE.EL == EL2 then 
UNPREDICTABLE; 
else 
hvc_enable = SCR_GEN[].HCE; 
else 
hvc_enable = if ELUsingAArch32(EL3) then NOT(HCR_EL2.HCD) else NOT(HCR.HCD) ; 


if hvc_enable == 'Q' then 
UNDEFINED; 


else 
AArch32.CallHypervisor(imm16) ; 


CONSTRAINED UNPREDICTABLE behavior 


Tf ELUsingAArch32(EL3) && SCR.HCE == 'Q' && PSTATE.EL == EL2, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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F5.1.51 ISB 


Instruction Synchronization Barrier flushes the pipeline in the PE, so that all instructions following the ISB are 
fetched from cache or memory, after the instruction has been completed. It ensures that the effects of context 
changing operations executed before the ISB instruction are visible to the instructions fetched after the ISB. Context 
changing operations include changing the Address Space Identifier (ASID), TLB maintenance instructions, branch 
predictor maintenance operations, and all changes to the System registers. 


In addition, any branches that appear in program order after the ISB instruction are written into the branch prediction 
logic with the context that is visible after the ISB instruction. This is needed to ensure correct execution of the 
instruction stream. 


For more information, see Jnstruction Synchronization Barrier (ISB) on page E2-2335. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16|15 1413 12/1110 9 8|7 6 5 4/3 0| 





Al variant 


ISB{<c>}{<q>} {<option>} 


Decode for this encoding 


// No additional decoding required 


T1 


[15 1413 12/1110 9 8/7 6 5 4/3 2 1 0|15141312/11109 8|7 6 5 4/3 0 | 





tt to 481 ot AAA] + ofofoffaafayo +1 0] option _| 


T1 variant 
ISB{<c>}{<q>} {<option>} 
Decode for this encoding 


// No additional decoding required 


Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. Must be AL or omitted. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<option> Specifies an optional limitation on the barrier operation. Values are: 
SY Full system barrier operation, encoded as option = 0b1111. Can be omitted. 


All other encodings of option are reserved. The corresponding instructions execute as full system 
barrier operations, but must not be relied upon by software. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
InstructionSynchronizationBarrier(); 
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F5.1.52 IT 

If-Then makes up to four following instructions (the IT block) conditional. The conditions for the instructions in the 
IT block are the same as, or the inverse of, the condition the IT instruction specifies for the first instruction in the 
block. 
The IT instruction itself does not affect the condition flags, but the execution of the instructions in the IT block can 
change the condition flags. 
16-bit instructions in the IT block, other than CMP, CMN and TST, do not set the condition flags. An IT instruction with 
the AL condition can change the behavior without conditional execution. 
The architecture permits exception return to an instruction in the IT block only if the restoration of the CPSR 
restores ITSTATE to a state consistent with the conditions specified by the IT instruction. Any other exception return 
to an instruction in an IT block is UNPREDICTABLE. Any branch to a target instruction in an IT block is not permitted, 
and if such a branch is made it is UNPREDICTABLE what condition is used when executing that target instruction and 
any subsequent instruction in the IT block. 
Many uses of the IT instruction are deprecated for performance reasons, and an implementation might include ITD 
controls that can disable those uses of IT, making them UNDEFINED. For more information see Conditional execution 
on page F2-2407. 
See also Conditional instructions on page F1-2369 and Conditional execution on page F2-2407. 
T1 

\15 14 13 12/1110 9 8|7 4|3 0 | 

1011 1 1 1 1{ firstcond | !=0000 
mask 

T1 variant 
IT{<x>{<y>{<z>}}}{<q>} <cond> 
Decode for this encoding 

if mask == '@000' then SEE "Related encodings"; 

if firstcond == '1111' || (firstcond == '1110' && BitCount(mask) != 1) then UNPREDICTABLE; 

if InITBlock() then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If firstcond == '1111', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes with the additional decode: firstcond = '1110'. 
If firstcond == '1110' && BitCount(mask) != 1, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Related encodings: Miscellaneous 16-bit instructions on page F3-2442. 
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Assembler symbols 


<X> The condition for the second instruction in the IT block. If omitted, the "mask" field is set to 0b1000. 
If present it is encoded in the "mask[3]" field: 
T firstcond[0] 
E NOT firstcond[0] 

<y> The condition for the third instruction in the IT block. If omitted and <x> is present, the "mask[2:0]" 
field is set to 0b100. If <y> is present it is encoded in the "mask[2]" field: 
T firstcond[0] 
E NOT firstcond[0] 

<Z> The condition for the fourth instruction in the IT block. If omitted and <y> is present, the "mask[1:0]" 
field is set to 0b10. If <z> is present, the "mask[0]" field is set to 1, and it is encoded in the "mask[1]" 
field: 
T firstcond[0] 
E NOT firstcond[0] 

<q> See Standard assembler syntax fields on page F2-2406. 

<cond> The condition for the first instruction in the IT block, encoded in the "firstcond" field. See 


Table F2-1 on page F2-2407 for the range of conditions available, and the encodings. 


The conditions specified in an IT instruction must match those specified in the syntax of the instructions in its IT 
block. When assembling to A32 code, assemblers check IT instruction syntax for validity but do not generate 
assembled instructions for them. See Conditional instructions on page F1-2369. 


Operation 
EncodingSpeci ficOperations(); 


AArch32.CheckITEnabled(mask) ; 
PSTATE.IT<7:@> = firstcond:mask; 
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F5.1.53 LDA 


Load-Acquire Word loads a word from memory and writes it to a register. The instruction also has memory ordering 
semantics as described in Load-Acquire, Store-Release on page B2-90 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 21 O| 


| iit joo ot tfo ols] Rn | Rt [arfofols 0 o tama) 


cond 


Al variant 


LDA{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || nm == 15 then UNPREDICTABLE; 


T1 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 21 O| 


pt rorooorrojsyy Rn fo RM Masfo}1 oma) 


T1 variant 


LDA{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 


Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
R[t] = MemO[address, 4]; 
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F5.1.54 LDAB 


Load-Acquire Byte loads a byte from memory, zero-extends it to form a 32-bit word and writes it to a register. The 
instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 21 O| 


| eit foo ot tft ofst Rn | Rt [airfofols 0 o tm) 


cond 


Al variant 


LDAB{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || nm == 15 then UNPREDICTABLE; 


T1 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 21 0| 


pt roto oor oft Rn fo RE (MH Masfojo oma) 


T1 variant 


LDAB{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 


Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
R[t] = ZeroExtend(MemO[address, 1], 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.55 LDAEX 
Load-Acquire Exclusive Word loads a word from memory, writes it to a register and: 


° If the address has the Shared Memory attribute, marks the physical address as exclusive access for the 
executing PE in a global monitor. 


° Causes the executing PE to indicate an active exclusive access in the local monitor. 

The instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. 
For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 2 1 0| 


| fet jo oo 4 tfo oft] Rn | RE [ert fo]1 0 o Ama) 


cond 


Al variant 


LDAEX{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || mn == 15 then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 21 0| 


pt rorooorr off Rn fo ROMs oma 


T1 variant 


LDAEX{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
AArch32.SetExclusiveMonitors(address, 4); 
R[t] = MemO[address, 4]; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.56 LDAEXB 


Load-Acquire Exclusive Byte loads a byte from memory, zero-extends it to form a 32-bit word, writes it to a register 


and: 

° If the address has the Shared Memory attribute, marks the physical address as exclusive access for the 
executing PE in a global monitor. 

° Causes the executing PE to indicate an active exclusive access in the local monitor. 


The instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. 
For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 2 1 0| 


pti foo ot tft oft] Rn TRE nt foft 0 0 MMMM 


cond 


Al variant 


LDAEXB{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || nm == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 21 0O| 


pt rorooorrojsyf Rn fo ROMO Mas|t}o oma 


T1 variant 


LDAEXB{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
AArch32.SetExclusiveMonitors(address, 1); 
R[t] = ZeroExtend(MemO[address, 1], 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.57 LDAEXD 
Load-Acquire Exclusive Doubleword loads a doubleword from memory, writes it to two registers and: 


° If the address has the Shared Memory attribute, marks the physical address as exclusive access for the 
executing PE in a global monitor 


° Causes the executing PE to indicate an active exclusive access in the local monitor. 


The instruction also acts as a barrier instruction with the ordering requirements described in Load-Acquire, 
Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 2 1 0| 


| isti1_ jo oo 1 1fo tf1{ Rn | Re [at fof1 0 0 tama) 
cond 
Al variant 
LDAEXD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>] 
Decode for this encoding 


t = UInt(Rt); t2 =t +1; n = UInt(Rn); 
if Rt<@> == '1' || t2 == 15 || n == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If Rt<@> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes with the additional decode: t<0> = '0'. 

. The instruction executes with the additional decode: t2 = t. 

° The instruction executes as described, with no change to its behavior and no additional side effects. 
If Rt == '1110', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


. The instruction is handled as described in Using R15 on page K1-5457. 
T1 
15 141312/11109 8|7 6 5 4|3 0 |15 42\11 8|7 65 4|3 21 0| 


fat totoootio) mm | R | Re [i imomm 


T1 variant 


LDAEXD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>] 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Decode for this encoding 


t = UInt(Rt); t2 = UInt(Rt2); n = UInt(Rn); 
if t == 15 || t2 == 15 || t == t2 || nm == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
. The load instruction executes but the destination register takes an UNKNOWN value. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


<Rt> must be even-numbered and not R14. 


For encoding T1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


<Rt2> For encoding A1: is the second general-purpose register to be transferred. <Rt2> must be <R(t+1)>. 


For encoding T1: is the second general-purpose register to be transferred, encoded in the "Rt2" field. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
AArch32.SetExclusiveMonitors(address, 8); 
value = MemO[address, 8]; 
// Extract words from 64-bit loaded value such that R[t] is 
// loaded from address and R[t2] from address+4. 
R[t] = if BigEndian() then value<63:32> else value<31:0>; 
R[t2] = if BigEndian() then value<31:@> else value<63:32>; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.58 LDAEXH 


Load-Acquire Exclusive Halfword loads a halfword from memory, zero-extends it to form a 32-bit word, writes it 
to a register and: 


° If the address has the Shared Memory attribute, marks the physical address as exclusive access for the 
executing PE in a global monitor. 


° Causes the executing PE to indicate an active exclusive access in the local monitor. 
The instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 2 1 0| 


pti foo ot ttt attf Rn TRE ant foft 0 o 1M MaOM 


cond 


Al variant 


LDAEXH{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || nm == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 21 0| 


pt rorooorrolsyt Rn fo ROOM tso tama 


T1 variant 


LDAEXH{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
AArch32.SetExclusiveMonitors(address, 2); 
R[t] = ZeroExtend(MemO[address, 2], 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.59 LDAH 


Load-Acquire Halfword loads a halfword from memory, zero-extends it to form a 32-bit word and writes it to a 
register. The instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on 
page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 


information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 21 0| 


pF feitt foo ot tft tttf Rn | RE [erfofo]1 o o Amaya) 


cond 


Al variant 


LDAH{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || mn == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 21 0| 


pr torooortzoftf Ra TR (Oma sfofo Waa 


T1 variant 


LDAH{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); nm = UInt(Rn); 
if t == 15 || mn == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior, see Appendix K1 Architectural 
Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
R[t] = ZeroExtend(MemO[address, 2], 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.60 LDC (immediate) 


Load data to System register (immediate) calculates an address from a base register value and an immediate offset, 
loads a word from memory, and writes it to the DBGDTRTXint System register. It can use offset, post-indexed, 
pre-indexed, or unindexed addressing. For information about memory accesses see Memory accesses on 

page F2-2412. 


In an implementation that includes EL2, the permitted LDC access to DBGDTRTXint can be trapped to Hyp mode, 
meaning that an attempt to execute an LDC instruction in a Non-secure mode other than Hyp mode, that would be 
permitted in the absence of the Hyp trap controls, generates a Hyp Trap exception. For more information, see 
Trapping general Non-secure System register accesses to debug registers on page G1-3911. 


For simplicity, the LDC pseudocode does not show this possible trap to Hyp mode. 





A1 
|31 28|27 26 25 24|23 22 21 20|19 16/15 141312|1110 9 8|7 | 0| 
Cem [1 1 ojPlujowii[ =i fo7o4f171)o] imme | 
cond Rn 


Offset variant 
Applies when P == 1 && W == 0. 


LDC{<c>}{<q>} p14, c5, [<Rn>{, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 1. 


LDC{<c>}{<q>} p14, c5, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDC{<c>}{<q>} p14, c5, [<Rn>, #{+/-}<imm>]! 


Unindexed variant 
Applies when P == 0 && U == 1 && W == @. 


LDC{<c>}{<q>} p14, c5, [<Rn>], <option> 


Decode for all variants of this encoding 


if Rn == '1111' then SEE LDC (literal); 

if P == 'Q' && U == '@' && W == '@' then UNDEFINED; 

n = UInt(Rn); cp = 14; 

jimm32 = ZeroExtend(imm8:'Q0', 32); index = (P == '1'); add = (U == '1'); whack = (W == '1'); 


T1 


151413 12|11109 8|7 6 5 4|3 0|15141312|1110 9 8|7 | 0 | 


fr tport 7 ofPlupowyiy enn fo+ot]111)o| imme | 
Rn 





Offset variant 


Applies when P == 1 && W == 0. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


LDC{<c>}{<q>} p14, c5, [<Rn>{, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 1. 


LDC{<c>}{<q>} p14, c5, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDC{<c>}{<q>} p14, c5, [<Rn>, #{+/-}<imm>]! 


Unindexed variant 
Applies when P == 0 && U == 1 && W == @. 


LDC{<c>}{<q>} p14, c5, [<Rn>], <option> 


Decode for all variants of this encoding 


if Rn == '1111' then SEE LDC (literal); 

if P == 'Q' && U == '@' && W == '@' then UNDEFINED; 

n = UInt(Rn); cp = 14; 

jimm32 = ZeroExtend(imm8:'Q0', 32); index = (P == '1'); add = (U == '1'); whack = (W == '1'); 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. If the PC is used, see LDC (literal). 
<option> Is an 8-bit immediate, in the range 0 to 255 enclosed in { }, encoded in the "imm8" field. The value 


of this field is ignored when executing this instruction. 


+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> Is the immediate offset used for forming the address, a multiple of 4 in the range 0-1020, defaulting 


to 0 and encoded in the "imm8" field, as <imm>/4. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
AArch32.CheckSystemAccess(cp, ThisInstr()); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 


// System register write to DBGDTRTXint. 
DBGDTR_ELO[] = MemA[address,4]; 


if wback then R[n] = offset_addr; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.61 LDC (literal) 


Load data to System register (literal) calculates an address from the PC value and an immediate offset, loads a word 
from memory, and writes it to the DBGDTRTXint System register. For information about memory accesses see 
Memory accesses on page F2-2412. 


In an implementation that includes EL2, the permitted LDC access to DBGDTRTXint can be trapped to Hyp mode, 
meaning that an attempt to execute an LDC instruction in a Non-secure mode other than Hyp mode, that would be 
permitted in the absence of the Hyp trap controls, generates a Hyp Trap exception. For more information, see 
Trapping general Non-secure System register accesses to debug registers on page G1-3911. 


For simplicity, the LDC pseudocode does not show this possible trap to Hyp mode. 


A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 8|7 | 0| 
Demi [174 ofPjujofwiif1 1+ ijo7+o [17 fo] imme | 
cond 
Al variant 


Applies when !(P == @ && U == 0 && W == Q). 
LDC{<c>}{<q>} p14, c5, <label> 


LDC{<c>}{<q>} p14, c5, [PC, #{+/-}<imm>] 
LDC{<c>}{<q>} p14, c5, [PC], <option> 


Decode for this encoding 
if P == '0' && U == 'Q' && W == 'Q' then UNDEFINED; 


index = (P == '1'); add = (U == ''1'); cp = 14; imm32 = ZeroExtend(imm8:'Q0', 32); 
if W== '1' || (P == '@' && CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If W == '1', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction executes without writeback of the base address. 
° The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
T1 
145 14 1312/1110 9 8/7 6 5 4/3 2 1 0/15141312/1110 9 8/7 | 0 | 


7 apolt 7 opPlupopwiiyi 17 ifo 70711 10] imme 


T1 variant 
Applies when !(P == @ && U == 0 && W == 0). 


LDC{<c>}{<q>} p14, c5, <label> 
LDC{<c>}{<q>} p14, c5, [PC, #{+/-}<imm>] 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 


F5.1 Alphabetical list of T32 


and A32 base instruction set instructions 


Decode for this encoding 


if P 
inde 
if W 


== '@' && U == 'O0' && W == 'Q' then UNDEFINED; 
x = (P == '1'); add = (U == '1'); cp = 14; imm32 = ZeroExtend(imm8:'Q0', 32); 
== '1' || (P == 'Q' && CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


IfW == '1' || P == 'Q', then one of the following behaviors must occur: 


The instruction is UNDEFINED. 
The instruction executes as NOP. 
The instruction executes without writeback of the base address. 


The instruction executes as LDC (immediate) with writeback to the PC. The instruction is handled as described 
in Using R15 on page K1-5457. 


Assembler symbols 


<c> 
<q> 


<opti 


<labe 


+/- 


<imm> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


on> Is an 8-bit immediate, in the range 0 to 255 enclosed in { }, encoded in the "imm8" field. The value 
of this field is ignored when executing this instruction. 


1> The label of the literal data item that is to be loaded into <Rt>. The assembler calculates the required 
value of the offset from the Align(PC, 4) value of the instruction to this label. Permitted values of 
the offset are multiples of 4 in the range -1020 to 1020. If the offset is zero or positive, imm32 is equal 
to the offset and add == TRUE (encoded as U == 1). If the offset is negative, imm32 is equal to minus 
the offset and add == FALSE (encoded as U == Q). 


Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 

- when U = @ 

+ when U = 1 


Is the immediate offset used for forming the address, a multiple of 4 in the range 0-1020, defaulting 
to 0 and encoded in the "imm8" field, as <imm>/4. 


The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 


Ope 


ration for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 

AArch32.CheckSystemAccess(cp, ThisInstr()); 

offset_addr = if add then (Align(PC,4) + imm32) else (Align(PC,4) - imm32); 
address = if index then offset_addr else Align(PC,4); 


// System register write to DBGDTRTXint. 
DBGDTR_ELO[] = MemA[address,4]; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.62 LDM, LDMIA, LDMFD 


Load Multiple (Increment After, Full Descending) loads multiple registers from consecutive memory locations 
using an address from a base register. The consecutive memory locations start at this address, and the address just 
above the highest of those locations can optionally be written back to the base register. 


The lowest-numbered register is loaded from the lowest memory address, through to the highest-numbered register 
from the highest memory address. See also Encoding of lists of general-purpose registers and the PC on 
page F2-2413. 


The registers loaded can include the PC, causing a branch to a loaded address. This is an interworking branch, see 
Pseudocode description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 


Related system instructions are LDM (User registers) and LDM (exception return). 


This instruction is used by the alias POP (multiple registers). See Alias conditions on page F5-2701 for details of 
when each alias is preferred. 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 | | | 0| 
Erm [4 0 o]o[tjo]w]t] Rn_| register_list 
cond 
Al variant 


LDM{TA}{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
LDMFD{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Full Descending stack 


Decode for this encoding 
n = UInt(Rn); registers = register_list; wback = (W == '1'); 


if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 
if whack && registers<n> == '1' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes as LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers loaded. 


Tf whack && registers<n> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T1 


\15141312\1110 8|7 | 0| 


T1 variant 


LDM{IA}{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
LDMFD{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Full Descending stack 


Decode for this encoding 


n = UInt(Rn); registers = 'Q0000000':register_list; wback = (registers<n> == 'Q'); 
if BitCount(registers) < 1 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction executes as LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers loaded. 


T2 


115 141312|1110 9 8|7 6 5 4/3 0/15 14 13 12| | | 0 


[111070 ojo tfofwii{] Rn [PiMjof registertist 


register_list<13> 


T2 variant 

LDM{IA}{<c>}.W <Rn>{!}, <registers> // Preferred syntax, if <Rn>, '!' and <registers> can be represented 
in T1 

LDMFD{<c>}.W <Rn>{!}, <registers> // Alternate syntax, Full Descending stack, if <Rn>, '!' and 
<registers> can be represented in T1 

LDM{TA}{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 

LDMFD{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Full Descending stack 

Decode for this encoding 

n = UInt(Rn); registers = P:M:register_list; whack = (W == '1'); 

if n == 15 || BitCount(registers) < 2 || (P == '1' && M == '1') then UNPREDICTABLE; 

if whack && registers<n> == '1' then UNPREDICTABLE; 

if registers<13> == '1' then UNPREDICTABLE; 

if registers<15> == '1' && InITBlock() && !LastInITBlock() then UNPREDICTABLE; 

CONSTRAINED UNPREDICTABLE behavior 

If BitCount(registers) < 1, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes as LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers loaded. 
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If whack && registers<n> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


If BitCount(registers) == 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction loads a single register using the specified addressing modes. 

° The instruction executes as LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. 


If registers<13> == '1', then one of the following behaviors must occur: 

. The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction performs all of the loads using the specified addressing mode, but R13 is UNKNOWN. 

IfP == '1' && M == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction loads the register list and either R14 or R15, both R14 and R15, or neither of these registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Alias conditions 





Alias of variant is preferred when 
POP (multiple registers) T2 W == '1' && Rn == '1101' && BitCount(P:M:register_list) > 1 
POP (multiple registers) Al == '1' @& Rn == '1101' && BitCount(register_list) > 1 





Assembler symbols 


TA Is an optional suffix for the Increment After form. 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


! For encoding Al and T2: the address adjusted by the size of the data loaded is written back to the 
base register. If specified, it is encoded in the "W" field as 1, otherwise this field defaults to 0. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


For encoding T1: the address adjusted by the size of the data loaded is written back to the base 
register. It is omitted if <Rn> is included in <registers>, otherwise it must be present. 


<registers> For encoding A1: is a list of one or more registers to be loaded, separated by commas and 
surrounded by { and }. The PC can be in the list. ARM deprecates using these instructions with both 


the LR and the PC in the list. 


For encoding T1: is a list of one or more registers to be loaded, separated by commas and surrounded 
by { and }. The registers in the list must be in the range RO-R7, encoded in the "register_list" field. 


For encoding T2: is a list of one or more registers to be loaded, separated by commas and surrounded 
by { and }. The registers in the list must be in the range RO-R12, encoded in the "register_list" field, 
and can optionally contain one of the LR or the PC. If the LR is in the list, the "M" field is set to 1, 
otherwise it defaults to 0. If the PC is in the list, the "P" field is set to 1, otherwise it defaults to 0. 


If the PC is in the list: 
° The LR must not be in the list. 
. The instruction must be either outside any IT block, or the last instruction in an IT block. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
for i =0 to 14 


if registers<i> == '1' then 
R[i] = MemA[address,4]; address = address + 4; 
if registers<15> == '1' then 
LoadWritePC(MemA[address,4]); 
if wback && registers<n> == '@' then R[n] = R[n] + 4«BitCount(registers); 
if wback && registers<n> == '1' then R[n] = 


bits(32) UNKNOWN; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


LDM (exception return) 


Load Multiple (exception return) loads multiple registers from consecutive memory locations using an address from 
a base register. The SPSR of the current mode is copied to the CPSR. An address adjusted by the size of the data 
loaded can optionally be written back to the base register. 


The registers loaded include the PC. The word loaded for the PC is treated as an address and a branch occurs to that 
address. 


Load Multiple (exception return) is: 
° UNDEFINED in Hyp mode. 


° UNPREDICTABLE in debug state, and in User mode and System mode. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 14 | | | 0 | 


| tei [to ofPtujtjwit] Rn ft] registertist_ 





cond 


Al variant 


LDM{<amode>}{<c>}{<q>} <Rn>{!}, <registers_with_pc>A 


Decode for this encoding 


n = UInt(Rn); registers = register_list; 

wback = (W == '1'); increment = (U == '1'); wordhigher = (P == U); 
if n == 15 then UNPREDICTABLE; 

if whack && registers<n> == '1' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && registers<n> == '1', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 


. The instruction performs all the loads using the specified addressing mode and the content of the register 
being written back is UNKNOWN. In addition, if an exception occurs during the execution of this instruction, 
the base address might be corrupted so that the instruction cannot be repeated. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<amode> is one of: 


DA Decrement After. The consecutive memory addresses end at the address in the base 
register. Encoded as P= 0, U = 0. 


FA Full Ascending. For this instruction, a synonym for DA. 


DB Decrement Before. The consecutive memory addresses end one word below the address 
in the base register. Encoded as P = 1, U=0. 


EA Empty Ascending. For this instruction, a synonym for DB. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


IA Increment After. The consecutive memory addresses start at the address in the base 
register. This is the default. Encoded as P= 0, U=1. 


FD Full Descending. For this instruction, a synonym for IA. 


IB Increment Before. The consecutive memory addresses start one word above the address 
in the base register. Encoded as P= 1, U=1. 


ED Empty Descending. For this instruction, a synonym for IB. 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


! The address adjusted by the size of the data loaded is written back to the base register. If specified, 
it is encoded in the "W" field as 1, otherwise this field defaults to 0. 


<registers_with_pc> Is a list of one or more registers, separated by commas and surrounded by { and }. It specifies 
the set of registers to be loaded. The registers are loaded with the lowest-numbered register from the 
lowest memory address, through to the highest-numbered register from the highest memory address. 
The PC must be specified in the register list, and the instruction causes a branch to the address (data) 
loaded into the PC. See also Encoding of lists of general-purpose registers and the PC on 
page F2-2413. 


Instructions with similar syntax but without the PC included in the registers list are described in LDM (User 
registers). 
Operation 

if ConditionPassed() then 


EncodingSpecificOperations(); 
if PSTATE.EL == EL2 then 


UNDEFINED; 
elsif PSTATE.M IN {M32_User,M32_System} then 

UNPREDICTABLE; // UNDEFINED or NOP 
else 


length = 4xBitCount(registers) + 4; 
address = if increment then R[n] else R[n]-length; 
if wordhigher then address = address+4; 


for i = @ to 14 
if registers<i> == '1' then 
R[i] = MemA[address,4]; address = address + 4; 
new_pc_value = MemA[address,4]; 


if woack && registers<n> == 'Q' then R[n] = if increment then R[n]+length else R[n]-length; 
if woack && registers<n> == '1' then R[n] = bits(32) UNKNOWN; 


AArch32.ExceptionReturn(new_pc_value, SPSR[]); 


CONSTRAINED UNPREDICTABLE behavior 


If PSTATE.M IN {M32_User,M32_System}, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.64 LDM (User registers) 


In an EL1 mode other than System mode, Load Multiple (User registers) loads multiple User mode registers from 
consecutive memory locations using an address from a base register. The registers loaded cannot include the PC. 
The PE reads the base register value normally, using the current mode to determine the correct Banked version of 
the register. This instruction cannot writeback to the base register. 


Load Multiple (User registers) is UNDEFINED in Hyp mode, and UNPREDICTABLE in User and System modes. 


A1 
|31 28|27 26 25 24/23 22 21 20/19 16|15 14 | | | 0 | 
ern [4 0 o[P[U]4fo]1] Rn 0] register_list 
cond 
Al variant 


LDM{<amode>}{<c>}{<q>} <Rn>, <registers_without_pc>A 


Decode for this encoding 


n = UInt(Rn); registers = register_list; increment = (U == '1'); wordhigher = (P == U); 
if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction operates as an LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<amode> is one of: 

DA Decrement After. The consecutive memory addresses end at the address in the base 
register. Encoded as P= 0, U = 0. 

FA Full Ascending. For this instruction, a synonym for DA. 

DB Decrement Before. The consecutive memory addresses end one word below the address 
in the base register. Encoded as P = 1, U=0. 

EA Empty Ascending. For this instruction, a synonym for DB. 

IA Increment After. The consecutive memory addresses start at the address in the base 
register. This is the default. Encoded as P= 0, U=1. 

FD Full Descending. For this instruction, a synonym for IA. 

IB Increment Before. The consecutive memory addresses start one word above the address 
in the base register. Encoded as P= 1, U=1. 

ED Empty Descending. For this instruction, a synonym for IB. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


<registers_without_pc> 


Is a list of one or more registers, separated by commas and surrounded by { and }. It specifies the 
set of registers to be loaded by the LDM instruction. The registers are loaded with the 
lowest-numbered register from the lowest memory address, through to the highest-numbered 
register from the highest memory address. The PC must not be in the register list. See also Encoding 
of lists of general-purpose registers and the PC on page F2-2413. 


Instructions with similar syntax but with the PC included in <registers_without_pc> are described in LDM 
(exception return). 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if PSTATE.EL == EL2 then UNDEFINED; 
elsif PSTATE.M IN {M32_User,M32_System} then UNPREDICTABLE; 
else 
length = 4xBitCount(registers) ; 
address = if increment then R[n] else R[n]-length; 
if wordhigher then address = address+4; 
for i = 0 to 14 
if registers<i> == '1' then // Load User mode register 
Rmode[i, M32_User] = MemA[address,4]; address = address + 4; 


CONSTRAINED UNPREDICTABLE behavior 


If PSTATE.M IN {M32_User,M32_System}, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction operates as an LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.65 LDMDA, LDMFA 


Load Multiple Decrement After (Full Ascending) loads multiple registers from consecutive memory locations using 
an address from a base register. The consecutive memory locations end at this address, and the address just below 
the lowest of those locations can optionally be written back to the base register. 


The lowest-numbered register is loaded from the lowest memory address, through to the highest-numbered register 
from the highest memory address. See also Encoding of lists of general-purpose registers and the PC on 
page F2-2413. 


The registers loaded can include the PC, causing a branch to a loaded address. This is an interworking branch, see 
Pseudocode description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 


Related system instructions are LDM (User registers) and LDM (exception return). 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 | | 0| 
Eri [4 0 oofojo]w[t] Rn _| register_list 
cond 
Al variant 


LDMDA{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
LDMFA{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Full Ascending stack 


Decode for this encoding 
n = UInt(Rn); registers = register_list; wback = (W == '1'); 


if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 
if wback && registers<n> == '1' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction operates as an LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers loaded. 


If whack && registers<n> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


! The address adjusted by the size of the data loaded is written back to the base register. If specified, 
it is encoded in the "W" field as 1, otherwise this field defaults to 0. 


<registers> Isa list of one or more registers to be loaded, separated by commas and surrounded by { and }. The 
PC can be in the list. ARM deprecates using these instructions with both the LR and the PC in the 
list. 


Operation 


if ConditionPassed() then 

EncodingSpecificOperations(); 
address = R[n] - 4xBitCount(registers) + 4; 
for i = 0 to 14 

if registers<i> == '1' then 

R[i] = MemA[address,4]; address = address + 4; 

if registers<15> == '1' then 

LoadWritePC(MemA[address,4]); 
if wback && registers<n> == 'Q' then R[n] = R[n] - 4«BitCount(registers) ; 
if wback && registers<n> == '1' then R[n] = bits(32) UNKNOWN; 





F5-2708 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.66 LDMDB, LDMEA 


Load Multiple Decrement Before (Empty Ascending) loads multiple registers from consecutive memory locations 
using an address from a base register. The consecutive memory locations end just below this address, and the 
address of the lowest of those locations can optionally be written back to the base register. 


The lowest-numbered register is loaded from the lowest memory address, through to the highest-numbered register 
from the highest memory address. See also Encoding of lists of general-purpose registers and the PC on 
page F2-2413. 


The registers loaded can include the PC, causing a branch to a loaded address. This is an interworking branch, see 
Pseudocode description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 


Related system instructions are LDM (User registers) and LDM (exception return). 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 | | 0| 
eri [4 0 o]4[ojo]w[t] Rn _| register_list 
cond 
Al variant 


LDMDB{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
LDMEA{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Empty Ascending stack 


Decode for this encoding 
n = UInt(Rn); registers = register_list; wback = (W == '1'); 


if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 
if wback && registers<n> == '1' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If whack && registers<n> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction executes as LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers loaded. 


T1 


115 1413 12/1110 9 8|7 6 5 4/3 0 |15 14 13 12| | | 2 


770740 0]7 ofojw[t] Rn [P[wjo] ____registerist——~sd 


register_list<13> 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T1 variant 


LDMDB{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
LDMEA{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Empty Ascending stack 


Decode for this encoding 


n = UInt(Rn); registers = P:M:register_list; whack = (W == '1'); 

if n == 15 || BitCount(registers) < 2 || (P == '1' && M == '1') then UNPREDICTABLE; 
if whack && registers<n> == '1' then UNPREDICTABLE; 

if registers<13> == '1' then UNPREDICTABLE; 

if registers<15> == '1' && InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && registers<n> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction executes as LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers loaded. 


If BitCount(registers) == 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction loads a single register using the specified addressing modes. 

° The instruction executes as LDM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. 


If registers<13> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction performs all of the loads using the specified addressing mode, but R13 is UNKNOWN. 

If P == '1' && M == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction loads the register list and either R14 or R15, both R14 and R15, or neither of these registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> 
<q> 
<Rn> 


<registers> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 
Is the general-purpose base register, encoded in the "Rn" field. 


The address adjusted by the size of the data loaded is written back to the base register. If specified, 
it is encoded in the "W" field as 1, otherwise this field defaults to 0. 


For encoding A1: is a list of one or more registers to be loaded, separated by commas and 
surrounded by { and }. The PC can be in the list. ARM deprecates using these instructions with both 
the LR and the PC in the list. 


For encoding T1: is a list of one or more registers to be loaded, separated by commas and surrounded 
by { and }. The registers in the list must be in the range RO-R12, encoded in the "register_list" field, 
and can optionally contain one of the LR or the PC. If the LR is in the list, the "M" field is set to 1, 
otherwise it defaults to 0. If the PC is in the list, the "P" field is set to 1, otherwise it defaults to 0. 
If the PC is in the list: 


° The LR must not be in the list. 


° The instruction must be either outside any IT block, or the last instruction in an IT block. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n] - 4xBitCount(registers); 
for i =0 to 14 
if registers<i> == '1' then 


R[i] = MemA[address,4]; address = address + 4; 


if registers<15> == '1' then 
LoadWritePC(MemA[address,4]); 


if wback && registers<n> == 'Q' then R[n] 
if wback && registers<n> == '1' then R[n] 


R[n] - 4*BitCount(registers) ; 
bits(32) UNKNOWN; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.67 


LDMIB, LDMED 


Load Multiple Increment Before (Empty Descending) loads multiple registers from consecutive memory locations 
using an address from a base register. The consecutive memory locations start just above this address, and the 
address of the last of those locations can optionally be written back to the base register. 


The lowest-numbered register is loaded from the lowest memory address, through to the highest-numbered register 
from the highest memory address. See also Encoding of lists of general-purpose registers and the PC on 
page F2-2413. 


The registers loaded can include the PC, causing a branch to a loaded address. This is an interworking branch, see 
Pseudocode description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 


Related system instructions are LDM (User registers) and LDM (exception return). 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 | | | 0 | 


[| eit [1 0 oft{tfofwi1] Rn | registerist, 
cond 
Al variant 


LDMIB{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
LDMED{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Empty Descending stack 


Decode for this encoding 
n = UInt(Rn); registers = register_list; wback = (W == '1'); 


if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 
if wback && registers<n> == '1' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If BitCount(registers) < 1, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


. The instruction operates as an LDM with the same addressing mode but targeting an unspecified set of registers. 
These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers loaded. 


If whack && registers<n> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 

Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 

Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


! The address adjusted by the size of the data loaded is written back to the base register. If specified, 
it is encoded in the "W" field as 1, otherwise this field defaults to 0. 


<registers> Isa list of one or more registers to be loaded, separated by commas and surrounded by { and }. The 
PC can be in the list. ARM deprecates using these instructions with both the LR and the PC in the 
list. 


Operation 


if ConditionPassed() then 

EncodingSpecificOperations(); 
address = R[n] + 4; 
for i = 0 to 14 

if registers<i> == '1' then 

R[i] = MemA[address,4]; address = address + 4; 

if registers<15> == '1' then 

LoadWritePC(MemA[address,4]); 
if wback && registers<n> == 'Q' then R[n] = R[n] + 4«BitCount(registers) 
if wback && registers<n> == '1' then R[n] = bits(32) UNKNOWN; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.68 LDR (immediate) 

Load Register (immediate) calculates an address from a base register value and an immediate offset, loads a word 

from memory, and writes it to a register. It can use offset, post-indexed, or pre-indexed addressing. For information 

about memory accesses see Memory accesses on page F2-2412. 

This instruction is used by the alias POP (single register). See Alias conditions on page F5-2716 for details of when 

each alias is preferred. 

A1 

31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 | | 0 | 
eit [0 1 o[Plulolwit] ern 
cond Rn 

Offset variant 

Applies when P == 1 && W == 0. 

LDR{<c>}{<q>} <Rt>, [<Rn> {, #{+/-}<imm>}] 

Post-indexed variant 

Applies when P == 0 && W == 0. 

LDR{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 

Pre-indexed variant 

Applies when P == 1 && W == 1. 

LDR{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 

Decode for all variants of this encoding 

if Rn == '1111' then SEE LDR (literal); 

if P == '@' && W == '1' then SEE LDRT; 

t = UInt(Rt); m = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 

index = (P == '1'); | add = (U == '1'); whack = (P == '@') || (W == '1'); 
if whack && n == t then UNPREDICTABLE; 

CONSTRAINED UNPREDICTABLE behavior 

Tf whack && n == t, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 

T1 

|15 14 13 12/11 10 | 65 |32 O| 
fo 1 tfofi] imme | Rn [| Rt | 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T1 variant 


LDR{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 


t = UInt(Rt); m = UInt(Rn); imm32 = ZeroExtend(imm5:'00', 32); 
index = TRUE; add = TRUE; wback = FALSE; 


T2 

|145141312|1110 8|7 | 0| 
toot} Rt | imme 

T2 variant 


LDR{<c>}{<q>} <Rt>, [SP{, #{+}<imm>}] 


Decode for this encoding 
t = UInt(Rt); nm = 13; imm32 = ZeroExtend(imm8:'00', 32); 
index = TRUE; add = TRUE; whack = FALSE; 


T3 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/11 | | 0 | 


Titittoootoy em | Rk | mma | 
Rn 


T3 variant 


LDR{<c>}.W <Rt>, [<Rn> {, #{+}<imm>}] // <Rt>, <Rn>, <imm> can be represented in Tl or T2 
LDR{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 
if Rn == '1111' then SEE LDR (literal); 


t = UInt(Rt); m = UInt(Rn); imm32 = ZeroExtend(imm12, 32); index = TRUE; add = TRUE; 
whack = FALSE; if t == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


T4 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1110 9 8|7 | 0 | 


[111410 ofofo{1 oft] 1111 | Rt [ajPiujw] imma 
Rn 





Offset variant 

Applies when P == 1 && U == 0 && W == @. 
LDR{<c>}{<q>} <Rt>, [<Rn> {, #-<imm>}] 
Post-indexed variant 

Applies when P == 0 && W == 1. 


LDR{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDR{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if Rn == '1111' then SEE LDR (literal); 

if P == '1' && U == '1' && W == '@' then SEE LDRT; 

if P == '@' && W == '@' then UNDEFINED; 

t = UInt(Rt); n = UInt(Rn); 

jimm32 = ZeroExtend(imm8, 32); index = (P == '1'); add = (U == '1'); whack = (W == '1'); 
if (woack && n == t) || (t == 15 && InITBlock() && !LastInITBlock()) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Alias conditions 





Alias 


of variant is preferred when 


POP (single register) Al (post-indexed) == 'Q' &&U == '1' && W == 'O' && Rn == '1101' && imm12 == '000000000100' 


POP (single register) | T4(post-indexed) Rn == '1101' @& U == '1' && imm8 == '00000100' 





Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 


can be used. If the PC is used, the instruction branches to the address (data) loaded to the PC. This 
is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


For encoding T1 and T2: is the general-purpose register to be transferred, encoded in the "Rt" field. 


For encoding T3 and T4: is the general-purpose register to be transferred, encoded in the "Rt" field. 
The PC can be used, provided the instruction is either outside an IT block or the last instruction of 
an IT block. If the PC is used, the instruction branches to the address (data) loaded to the PC. This 
is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


<Rn> For encoding A1, T3 and T4: is the general-purpose base register, encoded in the "Rn" field. For PC 
use see LDR (literal). 


For encoding T1: is the general-purpose base register, encoded in the "Rn" field. 
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+/- 


<imm> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 


- when U = @ 

+ when U = 1 

Specifies the offset is added to the base register. 

For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 


0 if omitted, and encoded in the "imm12" field. 


For encoding T1: is the optional positive unsigned immediate byte offset, a multiple of 4, in the 
range 0 to 124, defaulting to 0 and encoded in the "imm5" field as <imm>/4. 


For encoding T2: is the optional positive unsigned immediate byte offset, a multiple of 4, in the 
range 0 to 1020, defaulting to 0 and encoded in the "imm8" field as <imm>/4. 


For encoding T3: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 


For encoding T4: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 


Operation for all encodings 


if CurrentInstrSet() == InstrSet_A32 then 


else 


if ConditionPassed() then 


EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
data = MemU[address,4]; 
if wback then R[n] = offset_addr; 
if t == 15 then 
if address<1:0> == 'QQ@' then 
LoadWritePC(data) ; 
else 
UNPREDICTABLE; 
else 
R[t] = data; 


if ConditionPassed() then 


EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
data = MemU[address,4]; 
if woack then R[n] = offset_addr; 
if t == 15 then 
if address<1:0> == 'QQ' then 
LoadwWritePC(data) ; 
else 
UNPREDICTABLE; 
else 
R[t] = data; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.69 LDR (literal) 


Load Register (literal) calculates an address from the PC value and an immediate offset, loads a word from memory, 
and writes it to a register. For information about memory accesses see Memory accesses on page F2-2412. 


A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12|11 | | 0| 
1111 Jo 1 o[P]ujofwit]1 14 4{ Rte [| immi2_— 
cond 
Al variant 


Applies when !(P == @ && W == 1). 


LDR{<c>}{<q>} <Rt>, <label> // Normal form 
LDR{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative form 


Decode for this encoding 
if P == 'Q@' && W == '1' then SEE LDRT; 
t = UInt(Rt); imm32 = ZeroExtend(imm12, 32); 


add = (U == '1'); whack = (P == '0') || (W == '1'); 
if wboack then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If wack, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes with the additional decode: wback = FALSE;,. 

° The instruction treats bit[24] as the P bit, and bit[21] as the writeback (W) bit, and uses the same addressing 


mode as described in LDR (immediate). The instruction uses post-indexed addressing when P == '0' and uses 
pre-indexed addressing otherwise. The instruction is handled as described in Using R15 on page K1-5457. 


T1 


15 141312/1110 8|7 | 0 | 


o1oo4f Rt | imme | 


T1 variant 


LDR{<c>}{<q>} <Rt>, <label> // Normal form 


Decode for this encoding 


t = UInt(Rt); imm32 = ZeroExtend(imm8:'@0', 32); add = TRUE; 
T2 


15 141312/11109 8/7 6 5 4/3 2 1 0|15 12/11 | | 0 | 


14141170 ofofult oftjaatat Reo|immi2 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T2 variant 
LDR{<c>}.W <Rt>, <label> // Preferred syntax, and <Rt>, <label> can be represented in T1 


LDR{<c>}{<q>} <Rt>, <label> // Preferred syntax 
LDR{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative syntax 


Decode for this encoding 
t = UInt(Rt); imm32 = ZeroExtend(imml2, 32); add = (U == '1'); 
if t == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 


can be used. If the PC is used, the instruction branches to the address (data) loaded to the PC. This 
is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


For encoding T1: is the general-purpose register to be transferred, encoded in the "Rt" field. 


For encoding T2: is the general-purpose register to be transferred, encoded in the "Rt" field. The SP 
can be used. The PC can be used, provided the instruction is either outside an IT block or the last 
instruction of an IT block. If the PC is used, the instruction branches to the address (data) loaded to 
the PC. This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


<label> For encoding Al and T2: the label of the literal data item that is to be loaded into <Rt>. The 
assembler calculates the required value of the offset from the Align(PC, 4) value of the instruction 
to this label. Permitted values of the offset are -4095 to 4095. If the offset is zero or positive, imm32 
is equal to the offset and add == TRUE, encoded as U == 1. If the offset is negative, imm32 is equal to 
minus the offset and add == FALSE, encoded as U == 0. 


For encoding T1: the label of the literal data item that is to be loaded into <Rt>. The assembler 
calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
Permitted values of the offset are Multiples of four in the range 0 to 1020. 


+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 


0 if omitted, and encoded in the "imm12" field. 


For encoding T2: is a 12-bit unsigned immediate byte offset, in the range 0 to 4095, encoded in the 
"imm12" field. 


The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
base = Align(PC,4); 
address = if add then (base + imm32) else (base - imm32); 
data = MemU[address,4]; 
if t == 15 then 
if address<1:0> == 'QQ' then 
LoadWritePC(data) ; 
else 
UNPREDICTABLE; 
else 
R[t] = data; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.70 LDR (register) 


Load Register (register) calculates an address from a base register value and an offset register value, loads a word 
from memory, and writes it to a register. The offset register value can optionally be shifted. For information about 
memory accesses, see Memory accesses on page F2-2412. 


The T32 form of LDR (register) does not support register writeback. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 i7 6 5 4|3 0 | 
| tsi fo 1 afPfufofwi1{ Rn {| Rt imms —typefo] Rm _ 


cond 


Offset variant 
Applies when P == 1 && W == 0. 


LDR{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>{, <shift>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDR{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm>{, <shift>} 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDR{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>{, <shift>}]! 


Decode for all variants of this encoding 


if P == 'Q@' && W == '1' then SEE LDRT; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 

if m == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 


15141312/11109 8| 65 |32  0O| 


fo +0 t[iJofo] km | Rn] Rt 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T1 variant 


LDR{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>] 


Decode for this encoding 


t = UInt(Rt); n 


= UInt(Rn); m = UInt(Rm); 
(shift_t, shift_n) = 


(SRType_LSL, @); 
T2 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[111410 ofojo{1 oft] '1111 | Rt [0 0000 Ofimm2] Rm_| 
Rn 


T2 variant 


LDR{<c>}.W <Rt>, [<Rn>, {+}<Rm>] // <Rt>, <Rn>, <Rm> can be represented in T1 
LDR{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>{, LSL #<imm>}] 


Decode for this encoding 


if Rn == '1111' then SEE LDR (literal); 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
if t == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 


can be used. If the PC is used, the instruction branches to the address (data) loaded to the PC. This 
branch is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


For encoding T1: is the general-purpose register to be transferred, encoded in the "Rt" field. 


For encoding T2: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 
can be used, provided the instruction is either outside an IT block or the last instruction of an IT 
block. If the PC is used, the instruction branches to the address (data) loaded to the PC. This is an 
interworking branch, see Pseudocode description of operations on the AArch32 general-purpose 
registers and the PC on page E1-2293. 


<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 
in the offset variant. 


For encoding T1 and T2: is the general-purpose base register, encoded in the "Rn" field. 





+/- Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 
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<Rm> 


<shift> 


<imm> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Specifies the index register is added to the base register. 
Is the general-purpose index register, encoded in the "Rm" field. 


The shift to apply to the value read from <Rm>. If absent, no shift is applied. Otherwise, see Shifts 
applied to a register on page F2-2410. 


If present, the size of the left shift to apply to the value from <Rm>, in the range 1-3. <imm> is encoded 
in imm2. If absent, no shift is specified and imm2 is encoded as 0b00. 


Operation for all encodings 


if CurrentInstrSet() == InstrSet_A32 then 
if ConditionPassed() then 


else 


EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if index then offset_addr else R[n]; 
data = MemU[address,4]; 
if wback then R[n] = offset_addr; 
if t == 15 then 
if address<1:0> == 'QQ@' then 
LoadWritePC(data) ; 
else 
UNPREDICTABLE; 
else 
R[t] = data; 


if ConditionPassed() then 


EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
offset_addr = (R[n] + offset); 
address = offset_addr; 
data = MemU[address,4]; 
if t == 15 then 
if address<1:0> == 'QQ@' then 
LoadWritePC(data) ; 
else 
UNPREDICTABLE; 
else 
R[t] = data; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.71 


LDRB (immediate) 


Load Register Byte (immediate) calculates an address from a base register value and an immediate offset, loads a 
byte from memory, zero-extends it to form a 32-bit word, and writes it to a register. It can use offset, post-indexed, 
or pre-indexed addressing. For information about memory accesses see Memory accesses on page F2-2412. 


A1 


am 28|27 26 25 24|23 22 21 20/19 16/15 12\11 | 0| 


Peano OPO WL ea [a | ae 





Bond 


Offset variant 
Applies when P == 1 && W == 


LDRB{<c>}{<q>} <Rt>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 


LDRB{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 


LDRB{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>] ! 


Decode for all variants of this encoding 
if Rn == '1111' then SEE LDRB (literal); 
if P == 'Q' && W == '1' then SEE LDRBT; 
= UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 
index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
if t == 15 || (whack && n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If whack && n == t, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

. The instruction executes as NOP. 


° The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 
|15 14 13 12/11 10 | 65 |32 Of 


T1 variant 


LDRB{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 
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Decode for this encoding 


= UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm5, 32); 
index = TRUE; add = TRUE; wback = FALSE; 


T2 


ee 109 8|7 6 5 4|3 ie 12|11 | 0 | 


Poe ro ooo) a || ae 





T2 variant 


LDRB{<c>}.W <Rt>, [<Rn> {, #{+}<imm>}] // <Rt>, <Rn>, <imm> can be represented in T1 
LDRB{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 


if Rt == '1111' then SEE PLD; 
if Rn == '1111' then SEE LDRB (literal); 
= UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 
index = TRUE; add = TRUE; wback = FALSE; 
// ARMv8-A removes UNPREDICTABLE for R13 


T3 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


[114114170 ofojojo oft] = | Rte [ilplulw] imma 
Rn 





Offset variant 
Applies when Rt != 1111 && P == 1 && U == 0 && W == 


LDRB{<c>}{<q>} <Rt>, [<Rn> {, #-<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 


LDRB{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 


LDRB{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>] ! 


Decode for all variants of this encoding 


if Rt == '1111' && P == '1' && U == 'O' && W == '@' then SEE PLD, PLDW (immediate); 
if Rn == '1111' then SEE LDRB (literal); 
if P == '1' && U == '1' && W == '@' then SEE LDRBT; 
if P == 'O' && W == '@' then UNDEFINED; 
= UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm8, 32); 
index = (P == '1'); add = (U == '1'); whack = (W == '1'); 
if (t == 15 && W == '1') || (whack && n == t) then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 
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CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


The instruction is UNDEFINED. 
The instruction executes as NOP. 


The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Archi 


tectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> For encoding A1, T2 and T3: is the general-purpose base register, encoded in the "Rn" field. For PC 
use see LDRB (literal). 
For encoding T1: is the general-purpose base register, encoded in the "Rn" field. 
+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 
+ Specifies the offset is added to the base register. 
<imm> For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 
0 if omitted, and encoded in the "imm12" field. 
For encoding T1: is an optional 5-bit unsigned immediate byte offset, in the range 0 to 31, defaulting 
to 0 and encoded in the "imm5" field. 
For encoding T2: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 
For encoding T3: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 
Operation for all encodings 
if CurrentInstrSet() == InstrSet_A32 then 
if ConditionPassed() then 
EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
R[t] = ZeroExtend(MemU[address,1], 32); 
if wback then R[n] = offset_addr; 
else 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
R[t] = ZeroExtend(MemU[address,1], 32); 
if wback then R[n] = offset_addr; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.72 _ LDRB (literal) 


Load Register Byte (literal) calculates an address from the PC value and an immediate offset, loads a byte from 
memory, zero-extends it to form a 32-bit word, and writes it to a register. For information about memory accesses 
see Memory accesses on page F2-2412. 


A1 

\31 28|27 26 25 24/23 22 21 20|19 18 17 16|15 12|11 | | 0 | 
1111 fo 1 ofPlultjwi1{1 144] Rt [ immi2_ 
cond 

Al variant 


Applies when !(P == @ && W == 1). 


LDRB{<c>}{<q>} <Rt>, <label> // Normal form 
LDRB{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative form 


Decode for this encoding 
if P == '@' && W == '1' then SEE LDRBT; 
t = UInt(Rt); imm32 = ZeroExtend(imm12, 32); 


add = (U == '1'); whack = (P == '@') || (W == '1'); 
if t == 15 || wback then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If wack, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes with the additional decode: wback = FALSE;,. 

° The instruction treats bit[24] as the P bit, and bit[21] as the writeback (W) bit, and uses the same addressing 


mode as described in LDRB (immediate). The instruction uses post-indexed addressing when P == '0' and 
uses pre-indexed addressing otherwise. The instruction is handled as described in Using R15 on 
page K1-5457. 


T1 


151413 12|/11109 8|7 6 5 4/3 2 1 0\15 12|11 | | 0 | 


Tiittooujooiii7 i] st | mm —SC=~™ 
Rt 


T1 variant 


LDRB{<c>}{<q>} <Rt>, <label> // Preferred syntax 
LDRB{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative syntax 


Decode for this encoding 


if Rt == '1111' then SEE PLD; 
t = UInt(Rt); imm32 = ZeroExtend(imml2, 32); add = (U == '1'); 
// ARMv8-A removes UNPREDICTABLE for R13 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<label> The label of the literal data item that is to be loaded into <Rt>. The assembler calculates the required 


value of the offset from the Align(PC, 4) value of the instruction to this label. Permitted values of 
the offset are -4095 to 4095. If the offset is zero or positive, imm32 is equal to the offset and add == 
TRUE, encoded as U == 1. If the offset is negative, imm32 is equal to minus the offset and add == FALSE, 
encoded as U == 0. 


+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 


0 if omitted, and encoded in the "imm12" field. 


For encoding T1: is a 12-bit unsigned immediate byte offset, in the range 0 to 4095, encoded in the 
"imm12" field. 


The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
base = Align(PC,4); 
address = if add then (base + imm32) else (base - imm32); 
R[t] = ZeroExtend(MemU[address,1], 32); 
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F5.1.73 LDRB (register) 


Load Register Byte (register) calculates an address from a base register value and an offset register value, loads a 
byte from memory, zero-extends it to form a 32-bit word, and writes it to a register. The offset register value can 
optionally be shifted. For information about memory accesses see Memory accesses on page F2-2412. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 \7 6 5 4|3 0| 
Pst fo 1 afPfulajwi1f Rn | Rt Timms typeof Rm _ 


cond 


Offset variant 
Applies when P == 1 && W == 0. 


LDRB{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>{, <shift>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDRB{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm>{, <shift>} 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRB{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>{, <shift>}]! 


Decode for all variants of this encoding 


if P == 'Q' && W == '1' then SEE LDRBT; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 

if t == 15 || m == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 
15141312/11109 8| 65 |32 0O| 


fo +0 t[i]iJo[ km | Rn] Rt 


T1 variant 


LDRB{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>] 
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Decode for this encoding 
t = UInt(Rt); n= UInt(Rn); m = UInt(Rm); 


index = TRUE; add = TRUE; wback = FALSE; 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[111410 ofojofo oft] '1111 | 1111 |o 0 0 0 0 Ofimm2] Rm __| 
Rn Rt 





T2 variant 


LDRB{<c>}.W <Rt>, [<Rn>, {+}<Rm>] // <Rt>, <Rn>, <Rm> can be represented in T1 
LDRB{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>{, LSL #<imm>}] 


Decode for this encoding 


if Rt == '1111' then SEE PLD; 

if Rn == '1111' then SEE LDRB (literal); 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = TRUE; add = TRUE; wback = FALSE; 

(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 


in the offset variant. 


For encoding T1 and T2: is the general-purpose base register, encoded in the "Rn" field. 


+/- Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
+ Specifies the index register is added to the base register. 
<Rm> Is the general-purpose index register, encoded in the "Rm" field. 
<shift> The shift to apply to the value read from <Rm>. If absent, no shift is applied. Otherwise, see Shifts 


applied to a register on page F2-2410. 


<imm> If present, the size of the left shift to apply to the value from <Rm>, in the range 1-3. <imm> is encoded 
in imm2. If absent, no shift is specified and imm2 is encoded as 0bQ0. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if index then offset_addr else R[n]; 
R[t] = ZeroExtend(MemU[address,1] ,32); 
if wback then R[n] = offset_addr; 
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F5.1.74 LDRBT 

Load Register Byte Unprivileged loads a byte from memory, zero-extends it to form a 32-bit word, and writes it to 

a register. For information about memory accesses see Memory accesses on page F2-2412. 

The memory access is restricted as if the PE were running in User mode. This makes no difference if the PE is 

actually running in User mode. 

LDRBT is UNPREDICTABLE in Hyp mode. 

The T32 instruction uses an offset addressing mode, that calculates the address used for the memory access from a 

base register value and an immediate offset, and leaves the base register unchanged. 

The A32 instruction uses a post-indexed addressing mode, that uses a base register value as the address for the 

memory access, and calculates a new address from a base register value and an offset and writes it back to the base 

register. The offset can be an immediate value or an optionally-shifted register value. 

A1 

31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 | | 0 | 
eit [0 1 ofolult[if1] kn | Rt | immia—i*d 
cond 

Al variant 

LDRBT{<c>}{<q>} <Rt>, [<Rn>] {, #{+/-}<imm>} 

Decode for this encoding 

t = UInt(Rt); m = UInt(Rn); postindex = TRUE; add = (U == '1'); 
register_form = FALSE; imm32 = ZeroExtend(imm12, 32); 
if t == 15 || n == 15 || n == t then UNPREDICTABLE; 

CONSTRAINED UNPREDICTABLE behavior 

Ifn == 15, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction uses post-indexed addressing with the base register as PC. This is handled as described in 
Using R15 on page K1-5457. 

. The instruction uses immediate offset addressing with the base register as PC, without writeback. 

Ifn == t && n != 15, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 

A2 

31 28|27 26 25 24|23 22 21 20|19 16|15 12/11 i765 4]3 0 | 
eit [01 tolult[i[1] Rn [| Rt | mms [ype [o] Rm _| 
cond 
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A2 variant 


LDRBT{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm>{, <shift>} 
Decode for this encoding 
t = UInt(Rt); m= UInt(Rn); m= UInt(Rm); postindex = TRUE; add = (U == '1'); 


register_form = TRUE; (shift_t, shift_n) = DecodeImmShift(type, imm5); 
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Ifn == t && n != 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


1111170 ofojofo oft] !1111 [ Rte [1110] imme 
Rn 


T1 variant 


LDRBT{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 

if Rn == '1111' then SEE LDRB (literal); 

t = UInt(Rt); nm = UInt(Rn); postindex = FALSE; add = TRUE; 
register_form = FALSE; imm32 = ZeroExtend(imm8, 32); 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 


can be used, but this is deprecated. 


For encoding A2 and T1: is the general-purpose register to be transferred, encoded in the "Rt" field. 





<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
+/- For encoding A1: specifies the offset is added to or subtracted from the base register, defaulting to 
+ if omitted and encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 
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For encoding A2: specifies the index register is added to or subtracted from the base register, 
defaulting to + if omitted and encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
<Rm> Is the general-purpose index register, encoded in the "Rm" field. 
<shift> The shift to apply to the value read from <Rm>. If absent, no shift is applied. Otherwise, see Shifts 


applied to a register on page F2-2410. 
+ Specifies the offset is added to the base register. 


<imm> For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 
0 if omitted, and encoded in the "imm12" field. 


For encoding T1: is an optional 8-bit unsigned immediate byte offset, in the range 0 to 255, 
defaulting to 0 and encoded in the "imm8&" field. 


Operation for all encodings 


if ConditionPassed() then 
if PSTATE.EL == EL2 then UNPREDICTABLE; // Hyp mode 
EncodingSpecificOperations(); 
offset = if register_form then Shift(R[m], shift_t, shift_n, PSTATE.C) else imm32; 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if postindex then R[n] else offset_addr; 
R[t] = ZeroExtend(MemU_unpriv[address,1] , 32); 
if postindex then R[n] = offset_addr; 


CONSTRAINED UNPREDICTABLE behavior 


Tf PSTATE.EL == EL2, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes as LDRB (immediate). 
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F5.1.75  _ LDRD (immediate) 


Load Register Dual (immediate) calculates an address from a base register value and an immediate offset, loads two 
words from memory, and writes them to two registers. It can use offset, post-indexed, or pre-indexed addressing. 
For information about memory accesses see Memory accesses on page F2-2412. 





A1 
\31 28|27 26 25 24/23 22 21 20/19 16|15 12|11 8|7 6 5 4/3 0| 
| eit [oo ofPjuji{wiof 141 | Rt | imm4aH_[1[1 0] 1] imma_| 
cond Rn 


Offset variant 
Applies when P == 1 && W == 0. 


LDRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if Rn == '1111' then SEE LDRD (literal); 

if Rt<@> == '1' then UNPREDICTABLE; 

t = UInt(Rt); t2 = t+1;\ n = UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32); 
index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 

if P == 'Q' && W == '1' then UNPREDICTABLE; 

if whack && (n == t || n == t2) then UNPREDICTABLE; 

if t2 == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If whack && (n == t || n == t2), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 





If P == 'Q' && W == '1', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
. The instruction executes as an LDRD using one of offset, post-indexed, or pre-indexed addressing. 
If Rt<@> == '1', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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. The instruction executes with the additional decode: t<0> = '0'. 
° The instruction executes with the additional decode: t2 = t. 
. The instruction executes as described, with no change to its behavior and no additional side-effects. This does 


not apply when Rt =='1111'. 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 12/11 8|7 | 0 | 


+1070 OPluwiy enn] Rt | Ra | imme | 
Rn 


Offset variant 
Applies when P == 1 && W == 0. 


LDRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 1. 


LDRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if P == '0' && W == 'Q' then SEE "Related encodings"; 

if Rn == '1111' then SEE LDRD (literal); 

t = UInt(Rt); t2 = UInt(Rt2); nm = UInt(Rn); imm32 = ZeroExtend(imm8:'00', 32); 

index = (P == '1'); add = (U == '1'); whack = (W == '1'); 

if whack && (n == t || n == t2) then UNPREDICTABLE; 

if t == 15 || t2 == 15 || t == t2 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If whack && (n == t || n == t2), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


If t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The load instruction executes but the destination register takes an UNKNOWN value. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Related encodings: Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on 
page F3-2466. 


Assembler symbols 


<c> 
<q> 


<Rt> 


<Rt2> 


<Rn> 


+/- 


<imm> 


Ope 


if ¢ 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


For encoding A1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 
This register must be even-numbered and not R14. 


For encoding T1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


For encoding A1: is the second general-purpose register to be transferred. This register must be 
<R(t+1)>. 


For encoding T1: is the second general-purpose register to be transferred, encoded in the "Rt2" field. 
Is the general-purpose base register, encoded in the "Rn" field. For PC use see LDRD (literal). 


Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 


- when U = 0 


+ when U = 1 


For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is the unsigned immediate byte offset, a multiple of 4, in the range 0 to 1020, 
defaulting to 0 if omitted, and encoded in the "imm8" field as <imm>/4. 


ration for all encodings 


onditionPassed() then 
EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
if address == Align(address, 8) then 
data = MemA[address,8]; 
if BigEndian() then 
R[t] = data<63:32>; 
R[t2] = data<31:0>; 
else 
R[t] = data<31:0>; 
R[t2] = data<63:32>; 
else 
R[t] = MemA[address,4]; 
R[t2] = MemA[address+4,4]; 
if wback then R[n] = offset_addr; 
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F5.1.76  LDRD (literal) 
Load Register Dual (literal) calculates an address from the PC value and an immediate offset, loads two words from 
memory, and writes them to two registers. For information about memory accesses see Memory accesses on 
page F2-2412. 
A1 
\31 28|27 26 25 24/23 22 21 20|19 18 17 16|15 12|11 8|7 6 5 4|3 0| 
=1111_[o 0 olaful+fojol1 1 11 
cond 
Al variant 
LDRD{<c>}{<q>} <Rt>, <Rt2>, <label> // Normal form 
LDRD{<c>}{<q>} <Rt>, <Rt2>, [PC, #{+/-}<imm>] // Alternative form 
Decode for this encoding 
if Rt<@> == '1' then UNPREDICTABLE; 
t = UInt(Rt); t2 = t+1;  imm32 = ZeroExtend(imm4H:imm4L, 32); add = (U == '1'); 
if t2 == 15 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If Rt<@> == '1', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes with the additional decode: t<O0> = '0's. 
° The instruction executes with the additional decode: t2 = t;. 
. The instruction executes as described, with no change to its behavior and no additional side-effects. This does 
not apply when Rt =='1111'. 
If P == '@' || W == '1', then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes as if P == 1 and W == 0.' 
T1 
115 141312/11109 8|7 6 5 4/3 2 1 0|15 12|11 8|7 | 0| 
Ti toto oPuiwiyt+17] mk | RO | imme | 
T1 variant 
Applies when !(P == @ && W == Q). 
LDRD{<c>}{<q>} <Rt>, <Rt2>, <label> // Normal form 
LDRD{<c>}{<q>} <Rt>, <Rt2>, [PC, #{+/-}<imm>] // Alternative form 
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Decode for this encoding 


if P == '0' && W == 'Q' then SEE "Related encodings"; 

t = UInt(Rt); t2 = UInt(Rt2); 

jimm32 = ZeroExtend(imm8:'Q0', 32); add = (U == '1'); 

if t == 15 || t2 == 15 || t == t2 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
if W == '1' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The load instruction executes but the destination register takes an UNKNOWN value. 

If W == '1', then one of the following behaviors must occur: 

. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes without writeback of the base address. 

° The instruction uses post-indexed addressing when P == '0' and uses pre-indexed addressing otherwise. The 


instruction is handled as described in Using R15 on page K1-5457. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related encodings: Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on 
page F3-2466. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


This register must be even-numbered and not R14. 


For encoding T1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


<Rt2> For encoding A1: is the second general-purpose register to be transferred. This register must be 
<R(t+1)>. 


For encoding T1: is the second general-purpose register to be transferred, encoded in the "Rt2" field. 


<label> For encoding A1: the label of the literal data item that is to be loaded into <Rt>. The assembler 
calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
Any value in the range -255 to 255 is permitted. If the offset is zero or positive, imm32 is equal to the 
offset and add == TRUE, encoded as U == 1. If the offset is negative, imm32 is equal to minus the offset 
and add == FALSE, encoded as U == 0. 


For encoding T1: the label of the literal data item that is to be loaded into <Rt>. The assembler 
calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
Permitted values of the offset are multiples of 4 in the range -1020 to 1020. If the offset is zero or 
positive, imm32 is equal to the offset and add == TRUE, encoded as U == 1. If the offset is negative, 
imm32 is equal to minus the offset and add == FALSE, encoded as U == 0. 
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+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 


encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
<imm> For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 


if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is the optional 8-bit unsigned immediate byte offset, in the range 0 to 255, 


defaulting to 0 and encoded in the "imm8" field. 


The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 


information, see Use of labels in UAL instruction syntax on page F1-2369. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


address = if add then (Align(PC,4) + imm32) else (Align(PC,4) - imm32); 


if address == Align(address, 8) then 
data = MemA[address,8]; 
if BigEndian() then 
R[t] = data<63:32>; 
R[t2] = data<31:0>; 
else 
R[t] = data<31:0>; 
R[t2] = data<63:32>; 
else 
R[t] = MemA[address,4]; 
R[t2] = MemA[address+4,4]; 
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F5.1.77 LDRD (register) 


Load Register Dual (register) calculates an address from a base register value and a register offset, loads two words 
from memory, and writes them to two registers. It can use offset, post-indexed, or pre-indexed addressing. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


| fe1it_ fo o ofpfujofwjof Rn | Rt __Joyofofoy 1] oft] Rm __ 





cond 


Offset variant 
Applies when P == 1 && W == 0. 


LDRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>, {+/-}<Rm>] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>], {+/-}<Rm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>, {+/-}<Rm>]! 


Decode for all variants of this encoding 


if Rt<@> == '1' then UNPREDICTABLE; 

t = UInt(Rt); t2 = t+1; n= UInt(Rn); m = UInt(Rm); 

index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
if P == '@' && W == '1' then UNPREDICTABLE; 

if t2 == 15 || m == 15 || m == t || m == t2 then UNPREDICTABLE; 

if whack && (n == 15 || n == t || n == t2) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If whack && (n == t || n == t2), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


If P == 'Q' && W == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes as an LDRD using one of offset, post-indexed, or pre-indexed addressing. 


Ifm == t || m == t2, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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The instruction loads register Rm with an UNKNOWN value. 


If Rt<@> == '1', then one of the following behaviors must occur: 


The instruction is UNDEFINED. 
The instruction executes as NOP. 
The instruction executes with the additional decode: t<0> = '0'. 


The instruction executes with the additional decode: t2 = t. 


The instruction executes as described, with no change to its behavior and no additional side-effects. This does 


not apply when Rt =='1111'. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Archi 


tectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> 
<q> 


<Rt> 


<Rt2> 


<Rn> 


+/- 


<Rm> 


Ope 


See Standard assembler syntax fields on page F2-2406. 


See Standard assembler syntax fields on page F2-2406. 


Is the first general-purpose register to be transferred, encoded in the "Rt" field. This register must 


be even-numbered and not R14. 


Is the second general-purpose register to be transferred. This register must be <R(t+1)>. 


Is the general-purpose base register, encoded in the "Rn" field. The PC can be used in the offset 


variant. 


Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 


and encoded in the "U" field. It can have the following values: 
- when U = @ 


+ when U = 1 


Is the general-purpose index register, encoded in the "Rm" field. 


ration 


if ConditionPassed() then 


EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + R[m]) else (R[n] - R[m]); 
address = if index then offset_addr else R[n]; 
if address == Align(address, 8) then 
data = MemA[address,8]; 
if BigEndian() then 
R[t] = data<63:32>; 
R[t2] = data<31:0>; 
else 
R[t] = data<31:0>; 
R[t2] = data<63:32>; 
else 
R[t] = MemA[address,4]; 
R[t2] = MemA[address+4,4]; 


if wback then R[n] = offset_addr; 
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F5.1.78 LDREX 


Load Register Exclusive calculates an address from a base register value and an immediate offset, loads a word from 
memory, writes it to a register and: 


° If the address has the Shared Memory attribute, marks the physical address as exclusive access for the 
executing PE in a global monitor. 


° Causes the executing PE to indicate an active exclusive access in the local monitor. 
For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 


information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16|15 12\11109 8|7 6 5 4/3 2 1 0| 


| i117 fo oo 4 tfo oft} Rn TRE anf tft 0 0 MMMM 


cond 


Al variant 


LDREX{<c>}{<q>} <Rt>, [<Rn> {, {#}<imm>}] 


Decode for this encoding 
t = UInt(Rt); m = UInt(Rn); imm32 = Zeros(32); // Zero offset 
if t == 15 || n == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 12/1110 9 8|7 | 0 | 


Titotooo01o RA | R (Mma, mms | 


T1 variant 


LDREX{<c>}{<q>} <Rt>, [<Rn> {, #<imm>}] 


Decode for this encoding 
t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm8:'00', 32); 
if t == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 

<imm> For encoding A1: the immediate offset added to the value of <Rn> to calculate the address. <imm> can 


only be 0 or omitted. 
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For encoding T1: the immediate offset added to the value of <Rn> to calculate the address. <imm> can 
be omitted, meaning an offset of 0. Values are multiples of 4 in the range 0-1020. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n] + imm32; 
AArch32.SetExclusiveMonitors(address,4); 
R[t] = MemA[address,4]; 
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LDREXB 


Load Register Exclusive Byte derives an address from a base register value, loads a byte from memory, zero-extends 
it to form a 32-bit word, writes it to a register and: 


° If the address has the Shared Memory attribute, marks the physical address as exclusive access for the 
executing PE in a global monitor. 


° Causes the executing PE to indicate an active exclusive access in the local monitor. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 2 1 0| 


| tsti1_ jo oo 1 1f1 of1f Rn | Re [aif tft 0 0 tM) 
cond 


Al variant 


LDREXB{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 


t = UInt(Rt); nm = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 21 0O| 


f‘t1t101000171 0f4f Ro [ Ram aao zfo ofa) 


T1 variant 
LDREXB{<c>}{<q>} <Rt>, [<Rn>] 
Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
AArch32.SetExclusiveMonitors(address,1); 
R[t] = ZeroExtend(MemA[address,1], 32); 
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F5.1.80 LDREXD 


Load Register Exclusive Doubleword derives an address from a base register value, loads a 64-bit doubleword from 
memory, writes it to two registers and: 


° If the address has the Shared Memory attribute, marks the physical address as exclusive access for the 
executing PE in a global monitor. 


° Causes the executing PE to indicate an active exclusive access in the local monitor. 
For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 


information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 2 1 0| 


| tei fo oot tfo att} Rn TRE loft ft 0 o MMM 


cond 


Al variant 
LDREXD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>] 


Decode for this encoding 


t = UInt(Rt); t2 =t+ 1; n = UInt(Rn); 
if Rt<@> == '1' || t2 == 15 || n == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If Rt<@> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes with the additional decode: t<0> = '0'. 

° The instruction executes with the additional decode: t2 = t. 

. The instruction executes as described, with no change to its behavior and no additional side effects. 
If Rt == '1110', then one of the following behaviors must occur: 

. The instruction is UNDEFINED. 

. The instruction executes as NOP. 


° The instruction is handled as described in Using R15 on page K1-5457. 


T1 
[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/11 8|7 6 5 4|3 21 0| 


fatto toootio) em | & | R@ fom imomm 


T1 variant 


LDREXD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>] 
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Decode for this encoding 
t = UInt(Rt); t2 = UInt(Rt2); nm = UInt(Rn); 


if t == 15 || t2 == 15 || t == t2 || nm == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The load instruction executes but the destination register takes an UNKNOWN value. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


<Rt> must be even-numbered and not R14. 


For encoding T1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


<Rt2> For encoding A1: is the second general-purpose register to be transferred. <Rt2> must be <R(t+1)>. 


For encoding T1: is the second general-purpose register to be transferred, encoded in the "Rt2" field. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
AArch32.SetExclusiveMonitors(address, 8); 
value = MemA[address,8]; 
// Extract words from 64-bit loaded value such that R[t] is 
// loaded from address and R[t2] from address+4. 
R[t] = if BigEndian() then value<63:32> else value<31:0>; 
R[t2] = if BigEndian() then value<31:0@> else value<63:32>; 
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F5.1.81 LDREXH 


Load Register Exclusive Halfword derives an address from a base register value, loads a halfword from memory, 
zero-extends it to form a 32-bit word, writes it to a register and: 


° If the address has the Shared Memory attribute, marks the physical address as exclusive access for the 
executing PE in a global monitor. 


° Causes the executing PE to indicate an active exclusive access in the local monitor. 
For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 2 1 0| 


pti foo ot ttt attf Rn TRE anf ft 0 0 MMMM 


cond 


Al variant 


LDREXH{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); nm = UInt(Rn); 
if t == 15 || mn == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 21 0O| 


[1101000171 0f4f Ro [ R_ (amaao tfo 141) aa) 


T1 variant 


LDREXH{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
AArch32.SetExclusiveMonitors(address,2); 
R[t] = ZeroExtend(MemA[address,2], 32); 
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F5.1.82 _ LDRH (immediate) 


Load Register Halfword (immediate) calculates an address from a base register value and an immediate offset, loads 
a halfword from memory, zero-extends it to form a 32-bit word, and writes it to a register. It can use offset, 
post-indexed, or pre-indexed addressing. For information about memory accesses see Memory accesses on 

page F2-2412. 





A1 
\31 28|27 26 25 24/23 22 21 20/19 16|15 12|11 8|7 6 5 4/3 0| 
Tem [oo ofPultwii[ =i [Rt | imman [1]0 4] 1] immal_| 
cond Rn 


Offset variant 
Applies when P == 1 && W == 0. 


LDRH{<c>}{<q>} <Rt>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDRH{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRH{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if Rn == '1111' then SEE LDRH (literal); 

if P == '@' && W == '1' then SEE LDRHT; 

t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32); 

index = (P == '1');) add = (U == '1'); whack = (P == '@') || (W == '1'); 
if t == 15 || (whack && n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 
|15 14 13 12/11 10 | 65 |32 Of 


T1 variant 


LDRH{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-2751 
1ID092916 Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Decode for this encoding 


= UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm5:'@', 32); 
index = TRUE; add = TRUE; wback = FALSE; 


T2 


ee 109 8|7 6 5 4|3 ie 12|11 | 0 | 


Poe ro ooo) a || ae 





T2 variant 


LDRH{<c>}.W <Rt>, [<Rn> {, #{+}<imm>}] // <Rt>, <Rn>, <imm> can be represented in T1 
LDRH{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 


if Rt == '1111' then SEE PLD (immediate); 
if Rn == '1111' then SEE LDRH (literal); 
= UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 
index = TRUE; add = TRUE; wback = FALSE; 
// ARMv8-A removes UNPREDICTABLE for R13 


T3 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


[1114170 ofofofo tft] = | Rte [ilplulw] imma 
Rn 





Offset variant 
Applies when Rt != 1111 && P == 1 && U == 0 && W == 


LDRH{<c>}{<q>} <Rt>, [<Rn> {, #-<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 


LDRH{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 


LDRH{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>] ! 


Decode for all variants of this encoding 


if Rn == '1111' then SEE LDRH (literal); 
if Rt == '1111' && P == '1' && U == 'O' && W == '@' then SEE PLDW (immediate); 
if P == '1' && U == '1' && W == '@' then SEE LDRHT; 
if P == 'O' && W == '@' then UNDEFINED; 
= UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm8, 32); 
index = (P == '1'); add = (U == '1'); whack = (W == '1'); 
if (t == 15 && W == '1') || (wback && n == t) then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 
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CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


The instruction is UNDEFINED. 
The instruction executes as NOP. 


The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Archi 


tectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> 
<q> 
<Rt> 


<Rn> 


+/- 


<imm> 


Ope 


if ¢ 


else 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 
Is the general-purpose register to be transferred, encoded in the "Rt" field. 


For encoding A1, T2 and T3: is the general-purpose base register, encoded in the "Rn" field. For PC 
use see LDRH (literal). 


For encoding T1: is the general-purpose base register, encoded in the "Rn" field. 


Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 


- when U = 0 


+ when U = 1 
Specifies the offset is added to the base register. 


For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is the optional positive unsigned immediate byte offset, a multiple of 2, in the 
range 0 to 62, defaulting to 0 and encoded in the "imm5" field as <imm>/2. 


For encoding T2: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 


For encoding T3: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 


ration for all encodings 


urrentInstrSet() == InstrSet_A32 then 
if ConditionPassed() then 
EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
data = MemU[address,2]; 
if wback then R[n] = offset_addr; 
R[t] = ZeroExtend(data, 32); 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
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data = MemU[address,2]; 
if wback then R[n] = offset_addr; 
R[t] = ZeroExtend(data, 32); 
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F5.1.83 LDRH (literal) 


Load Register Halfword (literal) calculates an address from the PC value and an immediate offset, loads a halfword 
from memory, zero-extends it to form a 32-bit word, and writes it to a register. For information about memory 
accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 8|7 6 5 4|3 0 | 


| fe1iit_ fo o ofplujifwiitt 141] Rt | imma [ito 1[4] immat_| 





cond 


Al variant 
Applies when !(P == @ && W == 1). 


LDRH{<c>}{<q>} <Rt>, <label> // Normal form 
LDRH{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative form 


Decode for this encoding 
if P == '@' && W == '1' then SEE LDRHT; 
t = UInt(Rt); imm32 = ZeroExtend(imm4H:imm4L, 32); 


add = (U == '1'); whack = (P == '@') || (W == '1'); 
if t == 15 || wback then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If wack, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes with the additional decode: wback = FALSE;,. 

° The instruction treats bit[24] as the P bit, and bit[21] as the writeback (W) bit, and uses the same addressing 


mode as described in LDRH (immediate). The instruction uses post-indexed addressing when P == '0' and 
uses pre-indexed addressing otherwise. The instruction is handled as described in Using R15 on 
page K1-5457. 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12|11 | | 0 | 


Tiittoopujoiiii177] st | mmm ——SC—~s 
Rt 


T1 variant 


LDRH{<c>}{<q>} <Rt>, <label> // Preferred syntax 
LDRH{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative syntax 


Decode for this encoding 


if Rt == '1111' then SEE PLD (literal); 
t = UInt(Rt); imm32 = ZeroExtend(imml2, 32); add = (U == '1'); 
// ARMv8-A removes UNPREDICTABLE for R13 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-2755 
1ID092916 Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<label> For encoding A1: the label of the literal data item that is to be loaded into <Rt>. The assembler 


calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
Any value in the range -255 to 255 is permitted. If the offset is zero or positive, imm32 is equal to the 
offset and add == TRUE, encoded as U == 1. If the offset is negative, imm32 is equal to minus the offset 
and add == FALSE, encoded as U == 0. 


For encoding T1: the label of the literal data item that is to be loaded into <Rt>. The assembler 
calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
Permitted values of the offset are -4095 to 4095. If the offset is zero or positive, imm32 is equal to the 
offset and add == TRUE, encoded as U == 1. If the offset is negative, imm32 is equal to minus the offset 
and add == FALSE, encoded as U == 0. 


+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 


if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is a 12-bit unsigned immediate byte offset, in the range 0 to 4095, encoded in the 
"imm12" field. 


The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
base = Align(PC,4); 
address = if add then (base + imm32) else (base - imm32); 
data = MemU[address,2]; 
R[t] = ZeroExtend(data, 32); 
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LDRH (register) 


Load Register Halfword (register) calculates an address from a base register value and an offset register value, loads 
a halfword from memory, zero-extends it to form a 32-bit word, and writes it to a register. The offset register value 
can be shifted left by 0, 1, 2, or 3 bits. For information about memory accesses see Memory accesses on 

page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 





| tet fo o ofPfujofwiif Rn | Rt Joyo 1}o 1]4] Rm _| 


cond 


Offset variant 
Applies when P == 1 && W == 0. 


LDRH{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDRH{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRH{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>]! 


Decode for all variants of this encoding 

if P == '@' && W == '1' then SEE LDRHT; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
(shift_t, shift_n) = (SRType_LSL, Q); 

if t == 15 || m == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

Tf whack && n == t, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

. The instruction executes as NOP. 


° The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 
15 141312/11109 8| 65 |32 0O| 


foto tio] km | Rn] Rt 


T1 variant 


LDRH{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>] 
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Decode for this encoding 
t = UInt(Rt); n= UInt(Rn); m = UInt(Rm); 


index = TRUE; add = TRUE; wback = FALSE; 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[111410 ofojofo 141] '1111 | 1111 ]o 0 0 0 0 Ofimm2] Rm _| 
Rn Rt 





T2 variant 


LDRH{<c>}.W <Rt>, [<Rn>, {+}<Rm>] // <Rt>, <Rn>, <Rm> can be represented in T1 
LDRH{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>{, LSL #<imm>}] 


Decode for this encoding 


if Rn == '1111' then SEE LDRH (literal); 

if Rt == '1111' then SEE PLDW (register); 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = TRUE; add = TRUE; wback = FALSE; 

(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 


in the offset variant. 


For encoding T1 and T2: is the general-purpose base register, encoded in the "Rn" field. 


+/- Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

+ Specifies the index register is added to the base register. 

<Rm> Is the general-purpose index register, encoded in the "Rm" field. 

<imm> If present, the size of the left shift to apply to the value from <Rm>, in the range 1-3. <imm> is encoded 


in imm2?. If absent, no shift is specified and imm2 is encoded as 0bQ0. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
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offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if index then offset_addr else R[n]; 

data = MemU[address,2]; 

if wback then R[n] = offset_addr; 

R[t] = ZeroExtend(data, 32); 
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F5.1.85 LDRHT 

Load Register Halfword Unprivileged loads a halfword from memory, zero-extends it to form a 32-bit word, and 

writes it to a register. For information about memory accesses see Memory accesses on page F2-2412. 

The memory access is restricted as if the PE were running in User mode. This makes no difference if the PE is 

actually running in User mode. 

LDRHT is UNPREDICTABLE in Hyp mode. 

The T32 instruction uses an offset addressing mode, that calculates the address used for the memory access from a 

base register value and an immediate offset, and leaves the base register unchanged. 

The A32 instruction uses a post-indexed addressing mode, that uses a base register value as the address for the 

memory access, and calculates a new address from a base register value and an offset and writes it back to the base 

register. The offset can be an immediate value or a register value. 

A1 

31 28|27 26 25 24|23 22 21 20|19 16|15 12/11 8/7 6 5 4|3 0 | 
iit [oo ofolu]t [aft] Rn | Rt | imma [to a4] imma] 
cond 

Al variant 

LDRHT{<c>}{<q>} <Rt>, [<Rn>] {, #{+/-}<imm>} 

Decode for this encoding 

t = UInt(Rt); m = UInt(Rn); postindex = TRUE; add = (U == '1'); 
register_form = FALSE; imm32 = ZeroExtend(imm4H:imm4L, 32); 
if t == 15 || n == 15 || n == t then UNPREDICTABLE; 

CONSTRAINED UNPREDICTABLE behavior 

Ifn == 15, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction uses post-indexed addressing with the base register as PC. This is handled as described in 
Using R15 on page K1-5457. 

° The instruction is treated as if bit{24] =='1' and bit[21] == '0'. The instruction uses immediate offset 
addressing with the base register as PC, without writeback. 

Ifn == t && n != 15, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 
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A2 
|31 28|27 26 25 24/23 22 21 20/19 16|15 12\11109 8|7 6 5 4|3 0| 
| sit [oo ofofujo{1{+{ Rn | Rt__[offfol1fo 1]1] Rm _| 
cond 
A2 variant 


LDRHT{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm> 


Decode for this encoding 
t = UInt(Rt); m= UInt(Rn); m= UInt(Rm); postindex = TRUE; add = (U == '1'); 


register_form = TRUE; 
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Ifn == t && n != 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


11411170 ofofofo tftf en | Re [1170] imma 
Rn 


T1 variant 


LDRHT{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 

if Rn == '1111' then SEE LDRH (literal); 

t = UInt(Rt); nm = UInt(Rn); postindex = FALSE; add = TRUE; 
register_form = FALSE; imm32 = ZeroExtend(imm8, 32); 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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+/- For encoding A1: specifies the offset is added to or subtracted from the base register, defaulting to 
+ if omitted and encoded in the "U" field. It can have the following values: 


- when U = 0 
+ when U = 1 


For encoding A2: specifies the index register is added to or subtracted from the base register, 
defaulting to + if omitted and encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
<Rm> Is the general-purpose index register, encoded in the "Rm" field. 
+ Specifies the offset is added to the base register. 
<imm> For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 


if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is an optional 8-bit unsigned immediate byte offset, in the range 0 to 255, 
defaulting to 0 and encoded in the "imm8&" field. 


Operation for all encodings 


if ConditionPassed() then 
if PSTATE.EL == EL2 then UNPREDICTABLE; // Hyp mode 
EncodingSpecificOperations(); 
offset = if register_form then R[m] else imm32; 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if postindex then R[n] else offset_addr; 
data = MemU_unpriv[address, 2]; 
if postindex then R[n] = offset_addr; 
R[t] = ZeroExtend(data, 32); 


CONSTRAINED UNPREDICTABLE behavior 


Tf PSTATE.EL == EL2, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes as LDRH (immediate). 
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F5.1.86 LDRSB (immediate) 


Load Register Signed Byte (immediate) calculates an address from a base register value and an immediate offset, 
loads a byte from memory, sign-extends it to form a 32-bit word, and writes it to a register. It can use offset, 
post-indexed, or pre-indexed addressing. For information about memory accesses see Memory accesses on 

page F2-2412. 


A1 


i 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0 | 


Lei Joo ofPyuysiwi a] retin | Rt] imma fs Of] imma 





‘end 


Offset variant 
Applies when P == 1 && W == 0. 


LDRSB{<c>}{<q>} <Rt>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDRSB{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRSB{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if Rn == '1111' then SEE LDRSB (literal); 
if P == 'Q' && W == '1' then SEE LDRSBT; 
= UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32); 
index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
if t == 15 || (whack && n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 


eo 109 8|7 6 5 4/3 a 12\11 | 0 | 


Pee oof te os] |e |e 





T1 variant 


LDRSB{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 
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Decode for this encoding 


if Rt == '1111' then SEE PLI; 

if Rn == '1111' then SEE LDRSB (literal); 

t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 
index = TRUE; add = TRUE; wback = FALSE; 

// ARMv8-A removes UNPREDICTABLE for R13 


T2 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


fF t1170 oftfopo oy enn [| Rt (tpPjujwl imma —=d 
Rn 





Offset variant 
Applies when P == 1 && U == 0 && W == 0. 


LDRSB{<c>}{<q>} <Rt>, [<Rn> {, #-<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 1. 


LDRSB{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRSB{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if Rt == '1111' && P == '1' && U == 'O0' && W == 'O' then SEE PLI; 
if Rn == '1111' then SEE LDRSB (literal); 

if P == '1' && U == '1' && W == 'Q' then SEE LDRSBT; 

if P == 'Q' && W == '@' then UNDEFINED; 

t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm8, 32); 

index = (P == '1'); add = (U == '1'); whack = (W == '1'); 

if (t == 15 && W == '1') || (whack && n == t) then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 


address might be corrupted so that the instruction cannot be repeated. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
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<q> 
<Rt> 
<Rn> 


+/- 


<imm> 


Ope 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


See Standard assembler syntax fields on page F2-2406. 
Is the general-purpose register to be transferred, encoded in the "Rt" field. 
Is the general-purpose base register, encoded in the "Rn" field. For PC use see LDRSB (literal). 


Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 


+ when U = 1 

Specifies the offset is added to the base register. 

For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 


For encoding T2: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 


ration for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 

offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 

R[t] = SignExtend(MemU[address,1], 32); 

if wback then R[n] = offset_addr; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.87 LDRSB (literal) 


Load Register Signed Byte (literal) calculates an address from the PC value and an immediate offset, loads a byte 
from memory, sign-extends it to form a 32-bit word, and writes it to a register. For information about memory 
accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 8|7 6 5 4|3 0 | 


| fe1itt_ fo o ofplujafwiift 14 1f Rt | imma [i]t oft] immat_| 





cond 


Al variant 
Applies when !(P == @ && W == 1). 


LDRSB{<c>}{<q>} <Rt>, <label> // Normal form 
LDRSB{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative form 


Decode for this encoding 
if P == 'Q' && W == '1' then SEE LDRSBT; 
t = UInt(Rt); imm32 = ZeroExtend(imm4H:imm4L, 32); 


add = (U == '1'); whack = (P == '@') || (W == '1'); 
if t == 15 || wback then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If wack, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes with the additional decode: wback = FALSE;,. 

° The instruction treats bit[24] as the P bit, and bit[21] as the writeback (W) bit, and uses the same addressing 


mode as described in LDRSB (immediate). The instruction uses post-indexed addressing when P == '0' and 
uses pre-indexed addressing otherwise. The instruction is handled as described in Using R15 on 
page K1-5457. 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12|11 | | 0 | 


Tiitioouooiii7 i] sn | imma —SC=~™ 
Rt 


T1 variant 


LDRSB{<c>}{<q>} <Rt>, <label> // Preferred syntax 
LDRSB{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative syntax 


Decode for this encoding 


if Rt == '1111' then SEE PLI; 
t = UInt(Rt); imm32 = ZeroExtend(imml2, 32); add = (U == '1'); 
// ARMv8-A removes UNPREDICTABLE for R13 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> 
<q> 
<Rt> 


<label> 


+/- 


<imm> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 
Is the general-purpose register to be transferred, encoded in the "Rt" field. 


For encoding A1: the label of the literal data item that is to be loaded into <Rt>. The assembler 
calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
Any value in the range -255 to 255 is permitted. If the offset is zero or positive, imm32 is equal to the 
offset and add == TRUE, encoded as U == 1. If the offset is negative, imm32 is equal to minus the offset 
and add == FALSE, encoded as U == 0. 


For encoding T1: the label of the literal data item that is to be loaded into <Rt>. The assembler 
calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
Permitted values of the offset are -4095 to 4095. If the offset is zero or positive, imm32 is equal to the 
offset and add == TRUE, encoded as U == 1. If the offset is negative, imm32 is equal to minus the offset 
and add == FALSE, encoded as U == 0. 


Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 

- when U = @ 

+ when U = 1 

For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is a 12-bit unsigned immediate byte offset, in the range 0 to 4095, encoded in the 
"imm12" field. 


The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
base = Align(PC,4); 
address = if add then (base + imm32) else (base - imm32); 
R[t] = SignExtend(MemU[address,1], 32); 
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F5.1.88  LDRSB (register) 


Load Register Signed Byte (register) calculates an address from a base register value and an offset register value, 
loads a byte from memory, sign-extends it to form a 32-bit word, and writes it to a register. The offset register value 
can be shifted left by 0, 1, 2, or 3 bits. For information about memory accesses see Memory accesses on 

page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 0| 





| tet fo o ofPfujofwiif Rn | Rt oof] 1] oft] Rm _| 


cond 


Offset variant 
Applies when P == 1 && W == 0. 


LDRSB{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDRSB{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRSB{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>]! 


Decode for all variants of this encoding 


if P == '@' && W == '1' then SEE LDRSBT; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
(shift_t, shift_n) = (SRType_LSL, Q); 

if t == 15 || m == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 
15141312\11109 8| 65 |32  0O| 


foto tfoyi[7] km | Rn] Rt 


T1 variant 


LDRSB{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>] 
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Decode for this encoding 
t = UInt(Rt); n= UInt(Rn); m = UInt(Rm); 


index = TRUE; add = TRUE; wback = FALSE; 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[111410 oftfofo oft] '1111 | 1111 ]o 0 0 0 0 Ofimm2] Rm __| 
Rn Rt 





T2 variant 


LDRSB{<c>}.W <Rt>, [<Rn>, {+}<Rm>] // <Rt>, <Rn>, <Rm> can be represented in T1 
LDRSB{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>{, LSL #<imm>}] 


Decode for this encoding 


if Rt == '1111' then SEE PLI; 

if Rn == '1111' then SEE LDRSB (literal); 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = TRUE; add = TRUE; wback = FALSE; 

(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 


in the offset variant. 


For encoding T1 and T2: is the general-purpose base register, encoded in the "Rn" field. 


+/- Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
+ Specifies the index register is added to the base register. 
<Rm> Is the general-purpose index register, encoded in the "Rm" field. 
<imm> If present, the size of the left shift to apply to the value from <Rm>, in the range 1-3. <imm> is encoded 


in imm2?. If absent, no shift is specified and imm2 is encoded as 0bQ0. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
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offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if index then offset_addr else R[n]; 

R[t] = SignExtend(MemU[address,1], 32); 

if wback then R[n] = offset_addr; 
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F5.1.89 LDRSBT 


Load Register Signed Byte Unprivileged loads a byte from memory, sign-extends it to form a 32-bit word, and 
writes it to a register. For information about memory accesses see Memory accesses on page F2-2412. 


The memory access is restricted as if the PE were running in User mode. This makes no difference if the PE is 
actually running in User mode. 


LDRSBT is UNPREDICTABLE in Hyp mode. 


The T32 instruction uses an offset addressing mode, that calculates the address used for the memory access from a 
base register value and an immediate offset, and leaves the base register unchanged. 


The A32 instruction uses a post-indexed addressing mode, that uses a base register value as the address for the 
memory access, and calculates a new address from a base register value and an offset and writes it back to the base 
register. The offset can be an immediate value or a register value. 





A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 8|7 6 5 4|3 0| 
emt [oo ofojupi[i]i[ Rn | Rt | imman [1]4 o[4] imma_| 
cond 
Al variant 


LDRSBT{<c>}{<q>} <Rt>, [<Rn>] {, #{+/-}<imm>} 


Decode for this encoding 
t = UInt(Rt); nm = UInt(Rn); postindex = TRUE; add = (U == '1'); 


register_form = FALSE; imm32 = ZeroExtend(imm4H:imm4L, 32); 
if t == 15 || n == 15 || n == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Ifn == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction uses post-indexed addressing with the base register as PC. This is handled as described in 


Using R15 on page K1-5457. 


° The instruction is treated as if bit{24] =='1' and bit[21] == 'O'. The instruction uses immediate offset 
addressing with the base register as PC, without writeback. 


Ifn == t && n != 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 
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A2 
|31 28|27 26 25 24/23 22 21 20/19 16|15 12\11109 8|7 6 5 4|3 0| 
| sit [oo ofofujo{1{+} Rn | Rt__[offOfOl1[1 oft] Rm _| 
cond 
A2 variant 


LDRSBT{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm> 


Decode for this encoding 
t = UInt(Rt); m= UInt(Rn); m= UInt(Rm); postindex = TRUE; add = (U == '1'); 


register_form = TRUE; 
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Ifn == t && n != 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


11411470 of1fofo oft] en | Re [1170] imma 
Rn 


T1 variant 


LDRSBT{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 

if Rn == '1111' then SEE LDRSB (literal); 

t = UInt(Rt); nm = UInt(Rn); postindex = FALSE; add = TRUE; 
register_form = FALSE; imm32 = ZeroExtend(imm8, 32); 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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+/- For encoding A1: specifies the offset is added to or subtracted from the base register, defaulting to 
+ if omitted and encoded in the "U" field. It can have the following values: 


- when U = 0 
+ when U = 1 


For encoding A2: specifies the index register is added to or subtracted from the base register, 
defaulting to + if omitted and encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
<Rm> Is the general-purpose index register, encoded in the "Rm" field. 
+ Specifies the offset is added to the base register. 
<imm> For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 


if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is an optional 8-bit unsigned immediate byte offset, in the range 0 to 255, 
defaulting to 0 and encoded in the "imm8&" field. 


Operation for all encodings 


if ConditionPassed() then 
if PSTATE.EL == EL2 then UNPREDICTABLE; // Hyp mode 
EncodingSpecificOperations(); 
offset = if register_form then R[m] else imm32; 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if postindex then R[n] else offset_addr; 
R[t] = SignExtend(MemU_unpriv[address,1], 32); 
if postindex then R[n] = offset_addr; 


CONSTRAINED UNPREDICTABLE behavior 


Tf PSTATE.EL == EL2, then one of the following behaviors must occur: 





. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes as LDRSB (immediate). 
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F5.1.90 


LDRSH (immediate) 


Load Register Signed Halfword (immediate) calculates an address from a base register value and an immediate 
offset, loads a halfword from memory, sign-extends it to form a 32-bit word, and writes it to a register. It can use 
offset, post-indexed, or pre-indexed addressing. For information about memory accesses see Memory accesses on 
page F2-2412. 


A1 


i 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0 | 


Lei Joo ofPyu swear] retin | Rt] imma Pfs AP] imma 





‘end 


Offset variant 
Applies when P == 1 && W == 0. 


LDRSH{<c>}{<q>} <Rt>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDRSH{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRSH{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 
if Rn == '1111' then SEE LDRSH (literal); 
if P == 'Q' && W == '1' then SEE LDRSHT; 
= UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32); 
index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
if t == 15 || (whack && n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

Tf whack && n == t, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

. The instruction executes as NOP. 


° The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 


— 109 8|7 6 5 4/3 ae 12)\11 | 0 | 


Poe oof es] ee |e 





T1 variant 


LDRSH{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 
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Decode for this encoding 


if Rn == '1111' then SEE LDRSH (literal); 

if Rt == '1111' then SEE "Related instructions"; 

t = UInt(Rt); m = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 
index = TRUE; add = TRUE; whback = FALSE; 

// ARMv8-A removes UNPREDICTABLE for R13 


T2 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1110 9 8|7 | 0 | 


ttt 70 ooo ay enn [| Rt (tpPjujwl mms Cid 
Rn 





Offset variant 
Applies when Rt != 1111 && P == 1 && U == 0 && W == 0. 


LDRSH{<c>}{<q>} <Rt>, [<Rn> {, #-<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 1. 


LDRSH{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRSH{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if Rn == '1111' then SEE LDRSH (literal); 

if Rt == '1111' && P == '1' && U == '0' && W == 'O' then SEE "Related instructions"; 
if P == '1' && U == '1' && W == 'Q' then SEE LDRSHT; 

if P == 'Q' && W == '@' then UNDEFINED; 

t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm8, 32); 

index = (P == '1'); add = (U == '1'); whack = (W == '1'); 

if (t == 15 && W == '1') || (whack && n == t) then UNPREDICTABLE; 

// ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related instructions: Load/Store single on page F3-2485. 
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Assembler symbols 


<c> 
<q> 
<Rt> 


<Rn> 


+/- 


<imm> 


Ope 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the general-purpose register to be transferred, encoded in the "Rt" field. 

Is the general-purpose base register, encoded in the "Rn" field. For PC use see LDRSH (literal). 


Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 


+ when U = 1 

Specifies the offset is added to the base register. 

For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 


For encoding T2: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 


ration for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 

offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 

data = MemU[address,2]; 

if wback then R[n] = offset_addr; 

R[t] = SignExtend(data, 32); 
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F5.1.91 LDRSH (literal) 


Load Register Signed Halfword (literal) calculates an address from the PC value and an immediate offset, loads a 
halfword from memory, sign-extends it to form a 32-bit word, and writes it to a register. For information about 
memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 8|7 6 5 4|3 0 | 
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cond 


Al variant 
Applies when !(P == @ && W == 1). 


LDRSH{<c>}{<q>} <Rt>, <label> // Normal form 
LDRSH{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative form 


Decode for this encoding 
if P == 'Q' && W == '1' then SEE LDRSHT; 
t = UInt(Rt); imm32 = ZeroExtend(imm4H:imm4L, 32); 


add = (U == '1'); whack = (P == '@') || (W == '1'); 
if t == 15 || wback then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If wack, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes with the additional decode: wback = FALSE;,. 

° The instruction treats bit[24] as the P bit, and bit[21] as the writeback (W) bit, and uses the same addressing 


mode as described in LDRSH (immediate). The instruction uses post-indexed addressing when P == '0' and 
uses pre-indexed addressing otherwise. The instruction is handled as described in Using R15 on 
page K1-5457. 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12|11 | | 0 | 


Tiittoo oy i7 i sm | mma ——SC=~™ 
Rt 


T1 variant 


LDRSH{<c>}{<q>} <Rt>, <label> // Preferred syntax 
LDRSH{<c>}{<q>} <Rt>, [PC, #{+/-}<imm>] // Alternative syntax 


Decode for this encoding 


if Rt == '1111' then SEE "Related instructions"; 
t = UInt(Rt); imm32 = ZeroExtend(imml2, 32); add = (U == '1'); 
// ARMv8-A removes UNPREDICTABLE for R13 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related instructions: Load literal on page F3-2489. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<label> For encoding A1: the label of the literal data item that is to be loaded into <Rt>. The assembler 


calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
Any value in the range -255 to 255 is permitted. If the offset is zero or positive, imm32 is equal to the 
offset and add == TRUE, encoded as U == 1. If the offset is negative, imm32 is equal to minus the offset 
and add == FALSE, encoded as U == 0. 


For encoding T1: the label of the literal data item that is to be loaded into <Rt>. The assembler 
calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
Permitted values of the offset are -4095 to 4095. If the offset is zero or positive, imm32 is equal to the 
offset and add == TRUE, encoded as U == 1. If the offset is negative, imm32 is equal to minus the offset 
and add == FALSE, encoded as U == 0. 


+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 


if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is a 12-bit unsigned immediate byte offset, in the range 0 to 4095, encoded in the 
"imm12" field. 


The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
base = Align(PC,4); 
address = if add then (base + imm32) else (base - imm32); 
data = MemU[address,2]; 
R[t] = SignExtend(data, 32); 
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F5.1.92  LDRSH (register) 


Load Register Signed Halfword (register) calculates an address from a base register value and an offset register 
value, loads a halfword from memory, sign-extends it to form a 32-bit word, and writes it to a register. The offset 
register value can be shifted left by 0, 1, 2, or 3 bits. For information about memory accesses see Memory accesses 
on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 6 5 4|3 0 | 





cond 


Offset variant 
Applies when P == 1 && W == 0. 


LDRSH{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


LDRSH{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


LDRSH{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>]! 


Decode for all variants of this encoding 


if P == '@' && W == '1' then SEE LDRSHT; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = (P == '1'); | add = (U == '1'); whack = (P == '@') || (W == '1'); 
(shift_t, shift_n) = (SRType_LSL, Q); 

if t == 15 || m == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is <arm-defined-word>unknown</arm-defined-word>. In addition, if an exception 
occurs during such as instruction, the base address might be corrupted so that the instruction cannot be 
repeated. 


T1 


15141312/11109 8| 65 |32  0O| 
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T1 variant 


LDRSH{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>] 


Decode for this encoding 
t = UInt(Rt); n= UInt(Rn); m = UInt(Rm); 


index = TRUE; add = TRUE; wback = FALSE; 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
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Rn Rt 





T2 variant 


LDRSH{<c>}.W <Rt>, [<Rn>, {+}<Rm>] // <Rt>, <Rn>, <Rm> can be represented in T1 
LDRSH{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>{, LSL #<imm>}] 


Decode for this encoding 


if Rn == '1111' then SEE LDRSH (literal); 

if Rt == '1111' then SEE "Related instructions"; 

t = UInt(Rt); n= UInt(Rn); m = UInt(Rm); 

index = TRUE; add = TRUE; wback = FALSE; 

(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related instructions: Load/Store (register offset) on page F3-2485. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 


in the offset variant. 


For encoding T1 and T2: is the general-purpose base register, encoded in the "Rn" field. 


+/- Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

+ Specifies the index register is added to the base register. 

<Rm> Is the general-purpose index register, encoded in the "Rm" field. 

<imm> If present, the size of the left shift to apply to the value from <Rm>, in the range 1-3. <imm> is encoded 


in imm2. If absent, no shift is specified and imm2 is encoded as 0b00. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if index then offset_addr else R[n]; 
data = MemU[address,2]; 
if wback then R[n] = offset_addr; 
R[t] = SignExtend(data, 32); 
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F5.1.93 LDRSHT 

Load Register Signed Halfword Unprivileged loads a halfword from memory, sign-extends it to form a 32-bit word, 

and writes it to a register. For information about memory accesses see Memory accesses on page F2-2412. 

The memory access is restricted as if the PE were running in User mode. This makes no difference if the PE is 

actually running in User mode. 

LDRSHT is UNPREDICTABLE in Hyp mode. 

The T32 instruction uses an offset addressing mode, that calculates the address used for the memory access from a 

base register value and an immediate offset, and leaves the base register unchanged. 

The A32 instruction uses a post-indexed addressing mode, that uses a base register value as the address for the 

memory access, and calculates a new address from a base register value and an offset and writes it back to the base 

register. The offset can be an immediate value or a register value. 

A1 

31 28|27 26 25 24|23 22 21 20|19 16|15 12/11 8/7 6 5 4|3 0 | 
iit [oo ofolult[ift] Rn Re | imma [tt aa] imma] 
cond 

Al variant 

LDRSHT{<c>}{<q>} <Rt>, [<Rn>] {, #{+/-}<imm>} 

Decode for this encoding 

t = UInt(Rt); m = UInt(Rn); postindex = TRUE; add = (U == '1'); 
register_form = FALSE; imm32 = ZeroExtend(imm4H:imm4L, 32); 
if t == 15 || n == 15 || n == t then UNPREDICTABLE; 

CONSTRAINED UNPREDICTABLE behavior 

Ifn == 15, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction uses post-indexed addressing with the base register as PC. This is handled as described in 
Using R15 on page K1-5457. 

° The instruction is treated as if bit{24] =='1' and bit[21] == '0'. The instruction uses immediate offset 
addressing with the base register as PC, without writeback. 

Ifn == t && n != 15, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction performs all of the loads using the specified addressing mode and the content of the register 
that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 
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A2 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 





cond 


A2 variant 


LDRSHT{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm> 
Decode for this encoding 
t = UInt(Rt); m= UInt(Rn); m= UInt(Rm); postindex = TRUE; add = (U == '1'); 


register_form = TRUE; 
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Ifn == t && n != 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 
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T1 variant 


LDRSHT{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 

if Rn == '1111' then SEE LDRSH (literal); 

t = UInt(Rt); nm = UInt(Rn); postindex = FALSE; add = TRUE; 
register_form = FALSE; imm32 = ZeroExtend(imm8, 32); 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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+/- For encoding A1: specifies the offset is added to or subtracted from the base register, defaulting to 
+ if omitted and encoded in the "U" field. It can have the following values: 


- when U = 0 
+ when U = 1 


For encoding A2: specifies the index register is added to or subtracted from the base register, 
defaulting to + if omitted and encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
<Rm> Is the general-purpose index register, encoded in the "Rm" field. 
+ Specifies the offset is added to the base register. 
<imm> For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 


if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is an optional 8-bit unsigned immediate byte offset, in the range 0 to 255, 
defaulting to 0 and encoded in the "imm8&" field. 


Operation for all encodings 


if ConditionPassed() then 
if PSTATE.EL == EL2 then UNPREDICTABLE; // Hyp mode 
EncodingSpecificOperations(); 
offset = if register_form then R[m] else imm32; 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if postindex then R[n] else offset_addr; 
data = MemU_unpriv[address, 2]; 
if postindex then R[n] = offset_addr; 
R[t] = SignExtend(data, 32); 


CONSTRAINED UNPREDICTABLE behavior 


Tf PSTATE.EL == EL2, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes as LDRSH (immediate). 
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F5.1.94 LDRT 


Load Register Unprivileged loads a word from memory, and writes it to a register. For information about memory 
accesses see Memory accesses on page F2-2412. 


The memory access is restricted as if the PE were running in User mode. This makes no difference if the PE is 
actually running in User mode. 


LDRT is UNPREDICTABLE in Hyp mode. 


The T32 instruction uses an offset addressing mode, that calculates the address used for the memory access from a 
base register value and an immediate offset, and leaves the base register unchanged. 


The A32 instruction uses a post-indexed addressing mode, that uses a base register value as the address for the 
memory access, and calculates a new address from a base register value and an offset and writes it back to the base 
register. The offset can be an immediate value or an optionally-shifted register value. 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 0| 
Pmt [o 74 ofojujo[iji[ Rn | Rt | mm —isi 
cond 
Al variant 


LDRT{<c>}{<q>} <Rt>, [<Rn>] {, #{+/-}<imm>} 


Decode for this encoding 
t = UInt(Rt); nm = UInt(Rn); postindex = TRUE; add = (U == '1'); 


register_form = FALSE; imm32 = ZeroExtend(imm12, 32); 
if t == 15 || n == 15 || n == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Ifn == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction uses post-indexed addressing with the base register as PC. This is handled as described in 


Using R15 on page K1-5457. 


° The instruction is treated as if bit{24] =='1' and bit[21] == 'O'. The instruction uses immediate offset 
addressing with the base register as PC, without writeback. 


Ifn == t && n != 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 
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A2 


31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 i765 4|3 0 | 


Derm fo 7 tolupoyip7y Rn [Rt | mmd__[ype]o] Rm 


cond 
A2 variant 
LDRT{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm>{, <shift>} 
Decode for this encoding 
t = UInt(Rt); nm = UInt(Rn); m= UInt(Rm); postindex = TRUE; add = (U == '1'); 


register_form = TRUE; (shift_t, shift_n) = DecodeImmShift(type, immS); 
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Ifn == t && n != 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs all of the loads using the specified addressing mode and the content of the register 


that is written back is UNKNOWN. In addition, if an exception occurs during such as instruction, the base 
address might be corrupted so that the instruction cannot be repeated. 


T1 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


[11144170 ofofolt oft] en | Rte [1170] imme 
Rn 


T1 variant 


LDRT{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 

if Rn == '1111' then SEE LDR (literal); 

t = UInt(Rt); nm = UInt(Rn); postindex = FALSE; add = TRUE; 
register_form = FALSE; imm32 = ZeroExtend(imm8, 32); 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 


can be used, but this is deprecated. 


For encoding A2 and T1: is the general-purpose register to be transferred, encoded in the "Rt" field. 
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<Rn> 


+/- 


<Rm> 


<shift> 


<imm> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Is the general-purpose base register, encoded in the "Rn" field. 

For encoding A1: specifies the offset is added to or subtracted from the base register, defaulting to 
+ if omitted and encoded in the "U" field. It can have the following values: 

- when U = @ 

+ when U = 1 


For encoding A2: specifies the index register is added to or subtracted from the base register, 
defaulting to + if omitted and encoded in the "U" field. It can have the following values: 


- when U = 0 


+ when U = 1 
Is the general-purpose index register, encoded in the "Rm" field. 


The shift to apply to the value read from <Rm>. If absent, no shift is applied. Otherwise, see Shifts 
applied to a register on page F2-2410. 


Specifies the offset is added to the base register. 


For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 
0 if omitted, and encoded in the "imm12" field. 


For encoding T1: is an optional 8-bit unsigned immediate byte offset, in the range 0 to 255, 
defaulting to 0 and encoded in the "imm8" field. 


Operation for all encodings 


if ConditionPassed() then 
if PSTATE.EL == EL2 then UNPREDICTABLE; // Hyp mode 
EncodingSpecificOperations(); 
offset = if register_form then Shift(R[m], shift_t, shift_n, PSTATE.C) else imm32; 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if postindex then R[n] else offset_addr; 
data = MemU_unpriv[address,4]; 
if postindex then R[n] = offset_addr; 
R[t] = data; 


CONSTRAINED UNPREDICTABLE behavior 


Tf PSTATE.EL == EL2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes as LDR (immediate). 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.95 


LSL (immediate) 


Logical Shift Left (immediate) shifts a register value left by an immediate number of bits, shifting in zeros, and 
writes the result to the destination register. 


This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 
° The encodings in this description are named to match the encodings of MOV, MOVS (register). 


° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
A1 


31 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11 iI7 6 5 4|3 0| 


Per [0001 1]0 tof moo] Ra | 00000 [o oo] Rm _| 


cond S) imm5 type 


MOV, shift or rotate by value variant 
LSL{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 

is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, LSL #<imm> 


and is always the preferred disassembly. 
T2 


\15 14 13 12|11 10 | 65 |32 O| 


[oo ojo o} 00000 [| Rm | Rd_| 


op imm5 


T2 variant 

LSL<c>{<q>} {<Rd>,} <Rm>, #<imm> // Inside IT block 
is equivalent to 

MOV<c>{<q>} <Rd>, <Rm>, LSL #<imm> 


and is the preferred disassembly when InITBlock(). 
T3 


15 141312/11109 8/7 6 5 4/3 21 .0|1514 12/11 8|7 6 5 4|3 0 | 


Ti 70704007 00/117 1[0] mms | Rd immafo o] Rm _| 
iS) 


type 


MOV, shift or rotate by value variant 

LSL<c>.W {<Rd>,} <Rm>, #<imm> // Inside IT block, and <Rd>, <Rm>, <imm> can be represented in T2 
is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, LSL #<imm> 

and is always the preferred disassembly. 


LSL{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


MOV{<c>}{<q>} <Rd>, <Rm>, LSL #<imm> 


and is always the preferred disassembly. 


Assembler symbols 


<c> 


<q> 


<Rd> 


<Rm> 


<imm> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 
deprecates using the PC as the destination register, but if the PC is used, the instruction is a branch 
to the address calculated by the operation. This is an interworking branch, see Pseudocode 
description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 


For encoding T2 and T3: is the general-purpose destination register, encoded in the "Rd" field. 
For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 


For encoding T2 and T3: is the general-purpose source register, encoded in the "Rm" field. 


For encoding A1 and T2: is the shift amount, in the range 1 to 31, encoded in the "1mm5" field. 


For encoding T3: is the shift amount, in the range 1 to 31, encoded in the "imm3:imm2" field. 


Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.96 LSL (register) 
Logical Shift Left (register) shifts a register value left by a variable number of bits, shifting in zeros, and writes the 
result to the destination register. The variable number of bits is read from the bottom byte of a register 
This instruction is an alias of the MOV, MOVS (register-shifted register) instruction. This means that: 
° The encodings in this description are named to match the encodings of MOV, MOVS (register-shifted 
register). 
° The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this 
instruction. 
A1 
31 28|27 26 25 24|23 22 21 20|19 18 17 16/15 12|11 8/7 6 5 4|3 0| 
F111 [o_o 0 4 1fo 1fo\ooyoof{ Rd | Rs [ofo oft] Rm _| 
cond S) type 
Not flag setting variant 
LSL{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, LSL <Rs> 
and is always the preferred disassembly. 
T1 
15141312/1109 | 65 |32 4O| 
07000 0[0070] Rs | Rim | 
op 
Logical shift left variant 
LSL<c>{<q>} {<Rdm>,} <Rdm>, <Rs> // Inside IT block 
is equivalent to 
MOV<c>{<q>} <Rdm>, <Rdm>, LSL <Rs> 
and is the preferred disassembly when InITBlock(). 
T2 
151413 12/11109 8|7 6 5 4/3 0 |15 14 13 12/11 8/7 6 5 4|3 0 | 
11417070 0/0 ofo}] Rm [11174] Rd fooool Rs | 
type S 
Not flag setting variant 
LSL<c>.W {<Rd>,} <Rm>, <Rs> // Inside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in T1 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, LSL <Rs> 
and is always the preferred disassembly. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


LSL{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, LSL <Rs> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdm> Is the first general-purpose source register and the destination register, encoded in the "Rdm" field. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 

<Rs> Is the second general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 


the "Rs" field. 


Operation for all encodings 


The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this instruction. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.97 _ LSLS (immediate) 


Logical Shift Left, setting flags (immediate) shifts a register value left by an immediate number of bits, shifting in 
zeros, and writes the result to the destination register. 


If the destination register is not the PC, this instruction updates the condition flags based on the result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


e The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


° The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from AArch32 
State on page G1-3835. 


. The instruction is UNDEFINED in Hyp mode. 
° The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 


° The encodings in this description are named to match the encodings of MOV, MOVS (register). 
° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 12\11 |I7 6 5 4]3 0| 
1111 [o_o 0 1 1[o 4] 1 [oof Ra __| !=00000__[o ofof Rm _ | 
cond S) imm5 type 


MOVS, shift or rotate by value variant 
LSLS{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, LSL #<imm> 


and is always the preferred disassembly. 


T2 

|15 14 13 12|11 10 | 65 |3 2 0 | 

fo 0 ofo of 00000 [ Rm | Ra | 
op imm5 

T2 variant 


LSLS{<q>} {<Rd>,} <Rm>, #<imm> // Outside IT block 
is equivalent to 
MOVS{<q>} <Rd>, <Rm>, LSL #<imm> 


and is the preferred disassembly when !InITBlock(). 
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T3 
\15141312/11109 8|7 6 5 4|3 2 1 0|1514 12|11 8|7 6 5 4/3 0| 
T11070 7/0017 0]1]1 11 1]0] mma] Rd immo of Rm_| 
Ss type 


MOVS, shift or rotate by value variant 

LSLS.W {<Rd>,} <Rm>, #<imm> // Outside IT block, and <Rd>, <Rm>, <imm> can be represented in T2 
is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, LSL #<imm> 

and is always the preferred disassembly. 

LSLS{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, LSL #<imm> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 


deprecates using the PC as the destination register, but if the PC is used, the instruction performs an 
exception return, that restores PSTATE from SPSR_<current_mode>. 


For encoding T2 and T3: is the general-purpose destination register, encoded in the "Rd" field. 


<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 


For encoding T2 and T3: is the general-purpose source register, encoded in the "Rm" field. 


<imm> For encoding Al and T2: is the shift amount, in the range 1 to 31, encoded in the "1mm5" field. 


For encoding T3: is the shift amount, in the range 1 to 31, encoded in the "imm3:imm2" field. 


Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-2793 


1ID092916 


Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.98 


LSLS (register) 


Logical Shift Left, setting flags (register) shifts a register value left by a variable number of bits, shifting in zeros, 
writes the result to the destination register, and updates the condition flags based on the result. The variable number 
of bits is read from the bottom byte of a register 


This instruction is an alias of the MOV, MOVS (register-shifted register) instruction. This means that: 


° The encodings in this description are named to match the encodings of MOV, MOVS (register-shifted 
register). 


° The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this 
instruction. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 8|7 6 5 4|3 0| 


[eit fo oo 4 1fo 1] ify oo Ra _ [Rs Jojo of] Rm _| 
S) 


cond type 


Flag setting variant 
LSLS{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, LSL <Rs> 


and is always the preferred disassembly. 
T1 


15141312\1109 | 65 |32 O| 


op 


Logical shift left variant 

LSLS{<q>} {<Rdm>,} <Rdm>, <Rs> // Outside IT block 
is equivalent to 

MOVS{<q>} <Rdm>, <Rdm>, LSL <Rs> 


and is the preferred disassembly when !InITBlock(). 
T2 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0 | 


1114117070 0f0 ofsf Rm [ira it} Rd Joooo} Rs | 


type S 


Flag setting variant 
LSLS.W {<Rd>,} <Rm>, <Rs> // Outside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in T1 
is equivalent to 


MOVS{<c>}{<q>} <Rd>, <Rm>, LSL <Rs> 
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and is always the preferred disassembly. 
LSLS{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, LSL <Rs> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdm> Is the first general-purpose source register and the destination register, encoded in the "Rdm'" field. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 

<Rs> Is the second general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 


the "Rs" field. 


Operation for all encodings 


The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this instruction. 
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F5.1.99 


LSR (immediate) 


Logical Shift Right (immediate) shifts a register value right by an immediate number of bits, shifting in zeros, and 
writes the result to the destination register. 


This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 
° The encodings in this description are named to match the encodings of MOV, MOVS (register). 


° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
A1 


31 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11 iI7 6 5 4|3 0| 


Per [00071 1]0 tol oo] Ra | mms [0 iJo] Rm _| 


cond S) type 


MOV, shift or rotate by value variant 
LSR{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 

is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, LSR #<imm> 


and is always the preferred disassembly. 
T2 


\15 14 13 12|11 10 | 65 |32 O| 


foo ofo 1] imms | Rm | Rad | 
op 


T2 variant 

LSR<c>{<q>} {<Rd>,} <Rm>, #<imm> // Inside IT block 
is equivalent to 

MOV<c>{<q>} <Rd>, <Rm>, LSR #<imm> 


and is the preferred disassembly when InITBlock(). 
T3 


15 141312/11109 8/7 6 5 4/3 21 .0|1514 12/11 8|7 6 5 4|3 0 | 


Ti 70704007 00/117 10] mms | Rd immafo 1] Rm _| 
iS) 


type 


MOV, shift or rotate by value variant 

LSR<c>.W {<Rd>,} <Rm>, #<imm> // Inside IT block, and <Rd>, <Rm>, <imm> can be represented in T2 
is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, LSR #<imm> 

and is always the preferred disassembly. 


LSR{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


MOV{<c>}{<q>} <Rd>, <Rm>, LSR #<imm> 


and is always the preferred disassembly. 


Assembler symbols 


<c> 


<q> 


<Rd> 


<Rm> 


<imm> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 
deprecates using the PC as the destination register, but if the PC is used, the instruction is a branch 
to the address calculated by the operation. This is an interworking branch, see Pseudocode 
description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 


For encoding T2 and T3: is the general-purpose destination register, encoded in the "Rd" field. 
For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 

For encoding T2 and T3: is the general-purpose source register, encoded in the "Rm" field. 

For encoding Al and T2: is the shift amount, in the range 1 to 32, encoded in the "Iimm5" field as 
<imm> modulo 32. 


For encoding T3: is the shift amount, in the range 1 to 32, encoded in the "imm3:imm2" field as 
<imm> modulo 32. 


Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
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F5.1.100 


LSR (register) 


Logical Shift Right (register) shifts a register value right by a variable number of bits, shifting in zeros, and writes 
the result to the destination register. The variable number of bits is read from the bottom byte of a register 


This instruction is an alias of the MOV, MOVS (register-shifted register) instruction. This means that: 


° The encodings in this description are named to match the encodings of MOV, MOVS (register-shifted 
register). 


° The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this 
instruction. 


A1 


31 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11 8|7 6 5 4|3 0| 


| 1111 fo 0 0 4 10 1foloO@) Ra _ | Rs jojo afi] Rm _| 


cond S) type 


Not flag setting variant 
LSR{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, LSR <Rs> 


and is always the preferred disassembly. 
T1 


15141312\1109 | 65 |32  0O| 


op 


Logical shift right variant 

LSR<c>{<q>} {<Rdm>,} <Rdm>, <Rs> // Inside IT block 
is equivalent to 

MOV<c>{<q>} <Rdm>, <Rdm>, LSR <Rs> 


and is the preferred disassembly when InITBlock(). 
T2 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12/11 8|7 6 5 4|3 0 | 


11417070 0f0 1fo] Rm [1171] Rd fooool Rs | 


type S 


Not flag setting variant 

LSR<c>.W {<Rd>,} <Rm>, <Rs> // Inside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in T1 
is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, LSR <Rs> 


and is always the preferred disassembly. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


LSR{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, LSR <Rs> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdm> Is the first general-purpose source register and the destination register, encoded in the "Rdm" field. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 

<Rs> Is the second general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 


the "Rs" field. 


Operation for all encodings 


The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this instruction. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.101 LSRS (immediate) 


Logical Shift Right, setting flags (immediate) shifts a register value right by an immediate number of bits, shifting 
in zeros, and writes the result to the destination register. 


If the destination register is not the PC, this instruction updates the condition flags based on the result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


e The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


° The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from AArch32 
State on page G1-3835. 


. The instruction is UNDEFINED in Hyp mode. 

° The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 

This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 

° The encodings in this description are named to match the encodings of MOV, MOVS (register). 


° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 |7 6 5 43 0| 


[eit fo oo 4 1fo 1] 1h @O@! Ra | imms fo 1fo] Rm | 
iS) 


cond type 


MOVS, shift or rotate by value variant 
LSRS{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, LSR #<imm> 


and is always the preferred disassembly. 
T2 


|15 14 13 12/11 10 | 65 |32 Of 


foo ofo 1] mms] Rm | Ra | 
op 


T2 variant 

LSRS{<q>} {<Rd>,} <Rm>, #<imm> // Outside IT block 
is equivalent to 

MOVS{<q>} <Rd>, <Rm>, LSR #<imm> 


and is the preferred disassembly when ! InITBlock(). 
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T3 
\15141312/11109 8|7 6 5 4|3 2 1 0|1514 12|11 8|7 6 5 4/3 0| 
1110707007 0]1]1 11 1[0] mms] Rd [immo i[ Rm_| 
Ss type 


MOVS, shift or rotate by value variant 

LSRS.W {<Rd>,} <Rm>, #<imm> // Outside IT block, and <Rd>, <Rm>, <imm> can be represented in T2 
is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, LSR #<imm> 

and is always the preferred disassembly. 

LSRS{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, LSR #<imm> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 


deprecates using the PC as the destination register, but if the PC is used, the instruction performs an 
exception return, that restores PSTATE from SPSR_<current_mode>. 


For encoding T2 and T3: is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 
For encoding T2 and T3: is the general-purpose source register, encoded in the "Rm" field. 
<imm> For encoding Al and T2: is the shift amount, in the range 1 to 32, encoded in the "imm5" field as 
<imm> modulo 32. 


For encoding T3: is the shift amount, in the range 1 to 32, encoded in the "imm3:imm2" field as 
<imm> modulo 32. 


Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.102 | LSRS (register) 
Logical Shift Right, setting flags (register) shifts a register value right by an immediate number of bits, shifting in 
zeros, writes the result to the destination register, and updates the condition flags based on the result. The variable 
number of bits is read from the bottom byte of a register 
This instruction is an alias of the MOV, MOVS (register-shifted register) instruction. This means that: 
° The encodings in this description are named to match the encodings of MOV, MOVS (register-shifted 
register). 
° The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this 
instruction. 
A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 12|11 8/7 6 5 4|3 0| 
e111 fo oo 1 1f0 1] 4 {oof Rd [| Rs__[ojo ifs] Rm _| 
cond S) type 
Flag setting variant 
LSRS{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 
MOVS{<c>}{<q>} <Rd>, <Rm>, LSR <Rs> 
and is always the preferred disassembly. 
T1 
151413121109 | 65 |32 £40O| 
0700000071] Rs | Rim | 
op 
Logical shift right variant 
LSRS{<q>} {<Rdm>,} <Rdm>, <Rs> // Outside IT block 
is equivalent to 
MOVS{<q>} <Rdm>, <Rdm>, LSR <Rs> 
and is the preferred disassembly when !InITBlock(). 
T2 
15 1413 12/1110 9 8|7 6 5 4/3 0 15 14 13 12|11 8/7 6 5 4|3 0 | 
11417070 o0fo afi} Rm [1114] Rd fooool Rs | 
type S 
Flag setting variant 
LSRS.W {<Rd>,} <Rm>, <Rs> // Outside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in T1 
is equivalent to 
MOVS{<c>}{<q>} <Rd>, <Rm>, LSR <Rs> 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


and is always the preferred disassembly. 
LSRS{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, LSR <Rs> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdm> Is the first general-purpose source register and the destination register, encoded in the "Rdm'" field. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 

<Rs> Is the second general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 


the "Rs" field. 


Operation for all encodings 


The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this instruction. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.103 MCR 


Move to System register from general-purpose register or execute a System instruction. This instruction copies the 
value of a general-purpose register to a System register, or executes a System instruction. 


The System register and System instruction descriptions identify valid encodings for this instruction. Other 
encodings are UNDEFINED. For more information see About the AArch32 System register interface on page E1-2312 
and General behavior of System registers on page G4-4151. 


In an implementation that includes EL2, MCR accesses to System registers can be trapped to Hyp mode, meaning that 
an attempt to execute an MCR instruction in a Non-secure mode other than Hyp mode, that would be permitted in the 
absence of the Hyp trap controls, generates a Hyp Trap exception. For more information, see EL2 configurable 
controls on page G1-3894. 


Because of the range of possible traps to Hyp mode, the MCR pseudocode does not show these possible traps. 


A1 


|31 28/27 26 25 24/23 21 20/19 16|15 12\11 9 8|7 5 4|3 0 | 
Tem [1171 of opt Jo] orn | mt [1711] | op [1] cRm | 

cond coproc<3:1>| 
coproc<0> 


Al variant 
MCR{<c>}{<q>} <coproc>, {#}<opcl>, <Rt>, <CRn>, <CRm>{, {#}<opc2>} 


Decode for this encoding 


t = UInt(Rt); cp = if coproc<@> == '@' then 14 else 15; 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


T1 


151413 12|1110 9 8|7 5 4|3 0 |15 12|11 9 8|7 5 4|3 0 | 

[11 tfof1 11 of opet fo] crn | Rt [14 1{ | opc2 [1] CRm_| 
coproc<3:1>| 

coproc<0> 


T1 variant 

MCR{<c>}{<q>} <coproc>, {#}<opcl>, <Rt>, <CRn>, <CRm>{, {#}<opc2>} 

Decode for this encoding 

t = UInt(Rt); cp = if coproc<@> == '@' then 14 else 15; 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 

Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
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<coproc> 


<opcl> 


<Rt> 


<CRn> 


<CRm> 


<opc2> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Is the System register encoding space, encoded in the "coproc<0>" field. It can have the following 


values: 
p14 when coproc<0> = @ 
p15 when coproc<@> = 1 


Is the opcl parameter within the System register encoding space, in the range 0 to7, encoded in the 
"opcl" field. 


Is the general-purpose register to be transferred, encoded in the "Rt" field. 


Is the CRn parameter within the System register encoding space, in the range c0 to c15, encoded in 
the "CRn" field. 


Is the CRm parameter within the System register encoding space, in the range cO to c15, encoded in 
the "CRm" field. 


Is the opc2 parameter within the System register encoding space, in the range 0 to7, encoded in the 
"opc2" field. 


The possible values of { <coproc>, <opcl>, <CRn>, <CRm>, <opc2> } encode the entire System register and System 
instruction encoding space. Not all of this space is allocated, and the System register and System instruction 
descriptions identify the allocated encodings. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
AArch32.CheckSystemAccess(cp, ThisInstr()); 
AArch32.SysRegwWrite(cp, ThisInstr(), R[t]); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.104 


MCRR 


Move to System register from two general-purpose registers. This instruction copies the values of two 
general-purpose registers to a System register. 


The System register descriptions identify valid encodings for this instruction. Other encodings are UNDEFINED. For 
more information see About the AArch32 System register interface on page E1-2312 and General behavior of 
System registers on page G4-4151. 


In an implementation that includes EL2, MCRR accesses to System registers can be trapped to Hyp mode, meaning 
that an attempt to execute an MCRR instruction in a Non-secure mode other than Hyp mode, that would be permitted 
in the absence of the Hyp trap controls, generates a Hyp Trap exception. For more information, see EL2 
configurable controls on page G1-3894. 


Because of the range of possible traps to Hyp mode, the MCRR pseudocode does not show these possible traps. 


A1 


|31 28/27 26 25 24/23 22 21 20|19 16|15 12\11 9 8|7 4|3 0 | 
Tem [1100 oftjojo] m2 | mt (111%) | opt | cRm | 

cond coproc<3:1>| 
coproc<0> 


Al variant 


MCRR{<c>}{<q>} <coproc>, {#}<opcl>, <Rt>, <Rt2>, <CRm> 


Decode for this encoding 
t = UInt(Rt); t2 = UInt(Rt2); cp = if coproc<@> == 'Q@' then 14 else 15; 


if t == 15 || t2 == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 


T1 


|145141312/11109 8|7 6 5 4/3 0 |15 12/11 9 8|7 4|3 0 | 

[11 tfof1 100 oftfofo] rea | rt [1 11{ | oper | cRm | 
coproc<3:1>| 

coproc<0> 


T1 variant 


MCRR{<c>}{<q>} <coproc>, {#}<opcl>, <Rt>, <Rt2>, <CRm> 


Decode for this encoding 

t = UInt(Rt); t2 = UInt(Rt2); cp = if coproc<@> == '@' then 14 else 15; 
if t == 15 || t2 == 15 then UNPREDICTABLE; 

// ARMv8-A removes UNPREDICTABLE for R13 

Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
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<q> 


<coproc> 


<opcl> 


<Rt> 
<Rt2> 


<CRm> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


See Standard assembler syntax fields on page F2-2406. 


Is the System register encoding space, encoded in the "coproc<0>" field. It can have the following 


values: 
p14 when coproc<@> = @ 
p15 when coproc<@> = 1 


Is the opcl parameter within the System register encoding space, in the range 0 to 15, encoded in 
the "opcl" field. 


Is the first general-purpose register that is transferred into, encoded in the "Rt" field. 
Is the second general-purpose register that is transferred into, encoded in the "Rt2" field. 


Is the CRm parameter within the System register encoding space, in the range c0 to c15, encoded in 
the "CRm" field. 


The possible values of { <coproc>, <opcl>, <CRm> } encode the entire System register encoding space. Not all of this 
space is allocated, and the System register descriptions identify the allocated encodings. 


For the permitted uses of these instructions, as described in this manual, <Rt2> transfers bits[63:32] of the selected 
System register, while <Rt> transfers bits[31:0]. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
AArch32.CheckSystemAccess(cp, ThisInstr()); 
value = R[t2]:R[t]; 
AArch32.SysRegwWrite64(cp, ThisInstr(), value); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.105 MLA, MLAS 
Multiply Accumulate multiplies two register values, and adds a third register value. The least significant 32 bits of 
the result are written to the destination register. These 32 bits do not depend on whether the source register values 
are considered to be signed values or unsigned values. 
In an A32 instruction, the condition flags can optionally be updated based on the result. Use of this option adversely 
affects performance on many implementations. 
A1 
31 28/27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0 | 
iit [000 of0 0 1[s] Ra | Ra | Rm [too 7%] Rn | 
cond 
Flag setting variant 
Applies when §$ == 1. 
MLAS{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Not flag setting variant 
Applies when $ == 0. 
MLA{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for all variants of this encoding 
d = UInt(Rd); mn = UInt(Rn); m= UInt(Rm); a = UInt(Ra); setflags = (S == '1'); 
if d == 15 || n == 15 || m == 15 || a == 15 then UNPREDICTABLE; 
T1 
1514131211109 8|7 6 5 4/3 0|15 12|11 8/7 6 5 4|3 0 | 
Ti t7+10%77 0/000, Ra | em | Rd [0 o]o 0] Rm | 
Ra 
T1 variant 
MLA{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for this encoding 
if Ra == '1111' then SEE MUL; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); a = UInt(Ra); setflags = FALSE; 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. 
<Ra> Is the third general-purpose source register holding the addend, encoded in the "Ra" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand1 = SInt(R[n]); // operandl = UInt(R[n]) produces the same final results 
operand2 = SInt(R[m]); // operand2 = UInt(R[m]) produces the same final results 
addend = SInt(R[a]); // addend = UInt(R[a]) produces the same final results 
result = operandl « operand2 + addend; 
R[d] = result<31:0>; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result<31:0>); 
// PSTATE.C, PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.106 


Multiply and Subtract multiplies two register values, and subtracts the product from a third register value. The least 
significant 32 bits of the result are written to the destination register. These 32 bits do not depend on whether the 
source register values are considered to be signed values or unsigned values. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0| 


| iit jo oo ofo 1 tfof Rd | Ra | Rm |i oot] Rn | 


cond 


Al variant 
MLS{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); m= UInt(Rm); a = UInt(Ra); 
if d == 15 || n == 15 || m == 15 || a == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4]3 0 |15 12/11 81/7 6 5 4|3 0 | 


7 7itio77 ooo] kn | Ra | Re [oojo7] Rm | 


T1 variant 
MLS{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for this encoding 
d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 
if d == 15 || n == 15 || m == 15 || a == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. 


<Ra> Is the third general-purpose source register holding the minuend, encoded in the "Ra" field. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand1 = SInt(R[n]); // operandl = UInt(R[n]) produces the same final results 
operand2 = SInt(R[m]); // operand2 = UInt(R[m]) produces the same final results 
addend = SInt(R[a]); // addend = UInt(R[a]) produces the same final results 
result = addend - operand1 « operand2; 
R[d] = result<31:0>; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.107 


MOV, MOVS (immediate) 


Move (immediate) writes an immediate value to the destination register. 


If the destination register is not the PC, the MOVS variant of the instruction updates the condition flags based on 
the result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The MOV variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The MOVS variant of the instruction performs an exception return without the use of the stack. In this case: 
—_— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 
A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 | | 0 | 


Dei [oo 17 10 i[sfomoo Ra | _mmiz_——SS—=*d 


cond 
MOV variant 


Applies when S == 0. 


MOV{<c>}{<q>} <Rd>, #<const> 


MOVS variant 
Applies when S == 1. 


MOVS{<c>}{<q>} <Rd>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); setflags = (S == '1'); (imm32, carry) = A32ExpandImm_C(imm12, PSTATE.C); 
A2 


31 28/27 26 25 24|23 22 21 20/19 16|15 12\11 | | 0 | 


[erm [oo 717 ofojo o| mm | Rd] mmi2_———id 


cond 


A2 variant 


MOV{<c>}{<q>} <Rd>, #<imm16> // <imm16> can not be represented in Al 
MOVW{<c>}{<q>} <Rd>, #<imm16> // <imm16> can be represented in Al 


Decode for this encoding 


d = UInt(Rd); setflags = FALSE; imm32 = ZeroExtend(imm4:imm12, 32); 
if d == 15 then UNPREDICTABLE; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T1 
|15141312|1110 8|7 | 0| 
foo 4fo of] Ra | imme —«z 
T1 variant 


MOV<c>{<q>} <Rd>, #<imm8> // Inside IT block 
MOVS{<q>} <Rd>, #<imm8> // Outside IT block 


Decode for this encoding 


d = UInt(Rd); setflags = !InITBlock(); imm32 = ZeroExtend(imm8, 32); carry = PSTATE.C; 
T2 


151413 12/1110 9 8|7 6 5 4/3 21 0|1514 12/11 8|7 | 0| 


[1111 ofifojo o 1 ofsi1 11 1fof imma | Rd | imme 


MOV variant 
Applies when S$ == 0. 


MOV<c>.W <Rd>, #<const> // Inside IT block, and <Rd>, <const> can be represented in T1 
MOV{<c>}{<q>} <Rd>, #<const> 


MOVS variant 
Applies when $ == 1. 


MOVS.W <Rd>, #<const> // Outside IT block, and <Rd>, <const> can be represented in T1 
MOVS{<c>}{<q>} <Rd>, #<const> 


Decode for all variants of this encoding 
d = UInt(Rd); setflags = (S == '1');  (imm32, carry) = T32ExpandImm_C(i:imm3:imm8, PSTATE.C); 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


T3 


151413 12|11109 8|7 6 5 4|3 0\1514 12\11 8|7 | 0 | 


[11.1 1 ofits ofof1 0 of imm4 Jo] imma | Rd | imma 


T3 variant 


MOV{<c>}{<q>} <Rd>, #<imm16> // <imm16> cannot be represented in T1 or T2 
MOVW{<c>}{<q>} <Rd>, #<imm16> // <imm16> can be represented in Tl or T2 


Decode for this encoding 


d = UInt(Rd); setflags = FALSE; imm32 = ZeroExtend(imm4:i:imm3:imm8, 32); 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 


deprecates using the PC as the destination register, but if the PC is used: 


. For the MOV variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 


general-purpose registers and the PC on page E1-2293. 


° For the MOVS variant, the instruction performs an exception return, that restores PSTATE 


from SPSR_<current_mode>. 


For encoding A2, T1, T2 and T3: is the general-purpose destination register, encoded in the "Rd" 


field. 
<imm8> Is a 8-bit unsigned immediate, in the range 0 to 255, encoded in the "imm8" field. 
<imm16> For encoding A2: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 


"imm4:imm 12" field. 


For encoding T3: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 


"imm4:i:imm3:imm8s" field. 


<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 


page F2-2422 for the range of values. 


For encoding T2: an immediate value. See Modified immediate constants in T32 instructions on 


page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = imm32; 
if d == 15 then // Can only occur for encoding Al 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.108 MOV, MOVS (register) 
Move (register) copies a value from a register to the destination register. 


If the destination register is not the PC, the MOVS variant of the instruction updates the condition flags based on 
the result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. If the 
destination register is the PC: 


° The MOV variant of the instruction is a branch. In the T32 instruction set (encoding T1) this is a simple 
branch, and in the A32 instruction set it is an interworking branch, see Pseudocode description of operations 
on the AArch32 general-purpose registers and the PC on page E1-2293. 

° The MOVS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See J/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 
— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 
This instruction is used by the aliases ASRS (immediate), ASR (immediate), LSLS (Gmmediate), LSL (immediate), 


LSRS (immediate), LSR (immediate), RORS (immediate), ROR (immediate), RRXS, and RRX. See Alias 
conditions on page F5-2817 for details of when each alias is preferred. 


A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 i7 6 5 43 0| 
im [0007 1]0 7[sfoool Ra | mms [pelo] Rm | 


cond 


MOV, rotate right with extend variant 
Applies when S == 0 && imm5 == 00000 && type == 11. 


MOV{<c>}{<q>} <Rd>, <Rm>, RRX 


MOV, shift or rotate by value variant 
Applies when S$ == 0 && !(imm5 == 00000 && type == 11). 


MOV{<c>}{<q>} <Rd>, <Rm> {, <shift> #<amount>} 


MOVS, rotate right with extend variant 
Applies when $ == 1 && imm5 == 00000 && type == 11. 


MOVS{<c>}{<q>} <Rd>, <Rm>, RRX 


MOVS, shift or rotate by value variant 
Applies when S == 1 && !(imm5 == 00000 && type == 11). 


MOVS{<c>}{<q>} <Rd>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T1 


\15141312\1110 9 8|7 6 |3 2 O| 


fo 700071 ojo] Rm | Ra | 


T1 variant 


MOV{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(D:Rd); m= UInt(Rm); setflags = FALSE; 
(shift_t, shift_n) = (SRType_LSL, Q); 
if d == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


T2 


[15 14 13 12/11 10 | 65 |32 Of 


[0 0 ofiett] imm5 | Rm [ Rd _| 


op 


T2 variant 


MOV<c>{<q>} <Rd>, <Rm> {, <shift> #<amount>} // Inside IT block 
MOVS{<q>} <Rd>, <Rm> {, <shift> #<amount>} // Outside IT block 


Decode for this encoding 


d = UInt(Rd); m= UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = DecodeImmShift(op, imm5); 


if op == 'Q0' && imm5 == 'Q0000' && InITBlock() then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If op == '00' && imm5 == 'Q0000' && InITBlock(), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as if it passed its condition code check. 

° The instruction executes as NOP, as if it failed its condition code check. 
° The instruction executes as MOV Rd, Rm. 

T3 


[15 141312/11109 8/7 6 5 4/3 21 0|1514 12/11 


81/7 6 5 4|3 


0} 


770407007 0/5] 11 10] mm3 | Rd imma] ype] Rm 


MOV, rotate right with extend variant 
Applies when $ == Q && imm3 == 000 && imm2 == 00 && type == 11. 


MOV{<c>}{<q>} <Rd>, <Rm>, RRX 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


MOV, shift or rotate by value variant 
Applies when S == Q && !(imm3 == 000 && imm2 == 00 && type == 11). 


MOV{<c>}.W <Rd>, <Rm> {, LSL #0} // <Rd>, <Rm> can be represented in T1 

MOV<c>.W <Rd>, <Rm> {, <shift> #<amount>} // Inside IT block, and <Rd>, <Rm>, <shift>, <amount> can be 
represented in 12 

MOV{<c>}{<q>} <Rd>, <Rm> {, <shift> #<amount>} 


MOVS, rotate right with extend variant 


Applies when $ == 1 && imm3 == 000 && imm2 == 00 && type == 11. 


MOVS{<c>}{<q>} <Rd>, <Rm>, RRX 


MOVS, shift or rotate by value variant 


Applies when S == 1 && !(imm3 == 000 && imm2 == 00 && type == 11). 


MOVS.W <Rd>, <Rm> {, <shift> #<amount>} // Outside IT block, and <Rd>, <Rm>, <shift>, <amount> can be 
represented in T1 or T2 


MOVS{<c>}{<q>} <Rd>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); 


m = UInt(Rm); setflags = 


(S == '1'); 


(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 
if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Alias conditions 





Alias 


ASRS (immediate) 


ASRS (immediate) 


ASR (immediate) 


ASR (immediate) 
LSLS (immediate) 
LSLS (immediate) 
LSLS (immediate) 
LSL (immediate) 
LSL (immediate) 
LSL (immediate) 


LSRS (immediate) 


LSRS (immediate) 


of variant 


T3 (MOVS, shift or rotate by value), 
Al (MOVS, shift or rotate by value) 


T2 


T3 (MOV, shift or rotate by value), 
Al (MOV, shift or rotate by value) 


T2 

T3 (MOVS, shift or rotate by value) 
Al (MOVS, shift or rotate by value) 
T2 

T3 (MOY, shift or rotate by value) 
Al (MOV, shift or rotate by value) 
T2 


T3 (MOVS, shift or rotate by value), 
Al (MOVS, shift or rotate by value) 


T2 


is preferred when 


S == '1' && type == '10' 


op == '10' && !InITBlock() 


S == 'Q' && type == '10' 


op == '10' && InITBlock() 

S == '1' && imm3:Rd:imm2 != 'QQQxxxx00' && type == '00' 
S == '1' && imm5 != 'Q0000' && type == '00' 

op == '00' && immS != 'Q0000' && !InITBlock() 

S == 'O' && imm3:Rd:imm2 != 'QQ@xxxx00' && type == '00' 
S == 'Q' && imm5 != 'Q0000' && type == '00' 

op == '00' && imm5 != '00000' && InITBlock() 


S == '1' && type == 'Q1' 


op == 'Q1' && !InITBlock() 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





Alias of variant is preferred when 


LSR (immediate) T3 (MOV, shift or rotate by value), S == '0' && type == 'O1' 
Al (MOV, shift or rotate by value) 


LSR (immediate) T2 op == '01' && InITBlock() 


RORS (immediate) T3 (MOVS, shift or rotate by value) = S == '1' && imm3:Rd:imm2 != 'Q0Oxxxx00' && type == '11' 


RORS (immediate) Al (MOVS, shift or rotate by value) == '1' && imm5 != '00000' && type == '11' 

ROR (immediate) T3 (MOY, shift or rotate by value) == 'Q' && imm3:Rd:imm2 != 'QQOxxxx00' && type == '11' 
ROR (immediate) Al (MOV, shift or rotate by value) == '0' && imm5 != '00000' && type == '11' 

RRXS T3 (MOVS, rotate right with extend) == '1' && imm3 == '000' && imm2 == 'QQ' && type == '11' 
RRXS Al (MOVS, rotate right with extend) == '1' && immS == '00000' && type == '11' 

RRX T3 (MOV, rotate right with extend) == '0' && imm3 == '000' && imm2 == 'Q0' && type == '11' 
RRX Al (MOV, rotate right with extend) == 'Q' && imm5 == '00000' && type == '11' 





Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If the PC is 
used: 
. For the MOV variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. ARM deprecates use of the 
instruction if <Rn> is the PC. 


° For the MOVS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. ARM deprecates use of the instruction if <Rn> is not the LR, 
or if the optional shift or RRX argument is specified. 


For encoding T1: is the general-purpose destination register, encoded in the "D:Rd" field. If the PC 
is used: 


° The instruction causes a branch to the address moved to the PC. This is a simple branch, see 
Pseudocode description of operations on the AArch32 general-purpose registers and the PC 
on page E1-2293. 


° The instruction must either be outside an IT block or the last instruction of an IT block. 


For encoding T2 and T3: is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> For encoding Al and T1: is the general-purpose source register, encoded in the "Rm" field. The PC 
can be used. ARM deprecates use of the instruction if <Rd> is the PC. 
For encoding T2 and T3: is the general-purpose source register, encoded in the "Rm" field. 
<shi ft> For encoding A1 and T3: is the type of shift to be applied to the source register, encoded in the 
"type" field. It can have the following values: 
LSL when type = 00 
01 
ASR when type = 10 


LSR when type 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


For encoding T2: is the type of shift to be applied to the source register, encoded in the "op" field. 


It can have the following values: 


LSL when op = 00 
LSR when op = 01 
ASR when op = 10 
<amount> For encoding A1 and T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 


1 to 32 (when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T3: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 


result = shifted; 
if d == 15 then 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result) ; 
else 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.109 MOV, MOVS (register-shifted register) 


Move (register-shifted register) copies a register-shifted register value to the destination register. It can optionally 


update the condition flags based on the value. 


This instruction is used by the aliases ASRS (register), ASR (register), LSLS (register), LSL (register), LSRS 
(register), LSR (register), RORS (register), and ROR (register). See Alias conditions on page F5-2822 for details of 


when each alias is preferred. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 


8|7 6 5 4|3 


0| 


Per [00071 10 1[slomoo] Ra | Rs [0] ype]1] Rm _| 


cond 


Flag setting variant 
Applies when $ == 1. 


MOVS{<c>}{<q>} <Rd>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when $ == 0. 


MOV{<c>}{<q>} <Rd>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 


d = UInt(Rd); m= UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 
if d == 15 || m == 15 || s == 15 then UNPREDICTABLE; 


T1 
15141312\1109 | 65 |32 O| 
O71 000 0j0xxx] Rs | Rim | 


op 


Arithmetic shift right variant 
Applies when op == 0100. 


MOV<c>{<q>} <Rdm>, <Rdm>, ASR <Rs> // Inside IT block 
MOVS{<q>} <Rdm>, <Rdm>, ASR <Rs> // Outside IT block 


Logical shift left variant 
Applies when op == 0010. 


MOV<c>{<q>} <Rdm>, <Rdm>, LSL <Rs> // Inside IT block 
MOVS{<q>} <Rdm>, <Rdm>, LSL <Rs> // Outside IT block 


Logical shift right variant 
Applies when op == 0011. 


MOV<c>{<q>} <Rdm>, <Rdm>, LSR <Rs> // Inside IT block 
MOVS{<q>} <Rdm>, <Rdm>, LSR <Rs> // Outside IT block 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Rotate right variant 
Applies when op == @111. 


MOV<c>{<q>} <Rdm>, <Rdm>, ROR <Rs> // Inside IT block 
MOVS{<q>} <Rdm>, <Rdm>, ROR <Rs> // Outside IT block 


Decode for all variants of this encoding 
if !(op IN {'Q010', 'Q@11', '@100', '@111'}) then SEE "Related encodings"; 


d = UInt(Rdm); m = UInt(Rdm); s = UInt(Rs); 
setflags = !InITBlock(); shift_t = DecodeRegShift(op<2>:op<0>) ; 


T2 


[15 14 1312/1110 9 8|7 6 5 4/3 0/45 14 13 12|11 8|7 6 5 4|3 0 | 


77717070 Owes] Rm [17717] Ra [Oooo] Rs 


Flag setting variant 
Applies when §$ == 1. 


MOVS.W <Rd>, <Rm>, <type> <Rs> // Outside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in T1 
MOVS{<c>}{<q>} <Rd>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when $ == 0. 


MOV<c>.W <Rd>, <Rm>, <type> <Rs> // Inside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in 
Tl 
MOV{<c>}{<q>} <Rd>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); m= UInt(Rm); s = UInt(Rs); 

setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || m == 15 || s == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


Related encodings: In encoding T1, for an op field value that is not described above, see Data-processing (two low 
registers) on page F3-2438. 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Alias conditions 





Alias 

ASRS (register) 
ASRS (register) 
ASRS (register) 
ASR (register) 
ASR (register) 
ASR (register) 
LSLS (register) 
LSLS (register) 
LSLS (register) 
LSL (register) 
LSL (register) 
LSL (register) 
LSRS (register) 
LSRS (register) 
LSRS (register) 
LSR (register) 
LSR (register) 
LSR (register) 
RORS (register) 
RORS (register) 
RORS (register) 
ROR (register) 
ROR (register) 


ROR (register) 


of variant 

Al (flag setting) 

T1 (arithmetic shift right) 
T2 (flag setting) 

Al (not flag setting) 
T1 (arithmetic shift right) 
T2 (not flag setting) 
Al (flag setting) 

T1 (logical shift left) 
T2 (flag setting) 

Al (not flag setting) 
T1 (logical shift left) 
T2 (not flag setting) 
Al (flag setting) 

T1 (logical shift right) 
T2 (flag setting) 

Al (not flag setting) 
T1 (logical shift right) 
T2 (not flag setting) 
Al (flag setting) 

T1 (rotate right) 

T2 (flag setting) 

Al (not flag setting) 
T1 (rotate right) 


T2 (not flag setting) 


is preferred when 

S == '1' && type == '10' 

op == '0100' && !InITBlock() 
type == '10' && S == '1' 

S == 'Q' && type == '10' 

op == '0100' && InITBlock() 

type == '10' && S == 'Q' 

S == '1' && type == '00' 

op == '0010' && !InITBlock() 
type == '00' && S == '1' 

S == 'Q' && type == '00' 

op == '0010' && InITBlock() 

type == '00' && S == 'Q' 

S == '1' && type == 'Q1' 

op == '0011' && !InITBlock() 
type == 'Q1' && S == '1' 

S == 'Q' && type == 'Q1' 

op == '0011' && InITBlock() 

type == 'Q1' && S == 'Q' 

S == '1' && type == '11' 

op == 'Q0111' && !InITBlock() 
type == '11' && S == '1' 

S == 'Q' && type == '11' 

op == 'Q111' && InITBlock() 


type == '11' && S == 'Q' 





Assembler symbols 


<c> 


<q> 


<Rdm> 


<Rd> 


<Rm> 


See Standard assembler syntax fields on page F2-2406. 


See Standard assembler syntax fields on page F2-2406. 


Is the general-purpose source register and the destination register, encoded in the "Rdm" field. 


Is the general-purpose destination register, encoded in the "Rd" field. 


Is the general-purpose source register, encoded in the "Rm" field. 
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the following values: 


<type> 
LSL 
LSR 
ASR 
ROR 
<Rs> 


when type 
when type 
when type 


when type 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


= 00 
= 01 
10 
11 


Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 


Is the general-purpose source register holding a shift amount in its bottom 8 bits, encoded in the "Rs" 


field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 


(result, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
R[d] = result; 

if setflags then 
result<31>; 
= IsZeroBit(result); 
carry; 


PSTATE.N 
PSTATE.Z 
PSTATE.C 


// PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.110 


MOVT 


Move Top writes an immediate value to the top halfword of the destination register. It does not affect the contents 


of the bottom halfword. 
A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 | | 0 | 


| tii foo 1 1 oftfo of imma | Rd | immi2_ 


cond 


Al variant 


MOVT{<c>}{<q>} <Rd>, #<imm16> 


Decode for this encoding 


d = UInt(Rd); imm16 = imm4:imm12; 
if d == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/1514 12\11 8/7 | 0 | 


741 oft oft]t 0 0] mma [oy imma | Ra [imme —s 


T1 variant 


MOVT{<c>}{<q>} <Rd>, #<imm16> 


Decode for this encoding 


d = UInt(Rd); | imm16 = imm4:i:imm3:imm8; 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 


<imm16> For encoding A1: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 
"imm4:imm12" field. 


For encoding T1: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 
"imm4:i:imm3:imm8s" field. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations() ; 
R[d]<31:16> = imm16; 
// R{d]<15:@> unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.111 


MRC 


Move to general-purpose register from System register. This instruction copies the value of a System register to a 
general-purpose register. 


The System register descriptions identify valid encodings for this instruction. Other encodings are UNDEFINED. For 
more information see About the AArch32 System register interface on page E1-2312 and General behavior of 
System registers on page G4-4151. 


In an implementation that includes EL2, MRC accesses to system control registers can be trapped to Hyp mode, 
meaning that an attempt to execute an MRC instruction in a Non-secure mode other than Hyp mode, that would be 
permitted in the absence of the Hyp trap controls, generates a Hyp Trap exception. For more information, see EL2 
configurable controls on page G1-3894. 


Because of the range of possible traps to Hyp mode, the MRC pseudocode does not show these possible traps. 


A1 


|31 28/27 26 25 24/23 21 20/19 16|15 12\11 9 8|7 5 4|3 0 | 
Tem [1171 of opt [i] orn | Rt [1711] | op [1] cRm | 

cond coproc<3:1>| 
coproc<0> 


Al variant 


MRC{<c>}{<q>} <coproc>, {#}<opcl>, <Rt>, <CRn>, <CRm>{, {#}<opc2>} 


Decode for this encoding 


t = UInt(Rt); cp = if coproc<@> == '@' then 14 else 15; 
// ARMv8-A removes UNPREDICTABLE for R13 


T1 


151413 12|1110 9 8|7 5 4|3 0 |15 12|11 9 8|7 5 4|3 0 | 

[11 tfof1 11 of opet [1] crn | Rt [14 1{ | opc2 [1] CRm_| 
coproc<3:1>| 

coproc<0> 


T1 variant 
MRC{<c>}{<q>} <coproc>, {#}<opcl>, <Rt>, <CRn>, <CRm>{, {#}<opc2>} 
Decode for this encoding 

t = UInt(Rt); cp = if coproc<@> == '@' then 14 else 15; 

// ARMv8-A removes UNPREDICTABLE for R13 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<coproc> Is the System register encoding space, encoded in the "coproc<0>" field. It can have the following 
values: 


I 
S 


p14 when coproc<0> 


iT} 
RB 


p15 when coproc<@> 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


<opcl> Is the opcl parameter within the System register encoding space, in the range 0 to7, encoded in the 
"opcl" field. 
<Rt> Is the general-purpose register to be transferred or APSR_nzcv (encoded as @b1111), encoded in the 


"Rt" field. If APSR_nzcv is used, bits [31:28] of the transferred value are written to the PSTATE 
condition flags. 


<CRn> Is the CRn parameter within the System register encoding space, in the range c0 to c15, encoded in 
the "CRn" field. 


<CRm> Is the CRm parameter within the System register encoding space, in the range cO to c15, encoded in 
the "CRm" field. 


<opc2> Is the opc2 parameter within the System register encoding space, in the range 0 to7, encoded in the 
"opc2" field. 


The possible values of { <coproc>, <opcl>, <CRn>, <CRm>, <opc2> } encode the entire System register and System 
instruction encoding space. Not all of this space is allocated, and the System register and System instruction 
descriptions identify the allocated encodings. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
AArch32.CheckSystemAccess(cp, ThisInstr()); 
bits(32) value = AArch32.SysRegRead(cp, ThisInstr()); 
if t != 15 then 
R[t] = value; 
elsif AArch32.SysRegReadCanWriteAPSR(cp, ThisInstr()) then 
PSTATE.<N,Z,C,V> = value<31:28>; 
// value<27:0> are not used. 
else 
PSTATE.<N,Z,C,V> = bits(4) UNKNOWN; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.112 


MRRC 


Move to two general-purpose registers from System register. This instruction copies the value of a System register 
to two general-purpose registers. 


The System register descriptions identify valid encodings for this instruction. Other encodings are UNDEFINED. For 
more information see About the AArch32 System register interface on page E1-2312 and General behavior of 
System registers on page G4-4151. 


In an implementation that includes EL2, MRRC accesses to System registers can be trapped to Hyp mode, meaning 
that an attempt to execute an MRRC instruction in a Non-secure mode other than Hyp mode, that would be permitted 
in the absence of the Hyp trap controls, generates a Hyp Trap exception. For more information, see EL2 
configurable controls on page G1-3894. 


Because of the range of possible traps to Hyp mode, the MRRC pseudocode does not show these possible traps. 


A1 


|31 28/27 26 25 24|23 22 21 20|19 16|15 12\11 9 8|7 4|3 0 | 
Dem [1100 o[tjols] r2 | mt (111) | opct | cRm | 

cond coproc<3:1>| 
coproc<0> 


Al variant 


MRRC{<c>}{<q>} <coproc>, {#}<opcl>, <Rt>, <Rt2>, <CRm> 


Decode for this encoding 


= UInt(Rt); t2 = UInt(Rt2); cp = if coproc<@> == '@' then 14 else 15; 
if t == 15 || t2 == 15 || t == t2 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 

If t == t2, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


. The value in the destination register is UNKNOWN. 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 12/11 9 8|7 4|3 0 | 


[i 1 tfof1 1 oo oftfofi] re | rt [14 1f | oper | cRm_| 
coproc<3:1>| 
coproc<0> 


T1 variant 


MRRC{<c>}{<q>} <coproc>, {#}<opcl>, <Rt>, <Rt2>, <CRm> 


Decode for this encoding 


= UInt(Rt); t2 = UInt(Rt2); cp = if coproc<@> == '@' then 14 else 15; 
if t == 15 || t2 == 15 || t == t2 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


CONSTRAINED UNPREDICTABLE behavior 


If t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<coproc> Is the System register encoding space, encoded in the "coproc<0>" field. It can have the following 
values: 
p14 when coproc<@> = @ 
p15 when coproc<@> = 1 

<opcl> Is the opcl parameter within the System register encoding space, in the range 0 to 15, encoded in 


the "opcl" field. 


<Rt> Is the first general-purpose register that is transferred into, encoded in the "Rt" field. 
<Rt2> Is the second general-purpose register that is transferred into, encoded in the "Rt2" field. 
<CRm> Is the CRm parameter within the System register encoding space, in the range cO to c15, encoded in 


the "CRm" field. 


The possible values of { <coproc>, <opcl>, <CRm> } encode the entire System register encoding space. Not all of this 
space is allocated, and the System register descriptions identify the allocated encodings. 


For the permitted uses of these instructions, as described in this manual, <Rt2> transfers bits[63:32] of the selected 
System register, while <Rt> transfers bits[31:0]. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
AArch32.CheckSystemAccess(cp, ThisInstr()); 
value = AArch32.SysRegRead64(cp, ThisInstr()); 
R[t] = value<31:0>; 
R[t2] = value<63:32>; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 








F5.1.113 MRS 
Move Special register to general-purpose register moves the value of the The Application Program Status Register, 
APSR on page E1-2296, CPSR, or SPSR_<current_mode> into a general-purpose register. 
ARM recommends the APSR form when only the N, Z, C, V, Q, and GE[3:0] bits are being written. For more 
information, see The Application Program Status Register, APSR on page E1-2296. 
An MRS that accesses the SPSR is UNPREDICTABLE if executed in User mode or System mode. 
An MRS that is executed in User mode and accesses the CPSR returns an UNKNOWN value for the CPSR.{E, A, I, F, 
M} fields. 
A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11109 8|7 6 5 4/3 2 1 0| 
=1111 [0 0 0 1 Of[R]O]O[1)(1)(1) (1) Ra ——[(O)(0)}. 0 (-JJ.}0.:-0 «0-0 (0) (0) (0) (0) 
cond 
Al variant 
MRS{<c>}{<q>} <Rd>, <spec_reg> 
Decode for this encoding 
d = UInt(Rd); read_spsr = (R == '1'); 
if d == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 2 1 0|151413 12/11 8|7 65 4/3 21 0| 
17170074114 ARAMA ofofol Ra [ool o (Ooo) 
T1 variant 
MRS{<c>}{<q>} <Rd>, <spec_reg> 
Decode for this encoding 
d = UInt(Rd);  read_spsr = (R == '1'); 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<spec_reg> Is the special register to be accessed, encoded in the "R" field. It can have the following values: 
CPSR|APSR  whenR = @ 
SPSR when R = 1 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if read_spsr then 
if PSTATE.M IN {M32_User,M32_System} then 
UNPREDICTABLE; 
else 
R[d] = SPSR[]; 
else 
// CPSR has same bit assignments as SPSR, but with the IT, J, SS, IL, and T bits masked out. 
bits(32) mask = '1111100@ 00001111 00000011 11011111'; 
psr_val = GetPSRFromPSTATE() AND mask; 
if PSTATE.EL == ELQ then 
// If accessed from User mode return UNKNOWN values for E, A, I, F bits, bits<9:6>, 
// and for the M field, bits<4:0> 
psr_val<9:6> = bits(4) UNKNOWN; 
psr_val<4:@> = bits(5) UNKNOWN; 
R[d] = psr_val; 


CONSTRAINED UNPREDICTABLE behavior 


If PSTATE.M IN {M32_User, M32_System} && read_spsr, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.114 


MRS (Banked register) 


Move to Register from Banked or Special register moves the value from the Banked general-purpose register or 
SPSR of the specified mode, or the value of ELR_hyp on page G1-3804, to a general-purpose register. 


MRS (Banked register) is UNPREDICTABLE if executed in User mode. 


When EL3 is using AArch64, if an MRS (Banked register) instruction that is executed in a Secure EL1 mode would 
access SPSR_mon, SP_mon, or LR_mon, it is trapped to EL3. 


The effect of using an MRS (Banked register) instruction with a register argument that is not valid for the current mode 
is UNPREDICTABLE. For more information see Usage restrictions on the Banked register transfer instructions on 
page F5-3229. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1109 8|7 6 5 4/3 21 0| 


| fet jo oo 4 ofRfojo] mi | Rd loo {mjo 0 0 0{(0)(0)(0)(0) 


cond 


Al variant 


MRS{<c>}{<q>} <Rd>, <banked_reg> 


Decode for this encoding 
d = UInt(Rd); read_spsr = (R == '1'); 


if d == 15 then UNPREDICTABLE; 
SYSm = M:M1; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0/45 14 13 12|11 8|7 6 5 4|3 21 0| 


ppt toorir a sfRy mt ft ofoyol Ra [ovo + [mioofoo) 





T1 variant 
MRS{<c>}{<q>} <Rd>, <banked_reg> 
Decode for this encoding 
d = UInt(Rd); read_spsr = (R == '1'); 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
SYSm = M:M1; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
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<banked_reg> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 


F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


can have the following values: 


R8_usr 
R9_usr 
R1@_usr 
R1l_usr 
R12_usr 
SP_usr 
LR_usr 
R8_fig 
R9_fig 
R10_fig 
R11_fiq 
R12_fig 
SP_fiq 
LR_fiq 
LR_irg 
SP_irg 
LR_svc 
SP_svc 
LR_abt 
SP_abt 
LR_und 
SP_und 





LR_mon 
SP_mon 
ELR_hyp 
SP_hyp 
SPSR_fiq 
SPSR_irg 
SPSR_svc 
SPSR_abt 
SPSR_und 
SPSR_mon 
SPSR_hyp 


when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 


when R 


PrP FP FP reeres © © © © © €© €© €© €© €© €© €© €© © €© © © © S$ © & © f& Ff f& 


1, 








=S Ss Ss fs © f& 


@, 
Q, 
Q, 
@, 
e, 
Q, 
e, 
0 











The following encodings are UNPREDICTABLE: 
7 R = 0,M 


° R 


FPF ADA FA FD AD BW 


0,M 
0, M 
1,M 
1,M 
1,M 
1,M 
1,M 


PSs Ss S$ S$ F & 


a, 


? 


° 


= = = SB 3S 
PPP RP RB BR RB 


= 








= 


= 0111. 
= 1111. 
= 10xx. 
= QOxxx. 
= 10xx. 
= 110x. 
= 1111. 


= 0001 








M1 = 0000 
M1 = 0001 
M1 = Q010 
M1 = 0011 
M1 = 0100 
M1 = Q101 
M1 = Q110 
M1 = 1000 
M1 = 1001 
M1 = 1010 
M1 = 1011 
M1 = 1100 
M1 = 1101 
M1 = 1110 
M1 = 000 
M1 = 0001 
M1 = 0010 
M1 = 0011 
M1 = 100 
M1 = 0101 
M1 = 0110 
M1 = @111 
M1 = 1100 
M1 = 1101 
M1 = 1110 
M1 = 1111 
M1 = 1110 
M1 = 0000 
M1 = 0010 
M1 = 100 
M1 = Q110 
M1 = 1100 
M1 = 1110 


Is the name of the banked register to be transferred to or from, encoded in the "R:M:M1" field. It 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


. R=1,M=1,M1 = 0011. 
. R=1,M=1,M1 = 0101. 
. R=1,M=1,M1 = 111. 
‘ R= 1,M=1,M1 = 1Oxx. 
. R=1,M=1,M1 = 1101. 
. R=1,M=1,M1 = 1111. 








Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if PSTATE.EL == ELQ then 
UNPREDICTABLE; 
else 
mode = PSTATE.M; 
if read_spsr then 
SPSRaccessValid(SYSm, mode); // Check for UNPREDICTABLE cases 
case SYSm of 


when 'Q@1110' R[d] = SPSR_figq; 
when '10000' R[d] = SPSR_irq; 
when '10010' R[d] = SPSR_svc; 
when '10100' R[d] = SPSR_abt; 


when '10110' R[d] = SPSR_und; 

when '11100' 
if !ELUsingAArch32(EL3) then AArch64.MonitorModeTrap(); 
R[d] = SPSR_mon; 

when '11110' R[d] = SPSR_hyp; 





else 
BankedRegisterAccessValid(SYSm, mode); // Check for UNPREDICTABLE cases 
case SYSm of 
when 'QQxxx' // Access the User mode registers 
m = UInt(SYSm<2:0>) + 8; 

















R[d] = Rmode[m,M32_User]; 

when '@1xxx' // Access the FIQ mode registers 
m = UInt(SYSm<2:0>) + 8; 
R[d] = Rmode[m,M32_F1Q]; 

when '100Qx' // Access the IRQ mode registers 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
R[d] = Rmode[m,M32_IRQ]; 

when '10Q1x' // Access the Supervisor mode registers 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
R[d] = Rmode[m,M32_Svc]; 

when '1010x' // Access the Abort mode registers 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
R[d] = Rmode[m,M32_Abort]; 

when '1011x' // Access the Undefined mode registers 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
R[d] = Rmode[m,M32_Undef]; 

when '1110x' // Access Monitor registers 
if !ELUsingAArch32(EL3) then AArch64.MonitorModeTrap(); 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
R[d] = Rmode[m,M32_Monitor]; 

when '11110' // Access ELR_hyp register 
R[d] = ELR_hyp; 

when '11111' // Access SP_hyp register 
R[d] = Rmode[13,M32_Hyp]; 





CONSTRAINED UNPREDICTABLE behavior 
Tf PSTATE.EL == EL@, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


° The instruction executes as NOP. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.115 


MSR (Banked register) 


Move to Banked or Special register from general-purpose register moves the value of a general-purpose register to 
the Banked general-purpose register or SPSR of the specified mode, or to ELR_hyp on page G1-3804. 


MSR (Banked register) is UNPREDICTABLE if executed in User mode. 


When EL3 is using AArch64, if an MSR (Banked register) instruction that is executed in a Secure EL1 mode would 
access SPSR_mon, SP_mon, or LR_mon, it is trapped to EL3. 


The effect of using an MSR (Banked register) instruction with a register argument that is not valid for the current mode 
is UNPREDICTABLE. For more information see Usage restrictions on the Banked register transfer instructions on 
page F5-3229. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16|15 1413 12|11109 8|7 6 5 4{3 0| 


[sit fo oo 4 ofRitfoT mt [yy fooft[MJo_o 0 of Rn | 


cond 


Al variant 


MSR{<c>}{<q>} <banked_reg>, <Rn> 


Decode for this encoding 


n = UInt(Rn); write_spsr = (R == '1'); 
if n == 15 then UNPREDICTABLE; 
SYSm = M:M1; 


T1 


[15 141312/11109 8|7 6 5 4/3 0/45 14 13 12|11 8|7 6 5 4|3 21 0| 





ptt toorito ofR| Rv ft ofoyol mit [oo + mMioofo(o) 


T1 variant 
MSR{<c>}{<q>} <banked_reg>, <Rn> 
Decode for this encoding 
n = UInt(Rn); write_spsr = (R == '1'); 
if n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
SYSm = M:M1; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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<banked_reg> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 


F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


can have the following values: 


R8_usr 
R9_usr 
R1@_usr 
R1l_usr 
R12_usr 
SP_usr 
LR_usr 
R8_fig 
R9_fig 
R10_fig 
R11_fiq 
R12_fig 
SP_fiq 
LR_fiq 
LR_irg 
SP_irg 
LR_svc 
SP_svc 
LR_abt 
SP_abt 
LR_und 
SP_und 





LR_mon 
SP_mon 
ELR_hyp 
SP_hyp 
SPSR_fiq 
SPSR_irg 
SPSR_svc 
SPSR_abt 
SPSR_und 
SPSR_mon 
SPSR_hyp 


when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 
when R 


when R 


PrP FP FP reeres © © © © © €© €© €© €© €© €© €© €© © €© © © © S$ © & © f& Ff f& 


1, 








=S Ss Ss fs © f& 


@, 
Q, 
Q, 
@, 
e, 
Q, 
e, 
0 











The following encodings are UNPREDICTABLE: 
7 R = 0,M 


° R 


FPF ADA FA FD AD BW 


0,M 
0, M 
1,M 
1,M 
1,M 
1,M 
1,M 


PSs Ss S$ S$ F & 


a, 


? 


° 


= = = SB 3S 
PPP RP RB BR RB 


= 








= 


= 0111. 
= 1111. 
= 10xx. 
= QOxxx. 
= 10xx. 
= 110x. 
= 1111. 


= 0001 








M1 = 0000 
M1 = 0001 
M1 = Q010 
M1 = 0011 
M1 = 0100 
M1 = Q101 
M1 = Q110 
M1 = 1000 
M1 = 1001 
M1 = 1010 
M1 = 1011 
M1 = 1100 
M1 = 1101 
M1 = 1110 
M1 = 000 
M1 = 0001 
M1 = 0010 
M1 = 0011 
M1 = 100 
M1 = 0101 
M1 = 0110 
M1 = @111 
M1 = 1100 
M1 = 1101 
M1 = 1110 
M1 = 1111 
M1 = 1110 
M1 = 0000 
M1 = 0010 
M1 = 100 
M1 = Q110 
M1 = 1100 
M1 = 1110 


Is the name of the banked register to be transferred to or from, encoded in the "R:M:M1" field. It 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 








° R= 1,M = 1,M1 = 0011. 
° R= 1,M = 1,M1 = 0101. 
° R=1,M = 1,M1 = 0111 
. R = 1,M = 1,M1 = 10xx. 
° R= 1,M = 1,M1 = 1101. 
° R= 1,M=1,M1 = 1111. 
<Rn> Is the general-purpose source register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if PSTATE.EL == EL@ then 
UNPREDICTABLE; 
else 
mode = PSTATE.M; 
if write_spsr then 
SPSRaccessValid(SYSm, mode); // Check for UNPREDICTABLE cases 
case SYSm of 


when '@1110' SPSR_fiq = R[n]; 
when '10000' SPSR_irg = R[n]; 
when '10010' SPSR_svc = R[n]; 
when '10100' SPSR_abt = R[n]; 
when '10110' SPSR_und = R[n]; 


when '11100' 
if !ELUsingAArch32(EL3) then AArch64.MonitorModeTrap(); 
SPSR_mon = R[n]; 

when '11110' SPSR_hyp = R[n]; 





else 
BankedRegisterAccessValid(SYSm, mode); // Check for UNPREDICTABLE cases 
case SYSm of 
when 'Q@Qxxx' // Access the User mode registers 
m = UInt(SYSm<2:0>) + 8; 

















Rmode[m,M32_User] = R[n]; 

when '@Q1xxx' // Access the FIQ mode registers 
m = UInt(SYSm<2:0>) + 8; 
Rmode[m,M32_FIQ] = R[n]; 

when '100Qx' // Access the IRQ mode registers 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
Rmode[m,M32_IRQ] = R[n]; 

when '10Q1x' // Access the Supervisor mode registers 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
Rmode[m,M32_Svc] = R[n]; 

when '1010x' // Access the Abort mode registers 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
Rmode[m,M32_Abort] = R[n]; 

when '1011x' // Access the Undefined mode registers 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
Rmode[m,M32_Undef] = R[n]; 

when '1110x' // Access Monitor registers 
if !ELUsingAArch32(EL3) then AArch64.MonitorModeTrap(); 
m = 14 - UInt(SYSm<@>) ; // LR when SYSm<@> == @, otherwise SP 
Rmode[m,M32_Monitor] = R[n]; 

when '11110' // Access ELR_hyp register 
ELR_hyp = R[n]; 

when '11111' // Access SP_hyp register 


Rmode[13,M32_Hyp] = R[n]; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


CONSTRAINED UNPREDICTABLE behavior 


Tf PSTATE.EL == EL@, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.116 


MSR (immediate) 


Move immediate value to Special register moves selected bits of an immediate value to the corresponding bits in 
the The Application Program Status Register, APSR on page E1-2296, CPSR, or SPSR_<current_mode>. 


Because of the Do-Not-Modify nature of its reserved bits, the immediate form of MSR is normally only useful at the 
Application level for writing to APSR_nzcvq (CPSR_f). 


If an MSR (immediate) moves selected bits of an immediate value to the CPSR, the PE checks whether the value being 
written to PSTATE.M is legal. See I/legal changes to PSTATE.M on page G1-3809. 


An MSR (immediate) executed in User mode: 

° Is CONSTRAINED UNPREDICTABLE if it attempts to update the SPSR. 

° Otherwise, does not update any CPSR field that is accessible only at EL1 or higher, 

An MSR (immediate) executed in System mode is CONSTRAINED UNPREDICTABLE if it attempts to update the SPSR. 


The CPSR.E bit is writable from any mode using an MSR instruction. ARM deprecates using this to change its value. 
A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 | | 0 | 


| tei11_ fo 0 1 1 OR] Of mask yfyf imme 


cond 


Al variant 

Applies when !(R == @ && mask == 0000). 

MSR{<c>}{<q>} <spec_reg>, #<imm> 

Decode for this encoding 

if mask == 'Q000' && R == 'Q' then SEE "Related encodings"; 


imm32 = A32ExpandImm(imm12); write_spsr = (R == '1'); 
if mask == '@Q@@' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 
If mask == '0000' && R == '1', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 


° The instruction executes as NOP. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related encodings: Move Special Register & Hints (immediate) on page F4-2516. 


Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<spec_reg> Is one of: 


° APSR_<bits>. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


° CPSR_<fields>. 
° SPSR_<fields>. 


For CPSR and SPSR, <fields> is a sequence of one or more of the following: 


c mask<0> ='1' to enable writing of bits<7:0> of the destination PSR. 

X mask<1>='1' to enable writing of bits<15:8> of the destination PSR. 
S mask<2> ='1' to enable writing of bits<23:16> of the destination PSR. 
f mask<3> ='1' to enable writing of bits<31:24> of the destination PSR. 


For APSR, <bits> is one of nzcvq, g, or nzcvqg. These map to the following CPSR_<fields> values: 
° APSR_nzcvq is the same as CPSR_f (mask== '1000'). 

° APSR_g is the same as CPSR_s (mask == '0100'). 

° APSR_nzcvqg is the same as CPSR_fs (mask == '1100'). 


ARM recommends the APSR_<bits> forms when only the N, Z, C, V, Q, and GE[3:0] bits are being 
written. For more information, see The Application Program Status Register, APSR on 
page E1-2296. 


<imm> Is an immediate value. See Modified immediate constants in A32 instructions on page F2-2422 for 
the range of values. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if write_spsr then 
if PSTATE.M IN {M32_User,M32_System} then 
UNPREDICTABLE; 
else 
SPSRWriteByInstr(imm32, mask); 
else 
// Attempts to change to an i]1legal mode will invoke the Illegal Execution state mechanism 
CPSRWriteByInstr(imm32, mask); 


CONSTRAINED UNPREDICTABLE behavior 


If PSTATE.M IN {M32_User,M32_System} && write_spsr, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.117 MSR (register) 


Move general-purpose register to Special register moves selected bits of a general-purpose register to the The 
Application Program Status Register, APSR on page E1-2296, CPSR or SPSR_<current_mode>. 


Because of the Do-Not-Modify nature of its reserved bits, a read-modify-write sequence is normally required when 
the MSR instruction is being used at Application level and its destination is not APSR_nzcvq (CPSR_f). 


If an MSR (register) moves selected bits of an immediate value to the CPSR, the PE checks whether the value being 
written to PSTATE.M is legal. See Illegal changes to PSTATE.M on page G1-3809. 


An MSR (register) executed in User mode: 

° Is UNPREDICTABLE if it attempts to update the SPSR. 

° Otherwise, does not update any CPSR field that is accessible only at EL1 or higher. 

An MSR (register) executed in System mode is UNPREDICTABLE if it attempts to update the SPSR. 


The CPSR.E bit is writable from any mode using an MSR instruction. ARM deprecates using this to change its value. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 14 1312/1110 9 8/7 6 5 4|3 0| 


[| ett fo oo 4 oOfR]{t fof mask (1) ooo _o o of Rn | 


cond 


Al variant 

MSR{<c>}{<q>} <spec_reg>, <Rn> 
Decode for this encoding 

n = UInt(Rn); write_spsr = (R == '1'); 


if mask == '@Q@@' then UNPREDICTABLE; 
if n == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If mask == 'Q000', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
T1 
15 1413 12/1110 9 8|7 6 5 4|3 0 15 14 13 12|11 8/7 6 5 4|3 21 0| 


THTTOOT TO OR] Ra [1 Ofofo] mack [ooo Tohororoyoy 


T1 variant 


MSR{<c>}{<q>} <spec_reg>, <Rn> 


Decode for this encoding 


n = UInt(Rn); write_spsr = (R == '1'); 
if mask == '@Q@@' then UNPREDICTABLE; 
if n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


CONSTRAINED UNPREDICTABLE behavior 


If mask == '0000', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<spec_reg> Is one of: 


° APSR_<bits>. 
° CPSR_<fields>. 
° SPSR_<fields>. 


For CPSR and SPSR, <fields> is a sequence of one or more of the following: 


c mask<0> ='1' to enable writing of bits<7:0> of the destination PSR. 

X mask<1>='1' to enable writing of bits<15:8> of the destination PSR. 
s mask<2> ='1' to enable writing of bits<23:16> of the destination PSR. 
f mask<3> ='1' to enable writing of bits<31:24> of the destination PSR. 


For APSR, <bits> is one of nzcvq, g, or nzcvqg. These map to the following CPSR_<fields> values: 
° APSR_nzcvq is the same as CPSR_f (mask== '1000'). 

° APSR_g is the same as CPSR_s (mask == '0100'). 

° APSR_nzcvqg is the same as CPSR_fs (mask == '1100'). 


ARM recommends the APSR_<bits> forms when only the N, Z, C, V, Q, and GE[3:0] bits are being 
written. For more information, see The Application Program Status Register, APSR on 
page E1-2296. 


<Rn> Is the general-purpose source register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if write_spsr then 
if PSTATE.M IN {M32_User,M32_System} then 
UNPREDICTABLE; 
else 
SPSRWriteByInstr(R[n], mask); 
else 
// Attempts to change to an illegal mode will invoke the Illegal Execution state mechanism 
CPSRWriteByInstr(R[n], mask); 


CONSTRAINED UNPREDICTABLE behavior 
If write_spsr && PSTATE.M IN {M32_User,M32_System}, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


° The instruction executes as NOP. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.118 MUL, MULS 


Multiply multiplies two register values. The least significant 32 bits of the result are written to the destination 
register. These 32 bits do not depend on whether the source register values are considered to be signed values or 
unsigned values. 


Optionally, it can update the condition flags based on the result. In the T32 instruction set, this option is limited to 
only a few forms of the instruction. Use of this option adversely affects performance on many implementations. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 8/7 6 5 4|3 0| 
rim [oo 0 ofo 0 os] Ra [ooo] am [too] Rn | 


cond 


Flag setting variant 
Applies when $ == 1. 


MULS{<c>}{<q>} <Rd>, <Rn>{, <Rm>} 


Not flag setting variant 
Applies when $ == 0. 


MUL{<c>}{<q>} <Rd>, <Rn>{, <Rm>} 


Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


15 141312/11109 8/7 65 |3 2. Of 


T1 variant 


MUL<c>{<q>} <Rdm>, <Rn>{, <Rdm>} // Inside IT block 
MULS{<q>} <Rdm>, <Rn>{, <Rdm>} // Outside IT block 


Decode for this encoding 
d = UInt(Rdm); n = UInt(Rn); m= UInt(Rdm); setflags = !InITBlock(); 


T2 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0 | 


Tii770770)000| Ra [i177] Rd 0 ojo 0] Rm | 


T2 variant 


MUL<c>.W <Rd>, <Rn>{, <Rm>} // Inside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
MUL{<c>}{<q>} <Rd>, <Rn>{, <Rm>} 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = FALSE; 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rdm> Is the second general-purpose source register holding the multiplier and the destination register, 


encoded in the "Rdm" field. 


<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. If 


omitted, <Rd> is used. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand1 = SInt(R[n]); // operandl = UInt(R[n]) produces the same final results 
operand2 = SInt(R[m]); // operand2 = UInt(R[m]) produces the same final results 
result = operandl « operand2; 
R[d] = result<31:0>; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result<31:0>); 
// PSTATE.C, PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.119 MVN, MVNS (immediate) 
Bitwise NOT (immediate) writes the bitwise inverse of an immediate value to the destination register. 
If the destination register is not the PC, the MVNS variant of the instruction updates the condition flags based on 
the result. 
The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 
. The MVN variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 
° The MVNS variant of the instruction performs an exception return without the use of the stack. In this case: 
—_— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 
— The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 
AArch32 state on page G1-3835. 
— The instruction is UNDEFINED in Hyp mode. 
— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 
A1 
\31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 12|11 | | 0 | 
rt foo 1 1 4{1 t]sfo)@ oof Ra | im 
cond 
MVN variant 
Applies when S == 0. 
MVN{<c>}{<q>} <Rd>, #<const> 
MVNS variant 
Applies when S == 1. 
MVNS{<c>}{<q>} <Rd>, #<const> 
Decode for all variants of this encoding 
d = UInt(Rd); setflags = (S == '1'); 
(imm32, carry) = A32ExpandImm_C(imm12, PSTATE.C); 
T1 
15141312/11109 8/7 6 5 4/3 2 10/1514 12|11 8/7 | 0 | 
T4141 ofr foo o 7 1[s[1 47 1)0] mms | Ro | imme —_— 
MVN variant 
Applies when $ == 0. 
MVN{<c>}{<q>} <Rd>, #<const> 
MVNS variant 
Applies when $ == 1. 
MVNS{<c>}{<q>} <Rd>, #<const> 
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Decode for all variants of this encoding 


d = UInt(Rd); setflags = (S == '1'); 
(imm32, carry) = T32ExpandImm_C(i:imm3:imm8, PSTATE.C); 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> 


<q> 


<Rd> 


<const> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 
deprecates using the PC as the destination register, but if the PC is used: 


. For the MVN variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


. For the MVNS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. 


For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 


page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = NOT(imm32); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.120 MVN, MVNS (register) 
Bitwise NOT (register) writes the bitwise inverse of a register value to the destination register. 


If the destination register is not the PC, the MVNS variant of the instruction updates the condition flags based on 
the result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


. The MVN variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 

° The MVNS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 |I7 6 5 4|3 0| 
1111 [o 0 0 1 1[4 1] S{)(0)()(0f Ra [| imm5__[type[o] Rm _ 


cond 


MVN, rotate right with extend variant 
Applies when S == 0 && imm5 == 00000 && type == 11. 


MVN{<c>}{<q>} <Rd>, <Rm>, RRX 


MVN, shift or rotate by value variant 
Applies when S == 0 && !(imm5 == 00000 && type == 11). 


MVN{<c>}{<q>} <Rd>, <Rm> {, <shift> #<amount>} 


MVNS, rotate right with extend variant 
Applies when S == 1 && imm5 == 00000 && type == 11. 


MVNS{<c>}{<q>} <Rd>, <Rm>, RRX 

MVNS, shift or rotate by value variant 

Applies when $ == 1 && !(immS == 00000 && type == 11). 
MVNS{<c>}{<q>} <Rd>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 
d = UInt(Rd); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 


T1 


\15141312/11109 8|7 65 |3 2. O| 


jo tooo ol1 11 if Rm | Rad | 
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T1 variant 


MVN<c>{<q>} <Rd>, <Rm> // Inside IT block 
MVNS{<q>} <Rd>, <Rm> // Outside IT block 


Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


15 141312/11109 8/7 6 5 4/3 21 0/1514 12/11 81/7 6 5 4|3 0 | 


77040 7f0 07 1/5]1 11 10) mms] Ra [imma] ype] Rm 


MVN, rotate right with extend variant 
Applies when S$ == 0 && imm3 == 000 && imm2 == 00 && type == 11. 


MVN{<c>}{<q>} <Rd>, <Rm>, RRX 


MVN, shift or rotate by value variant 
Applies when S == 0 && !(imm3 == 000 && imm2 == Q@ && type == 11). 


MVN<c>.W <Rd>, <Rm> // Inside IT block, and <Rd>, <Rm> can be represented in T1 
MVN{<c>}{<q>} <Rd>, <Rm> {, <shift> #<amount>} 


MVNS, rotate right with extend variant 
Applies when S$ == 1 && imm3 == 000 && imm2 == QQ && type == 11. 


MVNS{<c>}{<q>} <Rd>, <Rm>, RRX 


MVNS, shift or rotate by value variant 
Applies when $ == 1 && !(imm3 == 000 && imm2 == QQ && type == 11). 


MVNS.W <Rd>, <Rm> // Outside IT block, and <Rd>, <Rm> can be represented in T1 
MVNS{<c>}{<q>} <Rd>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 

d = UInt(Rd); m= UInt(Rm); setflags = (S == '1'); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
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<Rd> 


<Rm> 


<shift> 


<amount> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 
deprecates using the PC as the destination register, but if the PC is used: 


. For the MVN variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


. For the MVNS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T1 and T2: is the general-purpose destination register, encoded in the "Rd" field. 


For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 


For encoding T1 and T2: is the general-purpose source register, encoded in the "Rm" field. 


Is the type of shift to be applied to the source register, encoded in the "type" field. It can have the 
following values: 


LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 


For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = NOT(shifted); 
if d == 15 then // Can only occur for A32 encoding 


if setflags then 


else 


else 
R[d] 


ALUExceptionReturn(result) ; 


ALUWritePC(result) ; 


= result; 


if setflags then 


PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 

// PSTATE.V unchanged 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.121 MVN, MVNS (register-shifted register) 
Bitwise NOT (register-shifted register) writes the bitwise inverse of a register-shifted register value to the 
destination register. It can optionally update the condition flags based on the result. 
A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 12/11 8/7 6 5 4|3 0| 
1111 [o_o 0 4 4{1 1]S\o)()O)@f Rd} Rs [Of type} 4] Rm_| 
cond 
Flag setting variant 
Applies when S == 1. 
MVNS{<c>}{<q>} <Rd>, <Rm>, <type> <Rs> 
Not flag setting variant 
Applies when S == 0. 
MVN{<c>}{<q>} <Rd>, <Rm>, <type> <Rs> 
Decode for all variants of this encoding 
d = UInt(Rd); m= UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 
if d == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> Is the general-purpose source register, encoded in the "Rm" field. 
<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<Rs> Is the general-purpose source register holding a shift amount in its bottom 8 bits, encoded in the "Rs" 
field. 
Operation 
if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


result = NOT(shifted); 

R[d] = result; 

if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 











F5.1.122 NOP 
No Operation does nothing. This instruction can be used for instruction alignment purposes. 
Note 
The timing effects of including a NOP instruction in a program are not guaranteed. It can increase execution time, 
leave it unchanged, or even reduce it. Therefore, NOP instructions are not suitable for timing loops. 
A1 
\31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 1413 12/1110 9 8/7 6 5 4/3 21 O| 
=1111 [0 0 1 1 0/0]1 0/0 Of0 0((1)(1)(1)(1)(0)(0)(0)(0) 0.0 0 00000 
cond 
Al variant 
NOP{<c>}{<q>} 
Decode for this encoding 
// No additional decoding required 
T1 
151413 12/11109 8|7 6 5 4/3 21 0O| 
10111 %1%31%1)/0 0 0 0/0 0 0 0 
T1 variant 
NOP{<c>}{<q>} 
Decode for this encoding 
// No additional decoding required 
T2 
151413 12/1110 9 8|7 6 5 4/3 2 1 0|15141312/11109 8/7 6 5 4/3 2 1 O| 
111100111 01 Of 1 ofroofojo o o]0 0 0 0/0 000 
T2 variant 
NOP{<c>}.W 
Decode for this encoding 
// No additional decoding required 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
// Do nothing 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.123 ORN, ORNS (immediate) 


Bitwise OR NOT (immediate) performs a bitwise (inclusive) OR of a register value and the complement of an 
immediate value, and writes the result to the destination register. It can optionally update the condition flags based 


on the result. 


T1 


151413 12\11109 8|7 6 5 4|3 0 |15 14 


0| 


[1111 ofijojo ot afs} 1111 Jol imm3 | Rd | imma 
Rn 


Flag setting variant 
Applies when S == 1. 


ORNS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Not flag setting variant 
Applies when $ == 0. 


ORN{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


if Rn == '1111' then SEE MVN (immediate); 


d = UInt(Rd); mn = UInt(Rn); setflags = (S == '1'); 
(imm32, carry) = T32ExpandImm_C(i:imm3:imm8, PSTATE.C); 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. If omitted, this register is the 


same as <Rn>. 


<Rn> Is the general-purpose source register, encoded in the "Rn" field. 


<const> Animmediate value. See Modified immediate constants in T32 instructions on page F2-2420 for the 


range of values. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = R[n] OR NOT(imm32); 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.124 ORN, ORNS (register) 


Bitwise OR NOT (register) performs a bitwise (inclusive) OR of a register value and the complement of an 
optionally-shifted register value, and writes the result to the destination register. It can optionally update the 
condition flags based on the result. 


T1 


151413 12|11109 8|7 6 5 4|3 0\1514 12\11 8|7 6 5 4|3 0| 


[ti torotjoo tr afs| 1111 [ol imm3 | Rd fimmaltype| Rm __| 
Rn 





ORN, rotate right with extend variant 
Applies when S == 0 && imm3 == 000 && imm2 == 00 && type == 11. 


ORN{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ORN, shift or rotate by value variant 
Applies when S == 0 && !(imm3 == 000 && imm2 == 00 && type == 11). 


ORN{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


ORNS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && imm2 == 00 && type == 11. 


ORNS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ORNS, shift or rotate by value variant 
Applies when S == 1 && !(imm3 == 000 && imm2 == 00 && type == 11). 


ORNS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 

if Rn == '1111' then SEE MVN (register); 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<amount> Is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 (when <shift> = 


LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] OR NOT(shifted); 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.125 ORR, ORRS (immediate) 


Bitwise OR (immediate) performs a bitwise (inclusive) OR of a register value and an immediate value, and writes 
the result to the destination register. 


If the destination register is not the PC, the ORRS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The ORR variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The ORRS variant of the instruction performs an exception return without the use of the stack. In this case: 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 

\31 28|27 26 25 24/23 22 21 20/19 16|15 12|11 | | 0| 
1111 [0 0 114 1/0 ofS} Rn | Rd | immt2_— 
cond 

ORR variant 

Applies when S == 0. 


ORR{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


ORRS variant 


Applies when S == 1. 


ORRS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); mn = UInt(Rn); setflags = (S == '1'); 
(imm32, carry) = A32ExpandImm_C(imm12, PSTATE.C); 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0/1514 12\11 8|7 | 0 | 


+11 oifofoo7 os] enn [ol imma] Ra | imme i 
Rn 


ORR variant 


Applies when $ == 0. 


ORR{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


ORRS variant 


Applies when S == 1. 
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ORRS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


if Rn == '1111' then SEE MOV (immediate); 

d = UInt(Rd); mn = UInt(Rn); setflags = (S == '1'); 

(imm32, carry) = T32ExpandImm_C(i:imm3:imm8, PSTATE.C); 

if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
. For the ORR variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 


general-purpose registers and the PC on page E1-2293. 


° For the ORRS variant, the instruction performs an exception return, that restores PSTATE 


from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 


this register is the same as <Rn>. 


<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 


used, but this is deprecated. 


For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 


<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 


page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 


page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations() ; 
result = R[n] OR imm32; 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result); 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.126 ORR, ORRS (register) 


Bitwise OR (register) performs a bitwise (inclusive) OR of a register value and an optionally-shifted register value, 
and writes the result to the destination register. 


If the destination register is not the PC, the ORRS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The ORR variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The ORRS variant of the instruction performs an exception return without the use of the stack. In this case: 


A1 


|31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20|19 16|15 12\11 i765 4|3 0 | 


| fe1it jo oo 4 tfo ofs{ Rn | Rd | immS | typefo] Rm _| 


cond 


ORR, rotate right with extend variant 


Applies when S == 0 && imm5 == 00000 && type == 11. 


ORR{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ORR, shift or rotate by value variant 


Applies when S == 0 && !(imm5 == 00000 && type == 11). 


ORR{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


ORRS, rotate right with extend variant 


Applies when S == 1 && imm5 == 00000 && type == 11. 


ORRS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ORRS, shift or rotate by value variant 


Applies when S == 1 && !(imm5 == 00000 && type == 11). 


ORRS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); n= 
(shift_t, shift_n) 


UInt(Rn); | m = UInt(Rm); setflags = (S == '1'); 
= DecodeImmShift(type, imm5); 
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T1 


\15141312\11109 8|7 65 |32  O| 


T1 variant 


ORR<c>{<q>} {<Rdn>,} <Rdn>, <Rm> // Inside IT block 
ORRS{<q>} {<Rdn>,} <Rdn>, <Rm> // Outside IT block 


Decode for this encoding 


d = UInt(Rdn); nm = UInt(Rdn); m= UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = (SRType_LSL, 0); 


T2 


[15 1413 12|1110 9 8/7 6 5 4/3 0|1514 12/11 817 6 5 4/3 o| 


[111010 1J0 071 ofs{ !1111 |) imm3 | Rd __|imm2| type] Rm _| 
Rn 





ORR, rotate right with extend variant 
Applies when S$ == 0 && imm3 == 000 && imm2 == QQ && type == 11. 


ORR{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ORR, shift or rotate by value variant 
Applies when S == 0 && !(imm3 == 000 && imm2 == 00 && type == 11). 


ORR<c>.W {<Rd>,} <Rn>, <Rm> // Inside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
ORR{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


ORRS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && imm2 == 00 && type == 11. 


ORRS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


ORRS, shift or rotate by value variant 
Applies when S == 1 && !(imm3 == 000 && imm2 == 00 && type == 11). 


ORRS.W {<Rd>,} <Rn>, <Rm> // Outside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
ORRS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 

if Rn == '1111' then SEE "Related encodings"; 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related encodings: Data-processing (shifted register) on page F3-2470 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdn> Is the first general-purpose source register and the destination register, encoded in the "Rdn" field. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
° For the ORR variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the ORRS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T2: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. 
For encoding T2: is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 


For encoding T1 and T2: is the second general-purpose source register, encoded in the "Rm" field. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


In T32 assembly: 


° Outside an IT block, if ORRS <Rd>, <Rn>, <Rd> is written with <Rd> and <Rn> both in the range RO-R7, it is 
assembled using encoding T1 as though ORRS <Rd>, <Rn> had been written. 


° Inside an IT block, if ORR<c> <Rd>, <Rn>, <Rd> is written with <Rd> and <Rn> both in the range RO-R7, it is 
assembled using encoding T1 as though ORR<c> <Rd>, <Rn> had been written. 


To prevent either of these happening, use the .W qualifier. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] OR shifted; 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
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ALUWritePC(result) ; 
else 

R[d] = result; 

if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.127 ORR, ORRS (register-shifted register) 


Bitwise OR (register-shifted register) performs a bitwise (inclusive) OR of a register value and a register-shifted 
register value, and writes the result to the destination register. It can optionally update the condition flags based on 
the result. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0| 


| iit jo oo 4 tfo ofs{ Rn | Rd | Rs [oltype{t] Rm | 


cond 


Flag setting variant 
Applies when S == 1. 


ORRS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when $ == 0. 


ORR{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 

<Rs> Is the general-purpose source register holding a shift amount in its bottom 8 bits, encoded in the "Rs" 
field. 
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Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] OR shifted; 
R[d] = result; 
if setflags then 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.128 PKHBT, PKHTB 


Pack Halfword combines one halfword of its first operand with the other halfword of its shifted second operand. 


A1 

|31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 17 6 5 4]|3 0| 
rim [o1107000] Rn | Rd | mms [wo 1] Rm | 
cond 

PKHBT variant 

Applies when tb == 0. 


PKHBT{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, LSL #<imm>} 


PKHTB variant 
Applies when tb == 1. 


PKHTB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ASR #<imm>} 


Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); tbform = (tb == '1'); 


(shift_t, shift_n) = DecodeImmShift(tb:'O', imm5); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/1514 12\11 8|7 6 5 4|3 0 | 


1470170 1/01 174 ofof Rn [oy imm3 | Rd jimm2fthfo] Rm _| 
S) T 


PKHBT variant 
Applies when tb == 0. 


PKHBT{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, LSL #<imm>} // tbform == FALSE 


PKHTB variant 
Applies when tb == 1. 


PKHTB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ASR #<imm>} // tbform == TRUE 


Decode for all variants of this encoding 


if S == '1' || T == '1' then UNDEFINED; 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); tbform = (tb == '1'); 

(shift_t, shift_n) = DecodeImmShift(tb:'Q', imm3:imm2); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> 


<q> 


<Rd> 


<Rn> 


<Rm> 


<imm> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the general-purpose destination register, encoded in the "Rd" field. 

Is the first general-purpose source register, encoded in the "Rn" field. 

Is the second general-purpose source register, encoded in the "Rm" field. 


For encoding A1: the shift to apply to the value read from <Rm>, encoded in immS. Is one of: 
omitted No shift, encoded as 0b00000. 


1-31 Left shift by specified number of bits, encoded as a binary number. 


For encoding A1: the shift to apply to the value read from <Rm>, encoded in immS. Is one of: 


omitted Instruction is a pseudo-instruction and is assembled as though PKHBT{<c>}{<q>} <Rd>, 
<Rm>, <Rn> had been written. 


1-32 Arithmetic right shift by specified number of bits. A shift by 32 bits is encoded as 
0b00000. Other shift amounts are encoded as binary numbers. 


— Note 


An assembler can permit <imm> = 0 to mean the same thing as omitting the shift, but this is not 
standard UAL and must not be used for disassembly. 





For encoding T1: the shift to apply to the value read from <Rm>, encoded in imm3:imm2. For PKHBT, 
it is one of: 


omitted No shift, encoded as 0b00000. 
1-31 Left shift by specified number of bits, encoded as a binary number. 
For PKHTB, it is one of: 


omitted Instruction is a pseudo-instruction and is assembled as though PKHBT{<c>}{<q>} <Rd>, 
<Rm>, <Rn> had been written. 


1-32 Arithmetic right shift by specified number of bits. A shift by 32 bits is encoded as 
0b00000. Other shift amounts are encoded as binary numbers. 


— Note 


An assembler can permit <imm> = 0 to mean the same thing as omitting the shift, but this is not 
standard UAL and must not be used for disassembly. 





Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand2 = Shift(R[m], shift_t, shift_n, PSTATE.C); // PSTATE.C ignored 
R[d]<15:@> = if tbform then operand2<15:0> else R[n]<15:0>; 
R[d]<31:16> = if tbform then R[n]<31:16> else operand2<31:16>; 
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F5.1.129 PLD, PLDW (immediate) 


Preload Data (immediate) signals the memory system that data memory accesses from a specified address are likely 
in the near future. The memory system can respond by taking actions that are expected to speed up the memory 
accesses when they do occur, such as preloading the cache line containing the specified address into the data cache. 


The PLD instruction signals that the likely memory access is a read, and the PLDW instruction signals that it is a write. 
The effect of a PLD or PLDW instruction is IMPLEMENTATION DEFINED. For more information, see Preloading caches 
on page E2-2321. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15 14 13 12|11 | | 0 | 


71107 Opus, enn Cm SC~* 
Rn 


Preload read variant 
Applies when R == 
PLD{<c>}{<q>} [<Rn> {, #{+/-}<imm>}] 
Preload write variant 
Applies when R == 
PLDW{<c>}{<q>} [<Rn> {, #{+/-}<imm>}] 
Decode for all variants of this encoding 

if Rn == '1111' then SEE PLD (literal); 

= UInt(Rn); imm32 = ZeroExtend(imm12, 32); add = (U == '1'); is_pldw = (R == 'Q'); 


71 


bie REM 8|7 6 5 4|3 0/45 14 13 12|11 | 0 | 





0000 


Preload read variant 

Applies when W == 

PLD{<c>}{<q>} [<Rn> {, #{+}<imm>}] 

Preload write variant 

Applies when W == 

PLDW{<c>}{<q>} [<Rn> {, #{+}<imm>}] 
Decode for all variants of this encoding 


if Rn == '1111' then SEE PLD (literal); 
= UInt(Rn); imm32 = ZeroExtend(imm12, 32); add = TRUE; is_pldw = (W == '1'); 
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T2 


151413 12|11109 8|7 6 5 4|3 0|15141312|1110 9 8|7 | 0 | 


T1447 0 ofojopojwty =m [1171 i4[1100| imma ‘| 
Rn 


Preload read variant 
Applies when W == 0. 
PLD{<c>}{<q>} [<Rn> {, #-<imm>}] 
Preload write variant 
Applies when W == 1. 
PLDW{<c>}{<q>} [<Rn> {, #-<imm>}] 
Decode for all variants of this encoding 
if Rn == '1111' then SEE PLD (literal); 
n = UInt(Rn); imm32 = ZeroExtend(imm8, 32); add = FALSE; is_pldw = (W == '1'); 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. Must be AL or omitted. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. If the PC is used, see PLD (literal). 
+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 
+ Specifies the offset is added to the base register. 
<imm> For encoding A1: is the optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 


defaulting to 0 and encoded in the "imm12" field. 


For encoding T1: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 


For encoding T2: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = if add then (R[n] + imm32) else (R[n] - imm32); 
if is_pldw then 
Hint_PreloadDataForWrite(address); 
else 
Hint_PreloadData(address); 
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F5.1.130 PLD (literal) 


Preload Data (literal) signals the memory system that data memory accesses from a specified address are likely in 
the near future. The memory system can respond by taking actions that are expected to speed up the memory 
accesses when they do occur, such as preloading the cache line containing the specified address into the data cache. 


The effect of a PLD instruction is IMPLEMENTATION DEFINED. For more information, see Preloading caches on 
page E2-2321. 
A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 13 12/11 | | 0 | 


Al variant 


PLD{<c>}{<q>} <label> // Normal form 
PLD{<c>}{<q>} [PC, #{+/-}<imm>] // Alternative form 


Decode for this encoding 


imm32 = ZeroExtend(imml2, 32); add = (U == '1'); 
T1 


15 14131211109 8/7 6 5 4/3 2 1 0|151413 12/11 | | 0 | 


1111 to ofofujofotts tt tft at at immi2 


T1 variant 


PLD{<c>}{<q>} <label> // Preferred syntax 
PLD{<c>}{<q>} [PC, #{+/-}<imm>] // Alternative syntax 


Decode for this encoding 


imm32 = ZeroExtend(imml2, 32); add = (U == '1'); 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. Must be AL or omitted. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<label> The label of the literal data item that is likely to be accessed in the near future. The assembler 
calculates the required value of the offset from the Align(PC, 4) value of the instruction to this label. 
The offset must be in the range 4095 to 4095. If the offset is zero or positive, imm32 is equal to the 
offset and add == TRUE. If the offset is negative, imm32 is equal to minus the offset and add == FALSE. 
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+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, encoded in 


the "imm12" field. 


For encoding T1: is a 12-bit unsigned immediate byte offset, in the range 0 to 4095, encoded in the 
"imm12" field. 


The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = if add then (Align(PC,4) + imm32) else (Align(PC,4) - imm32); 
Hint_PreloadData(address); 
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F5.1.131 PLD, PLDW (register) 


Preload Data (register) signals the memory system that data memory accesses from a specified address are likely in 
the near future. The memory system can respond by taking actions that are expected to speed up the memory 
accesses when they do occur, such as preloading the cache line containing the specified address into the data cache. 


The PLD instruction signals that the likely memory access is a read, and the PLDW instruction signals that it is a write. 
The effect of a PLD or PLDW instruction is IMPLEMENTATION DEFINED. For more information, see Preloading caches 
on page E2-2321. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15 14 13 12|11 i7 6 5 43 0 | 


114101 tftfuyRfo tf Re fafa imms_|type[o] Rm _| 


Preload read, optional shift or rotate variant 
Applies when R == 1 && !(immS == 00000 && type == 11). 


PLD{<c>}{<q>} [<Rn>, {+/-}<Rm> {, <shift> #<amount>}] 


Preload read, rotate right with extend variant 
Applies when R == 1 && imm5 == 00000 && type == 11. 


PLD{<c>}{<q>} [<Rn>, {+/-}<Rm> , RRX] 


Preload write, optional shift or rotate variant 
Applies when R == @ && !(immS == 00000 && type == 11). 


PLDW{<c>}{<q>} [<Rn>, {+/-}<Rm> {, <shift> #<amount>}] 


Preload write, rotate right with extend variant 
Applies when R == 0 && imm5 == 00000 && type == 11. 


PLDW{<c>}{<q>} [<Rn>, {+/-}<Rm> , RRX] 


Decode for all variants of this encoding 
n = UInt(Rn); m= UInt(Rm); add = (U == '1'); is_pldw = (R == 'Q'); 


(shift_t, shift_n) = DecodeImmShift(type, imm5); 
if m == 15 || (n == 15 && is_pldw) then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4|3 0|15141312|/11109 8|7 6 5 4/3 0| 


[111110 ofofofojwi1] 1117 [1 11 1/0 0000 Ofimm2] Rm __| 
Rn 





Preload read variant 

Applies when W == 0. 

PLD{<c>}{<q>} [<Rn>, {+}<Rm> {, LSL #<amount>}] 
Preload write variant 


Applies when W == 1. 
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PLDW{<c>}{<q>} [<Rn>, {+}<Rm> {, LSL #<amount>}] 


Decode for all variants of this encoding 


if Rn == '1111' then SEE PLD (literal); 

n = UInt(Rn); m= UInt(Rm); add = TRUE; is_pldw = (W == '1'); 
(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. <c> must be AL or omitted. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be 
used. 


For encoding T1: is the general-purpose base register, encoded in the "Rn" field. 


+/- Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 


and encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 
+ Specifies the index register is added to the base register. 
<Rm> Is the general-purpose index register, encoded in the "Rm" field. 
<shift> Is the type of shift to be applied to the index register, encoded in the "type" field 
following values: 
LSL when type = 00 
LSR when type = 1 
10 
a 


ASR when type 


ROR when type 


. It can have the 


<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T1: is the shift amount, in the range 0 to 3, defaulting to 0 and encoded in the "Iimm2" 


field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
address = if add then (R[n] + offset) else (R[n] - offset); 
if is_pldw then 
Hint_PreloadDataForWrite(address); 
else 
Hint_PreloadData(address); 
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F5.1.132 PLI (immediate, literal) 


Preload Instruction signals the memory system that instruction memory accesses from a specified address are likely 
in the near future. The memory system can respond by taking actions that are expected to speed up the memory 
accesses when they do occur, such as pre-loading the cache line containing the specified address into the instruction 
cache. 


The effect of a PLI instruction is IMPLEMENTATION DEFINED. For more information, see Preloading caches on 
page E2-2321. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 14 13 12|11 | | 0 | 


2 ti tot ofofufifo sf Rn faenfany immt2 


Al variant 
PLI{<c>}{<q>} [<Rn> {, #{+/-}<imm>}] 


PLI{<c>}{<q>} <label> // Normal form 
PLI{<c>}{<q>} [PC, #{+/-}<imm>] // Alternative form 


Decode for this encoding 


n = UInt(Rn); imm32 = ZeroExtend(imm12, 32); add = (U == '1'); 
T1 


[15 141312/11109 8|7 6 5 4/3 0/45 14 13 12|11 | | 0 | 


[1114170 oft{ifo oftf i111 [41 114] immi2_ 
Rn 





T1 variant 

PLI{<c>}{<q>} [<Rn> {, #{+}<imm>}] 

Decode for this encoding 

if Rn == '1111' then SEE encoding T3; 

n = UInt(Rn); imm32 = ZeroExtend(imm12, 32); add = TRUE; 


T2 


[15 14 1312/1110 9 8|7 6 5 4/3 0/1514 1312/1110 9 8|7 | 0 | 


fF r117 0 oftfopoofiy enn [it 1t[1100[| imme | 
Rn 


T2 variant 


PLI{<c>}{<q>} [<Rn> {, #-<imm>}] 


Decode for this encoding 


if Rn == '1111' then SEE encoding T3; 
n = UInt(Rn); imm32 = ZeroExtend(imm8, 32); add = FALSE; 
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T3 


151413 12|1110 9 8|7 6 5 4/3 2 1 0|15 1413 12/11 | | 0 | 


Tati tio ouooitiititi] _mmi2———S—s 


T3 variant 


PLI{<c>}{<q>} <label> // Preferred syntax 
PLI{<c>}{<q>} [PC, #{+/-}<imm>] // Alternative syntax 


Decode for this encoding 


n= 15; imm32 = ZeroExtend(imm12, 32); add = (U == '1'); 


Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. Must be AL or omitted. 
For encoding T1, T2 and T3: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 


<label> The label of the instruction that is likely to be accessed in the near future. The assembler calculates 
the required value of the offset from the Align(PC, 4) value of the instruction to this label. The offset 
must be in the range 4095 to 4095. If the offset is zero or positive, imm32 is equal to the offset and 
add == TRUE. If the offset is negative, imm32 is equal to minus the offset and add == FALSE. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
+ Specifies the offset is added to the base register. 
<imm> For encoding A1: is the optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 


defaulting to 0 and encoded in the "imm12" field. 


For encoding T1: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 


For encoding T2: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 


For encoding T3: is a 12-bit unsigned immediate byte offset, in the range 0 to 4095, encoded in the 
"imm12" field. 


For the literal forms of the instruction, encoding T3 is used, or Rn is encoded as b1111 in encoding A1, to indicate 
that the PC is the base register. 


The alternative literal syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
base = if n == 15 then Align(PC,4) else R[n]; 
address = if add then (base + imm32) else (base - imm32); 
Hint_PreloadInstr(address); 
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F5.1.133 


PLI (register) 


Preload Instruction signals the memory system that instruction memory accesses from a specified address are likely 
in the near future. The memory system can respond by taking actions that are expected to speed up the memory 
accesses when they do occur, such as pre-loading the cache line containing the specified address into the instruction 
cache. 


The effect of a PLI instruction is IMPLEMENTATION DEFINED. For more information, see Preloading caches on 
page E2-2321. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 14 13 12|11 i176 5 43 0 | 


77107 tolupiyo 7] Rn (DL mms [ype oT Rm 


Rotate right with extend variant 
Applies when imm5 == 00000 && type == 11. 


PLI{<c>}{<q>} [<Rn>, {+/-}<Rm> , RRX] 


Shift or rotate by value variant 
Applies when !(imm5 == 00000 && type == 11). 


PLI{<c>}{<q>} [<Rn>, {+/-}<Rm> {, <shift> #<amount>}] 


Decode for all variants of this encoding 


n = UInt(Rn); m= UInt(Rm); add = (U == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 
if m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0|15141312/11109 8|7 6 5 4/3 0 | 


41170 oftfojo of] enn [147 1]00000 O]mm] Rm | 
Rn 





T1 variant 


PLI{<c>}{<q>} [<Rn>, {+}<Rm> {, LSL #<amount>}] 


Decode for this encoding 

if Rn == '1111' then SEE PLI (immediate, literal); 

n = UInt(Rn); m= UInt(Rm); add = TRUE; 

(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. <c> must be AL or omitted. 
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<amount> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 
Is the general-purpose base register, encoded in the "Rn" field. 


Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 


- when U = 0 


+ when U = 1 
Specifies the index register is added to the base register. 
Is the general-purpose index register, encoded in the "Rm" field. 


Is the type of shift to be applied to the index register, encoded in the "type" field. It can have the 
following values: 


LSL when type = 00 
LSR when type = 1 
ASR when type = 10 
ROR when type = 11 


For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T1: is the shift amount, in the range 0 to 3, defaulting to 0 and encoded in the "Iimm2" 
field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
address = if add then (R[n] + offset) else (R[n] - offset); 
Hint_PreloadInstr(address); 
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F5.1.134 POP 

Pop Multiple Registers from Stack loads multiple general-purpose registers from the stack, loading from 

consecutive memory locations starting at the address in SP, and updates SP to point just above the loaded data. 

The lowest-numbered register is loaded from the lowest memory address, through to the highest-numbered register 

from the highest memory address. See also Encoding of lists of general-purpose registers and the PC on 

page F2-2413. 

The registers loaded can include the PC, causing a branch to a loaded address. This is an interworking branch, see 

Pseudocode description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 

T1 

\15 14 1312/1110 9 8/7 | 0 | 
O71 t[1 O[P] registerist —_] 

T1 variant 

POP{<c>}{<q>} <registers> // Preferred syntax 

LDM{<c>}{<q>} SP!, <registers> // Alternate syntax 

Decode for this encoding 

registers = P:'Q000000':register_list; | UnalignedAllowed = FALSE; 
if BitCount(registers) < 1 then UNPREDICTABLE; 
if registers<15> == '1' && InITBlock() && !LastInITBlock() then UNPREDICTABLE; 

CONSTRAINED UNPREDICTABLE behavior 

If BitCount(registers) < 1, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction targets an unspecified set of registers. These registers might include R1S. If the instruction 
specifies writeback, the modification to the base address on writeback might differ from the number of 
registers loaded. 

Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 

Architectural Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<registers> Isa list of one or more registers to be loaded, separated by commas and surrounded by { and }. The 

registers in the list must be in the range RO-R7, encoded in the "register_list" field, and can 
optionally include the PC. If the PC is in the list, the "P" field is set to 1, otherwise this field defaults 
to 0. If the PC is in the list, the instruction must be either outside any IT block, or the last instruction 
in an IT block. 
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Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = SP; 
for i = 0 to 14 
if registers<i> == '1' then 
R[i] = if UnalignedAllowed then MemU[address,4] else MemA[address,4]; 
address = address + 4; 
if registers<15> == '1' then 
if UnalignedAllowed then 
if address<1:0> == 'QQ@' then 
LoadWritePC(MemU[address,4]); 
else 
UNPREDICTABLE; 
else 
LoadwWritePC(MemA[address,4]); 
if registers<13> == 'Q' then SP = SP + 4«BitCount(registers) ; 
if registers<13> == '1' then SP = bits(32) UNKNOWN; 
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F5.1.135 | POP (multiple registers) 
Pop Multiple Registers from Stack loads multiple general-purpose registers from the stack, loading from 
consecutive memory locations starting at the address in SP, and updates SP to point just above the loaded data 
This instruction is an alias of the LDM, LDMIA, LDMEFD instruction. This means that: 
° The encodings in this description are named to match the encodings of LDM, LDMIA, LDMFD. 
. The description of LDM, LDMIA, LDMFD gives the operational pseudocode for this instruction. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 | | | 0 | 
cond Ww Rn 
Al variant 
POP{<c>}{<q>} <registers> 
is equivalent to 
LDM{<c>}{<q>} SP!, <registers> 
and is the preferred disassembly when BitCount(register_list) > 1. 
T2 
[15 14 1312/1110 9 8|7 6 5 4|3 0 45 14 13 12/ | | 0 | 
1147070 ofo sfoli{i}i 1 0 s[Pimfo} registerlist_ 
Ww Rn register_list<13> 
T2 variant 
POP{<c>}.W <registers> // All registers in RQ-R7, PC 
is equivalent to 
LDM{<c>}{<q>} SP!, <registers> 
and is the preferred disassembly when BitCount(P:M:register_list) > 1. 
POP{<c>}{<q>} <registers> 
is equivalent to 
LDM{<c>}{<q>} SP!, <registers> 
and is the preferred disassembly when BitCount(P:M:register_list) > 1. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<registers> For encoding A1: is a list of two or more registers to be loaded, separated by commas and 
surrounded by { and }. The lowest-numbered register is loaded from the lowest memory address, 
through to the highest-numbered register from the highest memory address. See also Encoding of 
lists of general-purpose registers and the PC on page F2-2413. If the SP is in the list, the value of 
the SP after such an instruction is UNKNOWN. The PC can be in the list. If it is, the instruction 
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branches to the address loaded to the PC. This is an interworking branch, see Pseudocode 
description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 
ARM deprecates the use of this instruction with both the LR and the PC in the list. 


For encoding T2: is a list of two or more registers to be loaded, separated by commas and 
surrounded by { and }. The lowest-numbered register is loaded from the lowest memory address, 
through to the highest-numbered register from the highest memory address. See also Encoding of 
lists of general-purpose registers and the PC on page F2-2413. The registers in the list must be in 
the range RO-R12, encoded in the "register_list" field, and can optionally contain one of the LR or 
the PC. If the LR is in the list, the "M" field is set to 1, otherwise it defaults to 0. If the PC is in the 
list, the "P" field is set to 1, otherwise it defaults to 0. The PC can be in the list. If it is, the instruction 
branches to the address loaded to the PC. This is an interworking branch, see Pseudocode 


description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 
If the PC is in the list: 


° The LR must not be in the list. 


° The instruction must be either outside any IT block, or the last instruction in an IT block. 


Operation for all encodings 


The description of LDM, LDMIA, LDMEFD gives the operational pseudocode for this instruction. 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-2883 


Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.136 | POP (single register) 


Pop Single Register from Stack loads a single general-purpose register from the stack, loading from the address in 
SP, and updates SP to point just above the loaded data 


This instruction is an alias of the LDR (immediate) instruction. This means that: 


° The encodings in this description are named to match the encodings of LDR (immediate). 
. The description of LDR (immediate) gives the operational pseudocode for this instruction. 
A1 
31 28|27 26 25 ca 22 21 ae 16|/15 12|11 | | 0 | 
sim Jot oof ojo Foi] A Jooe se eo oo TOD 
cond imm12 


Post-indexed variant 
POP{<c>}{<q>} <single_register_list> 
is equivalent to 

LDR{<c>}{<q>} <Rt>, [SP], #4 


and is always the preferred disassembly. 


T4 
Se 8/7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 
11 140 ojofofy ofijs ro aj Rt jrjolijijo ooo cing 
PUW imms 
Post-indexed variant 
POP{<c>}{<q>} <single_register_list> 
is equivalent to 
LDR{<c>}{<q>} <Rt>, [SP], #4 
and is always the preferred disassembly. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<single_register_list> 


Is the general-purpose register <Rt> to be loaded surrounded by { and }. 


<Rt> For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 
can be used. If the PC is used, the instruction branches to the address (data) loaded to the PC. This 
is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 
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For encoding T4: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 
can be used, provided the instruction is either outside an IT block or the last instruction of an IT 
block. If the PC is used, the instruction branches to the address (data) loaded to the PC. This is an 


interworking branch, see Pseudocode description of operations on the AArch32 general-purpose 
registers and the PC on page E1-2293. 


Operation for all encodings 


The description of LDR (immediate) gives the operational pseudocode for this instruction. 
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F5.1.137 PUSH 


Push Multiple Registers to Stack stores multiple general-purpose registers to the stack, storing to consecutive 
memory locations ending just below the address in SP, and updates SP to point to the start of the stored data. 


The lowest-numbered register is stored to the lowest memory address, through to the highest-numbered register to 
the highest memory address. See also Encoding of lists of general-purpose registers and the PC on page F2-2413. 


T1 


\15 1413 12|1110 9 8|7 | 0 | 


[101 tfo}4 ojm]  registerist_ | 


T1 variant 


PUSH{<c>}{<q>} <registers> // Preferred syntax 
STMDB{<c>}{<q>} SP!, <registers> // Alternate syntax 


Decode for this encoding 


registers = 'Q':M:'Q00000':register_list; UnalignedAllowed = FALSE; 
if BitCount(registers) < 1 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction targets an unspecified set of registers. These registers might include R1S. If the instruction 


specifies writeback, the modification to the base address on writeback might differ from the number of 
registers loaded. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<registers> Isa list of one or more registers to be stored, separated by commas and surrounded by { and }. The 
registers in the list must be in the range RO-R7, encoded in the "register_list" field, and can 
optionally include the LR. If the LR is in the list, the "M" field is set to 1, otherwise this field defaults 
to 0. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = SP - 4xBitCount(registers); 
for i = 0 to 14 
if registers<i> == '1' then 
if i == 13 && i != LowestSetBit(registers) then // Only possible for encoding Al 
MemA[Laddress,4] = bits(32) UNKNOWN; 
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else 
if UnalignedAllowed then 
MemU[address,4] = R[i]; 
else 
MemA[address,4] = R[i]; 
address = address + 4; 
if registers<15> == '1' then // Only possible for encoding Al or A2 
if UnalignedAllowed then 
MemU[address,4] = PCStoreValue(); 
else 
MemA[address,4] = PCStoreValue(); 
SP = SP - 4*BitCount(registers); 
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F5.1.138 | PUSH (multiple registers) 


Push multiple registers to Stack stores multiple general-purpose registers to the stack, storing to consecutive 
memory locations ending just below the address in SP, and updates SP to point to the start of the stored data 


This instruction is an alias of the STMDB, STMFD instruction. This means that: 
° The encodings in this description are named to match the encodings of STMDB, STMFD. 


° The description of STMDB, STMFD gives the operational pseudocode for this instruction. 
A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 | | | 0 | 


enn [10 o[tfojofijol+ +07] registers ——SOSC~* 
Ww Rn 


cond 


Al variant 

PUSH{<c>}{<q>} <registers> 

is equivalent to 

STMDB{<c>}{<q>} SP!, <registers> 


and is the preferred disassembly when BitCount(register_list) > 1. 
T1 


115 1413 12/1110 9 8/7 6 5 4/3 0/15 14 13 12/ | | 2) 


111010 of1 ofoftjol1 10 1fojmfo} SE registerist_ 
Ww R 


n réyister_list<13> 


T1 variant 

PUSH{<c>}.W <registers> // All registers in RQ-R7, LR 

is equivalent to 

STMDB{<c>}{<q>} SP!, <registers> 

and is the preferred disassembly when BitCount(M:register_list) > 1. 
PUSH{<c>}{<q>} <registers> 

is equivalent to 

STMDB{<c>}{<q>} SP!, <registers> 


and is the preferred disassembly when BitCount(M:register_list) > 1. 


Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<registers> | Forencoding A1: isa list of two or more registers to be stored, separated by commas and surrounded 
by { and }. The lowest-numbered register is stored to the lowest memory address, through to the 
highest-numbered register to the highest memory address. See also Encoding of lists of 
general-purpose registers and the PC on page F2-2413. The SP and PC can be in the list. However: 


. ARM deprecates the use of instructions that include the PC in the list. 
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° If the SP is in the list, and it is not the lowest-numbered register in the list, the instruction 
stores an UNKNOWN value for the SP. 


For encoding T1: is a list of one or more registers to be stored, separated by commas and surrounded 
by { and }. The lowest-numbered register is stored to the lowest memory address, through to the 
highest-numbered register to the highest memory address. See also Encoding of lists of 
general-purpose registers and the PC on page F2-2413. The registers in the list must be in the range 
RO-R12, encoded in the "register_list" field, and can optionally contain the LR. If the LR is in the 
list, the "M" field is set to 1, otherwise it defaults to 0. 


Operation for all encodings 


The description of STMDB, STMEFD gives the operational pseudocode for this instruction. 
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F5.1.139 | PUSH (single register) 


Push Single Register to Stack stores a single general-purpose register to the stack, storing to the 32-bit word below 
the address in SP, and updates SP to point to the start of the stored data 


This instruction is an alias of the STR (immediate) instruction. This means that: 


° The encodings in this description are named to match the encodings of STR (immediate). 
° The description of STR (immediate) gives the operational pseudocode for this instruction. 
A1 
31 28|27 26 25 sa 22 21 ils 16|/15 12|11 | | 0 | 
sim Jot o[ifopofifoli Foi] A Jooe se ooo oT OD 
cond imm12 


Pre-indexed variant 

PUSH{<c>}{<q>} <single_register_list> 
is equivalent to 

STR{<c>}{<q>} <Rt>, [SP, #-4]! 


and is always the preferred disassembly. 


T4 
Se 8/7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 
11 140 ojofofs ojojs to aj Rt Jilifoijo ooo cid 
P UW imms 
Pre-indexed variant 
PUSH{<c>}{<q>} <single_register_list> // Standard syntax 
is equivalent to 
STR{<c>}{<q>} <Rt>, [SP, #-4]! 
and is always the preferred disassembly. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<single_register_list> 


Is the general-purpose register <Rt> to be stored surrounded by { and }. 


<Rt> For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 
can be used, but this is deprecated. 


For encoding T4: is the general-purpose register to be transferred, encoded in the "Rt" field. 


Operation for all encodings 


The description of STR (immediate) gives the operational pseudocode for this instruction. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.140 QADD 


Saturating Add adds two register values, saturates the result to the 32-bit signed integer range -23! to (23! - 1), and 
writes the result to the destination register. If saturation occurs, it sets PSTATE.Q to 1. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


| fet joo 0 4 ojo fof Rn | Rd (ooo to 1] Rm _ 


cond 


Al variant 


QADD{<c>}{<q>} {<Rd>,} <Rm>, <Rn> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0/45 14 13 12|11 81/7 6 5 4|3 0 | 


Tiitio70o00] Ra [iii 7] Rd [1 0]0 0] Rm | 


T1 variant 


QADD{<c>}{<q>} {<Rd>,} <Rm>, <Rn> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 
<Rn> Is the second general-purpose source register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(R[d], sat) = SignedSatQ(SInt(R[m]) + SInt(R[n]), 32); 





if sat then 
PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.141 QADD16 
Saturating Add 16 performs two 16-bit integer additions, saturates the results to the 16-bit signed integer range -215 
<= x <= 215 - 1, and writes the results to the destination register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


| ieitt jo 110 ojo 1 of Rn | Rd WAaeferfofo oft] Rm __ | 


cond 


Al variant 


QADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


151413 12\11109 8|7 6 5 4|3 0 |15 14 13 12/11 8|7 6 5 4|3 0| 


Tiitio7ojoow Ra [iii7] Rd oojo[i] Rn 


T1 variant 


QADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
sum1 = SInt(R[n]<15:@>) + SInt(R[m]<15:@>); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


sum2 = SInt(R[n]<31:16>) + SInt(R[m]<31:16>) ; 
R[d]<15:@> = SignedSat(sum1, 16); 
R[d]<31:16> = SignedSat(sum2, 16); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.142 


QADD8 


Saturating Add 8 performs four 8-bit integer additions, saturates the results to the 8-bit signed integer range -27 <= 
x <= 27 - |, and writes the results to the destination register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


Pp feitt jo 110 ojo of Rn | Rd Wafer to oft] Rm 


cond 


Al variant 


QADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12/11 8|7 6 5 4|3 0| 


Tiit7o70o00] Ra [i177] Rd olojo[i] An 


T1 variant 


QADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
suml = SInt(R[n]<7:0>) + SInt(R[m]<7:0>); 
sum2 = SInt(R[n]<15:8>) + SInt(R[m]<15:8>); 
sum3 = SInt(R[n]<23:16>) + SInt(R[m]<23:16>); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


sum4 = SInt(R[n]<31:24>) + SInt(R[m]<31:24>); 
R[d]<7:@> = SignedSat(sum1, 8); 
R[d]<15:8> = SignedSat(sum2, 8); 
R[d]<23:16> = SignedSat(sum3, 8); 
R[d]<31:24> = SignedSat(sum4, 8); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.143 QASX 
Saturating Add and Subtract with Exchange exchanges the two halfwords of the second operand, performs one 
16-bit integer addition and one 16-bit subtraction, saturates the results to the 16-bit signed integer range -2!5 <= x 
<= 215 - 1, and writes the results to the destination register. 
A1 
\31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0 | 
mim fot1oojo7 of en | Ra (nolo tf] Rm | 
cond 
Al variant 
QASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 14 13 12|11 8/7 6 5 4|3 0| 
Tiittoto oro, Ra [1717] Ra jolojoli] Rm | 
T1 variant 
QASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
diff = SInt(R[n]<15:@>) - SInt(R[m]<31:16>); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


sum = SInt(R[n]<31:16>) + SInt(R[m]<15:0>); 
R[d]<15:@> = SignedSat(diff, 16); 
R[d]<31:16> = SignedSat(sum, 16); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.144 QDADD 
Saturating Double and Add adds a doubled register value to another register value, and writes the result to the 
destination register. Both the doubling and the addition have their results saturated to the 32-bit signed integer range 
-23! <= x <= 23! - 1. If saturation occurs in either operation, it sets PSTATE.Q to 1. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 
rit |o 0 0 4 of1 ojo] Rn _ [Rd foo 4 0 1 
cond 
Al variant 
QDADD{<c>}{<q>} {<Rd>,} <Rm>, <Rn> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 14 13 12/11 8|7 6 5 4|3 0 | 
Tit+1o707000] Re iii 7] Ro [1 0Jo 7] Rm | 
T1 variant 
QDADD{<c>}{<q>} {<Rd>,} <Rm>, <Rn> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 
<Rn> Is the second general-purpose source register, encoded in the "Rn" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
(doubled, sat1) = SignedSatQ(2 » SInt(R[n]), 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


(R[d], sat2) = SignedSatQ(SInt(R[m]) + SInt(doubled), 32); 
if satl || sat2 then 
PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.145 QDSUB 
Saturating Double and Subtract subtracts a doubled register value from another register value, and writes the result 
to the destination register. Both the doubling and the subtraction have their results saturated to the 32-bit signed 
integer range -23! <= x <= 23! - 1. If saturation occurs in either operation, it sets PSTATE.Q to 1. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 
rit [oo 0 4 of1 tJo] Rn [Rd oJ 0 4 0 1 
cond 
Al variant 
QDSUB{<c>}{<q>} {<Rd>,} <Rm>, <Rn> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 14 13 12/11 8|7 6 5 4|3 0 | 
Tit+1o70o00] Re liii7] Ro [1 0]1 1] Rm | 
T1 variant 
QDSUB{<c>}{<q>} {<Rd>,} <Rm>, <Rn> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 
<Rn> Is the second general-purpose source register, encoded in the "Rn" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
(doubled, sat1) = SignedSatQ(2 » SInt(R[n]), 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


(R[d], sat2) = SignedSatQ(SInt(R[m]) - SInt(doubled), 32); 
if satl || sat2 then 
PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.146 QSAX 
Saturating Subtract and Add with Exchange exchanges the two halfwords of the second operand, performs one 
16-bit integer subtraction and one 16-bit addition, saturates the results to the 16-bit signed integer range -2!5 <= x 
<= 215 - 1, and writes the results to the destination register. 
A1 
\31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0 | 
mim fot1oojo7 of en | Ra (nol? of] Rm | 
cond 
Al variant 
QSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 14 13 12|11 8/7 6 5 4|3 0| 
Tiittoto ito] Ra [1717] Ra jofojoli] am | 
T1 variant 
QSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
sum = SInt(R[n]<15:@>) + SInt(R[m]<31:16>) ; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


diff = SInt(R[n]<31:16>) - SInt(R[m]<15:0>); 
R[d]<15:@> = SignedSat(sum, 16); 
R[d]<31:16> = SignedSat(diff, 16); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.147. QSUB 


Saturating Subtract subtracts one register value from another register value, saturates the result to the 32-bit signed 


integer range -23! <= x <= 231 - 1, and writes the result to the destination register. If saturation occurs, it sets 
PSTATE.Q to 1. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


| i111 fo oo 1 ojo sfo] Rn | Rd __ OOo. 1 9 AP Rm 


cond 


Al variant 


QSUB{<c>}{<q>} {<Rd>,} <Rm>, <Rn> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 141312/11109 8|7 6 5 4/3 0/45 14 13 12|11 8/7 6 5 4|3 0 | 


Tata tiototooo) ek [titi] Ra |i ofio] Rm | 


T1 variant 


QSUB{<c>}{<q>} {<Rd>,} <Rm>, <Rn> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 
<Rn> Is the second general-purpose source register, encoded in the "Rn" field. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(R[d], sat) = SignedSatQ(SInt(R[m]) - SInt(R[n]), 32); 





if sat then 
PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.148 QSUB16 


Saturating Subtract 16 performs two 16-bit integer subtractions, saturates the results to the 16-bit signed integer 
range -215 <= x <= 215 - 1, and writes the results to the destination register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


| feit jo 110 ojo 1 of Rn | RA Waefnfoft +f] Rm 


cond 


Al variant 


QSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]3 0| 


Tiit7o7oio Ra [i177] Rd ooo] Rm _| 


T1 variant 


QSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diffl = SInt(R[n]<15:@>) - SInt(R[m]<15:0>); 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


diff2 = SInt(R[n]<31:16>) - SInt(R[m]<31:16>); 
R[d]<15:@> = SignedSat(diff1, 16); 
R[d]<31:16> = SignedSat(diff2, 16); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.149 QSUB8 


Saturating Subtract 8 performs four 8-bit integer subtractions, saturates the results to the 8-bit signed integer range 
-27 <= x <= 27 - 1, and writes the results to the destination register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


| feitt jo 110 ojo 1 of Rn | RA Weft tft] Rm 


cond 


Al variant 


QSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0/45 14 13 12|11 81/7 6 5 4|3 0 | 


Tiit7o70io0] Ra [i177] Rd oojofi] Rm 


T1 variant 


QSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diffl = SInt(R[n]<7:@>) - SInt(R[m]<7:0>); 
diff2 = SInt(R[n]<15:8>) - SInt(R[m]<15:8>); 
diff3 = SInt(R[n]<23:16>) - SInt(R[m]<23:16>); 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


diff4 = SInt(R[n]<31:24>) - SInt(R[m]<31:24>); 
R[d]<7:@> = SignedSat(diff1, 8); 
R[d]<15:8> SignedSat(diff2, 8); 
R[d]<23:16> = SignedSat(diff3, 8); 
R[d]<31:24> = SignedSat(diff4, 8); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.150 


RBIT 


Reverse Bits reverses the bit order in a 32-bit register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11109 8|7 6 5 4/3 0| 


p fertt fot 1 0 tfiatt aff, Re nnfnfofo 1 t+] Rm 


cond 





Al variant 

RBIT{<c>}{<q>} <Rd>, <Rm> 

Decode for this encoding 

d = UInt(Rd); m = UInt(Rm); 

if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


151413 12|1110 9 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0| 


T7itioto toot en [177i Ra [i o[io] Rm | 


T1 variant 
RBIT{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(Rd); m= UInt(Rm); n = UInt(Rn); 
if m!=n || d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 

Ifm != n, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes with the additional decode: m = UInt(Rn);. 
° The instruction executes with the additional decode: m = UInt(Rm);. 


° The value in the destination register is UNKNOWN. 


Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 


<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
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<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. 


For encoding T1: is the general-purpose source register, encoded in the "Rm" field. It must be 
encoded with an identical value in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
bits(32) result; 
for i = 0 to 31 
result<31-i> = R[m]<i>; 
R[d] = result; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.151 


REV 


Byte-Reverse Word reverses the byte order in a 32-bit register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11109 8|7 6 5 4/3 0| 


p feritt fo 41 10 tfoft aff, R¢_Dnfnfofo 1 t+] Rm 


cond 





Al variant 

REV{<c>}{<q>} <Rd>, <Rm> 

Decode for this encoding 

d = UInt(Rd); m = UInt(Rm); 

if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


\15141312\11109 8|7 65 |32  O| 


fT o71707 0/0 0] Rm] Ra | 


T1 variant 
REV{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(Rd); m = UInt(Rm); 
T2 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0 | 


T7itioto toot en [1777] Rea [7 o[oo] Rm | 


T2 variant 


REV{<c>}.W <Rd>, <Rm> // <Rd>, <Rm> can be represented in T1 
REV{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(Rd); m= UInt(Rm); n = UInt(Rn); 
if m!=n || d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 

If m != n, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


° The instruction executes with the additional decode: m = UInt(Rn);. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


° The instruction executes with the additional decode: m = UInt(Rm));. 
° The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> For encoding Al and T1: is the general-purpose source register, encoded in the "Rm" field. 


For encoding T2: is the general-purpose source register, encoded in the "Rm" field. It must be 
encoded with an identical value in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
bits(32) result; 
result<31:24> = R[m]<7:0>; 
result<23:16> = R[m]<15:8>; 
result<15:8> R[m]<23:16>; 
result<7:@> R[m]<31:24>; 
R[d] = result; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.152  REV16 


Byte-Reverse Packed Halfword reverses the byte order in each16-bit halfword of a 32-bit register. 





A1 
\31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 
| eit fo 1 1 0 Jolt tia) Ra Otto 4 t+] Rm | 
cond 
Al variant 


REV16{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 
d = UInt(Rd); m = UInt(Rm); 
if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


\15141312\11109 8|7 65 |3 2. O| 


fT o71707 0/04] Rm] Ra | 


T1 variant 
REV16{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(Rd); m = UInt(Rm); 
T2 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0 | 


fT7itiototoot em [171i Re |i ojo7] Rm 


T2 variant 


REV16{<c>}.W <Rd>, <Rm> // <Rd>, <Rm> can be represented in T1 
REV16{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(Rd); m= UInt(Rm); n = UInt(Rn); 
if m!=n || d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If m != n, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes with the additional decode: m = UInt(Rn);. 
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° The instruction executes with the additional decode: m = UInt(Rm));. 


° The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> For encoding Al and T1: is the general-purpose source register, encoded in the "Rm" field. 


For encoding T2: is the general-purpose source register, encoded in the "Rm" field. It must be 
encoded with an identical value in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
bits(32) result; 
result<31:24> = R[m]<23:16>; 
result<23:16> = R[m]<31:24>; 
result<15:8> R[m]<7:0>; 
result<7:@> = R[m]<15:8>; 
R[d] = result; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.153  REVSH 


Byte-Reverse Signed Halfword reverses the byte order in the lower 16-bit halfword of a 32-bit register, and 
sign-extends the result to 32 bits. 





A1 
\31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 
| tetitt fo 1 10 1f4}t taj — Ra Otto + +] Rm | 
cond 
Al variant 


REVSH{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(Rd); m = UInt(Rm); 
if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


(15141312|/11109 8|7 6 5 |3 2 0| 


fTo71i070[1 1] Rm] Ra | 


T1 variant 
REVSH{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(Rd); m = UInt(Rm); 
T2 


[15 1413 12/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


Titito1oooi Rn lit77| Ra [toi 1] Rm | 


T2 variant 


REVSH{<c>}.W <Rd>, <Rm> // <Rd>, <Rm> can be represented in T1 
REVSH{<c>}{<q>} <Rd>, <Rm> 


Decode for this encoding 


d = UInt(Rd); m= UInt(Rm); n = UInt(Rn); 
if m!=n || d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If m != n, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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° The instruction executes with the additional decode: m = UInt(Rn);. 
° The instruction executes with the additional decode: m = UInt(Rm);. 
. The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> For encoding Al and T1: is the general-purpose source register, encoded in the "Rm" field. 


For encoding T2: is the general-purpose source register, encoded in the "Rm" field. It must be 
encoded with an identical value in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
bits(32) result; 
result<31:8> = SignExtend(R[m]<7:0>, 24); 
result<7:@> = R[m]<15:8>; 
R[d] = result; 
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F5.1.154 RFE, RFEDA, RFEDB, RFEIA, RFEIB 
Return From Exception loads two consecutive memory locations using an address in a base register: 
. The word loaded from the lower address is treated as an instruction address. The PE branches to it. 
° The word loaded from the higher address is used to restore PSTATE. This word must be in the format of an 
SPSR. 
An address adjusted by the size of the data loaded can optionally be written back to the base register. 
The PE checks the value of the word loaded from the higher address for an illegal return event. See I/legal return 
events from AArch32 state on page G1-3835. 
RFE is UNDEFINED in Hyp mode and CONSTRAINED UNPREDICTABLE in User mode. 
A1 
131 30 29 28|27 26 25 24|23 22 21 20|19 16|15141312|11109 8|7 6 5 4/3 2 1 0| 
[11 41 1:0 OFPIULOfwit] Rm |(O)(0) (0) (0) (1) (0) (4) (0) (0) (0) (O)f(0) (0) (0) (0) (0) 
Decrement After variant 
Applies when P == 0 && U == 
RFEDA{<c>}{<q>} <Rn>{!} 
Decrement Before variant 
Applies when P == 1 && U == 0. 
RFEDB{<c>}{<q>} <Rn>{!} 
Increment After variant 
Applies when P == 0 && U == 1. 
RFE{IA}{<c>}{<q>} <Rn>{!} 
Increment Before variant 
Applies when P == 1 && U == 1. 
RFEIB{<c>}{<q>} <Rn>{!} 
Decode for all variants of this encoding 
n = UInt(Rn); 
wback = (W == '1'); | increment = (U == '1'); wordhigher = (P == U); 
if n == 15 then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0|15141312|\1110 9 8|7 6 5 4/3 2 1 0| 
[11 1.01 :0 Ofo Ofofwft] — Rv_|(Af(4){(O)f(0) (0) (0) (©) (0) (0) (0) (0) (©) (0) (0) (0) (0) 
T1 variant 
RFEDB{<c>}{<q>} <Rn>{!} // Outside or last in IT block 
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Decode for this encoding 
n = UInt(Rn); whack = (W == '1'); increment = FALSE; wordhigher = FALSE; 


if n == 15 then UNPREDICTABLE; 
if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


T2 


[15 1413 12/1110 9 8|7 6 5 4/3 0|/15141312/11109 8|7 6 5 4|3 21 0| 


[11101 0 0f4 4[0[wit] Rn _|(1)f(1)(O)M(O) (0) (0) (0) (0) (0) (©) (©) (0) (0) (0) (0) (0) 


T2 variant 


RFE{IA}{<c>}{<q>} <Rn>{!} // Outside or last in IT block 


Decode for this encoding 
n = UInt(Rn); whack = (W == '1'); increment = TRUE; wordhigher = FALSE; 


if n == 15 then UNPREDICTABLE; 
if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


TA For encoding A1: is an optional suffix to indicate the Increment After variant. 


For encoding T2: is an optional suffix for the Increment After form. 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. <c> must be AL or omitted. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


! The address adjusted by the size of the data loaded is written back to the base register. If specified, 


it is encoded in the "W" field as 1, otherwise this field defaults to 0. 


RFEFA, RFEEA, RFEFD, and RFEED are pseudo-instructions for RFEDA, RFEDB, RFEIA, and RFEIB respectively, referring to 
their use for popping data from Full Ascending, Empty Ascending, Full Descending, and Empty Descending stacks. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if PSTATE.EL == EL2 then 
UNDEFINED; 
elsif PSTATE.EL == ELQ then 
UNPREDICTABLE; // UNDEFINED or NOP 
else 
address = if increment then R[n] else R[n]-8; 
if wordhigher then address = address+4; 
new_pc_value = MemA[address,4]; 
spsr = MemA[address+4,4]; 
if whack then R[n] = if increment then R[n]+8 else R[n]-8; 
AArch32.ExceptionReturn(new_pc_value, spsr); 
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CONSTRAINED UNPREDICTABLE behavior 


Tf PSTATE.EL == EL@, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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F5.1.155 ROR (immediate) 


Rotate Right (immediate) provides the value of the contents of a register rotated by a constant value. The bits that 
are rotated off the right end are inserted into the vacated bit positions on the left. 


This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 
° The encodings in this description are named to match the encodings of MOV, MOVS (register). 


° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 


A1 


31 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11 iI7 6 5 4|3 0| 


Per [0001 1]0 tol moo] Ra | 00000 [1 iJo] Rm _| 


cond S) imm5 type 


MOV, shift or rotate by value variant 
ROR{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 

is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, ROR #<imm> 


and is always the preferred disassembly. 


T3 
15 141312/11109 8/7 6 5 4/3 2.1 0/1514 12/11 8/7 6 5 4/3 0 | 
1147070 1]0 074 ofoj4 1 1 40) imma | Rd fimm2{t_i{ Rm | 
S) type 
MOV, shift or rotate by value variant 
Applies when !(imm3 == 000 && imm2 == QQ). 
ROR{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, ROR #<imm> 
and is always the preferred disassembly. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 


deprecates using the PC as the destination register, but if the PC is used, the instruction is a branch 
to the address calculated by the operation. This is an interworking branch, see Pseudocode 
description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 


For encoding T3: is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 


For encoding T3: is the general-purpose source register, encoded in the "Rm" field. 
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<imm> For encoding A1: is the shift amount, in the range 1 to 31, encoded in the "imm5" field. 


For encoding T3: is the shift amount, in the range 1 to 31, encoded in the "imm3:imm2" field. 


Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 





F5-2922 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F5.1.156 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


ROR (register) 


Rotate Right (register) provides the value of the contents of a register rotated by a variable number of bits. The bits 
that are rotated off the right end are inserted into the vacated bit positions on the left. The variable number of bits is 
read from the bottom byte of a register 


This instruction is an alias of the MOV, MOVS (register-shifted register) instruction. This means that: 


° The encodings in this description are named to match the encodings of MOV, MOVS (register-shifted 
register). 


° The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this 
instruction. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 8/7 6 5 4|3 0| 


[| ie1i1t_fo oo 4 1]0 tJofo@ oo Ra [Rs foft 1f1] Rm | 


cond Ss type 


Not flag setting variant 
ROR{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, ROR <Rs> 


and is always the preferred disassembly. 
T1 


15141312\1109 | 65 |32 0O| 


fo 700 00f077 1] Rs | Rim | 


op 


Rotate right variant 

ROR<c>{<q>} {<Rdm>,} <Rdm>, <Rs> // Inside IT block 
is equivalent to 

MOV<c>{<q>} <Rdm>, <Rdm>, ROR <Rs> 


and is the preferred disassembly when InITBlock(). 
T2 


151413 12\11109 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0 | 


1 t1t10700]1 fof Rm [iriait} Rd jooo oj] Rs | 


type S 


Not flag setting variant 
ROR<c>.W {<Rd>,} <Rm>, <Rs> // Inside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in T1 
is equivalent to 


MOV{<c>}{<q>} <Rd>, <Rm>, ROR <Rs> 
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and is always the preferred disassembly. 
ROR{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 

is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, ROR <Rs> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdm> Is the first general-purpose source register and the destination register, encoded in the "Rdm'" field. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 

<Rs> Is the second general-purpose source register holding a rotate amount in its bottom 8 bits, encoded 


in the "Rs" field. 


Operation for all encodings 


The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this instruction. 
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F5.1.157 RORS (immediate) 


Rotate Right, setting flags (immediate) provides the value of the contents of a register rotated by a constant value. 
The bits that are rotated off the right end are inserted into the vacated bit positions on the left. 


If the destination register is not the PC, this instruction updates the condition flags based on the result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


e The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


° The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from AArch32 
State on page G1-3835. 


. The instruction is UNDEFINED in Hyp mode. 

° The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 

This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 

° The encodings in this description are named to match the encodings of MOV, MOVS (register). 


° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 i7 6 5 43 0| 


[| ie1i1t_ fo oo 4 1f0 1] 1fo@o)@) Ra | '00000 [1 1fo] Rm _| 
iS) 


cond imm5 type 


MOVS, shift or rotate by value variant 
RORS{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, ROR #<imm> 


and is always the preferred disassembly. 





T3 
|15141312|/1110 9 8|7 6 5 4/3 2 1 0|1514 12)11 8|7 6 5 4]3 0| 
Ti 40704007 0]1]1 11 10] mms Ra [mma] 7] Rm _| 
Ss type 
MOVS, shift or rotate by value variant 
Applies when !(imm3 == 000 && imm2 == QQ). 
RORS{<c>}{<q>} {<Rd>,} <Rm>, #<imm> 
is equivalent to 
MOVS{<c>}{<q>} <Rd>, <Rm>, ROR #<imm> 
and is always the preferred disassembly. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
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<q> See Standard assembler syntax fields on page F2-2406. 


<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 
deprecates using the PC as the destination register, but if the PC is used, the instruction performs an 
exception return, that restores PSTATE from SPSR_<current_mode>. 


For encoding T3: is the general-purpose destination register, encoded in the "Rd" field. 


<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 


For encoding T3: is the general-purpose source register, encoded in the "Rm" field. 


<imm> For encoding A1: is the shift amount, in the range 1 to 31, encoded in the "imm5" field. 


For encoding T3: is the shift amount, in the range 1 to 31, encoded in the "imm3:imm2" field. 


Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
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F5.1.158 §RORS (register) 
Rotate Right, setting flags (register) provides the value of the contents of a register rotated by a variable number of 
bits, and updates the condition flags based on the result. The bits that are rotated off the right end are inserted into 
the vacated bit positions on the left. The variable number of bits is read from the bottom byte of a register 
This instruction is an alias of the MOV, MOVS (register-shifted register) instruction. This means that: 
° The encodings in this description are named to match the encodings of MOV, MOVS (register-shifted 
register). 
° The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this 
instruction. 
A1 
31 28|27 26 25 24|23 22 21 20|19 18 17 16/15 12|11 8/7 6 5 4|3 0 | 
e111 Jo oo 4 1f0 1]4 {oof Ra_{| Rs [oft 1f1] Rm _| 
cond S) type 
Flag setting variant 
RORS{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 
is equivalent to 
MOVS{<c>}{<q>} <Rd>, <Rm>, ROR <Rs> 
and is always the preferred disassembly. 
T1 
151413121109 | 65 |32 4O| 
07 0000/0771] Rs | Rim | 
op 
Rotate right variant 
RORS{<q>} {<Rdm>,} <Rdm>, <Rs> // Outside IT block 
is equivalent to 
MOVS{<q>} <Rdm>, <Rdm>, ROR <Rs> 
and is the preferred disassembly when !InITBlock(). 
T2 
15 141312/11109 8|7 6 5 4/3 0 15 14 13 12|11 8/7 6 5 4|3 0 | 
1141i1toro oft ai} Rm [1114] Rd fooool Rs | 
type S 
Flag setting variant 
RORS.W {<Rd>,} <Rm>, <Rs> // Outside IT block, and <Rd>, <Rm>, <type>, <Rs> can be represented in T1 
is equivalent to 
MOVS{<c>}{<q>} <Rd>, <Rm>, ROR <Rs> 
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and is always the preferred disassembly. 
RORS{<c>}{<q>} {<Rd>,} <Rm>, <Rs> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, ROR <Rs> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdm> Is the first general-purpose source register and the destination register, encoded in the "Rdm'" field. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the first general-purpose source register, encoded in the "Rm" field. 

<Rs> Is the second general-purpose source register holding a rotate amount in its bottom 8 bits, encoded 


in the "Rs" field. 


Operation for all encodings 


The description of MOV, MOVS (register-shifted register) gives the operational pseudocode for this instruction. 
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F5.1.159 RRX 


Rotate Right with Extend provides the value of the contents of a register shifted right by one place, with the Carry 
flag shifted into bit[31]. 


This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 


° The encodings in this description are named to match the encodings of MOV, MOVS (register). 
° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
A1 
31 28/27 26 == 22 21 Zoe 18 168 12|11 i765 4]3 0| 
Sim ooo 7 io Tojo ool RA Joo oo ot so] An 
cond imm5 type 


MOV, rotate right with extend variant 
RRX{<c>}{<q>} {<Rd>,} <Rm> 

is equivalent to 

MOV{<c>}{<q>} <Rd>, <Rm>, RRX 


and is always the preferred disassembly. 


T3 
|15 14 13 12/11 10 9 sie BS a one 42/11 ree 0 | 
Daroro soos ofofi 14 ifofo oof Rd [0 o]i 4} Rm _| 
imm3 imm2 ws 
MOV, rotate right with extend variant 
RRX{<c>}{<q>} {<Rd>,} <Rm> 
is equivalent to 
MOV{<c>}{<q>} <Rd>, <Rm>, RRX 
and is always the preferred disassembly. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 


deprecates using the PC as the destination register, but if the PC is used, the instruction is a branch 
to the address calculated by the operation. This is an interworking branch, see Pseudocode 
description of operations on the AArch32 general-purpose registers and the PC on page E1-2293. 


For encoding T3: is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 


For encoding T3: is the general-purpose source register, encoded in the "Rm" field. 
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Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
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F5.1.160 RRXS 


Rotate Right with Extend, setting flags provides the value of the contents of a register shifted right by one place, 
with the Carry flag shifted into bit[31]. 


If the destination register is not the PC, this instruction updates the condition flags based on the result, and bit[0] is 
shifted into the Carry flag. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


° The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from AArch32 
State on page G1-3835. 


. The instruction is UNDEFINED in Hyp mode. 
° The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


This instruction is an alias of the MOV, MOVS (register) instruction. This means that: 


° The encodings in this description are named to match the encodings of MOV, MOVS (register). 
° The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 
A1 
31 28|27 26 = 22 21 zoe 18 1168 12\11 |7 6 5 4]|3 0| 
ei Joo ot io Tomonoy “RA Joo oo oft io] Am 
cond imm5 type 


MOVS, rotate right with extend variant 
RRXS{<c>}{<q>} {<Rd>,} <Rm> 

is equivalent to 

MOVS{<c>}{<q>} <Rd>, <Rm>, RRX 


and is always the preferred disassembly. 





T3 
|15 14.13 12/11 10 9 eS ons 12\11 a 0 | 
Daroro soos ofifi ts ifoyo oof Rd [o ofi i} Rm | 
imm3 imm2 ws 
MOVS, rotate right with extend variant 
RRXS{<c>}{<q>} {<Rd>,} <Rm> 
is equivalent to 
MOVS{<c>}{<q>} <Rd>, <Rm>, RRX 
and is always the preferred disassembly. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
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<q> See Standard assembler syntax fields on page F2-2406. 


<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. ARM 
deprecates using the PC as the destination register, but if the PC is used, the instruction performs an 
exception return, that restores PSTATE from SPSR_<current_mode>. 


For encoding T3: is the general-purpose destination register, encoded in the "Rd" field. 


<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. The PC can be 
used, but this is deprecated. 


For encoding T3: is the general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


The description of MOV, MOVS (register) gives the operational pseudocode for this instruction. 





F5-2932 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.161 RSB, RSBS (immediate) 


Reverse Subtract (immediate) subtracts a register value from an immediate value, and writes the result to the 
destination register. 


If the destination register is not the PC, the RSBS variant of the instruction updates the condition flags based on the 
result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The RSB variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 

° The RSBS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12\11 | | 0 | 
1111 [oo 1 0fo11/s| Rn | Rd [| immi2_— 


cond 


RSB variant 
Applies when $ == 0. 


RSB{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


RSBS variant 
Applies when $ == 1. 
RSBS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); imm32 = A32ExpandImm(imm12) ; 
T1 


15 141312/11109 8|7 65 |32 Of 


jo 1000 of100 4] Rn {| Ra | 


T1 variant 


RSB<c>{<q>} {<Rd>, }<Rn>, #0 // Inside IT block 
RSBS{<q>} {<Rd>, }<Rn>, #0 // Outside IT block 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); setflags = !InITBlock(); imm32 = Zeros(32); // immediate = #0 
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T2 


151413 12|11109 8|7 6 5 4|3 0\1514 12|11 8|7 | 0 | 


741 of [oli 17 os] eno] mma] Ra | imme—sd 


RSB variant 


Applies when $ == 0. 


RSB<c>.W {<Rd>,} <Rn>, #@ // Inside IT block 
RSB{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


RSBS variant 


Applies when $ == 1. 


RSBS.W {<Rd>,} <Rn>, #0 // Outside IT block 
RSBS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); 


n = UInt(Rn); setflags = (S == '1'); imm32 = T32ExpandImm(i:imm3:imm8) ; 


if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
° For the RSB variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 
° For the RSBS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 
For encoding T1 and T2: is the general-purpose destination register, encoded in the "Rd" field. If 
omitted, this register is the same as <Rn>. 
<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 
used, but this is deprecated. 
For encoding T1 and T2: is the general-purpose source register, encoded in the "Rn" field. 
<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 
For encoding T2: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(NOT(R[n]), imm32, '1'); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result) ; 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.162 RSB, RSBS (register) 


Reverse Subtract (register) subtracts a register value from an optionally-shifted register value, and writes the result 
to the destination register. 


If the destination register is not the PC, the RSBS variant of the instruction updates the condition flags based on the 
result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The RSB variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 

° The RSBS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— _ The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12\11 i765 4|3 0 | 
| isti1_ jo oo ojo 1 t{s{ Rn | Ra _ | __imm5__|typefo] Rm _| 


cond 


RSB, rotate right with extend variant 
Applies when S == 0 && imm5 == 00000 && type == 11. 


RSB{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


RSB, shift or rotate by value variant 
Applies when S == 0 && !(imm5 == 00000 && type == 11). 


RSB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


RSBS, rotate right with extend variant 
Applies when S == 1 && imm5 == 00000 && type == 11. 


RSBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


RSBS, shift or rotate by value variant 
Applies when § == 1 && !(immS == 00000 && type == 11). 


RSBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); n 


= UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = 


DecodeImmShift(type, imm5); 
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T1 


151413 12|11109 8|7 6 5 4|3 0\1514 12|11 8|7 6 5 4|3 0 | 


77070 Ti 17 08] Rn [OL imma | Rd imma] ype] Rm 


RSB, rotate right with extend variant 
Applies when $ == 0 && imm3 == 000 && imm2 == 00 && type == 11. 


RSB{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


RSB, shift or rotate by value variant 
Applies when $ == 0 && !(imm3 == 000 && imm2 == QQ && type == 11). 


RSB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


RSBS, rotate right with extend variant 
Applies when S$ == 1 && imm3 == 000 && imm2 == 00 && type == 11. 

RSBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 

RSBS, shift or rotate by value variant 
Applies when S == 1 && !(imm3 == 000 && imm2 == 00 && type == 11). 

RSBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 

Decode for all variants of this encoding 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
° For the RSB variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the RSBS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. 
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For encoding T1: is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 


For encoding T1: is the second general-purpose source register, encoded in the "Rm" field. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 1 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(NOT(R[n]), shifted, '1'); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result); 
else 
ALUWritePC(result) ; 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.163 RSB, RSBS (register-shifted register) 


Reverse Subtract (register-shifted register) subtracts a register value from a register-shifted register value, and 
writes the result to the destination register. It can optionally update the condition flags based on the result. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 
| i111 [o 0 0 ofo 1 1/s] Rn | Ra | Rs [O|type{1] Rm __ 


cond 


Flag setting variant 
Applies when S == 1. 


RSBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when S == 0. 


RSB{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 

<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 
the "Rs" field. 
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Operation 


if ConditionPassed() then 


EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(NOT(R[n]), shifted, '1'); 
R[d] = result; 
if setflags then 

PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.164 RSC, RSCS (immediate) 


Reverse Subtract with Carry (immediate) subtracts a register value and the value of NOT (Carry flag) from an 
immediate value, and writes the result to the destination register. 


If the destination register is not the PC, the RSCS variant of the instruction updates the condition flags based on the 
result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The RSC variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 

° The RSCS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See J/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12\11 | | 0 | 
rit [oo 1 ol1 1 ifs} Rn [| Rd | immi2_ 
cond 
RSC variant 
Applies when $ == 0. 


RSC{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


RSCS variant 
Applies when $ == 1. 


RSCS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); imm32 = A32ExpandImm(imm12) ; 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. If omitted, this register is the 


same as <Rn>. ARM deprecates using the PC as the destination register, but if the PC is used: 


° For the RSC variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


. For the RSCS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 
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<Rn> Is the general-purpose source register, encoded in the "Rn" field. The PC can be used, but this is 
deprecated. 
<const> An immediate value. See Modified immediate constants in A32 instructions on page F2-2422 for the 


range of values. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(NOT(R[n]), imm32, PSTATE.C); 
if d == 15 then 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 





F5-2942 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.165 RSC, RSCS (register) 


Reverse Subtract with Carry (register) subtracts a register value and the value of NOT (Carry flag) from an 
optionally-shifted register value, and writes the result to the destination register. 


If the destination register is not the PC, the RSCS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The RSC variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The RSCS variant of the instruction performs an exception return without the use of the stack. In this case: 


A1 


|31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20|19 16|15 12\11 i765 4|3 0 | 


| fit jo oo of1 1 tfs{ Rn | Rd | immS | type[o] Rm _ 


cond 


RSC, rotate right with extend variant 


Applies when S == 0 && imm5 == 00000 && type == 11. 


RSC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


RSC, shift or rotate by value variant 


Applies when S$ == @ && !(immS == 00000 && type == 11). 


RSC{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


RSCS, rotate right with extend variant 


Applies when S == 1 && imm5 == 00000 && type == 11. 


RSCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


RSCS, shift or rotate by value variant 


Applies when S$ == 1 && !(immS == 00000 && type == 11). 


RSCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); n= 
(shift_t, shift_n) 


UInt(Rn); | m = UInt(Rm); setflags = (S == '1'); 
= DecodeImmShift(type, imm5); 


Assembler symbols 


<c> 


<q> 


See Standard assembler syntax fields on page F2-2406. 


See Standard assembler syntax fields on page F2-2406. 
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<Rd> Is the general-purpose destination register, encoded in the "Rd" field. If omitted, this register is the 
same as <Rn>. ARM deprecates using the PC as the destination register, but if the PC is used: 


. For the RSC variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


. For the RSCS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


<Rn> Is the first general-purpose source register, encoded in the "Rn" field. The PC can be used, but this 
is deprecated. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. The PC can be used, but 
this is deprecated. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<amount> Is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 (when <shift> = 


LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(NOT(R[n]), shifted, PSTATE.C); 
if d == 15 then 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.166 RSC, RSCS (register-shifted register) 


Reverse Subtract (register-shifted register) subtracts a register value and the value of NOT (Carry flag) from a 
register-shifted register value, and writes the result to the destination register. It can optionally update the condition 
flags based on the result. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0| 
| isiiit_ fo oo o]1 1 tfs{ Rn | Ra | Rs_ [oftype]t] Rm __| 


cond 


Flag setting variant 
Applies when S == 1. 


RSCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when $ == 0. 


RSC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 

<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 
the "Rs" field. 
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Operation 


if ConditionPassed() then 


EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(NOT(R[n]), shifted, PSTATE.C); 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.167 SADD16 
Signed Add 16 performs two 16-bit signed integer additions, and writes the results to the destination register. It sets 
PSTATE.GE according to the results of the additions. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


| tei fo 1 to ojo ot] Ra RA fanfenfnfofo of1] Rm 


cond 


Al variant 


SADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 8/7 6 5 4|3 0 | 


111ttororjoo ty Rn [1111] Rd fofofofo] Rm | 


T1 variant 


SADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
sum1 = SInt(R[n]<15:@>) + SInt(R[m]<15:@>); 
sum2 = SInt(R[n]<31:16>) + SInt(R[m]<31:16>) ; 
R[d]<15:@> = suml<15:@>; 
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R[d]<31:16> = sum2<15:0>; 
PSTATE.GE<1:0> = if suml >= @ then '11' else 'Q0'; 
PSTATE.GE<3:2> = if sum2 >= @ then '11' else 'Q0'; 
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F5.1.168 SADD8 
Signed Add 8 performs four 8-bit signed integer additions, and writes the results to the destination register. It sets 
PSTATE.GE according to the results of the additions. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
itt jo 110 ofo o 4] Rn | Rd fant fo of1] Rm __ | 
cond 
Al variant 
SADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]|3 0| 
Titttot1o ooo, Ra [1717] Ra joojojo] am | 
T1 variant 
SADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
suml = SInt(R[n]<7:0>) + SInt(R[m]<7:0>); 
sum2 = SInt(R[n]<15:8>) + SInt(R[m]<15:8>); 
sum3 = SInt(R[n]<23:16>) + SInt(R[m]<23:16>); 
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sum4 = SInt(R[n]<31:24>) + SInt(R[m]<31:24>); 
sum1<7:0>; 
sum2<7:0>; 
sum3<7:0>; 
sum4<7:0>; 


R[d]<7:0> 
R[d]<15:8> 
R[d]<23:16> 
R[d]<31:24> 
PSTATE.GE<0> 
PSTATE.GE<1> 
PSTATE.GE<2> 
PSTATE.GE<3> 


if suml >= 
if sum2 >= 
if sum3 >= 
if sum4 >= 


@ then '1' 
®@ then '1' 
®@ then '1' 
@ then '1' 


else 
else 
else 
else 


'Q': 
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F5.1.169 SASX 
Signed Add and Subtract with Exchange exchanges the two halfwords of the second operand, performs one 16-bit 
integer addition and one 16-bit subtraction, and writes the results to the destination register. It sets PSTATE.GE 
according to the results. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
rit fo 110 ofo otf Rn [Rd [annfofo t]4] Rm _ | 
cond 
Al variant 
SASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 14 13 12/11 8/7 6 5 4|3 0| 
Tiitto1o oro] Ra [1177] Ra jo]ojojo] rm | 
T1 variant 
SASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
diff = SInt(R[n]<15:@>) - SInt(R[m]<31:16>); 
sum = SInt(R[n]<31:16>) + SInt(R[m]<15:0>); 
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R[d]<15:@> = diff<15:@>; 
R[d]<31:16> = sum<15:>; 
PSTATE.GE<1:0> = if diff >= @ then '11' else 'Q0'; 
PSTATE.GE<3:2> = if sum >= @ then '11' else 'Q0'; 
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F5.1.170 SBC, SBCS (immediate) 


Subtract with Carry (immediate) subtracts an immediate value and the value of NOT (Carry flag) from a register 
value, and writes the result to the destination register. 


If the destination register is not the PC, the SBCS variant of the instruction updates the condition flags based on the 
result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The SBC variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 

° The SBCS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— _ The PE checks SPSR_<current_mode> for an illegal return event. See J/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 
\31 28|27 26 25 24/23 22 21 20/19 16|15 12|11 | | 0| 
1111 [0 0 1 0/1 7 ofS} Rn | Rd | immt2— 
cond 
SBC variant 
Applies when S == 0. 


SBC{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


SBCS variant 
Applies when S == 1. 


SBCS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); imm32 = A32ExpandImm(imm12) ; 
T1 


[15 141312/11109 8|7 6 5 4|3 0/1514 12\11 8|7 | 0 | 


itt ofifoj1 ot ifs} Rn fof imms | Rd | imms 


SBC variant 

Applies when $ == 0. 

SBC{<c>}{<q>} {<Rd>,} <Rn>, #<const> 
SBCS variant 


Applies when $ == 1. 
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SBCS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); imm32 = T32ExpandImm(i:imm3:imm8) ; 
if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
° For the SBC variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 


general-purpose registers and the PC on page E1-2293. 


° For the SBCS variant, the instruction performs an exception return, that restores PSTATE 


from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 


this register is the same as <Rn>. 


<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 


used, but this is deprecated. 


For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 


<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 


page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 


page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(R[n], NOT(imm32), PSTATE.C); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.171 SBC, SBCS (register) 


Subtract with Carry (register) subtracts an optionally-shifted register value and the value of NOT (Carry flag) from 
a register value, and writes the result to the destination register. 


If the destination register is not the PC, the SBCS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. ARM 
deprecates any use of these encodings. However, when the destination register is the PC: 


° The SBC variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The SBCS variant of the instruction performs an exception return without the use of the stack. In this case: 


A1 


|31 


The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


The PE checks SPSR_<current_mode> for an illegal return event. See J//egal return events from 
AArch32 state on page G1-3835. 


The instruction is UNDEFINED in Hyp mode. 


The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


28|27 26 25 24|23 22 21 20/19 16|15 12\11 i765 4|3 0 | 


| fe1it jo oo of1 1 ofs} Rn | Rd | immS | type[o] Rm _| 


cond 


SBC, rotate right with extend variant 


Applies when $ == 0 && imm5 == 00000 && type == 11. 


SBC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


SBC, shift or rotate by value variant 


Applies when S == 0 && !(imm5 == 00000 && type == 11). 


SBC{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


SBCS, rotate right with extend variant 


Applies when S == 1 && imm5 == 00000 && type == 11. 


SBCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


SBCS, shift or rotate by value variant 


Applies when S == 1 && !(imm5 == 00000 && type == 11). 


SBCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); n= 
(shift_t, shift_n) 


UInt(Rn); | m = UInt(Rm); setflags = (S == '1'); 
= DecodeImmShift(type, imm5); 
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T1 


\15141312\11109 8|7 65 |3 2. O| 


T1 variant 


SBC<c>{<q>} {<Rdn>,} <Rdn>, <Rm> // Inside IT block 
SBCS{<q>} {<Rdn>,} <Rdn>, <Rm> // Outside IT block 


Decode for this encoding 
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 1413 12|1110 9 8|7 6 5 4/3 0|1514 12/11 817 6 5 4/3 o| 


At torortior sts} Rv of imm3 | Rd fimm2}type] Rm __| 


SBC, rotate right with extend variant 
Applies when S == 0 && imm3 == 000 && imm2 == 00 && type == 11. 


SBC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


SBC, shift or rotate by value variant 
Applies when S == 0 && !(imm3 == 000 && imm2 == 00 && type == 11). 


SBC<c>.W {<Rd>,} <Rn>, <Rm> // Inside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
SBC{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


SBCS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && imm2 == 00 && type == 11. 


SBCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


SBCS, shift or rotate by value variant 
Applies when S$ == 1 && !(imm3 == 000 && imm2 == 00 && type == 11). 


SBCS.W {<Rd>,} <Rn>, <Rm> // Outside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
SBCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdn> Is the first general-purpose source register and the destination register, encoded in the "Rdn" field. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
° For the SBC variant, the instruction is a branch to the address calculated by the operation. 


This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the SBCS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T2: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. 


<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. 
For encoding T2: is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 


For encoding T1 and T2: is the second general-purpose source register, encoded in the "Rm" field. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], NOT(shifted), PSTATE.C); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.172 SBC, SBCS (register-shifted register) 


Subtract with Carry (register-shifted register) subtracts a register-shifted register value and the value of NOT (Carry 
flag) from a register value, and writes the result to the destination register. It can optionally update the condition 
flags based on the result. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0| 
| tstii1_fo oo o]1 1 ofs{ Rn | Ra | Rs__[oftype]i] Rm _| 


cond 


Flag setting variant 
Applies when S == 1. 


SBCS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when $ == 0. 


SBC{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 

<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 
the "Rs" field. 
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ration 


if ConditionPassed() then 


EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], NOT(shifted), PSTATE.C); 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.173 SBFX 
Signed Bit Field Extract extracts any number of adjacent bits at any position from a register, sign-extends them to 
32 bits, and writes the result to the destination register. 
A1 
31 28|27 26 25 24|23 22 21 20| 16|15 12|11 i765 4]3 0 | 
itt jo 11 4 4fo[t] widthmt | Rd | tsb ft 0 1] Rn 
cond 
Al variant 
SBFX{<c>}{<q>} <Rd>, <Rn>, #<Isb>, #<width> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); 
Isbit = UInt(1sb); widthminusl = UInt(widthm1); 
if d == 15 || n == 15 then UNPREDICTABLE; 
T1 
15 14 1312/1110 9 8|7 6 5 4|3 0/1514 12\11 8/7 6 5 4| 0 | 
T1411 ofoyt to tfofo. Ra [op mms [Ra [imma] oy wath 
T1 variant 
SBFX{<c>}{<q>} <Rd>, <Rn>, #<Isb>, #<width> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); 
Tsbit = UInt(imm3:imm2); widthminus1 = UInt(widthm1); 
if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the general-purpose source register, encoded in the "Rn" field. 
<Isb> For encoding A1: is the bit number of the least significant bit in the field, in the range 0 to 31, 
encoded in the "Isb" field. 
For encoding T1: is the bit number of the least significant bit in the field, in the range 0 to 31, 
encoded in the "imm3:imm2" field. 
<width> Is the width of the field, in the range 1 to 32-<lsb>, encoded in the "widthm1" field as <width>-1. 
F5-2960 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
msbit = Isbit + widthminus1; 
if msbit <= 31 then 
R[d] = SignExtend(R[n]<msbit:lsbit>, 32); 
else 
UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If msbit > 31, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 
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F5.1.174 


SDIV 


Signed Divide divides a 32-bit signed integer register value by a 32-bit signed integer register value, and writes the 
result to the destination register. The condition flags are not affected. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 


Psat fo 14 1 ofo otf Ra May Rm fo 0 oft] Rn 
Ra 


cond 


Al variant 
SDIV{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 
if d == 15 || n == 15 || m == 15 || a != 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If Ra != '1111', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes as described, with no change to its behavior and no additional side effects. 


. The instruction executes as described, and the register specified by Ra becomes UNKNOWN. 


T1 


151413 12|11109 8|7 6 5 4|3 0 \15 12|11 8|7 6 5 4|3 0 | 


At 1itorttfoortt Rn [maa Ra [1114] Rm | 
Ra 


T1 variant 
SDIV{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 
if d == 15 || n == 15 || m == 15 || a != 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 

If Ra != '1111', then one of the following behaviors must occur: 

. The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes as described, with no change to its behavior and no additional side effects. 


. The instruction executes as described, and the register specified by Ra becomes UNKNOWN. 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register holding the dividend, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register holding the divisor, encoded in the "Rm" field. 
Overflow 


If the signed integer division 0x80000000 / OxFFFFFFFF is performed, the pseudocode produces the intermediate 
integer result +23!, that overflows the 32-bit signed integer range. No indication of this overflow case is produced, 
and the 32-bit result written to <Rd> must be the bottom 32 bits of the binary representation of +23!. So the result of 
the division is 0x80000000. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if SInt(R[m]) == @ then 
result = 0; 
else 
result = RoundTowardsZero(Real(SInt(R[n])) / Real(SInt(R[m]))); 
R[d] = result<31:@>; 
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F5.1.175 


SEL 


Select Bytes selects each byte of its result from either its first operand or its second operand, according to the values 
of the PSTATE.GE flags. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


pie fo1toroo of} Ra] RA faints o 4 tf Rm | 


cond 


Al variant 
SEL{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 


d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


Tiit7o707o10| Ra [iii] Rd [1000] Rm | 


T1 variant 
SEL{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
R[d]<7:@> = if PSTATE.GE<@> == '1' then R[n]<7:0> else R[m]<7:0>; 
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R[d]<15:8> if PSTATE.GE<1> == '1' then R[n]<15:8> else R[m]<15:8>; 
R[d]<23:16> = if PSTATE.GE<2> == '1' then R[n]<23:16> else R[m]<23:16>; 
R[d]<31:24> = if PSTATE.GE<3> == '1' then R[n]<31:24> else R[m]<31:24>; 
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F5.1.176 SETEND 


Set Endianness writes a new value to PSTATE.E. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 1413 12/1110 9 8/7 6 5 4/3 21 0| 


11.1.1.0 0.0 1 0 0 0 0 |(O)(O)(0}} 1 [(0)](0)](0)](0)|(0)](0)] E (0)} 0 | 0] Of 0 (0) (0) 


Al variant 


SETEND{<q>} <endian_specifier> // Cannot be conditional 
Decode for this encoding 
set_bigend = (E == '1'); 


T1 


15 14 1312/1110 9 8/7 6 5 4/3 21 0| 


101 10110 0 1fOKAE(O (0) ©) 


T1 variant 


SETEND{<q>} <endian_specifier> // Not permitted in IT block 


Decode for this encoding 


set_bigend = (E == '1'); 
if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


(0) (0) 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 


<endian_specifier> Is the endianness to be selected, and the value to be set in PSTATE.E, encoded in the "E" field. 


It can have the following values: 
LE when E = @ 
BE when E = 1 


Operation for all encodings 


EncodingSpecificOperations(); 
AArch32.CheckSETENDEnabled(); 
PSTATE.E = if set_bigend then '1' else '0'; 





F5-2966 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.177 SEV 


Send Event is a hint instruction. It causes an event to be signaled to all PEs in the multiprocessor system. For more 
information, see Wait For Event and Send Event on page G1-3872. 


A1 


31 28|27 26 25 24/23 22 21 20/19 18 17 16/15 14 13 12/1110 9 8/7 6 5 4/3 21 0} 


etm [007 7 ofo]t of0 ofo oO MOO oo 000100 


cond 


Al variant 


SEV{<c>}{<q>} 


Decode for this encoding 


// No additional decoding required 


T1 


[15 14 1312/1110 9 8/7 6 5 4/3 21 0| 


TOT7T117707100)0000] 


T1 variant 


SEV{<c>}{<q>} 


Decode for this encoding 


// No additional decoding required 


T2 


15 14 1312/1110 9 8/7 6 5 4/3 2 1 0/15141312/11109 8/7 6 5 4/3 21 0} 


114700171417 01 ofajafaays ofo}ofofo o ofo o o ojo 1 0 O| 


T2 variant 


SEV{<c>}.W 


Decode for this encoding 


// No additional decoding required 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
SendEvent(); 
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F5.1.178 SEVL 


Send Event Local is a hint instruction. It causes an event to be signaled locally without the requirement to affect 
other PEs in the multiprocessor system. It can prime a wait-loop which starts with a WFE instruction. 


A1 


31 28|27 26 25 24/23 22 21 20/19 18 17 16/15 14 13 12/1110 9 8/7 6 5 4/3 21 0} 


Ei [o 07 7 ofo]t of ofo oO MOO ooo 00704 


cond 


Al variant 


SEVL{<c>}{<q>} 


Decode for this encoding 


// No additional decoding required 


T1 


15 14 1312/1110 9 8/7 6 5 4/3 21 0| 


TOT7T1177010 70000] 


T1 variant 


SEVL{<c>}{<q>} 


Decode for this encoding 


// No additional decoding required 


T2 


15 14 1312/1110 9 8/7 6 5 4/3 2 1 0/15141312/11109 8/7 6 5 4/3 21 0} 


11470017141 01 ofayafaas ofo}ofojo o ofo o o ojo 1 0 1] 


T2 variant 


SEVL{<c>}.W 


Decode for this encoding 


// No additional decoding required 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-2969 


ID092916 Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
EventRegisterSet(); 
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F5.1.179 SHADD16 
Signed Halving Add 16 performs two signed 16-bit integer additions, halves the results, and writes the results to the 
destination register. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


pte fo 1 to ojo1 tf] Ra] RA fanfnfrfofo of1] Rm 


cond 


Al variant 


SHADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


111ttororjoo ty Rn [11t1] Rd fofofifo] Rm | 


T1 variant 


SHADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
sum1 = SInt(R[n]<15:@>) + SInt(R[m]<15:0>); 
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sum2 = SInt(R[n]<31:16>) + SInt(R[m]<31:16>) ; 
R[d]<15:@> = suml<16:1>; 
R[d]<31:16> = sum2<16:1>; 





F5-2972 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.180 SHADD8 
Signed Halving Add 8 performs four signed 8-bit integer additions, halves the results, and writes the results to the 
destination register. 
A1 
|31 28|27 26 25 24|23 22 21 20/19 16|15 12|/11109 8|7 6 5 4/3 0| 
Fit jo 110 ofo 1 4] Rn | RG fait fo of1] Rm _ | 
cond 
Al variant 
SHADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]|3 0| 
Tiittotro ooo, Ri [1717] Ra joolijo] am | 
T1 variant 
SHADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
suml = SInt(R[n]<7:0>) + SInt(R[m]<7:0>); 
sum2 = SInt(R[n]<15:8>) + SInt(R[m]<15:8>); 
sum3 = SInt(R[n]<23:16>) + SInt(R[m]<23:16>); 
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sum4 = SInt(R[n]<31:24>) + SInt(R[m]<31:24>); 
R[d]<7:@> = suml<8:1>; 
R[d]<15:8> = sum2<8:1>; 
R[d]<23:16> = sum3<8:1>; 
R[d]<31:24> = sum4<8:1>; 
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F5.1.181 SHASX 


Signed Halving Add and Subtract with Exchange exchanges the two halfwords of the second operand, performs one 
signed 16-bit integer addition and one signed 16-bit subtraction, halves the results, and writes the results to the 
destination register. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


petit jot 1 o ojo taf Rn] Rd Waryfofo +f+] Rm 


cond 


Al variant 


SHASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0/15 14 13 12|11 8/7 6 5 4|3 0 | 


ptr ttorortjotrof Rn [ttt t} Rd fofojifof Rm | 


T1 variant 


SHASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diff = SInt(R[n]<15:@>) - SInt(R[m]<31:16>); 
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sum = SInt(R[n]<31:16>) + SInt(R[m]<15:0>); 
R[d]<15:@> = diff<1l6:1>; 
R[d]<31:16> = sum<16:1>; 
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F5.1.182 SHSAX 


Signed Halving Subtract and Add with Exchange exchanges the two halfwords of the second operand, performs one 
signed 16-bit integer subtraction and one signed 16-bit addition, halves the results, and writes the results to the 
destination register. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


pet jot 1 o ojo taf Rn] Rd Wanfoft oft] Rm 


cond 


Al variant 


SHSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 8/7 6 5 4|3 0 | 


pt rttorotit of Rn [tra t] Rd fofojifo] Rm | 


T1 variant 


SHSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
sum = SInt(R[n]<15:@>) + SInt(R[m]<31:16>) ; 
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diff = SInt(R[n]<31:16>) - SInt(R[m]<15:0>); 
R[d]<15:@> = sum<16:1>; 
R[d]<31:16> = diff<16:1>; 
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F5.1.183 SHSUB16 
Signed Halving Subtract 16 performs two signed 16-bit integer subtractions, halves the results, and writes the results 
to the destination register. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


| tei fo 1 to ojo1 tf] Ra | RA fanfenfenfofs afi] Rm 


cond 


Al variant 


SHSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


11ittororr1o at Rn [1it1] Rd fofofifoy Rm | 


T1 variant 


SHSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diff1l = SInt(R[n]<15:@>) - SInt(R[m]<15:0>); 
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diff2 = SInt(R[n]<31:16>) - SInt(R[m]<31:16>); 
R[d]<15:@> = diffl<l6:1; 
R[d]<31:16> = diff2<16:1; 
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F5.1.184 SHSUB8 
Signed Halving Subtract 8 performs four signed 8-bit integer subtractions, halves the results, and writes the results 
to the destination register. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


pti fo 1 to Ojon tf] Ra | RA faints tts] Rm 


cond 


Al variant 


SHSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


11i1ttorort1oo; Rn [1111] Rd fofofifo] Rm | 


T1 variant 


SHSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diff1 = SInt(R[n]<7:@>) - SInt(R[m]<7:0>); 
diff2 = SInt(R[n]<15:8>) - SInt(R[m]<15:8>); 
diff3 = SInt(R[n]<23:16>) - SInt(R[m]<23:16>); 
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diff4 = SInt(R[n]<31:24>) - SInt(R[m]<31:24>); 
R[d]<7:@> = diff1l<8:1>; 
R[d]<15:8> = diff2<8:1>; 
R[d]<23:16> = diff3<8:1>; 
R[d]<31:24> = diff4<8:1; 
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F5.1.185 SMC 


Secure Monitor Call causes a Secure Monitor Call exception. For more information see Secure Monitor Call (SMC) 
exception on page G1-3854. 


SMC is available only for software executing at EL1 or higher. It is UNDEFINED in User mode. 


If the values of HCR.TSC and SCR.SCD are both 0, execution of an SMC instruction at EL1 or higher generates a 
Secure Monitor Call exception that is taken to EL3. When EL3 is using AArch32 this exception is taken to Monitor 
mode. When EL3 is using AArch64, it is the SCR_EL3.SMD bit, rather than the SCR.SCD bit, that can change the 
effect of executing an SMC instruction. 


If the value of HCR.TSC is 1, execution of an SMC instruction in a Non-secure EL1 mode generates an exception that 
is taken to EL2, regardless of the value of SCR.SCD. When EL2 is using AArch32, this is a Hyp Trap exception 
that is taken to Hyp mode. For more information see Traps to Hyp mode of Non-secure ELI execution of SMC 
instructions on page G1-3901. 


If the value of HCR.TSC is 0 and the value of SCR.SCD is 1, the SMC instruction is: 
° UNDEFINED in Non-secure state. 


° CONSTRAINED UNPREDICTABLE if executed in Secure state at EL1 or higher. 


A1 


31 28|27 26 25 24/23 22 21 20/19 18 17 16/15 1413 12/1110 9 8/7 6 5 4]3 0| 


[t=1111 Jo 0 0 1 0] 1 1] 0[(0)(0)(0) 0) (0) (0) (0) (0) (0) (0) (0) (J 01-11 
cond 
Al variant 
SMC{<c>}{<q>} {#}<imm4> 
Decode for this encoding 


// imm4 is for assembly/disassembly only and is ignored by hardware 


T1 


151413 12\11109 8|7 6 5 4|3 0|15141312/11109 8|7 6 5 4/3 21 0| 


11110141141 1 1414{ imm4_ [1 0] 0]0 [(0)(0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) 


T1 variant 

SMC{<c>}{<q>} {#}<imm4> 

Decode for this encoding 

// imm4 is for assembly/disassembly only and is ignored by hardware 
if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 

Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<imm4> 


Is a 4-bit unsigned immediate value, in the range 0 to 15, encoded in the "imm4" field. This is 
ignored by the PE. The Secure Monitor Call exception handler (Secure Monitor code) can use this 
value to determine what service is being requested, but ARM does not recommend this. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


if !HaveEL(EL3) || PSTATE.EL == EL@ then 
UNDEFINED; 


AArch32.CheckForSMCTrap(); 


if !ELUsingAArch32(EL3) then 
if SCR_EL3.SMD == '1' then 
// SMC disabled. 
UNDEFINED; 
else 
if SCR.SCD == '1' then 
// SMC disabled 
if IsSecure() then 
// Executes either as a NOP or UNALLOCATED. 
c = ConstrainUnpredictable(Unpredictable_SMD) ; 
assert c IN {Constraint_NOP, Constraint_UNDEF}; 
if c == Constraint_NOP then EndOfInstruction(); 
UNDEFINED; 


if !ELUsingAArch32(EL3) then 
AArch64.Cal1SecureMonitor(Zeros(16)); 
else 
AArch32.TakeSMCException(); 


CONSTRAINED UNPREDICTABLE behavior 


If SCR.SCD == '1' && IsSecure(), then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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F5.1.186 SMLABB, SMLABT, SMLATB, SMLATT 


Signed Multiply Accumulate (halfwords) performs a signed multiply accumulate operation. The multiply acts on 
two signed 16-bit quantities, taken from either the bottom or the top half of their respective source registers. The 
other halves of these source registers are ignored. The 32-bit product is added to a 32-bit accumulate value and the 
result is written to the destination register. 


If overflow occurs during the addition of the accumulate value, the instruction sets PSTATE.Q to 1. It is not possible 
for overflow to occur during the multiplication. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 
[erm [oo 07 0fo ofo] Rd | Ra | Rm i[M[NJo] Rn 


cond 


SMLABB variant 

Applies when M == 0 && N == 0. 
SMLABB{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMLABT variant 

Applies when M == 1 && N == 0. 
SMLABT{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMLATB variant 

Applies when M == 0 && N == 1. 
SMLATB{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMLATT variant 

Applies when M == 1 && N == 1. 
SMLATT{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for all variants of this encoding 
d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 


n_high = (N == '1'); | m_high = (M == '1'); 
if d == 15 || n == 15 || m == 15 || a == 15 then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4|3 0 \15 12|11 8|7 6 5 4|3 0 | 


Tit+7+70770/001] Ra | =i | Rd [0 O[N[M] Rm | 
Ra 


SMLABB variant 

Applies when N == 0 && M == 0. 
SMLABB{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMLABT variant 


Applies when N == @ && M == 1. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


SMLABT{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


SMLATB variant 
Applies when N == 1 && M == 0. 


SMLATB{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


SMLATT variant 
Applies when N == 1 && M == 1. 


SMLATT{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


Decode for all variants of this encoding 

if Ra == '1111' then SEE SMULBB, SMULBT, SMULTB, SMULTT; 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 

n_high = (N == '1'); m_high = (M == '1'); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register holding the multiplicand in the bottom or top half 


(selected by <x>), encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register holding the multiplier in the bottom or top half 
(selected by <y>), encoded in the "Rm" field. 


<Ra> Is the third general-purpose source register holding the addend, encoded in the "Ra" field. 


Operation for all encodings 


if ConditionPassed() then 

EncodingSpecificOperations(); 

operand1 = if n_high then R[n]<31:16> else R[n]<15:0>; 

operand2 = if m_high then R[m]<31:16> else R[m]<15:0>; 

result = SInt(operand1) « SInt(operand2) + SInt(R[a]); 

R[d] = result<31:0>; 

if result != SInt(result<31:0>) then // Signed overflow 
PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.187  SMLAD, SMLADX 


Signed Multiply Accumulate Dual performs two signed 16 x 16-bit multiplications. It adds the products to a 32-bit 
accumulate operand. 


Optionally, the instruction can exchange the halfwords of the second operand before performing the arithmetic. This 
produces top x bottom and bottom x top multiplication. 


This instruction sets PSTATE.Q to 1 if the accumulate operation overflows. Overflow cannot occur during the 


multiplications. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 8|7 6 5 4]3 0| 
eam [o717ojo0 0] rd | em | Rm lo o[m[i] Rn | 
cond Ra 
SMLAD variant 
Applies when M == 0. 


SMLAD{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


SMLADX variant 
Applies when M == 1. 


SMLADX{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


Decode for all variants of this encoding 
if Ra == '1111' then SEE SMUAD; 
d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 


m_swap = (M == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


\15141312|11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4|3 0 | 


fj+r17o71 ofo7o] rn | =e | Rd [0 o]o[m] Rm | 
Ra 


SMLAD variant 
Applies when M == 0. 


SMLAD{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


SMLADX variant 
Applies when M == 1. 


SMLADX{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


Decode for all variants of this encoding 


if Ra == '1111' then SEE SMUAD; 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 

m_swap = (M == '1'); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<Ra> Is the third general-purpose source register holding the addend, encoded in the "Ra" field. 


Operation for all encodings 


if ConditionPassed() then 

EncodingSpecificOperations(); 

operand2 = if m_swap then ROR(R[m],16) else R[m]; 

product1 = SInt(R[n]<15:@>) « SInt(operand2<15:0>); 

product2 = SInt(R[n]<31:16>) » SInt(operand2<31:16>) ; 

result = productl + product2 + SInt(R[a]); 

R[d] = result<31:0>; 

if result != SInt(result<31:0>) then // Signed overflow 
PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.188 SMLAL, SMLALS 


Signed Multiply Accumulate Long multiplies two signed 32-bit values to produce a 64-bit value, and accumulates 
this with a 64-bit value. 


In A32 instructions, the condition flags can optionally be updated based on the result. Use of this option adversely 
affects performance on many implementations. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0 | 


cond 


Flag setting variant 
Applies when $ == 1. 


SMLALS{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


Not flag setting variant 
Applies when $ == 0. 


SMLAL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


Decode for all variants of this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); setflags = (S == '1'); 


if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if dHi == dLo then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

. The instruction executes as NOP. 


° The value in the destination register is UNKNOWN. 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4|3 0 | 


T1 variant 


SMLAL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


Decode for this encoding 


dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); setflags = FALSE; 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 

// ARMv8-A removes UNPREDICTABLE for R13 

if dHi == dLo then UNPREDICTABLE; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


CONSTRAINED UNPREDICTABLE behavior 


If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<RdLo> Is the general-purpose source register holding the lower 32 bits of the addend, and the destination 


register for the lower 32 bits of the result, encoded in the "RdLo" field. 


<RdHi> Is the general-purpose source register holding the upper 32 bits of the addend, and the destination 
register for the upper 32 bits of the result, encoded in the "RdHi" field. 


<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = SInt(R[n]) « SInt(R[m]) + SInt(R[dHi]:R[dLo]); 
R[dHi] = result<63:32>; 
R[dLo] = result<31:0>; 
if setflags then 
PSTATE.N = result<63>; 
PSTATE.Z = IsZeroBit(result<63:0>); 
// PSTATE.C, PSTATE.V unchanged 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.189 SMLALBB, SMLALBT, SMLALTB, SMLALTT 


Signed Multiply Accumulate Long (halfwords) multiplies two signed 16-bit values to produce a 32-bit value, and 
accumulates this with a 64-bit value. The multiply acts on two signed 16-bit quantities, taken from either the bottom 
or the top half of their respective source registers. The other halves of these source registers are ignored. The 32-bit 
product is sign-extended and accumulated with a 64-bit accumulate value. 


Overflow is possible during this instruction, but only as a result of the 64-bit addition. This overflow is not detected 
if it occurs. Instead, the result wraps around modulo 2%. 


A1 
31 28/27 26 25 24|23 22 21 20/19 16|15 12\11 8/7 6 5 4|3 0 | 
Derm [oo 07 01 ofo| Ra | Rao | Rm |i[M[NJo] Rn | 


cond 


SMLALBB variant 
Applies when M == @ && N == 


SMLALBB{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


SMLALBT variant 
Applies when M == 1 && N == 


SMLALBT{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


SMLALTB variant 
Applies when M == @ && N == 


SMLALTB{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


SMLALTT variant 
Applies when M == 1 && N == 


SMLALTT{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


Decode for all variants of this encoding 


dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); 
n_high = (N == '1'); | m_high = (M == '1'); 

if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if dHi == dLo then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 





If dHi == dLo, then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The value in the destination register is UNKNOWN. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T1 


151413 12\11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4|3 0 | 


Tiit70%777100| Ra | Reo | Raw [1 O[N[M] Rm 


SMLALBB variant 
Applies when N == 0 && M == 


SMLALBB{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


SMLALBT variant 
Applies when N == 0 && M == 


SMLALBT{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


SMLALTB variant 
Applies when N == 1 && M == 


SMLALTB{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


SMLALTT variant 
Applies when N == 1 && M == 


SMLALTT{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


Decode for all variants of this encoding 


dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); 
n_high = (N == '1');\ m_high = (M == '1'); 

if dLo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 

if dHi == dLo then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<RdLo> Is the general-purpose source register holding the lower 32 bits of the addend, and the destination 


register for the lower 32 bits of the result, encoded in the "RdLo" field. 


<RdHi> Is the general-purpose source register holding the upper 32 bits of the addend, and the destination 


register for the upper 32 bits of the result, encoded in the "RdHi" field. 
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<Rn> 


<Rm> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


For encoding A1: is the first general-purpose source register holding the multiplicand in the bottom 
or top half (selected by <x>), encoded in the "Rn" field. 


For encoding T1: is the first general-purpose source register holding the multiplicand in the bottom 
or top half (selected by <x>), encoded in the "Rn" field. 
For encoding A1: is the second general-purpose source register holding the multiplier in the bottom 
or top half (selected by <y>), encoded in the "Rm" field. 


For encoding T1: is the second general-purpose source register holding the multiplier in the bottom 
or top half (selected by <x>), encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 

operand1 = if n_high then R[n]<31:16> else R[n]<15:0>; 

operand2 = if m_high then R[m]<31:16> else R[m]<15:0>; 

result = SInt(operand1) « SInt(operand2) + SInt(R[dHi]:R[dLo]); 
R[dHi] = result<63:32>; 

R[dLo] = result<31:0>; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.190 SMLALD, SMLALDX 
Signed Multiply Accumulate Long Dual performs two signed 16 x 16-bit multiplications. It adds the products to a 
64-bit accumulate operand. 
Optionally, the instruction can exchange the halfwords of the second operand before performing the arithmetic. This 
produces top x bottom and bottom x top multiplication. 
Overflow is possible during this instruction, but only as a result of the 64-bit addition. This overflow is not detected 
if it occurs. Instead, the result wraps around modulo 2%. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 
rit [o 11 1 o[1 0 of RaHi [| Rdto | Rm |o ofmMji] Rn | 
cond 
SMLALD variant 
Applies when M == 0. 
SMLALD{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
SMLALDX variant 
Applies when M == 1. 
SMLALDX{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Decode for all variants of this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); m_swap = (M == '1'); 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if dHi == dLo then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The value in the destination register is UNKNOWN. 
T1 
15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12)\11 8/7 6 5 4|3 0 | 
11417074 1/1 00] Rn _ [ Rdlo | RdHi_ [1 1 OfM] Rm _ | 
SMLALD variant 
Applies when M == 0. 
SMLALD{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
SMLALDX variant 
Applies when M == 1. 
SMLALDX{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
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Decode for all variants of this encoding 


dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m= UInt(Rm); m_swap = (M == '1'); 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 

// ARMv8-A removes UNPREDICTABLE for R13 

if dHi == dLo then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<RdLo> Is the general-purpose source register holding the lower 32 bits of the addend, and the destination 


register for the lower 32 bits of the result, encoded in the "RdLo" field. 


<RdHi> Is the general-purpose source register holding the upper 32 bits of the addend, and the destination 
register for the upper 32 bits of the result, encoded in the "RdHi" field. 


<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand2 = if m_swap then ROR(R[m],16) else R[m]; 
product1 = SInt(R[n]<15:@>) « SInt(operand2<15:0>); 
product2 = SInt(R[n]<31:16>) « SInt(operand2<31:16>) ; 
result = productl + product2 + SInt(R[dHi]:R[dLo]); 
R[dHi] = result<63:32>; 
R[dLo] = result<31:0>; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.191 SMLAWB, SMLAWT 
Signed Multiply Accumulate (word by halfword) performs a signed multiply accumulate operation. The multiply 
acts on a signed 32-bit quantity and a signed 16-bit quantity. The signed 16-bit quantity is taken from either the 
bottom or the top half of its source register. The other half of the second source register is ignored. The top 32 bits 
of the 48-bit product are added to a 32-bit accumulate value and the result is written to the destination register. The 
bottom 16 bits of the 48-bit product are ignored. 
If overflow occurs during the addition of the accumulate value, the instruction sets PSTATE.Q to 1. No overflow 
can occur during the multiplication. 
A1 
\31 28|27 26 25 24|23 22 21 20|19 16|15 12/11 8|7 6 5 4|3 0 | 
iit [0007 0]0 iJo] Ra [| Ra | Rm | t[wolo] Rn | 
cond 
SMLAWB variant 
Applies when M == 0. 
SMLAWB{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMLAWT variant 
Applies when M == 1. 
SMLAWT{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); a = UInt(Ra); m_high = (M == '1'); 
if d == 15 || n == 15 || m == 15 || a == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12|11 8|7 6 5 4|3 0 | 
ti1titortojorr} Rn fT eit | Rd fo ofojm] Rm | 
Ra 
SMLAWB variant 
Applies when M == 0. 
SMLAWB{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMLAWT variant 
Applies when M == 1. 
SMLAWT{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for all variants of this encoding 
if Ra == '1111' then SEE SMULWB, SMULWT; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); a = UInt(Ra); m_high = (M == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> 


<q> 


<Rd> 


<Rn> 


<Rm> 


<Ra> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the general-purpose destination register, encoded in the "Rd" field. 

Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 


Is the second general-purpose source register holding the multiplier in the bottom or top half 
(selected by <y>), encoded in the "Rm" field. 


Is the third general-purpose source register holding the addend, encoded in the "Ra" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 

operand2 = if m_high then R[m]<31:16> else R[m]<15:0>; 

result = SInt(R[n]) « SInt(operand2) + (SInt(R[a]) << 16); 

R[d] = result<47:16>; 

if (result >> 16) != SInt(R[d]) then // Signed overflow 
PSTATE.Q = '1'; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.192 


SMLSD, SMLSDX 


Signed Multiply Subtract Dual performs two signed 16 x 16-bit multiplications. It adds the difference of the 
products to a 32-bit accumulate operand. 


Optionally, the instruction can exchange the halfwords of the second operand before performing the arithmetic. This 
produces top x bottom and bottom x top multiplication. 


This instruction sets PSTATE.Q to 1 if the accumulate operation overflows. Overflow cannot occur during the 
multiplications or subtraction. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16|15 12\11 8|7 6 5 4|3 0 | 


Penn [o7 747 0fo0 0] ra] = | Rm [0 1[M[i] Rn | 
Ra 


cond 


SMLSD variant 
Applies when M == 0. 


SMLSD{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


SMLSDX variant 
Applies when M == 1. 


SMLSDX{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


Decode for all variants of this encoding 
if Ra == '1111' then SEE SMUSD; 


d = UInt(Rd); mn = UInt(Rn); m= UInt(Rm); a = UInt(Ra); m_swap = (M == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 42/11 81/7 6 5 4|3 0 | 


Titi 710710/100] Ra | =i | Rd [0 O]0|M] Rm | 
Ra 


SMLSD variant 
Applies when M == 0. 


SMLSD{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


SMLSDX variant 
Applies when M == 1. 


SMLSDX{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


Decode for all variants of this encoding 


if Ra == '1111' then SEE SMUSD; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); a = UInt(Ra); m_swap = (M == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> 


<q> 


<Rd> 


<Rn> 


<Rm> 


<Ra> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the general-purpose destination register, encoded in the "Rd" field. 

Is the first general-purpose source register, encoded in the "Rn" field. 

Is the second general-purpose source register, encoded in the "Rm" field. 


Is the third general-purpose source register holding the addend, encoded in the "Ra" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 

operand2 = if m_swap then ROR(R[m],16) else R[m]; 

product1 = SInt(R[n]<15:@>) « SInt(operand2<15:0>); 

product2 = SInt(R[n]<31:16>) » SInt(operand2<31:16>) ; 

result = productl - product2 + SInt(R[a]); 

R[d] = result<31:0>; 

if result != SInt(result<31:0>) then // Signed overflow 
PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.193 


SMLSLD, SMLSLDX 


Signed Multiply Subtract Long Dual performs two signed 16 x 16-bit multiplications. It adds the difference of the 
products to a 64-bit accumulate operand. 


Optionally, the instruction can exchange the halfwords of the second operand before performing the arithmetic. This 
produces top x bottom and bottom x top multiplication. 


Overflow is possible during this instruction, but only as a result of the 64-bit addition. This overflow is not detected 


if it occurs. Instead, the result wraps around modulo 2%. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8/7 6 5 4|3 0 | 


cond 

SMLSLD variant 

Applies when M == 0. 

SMLSLD{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
SMLSLDX variant 

Applies when M == 1. 

SMLSLDX{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Decode for all variants of this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); m_swap = (M == '1'); 


if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if dHi == dLo then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

. The instruction executes as NOP. 


° The value in the destination register is UNKNOWN. 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4|3 0 | 


SMLSLD variant 

Applies when M == 0. 

SMLSLD{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
SMLSLDX variant 

Applies when M == 1. 


SMLSLDX{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
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Decode for all variants of this encoding 


dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m= UInt(Rm); m_swap = (M == '1'); 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 

// ARMv8-A removes UPREDICTABLE for R13 

if dHi == dLo then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<RdLo> Is the general-purpose source register holding the lower 32 bits of the addend, and the destination 


register for the lower 32 bits of the result, encoded in the "RdLo" field. 


<RdHi> Is the general-purpose source register holding the upper 32 bits of the addend, and the destination 
register for the upper 32 bits of the result, encoded in the "RdHi" field. 


<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand2 = if m_swap then ROR(R[m],16) else R[m]; 
productl = SInt(R[n]<15:@>) « SInt(operand2<15:0>); 
product2 = SInt(R[n]<31:16>) « SInt(operand2<31:16>) ; 
result = productl - product2 + SInt(R[dHi]:R[dLo]); 
R[dHi] = result<63:32>; 
R[dLo] = result<31:0>; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.194 SMMLA, SMMLAR 
Signed Most Significant Word Multiply Accumulate multiplies two signed 32-bit values, extracts the most 
significant 32 bits of the result, and adds an accumulate value. 
Optionally, the instruction can specify that the result is rounded instead of being truncated. In this case, the constant 
0x80000000 is added to the product before the high word is extracted. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 
iit fo 114 oft otf Rd [ot | Rm _ fo ofR]i] Rn | 
cond Ra 
SMMLA variant 
Applies when R == 0. 
SMMLA{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMMLAR variant 
Applies when R == 1. 
SMMLAR{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for all variants of this encoding 
if Ra == '1111' then SEE SMMUL; 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); a = UInt(Ra); round = (R == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12\11 8|7 6 5 4|3 0 | 
Tiit+ott oo Re | ean | ra lo ofojR] Rm | 
Ra 
SMMLA variant 
Applies when R == 0. 
SMMLA{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMMLAR variant 
Applies when R == 1. 
SMMLAR{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for all variants of this encoding 
if Ra == '1111' then SEE SMMUL; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); a = UInt(Ra); round = (R == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. 
<Ra> Is the third general-purpose source register holding the addend, encoded in the "Ra" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = (SInt(R[a]) << 32) + SInt(R[n]) * SInt(R[m]); 
if round then result = result + 0x80000000; 
R[d] = result<63:32>; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.195 SMMLS, SMMLSR 
Signed Most Significant Word Multiply Subtract multiplies two signed 32-bit values, subtracts the result from a 
32-bit accumulate value that is shifted left by 32 bits, and extracts the most significant 32 bits of the result of that 
subtraction. 
Optionally, the instruction can specify that the result of the instruction is rounded instead of being truncated. In this 
case, the constant 0x80000000 is added to the result of the subtraction before the high word is extracted. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8/7 6 5 4|3 0 | 
mim [oti 7 oji1o7] ra | Ra | Rm [i ajr[i] Rn 
cond 
SMMLS variant 
Applies when R == 0. 
SMMLS{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMMLSR variant 
Applies when R == 1. 
SMMLSR{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); a = UInt(Ra); round = (R == '1'); 
if d == 15 || n == 15 || m == 15 || a == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 12|11 8/7 6 5 4|3 0 | 
Tiit7o1710170| Ra | Ra | Ra 0 o]ojR] Rm | 
SMMLS variant 
Applies when R == 0. 
SMMLS{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
SMMLSR variant 
Applies when R == 1. 
SMMLSR{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 
Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); a = UInt(Ra); round = (R == '1'); 
if d == 15 || n == 15 || m == 15 || a == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. 
<Ra> Is the third general-purpose source register holding the addend, encoded in the "Ra" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = (SInt(R[a]) << 32) - SInt(R[n]) « SInt(R[m]); 
if round then result = result + 0x80000000; 
R[d] = result<63:32>; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.196 SMMUL, SMMULR 
Signed Most Significant Word Multiply multiplies two signed 32-bit values, extracts the most significant 32 bits of 
the result, and writes those bits to the destination register. 
Optionally, the instruction can specify that the result is rounded instead of being truncated. In this case, the constant 
0x80000000 is added to the product before the high word is extracted. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 8|7 6 5 4|3 0| 
iit fo 111 oft otf] Ra [1111] Rm _ fo ofR]i] Rn | 
cond 
SMMUL variant 
Applies when R == 0. 
SMMUL{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
SMMULR variant 
Applies when R == 1. 
SMMULR{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); round = (R == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]|3 0| 
Tiittot+1 oo Re [1717] Ra lo ojojR] Rm | 
SMMUL variant 
Applies when R == 0. 
SMMUL{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
SMMULR variant 
Applies when R == 1. 
SMMULR{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); round = (R == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = SInt(R[n]) « SInt(R[m]); 
if round then result = result + 0x80000000; 
R[d] = result<63:32>; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.197 


SMUAD, SMUADX 


Signed Dual Multiply Add performs two signed 16 x 16-bit multiplications. It adds the products together, and writes 
the result to the destination register. 


Optionally, the instruction can exchange the halfwords of the second operand before performing the arithmetic. This 
produces top x bottom and bottom x top multiplication. 


This instruction sets PSTATE.Q to 1 if the addition overflows. The multiplications cannot overflow. 
A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 14 13 12|11 8|7 6 5 4|3 0 | 


pit jo 117 ojo oof Rd [trait] Rm fo ojmit] Rn | 


cond 


SMUAD variant 
Applies when M == 0. 


SMUAD{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


SMUADX variant 
Applies when M == 1. 


SMUADX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); m_swap = (M == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0/45 14 13 12|11 81/7 6 5 4|3 0 | 


77a tio77ojo7o] en [1777] Re 0 0]o[m] Rm | 


SMUAD variant 
Applies when M == 0. 


SMUAD{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


SMUADX variant 
Applies when M == 1. 


SMUADX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for all variants of this encoding 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); m_swap = (M == '1'); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> 


<q> 


<Rd> 


<Rn> 


<Rm> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 
Is the general-purpose destination register, encoded in the "Rd" field. 
Is the first general-purpose source register, encoded in the "Rn" field. 


Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 

operand2 = if m_swap then ROR(R[m],16) else R[m]; 

product1 = SInt(R[n]<15:@>) « SInt(operand2<15:0>); 

product2 = SInt(R[n]<31:16>) « SInt(operand2<31:16>) ; 

result = productl + product2; 

R[d] = result<31:0>; 

if result != SInt(result<31:0>) then // Signed overflow 
PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.198 


SMULBB, SMULBT, SMULTB, SMULTT 


Signed Multiply (halfwords) multiplies two signed 16-bit quantities, taken from either the bottom or the top half of 
their respective source registers. The other halves of these source registers are ignored. The 32-bit product is written 
to the destination register. No overflow is possible during this instruction. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 14 13 12|11 8|7 6 5 4|3 0 | 


[eit fo oo 4 of1 tfof Ra (0) (0) (Rm __t[MINTo] Rn | 


cond 
SMULBB variant 


Applies when M == @ && N == 


SMULBB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


SMULBT variant 
Applies when M == 1 && N == 


SMULBT{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


SMULTB variant 
Applies when M == @ && N == 


SMULTB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


SMULTT variant 

Applies when M == 1 && N == 1. 
SMULTT{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 


n_high = (N == '1'); m_high = (M == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


71 


[15 1413 12/1110 9 8|7 6 5 4/3 0/15 14 13 12|11 8|7 6 5 4|3 0 | 


parti tortojoo tt Ra [tit at Rd fo O[N|M] Rm | 


SMULBB variant 
Applies when N == @ && M == 


SMULBB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
SMULBT variant 
Applies when N == @ && M == 


SMULBT{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
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SMULTB variant 
Applies when N == 1 && M == 


SMULTB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


SMULTT variant 
Applies when N == 1 && M == 


SMULTT{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for all variants of this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

n_high = (N == '1'); m_high = (M == '1'); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register holding the multiplicand in the bottom or top half 


(selected by <x>), encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register holding the multiplier in the bottom or top half 
(selected by <y>), encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand1 = if n_high then R[n]<31:16> else R[n]<15:0>; 
operand2 = if m_high then R[m]<31:16> else R[m]<15:0>; 
result = SInt(operand1) « SInt(operand2) ; 
R[d] = result<31:0>; 
// Signed overflow cannot occur 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.199 SMULL, SMULLS 
Signed Multiply Long multiplies two 32-bit signed values to produce a 64-bit result. 
In A32 instructions, the condition flags can optionally be updated based on the result. Use of this option adversely 
affects performance on many implementations. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12/11 8/7 6 5 4|3 0 | 
Fit |o 0 0 01 1 o]s] RdHi_ | Rdlo [| Rm [1001] Rn_ | 
cond 
Flag setting variant 
Applies when $ == 1. 
SMULLS{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Not flag setting variant 
Applies when $ == 0. 
SMULL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Decode for all variants of this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); setflags = (S == '1'); 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if dHi == dLo then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If dHi == dLo, then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12|11 8|7 6 5 4|3 0| 
1141170747 1/0 00] Rn _ { Rdlo [| RdHi_ [oo 0 of Rm _ | 
T1 variant 
SMULL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Decode for this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); setflags = FALSE; 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 
if dHi == dLo then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<RdLo> Is the general-purpose destination register for the lower 32 bits of the result, encoded in the "RdLo" 
field. 

<RdHi> Is the general-purpose destination register for the upper 32 bits of the result, encoded in the "RdHi" 
field. 

<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = SInt(R[n]) « SInt(R[m]); 
R[dHi] = result<63:32>; 
R[dLo] = result<31:0>; 
if setflags then 
PSTATE.N = result<63>; 
PSTATE.Z = IsZeroBit(result<63:0>); 
// PSTATE.C, PSTATE.V unchanged 
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F5.1.200 SMULWB, SMULWT 
Signed Multiply (word by halfword) multiplies a signed 32-bit quantity and a signed 16-bit quantity. The signed 
16-bit quantity is taken from either the bottom or the top half of its source register. The other half of the second 
source register is ignored. The top 32 bits of the 48-bit product are written to the destination register. The bottom 
16 bits of the 48-bit product are ignored. No overflow is possible during this instruction. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 8/7 6 5 4|3 0| 
rit [oo 0 1 ofo tfo] Rd (0)(0)@)(]_ Rm [tim[tfo] Rn __| 
cond 
SMULWB variant 
Applies when M == 0. 
SMULWB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
SMULWT variant 
Applies when M == 1. 
SMULWT{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); m_high = (M == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
15 141312/11109 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4|3 0 | 
Tiit+ot+ojo+a Ra [1717] rao o]o[m] em | 
SMULWB variant 
Applies when M == 0. 
SMULWB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
SMULWT variant 
Applies when M == 1. 
SMULWT{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); m_high = (M == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> 


<q> 


<Rd> 


<Rn> 


<Rm> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the general-purpose destination register, encoded in the "Rd" field. 

Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 


Is the second general-purpose source register holding the multiplier in the bottom or top half 
(selected by <y>), encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 

operand2 = if m_high then R[m]<31:16> else R[m]<15:0>; 
product = SInt(R[n]) « SInt(operand2); 

R[d] = product<47:16>; 

// Signed overflow cannot occur 
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F5.1.201 


SMUSD, SMUSDX 


Signed Multiply Subtract Dual performs two signed 16 x 16-bit multiplications. It subtracts one of the products from 
the other, and writes the result to the destination register. 


Optionally, the instruction can exchange the halfwords of the second operand before performing the arithmetic. This 
produces top x bottom and bottom x top multiplication. 


Overflow cannot occur. 
A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 14 13 12|11 8|7 6 5 4|3 0 | 


pert jo 117 ojo oof Rd [trait] Rm fo tjmMitf] Rn | 


cond 
SMUSD variant 


Applies when M == 0. 


SMUSD{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


SMUSDX variant 
Applies when M == 1. 


SMUSDX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); m_swap = (M == '1'); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0/45 14 13 12|11 81/7 6 5 4|3 0 | 


Tati tio77 ooo] Rn [17717] Re 0 0]o[m] Rm | 


SMUSD variant 
Applies when M == 0. 


SMUSD{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


SMUSDX variant 
Applies when M == 1. 


SMUSDX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for all variants of this encoding 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); m_swap = (M == '1'); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand2 = if m_swap then ROR(R[m],16) else R[m]; 
productl = SInt(R[n]<15:@>) « SInt(operand2<15:0>); 
product2 = SInt(R[n]<31:16>) « SInt(operand2<31:16>) ; 
result = productl - product2; 
R[d] = result<31:0>; 
// Signed overflow cannot occur 
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F5.1.202 SRS, SRSDA, SRSDB, SRSIA, SRSIB 

Store Return State stores the LR_<current_mode> and SPSR_<current_mode> to the stack of a specified mode. For 
information about memory accesses see Memory accesses on page F2-2412. 
SRS is UNDEFINED in Hyp mode. 
SRS is CONSTRAINED UNPREDICTABLE if it is executed in User or System mode, or if the specified mode is any of the 
following: 
. Not implemented. 
° A mode that Table G1-5 on page G1-3796 does not show. 
° Hyp mode. 
° Monitor mode, if the SRS instruction is executed in Non-secure state. 
If EL3 is using AArch64 and an SRS instruction that is executed in a Secure EL1 mode specifies Monitor mode, it 
is trapped to EL3. 
See Traps to EL3 of Secure monitor functionality from Secure ELI using AArch32 on page D1-1590. 
A1 

|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16/15 141312|1110 9 8|7 6 5 4| 0 | 

[11 4 1 1 0 OF PLU] 4 wo [(1) (1) (0) (1)](0) (0) (0) (0) (0) (4) (0) (1) (0) (0) (O)|_— mode 
Decrement After variant 
Applies when P == 0 && U == 0. 
SRSDA{<c>}{<q>} SP{!}, #<mode> 
Decrement Before variant 
Applies when P == 1 && U == 0. 
SRSDB{<c>}{<q>} SP{!}, #<mode> 
Increment After variant 
Applies when P == 0 && U == 1. 
SRS{TA}{<c>}{<q>} SP{!}, #<mode> 
Increment Before variant 
Applies when P == 1 && U == 1. 
SRSIB{<c>}{<q>} SP{!}, #<mode> 
Decode for all variants of this encoding 

wback = (W == '1'); increment = (U == '1'); wordhigher = (P == U); 
T1 

|15141312|/1110 9 8|7 6 5 4/3 2 1 0|15141312\11109 8|7 6 5 4| 0| 

[11 1 0 1 0 Of Of O fw} [(1) (1) (0) A DIAIOMO) (©) (0) (©) (0) (0) (0) (O)|_— Mode 
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T1 variant 


SRSDB{<c>}{<q>} SP{!}, #<mode> 


Decode for this encoding 


wback = (W == '1'); increment = FALSE; wordhigher = FALSE; 
T2 


[151413 12|1110 9 8|7 6 5 4/3 2 1 0|15141312/11109 8|7 6 5 4| 0| 


[111.0 1:0 0f14 14[0Jwfo [ty (4) (0) AAA MOMO) (0) (0) (0) (0) (0) (0) (| __ mode __| 


T2 variant 


SRS{TA}{<c>}{<q>} SP{!}, #<mode> 


Decode for this encoding 


wback = (W == '1'); increment = TRUE; wordhigher = FALSE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly SRS (T32) on page K1-5468 and SRS 
(A32) on page K1-5468. 


Assembler symbols 


TA For encoding A1: is an optional suffix to indicate the Increment After variant. 


For encoding T2: is an optional suffix for the Increment After form. 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. <c> must be AL or omitted. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


! The address adjusted by the size of the data loaded is written back to the base register. If specified, 
it is encoded in the "W" field as 1, otherwise this field defaults to 0. 


<mode> Is the number of the mode whose Banked SP is used as the base register, encoded in the "mode" 
field. For details of PE modes and their numbers see AArch32 PE mode descriptions on 
page G1-3796. 


SRSFA, SRSEA, SRSFD, and SRSED are pseudo-instructions for SRSIB, SRSIA, SRSDB, and SRSDA respectively, referring to 
their use for pushing data onto Full Ascending, Empty Ascending, Full Descending, and Empty Descending stacks. 


Operation for all encodings 


if CurrentInstrSet() == InstrSet_A32 then 
if ConditionPassed() then 
EncodingSpecificOperations(); 
if PSTATE.EL == EL2 then // UNDEFINED at EL2 
UNDEFINED; 


// Check for UNPREDICTABLE cases. The definition of UNPREDICTABLE does not permit these 
// to be security holes 
if PSTATE.M IN {M32_User,M32_System} then 





UNPREDICTABLE; 
elsif mode == M32_Hyp then // Check for attempt to access Hyp mode SP 
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UNPREDICTABLE; 
elsif mode == M32_Monitor then // Check for attempt to access Monitor mode SP 
if !HaveEL(EL3) || !IsSecure() then 
UNPREDICTABLE; 
elsif !ELUsingAArch32(EL3) then 
AArch64.MonitorModeTrap(); 
elsif BadMode(mode) then 
UNPREDICTABLE; 


base = Rmode[13,mode]; 
address = if increment then base else base-8; 
if wordhigher then address = address+4; 
MemA[address,4] = LR; 
MemA[address+4,4] = SPSR[]; 
if wboack then Rmode[13,mode] = if increment then base+8 else base-8; 
else 
if ConditionPassed() then 
EncodingSpecificOperations(); 
if PSTATE.EL == EL2 then // UNDEFINED at EL2 
UNDEFINED; 


// Check for UNPREDICTABLE cases. The definition of UNPREDICTABLE does not permit these 
// to be security holes 
if PSTATE.M IN {M32_User,M32_System} then 
UNPREDICTABLE; 
elsif mode == M32_Hyp then // Check for attempt to access Hyp mode SP 
UNPREDICTABLE; 
elsif mode == M32_Monitor then // Check for attempt to access Monitor mode SP 
if !HaveEL(EL3) || !IsSecure() then 
UNPREDICTABLE; 
elsif !ELUsingAArch32(EL3) then 
AArch64.MonitorModeTrap(); 
elsif BadMode(mode) then 
UNPREDICTABLE; 


base = Rmode[13,mode]; 

address = if increment then base else base-8; 

if wordhigher then address = address+4; 

MemA[address,4] = LR; 

MemA[address+4,4] = SPSR[]; 

if wboack then Rmode[13,mode] = if increment then base+8 else base-8; 
CONSTRAINED UNPREDICTABLE behavior 
If PSTATE.M IN {M32_User,M32_System}, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 


° The instruction executes as NOP. 


If mode == M32_Hyp, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

If mode == M32_Monitor && (!HaveEL(EL3) || !IsSecure()), then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 

° The instruction executes as NOP. 


If BadMode(mode), then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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° The instruction stores to the stack of the mode in which it is executed. 


° The instruction stores to an UNKNOWN address, and if the instruction specifies writeback then any 
general-purpose register that can be accessed from the current Exception level without a privilege violation 
becomes UNKNOWN. 
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F5.1.203 SSAT 


Signed Saturate saturates an optionally-shifted signed value to a selectable signed range. 


This instruction sets PSTATE.Q to 1 if the operation saturates. 


A1 
31 28|27 26 25 24|23 22 21 20| 16|15 12|11 i765 4/3 0 | 
ei [0110 1)0[i] saimm | Ra | immo [nfo a] Rn | 


cond 


Arithmetic shift right variant 
Applies when sh == 1. 


SSAT{<c>}{<q>} <Rd>, #<imm>, <Rn>, ASR #<amount> 


Logical shift left variant 
Applies when sh == Q. 


SSAT{<c>}{<q>} <Rd>, #<imm>, <Rn> {, LSL #<amount>} 


Decode for all variants of this encoding 
d = UInt(Rd); nm = UInt(Rn); saturate_to = UInt(sat_imm)+1; 


(shift_t, shift_n) = DecodeImmShift(sh:'O', imm5); 
if d == 15 || n == 15 then UNPREDICTABLE; 


T1 


[15 141312/11109 8|7 6 5 4|3 0/1514 12\11 8/7 6 5 4| 0 | 


[11.1 1 ofof1 tfo ofshtof Rn fo] imma | Rd fimm2|(o)]__sat_imm_| 





Arithmetic shift right variant 
Applies when sh == 1 && !(imm3 == 000 && imm2 == 00). 


SSAT{<c>}{<q>} <Rd>, #<imm>, <Rn>, ASR #<amount> 


Logical shift left variant 
Applies when sh == Q. 


SSAT{<c>}{<q>} <Rd>, #<imm>, <Rn> {, LSL #<amount>} 


Decode for all variants of this encoding 


if sh == '1' && (imm3:imm2) == 'QQ000' then SEE SSAT16; 

d = UInt(Rd); nm = UInt(Rn); saturate_to = UInt(sat_imm)+1; 

(shift_t, shift_n) = DecodeImmShift(sh:'@', imm3:imm2); 

if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<imm> Is the bit position for saturation, in the range 1 to 32, encoded in the "sat_imm" field as <imm>-1. 
<Rn> Is the general-purpose source register, encoded in the "Rn" field. 

<amount> For encoding A1: is the optional shift amount, in the range 0 to 31, defaulting to 0 and encoded in 


the "imm5" field. 


For encoding A1: is the shift amount, in the range 1 to 32 encoded in the "imm5" field as <amount> 
modulo 32. 


For encoding T1: is the optional shift amount, in the range 0 to 31, defaulting to 0 and encoded in 
the "imm3:imm2" field. 


For encoding T1: is the shift amount, in the range 1 to 31 encoded in the "imm3:imm2" field as 
<amount>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand = Shift(R[n], shift_t, shift_n, PSTATE.C); // PSTATE.C ignored 
(result, sat) = SignedSatQ(SInt(operand), saturate_to); 
R[d] = SignExtend(result, 32); 





if sat then 
PSTATE.Q = '1'; 
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F5.1.204 


SSAT16 


Signed Saturate 16 saturates two signed 16-bit values to a selected signed range. 


This instruction sets PSTATE.Q to 1 if the operation saturates. 
A1 


31 28/27 26 25 24|23 22 21 20|19 16|15 12\11109 8|7 6 5 4/3 0| 


erm [o 7710 1)0]7 0| saimm | Ra (yoo 17] Rn 


cond 


Al variant 


SSAT16{<c>}{<q>} <Rd>, #<imm>, <Rn> 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); saturate_to = UInt(sat_imm)+1; 
if d == 15 || n == 15 then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0 | 


741 oft t]o ofifoy Rn [ojo oo] Ra [0 ofoyfo] sat_imm | 


T1 variant 
SSAT16{<c>}{<q>} <Rd>, #<imm>, <Rn> 
Decode for this encoding 
d = UInt(Rd); mn = UInt(Rn); saturate_to = UInt(sat_imm)+1; 
if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<imm> Is the bit position for saturation, in the range 1 to 16, encoded in the "sat_imm" field as <imm>-1. 


<Rn> Is the general-purpose source register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result1, sat1) = SignedSatQ(SInt(R[n]<15:0>), saturate_to); 
(result2, sat2) = SignedSatQ(SInt(R[n]<31:16>), saturate_to); 
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R[d]<15:@> = SignExtend(resultl, 16); 
R[d]<31:16> = SignExtend(result2, 16); 
if satl || sat2 then 

PSTATE.Q = '1'; 
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F5.1.205 SSAX 
Signed Subtract and Add with Exchange exchanges the two halfwords of the second operand, performs one 16-bit 
integer subtraction and one 16-bit addition, and writes the results to the destination register. It sets PSTATE.GE 
according to the results. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 6 5 4/3 0| 
rit fo 110 ofo otf Rn [Ra [nfo of1] Rm _| 
cond 
Al variant 
SSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 14 13 12/11 8/7 6 5 4|3 0| 
Tiittoto ito] Ra [1717] Ra jojojojo] rm | 
T1 variant 
SSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
sum = SInt(R[n]<15:@>) + SInt(R[m]<31:16>); 
diff = SInt(R[n]<31:16>) - SInt(R[m]<15:0>); 
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R[d]<15:@> = sum<15:@>; 
R[d]<31:16> = diff<15:@>; 
PSTATE.GE<1:@> = if sum >= @ then '11' else 'Q0'; 
PSTATE.GE<3:2> = if diff >= @ then '11' else 'Q0'; 
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F5.1.206 


SSUB16 


Signed Subtract 16 performs two 16-bit signed integer subtractions, and writes the results to the destination register. 
It sets PSTATE.GE according to the results of the subtractions. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


| tei fo 1 to ojo ot] Ra] RA fanfenienfofs aft] Rm 


cond 


Al variant 


SSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 8/7 6 5 4|3 0 | 


11ittororr1o at Rn [1111] Rd fofofofo} Rm | 


T1 variant 


SSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diffl = SInt(R[n]<15:@>) - SInt(R[m]<15:0>); 
diff2 = SInt(R[n]<31:16>) - SInt(R[m]<31:16>); 
R[d]<15:@> = diff1<15:0>; 





F5-3028 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


R[d]<31:16> = diff2<15:0>; 
PSTATE.GE<1:0> = if diffl >= @ then '11' else '00'; 
PSTATE.GE<3:2> = if diff2 >= @ then '11' else '00'; 
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F5.1.207 SSUB8 
Signed Subtract 8 performs four 8-bit signed integer subtractions, and writes the results to the destination register. 
It sets PSTATE.GE according to the results of the subtractions. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


| tei fo 1 to ojo ot] Ra | RA fanny tt ttt] Rm 


cond 


Al variant 


SSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 8/7 6 5 4|3 0 | 


111ttorort1oo; Rn [1111] Rd fofofofo} Rm | 


T1 variant 


SSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diff1 = SInt(R[n]<7:@>) - SInt(R[m]<7:0>); 
diff2 = SInt(R[n]<15:8>) - SInt(R[m]<15:8>); 
diff3 = SInt(R[n]<23:16>) - SInt(R[m]<23:16>); 
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diff4 = SInt(R[n]<31:24>) - SInt(R[m]<31:24>); 
R[d]<7:@> = diff1l<7:@>; 

R[d]<15:8> = diff2<7:@>; 

R[d]<23:16> = diff3<7:>; 

R[d]<31:24> = diff4<7:@>; 

PSTATE.GE<@> = if diffl >= @ then '1' else 'Q'; 
PSTATE.GE<1> if diff2 >= @ then '1' else '0'; 
PSTATE.GE<2> if diff3 >= @ then '1' else 'Q'; 
PSTATE.GE<3> if diff4 >= @ then '1' else 'Q'; 
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F5.1.208 STC 

Store data to System register calculates an address from a base register value and an immediate offset, and stores a 
word from the DBGDTRRXint System register to memory. It can use offset, post-indexed, pre-indexed, or 
unindexed addressing. For information about memory accesses see Memory accesses on page F2-2412. 
In an implementation that includes EL2, the permitted STC access to DBGDTRRXint can be trapped to Hyp mode, 
meaning that an attempt to execute an STC instruction in a Non-secure mode other than Hyp mode, that would be 
permitted in the absence of the Hyp trap controls, generates a Hyp Trap exception. For more information, see 
Trapping general Non-secure System register accesses to debug registers on page G1-3911. 
For simplicity, the STC pseudocode does not show this possible trap to Hyp mode. 
A1 

|31 28|27 26 25 24|23 22 21 20/19 16|15 141312|1110 9 8/7 | 0| 

ean [11 o[P[ujojwfo[ eno 70 7[177)0o] imma — 
cond 

Offset variant 
Applies when P == 1 && W == 0. 
STC{<c>}{<q>} p14, c5, [<Rn>{, #{+/-}<imm>}] 
Post-indexed variant 
Applies when P == 0 && W == 1. 
STC{<c>}{<q>} p14, c5, [<Rn>], #{+/-}<imm> 
Pre-indexed variant 
Applies when P == 1 && W == 1. 
STC{<c>}{<q>} p14, c5, [<Rn>, #{+/-}<imm>]! 
Unindexed variant 
Applies when P == 0 && U == 1 && W == @. 
STC{<c>}{<q>} p14, c5, [<Rn>], <option> 
Decode for all variants of this encoding 

if P == '0' && U == '0' && W == 'Q' then UNDEFINED; 

n = UInt(Rn); cp = 14; 

imm32 = ZeroExtend(imm8:'QQ', 32); index = (P == '1'); add = (U == '1'); whack = (W == '1'); 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
Ifn == 15 && whack, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction executes without writeback of the base address. 
. The instruction executes with writeback to the PC. The instruction is handled as described in Using R15 on 

page K1-5457. 
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T1 


151413 12|11109 8|7 6 5 4|3 0|15141312|1110 9 8|7 | 0 | 


7 apolt 7 ofP[upopwfoy Rn fo 70 7[11 7] imme 


Offset variant 
Applies when P == 1 && W == 0. 


STC{<c>}{<q>} p14, c5, [<Rn>{, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 1. 


STC{<c>}{<q>} p14, c5, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STC{<c>}{<q>} p14, c5, [<Rn>, #{+/-}<imm>]! 


Unindexed variant 
Applies when P == 0 && U == 1 && W == 0. 


STC{<c>}{<q>} p14, c5, [<Rn>], <option> 


Decode for all variants of this encoding 
if P == '0' && U == 'Q' && W == 'Q' then UNDEFINED; 
n = UInt(Rn); cp = 14; 


imm32 = ZeroExtend(imm8:'Q0', 32); index = (P == '1'); add = (U == '1'); whack = (W == '1'); 
if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Ifn == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes without writeback of the base address. 

° The instruction executes with writeback to the PC. The instruction is handled as described in Using R/5 on 


page K1-5457. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> For the offset or unindexed variant: is the general-purpose base register, encoded in the "Rn" field. 


The PC can be used, but this is deprecated. 
For the offset, post-indexed or pre-indexed variant: is the general-purpose base register, encoded in 
the "Rn" field. 


<option> Is an 8-bit immediate, in the range 0 to 255 enclosed in { }, encoded in the "imm8" field. The value 
of this field is ignored when executing this instruction. 
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+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> Is the immediate offset used for forming the address, a multiple of 4 in the range 0-1020, defaulting 


to 0 and encoded in the "imm8" field, as <imm>/4. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
AArch32.CheckSystemAccess(cp, ThisInstr()); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 


// System register read from DBGDTRRXint. 
MemA[address,4] = DBGDTR_ELQ[]; 


if wboack then R[n] = offset_addr; 
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F5.1.209 STL 


Store-Release Word stores a word from a register to memory. The instruction also has memory ordering semantics 
as described in Load-Acquire, Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16|15 1413 12/1110 9 8|7 6 5 4|3 0 | 


[sit foo o 4 tfo ofof Ra fm aynfofot1 oo 4] Rt | 


cond 


Al variant 


STL{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || mn == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 21 O| 


pt rorooort ojo] Rn fo RE (HM Masfo}1 oma) 


T1 variant 


STL{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
MemO[address, 4] = R[t]; 
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F5.1.210 


STLB 


Store-Release Byte stores a byte from a register to memory. The instruction also has memory ordering semantics as 
described in Load-Acquire, Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16|15 1413 12/1110 9 8|7 6 5 4|3 0 | 


feat foo o 41] ofof Ra @aynfofot1 oo 4] Rt | 


cond 


Al variant 


STLB{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 


t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 21 0| 


pt roro oor ojo} Rn fo RE (HMmasfojo ofa) 


T1 variant 
STLB{<c>}{<q>} <Rt>, [<Rn>] 
Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
MemO[address, 1] = R[t]<7:0>; 
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F5.1.211 STLEX 


Store-Release Exclusive Word stores a word from a register to memory if the executing PE has exclusive access to 
the memory at that address, and returns a status value of 0 if the store was successful, or of 1 if no store was 
performed. 


The instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


Perm [ooo 710 ofo] en | Rd (ipolioo7] Rt 


cond 


Al variant 

STLEX{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 

Decode for this encoding 

d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 


if d == 15 || t == 15 || n == 15 then UNPREDICTABLE; 
if d == n || d == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
. The instruction performs the store to an UNKNOWN address. 
T1 
151413 12/11109 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 0 | 


1110100011 0fof Rn {[ Re [aa@aarisf1 of Ra | 


T1 variant 


STLEX{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 


Decode for this encoding 


d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 
if d == 15 || t == 15 || n == 15 then UNPREDICTABLE; 
if d == n || d == t then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction performs the store to an UNKNOWN address. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the destination general-purpose register into which the status result of the store exclusive is 


written, encoded in the "Rd" field. The value returned is: 


) If the operation updates memory. 

1 If the operation fails to update memory. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Aborts and alignment 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
° Memory is not updated. 

. <Rd> is not updated. 


A non word-aligned memory address causes an Alignment fault Data Abort exception to be generated, subject to 
the following rules: 


. Tf AArch32.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch32.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
if AArch32.ExclusiveMonitorsPass(address,4) then 
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MemO[address, 4] = R[t]; 

R[d] = ZeroExtend('@'); 
else 

R[d] = ZeroExtend('1'); 
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F5.1.212 STLEXB 


Store-Release Exclusive Byte stores a byte from a register to memory if the executing PE has exclusive access to 
the memory at that address, and returns a status value of 0 if the store was successful, or of 1 if no store was 


performed. 


The instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 


information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 


0| 


Perm [ooo 711 ofo] en | Rd (ipolioo7] 


cond 


Al variant 

STLEXB{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 
Decode for this encoding 

d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 


if d == 15 || t == 15 || n == 15 then UNPREDICTABLE; 
if d == n || d == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
. The instruction performs the store to an UNKNOWN address. 
T1 
151413 12/11109 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 


0} 


1110100011 0fof Rn {[ Rt [ayaaattsfo of Ra | 


T1 variant 


STLEXB{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 


Decode for this encoding 


d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 
if d == 15 || t == 15 || n == 15 then UNPREDICTABLE; 
if d == n || d == t then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction performs the store to an UNKNOWN address. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the destination general-purpose register into which the status result of the store exclusive is 


written, encoded in the "Rd" field. The value returned is: 


) If the operation updates memory. 

1 If the operation fails to update memory. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Aborts 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
° Memory is not updated. 

. <Rd> is not updated. 


Tf AArch32.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation for all encodings 


if ConditionPassed() then 

EncodingSpecificOperations(); 

address = R[n]; 

if AArch32.ExclusiveMonitorsPass(address,1) then 
MemO[address, 1] = R[t]<7:0>; 
R[d] = ZeroExtend('@'); 

else 
R[d] = ZeroExtend('1'); 
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F5.1.213 STLEXD 


Store-Release Exclusive Doubleword stores a doubleword from two registers to memory if the executing PE has 
exclusive access to the memory at that address, and returns a status value of 0 if the store was successful, or of 1 if 
no store was performed. 


The instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


Derm [ooo 71070] en | Rd (ipolioo7] Rt 


cond 


Al variant 


STLEXD{<c>}{<q>} <Rd>, <Rt>, <Rt2>, [<Rn>] 
Decode for this encoding 
d = UInt(Rd); t = UInt(Rt); t2 = t+1; n = UInt(Rn); 


if d == 15 || Rt<@> == '1' || t2 == 15 || n == 15 then UNPREDICTABLE; 
if d==n || d == t || d == t2 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction performs the store to an UNKNOWN address. 

If Rt<@> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes with the additional decode: Rt<0> = '0'. 
° The instruction executes with the additional decode: t2 = t. 

° The instruction executes as described, with no change to its behavior and no additional side effects. 
If Rt == '1110', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 


° The instruction is handled as described in Using R15 on page K1-5457. 





F5-3042 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4|3 0 | 


pt rorooortojof Rn | Re | Re fiftii tf] Ra | 


T1 variant 


STLEXD{<c>}{<q>} <Rd>, <Rt>, <Rt2>, [<Rn>] 
Decode for this encoding 
d = UInt(Rd); t = UInt(Rt); t2 = UInt(Rt2); n = UInt(Rn); 


if d == 15 || t == 15 || t2 == 15 || n == 15 then UNPREDICTABLE; 
if d==n || d == t || d == t2 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs the store to an UNKNOWN address. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the destination general-purpose register into which the status result of the store exclusive is 


written, encoded in the "Rd" field. The value returned is: 


0 If the operation updates memory. 
HE If the operation fails to update memory. 
<Rt> For encoding A1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


<Rt> must be even-numbered and not R14. 


For encoding T1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


<Rt2> For encoding A1: is the second general-purpose register to be transferred. <Rt2> must be <R(t+1)>. 


For encoding T1: is the second general-purpose register to be transferred, encoded in the "Rt2" field. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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Aborts and alignment 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
. Memory is not updated. 

° <Rd> is not updated. 


A non word-aligned memory address causes an Alignment fault Data Abort exception to be generated, subject to 
the following rules: 


. Tf AArch32.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch32.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
// Create doubleword to store such that R[t] will be stored at address and R[t2] at address+4. 
value = if BigEndian() then R[t]:R[t2] else R[t2]:R[t]; 
if AArch32.ExclusiveMonitorsPass(address, 8) then 
MemO[address, 8] = value; 
R[d] = ZeroExtend('@'); 
else 
R[d] = ZeroExtend('1'); 
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F5.1.214 STLEXH 


Store-Release Exclusive Halfword stores a halfword from a register to memory if the executing PE has exclusive 
access to the memory at that address, and returns a status value of 0 if the store was successful, or of 1 if no store 
was performed. 


The instruction also has memory ordering semantics as described in Load-Acquire, Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


Derm [ooo 71170] en | Rd (ipolioo7] 


cond 


Al variant 


STLEXH{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 
Decode for this encoding 
d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 


if d == 15 || t == 15 || n == 15 then UNPREDICTABLE; 
if d == n || d == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
. The instruction performs the store to an UNKNOWN address. 
T1 
151413 12/11109 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 0 | 


1110100011 0fof] Rn {[ Rt [aaaarfsfo1] Ra | 


T1 variant 


STLEXH{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 


Decode for this encoding 


d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 
if d == 15 || t == 15 || n == 15 then UNPREDICTABLE; 
if d == n || d == t then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction performs the store to an UNKNOWN address. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the destination general-purpose register into which the status result of the store exclusive is 


written, encoded in the "Rd" field. The value returned is: 


) If the operation updates memory. 

1 If the operation fails to update memory. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Aborts and alignment 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
° Memory is not updated 

. <Rd> is not updated. 


A non word-aligned memory address causes an Alignment fault Data Abort exception to be generated, subject to 
the following rules: 


. Tf AArch32.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch32.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
if AArch32.ExclusiveMonitorsPass(address,2) then 
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MemO[address, 2] = R[t]<15:>; 
R[d] = ZeroExtend('@'); 

else 
R[d] = ZeroExtend('1'); 
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F5.1.215 


STLH 


Store-Release Halfword stores a halfword from a register to memory. The instruction also has memory ordering 
semantics as described in Load-Acquire, Store-Release on page B2-90. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16|15 1413 12/1110 9 8|7 6 5 4|3 0| 


Peat fo oo 4 tis tfoT Ra MM aynfofot1 oo 4] Rt | 


cond 


Al variant 


STLH{<c>}{<q>} <Rt>, [<Rn>] 


Decode for this encoding 


t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 21 O| 


pt rorooordt ojo} Rn fo RE (HM Masfojo tama) 


T1 variant 
STLH{<c>}{<q>} <Rt>, [<Rn>] 
Decode for this encoding 
t = UInt(Rt); n = UInt(Rn); 
if t == 15 || n == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
MemO[address, 2] = R[t]<15:0>; 
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F5.1.216 STM, STMIA, STMEA 


Store Multiple (Increment After, Empty Ascending) stores multiple registers to consecutive memory locations using 
an address from a base register. The consecutive memory locations start at this address, and the address just above 
the last of those locations can optionally be written back to the base register. 


The lowest-numbered register is loaded from the lowest memory address, through to the highest-numbered register 
from the highest memory address. See also Encoding of lists of general-purpose registers and the PC on 
page F2-2413. 


For details of related system instructions see STM (User registers). 


A1 
\31 28|27 26 25 24/23 22 21 20/19 16|15 | | | 0| 
eri [4 0 o]o[t]o]wjo] Rn _| register_list 
cond 
Al variant 


STM{IA}{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
STMEA{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Empty Ascending stack 


Decode for this encoding 


n = UInt(Rn); registers = register_list; whback = (W == '1'); 
if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers stored. 


If n == 15 && whack, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes without writeback of the base address. 

. The instruction executes with writeback to the PC. The instruction is handled as described in Using R15 on 


page K1-5457. 
T1 
|15 14 13 12/11 10 8|7 | 0| 


[710 o[o] Rn | resister ist] 


T1 variant 


STM{IA}{<c>}{<q>} <Rn>!, <registers> // Preferred syntax 
STMEA{<c>}{<q>} <Rn>!, <registers> // Alternate syntax, Empty Ascending stack 
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Decode for this encoding 


n = UInt(Rn); registers = 'Q0000000':register_list; wback = TRUE; 
if BitCount(registers) < 1 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers stored. 


Ifn == 15 && whack, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes without writeback of the base address. 

° The instruction executes with writeback to the PC. The instruction is handled as described in Using RI5 on 


page K1-5457. 
T2 


[15 1413 12/1110 9 8|7 6 5 4/3 0/15 14 13 12| | | 0 | 


77040 0]0 to]w[o] Rn [Olmfo] register ist —~sd 


réyister_list<13> 


T2 variant 

STM{IA}{<c>}.W <Rn>{!}, <registers> // Preferred syntax, if <Rn>, '!' and <registers> can be represented 
in T1 

STMEA{<c>}.W <Rn>{!}, <registers> // Alternate syntax, Empty Ascending stack, if <Rn>, '!' and 


<registers> can be represented in T1 
STM{IA}{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
STMEA{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Empty Ascending stack 


Decode for this encoding 


n = UInt(Rn); registers = P:M:register_list; whack = (W == '1'); 
if n == 15 || BitCount(registers) < 2 then UNPREDICTABLE; 

if wback && registers<n> == '1' then UNPREDICTABLE; 

if registers<13> == '1' then UNPREDICTABLE; 

if registers<15> == '1' then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers stored. 
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If BitCount(registers) == 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes as described, with no change to its behavior and no additional side effects. 

° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. 


Tf whack && registers<n> == '1', then one of the following behaviors must occur: 

. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The store instruction executes but the value stored for the base register is UNKNOWN. 

If registers<13> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The store instruction performs all of the stores using the specified addressing mode but the value of R13 is 
UNKNOWN. 

If registers<15> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The store instruction performs all of the stores using the specified addressing mode but the value of R15 is 
UNKNOWN. 


Ifn == 15 && whack, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes without writeback of the base address. 

° The instruction executes with writeback to the PC. The instruction is handled as described in Using RI5 on 


page K1-5457. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


TA Is an optional suffix for the Increment After form. 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


! The address adjusted by the size of the data loaded is written back to the base register. If specified, 
it is encoded in the "W" field as 1, otherwise this field defaults to 0. 
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<registers> | Forencoding A1: is a list of one or more registers to be stored, separated by commas and surrounded 
by { and }. The PC can be in the list. However, ARM deprecates the use of instructions that include 
the PC in the list. If base register writeback is specified, and the base register is not the 
lowest-numbered register in the list, such an instruction stores an UNKNOWN value for the base 
register. 


For encoding T1: is a list of one or more registers to be stored, separated by commas and surrounded 
by { and }. The registers in the list must be in the range RO-R7, encoded in the "register_list" field. 
If the base register is not the lowest-numbered register in the list, such an instruction stores an 
UNKNOWN value for the base register. 


For encoding T2: is a list of one or more registers to be stored, separated by commas and surrounded 
by { and }. The registers in the list must be in the range RO-R12, encoded in the "register_list" field, 
and can optionally contain the LR. If the LR is in the list, the "M" field is set to 1, otherwise it 
defaults to 0. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
for i = 0 to 14 
if registers<i> == '1' then 
if i == n && whack && i != LowestSetBit(registers) then 
MemA[address,4] = bits(32) UNKNOWN; // Only possible for encodings T1 and Al 
else 
MemA[address,4] = R[i]; 
address = address + 4; 
if registers<15> == '1' then // Only possible for encoding A1 
MemA[address,4] = PCStoreValue(); 
if wback then R[n] = R[n] + 4«BitCount(registers) ; 
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F5.1.217 STM (User registers) 


In an EL1 mode other than System mode, Store Multiple (User registers) stores multiple User mode registers to 
consecutive memory locations using an address from a base register. The PE reads the base register value normally, 
using the current mode to determine the correct Banked version of the register. This instruction cannot writeback to 
the base register. 


Store Multiple (User registers) is UNDEFINED in Hyp mode, and CONSTRAINED UNPREDICTABLE in User or System 


modes. 
A1 
|31 28|27 26 25 24/23 22 21 20/19 16|15 | | | 0 | 
Ett [10 o[P[U]tfojo] Rn _| jegister lst 
cond 
Al variant 


STM{<amode>}{<c>}{<q>} <Rn>, <registers>A 


Decode for this encoding 


n = UInt(Rn); registers = register_list; increment = (U == '1'); wordhigher = (P == U); 
if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<amode> is one of: 

DA Decrement After. The consecutive memory addresses end at the address in the base 
register. Encoded as P= 0, U = 0. 

ED Empty Descending. For this instruction, a synonym for DA. 

DB Decrement Before. The consecutive memory addresses end one word below the address 
in the base register. Encoded as P = 1, U=0. 

FD Full Descending. For this instruction, a synonym for DB. 

IA Increment After. The consecutive memory addresses start at the address in the base 
register. This is the default. Encoded as P=0, U= 1. 

EA Empty Ascending. For this instruction, a synonym for IA. 

IB Increment Before. The consecutive memory addresses start one word above the address 
in the base register. Encoded as P= 1, U= 1. 

FA Full Ascending. For this instruction, a synonym for IB. 
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<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


<registers> Isa list of one or more registers, separated by commas and surrounded by { and }. It specifies the 
set of registers to be stored by the STM instruction. The registers are stored with the lowest-numbered 
register to the lowest memory address, through to the highest-numbered register to the highest 
memory address. See also Encoding of lists of general-purpose registers and the PC on 
page F2-2413. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if PSTATE.EL == EL2 then 
UNDEFINED; 
elsif PSTATE.M IN {M32_User,M32_System} then 
UNPREDICTABLE; 
else 
length = 4«BitCount(registers); 
address = if increment then R[n] else R[n]-length; 
if wordhigher then address = address+4; 
for i = to 14 
if registers<i> == '1' then // Store User mode register 
MemA[address,4] = Rmode[i, M32_User]; 
address = address + 4; 
if registers<15> == '1' then 
MemA[address,4] = PCStoreValue(); 


CONSTRAINED UNPREDICTABLE behavior 


If PSTATE.M IN {M32_User,M32_System}, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. 
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F5.1.218 STMDA, STMED 

Store Multiple Decrement After (Empty Descending) stores multiple registers to consecutive memory locations 

using an address from a base register. The consecutive memory locations end at this address, and the address just 

below the lowest of those locations can optionally be written back to the base register. 

The lowest-numbered register is loaded from the lowest memory address, through to the highest-numbered register 

from the highest memory address. See also Encoding of lists of general-purpose registers and the PC on 

page F2-2413. 

For details of related system instructions see STM (User registers). 

A1 

31 28|27 26 25 24|23 22 21 20/19 16/15 | | | 0 | 
eit [1 0 ofolo[o|wlo] Rn | registers 
cond 

Al variant 

STMDA{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 

STMED{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Empty Descending stack 

Decode for this encoding 

n = UInt(Rn); registers = register_list; whack = (W == '1'); 
if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 

CONSTRAINED UNPREDICTABLE behavior 

If BitCount(registers) < 1, then one of the following behaviors must occur: 

. The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction targets an unspecified set of registers. These registers might include R15. If the instruction 
specifies writeback, the modification to the base address on writeback might differ from the number of 
registers stored. 

If n == 15 && whack, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

. The instruction executes without writeback of the base address. 

. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 

Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 

Architectural Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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! The address adjusted by the size of the data loaded is written back to the base register. If specified, 
it is encoded in the "W" field as 1, otherwise this field defaults to 0. 


<registers> Isa list of one or more registers to be stored, separated by commas and surrounded by { and }. The 
PC can be in the list. However, ARM deprecates the use of instructions that include the PC in the 
list. If base register writeback is specified, and the base register is not the lowest-numbered register 
in the list, such an instruction stores an UNKNOWN value for the base register. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n] - 4xBitCount(registers) + 4; 
for i = 0 to 14 
if registers<i> == '1' then 
if i == n && whack && i != LowestSetBit(registers) then 
MemA[address,4] = bits(32) UNKNOWN; 
else 
MemA[address,4] = R[i]; 
address = address + 4; 
if registers<15> == '1' then 
MemA[address,4] = PCStoreValue(); 
if wback then R[n] = R[n] - 4*BitCount(registers); 
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F5.1.219 STMDB, STMFD 


Store Multiple Decrement Before (Full Descending) stores multiple registers to consecutive memory locations 
using an address from a base register. The consecutive memory locations end just below this address, and the 
address of the first of those locations can optionally be written back to the base register. 


The lowest-numbered register is loaded from the lowest memory address, through to the highest-numbered register 
from the highest memory address. See also Encoding of lists of general-purpose registers and the PC on 
page F2-2413. 


For details of related system instructions see STM (User registers). 


This instruction is used by the alias PUSH (multiple registers). See Alias conditions on page F5-3059 for details of 
when each alias is preferred. 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 | | 0| 
er [4 0 o]4[ojo]wjo] Rn _| register_list 
cond 
Al variant 


STMDB{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
STMFD{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Full Descending stack 


Decode for this encoding 


n = UInt(Rn); registers = register_list; whack = (W == '1'); 
if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers stored. 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 14 13 12| | | 0 | 


[1110170 0f1 ofofwiof Rn |ojmjoj _——registertist 


réyister_list<13> 


T1 variant 


STMDB{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
STMFD{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Full Descending stack 


Decode for this encoding 


n = UInt(Rn); registers = P:M:register_list; whack = (W == '1'); 
if n == 15 || BitCount(registers) < 2 then UNPREDICTABLE; 

if wback && registers<n> == '1' then UNPREDICTABLE; 

if registers<13> == '1' then UNPREDICTABLE; 

if registers<15> == '1' then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If BitCount(registers) < 1, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. If the instruction specifies writeback, the modification to the base address 
on writeback might differ from the number of registers stored. 


If whack && registers<n> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The store instruction executes but the value stored for the base register is UNKNOWN. 


If BitCount(registers) == 1, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes as described, with no change to its behavior and no additional side effects. 

° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 


These registers might include R15. 


If registers<13> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes as described, with no change to its behavior and no additional side effects. 

. The store instruction performs all of the stores using the specified addressing mode but the value of R13 is 
UNKNOWN. 

If registers<15> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The store instruction performs all of the stores using the specified addressing mode but the value of R15 is 
UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Alias conditions 





Alias of variant is preferred when 
PUSH (multiple registers) T1 W == '1' && Rn == '1101' && BitCount(M:register_list) > 1 
PUSH (multiple registers) Al W == '1' && Rn == '1101' && BitCount(register_list) > 1 





Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


! The address adjusted by the size of the data loaded is written back to the base register. If specified, 
it is encoded in the "W" field as 1, otherwise this field defaults to 0. 


<registers> | Forencoding A1: is a list of one or more registers to be stored, separated by commas and surrounded 
by { and }. The PC can be in the list. However, ARM deprecates the use of instructions that include 
the PC in the list. If base register writeback is specified, and the base register is not the 
lowest-numbered register in the list, such an instruction stores an UNKNOWN value for the base 
register. 


For encoding T1: is a list of one or more registers to be stored, separated by commas and surrounded 
by { and }. The registers in the list must be in the range RO-R12, encoded in the "register_list" field, 
and can optionally contain the LR. If the LR is in the list, the "M" field is set to 1, otherwise it 
defaults to 0. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n] - 4xBitCount(registers); 
for i = 0 to 14 
if registers<i> == '1' then 
if i == n && whack && i != LowestSetBit(registers) then 
MemA[address,4] = bits(32) UNKNOWN; // Only possible for encoding Al 
else 
MemA[address,4] = R[i]; 
address = address + 4; 
if registers<15> == '1' then // Only possible for encoding Al 
MemA[address,4] = PCStoreValue(); 
if wback then R[n] = R[n] - 4«BitCount(registers) ; 
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F5.1.220 STMIB, STMFA 

Store Multiple Increment Before (Full Ascending) stores multiple registers to consecutive memory locations using 
an address from a base register. The consecutive memory locations start just above this address, and the address of 
the last of those locations can optionally be written back to the base register. 
The lowest-numbered register is loaded from the lowest memory address, through to the highest-numbered register 
from the highest memory address. See also Encoding of lists of general-purpose registers and the PC on 
page F2-2413. 
For details of related system instructions see STM (User registers). 
A1 

31 28|27 26 25 24|23 22 21 20|19 16/15 | | | 0 | 

eit [10 0]i]1]o|wlo] Rn | registers 
cond 

Al variant 
STMIB{<c>}{<q>} <Rn>{!}, <registers> // Preferred syntax 
STMFA{<c>}{<q>} <Rn>{!}, <registers> // Alternate syntax, Full Ascending stack 
Decode for this encoding 

n = UInt(Rn); registers = register_list; whack = (W == '1'); 

if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If BitCount(registers) < 1, then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction operates as an STM with the same addressing mode but targeting an unspecified set of registers. 

These registers might include R15. 
Ifn == 15 && whack, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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! The address adjusted by the size of the data loaded is written back to the base register. If specified, 
it is encoded in the "W" field as 1, otherwise this field defaults to 0. 


<registers> Isa list of one or more registers to be stored, separated by commas and surrounded by { and }. The 
PC can be in the list. However, ARM deprecates the use of instructions that include the PC in the 
list. If base register writeback is specified, and the base register is not the lowest-numbered register 
in the list, such an instruction stores an UNKNOWN value for the base register. 


Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n] + 4; 
for i = 0 to 14 
if registers<i> == '1' then 
if i == n && whack && i != LowestSetBit(registers) then 
MemA[address,4] = bits(32) UNKNOWN; 
else 
MemA[address,4] = R[i]; 
address = address + 4; 
if registers<15> == '1' then 
MemA[address,4] = PCStoreValue(); 
if wback then R[n] = R[n] + 4*BitCount(registers); 
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F5.1.221 STR (immediate) 


Store Register (immediate) calculates an address from a base register value and an immediate offset, and stores a 
word from a register to memory. It can use offset, post-indexed, or pre-indexed addressing. For information about 
memory accesses see Memory accesses on page F2-2412. 


This instruction is used by the alias PUSH (single register). See Alias conditions on page F5-3065 for details of 
when each alias is preferred. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 | | 0 | 
[erm [o7 o[Plujow[ol en | Rt | mmiz_——S—CSs 


cond 


Offset variant 
Applies when P == 1 && W == 0. 


STR{<c>}{<q>} <Rt>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


STR{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STR{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 
if P == 'Q' && W == '1' then SEE STRT; 
t = UInt(Rt); m = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 


index = (P == '1'); | add = (U == '1'); whack = (P == '@') || (W == '1'); 
if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Tf whack && n == 15, then one of the following behaviors must occur: 





. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
° The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
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T1 

|15 14 13 12|11 10 | 65 |3 2 0| 
fo 7 afojo[ mms [ Rn | Rt | 
T1 variant 


STR{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 


t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm5:'00', 32); 
index = TRUE; add = TRUE; wback = FALSE; 


T2 

|15141312|1110 8|7 | 0| 
too ifof Rt | imms 

T2 variant 


STR{<c>}{<q>} <Rt>, [SP{, #{+}<imm>}] 


Decode for this encoding 
t = UInt(Rt); nm = 13; imm32 = ZeroExtend(imm8:'00', 32); 
index = TRUE; add = TRUE; wback = FALSE; 


T3 


[15 141312/11109 8|7 6 5 4/3 0 |15 12)11 | | 0 | 


T1447 00jo)i]1 oo] em | Rt | mma —i| 
Rn 


T3 variant 


STR{<c>}.W <Rt>, [<Rn> {, #{+}<imm>}] // <Rt>, <Rn>, <imm> can be represented in Tl or T2 
STR{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 
if Rn == '1111' then UNDEFINED; 
t = UInt(Rt); m= UInt(Rn); imm32 = ZeroExtend(imm12, 32); 


index = TRUE; add = TRUE; wback = FALSE; 
if t == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 
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T4 


151413 12|11109 8|7 6 5 4|3 0 |15 12|\1110 9 8|7 | 0 | 


+7117 0 ofofoft ojo, enn | Rt [t[Pjujwl imma —=d 
Rn 





Offset variant 
Applies when P == 1 && U == 0 && W == 0. 


STR{<c>}{<q>} <Rt>, [<Rn> {, #-<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 1. 


STR{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STR{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if P == '1' && U == '1' && W == '@' then SEE STRT; 

if Rn == '1111' || (P == '@' && W == 'Q') then UNDEFINED; 
t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm8, 32); 
index = (P == '1'); add = (U == '1'); whack = (W == '1'); 
if t == 15 || (whack && n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Tf whack && n == 15, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes without writeback of the base address. 

° The instruction uses the addressing mode described in the equivalent immediate offset instruction. 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 





F5-3064 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Alias conditions 





Alias 


PUSH (single register) 


PUSH (single register) 


of variant is preferred when 
Al (pre-indexed) P == '1' && U == '@' && W == '1' && Rn == '1101' && imm12 == '000000000100' 


T4 (pre-indexed) = Rn == '1101' && U == 'Q' && imm8 == '00000100' 





Assembler symbols 


<c> 
<q> 


<Rt> 


<Rn> 


+/- 


<imm> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 
can be used, but this is deprecated. 

For encoding T1, T2, T3 and T4: is the general-purpose register to be transferred, encoded in the 
"Rt" field. 

For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 
in the offset variant, but this is deprecated. 

For encoding T1, T3 and T4: is the general-purpose base register, encoded in the "Rn" field. 
Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 

- when U = @ 

+ when U = 1 

Specifies the offset is added to the base register. 

For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 


0 if omitted, and encoded in the "imm12" field. 


For encoding T1: is the optional positive unsigned immediate byte offset, a multiple of 4, in the 
range 0 to 124, defaulting to 0 and encoded in the "imm5" field as <imm>/4. 


For encoding T2: is the optional positive unsigned immediate byte offset, a multiple of 4, in the 
range 0 to 1020, defaulting to 0 and encoded in the "imm8" field as <imm>/4. 


For encoding T3: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 


For encoding T4: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 


Operation for all encodings 


if CurrentInstrSet() == InstrSet_A32 then 


else 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 


address = if index then offset_addr else R[n]; 
MemU[address,4] = if t == 15 then PCStoreValue() else R[t]; 
if wback then R[n] = offset_addr; 


if ConditionPassed() then 


EncodingSpecificOperations(); 

offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
MemU[address,4] = R[t]; 

if woack then R[n] = offset_addr; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.222 STR (register) 


Store Register (register) calculates an address from a base register value and an offset register value, stores a word 
from a register to memory. The offset register value can optionally be shifted. For information about memory 
accesses see Memory accesses on page F2-2412. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 \7 6 5 4|3 0| 
| tsi fo 1 afPfufofwiof Rn {| Rt | imms __typefo] Rm _| 


cond 


Offset variant 
Applies when P == 1 && W == 0. 


STR{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>{, <shift>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


STR{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm>{, <shift>} 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STR{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>{, <shift>}]! 


Decode for all variants of this encoding 


if P == 'Q' && W == '1' then SEE STRT; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = (P == '1'); add = (U == '1'); whack = (P == '0') || (W == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 

if m == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Tf whack && n == 15, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
° The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
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T1 


\15141312\11109 8| 65 |32 0O| 


fo +0 tojofo[ km | Rn] Rt 


T1 variant 


STR{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>] 


Decode for this encoding 
t = UInt(Rt); n= UInt(Rn); m = UInt(Rm); 


index = TRUE; add = TRUE; wback = FALSE; 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 0 | 


[111410 ofojo{1 ofof '=1111 | Rt [0 0 0 0 0 Ofimm2] Rm _| 
Rn 


T2 variant 


STR{<c>}.W <Rt>, [<Rn>, {+}<Rm>] // <Rt>, <Rn>, <Rm> can be represented in T1 
STR{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>{, LSL #<imm>}] 


Decode for this encoding 


if Rn == '1111' then UNDEFINED; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = TRUE; add = TRUE; wback = FALSE; 

(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if t == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
. The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 


can be used, but this is deprecated. 


For encoding T1 and T2: is the general-purpose register to be transferred, encoded in the "Rt" field. 
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<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 
in the offset variant, but this is deprecated. 


For encoding T1 and T2: is the general-purpose base register, encoded in the "Rn" field. 


+/- Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

+ Specifies the index register is added to the base register. 

<Rm> Is the general-purpose index register, encoded in the "Rm" field. 

<shift> The shift to apply to the value read from <Rm>. If absent, no shift is applied. Otherwise, see Shifts 


applied to a register on page F2-2410. 


<imm> If present, the size of the left shift to apply to the value from <Rm>, in the range 1-3. <imm> is encoded 
in imm2. If absent, no shift is specified and imm2 is encoded as 0b00. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if index then offset_addr else R[n]; 
if t == 15 then // Only possible for encoding Al 
data = PCStoreValue(); 
else 
data = R[t]; 
MemU[address,4] = data; 
if wback then R[n] = offset_addr; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.223. STRB (immediate) 


Store Register Byte (immediate) calculates an address from a base register value and an immediate offset, and stores 
a byte from a register to memory. It can use offset, post-indexed, or pre-indexed addressing. For information about 


memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 | 


| feitt fo 4 ofpfujifwjof Rn [Rt | immt2 


cond 


Offset variant 
Applies when P == 1 && W == 0. 


STRB{<c>}{<q>} <Rt>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


STRB{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STRB{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>] ! 


Decode for all variants of this encoding 


if P == 'Q' && W == '1' then SEE STRBT; 

t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 

index = (P == '1'); add = (U == '1'); whack = (P == '0') || (W == '1'); 
if t == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Tf whack && n == 15, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
T1 


|15 14 13 12/11 10 | 65 |32 Of 


fo 1 Afifo] imms | Rn [| Rt | 


T1 variant 


STRB{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 
t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm5, 32); 
index = TRUE; add = TRUE; wback = FALSE; 


T2 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/11 | | 0 | 


Tt447100fo]ifo ojo] em | Rt | mma —+id| 
Rn 


T2 variant 


STRB{<c>}.W <Rt>, [<Rn> {, #{+}<imm>}] // <Rt>, <Rn>, <imm> can be represented in T1 
STRB{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 
if Rn == '1111' then UNDEFINED; 
t = UInt(Rt); m = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 


index = TRUE; add = TRUE; wback = FALSE; 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


T3 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


114470 0fo)o[o ojo] erm | Rt [t[PlujW] imma —izd 
Rn 


Offset variant 
Applies when P == 1 && U == @ && W == @. 


STRB{<c>}{<q>} <Rt>, [<Rn> {, #-<imm>}] 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Post-indexed variant 
Applies when P == 0 && W == 1. 


STRB{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STRB{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if P == '1' && U == '1' && W == '@' then SEE STRBT; 

if Rn == '1111' || (P == '@' && W == 'Q') then UNDEFINED; 

t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm8, 32); 

index = (P == '1'); add = (U == '1'); whack = (W == '1'); 

if t == 15 || (whack && n == t) then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Tf whack && n == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes without writeback of the base address. 

° The instruction uses the addressing mode described in the equivalent immediate offset instruction. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 


in the offset variant, but this is deprecated. 


For encoding T1, T2 and T3: is the general-purpose base register, encoded in the "Rn" field. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-3071 
1ID092916 Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


+/- 


<imm> 


Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 


- when U = @ 

+ when U = 1 

Specifies the offset is added to the base register. 

For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 


0 if omitted, and encoded in the "imm12" field. 


For encoding T1: is an optional 5-bit unsigned immediate byte offset, in the range 0 to 31, defaulting 
to 0 and encoded in the "imm5" field. 


For encoding T2: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 


For encoding T3: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 


Operation for all encodings 


if CurrentInstrSet() == InstrSet_A32 then 


else 


if ConditionPassed() then 


EncodingSpecificOperations(); 

offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
MemU[address,1] = R[t]<7:0>; 

if wback then R[n] = offset_addr; 


if ConditionPassed() then 


EncodingSpecificOperations(); 

offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
MemU[address,1] = R[t]<7:0>; 

if wback then R[n] = offset_addr; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.224 STRB (register) 


Store Register Byte (register) calculates an address from a base register value and an offset register value, and stores 
a byte from a register to memory. The offset register value can optionally be shifted. For information about memory 
accesses see Memory accesses on page F2-2412. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 \7 6 5 4|3 0| 
| ist fo 1 afPfulajwiof Rn {| Rt | imms —typefo] Rm _| 


cond 


Offset variant 
Applies when P == 1 && W == 0. 


STRB{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>{, <shift>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


STRB{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm>{, <shift>} 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STRB{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>{, <shift>}]! 


Decode for all variants of this encoding 


if P == '@' && W == '1' then SEE STRBT; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = (P == '1'); add = (U == '1'); whack = (P == '0') || (W == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 

if t == 15 || m == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Tf whack && n == 15, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 


T1 


15141312/11109 8| 65 |32 O| 


fo 10 tfofifo} Rm | Rn [ Rt | 


T1 variant 


STRB{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>] 


Decode for this encoding 


t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 
index = TRUE; add = TRUE; wback = FALSE; 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


[151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0| 


114410 ofojojo oo] mt [ Rt [00000 Ojimma] Rm | 
Rn 


T2 variant 


STRB{<c>}.W <Rt>, [<Rn>, {+}<Rm>] // <Rt>, <Rn>, <Rm> can be represented in T1 
STRB{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>{, LSL #<imm>}] 


Decode for this encoding 


if Rn == '1111' then UNDEFINED; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = TRUE; add = TRUE; wback = FALSE; 

(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if t == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
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<Rn> 


+/- 


aa 
<Rm> 


<shift> 


<imm> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 
in the offset variant, but this is deprecated. 


For encoding T1 and T2: is the general-purpose base register, encoded in the "Rn" field. 
Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 


= when U = 0 


+ when U = 1 
Specifies the index register is added to the base register. 
Is the general-purpose index register, encoded in the "Rm" field. 


The shift to apply to the value read from <Rm>. If absent, no shift is applied. Otherwise, see Shifts 
applied to a register on page F2-2410. 


If present, the size of the left shift to apply to the value from <Rm>, in the range 1-3. <imm> is encoded 
in imm2. If absent, no shift is specified and imm2 is encoded as 0b00. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if index then offset_addr else R[n]; 
MemU[address,1] = R[t]<7:0>; 
if wback then R[n] = offset_addr; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.225 STRBT 


Store Register Byte Unprivileged stores a byte from a register to memory. For information about memory accesses 
see Memory accesses on page F2-2412. 


The memory access is restricted as if the PE were running in User mode. This makes no difference if the PE is 
actually running in User mode. 


STRBT is UNPREDICTABLE in Hyp mode. 


The T32 instruction uses an offset addressing mode, that calculates the address used for the memory access from a 
base register value and an immediate offset, and leaves the base register unchanged. 


The A32 instruction uses a post-indexed addressing mode, that uses a base register value as the address for the 
memory access, and calculates a new address from a base register value and an offset and writes it back to the base 
register. The offset can be an immediate value or an optionally-shifted register value. 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 0| 
Pmt [o 74 ofoju[i[ijo[ Rn | Rt | mma —isi 
cond 
Al variant 


STRBT{<c>}{<q>} <Rt>, [<Rn>] {, #{+/-}<imm>} 


Decode for this encoding 
t = UInt(Rt); m = UInt(Rn); postindex = TRUE; add = (U == '1'); 


register_form = FALSE; imm32 = ZeroExtend(imm12, 32); 
if t == 15 || n == 15 || n == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


If n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Ifn == 15, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
° The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


A2 


31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 i765 4|3 0 | 


erm fo 7 to[u[iqifoy Rn [Rt | mmd__[ype]o] Rm 


cond 
A2 variant 
STRBT{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm>{, <shift>} 
Decode for this encoding 
t = UInt(Rt); m = UInt(Rn); m= UInt(Rm); postindex = TRUE; add = (U == '1'); 
register_form = TRUE; (shift_t, shift_n) = DecodeImmShift(type, immS); 


if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


If n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Ifn == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12\1110 9 8|7 | 0 | 


114470 0f0jof0 ojo] etm | Ri i[it1o| imma 
Rn 


T1 variant 


STRBT{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 


if Rn == '1111' then UNDEFINED; 

t = UInt(Rt); nm = UInt(Rn); postindex = FALSE; add = TRUE; 
register_form = FALSE; imm32 = ZeroExtend(imm8, 32); 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> 
<q> 


<Rt> 


<Rn> 


+/- 


<Rm> 


<shift> 


<imm> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 
For encoding A1: is the general-purpose register to be transferred, encoded in the "Rt" field. The PC 


can be used, but this is deprecated. 

For encoding A2 and T1: is the general-purpose register to be transferred, encoded in the "Rt" field. 
Is the general-purpose base register, encoded in the "Rn" field. 

For encoding A1: specifies the offset is added to or subtracted from the base register, defaulting to 
+ if omitted and encoded in the "U" field. It can have the following values: 

- when U = @ 

+ when U = 1 


For encoding A2: specifies the index register is added to or subtracted from the base register, 
defaulting to + if omitted and encoded in the "U" field. It can have the following values: 


- when U = 0 


+ when U = 1 
Is the general-purpose index register, encoded in the "Rm" field. 


The shift to apply to the value read from <Rm>. If absent, no shift is applied. Otherwise, see Shifts 
applied to a register on page F2-2410. 


Specifies the offset is added to the base register. 


For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 
0 if omitted, and encoded in the "imm12" field. 


For encoding T1: is an optional 8-bit unsigned immediate byte offset, in the range 0 to 255, 
defaulting to 0 and encoded in the "imm8&" field. 


Operation for all encodings 


if ConditionPassed() then 
if PSTATE.EL == EL2 then UNPREDICTABLE; // Hyp mode 
EncodingSpecificOperations(); 
offset = if register_form then Shift(R[m], shift_t, shift_n, PSTATE.C) else imm32; 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if postindex then R[n] else offset_addr; 
MemU_unpriv[address,1] = R[t]<7:>; 
if postindex then R[n] = offset_addr; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


CONSTRAINED UNPREDICTABLE behavior 


Tf PSTATE.EL == EL2, then one of the following behaviors must occur: 





. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes as STRB (immediate). 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.226 STRD (immediate) 
Store Register Dual (immediate) calculates an address from a base register value and an immediate offset, and stores 
two words from two registers to memory. It can use offset, post-indexed, or pre-indexed addressing. For information 
about memory accesses see Memory accesses on page F2-2412. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8/7 6 5 4|3 0 | 
cond 
Offset variant 
Applies when P == 1 && W == 0. 
STRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn> {, #{+/-}<imm>}] 
Post-indexed variant 
Applies when P == 0 && W == 0. 
STRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>], #{+/-}<imm> 
Pre-indexed variant 
Applies when P == 1 && W == 1. 
STRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>, #{+/-}<imm>]! 
Decode for all variants of this encoding 
if Rt<@> == '1' then UNPREDICTABLE; 
t = UInt(Rt); t2 = t+1; n= UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32); 
index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
if P == '@' && W == '1' then UNPREDICTABLE; 
if whack && (n == 15 || n == t || n == t2) then UNPREDICTABLE; 
if t2 == 15 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
Ift == 15 || t2 == 15, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 
R15 is UNKNOWN. 
If whack && (n == t || n == t2), then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 
Tf whack && n == 15, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction executes without writeback of the base address. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 

If Rt<@> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes with the additional decode: t<0> = '0'. 

° The instruction executes with the additional decode: t2 = t. 

. The instruction executes as described, with no change to its behavior and no additional side-effects. This does 


not apply when Rt =='1111'. 


If P == 'Q' && W == '1', then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction executes as an LDRD using one of offset, post-indexed, or pre-indexed addressing. 
T1 
15 1413 12/11109 8|7 6 5 4/3 0 |15 42/11 8|7 | 0 | 


T1107 00PlUlt wo =m | Rt | R@ | imma 
Rn 


Offset variant 
Applies when P == 1 && W == 0. 


STRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 1. 


STRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>, #{+/-}<imm>]! 


Decode for all variants of this encoding 


if P == '0' && W == 'Q' then SEE "Related encodings"; 

t = UInt(Rt); t2 = UInt(Rt2); mn = UInt(Rn); imm32 = ZeroExtend(imm8:'00', 32); 

index = (P == '1'); add = (U == '1'); whack = (W == '1'); 

if whack && (n == t || n == t2) then UNPREDICTABLE; 

if n == 15 || t == 15 || t2 == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


Ift == 15 || t2 == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 
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If whack && (n == t || n == t2), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Tf whack && n == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes without writeback of the base address. 

. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related encodings: Load/Store dual, Load/Store Exclusive, Load-Acquire/Store-Release, and table branch on 
page F3-2466. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> For encoding A1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


This register must be even-numbered and not R14. 

For encoding T1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Rt2> For encoding A1: is the second general-purpose register to be transferred. This register must be 

<R(t+1)>. 

For encoding T1: is the second general-purpose register to be transferred, encoded in the "Rt2" field. 
<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 

in the offset variant, but this is deprecated. 


For encoding T1: is the general-purpose base register, encoded in the "Rn" field. 


+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 


if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is the unsigned immediate byte offset, a multiple of 4, in the range 0 to 1020, 
defaulting to 0 if omitted, and encoded in the "imm8" field as <imm>/4. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
if address == Align(address, 8) then 
bits(64) data; 
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if BigEndian() then 
data<63:32> = R[t]; 
data<31:0> = R[t2]; 


else 
data<31:0> = R[t]; 
data<63:32> = R[t2]; 
MemA[address,8] = data; 


else 





MemA[address,4] = R[t]; 
MemA[address+4,4] = R[t2]; 
if wback then R[n] = offset_addr; 
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F5.1.227 


STRD (register) 


Store Register Dual (register) calculates an address from a base register value and a register offset, and stores two 
words from two registers to memory. It can use offset, post-indexed, or pre-indexed addressing. For information 


about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 


0| 





cond 


Offset variant 
Applies when P == 1 && W == 0. 


STRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>, {+/-}<Rm>] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


STRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>], {+/-}<Rm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STRD{<c>}{<q>} <Rt>, <Rt2>, [<Rn>, {+/-}<Rm>]! 


Decode for all variants of this encoding 

if Rt<@> == '1' then UNPREDICTABLE; 

t = UInt(Rt); t2 = t+1; mn = UInt(Rn); m = UInt(Rm); 

index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
if P == '@' && W == '1' then UNPREDICTABLE; 

if t2 == 15 || m == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t || n == t2) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 
Ift == 15 || t2 == 15, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 


° The instruction executes as NOP. 


° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


If whack && (n == t || n == t2), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The store instruction executes but the value stored is UNKNOWN. 
Tf whack && n == 15, then one of the following behaviors must occur: 

. The instruction is UNDEFINED. 

° The instruction executes as NOP. 


° The instruction executes without writeback of the base address. 
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. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 

If Rt<@> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes with the additional decode: t<0> = '0'. 

° The instruction executes with the additional decode: t2 = t. 

. The instruction executes as described, with no change to its behavior and no additional side-effects. This does 


not apply when Rt =='1111'. 


If P == 'Q' && W == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes with the additional decode: P ='1'; W = '0'. 
° The instruction executes with the additional decode: P ='1'; W ='l'. 
° The instruction executes with the additional decode: P = '0'; W = '0'. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the first general-purpose register to be transferred, encoded in the "Rt" field. This register must 


be even-numbered and not R14. 
<Rt2> Is the second general-purpose register to be transferred. This register must be <R(t+1)>. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. The PC can be used in the offset 
variant, but this is deprecated. 


+/- Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<Rm> Is the general-purpose index register, encoded in the "Rm" field. 

Operation 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset_addr = if add then (R[n] + R[m]) else (R[n] - R[m]); 
address = if index then offset_addr else R[n]; 
if address == Align(address, 8) then 
bits(64) data; 
if BigEndian() then 
data<63:32> = R[t]; 
data<31:0> = R[t2]; 
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else 
data<31:0> = R[t]; 
data<63:32> = R[t2]; 
MemA[address,8] = data; 
else 
MemA[address,4] = R[t]; 
MemA[address+4,4] = R[t2]; 
if wback then R[n] = offset_addr; 
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F5.1.228 STREX 


Store Register Exclusive calculates an address from a base register value and an immediate offset, stores a word 
from a register to the calculated address if the PE has exclusive access to the memory at that address, and returns a 
status value of 0 if the store was successful, or of 1 if no store was performed. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


Derm [ooo 710 ojo] mn | Rd (ipfoo7] 


cond 


Al variant 

STREX{<c>}{<q>} <Rd>, <Rt>, [<Rn> {, {#}<imm>}] 

Decode for this encoding 

d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); imm32 = Zeros(32); // Zero offset 


if d == 15 || t == 15 || nm == 15 then UNPREDICTABLE; 
if d == n || d == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs the store to an UNKNOWN address. 
T1 
1151413 12|/1110 9 8|7 6 5 4|3 0 |15 12|11 8|7 | 0| 


1170700007 ofof Rn { RA | Rd [imme | 


T1 variant 


STREX{<c>}{<q>} <Rd>, <Rt>, [<Rn> {, #<imm>}] 


Decode for this encoding 


d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm8:'00', 32); 
if d == 15 || t == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
if d == n || d == t then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction performs the store to an UNKNOWN address. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the destination general-purpose register into which the status result of the store exclusive is 


written, encoded in the "Rd" field. The value returned is: 


) If the operation updates memory. 
1 If the operation fails to update memory. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
<imm> For encoding A1: the immediate offset added to the value of <Rn> to calculate the address. <imm> can 


only be 0 or omitted. 


For encoding T1: the immediate offset added to the value of <Rn> to calculate the address. <imm> can 
be omitted, meaning an offset of 0. Values are multiples of 4 in the range 0-1020. 

Aborts and alignment 

If a synchronous Data Abort exception is generated by the execution of this instruction: 

. Memory is not updated. 

° <Rd> is not updated. 


A non word-aligned memory address causes an Alignment fault Data Abort exception to be generated, subject to 
the following rules: 


. If AArch32.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch32.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 
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Operation for all encodings 


if ConditionPassed() then 

EncodingSpecificOperations(); 

address = R[n] + imm32; 

if AArch32.ExclusiveMonitorsPass(address,4) then 
MemA[address,4] = R[t]; 
R[d] = ZeroExtend('@'); 

else 
R[d] = ZeroExtend('1'); 
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F5.1.229 STREXB 
Store Register Exclusive Byte derives an address from a base register value, stores a byte from a register to the 
derived address if the executing PE has exclusive access to the memory at that address, and returns a status value 
of 0 if the store was successful, or of 1 if no store was performed. 
For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
mim [ooo 7 11 ojo] en | Ra (toot 
cond 
Al variant 
STREXB{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 
Decode for this encoding 
d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 
if d == 15 || t == 15 || nm == 15 then UNPREDICTABLE; 
if d == n || d == t then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d == t, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The store instruction executes but the value stored is UNKNOWN. 
If d == n, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs the store to an UNKNOWN address. 
T1 
1151413 12|/1110 9 8|7 6 5 4|3 0 |15 12|11109 8|7 6 5 4/3 0| 
114707000177 ofof Rn fT Re ()aaajo sto of Rd | 
T1 variant 
STREXB{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 
Decode for this encoding 
d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 
if d == 15 || t == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
if d == n || d == t then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction performs the store to an UNKNOWN address. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the destination general-purpose register into which the status result of the store exclusive is 


written, encoded in the "Rd" field. The value returned is: 


) If the operation updates memory. 

1 If the operation fails to update memory. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Aborts 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
° Memory is not updated. 

. <Rd> is not updated. 


Tf AArch32.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation for all encodings 


if ConditionPassed() then 

EncodingSpecificOperations(); 

address = R[n]; 

if AArch32.ExclusiveMonitorsPass(address,1) then 
MemA[address,1] = R[t]<7:0>; 
R[d] = ZeroExtend('@'); 

else 
R[d] = ZeroExtend('1'); 
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F5.1.230 STREXD 


Store Register Exclusive Doubleword derives an address from a base register value, stores a 64-bit doubleword from 
two registers to the derived address if the executing PE has exclusive access to the memory at that address, and 
returns a status value of 0 if the store was successful, or of 1 if no store was performed. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


Derm [ooo 7107p] em | R¢ (ipfooi 


cond 


Al variant 


STREXD{<c>}{<q>} <Rd>, <Rt>, <Rt2>, [<Rn>] 
Decode for this encoding 
d = UInt(Rd); t = UInt(Rt); t2 = t+1; n = UInt(Rn); 


if d == 15 || Rt<@> == '1' || t2 == 15 || n == 15 then UNPREDICTABLE; 
if d==n || d == t || d == t2 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction performs the store to an UNKNOWN address. 

If Rt<@> == '1', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as NOP. 

° The instruction executes with the additional decode: Rt<0> = '0'. 
. The instruction executes with the additional decode: t2 = t. 

° The instruction executes as described, with no change to its behavior and no additional side effects. 
If Rt == '1110', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


. The instruction is handled as described in Using R15 on page K1-5457. 
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T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4|3 0 | 


T7totooot11 oo] em | Rk | R@ loii7] Ro 


T1 variant 


STREXD{<c>}{<q>} <Rd>, <Rt>, <Rt2>, [<Rn>] 


Decode for this encoding 
d = UInt(Rd); t = UInt(Rt); t2 = UInt(Rt2); n = UInt(Rn); 
if d == 15 || t == 15 || t2 == 15 || nm == 15 then UNPREDICTABLE; 


// ARMv8-A removes UNPREDICTABLE for R13 
if d == n || d == t || d == t2 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs the store to an UNKNOWN address. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the destination general-purpose register into which the status result of the store exclusive is 


written, encoded in the "Rd" field. The value returned is: 
0 If the operation updates memory. 
HE If the operation fails to update memory. 
<Rd> must not be the same as <Rn>, <Rt>, or <Rt2>. 
<Rt> For encoding A1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 
<Rt> must be even-numbered and not R14. 


For encoding T1: is the first general-purpose register to be transferred, encoded in the "Rt" field. 


<Rt2> For encoding A1: is the second general-purpose register to be transferred. <Rt2> must be <R(t+1)>. 


For encoding T1: is the second general-purpose register to be transferred, encoded in the "Rt2" field. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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Aborts and alignment 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
. Memory is not updated. 

° <Rd> is not updated. 


A non doubleword-aligned memory address causes an Alignment fault Data Abort exception to be generated, 
subject to the following rules: 


. Tf AArch32.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Tf AArch32.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
// Create doubleword to store such that R[t] will be stored at address and R[t2] at address+4. 
value = if BigEndian() then R[t]:R[t2] else R[t2]:R[t]; 
if AArch32.ExclusiveMonitorsPass(address,8) then 
MemA[address,8] = value; R[d] = ZeroExtend('Q'); 
else 
R[d] = ZeroExtend('1'); 
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F5.1.231 STREXH 


Store Register Exclusive Halfword derives an address from a base register value, stores a halfword from a register 
to the derived address if the executing PE has exclusive access to the memory at that address, and returns a status 
value of 0 if the store was successful, or of 1 if no store was performed. 


For more information about support for shared memory see Synchronization and semaphores on page E2-2355. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


Derm [ooo 7117p] en | Rd (ipfoo7] 


cond 


Al variant 


STREXH{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 
Decode for this encoding 
d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 


if d == 15 || t == 15 || nm == 15 then UNPREDICTABLE; 
if d == n || d == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction performs the store to an UNKNOWN address. 
T1 
1151413 12|/1110 9 8|7 6 5 4|3 0 |15 12|11109 8|7 6 5 4/3 0| 


1170700077 fof Rn [ R  (Hamafo tfo 1] Rd | 


T1 variant 


STREXH{<c>}{<q>} <Rd>, <Rt>, [<Rn>] 


Decode for this encoding 


d = UInt(Rd); t = UInt(Rt); n = UInt(Rn); 
if d == 15 || t == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
if d == n || d == t then UNPREDICTABLE; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


CONSTRAINED UNPREDICTABLE behavior 


If d == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


If d == n, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction performs the store to an UNKNOWN address. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the destination general-purpose register into which the status result of the store exclusive is 


written, encoded in the "Rd" field. The value returned is: 


) If the operation updates memory. 

1 If the operation fails to update memory. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


Aborts and alignment 

If a synchronous Data Abort exception is generated by the execution of this instruction: 
° Memory is not updated. 

° <Rd> is not updated. 


A non halfword-aligned memory address causes an Alignment fault Data Abort exception to be generated, subject 
to the following rules: 


. Tf AArch32.ExclusiveMonitorsPass() returns TRUE, the exception is generated. 
° Otherwise, it is IMPLEMENTATION DEFINED whether the exception is generated. 


If AArch32.ExclusiveMonitorsPass() returns FALSE and the memory address, if accessed, would generate a 
synchronous Data Abort exception, it is IMPLEMENTATION DEFINED whether the exception is generated. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
address = R[n]; 
if AArch32.ExclusiveMonitorsPass(address,2) then 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


MemA[address,2] = R[t]<15:0>; 
R[d] = ZeroExtend('@'); 

else 
R[d] = ZeroExtend('1'); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.232 | STRH (immediate) 


Store Register Halfword (immediate) calculates an address from a base register value and an immediate offset, and 
stores a halfword from a register to memory. It can use offset, post-indexed, or pre-indexed addressing. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0| 
| tsi foo ofPjufijwiof Rn {Rt imma [to 144] immat_| 


cond 


Offset variant 
Applies when P == 1 && W == 0. 


STRH{<c>}{<q>} <Rt>, [<Rn> {, #{+/-}<imm>}] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


STRH{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STRH{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>] ! 


Decode for all variants of this encoding 


if P == '@' && W == '1' then SEE STRHT; 

t = UInt(Rt); m = UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32); 

index = (P == '1'); add = (U == '1'); whack = (P == '@') || (W == '1'); 
if t == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Tf whack && n == 15, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
T1 


|15 14 13 12/11 10 | 65 |32 Of 


[10 0 fof imms | Rn [| Rt | 


T1 variant 


STRH{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 
t = UInt(Rt); m= UInt(Rn); imm32 = ZeroExtend(imm5:'@', 32); 
index = TRUE; add = TRUE; whack = FALSE; 


T2 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 42\11 | | 0 | 


Tit+170ojoifo oy emt [ Rm | mma —+id 
Rn 


T2 variant 


STRH{<c>}.W <Rt>, [<Rn> {, #{+}<imm>}] // <Rt>, <Rn>, <imm> can be represented in T1 
STRH{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 
if Rn == '1111' then UNDEFINED; 
t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm12, 32); 


index = TRUE; add = TRUE; wback = FALSE; 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


T3 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


41170 ofofojo ajo] enn | Rt (tjPjujw] mms —=sd 
Rn 





Offset variant 
Applies when P == 1 && U == 0 && W == @. 


STRH{<c>}{<q>} <Rt>, [<Rn> {, #-<imm>}] 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Post-indexed variant 
Applies when P == 0 && W == 1. 


STRH{<c>}{<q>} <Rt>, [<Rn>], #{+/-}<imm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STRH{<c>}{<q>} <Rt>, [<Rn>, #{+/-}<imm>] ! 


Decode for all variants of this encoding 


if P == '1' && U == '1' && W == '@' then SEE STRHT; 

if Rn == '1111' || (P == '@' && W == '@') then UNDEFINED; 

t = UInt(Rt); nm = UInt(Rn); imm32 = ZeroExtend(imm8, 32); 

index = (P == '1'); add = (U == '1'); whack = (W == '1'); 

if t == 15 || (whack && n == t) then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 


in the offset variant, but this is deprecated. 
For encoding Al, T1, T2, T3: is the general-purpose base register, encoded in the "Rn" field. 
+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 


+ when U = 1 


+ Specifies the offset is added to the base register. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is the optional positive unsigned immediate byte offset, a multiple of 2, in the 
range 0 to 62, defaulting to 0 and encoded in the "imm5" field as <imm>/2. 


For encoding T2: is an optional 12-bit unsigned immediate byte offset, in the range 0 to 4095, 
defaulting to 0 and encoded in the "imm12" field. 


For encoding T3: is an 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 
if omitted, and encoded in the "imm8" field. 


Operation for all encodings 


if CurrentInstrSet() == InstrSet_A32 then 


else 


if ConditionPassed() then 


EncodingSpecificOperations(); 

offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
MemU[address,2] = R[t]<15:0>; 

if woack then R[n] = offset_addr; 


if ConditionPassed() then 


EncodingSpecificOperations(); 

offset_addr = if add then (R[n] + imm32) else (R[n] - imm32); 
address = if index then offset_addr else R[n]; 
MemU[address,2] = R[t]<15:0>; 

if wback then R[n] = offset_addr; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.233  STRH (register) 


Store Register Halfword (register) calculates an address from a base register value and an offset register value, and 
stores a halfword from a register to memory. The offset register value can be shifted left by 0, 1, 2, or 3 bits. For 
information about memory accesses see Memory accesses on page F2-2412. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


| fe1it_ fo o ofpfujofwjof Rn | Rt __Joyoofoy to 1]+] Rm __ 





cond 


Offset variant 
Applies when P == 1 && W == 0. 


STRH{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>] 


Post-indexed variant 
Applies when P == 0 && W == 0. 


STRH{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm> 


Pre-indexed variant 
Applies when P == 1 && W == 1. 


STRH{<c>}{<q>} <Rt>, [<Rn>, {+/-}<Rm>]! 


Decode for all variants of this encoding 


if P == 'Q' && W == '1' then SEE STRHT; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = (P == '1'); add = (U == '1'); whack = (P == '0') || (W == '1'); 
(shift_t, shift_n) = (SRType_LSL, Q); 

if t == 15 || m == 15 then UNPREDICTABLE; 

if whack && (n == 15 || n == t) then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Tf whack && n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Tf whack && n == 15, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 


T1 


15 141312/11109 8| 65 |32 O| 


fo 10 tfofoft] Rm | Rn [| Rt | 


T1 variant 


STRH{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>] 


Decode for this encoding 


t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 
index = TRUE; add = TRUE; wback = FALSE; 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0| 


1144710 ofojofo toy Hmm [ Rt [00000 Ojimma] Rm | 
Rn 


T2 variant 


STRH{<c>}.W <Rt>, [<Rn>, {+}<Rm>] // <Rt>, <Rn>, <Rm> can be represented in T1 
STRH{<c>}{<q>} <Rt>, [<Rn>, {+}<Rm>{, LSL #<imm>}] 


Decode for this encoding 


if Rn == '1111' then UNDEFINED; 

t = UInt(Rt); nm = UInt(Rn); m = UInt(Rm); 

index = TRUE; add = TRUE; wback = FALSE; 

(shift_t, shift_n) = (SRType_LSL, UInt(imm2)); 

if t == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-3103 


ID092916 Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


<Rn> For encoding A1: is the general-purpose base register, encoded in the "Rn" field. The PC can be used 
in the offset variant, but this is deprecated. 


For encoding T1 and T2: is the general-purpose base register, encoded in the "Rn" field. 


+/- Specifies the index register is added to or subtracted from the base register, defaulting to + if omitted 
and encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

+ Specifies the index register is added to the base register. 

<Rm> Is the general-purpose index register, encoded in the "Rm" field. 

<imm> If present, the size of the left shift to apply to the value from <Rm>, in the range 1-3. <imm> is encoded 


in imm2. If absent, no shift is specified and imm2 is encoded as 0b00. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
offset = Shift(R[m], shift_t, shift_n, PSTATE.C); 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if index then offset_addr else R[n]; 
MemU[address,2] = R[t]<15:0>; 
if wback then R[n] = offset_addr; 
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F5.1.234 STRHT 


Store Register Halfword Unprivileged stores a halfword from a register to memory. For information about memory 
accesses see Memory accesses on page F2-2412. 


The memory access is restricted as if the PE were running in User mode. This makes no difference if the PE is 
actually running in User mode. 


STRHT is UNPREDICTABLE in Hyp mode. 


The T32 instruction uses an offset addressing mode, that calculates the address used for the memory access from a 
base register value and an immediate offset, and leaves the base register unchanged. 


The A32 instruction uses a post-indexed addressing mode, that uses a base register value as the address for the 
memory access, and calculates a new address from a base register value and an offset and writes it back to the base 
register. The offset can be an immediate value or a register value. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11 81/7 6 5 4|3 0 | 


erm [oo ofofupiyifo. Rn | Rt | mman [a]0 a]7] immal_ 


cond 
Al variant 
STRHT{<c>}{<q>} <Rt>, [<Rn>] {, #{+/-}<imm>} 


Decode for this encoding 


t = UInt(Rt); m = UInt(Rn); postindex = TRUE; add = (U == '1'); 
register_form = FALSE; imm32 = ZeroExtend(imm4H:imm4L, 32); 
if t == 15 || n == 15 || n == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


If n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Ifn == 15, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
° The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





A2 
|31 28|27 26 25 24/23 22 21 20/19 16|15 12\11109 8|7 6 5 4|3 0| 
| t=i11_ [oo ofofujo{ifo] Rn | Rt __[offofolrf{o 1]1] Rm _| 
cond 
A2 variant 


STRHT{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm> 


Decode for this encoding 
t = UInt(Rt); m= UInt(Rn); m= UInt(Rm); postindex = TRUE; add = (U == '1'); 


register_form = TRUE; 
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


If n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Ifn == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12\1110 9 8|7 | 0 | 


114470 0j0jofo fo] etm | Rt i[t11o| imma 
Rn 


T1 variant 


STRHT{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 


if Rn == '1111' then UNDEFINED; 

t = UInt(Rt); nm = UInt(Rn); postindex = FALSE; add = TRUE; 
register_form = FALSE; imm32 = ZeroExtend(imm8, 32); 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose register to be transferred, encoded in the "Rt" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 

+/- For encoding A1: specifies the offset is added to or subtracted from the base register, defaulting to 
+ if omitted and encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 


For encoding A2: specifies the index register is added to or subtracted from the base register, 
defaulting to + if omitted and encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
<Rm> Is the general-purpose index register, encoded in the "Rm" field. 
+ Specifies the offset is added to the base register. 
<imm> For encoding A1: is the 8-bit unsigned immediate byte offset, in the range 0 to 255, defaulting to 0 


if omitted, and encoded in the "imm4H:imm4L" field. 


For encoding T1: is an optional 8-bit unsigned immediate byte offset, in the range 0 to 255, 
defaulting to 0 and encoded in the "imm8" field. 


Operation for all encodings 


if ConditionPassed() then 
if PSTATE.EL == EL2 then UNPREDICTABLE; // Hyp mode 
EncodingSpecificOperations(); 
offset = if register_form then R[m] else imm32; 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if postindex then R[n] else offset_addr; 
MemU_unpriv[address,2] = R[t]<15:@>; 
if postindex then R[n] = offset_addr; 


CONSTRAINED UNPREDICTABLE behavior 
Tf PSTATE.EL == EL2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





° The instruction executes as NOP. 
° The instruction executes as STRH (immediate). 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.235 STRT 


Store Register Unprivileged stores a word from a register to memory. For information about memory accesses see 
Memory accesses on page F2-2412. 


The memory access is restricted as if the PE were running in User mode. This makes no difference if the PE is 
actually running in User mode. 


STRT is UNPREDICTABLE in Hyp mode. 


The T32 instruction uses an offset addressing mode, that calculates the address used for the memory access from a 
base register value and an immediate offset, and leaves the base register unchanged. 


The A32 instruction uses a post-indexed addressing mode, that uses a base register value as the address for the 
memory access, and calculates a new address from a base register value and an offset and writes it back to the base 
register. The offset can be an immediate value or an optionally-shifted register value. 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 0| 
Fem [o 74 ofojujofijo[ Rn | Rt | mma —id| 
cond 
Al variant 


STRT{<c>}{<q>} <Rt>, [<Rn>] {, #{+/-}<imm>} 


Decode for this encoding 
t = UInt(Rt); nm = UInt(Rn); postindex = TRUE; add = (U == '1'); 


register_form = FALSE; imm32 = ZeroExtend(imm12, 32); 
if n == 15 || n == t then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Ifn == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction executes without writeback of the base address. 

. The instruction uses the addressing mode described in the equivalent immediate offset instruction. 

A2 

31 28|27 26 25 24|23 22 21 20|19 16|15 12\11 i765 4|3 0 | 

11141 [o 1 1Jolujo{tfo] Rn | Rt _ [ __imms _|type[o] Rm_| 
cond 

A2 variant 


STRT{<c>}{<q>} <Rt>, [<Rn>], {+/-}<Rm>{, <shift>} 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Decode for this encoding 
t = UInt(Rt); m= UInt(Rn); m= UInt(Rm); postindex = TRUE; add = (U == '1'); 


register_form = TRUE; (shift_t, shift_n) = DecodeImmShift(type, immS); 
if n == 15 || n == t || m == 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If n == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The store instruction executes but the value stored is UNKNOWN. 


Ifn == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes without writeback of the base address. 
° The instruction uses the addressing mode described in the equivalent immediate offset instruction. 
T1 
15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


11414170 ofofol1 ofof = | Rte [1170] imma 
Rn 


T1 variant 


STRT{<c>}{<q>} <Rt>, [<Rn> {, #{+}<imm>}] 


Decode for this encoding 
if Rn == '1111' then UNDEFINED; 
t = UInt(Rt); nm = UInt(Rn); postindex = FALSE; add = TRUE; 


register_form = FALSE; imm32 = ZeroExtend(imm8, 32); 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If t == 15, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The store instruction performs the store using the specified addressing mode but the value corresponding to 


R15 is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
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<q> 


<Rt> 


<Rn> 


+/- 


<Rm> 


<shift> 


<imm> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


See Standard assembler syntax fields on page F2-2406. 

For encoding A1 and A2: is the general-purpose register to be transferred, encoded in the "Rt" field. 
The PC can be used, but this is deprecated. 

For encoding T1: is the general-purpose register to be transferred, encoded in the "Rt" field. 

Is the general-purpose base register, encoded in the "Rn" field. 

For encoding A1: specifies the offset is added to or subtracted from the base register, defaulting to 
+ if omitted and encoded in the "U" field. It can have the following values: 

- when U = @ 

+ when U = 1 


For encoding A2: specifies the index register is added to or subtracted from the base register, 
defaulting to + if omitted and encoded in the "U" field. It can have the following values: 


- when U = 0 


+ when U = 1 
Is the general-purpose index register, encoded in the "Rm" field. 


The shift to apply to the value read from <Rm>. If absent, no shift is applied. Otherwise, see Shifts 
applied to a register on page F2-2410. 


Specifies the offset is added to the base register. 


For encoding A1: is the 12-bit unsigned immediate byte offset, in the range 0 to 4095, defaulting to 
0 if omitted, and encoded in the "imm12" field. 


For encoding T1: is an optional 8-bit unsigned immediate byte offset, in the range 0 to 255, 
defaulting to 0 and encoded in the "imm8" field. 


Operation for all encodings 


if ConditionPassed() then 
if PSTATE.EL == EL2 then UNPREDICTABLE; // Hyp mode 
EncodingSpecificOperations(); 
offset = if register_form then Shift(R[m], shift_t, shift_n, PSTATE.C) else imm32; 
offset_addr = if add then (R[n] + offset) else (R[n] - offset); 
address = if postindex then R[n] else offset_addr; 
if t == 15 then // Only possible for encodings Al and A2 


data = PCStoreValue(); 
else 

data = R[t]; 
MemU_unpriv[address, 4] 


data; 


if postindex then R[n] = offset_addr; 


CONSTRAINED UNPREDICTABLE behavior 


Tf PSTATE.EL == EL2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction executes as STR (immediate). 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.236 SUB (immediate, from PC) 


Subtract from PC subtracts an immediate value from the Align(PC, 4) value to form a PC-relative address, and 
writes the result to the destination register. ARM recommends that, where possible, software avoids using this alias 


This instruction is an alias of the ADR instruction. This means that: 
° The encodings in this description are named to match the encodings of ADR. 


. The description of ADR gives the operational pseudocode for this instruction. 
A2 


31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 12\11 | | 0 | 


[erm [oo 7 ofo7ofoli1%1%] Rd [| mmi2_———id 


cond 


A2 variant 

SUB{<c>}{<q>} <Rd>, PC, #<const> 
is equivalent to 

ADR{<c>}{<q>} <Rd>, <label> 


and is the preferred disassembly when imm12 == '000000000000'. 
T2 


151413 12/1110 9 8|7 6 5 4/3 21 0|1514 12/11 8|7 | 0| 


[11.1 1 ofits oftfoftjo{t1 1 1 1]o] imma | Rd | imma 


T2 variant 

SUB{<c>}{<q>} <Rd>, PC, #<imm12> 
is equivalent to 

ADR{<c>}{<q>} <Rd>, <label> 


and is the preferred disassembly when i:imm3:imm8 == '000000000000'. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A2: is the general-purpose destination register, encoded in the "Rd" field. If the PC is 


used, the instruction is a branch to the address calculated by the operation. This is an interworking 
branch, see Pseudocode description of operations on the AArch32 general-purpose registers and the 
PC on page E1-2293. 


For encoding T2: is the general-purpose destination register, encoded in the "Rd" field. 


<label> For encoding A1 and A2: the label of an instruction or literal data item whose address is to be loaded 
into <Rd>. The assembler calculates the required value of the offset from the Align(PC, 4) value of 
the ADR instruction to this label. If the offset is zero or positive, encoding A1 is used, with imm32 equal 
to the offset. If the offset is negative, encoding A2 is used, with imm32 equal to the size of the offset. 
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<imm12> 


<const> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


That is, the use of encoding A2 indicates that the required offset is minus the value of imm32. 
Permitted values of the size of the offset are any of the constants described in Modified immediate 
constants in A32 instructions on page F2-2422. 


For encoding T1: the label of an instruction or literal data item whose address is to be loaded into 

<Rd>. The assembler calculates the required value of the offset from the Align(PC, 4) value of the 

ADR instruction to this label. Permitted values of the size of the offset are multiples of 4 in the range 
0 to 1020. 


For encoding T2 and T3: the label of an instruction or literal data item whose address is to be loaded 
into <Rd>. The assembler calculates the required value of the offset from the Align(PC, 4) value of 
the ADR instruction to this label. If the offset is zero or positive, encoding T3 is used, with imm32 equal 
to the offset. If the offset is negative, encoding T2 is used, with imm32 equal to the size of the offset. 
That is, the use of encoding T2 indicates that the required offset is minus the value of imm32. 
Permitted values of the size of the offset are 0-4095. 


Is a 12-bit unsigned immediate, in the range 0 to 4095, encoded in the "i:imm3:imm8" field. 


Animmediate value. See Modified immediate constants in A32 instructions on page F2-2422 for the 
range of values. 


Operation for all encodings 


The description of ADR gives the operational pseudocode for this instruction. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.237 SUB, SUBS (immediate) 


Subtract (immediate) subtracts an immediate value from a register value, and writes the result to the destination 
register. 


If the destination register is not the PC, the SUBS variant of the instruction updates the condition flags based on the 
result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. If the 
destination register is the PC: 


° The SUB variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 

° The SUBS variant of the instruction performs an exception return without the use of the stack. In this case: 
— The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See J/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode, except for encoding T5 with <imm8> set to zero, which is 
the encoding for the ERET instruction, see ERET. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12\11 | | 0 | 
im [oo 7 ojo 7 o[s| en | Ra | _mmi2_——S—S—=*Y 


cond 


SUB variant 
Applies when S == @ && Rn != 11x1. 


SUB{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


SUBS variant 
Applies when S == 1 && Rn != 1101. 


SUBS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 
if Rn == '1111' && S == 'Q' then SEE ADR; 


if Rn == '1101' then SEE SUB (SP minus immediate); 
d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); imm32 = A32ExpandImm(imm12) ; 


T1 
15141312/11109 8| 65 |32  O| 


foo 0114 7[1] mma] Rn [ Ra | 


T1 variant 


SUB<c>{<q>} <Rd>, <Rn>, #<imm3> // Inside IT block 
SUBS{<q>} <Rd>, <Rn>, #<imm3> // Outside IT block 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); setflags = !InITBlock(); imm32 = ZeroExtend(imm3, 32); 


T2 
|15141312|/1110 8|7 | 0| 
T2 variant 


SUB<c>{<q>} <Rdn>, #<imm8> // Inside IT block, and <Rdn>, <imm8> can be represented in T1 
SUB<c>{<q>} {<Rdn>,} <Rdn>, #<imm8> // Inside IT block, and <Rdn>, <imm8> cannot be represented in T1 
SUBS{<q>} <Rdn>, #<imm8> // Outside IT block, and <Rdn>, <imm8> can be represented in T1 
SUBS{<q>} {<Rdn>,} <Rdn>, #<imm8> // Outside IT block, and <Rdn>, <imm8> cannot be represented in T1 


Decode for this encoding 


d = UInt(Rdn); n = UInt(Rdn); setflags = !InITBlock(); imm32 = ZeroExtend(imm8, 32); 
T3 


[15 141312/11109 8|7 6 5 4|3 0/1514 12\11 8|7 | 0 | 


111 ofifolt 10 7s] =101 [ol imma | Ra | imme —is 
Rn 


SUB variant 
Applies when $ == 0. 
SUB<c>.W {<Rd>,} <Rn>, #<const> // Inside IT block, and <Rd>, <Rn>, <const> can be represented in T1 or 


T2 
SUB{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


SUBS variant 
Applies when S == 1 && Rd != 1111. 


SUBS.W {<Rd>,} <Rn>, #<const> // Outside IT block, and <Rd>, <Rn>, <const> can be represented in T1 or T2 
SUBS{<c>}{<q>} {<Rd>,} <Rn>, #<const> 


Decode for all variants of this encoding 


if Rd == '1111' && S == '1' then SEE CMP (immediate); 

if Rn == '1101' then SEE SUB (SP minus immediate); 

d = UInt(Rd); nm = UInt(Rn); setflags = (S == '1'); imm32 = T32ExpandImm(i:imm3:imm8) ; 

if (d == 15 && !setflags) || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


T4 


151413 12|11109 8|7 6 5 4|3 0\1514 12\11 8|7 | 0| 


[1111 ofits oftfofijo} '11xt Jol imm3 | Rd | imma 
Rn 





T4 variant 


SUB{<c>}{<q>} {<Rd>,} <Rn>, #<imm12> // <imm12> cannot be represented in T1, 72, or T3 
SUBW{<c>}{<q>} {<Rd>,} <Rn>, #<imm12> // <imm12> can be represented in T1, 12, or T3 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Decode for this encoding 


if Rn == '1111' then SEE ADR; 

if Rn == '1101' then SEE SUB (SP minus immediate); 

d = UInt(Rd); nm = UInt(Rn); setflags = FALSE; imm32 = ZeroExtend(i:imm3:imm8, 32); 
if d == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 





T5 
[1514131211109 8|7 6 5 4/3 0/15 141312|1110 9 8|7 | 0| 
111.1.00 1 1 1 1 0 1((1)(1)(1)(0)} 1 0 (0) 0 [AAAI] '=00000000 
Rn imms 
TS variant 


SUBS{<c>}{<q>} PC, LR, #<imm8> 


Decode for this encoding 


if Rn == '1110' && IsZero(imm8) then SEE ERET; 

d= 15; n= UInt(Rn); setflags = TRUE; imm32 = ZeroExtend(imm8, 32); 
if n != 14 then UNPREDICTABLE; 

if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly SUBS PC. LR and related instructions 
(A32) on page K1-5469 and SUBS PC, LR and related instructions (T32) on page K1-5468. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rdn> Is the general-purpose source and destination register, encoded in the "Rdn" field. 

<imm8> For encoding T2: is a 8-bit unsigned immediate, in the range 0 to 255, encoded in the "imm8" field. 


For encoding TS: is a 8-bit unsigned immediate, in the range 0 to 255, encoded in the "imm8" field. 
If <Rn> is the LR, and zero is used, see ERET. 


<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. If the PC is used: 


° For the SUB variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


° For the SUBS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. ARM deprecates use of this instruction unless <Rn> is the LR. 


For encoding T1, T3 and T4: is the general-purpose destination register, encoded in the "Rd" field. 
If omitted, this register is the same as <Rn>. 


<Rn> For encoding A1 and T4: is the general-purpose source register, encoded in the "Rn" field. If the SP 
is used, see SUB, SUBS (SP minus immediate). If the PC is used, see ADR. 
For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 


For encoding T3: is the general-purpose source register, encoded in the "Rn" field. If the SP is used, 
see SUB, SUBS (SP minus immediate). 
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<imm3> 
<imm12> 


<const> 


In the T32 instruction set, MOVS{<c>}{<q>} PC, LR is a pseudo-instruction for SUBS{<c>}{<q>} PC, LR, #0. 


Is a 3-bit unsigned immediate, in the range 0 to 7, encoded in the "imm3" field. 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Is a 12-bit unsigned immediate, in the range 0 to 4095, encoded in the "i:imm3:imm8" field. 


For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 


For encoding T3: an immediate value. See Modified immediate constants in T32 instructions on 


page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 


(result, nzcv) = AddWithCarry(R[n], NOT(imm32), '1'); 


if d == 15 then 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result) ; 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.238 SUB, SUBS (register) 


Subtract (register) subtracts an optionally-shifted register value from a register value, and writes the result to the 


destination register. 


If the destination register is not the PC, the SUBS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. However, 


when the destination register is the PC: 


° The SUB variant of the instruction is an interworking branch, see Pseudocode description of operations on 


the AArch32 general-purpose registers and the PC on page E1-2293. 


° The SUBS variant of the instruction performs an exception return without the use of the stack. ARM 


deprecates use of this instruction. However, in this case: 


—_ The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 


AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 


|7 6 5 4|3 


0| 


cond 


SUB, rotate right with extend variant 
Applies when S$ == 0 && imm5 == 00000 && type == 11. 


SUB{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


SUB, shift or rotate by value variant 
Applies when S$ == 0 && !(immS == 00000 && type == 11). 


SUB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


SUBS, rotate right with extend variant 
Applies when S$ == 1 && imm5 == 00000 && type == 11. 


SUBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


SUBS, shift or rotate by value variant 
Applies when S$ == 1 && !(immS == 00000 && type == 11). 


SUBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


if Rn == '1101' then SEE SUB (SP minus register); 
d = UInt(Rd); mn = UInt(Rn); m= UInt(Rm); setflags 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 


tor_[ _Rd_| mms [ype]o] Rm | 
Rn 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


T1 


\15141312\11109 8| 65 |32 O| 


foo 077 0[1] Rm] Rn] Ra | 


T1 variant 


SUB<c>{<q>} <Rd>, <Rn>, <Rm> // Inside IT block 
SUBS{<q>} {<Rd>,} <Rn>, <Rm> // Outside IT block 


Decode for this encoding 


d = UInt(Rd); n 


= UInt(Rn); m= UInt(Rm); setflags = !InITBlock(); 
(shift_t, shift_n) = 


(SRType_LSL, @); 
T2 


[15 1413 12/1110 9 8/7 6 5 4/3 0|1514 12/11 817 6 5 4/3 o| 


[111010 14110 1{s{ !1101 | imm3 | Rd __fimm2| type] = Rm_| 
Rn 





SUB, rotate right with extend variant 
Applies when S == 0 && imm3 == 000 && imm2 == 00 && type == 11. 


SUB{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


SUB, shift or rotate by value variant 
Applies when S == 0 && !(imm3 == 000 && imm2 == 00 && type == 11). 


SUB<c>.W {<Rd>,} <Rn>, <Rm> // Inside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
SUB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


SUBS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && Rd != 1111 && imm2 == 00 && type == 11 


SUBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, RRX 


SUBS, shift or rotate by value variant 
Applies when S$ == 1 && !(imm3 == 000 && imm2 == 00 && type == 11) && Rd != 1111 


SUBS.W {<Rd>,} <Rn>, <Rm> // Outside IT block, and <Rd>, <Rn>, <Rm> can be represented in T1 
SUBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


if Rd == '1111' && S == '1' then SEE CMP (register); 

if Rn == '1101' then SEE SUB (SP minus register); 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if (d == 15 && !setflags) || n == 15 || m == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the same as <Rn>. ARM deprecates using the PC as the destination register, but if the 
PC is used: 
. For the SUB variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 
° For the SUBS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 
For encoding T1 and T2: is the general-purpose destination register, encoded in the "Rd" field. If 
omitted, this register is the same as <Rn>. 
<Rn> For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. If the SP is used, see SUB, SUBS (SP minus register). 
For encoding T1: is the first general-purpose source register, encoded in the "Rn" field. 
For encoding T2: is the first general-purpose source register, encoded in the "Rn" field. If the SP is 
used, see SUB, SUBS (SP minus register). 
<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 
For encoding T1 and T2: is the second general-purpose source register, encoded in the "Rm" field. 
<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 1 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], NOT(shifted), '1'); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.239 SUB, SUBS (register-shifted register) 


Subtract (register-shifted register) subtracts a register-shifted register value from a register value, and writes the 
result to the destination register. It can optionally update the condition flags based on the result. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 
| i111 [o 0 0 ofo 71 ofS] Rn | Ra _ | Rs_ [O|type{1] Rm __ 


cond 


Flag setting variant 
Applies when S == 1. 


SUBS{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Not flag setting variant 
Applies when S == 0. 


SUB{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs> 


Decode for all variants of this encoding 

d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); s = UInt(Rs); 
setflags = (S == '1'); shift_t = DecodeRegShift(type) ; 

if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 

<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 
the "Rs" field. 
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Operation 


if ConditionPassed() then 


EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(R[n], NOT(shifted), '1'); 
R[d] = result; 
if setflags then 

PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.240 SUB, SUBS (SP minus immediate) 


Subtract from SP (immediate) subtracts an immediate value from the SP value, and writes the result to the 
destination register. 


If the destination register is not the PC, the SUBS variant of the instruction updates the condition flags based on the 
result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. If the 
destination register is the PC: 


° The SUB variant of the instruction is an interworking branch, see Pseudocode description of operations on 
the AArch32 general-purpose registers and the PC on page E1-2293. 


° The SUBS variant of the instruction performs an exception return without the use of the stack. ARM 
deprecates use of this instruction. However, in this case: 
—_ The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See I/legal return events from 
AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11 | | 0 | 
ei [oo7ojo7o[s[ti07] Ra | mma ——=*d 


cond 


SUB variant 
Applies when S == 0. 


SUB{<c>}{<q>} {<Rd>,} SP, #<const> 


SUBS variant 
Applies when S == 1. 


SUBS{<c>}{<q>} {<Rd>,} SP, #<const> 
Decode for all variants of this encoding 
d = UInt(Rd); setflags = (S == '1'); imm32 = A32ExpandImm(imm12) ; 


T1 


\15 1413 12\1110 9 8|7 6 | 0 | 


1o1tooooli} imm | 


T1 variant 


SUB{<c>}{<q>} {SP,} SP, #<imm7> 


Decode for this encoding 


d = 13; setflags = FALSE; imm32 = ZeroExtend(imm7:'Q0', 32); 
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T2 


151413 12/1110 9 8|7 6 5 4/3 21 0|1514 12/11 8|7 | 0 | 


771 of [oli to 7[s]1 10 10] mma] Ra [imme —_—s 


SUB variant 
Applies when $ == 0. 


SUB{<c>}.W {<Rd>,} SP, #<const> // <Rd>, <const> can be represented in T1 
SUB{<c>}{<q>} {<Rd>,} SP, #<const> 


SUBS variant 
Applies when $ == 1 && Rd != 1111. 


SUBS{<c>}{<q>} {<Rd>,} SP, #<const> 


Decode for all variants of this encoding 
if Rd == '1111' && S == '1' then SEE CMP (immediate); 


d = UInt(Rd); setflags = (S == '1');  imm32 = T32ExpandImm(i:imm3:imm8) ; 
if d == 15 && !setflags then UNPREDICTABLE; 


T3 


15 141312/11109 8/7 6 5 4/3 21 0/1514 12/11 8|7 | 0 | 


741 oft oft opi fot 10 toy mma | Ra [imme ——sd 


T3 variant 


SUB{<c>}{<q>} {<Rd>,} SP, #<imm12> // <imm12> cannot be represented in Tl, 12, or T3 
SUBW{<c>}{<q>} {<Rd>,} SP, #<imm12> // <imm12> can be represented in T1, 12, or T3 


Decode for this encoding 

d = UInt(Rd); setflags = FALSE; imm32 = ZeroExtend(i:imm3:imm8, 32); 
if d == 15 then UNPREDICTABLE; 

Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<imm7> Is the unsigned immediate, a multiple of 4, in the range 0 to 508, encoded in the "imm7" field as 
<imm7>/4. 

<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 


this register is the SP. If the PC is used: 


° For the SUB variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 
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. For the SUBS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. ARM deprecates use of this instruction unless <Rn> is the LR. 


For encoding T2 and T3: is the general-purpose destination register, encoded in the "Rd" field. If 
omitted, this register is the SP. 


<imm12> Is a 12-bit unsigned immediate, in the range 0 to 4095, encoded in the "i:imm3:imm8" field. 


<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 


For encoding T2: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result, nzcv) = AddWithCarry(SP, NOT(imm32), '1'); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result) ; 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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F5.1.241 SUB, SUBS (SP minus register) 


Subtract from SP (register) subtracts an optionally-shifted register value from the SP value, and writes the result to 


the destination register. 


If the destination register is not the PC, the SUBS variant of the instruction updates the condition flags based on the 


result. 


The field descriptions for <Rd> identify the encodings where the PC is permitted as the destination register. If the 


destination register is the PC: 


° The SUB variant of the instruction is an interworking branch, see Pseudocode description of operations on 


the AArch32 general-purpose registers and the PC on page E1-2293. 


° The SUBS variant of the instruction performs an exception return without the use of the stack. ARM 


deprecates use of this instruction. However, in this case: 


—_ The PE branches to the address written to the PC, and restores PSTATE from SPSR_<current_mode>. 


— The PE checks SPSR_<current_mode> for an illegal return event. See J/legal return events from 


AArch32 state on page G1-3835. 


— The instruction is UNDEFINED in Hyp mode. 


— The instruction is CONSTRAINED UNPREDICTABLE in User mode and System mode. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 


I7 6 5 4|3 


0 | 


[erm [oo 0 0f0 7 0[s]i 107] Ra | mms [wpe]o] Rm | 


cond 


SUB, rotate right with extend variant 
Applies when S$ == 0 && imm5 == 00000 && type == 11. 


SUB{<c>}{<q>} {<Rd>,} SP, <Rm> , RRX 


SUB, shift or rotate by value variant 
Applies when S$ == 0 && !(immS == 00000 && type == 11). 


SUB{<c>}{<q>} {<Rd>,} SP, <Rm> {, <shift> #<amount>} 


SUBS, rotate right with extend variant 
Applies when S$ == 1 && imm5 == 00000 && type == 11. 


SUBS{<c>}{<q>} {<Rd>,} SP, <Rm> , RRX 


SUBS, shift or rotate by value variant 
Applies when S$ == 1 && !(immS == 00000 && type == 11). 


SUBS{<c>}{<q>} {<Rd>,} SP, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


d = UInt(Rd); m= UInt(Rm); setflags = (S == '1'); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 
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T1 


151413 12/1110 9 8|7 6 5 4/3 21 0|1514 12/11 8|7 6 5 4|3 0 | 


7704071 10 7/8/11 0 tO] mms | Rd [imma] ype] Rm 


SUB, rotate right with extend variant 
Applies when S == 0 && imm3 == 000 && imm2 == QQ && type == 11. 


SUB{<c>}{<q>} {<Rd>,} SP, <Rm>, RRX 


SUB, shift or rotate by value variant 
Applies when S == 0 && !(imm3 == 000 && imm2 == 00 && type == 11). 


SUB{<c>}.W {<Rd>,} SP, <Rm> // <Rd>, <Rm> can be represented in T1 or 72 
SUB{<c>}{<q>} {<Rd>,} SP, <Rm> {, <shift> #<amount>} 


SUBS, rotate right with extend variant 
Applies when S == 1 && imm3 == 000 && Rd != 1111 && imm2 == 00 && type == 11. 


SUBS{<c>}{<q>} {<Rd>,} SP, <Rm>, RRX 


SUBS, shift or rotate by value variant 
Applies when S == 1 && !(imm3 == 000 && imm2 == 00 && type == 11) && Rd != 1111. 


SUBS{<c>}{<q>} {<Rd>,} SP, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 

if Rd == '1111' && S == '1' then SEE CMP (register); 

d = UInt(Rd); m= UInt(Rm); setflags = (S == '1'); 

(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 

if (d == 15 && !setflags) || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> For encoding A1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 


this register is the SP. ARM deprecates using the PC as the destination register, but if the PC is used: 


. For the SUB variant, the instruction is a branch to the address calculated by the operation. 
This is an interworking branch, see Pseudocode description of operations on the AArch32 
general-purpose registers and the PC on page E1-2293. 


. For the SUBS variant, the instruction performs an exception return, that restores PSTATE 
from SPSR_<current_mode>. 


For encoding T1: is the general-purpose destination register, encoded in the "Rd" field. If omitted, 
this register is the SP. 
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<Rm> For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 


For encoding T1: is the second general-purpose source register, encoded in the "Rm" field. 


<shift> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 1 
ASR when type = 10 
ROR when type = 11 
<amount> For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 


(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
shifted = Shift(R[m], shift_t, shift_n, PSTATE.C); 
(result, nzcv) = AddWithCarry(SP, NOT(shifted), '1'); 
if d == 15 then // Can only occur for A32 encoding 
if setflags then 
ALUExceptionReturn(result) ; 
else 
ALUWritePC(result); 
else 
R[d] = result; 
if setflags then 
PSTATE.<N,Z,C,V> = nzcv; 
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Supervisor Call causes a Supervisor Call exception. For more information, see Supervisor Call (SVC) exception on 
page G1-3853. 


Note 


SVC was previously called SWI, Software Interrupt, and this name is still found in some documentation. 








Software can use this instruction as a call to an operating system to provide a service. 
In the following cases, the Supervisor Call exception generated by the SVC instruction is taken to Hyp mode: 
° If the SVC is executed in Hyp mode. 


° If HCR.TGE is set to 1, and the SVC is executed in Non-secure User mode. For more information, see 
Supervisor Call exception, when the value of HCR.TGE is 1 on page G1-3829 


In these cases, the HSR identifies that the exception entry was caused by a Supervisor Call exception, EC value 0x11, 
see Use of the HSR on page G4-4137. The immediate field in the HSR: 


° If the SVC is unconditional: 
— For the T32 instruction, is the zero-extended value of the imm8 field. 


— _ For the A32 instruction, is the least-significant 16 bits the imm24 field. 


° If the SVC is conditional, is UNKNOWN. 
A1 


|31 28|27 26 25 24|23 | | | 0 | 


eam [111i] SSS SCSC—~—SCS 


cond 


Al variant 


SVC{<c>}{<q>} {#}<imm> 
Decode for this encoding 
imm32 = ZeroExtend(imm24, 32); 


T1 


\15 1413 12|1110 9 8|7 | 0 | 


Tiotitii] imme ‘| 


T1 variant 


SVC{<c>}{<q>} {#}<imm> 


Decode for this encoding 


jimm32 = ZeroExtend(imm8, 32); 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
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<q> See Standard assembler syntax fields on page F2-2406. 


<imm> For encoding A1: is a 24-bit unsigned immediate, in the range 0 to 16777215, encoded in the 
"imm24" field. This value is for assembly and disassembly only. SVC handlers in some systems 
interpret imm24 in software, for example to determine the required service. 


For encoding T1: is a 8-bit unsigned immediate, in the range 0 to 255, encoded in the "imm8" field. 
This value is for assembly and disassembly only. SVC handlers in some systems interpret imm8 in 
software, for example to determine the required service. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
AArch32.Cal1Supervisor(imm32<15 :Q>) ; 
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F5.1.243 SXTAB 


Signed Extend and Add Byte extracts an 8-bit value from a register, sign-extends it to 32 bits, adds the result to the 
value in another register, and writes the final result to the destination register. The instruction can specify a rotation 
by 0, 8, 16, or 24 bits before extracting the 8-bit value. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


| 1117, Jo 1 10 tfolt of 1111 | Rd frotateloofo 1 1 1] Rm __| 
Rn 


cond 


Al variant 


SXTAB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 


Decode for this encoding 
if Rn == '1111' then SEE SXTB; 


d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 8/7 6 5 4|3 0 | 


Tt4 411070 0/1 of mt [111%] Rd [1 ]Ofotate] Rm | 
Rn 


T1 variant 


SXTAB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 


Decode for this encoding 

if Rn == '1111' then SEE SXTB; 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 

if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 


(omitted) when rotate = 00 


8 when rotate = 01 
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16 when rotate = 10 


24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d] = R[n] + SignExtend(rotated<7:@>, 32); 
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F5.1.244 SXTAB16 


Signed Extend and Add Byte 16 extracts two 8-bit values from a register, sign-extends them to 16 bits each, adds 
the results to two 16-bit values from another register, and writes the final results to the destination register. The 
instruction can specify a rotation by 0, 8, 16, or 24 bits before extracting the 8-bit values. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


| 1117, Jo 1 10 tfojo of 1111 | Rd frotateloofo 1 1 1] Rm __| 
Rn 


cond 


Al variant 


SXTAB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 


Decode for this encoding 
if Rn == '1111' then SEE SXTB16; 


d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 8/7 6 5 4|3 0 | 


F711 7070 oo tfo] enn [i111] Rd [t[Ofotate] Rm | 
Rn 





T1 variant 


SXTAB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 


Decode for this encoding 

if Rn == '1111' then SEE SXTB16; 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 

if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 


(omitted) when rotate = 00 


8 when rotate = 01 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


16 when rotate = 10 


24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d]<15:@> = R[n]<15:@> + SignExtend(rotated<7:0>, 16); 
R[d]<31:16> = R[n]<31:16> + SignExtend(rotated<23:16>, 16); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.245 SXTAH 


Signed Extend and Add Halfword extracts a 16-bit value from a register, sign-extends it to 32 bits, adds the result 
to a value from another register, and writes the final result to the destination register. The instruction can specify a 
rotation by 0, 8, 16, or 24 bits before extracting the 16-bit value. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


| 1117 fo 1 to tfolt a] 1111 | Rd frotateloofo 1 1 1] Rm __| 
Rn 


cond 


Al variant 


SXTAH{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 


Decode for this encoding 
if Rn == '1111' then SEE SXTH; 


d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0/15 14 13 12|11 8/7 6 5 4|3 0 | 


TT4 47070 0)0 of] etm [111%] Rd [1 [Ofotate] Rm | 
Rn 


T1 variant 


SXTAH{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 


Decode for this encoding 

if Rn == '1111' then SEE SXTH; 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 

if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 

<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 


(omitted) when rotate = 00 


8 when rotate = 01 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


16 when rotate = 10 


24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d] = R[n] + SignExtend(rotated<15:0>, 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.246 SXTB 
Signed Extend Byte extracts an 8-bit value from a register, sign-extends it to 32 bits, and writes the result to the 
destination register. The instruction can specify a rotation by 0, 8, 16, or 24 bits before extracting the 8-bit value. 
A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12|/11109 8|7 6 5 4/3 0| 
itt jo 1 1 0 1fo{1 o]1 11 4] Rd frotatel(oofo_ 11 1] Rm | 
cond 
Al variant 
SXTB{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 
Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/11109 8|7 6 5 |3 2 0| 
1077007 0/0]1] Rm | Ra | 
T1 variant 
SXTB{<c>}{<q>} {<Rd>,} <Rm> 
Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = Q; 
T2 
|15141312|/1110 9 8|7 6 5 4/3 2 1 0|1514 13 12\11 8|7 6 5 4]|3 0| 
Ti471070 0/1 0011771777] Ra [1fojfotate] Rm | 
T2 variant 
SXTB{<c>}.W {<Rd>,} <Rm> // <Rd>, <Rm> can be represented in T1 
SXTB{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 
Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the general-purpose source register, encoded in the "Rm" field. 

<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 


(omitted) when rotate = 00 


8 when rotate = 01 
16 when rotate = 10 
24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d] = SignExtend(rotated<7:@>, 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.247 SXTB16 


Signed Extend Byte 16 extracts two 8-bit values from a register, sign-extends them to 16 bits each, and writes the 
results to the destination register. The instruction can specify a rotation by 0, 8, 16, or 24 bits before extracting the 
8-bit values. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0 | 


| feiiit_ jo 110 tfofo oft 111] Rd frotatelooo 1.1 1] Rm | 


cond 


Al variant 


SXTB16{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 


Decode for this encoding 


d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


15 14131211109 8/7 6 5 4/3 2 1 0|151413 12/11 8/7 6 5 4|3 0 | 


7714070 0/0 711771717] Re |i fOfotate] Rm 


T1 variant 


SXTB16{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 


Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rm> Is the general-purpose source register, encoded in the "Rm" field. 
<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 
(omitted) when rotate = 00 
8 when rotate = @1 
16 when rotate = 10 
24 when rotate = 11 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d]<15:@> = SignExtend(rotated<7:0>, 16); 
R[d]<31:16> = SignExtend(rotated<23:16>, 16); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.248 SXTH 


Signed Extend Halfword extracts a 16-bit value from a register, sign-extends it to 32 bits, and writes the result to 
the destination register. The instruction can specify a rotation by 0, 8, 16, or 24 bits before extracting the 16-bit 
value. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0 | 


| feiit fo 410 tfoft 1f1 111] Rd frotateloolo +1 1] Rm | 


cond 


Al variant 


SXTH{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 


Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


15 141312/11109 8|7 65 |32 Of 


[7071007 0[ojo] Rm] Ra | 


T1 variant 


SXTH{<c>}{<q>} {<Rd>,} <Rm> 


Decode for this encoding 


d = UInt(Rd); m= UInt(Rm); rotation = Q; 
T2 


[15 1413 12/1110 9 8/7 6 5 4/3 2 1 0|151413 12|11 8/7 6 5 4|3 0 | 


77711070 0/0 0oli 1717/1711] Re |i fOfotate] Rm 


T2 variant 


SXTH{<c>}.W {<Rd>,} <Rm> // <Rd>, <Rm> can be represented in T1 
SXTH{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 


Decode for this encoding 

d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 

if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the general-purpose source register, encoded in the "Rm" field. 

<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 


(omitted) when rotate = 00 


8 when rotate = 01 
16 when rotate = 10 
24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d] = SignExtend(rotated<15:0>, 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.249 TBB, TBH 
Table Branch Byte or Halfword causes a PC-relative forward branch using a table of single byte or halfword offsets. 
A base register provides a pointer to the table, and a second register supplies an index into the table. The branch 
length is twice the value returned from the table. 
T1 
15 141312/11109 8|7 6 5 4/3 0|/15141312/1110 9 8|7 6 5 4/3 0 | 
11 totooorto tt Ra (Ala Olofololo o ofH] Rm | 
Byte variant 
Applies when H == 0. 
TBB{<c>}{<q>} [<Rn>, <Rm>] // Outside or last in IT block 
Halfword variant 
Applies when H == 1. 
TBH{<c>}{<q>} [<Rn>, <Rm>, LSL #1] // Outside or last in IT block 
Decode for all variants of this encoding 
n = UInt(Rn); m= UInt(Rm); is_tbh = (H == '1'); 
if m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
if InITBlock() && !LastInITBlock() then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the general-purpose base register holding the address of the table of branch lengths, encoded in 
the "Rn" field. The PC can be used. If it is, the table immediately follows this instruction. 
<Rm> For the byte variant: is the general-purpose index register, encoded in the "Rm" field. This register 
contains an integer pointing to a single byte in the table. The offset in the table is the value of the 
index. 
For the halfword variant: is the general-purpose index register, encoded in the "Rm" field. This 
register contains an integer pointing to a halfword in the table. The offset in the table is twice the 
value of the index. 
Operation 
if ConditionPassed() then 
EncodingSpecificOperations(); 
if is_tbh then 
halfwords = UInt(MemU[R[n]+LSL(R[m],1), 2]); 
else 
halfwords = UInt(MemU[R[n]+R[m], 1]); 
BranchWritePC(PC + 2halfwords); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.250 TEQ (immediate) 


Test Equivalence (immediate) performs a bitwise exclusive OR operation on a register value and an immediate 
value. It updates the condition flags based on the result, and discards the result. 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 | | 0| 
I=1111_ [0 0 1 1 0 
cond 
Al variant 


TEQ{<c>}{<q>} <Rn>, #<const> 


Decode for this encoding 
n = UInt(Rn); 
(imm32, carry) = A32ExpandImm_C(imm12, PSTATE.C); 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/1514 12/11109 8|7 | 0 | 


741 of [ojo to opt] Rn op imma [117 7[ imme 


T1 variant 


TEQ{<c>}{<q>} <Rn>, #<const> 


Decode for this encoding 

n = UInt(Rn); 

(imm32, carry) = T32ExpandImm_C(i:imm3:imm8, PSTATE.C); 

if n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 


used, but this is deprecated. 
For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 

<const> For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 
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Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 
result = R[n] EOR imm32; 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 

// PSTATE.V unchanged 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.251 TEQ (register) 


Test Equivalence (register) performs a bitwise exclusive OR operation on a register value and an optionally-shifted 
register value. It updates the condition flags based on the result, and discards the result. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 iI7 6 5 4]|3 0| 
| i=1111_ fo 0 0 1 Ofo 144] Rn O(O(O)(0)]_—immd_— type fO] Rm __ 


cond 


Rotate right with extend variant 
Applies when imm5 == 00000 && type == 11. 


TEQ{<c>}{<q>} <Rn>, <Rm>, RRX 


Shift or rotate by value variant 
Applies when !(imm5 == 00000 && type == 11). 


TEQ{<c>}{<q>} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 
n = UInt(Rn); m = UInt(Rm); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 


T1 


151413 12|11109 8|7 6 5 4|3 0\1514 12/11109 8|7 6 5 4/3 0| 


Rotate right with extend variant 
Applies when imm3 == 000 && imm2 == 00 && type == 11. 
TEQ{<c>}{<q>} <Rn>, <Rm>, RRX 
Shift or rotate by value variant 
Applies when !(imm3 == 000 && imm2 == QQ && type == 11). 
TEQ{<c>}{<q>} <Rn>, <Rm> {, <shift> #<amount>} 
Decode for all variants of this encoding 
n = UInt(Rn); m = UInt(Rm); 
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 
if n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
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<q> 


<Rn> 


<Rm> 


<shift> 


<amount> 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


See Standard assembler syntax fields on page F2-2406. 

For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. 

For encoding T1: is the first general-purpose source register, encoded in the "Rn" field. 

For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 

For encoding T1: is the second general-purpose source register, encoded in the "Rm" field. 


Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 


LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 


For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


(shifted, 


result = 
PSTATE.N 
PSTATE.Z 
PSTATE.C 


carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
R[n] EOR shifted; 
= result<31>; 
= IsZeroBit(result); 
= carry; 


// PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.252 TEQ (register-shifted register) 
Test Equivalence (register-shifted register) performs a bitwise exclusive OR operation on a register value and a 
register-shifted register value. It updates the condition flags based on the result, and discards the result. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 8/7 6 5 4|3 0| 
Fit |o 0 0 4 ofo t]4] Rn [oo)|_—Rs_— | type [1] Rm _ 
cond 
Al variant 
TEQ{<c>}{<q>} <Rn>, <Rm>, <type> <Rs> 
Decode for this encoding 
n = UInt(Rn); m= UInt(Rm); s = UInt(Rs); 
shift_t = DecodeRegShift(type) ; 
if n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 
the "Rs" field. 
Operation 
if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] EOR shifted; 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.253 TST (immediate) 


Test (immediate) performs a bitwise AND operation on a register value and an immediate value. It updates the 
condition flags based on the result, and discards the result. 


A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 | | 0| 
F111 [oO 0.117 Of0 oft] Rn __[Ooof|_———immt2— 
cond 
Al variant 


TST{<c>}{<q>} <Rn>, #<const> 


Decode for this encoding 


n = UInt(Rn); 
(imm32, carry) = A32ExpandImm_C(imm12, PSTATE.C); 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/1514 12/11109 8|7 | 0 | 


741 of fofo oo oft] Rn [0] imma [117 7[ imme 


T1 variant 


TST{<c>}{<q>} <Rn>, #<const> 


Decode for this encoding 


n = UInt(Rn); 
(imm32, carry) = T32ExpandImm_C(i:imm3:imm8, PSTATE.C); 
if n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> 


<q> 


<Rn> 


<const> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

For encoding A1: is the general-purpose source register, encoded in the "Rn" field. The PC can be 
used, but this is deprecated. 

For encoding T1: is the general-purpose source register, encoded in the "Rn" field. 

For encoding A1: an immediate value. See Modified immediate constants in A32 instructions on 
page F2-2422 for the range of values. 


For encoding T1: an immediate value. See Modified immediate constants in T32 instructions on 
page F2-2420 for the range of values. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = R[n] AND imm32; 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.254 TST (register) 


Test (register) performs a bitwise AND operation on a register value and an optionally-shifted register value. It 
updates the condition flags based on the result, and discards the result. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 iI7 6 5 4]|3 0| 
| i=1111_ [o 0 0 1 Of0 oft] Rn _OO(O)(O)]_—immd_—typefO] Rm __ 


cond 


Rotate right with extend variant 
Applies when imm5 == 00000 && type == 11. 


TST{<c>}{<q>} <Rn>, <Rm>, RRX 


Shift or rotate by value variant 
Applies when !(imm5 == 00000 && type == 11). 


TST{<c>}{<q>} <Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


n = UInt(Rn); m = UInt(Rm); 
(shift_t, shift_n) = DecodeImmShift(type, imm5); 


T1 


15141312\11109 8|7 65 |3 2. O| 


T1 variant 


TST{<c>}{<q>} <Rn>, <Rm> 


Decode for this encoding 
n = UInt(Rn); m = UInt(Rm); 
(shift_t, shift_n) = (SRType_LSL, Q); 


T2 


151413 12|11109 8|7 6 5 4|3 0\1514 12/11109 8|7 6 5 4/3 0| 


Rotate right with extend variant 
Applies when imm3 == 000 && imm2 == 00 && type == 11. 


TST{<c>}{<q>} <Rn>, <Rm>, RRX 


Shift or rotate by value variant 


Applies when !(imm3 == 000 && imm2 == 00 && type == 11). 
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TST{<c>}.W <Rn>, <Rm> // <Rn>, <Rm> can be represented in T1 


TST{<c>}{<q>} 


<Rn>, <Rm> {, <shift> #<amount>} 


Decode for all variants of this encoding 


n = UInt(Rn); 


m = UInt(Rm); 


(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2); 
if n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> 


<q> 


<Rn> 


<Rm> 


<shift> 


<amount> 


See Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


For encoding A1: is the first general-purpose source register, encoded in the "Rn" field. The PC can 
be used, but this is deprecated. 

For encoding T1 and T2: is the first general-purpose source register, encoded in the "Rn" field. 
For encoding A1: is the second general-purpose source register, encoded in the "Rm" field. The PC 
can be used, but this is deprecated. 

For encoding T1 and T2: is the second general-purpose source register, encoded in the "Rm" field. 


Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 


ESL when type = 00 
LSR when type = 1 
ASR when type = 10 
ROR when type = 11 


For encoding A1: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm5" field as <amount> modulo 32. 


For encoding T2: is the shift amount, in the range 1 to 31 (when <shift> = LSL or ROR) or 1 to 32 
(when <shift> = LSR or ASR) encoded in the "imm3:imm2" field as <amount> modulo 32. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


(shifted, 


result = 
PSTATE.N 
PSTATE.Z 
PSTATE.C 


carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
R[n] AND shifted; 

= result<31>; 

= IsZeroBit(result); 

= carry; 


// PSTATE.V unchanged 
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F5.1.255 TST (register-shifted register) 
Test (register-shifted register) performs a bitwise AND operation on a register value and a register-shifted register 
value. It updates the condition flags based on the result, and discards the result. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 14 13 12|11 8/7 6 5 4|3 0| 
Ft |o 0 0 1 ofo of4] Rn [oyoyo|_—Rs_—[ OJ type [1] Rm _ 
cond 
Al variant 
TST{<c>}{<q>} <Rn>, <Rm>, <type> <Rs> 
Decode for this encoding 
n = UInt(Rn); m= UInt(Rm); s = UInt(Rs); 
shift_t = DecodeRegShift(type) ; 
if n == 15 || m == 15 || s == 15 then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
<type> Is the type of shift to be applied to the second source register, encoded in the "type" field. It can have 
the following values: 
LSL when type = 00 
LSR when type = 01 
ASR when type = 10 
ROR when type = 11 
<Rs> Is the third general-purpose source register holding a shift amount in its bottom 8 bits, encoded in 
the "Rs" field. 
Operation 
if ConditionPassed() then 
EncodingSpecificOperations(); 
shift_n = UInt(R[s]<7:0>); 
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, PSTATE.C); 
result = R[n] AND shifted; 
PSTATE.N = result<31>; 
PSTATE.Z = IsZeroBit(result); 
PSTATE.C = carry; 
// PSTATE.V unchanged 
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F5.1.256 UADD16 
Unsigned Add 16 performs two 16-bit unsigned integer additions, and writes the results to the destination register. 
It sets PSTATE.GE according to the results of the additions. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
Fit jo 110 o[7 0 4] Rn | Rd farnyfofo of1] Rm __ | 
cond 
Al variant 
UADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]|3 0| 
Titttoto yoo Re [1717] Ra jo[tfojo] rm | 
T1 variant 
UADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
sum1 = UInt(R[n]<15:@>) + UInt(R[m]<15:>); 
sum2 = UInt(R[n]<31:16>) + UInt(R[m]<31:16>) ; 
R[d]<15:@> = suml<15:@>; 
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R[d]<31:16> = sum2<15:0@>; 
PSTATE.GE<1:0> = if suml >= 0x10000 then '11' else 'QQ'; 
PSTATE.GE<3:2> = if sum2 >= 0x10000 then '11' else 'QQ'; 
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F5.1.257  UADD8 
Unsigned Add 8 performs four unsigned 8-bit integer additions, and writes the results to the destination register. It 
sets PSTATE.GE according to the results of the additions. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
itt jo 110 0[17 0 4] Rn | RG fants fo of1] Rm _ | 
cond 
Al variant 
UADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]|3 0| 
Titttot1o ooo, Ra [1717] Ra jo[tfojo] am | 
T1 variant 
UADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
sum1 = UInt(R[n]<7:0>) + UInt(R[m]<7:0>); 
sum2 = UInt(R[n]<15:8>) + UInt(R[m]<15:8>); 
sum3 = UInt(R[n]<23:16>) + UInt(R[m]<23:16>); 
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sum4 = UInt(R[n]<31:24>) + UInt(R[m]<31:24>); 
sum1<7:0>; 
sum2<7:0>; 
sum3<7:0>; 
sum4<7:0>; 


R[d]<7:0> 
R[d]<15:8> 
R[d]<23:16> 
R[d]<31:24> 
PSTATE.GE<0> 
PSTATE.GE<1> 
PSTATE.GE<2> 
PSTATE.GE<3> 


if suml >= 
if sum2 >= 
if sum3 >= 
if sum4 >= 


0x100 then '1' else 
0x100 then '1' else 
0x100 then '1' else 
0x100 then '1' else 


'Q': 
'Q'; 
‘g's 
'Q': 
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F5.1.258 


UASX 


Unsigned Add and Subtract with Exchange exchanges the two halfwords of the second operand, performs one 
unsigned 16-bit integer addition and one unsigned 16-bit subtraction, and writes the results to the destination 
register. It sets PSTATE.GE according to the results. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


pert jot to oj1 of Rn fo Rd Warfofo +f] Rm 


cond 


Al variant 


UASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 


d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


ptr ttorortjorof Rn [tra t} Rd fojtjofo} Rm | 


T1 variant 
UASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diff = UInt(R[n]<15:@>) - UInt(R[m]<31:16>); 
sum = UInt(R[n]<31:16>) + UInt(R[m]<15:0>); 
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R[d]<15:@> = diff<15:@>; 

R[d]<31:16> = sum<15:>; 

PSTATE.GE<1:0> = if diff >= @ then '11' else 'Q0'; 
PSTATE.GE<3:2> = if sum >= 0x10000 then '11' else 'QQ'; 
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F5.1.259 UBFX 
Unsigned Bit Field Extract extracts any number of adjacent bits at any position from a register, zero-extends them 
to 32 bits, and writes the result to the destination register. 
A1 
31 28|27 26 25 24|23 22 21 20| 16|15 12|11 i765 4]3 0 | 
rit [oO 1 1 414 widthmt [Rd | sb 1 Ot] Rn 
cond 
Al variant 
UBFX{<c>}{<q>} <Rd>, <Rn>, #<Isb>, #<width> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); 
Isbit = UInt(1sb); widthminusl = UInt(widthm1); 
if d == 15 || n == 15 then UNPREDICTABLE; 
T1 
15 1413 12/1110 9 8|7 6 5 4|3 0/1514 12\11 8/7 6 5 4| 0 | 
Ti 11 ofoyt a]4 Popo. Ra [op mms [Ra [mmol oy. wicthnt 
T1 variant 
UBFX{<c>}{<q>} <Rd>, <Rn>, #<Isb>, #<width> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); 
Tsbit = UInt(imm3:imm2); widthminus1 = UInt(widthm1); 
if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the general-purpose source register, encoded in the "Rn" field. 
<Isb> For encoding A1: is the bit number of the least significant bit in the field, in the range 0 to 31, 
encoded in the "Isb" field. 
For encoding T1: is the bit number of the least significant bit in the field, in the range 0 to 31, 
encoded in the "imm3:imm2" field. 
<width> Is the width of the field, in the range 1 to 32-<lsb>, encoded in the "widthm1" field as <width>-1. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
msbit = Isbit + widthminus1; 
if msbit <= 31 then 
R[d] = ZeroExtend(R[n]<msbit:]lsbit>, 32); 
else 
UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If msbit > 31, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 
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F5.1.260 


UDF 


Permanently Undefined generates an Undefined Instruction exception. 


The encodings for UDF used in this section are defined as permanently UNDEFINED in the ARMv8-A architecture. 


However: 
° With the T32 instruction set, ARM deprecates using the UDF instruction in an IT block. 


° In the A32 instruction set, UDF is not conditional. 
A1 


31 28/27 26 25 24|23 22 21 20/19 | | 81/7 6 5 4|3 0 | 


Titojotiititi] mmi2_———<«di 1 7 1] imma 


cond 


Al variant 
UDF{<c>}{<q>} {#}<imm> 
Decode for this encoding 


jmm32 = ZeroExtend(imm12:imm4, 32); 
// imm32 is for assembly and disassembly only, and is ignored by hardware. 


T1 


\15 1413 12|1110 9 8|7 | 0| 


Tiotit io] imme _| 


T1 variant 


UDF {<c>}{<q>} {#}<imm> 


Decode for this encoding 


jimm32 = ZeroExtend(imm8, 32); 
// imm32 is for assembly and disassembly only, and is ignored by hardware. 


T2 


[15 1413 12/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 | | 0 | 


11a totritara tat imme [1 oftfof immi2 


T2 variant 


UDF{<c>}.W {#}<imm> // <imm> can be represented in T1 
UDF{<c>}{<q>} {#}<imm> 


Decode for this encoding 


imm32 = ZeroExtend(imm4:imm12, 32); 
// imm32 is for assembly and disassembly only, and is ignored by hardware. 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. <c> must be AL or omitted. 
For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. ARM deprecates 
using any <c> value other than AL. 

<q> See Standard assembler syntax fields on page F2-2406. 

<imm> For encoding A1: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 
"imm12:imm4" field. The PE ignores the value of this constant. 


For encoding T1: is a 8-bit unsigned immediate, in the range 0 to 255, encoded in the "imm8" field. 
The PE ignores the value of this constant. 


For encoding T2: is a 16-bit unsigned immediate, in the range 0 to 65535, encoded in the 
"imm4:imm12" field. The PE ignores the value of this constant. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
UNDEFINED; 
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F5.1.261 


UDIV 


Unsigned Divide divides a 32-bit unsigned integer register value by a 32-bit unsigned integer register value, and 
writes the result to the destination register. The condition flags are not affected. 


See Divide instructions on page F1-2379 for more information about this instruction. 
A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 


Posi foi tt ofo taf Ra May Rm fo 0 oft] Rn 
Ra 


cond 


Al variant 
UDIV{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 
if d == 15 || n == 15 || m == 15 || a != 15 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If Ra != '1111', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes as described, with no change to its behavior and no additional side effects. 


° The instruction performs a divide and the register specified by Ra becomes UNKNOWN. 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4|3 0 | 


friitoiiioiy wm MOM Re [tii] rm | 
Ra 


T1 variant 
UDIV{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 
if d == 15 || n == 15 || m == 15 || a != 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 

If Ra != '1111', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction executes as described, with no change to its behavior and no additional side effects. 


° The instruction performs a divide and the register specified by Ra becomes UNKNOWN. 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rn> Is the first general-purpose source register holding the dividend, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register holding the divisor, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if UInt(R[m]) == @ then 
result = 0; 
else 
result = RoundTowardsZero(Real(UInt(R[n])) / Real(UInt(R[m]))); 
R[d] = result<31:0>; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.262 UHADD16 
Unsigned Halving Add 16 performs two unsigned 16-bit integer additions, halves the results, and writes the results 


to the destination register. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


pti fo 1 to oft tt Ra RA fanfeninfofo oft] Rm 


cond 


Al variant 


UHADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


11ittororjoo ty Rn [11ti1] Rd folififo] Rm | 


T1 variant 


UHADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
sum1 = UInt(R[n]<15:@>) + UInt(R[m]<15:0>); 
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sum2 = UInt(R[n]<31:16>) + UInt(R[m]<31:16>) ; 
R[d]<15:@> = suml<16:1>; 
R[d]<31:16> = sum2<16:1>; 
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F5.1.263 UHADD8 
Unsigned Halving Add 8 performs four unsigned 8-bit integer additions, halves the results, and writes the results to 
the destination register. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
itt jo 110 o[7 1 4f Rn | RG anit fo of1] Rm __ | 
cond 
Al variant 
UHADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]|3 0| 
Tiittoto ooo, Ra [1717] Ra joli|ijo] em | 
T1 variant 
UHADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
sum1 = UInt(R[n]<7:0>) + UInt(R[m]<7:0>); 
sum2 = UInt(R[n]<15:8>) + UInt(R[m]<15:8>); 
sum3 = UInt(R[n]<23:16>) + UInt(R[m]<23:16>); 
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sum4 = UInt(R[n]<31:24>) + UInt(R[m]<31:24>); 
R[d]<7:@> = suml<8:1>; 
R[d]<15:8> = sum2<8:1>; 
R[d]<23:16> = sum3<8:1>; 
R[d]<31:24> = sum4<8:1>; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.264 UHASX 
Unsigned Halving Add and Subtract with Exchange exchanges the two halfwords of the second operand, performs 
one unsigned 16-bit integer addition and one unsigned 16-bit subtraction, halves the results, and writes the results 
to the destination register. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
iit fo 110 oft 4 tf Rn [Ra [nfofo t]4] Rm __ | 
cond 
Al variant 
UHASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]3 0| 
Titttoto oro] Ra [1717] Ra jo]i|ijo] rm | 
T1 variant 
UHASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
diff = UInt(R[n]<15:@>) - UInt(R[m]<31:16>); 
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sum = UInt(R[n]<31:16>) + UInt(R[m]<15:0>); 
R[d]<15:@> = diff<16:1>; 
R[d]<31:16> = sum<16:1>; 
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F5.1.265 UHSAX 
Unsigned Halving Subtract and Add with Exchange exchanges the two halfwords of the second operand, performs 
one unsigned 16-bit integer subtraction and one unsigned 16-bit addition, halves the results, and writes the results 
to the destination register. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 0| 
rit fo 110 of1 4 tf Rn [Rd [anno fs os] Rm _| 
cond 
Al variant 
UHSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 14 13 12/11 8|7 6 5 4|3 0| 
Tiittoto ito] Ra [ti17] ra jo[ifijo] rm | 
T1 variant 
UHSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
sum = UInt(R[n]<15:@>) + UInt(R[m]<31:16>) ; 
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diff = UInt(R[n]<31:16>) - UInt(R[m]<15:0>); 
R[d]<15:@> = sum<16:1>; 
R[d]<31:16> = diff<16:1>; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.266 UHSUB16 
Unsigned Halving Subtract 16 performs two unsigned 16-bit integer subtractions, halves the results, and writes the 


results to the destination register. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


pote fo 1 to oft tt Ra fT RA fanfenfenoft aft] Rm 


cond 


Al variant 


UHSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


11ittororr1o af Rn [rit] Rd folififo] Rm | 


T1 variant 


UHSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diff1 = UInt(R[n]<15:@>) - UInt(R[m]<15:0>); 
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diff2 = UInt(R[n]<31:16>) - UInt(R[m]<31:16>); 
R[d]<15:@> = diffl<l6:1b; 
R[d]<31:16> = diff2<16:1; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.267 UHSUB8 
Unsigned Halving Subtract 8 performs four unsigned 8-bit integer subtractions, halves the results, and writes the 


results to the destination register. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


pote fo 1 to oft tt Ra TRA fanny tt ttt] Rm 


cond 


Al variant 


UHSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


11ittorort1oo; Rn [11t1] Rd folififo] Rm | 


T1 variant 


UHSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diff1 = UInt(R[n]<7:@>) - UInt(R[m]<7:0>); 
diff2 = UInt(R[n]<15:8>) - UInt(R[m]<15:8>); 
diff3 = UInt(R[n]<23:16>) - UInt(R[m]<23:16>); 
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diff4 = UInt(R[n]<31:24>) - UInt(R[m]<31:24>); 
R[d]<7:@> = diff1l<8:1>; 
R[d]<15:8> = diff2<8:1>; 
R[d]<23:16> = diff3<8:1>; 
R[d]<31:24> = diff4<8:1; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F5-3177 
1ID092916 Non-Confidential 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.268 UMAAL 


Unsigned Multiply Accumulate Accumulate Long multiplies two unsigned 32-bit values to produce a 64-bit value, 
adds two unsigned 32-bit values, and writes the 64-bit result to two registers. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12\11 8|7 6 5 4|3 0 | 


| i111 fo o 0 ofo 1 ojo] RdHi | Rdlo | Rm |i 001] Rn | 


cond 


Al variant 


UMAAL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


Decode for this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); 


if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if dHi == dLo then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

. The instruction executes as NOP. 


° The value in the destination register is UNKNOWN. 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4|3 0 | 


Tiit707771710| Ra | Reo | Ra [O77 0] Rm | 


T1 variant 


UMAAL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 


Decode for this encoding 


dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 

if dHi == dLo then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 





If dHi == dLo, then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<RdLo> Is the general-purpose source register holding the first addend and the destination register for the 


lower 32 bits of the result, encoded in the "RdLo" field. 


<RdHi> Is the general-purpose source register holding the second addend and the destination register for the 
upper 32 bits of the result, encoded in the "RdHi" field. 


<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = UInt(R[n]) « UInt(R[m]) + UInt(R[dHi]) + UInt(R[dLo]); 
R[dHi] = result<63:32>; 
R[dLo] = result<31:0>; 
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F5.1.269 UMLAL, UMLALS 
Unsigned Multiply Accumulate Long multiplies two unsigned 32-bit values to produce a 64-bit value, and 
accumulates this with a 64-bit value. 
In A32 instructions, the condition flags can optionally be updated based on the result. Use of this option adversely 
affects performance on many implementations. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|11 8|7 6 5 4]|3 0| 
1111 [oo 0 of1 o afs] Rabi [ Rdlo | Rm [1004] Rn | 
cond 
Flag setting variant 
Applies when $ == 1. 
UMLALS{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Not flag setting variant 
Applies when $ == 0. 
UMLAL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Decode for all variants of this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); setflags = (S == '1'); 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if dHi == dLo then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The value in the destination register is UNKNOWN. 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12|11 8|7 6 5 4]|3 0| 
11ti1toriiit 1 of Rn [ Rdlo [| RdHi [oo 0 of Rm _ | 
T1 variant 
UMLAL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Decode for this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); setflags = FALSE; 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 
if dHi == dLo then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<RdLo> Is the general-purpose source register holding the lower 32 bits of the addend, and the destination 


register for the lower 32 bits of the result, encoded in the "RdLo" field. 


<RdHi> Is the general-purpose source register holding the upper 32 bits of the addend, and the destination 
register for the upper 32 bits of the result, encoded in the "RdHi" field. 


<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = UInt(R[n]) « UInt(R[m]) + UInt(R[dHi]:R[dLo]); 
R[dHi] = result<63:32>; 
R[dLo] = result<31:0>; 
if setflags then 
PSTATE.N = result<63>; 
PSTATE.Z = IsZeroBit(result<63:0>); 
// PSTATE.C, PSTATE.V unchanged 
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F5.1.270 UMULL, UMULLS 
Unsigned Multiply Long multiplies two 32-bit unsigned values to produce a 64-bit result. 
In A32 instructions, the condition flags can optionally be updated based on the result. Use of this option adversely 
affects performance on many implementations. 
A1 
|31 28|27 26 25 24|23 22 21 20/19 16|15 12|11 8|7 6 5 4]|3 0| 
F111 |o 0 0 o]1 0 os] RdHi_ | Rdlo [| Rm _ [1001] Rn _ | 
cond 
Flag setting variant 
Applies when $ == 1. 
UMULLS{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Not flag setting variant 
Applies when $ == 0. 
UMULL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Decode for all variants of this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); setflags = (S == '1'); 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
if dHi == dLo then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If dHi == dLo, then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12|11 8|7 6 5 4]|3 0| 
1141i1torititforo} Rn { Rdlo | RdHi [oo 0 of Rm _ | 
T1 variant 
UMULL{<c>}{<q>} <RdLo>, <RdHi>, <Rn>, <Rm> 
Decode for this encoding 
dLo = UInt(RdLo); dHi = UInt(RdHi); n = UInt(Rn); m = UInt(Rm); setflags = FALSE; 
if dlo == 15 || dHi == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
// ARMv8-A removes UNPREDICTABLE for R13 
if dHi == dLo then UNPREDICTABLE; 
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CONSTRAINED UNPREDICTABLE behavior 


If dHi == dLo, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<RdLo> Is the general-purpose destination register for the lower 32 bits of the result, encoded in the "RdLo" 
field. 

<RdHi> Is the general-purpose destination register for the upper 32 bits of the result, encoded in the "RdHi" 
field. 

<Rn> Is the first general-purpose source register holding the multiplicand, encoded in the "Rn" field. 

<Rm> Is the second general-purpose source register holding the multiplier, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
result = UInt(R[n]) « UInt(R[m]); 
R[dHi] = result<63:32>; 
R[dLo] = result<31:0>; 
if setflags then 
PSTATE.N = result<63>; 
PSTATE.Z = IsZeroBit(result<63:0>); 
// PSTATE.C, PSTATE.V unchanged 
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F5.1.271 UQADD16 


Unsigned Saturating Add 16 performs two unsigned 16-bit integer additions, saturates the results to the 16-bit 
unsigned integer range 0 <= x <= 2!6 - 1, and writes the results to the destination register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


pfeitt fo 110 of1 1 of Rn | RA WAfrfofo oft] Rm 


cond 


Al variant 


UQADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]|3 0| 


Tiitto7ojooi Ra [iii7] Rd jo[ijopi] Rn 


T1 variant 


UQADD16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
sum1 = UInt(R[n]<15:@>) + UInt(R[m]<15:0>); 
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sum2 = UInt(R[n]<31:16>) + UInt(R[m]<31:16>) ; 
R[d]<15:@> = UnsignedSat(sum1, 16); 
R[d]<31:16> = UnsignedSat(sum2, 16); 
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F5.1.272 UQADD8 
Unsigned Saturating Add 8 performs four unsigned 8-bit integer additions, saturates the results to the 8-bit unsigned 
integer range 0 <= x <= 28 - 1, and writes the results to the destination register. 
A1 
|31 28|27 26 25 24|23 22 21 20/19 16|15 12|/11109 8|7 6 5 4/3 0| 
Fit fo 110 of1 4 of Rn [Ra [eff tfo of4] Rm __ | 
cond 
Al variant 
UQADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]|3 0| 
Tiittotro ooo, Ra [1717] Ra jolifoli] am | 
T1 variant 
UQADD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
sum1 = UInt(R[n]<7:@>) + UInt(R[m]<7:0>); 
sum2 = UInt(R[n]<15:8>) + UInt(R[m]<15:8>); 
sum3 = UInt(R[n]<23:16>) + UInt(R[m]<23:16>); 
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sum4 = UInt(R[n]<31:24>) + UInt(R[m]<31:24>); 
R[d]<7:@> = UnsignedSat(suml1, 8); 
R[d]<15:8> = UnsignedSat(sum2, 8); 
R[d]<23:16> = UnsignedSat(sum3, 8); 
R[d]<31:24> = UnsignedSat(sum4, 8); 
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F5.1.273  UQASX 
Unsigned Saturating Add and Subtract with Exchange exchanges the two halfwords of the second operand, 
performs one unsigned 16-bit integer addition and one unsigned 16-bit subtraction, saturates the results to the 16-bit 
unsigned integer range 0 <= x <= 2!6 - 1, and writes the results to the destination register. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 
itt fo 110 o[7 1 of Rn | Rd farnfofo af1] Rm | 
cond 
Al variant 
UQASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 14 13 12/11 8/7 6 5 4|3 0 | 
Tiittoto oro] Ri [1717] Ra jo[tfoj7] Rm _| 
T1 variant 
UQASX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
diff = UInt(R[n]<15:@>) - UInt(R[m]<31:16>); 
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sum = UInt(R[n]<31:16>) + UInt(R[m]<15:0>); 
R[d]<15:@> = UnsignedSat(diff, 16); 
R[d]<31:16> = UnsignedSat(sum, 16); 
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F5.1.274 UQSAX 
Unsigned Saturating Subtract and Add with Exchange exchanges the two halfwords of the second operand, 
performs one unsigned 16-bit integer subtraction and one unsigned 16-bit addition, saturates the results to the 16-bit 
unsigned integer range 0 <= x <= 2!6 - 1, and writes the results to the destination register. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 
itt jo 110 o[7 1 of Rn | RG farnfiof+ of1] Rm | 
cond 
Al variant 
UQSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 
T1 
1514131211109 8|7 6 5 4/3 0 |15 14 13 12/11 8/7 6 5 4|3 0 | 
Tritt 107010] Re [i111] Re opijop] Rm | 
T1 variant 
UQSAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); 
sum = UInt(R[n]<15:@>) + UInt(R[m]<31:16>); 
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diff = UInt(R[n]<31:16>) - UInt(R[m]<15:0>); 
R[d]<15:@> = UnsignedSat(sum, 16); 
R[d]<31:16> = UnsignedSat(diff, 16); 
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F5.1.275 UQSUB16 


Unsigned Saturating Subtract 16 performs two unsigned 16-bit integer subtractions, saturates the results to the 
16-bit unsigned integer range 0 <= x <= 2!6 - 1, and writes the results to the destination register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


| feitt fo 110 of1 tof Rn fT RA Weft tft] Rm 


cond 


Al variant 


UQSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


|15141312|/1110 9 8|7 6 5 4/3 0 |15 14 13 12|11 8|7 6 5 4]|3 0| 


Tiitto7oio Re [i1i7] Rd Joliopi] Rn 


T1 variant 


UQSUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diff1 = UInt(R[n]<15:@>) - UInt(R[m]<15:0>); 
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diff2 = UInt(R[n]<31:16>) - UInt(R[m]<31:16>); 
R[d]<15:@> = UnsignedSat(diff1, 16); 
R[d]<31:16> = UnsignedSat(diff2, 16); 
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F5.1.276 UQSUB8 


Unsigned Saturating Subtract 8 performs four unsigned 8-bit integer subtractions, saturates the results to the 8-bit 
unsigned integer range 0 <= x <= 28 - 1, and writes the results to the destination register. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


Pp feitt fo 110 of1 tof Rn fo RA Weft tft] Rm 


cond 


Al variant 


UQSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0/45 14 13 12|11 81/7 6 5 4/3 0 | 


Tiitto7oioo| Ra [iii] Rd jo[ijopi] Rn 


T1 variant 


UQSUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diffl = UInt(R[n]<7:@>) - UInt(R[m]<7:0>); 
diff2 = UInt(R[n]<15:8>) - UInt(R[m]<15:8>); 
diff3 = UInt(R[n]<23:16>) - UInt(R[m]<23:16>); 
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diff4 = UInt(R[n]<31:24>) - UInt(R[m]<31:24>); 
R[d]<7:@> = UnsignedSat(diff1, 8); 
R[d]<15:8> = UnsignedSat(diff2, 8); 
R[d]<23:16> = UnsignedSat(diff3, 8); 
R[d]<31:24> = UnsignedSat(diff4, 8); 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.277 


USAD8 


Unsigned Sum of Absolute Differences performs four unsigned 8-bit subtractions, and adds the absolute values of 
the differences together. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16|15 14 13 12|11 8|7 6 5 4|3 0| 


pie fortitooo) Rd jtitiat Rm joooir] Rn | 


cond 


Al variant 


USAD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 


d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 8/7 6 5 4|3 0 | 


Tiit7o01710i17] Ra [i177] Rd jo ojo o] Rm | 


T1 variant 
USAD8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
absdiff1 = Abs(UInt(R[n]<7:@>) - UInt(R[m]<7:0>)); 
absdiff2 = Abs(UInt(R[n]<15:8>) - UInt(R[m]<15:8>)); 
absdiff3 = Abs(UInt(R[n]<23:16>) - UInt(R[m]<23:16>)); 





F5-3196 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


absdiff4 = Abs(UInt(R[n]<31:24>) - UInt(R[m]<31:24>)); 
result = absdiffl + absdiff2 + absdiff3 + absdiff4; 
R[d] = result<31:0>; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.278 USADA8 


Unsigned Sum of Absolute Differences and Accumulate performs four unsigned 8-bit subtractions, and adds the 
absolute values of the differences to a 32-bit accumulate operand. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11 8/7 6 5 4|3 0 | 


| ist jo1111000f Rd {| = | Rm jooo0 14] Rn | 
Ra 


cond 


Al variant 


USADA8{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


Decode for this encoding 
if Ra == '1111' then SEE USAD8; 


d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); a = UInt(Ra); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4|3 0 | 


ti t1t1t1077 0/4474] Ro | em | Rd fo ofo of Rm | 
Ra 


T1 variant 


USADA8{<c>}{<q>} <Rd>, <Rn>, <Rm>, <Ra> 


Decode for this encoding 

if Ra == '1111' then SEE USAD8; 

d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); a = UInt(Ra); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
<Ra> Is the third general-purpose source register holding the addend, encoded in the "Ra" field. 
F5-3198 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


absdiff1 = Abs(UInt(R[n]<7:@>) - UInt(R[m]<7:0>)); 

absdiff2 = Abs(UInt(R[n]<15:8>) - UInt(R[m]<15:8>)); 
absdiff3 = Abs(UInt(R[n]<23:16>) - UInt(R[m]<23:16>)); 
absdiff4 = Abs(UInt(R[n]<31:24>) - UInt(R[m]<31:24>)); 


result = UInt(R[a]) + absdiff1l + absdiff2 + absdiff3 + absdiff4; 
R[d] = result<31:0>; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.279 USAT 


Unsigned Saturate saturates an optionally-shifted signed value to a selected unsigned range. 


This instruction sets PSTATE.Q to 1 if the operation saturates. 


A1 
31 28|27 26 25 24|23 22 21 20| 16|15 12|11 i765 4/3 0 | 
ei [0110 1[1]1] saimm | Ra | immo [nfo a] Rn | 


cond 


Arithmetic shift right variant 
Applies when sh == 1. 


USAT{<c>}{<q>} <Rd>, #<imm>, <Rn>, ASR #<amount> 


Logical shift left variant 
Applies when sh == Q. 


USAT{<c>}{<q>} <Rd>, #<imm>, <Rn> {, LSL #<amount>} 


Decode for all variants of this encoding 
d = UInt(Rd); n = UInt(Rn); saturate_to = UInt(sat_imm); 


(shift_t, shift_n) = DecodeImmShift(sh:'O', imm5); 
if d == 15 || n == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4]3 0/1514 12\11 8/7 6 5 4| 0 | 


[11.1 1 oot 4{1 ofshtof Rn__[o] imma | Rd fimm2|(o)]__sat_imm_| 





Arithmetic shift right variant 
Applies when sh == 1 && !(imm3 == 000 && imm2 == 00). 


USAT{<c>}{<q>} <Rd>, #<imm>, <Rn>, ASR #<amount> 


Logical shift left variant 
Applies when sh == Q. 


USAT{<c>}{<q>} <Rd>, #<imm>, <Rn> {, LSL #<amount>} 


Decode for all variants of this encoding 


if sh == '1' && (imm3:imm2) == 'QQ000' then SEE USAT16; 

d = UInt(Rd); nm = UInt(Rn); saturate_to = UInt(sat_imm); 

(shift_t, shift_n) = DecodeImmShift(sh:'@', imm3:imm2); 

if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 





F5-3200 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<imm> Is the bit position for saturation, in the range 0 to 31, encoded in the "sat_imm" field. 

<Rn> Is the general-purpose source register, encoded in the "Rn" field. 

<amount> For encoding A1: is the optional shift amount, in the range 0 to 31, defaulting to 0 and encoded in 


the "imm5" field. 


For encoding A1: is the shift amount, in the range 1 to 32 encoded in the "imm5" field as <amount> 
modulo 32. 


For encoding T1: is the optional shift amount, in the range 0 to 31, defaulting to 0 and encoded in 
the "imm3:imm2" field. 


For encoding T1: is the shift amount, in the range 1 to 31 encoded in the "imm3:imm2" field as 
<amount>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
operand = Shift(R[n], shift_t, shift_n, PSTATE.C); // PSTATE.C ignored 
(result, sat) = UnsignedSatQ(SInt(operand), saturate_to); 
R[d] = ZeroExtend(result, 32); 





if sat then 
PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.280 USAT16 


Unsigned Saturate 16 saturates two signed 16-bit values to a selected unsigned range. 


This instruction sets PSTATE.Q to 1 if the operation saturates. 
A1 


31 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


Perm [0770 1]4]1 0] satimm | Ra (foo 17] Rn 


cond 


Al variant 


USAT16{<c>}{<q>} <Rd>, #<imm>, <Rn> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); saturate_to = UInt(sat_imm); 
if d == 15 || n == 15 then UNPREDICTABLE; 


T1 


[151413 12\11109 8|7 6 5 4|3 0 |15 14 13 12|11 8|7 6 5 4|3 0 | 


741 oft aft opto, Rn [ojo 0 0] Ra [0 ofoyfo] sat_imm | 


T1 variant 


USAT16{<c>}{<q>} <Rd>, #<imm>, <Rn> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); saturate_to = UInt(sat_imm); 
if d == 15 || n == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<imm> Is the bit position for saturation, in the range 0 to 15, encoded in the "sat_imm" field. 
<Rn> Is the general-purpose source register, encoded in the "Rn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
(result1, sat1) = UnsignedSatQ(SInt(R[n]<15:0>), saturate_to); 
(result2, sat2) = UnsignedSatQ(SInt(R[n]<31:16>), saturate_to); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


R[d]<15:@> = ZeroExtend(result1, 16); 
R[d]<31:16> = ZeroExtend(result2, 16); 
if satl || sat2 then 

PSTATE.Q = '1'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.281 


USAX 


Unsigned Subtract and Add with Exchange exchanges the two halfwords of the second operand, performs one 
unsigned 16-bit integer subtraction and one unsigned 16-bit addition, and writes the results to the destination 
register. It sets PSTATE.GE according to the results. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


pet jot to oj1 of Rn fo Rd Wayfoft oft] Rm 


cond 


Al variant 


USAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 


d = UInt(Rd); mn = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 81/7 6 5 4|3 0 | 


ptr ttorotitof Rn [tra t] Rd fojtjofo} Rm | 


T1 variant 
USAX{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
sum = UInt(R[n]<15:0>) + UInt(R[m]<31:16>); 
diff = UInt(R[n]<31:16>) - UInt(R[m]<15:0>); 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


R[d]<15:@> = sum<15:@>; 

R[d]<31:16> = diff<15:@>; 

PSTATE.GE<1:0> = if sum >= 0x10000 then '11' else 'QQ'; 
PSTATE.GE<3:2> = if diff >= @ then '11' else 'Q0'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.282 


USUB16 


Unsigned Subtract 16 performs two 16-bit unsigned integer subtractions, and writes the results to the destination 
register. It sets PSTATE.GE according to the results of the subtractions. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


pote fo 1 to ojt ot] Ra RA fanfenfenfofs aft] Rm 


cond 


Al variant 


USUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 


d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 8|7 6 5 4|3 0 | 


11ittororr1o at Rn [1it1] Rd folifofo} Rm | 


T1 variant 
USUB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 
Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 


<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diffl = UInt(R[n]<15:@>) - UInt(R[m]<15:0>); 
diff2 = UInt(R[n]<31:16>) - UInt(R[m]<31:16>); 
R[d]<15:@> = diff1<15:0>; 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


R[d]<31:16> = diff2<15:0>; 
PSTATE.GE<1:0> = if diffl >= @ then '11' else '00'; 
PSTATE.GE<3:2> = if diff2 >= @ then '11' else '00'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


F5.1.283 USUB8 
Unsigned Subtract 8 performs four 8-bit unsigned integer subtractions, and writes the results to the destination 
register. It sets PSTATE.GE according to the results of the subtractions. 


A1 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


pte fo 1 to oft ot] Ra RA fanny tt] Rm 


cond 


Al variant 


USUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 
d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0/15 14 13 12|11 8|7 6 5 4|3 0 | 


111ttoror1oof Rn [11t1] Rd folifofo} Rm | 


T1 variant 


USUB8{<c>}{<q>} {<Rd>,} <Rn>, <Rm> 


Decode for this encoding 

d = UInt(Rd); nm = UInt(Rn); m = UInt(Rm); 

if d == 15 || n == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
diff1 = UInt(R[n]<7:@>) - UInt(R[m]<7:0>); 
diff2 = UInt(R[n]<15:8>) - UInt(R[m]<15:8>); 
diff3 = UInt(R[n]<23:16>) - UInt(R[m]<23:16>); 
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F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


diff4 = UInt(R[n]<31:24>) - UInt(R[m]<31:24>); 


R[d]<7:0> 
R[d]<15:8> 
R[d]<23:16> 
R[d]<31:24> 
PSTATE.GE<0> 
PSTATE.GE<1> 
PSTATE.GE<2> 
PSTATE.GE<3> 


diff1<7:0>; 
diff2<7:0>; 
diff3<7:0>; 
diff4<7:0>; 
= if diffl >= 
if diff2 >= 
if diff3 >= 
if diff4 >= 


@ then '1' else 'Q'; 
@ then '1' else 'Q'; 
@ then '1' else 'Q'; 
@ then '1' else 'Q'; 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 








F5.1.284 UXTAB 
Unsigned Extend and Add Byte extracts an 8-bit value from a register, zero-extends it to 32 bits, adds the result to 
the value in another register, and writes the final result to the destination register. The instruction can specify a 
rotation by 0, 8, 16, or 24 bits before extracting the 8-bit value. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 6 5 4/3 0| 
e111 [o 411 04 F111 [Rd frotatel(oo_1 1 1 
cond Rn 
Al variant 
UXTAB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 
Decode for this encoding 
if Rn == '1111' then SEE UXTB; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/1110 9 8|7 6 5 4/3 0/15 14 13 12/11 8|7 6 5 4|3 0 | 
Tit t1oto oT of] em [1111] Rd [t]Ojrotatel Rm] 
Rn 
T1 variant 
UXTAB{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 
Decode for this encoding 
if Rn == '1111' then SEE UXTB; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 
(omitted) when rotate = 00 
8 when rotate = Q1 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


16 when rotate = 10 


24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d] = R[n] + ZeroExtend(rotated<7:0>, 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 








F5.1.285 UXTAB16 
Unsigned Extend and Add Byte 16 extracts two 8-bit values from a register, zero-extends them to 16 bits each, adds 
the results to two 16-bit values from another register, and writes the final results to the destination register. The 
instruction can specify a rotation by 0, 8, 16, or 24 bits before extracting the 8-bit values. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 6 5 4/3 0| 
rit fo 11 0 1{1Jo of i111 | Rd frotatel(ofo 1 1 1 
cond Rn 
Al variant 
UXTAB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 
Decode for this encoding 
if Rn == '1111' then SEE UXTB16; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/11109 8|7 6 5 4/3 0/15 14 13 12/11 8|7 6 5 4|3 0 | 
Pitt 1oto oot em [1117] Rd [t]Ojrottel Rm] 
Rn 
T1 variant 
UXTAB16{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 
Decode for this encoding 
if Rn == '1111' then SEE UXTB16; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 
(omitted) when rotate = 00 
8 when rotate = Q1 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


16 when rotate = 10 


24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d]<15:@> = R[n]<15:@> + ZeroExtend(rotated<7:@>, 16); 
R[d]<31:16> = R[n]<31:16> + ZeroExtend(rotated<23:16>, 16); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 








F5.1.286 UXTAH 
Unsigned Extend and Add Halfword extracts a 16-bit value from a register, zero-extends it to 32 bits, adds the result 
to a value from another register, and writes the final result to the destination register. The instruction can specify a 
rotation by 0, 8, 16, or 24 bits before extracting the 16-bit value. 
A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 6 5 4/3 0| 
e111 [o 411 04 F111 [Rd frotatel(o}o_1 1 1 
cond Rn 
Al variant 
UXTAH{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 
Decode for this encoding 
if Rn == '1111' then SEE UXTH; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 
T1 
151413 12/1110 9 8|7 6 5 4/3 0/15 14 13 12/11 8|7 6 5 4|3 0 | 
Tit t+ 1010 oo of] emm [1111] Rd [t]Ojfottel Rm] 
Rn 
T1 variant 
UXTAH{<c>}{<q>} {<Rd>,} <Rn>, <Rm> {, ROR #<amount>} 
Decode for this encoding 
if Rn == '1111' then SEE UXTH; 
d = UInt(Rd); nm = UInt(Rn); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 
<Rn> Is the first general-purpose source register, encoded in the "Rn" field. 
<Rm> Is the second general-purpose source register, encoded in the "Rm" field. 
<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 
(omitted) when rotate = 00 
8 when rotate = Q1 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 


16 when rotate = 10 


24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d] = R[n] + ZeroExtend(rotated<15:0>, 32); 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.1 Alphabetical list of T32 and A32 base instruction set instructions 





F5.1.287 UXTB 
Unsigned Extend Byte extracts an 8-bit value from a register, zero-extends it to 32 bits, and writes the result to the 
destination register. The instruction can specify a rotation by 0, 8, 16, or 24 bits before extracting the 8-bit value. 
A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12|/11109 8|7 6 5 4/3 0| 
itt jo 1 10 1{1{1 [1 11 4] Rd frotatel(ofo_1 1 1] Rm | 
cond 
Al variant 
UXTB{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 
Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/11109 8|7 6 5 |3 2 0| 
To77007 0/11] Rm | Ra | 
T1 variant 
UXTB{<c>}{<q>} {<Rd>,} <Rm> 
Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = Q; 
T2 
|15141312|/1110 9 8|7 6 5 4/3 2 1 0|1514 13 12\11 8|7 6 5 4]|3 0| 
Toit 107001 O11 717171] Ra [1fojfotate] Rm |] 
T2 variant 
UXTB{<c>}.W {<Rd>,} <Rm> // <Rd>, <Rm> can be represented in T1 
UXTB{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 
Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the general-purpose source register, encoded in the "Rm" field. 

<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 


(omitted) when rotate = 00 


8 when rotate = 01 
16 when rotate = 10 
24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d] = ZeroExtend(rotated<7:@>, 32); 
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F5.1.288 UXTB16 


Unsigned Extend Byte 16 extracts two 8-bit values from a register, zero-extends them to 16 bits each, and writes 
the results to the destination register. The instruction can specify a rotation by 0, 8, 16, or 24 bits before extracting 
the 8-bit values. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0 | 


| feitt fo 110 tfifo oft 111] Rd frotateloolo +1 1] Rm | 


cond 


Al variant 


UXTB16{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 


Decode for this encoding 


d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 


T1 


15 14 1312/1110 9 8/7 6 5 4/3 2 1 0|151413 12/11 8/7 6 5 4|3 0 | 


77 ti070 oti 11ii11 i Re [i fofotate] Rm 


T1 variant 


UXTB16{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 


Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> For encoding A1: is the general-purpose source register, encoded in the "Rm" field. 


For encoding T1: is the second general-purpose source register, encoded in the "Rm" field. 


<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 


(omitted) when rotate = 00 





8 when rotate = 01 
16 when rotate = 10 
24 when rotate = 11 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d]<15:@> = ZeroExtend(rotated<7:0>, 16); 
R[d]<31:16> = ZeroExtend(rotated<23:16>, 16); 
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F5.1.289 UXTH 
Unsigned Extend Halfword extracts a 16-bit value from a register, zero-extends it to 32 bits, and writes the result to 
the destination register. The instruction can specify a rotation by 0, 8, 16, or 24 bits before extracting the 16-bit 
value. 
A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12|/11109 8|7 6 5 4/3 0| 
rit fo 41 0 4f1f4 1f1 11 4[ Rd frotatel(ofo_1 1 1] Rm __| 
cond 
Al variant 
UXTH{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 
Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; 
T1 
|15141312|/11109 8|7 6 5 |3 2 0| 
1011007 0]1]0] Rm] Ra | 
T1 variant 
UXTH{<c>}{<q>} {<Rd>,} <Rm> 
Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = Q; 
T2 
|15141312|/1110 9 8|7 6 5 4/3 2 1 0|1514 13 12\11 8|7 6 5 4]3 0| 
Titt70710 00 0[1177[177 7) Ra 1 [Ojfotate] Rm | 
T2 variant 
UXTH{<c>}.W {<Rd>,} <Rm> // <Rd>, <Rm> can be represented in T1 
UXTH{<c>}{<q>} {<Rd>,} <Rm> {, ROR #<amount>} 
Decode for this encoding 
d = UInt(Rd); m= UInt(Rm); rotation = UInt(rotate:'000'); 
if d == 15 || m == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rd> Is the general-purpose destination register, encoded in the "Rd" field. 

<Rm> Is the general-purpose source register, encoded in the "Rm" field. 

<amount> Is the rotate amount, encoded in the "rotate" field. It can have the following values: 


(omitted) when rotate = 00 


8 when rotate = 01 
16 when rotate = 10 
24 when rotate = 11 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
rotated = ROR(R[m], rotation); 
R[d] = ZeroExtend(rotated<15:@>, 32); 
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F5.1.290 WFE 


Wait For Event is a hint instruction that permits the PE to enter a low-power state until one of a number of events 
occurs, including events signaled by executing the SEV instruction on any PE in the multiprocessor system. For more 
information, see Wait For Event and Send Event on page G1-3872. 


As described in Wait For Event and Send Event on page G1-3872, the execution of a WFE instruction that would 
otherwise cause entry to a low-power state can be trapped to a higher Exception level, see: 


° Traps to Undefined mode of ELO execution of WFE and WFI instructions on page G1-3888. 
° Traps to Hyp mode of Non-secure ELO and EL1 execution of WFE and WFI instructions on page G1-3904. 


° Traps to Monitor mode of the execution of WFE and WF] instructions in modes other than Monitor mode on 
page G1-3915. 


A1 


31 28|27 26 25 24/23 22 21 20\19 1817 16/15 1413 12/1110 9 8/7 6 5 4/3 21 0} 


Tein [007 7 ofo]t of ofo oMMMIMO MOO 00000070 


cond 


Al variant 


WFE{<c>}{<q>} 


Decode for this encoding 


// No additional decoding required 


T1 


15 14 1312/1110 9 8/7 6 5 4/3 21 0| 


TO711177/00710)0000] 


T1 variant 


WFE{<c>}{<q>} 


Decode for this encoding 


// No additional decoding required 


T2 


15 14 1312/1110 9 8/7 6 5 4/3 2 1 0/15141312/11109 8/7 6 5 4/3 21 0; 


11470017141 01 ofayafajats ofo}ofojo o ofo o o ojo o 1 O| 


T2 variant 


WFE{<c>}.W 


Decode for this encoding 


// No additional decoding required 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if EventRegistered() then 
ClearEventRegister(); 
else 
if PSTATE.EL == ELQ@ then 
// Check for traps described by the OS. 
AArch32.CheckForWFxTrap(EL1, TRUE); 
if HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} then 
// Check for traps described by the Hypervisor. 
AArch32.CheckForWFxTrap(EL2, TRUE); 
if HaveEL(EL3) && PSTATE.M != M32_Monitor then 
// Check for traps described by the Secure Monitor. 
AArch32.CheckForWFxTrap(EL3, TRUE); 
WaitForEvent(); 
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F5.1.291 WEI 
Wait For Interrupt is a hint instruction that permits the PE to enter a low-power state until one of a number of 
asynchronous events occurs. For more information, see Wait For Interrupt on page G1-3875. 
As described in Wait For Interrupt on page G1-3875, the execution of a WFI instruction that would otherwise cause 
entry to a low-power state can be trapped to a higher Exception level, see: 
° Traps to Undefined mode of ELO execution of WFE and WFI instructions on page G1-3888. 
° Traps to Hyp mode of Non-secure ELO and EL1 execution of WFE and WFI instructions on page G1-3904. 
° Traps to Monitor mode of the execution of WFE and WFI instructions in modes other than Monitor mode on 
page G1-3915. 

A1 

|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 1413 12/1110 9 8|7 6 5 4|3 21 0| 

=1111 [0 0 1 1 0]0]1 0f0 Of 0 0((1)(1)(1)(4)(0)(0)(0)(0) 00 00001 1 
cond 

Al variant 
WFI{<c>}{<q>} 
Decode for this encoding 

// No additional decoding required 
T1 

|15141312/1110 9 8|7 6 5 4/3 2 1 0| 

101111 11/0 0111/0000 

T1 variant 
WFI{<c>}{<q>} 
Decode for this encoding 

// No additional decoding required 
T2 

|15141312|/1110 9 8|7 6 5 4/3 2 1 0|15141312|/11109 8|7 6 5 4/3 2 1 0| 

11110011101 Ofaatala1 ofofofojo o ofo 00 0fo011 

T2 variant 
WFI{<c>}.W 
Decode for this encoding 

// No additional decoding required 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if !InterruptPending() then 
if PSTATE.EL == ELQ then 
// Check for traps described by the OS. 
AArch32.CheckForwWFxTrap(EL1, FALSE); 
if HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} then 
// Check for traps described by the Hypervisor. 
AArch32.CheckForWFxTrap(EL2, FALSE); 
if HaveEL(EL3) && PSTATE.M != M32_Monitor then 
// Check for traps described by the Secure Monitor. 
AArch32.CheckForWFxTrap(EL3, FALSE); 
WaitForInterrupt(); 
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F5.1.292 YIELD 

YIELD is a hint instruction. Software with a multithreading capability can use a YIELD instruction to indicate to the 
PE that it is performing a task, for example a spin-lock, that could be swapped out to improve overall system 
performance. The PE can use this hint to suspend and resume multiple software threads if it supports the capability. 
For more information about the recommended use of this instruction see The Yield instruction on page F1-2385. 
A1 

31 28|27 26 25 24|23 22 21 20/19 18 17 16/15 14 13 12/1110 9 8/7 6 5 4/3 21 0} 

=1111 [Oo 0 1.1 ofof1 ofo ofo ofayiaayfayioy(0)@ (0) 0 0 0 00001 
cond 

Al variant 
YIELD{<c>}{<q>} 
Decode for this encoding 

// No additional decoding required 
T1 

151413 12/11109 8|7 6 5 4/3 21 0O| 

101%1%1%1%31%1)/0 0014/0 00 0 

T1 variant 
YIELD{<c>}{<q>} 
Decode for this encoding 

// No additional decoding required 
T2 

1514131211109 8|7 6 5 4/3 2 1 0|15141312/11109 8/7 6 5 4/3 2 1 O| 

111100111 01 ol(4aatay1 ofojololo o ofo 0 0 of0 0 oO 1 

T2 variant 
YIELD{<c>}.W 
Decode for this encoding 

// No additional decoding required 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
Hint_Yield(); 
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F5.2 Encoding and use of Banked register transfer instructions 


Software executing at EL1 or higher can use the MRS (Banked register) and MSR (Banked register) instructions to 
transfer values between the general-purpose registers and Special-purpose registers. One particular use of these 
instructions is for a hypervisor to save or restore the register values of a Guest OS. The following sections give more 
information about these instructions: 


° Register arguments in the Banked register transfer instructions. 

° Usage restrictions on the Banked register transfer instructions on page F5-3229. 

° Encoding the register argument in the Banked register transfer instructions on page F5-3230. 
° Pseudocode support for the Banked register transfer instructions on page F5-3231. 


For descriptions of the instructions see MRS (Banked register) on page F5-2832 and MSR (Banked register) on 
page F5-2836. 


F5.2.1 Register arguments in the Banked register transfer instructions 


Figure F5-1 shows the Banked general-purpose registers and Special-purpose registers: 


Associated PE mode 





[ 
User or 


System Hyp Supervisor Abort Undefined Monitor IRQ FIQ 


R8_usr 
R9_usr 
R10_usr 
R11_usr 











General-purpose 
registers 





R12_usr 
SP_usr  SP_hyp SP_svc SP_abt SP_und SP_mon SP_irg 
LR_usr LR_svc LR_abt LR_und LR_mon LR_irq 


Special-purpose SPSR_hyp SPSR_svc SPSR_abt SPSR_und SPSR_mon SPSR_irq SPSR_fiq 
registers ELR_hyp 


For the general-purpose registers, if no other register is shown, the current mode register is the _usr register. 
So, for example, the full set of current mode registers, including the registers that are not banked: 

* For Hyp mode, is {RO_usr - R12_usr, SP_hyp, LR_usr, SPSR_hyp, ELR_hyp}. 

* For Abort mode, is {RO_usr - R12_usr, SP_abt, LR_abt, SPSR_abt}. 











Figure F5-1 Banking of general-purpose and Special-purpose registers 


Figure F5-1 is based on Figure G1-2 on page G1-3799, that shows the complete set of general-purpose registers and 
Special-purpose registers accessible in each mode. 





Note 


° System mode uses the same set of registers as User mode. Neither of these modes can access an SPSR, except 
that System mode can use the MRS (Banked register) and MSR (Banked register) instructions to access some 
SPSRs, as described in Usage restrictions on the Banked register transfer instructions on page F5-3229. 


° General-purpose registers RO-R7, that are not Banked, cannot be accessed using the MRS (Banked register) 
and MSR (Banked register) instructions. 


° In addition to the registers shown in Figure F5-1, the DLR and DSPSR are AArch32 System registers that 
map onto the AArch64 Special-purpose registers DLR_ELO and DSPSR_ELO. However, DLR and DSPSR 
are not accessible using the MRS (Banked register) and MSR (Banked register) instructions. 





Software using an MRS (Banked register) or MSR (Banked register) instruction specifies one of these registers using a 
name shown in Figure F5-1, or an alternative name for SP or LR. These registers can be grouped as follows: 


R8-R12 Each of these registers has two Banked copies, _usr and _fiq, for example R8_usr and R8_fiq. 
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SP There is a Banked copy of SP for every mode except System mode. For example, SP_svc is the SP 
for Supervisor mode. 


LR There is a Banked copy of LR for every mode except System mode and Hyp mode. For example, 
LR_svc is the LR for Supervisor mode. 


SPSR There is a Banked copy of SPSR for every mode except System mode and User mode. 


ELR_hyp Except for the operations provided by MRS (Banked register) and MSR (Banked register), ELR_hyp is 
accessible only from Hyp mode. It is not Banked. 




















F5.2.2 Usage restrictions on the Banked register transfer instructions 
MRS (Banked register) and MSR (Banked register) instructions are CONSTRAINED UNPREDICTABLE if any of the 
following applies: 
° The instruction is executed in User mode. 
° The instruction accesses a Banked register that is not implemented, or that either: 
— Is not accessible from the current Privilege level and Security state. 
— Can be accessed from the current mode by using a different instruction. 
MSR/MRS Banked registers on page K1-5477 describes the permitted CONSTRAINED UNPREDICTABLE behavior. 
An MRS (Banked register) or an MSR (Banked register) executed: 
° At Non-secure EL1 cannot access any Hyp mode Banked registers. 
° At Non-secure EL1 or EL2 cannot access any Monitor mode Banked registers. 
° In a Secure mode other than Monitor mode cannot access any Hyp Banked registers. 
This means that the Banked registers that MRS (Banked register) and MSR (Banked register) instructions cannot access 
are: 
From Monitor mode 
° The current mode registers R8_usr-R12_usr, SP_mon, LR_mon, and SPSR_mon. 
From Hyp mode 
° The Monitor mode registers SP_mon, LR_mon, and SPSR_mon. 
° The current mode registers R8_usr-R12_usr, SP_hyp, LR_usr, and SPSR_hyp. 
—— Note 
MRS (Banked register) and MSR (Banked register) instructions can access the current mode register 
ELR_hyp. 
From FIQ mode 
° From Non-secure EL1, the Monitor mode registers SP_mon, LR_mon, and SPSR_mon. 
° The Hyp mode registers SP_hyp, SPSR_hyp, and ELR_hyp. 
. The current mode registers R8_fiq-R12_fiq, SP_fiq, LR_fiq, and SPSR_fiq. 
From System mode 
° From Non-secure EL1, the Monitor mode registers SP_mon, LR_mon, and SPSR_mon. 
° The Hyp mode registers SP_hyp, SPSR_hyp, and ELR_hyp. 
° The current mode registers R8_usr-R12_usr, SP_usr, and LR_usr. 
From Supervisor mode, Abort mode, Undefined mode, and IRQ mode 
° From Non-secure EL1, the Monitor mode registers SP_mon, LR_mon, and SPSR_mon. 
. The Hyp mode registers SP_hyp, SPSR_hyp, and ELR_hyp. 
. The current mode registers R8_usr-R12_usr, SP_<current_mode>, LR_<current_mode>, 
and SPSR_<current_mode>. 
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If EL3 is using AArch64, all MRS (Banked register) and MSR (Banked register) accesses to the Monitor mode registers 
from Secure EL1 modes are trapped to EL3. See Traps to EL3 of Secure monitor functionality from Secure EL1 
using AArch32 on page D1-1590. 


For more information, see: 

° Encoding the register argument in the Banked register transfer instructions. 

° Pseudocode support for the Banked register transfer instructions on page F5-3231. 
° MRS (Banked register) on page F5-2832. 

° MSR (Banked register) on page F5-2836. 





Note 


CONSTRAINED UNPREDICTABLE behavior must not give access to registers that are not accessible from the current 
Privilege level and Security state. 
























































F5.2.3 Encoding the register argument in the Banked register transfer instructions 
The MRS (Banked register) and MSR (Banked register) instructions include a 5-bit field, SYSm, and an R bit, that 
together encode the register argument for the instruction. 
When the R bit is set to 0, the argument is a register other than a Banked copy of the SPSR, and Table F5-1 shows 
how the SYSm field defines the required register argument. In this table, CONST. UNPREDICTABLE indicates that 
behavior is CONSTRAINED UNPREDICTABLE. 
Table F5-1 Banked register encodings when R== 
SYSm<4:3> 
SYSm<2:0> 0b00 Qb01 Qb10 Qb11 
0b000 R8_usr R8_fiq LR_irq CONST. UNPREDICTABLE 
0b001 R9_usr R9_fiq SP_irq CONST. UNPREDICTABLE 
0b010 R10_usr R10_fiq LR_svc — CONST. UNPREDICTABLE 
Qb011 R11_usr R11_fig SP_svc | CONST. UNPREDICTABLE 
0b100 R12_usr R12_fiq LR_abt LR_mon 
0b101 SP_usr SP_fiq SP_abt SP_mon 
0b110 LR_usr LR_fiq LR_und ELR_hyp 
Qb111 CONST. UNPREDICTABLE CONST. UNPREDICTABLE SP_und  SP_hyp 
When the R bit is set to 1, the argument is a Banked copy of the SPSR, and Table F5-2 shows how the SYSm field 
defines the required register argument. In this table, CONST. UNPREDICTABLE indicates that behavior is 
CONSTRAINED UNPREDICTABLE. 
Table F5-2 Banked register encodings when R== 
SYSm<4:3> 
SYSm<2:0> | 0b00 Qb01 0b10 Qb11 
0b000 CONST. UNPREDICTABLE CONST. UNPREDICTABLE SPSR_irq CONST. UNPREDICTABLE 
0b001 CONST. UNPREDICTABLE CONST. UNPREDICTABLE CONST. UNPREDICTABLE CONST. UNPREDICTABLE 
0b010 CONST. UNPREDICTABLE CONST. UNPREDICTABLE SPSR_svec CONST. UNPREDICTABLE 
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Table F5-2 Banked register encodings when R==1 (continued) 





SYSm<2:0> 


SYSm<4:3> 


0b00 


Qb01 


0b10 


Qb11 





0b011 


CONST. UNPREDICTABLE 


CONST. UNPREDICTABLE 


CONST. UNPREDICTABLE 


CONST. UNPREDICTABLE 





0b100 


CONST. UNPREDICTABLE 


CONST. UNPREDICTABLE 


SPSR_abt 


SPSR_mon 





0b101 


CONST. UNPREDICTABLE 


CONST. UNPREDICTABLE 


CONST. UNPREDICTABLE 


CONST. UNPREDICTABLE 





0b110 


CONST. UNPREDICTABLE 


SPSR_fiq 


SPSR_und 


SPSR_hyp 





0b111 





CONST. UNPREDICTABLE 


CONST. UNPREDICTABLE 


CONST. UNPREDICTABLE 





CONST. UNPREDICTABLE 





F5.2.4 


Pseudocode support for the Banked register transfer instructions 


The pseudocode functions BankedRegisterAccessValid() and SPSRaccessValid() check the validity of MRS (Banked 
register) and MSR (Banked register) accesses. That is, they filter the accesses that are CONSTRAINED UNPREDICTABLE 
either because: 


They attempt to access a register that Usage restrictions on the Banked register transfer instructions on 
page F5-3229 shows is not accessible. 


They use an SYSm<4:@> encoding that Encoding the register argument in the Banked register transfer 
instructions on page F5-3230 shows as CONSTRAINED UNPREDICTABLE. 


BankedRegisterAccessValid() applies to accesses to the banked general-purpose registers, or to ELR_hyp, and 
SPSRaccessValid() applies to accesses to the SPSRs. 
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F5 T32 and A32 Base Instruction Set Instruction Descriptions 
F5.2 Encoding and use of Banked register transfer instructions 
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Chapter F6 
T32 and A32 Advanced SIMD and floating-point 
Instruction Descriptions 


This chapter describes each instruction. It contains the following sections: 


° Alphabetical list of floating-point and Advanced SIMD instructions on page F6-3234. 


Note 


Some headings in this chapter use the term floating-point register. This is an abbreviated description, and means a 
register in the Advanced SIMD and floating-point register file. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


This section lists every floating-point and Advanced SIMD instruction in the T32 and A32 instruction sets. For 
details of the format used see Format of instruction descriptions on page F2-2402. 


This section is formatted so that each instruction description starts on a new page. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.1 AESD 


AES single round decryption. 
A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


Ti 770077 1)0][1 i[szeJo of va [ojo 17 0[i[w[o] vm | 


Al variant 


AESD.<dt> <Qd>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if size != '@@' then UNDEFINED; 

if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 


T1 


[15 141312/11109 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 0 | 


14471147114 1{0]1 afsizefo of va [ojo 11 ofi{mfo] vm | 


T1 variant 


AESD.<dt> <Qd>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if size != '@@' then UNDEFINED; 

if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<dt> Is the data type, encoded in the "size" field. It can have the following values: 
8 when size = 00 


The following encodings are reserved: 





° size = Q1. 
° size = 1x. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3235 


ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
op1 = Q[d>>1]; op2 = Q[m>>1]; 
Q[d>>1] = AESInvSubBytes(AESInvShiftRows(op1 EOR op2)); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.2 AESE 


AES single round encryption. 
A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


Ti7700%7 101 i[szefo of va [ojo 11 0]o|w[o] vm | 


Al variant 


AESE.<dt> <Qd>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if size != '@@' then UNDEFINED; 

if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 


T1 


[15 141312/11109 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 0 | 


14441147114 1/0] afsizefo of va [ojo 11 ofojmjo} vm __| 


T1 variant 


AESE.<dt> <Qd>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if size != '@@' then UNDEFINED; 

if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<dt> Is the data type, encoded in the "size" field. It can have the following values: 
8 when size = 00 


The following encodings are reserved: 





° size = Q1. 
° size = 1x. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
op1 = Q[d>>1]; op2 = Q[m>>1]; 
Q[d>>1] = AESSubBytes(AESShiftRows(op1 EOR op2)); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.3 AESIMC 


AES inverse mix columns. 
A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


Ti7700%7 1)0][1 i[szejo of va [ojo117|1[w[o] vm | 


Al variant 


AESIMC.<dt> <Qd>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if size != '@@' then UNDEFINED; 

if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 


T1 


[15 141312/11109 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 0 | 


14441147114 1{0]1 afsizefo of va jojo 11 t{i{mfo] vm | 


T1 variant 


AESIMC.<dt> <Qd>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if size != '@@' then UNDEFINED; 

if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<dt> Is the data type, encoded in the "size" field. It can have the following values: 
8 when size = 00 


The following encodings are reserved: 


° size = Q1. 

° size = 1x. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
Q[{d>>1] = AESInvMixColumns(Q[m>>1]); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 





F6.1.4 AESMC 
AES mix columns. 
Al 
|31 30 29 28|27 26 25 24/23 22 21 20|19 18 17 16|15 12/11109 8|7 6 5 4|3 0 | 
11110017 1o]1 tszeloo] ve lojo7+ 1 tomo] vm | 
Al variant 
AESMC.<dt> <Qd>, <Qm> 
Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if size != '@@' then UNDEFINED; 
if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 
T1 
151413 12/11109 8/7 6 5 4/3 2 1 015 12/1109 8|7 6 5 4|3 0 | 
14111111414 1{[d{1 1{sze]o of va [olo111fo[mjo] vm _| 
T1 variant 
AESMC.<dt> <Qd>, <Qm> 
Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if size != '@@' then UNDEFINED; 
if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<dt> Is the data type, encoded in the "size" field. It can have the following values: 
8 when size = 00 
The following encodings are reserved: 
° size = Q1. 
° size = 1x. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
Q[d>>1] = AESMixColumns(Q[m>>1]); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.5 


FLDMDBX, FLDMIAX 


FLDMxX loads multiple SIMD&FP registers from consecutive locations in the Advanced SIMD and floating-point 
register file using an address from a general-purpose register. 


ARM deprecates use of FLDMDBX and FLDMIAX, except for disassembly purposes, and reassembly of 
disassembled code. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 | 1 0| 


| teat ft 4 ofpfujofwiif Rn | vd ft oft 1] imms<7t> 1) 





cond imm8<0> 


Decrement Before variant 
Applies when P == 1 && U == @ && W == 1. 


FLDMDBX{<c>}{<q>} <Rn>!, <dreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


FLDMIAX{<c>}{<q>} <Rn>{!}, <dreglist> 


Decode for all variants of this encoding 

if P == '0' && U == '0' && W == '0' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VLDR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = FALSE; add = (U == '1'); whack = (W == '1'); 

d = UInt(D:Vd); nm = UInt(Rn); imm32 = ZeroExtend(imm8:'00', 32) 

regs = UInt(imm8) DIV 2; // If UInt(imm8) is odd, see "FLDMX". 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 
if regs == 0 || regs > 16 || (d+regs) > 32 then UNPREDICTABLE; 

if imm8<@> == '1' && (d+regs) > 16 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If regs == Q, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction operates as a VLDM with the same addressing mode but loads no registers. 
If regs > 16 || (d+regs) > 16, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 
the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


T1 
[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 1 0| 
Ti 7071 opPluppwi] Ra | va [+ o[t i] immer [7] 


imm8<0> 


Decrement Before variant 
Applies when P == 1 && U == 0 && W == 1. 


FLDMDBX{<c>}{<q>} <Rn>!, <dreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


FLDMIAX{<c>}{<q>} <Rn>{!}, <dreglist> 


Decode for all variants of this encoding 


if P == '0' && U == '0' && W == 'Q' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VLDR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = FALSE; add = (U == '1'); whack = (W == '1'); 

d = UInt(D:Vd); nm = UInt(Rn); imm32 = ZeroExtend(imm8:'Q0', 32) 

regs = UInt(imm8) DIV 2; // If UInt(imm8) is odd, see "FLDMX". 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 

if regs == @ || regs > 16 || (d+regs) > 32 then UNPREDICTABLE; 

if imm8<@> == '1' && (d+regs) > 16 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If regs == 0, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction operates as a VLDM with the same addressing mode but loads no registers. 


If regs > 16 || (d+regs) > 16, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related encodings: See Advanced SIMD and floating-point 64-bit move on page F3-2448 for the T32 instruction 
set, or Advanced SIMD and floating-point 64-bit move on page F4-2532 for the A32 instruction set. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3243 


ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. If writeback is not specified, the PC 
can be used. 


! Specifies base register writeback. Encoded in the "W" field as 1 if present, otherwise 0. 


<dreglist> Is the list of consecutively numbered 64-bit SIMD&FP registers to be transferred. The first register 
in the list is encoded in "D:Vd", and "imm8" is set to twice the number of registers in the list plus 
one. The list must contain at least one register, all registers must be in the range DO-D15, and must 
not contain more than 16 registers. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
address = if add then R[n] else R[n]-imm32; 
for r = @ to regs-1 
if single_regs then 
S[d+r] = MemA[address,4]; address = address+4; 
else 
word1 = MemA[address,4]; word2 = MemA[address+4,4]; address = address+8; 
// Combine the word-aligned words in the correct order for current endianness. 
D[d+r] = if BigEndian() then word1:word2 else word2:word1; 
if wback then R[n] = if add then R[n]+imm32 else R[n]-imm32; 
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F6.1.6 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


FSTMDBX, FSTMIAX 


FSTMX stores multiple SIMD&FP registers from the Advanced SIMD and floating-point register file to 
consecutive locations in using an address from a general-purpose register. 


ARM deprecates use of FLDMDBX and FLDMIAX, except for disassembly purposes, and reassembly of 
disassembled code. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 | 1 0| 


| feiitt ft 4 ofpfujofwiof Rn | vd ft oft 1] imms<7t> [1 





cond imm8<0> 


Decrement Before variant 
Applies when P == 1 && U == @ && W == 1. 


FSTMDBX{<c>}{<q>} <Rn>!, <dreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


FSTMIAX{<c>}{<q>} <Rn>{!}, <dreglist> 


Decode for all variants of this encoding 


if P == '@' && U == '0' && W == '0' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VSTR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = FALSE; add = (U == '1'); whack = (W == '1'); 

d = UInt(D:Vd); mn = UInt(Rn); imm32 = ZeroExtend(imm8:'Q0', 32) 

regs = UInt(imm8) DIV 2; // If UInt(imm8) is odd, see "FSTMX". 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 

if regs == 0 || regs > 16 || (d+regs) > 32 then UNPREDICTABLE; 

if imm8<@> == '1' && (d+regs) > 16 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If regs == Q, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction operates as a VSTM with the same addressing mode but stores no registers. 
If regs > 16 || (d+regs) > 16, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


. The memory locations specified by the instruction and the number of registers specified by the instruction if 
the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


T1 
15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 1 0| 
Ti 7071 0pPlupo[w[o] Ra [| va [t+ o[t i] immer [7] 


imm8<0> 


Decrement Before variant 
Applies when P == 1 && U == 0 && W == 1. 


FSTMDBX{<c>}{<q>} <Rn>!, <dreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


FSTMIAX{<c>}{<q>} <Rn>{!}, <dreglist> 


Decode for all variants of this encoding 


if P == 'Q' && U == '0' && W == 'Q' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VSTR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = FALSE; add = (U == '1'); whack = (W == '1'); 

d = UInt(D:Vd); nm = UInt(Rn); imm32 = ZeroExtend(imm8:'@0', 32) 

regs = UInt(imm8) DIV 2; // If UInt(imm8) is odd, see "FSTMX". 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 

if regs == @ || regs > 16 || (d+regs) > 32 then UNPREDICTABLE; 

if imm8<@> == '1' && (d+regs) > 16 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If regs == 0, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction operates as a VSTM with the same addressing mode but stores no registers. 


If regs > 16 || (d+regs) > 16, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related encodings: See Advanced SIMD and floating-point 64-bit move on page F3-2448 for the T32 instruction 
set, or Advanced SIMD and floating-point 64-bit move on page F4-2532 for the A32 instruction set. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
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<Rn> 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Is the general-purpose base register, encoded in the "Rn" field. If writeback is not specified, the PC 
can be used. However, ARM deprecates use of the PC. 


Specifies base register writeback. Encoded in the "W" field as 1 if present, otherwise 0. 


<dreglist> Is the list of consecutively numbered 64-bit SIMD&FP registers to be transferred. The first register 


in the list is encoded in "D:Vd", and "imm8" is set to twice the number of registers in the list plus 
one. The list must contain at least one register, all registers must be in the range DO-D15, and must 
not contain more than 16 registers. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
address = if add then R[n] else R[n]-imm32; 
for r = @ to regs-1 
if single_regs then 
MemA[address,4] = S[d+r]; address = address+4; 
else 
// Store as two word-aligned words in the correct order for current endianness. 
MemA[address,4] = if BigEndian() then D[d+r]<63:32> else D[d+r]<31:0>; 
MemA[address+4,4] = if BigEndian() then D[d+r]<31:0> else D[d+r]<63:32>; 
address = address+8; 
if wback then R[n] = if add then R[n]+imm32 else R[n]-imm32; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.7 SHA1C 


SHA hash update (choose). 
A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


17100 1fojoppjo of] wa | va_[110 o|Nja|Mjo] vm | 


Al variant 


SHA1C.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if Q != '1' then UNDEFINED; 


if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[11 tfoj1 414 ofofo of vn | vd ji 10 ojNfajmfo] vm | 


T1 variant 


SHA1C.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if Q != '1' then UNDEFINED; 

if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
X = Q[d>>1]; 
Y = Q[n>>1]<31:0>; // Note: 32 bits wide 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


W = Q[m>>1]; 

for e = Q to 3 
t = SHAchoose(X<63:32>, X<95:64>, X<127:96>); 
Y = Y + ROL(X<31:0>, 5) + t + Elem[W, e, 32]; 
X<63:32> = ROL(X<63:32>, 30); 
<Y, X> = ROL(Y:X, 32); 

Q[d>>1] = X; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.8 SHA1H 


SHAI fixed rotate. 
A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


Ti 770017 1)0][1 i[szcefo 7] va [ojo 10 7|1[w[o] vm | 


Al variant 


SHA1H.32 <Qd>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if size != '10' then UNDEFINED; 

if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 


T1 


15 141312/11109 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 0 | 


titi iit 1401 ifsielo 1] va _ jofo 1 o 1}t}Mjo] vm | 





T1 variant 


SHA1H.32 <Qd>, <Qm> 


Decode for this encoding 

if ! HaveCryptoExt() then UNDEFINED; 

if size != '10' then UNDEFINED; 

if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 

Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 


<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
Q[d>>1] = ZeroExtend(ROL(Q[m>>1]<31:0>, 30), 128); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.9 SHA1M 


SHA1 hash update (majority). 
A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


17100 7fojopoj1 o| wa | va [110 o[Nja|Mjo] vm | 


Al variant 


SHAIM.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if Q != '1' then UNDEFINED; 


if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[11 tfoj1 414 ofoft of vn _ | vd fi 10 ojNfajmfo] vm | 


T1 variant 


SHAIM.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if Q != '1' then UNDEFINED; 

if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
X = Q[d>>1]; 
Y = Q[n>>1]<31:0>; // Note: 32 bits wide 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


W = Q[m>>1]; 

for e = Q to 3 
t = SHAmajority(X<63:32>, X<95:64>, X<127:96>); 
Y = Y + ROL(X<31:0>, 5) + t + Elem[W, e, 32]; 
X<63:32> = ROL(X<63:32>, 30); 
<Y, X> = ROL(Y:X, 32); 

Q[d>>1] = X; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.10 SHA1P 


SHA1 hash update (parity). 
A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


777100 7)ojoppjo 7] wa | va [110 o|Nja|Mjo] vm _| 


Al variant 


SHA1P.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if Q != '1' then UNDEFINED; 


if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[11 tfoj1 414 ofofo st vn | vd jit 0 ojNfajmfo] vm 


T1 variant 


SHA1P.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if Q != '1' then UNDEFINED; 

if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
X = Q[d>>1]; 
Y = Q[n>>1]<31:0>; // Note: 32 bits wide 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3253 
1ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


W = Q[m>>1]; 

for e = Q to 3 
t = SHAparity(X<63:32>, X<95:64>, X<127:96>); 
Y = Y + ROL(X<31:0>, 5) + t + Elem[W, e, 32]; 
X<63:32> = ROL(X<63:32>, 30); 
<Y, X> = ROL(Y:X, 32); 

Q[d>>1] = X; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.11 SHA1SU0 


SHA1 schedule update 0. 
A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


717100 7fojoppy1 1] wa | va [110 o[Nja|Mjo] vm | 


Al variant 


SHA1SUQ.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if Q != '1' then UNDEFINED; 


if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[1 t tfoj1 414 ofoft af vn | va fi 10 ojNJa}mfo] vm 


T1 variant 


SHA1SUQ.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if Q != '1' then UNDEFINED; 

if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 





<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
op1 = Q[d>>1]; op2 = Q[n>>1]; op3 = Q[m>>1]; 
op2 = op2<63:0> : opl<127:64>; 
Q[d>>1] = op1 EOR op2 EOR op3; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 





F6.1.12 SHA1SU1 
SHA1 schedule update 1. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 
11110017 @o]1 isze]1 0] ve lojo7 1 *o|mjo] vm | 
Al variant 
SHA1SU1.32 <Qd>, <Qm> 
Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if size != '10' then UNDEFINED; 
if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 
T1 
115141312/1110 9 8|7 6 5 4|/3 2 1 0|15 12/1110 9 8|7 6 5 4|3 0| 
14111117414 1{[d{1 1{sze]1 of va [olo111fo[mjo] vm _| 
T1 variant 
SHA1SU1.32 <Qd>, <Qm> 
Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if size != '10' then UNDEFINED; 
if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
X = Q[{d>>1]; Y = Q[m>>1]; 
T = X EOR LSR(Y, 32); 
W@ = ROL(T<31:@>, 1); 
W1 = ROL(T<63:32>, 1); 
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W2 = ROL(T<95:64>, 1); 
W3 = ROL(T<127:96>, 1) EOR ROL(T<31:0>, 2); 
Q[d>>1] = W3:W2:W1:WO; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.13 SHA256H 


SHA256 hash update part 1. 
A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


717100 7{ifoppjo ol wm | va [110 o[Nja|Mjo] vm _| 


Al variant 


SHA256H.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if Q != '1' then UNDEFINED; 


if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[11 tftf1 414 ofofo of vn | vd |i 10 ojNfajmfo] vm 


T1 variant 


SHA256H.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if Q != '1' then UNDEFINED; 

if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
X = Q[d>>1]; Y = Q[n>>1]; W = Q[m>>1]; partl = TRUE; 
Q[d>>1] = SHA256hash(X, Y, W, part1); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.14 SHA256H2 


SHA256 hash update part 2. 
A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


717100 7{ioppjos] va | va [110 o[Nja|Mjo] vm | 


Al variant 


SHA256H2.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if Q != '1' then UNDEFINED; 


if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4/3 0 | 


11 tfti1 41 ofofo st vn | vd fit 0 ojNJajmMfo} vm | 


T1 variant 


SHA256H2.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if Q != '1' then UNDEFINED; 

if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
X = Q[n>>1]; Y = Q[d>>1]; W = Q[m>>1]; partl = FALSE; 
Q[d>>1] = SHA256hash(X, Y, W, part1); 
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F6.1.15 SHA256SU0 
SHA256 schedule update 0. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 
T1i10017 o]1 isze]1 0] ve lojo+1|1[Mjo] vm | 
Al variant 
SHA256SUQ.32 <Qd>, <Qm> 
Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if size != '10' then UNDEFINED; 
if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 
T1 
115141312/1110 9 8|7 6 5 4|/3 2 1 0|15 12/1110 9 8|7 6 5 4|3 0| 
14111111414 1{[d{1 1{sze]1 of va folo111{i[mjo] vm _| 
T1 variant 
SHA256SUQ.32 <Qd>, <Qm> 
Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if size != '10' then UNDEFINED; 
if Vd<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
Assembler symbols 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Operation for all encodings 
if ConditionPassed() then 
bits(128) result; 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
X = Q[{d>>1]; Y = Q[m>>1]; 
T = Y<31:0@> : X<127:32>; 
for e = Q to 3 
elt = Elem[T, e, 32]; 
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elt = ROR(elt, 7) EOR ROR(elt, 18) EOR LSR(elt, 3); 
Elem[result, e, 32] = elt + Elem[X, e, 32]; 
Q[d>>1] = result; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.16 SHA256SU1 


SHA256 schedule update 1. 
A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


717100 7{i)opoy1 of] wa | va [110 o[Nja|Mjo] vm | 


Al variant 


SHA256SU1.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 
if ! HaveCryptoExt() then UNDEFINED; 
if Q != '1' then UNDEFINED; 


if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[1 t tftfs 414 ofoft of vn | vd fi 10 ojNJa}mMfo} vm 


T1 variant 


SHA256SU1.32 <Qd>, <Qn>, <Qm> 


Decode for this encoding 


if ! HaveCryptoExt() then UNDEFINED; 

if Q != '1' then UNDEFINED; 

if Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 
bits(128) result; 
EncodingSpecificOperations(); CheckCryptoEnabled32(); 
X = Q[d>>1]; Y = Q[n>>1]; Z = Q[m>>1]; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


TO = Z<31:0> : Y<127:32>; 


Tl = Z<127:64>; 

fore=Qtol 
elt = Elem[T1, e, 32]; 
elt = ROR(elt, 17) EOR ROR(elt, 19) EOR LSR(elt, 10); 
elt = elt + Elem[X, e, 32] + Elem[T@, e, 32]; 
Elem[result, e, 32] = elt; 


T1 = result<63:0>; 

for e = 2 to 3 
elt = Elem[T1, e - 2, 32]; 
elt = ROR(elt, 17) EOR ROR(elt, 19) EOR LSR(elt, 10); 
elt = elt + Elem[X, e, 32] + Elem[T@, e, 32]; 
Elem[result, e, 32] = elt; 


Q[d>>1] = result; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.17 VABA 


Vector Absolute Difference and Accumulate subtracts the elements of one vector from the corresponding elements 
of another vector, and accumulates the absolute values of the results into the elements of the destination vector. 


Operand and result elements are all integers of the same length. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


17100 1[Ulo[pfsze[ va | va_[o 17 1[Njo|Mi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VABA{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VABA{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 

if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); Jlong_destination = FALSE; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


[1 1 tfuj1 414 ofdfsize] vn | vd fo 1 4 t|NJQ}mMft} vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VABA{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VABA{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); Jlong_destination = FALSE; 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[Din[n+r],e,esize]; 
op2 = Elem[Din[m+r],e,esize]; 
absdiff = Abs(Int(opl,unsigned) - Int(op2,unsigned)) ; 
if long_destination then 
Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2sesize] + absdiff; 
else 





Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + absdiff; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.18 VABAL 


Vector Absolute Difference and Accumulate Long subtracts the elements of one vector from the corresponding 
elements of another vector, and accumulates the absolute values of the results into the elements of the destination 
vector. 


Operand elements are all integers of the same length, and the result elements are double the length of the operands. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 

|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 

74700 afuipopen] we [ va [070 4|NJo[Mjo] vm | 
size 

Al variant 


VABAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' then UNDEFINED; 

unsigned = (U == '1'); long_destination = TRUE; 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); nm = UInt(N:Vn); | m = UInt(M:Vm); regs = 1; 


T1 


[15 14 1312/1110 9 8|7 6 5 4]3 0 |15 12/1109 8|7 6 5 4/3 0 | 


a ajo 7 tye] va [va fo 10 1[NJo[Mpo] vm 


size 


T1 variant 


VABAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' then UNDEFINED; 

unsigned = (U == '1'); long_destination = TRUE; 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn);\ m = UInt(M:Vm); regs = 1; 


Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[Din[n+r],e,esize]; 
op2 = Elem[Din[m+r],e,esize]; 
absdiff = Abs(Int(opl,unsigned) - Int(op2,unsigned)) ; 
if long_destination then 
Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2esize] + absdiff; 
else 
Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + absdiff; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.19 VABD (floating-point) 


Vector Absolute Difference (floating-point) subtracts the elements of one vector from the corresponding elements 
of another vector, and places the absolute values of the results in the elements of the destination vector. 


Operand and result elements are all single-precision floating-point numbers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


77100 t[ifo[pyifee] va | va_[1 10 7[NJQ|Mjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VABD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VABD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 


esize = 32; elements = 2; 
d = UInt(D:Vd); mn = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0 | 


apt 117 opie] ve [vat 10 t[Na[Mpo] vm _ 


64-bit SIMD vector variant 
Applies when Q == 0. 


VABD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VABD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

esize = 32; elements = 2; 

d = UInt(D:Vd); mn = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = Q@ to regs-1 
for e = Q to elements-1 
op1 = Elem[D[n+r],e,esize]; op2 = Elem[D[m+r],e,esize]; 
Elem[D[d+r],e,esize] = FPAbs(FPSub(op1, op2,StandardFPSCRValue())); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.20 VABD (integer) 


Vector Absolute Difference (integer) subtracts the elements of one vector from the corresponding elements of 
another vector, and places the absolute values of the results in the elements of the destination vector. 


Operand and result elements are all integers of the same length. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


77100 1[Ulo[pfsze[ va | va_[o 17 1|Nja|Mjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VABD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VABD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); Jlong_destination = FALSE; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


[1 1 tfuj1 414 ofdfsize] vn | vd jo 14 t|NJQ}mMfo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VABD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VABD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); Jlong_destination = FALSE; 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[Din[n+r],e,esize]; 
op2 = Elem[Din[m+r],e,esize]; 
absdiff = Abs(Int(opl,unsigned) - Int(op2,unsigned)) ; 
if long_destination then 
Elem[Q[d>>1],e,2esize] = absdiff<2sesize-1:0>; 

else 





Elem[D[d+r],e,esize] = absdiff<esize-1:0>; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.21  VABDL (integer) 


Vector Absolute Difference Long (integer) subtracts the elements of one vector from the corresponding elements of 
another vector, and places the absolute values of the results in the elements of the destination vector. 


Operand elements are all integers of the same length, and the result elements are double the length of the operands. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 

|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12|/11109 8|7 6 5 4/3 0| 

74700 afuiopen] we [ va [017 4[NJo[M[o] vm | 
size 

Al variant 


VABDL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' then UNDEFINED; 

unsigned = (U == '1'); long_destination = TRUE; 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); mn = UInt(N:Vn); | m = UInt(M:Vm); regs = 1; 


T1 


151413 12\11109 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0| 


aot tt tppeny va | va_fo 17 1[N[o[Mjo] vm _| 


size 


T1 variant 


VABDL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' then UNDEFINED; 

unsigned = (U == '1'); long_destination = TRUE; 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n= UInt(N:Vn); | m = UInt(M:Vm); regs = 1; 


Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[Din[n+r],e,esize]; 
op2 = Elem[Din[m+r],e,esize]; 
absdiff = Abs(Int(opl,unsigned) - Int(op2,unsigned)) ; 
if long_destination then 
Elem[Q[d>>1],e,2xesize] = absdiff<2xesize-1:0>; 

else 





Elem[D[d+r],e,esize] = absdiff<esize-1:0>; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.22 VABS 


Vector Absolute takes the absolute value of each element in a vector, and places the results in a second vector. The 
floating-point version only clears the sign bit. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


11770077 1)0][1 i[szefo it] va [olF[1 7 oa[w[o] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VABS{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VABS{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
advsimd = TRUE; floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


A2 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 
| ist [1 140 1[d]1 1fofo o of va 4 of1 x]tf{i{mjo] vm | 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VABS{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VABS{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != '@@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
advsimd = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
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T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4/3 0 | 


77714174 1o]1 teze[o 1] va olF]i 7 ojalmjo] vm 





64-bit SIMD vector variant 
Applies when Q == 0. 


VABS{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VABS{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1') then UNDEFINED; 
advsimd = TRUE; floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 


151413 12|11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4|3 0| 


7704770 1]1 too 00] va [1 o]7 x[a]1[Mjo] vm _| 


size 


Single-precision scalar variant 
Applies when size == 10. 


VABS{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VABS{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@0' || FPSCR.Stride != 'Q00' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
advsimd = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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<Qd> 
<Qm> 
<Dd> 
<Dm> 
<Sd> 


<Sm> 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Is the data type for the elements of the vectors, encoded in the "F:size" field. It can have the 
following values: 


S8 when F = 0, size = 00 
S16 when F = 0, size = Q1 
$32 when F = 0, size = 10 
F32 when F = 1, size = 10 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 

Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 


Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = 0 to elements-1 
if floating_point then 
Elem[D[d+r],e,esize] = FPAbs(Elem[D[m+r],e,esize]); 
else 
result = Abs(SInt(Elem[D[m+r],e,esize])); 
Elem[D[d+r],e,esize] = result<esize-1:0>; 
else // NFP instruction 
case esize of 
when 32 S[d] = FPAbs(S[m]); 
when 64 D[d] = FPAbs(D[m]); 
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F6.1.23 VACGE 
Vector Absolute Compare Greater Than or Equal takes the absolute value of each element in a vector, and compares 
it with the absolute value of the corresponding element of a second vector. If the first is greater than or equal to the 
second, the corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros. 
The operands and result can be quadword or doubleword vectors. They must all be the same size. 
The operand vector elements must be 32-bit floating-point numbers. 
The result vector elements are 32-bit fields. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
This instruction is used by the pseudo-instruction VACLE. The pseudo-instruction is never the preferred 
disassembly. 
A1 
|31 30 29 28|27 26 25 2423 22 21 20|19 16|15 12/11109 8|7 6 5 4|3 0| 
T4171 00 i[t)o[o[ofse] va [va [+17 O(N]Q[wps] vm | 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VACGE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VACGE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<O> == '1' || Vn<@> == 'L' || Vn<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 
or_equal = (op == 'Q'); 
esize = 32; elements = 2; 
d = UInt(D:Vd); nn = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
a apifi 117 o[ppolse] va vat 7 olNaluyi] vm 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VACGE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VACGE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

or_equal = (op == 'Q'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Alias conditions 





Pseudo-instruction is preferred when 





VACLE Never 





Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = @ to elements-1 
op1 = FPAbs(Elem[D[n+r],e,esize]); op2 = FPAbs(Elem[D[m+r],e,esize]); 
if or_equal then 
test_passed = FPCompareGE(op1, op2, StandardFPSCRValue()); 
else 
test_passed = FPCompareGT(op1, op2, StandardFPSCRValue()); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.24 VACLE 


Vector Absolute Compare Less Than or Equal takes the absolute value of each element in a vector, and compares it 
with the absolute value of the corresponding element of a second vector. If the first is less than or equal to the second, 
the corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros 


This instruction is a pseudo-instruction of the VACGE instruction. This means that: 


° The encodings in this description are named to match the encodings of VACGE. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VACGE gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15 12/1109 8|7 6 5 4|3 0| 


T4770 0 4[1[o[o]ofse| ve [| va [+171 0|N[Q[M[[ vm _| 
op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VACLE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VACGE{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 
VACLE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VACGE{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


T1 


151413 12|11109 8|7 6 5 4|3 0 \15 12\11109 8|7 6 5 4/3 0 | 


aft 717 o[ppofee] va | va_[1 17 olNjalMy7] vm _| 
op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VACLE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VACGE{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 
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128-bit SIMD vector variant 

Applies when Q == 1. 
VACLE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VACGE{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


Assembler symbols 


<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Operation for all encodings 


The description of VACGE gives the operational pseudocode for this instruction. 
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F6.1.25 VACGT 
Vector Absolute Compare Greater Than takes the absolute value of each element in a vector, and compares it with 
the absolute value of the corresponding element of a second vector. If the first is greater than the second, the 
corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros. 
The operands and result can be quadword or doubleword vectors. They must all be the same size. 
The operand vector elements must be 32-bit floating-point numbers. 
The result vector elements are 32-bit fields. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
This instruction is used by the pseudo-instruction VACLT. The pseudo-instruction is never the preferred 
disassembly. 
A1 
|31 30 29 28|27 26 25 2423 22 21 20|19 16|15 12/11109 8|7 6 5 4|3 0| 
Ti 1100 i[to[o[t fel va [va [+17 O(N]Q]wps] vm | 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VACGT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VACGT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == 'L' 8& (Vd<O> == 'L' || Vn<@> == ‘1’ || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 
or_equal = (op == 'Q'); 
esize = 32; elements = 2; 
d = UInt(D:Vd); nn = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
a atft 1 +7 opppi fee] ve vet 7 ON aM] vm 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VACGT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VACGT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

or_equal = (op == 'Q'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Alias conditions 





Pseudo-instruction is preferred when 





VACLT Never 





Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = @ to elements-1 
op1 = FPAbs(Elem[D[n+r],e,esize]); op2 = FPAbs(Elem[D[m+r],e,esize]); 
if or_equal then 
test_passed = FPCompareGE(op1, op2, StandardFPSCRValue()); 
else 
test_passed = FPCompareGT(op1, op2, StandardFPSCRValue()); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.26 VACLT 


Vector Absolute Compare Less Than takes the absolute value of each element in a vector, and compares it with the 
absolute value of the corresponding element of a second vector. If the first is less than the second, the corresponding 
element in the destination vector is set to all ones. Otherwise, it is set to all zeros 


This instruction is a pseudo-instruction of the VACGT instruction. This means that: 


° The encodings in this description are named to match the encodings of VACGT. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VACGT gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12/1109 8|7 6 5 4|3 0| 


Ti 7700 4]1/o[o[tfe| ve [| va [+771 0(NQ[w[[ vm _| 
op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VACLT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VACGT{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 
VACLT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VACGT{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


T1 


151413 12|11109 8|7 6 5 4|3 0 \15 12\11109 8|7 6 5 4/3 0 | 


aft 711 ofpyi fee] ve | va_[t 17 olNjalMyi] vm _| 
op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VACLT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VACGT{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 
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128-bit SIMD vector variant 

Applies when Q == 1. 
VACLT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VACGT{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


Assembler symbols 


<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Operation for all encodings 


The description of VACGT gives the operational pseudocode for this instruction. 
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F6.1.27 VADD (floating-point) 


Vector Add (floating-point) adds corresponding elements in two vectors, and places the results in the destination 


vector. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 


12/1110 9 8|7 6 5 4|3 


0 | 


77100 tfofo[ppofe] va | va_[1 10 7[NJQ[Mjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VADD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> = 
if sz == '1' then UNDEFINED; 

advsimd = TRUE; 

esize = 32; elements = 2; 


d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


A2 


31 28|27 26 25 24|23 22 21 20|19 16/15 


1') then UNDEFINED; 


12/1110 9 8|7 6 5 4|3 


0| 


pofeatt [1 41 ofojof1 af vn | vd ft of x|NJojmMfo] vm | 


cond 


Single-precision scalar variant 
Applies when size == 10. 
VADD{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 


VADD{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q00' || FPSCR.Stride != '@@' then UNDEFINED; 


if size != 'lx' then UNDEFINED; 
advsimd = FALSE; 
case size of 


when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = 
UInt(N:Vn); m 


when '11' esize = 64; d = UInt(D:Vd); n 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


T1 


\15141312|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0 | 


7 apolt 717 o[ppofee] va | va_[1 10 7[Nja|Mjo] vm 


64-bit SIMD vector variant 
Applies when Q == 0. 


VADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VADD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

advsimd = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 
15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 
770477 Ofoppyi a] va | va _[1 0]7 x[N[o[mpo] vm 


size 


Single-precision scalar variant 
Applies when size == 10. 


VADD{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VADD{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@0' || FPSCR.Stride != 'Q0' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
advsimd = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 


when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 

<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = Q@ to elements-1 
Elem[D[d+r],e,esize] = FPAdd(Elem[D[n+r],e,esize], Elem[D[m+r],e,esize], 
StandardFPSCRValue()); 





else // NFP instruction 
case esize of 
when 32 
S[d] = FPAdd(S[n], S[m], FPSCR); 
when 64 
D[d] = FPAdd(D[n], D[m], FPSCR); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.28 VADD (integer) 


Vector Add (integer) adds corresponding elements in two vectors, and places the results in the destination vector. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


[111100 1Jofojpfsize] vn | vd |i 0 0 oOjNJQ}Mfo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VADD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


7 poli 777 o[ppsze[ va | va_[1 00 o[Nja|mjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VADD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Assembler symbols 





<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 
18 when size = 00 
116 when size = Q1 
132 when size = 10 
164 when size = 11 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
Elem[D[d+r],e,esize] = Elem[D[n+r],e,esize] + Elem[D[m+r],e,esize]; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.29 VADDHN 


Vector Add and Narrow, returning High Half adds corresponding elements in two quadword vectors, and places the 
most significant half of each result in a doubleword vector. The results are truncated. For rounded results, see 
VRADDHN. 


The operand elements can be 16-bit, 32-bit, or 64-bit integers. There is no distinction between signed and unsigned 
integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 

|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8/7 6 5 4]3 0| 

774700 4fojifo[en] vn [ va [070 O[NJo|Mjo] vm | 
size 

Al variant 


VADDHN{<c>}{<q>}.<dt> <Dd>, <Qn>, <Qm> 


Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 apott 11+ tppen] va | va__[o 10 o[Njo[mjo] vm _| 


size 


T1 variant 


VADDHN{<c>}{<q>}.<dt> <Dd>, <Qn>, <Qm> 


Decode for this encoding 

if size == '11' then SEE "Related encodings"; 

if Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 
Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> 


<Dd> 
<Qn> 


<Qm> 


Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 


116 when size = 00 
132 when size = 01 
164 when size = 10 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = 0 to elements-1 


result = Elem[Qin[n>>1],e,2sesize] + Elem[Qin[m>>1],e,2«esize]; 
Elem[D[d],e,esize] = result<2esize-1:esize>; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.30 VADDL 


Vector Add Long adds corresponding elements in two doubleword vectors, and places the results in a quadword 
vector. Before adding, it sign-extends or zero-extends the elements of both operands. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12\11109 8/7 6 5 4]3 0| 
T11100 t[utpo[en] va | va [0 0 ofo[NJo[mjo[ vm | 
size op 
Al variant 


VADDL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' || (op == '1' && Vn<@> == '1') then UNDEFINED; 

unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; is_vaddw = (op == '1'); 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


ajo 17 tppeny va | va __[o 0 ofo|Njo[mpo] vm 


size op 


T1 variant 


VADDL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' || (op == '1' && Vn<@> == '1') then UNDEFINED; 

unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; is_vaddw = (op == '1'); 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 


Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> Is the data type for the elements of the second operand vector, encoded in the "U:size" field. It can 
have the following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 


for e = 0 to elements-1 
if is_vaddw then 


opl = Int(Elem[Qin[n>>1],e,2*esize], unsigned); 


else 


opl = Int(Elem[Din[n],e,esize], unsigned); 
result = opl + Int(Elem[Din[m] ,e,esize] ,unsigned); 
result<2*esize-1:0>; 


Elem[Q[d>>1],e,2esize] 


CheckAdvSIMDEnabled(); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.31 VADDW 


Vector Add Wide adds corresponding elements in one quadword and one doubleword vector, and places the results 
in a quadword vector. Before adding, it sign-extends or zero-extends the elements of the doubleword operand. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12\11109 8/7 6 5 4]3 0| 
T1410 0 tutpopen] va | va [00 o]t|NJo[mjol vm | 
size op 
Al variant 


VADDW{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' || (op == '1' && Vn<@> == '1') then UNDEFINED; 

unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; is_vaddw = (op == '1'); 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0 | 


aut 7 tppeny va | va [oo oft |NJo[mjo] vm _ 
op 


size 


T1 variant 


VADDW{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' || (op == '1' && Vn<@> == '1') then UNDEFINED; 

unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; is_vaddw = (op == '1'); 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 


Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> Is the data type for the elements of the second operand vector, encoded in the "U:size" field. It can 

have the following values: 

S8 when U = 0, size = 00 

S16 when U = 0, size = Q1 

$32 when U = 0, size = 10 

U8 when U = 1, size = 00 

U16 when U = 1, size = Q1 

U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 


for e = 0 to elements-1 
if is_vaddw then 


opl = Int(Elem[Qin[n>>1],e,2*esize], unsigned); 


else 


opl = Int(Elem[Din[n],e,esize], unsigned); 
result = opl + Int(Elem[Din[m] ,e,esize] ,unsigned); 
result<2*esize-1:0>; 


Elem[Q[d>>1],e,2esize] 


CheckAdvSIMDEnabled(); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.32  VAND (immediate) 


Vector Bitwise AND (immediate) performs a bitwise AND between a register value and an immediate value, and 
returns the result into the destination vector 


This instruction is a pseudo-instruction of the VBIC (immediate) instruction. This means that: 


° The encodings in this description are named to match the encodings of VBIC (immediate). 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VBIC (immediate) gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24|23 22 2120/1918  16|15 42\11 8|7 6 5 4|3 0| 


77100 Tift] 0 Oo] mms | va | omode [ola[i]i] imma 


64-bit SIMD vector variant 

Applies when Q == 0. 

VAND{<c>}{<q>}.<dt> {<Dd>,} <Dd>, #<imm> 
is equivalent to 

VBIC{<c>}{<q>}.<dt> <Dd>, #<imm> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 

VAND{<c>}{<q>}.<dt> {<Qd>,} <Qd>, #<imm> 
is equivalent to 

VBIC{<c>}{<q>}.<dt> <Qd>, #-<imm> 


and is never the preferred disassembly. 


T1 


\15141312/11109 8|7 6 5 4/3 2. 0O|15 42|11 8|7 6 5 4|3 0 | 


[1 t tfif1 114 1Jdfo 0 of imma | vd | mode Jola}i{1] imma _| 


64-bit SIMD vector variant 

Applies when Q == 0. 

VAND{<c>}{<q>}.<dt> {<Dd>,} <Dd>, #<imm> 
is equivalent to 

VBIC{<c>}{<q>}.<dt> <Dd>, #<imm> 


and is never the preferred disassembly. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


128-bit SIMD vector variant 

Applies when Q == 1. 

VAND{<c>}{<q>}.<dt> {<Qd>,} <Qd>, #<imm> 
is equivalent to 

VBIC{<c>}{<q>}.<dt> <Qd>, #-<imm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<dt> The data type used for <imm>. It can be either 116 or 132. 18, 164, and F32 are also permitted, but the 
resulting syntax is a pseudo-instruction. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<imm> Is a constant of the type specified by <dt> that is replicated to fill the destination register. For details 


of the range of constants available and the encoding of <dt> and <imm>, see Modified immediate 
constants in T32 and A32 Advanced SIMD instructions on page F2-2423. 


Operation for all encodings 


The description of VBIC (immediate) gives the operational pseudocode for this instruction. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.33 VAND (register) 


Vector Bitwise AND (register) performs a bitwise AND operation between two registers, and places the result in 
the destination register. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


77100 tfojoppjo of] wa | va [ooo 7|Nja|Mi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VAND{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VAND{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd);  n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 apolt 777 oppjo oy wa | va [ooo 7|Njo[Mi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VAND{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VAND{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
D{d+r] = D[n+r] AND D[m+r]; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.34 VBIC (immediate) 


Vector Bitwise Bit Clear (immediate) performs a bitwise AND between a register value and the complement of an 
immediate value, and returns the result into the destination vector. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


This instruction is used by the pseudo-instruction VAND (immediate). The pseudo-instruction is never the preferred 
disassembly. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|15 12\11 8|7 6 5 4|3 0| 


77100 tfi[t[p]o 0 Oo] mms | va | omode [ola[i]i] imma 


64-bit SIMD vector variant 
Applies when Q == 0. 


VBIC{<c>}{<q>}.<dt> {<Dd>,} <Dd>, #<imm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VBIC{<c>}{<q>}.<dt> {<Qd>,} <Qd>, #<imm> 


Decode for all variants of this encoding 
if cmode<@> == '@' || cmode<3:2> == '11' then SEE "Related encodings"; 
if Q == '1' && Vd<@> == '1' then UNDEFINED; 


imm64 = AdvSIMDExpandImm('1', cmode, i:imm3:imm4) ; 
d = UInt(D:Vd);_ regs = if Q == '@' then 1 else 2; 


71 


\15141312/11109 8|7 6 5 4/3 2. 0O|15 42|11 8|7 6 5 4|3 0 | 


apt 717 t[o]o oo] mms | va | omode [o[a[i]i] imma 


64-bit SIMD vector variant 
Applies when Q == 0. 


VBIC{<c>}{<q>}.<dt> {<Dd>,} <Dd>, #<imm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VBIC{<c>}{<q>}.<dt> {<Qd>,} <Qd>, #<imm> 


Decode for all variants of this encoding 


if cmode<@> == '@' || cmode<3:2> == '11' then SEE "Related encodings"; 
if Q == '1' && Vd<@> == '1' then UNDEFINED; 

imm64 = AdvSIMDExpandImm('1', cmode, i:imm3:imm4) ; 

d = UInt(D:Vd);_ regs = if Q == '@' then 1 else 2; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 


Alias conditions 





Pseudo-instruction is preferred when 





VAND (immediate) Never 





Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<dt> The data type used for <imm>. It can be either 116 or 132. 18, 164, and F32 are also permitted, but the 
resulting syntax is a pseudo-instruction. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<imm> Is a constant of the type specified by <dt> that is replicated to fill the destination register. For details 


of the range of constants available and the encoding of <dt> and <imm>, see Modified immediate 
constants in T32 and A32 Advanced SIMD instructions on page F2-2423. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
D[d+r] = D[d+r] AND NOT(imm64) ; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.35 VBIC (register) 


Vector Bitwise Bit Clear (register) performs a bitwise AND between a register value and the complement of a 
register value, and places the result in the destination register. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


777100 7fojoppjo 7] wa | va [ooo 7|Nja|Mi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VBIC{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VBIC{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd);  n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 apolt7 77 opjo a] wm | va [ooo 7|Nja[Mi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VBIC{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VBIC{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
D[d+r] = D[n+r] AND NOT(D[m+r]); 
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F6.1.36 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Vector Bitwise Insert if False inserts each bit from the first source register into the destination register if the 
corresponding bit of the second source register is 0, otherwise leaves the bit in the destination register unchanged. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


Ti7700 4/10] i] ve [| va [ooo a[NfQiM[i] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VBIF{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VBIF{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if op == 'Q0' then SEE VEOR; 

if op == 'Q@1' then operation = VBitOps_VBSL; 

if op == '10' then operation = VBitOps_VBIT; 

if op == '11' then operation = VBitOps_VBIF; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 


at ttift 1 tt opis tt vn fT vd fo 0 ot INfQfMis] vm 


op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VBIF{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VBIF{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if op == 'Q0' then SEE VEOR; 

if op == 'Q@1' then operation = VBitOps_VBSL; 

if op == '10' then operation = VBitOps_VBIT; 

if op == '11' then operation = VBitOps_VBIF; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 
enumeration VBitOps {VBitOps_VBIF, VBitOps_VBIT, VBitOps_VBSL}; 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
case operation of 
when VBitOps_VBIF D[d+r] = (D[d+r] AND D[m+r]) OR (D[n+r] AND NOT(D[m+r])); 
when VBitOps_VBIT D[d+r] = (D[n+r] AND D[m+r]) OR (D[d+r] AND NOT(D[m+r])); 
when VBitOps_VBSL D[d+r] = (D[n+r] AND D[d+r]) OR (D[m+r] AND NOT(D[d+r])); 
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F6.1.37 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Vector Bitwise Insert if True inserts each bit from the first source register into the destination register if the 
corresponding bit of the second source register is 1, otherwise leaves the bit in the destination register unchanged. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


77100 7[ifopoji o| va | va [ooo 7|Nja|Mi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VBIT{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VBIT{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if op == 'Q0' then SEE VEOR; 

if op == 'Q@1' then operation = VBitOps_VBSL; 

if op == '10' then operation = VBitOps_VBIT; 

if op == '11' then operation = VBitOps_VBIF; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 


fat ttift 1 tt ofpfs of vn | vd fo oo tINfafMii] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VBIT{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VBIT{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if op == 'Q0' then SEE VEOR; 

if op == 'Q@1' then operation = VBitOps_VBSL; 

if op == '10' then operation = VBitOps_VBIT; 

if op == '11' then operation = VBitOps_VBIF; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 
enumeration VBitOps {VBitOps_VBIF, VBitOps_VBIT, VBitOps_VBSL}; 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
case operation of 
when VBitOps_VBIF D[d+r] = (D[d+r] AND D[m+r]) OR (D[n+r] AND NOT(D[m+r])); 
when VBitOps_VBIT D[d+r] = (D[n+r] AND D[m+r]) OR (D[d+r] AND NOT(D[m+r])); 
when VBitOps_VBSL D[d+r] = (D[n+r] AND D[d+r]) OR (D[m+r] AND NOT(D[d+r])); 





F6-3308 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6.1.38 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


VBSL 


Vector Bitwise Select sets each bit in the destination to the corresponding bit from the first source operand when the 
original destination bit was 1, otherwise from the second source operand. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


777100 7[iopjo a] wa | va [ooo 7|NjalMi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VBSL{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VBSL{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if op == 'Q0' then SEE VEOR; 

if op == 'Q@1' then operation = VBitOps_VBSL; 

if op == '10' then operation = VBitOps_VBIT; 

if op == '11' then operation = VBitOps_VBIF; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 


it ttift 1 tt ofpjo tt] vn | vd fo 0 0 tINfQfmis] vm 


64-bit SIMD vector variant 
Applies when Q == 0. 


VBSL{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VBSL{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if op == 'Q0' then SEE VEOR; 

if op == 'Q@1' then operation = VBitOps_VBSL; 

if op == '10' then operation = VBitOps_VBIT; 

if op == '11' then operation = VBitOps_VBIF; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 
enumeration VBitOps {VBitOps_VBIF, VBitOps_VBIT, VBitOps_VBSL}; 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
case operation of 
when VBitOps_VBIF D[d+r] = (D[d+r] AND D[m+r]) OR (D[n+r] AND NOT(D[m+r])); 
when VBitOps_VBIT D[d+r] = (D[n+r] AND D[m+r]) OR (D[d+r] AND NOT(D[m+r])); 
when VBitOps_VBSL D[d+r] = (D[n+r] AND D[d+r]) OR (D[m+r] AND NOT(D[d+r])); 
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F6.1.39 VCEQ (immediate #0) 


Vector Compare Equal to Zero takes each element in a vector, and compares it with zero. If it is equal to zero, the 
corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros. 


The operand vector elements can be any one of: 

° 8-bit, 16-bit, or 32-bit integers. There is no distinction between signed and unsigned integers. 
° 32-bit floating-point numbers. 

The result vector elements are fields the same size as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


131 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


11417007174 1/0]1 t}szefo 1] va JofFjo 1 ofajmjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCEQ{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCEQ{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 


Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


[15 141312/11109 8/7 6 5 4/3 2 1 0|15 12/1110 9 8|7 6 5 4/3 0 | 





77414177 1)0]1 tezefo 1] va Jo[Fjo 7 oja|mjo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCEQ{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCEQ{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 
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Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "F:size" field. It can have the 
following values: 
18 when F = 0, size = 00 
116 when F = 0, size = Q1 
132 when F = 0, size = 10 
F32 when F = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
if floating_point then 
bits(esize) zero = FPZero('Q'); 
test_passed = FPCompareEQ(Elem[D[mtr],e,esize], zero, StandardFPSCRValue()); 
else 
test_passed = (Elem[D[m+r],e,esize] == Zeros(esize)); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.40 VCEQ (register) 


Vector Compare Equal takes each element in a vector, and compares it with the corresponding element of a second 
vector. If they are equal, the corresponding element in the destination vector is set to all ones. Otherwise, it is set to 
all zeros. 


The operand vector elements can be any one of: 

° 8-bit, 16-bit, or 32-bit integers. There is no distinction between signed and unsigned integers. 
° 32-bit floating-point numbers. 

The result vector elements are fields the same size as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12\11109 8|7 6 5 4/3 0| 


[111100 tfitfojofsize] vn | vd |i 0 0 O|NJQ}M{1] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCEQ{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCEQ{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '11' then UNDEFINED; 


int_operation = TRUE; esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


A2 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


[111400 tfofojofofsz] vn | vd fi 14 ojNjajmMfo} vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCEQ{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VCEQ{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 

int_operation = FALSE; 


esize = 32; elements = 2; 
d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 ait 717 o[pysze] va | va_[1 00 o[Nja|Myi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCEQ{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCEQ{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '11' then UNDEFINED; 


int_operation = TRUE; esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 apolt 717 o[ppofee] va | va_[1 17 0[NjalMjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCEQ{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCEQ{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

int_operation = FALSE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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Assembler symbols 

<c> For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 
For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<dt> For encoding Al and T1: is the data type for the elements of the vectors, encoded in the "size" field. 
It can have the following values: 

00 

01 

132 when size = 10 


18 when size 


116 when size 


For encoding A2 and T2: is the data type for the elements of the vectors, encoded in the "sz" field. 
It can have the following values: 


F32 when sz = 0 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = Q@ to regs-1 
for e = Q to elements-1 
op1 = Elem[D[n+r],e,esize]; op2 = Elem[D[m+r],e,esize]; 
if int_operation then 
test_passed = (opl == op2); 
else 
test_passed = FPCompareEQ(op1, op2, StandardFPSCRValue()); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.41 VCGE (immediate #0) 
Vector Compare Greater Than or Equal to Zero takes each element in a vector, and compares it with zero. If it is 
greater than or equal to zero, the corresponding element in the destination vector is set to all ones. Otherwise, it is 
set to all zeros. 
The operand vector elements can be any one of: 
° 8-bit, 16-bit, or 32-bit signed integers. 
° 32-bit floating-point numbers. 
The result vector elements are fields the same size as the operand vector elements. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11109 8|7 6 5 4|3 0| 
114700174 1/0]1 t{szelo 1] va [olFlo o tfalmMjo] vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VCGE{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 
128-bit SIMD vector variant 
Applies when Q == 1. 
VCGE{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 
Decode for all variants of this encoding 
if size == '11' || (F == '1' && size != '10') then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 2 1 0(|15 12/1110 9 8|7 6 5 4|3 0| 
114441714 1/0/71 t{sizefo 1] va [ojrFjo o tfajmMjo} vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VCGE{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 
128-bit SIMD vector variant 
Applies when Q == 1. 
VCGE{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 
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Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "F:size" field. It can have the 
following values: 
S8 when F = 0, size = 00 
S16 when F = 0, size = Q1 
$32 when F = 0, size = 10 
F32 when F = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
if floating_point then 
bits(esize) zero = FPZero('Q'); 
test_passed = FPCompareGE(Elem[D[mtr],e,esize], zero, StandardFPSCRValue()); 
else 
test_passed = (SInt(Elem[D[m+r],e,esize]) >= @); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.42 VCGE (register) 
Vector Compare Greater Than or Equal takes each element in a vector, and compares it with the corresponding 
element of a second vector. If the first is greater than or equal to the second, the corresponding element in the 
destination vector is set to all ones. Otherwise, it is set to all zeros. 
The operand vector elements can be any one of: 
° 8-bit, 16-bit, or 32-bit signed integers. 
° 8-bit, 16-bit, or 32-bit unsigned integers. 
° 32-bit floating-point numbers. 
The result vector elements are fields the same size as the operand vector elements. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
This instruction is used by the pseudo-instruction VCLE (register). The pseudo-instruction is never the preferred 
disassembly. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15 12/11109 8|7 6 5 4|3 0 | 
114100 1{uUJolp]size] vn [vd [oo 4 1{NfafmMi{ vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VCGE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VCGE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '11' then UNDEFINED; 
type = if U == '1' then VCGEtype_unsigned else VCGEtype_signed; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
A2 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12/11109 8|7 6 5 4|3 0 | 
111400 t}4}o[pfo}sz] vn [va [1 411 o[Njajmfo} vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VCGE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
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128-bit SIMD vector variant 
Applies when Q == 1. 


VCGE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 

type = VCGEtype_fp; 


esize = 32; elements = 2; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12\11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0| 


7 afuft 777 o[pysze] va | va_[o07 7[NJQ|My7] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCGE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCGE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<O> == 'L1' || Vn<O> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '11' then UNDEFINED; 

type = if U == '1' then VCGEtype_unsigned else VCGEtype_signed; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); mn = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0| 


[11 tftfs 414 ofpfofsz] vn | vd 4111 ojNJajmMfo] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCGE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VCGE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

type = VCGEtype_fp; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Alias conditions 





Pseudo-instruction is preferred when 





VCLE (register) Never 





Assembler symbols 


<C> For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> For encoding Al and T1: is the data type for the elements of the operands, encoded in the "U:size" 
field. It can have the following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 


For encoding A2 and T2: is the data type for the elements of the vectors, encoded in the "sz" field. 
It can have the following values: 


F32 when sz = 0 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 
enumeration VCGEtype {VCGEtype_signed, VCGEtype_unsigned, VCGEtype_fp}; 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[D[n+r],e,esize]; op2 = Elem[D[m+r],e,esize]; 
case type of 
when VCGEtype_signed test_passed = (SInt(opl) >= SInt(op2)); 
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when VCGEtype_unsigned test_passed = (UInt(op1) >= UInt(op2)); 
when VCGEtype_fp test_passed = FPCompareGE(op1, op2, StandardFPSCRValue()); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.43 VCGT (immediate #0) 


Vector Compare Greater Than Zero takes each element in a vector, and compares it with zero. If it is greater than 
zero, the corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros. 


The operand vector elements can be any one of: 

° 8-bit, 16-bit, or 32-bit signed integers. 

° 32-bit floating-point numbers. 

The result vector elements are fields the same size as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


131 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


11417001714 1/0]1 t}szefo 1] vd JolFlo o ofajmjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCGT{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCGT{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 


Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


[15 141312/11109 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 0 | 





77414777 1)0]1 tezefo 7] va o[F]o o ojalMjo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCGT{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCGT{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 
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Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "F:size" field. It can have the 
following values: 
S8 when F = 0, size = 00 
S16 when F = 0, size = Q1 
$32 when F = 0, size = 10 
F32 when F = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
if floating_point then 
bits(esize) zero = FPZero('Q'); 
test_passed = FPCompareGT(Elem[D[mtr],e,esize], zero, StandardFPSCRValue()); 
else 
test_passed = (SInt(Elem[D[m+r],e,esize]) > Q); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.44 VCGT (register) 
Vector Compare Greater Than takes each element in a vector, and compares it with the corresponding element of a 
second vector. If the first is greater than the second, the corresponding element in the destination vector is set to all 
ones. Otherwise, it is set to all zeros. 
The operand vector elements can be any one of: 
° 8-bit, 16-bit, or 32-bit signed integers. 
° 8-bit, 16-bit, or 32-bit unsigned integers. 
° 32-bit floating-point numbers. 
The result vector elements are fields the same size as the operand vector elements. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
This instruction is used by the pseudo-instruction VCLT (register). The pseudo-instruction is never the preferred 
disassembly. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15 12/11109 8|7 6 5 4|3 0 | 
114100 1JUjolo]size} vn [va [oo 4 1{NjalmMjo] vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VCGT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VCGT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '11' then UNDEFINED; 
type = if U == '1' then VCGTtype_unsigned else VCGTtype_signed; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == 'Q@' then 1 else 2; 
A2 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12/11109 8|7 6 5 4|3 0 | 
111400 t}4Jo[o}t}sz] vn [va [1 41 oO[Njajmfo} vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VCGT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
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128-bit SIMD vector variant 
Applies when Q == 1. 


VCGT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 

type = VCGTtype_fp; 


esize = 32; elements = 2; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12\11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0| 


7 afult 717 o[pysze] va | va_[o07 7[NJQ|Mjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCGT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCGT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<> == 'L' || Vn<O> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '11' then UNDEFINED; 

type = if U == '1' then VCGTtype_unsigned else VCGTtype_signed; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0| 


1 t tftfs 41 t ofoft sz] vn | vd 4 1 1 O[NJQ}Mfo] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCGT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VCGT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

type = VCGTtype_fp; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Alias conditions 





Pseudo-instruction is preferred when 





VCLT (register) Never 





Assembler symbols 


<C> For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> For encoding Al and T1: is the data type for the elements of the operands, encoded in the "U:size" 
field. It can have the following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 


For encoding A2 and T2: is the data type for the elements of the vectors, encoded in the "sz" field. 
It can have the following values: 


F32 when sz = 0 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 
enumeration VCGTtype {VCGTtype_signed, VCGTtype_unsigned, VCGTtype_fp}; 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
opl = Elem[D[n+r],e,esize]; op2 = Elem[D[m+r],e,esize]; 
case type of 
when VCGTtype_signed test_passed = (SInt(op1) > SInt(op2)); 
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when VCGTtype_unsigned test_passed = (UInt(op1) > UInt(op2)); 
when VCGTtype_fp test_passed = FPCompareGT(op1, op2, StandardFPSCRValue()); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.45 VCLE (immediate #0) 


Vector Compare Less Than or Equal to Zero takes each element in a vector, and compares it with zero. If it is less 
than or equal to zero, the corresponding element in the destination vector is set to all ones. Otherwise, it is set to all 
zeros. 


The operand vector elements can be any one of: 

° 8-bit, 16-bit, or 32-bit signed integers. 

° 32-bit floating-point numbers. 

The result vector elements are fields the same size as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


111 too1d fot ifsielo 1] va _ JolFjo 1 1]ajmMjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCLE{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCLE{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 


Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


145 14 1312/1110 9 8/7 6 5 4/3 2 1 0|15 12/1110 9 8|7 6 5 4/3 0 | 





A ttitit 1 tfoi1 t}sizefo 1] vd jojrjo 1 sfajmjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCLE{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCLE{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 
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Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "F:size" field. It can have the 
following values: 
S8 when F = 0, size = 00 
S16 when F = 0, size = Q1 
$32 when F = 0, size = 10 
F32 when F = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
if floating_point then 
bits(esize) zero = FPZero('Q'); 
test_passed = FPCompareGE(zero, Elem[D[mtr],e,esize], StandardFPSCRValue()); 
else 
test_passed = (SInt(Elem[D[m+r],e,esize]) <= Q); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.46  VCLE (register) 


Vector Compare Less Than or Equal takes each element in a vector, and compares it with the corresponding element 
of a second vector. If the first is less than or equal to the second, the corresponding element in the destination vector 
is set to all ones. Otherwise, it is set to all zeros 


This instruction is a pseudo-instruction of the VCGE (register) instruction. This means that: 


. The encodings in this description are named to match the encodings of VCGE (register). 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VCGE (register) gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12\/11109 8|7 6 5 4/3 0| 


77100 1[Ulo[pfsze[ va | va_[o07 7|NJQ|M[i] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCLE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VCGE{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 
VCLE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VCGE{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


A2 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 


17100 t[ifo[ppofee] va | va_[1 17 o[NjalMjo] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCLE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VCGE{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 
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128-bit SIMD vector variant 

Applies when Q == 1. 
VCLE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VCGE{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 afuft 717 o[pysze] va | va_jo07 1|Njo|Mi] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCLE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VCGE{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 
VCLE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VCGE{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


T2 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


apf 114 opppofee] va | va_[1 17 0[Nja|Mjo] vm _| 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCLE{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VCGE{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 


Applies when Q == 1. 
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VCLE{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 
VCGE{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


Assembler symbols 


<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<C> For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 


must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> For encoding Al and T1: is the data type for the elements of the operands, encoded in the "U:size" 
field. It can have the following values: 
S8 when U = 0, size = 00 
$16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 


For encoding A2 and T2: is the data type for the elements of the vectors, encoded in the "sz" field. 
It can have the following values: 


F32 when sz = 0 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Operation for all encodings 


The description of VCGE (register) gives the operational pseudocode for this instruction. 
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F6.1.47 VCLS 


Vector Count Leading Sign Bits counts the number of consecutive bits following the topmost bit, that are the same 
as the topmost bit, in each element in a vector, and places the results in a second vector. The count does not include 
the topmost bit itself. 


The operand vector elements can be any one of 8-bit, 16-bit, or 32-bit signed integers. 
The result vector elements are the same data type as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


11770077 101 iszefo of va [oi 00 oa[w[o] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCLS{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCLS{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


151413 12\11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4|3 0 | 


777147177 1(0]1 teze[o 0] va [oli oo oja|mjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCLS{<c>}{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCLS{<c>}{<q>}.<dt> <Qd>, <Qm> 
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Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<dt> Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 


S8 when size = 00 
S16 when size = 01 
$32 when size = 10 


The encoding size = 11 is reserved. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
Elem[D[d+r],e,esize] = CountLeadingSignBits(Elem[D[m+r] ,e,esize] )<esize-1:0>; 
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F6.1.48 VCLT (immediate #0) 


Vector Compare Less Than Zero takes each element in a vector, and compares it with zero. If it is less than zero, the 
corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros. 


The operand vector elements can be any one of: 

° 8-bit, 16-bit, or 32-bit signed integers. 

° 32-bit floating-point numbers. 

The result vector elements are fields the same size as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


131 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


114170071714 1/0]1 t}szefo 1] va JofF}1 o ofalmjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCLT{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCLT{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 


Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


[15 141312/11109 8/7 6 5 4/3 2 1 0|15 12/1110 9 8|7 6 5 4/3 0 | 





74147177 1)0]1 tezefo 1] va olF]1 0 oja|Mjo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCLT{<c>}{<q>}.<dt> {<Dd>,} <Dm>, #0 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCLT{<c>}{<q>}.<dt> {<Qd>,} <Qm>, #0 
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Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "F:size" field. It can have the 
following values: 
S8 when F = 0, size = 00 
S16 when F = 0, size = Q1 
$32 when F = 0, size = 10 
F32 when F = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
if floating_point then 
bits(esize) zero = FPZero('Q'); 
test_passed = FPCompareGT(zero, Elem[D[mtr],e,esize], StandardFPSCRValue()); 
else 
test_passed = (SInt(Elem[D[m+r],e,esize]) < Q); 
Elem[D[d+r],e,esize] = if test_passed then Ones(esize) else Zeros(esize); 
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F6.1.49  VCLT (register) 


Vector Compare Less Than takes each element in a vector, and compares it with the corresponding element of a 
second vector. If the first is less than the second, the corresponding element in the destination vector is set to all 
ones. Otherwise, it is set to all zeros 


This instruction is a pseudo-instruction of the VCGT (register) instruction. This means that: 


° The encodings in this description are named to match the encodings of VCGT (register). 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VCGT (register) gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12/1109 8|7 6 5 4/3 0| 


717100 1[Ulo[pysze[ va | va_[o07 7|NJQ|Mjo] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCLT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VCGT{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 
VCLT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VCGT{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


A2 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 


77100 t[ifo[pyife] va | va_[1 17 o[NjalMjo] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCLT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VCGT{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 
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128-bit SIMD vector variant 

Applies when Q == 1. 
VCLT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VCGT{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1110 9 8|7 6 5 4/3 


0) 


7 afuft 717 o[pysze] va | va_[o07 1|NJQ|Mjo] vm _| 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCLT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VCGT{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 
VCLT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
is equivalent to 

VCGT{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


T2 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 


0| 


aft 114 ofpyifee] va | va_[1 17 0[NjolMjo] vm _| 


64-bit SIMD vector variant 

Applies when Q == 0. 
VCLT{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
is equivalent to 

VCGT{<c>}{<q>}.<dt> <Dd>, <Dm>, <Dn> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 


Applies when Q == 1. 
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VCLT{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


is equivalent to 


VCGT{<c>}{<q>}.<dt> <Qd>, <Qm>, <Qn> 


and is never the preferred disassembly. 


Assembler symbols 


<Dm> 
<Dn> 


<Qm> 


<Qn> 


<c> 


<q> 


<dt> 


<Qd> 


<Dd> 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 


Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


For encoding Al and T1: is the data type for the elements of the operands, encoded in the "U:size" 
field. It can have the following values: 


S8 when U = 0, size = 00 
$16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 


For encoding A2 and T2: is the data type for the elements of the vectors, encoded in the "sz" field. 
It can have the following values: 


F32 when sz = 0 
Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Operation for all encodings 


The description of VCGT (register) gives the operational pseudocode for this instruction. 
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F6.1.50 VCLZ 


Vector Count Leading Zeros counts the number of consecutive zeros, starting from the most significant bit, in each 
element in a vector, and places the results in a second vector. 


The operand vector elements can be any one of 8-bit, 16-bit, or 32-bit integers. There is no distinction between 
signed and unsigned integers. 


The result vector elements are the same data type as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


11770077 101 i[szefo of va [oli 00 7/Q[w[o] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCLZ{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCLZ{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4|3 0 | 


7714777 1(0]1 tewe[o 0] va [o[t oo 7alMo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCLZ{<c>}{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCLZ{<c>}{<q>}.<dt> <Qd>, <Qm> 
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Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<dt> Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 


I8 when size = 00 
116 when size = 01 
132 when size = 10 


The encoding size = 11 is reserved. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
Elem[D[d+r],e,esize] = CountLeadingZeroBits(Elem[D[m+r] ,e,esize] )<esize-1:0>; 
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F6.1.51 VCMP 
Vector Compare compares two floating-point registers, or one floating-point register and zero. It writes the result to 
the FPSCR flags. These are normally transferred to the PSTATE.{N, Z, C, V} Condition flags by a subsequent VMRS 
instruction. 
It raises an Invalid Operation exception only if either operand is a signaling NaN. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12|/1110 9 8|7 6 5 4/3 0| 
mim [i110 to] tfo]1 oo] va [1 oft x[oli[mpo. vm _| 
cond size E 
Single-precision scalar variant 
Applies when size == 10. 
VCMP{<c>}{<q>}.F32 <Sd>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VCMP{<c>}{<q>}.F64 <Dd>, <Dm> 
Decode for all variants of this encoding 
if size != 'lx' then UNDEFINED; 
quiet_nan_exc = (E == '1'); with_zero = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
A2 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12|\11109 8|7 6 5 4|3 21 0| 
ei [1110 to]1 tol1 oa] va [1 Off x fo] Tfopo Toy OVO) 
cond size E 
Single-precision scalar variant 
Applies when size == 10. 
VCMP{<c>}{<q>}.F32 <Sd>, #0.0 
Double-precision scalar variant 
Applies when size == 11. 
VCMP{<c>}{<q>}.F64 <Dd>, #0.0 
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Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
quiet_nan_exc = (E == '1'); with_zero = TRUE; 
case size of 
when '10' esize 
when '11' esize 


32; d 
64; d 


UInt(Vd:D); 
UInt(D:Vd); 


T1 


[15 14131211109 8/7 6 5 4/3 21 0|15 12/1110 9 8|7 6 5 4/3 0 | 


jp t1o71t1o t1fdf1 sfoj1 oof vd ft of1 xjojt|mMjo} vm | 


size E 


Single-precision scalar variant 
Applies when size == 10. 


VCMP{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCMP{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 

quiet_nan_exc = (E == '1'); with_zero = FALSE; 

case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 


T2 


151413 12|11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4/3 21 0| 


1170177170 1Jd]1 tfoj1 o4f va [4t oft xfof1 foo {oo 0) (0) 


size E 


Single-precision scalar variant 
Applies when size == 10. 


VCMP{<c>}{<q>}.F32 <Sd>, #0.0 


Double-precision scalar variant 
Applies when size == 11. 


VCMP{<c>}{<q>}.F64 <Dd>, #0.0 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
quiet_nan_exc = (E == '1'); with_zero = TRUE; 
case size of 
when '10' esize 
when '11' esize 


32; d 
64; d 


UInt(Vd:D); 
UInt(D:Vd); 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 

NaNs 

The IEEE 754 standard specifies that the result of a comparison is precisely one of <, ==, > or unordered. If either 


or both of the operands are NaNs, they are unordered, and all three of (Operand1 < Operand2), (Operand1 == 
Operand2) and (Operand1 > Operand2) are false. This results in the FPSCR flags being set as N=0, Z=0, C=1 and 
V=1. 


VCMPE raises an Invalid Operation exception if either operand is any type of NaN, and is suitable for testing for <, 
<=, >, >=, and other predicates that raise an exception when the operands are unordered. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
bits(32) op32 = if with_zero then FPZero('Q') else S[m]; 
FPSCR.<N,Z,C,V> = FPCompare(S[d], 0p32, quiet_nan_exc, FPSCR); 
when 64 
bits(64) op64 = if with_zero then FPZero('0') else D[m]; 
FPSCR.<N,Z,C,V> = FPCompare(D[d], 0p64, quiet_nan_exc, FPSCR); 
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F6.1.52 VCMPE 
Vector Compare, raising Invalid Operation on NaN compares two floating-point registers, or one floating-point 
register and zero. It writes the result to the FPSCR flags. These are normally transferred to the PSTATE.{N, Z, C, 
V} Condition flags by a subsequent VMRS instruction. 
It raises an Invalid Operation exception if either operand is any type of NaN. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4/3 0| 
eit [1110 to]t tfol1 oo] ve [a oft xt] i[mpo. vm 
cond size E 
Single-precision scalar variant 
Applies when size == 10. 
VCMPE{<c>}{<q>}.F32 <Sd>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VCMPE{<c>}{<q>}.F64 <Dd>, <Dm> 
Decode for all variants of this encoding 
if size != 'lx' then UNDEFINED; 
quiet_nan_exc = (E == '1'); with_zero = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
A2 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11109 8|7 6 5 4/3 2 1 0| 
eit [1110 to]t tfol1 oa] ve [1 oft x] 1] a foo OVO) 
cond size E 
Single-precision scalar variant 
Applies when size == 10. 
VCMPE{<c>}{<q>}.F32 <Sd>, #0.0 
Double-precision scalar variant 
Applies when size == 11. 
VCMPE{<c>}{<q>}.F64 <Dd>, #0.0 
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Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
quiet_nan_exc = (E == '1'); with_zero = TRUE; 
case size of 
when '10' esize 
when '11' esize 


32; d 
64; d 


UInt(Vd:D); 
UInt(D:Vd); 


T1 


[15 14131211109 8/7 6 5 4/3 21 0|15 12/1110 9 8|7 6 5 4/3 0 | 


jp t1o1t1o tfdf1 tfoj1 o of vd ft of x]ift}mMfo} vm | 


size E 


Single-precision scalar variant 
Applies when size == 10. 


VCMPE{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCMPE{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 

quiet_nan_exc = (E == '1'); with_zero = FALSE; 

case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 


T2 


151413 12|11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4/3 21 0| 


1170177170 1Jd]1 tfoj1 o4f va [1 oft x] 1] 1 fof 0 |) 0) 0) (0) 


size E 


Single-precision scalar variant 
Applies when size == 10. 


VCMPE{<c>}{<q>}.F32 <Sd>, #0.0 


Double-precision scalar variant 
Applies when size == 11. 


VCMPE{<c>}{<q>}.F64 <Dd>, #0.0 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
quiet_nan_exc = (E == '1'); with_zero = TRUE; 
case size of 
when '10' esize 
when '11' esize 


32; d 
64; d 


UInt(Vd:D); 
UInt(D:Vd); 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 

NaNs 

The IEEE 754 standard specifies that the result of a comparison is precisely one of <, ==, > or unordered. If either 


or both of the operands are NaNs, they are unordered, and all three of (Operand1 < Operand2), (Operand1 == 
Operand2) and (Operand1 > Operand2) are false. This results in the FPSCR flags being set as N=0, Z=0, C=1 and 
V=1. 


VCMPE raises an Invalid Operation exception if either operand is any type of NaN, and is suitable for testing for <, 
<=, >, >=, and other predicates that raise an exception when the operands are unordered. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
bits(32) op32 = if with_zero then FPZero('Q') else S[m]; 
FPSCR.<N,Z,C,V> = FPCompare(S[d], 0p32, quiet_nan_exc, FPSCR); 
when 64 
bits(64) op64 = if with_zero then FPZero('0') else D[m]; 
FPSCR.<N,Z,C,V> = FPCompare(D[d], 0p64, quiet_nan_exc, FPSCR); 
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F6.1.53 VCNT 


Vector Count Set Bits counts the number of bits that are one in each element in a vector, and places the results in a 


second vector. 
The operand vector elements must be 8-bit fields. 


The result vector elements are 8-bit integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 


more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4/3 


0| 


Ti 770077 10/1 i[szefo of va [oli 07% o/a[w[o] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCNT{<c>}{<q>}.8 <Dd>, <Dm> // Encoded as Q = 0 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCNT{<c>}{<q>}.8 <Qd>, <Qm> // Encoded as Q=1 


Decode for all variants of this encoding 


if size != '@@' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8; elements = 8; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


151413 12|/1110 9 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4/3 


0| 


Tiiti117 to] iszcefo of va [ol 07 oajwjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCNT{<c>}{<q>}.8 <Dd>, <Dm> // Encoded as Q = 0 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCNT{<c>}{<q>}.8 <Qd>, <Qm> // Encoded as Q=1 


Decode for all variants of this encoding 


if size != '@@' then UNDEFINED; 

if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8; elements = 8; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
Elem[D[d+r],e,esize] = BitCount(Elem[D[m+r] ,e,esize] )<esize-1:0>; 
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F6.1.54 VCVT (between double-precision and single-precision) 


Convert between double-precision and single-precision does one of the following: 


° Converts the value in a double-precision register to single-precision and writes the result to a single-precision 
register. 
° Converts the value in a single-precision register to double-precision and writes the result to a 


double-precision register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11109 8|7 6 5 4/3 0| 
| t=11it [4 1 10 1[pf1 tfol1 1 1] va [1 oft xfafi{mfo[ vm _ | 


cond size 


Single-precision to double-precision variant 
Applies when size == 10. 


VCVT{<c>}{<q>}.F64.F32 <Dd>, <Sm> 


Double-precision to single-precision variant 
Applies when size == 11. 


VCVT{<c>}{<q>}.F32.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 
double_to_single = (sz == '1'); 


d = if double_to_single then UInt(Vd:D) else UInt(D:Vd); 
m = if double_to_single then UInt(M:Vm) else UInt(Vm:M); 


T1 


151413 12|1110 9 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4/3 0| 


77704770 1p toyi1 7] va [1 o]7 x[a]t[Mjo. vm _ 


size 
Single-precision to double-precision variant 


Applies when size == 10. 


VCVT{<c>}{<q>}.F64.F32 <Dd>, <Sm> 


Double-precision to single-precision variant 
Applies when size == 11. 


VCVT{<c>}{<q>}.F32.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


double_to_single = (sz == '1'); 
d = if double_to_single then UInt(Vd:D) else UInt(D:Vd); 
m = if double_to_single then UInt(M:Vm) else UInt(Vm:M); 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
if double_to_single then 
S[d] = FPConvert(D[m], FPSCR); 
else 
D[d] = FPConvert(S[m], FPSCR); 
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F6.1.55 VCVT (between half-precision and single-precision, Advanced SIMD) 


Vector Convert between half-precision and single-precision converts each element in a vector from single-precision 
to half-precision floating-point, or from half-precision to single-precision, and places the results in a second vector. 


The vector elements must be 32-bit floating-point numbers, or 16-bit floating-point numbers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


Ti 770077 101 i[sze]i of va [0] tflolo]w[o] vm _| 


Half-precision to single-precision variant 
Applies when op == 1. 


VCVT{<c>}{<q>}.F32.F16 <Qd>, <Dm> // Encoded as op = 1 


Single-precision to half-precision variant 


Applies when op == 0. 
VCVT{<c>}{<q>}.F16.F32 <Dd>, <Qm> // Encoded as op = @ 
Decode for all variants of this encoding 

if size != '@1' then UNDEFINED; 

half_to_single = (op == '1'); 

if half_to_single && Vd<@> == '1' then UNDEFINED; 

if !half_to_single && Vm<Q@> == '1' then UNDEFINED; 
esize = 16; elements = 4; 

m = UInt(M:Vm); d = UInt(D:Vd); 
T1 

115141312/11109 8|7 6 5 4;/3 2 1 0|15 12/1110 9 8|7 6 5 4|/3 0| 


1a t1ti1i1t to} t}size[t1 of va fot sfopfofo[mjo] vm | 


Half-precision to single-precision variant 

Applies when op == 1. 

VCVT{<c>}{<q>}.F32.F16 <Qd>, <Dm> // Encoded as op = 1 
Single-precision to half-precision variant 

Applies when op == Q. 


VCVT{<c>}{<q>}.F16.F32 <Dd>, <Qm> // Encoded as op 


iT} 
Ss 


Decode for all variants of this encoding 


if size != '@1' then UNDEFINED; 
half_to_single = (op == '1'); 
if half_to_single && Vd<@> == '1' then UNDEFINED; 
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if !half_to_single && Vm<Q@> == '1' then UNDEFINED; 
esize = 16; elements = 4; 
m = UInt(M:Vm); d = UInt(D:Vd); 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = 0 to elements-1 
if half_to_single then 
Elem[Q[d>>1] ,e,32] = FPConvert(Elem[Din[m],e,16], StandardFPSCRValue()); 
else 
Elem[D[d],e,16] = FPConvert(Elem[Qin[m>>1],e,32], StandardFPSCRValue()); 
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F6.1.56 VCVT (between floating-point and integer, Advanced SIMD) 


Vector Convert between floating-point and integer converts each element in a vector from floating-point to integer, 
or from integer to floating-point, and places the results in a second vector. 


The vector elements must be 32-bit floating-point numbers, or 32-bit integers. Signed and unsigned integers are 
distinct. 


The floating-point to integer operation uses the Round towards Zero rounding mode. The integer to floating-point 
operation uses the Round to Nearest rounding mode. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


71770047 1)0][1 i[sze]i 1] va [0]1 1] op [Q[w[o] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCVT{<c>}{<q>}.<dtl>.<dt2> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCVT{<c>}{<q>}.<dtl>.<dt2> <Qd>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size != '10' then UNDEFINED; 

to_integer = (op<l> == '1'); unsigned = (op<@> == '1'); 
esize = 32; elements = 2; 


d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


151413 12|/11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4|3 0 | 


74141714 1p]1 towel 7] va [ot 7] op [alMjo] vm _| 





64-bit SIMD vector variant 

Applies when Q == 0. 
VCVT{<c>}{<q>}.<dtl>.<dt2> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCVT{<c>}{<q>}.<dtl>.<dt2> <Qd>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size != '10' then UNDEFINED; 


to_integer = 


(op<l> == '1'); unsigned = (op<@> == '1'); 


esize = 32; elements = 2; 


d = UInt(D:Vd); m= UInt(M:Vm); regs 


if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> 


<q> 


<dtl> 


<dt2> 


<Qd> 
<Qm> 
<Dd> 


<Dm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the destination vector, encoded in the "size:op" field. It can have 
the following values: 


F32 when size = 10, op = 0x 
$32 when size = 10, op = 10 
U32 when size = 10, op = 11 


Is the data type for the elements of the source vector, encoded in the "size:op" field. It can have the 
following values: 


$32 when size = 10, op = 00 
U32 when size = 10, op = 01 
F32 when size = 10, op = 1x 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(esize) result; 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[D[m+r],e,esize]; 


if to_integer then 


result = FPToFixed(op1, @, unsigned, StandardFPSCRValue(), FPRounding_ZERO) ; 


else 


result = FixedToFP(op1, @, unsigned, StandardFPSCRValue(), FPRounding_TIEEVEN) ; 


Elem[D[d+r],e,esize] = result; 
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F6.1.57 VCVT (floating-point to integer, floating-point) 


Convert floating-point to integer with Round towards Zero converts a value in a register from floating-point to a 
32-bit integer, using the Round towards Zero rounding mode, and places the result in a second register. 


VCVT (between floating-point and fixed-point, floating-point) describes conversions between floating-point and 
16-bit integers. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
31 28|27 26 25 24|23 22 2120/1918 16/15 12/1110 9 8|7 6 5 4/3 0| 
Derm [1770 tpi i]t o x] va_[1 o]7 x[a]1[Mjo] vm _| 


cond opc2 size op 


Single-precision scalar variant 
Applies when opc2 == 100 && size == 10. 


VCVT{<c>}{<q>}.U32.F32 <Sd>, <Sm> 


Single-precision scalar variant 
Applies when opc2 == 101 && size == 10. 


VCVT{<c>}{<q>}.S32.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when opc2 == 100 && size == 11. 


VCVT{<c>}{<q>}.U32.F64 <Sd>, <Dm> 


Double-precision scalar variant 
Applies when opc2 == 101 && size == 11. 


VCVT{<c>}{<q>}.S32.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if opc2 != 'Q00' && opc2 != '10x' then SEE "Related encodings"; 
if size != '1x' then UNDEFINED; 
to_integer = (opc2<2> == '1'); 
if to_integer then 
unsigned = (opc2<@> == 'Q'); 
rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 
d = UInt(Vd:D); 
case size of 
when '10' esize = 32; m = UInt(Vm:M); 
when '11' esize = 64; m = UInt(M:Vm); 


else 
unsigned = (op == 'Q'); 
rounding = FPRoundingMode(FPSCR) ; 
m = UInt(Vm:M); 
case size of 
when '10' esize = 32; d = UInt(Vd:D); 
when '11' esize = 64; d = UInt(D:Vd); 
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T1 
115141312/11109 8|7 65 4|3 2. O15 12/1109 8|7 6 5 4|3 0| 
Ti tottto tot ait ox] ve [1 oft x[i[t[Mpo] vm _| 


opc2 size op 


Single-precision scalar variant 
Applies when opc2 == 100 && size == 10. 


VCVT{<c>}{<q>}.U32.F32 <Sd>, <Sm> 


Single-precision scalar variant 
Applies when opc2 == 101 && size == 10. 


VCVT{<c>}{<q>}.S32.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when opc2 == 100 && size == 11. 


VCVT{<c>}{<q>}.U32.F64 <Sd>, <Dm> 


Double-precision scalar variant 
Applies when opc2 == 101 && size == 11. 


VCVT{<c>}{<q>}.S32.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if opc2 != 'Q00' && opc2 != '10x' then SEE "Related encodings"; 
if size != '1x' then UNDEFINED; 
to_integer = (opc2<2> == '1'); 
if to_integer then 
unsigned = (opc2<@> == 'Q'); 
rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 
d = UInt(Vd:D); 
case size of 
when '10' esize = 32; m = UInt(Vm:M); 
when '11' esize = 64; m = UInt(M:Vm); 


else 
unsigned = (op == 'Q'); 
rounding = FPRoundingMode(FPSCR) ; 
m = UInt(Vm:M); 
case size of 
when '10' esize = 32; d = UInt(Vd:D); 
when '11' esize = 64; d = UInt(D:Vd); 


Notes for all encodings 


Related encodings: See Floating-point data-processing on page F3-2450 for the T32 instruction set, or 
Floating-point data-processing on page F4-2533 for the A32 instruction set. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
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<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 


if to_integer the 
case esize of 








n 








CheckVFPEnabled(TRUE) ; 





when 32 
S[d] = FPToFixed(S[m], @, unsigned, FPSCR, rounding); 
when 64 
S[d] = FPToFixed(D[m], @, unsigned, FPSCR, rounding); 
else 
case esize of 
when 32 
S[d] = FixedToFP(S[m], @, unsigned, FPSCR, rounding); 
when 64 
D[d] = FixedToFP(S[m], @, unsigned, FPSCR, rounding); 
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F6.1.58 VCVT (integer to floating-point, floating-point) 


Convert integer to floating-point converts a 32-bit integer to floating-point using the rounding mode specified by 
the FPSCR, and places the result in a second register. 


VCVT (between floating-point and fixed-point, floating-point) describes conversions between floating-point and 
16-bit integers. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
31 28|27 26 25 24|23 22 2120/1918 16/15 12\11109 8|7 6 5 4/3 0| 
Derm [i770 1[]1 t]4Jo 00] va [1 0]7 xfos[t[mpo] vm _ 


cond opc2 size 


Single-precision scalar variant 
Applies when size == 10. 


VCVT{<c>}{<q>}.F32.<dt> <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCVT{<c>}{<q>}.F64.<dt> <Dd>, <Sm> 


Decode for all variants of this encoding 


if opc2 != 'Q00' && opc2 != '10x' then SEE "Related encodings"; 
if size != '1x' then UNDEFINED; 
to_integer = (opc2<2> == '1'); 
if to_integer then 
unsigned = (opc2<@> == 'Q'); 
rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 
d = UInt(Vd:D); 
case size of 
when '10' esize = 32; m = UInt(Vm:M); 
when '11' esize = 64; m = UInt(M:Vm); 
else 
unsigned = (op == 'Q'); 
rounding = FPRoundingMode(FPSCR) ; 
m = UInt(Vm:M); 
case size of 
when '10' esize = 32; d = UInt(Vd:D); 





when '11' esize = 64; d = UInt(D:Vd); 

T1 

|15141312|11109 8|7 6 5 4|3 2 0 |15 12|\1110 9 8|7 6 5 4/3 0 | 

111071470 1)0]1 1]1]0 00] va [1 0]1 xfop[t[mpo] vm 
opc2 size 
Single-precision scalar variant 
Applies when size == 10. 
VCVT{<c>}{<q>}.F32.<dt> <Sd>, <Sm> 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3359 


ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Double-precision scalar variant 
Applies when size == 11. 


VCVT{<c>}{<q>}.F64.<dt> <Dd>, <Sm> 


Decode for all variants of this encoding 


if opc2 != 'Q00' && opc2 != '10x' then SEE "Related encodings"; 


if size != '1x' then UNDEFINED; 
to_integer = (opc2<2> == '1'); 
if to_integer then 

unsigned = (opc2<@> == '0'); 


rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 


d = UInt(Vd:D); 

case size of 
when '10' esize = 32; m 
when '11' esize = 64; m = 


else 
unsigned = (op == 'Q'); 


UInt(Vm:M) ; 
UInt(M:Vm) ; 


rounding = FPRoundingMode(FPSCR) ; 


m = UInt(Vm:M); 
case size of 
when '10' esize 
when '11' esize 


32; d 
64; d 


Notes for all encodings 


UInt(Vd:D); 
UInt(D:Vd); 


Related encodings: See Floating-point data-processing on page F3-2450 for the T32 instruction set, or 
Floating-point data-processing on page F4-2533 for the A32 instruction set. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the operand, encoded in the "op" field. It can have the following values: 
U32 when op = 0 
$32 when op = 1 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if to_integer then 
case esize of 











CheckVFPEnabled(TRUE) ; 








when 32 
S[d] = FPToFixed(S[m], @, unsigned, FPSCR, rounding); 
when 64 
S[d] = FPToFixed(D[m], @, unsigned, FPSCR, rounding); 
else 
case esize of 
when 32 
S[d] = FixedToFP(S[m], @, unsigned, FPSCR, rounding); 
when 64 
D[d] = FixedToFP(S[m], @, unsigned, FPSCR, rounding); 
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F6.1.59 VCVT (between floating-point and fixed-point, Advanced SIMD) 


Vector Convert between floating-point and fixed-point converts each element in a vector from floating-point to 
fixed-point, or from fixed-point to floating-point, and places the results in a second vector. 


The vector elements must be 32-bit floating-point numbers, or 32-bit integers. Signed and unsigned integers are 
distinct. 


The floating-point to fixed-point operation uses the Round towards Zero rounding mode. The fixed-point to 
floating-point operation uses the Round to Nearest rounding mode. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 12\11109 8|7 6 5 4/3 0 | 


77100 TU] imme [va __[1 1] » [oja|my7] vm | 


64-bit SIMD vector variant 
Applies when imm6 != QQ0xxx && Q == 0. 


VCVT{<c>}{<q>}.<dtl>.<dt2> <Dd>, <Dm>, #<fbits> 


128-bit SIMD vector variant 
Applies when imm6 != Q00xxx && Q == 1. 


VCVT{<c>}{<q>}.<dtl>.<dt2> <Qd>, <Qm>, #<fbits> 


Decode for all variants of this encoding 


if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 

if op<l> == '@' then UNDEFINED; 

if imm6 == '@xxxxx' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
to_fixed = (op<@> == '1'); frac_bits = 64 - UInt(imm6); 

unsigned = (U == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


15 141312\11109 8/7 65 | 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 afut 111 to] imme | va __[1 1] » [oja|myi] vm | 


64-bit SIMD vector variant 
Applies when imm6 != QQ0xxx && Q == 0. 


VCVT{<c>}{<q>}.<dtl>.<dt2> <Dd>, <Dm>, #<fbits> 


128-bit SIMD vector variant 
Applies when imm6 != Q00xxx && Q == 1. 


VCVT{<c>}{<q>}.<dt1l>.<dt2> <Qd>, <Qm>, #<fbits> 
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Decode for all variants of this encoding 


if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 

if op<l> == '@' then UNDEFINED; 

if imm6 == '@xxxxx' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
to_fixed = (op<@> == '1'); frac_bits = 64 - UInt(imm6); 

unsigned = (U == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the destination vector, encoded in the "op:U" field. It can have 
the following values: 
F32 when op = 10,U = 0 
F32 when op = 10,U = 1 
$32 when op = 11, U = 0 
U32 when op = 11,U = 1 


The encoding op = x, U = x is reserved. 


<dt2> Is the data type for the elements of the source vector, encoded in the "op:U" field. It can have the 
following values: 
$32 when op = 10,U = 0 
U32 when op = 10,U = 1 
F32 when op = 11, U = 0 
F32 when op = 11,U = 1 


The encoding op = 0x, U = x is reserved. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<fbits> The number of fraction bits in the fixed point number, in the range 1 to 32: 

° (64 - <fbits>) is encoded in imm6. 


An assembler can permit an <fbits> value of 0. This is encoded as floating-point to integer or 
integer to floating-point instruction, see VCVT (between floating-point and integer, Advanced 
SIMD). 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 
bits(esize) result; 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[D[m+r],e,esize]; 
if to_fixed then 
result = FPToFixed(op1, frac_bits, unsigned, StandardFPSCRValue(), 
FPRounding_ZERO) ; 
else 
result = FixedToFP(op1, frac_bits, unsigned, StandardFPSCRValue(), 
FPRounding_TIEEVEN) ; 
Elem[D[d+r],e,esize] = result; 
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F6.1.60 VCVT (between floating-point and fixed-point, floating-point) 


Convert between floating-point and fixed-point converts a value in a register from floating-point to fixed-point, or 
from fixed-point to floating-point. Software can specify the fixed-point value as either signed or unsigned. 


The floating-point value can be single-precision or double-precision. 


The fixed-point value can be 16-bit or 32-bit. Conversions from fixed-point values take their operand from the 
low-order bits of the source register and ignore any remaining bits. Signed conversions to fixed-point values 
sign-extend the result value to the destination register width. Unsigned conversions to fixed-point values 
zero-extend the result value to the destination register width. 


The floating-point to fixed-point operation uses the Round towards Zero rounding mode. The fixed-point to 
floating-point operation uses the Round to Nearest rounding mode. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 





A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 
| tetitt [1 1 10 1[D]4 1{1Jopft{u] va | 4 of1 x|sx{1fifo] imma | 
cond sf 


Single-precision scalar variant 
Applies when op == @ && sf == 10. 


VCVT{<c>}{<q>}.F32.<dt> <Sdm>, <Sdm>, #<fbits> 


Single-precision scalar variant 
Applies when op == 1 && sf == 10. 


VCVT{<c>}{<q>}.<dt>.F32 <Sdm>, <Sdm>, #<fbits> 


Double-precision scalar variant 
Applies when op == @ && sf == 11. 


VCVT{<c>}{<q>}.F64.<dt> <Ddm>, <Ddm>, #<fbits> 


Double-precision scalar variant 
Applies when op == 1 && sf == 11. 


VCVT{<c>}{<q>}.<dt>.F64 <Ddm>, <Ddm>, #<fbits> 


Decode for all variants of this encoding 


if sf != '1x' then UNDEFINED; 
to_fixed = (op == '1'); unsigned = (U == '1'); 
size = if sx == '@' then 16 else 32; 
frac_bits = size - UInt(imm4:i); 
case sf of 
when '10' fp_size = 32; d = UInt(Vd:D); 
when '11' fp_size = 64; d = UInt(D:Vd); 


if frac_bits < @ then UNPREDICTABLE; 





F6-3364 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


CONSTRAINED UNPREDICTABLE behavior 


If frac_bits < Q, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 
T1 
\15141312/11109 8|7 6 5 4/3 2 1 0|15 12\}1110 9 8|7 6 5 4|3 0| 


7704710 tlt ti fofifu] va [1 0]7 fst fifo] imma 





Single-precision scalar variant 
Applies when op == @ && sf == 10. 


VCVT{<c>}{<q>}.F32.<dt> <Sdm>, <Sdm>, #<fbits> 


Single-precision scalar variant 
Applies when op == 1 && sf == 10. 


VCVT{<c>}{<q>}.<dt>.F32 <Sdm>, <Sdm>, #<fbits> 


Double-precision scalar variant 
Applies when op == @ && sf == 11. 


VCVT{<c>}{<q>}.F64.<dt> <Ddm>, <Ddm>, #<fbits> 


Double-precision scalar variant 
Applies when op == 1 && sf == 11. 


VCVT{<c>}{<q>}.<dt>.F64 <Ddm>, <Ddm>, #<fbits> 


Decode for all variants of this encoding 


if sf != '1x' then UNDEFINED; 
to_fixed = (op == '1'); unsigned = (U == '1'); 
size = if sx == '@' then 16 else 32; 
frac_bits = size - UInt(imm4:i); 
case sf of 
when '10' fp_size = 32; d = UInt(Vd:D); 
when '11' fp_size = 64; d = UInt(D:Vd); 


if frac_bits < @ then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If frac_bits < Q, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VCVT (between floating-point and 
fixed-point) on page K1-5470. 


Assembler symbols 


<c> 


<q> 


<dt> 


<Sdm> 
<Ddm> 


<fbit 


See Standard assembler syntax fields on page F2-2406. 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the fixed-point number, encoded in the "U:sx" field. It can have the following 

values: 

S16 when U = 0, sx = @ 

$32 when U = 0, sx = 1 

U16 when U = 1, sx = @ 

U32 when U = 1,sx = 1 

Is the 32-bit name of the SIMD&FP destination and source register, encoded in the "Vd:D" field. 

Is the 64-bit name of the SIMD&FP destination and source register, encoded in the "D:Vd" field. 
S> The number of fraction bits in the fixed-point number: 

° If <dt> is S16 or U16, <fbits> must be in the range 0-16. (16 - <fbits>) is encoded in [imm4, 7] 

° If <dt> is $32 or U32, <fbits> must be in the range 1-32. (32 - <fbits>) is encoded in [imm4, i]. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOpe 

if to_fixed then 
bits(size) resu 
case fp_size of 








rations(); CheckVFPEnabled(TRUE); 


ts 





when 32 
result = FPToFixed(S[d], frac_bits, unsigned, FPSCR, FPRounding_ZERO) ; 
S[d] = Extend(result, 32, unsigned); 
when 64 
result = FPToFixed(D[d], frac_bits, unsigned, FPSCR, FPRounding_ZERO) ; 
D[d] = Extend(result, 64, unsigned); 
else 
case fp_size of 
when 32 
S[d] = FixedToFP(S[d]<size-1:0>, frac_bits, unsigned, FPSCR, FPRounding_TIEEVEN) ; 
when 64 
D[d] = FixedToFP(D[d]<size-1:0>, frac_bits, unsigned, FPSCR, FPRounding_TIEEVEN); 
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F6.1.61 VCVTA (Advanced SIMD) 


Vector Convert floating-point to integer with Round to Nearest with Ties to Away converts each element in a vector 
from floating-point to integer using the Round to Nearest with Ties to Away rounding mode, and places the results 
in a second vector. 


The operand vector elements must be 32-bit floating-point numbers. 
The result vector elements are 32-bit integers. Signed and unsigned integers are distinct. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


Ti 770077 10] i[sze]i 1] va [o]o[o ofop[a]wfo[ vm | 
RM 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCVTA{<q>}.<dt>.<dt2> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCVTA{<q>}.<dt>.<dt2> <Qd>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 


1444147114 1J0]1 afsize[1 1{ va [ofofo ofopfajmfo}] vm __| 
RM 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCVTA{<q>}.<dt>.<dt2> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCVTA{<q>}.<dt>.<dt2> <Qd>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the destination, encoded in the "op" field. It can have the 
following values: 
$32 when op = 0 
U32 when op = 1 
<dt2> Is the data type for the elements of the source vector, encoded in the "size" field. It can have the 
following values: 
F32 when size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(esize) result; 
for r = @ to regs-1 
for e = 0 to elements-1 
Elem[D[d+r],e,esize] = FPToFixed(Elem[D[m+r],e,esize], @, unsigned, 
StandardFPSCRValue(), rounding); 
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F6.1.62 VCVTA (floating-point) 


Convert floating-point to integer with Round to Nearest with Ties to Away converts a value in a register from 
floating-point to a 32-bit integer using the Round to Nearest with Ties to Away rounding mode, and places the result 
in a second register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


atati1 tito tfof1 1 tfifo of vd ft 0f1 xfonft}mjo} vm | 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VCVTA{<q>}.<dt>.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCVTA{<q>}.<dt>.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
rounding = FPDecodeRM(RM); unsigned = (op == 'Q'); 
d = UInt(Vd:D); 
case size of 
when '10' esize = 32; m = UInt(M:Vm); 
when '11' esize = 64; m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


T1 


15 14131211109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


titi tito tfof1 1 tfifo of vd ft 0f1 xJopft}mMfo} vm | 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VCVTA{<q>}.<dt>.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCVTA{<q>}.<dt>.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
rounding = FPDecodeRM(RM); unsigned = (op == 'Q'); 
d = UInt(Vd:D); 
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case size of 


when '10' esize = 32; m = UInt(M:Vm); 
when '11' esize = 64; m = UInt(M:Vm); 


if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<q> 


<dt> 


<Sd> 


<Sm> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the destination, encoded in the "op" field. It can have the 
following values: 


U32 when op = 0 
$32 when op = 1 


Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 


when 32 

S[d] = FPToFixed(S[m], @, unsigned, FPSCR, rounding); 
when 64 

S[d] = FPToFixed(D[m], ®, unsigned, FPSCR, rounding); 
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VCVTB 


Convert to or from a half-precision value in the bottom half of a single-precision register does one of the following: 


° Converts the half-precision value in the bottom half of a single-precision register to single-precision and 


writes the result to a single-precision register. 


° Converts the half-precision value in the bottom half of a single-precision register to double-precision and 


writes the result to a double-precision register. 


° Converts the single-precision value in a single-precision register to half-precision and writes the result into 


the bottom half of a single-precision register, preserving the other half of the destination register. 


° Converts the double-precision value in a double-precision register to half-precision and writes the result into 


the bottom half of a single-precision register, preserving the other half of the destination register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 


mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11109 8|7 6 5 4/3 


P| feaitt [1 4 10 1fo{t sfofo tfop) va f4 ofifszjo]+|mjo] vm | 


cond Tt 


Half-precision to single-precision variant 
Applies when op == @ && sz == 


VCVTB{<c>}{<q>}.F32.F16 <Sd>, <Sm> 


Half-precision to double-precision variant 
Applies when op == @ && sz == 


VCVTB{<c>}{<q>}.F64.F16 <Dd>, <Sm> 


Single-precision to half-precision variant 
Applies when op == 1 && sz == 


VCVTB{<c>}{<q>}.F16.F32 <Sd>, <Sm> 


Double-precision to half-precision variant 
Applies when op == 1 && sz == 


VCVTB{<c>}{<q>}.F16.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


uses_double = (sz == '1'); convert_from_half = (op == '0'); 
lowbit = (if T == '1' then 16 else 0); 
if uses_double then 
if convert_from_half then 
d = UInt(D:Vd); m = UInt(Vm:M); 
else 
d = UInt(Vd:D); m = UInt(M:Vm); 
else 
d = UInt(Vd:D); m = UInt(Vm:M); 
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T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4|3 0 | 


11701717170 1Jd]1 tfojo tfoof va [1 oftfszfo]i{mfo] vm __ | 
T 


Half-precision to single-precision variant 
Applies when op == @ && sz == 


VCVTB{<c>}{<q>}.F32.F16 <Sd>, <Sm> 


Half-precision to double-precision variant 
Applies when op == @ && sz == 


VCVTB{<c>}{<q>}.F64.F16 <Dd>, <Sm> 


Single-precision to half-precision variant 
Applies when op == 1 && sz == 


VCVTB{<c>}{<q>}.F16.F32 <Sd>, <Sm> 


Double-precision to half-precision variant 
Applies when op == 1 && sz == 


VCVTB{<c>}{<q>}.F16.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


uses_double = (sz == '1'); convert_from_half = (op == '0'); 
lowbit = (if T == '1' then 16 else 0); 
if uses_double then 
if convert_from_half then 
d = UInt(D:Vd); m = UInt(Vm:M); 
else 
d = UInt(Vd:D); m = UInt(M:Vm); 
else 
d = UInt(Vd:D); m = UInt(Vm:M); 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
bits(16) hp; 
if convert_from_half then 
hp = S[m]<lowbit+15:lowbit>; 
if uses_double then 
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Co 
a 

Qa 
oo 

iT} 


FPConvert(hp, FPSCR); 
else 
S[d] = FPConvert(hp, FPSCR); 
else 
if uses_double then 
hp = FPConvert(D[m], FPSCR); 
else 
hp = FPConvert(S[m], FPSCR); 
S[d]<lowbit+15:lowbit> = hp; 
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F6.1.64 VCVTM (Advanced SIMD) 


Vector Convert floating-point to integer with Round towards -Infinity converts each element in a vector from 
floating-point to integer using the Round towards -Infinity rounding mode, and places the results in a second vector. 


The operand vector elements must be 32-bit floating-point numbers. 
The result vector elements are 32-bit integers. Signed and unsigned integers are distinct. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 30 29 28 27 26 25 24 23 22 21 2019 18 17 16 15 12\11109 8765 4 3 0 


141170011 1/01 t{size{1 1] vd [ojo] iJopjalmjo] vm __| 
RM 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCVIM{<q>}.<dt>.<dt2> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCVIM{<q>}.<dt>.<dt2> <Qd>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


atti ti1 tt tfoi1 thsize]t 1] vd fojol1 tfopfajmjo] vm 
RM 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCVIM{<q>}.<dt>.<dt2> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCVIM{<q>}.<dt>.<dt2> <Qd>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the destination, encoded in the "op" field. It can have the 
following values: 
$32 when op = 0 
U32 when op = 1 
<dt2> Is the data type for the elements of the source vector, encoded in the "size" field. It can have the 
following values: 
F32 when size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(esize) result; 
for r = @ to regs-1 
for e = 0 to elements-1 
Elem[D[d+r],e,esize] = FPToFixed(Elem[D[m+r],e,esize], @, unsigned, 
StandardFPSCRValue(), rounding); 
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F6.1.65 VCVTM (floating-point) 


Convert floating-point to integer with Round towards -Infinity converts a value in a register from floating-point to 
a 32-bit integer using the Round towards -Infinity rounding mode, and places the result in a second register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0 | 


Tiiti1710 toi 7 iit] va [to] xpolt[wpo] vm _| 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VCVTIM{<q>}.<dt>.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCVTIM{<q>}.<dt>.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
rounding = FPDecodeRM(RM); unsigned = (op == 'Q'); 
d = UInt(Vd:D); 
case size of 
when '10' esize = 32; m = UInt(M:Vm); 
when '11' esize = 64; m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


T1 


[145 141312/11109 8/7 6 5 4/3 21 0|15 12/1109 8|7 6 5 4/3 


tat tito tfoft 41 tif tf vd ft 0f4 xJonft}mfo} vm | 
RM 





size 


Single-precision scalar variant 
Applies when size == 10. 


VCVIM{<q>}.<dt>.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCVTIM{<q>}.<dt>.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == 'Q'); 
d = UInt(Vd:D); 

case size of 


0} 
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when '10' esize = 32; m = UInt(M:Vm); 
when '11' esize = 64; m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the destination, encoded in the "op" field. It can have the 
following values: 
U32 when op = 0 
$32 when op = 1 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
S[d] = FPToFixed(S[m], @, unsigned, FPSCR, rounding); 
when 64 
S[d] = FPToFixed(D[m], ®, unsigned, FPSCR, rounding); 
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F6.1.66 VCVTN (Advanced SIMD) 


Vector Convert floating-point to integer with Round to Nearest converts each element in a vector from floating-point 
to integer using the Round to Nearest rounding mode, and places the results in a second vector. 


The operand vector elements must be 32-bit floating-point numbers. 
The result vector elements are 32-bit integers. Signed and unsigned integers are distinct. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4/3 0 | 


Ti 770077 1)0]1 i[sze]i 1] va [o]o[o ifor[a]wpo[ vm | 
RM 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCVIN{<q>}.<dt>.<dt2> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCVIN{<q>}.<dt>.<dt2> <Qd>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


atti ti1 tt tjoi1 thsize]1 1] vd fojofo tfopfajmjo] vm 
RM 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCVIN{<q>}.<dt>.<dt2> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCVIN{<q>}.<dt>.<dt2> <Qd>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the destination, encoded in the "op" field. It can have the 
following values: 
$32 when op = 0 
U32 when op = 1 
<dt2> Is the data type for the elements of the source vector, encoded in the "size" field. It can have the 
following values: 
F32 when size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(esize) result; 
for r = @ to regs-1 
for e = 0 to elements-1 
Elem[D[d+r],e,esize] = FPToFixed(Elem[D[m+r],e,esize], @, unsigned, 
StandardFPSCRValue(), rounding); 
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F6.1.67 VCVTN (floating-point) 


Convert floating-point to integer with Round to Nearest converts a value in a register from floating-point to a 32-bit 


integer using the Round to Nearest rounding mode, and places the result in a second register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0 | 


Tiiti110 tot 1 tor] va [1 oft xpolt[wpo] vm _| 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VCVIN{<q>}.<dt>.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCVIN{<q>}.<dt>.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
rounding = FPDecodeRM(RM); unsigned = (op == 'Q'); 
d = UInt(Vd:D); 
case size of 
when '10' esize = 32; m = UInt(M:Vm); 
when '11' esize = 64; m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


T1 


[145 14131211109 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 


Stati tito tfof1 1 tfifo sf vd ft 0f1 xJonft}mfo] vm | 
RM 





size 


Single-precision scalar variant 
Applies when size == 10. 


VCVIN{<q>}.<dt>.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCVIN{<q>}.<dt>.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == 'Q'); 
d = UInt(Vd:D); 

case size of 


0} 
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when '10' esize = 32; m = UInt(M:Vm); 
when '11' esize = 64; m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the destination, encoded in the "op" field. It can have the 
following values: 
U32 when op = 0 
$32 when op = 1 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
S[d] = FPToFixed(S[m], @, unsigned, FPSCR, rounding); 
when 64 
S[d] = FPToFixed(D[m], ®, unsigned, FPSCR, rounding); 
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F6.1.68 VCVTP (Advanced SIMD) 


Vector Convert floating-point to integer with Round towards +Infinity converts each element in a vector from 
floating-point to integer using the Round towards +Infinity rounding mode, and places the results in a second vector. 


The operand vector elements must be 32-bit floating-point numbers. 
The result vector elements are 32-bit integers. Signed and unsigned integers are distinct. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4/3 0 | 


Ti 770047 101 isze]i 1] va [lot ofop[a]wfo[ vm | 
RM 


64-bit SIMD vector variant 
Applies when Q == 0. 


VCVTP{<q>}.<dt>.<dt2> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VCVTP{<q>}.<dt>.<dt2> <Qd>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


T1 


[15 1413 12/1110 9 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


ttt tt tfol1 thsize]1 1] vd [ojo]1 ofopfajmjo] vm | 
RM 


64-bit SIMD vector variant 
Applies when Q == 0. 
VCVTP{<q>}.<dt>.<dt2> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VCVTP{<q>}.<dt>.<dt2> <Qd>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the destination, encoded in the "op" field. It can have the 
following values: 
$32 when op = 0 
U32 when op = 1 
<dt2> Is the data type for the elements of the source vector, encoded in the "size" field. It can have the 
following values: 
F32 when size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(esize) result; 
for r = @ to regs-1 
for e = 0 to elements-1 
Elem[D[d+r],e,esize] = FPToFixed(Elem[D[m+r],e,esize], @, unsigned, 
StandardFPSCRValue(), rounding); 
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F6.1.69 VCVTP (floating-point) 


Convert floating-point to integer with Round towards +Infinity converts a value in a register from floating-point to 
a 32-bit integer using the Round towards +Infinity rounding mode, and places the result in a second register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0| 


Tiiti1710 717 tii of va [1 ot xpolt[wpo] vm _| 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VCVTP{<q>}.<dt>.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCVTP{<q>}.<dt>.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
rounding = FPDecodeRM(RM); unsigned = (op == 'Q'); 
d = UInt(Vd:D); 
case size of 
when '10' esize = 32; m = UInt(M:Vm); 
when '11' esize = 64; m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


T1 


15 14131211109 8/7 6 5 4/3 21 0|15 12/1109 8|7 6 5 4/3 


etait to soft 4 tfift of vd ft 0f4 xJopft}mMfo] vm | 
RM 





size 


Single-precision scalar variant 
Applies when size == 10. 


VCVTP{<q>}.<dt>.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VCVTP{<q>}.<dt>.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 

rounding = FPDecodeRM(RM); unsigned = (op == 'Q'); 
d = UInt(Vd:D); 

case size of 


0} 
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when '10' esize = 32; m = UInt(M:Vm); 
when '11' esize = 64; m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the destination, encoded in the "op" field. It can have the 
following values: 
U32 when op = 0 
$32 when op = 1 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
S[d] = FPToFixed(S[m], @, unsigned, FPSCR, rounding); 
when 64 
S[d] = FPToFixed(D[m], ®, unsigned, FPSCR, rounding); 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3385 


1ID092916 


Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.70 VCVTR 


Convert floating-point to integer converts a value in a register from floating-point to a 32-bit integer, using the 
rounding mode specified by the FPSCR and places the result in a second register. 


VCVT (between floating-point and fixed-point, floating-point) describes conversions between floating-point and 
16-bit integers. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
31 28|27 26 25 24|23 22 2120/1918 16/15 12/1110 9 8|7 6 5 4/3 0| 
Perm [1770 1pli i]t ox] vat o]7 xfo[t|mjo] vm 


cond opc2 size op 


Single-precision scalar variant 
Applies when opc2 == 100 && size == 10. 


VCVTR{<c>}{<q>}.U32.F32 <Sd>, <Sm> 


Single-precision scalar variant 
Applies when opc2 == 101 && size == 10. 


VCVTR{<c>}{<q>}.S32.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when opc2 == 100 && size == 11. 


VCVTR{<c>}{<q>}.U32.F64 <Sd>, <Dm> 


Double-precision scalar variant 
Applies when opc2 == 101 && size == 11. 


VCVTR{<c>}{<q>}.S32.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if opc2 != 'Q00' && opc2 != '10x' then SEE "Related encodings"; 
if size != '1x' then UNDEFINED; 
to_integer = (opc2<2> == '1'); 
if to_integer then 
unsigned = (opc2<@> == 'Q'); 
rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 
d = UInt(Vd:D); 
case size of 
when '10' esize = 32; m = UInt(Vm:M); 
when '11' esize = 64; m = UInt(M:Vm); 


else 
unsigned = (op == 'Q'); 
rounding = FPRoundingMode(FPSCR) ; 
m = UInt(Vm:M); 
case size of 
when '10' esize = 32; d = UInt(Vd:D); 
when '11' esize = 64; d = UInt(D:Vd); 
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T1 
115141312/11109 8|7 65 4|3 2. O15 12/1109 8|7 6 5 4|3 0| 
Ti tott to tolt ato x] ve [1 oft xfo[t[mpo] vm _| 


opc2 size op 


Single-precision scalar variant 
Applies when opc2 == 100 && size == 10. 


VCVTR{<c>}{<q>}.U32.F32 <Sd>, <Sm> 


Single-precision scalar variant 
Applies when opc2 == 101 && size == 10. 


VCVTR{<c>}{<q>}.S32.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when opc2 == 100 && size == 11. 


VCVTR{<c>}{<q>}.U32.F64 <Sd>, <Dm> 


Double-precision scalar variant 
Applies when opc2 == 101 && size == 11. 


VCVTR{<c>}{<q>}.S32.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


if opc2 != 'Q00' && opc2 != '10x' then SEE "Related encodings"; 
if size != '1x' then UNDEFINED; 
to_integer = (opc2<2> == '1'); 
if to_integer then 
unsigned = (opc2<@> == 'Q'); 
rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 
d = UInt(Vd:D); 
case size of 
when '10' esize = 32; m = UInt(Vm:M); 
when '11' esize = 64; m = UInt(M:Vm); 


else 
unsigned = (op == 'Q'); 
rounding = FPRoundingMode(FPSCR) ; 
m = UInt(Vm:M); 
case size of 
when '10' esize = 32; d = UInt(Vd:D); 
when '11' esize = 64; d = UInt(D:Vd); 


Notes for all encodings 


Related encodings: See Floating-point data-processing on page F3-2450 for the T32 instruction set, or 
Floating-point data-processing on page F4-2533 for the A32 instruction set. 


Assembler symbols 





<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
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<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 


if to_integer the 
case esize of 








n 








CheckVFPEnabled(TRUE) ; 





when 32 
S[d] = FPToFixed(S[m], @, unsigned, FPSCR, rounding); 
when 64 
S[d] = FPToFixed(D[m], @, unsigned, FPSCR, rounding); 
else 
case esize of 
when 32 
S[d] = FixedToFP(S[m], @, unsigned, FPSCR, rounding); 
when 64 
D[d] = FixedToFP(S[m], @, unsigned, FPSCR, rounding); 
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VCVTT 


Convert to or from a half-precision value in the top half of a single-precision register does one of the following: 


° Converts the half-precision value in the top half of a single-precision register to single-precision and writes 


the result to a single-precision register. 


° Converts the half-precision value in the top half of a single-precision register to double-precision and writes 


the result to a double-precision register. 


° Converts the single-precision value in a single-precision register to half-precision and writes the result into 


the top half of a single-precision register, preserving the other half of the destination register. 


° Converts the double-precision value in a double-precision register to half-precision and writes the result into 


the top half of a single-precision register, preserving the other half of the destination register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 


mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12\11109 8|7 6 5 4/3 


PE fertt [1 410 1fo{1 sfofo tfop) va ft ofifsz]i}+}mfo] vm | 
T 


cond 


Half-precision to single-precision variant 
Applies when op == @ && sz == 


VCVTT{<c>}{<q>}.F32.F16 <Sd>, <Sm> 


Half-precision to double-precision variant 
Applies when op == @ && sz == 


VCVTT{<c>}{<q>}.F64.F16 <Dd>, <Sm> 


Single-precision to half-precision variant 
Applies when op == 1 && sz == 


VCVTT{<c>}{<q>}.F16.F32 <Sd>, <Sm> 


Double-precision to half-precision variant 
Applies when op == 1 && sz == 


VCVTT{<c>}{<q>}.F16.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


uses_double = (sz == '1'); convert_from_half = (op == '0'); 
lowbit = (if T == '1' then 16 else 0); 
if uses_double then 
if convert_from_half then 
d = UInt(D:Vd); m = UInt(Vm:M); 
else 
d = UInt(Vd:D); m = UInt(M:Vm); 
else 
d = UInt(Vd:D); m = UInt(Vm:M); 
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T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4/3 0 | 


1170117170 1{d]1 tfojo tfopf va [1 oftfsz{t]i{mfo] vm __ | 
T 


Half-precision to single-precision variant 
Applies when op == @ && sz == 


VCVTT{<c>}{<q>}.F32.F16 <Sd>, <Sm> 


Half-precision to double-precision variant 
Applies when op == @ && sz == 


VCVTT{<c>}{<q>}.F64.F16 <Dd>, <Sm> 


Single-precision to half-precision variant 
Applies when op == 1 && sz == 


VCVTT{<c>}{<q>}.F16.F32 <Sd>, <Sm> 


Double-precision to half-precision variant 
Applies when op == 1 && sz == 


VCVTT{<c>}{<q>}.F16.F64 <Sd>, <Dm> 


Decode for all variants of this encoding 


uses_double = (sz == '1'); convert_from_half = (op == '0'); 
lowbit = (if T == '1' then 16 else 0); 
if uses_double then 
if convert_from_half then 
d = UInt(D:Vd); m = UInt(Vm:M); 
else 
d = UInt(Vd:D); m = UInt(M:Vm); 
else 
d = UInt(Vd:D); m = UInt(Vm:M); 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
bits(16) hp; 
if convert_from_half then 
hp = S[m]<lowbit+15:lowbit>; 
if uses_double then 
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Co 
a 

Qa 
oo 

iT} 


FPConvert(hp, FPSCR); 
else 
S[d] = FPConvert(hp, FPSCR); 
else 
if uses_double then 
hp = FPConvert(D[m], FPSCR); 
else 
hp = FPConvert(S[m], FPSCR); 
S[d]<lowbit+15:lowbit> = hp; 
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F6.1.72 VDIV 
Divide divides one floating-point value by another floating-point value and writes the result to a third floating-point 
register. 
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
eam [117 o[tpojo o[ vn | va [1 o[7 x[Njo[M[o. vm _| 
cond size 
Single-precision scalar variant 
Applies when size == 10. 
VDIV{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VDIV{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 
Decode for all variants of this encoding 
if FPSCR.Len != 'Q00' || FPSCR.Stride != '@@' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12\/1110 9 8|7 6 5 4|3 0| 
114707141 of1fojo of vn {va [1 of x{nfo[mMjo[ vm | 
size 
Single-precision scalar variant 
Applies when size == 10. 
VDIV{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VDIV{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 
Decode for all variants of this encoding 
if FPSCR.Len != '000' || FPSCR.Stride != 'Q@@' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 





when 32 
S[d] = FPDiv(S[n], S[{m], FPSCR); 
when 64 
D[d] = FPDiv(D[n], D[m], FPSCR); 
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F6.1.73 VDUP (general-purpose register) 


Duplicate general-purpose register to vector duplicates an element from a general-purpose register into every 
element of the destination vector. 


The destination vector elements can be 8-bit, 16-bit, or 32-bit fields. The source element is the least significant 8, 
16, or 32 bits of the general-purpose register. There is no distinction between data types. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 21 0| 


| feaitt ft 11 oftfefajof vd | Rt ft 0 1 1 {DOJ E] 1 (Oooo) 


cond 


Al variant 


VDUP{<c>}{<q>}.<size> <Qd>, <Rt> // Encoded as 1 
=0 


VDUP{<c>}{<q>}.<size> <Dd>, <Rt> // Encoded as 


rove) 


Decode for this encoding 


if Q == '1' && Vd<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); t = UInt(Rt); regs = if Q == '@' then 1 else 2; 
case B:E of 
when 'QQ' esize = 32; elements = 2; 
when 'Q1' esize = 16; elements = 4; 
when '10' esize = 8; elements = 8; 
when '11' UNDEFINED; 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


T1 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 21 O| 


Tit o117 ose] ve | Rt] 01 1 OofEp fopoOO) 


T1 variant 


VDUP{<c>}{<q>}.<size> <Qd>, <Rt> // Encoded as 
VDUP{<c>}{<q>}.<size> <Dd>, <Rt> // Encoded as 


=1 
v7) 


Pf 


Decode for this encoding 


if Q == '1' && Vd<@> == '1' then UNDEFINED; 

d = UInt(D:Vd); t = UInt(Rt); regs = if Q == '@' then 1 else 2; 

case B:E of 
when 'QQ0' esize 
when 'Q1' esize = 16; elements 
when '10' esize = 8; elements 
when '11' UNDEFINED; 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


32; elements = 2; 
4; 
8; 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. ARM strongly recommends that any VDUP 
instruction is unconditional, see Conditional execution on page F2-2407. 
<q> See Standard assembler syntax fields on page F2-2406. 
<size> The data size for the elements of the destination vector. It must be one of: 
8 Encoded as [b, e] = 0b10. 
16 Encoded as [b, e] = 0b@1. 
32 Encoded as [b, e] = 0b00. 
<Qd> The destination vector for a quadword operation. 
<Dd> The destination vector for a doubleword operation. 
<Rt> The ARM source register. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
scalar = R[t]<esize-1:0>; 
for r = @ to regs-1 
for e = Q to elements-1 
Elem[D[d+r],e,esize] = scalar; 
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F6.1.74 VDUP (scalar) 
Duplicate vector element to vector duplicates a single element of a vector into every element of the destination 
vector. 
The scalar, and the destination vector elements, can be any one of 8-bit, 16-bit, or 32-bit fields. There is no 
distinction between data types. 
For more information about scalars see Advanced SIMD scalars on page F2-2432. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
Ti tt o0%41 to[1 1] mm [va [1 i[o 0 o]a|wpo] vm | 
Encoding 
Applies when Q == 0. 
VDUP{<c>}{<q>}.<size> <Dd>, <Dm[x]> 
Encoding 
Applies when Q == 1. 
VDUP{<c>}{<q>}.<size> <Qd>, <Dm[x]> 
Decode for all variants of this encoding 
if imm4 == 'x@QQ' then UNDEFINED; 
if Q == '1' && Vd<@> == '1' then UNDEFINED; 
case imm4 of 
when 'xxxl' esize = 8; elements = 8; index = UInt(imm4<3:1>); 
when 'xx1@' esize = 16; elements = 4; index = UInt(imm4<3:2>); 
when 'x100' esize = 32; elements = 2; index = UInt(imm4<3>); 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
T1 
1514131211109 8|7 6 5 4/3 0|15 12\/11109 8|7 6 5 4|3 0| 
Totti 111 top i] mm [va [1 1fo 0 o]a]wfo] vm | 
Encoding 
Applies when Q == 0. 
VDUP{<c>}{<q>}.<size> <Dd>, <Dm[x]> 
Encoding 
Applies when Q == 1. 
VDUP{<c>}{<q>}.<size> <Qd>, <Dm[x]> 
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Decode for all variants of this encoding 


if imm4 == 'x@QQ' then UNDEFINED; 

if Q == '1' && Vd<@> == '1' then UNDEFINED; 

case imm4 of 
when 'xxx1l' esize = 8; elements = 8; index = UInt(imm4<3:1>); 
when 'xx10' esize = 16; elements = 4; index = UInt(imm4<3:2>); 
when 'x100' esize = 32; elements = 2; index = UInt(imm4<3>); 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> The data size. It must be one of: 
8 Encoded as imm4<0> = '1'. imm4<3:1> encodes the index[x] of the scalar. 
16 Encoded as imm4<1:0> ='10'. imm4<3:2> encodes the index [x] of the scalar. 
32 Encoded as imm4<2:0> = '100'. imm4<3> encodes the index [x] of the scalar. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm[x]> The scalar. For details of how [x] is encoded, see the description of <size>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
scalar = Elem[D[m] ,index,esize]; 
for r = @ to regs-1 
for e = Q to elements-1 
Elem[D[d+r],e,esize] = scalar; 
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F6.1.75 VEOR 
Vector Bitwise Exclusive OR performs a bitwise Exclusive OR operation between two registers, and places the 
result in the destination register. The operand and result registers can be quadword or doubleword. They must all be 
the same size. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
111100 1)ifo[ofo 0] va | ve [ooo anjalmji] vm] 
64-bit SIMD vector variant 
Applies when Q == 0. 
VEOR{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VEOR{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12\1110 9 8|7 6 5 4|3 0| 
[11 t}4]1 144 ofpjo of vn {va jo oo 1{NJaQjm[t] vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VEOR{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VEOR{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == 'L' 8& (Vd<O> == 'L' || Vn<@> == ‘1’ || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
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For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 
for r = @ to regs-1 
D[d+r] = D[n+r] EOR D[m+r]; 
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F6.1.76 VEXT (byte elements) 
Vector Extract extracts elements from the bottom end of the second operand vector and the top end of the first, 
concatenates them and places the result in the destination vector. 
The elements of the vectors are treated as being 8-bit fields. There is no distinction between data types. 
The following figure shows an example of the operation of VEXT doubleword operation for imm = 3. 
76543 210 76543210 
Vm Vn 
Vd 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
This instruction is used by the pseudo-instruction VEXT (multibyte elements). The pseudo-instruction is never the 
preferred disassembly. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12/11 8/7 6 5 4|3 0| 
Ti tt 007010] 1] vo | va | mm [n]O]wpo[ vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VEXT{<c>}{<q>}.8 {<Dd>,} <Dn>, <Dm>, #<imm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VEXT{<c>}{<q>}.8 {<Qd>,} <Qn>, <Qm>, #<imm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if Q == '0' && imm4<3> == '1' then UNDEFINED; 
quadword_operation = (Q == '1'); position = 8 « UInt(imm4); 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12|11 8|7 6 5 4|3 0 | 
Tit otti1 toi] ve | va | mm [nJo]wfo[ vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
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VEXT{<c>}{<q>}.8 {<Dd>,} <Dn>, <Dm>, #<imm> 


128-bit SIMD vector variant 


Applies when Q == 1. 


VEXT{<c>}{<q>}.8 {<Qd>,} <Qn>, <Qm>, #<imm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if Q == 'Q' && imm4<3> == '1' then UNDEFINED; 
quadword_operation = (Q == '1'); position = 8 « UInt(imm4); 


d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 


Alias conditions 





Pseudo-instruction is preferred when 





VEXT (multibyte elements) Never 





Assembler symbols 


<c> 


<q> 
<Qd> 
<Qn> 


<Qm> 


<Dd> 
<Dn> 
<Dm> 


<imm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


For the 64-bit SIMD vector variant: is the location of the extracted result in the concatenation of the 
operands, as a number of bytes from the least significant end, in the range 0 to 7, encoded in the 
"imm4" field. 


For the 128-bit SIMD vector variant: is the location of the extracted result in the concatenation of 
the operands, as a number of bytes from the least significant end, in the range 0 to 15, encoded in 
the "imm4" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
if quadword_operation then 
Q[d>>1] = (Q[m>>1]:Q[n>>1] )<position+127:position>; 
else 
D[d] = (D[m]:D[n])<position+63:position>; 
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F6.1.77 VEXT (multibyte elements) 


Vector Extract extracts elements from the bottom end of the second operand vector and the top end of the first, 
concatenates them and places the result in the destination vector 


This instruction is a pseudo-instruction of the VEXT (byte elements) instruction. This means that: 


° The encodings in this description are named to match the encodings of VEXT (byte elements). 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VEXT (byte elements) gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11 81/7 6 5 4|3 0| 


77710070 117] we | va | imma [N[o[M[o] vm 


64-bit SIMD vector variant 

Applies when Q == 0. 

VEXT{<c>}{<q>}.<size> {<Dd>,} <Dn>, <Dm>, #<imm> 

is equivalent to 

VEXT{<c>}{<q>}.8 {<Dd>,} <Dn>, <Dm>, #<imms(size/8)> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 

VEXT{<c>}{<q>}.<size> {<Qd>,} <Qn>, <Qm>, #<imm> 

is equivalent to 

VEXT{<c>}{<q>}.8 {<Qd>,} <Qn>, <Qm>, #<imm«(size/8)> 


and is never the preferred disassembly. 


T1 


151413 12\11109 8|7 6 5 4|3 0 |15 42|11 8|7 6 5 4|3 0 | 


117077174 1{0]1 4] vn [va | imm4 |Njafmjo] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 

VEXT{<c>}{<q>}.<size> {<Dd>,} <Dn>, <Dm>, #<imm> 

is equivalent to 

VEXT{<c>}{<q>}.8 {<Dd>,} <Dn>, <Dm>, #<imm«(size/8)> 


and is never the preferred disassembly. 
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128-bit SIMD vector variant 


Applies when Q == 1. 


VEXT{<c>}{<q>}.<size> {<Qd>,} <Qn>, <Qm>, #<imm> 


is equivalent to 


VEXT{<c>}{<q>}.8 {<Qd>,} <Qn>, <Qm>, #<imms(size/8)> 


and is never the preferred disassembly. 


Assembler symbols 


<c> 


<q> 


<size> 


<Qd> 
<Qn> 


<Qm> 


<Dd> 
<Dn> 
<Dm> 


<imm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


For the 64-bit SIMD vector variant: is the size of the operation, and can be one of 16 or 32. 


For the 128-bit SIMD vector variant: is the size of the operation, and can be one of 16, 32 or 64. 
Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


For the 64-bit SIMD vector variant: is the location of the extracted result in the concatenation of the 
operands, as a number of bytes from the least significant end, in the range 0 to (128/<size>)-1. 


For the 128-bit SIMD vector variant: is the location of the extracted result in the concatenation of 
the operands, as a number of bytes from the least significant end, in the range 0 to (64/<size>)-1. 


Operation for all encodings 


The description of VEXT (byte elements) gives the operational pseudocode for this instruction. 
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F6.1.78 VFMA 


Vector Fused Multiply Accumulate multiplies corresponding elements of two vectors, and accumulates the results 
into the elements of the destination vector. The instruction does not round the result of the multiply before the 


accumulation. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 


12/1110 9 8|7 6 5 4|3 


0| 


77100 tfofo[ppofe] va | va_[1 10 o[Nja|Myi] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 
VFMA{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VFMA{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 


if sz == '1' then UNDEFINED; 

advsimd = TRUE; opl_neg = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
regs = if Q == 'Q' then 1 else 2; 


A2 


31 28|27 26 25 24|23 22 21 20|19 16/15 


12\11109 8|7 6 5 4/3 


0) 


pofeatt [1 4 1 oftfof1 of vn | vd ft of x|NJojmMfo] vm | 


cond 


Single-precision scalar variant 
Applies when size == 10. 
VFMA{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 


VFMA{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q00' || FPSCR.Stride != '@@' then UNDEFINED; 


if size != 'lx' then UNDEFINED; 
advsimd = FALSE; opl_neg = (op == '1'); 
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case size of 


when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
T1 
\15141312/11109 8|7 6 5 4/3 0 |15 12\1110 9 8|7 6 5 4|3 0| 


[11 itfof1 141 ofdjofsz] vn | va [4 10 ojNfaimi1] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VFMA{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VFMA{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 

advsimd = TRUE; opl_neg = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

regs = if Q == 'Q' then 1 else 2; 


T2 
[151413 12|/11109 8|7 6 5 4|3 0 \15 12\11109 8|7 6 5 4/3 0| 
Ti tot11 oliolto] va | ve [1 o]t x[nJo[mjo] vm _| 


size op 


Single-precision scalar variant 
Applies when size == 10. 


VFMA{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VFMA{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@0' || FPSCR.Stride != 'Q0' then UNDEFINED; 
if size != '1x' then UNDEFINED; 
advsimd = FALSE; opl_neg = (op == '1'); 
case size of 
when '10' esize 
when '11' esize 


32; d = UInt(Vd:D); nm = UInt(Vn:N); m = UInt(Vm:M); 
64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
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Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 
<Qn> 


<Qm> 


<Dd> 
<Dn> 
<Dm> 
<Sd> 
<Sn> 


<Sm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 


values: 


F32 when sz = 0 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 


Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M: 


<Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D: Vd" 


Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" 


Vm" field as 


field. 


field. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" 
Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" 


Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = Q to elements-1 
bits(esize) opl = Elem[D[n+r],e,esize]; 
if opl_neg then op1 = FPNeg(op1); 
Elem[D[d+r],e,esize] = FPMulAdd(Elem[D[d+r],e,esize], 
op1, Elem[D[m+r],e,esize], StandardFPSCRValue()); 


else // VFP instruction 
case esize of 

when 32 
op32 = if opl_neg then FPNeg(S[n]) else S[n]; 
S[d] = FPMulAdd(S[d], op32, S[{m], FPSCR); 

when 64 
op64 = if opl_neg then FPNeg(D[n]) else D[n]; 
D[d] = FPMulAdd(D[d], op64, D[m], FPSCR) 


field. 


field. 


:M" field. 
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F6.1.79 VFMS 


Vector Fused Multiply Subtract negates the elements of one vector and multiplies them with the corresponding 
elements of another vector, adds the products to the corresponding elements of the destination vector, and places the 
results in the destination vector. The instruction does not round the result of the multiply before the addition. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


17100 tfofo[p]ife] va | va_[1 10 o[Nja|Myi] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VFMS{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VFMS{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 

advsimd = TRUE; opl_neg = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

regs = if Q == 'Q' then 1 else 2; 


A2 
31 28|27 26 25 24|23 22 21 20|19 16/15 12\11109 8|7 6 5 4/3 0| 
| t=11it [4 1 4 ofi{p{1 of vn {| vd [4 oft x[N{i{mfo[ vm _ | 


cond size op 


Single-precision scalar variant 
Applies when size == 10. 


VFMS{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VEMS{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != '@@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
advsimd = FALSE; opl_neg = (op == '1'); 
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case size of 


UInt(M:Vm) ; 


when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m 
T1 
\15141312/11109 8|7 6 5 4|3 0 |15 


12\11109 8|7 6 5 4|3 


0| 


[11 tfol1 414 ofoft4sz] vn | vd fi 1 0 oNJQ}m{t] vm 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VFMS{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VFMS{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == 'L' || Vm<Q> 
if sz == '1' then UNDEFINED; 

advsimd = TRUE; opl_neg = (op == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 

regs = if Q == 'Q' then 1 else 2; 


T2 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 


"1') then UNDEFINED; 


12/1109 8|7 6 5 4/3 


0} 


Tito0%117 [oo] ve | va [1 o[1 x|N[i[wpo] vm _| 


Single-precision scalar variant 
Applies when size == 10. 
VFMS{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 


VFMS{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


size op 


if FPSCR.Len != 'Q00' || FPSCR.Stride != '@@' then UNDEFINED; 


if size != '1x' then UNDEFINED; 
advsimd = FALSE; opl_neg = (op == '1'); 
case size of 
when '10' esize 
when '11' esize 


64; d = UInt(D:Vd); n 


32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
UInt(N:Vn); m 


UInt(M:Vm) ; 





F6-3408 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 
<Qn> 


<Qm> 


<Dd> 
<Dn> 
<Dm> 
<Sd> 
<Sn> 


<Sm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 


F32 when sz = 0 
Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 


Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = Q to elements-1 


bits(esize) opl = Elem[D[n+r],e,esize]; 
if opl_neg then op1 = FPNeg(op1); 
Elem[D[d+r],e,esize] = FPMulAdd(Elem[D[d+r],e,esize], 
op1, Elem[D[m+r],e,esize], StandardFPSCRValue()); 


else // VFP instruction 
case esize of 
when 32 


op32 = if opl_neg then FPNeg(S[n]) else S[n]; 
S[d] = FPMulAdd(S[d], op32, S[{m], FPSCR); 


when 64 


op64 = if opl_neg then FPNeg(D[n]) else D[n]; 
D[d] = FPMulAdd(D[d], op64, D[m], FPSCR); 
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F6.1.80 VFNMA 


Vector Fused Negate Multiply Accumulate negates one floating-point register value and multiplies it by another 
floating-point register value, adds the negation of the floating-point value in the destination register to the product, 
and writes the result back to the destination register. The instruction does not round the result of the multiply before 


the addition. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 


0) 


pfeaitt ft 4 1 oftfofo +f vn | vd ft oft XIN] t]Mfo] vm | 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VFNMA{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VFNMA{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != '@Q0' || FPSCR.Stride != 'Q0' then UNDEFINED; 
if size != '1x' then UNDEFINED; 
opl_neg = (op == '1'); 
case size of 
when '10' esize 
when '11' esize 


32; d = UInt(Vd:D); nm = UInt(Vn:N); m = UInt(Vm:M); 
64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 


0| 


2 t1o14 1 oftfofo sf vn | vo f4 of x|N]t}Mfo} vm | 


size 


Single-precision scalar variant 
Applies when size == 10. 


VFNMA{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VFNMA{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'QQ@0' || FPSCR.Stride != '00' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
opl_neg = (op == '1'); 
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case size of 


when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 








when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
op32 = if opl_neg then FPNeg(S[n]) else S[n]; 
S[d] = FPMulAdd(FPNeg(S[d]), 0p32, S[m], FPSCR); 
when 64 
op64 = if opl_neg then FPNeg(D[n]) else D[n]; 
D[d] = FPMulAdd(FPNeg(D[d]), 0p64, D[m], FPSCR); 
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F6.1.81 VFNMS 


Vector Fused Negate Multiply Subtract multiplies together two floating-point register values, adds the negation of 
the floating-point value in the destination register to the product, and writes the result back to the destination 


register. The instruction does not round the result of the multiply before the addition. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 


0| 


| feaitt ft 11 oftfofo +f vn | vd 44 of x|NJojmMjo] vm | 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VFNMS{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VFENMS{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@0' || FPSCR.Stride != '00' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 

opl_neg = (op == '1'); 

case size of 


when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
T1 
115 141312/11109 8|7 6 5 4/3 0 |15 12\11109 8|7 6 5 4|3 


0| 


1470471 oft{ofo af vn [va {1 oft x{Njofmfo] vm __| 


size 


Single-precision scalar variant 
Applies when size == 10. 


VFNMS{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VFENMS{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != '@@@' || FPSCR.Stride != '00' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
opl_neg = (op == '1'); 
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case size of 


when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 








when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
Assembler symbols 
<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
Operation for all encodings 
if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
op32 = if opl_neg then FPNeg(S[n]) else S[n]; 
S[d] = FPMulAdd(FPNeg(S[d]), 0p32, S[m], FPSCR); 
when 64 
op64 = if opl_neg then FPNeg(D[n]) else D[n]; 
D[d] = FPMulAdd(FPNeg(D[d]), 0p64, D[m], FPSCR); 
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F6.1.82 VHADD 


Vector Halving Add adds corresponding elements in two vectors of integers, shifts each result right one bit, and 
places the final results in the destination vector. The results of the halving operations are truncated. For rounded 
results, see VRHADD). 


The operand and result elements are all the same type, and can be any one of: 
° 8-bit, 16-bit, or 32-bit signed integers. 
° 8-bit, 16-bit, or 32-bit unsigned integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


1141700 1]uJojp]size] vn _ [ va _ [o ofofo|Nnjajmfo] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VHADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VHADD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

add = (op == '0'); unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 


[11 tfuj1 141 ojpfsize] vn | Vd Jo ofojojNjajmjo{ vm __| 
op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VHADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VHADD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

add = (op == '0'); unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 

opl = Int(Elem[D[n+r],e,esize], unsigned); 
op2 = Int(Elem[D[m+r],e,esize], unsigned); 
result = if add then opl+op2 else op1-op2; 
Elem[D[d+r],e,esize] = result<esize:1>; 
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F6.1.83 VHSUB 


Vector Halving Subtract subtracts the elements of the second operand from the corresponding elements of the first 
operand, shifts each result right one bit, and places the final results in the destination vector. The results of the 
halving operations are truncated. There is no rounding version. 


The operand and result elements are all the same type, and can be any one of: 
° 8-bit, 16-bit, or 32-bit signed integers. 
° 8-bit, 16-bit, or 32-bit unsigned integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


1441700 1]uUJojp]size] vn [| va [o oftfo|Nnjafmfo} vm __| 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VHSUB{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VHSUB{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

add = (op == '0'); unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[11 tfuj1 141 ojp{size] vn | Vd _ Jo oft}fojNjajmjo{ vm _| 
op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VHSUB{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VHSUB{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

add = (op == '0'); unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 

opl = Int(Elem[D[n+r],e,esize], unsigned); 
op2 = Int(Elem[D[m+r],e,esize], unsigned); 
result = if add then opl+op2 else op1-op2; 
Elem[D[d+r],e,esize] = result<esize:1>; 
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F6.1.84 VLD1 (single element to one lane) 


Load single 1-element structure to one lane of one register loads one element from memory into one element of a 
register. Elements of the register that are not loaded are unchanged. For details of the addressing mode see The 
Advanced SIMD addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 4|3 0 | 


Ti 770100 1)0[1fo] Ra | va [=1]0 Ofindexatgn[ Rm | 


size 


Offset variant 
Applies when Rm == 1111. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' then SEE VLD1 (single element to all lanes); 
case size of 
when 'QQ' 
if index_al 
ebytes = 1; 
when '@1' 
if index_align<1> != 'Q' then UNDEFINED; 
ebytes = 2; index = UInt(index_align<3:2>); 
alignment = if index_align<@> == '@' then 1 else 2; 
when '10' 
if index_al 


"Q' then UNDEFINED; 
UInt(index_align<3:1>); 


ign<@> != 
index 


alignment = 1; 


ign<2> != '@' then UNDEFINED; 





if index_a 
ebytes = 4; 


alignment = 


d = UInt(D:Vd); n 
whack = (m != 15); 


Jign<1:0> != '00' && index_align<1:@> != '11' then UNDEFINED; 


index = UInt(index_align<3>); 

if index_align<1:0> == 'Q0' then 1 else 4; 
= UInt(Rn); m = UInt(Rm); 

register_index = (m != 15 && m != 13); 


if n == 15 then UNPREDICTABLE; 


T1 


|15 14 13 12/11 10 


98/7 6 5 4/3 0 |15 12/1110 9 8|7 4|3 0 | 





7711007 1]i]o] Rn | va [=11]0 [index align] Rm 


size 
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Offset variant 
Applies when Rm == 1111. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' then SEE VLD1 (single element to all lanes); 
case size of 
when 'QQ' 
if index_align<@> != 'Q' then UNDEFINED; 
ebytes = 1; index = UInt(index_align<3:1>); alignment = 1; 
when 'Q1' 
if index_align<1l> != 'Q' then UNDEFINED; 
ebytes = 2; index = UInt(index_align<3:2>); 
alignment = if index_align<@> == 'Q' then 1 else 2; 
when '10' 
if index_align<2> != 'Q' then UNDEFINED; 
if index_align<1:@> != 'Q0' && index_align<1:0> != '11' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
alignment = if index_align<1:0> == '@0' then 1 else 4; 
d = UInt(D:Vd); mn = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 then UNPREDICTABLE; 








Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = Q1 
32 when size = 10 
<list> Is a list containing the single 64-bit name of the SIMD&FP register holding the element. The list 


must be { <Dd>[<index>] }. The register <Dd> is encoded in the "D:Vd" field. The permitted values 
and encoding of <index> depend on <size>: 





<size> == <index> is in the range 0 to 7, encoded in the "index_align<3:1>" field. 
<size> == 16 <index> is in the range 0 to 3, encoded in the "index_align<3:2>" field. 
<size> == 32 <index> is 0 or 1, encoded in the "index_align<3>" field. 
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<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


<align> When <size> == 8, <align> must be omitted, otherwise it is the optional alignment. Whenever 
<align> is omitted, the standard alignment is used, see Unaligned data access on page E2-2323, and 
the encoding depends on <size>: 


<size> == Encoded in the "index_align<0>" field as 0. 
<size> == 16 Encoded in the "index_align<1:0>" field as 0b00. 
<size> == 32 Encoded in the "index_align<2:0>" field as 0b000. 


Whenever <align> is present, the permitted values and encoding depend on <size>: 


<size> == 16 <align> is 16, meaning 16-bit alignment, encoded in the 
"index_align<1:0>" field as 0b01. 


<size> == 32 <align> is 32, meaning 32-bit alignment, encoded in the 
"index_align<2:0>" field as 0b011. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = FALSE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
Elem[D[d],index] = MemU[address, ebytes]; 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + ebytes; 
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F6.1.85 VLD1 (single element to all lanes) 


Load single 1-element structure and replicate to all lanes of one register loads one element from memory into every 
element of one or two vectors. For details of the addressing mode see The Advanced SIMD addressing mode on 
page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


pt 1tortoo stfofrjof Rn | vd 4|1 1J0 ofsize|Tfa} Rm | 


Offset variant 
Applies when Rm == 1111. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' || (size == 'Q0' && a == '1') then UNDEFINED; 
ebytes = 1 << UInt(size); regs = if T == 'Q' then 1 else 2; 
alignment = if a == 'Q' then 1 else ebytes; 

d = UInt(D:Vd); n= UInt(Rn); m = UInt(Rm); 

wback = (m != 15); register_index = (m != 15 && m != 13); 

if n == 15 || d+regs > 32 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d+regs > 32, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


T1 


[15 141312/11109 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4/3 0 | 


ptr t1oo 4 tfoftjof Rn | vd | 1J0 ofsize}Tfa} Rm | 
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Offset variant 
Applies when Rm == 1111. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' || (size == 'Q0' && a == '1') then UNDEFINED; 
ebytes = 1 << UInt(size); regs = if T == 'Q' then 1 else 2; 
alignment = if a == '@' then 1 else ebytes; 

d = UInt(D:Vd); mn = UInt(Rn); m = UInt(Rm); 

wback = (m != 15); register_index = (m != 15 && m != 13); 

if n == 15 || d+regs > 32 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d+regs > 32, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD1 (single element to all lanes) on 


page K1-5470. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


<list> Is a list containing the 64-bit names of the SIMD&FP registers. The list must be one of: 
{ <Dd>[] } Encoded in the "T" field as 0. 
{ <Dd>[], <Dd+1>[] } Encoded in the "T" field as 1. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


The register <Dd> is encoded in the "D:Vd" field. 
Is the general-purpose base register, encoded in the "Rn" field. 


When <size> == 8, <align> must be omitted, otherwise it is the optional alignment. Whenever 
<align> is omitted, the standard alignment is used, see Unaligned data access on page E2-2323, and 
is encoded in the "a" field as 0. Whenever <align> is present, the permitted values and encoding 
depend on <size>: 


<size> == 16 <align> is 16, meaning 16-bit alignment, encoded in the "a" field as 1. 
<size> == 32 <align> is 32, meaning 32-bit alignment, encoded in the "a" field as 1. 
: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 


see The Advanced SIMD addressing mode on page F2-2427. 


Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 


page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = FALSE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
bits(64) replicated_element = Replicate(MemU[address, ebytes]); 
for r = @ to regs-1 
D[d+r] = replicated_element; 
if whack then 
if register_index then 


else 


R[n] = R[n] + R[m]; 


R[n] = R[n] + ebytes; 
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F6.1.86 VLD1 (multiple single elements) 
Load multiple single 1-element structures to one, two, three, or four registers loads elements from memory into one, 
two, three, or four registers, without de-interleaving. Every element of each register is loaded. For details of the 
addressing mode see The Advanced SIMD addressing mode on page F2-2427. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 8/7 6 5 4|3 0| 
Tit 70100 0[o[1fo] Ra | va [x x1 x[szelaign] Rm 
Offset variant 
Applies when Rm == 1111. 
VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
case type of 
when 'Q111' 
regs = 1; if align<1l> == '1' then UNDEFINED; 
when '1010' 
regs = 2; if align == '11' then UNDEFINED; 
when 'Q110' 
regs = 3; if align<1l> == '1' then UNDEFINED; 
when 'Q010' 
regs = 4; 
otherwise 
SEE "Related encodings"; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); mn = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d+regs > 32 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d+regs > 32, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 
the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 
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T1 
115 141312/11109 8|7 6 5 4/3 0 |15 12/11 8|7 6 5 4|3 0 | 
114771007 0p]ijo] Rn | Va [xx 1 x]szeJaign] Rm _| 
type 
Offset variant 
Applies when Rm == 1111. 
VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VLD1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
case type of 
when 'Q111' 
regs = 1; if align<1l> == '1' then UNDEFINED; 
when '1010' 
regs = 2; if align == '11' then UNDEFINED; 
when 'Q110' 
regs = 3; if align<l> == '1' then UNDEFINED; 
when 'Q010' 
regs = 4; 
otherwise 
SEE "Related encodings"; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); n= UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d+regs > 32 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d+regs > 32, then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLDJ (multiple single elements) on 
page K1-5470. 


Related encodings: See Advanced SIMD element or structure Load/Store on page F3-2479 for the T32 instruction 
set, or Advanced SIMD element or structure Load/Store on page F4-2553 for the A32 instruction set. 
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Assembler symbols 


<c> 


<q> 


<size> 


<list> 


<Rn> 


<align> 


<Rm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data size, encoded in the "size" field. It can have the following values: 


8 when size = 00 
16 when size = @1 
32 when size = 10 
64 when size = 11 


Is a list containing the 64-bit names of the SIMD&FP registers. The list must be one of: 
{ <Dd> } 
Encoded in the "type" field as 0b0111. 
{ <Dd>, <Dd+1> } 
Encoded in the "type" field as 0b1010. 
{ <Dd>, <Dd+1>, <Dd+2> } 
Encoded in the "type" field as 0b0110. 
{ <Dd>, <Dd+1>, <Dd+2>, <Dd+3> } 
Encoded in the "type" field as 0b0010. 
The register <Dd> is encoded in the "D:Vd" field. 


Is the general-purpose base register, encoded in the "Rn" field. 


Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 
Unaligned data access on page E2-2323, and is encoded in the "align" field as @b00. Whenever 
<align> is present, the permitted values are: 


64 64-bit alignment, encoded in the "align" field as 0b01. 


128 128-bit alignment, encoded in the "align" field as 0b10. Available only if <list> contains 
two or four registers. 


256 256-bit alignment, encoded in the "align" field as 0b11. Available only if <list> contains 
four registers. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = FALSE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
for r = @ to regs-1 


for e = Q to elements-1 


bits(ebytes«8) data; 
if ebytes != 8 then 
data = MemU[address, ebytes]; 
else 
data<31:0> = if BigEndian() then MemU[address+4,4] else MemU[address,4]; 
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data<63:32> = if BigEndian() then MemU[address,4] else MemU[address+4,4]; 
Elem[D[d+r],e] = data; 
address = address + ebytes; 
if whack then 

if register_index then 
R[n] = R[n] + R[m]; 

else 
R[n] = R[n] + 8regs; 
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F6.1.87 VLD2 (single 2-element structure to one lane) 


Load single 2-element structure to one lane of two registers loads one 2-element structure from memory into 
corresponding elements of two registers. Elements of the registers that are not loaded are unchanged. For details of 
the addressing mode see The Advanced SIMD addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|1110 9 8|7 4|3 0| 
T1470700 10]io] Ra | Va [Ft ]0 t[indexalgn] Rm | 
size 
Offset variant 
Applies when Rm == 1111. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
if size == '11' then SEE VLD2 (single 2-element structure to all lanes); 
case size of 
when 'QQ' 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
alignment = if index_align<@> == '@' then 1 else 2; 
when '@1' 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
alignment = if index_align<@> == '@' then 1 else 4; 
when '10' 


if index_align<1l> != 'Q' then UNDEFINED; 

ebytes = 4; index = UInt(index_align<3>); 

inc = if index_align<2> == 'Q' then 1 else 2; 

alignment = if index_align<@> == '@' then 1 else 8; 
d = UInt(D:Vd); d2 =d + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d2 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 
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T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12|1110 9 8|7 4|3 0| 
Ti471007 1p]ijo] Rn | Vd [Ft ]0 t[indexaign] Rm | 
size 
Offset variant 
Applies when Rm == 1111. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
if size == '11' then SEE VLD2 (single 2-element structure to all lanes); 
case size of 
when 'QQ' 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
alignment = if index_align<@> == '@' then 1 else 2; 
when '@1' 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
alignment = if index_align<@> == '@' then 1 else 4; 
when '10' 


if index_align<1> != '@' then UNDEFINED; 

ebytes = 4; index = UInt(index_align<3>); 

inc = if index_align<2> == '@' then 1 else 2; 

alignment = if index_align<@> == '@' then 1 else 8; 
d = UInt(D:Vd); d2 =d+ inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d2 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD2 (single 2-element structure to 
one lane) on page K1-5470. 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = 01 
32 when size = 10 
<list> Is a list containing the 64-bit names of the two SIMD&FP registers holding the element. The list 


must be one of: 
{ <Dd>[<index>], <Dd+1>[<index>] } 

Single-spaced registers, encoded as "spacing" = 0. 
{ <Dd>[<index>], <Dd+2>[<index>] } 


Double-spaced registers, encoded as "spacing" = 1. Not permitted when 
<size> == 8. 


The encoding of "spacing" depends on <size>: 
<size> == 16 "spacing" is encoded in the "index_align<1>" field. 
<size> == 32 "spacing" is encoded in the "index_align<2>" field. 


The register <Dd> is encoded in the "D:Vd" field. The permitted values and encoding of <index> 
depend on <size>: 


<size> == <index> is in the range 0 to 7, encoded in the "index_align<3:1>" field. 
<size> == 16 <index> is in the range 0 to 3, encoded in the "index_align<3:2>" field. 
<size> == 32 <index> is 0 or 1, encoded in the "index_align<3>" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 

<align> Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 


Unaligned data access on page E2-2323, and the encoding depends on <size>: 


<size> == Encoded in the "index_align<0>" field as 0. 

<size> == 16 Encoded in the "index_align<0>" field as 0. 

<size> == 32 Encoded in the "index_align<1:0>" field as 0b00. 

Whenever <align> is present, the permitted values and encoding depend on <size>: 

<size> == <align> is 16, meaning 16-bit alignment, encoded in the "index_align<0>" 
field as 1. 

<size> == 16 <align> is 32, meaning 32-bit alignment, encoded in the "index_align<0>" 
field as 1. 

<size> == 32 <align> is 64, meaning 64-bit alignment, encoded in the 


"index_align<1:0>" field as 0b01. 
: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 


see The Advanced SIMD addressing mode on page F2-2427. 


<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = FALSE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
Elem[D[d], index] = MemU[address,ebytes]; 
Elem[D[d2],index] = MemU[address+ebytes, ebytes]; 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 2xebytes; 
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F6.1.88 


VLD2 (single 2-element structure to all lanes) 


Load single 2-element structure and replicate to all lanes of two registers loads one 2-element structure from 
memory into all lanes of two registers. For details of the addressing mode see The Advanced SIMD addressing mode 
on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


ptr toroo sfoftjof Rn | vd | 1J0 t}size]Tfa} Rm | 


Offset variant 
Applies when Rm == 1111. 


VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 

if size == '11' then UNDEFINED; 

ebytes = 1 << UInt(size); 

alignment = if a == 'Q' then 1 else 2sebytes; 

inc = if T == '@' then 1 else 2; 

d = UInt(D:Vd); d2 =d+ inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If d2 > 31, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 
the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 





pt rt1oo 4 toftjof Rn | vd |4 tJ0 t}size]Tfa} Rm | 
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Offset variant 
Applies when Rm == 1111. 


VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

ebytes = 1 << UInt(size); 

alignment = if a == '@' then 1 else 2xebytes; 

inc = if T == '@' then 1 else 2; 

d = UInt(D:Vd); d2 =d+ inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d2 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD2 (single 2-element structure to 
all lanes) on page K1-5471. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = Q1 
32 when size = 10 


The encoding size = 11 is reserved. 


<list> Is a list containing the 64-bit names of two SIMD&FP registers. The list must be one of: 
{ <Dd>[], <Dd+1>[] } Single-spaced registers, encoded in the "T" field as Q. 
{ <Dd>[], <Dd+2>[] } Double-spaced registers, encoded in the "T" field as 1. 
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The register <Dd> is encoded in the "D:Vd" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


<align> Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 
Unaligned data access on page E2-2323, and is encoded in the "a" field as @. Whenever <align> is 
present, the permitted values and encoding depend on <size>: 


<size> == <align> is 16, meaning 16-bit alignment, encoded in the "a" field as 1. 
<size> == 16 <align> is 32, meaning 32-bit alignment, encoded in the "a" field as 1. 
<size> == 32 <align> is 64, meaning 64-bit alignment, encoded in the "a" field as 1. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = FALSE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
D[d] = Replicate(MemU[address,ebytes]); 
D[d2] = Replicate(MemU[address+ebytes,ebytes]); 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 2xebytes; 
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F6.1.89 VLD2 (multiple 2-element structures) 
Load multiple 2-element structures to two or four registers loads multiple 2-element structures from memory into 
two or four registers, with de-interleaving. For more information, see Element and structure load/store instructions 
on page F1-2388. Every element of each register is loaded. For details of the addressing mode see The Advanced 
SIMD addressing mode on page F2-2427. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 2423 22 21 20|19 16|15 12|11 8/7 6 5 4|3 0| 
Ttit0700 0[o[o] Ra | ve [x 0x x[sze[aign] Rm | 
type 
Offset variant 
Applies when Rm == 1111. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
case type of 
when '1000' 
regs = 1; inc =1; if align == '11' then UNDEFINED; 
when '1001' 
regs = 1; inc = 2; if align == '11' then UNDEFINED; 
when 'QQ11' 
regs = 2; inc = 2; 
otherwise 
SEE "Related encodings"; 
if size == '11' then UNDEFINED; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2 =d + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2+regs > 32 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d2+regs > 32, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 
the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 
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T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4/3 0| 
11471007 0[ijo] Rn | Va [x0 x x]szeJaign] Rm _| 
type 
Offset variant 
Applies when Rm == 1111. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VLD2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
case type of 
when '1000' 
regs = 1; inc =1; if align == '11' then UNDEFINED; 
when '10Q1' 
regs = 1; inc = 2; if align == '11' then UNDEFINED; 
when 'QQ11' 
regs = 2; inc = 2; 
otherwise 
SEE "Related encodings"; 
if size == '11' then UNDEFINED; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2 =d + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2+regs > 32 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d2+regs > 32, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD2 (multiple 2-element structures) 
on page K1-5470. 


Related encodings: See Advanced SIMD element or structure Load/Store on page F3-2479 for the T32 instruction 
set, or Advanced SIMD element or structure Load/Store on page F4-2553 for the A32 instruction set. 
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Assembler symbols 


<c> 


<q> 


<size> 


<list> 


<Rn> 


<align> 


<Rm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data size, encoded in the "size" field. It can have the following values: 


8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


Is a list containing the 64-bit names of the SIMD&FP registers. The list must be one of: 
{ <Dd>, <Dd+1> } 

Single-spaced registers, encoded in the "type" field as 0b1000. 
{ <Dd>, <Dd+2> } 

Double-spaced registers, encoded in the "type" field as 0b1001. 
{ <Dd>, <Dd+1>, <Dd+2>, <Dd+3> } 

Single-spaced registers, encoded in the "type" field as 0b0011. 
The register <Dd> is encoded in the "D:Vd" field. 


Is the general-purpose base register, encoded in the "Rn" field. 


Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 
Unaligned data access on page E2-2323, and is encoded in the "align" field as @b00. Whenever 
<align> is present, the permitted values are: 


64 64-bit alignment, encoded in the "align" field as 0b01. 
128 128-bit alignment, encoded in the "align" field as 0b10. 
256 256-bit alignment, encoded in the "align" field as 0b11. Available only if <list> contains 


four registers. 
: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = FALSE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
for r = @ to regs-1 


for e = Q to elements-1 
Elem[D[d+r], e] = MemU[address,ebytes]; 
Elem[D[d2+r],e] = MemU[address+ebytes, ebytes]; 
address = address + 2«ebytes; 


if whack then 


if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 16*regs; 
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F6.1.90 


VLD3 (single 3-element structure to one lane) 


Load single 3-element structure to one lane of three registers loads one 3-element structure from memory into 
corresponding elements of three registers. Elements of the registers that are not loaded are unchanged. For details 
of the addressing mode see The Advanced SIMD addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 4|3 0 | 


77710700 1/]1]o] Rn | va [11]7 [index align] Rm 


size 


Offset variant 
Applies when Rm == 1111. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>], <Rm> 


Decode for all variants of this encoding 


if size == '11' then SEE VLD3 (single 3-element structure to all lanes); 
case size of 
when 'QQ' 
if index_align<@> != '@' then UNDEFINED; 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
when '@1' 
if index_align<@> != 'Q' then UNDEFINED; 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
when '10' 
if index_align<1:@> != '@0' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
inc = if index_align<2> == '@' then 1 else 2; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d3 > 31 then UNPREDICTABLE; 








CONSTRAINED UNPREDICTABLE behavior 

If d3 > 31, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 
the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 
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T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12|1110 9 8|7 4|3 0| 
Tit 47007 T]io] Rn | va [E17 O]index align] Rm | 
size 
Offset variant 
Applies when Rm == 1111. 
VLD3{<c>}{<q>}.<size> <list>, [<Rn>] 
Post-indexed variant 
Applies when Rm == 1101. 
VLD3{<c>}{<q>}.<size> <list>, [<Rn>]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VLD3{<c>}{<q>}.<size> <list>, [<Rn>], <Rm> 
Decode for all variants of this encoding 
if size == '11' then SEE VLD3 (single 3-element structure to all lanes); 
case size of 
when 'QQ' 


if index_align<@> != '@' then UNDEFINED; 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
when '@1' 
if index_align<@> != '@' then UNDEFINED; 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
when '10' 
if index_align<1:@> != '@0' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
inc = if index_align<2> == '@' then 1 else 2; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d3 > 31 then UNPREDICTABLE; 








CONSTRAINED UNPREDICTABLE behavior 


If d3 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD3 (single 3-element structure to 
one lane) on page K1-5471. 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = 01 
32 when size = 10 
<list> Is a list containing the 64-bit names of the three SIMD&FP registers holding the element. The list 


must be one of: 
{ <Dd>[<index>], <Dd+1>[<index>], <Dd+2>[<index>] } 

Single-spaced registers, encoded as "spacing" = 0. 
{ <Dd>[<index>], <Dd+2>[<index>], <Dd+4>[<index>] } 


Double-spaced registers, encoded as "spacing" = 1. Not permitted when 
<size> == 8. 


The encoding of "spacing" depends on <size>: 


<size> == "spacing" is encoded in the "index_align<0>" field. 

<size> == 16 "spacing" is encoded in the "index_align<1>" field, and "index_align<0>" 
is set to Q. 

<size> == 32 "spacing" is encoded in the "index_align<2>" field, and 


"index_align<1:0>" is set to @b00. 


The register <Dd> is encoded in the "D:Vd" field. The permitted values and encoding of <index> 
depend on <size>: 


<size> == <index> is in the range 0 to 7, encoded in the "index_align<3:1>" field. 
<size> == 16 <index> is in the range 0 to 3, encoded in the "index_align<3:2>" field. 
<size> == 32 <index> is 0 or 1, encoded in the "index_align<3>" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 

<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Alignment 


Standard alignment rules apply, see Alignment support on page B2-76. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; 
Elem[D[d], index] = MemU[address, ebytes]; 
Elem[D[d2],index] = MemU[address+ebytes, ebytes]; 
Elem[D[d3], index] = MemU[address+2«ebytes, ebytes]; 
if whack then 

if register_index then 





F6-3440 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 





R[n] = R[n] + Rom]; 
else 
R[n] = R[n] + 3xebytes; 
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F6.1.91 VLD3 (single 3-element structure to all lanes) 

Load single 3-element structure and replicate to all lanes of three registers loads one 3-element structure from 
memory into all lanes of three registers. For details of the addressing mode see The Advanced SIMD addressing 
mode on page F2-2427. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 

|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 

Tit to700 to[to] Ra | ve [1 1]t ofsze[T]o] Rm | 
a 

Offset variant 
Applies when Rm == 1111. 
VLD3{<c>}{<q>}.<size> <list>, [<Rn>] 
Post-indexed variant 
Applies when Rm == 1101. 
VLD3{<c>}{<q>}.<size> <list>, [<Rn>]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VLD3{<c>}{<q>}.<size> <list>, [<Rn>], <Rm> 
Decode for all variants of this encoding 

if size == '11' || a == '1' then UNDEFINED; 

ebytes = 1 << UInt(size); 

inc = if T == 'Q@' then 1 else 2; 

d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; n = UInt(Rn); m = UInt(Rm); 

wback = (m != 15); register_index = (m != 15 && m != 13); 

if n == 15 || d3 > 31 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d3 > 31, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 

the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 

T1 

|15141312|/1110 9 8|7 6 5 4/3 0 |15 12/1110 9 8|7 6 5 4|3 0| 

Titt7007 Wolo] Ra | ve [1 1]t ofsze[t]o] Rm | 
a 
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Offset variant 
Applies when Rm == 1111. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>], <Rm> 


Decode for all variants of this encoding 


if size == '11' || a == '1' then UNDEFINED; 

ebytes = 1 << UInt(size); 

inc = if T == '@' then 1 else 2; 

d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; n= UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 

if n == 15 || d3 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d3 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD3 (single 3-element structure to 
all lanes) on page K1-5471. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


<list> Is a list containing the 64-bit names of three SIMD&FP registers. The list must be one of: 
{ <Dd>[], <Dd+1>[], <Dd+2>[] } 


Single-spaced registers, encoded in the "T" field as 0. 
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{ <Dd>[], <Dd+2>[], <Dd+4>[] } 
Double-spaced registers, encoded in the "T" field as 1. 


The register <Dd> is encoded in the "D:Vd" field. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Alignment 


Standard alignment rules apply, see Alignment support on page B2-76. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; 
D[d] = Replicate(MemU[address,ebytes]); 
D[d2] = Replicate(MemU[address+ebytes,ebytes]); 
D[d3] = Replicate(MemU[address+2«ebytes,ebytes]); 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 3xebytes; 
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F6.1.92 VLD3 (multiple 3-element structures) 


Load multiple 3-element structures to three registers loads multiple 3-element structures from memory into three 
registers, with de-interleaving. For more information, see Element and structure load/store instructions on 

page F1-2388. Every element of each register is loaded. For details of the addressing mode see The Advanced SIMD 
addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 2423 22 21 20|19 16|15 12|11 8/7 6 5 4|3 0| 
Ttito700 0[o[to] Ra | ve lo 70 x[sze|aign] Rm | 


type 


Offset variant 
Applies when Rm == 1111. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


case type of 
when 'Q100' 
inc = 1; 
when 'Q101' 
inc = 2; 
otherwise 
SEE "Related encodings"; 
if size == '11' || align<1> == '1' then UNDEFINED; 
alignment = if align<@> == 'Q' then 1 else 8; 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); | register_index = (m != 15 && m != 13); 
if n == 15 || d3 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d3 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 
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T1 
[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/11 8|7 6 5 4|3 0 | 
Ti tt7007 O[io] Ra | ve lo 70 x[swe[aign] Rm | 


type 


Offset variant 
Applies when Rm == 1111. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


case type of 
when 'Q100' 
inc = 1; 
when 'Q101' 
inc = 2; 
otherwise 
SEE "Related encodings"; 
if size == '11' || align<1l> == '1' then UNDEFINED; 
alignment = if align<@> == 'Q' then 1 else 8; 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d3 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d3 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD3 (multiple 3-element structures) 


on page K1-5471. 


Related encodings: See Advanced SIMD element or structure Load/Store on page F3-2479 for the T32 instruction 


set, or Advanced SIMD element or structure Load/Store on page F4-2553 for the A32 instruction set. 
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Assembler symbols 


<c> 


<q> 


<size> 


<list> 


<Rn> 


<align> 


<Rm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data size, encoded in the "size" field. It can have the following values: 


8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


Is a list containing the 64-bit names of the SIMD&FP registers. The list must be one of: 
{ <Dd>, <Dd+1>, <Dd+2> } 

Single-spaced registers, encoded in the "type" field as 0b0100. 
{ <Dd>, <Dd+2>, <Dd+4> } 

Double-spaced registers, encoded in the "type" field as 0b0101. 
The register <Dd> is encoded in the "D:Vd" field. 


Is the general-purpose base register, encoded in the "Rn" field. 


Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 
Unaligned data access on page E2-2323, and is encoded in the "align" field as @b00. Whenever 
<align> is present, the only permitted values is 64, meaning 64-bit alignment, encoded in the "align" 
field as Qb01. : is the preferred separator before the <align> value, but the alignment can be specified 
as @<align>, see The Advanced SIMD addressing mode on page F2-2427. 


Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about <Rn>, !, and <Rm>, see The Advanced SIMD addressing mode on page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = FALSE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
for e = 0 to elements-1 


Elem[D[d], e] = MemU[address, ebytes]; 
Elem[D[d2],e] = MemU[address+ebytes, ebytes]; 
Elem[D[d3],e] = MemU[address+2«ebytes,ebytes]; 
address = address + 3xebytes; 


if whack then 


if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 24; 
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F6.1.93 


VLD4 (single 4-element structure to one lane) 


Load single 4-element structure to one lane of four registers loads one 4-element structure from memory into 
corresponding elements of four registers. Elements of the registers that are not loaded are unchanged. For details of 
the addressing mode see The Advanced SIMD addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12\1110 9 8|7 4|3 0 | 


77710700 1/]1]o] Rn | va [=t1]7 1 [index align] Rm 


size 


Offset variant 
Applies when Rm == 1111. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' then SEE VLD4 (single 4-element structure to all lanes); 
case size of 
when 'QQ' 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
alignment = if index_align<@> == 'Q' then 1 else 4; 
when '@1' 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
alignment = if index_align<@> == 'Q' then 1 else 8; 
when '10' 
if index_align<1:@> == '11' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
inc = if index_align<2> == 'Q' then 1 else 2; 
alignment = if index_align<1:0> == '@0' then 1 else 4 << UInt(index_align<1:0>); 
d = UInt(D:Vd); d2=d+ inc; d3 = d2+ inc; d4 = d3 + inc; n= UInt(Rn); m= UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d4 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If d4 > 31, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 
the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 
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T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12|\1110 9 8|7 4|3 0 | 





7717007 tlio] Rn | va [=ti]7 1 [index ain] Rm 


size 


Offset variant 
Applies when Rm == 1111. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' then SEE VLD4 (single 4-element structure to all lanes); 
case size of 
when 'QQ' 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
alignment = if index_align<@> == '@' then 1 else 4; 
when '@1' 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
alignment = if index_align<@> == '@' then 1 else 8; 
when '10' 
if index_align<1:@> == '11' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
inc = if index_align<2> == '@' then 1 else 2; 
alignment = if index_align<1:0> == 'Q@0' then 1 else 4 << UInt(index_align<1:0>); 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; d4 = d3 + inc; nm = UInt(Rn); m= UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d4 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d4 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD4 (single 4-element structure to 
one lane) on page K1-5471. 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = 01 
32 when size = 10 
<list> Is a list containing the 64-bit names of the four SIMD&FP registers holding the element. The list 


must be one of: 

{ <Dd>[<index>], <Dd+1>[<index>], <Dd+2>[<index>], <Dd+3>[<index>] } 
Single-spaced registers, encoded as "spacing" = 0. 

{ <Dd>[<index>], <Dd+2>[<index>], <Dd+4>[<index>], <Dd+6>[<index>] } 


Double-spaced registers, encoded as "spacing" = 1. Not permitted when 
<size> == 8. 


The encoding of "spacing" depends on <size>: 
<size> == 16 "spacing" is encoded in the "index_align<1>" field. 
<size> == 32 "spacing" is encoded in the "index_align<2>" field. 


The register <Dd> is encoded in the "D:Vd" field. The permitted values and encoding of <index> 
depend on <size>: 


<size> == <index> is in the range 0 to 7, encoded in the "index_align<3:1>" field. 
<size> == 16 <index> is in the range 0 to 3, encoded in the "index_align<3:2>" field. 
<size> == 32 <index> is 0 or 1, encoded in the "index_align<3>" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 

<align> Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 


Unaligned data access on page E2-2323, and the encoding depends on <size>: 


<size> == Encoded in the "index_align<0>" field as 0. 

<size> == 16 Encoded in the "index_align<0>" field as 0. 

<size> == 32 Encoded in the "index_align<1:0>" field as 0b00. 

Whenever <align> is present, the permitted values and encoding depend on <size>: 

<size> == <align> is 32, meaning 32-bit alignment, encoded in the "index_align<0>" 
field as 1. 

<size> == 16 <align> is 64, meaning 64-bit alignment, encoded in the "index_align<0>" 
field as 1. 

<size> == 32 <align> can be 64 or 128. 64-bit alignment is encoded in the 


"index_align<1:0>" field as 0b01, and 128-bit alignment is encoded in the 
"index_align<1:0>" field as 0b10. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 





F6-3450 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = FALSE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
Elem[D[d], index] = MemU[address, ebytes]; 
Elem[D[d2],index] = MemU[address+ebytes, ebytes]; 
Elem[D[d3], index] = MemU[address+2«ebytes,ebytes]; 
Elem[D[d4], index] = MemU[address+3«ebytes, ebytes]; 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 





R[n] = R[n] + 4xebytes; 
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F6.1.94 


VLD4 (single 4-element structure to all lanes) 


Load single 4-element structure and replicate to all lanes of four registers loads one 4-element structure from 
memory into all lanes of four registers. For details of the addressing mode see The Advanced SIMD addressing mode 


on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 


12/1110 9 8|7 6 5 4|3 0 | 


pt r1toroo sfofrjof Rn | vo f4 1]1 t}size}tfa} Rm | 


Offset variant 
Applies when Rm == 1111. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' && a == '@' then UNDEFINED; 
if size == '11' then 
ebytes = 4; alignment = 16; 
else 
ebytes = 1 << UInt(size); 
if size == '10' then 
alignment = if a == 'Q@' then 1 else 8; 
else 
alignment = if a == 'Q@' then 1 else 4xebytes; 
inc = if T == '@' then 1 else 2; 
d = UInt(D:Vd); d2 =d+ inc; d3 =d2+ inc; d4 = 
wback = (m != 15); register_index = (m != 15 && m != 
if n == 15 || d4 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 
If d4 > 31, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 


° The instruction executes as NOP. 


d3 + inc; 


n = UInt(Rn); 


m = UInt(Rm); 


° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 
the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 





F6-3452 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


11111007 tfoirfof Rn [ va [1 1{1 1{sze]Tla] Rm | 


Offset variant 
Applies when Rm == 1111. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 

Applies when Rm == 1101. 

VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 

Applies when Rm != 11x1. 

VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 


if size == '11' && a == '@' then UNDEFINED; 


if size == '11' then 
ebytes = 4; alignment = 16; 
else 
ebytes = 1 << UInt(size); 
if size == '10' then 
alignment = if a == 'Q@' then 1 else 8; 
else 
alignment = if a == 'Q@' then 1 else 4xebytes; 


inc = if T == '@' then 1 else 2; 

d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; d4 = d3 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 

if n == 15 || d4 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d4 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD4 (single 4-element structure to 
all lanes) on page K1-5472. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
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<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = Q1 
32 when size = 1x 
<list> Is a list containing the 64-bit names of four SIMD&FP registers. The list must be one of: 


<align> 


{ <Dd>[], <Dd+1>[], <Dd+2>[], <Dd+3>[] } 


Single-spaced registers, encoded in the "T" field as 0. 


{ <Dd>[], <Dd+2>[], <Dd+4>[], <Dd+6>[] } 


Double-spaced registers, encoded in the "T" field as 1. 


The register <Dd> is encoded in the "D:Vd" field. 


Is the general-purpose base register, encoded in the "Rn" field. 


Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 


Unaligned data access on page E2-2323, and is encoded in the "a" field as @. Whenever <align> is 


present, the permitted values and encoding depend on <size>: 


<size> == <align> is 32, meaning 32-bit alignment, encoded in the "a" field as 1. 


<size> == 16 <align> is 64, meaning 64-bit alignment, encoded in the "a" field as 1. 


<size> == 32 <align> can be 64 or 128. 64-bit alignment is encoded in the "a:size<0>" 
field as 0b10, and 128-bit alignment is encoded in the "a:size<0>" field as 


Qb11. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 


see The Advanced SIMD addressing mode on page F2-2427. 


<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 


"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 


page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


address = R[n]; iswrite = FALSE; 


- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 


D[d] = Replicate(MemU[address,ebytes]); 


D[d2] = Replicate(MemU[address+ebytes,ebytes]); 
D[d3] = Replicate(MemU[address+2«ebytes,ebytes]); 
D[d4] = Replicate(MemU[address+3«ebytes,ebytes]); 


if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 4xebytes; 





F6-3454 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.95 VLD4 (multiple 4-element structures) 


Load multiple 4-element structures to four registers loads multiple 4-element structures from memory into four 
registers, with de-interleaving. For more information, see Element and structure load/store instructions on 

page F1-2388. Every element of each register is loaded. For details of the addressing mode see The Advanced SIMD 
addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 2423 22 21 20|19 16|15 12/11 8/7 6 5 4|3 0| 
Tt it0700 0[o[o] Ra | ve [000 x[swelaign] Rm | 


type 


Offset variant 
Applies when Rm == 1111. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


case type of 
when 'Q00Q' 
inc = 1; 
when 'QQQ1' 
inc = 2; 
otherwise 
SEE "Related encodings"; 
if size == '11' then UNDEFINED; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; d4 = d3 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d4 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d4 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 
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T1 


151413 12|11109 8|7 6 5 4|3 


0 |15 


12|11 


8|7 6 5 4|3 0 | 


Tii77007 [io] Ra | va [000 x[swelalgn] Rm | 


Offset variant 


Applies when Rm == 1111. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 


Applies when Rm == 1101. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 


Applies when Rm != 11x1. 


VLD4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


case type of 
when 'Q000' 
inc = 1; 
when 'QQQ1' 
inc = 2; 
otherwise 
SEE "Related encodings"; 
if size == '11' then UNDEFINED; 


alignment = if align == 'Q0' then 1 else 4 << UInt(align); 


ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2 =d+ inc; d3 = d2 + inc; 


d4 = d3 + inc; 


wback = (m != 15); register_index = (m != 15 && m != 13); 


if n == 15 || d4 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d4 > 31, then one of the following behaviors must occur: 


type 


n = UInt(Rn); m = UInt(Rm); 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLD4 (multiple 4-element structures) 


on page K1-5471. 


Related encodings: See Advanced SIMD element or structure Load/Store on page F3-2479 for the T32 instruction 
set, or Advanced SIMD element or structure Load/Store on page F4-2553 for the A32 instruction set. 
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Assembler symbols 


<c> 


<q> 


<size> 


<list> 


<Rn> 


<align> 


<Rm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data size, encoded in the "size" field. It can have the following values: 


8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


Is a list containing the 64-bit names of the SIMD&FP registers. The list must be one of: 
{ <Dd>, <Dd+1>, <Dd+2>, <Dd+3> } 
Single-spaced registers, encoded in the "type" field as 0b0000. 
{ <Dd>, <Dd+2>, <Dd+4>, <Dd+6> } 
Double-spaced registers, encoded in the "type" field as 0b0001. 
The register <Dd> is encoded in the "D:Vd" field. 


Is the general-purpose base register, encoded in the "Rn" field. 


Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 
Unaligned data access on page E2-2323, and is encoded in the "align" field as @b00. Whenever 
<align> is present, the permitted values are: 


64 64-bit alignment, encoded in the "align" field as 0b01. 
128 128-bit alignment, encoded in the "align" field as 0b10. 
256 256-bit alignment, encoded in the "align" field as @b11. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = FALSE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
for e = 0 to elements-1 


Elem[D[d 
Elem[D[ 
Elem[D 
Elem[D[ 
address = address + 4ebytes; 


], e] = MemU[address, ebytes]; 
d2],e] = MemU[address+ebytes, ebytes]; 
[d3],e] = MemU[address+2«ebytes, ebytes]; 
d4],e] = MemU[address+3«ebytes,ebytes]; 


if whack then 


if register_index then 


R[n] = R[n] + R[m]; 


R[n] = R[n] + 32; 
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F6.1.96 


VLDM, VLDMDB, VLDMIA 


Load Multiple SIMD&FP registers loads multiple registers from consecutive locations in the Advanced SIMD and 
floating-point register file using an address from a general-purpose register. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


This instruction is used by the alias VPOP. See Alias conditions on page F6-3461 for details of when each alias is 
preferred. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 | 1 0| 


| feat ft 4 ofPfujofwiif Rn | vd ft oft 1] imma<71> [| 





cond imm8<0> 


Decrement Before variant 
Applies when P == 1 && U == 0 && W == 1. 


VLDMDB{<c>}{<q>}{.<size>} <Rn>!, <dreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


VLDM{<c>}{<q>}{.<size>} <Rn>{!}, <dreglist> 
VLDMIA{<c>}{<q>}{.<size>} <Rn>{!}, <dreglist> 


Decode for all variants of this encoding 

if P == '0' && U == '0' && W == 'Q' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VLDR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = FALSE; add = (U == '1'); whack = (W == '1'); 

d = UInt(D:Vd); nm = UInt(Rn); imm32 = ZeroExtend(imm8:'00', 32) 

regs = UInt(imm8) DIV 2; // If UInt(imm8) is odd, see "FLDMX". 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 
if regs == 0 || regs > 16 || (d+regs) > 32 then UNPREDICTABLE; 

if imm8<@> == '1' && (d+regs) > 16 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If regs == Q, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction operates as a VLDM with the same addressing mode but loads no registers. 
If regs > 16 || (d+regs) > 32, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 
the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 





F6-3458 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


A2 
31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 | 0 | 
Derm [17 o[Pjuppw[i[ en | va_[1 0]7 0 imme _—s 


cond 


Decrement Before variant 
Applies when P == 1 && U == 0 && W == 1. 


VLDMDB{<c>}{<q>}{.<size>} <Rn>!, <sreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


VLDM{<c>}{<q>}{.<size>} <Rn>{!}, <sreglist> 
VLDMIA{<c>}{<q>}{.<size>} <Rn>{!}, <sreglist> 


Decode for all variants of this encoding 


if P == '0' && U == '0' && W == '0' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VLDR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = TRUE; add = (U == '1'); whack = (W == '1'); d= UInt(Vd:D); n = UInt(Rn); 
imm32 = ZeroExtend(imm8:'00', 32); regs = UInt(imm8); 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 

if regs == @ || (d+regs) > 32 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If regs == 0, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction operates as a VLDM with the same addressing mode but loads no registers. 


If (d+regs) > 32, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 





T1 
15 141312/1110 9 8|7 6 5 4/3 0 |15 12|1110 9 8|7 | 1 0| 
144047 ofPlu[Djwit] Rn | vd [4 oft 4] imms<7:1> [0] 
imm8<0> 
Decrement Before variant 
Applies when P == 1 && U == @ && W == 1. 
VLDMDB{<c>}{<q>}{.<size>} <Rn>!, <dreglist> 
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Increment After variant 
Applies when P == 0 && U == 1. 


VLDM{<c>}{<q>}{.<size>} <Rn>{!}, <dreglist> 
VLDMIA{<c>}{<q>}{.<size>} <Rn>{!}, <dreglist> 


Decode for all variants of this encoding 


if P == '0' && U == '0' && W == 'Q' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VLDR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = FALSE; add = (U == '1'); whack = (W == '1'); 

d = UInt(D:Vd); n= UInt(Rn); imm32 = ZeroExtend(imm8:'Q0', 32) 

regs = UInt(imm8) DIV 2; // If UInt(imm8) is odd, see "FLDMX". 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 

if regs == 0 || regs > 16 || (d+regs) > 32 then UNPREDICTABLE; 

if imm8<@> == '1' && (d+regs) > 16 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If regs == Q, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
. The instruction operates as a VLDM with the same addressing mode but loads no registers. 


If regs > 16 || (d+regs) > 32, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


T2 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


77047 oPluppiy Rn | va [10] 0] imme _— 


Decrement Before variant 
Applies when P == 1 && U == 0 && W == 1. 


VLDMDB{<c>}{<q>}{.<size>} <Rn>!, <sreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


VLDM{<c>}{<q>}{.<size>} <Rn>{!}, <sreglist> 
VLDMIA{<c>}{<q>}{.<size>} <Rn>{!}, <sreglist> 


Decode for all variants of this encoding 


if P == '0' && U == '0' && W == '0' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VLDR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = TRUE; add = (U == '1'); whack = (W == '1'); d = UInt(Vd:D); n = UInt(Rn); 
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imm32 = ZeroExtend(imm8:'Q00', 32); regs = UInt(imm8); 
if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 
if regs == @ || (d+regs) > 32 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If regs == Q, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction operates as a VLDM with the same addressing mode but loads no registers. 


If (d+regs) > 32, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. If the instruction specifies writeback, 


the base register becomes UNKNOWN. This behavior does not affect any general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VLDM on page K1-5472. 


Related encodings: See Advanced SIMD and floating-point 64-bit move on page F3-2448 for the T32 instruction 
set, or Advanced SIMD and floating-point 64-bit move on page F4-2532 for the A32 instruction set. 


Alias conditions 





Alias is preferred when 





VPOP P == 'Q' && U == '1' && W == '1' && Rn == '1101' 





Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<size> An optional data size specifier. If present, it must be equal to the size in bits, 32 or 64, of the registers 


being transferred. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. If writeback is not specified, the PC 
can be used. 


! Specifies base register writeback. Encoded in the "W" field as 1 if present, otherwise 0. 


<sreglist> Is the list of consecutively numbered 32-bit SIMD&FP registers to be transferred. The first register 
in the list is encoded in "Vd:D", and "imm8" is set to the number of registers in the list. The list must 
contain at least one register. 


<dreglist> Is the list of consecutively numbered 64-bit SIMD&FP registers to be transferred. The first register 
in the list is encoded in "D:Vd", and "imm8" is set to twice the number of registers in the list. The 
list must contain at least one register, and must not contain more than 16 registers. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
address = if add then R[n] else R[n]-imm32; 
for r = @ to regs-1 
if single_regs then 
S[d+r] = MemA[address,4]; address = address+4; 
else 
word1 = MemA[address,4]; word2 = MemA[address+4,4]; address = address+8; 
// Combine the word-aligned words in the correct order for current endianness. 
D[d+r] = if BigEndian() then word1:word2 else word2:word1; 
if wback then R[n] = if add then R[n]+imm32 else R[n]-imm32; 
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F6.1.97  VLDR (immediate) 


Load SIMD&FP register (immediate) loads a single register from the Advanced SIMD and floating-point register 
file, using an address from a general-purpose register, with an optional offset. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 





A1 
|31 28|27 26 25 24/23 22 21 20/19 16|15 12|1110 9 8|7 | 0 | 
Pen [1 1 ojsfupojoys[ =m [ va [toji x] mma —is 
cond Rn size 


Single-precision scalar variant 
Applies when size == 10. 


VLDR{<c>}{<q>}{.32} <Sd>, [<Rn> {, #{+/-}<imm>}] 


Double-precision scalar variant 
Applies when size == 11. 


VLDR{<c>}{<q>}{.64} <Dd>, [<Rn> {, #{+/-}<imm>}] 


Decode for all variants of this encoding 


if size IN { '@@', '@1' } then UNDEFINED; 

esize = 8 << UInt(size); add = (U == '1'); 

imm32 = if esize == 16 then ZeroExtend(imm8:'@', 32) else ZeroExtend(imm8:'Q0', 32); 
d = UInt(D:Vd); n = UInt(Rn); 


T1 
\15141312/11109 8|7 6 5 4/3 0 |15 12|1110 9 8|7 | 0 | 
Tit 1077 0]Uppoy] enn | va (1 oj1x] imma i 
Rn size 


Single-precision scalar variant 
Applies when size == 10. 


VLDR{<c>}{<q>}{.32} <Sd>, [<Rn> {, #{+/-}<imm>}] 


Double-precision scalar variant 
Applies when size == 11. 


VLDR{<c>}{<q>}{.64} <Dd>, [<Rn> {, #{+/-}<imm>}] 


Decode for all variants of this encoding 


if size IN { '@Q', '@1' } then UNDEFINED; 

esize = 8 << UInt(size); add = (U == '1'); 

imm32 = if esize == 16 then ZeroExtend(imm8:'@', 32) else ZeroExtend(imm8:'00', 32); 
d = UInt(D:Vd); n = UInt(Rn); 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
.64 Is an optional data size specifier for 64-bit memory accesses that can be used in the assembler source 


code, but is otherwise ignored. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


32 Is an optional data size specifier for 32-bit memory accesses that can be used in the assembler source 
code, but is otherwise ignored. 


<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 

+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> Is the optional unsigned immediate byte offset, a multiple of 4, in the range 0 to 1020, defaulting to 


0, and encoded in the "imm8" field as <imm>/4. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
base = if n == 15 then Align(PC,4) else R[n]; 
address = if add then (base + imm32) else (base - imm32); 
case esize of 
when 32 
S[d] = MemA[address,4]; 
when 64 
word1 = MemA[address,4]; word2 = MemA[address+4,4]; 
// Combine the word-aligned words in the correct order for current endianness. 
D[d] = if BigEndian() then word1:word2 else word2:word1; 
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F6.1.98 VLDR (literal) 


Load SIMD&FP register (literal) loads a single register from the Advanced SIMD and floating-point register file, 
using an address from the PC value and an immediate offset. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 





A1 
|31 28|27 26 25 24/23 22 21 20/19 16|15 12|1110 9 8|7 | 0 | 
Tem [11 ojiupojoys[+ 114] va |tojix] mma i 
cond Rn size 


Single-precision scalar variant 
Applies when size == 10. 


VLDR{<c>}{<q>}{.32} <Sd>, <label> 
VLDR{<c>}{<q>}{.32} <Sd>, [PC, #{+/-}<imm>] 


Double-precision scalar variant 
Applies when size == 11. 


VLDR{<c>}{<q>}{.64} <Dd>, <label> 
VLDR{<c>}{<q>}{.64} <Dd>, [PC, #{+/-}<imm>] 


Decode for all variants of this encoding 


if size IN { '@@', '@1' } then UNDEFINED; 

esize = 8 << UInt(size); add = (U == '1'); 

imm32 = if esize == 16 then ZeroExtend(imm8:'0', 32) else ZeroExtend(imm8:'00', 32); 
d = UInt(D:Vd); n = UInt(Rn); 


T1 
115141312/11109 8|7 6 5 4|3 0|15 12\1110 9 8|7 | 0| 
11101717 oftfufpjoj1{1 4141 1{ va [1 of1 x} imma 
Rn size 


Single-precision scalar variant 
Applies when size == 10. 


VLDR{<c>}{<q>}{.32} <Sd>, <label> 
VLDR{<c>}{<q>}{.32} <Sd>, [PC, #{+/-}<imm>] 


Double-precision scalar variant 
Applies when size == 11. 


VLDR{<c>}{<q>}{.64} <Dd>, <label> 
VLDR{<c>}{<q>}{.64} <Dd>, [PC, #{+/-}<imm>] 


Decode for all variants of this encoding 


if size IN { '@@', '@1' } then UNDEFINED; 

esize = 8 << UInt(size); add = (U == '1'); 

imm32 = if esize == 16 then ZeroExtend(imm8:'Q', 32) else ZeroExtend(imm8:'00', 32); 
d = UInt(D:Vd); n = UInt(Rn); 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
.64 Is an optional data size specifier for 64-bit memory accesses that can be used in the assembler source 


code, but is otherwise ignored. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


32 Is an optional data size specifier for 32-bit memory accesses that can be used in the assembler source 
code, but is otherwise ignored. 


<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 


<label> The label of the literal data item to be loaded. The assembler calculates the required value of the 
offset from the Align(PC, 4) value of the instruction to this label. Permitted values are multiples of 
4 in the range -1020 to 1020. If the offset is zero or positive, imm32 is equal to the offset and add == 
TRUE. If the offset is negative, imm32 is equal to minus the offset and add == FALSE. 


+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 


- when U = @ 
+ when U = 1 
<imm> Is the optional unsigned immediate byte offset, a multiple of 4, in the range 0 to 1020, defaulting to 


0, and encoded in the "imm8" field as <imm>/4. 


The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be specified 
separately, including permitting a subtraction of 0 that cannot be specified using the normal syntax. For more 
information, see Use of labels in UAL instruction syntax on page F1-2369. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
base = if n == 15 then Align(PC,4) else R[n]; 
address = if add then (base + imm32) else (base - imm32); 
case esize of 
when 32 
S[d] = MemA[address,4]; 
when 64 
word1 = MemA[address,4]; word2 = MemA[address+4,4]; 
// Combine the word-aligned words in the correct order for current endianness. 
D[d] = if BigEndian() then word1:word2 else word2:word1; 
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F6.1.99 VMAX (floating-point) 


Vector Maximum compares corresponding elements in two vectors, and copies the larger of each pair into the 
corresponding element in the destination vector. 


The operand vector elements are 32-bit floating-point numbers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


T4770 0 4Jolopp]ofse| va [| va [+11 a[N[Q[wpo] vm _| 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMAX{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMAX{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

maximum = (op == 'Q'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


[1 1 tfoj1 414 ofpfofsz] vn | va 4114 t{NJa}mMfo} vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMAX{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMAX{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

maximum = (op == 'Q'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Floating-point maximum and minimum 
° max(+0.0, -0.0) = +0.0 


° If any input is a NaN, the corresponding result element is the default NaN. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[D[n+r],e,esize]; op2 = Elem[D[m+r],e,esize] 
if maximum then 
Elem[D[d+r],e,esize] = FPMax(op1, op2, StandardFPSCRValue()); 
else 
Elem[D[d+r],e,esize] = FPMin(op1, op2, StandardFPSCRValue()); 
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F6.1.100 VMAX (integer) 


Vector Maximum compares corresponding elements in two vectors, and copies the larger of each pair into the 
corresponding element in the destination vector. 


The operand vector elements can be any one of: 

° 8-bit, 16-bit, or 32-bit signed integers. 

° 8-bit, 16-bit, or 32-bit unsigned integers. 

The result vector elements are the same size as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
[31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
11110 0 1JUlo}D]size}] vn  [ Vd jo 11 O|NJQ{Mjo] vm _ | 


op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMAX{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMAX{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

maximum = (op == '@'); unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4|3 0 | 


7 afuft 717 o[pysze] va | va_Jo 17 o|Nja|Mjo] vm _| 
op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VMAX{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VMAX{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1L' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

maximum = (op == '@'); unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 

opl = Int(Elem[D[n+r],e,esize], unsigned); 
op2 = Int(Elem[D[m+r],e,esize], unsigned); 
result = if maximum then Max(op1,op2) else Min(op1,op2); 
Elem[D[d+r],e,esize] = result<esize-1:0>; 
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F6.1.101 VMAXNM 


This instruction determines the floating-point maximum number. 


It handles NaNs in consistence with the IEEE754-2008 specification. It returns the numerical operand when one 
operand is numerical and the other is a quiet NaN, but otherwise the result is identical to floating-point VMAX. 


This instruction is not conditional. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


114100 1}1}/o}pfofsz] vn | vd [114 1{NJQiMji1{ vm _| 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMAXNM{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMAXNM{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

maximum = (op == 'Q'); 

advsimd = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


A2 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


114117170 1Jpfo of vn [va {1 oft x{[Njofmfo} vm | 


size op 


Single-precision scalar variant 
Applies when size == 10. 


VMAXNM{<q>}.F32 <Sd>, <Sn>, <Sm> // Cannot be conditional 


Double-precision scalar variant 
Applies when size == 11. 


VMAXNM{<q>}.F64 <Dd>, <Dn>, <Dm> // Cannot be conditional 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
advsimd = FALSE; 

maximum = (op == 'Q'); 

case size of 
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when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 


when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m 


if InITBlock() then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 


UInt(M:Vm) ; 


12/1109 8|7 6 5 4|3 


0| 


[11 tfat1 141 ofdjofsz] vn [| va [411 4iNfafmia] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMAXNM{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMAXNM{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == 'L' || Vm<Q> 
if sz == '1' then UNDEFINED; 

maximum = (op == 'Q'); 

advsimd = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); regs 
if InITBlock() then UNPREDICTABLE; 


T2 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 


"1') then UNDEFINED; 


if Q == 'Q' then 1 else 2; 


12/1109 8|7 6 5 4/3 


0} 


1141717170 1Jdfo of vn [va {1 oft x[Njojmfo} vm | 


Single-precision scalar variant 


Applies when size == 10. 


size op 


VMAXNM{<q>}.F32 <Sd>, <Sn>, <Sm> // Not permitted in IT block 


Double-precision scalar variant 


Applies when size == 11. 


VMAXNM{<q>}.F64 <Dd>, <Dn>, <Dm> // Not permitted in IT block 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
advsimd = FALSE; 

maximum = (op == 'Q'); 

case size of 


when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 





when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 

<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[D[n+r], e, esize]; op2 = Elem[D[m+tr], e, esize]; 
if maximum then 
Elem[D[d+r], e, esize] = FPMaxNum(op1, op2, StandardFPSCRValue()); 


























else 
Elem[D[d+r], e, esize] = FPMinNum(op1, op2, StandardFPSCRValue()); 
else // NFP instruction 
case esize of 
when 32 
if maximum then 
S[d] = FPMaxNum(S[n], S[{m], FPSCR); 
else 
S[d] = FPMinNum(S[n], S[{m], FPSCR); 
when 64 
if maximum then 
D[d] = FPMaxNum(D[n], D[m], FPSCR); 
else 
D[d] = FPMinNum(D[n], D[m], FPSCR); 
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F6.1.102 VMIN (floating-point) 
Vector Minimum compares corresponding elements in two vectors, and copies the smaller of each pair into the 
corresponding element in the destination vector. 
The operand vector elements are 32-bit floating-point numbers. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|11109 8|7 6 5 4/3 0| 
T1747 00 t)olo[o[t | va [_va_[1 17 7[NJa|Mpoy vm _| 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VMIN{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VMIN{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == 'L' 8& (Vd<O> == 'L' || Vn<@> == ‘1’ || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 
maximum = (op == 'Q'); 
esize = 32; elements = 2; 
d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12\/1110 9 8|7 6 5 4|3 0| 
[11 tfoft 144 ofpisfsz] vn | vd [141 t{Nfalmjo] vm __| 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VMIN{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VMIN{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<O> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 
maximum = (op == 'Q'); 
esize = 32; elements = 2; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Floating-point minimum 
° min(+0.0, -0.0) = -0.0 


° If any input is a NaN, the corresponding result element is the default NaN. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[D[n+r],e,esize]; op2 = Elem[D[m+r],e,esize] 
if maximum then 
Elem[D[d+r],e,esize] = FPMax(op1, op2, StandardFPSCRValue()); 
else 
Elem[D[d+r],e,esize] = FPMin(op1, op2, StandardFPSCRValue()); 
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F6.1.103 


VMIN (integer) 


Vector Minimum compares corresponding elements in two vectors, and copies the smaller of each pair into the 
corresponding element in the destination vector. 


The operand vector elements can be any one of: 

° 8-bit, 16-bit, or 32-bit signed integers. 

° 8-bit, 16-bit, or 32-bit unsigned integers. 

The result vector elements are the same size as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


[31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


114100 1Jujo]p}sie} vn {va fo 11 o[Njajmji] vm | 


op 
64-bit SIMD vector variant 


Applies when Q == 0. 


VMIN{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMIN{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

maximum = (op == '@'); unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 1413 12/1110 9 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 afuft 711 o[pysze] va | va_jo 17 o[Nja|Mi] vm | 
op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VMIN{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VMIN{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1L' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

maximum = (op == '@'); unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 

opl = Int(Elem[D[n+r],e,esize], unsigned); 
op2 = Int(Elem[D[m+r],e,esize], unsigned); 
result = if maximum then Max(op1,op2) else Min(op1,op2); 
Elem[D[d+r],e,esize] = result<esize-1:0>; 
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F6.1.104 VMINNM 


This instruction determines the floating point minimum number. 


It handles NaNs in consistence with the IEEE754-2008 specification. It returns the numerical operand when one 
operand is numerical and the other is a quiet NaN, but otherwise the result is identical to floating-point VMIN. 


This instruction is not conditional. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


114100 1}1}fo]d{1}sz] vn | vd [414 1{NJQiMj1{ vm _| 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMINNM{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMINNM{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

maximum = (op == 'Q'); 

advsimd = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 





A2 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12\11109 8|7 6 5 4|3 0| 
Titt14%70 7)oJo 0] vn | va [1 0]1 x[N[1[Mpo] vm 
size op 
Single-precision scalar variant 
Applies when size == 10. 
VMINNM{<q>}.F32 <Sd>, <Sn>, <Sm> // Cannot be conditional 
Double-precision scalar variant 
Applies when size == 11. 
VMINNM{<q>}.F64 <Dd>, <Dn>, <Dm> // Cannot be conditional 
Decode for all variants of this encoding 
if size != '1x' then UNDEFINED; 
advsimd = FALSE; 
maximum = (op == '0'); 
case size of 
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when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 


when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12\1110 9 8|7 6 5 4|3 0| 


11 tfat1 14 t ofd]tfsz] vn | va f4 11 afNfofmia] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMINNM{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMINNM{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

maximum = (op == 'Q'); 

advsimd = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


T2 
[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
117417110 1/dfo of vn | vd [1 of1 x{N{i{mjo[ vm _| 


size op 


Single-precision scalar variant 
Applies when size == 10. 


VMINNM{<q>}.F32 <Sd>, <Sn>, <Sm> // Not permitted in IT block 


Double-precision scalar variant 
Applies when size == 11. 


VMINNM{<q>}.F64 <Dd>, <Dn>, <Dm> // Not permitted in IT block 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
advsimd = FALSE; 
maximum = (op == 'Q'); 
case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 





when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 

<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[D[n+r], e, esize]; op2 = Elem[D[m+tr], e, esize]; 
if maximum then 
Elem[D[d+r], e, esize] = FPMaxNum(op1, op2, StandardFPSCRValue()); 


























else 
Elem[D[d+r], e, esize] = FPMinNum(op1, op2, StandardFPSCRValue()); 
else // NFP instruction 
case esize of 
when 32 
if maximum then 
S[d] = FPMaxNum(S[n], S[{m], FPSCR); 
else 
S[d] = FPMinNum(S[n], S[{m], FPSCR); 
when 64 
if maximum then 
D[d] = FPMaxNum(D[n], D[m], FPSCR); 
else 
D[d] = FPMinNum(D[n], D[m], FPSCR); 
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F6.1.105 | VMLA (floating-point) 


Vector Multiply Accumulate multiplies corresponding elements in two vectors, and accumulates the results into the 
elements of the destination vector. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


T1770 0 1Joo[p]ofse| vn [| va [110 a[NfQ\M[i] vm _| 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMLA{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMLA{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

advsimd = TRUE; add = (op == 'Q'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


A2 
31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 
| sit [114 ofof[pjo of vn | va [4 of1 x|Nfo{mjo] vm __| 


cond size op 


Single-precision scalar variant 
Applies when size == 10. 


VMLA{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VMLA{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 

if FPSCR.Len != 'Q@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 

advsimd = FALSE; add = (op == 'Q'); 

case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
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T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


7 apolt 717 ofppoye] va | va_[1 10 7[NJQ[Myi] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMLA{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMLA{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

advsimd = TRUE; add = (op == 'Q'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 
15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 0 | 
770477 [oppo ol va | va _[1 0]7 x[N[o[Mpo] vm 


size op 


Single-precision scalar variant 
Applies when size == 10. 


VMLA{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VMLA{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 

if FPSCR.Len != 'Q@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 

advsimd = FALSE; add = (op == 'Q'); 

case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 

<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = Q@ to elements-1 
product = FPMul(Elem[D[n+r],e,esize], Elem[D[m+r],e,esize], StandardFPSCRValue()); 
addend = if add then product else FPNeg(product) ; 
Elem[D[d+r],e,esize] = FPAdd(Elem[D[d+r],e,esize], addend, StandardFPSCRValue()); 
else // NFP instruction 
case esize of 
when 32 
addend32 = if add then FPMul(S[n], S[m], FPSCR) else FPNeg(FPMul(S[n], S[m], FPSCR)); 
S[d] = FPAdd(S[d], addend32, FPSCR); 
when 64 
addend64 = if add then FPMul(D[n], D[m], FPSCR) else FPNeg(FPMul(D[n], D[m], FPSCR)); 
D[d] = FPAdd(D[d], addend64, FPSCR); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 





F6.1.106  VMLA (integer) 
Vector Multiply Accumulate multiplies corresponding elements in two vectors, and adds the products to the 
corresponding elements of the destination vector. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12\11109 8|7 6 5 4|3 0| 
114100 J0[o[D]sze] va | va [100 7|NJQlMjo] vm | 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VMLA{<c>}{<q>}.<type><size> <Dd>, <Dn>, <Dm> // Encoding T1/A1, encoded as Q = 0 
128-bit SIMD vector variant 
Applies when Q == 1. 
VMLA{<c>}{<q>}.<type><size> <Qd>, <Qn>, <Qm> // Encoding T1/A1, encoded as Q = 1 
Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
add = (op == '0'); long_destination = FALSE; 
unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
115141312/11109 8|7 6 5 4|3 0|15 12/1110 9 8|7 6 5 4|3 0| 
1111 0{D[{sze] Vn | Vd [100 1{N/Q{Mjo] vm _| 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VMLA{<c>}{<q>}.<type><size> <Dd>, <Dn>, <Dm> // Encoding T1/A1, encoded as Q = 0 
128-bit SIMD vector variant 
Applies when Q == 1. 
VMLA{<c>}{<q>}.<type><size> <Qd>, <Qn>, <Qm> // Encoding T1/A1, encoded as Q = 1 
Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
add = (op == '0'); long_destination = FALSE; 
unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == 'Q@' then 1 else 2; 
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Assembler symbols 


<c> 


<q> 


<type> 


<size> 


<Qd> 
<Qn> 


<Qm> 


<Dd> 
<Dn> 


<Dm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


The data type for the elements of the operands. It must be one of: 


S Optional in encoding T1/A1. Encoded as U = 0 in encoding T2/A2. 
U Optional in encoding T1/A1. Encoded as U = 1 in encoding T2/A2. 
I Available only in encoding T1/A1. 


The data size for the elements of the operands. It must be one of: 


8 Encoded as size = 0b00. 
16 Encoded as size = 0bQ1. 
32 Encoded as size = 0b10. 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 


for e = Q to elements-1 
product = Int(Elem[Din[n+r],e,esize] ,unsigned) « Int(Elem[Din[m+r] ,e,esize] ,unsigned) ; 
addend = if add then product else -product; 
if long_destination then 


Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2esize] + addend; 


else 


Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + addend; 
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F6.1.107 | VMLA (by scalar) 


Vector Multiply Accumulate multiplies elements of a vector by a scalar, and adds the products to corresponding 
elements of the destination vector. 


For more information about scalars see Advanced SIMD scalars on page F2-2432. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
[31 30 29 28|27 26 25 24|23 22 21 20\19 16/15 12\11109 8|7 6 5 4|3 0 | 
Tt i100 tQ)iopen| va | ve _[ojo]olF[n[t[mjo[ vm | 


size op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMLA{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm[x]> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMLA{<c>}{<q>}.<dt> <Qd>, <Qn>, <Dm[x]> 


Decode for all variants of this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || (F == '1' && size == '01') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1') then UNDEFINED; 

unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 
add = (op == '0'); floating_point = (F == '1'); long_destination = FALSE; 
d = UInt(D:Vd); n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 


if size == 'Q1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
15 141312/11109 8|7 6 5 4|3 0 |15 12\1110 9 8|7 6 5 4|3 0| 
[11 tfof1 144 t{ofi1] vn | va __JojofofF[N{1|mjo] vm _| 
size op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VMLA{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm[x]> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VMLA{<c>}{<q>}.<dt> <Qd>, <Qn>, <Dm[x]> 
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Decode for all variants of this encoding 


if size == 
if size == 


"11' then SEE "Related encodings"; 
'Q0' || (F == '1' && size == 'Q1') then UNDEFINED; 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1') then UNDEFINED; 


unsigned = 
add = (op 


FALSE; // "Don't care" value: TRUE produces same functionality 
== 'Q'); floating_point = (F == '1'); Jlong_destination = FALSE; 


d = UInt(D:Vd); n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 


if size == 
if size == 


"Q@1' then esize = 16; elements = 4; 
'10' then esize = 32; elements = 2; m 


m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 


UInt(Vm); index = UInt(M); 


Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 


<Qn> 


<Dd> 


<Dn> 


<Dm[x]> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data type for the scalar and the elements of the operand vector, encoded in the "F:size" field. 
It can have the following values: 


116 when F = 0, size = Q1 
132 when F = 0, size = 10 
F32 when F = 1, size = 10 


Is the 128-bit name of the SIMD&FP register holding the accumulate vector, encoded in the "D:Vd" 
field as <Qd>*2. 


Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 64-bit name of the SIMD&FP register holding the accumulate vector, encoded in the "D: Vd" 
field. 


Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


The scalar. Dm is restricted to DO-D7 if <size> is 16, or DO-D15 otherwise. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
op2 = Elem[Din[m],index,esize]; op2val = Int(op2, unsigned); 
for r = @ to regs-1 
for e = Q to elements-1 


op1 = Elem[Din[n+r],e,esize]; oplval = Int(opl, unsigned); 
if floating_point then 


fp_addend = if add then FPMul(op1,op2,StandardFPSCRValue()) else 


FPNeg(FPMul(op1,op2 , StandardFPSCRValue())); 


else 


Elem[D[d+r],e,esize] = FPAdd(Elem[Din[d+r],e,esize], fp_addend, StandardFPSCRValue()); 


addend = if add then oplvalsop2val else -oplval«op2val; 
if long_destination then 

Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2esize] + addend; 
else 





Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + addend; 
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F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 








F6.1.108 VMLAL (integer) 
Vector Multiply Accumulate Long multiplies corresponding elements in two vectors, and add the products to the 
corresponding element of the destination vector. The destination vector element is twice as long as the elements that 
are multiplied. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
Ti 1100 iupipo[en] va | va [+ ofo]o[n[o|wpo. vm] 
size op 
Al variant 
VMLAL{<c>}{<q>}.<type><size> <Qd>, <Dn>, <Dm> // Encoding T2/A2 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vd<@> == '1' then UNDEFINED; 
add = (op == '0'); long_destination = TRUE; unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); | m = UInt(M:Vm); regs = 1; 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12/1109 8|7 6 5 4|3 0| 
size op 
T1 variant 
VMLAL{<c>}{<q>}.<type><size> <Qd>, <Dn>, <Dm> // Encoding 12/A2 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vd<@> == '1' then UNDEFINED; 
add = (op == '0'); long_destination = TRUE; unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); nm = UInt(N:Vn); | m = UInt(M:Vm); regs = 1; 
Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
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<type> 


<size> 


<Qd> 
<Dn> 


<Dm> 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


The data type for the elements of the operands. It must be one of: 


S Optional in encoding T1/A1. Encoded as U = 0 in encoding T2/A2. 
U Optional in encoding T1/A1. Encoded as U = 1 in encoding T2/A2. 
I Available only in encoding T1/A1. 


The data size for the elements of the operands. It must be one of: 


8 Encoded as size = 0b00. 
16 Encoded as size = 0bQ1. 
32 Encoded as size = 0b10. 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 


for e = Q to elements-1 
product = Int(Elem[Din[n+r],e,esize],unsigned) « Int(Elem[Din[m+r] ,e,esize] ,unsigned) ; 
addend = if add then product else -product; 
if long_destination then 


Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2sesize] + addend; 


else 


Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + addend; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 








F6.1.109 |= VMLAL (by scalar) 
Vector Multiply Accumulate Long multiplies elements of a vector by a scalar, and adds the products to 
corresponding elements of the destination vector. The destination vector elements are twice as long as the elements 
that are multiplied. 
For more information about scalars see Advanced SIMD scalars on page F2-2432. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
Tit 100 tuliopen] va | ve [ojo[t o[n[t[mpo] vm | 
size op 
Al variant 
VMLAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm[x]> 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if size == '@Q' || Vd<@> == '1' then UNDEFINED; 
unsigned = (U == '1'); add = (op == '@'); | floating_point = FALSE; long_destination = TRUE; 
d = UInt(D:Vd); n = UInt(N:Vn); regs = 1; 
if size == 'Q1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12\1110 9 8|7 6 5 4|3 0| 
size op 
T1 variant 
VMLAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm[x]> 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if size == '@Q@' || Vd<@> == '1' then UNDEFINED; 
unsigned = (U == '1'); add = (op == '@'); floating_point = FALSE; long_destination = TRUE; 
d = UInt(D:Vd); n = UInt(N:Vn); regs = 1; 
if size == 'Q@1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 
Assembler symbols 
<C> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the scalar and the elements of the operand vector, encoded in the "U:size" field. 
It can have the following values: 
$16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 

<Qd> Is the 128-bit name of the SIMD&FP register holding the accumulate vector, encoded in the "D:Vd" 
field as <Qd>*2. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm[x]> The scalar. Dm is restricted to DO-D7 if <size> is 16, or DO-D15 otherwise. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
op2 = Elem[Din[m],index,esize]; op2val = Int(op2, unsigned); 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[Din[n+r],e,esize]; oplval = Int(opl, unsigned); 
if floating_point then 
fp_addend = if add then FPMul(op1,op2,StandardFPSCRValue()) else 
FPNeg(FPMul (op1,op2 , StandardFPSCRValue())); 
Elem[D[d+r],e,esize] = FPAdd(Elem[Din[d+r],e,esize], fp_addend, StandardFPSCRValue()); 
else 
addend = if add then oplvalsxop2val else -oplval«op2val; 
if long_destination then 
Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2esize] + addend; 
else 





Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + addend; 
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F6.1.110 | VMLS (floating-point) 
Vector Multiply Subtract multiplies corresponding elements in two vectors, subtracts the products from 
corresponding elements of the destination vector, and places the results in the destination vector. 
Note 
ARM recommends that software does not use the VMLS instruction in the Round towards Plus Infinity and Round 
towards Minus Infinity rounding modes, because the rounding of the product and of the sum can change the result 
of the instruction in opposite directions, defeating the purpose of these rounding modes. 
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12/1109 8|7 6 5 4|3 0| 
141100 to]o[o[ tf] ve [va [+10 a[NJQlwpt] vm | 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VMLS{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VMLS{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == 'L' 8& (Vd<O> == 'L' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 
advsimd = TRUE; add = (op == 'Q'); 
esize = 32; elements = 2; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
A2 
31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 0| 
eit [111 ofojofo o[ vn [| ve [ao] x]N]i[mpo. vm] 
cond size op 
Single-precision scalar variant 
Applies when size == 10. 
VMLS{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VMLS{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
if FPSCR.Len != 'Q@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 
advsimd = FALSE; add = (op == 'Q'); 
case size of 
when '10' esize 
when '11' esize 


32; d = UInt(Vd:D); nm = UInt(Vn:N); m = UInt(Vm:M); 
64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0 | 


7 apolt 717 o[pyifee] va | va_[1 10 7|NJQ[Myi] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMLS{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMLS{<c>}{<q>}.<dt> <Qd>, <Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<O> == 'L' || Vn<O> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

advsimd = TRUE; add = (op == 'Q'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 
151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0 | 
11101711 ofo[djo of vn [va [1 of1 x|N{i{mMjo] vm | 


size op 


Single-precision scalar variant 
Applies when size == 10. 


VMLS{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VMLS{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
if FPSCR.Len != 'Q@0' || FPSCR.Stride != 'Q0' then UNDEFINED; 
advsimd = FALSE; add = (op == 'Q@'); 
case size of 
when '10' esize = 32; d 
when '11' esize = 64; d 


UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 

<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = Q@ to elements-1 
product = FPMul(Elem[D[n+r],e,esize], Elem[D[m+r],e,esize], StandardFPSCRValue()); 
addend = if add then product else FPNeg(product) ; 
Elem[D[d+r],e,esize] = FPAdd(Elem[D[d+r],e,esize], addend, StandardFPSCRValue()); 
else // NFP instruction 
case esize of 
when 32 
addend32 = if add then FPMul(S[n], S[m], FPSCR) else FPNeg(FPMul(S[n], S[m], FPSCR)); 
S[d] = FPAdd(S[d], addend32, FPSCR); 
when 64 
addend64 = if add then FPMul(D[n], D[m], FPSCR) else FPNeg(FPMul(D[n], D[m], FPSCR)); 
D[d] = FPAdd(D[d], addend64, FPSCR); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.111__ VMLS (integer) 


Vector Multiply Subtract multiplies corresponding elements in two vectors, and subtracts the products from the 
corresponding elements of the destination vector. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


T17700 i]1[o[o[sze] vn | va [100 a[N[Q[w[o] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMLS{<c>}{<q>}.<type><size> <Dd>, <Dn>, <Dm> // Encoding T1/A1, encoded as Q = 0 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMLS{<c>}{<q>}.<type><size> <Qd>, <Qn>, <Qm> // Encoding T1/A1, encoded as Q = 1 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

add = (op == '0'); long_destination = FALSE; 

unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 


1{1 1114 ofp}sie] vn {| va [1 0 0 1JNJa{Mjo] vm __| 


op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMLS{<c>}{<q>}.<type><size> <Dd>, <Dn>, <Dm> // Encoding T1/A1, encoded as Q = 0 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMLS{<c>}{<q>}.<type><size> <Qd>, <Qn>, <Qm> // Encoding T1/A1, encoded as Q = 1 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

add = (op == '0'); long_destination = FALSE; 

unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<type> The data type for the elements of the operands. It must be one of: 
S Optional in encoding T1/A1. Encoded as U = 0 in encoding T2/A2. 
U Optional in encoding T1/A1. Encoded as U = 1 in encoding T2/A2. 
I Available only in encoding T1/A1. 
<size> The data size for the elements of the operands. It must be one of: 
8 Encoded as size = 0b0Q. 
16 Encoded as size = 0b01. 
32 Encoded as size = 0b19. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
product = Int(Elem[Din[n+r],e,esize] ,unsigned) « Int(Elem[Din[m+r] ,e,esize] ,unsigned) ; 
addend = if add then product else -product; 
if long_destination then 
Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2esize] + addend; 
else 
Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + addend; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.112 | VMLS (by scalar) 


Vector Multiply Subtract multiplies elements of a vector by a scalar, and either subtracts the products from 
corresponding elements of the destination vector. 


For more information about scalars see Advanced SIMD scalars on page F2-2432. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12\11109 8|7 6 5 4|3 0| 
T14100 tQyopen] vn | va JO] *)o]F[N][t[mjo[ vm _| 
size op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMLS{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm[x]> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMLS{<c>}{<q>}.<dt> <Qd>, <Qn>, <Dm[x]> 


Decode for all variants of this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || (F == '1' && size == '01') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1') then UNDEFINED; 

unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 
add = (op == '0'); floating_point = (F == '1'); long_destination = FALSE; 
d = UInt(D:Vd); n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 


if size == 'Q1' then esize = 16; elements = 4; m = UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
15 141312/11109 8|7 6 5 4|3 0 |15 12\1110 9 8|7 6 5 4|3 0| 





at spas ttt pope] vn | vd Jolt fofF IN| t|mfo} vm | 


size op 


64-bit SIMD vector variant 

Applies when Q == 0. 
VMLS{<c>}{<q>}.<dt> <Dd>, <Dn>, <Dm[x]> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VMLS{<c>}{<q>}.<dt> <Qd>, <Qn>, <Dm[x]> 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Decode for all variants of this encoding 


if size == 
if size == 


"11' then SEE "Related encodings"; 
'Q0' || (F == '1' && size == 'Q1') then UNDEFINED; 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1') then UNDEFINED; 


unsigned = 
add = (op 


FALSE; // "Don't care" value: TRUE produces same functionality 
= 'Q'); floating_point = (F == '1'); Jlong_destination = FALSE; 


d = UInt(D:Vd); n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 


if size == 
if size == 


"Q@1' then esize = 16; elements = m 


4; UInt(Vm<2:0>); index = UInt(M:Vm<3>); 
'10' then esize = 32; elements = 2; m 


UInt(Vm); index = UInt(M); 


Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 


<Qn> 


<Dd> 


<Dn> 


<Dm[x]> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data type for the scalar and the elements of the operand vector, encoded in the "F:size" field. 
It can have the following values: 


116 when F = 0, size = Q1 
132 when F = 0, size = 10 
F32 when F = 1, size = 10 


Is the 128-bit name of the SIMD&FP register holding the accumulate vector, encoded in the "D:Vd" 
field as <Qd>*2. 


Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 64-bit name of the SIMD&FP register holding the accumulate vector, encoded in the "D: Vd" 
field. 


Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


The scalar. Dm is restricted to DO-D7 if <size> is 16, or DO-D15 otherwise. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
op2 = Elem[Din[m],index,esize]; op2val = Int(op2, unsigned); 
for r = @ to regs-1 
for e = Q to elements-1 


op1 = Elem[Din[n+r],e,esize]; oplval = Int(opl, unsigned); 
if floating_point then 
fp_addend = if add then FPMul(op1,op2,StandardFPSCRValue()) else 


FPNeg(FPMul(op1,op2 , StandardFPSCRValue())); 


Elem[D[d+r],e,esize] = FPAdd(Elem[Din[d+r],e,esize], fp_addend, StandardFPSCRValue()); 
else 
addend = if add then oplvalsop2val else -oplval«op2val; 
if long_destination then 
Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2esize] + addend; 
else 





Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + addend; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.113  VMLSL (integer) 


Vector Multiply Subtract Long multiplies corresponding elements in two vectors, and subtract the products from the 
corresponding elements of the destination vector. The destination vector element is twice as long as the elements 
that are multiplied. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


[111700 tfufi{o}a{ vn | va [4 of1fojNfo[mjo] vm __| 
op 


size 


Al variant 


VMLSL{<c>}{<q>}.<type><size> <Qd>, <Dn>, <Dm> // Encoding T2/A2 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' then UNDEFINED; 

add = (op == '0'); long_destination = TRUE; unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n= UInt(N:Vn); | m = UInt(M:Vm); regs = 1; 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 





size op 


T1 variant 


VMLSL{<c>}{<q>}.<type><size> <Qd>, <Dn>, <Dm> // Encoding T2/A2 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if Vd<@> == '1' then UNDEFINED; 

add = (op == '0'); long_destination = TRUE; unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); nm = UInt(N:Vn); | m = UInt(M:Vm); regs = 1; 


Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3499 
1ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<type> The data type for the elements of the operands. It must be one of: 

S Optional in encoding T1/A1. Encoded as U = 0 in encoding T2/A2. 

U Optional in encoding T1/A1. Encoded as U = 1 in encoding T2/A2. 

I Available only in encoding T1/A1. 
<size> The data size for the elements of the operands. It must be one of: 

8 Encoded as size = 0b0Q. 

16 Encoded as size = 0b01. 

32 Encoded as size = 0b10. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
product = Int(Elem[Din[n+r],e,esize],unsigned) « Int(Elem[Din[m+r] ,e,esize] ,unsigned) ; 
addend = if add then product else -product; 
if long_destination then 
Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2sesize] + addend; 
else 
Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + addend; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.114 | VMLSL (by scalar) 


Vector Multiply Subtract Long multiplies elements of a vector by a scalar, and subtracts the products from 
corresponding elements of the destination vector. The destination vector elements are twice as long as the elements 
that are multiplied. 


For more information about scalars see Advanced SIMD scalars on page F2-2432. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12\11109 8/7 6 5 4]3 0| 
114100 tutpopen] ve | va Jo]*]1 o[N]t[mjo[ vm _| 
size op 
Al variant 


VMLSL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm[x]> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == '@Q' || Vd<@> == '1' then UNDEFINED; 

unsigned = (U == '1'); add = (op == '@'); | floating_point = FALSE; long_destination = TRUE; 
d = UInt(D:Vd); n = UInt(N:Vn); regs = 1; 


if size == 'Q1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 

if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 

\15141312/11109 8|7 6 5 4|3 0 |15 12\1110 9 8|7 6 5 4|3 0| 





at tfuls ttt tfofient vn | vd oft] oN] t]Mfo} vm | 


size op 


T1 variant 


VMLSL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm[x]> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 
if size == '@Q@' || Vd<@> == '1' then UNDEFINED; 
unsigned = (U == '1'); add = (op == '@'); floating_point = FALSE; long_destination = TRUE; 
d = UInt(D:Vd); n = UInt(N:Vn); regs = 1; 

if size == '@1' then esize = 16; elements = 4; m 
if size == '10' then esize = 32; elements = 2; m 


UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
UInt(Vm); index = UInt(M); 


Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 





<C> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the scalar and the elements of the operand vector, encoded in the "U:size" field. 
It can have the following values: 
$16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 

<Qd> Is the 128-bit name of the SIMD&FP register holding the accumulate vector, encoded in the "D:Vd" 
field as <Qd>*2. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm[x]> The scalar. Dm is restricted to DO-D7 if <size> is 16, or DO-D15 otherwise. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
op2 = Elem[Din[m],index,esize]; op2val = Int(op2, unsigned); 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[Din[n+r],e,esize]; oplval = Int(opl, unsigned); 
if floating_point then 
fp_addend = if add then FPMul(op1,op2,StandardFPSCRValue()) else 
FPNeg(FPMul (op1,op2 , StandardFPSCRValue())); 
Elem[D[d+r],e,esize] = FPAdd(Elem[Din[d+r],e,esize], fp_addend, StandardFPSCRValue()); 
else 
addend = if add then oplvalsxop2val else -oplval«op2val; 
if long_destination then 
Elem[Q[d>>1],e,2esize] = Elem[Qin[d>>1],e,2esize] + addend; 
else 





Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + addend; 
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F6.1.115 | VMOV (between two general-purpose registers and a doubleword floating-point register) 


Copy two general-purpose registers to or from a SIMD&FP register copies two words from two general-purpose 
registers into a doubleword register in the Advanced SIMD and floating-point register file, or from a doubleword 
register in the Advanced SIMD and floating-point register file to two general-purpose registers. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
Deri [i700 o]ifofl re | et [10710 o[Mi] vm | 


cond 


From general-purpose registers variant 
Applies when op == @. 


VMOV{<c>}{<q>} <Dm>, <Rt>, <Rt2> 


To general-purpose registers variant 
Applies when op == 1. 


VMOV{<c>}{<q>} <Rt>, <Rt2>, <Dm> 


Decode for all variants of this encoding 


to_arm_registers = (op == '1'); t = UInt(Rt); t2 = UInt(Rt2); m = UInt(M:Vm); 
if t == 15 || t2 == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
if to_arm_registers && t == t2 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If to_arm_registers && t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The value in the destination register is UNKNOWN. 
T1 
115141312/11109 8|7 6 5 4|3 0|15 12\/11109 8|7 6 5 4|3 0| 


Ti 707100 0/10] re | Rt [10 71]iJo om] vm | 


From general-purpose registers variant 
Applies when op == 0. 

VMOV{<c>}{<q>} <Dm>, <Rt>, <Rt2> 

To general-purpose registers variant 
Applies when op == 1. 


VMOV{<c>}{<q>} <Rt>, <Rt2>, <Dm> 
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Decode for all variants of this encoding 
to_arm_registers = (op == '1'); t = UInt(Rt); t2 = UInt(Rt2); m = UInt(M:Vm); 


if t == 15 || t2 == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
if to_arm_registers && t == t2 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If to_arm_registers && t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The value in the destination register is UNKNOWN. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VMOV (between two general-purpose 
registers and a doubleword floating-point register) on page K1-5472. 


Assembler symbols 


<Dm> Is the 64-bit name of the SIMD&FP register to be transferred, encoded in the "M:Vm" field. 

<Rt2> Is the first general-purpose register that <Dm>[63:32] will be transferred to or from, encoded in the 
"Rt" field. 

<Rt> Is the first general-purpose register that <Dm>[31:0] will be transferred to or from, encoded in the "Rt" 
field. 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
if to_arm_registers then 
R[t] = D[m]<31:0>; 
R[t2] = D[m]<63:32>; 
else 
D[m]<31:0> = R[t]; 
D[m]<63:32> = R[t2]; 
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F6.1.116 |VMOV (immediate) 


Copy immediate value to a SIMD&FP register places an immediate constant into every element of the destination 
register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|15 12\11 8|7 6 5 4|3 0 | 


77100 Tift] 0 Of mms | va | omode [o[afos|7] imma 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMOV{<c>}{<q>}.<dt> <Dd>, #<imm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMOV{<c>}{<q>}.<dt> <Qd>, #<imm> 


Decode for all variants of this encoding 


if op == '@' && cmode<@> == '1' && cmode<3:2> != '11' then SEE VORR (immediate); 

if op == '1' && cmode != '1110' then SEE "Related encodings"; 

if Q == '1' && Vd<@> == '1' then UNDEFINED; 

single_register = FALSE; advsimd = TRUE; imm64 = AdvSIMDExpandImm(op, cmode, i:imm3:imm4) ; 
d = UInt(D:Vd); regs = if Q == '@' then 1 else 2; 


A2 
31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 0 | 
Pst [1 1 4 0 1[D]4 1{ imma [va [4 oft xfojofojo] imma _| 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VMOV{<c>}{<q>}.F32 <Sd>, #<imm> 


Double-precision scalar variant 
Applies when size == 11. 


VMOV{<c>}{<q>}.F64 <Dd>, #<imm> 


Decode for all variants of this encoding 


if FPSCR.Len != '@@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 

single_register = (size != '11'); advsimd = FALSE; 

bits(32) imm32; 

bits(64) imm64; 
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case size of 
when '10' d = UInt(Vd:D); imm32 = VFPExpandImm(imm4H:imm4L) ; 
when '11' d = UInt(D:Vd); imm64 = VFPExpandImm(imm4H:imm4L); regs = 1; 


T1 


\15141312/11109 8|7 6 5 4/3 2. 0|15 42|11 8|7 6 5 4|3 0| 


[1 t tfif1 414 14ofo 0 of imms | vd | cmode |o]ajop]t] imma _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMOV{<c>}{<q>}.<dt> <Dd>, #<imm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMOV{<c>}{<q>}.<dt> <Qd>, #<imm> 


Decode for all variants of this encoding 


if op == '@' && cmode<@> == '1' && cmode<3:2> != '11' then SEE VORR (immediate); 

if op == '1' && cmode != '1110' then SEE "Related encodings"; 

if Q == '1' && Vd<@> == '1' then UNDEFINED; 

single_register = FALSE; advsimd = TRUE; imm64 = AdvSIMDExpandImm(op, cmode, i:imm3:imm4) ; 
d = UInt(D:Vd); regs = if Q == '@' then 1 else 2; 


T2 
[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
[1110471740 14d{1 1] imm4H | vd [1 of1 xfojofofol immar_| 


size 


Single-precision scalar variant 
Applies when size == 10. 


VMOV{<c>}{<q>}.F32 <Sd>, #<imm> 


Double-precision scalar variant 
Applies when size == 11. 


VMOV{<c>}{<q>}.F64 <Dd>, #<imm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'QQ@0' || FPSCR.Stride != 'Q00' then UNDEFINED; 
if size != '1x' then UNDEFINED; 
single_register = (size != '11'); advsimd = FALSE; 
bits(32) imm32; 
bits(64) imm64; 
case size of 
when '10' d = UInt(Vd:D);  imm32 = VFPExpandImm(imm4H:imm4L) ; 
when '11' d = UInt(D:Vd); | imm64 = VFPExpandImm(imm4H:imm4L); regs = 1; 
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Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 


set. 


Assembler symbols 


<c> 


<q> 

<dt> 
<Qd> 
<Dd> 
<Sd> 


<imm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

The data type. It must be one of 18, 116, 132, 164, or F32. 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 


For encoding A1 and T1: is a constant of the type specified by <dt> that is replicated to fill the 
destination register. For details of the range of constants available and the encoding of <dt> and 
<imm>, see Modified immediate constants in T32 and A32 Advanced SIMD instructions on 

page F2-2423. 


For encoding A2 and T2: is a signed floating-point constant with 3-bit exponent and normalized 4 
bits of precision, encoded in "imm4H:imm4L". For details of the range of constants available and 
the encoding of <imm>, see Modified immediate constants in T32 and A32 floating-point instructions 
on page F2-2424. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if single_register then 
S[d] = imm32; 


for r = @ to regs-1 
D[d+r] = imm64; 
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F6.1.117 


VMOV (register) 


Copy between FP registers copies the contents of one FP register to another. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A2 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0| 


| feiitit [1 410 tfo{1 sfojo o of vd 44 of1 xjo]t|mMfo] vm | 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VMOV{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VMOV{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q00' || FPSCR.Stride != '@@' then UNDEFINED; 
single_register = (sz == 'Q'); advsimd = FALSE; 
if single_register then 
d = UInt(Vd:D);  m = UInt(Vm:M); 
else 
d = UInt(D:Vd); m= UInt(M:Vm); regs = 1; 


T2 


15 141312/11109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 


0} 


1170117170 1{d]1 tfojo o of va {1 oft xfojifmfo} vm | 


size 


Single-precision scalar variant 
Applies when size == 10. 


VMOV{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VMOV{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q00' || FPSCR.Stride != 'Q@' then UNDEFINED; 
single_register = (sz == 'Q@'); advsimd = FALSE; 
if single_register then 
d = UInt(Vd:D); =m = UInt(Vm:M); 
else 
d = UInt(D:Vd); m= UInt(M:Vm); regs = 1; 





F6-3508 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if single_register then 
S[d] = S[m]; 
else 
for r = @ to regs-1 
D[d+r] = D[m+r]; 
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F6.1.118 | VMOV (register, SIMD) 


Copy between SIMD registers copies the contents of one SIMD register to another 


This instruction is an alias of the VORR (register) instruction. This means that: 


° The encodings in this description are named to match the encodings of VORR (register). 
° The description of VORR (register) gives the operational pseudocode for this instruction. 
A1 

|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|/1110 9 8|7 6 5 4/3 


0 | 


777100 1fofopo]7 o| wa | va [oo 0 7|Nja|Mi] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VMOV{<c>}{<q>}{.<dt>} <Dd>, <Dm> 

is equivalent to 

VORR{<c>}{<q>}{.<dt>} <Dd>, <Dm>, <Dm> 
and is the preferred disassembly when N:Vn == M:Vmn. 
128-bit SIMD vector variant 

Applies when Q == 1. 
VMOV{<c>}{<q>}{.<dt>} <Qd>, <Qm> 

is equivalent to 

VORR{<c>}{<q>}{.<dt>} <Qd>, <Qm>, <Qm> 


and is the preferred disassembly when N:Vn == M:Vm. 


T1 


[151413 12|11109 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 


0| 


[11 tfoj1 414 ofoft of vn | vd joo o t{NJa}m{t} vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VMOV{<c>}{<q>}{.<dt>} <Dd>, <Dm> 

is equivalent to 

VORR{<c>}{<q>}{.<dt>} <Dd>, <Dm>, <Dm> 


and is the preferred disassembly when N:Vn == M:Vm. 
128-bit SIMD vector variant 
Applies when Q == 1. 


VMOV{<c>}{<q>}{.<dt>} <Qd>, <Qm> 
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is equivalent to 


VORR{<c>}{<q>}{.<dt>} <Qd>, <Qm>, <Qm> 


and is the preferred disassembly when N:Vn == M:Vm. 


Assembler symbols 


<c> 


<q> 
<dt> 
<Qd> 


<Qm> 


<Dd> 


<Dm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

An optional data type. <dt> must not be F64, but it is otherwise ignored. 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 


Is the 128-bit name of the SIMD&FP source register, encoded in the "N:Vn" and "M:Vm" field as 
<Qm>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "N:Vn" and "M:Vm" field. 


Operation for all encodings 


The description of VORR (register) gives the operational pseudocode for this instruction. 
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F6.1.119 | VMOV (general-purpose register to scalar) 


Copy a general-purpose register to a vector element copies a byte, halfword, or word from a general-purpose register 
into an Advanced SIMD scalar. 


On a Floating-point-only system, this instruction transfers one word to the upper or lower half of a double-precision 
floating-point register from a general-purpose register. This is an identical operation to the Advanced SIMD single 
word transfer. 


For more information about scalars see Advanced SIMD scalars on page F2-2432. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 2 1 0| 


| feiitt [1 1 1 ofojopetjof vd | Rt [4 0 1 1 [DJopc2] 1 (O00) 


cond 


Al variant 


VMOV{<c>}{<q>}{.<size>} <Dd[x]>, <Rt> 


Decode for this encoding 


case opcl:opc2 of 
when '1xxx' advsimd = TRUE; esize = 8; index = UInt(opcl<Q>:opc2); 
when '@xx1' advsimd = TRUE; esize = 16; index = UInt(opcl<0>:opc2<1>); 
when '@x00' advsimd = FALSE; esize = 32; index = UInt(opcl<Q>); 
when 'Q@x10' UNDEFINED; 

d = UInt(D:Vd); t = UInt(Rt); 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


T1 


[151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 21 0| 


[1110141 of0fopetfof vd | Rt [1 0 1 1[DJopc2] 1 (Oooo) 


T1 variant 


VMOV{<c>}{<q>}{.<size>} <Dd[x]>, <Rt> 


Decode for this encoding 


case opcl:opc2 of 
when '1xxx' advsimd = TRUE; esize = 8; index = UInt(opcl<Q>:opc2); 
when '@xx1' advsimd = TRUE; esize = 16; index = UInt(opcl<0>:opc2<1>); 
when 'Q@x@@' advsimd = FALSE; esize = 32; index = UInt(opcl<Q>); 
when '@x1@' UNDEFINED; 

d = UInt(D:Vd); t = UInt(Rt); 

if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<size> The data size. It must be one of: 
8 Encoded as opcl<1> = 1. [x] is encoded in opcl<@>, opc2. 
16 Encoded as opcl<1> = 0, opc2<@> = 1. [x] is encoded in opcl<0>, opc2<1>. 
32 Encoded as opcl<1> = 0, opc2 = @b00. [x] is encoded in opcl1<0>. 


omitted | Equivalent to 32. 


<Dd[x]> The scalar. The register <Dd> is encoded in D: Vd. For details of how [x] is encoded, see the 
description of <size>. 


<Rt> The source general-purpose register. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
Elem[D[d],index,esize] = R[t]<esize-1:0>; 
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F6.1.120 |VMOV (between general-purpose register and single-precision) 
Copy a general-purpose register to or from a 32-bit SIMD&FP register. This instruction transfers the value held in 
a 32-bit SIMD&FP register to a general-purpose register, or the value held in a general-purpose register to a 32-bit 
SIMD&FP register. 
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|11109 8|7 6 5 4|3 21 0| 
eit [1 11 00 0 Ofop) vn | Rt [1010 
cond 
From general-purpose register variant 
Applies when op == 0. 
VMOV{<c>}{<q>} <Sn>, <Rt> 
To general-purpose register variant 
Applies when op == 1. 
VMOV{<c>}{<q>} <Rt>, <Sn> 
Decode for all variants of this encoding 
if size != '1@' then UNDEFINED; 
to_arm_register = (op == '1'); t = UInt(Rt); n = UInt(Vn:N); 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12\11109 8|7 6 5 4|3 21 0| 
1147017417000 0fop) va {RE [1 0 1 O[N[OfOY 1 (OOOO) 
From general-purpose register variant 
Applies when op == 0. 
VMOV{<c>}{<q>} <Sn>, <Rt> 
To general-purpose register variant 
Applies when op == 1. 
VMOV{<c>}{<q>} <Rt>, <Sn> 
Decode for all variants of this encoding 
if size != '1@' then UNDEFINED; 
to_arm_register = (op == '1'); | t = UInt(Rt); n = UInt(Vn:N); 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<Rt> Is the general-purpose register that <Sn> will be transferred to or from, encoded in the "Rt" field. 
<Sn> Is the 32-bit name of the SIMD&FP register to be transferred, encoded in the "Vn:N" field. 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
if to_arm_register then 





R[t] = S[n]; 
else 
S[n] = R(t]; 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3515 


1ID092916 


Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 











F6.1.121 | VMOV (scalar to general-purpose register) 
Copy a vector element to a general-purpose register with sign or zero extension copies a byte, halfword, or word 
from an Advanced SIMD scalar to a general-purpose register. Bytes and halfwords can be either zero-extended or 
sign-extended. 
On a Floating-point-only system, this instruction transfers one word from the upper or lower half of a 
double-precision floating-point register to a general-purpose register. This is an identical operation to the Advanced 
SIMD single word transfer. 
For more information about scalars see Advanced SIMD scalars on page F2-2432. 
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12|\11109 8|7 6 5 4|3 21 0| 
mim [1110 Tot 
cond 
Al variant 
VMOV{<c>}{<q>}{.<dt>} <Rt>, <Dn[x]> 
Decode for this encoding 
case U:opcl:opc2 of 
when 'xlxxx' advsimd = TRUE; esize = 8; index = UInt(opcl<Q>:opc2); 
when 'x@xx1' advsimd = TRUE; esize = 16; index = UInt(opcl<Q>:opc2<1>); 
when '@Q@x00' advsimd = FALSE; esize = 32; index = UInt(opcl<0>); 
when '10x00' UNDEFINED; 
when 'x@x10' UNDEFINED; 
t = UInt(Rt); m= UInt(N:Vn); unsigned = (U == '1'); 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12\|11109 8|7 6 5 4/3 21 0| 
11407141 ofUfopet{1}] va [Rt [1 0 4 1{N]ope2] 1 (ooo) 
T1 variant 
VMOV{<c>}{<q>}{.<dt>} <Rt>, <Dn[x]> 
Decode for this encoding 
case U:opcl:opc2 of 
when 'x1xxx' advsimd = TRUE; esize = 8; index = UInt(opcl<Q>:opc2); 
when 'x@xx1' advsimd = TRUE; esize = 16; index = UInt(opcl<0>:opc2<1>); 
when 'QQx@@' advsimd = FALSE; esize = 32; index = UInt(opcl<Q>); 
when '10x@@' UNDEFINED; 
when 'x@x10' UNDEFINED; 
t = UInt(Rt); m = UInt(N:Vn); unsigned = (U == '1'); 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<dt> The data type. It must be one of: 
S8 Encoded as U = 0, opci<1> = 1. [x] is encoded in opc1<Q>, opc2. 
S16 Encoded as U = 0, opcl<1> = 0, opc2<@> = 1. [x] is encoded in opc1<0>, opc2<1>. 
U8 Encoded as U = 1, opcl<1> = 1. [x] is encoded in opc1<Q>, opc2. 
U16 Encoded as U = 1, opcl<1> = 0, opc2<@> = 1. [x] is encoded in opc1<@>, opc2<1>. 
32 Encoded as U = 0, opci<1> = 0, opc2 = @b00. [x] is encoded in opcl1<0>. 


omitted Equivalent to 32. 


<Rt> The destination general-purpose register. 


<Dn[x]> The scalar. For details of how [x] is encoded see the description of <dt>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if unsigned then 
R[t] = ZeroExtend(Elem[D[n],index,esize], 32); 
else 
R[t] = SignExtend(Elem[D[n],index,esize], 32); 
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F6.1.122 |VMOV (between two general-purpose registers and two single-precision registers) 


Copy two general-purpose registers to a pair of 32-bit SIMD&FP registers transfers the contents of two 
consecutively numbered single-precision Floating-point registers to two general-purpose registers, or the contents 
of two general-purpose registers to a pair of single-precision Floating-point registers. The general-purpose registers 
do not have to be contiguous. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
\31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 
Derm [i700 o]ifofel re | Rt [10 700 o[m7] vm | 


cond 


From general-purpose registers variant 
Applies when op == 0. 
VMOV{<c>}{<q>} <Sm>, <Sml>, <Rt>, <Rt2> 
To general-purpose registers variant 
Applies when op == 1. 
VMOV{<c>}{<q>} <Rt>, <Rt2>, <Sm>, <Sm1> 
Decode for all variants of this encoding 
to_arm_registers = (op == '1'); t = UInt(Rt); t2 = UInt(Rt2); m = UInt(Vm:M); 


if t == 15 || t2 == 15 || m == 31 then UNPREDICTABLE; 
if to_arm_registers && t == t2 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If to_arm_registers && t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The value in the destination register is UNKNOWN. 


If m == 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the single-precision registers become UNKNOWN for a move to the single-precision register. 


The general-purpose registers listed in the instruction become UNKNOWN for a move from the 
single-precision registers. This behavior does not affect any other general-purpose registers. 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


77704700 0/ifopel Re | Rt [10 7)o[o [mi] vm | 
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From general-purpose registers variant 
Applies when op == 0. 
VMOV{<c>}{<q>} <Sm>, <Sml>, <Rt>, <Rt2> 
To general-purpose registers variant 
Applies when op == 1. 
VMOV{<c>}{<q>} <Rt>, <Rt2>, <Sm>, <Sm1> 
Decode for all variants of this encoding 
to_arm_registers = (op == '1'); t = UInt(Rt); t2 = UInt(Rt2); m = UInt(Vm:M); 


if t == 15 || t2 == 15 || m == 31 then UNPREDICTABLE; 
if to_arm_registers && t == t2 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If to_arm_registers && t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The value in the destination register is UNKNOWN. 


If m == 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° One or more of the single-precision registers become UNKNOWN for a move to the single-precision register. 


The general-purpose registers listed in the instruction become UNKNOWN for a move from the 
single-precision registers. This behavior does not affect any other general-purpose registers. 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VMOV (between two general-purpose 
registers and two single-precision registers) on page K1-5472. 


Assembler symbols 


<Rt2> Is the second general-purpose register that <Sm1> will be transferred to or from, encoded in the "Rt" 
field. 

<Rt> Is the first general-purpose register that <Sm> will be transferred to or from, encoded in the "Rt" field. 

<Sm1> Is the 32-bit name of the second SIMD&FP register to be transferred. This is the next SIMD&FP 
register after <Sm>. 

<Sm> Is the 32-bit name of the first SIMD&FP register to be transferred, encoded in the "Vm:M" field. 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
if to_arm_registers then 
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R[t] = S{m]; 

R[t2] = S[m+1]; 
else 

S[m] = R[t]; 


S[m+1] = R[t2]; 
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F6.1.123  VMOVL 


Vector Move Long takes each element in a doubleword vector, sign or zero-extends them to twice their original 
length, and places the results in a quadword vector. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 

|31 30 29 28|27 26 25 24/23 2221 |191817 16/15 12|/11109 8|7 6 5 4/3 0| 

774700 4[uli[o] Ho Joo of va [107 ofojojm[i] vm | 
imm3H 

Al variant 


VMOVL{<c>}{<q>}.<dt> <Qd>, <Dm> 


Decode for this encoding 


if imm3H == 'Q00' then SEE "Related encodings"; 

if imm3H != 'QQ1' && imm3H != 'Q10' && imm3H != '100' then SEE VSHLL; 
if Vd<@> == '1' then UNDEFINED; 

esize = 8 « UInt(imm3H); 

unsigned = (U == '1'); elements = 64 DIV esize; 

d = UInt(D:Vd); m = UInt(M:Vm); 


T1 


\15141312/11109 8|7 65 |3 21 0\15 12\11109 8|7 6 5 4|3 0 | 


[1 1 tfuj1 414 14of 000 fo o of vd 41 01 ofofojmit] vm | 


imm3H 


T1 variant 


VMOVL{<c>}{<q>}.<dt> <Qd>, <Dm> 


Decode for this encoding 


if imm3H == 'Q00' then SEE "Related encodings"; 

if imm3H != 'QQ1' && imm3H != '010' && imm3H != '100' then SEE VSHLL; 
if Vd<@> == '1' then UNDEFINED; 

esize = 8 « UInt(imm3H); 

unsigned = (U == '1'); elements = 64 DIV esize; 

d = UInt(D:Vd); m = UInt(M:Vm); 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
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<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operand, encoded in the "U:imm3H" field. It can have the 
following values: 
S8 when U = 0, imm3H = 001 
S16 when U = 0, imm3H = 010 
$32 when U = 0, imm3H = 100 
U8 when U = 1, imm3H = 001 
U16 when U = 1, imm3H = 010 
U32 when U = 1, imm3H = 100 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = 0 to elements-1 
result = Int(Elem[Din[m],e,esize], unsigned) ; 
Elem[Q[d>>1],e,2sesize] = result<2xesize-1:0>; 
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F6.1.124 VMOVN 


Vector Move and Narrow copies the least significant half of each element of a quadword vector into the 
corresponding elements of a doubleword vector. 


The operand vector elements can be any one of 16-bit, 32-bit, or 64-bit integers. There is no distinction between 
signed and unsigned integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


This instruction is used by the pseudo-instructions VRSHRN (zero) and VSHRN (zero). The pseudo-instruction is 
never the preferred disassembly. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


77710077 10]1 t[eze[i of va [ofo 70 o[o[mo] vm | 


Al variant 


VMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 


Decode for this encoding 


if size == '11' then UNDEFINED; 
if Vm<@> == '1' then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); m = UInt(M:Vm); 


T1 


[15 141312/11109 8/7 6 5 4/3 2.1 0|15 12/1110 9 8|7 6 5 4/3 0 | 


Tiiti117 10] iJsze]7 of va [ol 10 o]o|wfo] vm _| 


T1 variant 


VMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 


Decode for this encoding 


if size == '11' then UNDEFINED; 
if Vm<@> == '1' then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); > m = UInt(M:Vm); 
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Alias conditions 





Pseudo-instruction is preferred when 











VRSHRN (zero) Never 

VSHRN (zero) Never 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operand, encoded in the "size" field. It can have the following 
values: 
I16 when size = 00 
132 when size = 01 
164 when size = 10 


The encoding size = 11 is reserved. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = 0 to elements-1 
Elem[D[d],e,esize] = Elem[Qin[m>>1] ,e,2*esize]<esize-1:0>; 
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F6.1.125 VMRS 


Move SIMD&FP Special register to general-purpose register moves the value of an Advanced SIMD and 
floating-point System register to a general-purpose register. When the specified System register is the FPSCR, a 
form of the instruction transfers the FPSCR.{N, Z, C, V} condition flags to the APSR.{N, Z, C, V} condition flags. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


When these settings permit the execution of floating-point and Advanced SIMD instructions, if the specified 
floating-point System register is not the FPSCR, the instruction is UNDEFINED if executed in User mode. 


In an implementation that includes EL2, when HCR.TIDO is set to 1, any VMRS access to FPSID from a Non-secure 
EL1 mode that would be permitted if HCR.TIDO was set to 0 generates a Hyp Trap exception. For more information, 
see ID group 0, Primary device identification registers on page G1-3902. 


For simplicity, the VMRS pseudocode does not show the possible trap to Hyp mode. 





A1 
|31 28|27 26 25 24|23 22 21 20|19 16|15 12\11109 8|7 6 5 4|3 21 0| 
cond 
Al variant 


VMRS{<c>}{<q>} <Rt>, <spec_reg> 


Decode for this encoding 
t = UInt(Rt); 


if !(reg IN {'Q00@x', '@101', '@11x', '1000'}) then UNPREDICTABLE; 
if t == 15 && reg != '@@01' then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If !(reg IN {'Q00x', 'Q101', 'Q11x', '1000'}), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The instruction transfers an UNKNOWN value to the specified target register. When the Rt field holds the value 


0b1111, the specified target register is the APSR.{N, Z, C, V} bits, and these bits become UNKNOWN. 
Otherwise, the specified target register is the register specified by the Rt field, RO - R14. 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 21 0| 





T1 variant 


VMRS{<c>}{<q>} <Rt>, <spec_reg> 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3525 
1ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Decode for this encoding 


t = UInt(Rt); 


if !(reg IN {'Q00@x', '@101', '@11x', '1000'}) then UNPREDICTABLE; 
if t == 15 && reg != 'Q@01' then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 


CONSTRAINED UNPREDICTABLE behavior 


If !(reg IN {'Q00x', 'Q101', 'Q11x', '1000'}), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction transfers an UNKNOWN value to the specified target register. When the Rt field holds the value 


0b1111, the specified target register is the APSR.{N, Z, C, V} bits, and these bits become UNKNOWN. 


Otherwise, the specified target register is the register specified by the Rt field, RO - R14. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 


Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Rt> Is the general-purpose destination register, encoded in the "Rt" field. Is one of: 
RO-R14 General-purpose register. 


APSR_nzcv Permitted only when <spec_reg> is FPSCR. Encoded as 0b1111. The instruction transfers 
the FPSCR. {N, Z, C, V} condition flags to the The Application Program Status Register, 


APSR on page E1-2296.{N, Z, C, V} condition flags. 


<spec_reg> Is the source Advanced SIMD and floating-point System register, encoded in the "reg" field. It can 
have the following values: 


FPSID 
FPSCR 
MVFR2 
MVFR1 
MVFRO 
FPEXC 


The following encodings are UNPREDICTABLE: 


reg 
reg 
reg 
reg 


reg 


= 001x. 
= 0100. 
= 1001. 
= 101x. 
= 11xx. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 
if reg == 'Q001' then 


when reg = 0000 
when reg = 0001 
when reg = 0101 
0110 
when reg = 0111 
when reg = 1000 


when reg 


// FPSCR 
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CheckVFPEnabled(TRUE) ; 
if t == 15 then 
PSTATE.<N,Z,C,V> = FPSR.<N,Z,C,V>; 


else 
R[t] = FPSCR; 
elsif PSTATE.EL == ELQ then 
UNDEFINED; // Non-FPSCR registers accessible only at PL1 or above 
else 
CheckVFPEnabled(FALSE) ; // Non-FPSCR registers are not affected by FPEXC.EN 
case reg of 


// Pseudocode does not consider possible HCR.TIDn Hyp Traps of Non-secure register reads 
when 'Q000' R[t] = FPSID; 

when 'Q101' R[t] = MVFR2; 

when 'Q110' R[t] = MVFR1; 

when 'Q@111' R[t] = MVFRO; 

when '1000' R[t] = FPEXC; 

otherwise Unreachable(); // Dealt with above or in encoding-specific pseudocode 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3527 
Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 








F6.1.126 VMSR 
Move general-purpose register to SIMD&FP Special register moves the value of a general-purpose register to a 
floating-point System register. 
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
When these settings permit the execution of floating-point and Advanced SIMD instructions: 
° If the specified floating-point System register is not the FPSCR, the instruction is UNDEFINED if executed in 
User mode. 
° If the specified floating-point System register is the FPEXC and the instruction is executed in a mode other 
than User mode the instruction is ignored. 
A1 
31 28/27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 2 1 0| 
eit [1 1 1 ota 1/0] eg | Rt 10. 10 
cond 
Al variant 
VMSR{<c>}{<q>} <spec_reg>, <Rt> 
Decode for this encoding 
t = UInt(Rt); 
if reg != 'Q00x' && reg != '1000' then UNPREDICTABLE; 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
CONSTRAINED UNPREDICTABLE behavior 
If reg != '000x' && reg != '1000', then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction transfers the value in the general-purpose register to one of the allocated registers accessible 
using VMSR at the same Exception level. 
T1 
151413 12/11109 8|7 6 5 4/3 0 |15 12|11109 8|7 6 5 4/3 21 0| 
Tito 1O11 to] eg | Re [10 1 Olof fooKOo} 
T1 variant 
VMSR{<c>}{<q>} <spec_reg>, <Rt> 
Decode for this encoding 
t = UInt(Rt); 
if reg != 'Q00x' && reg != '1000' then UNPREDICTABLE; 
if t == 15 then UNPREDICTABLE; // ARMv8-A removes UNPREDICTABLE for R13 
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CONSTRAINED UNPREDICTABLE behavior 


If reg != '000x' && reg != '1000', then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

. The instruction transfers the value in the general-purpose register to one of the allocated registers accessible 


using VMSR at the same Exception level. 


Notes for all encodings 

For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 

Assembler symbols 

<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 


<spec_reg> Is the destination Advanced SIMD and floating-point System register, encoded in the "reg" field. It 
can have the following values: 


FPSID when reg = 0000 
FPSCR when reg = 0001 
FPEXC when reg = 1000 


The following encodings are UNPREDICTABLE: 
° reg = QQ1x. 
® reg = O1xx. 


7 reg = 1001. 


° reg = 101x. 
° reg = 11xx. 
<Rt> Is the general-purpose source register, encoded in the "Rt" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); 


if reg == 'Q001' then // FPSCR 

CheckVFPEnabled(TRUE) ; 

FPSCR = R[t]; 
elsif PSTATE.EL == ELQ then 

UNDEFINED; // Non-FPSCR registers accessible only at PL1 or above 
else 

CheckVFPEnabled(FALSE) ; // Non-FPSCR registers are not affected by FPEXC.EN 

case reg of 

when 'Q000' // VMSR access to FPSID is ignored 


when '1000' FPEXC = R[t]; 
otherwise Unreachable(); // Dealt with above or in encoding-specific pseudocode 
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F6.1.127 | VMUL (floating-point) 


Vector Multiply multiplies corresponding elements in two vectors, and places the results in the destination vector. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 


0| 


111 too tfifojofo}sz] vn | va |i 10 t{NJa}m{t] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VMUL{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VMUL{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 


if sz == '1' then UNDEFINED; 
advsimd = TRUE; 
esize = 32; elements = 2; 


d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


A2 


31 28/27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 


0| 


| teat ft 1 1 ofojojs of vn | vd {1 of1 xjNfojmjo] vm _| 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VMUL{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VMUL{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 
if size != '1x' then UNDEFINED; 
advsimd = FALSE; 


case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize 4; UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


iT} 
oa. 
iT} 
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T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


apt 717 o[ppoyee] va | va_[1 10 7[Nja[Myi] vm 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMUL{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMUL{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

advsimd = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 
\15141312/11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0| 
T110771 0)opp]1 0] va | va [i o]7 x[NJo[Mpo. vm] 
size 
Single-precision scalar variant 
Applies when size == 10. 
VMUL{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VMUL{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 
Decode for all variants of this encoding 
if FPSCR.Len != 'QQ@0' || FPSCR.Stride != 'Q0' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
advsimd = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 

<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = Q@ to elements-1 
Elem[D[d+r],e,esize] = FPMul(Elem[D[n+r],e,esize], Elem[D[m+r],e,esize], 
StandardFPSCRValue()); 





else // NFP instruction 
case esize of 
when 32 
S[d] = FPMul(S[n], S[m], FPSCR); 
when 64 
D[d] = FPMul(D[n], D[m], FPSCR); 
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F6.1.128 | VMUL (integer and polynomial) 
Vector Multiply multiplies corresponding elements in two vectors. 
For information about multiplying polynomials see Polynomial arithmetic over {0, 1} on page A1-45. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


[111100 tfopfojofsize] vn | vd _ |i 0 0 t|NJQ}M{t] vm 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMUL{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMUL{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 

if size == '11' || (op == '1' && size != '00') then UNDEFINED; 

if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 
polynomial = (op == '1'); long_destination = FALSE; 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


1 aot 711 o[pfsze] va | va_[1 00 1|NJQ|Mi] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMUL{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMUL{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if size == '11' || (op == '1' && size != '00') then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 
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polynomial = (op == '1'); long_destination = FALSE; 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 

Notes for all encodings 
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<dt> Is the data type for the elements of the operands, encoded in the "op:size" field. It can have the 
following values: 


18 when op = Q, size = 00 
116 when op = Q, size = 01 
132 when op = Q, size = 10 
P8 when op = 1, size = 00 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
opl = Elem[Din[n+r],e,esize]; oplval = Int(opl, unsigned); 
op2 = Elem[Din[m+r],e,esize]; op2val = Int(op2, unsigned); 
if polynomial then 
product = PolynomialMult(op1,op2); 
else 
product = (oplval«op2val)<2«*esize-1:0>; 
if long_destination then 
Elem[Q[d>>1],e,2*esize] = product; 
else 
Elem[D[d+r],e,esize] = product<esize-1:0>; 
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F6.1.129 | VMUL (by scalar) 
Vector Multiply multiplies each element in a vector by a scalar, and places the results in a second vector. 
For more information about scalars see Advanced SIMD scalars on page F2-2432. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 
[111700 tfaiifofeaf~ vn | va [4 0 ofFIN{1{mMjo] vm | 


size 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMUL{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm>[<index>] 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMUL{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm>[<index>] 


Decode for all variants of this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || (F == '1' && size == '01') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1') then UNDEFINED; 

unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 
floating_point = (F == '1'); long_destination = FALSE; 

d = UInt(D:Vd); n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 


if size == 'Q1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12\1110 9 8|7 6 5 4|3 0| 





7 ajet ttt tppeny va | va _[1 0 ofF[N[t[Mpo] vm _ 


size 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMUL{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm>[<index>] 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMUL{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm>[<index>] 


Decode for all variants of this encoding 


if size == '11' then SEE "Related encodings"; 
if size == 'Q0' || (F == '1' && size == '01') then UNDEFINED; 
if Q == '1' && (Vd<0> == '1' || Vn<@> == '1') then UNDEFINED; 
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unsigned = FALSE; // "Don't care" value: TRUE produces same functionality 

floating_point = (F == '1'); long_destination = FALSE; 

d = UInt(D:Vd); n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 

if size == 'Q1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 


Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the scalar and the elements of the operand vector, encoded in the "F:size" field. 
It can have the following values: 
116 when F = 0, size = Q1 
132 when F = 0, size = 10 
F32 when F = 1, size = 10 
The following encodings are reserved: 
° F = @, size = 00. 
° F = Q, size = 11. 


° F = 1, size = Ox. 


° F = 1, size = 11. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Vm<2:0>" field when 


<dt> is 116, otherwise the "Vm" field. 


<index> Is the element index in the range 0 to 3, encoded in the "M:Vm<3>" field when <dt> is 116, 
otherwise in range 0 to 1, encoded in the "M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
op2 = Elem[Din[m],index,esize]; op2val = Int(op2, unsigned); 
for r = @ to regs-1 
for e = Q to elements-1 
opl = Elem[Din[n+r],e,esize]; oplval = Int(opl, unsigned); 
if floating_point then 
Elem[D[d+r],e,esize] = FPMul(op1, op2, StandardFPSCRValue()); 
else 
if long_destination then 
Elem[Q[d>>1],e,2esize] = (oplval«op2val)<2*esize-1:0>; 
else 





Elem[D[d+r],e,esize] = (oplval«op2val)<esize-1:0>; 





F6-3536 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.130 | VMULL (integer and polynomial) 


Vector Multiply Long multiplies corresponding elements in two vectors. The destination vector elements are twice 
as long as the elements that are multiplied. 


For information about multiplying polynomials see Polynomial arithmetic over {0, 1} on page A1-45. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12|/11109 8|7 6 5 4/3 0| 
747400 TU] va | va [4 t)oofo[NJo[Mjo[ vm | 
size 
Al variant 


VMULL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 
unsigned = (U == '1'); polynomial = (op == '1'); long_destination = TRUE; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
if polynomial then 
if U == '1' || size == 'Q@1' then UNDEFINED; 
if size == '10' then // «p64 
if !HaveCryptoExt() then UNDEFINED; 
if InITBlock() then UNPREDICTABLE; 
esize = 64; elements = 1; 
if Vd<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); nm = UInt(N:Vn); m = UInt(M:Vm); regs = 1; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 0 | 





size 


T1 variant 


VMULL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 
unsigned = (U == '1'); polynomial = (op == '1'); long_destination = TRUE; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
if polynomial then 
if U == '1' || size == '@1' then UNDEFINED; 
if size == '10' then // «p64 
if !HaveCryptoExt() then UNDEFINED; 
if InITBlock() then UNPREDICTABLE; 
esize = 64; elements = 1; 
if Vd<@> == '1' then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); regs = 1; 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "op:U:size" field. It can have the 
following values: 
S8 when op = 0,U = 0, size = 00 
S16 when op = 0,U = 0, size = @1 
$32 when op = 0,U = 0, size = 10 
U8 when op = 0,U = 1, size = 00 
U16 when op = 0,U = 1, size = @1 
U32 when op = 0,U = 1, size = 10 
P8 when op = 1,U = 0, size = 00 
P64 when op = 1,U = 0, size = 10 
The following encodings are reserved: 
° op = 0,U = Q, size = 11. 
° op = 0,U = 1, size = 11. 
° op = 1,U = Q, size = @1. 
° op = 1,U = Q, size = 11. 
° op = 1,U = 1, size = xx. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); 


for r = @ to regs-1 

for e = Q to elements-1 
Elem[Din[n+r],e,esize]; oplval 
Elem[Din[m+r],e,esize]; op2val 
if polynomial then 


op1 
op2 = 


product = PolynomialMult(op1,op2); 


else 


CheckAdvSIMDEnabled(); 


= Int(op1, unsigned); 
= Int(op2, unsigned); 


product = (oplval«op2val)<2«*esize-1:0>; 
if long_destination then 


Elem[Q[d>>1],e,2xesize] = product; 


else 


Elem[D[d+r],e,esize] = product<esize-1:0>; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.131 | VMULL (by scalar) 


Vector Multiply Long multiplies each element in a vector by a scalar, and places the results in a second vector. The 
destination vector elements are twice as long as the elements that are multiplied. 


For more information about scalars see Advanced SIMD scalars on page F2-2432. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 

|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12|/11109 8|7 6 5 4/3 0| 

74700 afuiopen] we [ va [107 O[N[i[mjo] vm | 
size 

Al variant 


VMULL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm>[<index>] 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || Vd<@> == '1' then UNDEFINED; 

unsigned = (U == '1'); Jlong_destination = TRUE; floating_point = FALSE; 
d = UInt(D:Vd); n = UInt(N:Vn); regs = 1; 


if size == 'Q1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12\1110 9 8|7 6 5 4|3 0| 
[tt tfuts 144 tfopisn] vn | va [1 0 1 o[N{i[mMjo] vm _| 
size 
T1 variant 


VMULL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm>[<index>] 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || Vd<@> == '1' then UNDEFINED; 

unsigned = (U == '1'); Jlong_destination = TRUE; floating_point = FALSE; 
d = UInt(D:Vd); n = UInt(N:Vn); regs = 1; 
if size == 'Q@1' then esize = 16; elements 
if size == '10' then esize = 32; elements 


UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
UInt(Vm); index = UInt(M); 


4; m 
2; =m 


Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the scalar and the elements of the operand vector, encoded in the "U:size" field. 
It can have the following values: 
$16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 


The following encodings are reserved: 
° U = 0, size = 00. 
° U = 0, size = 11. 
° U = 1, size = 0. 
° U=1, size = 11. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "Vm<2:0>" field when 


<dt> is S16 or U16, otherwise the "Vm" field. 


<index> Is the element index in the range 0 to 3, encoded in the "M:Vm<3>" field when <dt> is S16 or U16, 
otherwise in range 0 to 1, encoded in the "M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
op2 = Elem[Din[m],index,esize]; op2val = Int(op2, unsigned); 
for r = @ to regs-1 
for e = Q to elements-1 
op1 = Elem[Din[n+r],e,esize]; oplval = Int(opl, unsigned); 
if floating_point then 
Elem[D[d+r],e,esize] = FPMul(op1, op2, StandardFPSCRValue()); 
else 
if long_destination then 
Elem[Q[d>>1],e,2esize] = (oplval«op2val)<2*esize-1:0>; 
else 





Elem[D[d+r],e,esize] = (oplval«op2val)<esize-1:0>; 





F6-3540 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.132 VMVN (immediate) 


Vector Bitwise NOT (immediate) places the bitwise inverse of an immediate integer constant into every element of 
the destination register. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|15 12\11 8|7 6 5 4|3 0 | 


77100 Tift] 0 oO] mms | va | omode [o[a[i]i] imma 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMVN{<c>}{<q>}.<dt> <Dd>, #<imm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMVN{<c>}{<q>}.<dt> <Qd>, #<imm> 


Decode for all variants of this encoding 
if (cmode<@> == '1' && cmode<3:2> != '11') || cmode<3:1> == '111' then SEE "Related encodings"; 
if Q == '1' && Vd<@> == '1' then UNDEFINED; 


imm64 = AdvSIMDExpandImm('1', cmode, i: imm3:imm4) ; 
d = UInt(D:Vd); regs = if Q == '@' then 1 else 2; 


T1 


\15141312/11109 8|7 6 5 4/3 2. 0|15 12|11 8|7 6 5 4|3 0| 


apt 717 to]o 0 Of mms [va | omode [olali]i] imma 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMVN{<c>}{<q>}.<dt> <Dd>, #<imm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMVN{<c>}{<q>}.<dt> <Qd>, #<imm> 


Decode for all variants of this encoding 


if (cmode<@> == '1' && cmode<3:2> != '11') || cmode<3:1> == '111' then SEE "Related encodings"; 
if Q == '1' && Vd<@> == '1' then UNDEFINED; 

imm64 = AdvSIMDExpandImm('1', cmode, i: imm3:imm4) ; 

d = UInt(D:Vd);_ regs = if Q == '@' then 1 else 2; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> The data type. It must be either 116 or 132. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<imm> Is a constant of the type specified by <dt> that is replicated to fill the destination register. For details 


of the range of constants available and the encoding of <dt> and <imm>, see Modified immediate 
constants in T32 and A32 Advanced SIMD instructions on page F2-2423. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
D[d+r] = NOT(imm64); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.133 VMVN (register) 


Vector Bitwise NOT (register) takes a value from a register, inverts the value of each bit, and places the result in the 
destination register. The registers can be either doubleword or quadword. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


11770077 101 i[szefo of va [oli 07% 7/Q\w[o] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VMVN{<c>}{<q>}{.<dt>} <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VMVN{<c>}{<q>}{.<dt>} <Qd>, <Qm> 


Decode for all variants of this encoding 
if size != '@@' then UNDEFINED; 


if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


71 


15 1413 12/11109 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 0 | 


Tiiti117 1] iszcefoo[ va [oli 07% 7Q[wpo] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 

VMVN{<c>}{<q>}{.<dt>} <Dd>, <Dm> 

128-bit SIMD vector variant 

Applies when Q == 1. 

VMVN{<c>}{<q>}{.<dt>} <Qd>, <Qm> 

Decode for all variants of this encoding 
if size != '@@' then UNDEFINED; 


if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 





<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
D[d+r] = NOT(D[m+r]); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.134 VNEG 


Vector Negate negates each element in a vector, and places the results in a second vector. The floating-point version 
only inverts the sign bit. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0 | 


11770077 1)0][1 iszcefo it] va [olF[1 7 7Q[w[o] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VNEG{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VNEG{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
advsimd = TRUE; floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


A2 
31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 
Pst [1 110 1[b]1 1fofo o if va [4 of1 xfofi{mjo] vm | 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VNEG{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VNEG{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
if FPSCR.Len != 'Q@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 
advsimd = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4/3 0 | 





7714777 1(0]1 tewefo 7] va [o[F]7 7 7alMjo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VNEG{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VNEG{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' || (F == '1' && size != '10') then UNDEFINED; 

if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1') then UNDEFINED; 
advsimd = TRUE; floating_point = (F == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 


151413 12|11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4|3 0| 


7704770 1]1 tooo 7] va [1 o]7 x[o[t|mjo] vm 


size 


Single-precision scalar variant 
Applies when size == 10. 


VNEG{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VNEG{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
if FPSCR.Len != 'Q@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 
advsimd = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Is the data type for the elements of the vectors, encoded in the "F:size" field. It can have the 
following values: 


S8 when F = 0, size = 00 
S16 when F = 0, size = Q1 
$32 when F = 0, size = 10 
F32 when F = 1, size = 10 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 

Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 


Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = 0 to elements-1 
if floating_point then 
Elem[D[d+r],e,esize] = FPNeg(Elem[D[m+r] ,e,esize] ); 
else 
result = -SInt(Elem[D[m+r],e,esize]); 
Elem[D[d+r],e,esize] = result<esize-1:0>; 
else // NFP instruction 
case esize of 





when 32 S[d] = FPNeg(S[m]); 
when 64 D[d] = FPNeg(D[m]) ; 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3547 


1ID092916 


Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 











F6.1.135 VNMLA 
Vector Negate Multiply Accumulate multiplies together two floating-point register values, adds the negation of the 
floating-point value in the destination register to the negation of the product, and writes the result back to the 
destination register. 
Note 
ARM recommends that software does not use the VNMLA instruction in the Round towards Plus Infinity and Round 
towards Minus Infinity rounding modes, because the rounding of the product and of the sum can change the result 
of the instruction in opposite directions, defeating the purpose of these rounding modes. 
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 | 
eit [111 ofojofo 1] vn [| ve [1 0] x]N]a[mpo] vm _| 
cond size op 
Single-precision scalar variant 
Applies when size == 10. 
VNMLA{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VNMLA{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 
Decode for all variants of this encoding 
if FPSCR.Len != '000' || FPSCR.Stride != 'Q@' then UNDEFINED; 
if size != '1x' then UNDEFINED; 
type = if op == '1' then VFPNegMul_VNMLA else VFPNegMul_VNMLS; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
T1 
15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
Tit 0414 Ofojojo i] va [va [+ o[t x|N]i[wpo] vm | 
size op 
Single-precision scalar variant 
Applies when size == 10. 
VNMLA{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VNMLA{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Decode for all variants of this encoding 


if FPSCR.Len != '@@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 

if size != '1x' then UNDEFINED; 

type = if op == '1' then VFPNegMul_VNMLA else VFPNegMul_VNMLS; 

case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 
enumeration VFPNegMul {VFPNegMul_VNMLA, VFPNegMul_VNMLS, VFPNegMu1_VNMUL}; 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
product32 = FPMul(S[n], S[m], FPSCR); 
case type of 
when VFPNegMul_VNMLA S[d] = FPAdd(FPNeg(S[d]), FPNeg(product32), FPSCR) 
when VFPNegMul_VNMLS S[d] = FPAdd(FPNeg(S[d]), product32, FPSCR); 
when VFPNegMul_VNMUL S[d] = FPNeg(product32); 


Qa. 


Qa 


when 64 
product64 = FPMul(D[n], D[m], FPSCR); 
case type of 

when VFPNegMul_VNMLA D[d] 
when VFPNegMul_VNMLS_ D[d] 
when VFPNegMul_VNMUL D[d] 


FPAdd(FPNeg(D[d]), FPNeg(product64), FPSCR); 
FPAdd(FPNeg(D[d]), product64, FPSCR); 
FPNeg(product64) ; 





a. 
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F6.1.136 VNMLS 


Vector Negate Multiply Subtract multiplies together two floating-point register values, adds the negation of the 
floating-point value in the destination register to the product, and writes the result back to the destination register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 


0 | 


| teiit ft 11 ofojofo +f vn | vd | of1 x|NJojmMjo} vm | 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VNMLS{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VNMLS{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 

if size != 'lx' then UNDEFINED; 

type = if op == '1' then VFPNegMul_VNMLA else VFPNegMul_VNMLS; 

case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 14 1312/1110 9 8|7 6 5 4/3 0 |15 12/1110 9 8|7 6 5 4/3 


0| 


At totit ofojpjo tt vn | va {i oft xjNfofmjo] vm 


size 


Single-precision scalar variant 
Applies when size == 10. 


VNMLS{<c>}{<q>}.F32 <Sd>, <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VNMLS{<c>}{<q>}.F64 <Dd>, <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@@@' || FPSCR.Stride != 'Q00' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
type = if op == '1' then VFPNegMul_VNMLA else VFPNegMul_VNMLS; 
case size of 
when '10' esize = 32; d 
when '11' esize = 64; d 


UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
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Assembler symbols 


<c> 


<q> 


<Sd> 


<Sn> 


<Sm> 


<Dd> 


<Dn> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 

Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


enumeration VFPNegMul {VFPNegMul_VNMLA, VFPNegMul_VNMLS, VFPNegMu1_VNMUL}; 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
product32 = FPMul(S[n], S[m], FPSCR); 
case type of 


when VFPNegMul_VNMLA S[d] = FPAdd(FPNeg(S[d]), FPNeg(product32), FPSCR) 
when VFPNegMul_VNMLS S[d] = FPAdd(FPNeg(S[d]), product32, FPSCR); 
when VFPNegMul_VNMUL S[d] = FPNeg(product32); 


when 64 
product64 = FPMul(D[n], D[m], FPSCR); 
case type of 











when VFPNegMul_VNMLA D[d] = FPAdd(FPNeg(D[d]), FPNeg(product64), FPSCR); 
when VFPNegMul_VNMLS D[d] = FPAdd(FPNeg(D[d]), product64, FPSCR); 
when VFPNegMul_VNMUL D[d] = FPNeg(product64) ; 
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F6.1.137 9 VNMUL 


Vector Negate Multiply multiplies together two floating-point register values, and writes the negation of the result 


to the destination register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 


0 | 


| feitt ft 41 ofojof1 of vn | vd ft of x|N]t]Mfo} vm | 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VNMUL{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VNMUL{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@0' || FPSCR.Stride != 'Q0' then UNDEFINED; 
type = VFPNegMul_VNMUL ; 
case size of 
when '10' esize 
when '11' esize 


32; d = UInt(Vd:D); nm = UInt(Vn:N); m = UInt(Vm:M); 
64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 


0| 


1t1o014 1 ofojoft of vn | va ft of1 x|N[t]mMfo] vm | 


size 


Single-precision scalar variant 
Applies when size == 10. 


VNMUL{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VNMUL{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 
type = VFPNegMul_VNMUL ; 
case size of 
when '10' esize = 
when '11' esize 


i 
w 
N 
Qa 

i 


= UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 


iT} 
aD 
B 
Q 
iT 
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Assembler symbols 


<c> 


<q> 


<Sd> 


<Sn> 


<Sm> 


<Dd> 


<Dn> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 

Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


enumeration VFPNegMul {VFPNegMul_VNMLA, VFPNegMul_VNMLS, VFPNegMu1_VNMUL}; 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
product32 = FPMul(S[n], S[m], FPSCR); 
case type of 


when VFPNegMul_VNMLA S[d] = FPAdd(FPNeg(S[d]), FPNeg(product32), FPSCR) 
when VFPNegMul_VNMLS S[d] = FPAdd(FPNeg(S[d]), product32, FPSCR); 
when VFPNegMul_VNMUL S[d] = FPNeg(product32); 


when 64 
product64 = FPMul(D[n], D[m], FPSCR); 
case type of 











when VFPNegMul_VNMLA D[d] = FPAdd(FPNeg(D[d]), FPNeg(product64), FPSCR); 
when VFPNegMul_VNMLS D[d] = FPAdd(FPNeg(D[d]), product64, FPSCR); 
when VFPNegMul_VNMUL D[d] = FPNeg(product64) ; 
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F6.1.138 |VORN (immediate) 


Vector Bitwise OR NOT (immediate) performs a bitwise OR between a register value and the complement of an 
immediate value, and returns the result into the destination vector 


This instruction is a pseudo-instruction of the VORR (immediate) instruction. This means that: 


° The encodings in this description are named to match the encodings of VORR (immediate). 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VORR (immediate) gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24/23 22 2120/1918  16|15 12\11 8|7 6 5 4|3 0| 


77100 tito] oO] mms | va | omode [o[a[o]i] imma 


64-bit SIMD vector variant 

Applies when Q == 0. 

VORN{<c>}{<q>}.<dt> {<Dd>,} <Dd>, #<imm> 
is equivalent to 

VORR{<c>}{<q>}.<dt> <Dd>, #<imm> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 

VORN{<c>}{<q>}.<dt> {<Qd>,} <Qd>, #<imm> 
is equivalent to 

VORR{<c>}{<q>}.<dt> <Qd>, #-<imm> 


and is never the preferred disassembly. 


T1 


\15141312/11109 8|7 6 5 4/3 2. 0O|15 42|11 8|7 6 5 4|3 0 | 


[1 t tfif1 114 1Jofo 0 of imms | vd | cmode Jolajo{+] imm4 | 


64-bit SIMD vector variant 

Applies when Q == 0. 

VORN{<c>}{<q>}.<dt> {<Dd>,} <Dd>, #<imm> 
is equivalent to 

VORR{<c>}{<q>}.<dt> <Dd>, #<imm> 


and is never the preferred disassembly. 
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128-bit SIMD vector variant 


Applies when Q == 1. 


VORN{<c>}{<q>}.<dt> {<Qd>,} <Qd>, #<imm> 


is equivalent to 


VORR{<c>}{<q>}.<dt> <Qd>, #-<imm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 
<Dd> 


<imm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


The data type used for <imm>. It can be either 116 or 132. 18, 164, and F32 are also permitted, but the 
resulting syntax is a pseudo-instruction. 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is a constant of the type specified by <dt> that is replicated to fill the destination register. For details 
of the range of constants available and the encoding of <dt> and <imm>, see Modified immediate 
constants in T32 and A32 Advanced SIMD instructions on page F2-2423. 


Operation for all encodings 


The description of VORR (immediate) gives the operational pseudocode for this instruction. 
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F6.1.139 | VORN (register) 
Vector bitwise OR NOT (register) performs a bitwise OR NOT operation between two registers, and places the 
result in the destination register. The operand and result registers can be quadword or doubleword. They must all be 
the same size. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|/11109 8|7 6 5 4/3 0| 
111100 t)olojolt a] va | ve [ooo a[njalM[i] vm] 
64-bit SIMD vector variant 
Applies when Q == 0. 
VORN{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VORN{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<O> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12\1110 9 8|7 6 5 4|3 0| 
[11 tfoi1 144 ofojt af vn [va jo oo 1{NJQimMit] vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VORN{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VORN{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
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For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
D[d+r] = D[n+r] OR NOT(D[m+r]); 
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F6.1.140 |VORR (immediate) 


Vector Bitwise OR (immediate) performs a bitwise OR between a register value and an immediate value, and returns 


the result into the destination vector. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 


more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


This instruction is used by the pseudo-instruction VORN (immediate). The pseudo-instruction is never the preferred 


disassembly. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/1918  16|15 12\11 8|7 6 5 4|3 0| 


77100 tit] 0 oO] mms | va | omode [o[a[o]i] imma 


64-bit SIMD vector variant 
Applies when Q == 0. 


VORR{<c>}{<q>}.<dt> {<Dd>,} <Dd>, #<imm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VORR{<c>}{<q>}.<dt> {<Qd>,} <Qd>, #<imm> 


Decode for all variants of this encoding 
if cmode<@> == '@' || cmode<3:2> == '11' then SEE VMOV (immediate); 
if Q == '1' && Vd<@> == '1' then UNDEFINED; 


imm64 = AdvSIMDExpandImm('@', cmode, i:imm3:imm4) ; 
d = UInt(D:Vd); regs = if Q == '@' then 1 else 2; 


71 


\15141312/11109 8|7 6 5 4/3 2. 0O|15 42|11 8|7 6 5 4|3 0 | 


apt 717 t[o]o oo] mms | va | omode [o[alo]i] imma 


64-bit SIMD vector variant 
Applies when Q == 0. 


VORR{<c>}{<q>}.<dt> {<Dd>,} <Dd>, #<imm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VORR{<c>}{<q>}.<dt> {<Qd>,} <Qd>, #<imm> 


Decode for all variants of this encoding 


if cmode<@> == '@' || cmode<3:2> == '11' then SEE VMOV (immediate); 
if Q == '1' && Vd<@> == '1' then UNDEFINED; 

imm64 = AdvSIMDExpandImm('@', cmode, i:imm3:imm4) ; 

d = UInt(D:Vd);_ regs = if Q == '@' then 1 else 2; 
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Alias conditions 





Pseudo-instruction is preferred when 





VORN (immediate) Never 





Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> The data type used for <imm>. It can be either 116 or 132. 18, 164, and F32 are also permitted, but the 
resulting syntax is a pseudo-instruction. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<imm> Is a constant of the type specified by <dt> that is replicated to fill the destination register. For details 


of the range of constants available and the encoding of <dt> and <imm>, see Modified immediate 
constants in T32 and A32 Advanced SIMD instructions on page F2-2423. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
D[d+r] = D[d+r] OR imm64; 
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F6.1.141 VORR (register) 


Vector bitwise OR (register) performs a bitwise OR operation between two registers, and places the result in the 
destination register. The operand and result registers can be quadword or doubleword. They must all be the same 
size. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


This instruction is used by the pseudo-instructions VMOV (register, SIMD), VRSHR (zero), and VSHR (zero). The 
pseudo-instruction is never the preferred disassembly. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


77100 tfofopp]i o| va | va [ooo 7|NjalMi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VORR{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VORR{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 141312/11109 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


[1 1 tfoj1 414 ofoft of vn _ | vd joo o t{NJajm{t] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 

VORR{<c>}{<q>}{.<dt>} {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 

VORR{<c>}{<q>}{.<dt>} {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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Alias conditions 





Alias or pseudo-instruction — is preferred when 














VMOV (register, SIMD) N:Vn == M:Vm 
VRSHR (zero) Never 
VSHR (zero) Never 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
D{d+r] = D[n+r] OR D[m+r]; 
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F6.1.142 VPADAL 


Vector Pairwise Add and Accumulate Long adds adjacent pairs of elements of a vector, and accumulates the results 
into the elements of the destination vector. 


The vectors can be doubleword or quadword. The operand elements can be 8-bit, 16-bit, or 32-bit integers. The 
result elements are twice the length of the operand elements. 


The following figure shows an example of the operation of VPADAL doubleword operation for data type S16 





Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 1615 12\11109 8|7 6 5 4/3 0| 


Ti 7700%7 101 iszceJo of va [o]1 1 ofopfa]wfo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VPADAL{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VPADAL{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 

if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (op == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 
[15 14 1312/1110 9 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 0 | 


1141111114 1{d]1 tfsize[o of va  [oj1 1 ofopfajmjo}] vm | 


64-bit SIMD vector variant 


Applies when Q == 0. 
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VPADAL{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VPADAL{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (op == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<dt> Is the data type for the elements of the vectors, encoded in the "op:size" field. It can have the 
following values: 


S8 when op = Q, size = 00 
S16 when op = Q, size = 01 
$32 when op = Q, size = 10 
U8 when op = 1, size = 00 
U16 when op = 1, size = 01 
U32 when op = 1, size = 10 


The following encodings are reserved: 


° op = Q, size = 11. 


° op = 1, size = 11. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
h = elements DIV 2; 


for r = @ to regs-1 
for e = @ to h-1 
op1 = Elem[D[m+r],2*e,esize]; op2 = Elem[D[m+r],2xe+1,esize]; 
result = Int(op1, unsigned) + Int(op2, unsigned); 
Elem[D[d+r],e,2*esize] = Elem[D[d+r],e,2sesize] + result; 
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F6.1.143  VPADD (floating-point) 
Vector Pairwise Add (floating-point) adds adjacent pairs of elements of two vectors, and places the results in the 
destination vector. 
The operands and result are doubleword vectors. 
The operand and result elements are 32-bit floating-point numbers. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
741100 4[t)o[b[olse] vn [va [+10 a[NJQ[wpo] vm | 
Al variant 
VPADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if Q == '1' then UNDEFINED; 
if sz == '1' then UNDEFINED; 
esize = 32; elements = 2; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12/1109 8|7 6 5 4|3 0 | 
a apifi 1 +7 o[ppoysel va vat 1 0 tN alo] vm 
T1 variant 
VPADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if Q == '1' then UNDEFINED; 
if sz == '1' then UNDEFINED; 
esize = 32; elements = 2; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 
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<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 
bits(64) dest; 
h = elements DIV 2; 


for e = 0 to h-1 
Elem[dest,e,esize] 
StandardFPSCRValue()); 
Elem[dest,e+h,esize] = FPAdd(Elem[D[m],2e,esize], Elem[D[m],2se+1,esize], 
StandardFPSCRValue()); 


FPAdd(Elem[D[n],2*e,esize], Elem[D[n],2se+1,esize], 


D[d] = dest; 
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F6.1.144  VPADD (integer) 
Vector Pairwise Add (integer) adds adjacent pairs of elements of two vectors, and places the results in the destination 
vector. 
The operands and result are doubleword vectors. 
The operand and result elements must all be the same type, and can be 8-bit, 16-bit, or 32-bit integers. There is no 
distinction between signed and unsigned integers. 
The following figure shows an example of the operation of VPADD doubleword operation for data type I16. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 2423 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 
T4110 0 1o]o[o[sze] vn | va [+071 a[nJQlw[] vm | 
Al variant 
VPADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if size == '11' || Q == '1' then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
T1 
[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
fa apoli 1 +7 o[ppsze[ va | va [to 7 7[NjalMyi] vm 
T1 variant 
VPADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if size == '11' || Q == '1' then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
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Assembler symbols 


<c> 


<q> 


<dt> 


<Dd> 


<Dn> 


<Dm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 

I8 when size = 00 

116 when size = 01 

132 when size = 10 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(64) dest; 
h = elements DIV 2; 


for e = 0 to h-1 
Elem[dest,e,esize] = Elem[D[n],2se,esize] + Elem[D[n],2*e+1,esize]; 


Elem[dest,e+h,esize] = Elem[D[m],2se,esize] + Elem[D[m] ,2e+1,esize]; 


D[d] = dest; 
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F6.1.145 VPADDL 


Vector Pairwise Add Long adds adjacent pairs of elements of two vectors, and places the results in the destination 
vector. 


The vectors can be doubleword or quadword. The operand elements can be 8-bit, 16-bit, or 32-bit integers. The 
result elements are twice the length of the operand elements. 


The following figure shows an example of the operation of VPADDL doubleword operation for data type S16. 





Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


Ti 770077 1)0][1 iszeJo of va [o]o 1 ofop[a]wfo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VPADDL{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VPADDL{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 

if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (op == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 
151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4/3 0 | 


Tit titi7 to] iszcefo of va [olo 1 ofop[a]w[o[ vm | 


64-bit SIMD vector variant 


Applies when Q == 0. 
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VPADDL{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VPADDL{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (op == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<dt> Is the data type for the elements of the vectors, encoded in the "op:size" field. It can have the 
following values: 


S8 when op = Q, size = 00 
S16 when op = Q, size = 01 
$32 when op = Q, size = 10 
U8 when op = 1, size = 00 
U16 when op = 1, size = 01 
U32 when op = 1, size = 10 


The following encodings are reserved: 


° op = Q, size = 11. 


° op = 1, size = 11. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 
h = elements DIV 2; 


for r = @ to regs-1 
for e = @ to h-1 
op1 = Elem[D[m+r],2*e,esize]; op2 = Elem[D[m+r],2xe+1,esize]; 
result = Int(op1, unsigned) + Int(op2, unsigned); 
Elem[D[d+r],e,2xesize] = result<2xesize-1:0>; 
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F6.1.146 VPMAX (floating-point) 
Vector Pairwise Maximum compares adjacent pairs of elements in two doubleword vectors, and copies the larger 
of each pair into the corresponding element in the destination doubleword vector. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
141100 i[t)o[o[olse] vn [va [+11 aNJo[wpo] vm | 
op Q 
Al variant 
VPMAX{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if Q == '1' then UNDEFINED; 
if sz == '1' then UNDEFINED; 
maximum = (op == 'Q'); 
esize = 32; elements = 2; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
T1 
15 141312/11109 8|7 6 5 4/3 0 |15 12/1109 8|7 6 5 4/3 0 | 
a apft 1 +7 o[ppoyse] va vat +7 t[NfolMpo] vm 
op Q 
T1 variant 
VPMAX{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if Q == '1' then UNDEFINED; 
if sz == '1' then UNDEFINED; 
maximum = (op == 'Q'); 
esize = 32; elements = 2; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
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<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(64) dest; 
h = elements DIV 2; 


for e = 0 to h-1 
opl = Elem[D[n],2e,esize]; op2 = Elem[D[n],2e+1,esize]; 
Elem[dest,e,esize] = if maximum then FPMax(op1,op2,StandardFPSCRValue()) else 
FPMin(op1,op2,StandardFPSCRValue()); 
opl = Elem[D[m],2*e,esize]; op2 = Elem[D[m],2e+1,esize]; 
Elem[dest,eth,esize] = if maximum then FPMax(op1,op2,StandardFPSCRValue()) else 
FPMin(op1,op2,StandardFPSCRValue()); 


D[d] = dest; 
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F6.1.147  VPMAX (integer) 


Vector Pairwise Maximum compares adjacent pairs of elements in two doubleword vectors, and copies the larger 
of each pair into the corresponding element in the destination doubleword vector. 


The following figure shows an example of the operation of VPMAX doubleword operation for data type S16 or 
U16. 





Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 

|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 
114700 1Ulo[D]sze] va | va [107 0]NJo|Mjo] vm | 

Q op 

Al variant 

VPMAX{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 

Decode for this encoding 

if size == '11' || Q == '1' then UNDEFINED; 

maximum = (op == '@'); unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 

T1 

\15141312/11109 8|7 6 5 4|3 0 |15 12\1110 9 8|7 6 5 4|3 0| 


[11 tfuli 144 ofp}size{ vn [vd [1 01 ofNjo{mMfo] vm _| 
Q op 


T1 variant 


VPMAX{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


Decode for this encoding 


if size == '11' || Q == '1' then UNDEFINED; 
maximum = (op == '@'); unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(64) dest; 
h = elements DIV 2; 


for e = 0 to h-1 
op1 = Int(Elem[D[n],2*e,esize], unsigned); 
op2 = Int(Elem[D[n],2se+1,esize], unsigned); 
result = if maximum then Max(op1,op2) else Min(op1,op2); 
Elem[dest,e,esize] = result<esize-1:0>; 
op1 = Int(Elem[D[m],2*e,esize], unsigned); 
op2 = Int(Elem[D[m],2«e+1,esize], unsigned); 
result = if maximum then Max(op1,op2) else Min(op1,op2); 
Elem[dest,eth,esize] = result<esize-1:0>; 


D[d] = dest; 
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F6.1.148 = VPMIN (floating-point) 
Vector Pairwise Minimum compares adjacent pairs of elements in two doubleword vectors, and copies the smaller 
of each pair into the corresponding element in the destination doubleword vector. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
141100 t[to[o[t fe] va [va _[+ 17 4NJo[wpo] vm | 
op Q 
Al variant 
VPMIN{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if Q == '1' then UNDEFINED; 
if sz == '1' then UNDEFINED; 
maximum = (op == 'Q'); 
esize = 32; elements = 2; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
T1 
15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
ED OD) 
op Q 
T1 variant 
VPMIN{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if Q == '1' then UNDEFINED; 
if sz == '1' then UNDEFINED; 
maximum = (op == 'Q'); 
esize = 32; elements = 2; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
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<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(64) dest; 
h = elements DIV 2; 


for e = 0 to h-1 
opl = Elem[D[n],2e,esize]; op2 = Elem[D[n],2e+1,esize]; 
Elem[dest,e,esize] = if maximum then FPMax(op1,op2,StandardFPSCRValue()) else 
FPMin(op1,op2,StandardFPSCRValue()); 
opl = Elem[D[m],2*e,esize]; op2 = Elem[D[m],2e+1,esize]; 
Elem[dest,eth,esize] = if maximum then FPMax(op1,op2,StandardFPSCRValue()) else 
FPMin(op1,op2,StandardFPSCRValue()); 


D[d] = dest; 
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F6.1.149  VPMIN (integer) 
Vector Pairwise Minimum compares adjacent pairs of elements in two doubleword vectors, and copies the smaller 
of each pair into the corresponding element in the destination doubleword vector. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
1417100 iulo[o[sze[ vn [va [+07 O|NJo[w[1] vm | 
Q op 
Al variant 
VPMIN{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if size == '11' || Q == '1' then UNDEFINED; 
maximum = (op == '@'); unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12\/11109 8|7 6 5 4|3 0 | 
4 afolt 117 o[ppszel va | vat 07 olNjo[my7] vm 
Q op 
T1 variant 
VPMIN{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
Decode for this encoding 
if size == '11' || Q == '1' then UNDEFINED; 
maximum = (op == '@'); unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
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U16 when U = 1, size = Q1 

U32 when U = 1, size = 10 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
bits(64) dest; 
h = elements DIV 2; 


for e = 0 to h-1 
op1 = Int(Elem[D[n],2*e,esize], unsigned); 
op2 = Int(Elem[D[n],2«e+1,esize], unsigned); 
result = if maximum then Max(op1,op2) else Min(op1,op2); 
Elem[dest,e,esize] = result<esize-1:0>; 
op1 = Int(Elem[D[m],2*e,esize], unsigned); 
op2 = Int(Elem[D[m],2se+1,esize], unsigned); 
result = if maximum then Max(op1,op2) else Min(op1,op2); 
Elem[dest,eth,esize] = result<esize-1:0>; 


D[d] = dest; 
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F6.1.150 VPOP 
Pop SIMD&FP registers from Stack loads multiple consecutive Advanced SIMD and floating-point register file 
registers from the stack 
This instruction is an alias of the VLDM, VLDMDB, VLDMIA instruction. This means that: 
° The encodings in this description are named to match the encodings of VLDM, VLDMDB, VLDMIA. 
. The description of VLDM, VLDMDB, VLDMIA gives the operational pseudocode for this instruction. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 | 1 0| 
mim [tt oop yt 7 oa, ve [1 oft 1] immaera> 0] 
cond PU Ww Rn imm8<0> 
Increment After variant 
VPOP{<c>}{<q>}{.<size>} <dreglist> 
is equivalent to 
VLDM{<c>}{<q>}{.<size>} SP!, <dreglist> 
and is always the preferred disassembly. 
A2 
31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 | 0 | 
rit [1 4 ofoli{o]i]i1]1 14 0 1 
cond PU WwW Rn 
Increment After variant 
VPOP{<c>}{<q>}{.<size>} <sreglist> 
is equivalent to 
VLDM{<c>}{<q>}{.<size>} SP!, <sreglist> 
and is always the preferred disassembly. 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12\1110 9 8/7 | 1 0| 
Ti 10747 opoliol1 107] ve [1 o[t a] mmeera> 
PU Ww Rn imm8<0> 
Increment After variant 
VPOP{<c>}{<q>}{.<size>} <dreglist> 
is equivalent to 
VLDM{<c>}{<q>}{.<size>} SP!, <dreglist> 
and is always the preferred disassembly. 
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T2 


[151413 12|11109 8|7 6 5 4|3 0 |15 12|1110 9 8|7 | 0 | 


Tito opp +o7] va [toto] mms 
P U Ww Rn 


Increment After variant 
VPOP{<c>}{<q>}{.<size>} <sreglist> 

is equivalent to 

VLDM{<c>}{<q>}{.<size>} SP!, <sreglist> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<size> An optional data size specifier. If present, it must be equal to the size in bits, 32 or 64, of the registers 


being transferred. 


<sreglist> Is the list of consecutively numbered 32-bit SIMD&FP registers to be transferred. The first register 
in the list is encoded in "Vd:D", and "imm8" is set to the number of registers in the list. The list must 
contain at least one register. 


<dreglist> Is the list of consecutively numbered 64-bit SIMD&FP registers to be transferred. The first register 
in the list is encoded in "D:Vd", and "imm8" is set to twice the number of registers in the list. The 
list must contain at least one register, and must not contain more than 16 registers. 


Operation for all encodings 


The description of VLDM, VLDMDB, VLDMIA gives the operational pseudocode for this instruction. 
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F6.1.151 VPUSH 
Push SIMD&FP registers to Stack stores multiple consecutive registers from the Advanced SIMD and 
floating-point register file to the stack 
This instruction is an alias of the VSTM, VSTMDB, VSTMIA instruction. This means that: 
° The encodings in this description are named to match the encodings of VSTM, VSTMDB, VSTMIA. 
° The description of VSTM, VSTMDB, VSTMIA gives the operational pseudocode for this instruction. 
A1 
31 28|27 26 25 24|23 22 21 20|19 16|15 12/1110 9 8|7 | 1 0| 
eam [11 oli fo[oyijol+ 7 oa] va [1 o]t 1] immaera> [0] 
cond PU Ww Rn imm8<0> 
Decrement Before variant 
VPUSH{<c>}{<q>}{.<size>} <dreglist> 
is equivalent to 
VSTMDB{<c>}{<q>}{.<size>} SP!, <dreglist> 
and is always the preferred disassembly. 
A2 
31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 | 0 | 
rit [1 4 oft}o[p]+jo]1 1 0 4 
cond PU Ww Rn 
Decrement Before variant 
VPUSH{<c>}{<q>}{.<size>} <sreglist> 
is equivalent to 
VSTMDB{<c>}{<q>}{.<size>} SP!, <sreglist> 
and is always the preferred disassembly. 
T1 
1514131211109 8|7 6 5 4/3 0|15 12\1110 9 8/7 | 1 0| 
Ti 1077 Oto fol1 +07] ve [1 o[t 1] mmeeras 
PU WwW Rn imm8<0> 
Decrement Before variant 
VPUSH{<c>}{<q>}{.<size>} <dreglist> 
is equivalent to 
VSTMDB{<c>}{<q>}{.<size>} SP!, <dreglist> 
and is always the preferred disassembly. 
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T2 


151413 12|1110 9 8|7 6 5 4|3 0 |15 12|1110 9 8|7 | 0 | 


Tito oppo toi] va [i o[to] mms 
P U WwW Rn 


Decrement Before variant 
VPUSH{<c>}{<q>}{.<size>} <sreglist> 

is equivalent to 

VSTMDB{<c>}{<q>}{.<size>} SP!, <sreglist> 


and is always the preferred disassembly. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<size> An optional data size specifier. If present, it must be equal to the size in bits, 32 or 64, of the registers 


being transferred. 


<sreglist> Is the list of consecutively numbered 32-bit SIMD&FP registers to be transferred. The first register 
in the list is encoded in "Vd:D", and "imm8" is set to the number of registers in the list. The list must 
contain at least one register. 


<dreglist> Is the list of consecutively numbered 64-bit SIMD&FP registers to be transferred. The first register 
in the list is encoded in "D:Vd", and "imm8" is set to twice the number of registers in the list. The 
list must contain at least one register, and must not contain more than 16 registers. 


Operation for all encodings 


The description of VSTM, VSTMDB, VSTMIA gives the operational pseudocode for this instruction. 
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F6.1.152  VQABS 


Vector Saturating Absolute takes the absolute value of each element in a vector, and places the results in the 


destination vector. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 


occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 


more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0| 


11770077 1)0][1 iJszefo of va [oli 17 oa[wpo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQABS{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQABS{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


71 


151413 12\11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4/3 


0| 


Tiiti1i7 top iszcefo of va [oli 17 oajwpo] vm 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQABS{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQABS{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 
S8 when size = 00 
$16 when size = 01 
$32 when size = 10 


The encoding size = 11 is reserved. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
result = Abs(SInt(Elem[D[m+r],e,esize])); 
(Elem[D[d+r],e,esize], sat) = SignedSatQ(result, esize); 
if sat then FPSR.QC = '1'; 
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F6.1.153 VQADD 
Vector Saturating Add adds the values of corresponding elements of two vectors, and places the results in the 
destination vector. 
If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12/1110 9 8|7 6 5 4/3 0| 
141100 iulo[o[sze[ vn [| va [000 O|N]Q[w[1] vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VQADD{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VQADD{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<O> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); nn = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == 'Q@' then 1 else 2; 
T1 
15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 
fa afoli 117 o[ppszel vn | va [oo 0 olNjalmyi] vm 
64-bit SIMD vector variant 
Applies when Q == 0. 
VQADD{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VQADD{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
S64 when U = 0, size = 11 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
U64 when U = 1, size = 11 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
sum = Int(Elem[D[n+r],e,esize], unsigned) + Int(Elem[D[m+r],e,esize], unsigned) ; 
(Elem[D[d+r],e,esize], sat) = SatQ(sum, esize, unsigned); 
if sat then FPSR.QC = '1'; 
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F6.1.154 VQDMLAL 
Vector Saturating Doubling Multiply Accumulate Long multiplies corresponding elements in two doubleword 
vectors, doubles the products, and accumulates the results into the elements of a quadword vector. 
The second operand can be a scalar instead of a vector. For more information about scalars see Advanced SIMD 
scalars on page F2-2432. 
If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|/1110 9 8|7 6 5 4/3 0| 
111100 toliopen] va | ve _[7 ofo]a[no[mpo vm | 
size op 
Al variant 
VQDMLAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if size == 'QQ' || Vd<@> == '1' then UNDEFINED; 
add = (op == '0'); 
scalar_form = FALSE; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
A2 
131 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|11109 8|7 6 5 4/3 0| 
111100 t)oliopen] va | ve _Jojo[t a[n[t[mjo] vm _| 
size op 
A2 variant 
VQDMLAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm>[<index>] 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if size == '@Q@' || Vd<@> == '1' then UNDEFINED; 
add = (op == '0'); 
scalar_form = TRUE; d = UInt(D:Vd); n = UInt(N:Vn); 
if size == 'Q1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12\11109 8|7 6 5 4|3 0| 
71 apolt + 71 aopemy va ve [40 ofa [nJolmpo] vm | 
size op 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


T1 variant 


VQDMLAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == '@Q' || Vd<@> == '1' then UNDEFINED; 

add = (op == '0'); 

scalar_form = FALSE; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
esize = 8 << UInt(size); elements = 64 DIV esize; 


T2 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


at tfofs tat tfofient va | vd fofofs IN] t}mMfo} vm | 


size op 


T2 variant 


VQDMLAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm>[<index>] 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == '@Q' || Vd<@> == '1' then UNDEFINED; 

add = (op == '0'); 

scalar_form = TRUE; d = UInt(D:Vd); n = UInt(N:Vn); 

if size == 'Q@1' then esize = 16; elements = 4; m = UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 


Notes for all encodings 

Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 

Assembler symbols 


<C> For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 
$16 when size = 01 
$32 when size = 10 


The following encodings are reserved: 


° size = 00. 

° size = 11. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> For encoding A1 and T1: is the 64-bit name of the second SIMD&FP source register, encoded in 


the "M:Vm" field. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


For encoding A2 and T2: is the 64-bit name of the second SIMD&FP source register, encoded in 
the "Vm<2:0>" field when <dt> is S16, otherwise the "Vm" field. 


<index> Is the element index in the range 0 to 3, encoded in the "M:Vm<3>" field when <dt> is $16, 
otherwise in range 0 to 1, encoded in the "M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
if scalar_form then op2 = SInt(Elem[Din[m] ,index,esize]); 
for e = 0 to elements-1 
if !scalar_form then op2 = SInt(Elem[Din[m] ,e,esize]); 
opl = SInt(Elem[Din[n],e,esize]); 
// The following only saturates if both op1 and op2 equal -(2A(esize-1)) 
(product, sat1) = SignedSatQ(2soplxop2, 2*esize); 
if add then 
result = SInt(Elem[Qin[d>>1],e,2esize]) + SInt(product); 
else 
result = SInt(Elem[Qin[d>>1],e,2esize]) - SInt(product); 
(Elem[Q[d>>1],e,2xesize], sat2) = SignedSatQ(result, 2esize); 
if satl || sat2 then FPSR.QC = '1'; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.155 VQDMLSL 


Vector Saturating Doubling Multiply Subtract Long multiplies corresponding elements in two doubleword vectors, 
subtracts double the products from corresponding elements of a quadword vector, and places the results in the same 
quadword vector. 


The second operand can be a scalar instead of a vector. For more information about scalars see Advanced SIMD 
scalars on page F2-2432. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12\11109 8/7 6 5 4]3 0| 
T141100 to]tpo[en] va | va [4 O]t]t|NJo[mjo. vm | 
size op 
Al variant 


VQDMLSL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || Vd<@> == '1' then UNDEFINED; 

add = (op == '0'); 

scalar_form = FALSE; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
esize = 8 << UInt(size); elements = 64 DIV esize; 


A2 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12\11109 8/7 6 5 4]3 0| 
11 4100 t)otpopen] va | va Jo]*]1 4|N]t]mjo. vm _| 
size op 
A2 variant 


VQDMLSL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm>[<index>] 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == '@Q' || Vd<@> == '1' then UNDEFINED; 

add = (op == '0'); 

scalar_form = TRUE; d = UInt(D:Vd); n = UInt(N:Vn); 

if size == 'Q1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 





if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
115141312/11109 8|7 6 5 4|3 0|15 12/1110 9 8|7 6 5 4|3 0| 
7 tfolt 147 tpn] vn | va [10 4]7\NJo[Mpo] vm _| 
size op 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


T1 variant 


VQDMLSL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == '@Q' || Vd<@> == '1' then UNDEFINED; 

add = (op == '0'); 

scalar_form = FALSE; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
esize = 8 << UInt(size); elements = 64 DIV esize; 


T2 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 





at tfofs ttt fof] va | vd foftf4 tN] t]Mfo} vm | 


size op 


T2 variant 


VQDMLSL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm>[<index>] 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == '@Q' || Vd<@> == '1' then UNDEFINED; 

add = (op == '0'); 

scalar_form = TRUE; d = UInt(D:Vd); n = UInt(N:Vn); 

if size == 'Q@1' then esize = 16; elements = 4; m = UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 


Notes for all encodings 

Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 

Assembler symbols 


<C> For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 
$16 when size = 01 
$32 when size = 10 


The following encodings are reserved: 


° size = 00. 

° size = 11. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> For encoding A1 and T1: is the 64-bit name of the second SIMD&FP source register, encoded in 


the "M:Vm" field. 
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F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


For encoding A2 and T2: is the 64-bit name of the second SIMD&FP source register, encoded in 
the "Vm<2:0>" field when <dt> is S16, otherwise the "Vm" field. 


<index> Is the element index in the range 0 to 3, encoded in the "M:Vm<3>" field when <dt> is $16, 
otherwise in range 0 to 1, encoded in the "M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
if scalar_form then op2 = SInt(Elem[Din[m] ,index,esize]); 
for e = 0 to elements-1 
if !scalar_form then op2 = SInt(Elem[Din[m] ,e,esize]); 
opl = SInt(Elem[Din[n],e,esize]); 
// The following only saturates if both op1 and op2 equal -(2A(esize-1)) 
(product, sat1) = SignedSatQ(2soplxop2, 2*esize); 
if add then 
result = SInt(Elem[Qin[d>>1],e,2esize]) + SInt(product); 
else 
result = SInt(Elem[Qin[d>>1],e,2esize]) - SInt(product); 
(Elem[Q[d>>1],e,2xesize], sat2) = SignedSatQ(result, 2esize); 
if satl || sat2 then FPSR.QC = '1'; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.156 VQDMULH 


Vector Saturating Doubling Multiply Returning High Half multiplies corresponding elements in two vectors, 
doubles the results, and places the most significant half of the final results in the destination vector. The results are 
truncated, for rounded results see VQRDMULH. 


The second operand can be a scalar instead of a vector. For more information about scalars see Advanced SIMD 
scalars on page F2-2432. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


717100 1)o[o[pfsze[ va | va [107 1|NJQ|Mjo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQDMULH{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQDMULH{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '@@' || size == '11' then UNDEFINED; 

scalar_form = FALSE; esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


A2 
|31 30 29 28|27 26 25 2423 22 21 20|19 16|15 12/1109 8|7 6 5 4|3 0| 
17100 tape] vw | va [110 0[N[i[Mjo] vm _| 


size 


64-bit SIMD vector variant 

Applies when Q == 0. 

VQDMULH{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm[x]> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VQDMULH{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm[x]> 
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F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Decode for all variants of this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'Q@@' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1') then UNDEFINED; 

scalar_form = TRUE; d = UInt(D:Vd);\ n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 


if size == 'Q@1' then esize = 16; elements = 4; m = UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12\1110 9 8|7 6 5 4|3 0| 


7 ipolt 777 o[pysze] va | va_[1 07 7|Nja|Mjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQDMULH{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQDMULH{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '@@' || size == '11' then UNDEFINED; 


scalar_form = FALSE; esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 


[151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


7 aot 7 +7 tppeny vw | va_[1 10 o[N[i[Mjo] vm _| 


size 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQDMULH{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm[x]> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQDMULH{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm[x]> 


Decode for all variants of this encoding 


if size == '11' then SEE "Related encodings"; 

if size == '@Q@' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1') then UNDEFINED; 

scalar_form = TRUE; d = UInt(D:Vd); n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 
if size == 'Q@1' then esize = 16; elements = 4; m = UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 
<Qn> 


<Qm> 


<Dd> 
<Dn> 


<Dm[x]> 


<Dm> 


For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

For encoding Al and T1: is the data type for the elements of the operands, encoded in the "size" 
field. It can have the following values: 

$16 when size = Q1 


$32 when size = 10 


For encoding A2 and T2: is the data type for the elements of the operands, encoded in the "size" 
field. It can have the following values: 


$16 when size = 01 
$32 when size = 10 


The following encodings are reserved: 
. size = 00. 


. size = 11. 
Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


The scalar for either a quadword or a doubleword scalar operation. If <dt> is $16, Dm is restricted to 
DO-D7. If <dt> is $32, Dm is restricted to DO-D15. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
if scalar_form then op2 = SInt(Elem[D[m] ,index,esize]); 
for r = @ to regs-1 


for e = Q to elements-1 


if !scalar_form then op2 = SInt(Elem[D[m+r],e,esize]); 

op1 = SInt(Elem[D[n+r],e,esize]); 

// The following only saturates if both op1 and op2 equal -(2A(esize-1)) 
(result, sat) = SignedSatQ((2*oplso0p2) >> esize, esize); 
Elem[D[d+r],e,esize] = result; 

if sat then FPSR.QC = '1'; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.157  VQDMULL 


Vector Saturating Doubling Multiply Long multiplies corresponding elements in two doubleword vectors, doubles 
the products, and places the results in a quadword vector. 


The second operand can be a scalar instead of a vector. For more information about scalars see Advanced SIMD 
scalars on page F2-2432. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 

|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8/7 6 5 4]3 0| 

774700 4fojifopen] ve [| va [170 4|NJo|M[o] vm | 
size 

Al variant 


VQDMULL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || Vd<@> == '1' then UNDEFINED; 

scalar_form = FALSE; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
esize = 8 << UInt(size); elements = 64 DIV esize; 


A2 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8/7 6 5 4]3 0| 
T1747 00 10[i]=n] va | va (107 4|NJi[Mjol vm | 
size 
A2 variant 


VQDMULL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm[x]> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || Vd<@> == '1' then UNDEFINED; 

scalar_form = TRUE; d = UInt(D:Vd); n = UInt(N:Vn); 

if size == 'Q@1' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>); 





if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
115141312/11109 8|7 6 5 4]3 0|15 12\11109 8|7 6 5 4|3 0| 
[11 tfoj111 1 1[Dpet{ vn | vd [4110 4|N{o[Mjo] vm __| 
size 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


T1 variant 


VQDMULL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || Vd<@> == '1' then UNDEFINED; 

scalar_form = FALSE; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
esize = 8 << UInt(size); elements = 64 DIV esize; 


T2 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 apolt 717 tppeny vw | va_[1 07 7(N[i[Mpo] vm _| 


size 


T2 variant 


VQDMULL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm[x]> 


Decode for this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'QQ' || Vd<@> == '1' then UNDEFINED; 

scalar_form = TRUE; d = UInt(D:Vd); n = UInt(N:Vn); 

if size == 'Q@1' then esize = 16; elements = 4; m = UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 


Notes for all encodings 

Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 

Assembler symbols 


<c> For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 
S16 when size = 01 
$32 when size = 10 


The following encodings are reserved: 





° size = 00. 
° size = 11. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm[x]> The scalar for a scalar operation. If <dt> is $16, Dm is restricted to DO-D7. If <dt> is $32, Dm is restricted 
to DO-D15. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
F6-3596 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Operation for all encodings 


if ConditionPassed() then 

EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 

if scalar_form then op2 = SInt(Elem[Din[m] ,index,esize]); 

for e = Q to elements-1 
if !scalar_form then op2 = SInt(Elem[Din[m] ,e,esize]); 
opl = SInt(Elem[Din[n],e,esize]); 
// The following only saturates if both op1 and op2 equal -(2A(esize-1)) 
(product, sat) = SignedSatQ(2soplxop2, 2esize); 
Elem[Q[d>>1],e,2*esize] = product; 
if sat then FPSR.QC = '1'; 
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F6.1.158 VQMOVN, VQMOVUN 


Vector Saturating Move and Narrow copies each element of the operand vector to the corresponding element of the 
destination vector. 


The operand is a quadword vector. The elements can be any one of: 
° 16-bit, 32-bit, or 64-bit signed integers. 
° 16-bit, 32-bit, or 64-bit unsigned integers. 


The result is a doubleword vector. The elements are half the length of the operand vector elements. If the operand 
is unsigned, the results are unsigned. If the operand is signed, the results can be signed or unsigned. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


This instruction is used by the pseudo-instructions VQRSHRN (zero), VQRSHRUN (zero), VQSHRN (zero), and 
VQSHRUN (zero). The pseudo-instruction is never the preferred disassembly. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


Ti 770077 1)0][1 i[sze]i of va [oo 1 0] » [w[o] vm 


Signed result variant 
Applies when op == 1x. 


VQMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 


Unsigned result variant 
Applies when op == @1. 


VQMOVUN{<c>}{<q>}.<dt> <Dd>, <Qm> 


Decode for all variants of this encoding 

if op == '@0' then SEE VMOVN; 

if size == '11' || Vm<@> == '1' then UNDEFINED; 

src_unsigned = (op == '11'); dest_unsigned = (op<@> == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); | m = UInt(M:Vm); 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4|3 0 | 


7714777 10]1 teze[t 0] va [o[o 7 0] op [mo] vm | 


Signed result variant 
Applies when op == 1x. 


VQMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 
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Unsigned result variant 
Applies when op == 01. 


VQMOVUN{<c>}{<q>}.<dt> <Dd>, <Qm> 


Decode for all variants of this encoding 


if op == 'Q0' then SEE VMOVN; 

if size == '11' || Vm<@> == '1' then UNDEFINED; 

src_unsigned = (op == '11'); dest_unsigned = (op<@> == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m = UInt(M:Vm); 


Alias conditions 





Pseudo-instruction is preferred when 





VQRSHRN (zero) Never 





VQRSHRUN (zero) Never 











VQSHRN (zero) Never 

VQSHRUN (zero) Never 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> For the signed result variant: is the data type for the elements of the operand, encoded in the 
"op<0>:size" field. It can have the following values: 
S16 when op<0> = 0, size = 00 
$32 when op<0> = 0, size = Q1 
S64 when op<0> = 0, size = 10 
U16 when op<@> = 1, size = 00 
U32 when op<@> = 1, size = Q1 
U64 when op<0> = 1, size = 10 


The following encodings are reserved: 
° op<0> = 0, size = 11. 
° op<@> = 1, size = 11. 


For the unsigned result variant: is the data type for the elements of the operand, encoded in the "size" 
field. It can have the following values: 


S16 when size = 00 
$32 when size = 01 
S64 when size = 10 


The encoding size = 11 is reserved. 





<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = Q to elements-1 
operand = Int(Elem[Qin[m>>1],e,2esize], src_unsigned) ; 
(Elem[D[d],e,esize], sat) = SatQ(operand, esize, dest_unsigned); 
if sat then FPSR.QC = '1'; 





F6-3600 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.159 VQNEG 
Vector Saturating Negate negates each element in a vector, and places the results in the destination vector. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


71770077 101 i[szefo of va [oli 17 7/Q\w[o] vm 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQNEG{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQNEG{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4|3 0 | 


7411177 1/0]1 tezefo oo] va jolt 77 7alMo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQNEG{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQNEG{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 
S8 when size = 00 
$16 when size = 01 
$32 when size = 10 


The encoding size = 11 is reserved. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
result = -SInt(Elem[D[m+r] ,e,esize]); 
(Elem[D[d+r],e,esize], sat) = SignedSatQ(result, esize); 
if sat then FPSR.QC = '1'; 
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F6.1.160 VQRDMULH 


Vector Saturating Rounding Doubling Multiply Returning High Half multiplies corresponding elements in two 
vectors, doubles the results, and places the most significant half of the final results in the destination vector. The 
results are rounded. For truncated results see VQDMULH. 


The second operand can be a scalar instead of a vector. For more information about scalars see Advanced SIMD 
scalars on page F2-2432. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


717100 1[i)o[pfsze[ va | va_[1 07 1|NJQ|Mjo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQRDMULH{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQRDMULH{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '@@' || size == '11' then UNDEFINED; 

scalar_form = FALSE; esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


A2 
|31 30 29 28|27 26 25 2423 22 21 20|19 16|15 12/1110 9 8|7 6 5 4|3 0| 
17100 tape] vw | va [110 7|N[1[Mjo] vm _| 


size 


64-bit SIMD vector variant 

Applies when Q == 0. 

VQRDMULH{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm[x]> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VQRDMULH{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm[x]> 
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Decode for all variants of this encoding 


if size == '11' then SEE "Related encodings"; 

if size == 'Q@@' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1') then UNDEFINED; 

scalar_form = TRUE; d = UInt(D:Vd);\ n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 


if size == 'Q@1' then esize = 16; elements = 4; m = UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 
T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12\1110 9 8|7 6 5 4|3 0| 


7 apt 717 o[pysze] va | va [107 7[NJQ[Mjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQRDMULH{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQRDMULH{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '@@' || size == '11' then UNDEFINED; 


scalar_form = FALSE; esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 


[151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


7 ajet +t tppeny ve | vat 10 7/N[t[Mpo] vm _| 


size 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQRDMULH{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm[x]> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQRDMULH{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm[x]> 


Decode for all variants of this encoding 


if size == '11' then SEE "Related encodings"; 

if size == '@Q@' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vn<@> == '1') then UNDEFINED; 

scalar_form = TRUE; d = UInt(D:Vd);\ n = UInt(N:Vn); regs = if Q == 'Q' then 1 else 2; 
if size == 'Q@1' then esize = 16; elements = 4; m = UInt(Vm<2:@>); index = UInt(M:Vm<3>); 
if size == '10' then esize = 32; elements = 2; m= UInt(Vm); index = UInt(M); 





F6-3604 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 
<Qn> 


<Qm> 


<Dd> 
<Dn> 


<Dm[x]> 


<Dm> 


For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

For encoding Al and T1: is the data type for the elements of the operands, encoded in the "size" 
field. It can have the following values: 

$16 when size = Q1 


$32 when size = 10 


For encoding A2 and T2: is the data type for the elements of the operands, encoded in the "size" 
field. It can have the following values: 


$16 when size = 01 
$32 when size = 10 


The following encodings are reserved: 
. size = 00. 


. size = 11. 
Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


The scalar for either a quadword or a doubleword scalar operation. If <dt> is $16, Dm is restricted to 
DO-D7. If <dt> is $32, Dm is restricted to DO-D15. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
round_const = 1 << (esize-1); 
if scalar_form then op2 = SInt(Elem[D[m] ,index,esize]); 
for r = @ to regs-1 


for e = Q to elements-1 
op1 = SInt(Elem[D[n+r],e,esize] ); 


if !scalar_form then op2 = SInt(Elem[D[m+r],e,esize]); 

(result, sat) = SignedSatQ((2*oplxop2 + round_const) >> esize, esize); 
Elem[D[d+r],e,esize] = result; 

if sat then FPSR.QC = '1'; 
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F6.1.161 


VQRSHL 


Vector Saturating Rounding Shift Left takes each element in a vector, shifts them by a value from the least 
significant byte of the corresponding element of a second vector, and places the results in the destination vector. If 
the shift value is positive, the operation is a left shift. Otherwise, it is a right shift. 


For truncated results see VQSHL (register). 

The first operand and result elements are the same data type, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 

° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 

The second operand is a signed integer of the same size. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 0| 


111400 1{ufojofsize] vn | vd jo 10 t{NJQ}M[t] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQRSHL{<c>}{<q>}.<dt> {<Dd>,} <Dm>, <Dn> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VQRSHL{<c>}{<q>}.<dt> {<Qd>,} <Qm>, <Qn> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1' || Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); | m = UInt(M:Vm); | n = UInt(N:Vn); regs = if Q == '@' then 1 else 2; 


T1 
151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0| 


7 afult 717 o[pysze] va | va_[o 10 7[NJQ|Myi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VQRSHL{<c>}{<q>}.<dt> {<Dd>,} <Dm>, <Dn> 
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128-bit SIMD vector variant 
Applies when Q == 1. 


VQRSHL{<c>}{<q>}.<dt> {<Qd>,} <Qm>, <Qn> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1' || Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); | m = UInt(M:Vm); | n = UInt(N:Vn); regs = if Q == '@' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
S64 when U = 0, size = 11 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
U64 when U = 1, size = 11 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 

shift = SInt(Elem[D[n+r] ,e,esize]<7:0>); 
round_const = 1 << (-1-shift); // @ for left shift, 2A(n-1) for right shift 
operand = Int(Elem[D[m+r],e,esize], unsigned); 
(result, sat) = SatQ((operand + round_const) << shift, esize, unsigned); 
Elem[D[d+r],e,esize] = result; 
if sat then FPSR.QC = '1'; 
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F6.1.162 VQRSHRN (zero) 
Vector Saturating Rounding Shift Right, Narrow takes each element in a quadword vector of integers, right shifts 
them by an immediate value, and places the signed rounded results in a doubleword vector 
This instruction is a pseudo-instruction of the VQMOVN, VQMOVUN instruction. This means that: 
° The encodings in this description are named to match the encodings of VQMOVN, VQMOVUN. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VQMOVN, VQMOVUN gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 
Tit 10047 to][1 i[sze]i of va [olo 7 o[1 x[m[o] vm | 
op 
Signed result variant 
VQRSHRN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 
VQMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 
and is never the preferred disassembly. 
T1 
151413 12/1110 9 8|7 6 5 4/3 21 0|15 12\/11109 8|7 6 5 4|3 0 | 
114747414 1/0]1 t{size[1 of va [ojo 1 of1 x{mMjo] vm __| 
op 
Signed result variant 
VQRSHRN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 
VQMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 
and is never the preferred disassembly. 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operand, encoded in the "op<0>:size" field. It can have the 
following values: 
S16 when op<0> = 0, size = 00 
$32 when op<0> = 0, size = Q1 
S64 when op<@> = 0, size = 10 
U16 when op<@> = 1, size = 00 
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01 
10 


U32 when op<@> = 1, size 


U64 when op<@> = 1, size 
The following encodings are reserved: 
° op<0> = 0, size = 11. 


° op<@> = 1, size = 11. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


The description of VQMOVN, VQMOVUN gives the operational pseudocode for this instruction. 
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F6.1.163 


VQRSHRN, VQRSHRUN 


Vector Saturating Rounding Shift Right, Narrow takes each element in a quadword vector of integers, right shifts 
them by an immediate value, and places the rounded results in a doubleword vector. 


For truncated results, see VQSHL (register). 

The operand elements must all be the same size, and can be any one of: 
° 16-bit, 32-bit, or 64-bit signed integers. 

° 16-bit, 32-bit, or 64-bit unsigned integers. 


The result elements are half the width of the operand elements. If the operand elements are signed, the results can 
be either signed or unsigned. If the operand elements are unsigned, the result elements must also be unsigned. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 12/1109 8|7 6 5 4|3 0 | 


211 too tfulifof imme | vd {1 0 ofopfo]t|mit] vm | 


Signed result variant 
Applies when !(imm6 == Q00xxx) && op == 1. 


VQRSHRN{<c>}{<q>}.<type><size> <Dd>, <Qm>, #<imm> 


Unsigned result variant 
Applies when U == 1 && !(imm6 == QQ0xxx) && op == 0. 


VQRSHRUN{<c>}{<q>}.<type><size> <Dd>, <Qm>, #<imm> 


Decode for all variants of this encoding 


if imm6 == '@0Qxxx' then SEE "Related encodings"; 

if U == 'Q' && op == '@' then SEE VRSHRN; 

if Vm<@> == '1' then UNDEFINED; 

case imm6 of 
when 'QQ1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6) 
when '@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 

src_unsigned = (U == '1' && op == '1'); dest_unsigned = (U == '1'); 

d = UInt(D:Vd); m = UInt(M:Vm); 


T1 
\15141312/11109 8|7 6 5 | 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 afuft 7 tt to] imme [va [10 ofopo[t [mya] vm 


Signed result variant 


Applies when ! (imm6 == Q0@xxx) && op == 1. 
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VQRSHRN{<c>}{<q>}.<type><size> <Dd>, <Qm>, #<imm> 


Unsigned result variant 


Applies when U == 1 && !(imm6 == Q00xxx) && op == 0. 


VQRSHRUN{<c>}{<q>}.<type><size> <Dd>, <Qm>, #<imm> 


Decode for all variants of this encoding 


if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 
if U == '@' && op == '@' then SEE VRSHRN; 
if Vm<@> == '1' then UNDEFINED; 


case imm6 of 


when 'Q@Q@1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when '@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 


src_unsigned 


= (U == '1' && op == '1'); dest_unsigned = (U == '1'); 


d = UInt(D:Vd); m = UInt(M:Vm); 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 


set. 


Assembler symbols 


<c> 


<q> 


<type> 


<size> 


<Dd> 
<Qm> 


<imm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

For the signed result variant: is the data type for the elements of the vectors, encoded in the "U" 
field. It can have the following values: 

Ss when U = @ 

U when U = 1 


For the unsigned result variant: is the data type for the elements of the vectors, encoded in the "U" 
field. It can have the following values: 


S when U = 1 


Is the data size for the elements of the vectors, encoded in the "imm6<5:3>" field. It can have the 
following values: 


16 when imm6<5:3> = 001 
32 when imm6<5:3> = Q1x 
64 when imm6<5:3> = 1xx 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Is an immediate value, in the range 1 to <size>/2, encoded in the "imm6" field as <size>/2 - <imm>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
round_const = 1 << (shift_amount - 1); 
for e = 0 to elements-1 
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operand = Int(Elem[Qin[m>>1],e,2sesize], src_unsigned) ; 

(result, sat) = SatQ((operand + round_const) >> shift_amount, esize, dest_unsigned); 
Elem[D[d],e,esize] = result; 

if sat then FPSR.QC = '1'; 
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F6.1.164 VQRSHRUN (zero) 


Vector Saturating Rounding Shift Right, Narrow takes each element in a quadword vector of integers, right shifts 
them by an immediate value, and places the unsigned rounded results in a doubleword vector 


This instruction is a pseudo-instruction of the VQMOVN, VQMOVUN instruction. This means that: 
° The encodings in this description are named to match the encodings of VQMOVN, VQMOVUN. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 


° The description of VOQMOVN, VQMOVUN gives the operational pseudocode for this instruction. 
A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0 | 


77710077 10]1 t[sze[i of va [ofo 7 oo 1[mjo] vm | 
op 


Unsigned result variant 
VQRSHRUN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VQMOVUN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 


T1 


[15 141312/11109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


ti t1i1 4 tfoft ifsize[1 of vd _ fofo 1 ojo t]Mjo] vm | 
op 


Unsigned result variant 
VQRSHRUN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VQMOVUN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operand, encoded in the "size" field. It can have the following 
values: 
$16 when size = 00 
$32 when size = 01 
S64 when size = 10 


The encoding size = 11 is reserved. 
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<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


The description of VOQMOVN, VQMOVUN gives the operational pseudocode for this instruction. 
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F6.1.165 VQSHL, VQSHLU (immediate) 


Vector Saturating Shift Left (immediate) takes each element in a vector of integers, left shifts them by an immediate 
value, and places the results in a second vector. 


The operand elements must all be the same size, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 
° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 


The result elements are the same size as the operand elements. If the operand elements are signed, the results can 
be either signed or unsigned. If the operand elements are unsigned, the result elements must also be unsigned. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28/27 26 25 24|23 2221 | 16/15 12/1109 8|7 6 5 4|3 0 | 


77100 TU] imme [va __[o 1 tfoo[t[almyi] vm 


VQSHL,double,signed-result variant 
Applies when ! (imm6 == Q00xxx && L == 0) && op == 1 && Q == 0. 


VQSHL{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 


VQSHL,quad,signed-result variant 
Applies when !(imm6 == Q00xxx && L == 0) && op == 1 && Q == 1. 


VQSHL{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 


VQSHLU, double, unsigned-result variant 
Applies when U == 1 && !(imm6 == Q00xxx && L == @) && op == @ && Q == 0. 


VQSHLU{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 


VQSHLU,quad,unsigned-result variant 
Applies when U == 1 && !(imm6 == Q00xxx && L == @) && op == @ && Q == 1. 


VQSHLU{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 


Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if U == 'Q' && op == '@' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx' esize = 8; elements = 8; shift_amount = UInt(imm6) - 8; 
when 'QQ1xxxx' esize = 16; elements = 4; shift_amount = UInt(imm6) - 16; 
when '@1xxxxx' esize = 32; elements = 2; shift_amount = UInt(imm6) - 32; 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = UInt(imm6); 

src_unsigned = (U == '1' && op == '1'); dest_unsigned = (U == '1'); 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
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T1 


\15141312\11109 8|7 65 | 0 |15 12/1110 9 8|7 6 5 4|3 0 | 


a afot +7 1p] imme va fo + ifort [almy7] vm _ 


VQSHL,double,signed-result variant 
Applies when !(imm6 == Q00xxx && L == @) && op == 1 && Q == 0. 


VQSHL{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 


VQSHL,quad,signed-result variant 
Applies when !(imm6 == Q00xxx && L == @) && op == 1 && Q == 1. 


VQSHL{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 


VQSHLU, double, unsigned-result variant 
Applies when U == 1 && !(imm6 == Q00xxx && L == @) && op == @ && Q == 0. 


VQSHLU{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 


VQSHLU,quad,unsigned-result variant 
Applies when U == 1 && !(imm6 == Q00xxx && L == 0) && op == 0 && Q == 1. 


VQSHLU{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 


Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if U == '@' && op == '@' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx' esize = 8; elements = 8; shift_amount = UInt(imm6) - 8; 
when 'QQ1xxxx' esize = 16; elements = 4; shift_amount = UInt(imm6) - 16; 
when 'Q1xxxxx' esize = 32; elements = 2; shift_amount = UInt(imm6) - 32; 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = UInt(imm6); 

src_unsigned = (U == '1' && op == '1'); dest_unsigned = (U == '1'); 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 


Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<type> Is the data type for the elements of the vectors, encoded in the "U" field. It can have the following 
values: 
Ss when U 


U when U 


iT} iT} 
PrP S&S 
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Is the data size for the elements of the vectors, encoded in the "L:imm6<5:3>" field. It can have the 
following values: 


8 when L = 0, imm6<5:3> = 001 
16 when L = 0, imm6<5:3> = Q1x 
32 when L = 0, imm6<5:3> = 1xx 
64 when L = 1, imm6<5:3> = xxx 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Is an immediate value, in the range 0 to <size>-1, encoded in the "imm6" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 


for e = Q to elements-1 
operand = Int(Elem[D[m+r],e,esize], src_unsigned) ; 
(result, sat) = SatQ(operand << shift_amount, esize, dest_unsigned) ; 
Elem[D[d+r],e,esize] = result; 
if sat then FPSR.QC = '1'; 
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F6.1.166 VQSHL (register) 
Vector Saturating Shift Left (register) takes each element in a vector, shifts them by a value from the least significant 
byte of the corresponding element of a second vector, and places the results in the destination vector. If the shift 
value is positive, the operation is a left shift. Otherwise, it is a right shift. 
The results are truncated. For rounded results, see VQRSHL. 
The first operand and result elements are the same data type, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 
° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 
The second operand is a signed integer of the same size. 
If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
114100 1{UJolp]sie] vn [va fo 1 0 o[Njalm]i] vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VQSHL{<c>}{<q>}.<dt> {<Dd>,} <Dm>, <Dn> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VQSHL{<c>}{<q>}.<dt> {<Qd>,} <Qm>, <Qn> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<O> == '1' || Vm<@> == '1' || Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); | m = UInt(M:Vm); | n = UInt(N:Vn); regs = if Q == '@' then 1 else 2; 
T1 
[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
a afoli 1 +7 o[ppszel va | va fo 1 0 olNjaluyi] vm 
64-bit SIMD vector variant 
Applies when Q == 0. 
VQSHL{<c>}{<q>}.<dt> {<Dd>,} <Dm>, <Dn> 
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128-bit SIMD vector variant 
Applies when Q == 1. 


VQSHL{<c>}{<q>}.<dt> {<Qd>,} <Qm>, <Qn> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1' || Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); | m = UInt(M:Vm); | n = UInt(N:Vn); regs = if Q == '@' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
S64 when U = 0, size = 11 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
U64 when U = 1, size = 11 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 

shift = SInt(Elem[D[n+r] ,e,esize]<7:0>); 
operand = Int(Elem[D[m+r],e,esize], unsigned); 
(result,sat) = SatQ(operand << shift, esize, unsigned); 
Elem[D[d+r],e,esize] = result; 
if sat then FPSR.QC = '1'; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3619 
1ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.167 


VQSHRN (zero) 


Vector Saturating Shift Right, Narrow takes each element in a quadword vector of integers, right shifts them by an 
immediate value, and places the signed truncated results in a doubleword vector 


This instruction is a pseudo-instruction of the VQMOVN, VQMOVUN instruction. This means that: 
° The encodings in this description are named to match the encodings of VQMOVN, VQMOVUN. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 


° The description of VOQMOVN, VQMOVUN gives the operational pseudocode for this instruction. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4/3 0 | 


77710077 10]1 t[eze[i of va [ofo 7 o|7 x[mjo] vm | 
op 


Signed result variant 
VQSHRN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VQMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 
T1 


[15 141312/11109 8/7 6 5 4/3 2.1 0|15 12/1110 9 8|7 6 5 4/3 0 | 


titi tfoft ifsize[1 of vd  jofo 1 oj1 xjmjo] vm | 
op 


Signed result variant 
VQSHRN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VQMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 


Assembler symbols 

<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the operand, encoded in the "op<0>:size" field. It can have the 
following values: 
S16 when op<0> = 0, size = 00 
$32 when op<0> = 0, size = Q1 

10 

00 


S64 when op<0> = 0, size 


U16 when op<@> = 1, size 
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01 
10 


U32 when op<@> = 1, size 


U64 when op<@> = 1, size 
The following encodings are reserved: 
° op<0> = 0, size = 11. 


° op<@> = 1, size = 11. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


The description of VQMOVN, VQMOVUN gives the operational pseudocode for this instruction. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3621 


1ID092916 


Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 





F6.1.168 VQSHRN, VQSHRUN 
Vector Saturating Shift Right, Narrow takes each element in a quadword vector of integers, right shifts them by an 
immediate value, and places the truncated results in a doubleword vector. 
For rounded results, see VQRSHRN, VQRSHRUN. 
The operand elements must all be the same size, and can be any one of: 
° 16-bit, 32-bit, or 64-bit signed integers. 
° 16-bit, 32-bit, or 64-bit unsigned integers. 
The result elements are half the width of the operand elements. If the operand elements are signed, the results can 
be either signed or unsigned. If the operand elements are unsigned, the result elements must also be unsigned. 
If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 2221 | 16/15 12\11109 8|7 6 5 4/3 0 | 
114100 1JuUl1{o] imme [vad _ [1 0 ofopfofofmii] vm _| 
Signed result variant 
Applies when !(imm6 == Q0@xxx) && op == 1. 
VQSHRN{<c>}{<q>}.<type><size> <Dd>, <Qm>, #<imm> 
Unsigned result variant 
Applies when U == 1 && !(imm6 == @00xxx) && op == 0. 
VQSHRUN{<c>}{<q>}.<type><size> <Dd>, <Qm>, #<imm> 
Decode for all variants of this encoding 
if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 
if U == '@' && op == '@' then SEE VSHRN; 
if Vm<@> == '1' then UNDEFINED; 
case imm6 of 
when 'QQ1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when '@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
src_unsigned = (U == '1' && op == '1'); dest_unsigned = (U == '1'); 
d = UInt(D:Vd); m = UInt(M:Vm); 
T1 
15 141312\11109 8/7 65 | 0 |15 12/1110 9 8|7 6 5 4/3 0 | 
T4174 to] imme | ve [40 Ofopfolo[m[a] vm] 
Signed result variant 
Applies when ! (imm6 == Q0@xxx) && op == 1. 
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VQSHRN{<c>}{<q>}.<type><size> <Dd>, <Qm>, #<imm> 


Unsigned result variant 


Applies when U == 1 && !(imm6 == @00xxx) && op == 0. 


VQSHRUN{<c>}{<q>}.<type><size> <Dd>, <Qm>, #<imm> 


Decode for all variants of this encoding 


if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 
if U == '@' && op == '@' then SEE VSHRN; 
if Vm<@> == '1' then UNDEFINED; 


case imm6 of 


when 'Q@Q@1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when '@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 


src_unsigned 


= (U == '1' && op == '1'); dest_unsigned = (U == '1'); 


d = UInt(D:Vd); m = UInt(M:Vm); 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 


set. 


Assembler symbols 


<c> 


<q> 


<type> 


<size> 


<Dd> 
<Qm> 


<imm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

For the signed result variant: is the data type for the elements of the vectors, encoded in the "U" 
field. It can have the following values: 

Ss when U = @ 

U when U = 1 


For the unsigned result variant: is the data type for the elements of the vectors, encoded in the "U" 
field. It can have the following values: 


S when U = 1 


Is the data size for the elements of the vectors, encoded in the "imm6<5:3>" field. It can have the 
following values: 


16 when imm6<5:3> = 001 
32 when imm6<5:3> = Q1x 
64 when imm6<5:3> = 1xx 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Is an immediate value, in the range 1 to <size>/2, encoded in the "imm6" field as <size>/2 - <imm>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = 0 to elements-1 
operand = Int(Elem[Qin[m>>1],e,2esize], src_unsigned) ; 
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(result, sat) = SatQ(operand >> shift_amount, esize, dest_unsigned); 
Elem[D[d],e,esize] = result; 
if sat then FPSR.QC = '1'; 
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F6.1.169 §VQSHRUN (zero) 


Vector Saturating Shift Right, Narrow takes each element in a quadword vector of integers, right shifts them by an 
immediate value, and places the unsigned truncated results in a doubleword vector 


This instruction is a pseudo-instruction of the VQMOVN, VQMOVUN instruction. This means that: 
° The encodings in this description are named to match the encodings of VQMOVN, VQMOVUN. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 


° The description of VQMOVN, VQMOVUN gives the operational pseudocode for this instruction. 
A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0 | 


77710077 10]1 teze[i of va [ofo 7 ofo 1[mjo] vm | 
op 


Unsigned result variant 
VQSHRUN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VQMOVUN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 


T1 


[15 141312/11109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


titi iit tfoft ifsize[1 of vd jofo 1 ojo t}Mjo] vm | 
op 


Unsigned result variant 
VQSHRUN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VQMOVUN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operand, encoded in the "size" field. It can have the following 
values: 
$16 when size = 00 
$32 when size = 01 
S64 when size = 10 


The encoding size = 11 is reserved. 
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<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


The description of VOQMOVN, VQMOVUN gives the operational pseudocode for this instruction. 
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F6.1.170 VQSUB 


Vector Saturating Subtract subtracts the elements of the second operand vector from the corresponding elements of 
the first operand vector, and places the results in the destination vector. Signed and unsigned operations are distinct. 


The operand and result elements must all be the same type, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 
° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 


If any of the results overflow, they are saturated. The cumulative saturation bit, FPSCR.QC, is set if saturation 
occurs. For details see Pseudocode description of saturation on page E1-2291. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12\11109 8|7 6 5 4/3 0| 


[111100 1fulojofsize] vn | vd joo 1 oj[Nfajm{t] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 

VQSUB{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm> 

128-bit SIMD vector variant 

Applies when Q == 1. 

VQSUB{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Qm> 

Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); nm = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0| 


[11 tfuj1 414 ofdfsize] vn | vd joo 1 ojNJQ}M{t] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VQSUB{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VQSUB{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 
<Qn> 


<Qm> 


<Dd> 
<Dn> 


<Dm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "U:size" field. It can have the 
following values: 


S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
S64 when U = 0, size = 11 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
U64 when U = 1, size = 11 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 


diff = Int(Elem[D[n+r],e,esize], unsigned) - Int(Elem[D[m+r],e,esize], unsigned); 
(Elem[D[d+r],e,esize], sat) = SatQ(diff, esize, unsigned); 
if sat then FPSR.QC = '1'; 
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F6.1.171  VRADDHN 


Vector Rounding Add and Narrow, returning High Half adds corresponding elements in two quadword vectors, and 
places the most significant half of each result in a doubleword vector. The results are rounded. For truncated results, 
see VADDHN. 


The operand elements can be 16-bit, 32-bit, or 64-bit integers. There is no distinction between signed and unsigned 
integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 

|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8/7 6 5 4]3 0| 

74700 after] we [ va [070 O[Njo|Mjo] vm | 
size 

Al variant 


VRADDHN{<c>}{<q>}.<dt> <Dd>, <Qn>, <Qm> 


Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


api 717 tppen] va | va _[o 10 o[Njo[mjo] vm _| 


size 


T1 variant 


VRADDHN{<c>}{<q>}.<dt> <Dd>, <Qn>, <Qm> 


Decode for this encoding 

if size == '11' then SEE "Related encodings"; 

if Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 
Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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<dt> Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 


116 when size = 00 
132 when size = 01 
164 when size = 10 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
round_const = 1 << (esize-1); 
for e = 0 to elements-1 
result = Elem[Qin[n>>1],e,2sesize] + Elem[Qin[m>>1],e,2xesize] + round_const; 
Elem[D[d],e,esize] = result<2esize-1:esize>; 
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F6.1.172 VRECPE 


Vector Reciprocal Estimate finds an approximate reciprocal of each element in the operand vector, and places the 
results in the destination vector. 


The operand and result elements are the same type, and can be 32-bit floating-point numbers, or 32-bit unsigned 
integers. 


For details of the operation performed by this instruction see Floating-point reciprocal square root estimate and 
step on page E1-2309. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


14770077 1)0][1 i[sze]i 1] va [0]? ofFloa|w[o] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRECPE{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRECPE{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

floating_point = (F == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|/11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4|3 0 | 


77 ti17 7 1p]1 towel 1] va fot oF [ojalmpo. vm 





64-bit SIMD vector variant 
Applies when Q == 0. 
VRECPE{<c>}{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VRECPE{<c>}{<q>}.<dt> <Qd>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

floating_point = (F == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 
<Qm> 
<Dd> 


<Dm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the data type for the elements of the vectors, encoded in the "F:size" field. It can have the 
following values: 

U32 when F = 0, size = 10 

F32 when F = 1, size = 10 

The following encodings are reserved: 

° F = Q, size = Ox. 


11. 


= 0, size 


F 
° F = 1, size = Ox. 
F 


= 1,size = 11. 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Newton-Raphson iteration 


For details of the operation performed and how it can be used in a Newton-Raphson iteration to calculate the 


reciprocal of a number, see Floating-point reciprocal estimate and step on page E1-2308. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 


if floating_point then 

Elem[D[d+r],e,esize] = FPRecipEstimate(Elem[D[m+r],e,esize], StandardFPSCRValue()); 
else 

Elem[D[d+r],e,esize] = UnsignedRecipEstimate(Elem[D[m+r] ,e,esize]); 
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F6.1.173 VRECPS 


Vector Reciprocal Step multiplies the elements of one vector by the corresponding elements of another vector, 
subtracts each of the products from 2.0, and places the results into the elements of the destination vector. 


The operand and result elements are 32-bit floating-point numbers. 


For details of the operation performed by this instruction see Floating-point reciprocal estimate and step on 
page E1-2308. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


77100 tfofo[ppofe] va | va_[1 17 7[NJo|M[i] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRECPS{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRECPS{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 


esize = 32; elements = 2; 
d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0 | 


7 apolt 717 o[ppofe] va | va_[1 17 7[NjQ[Myi] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VRECPS{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VRECPS{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Newton-Raphson iteration 


For details of the operation performed and how it can be used in a Newton-Raphson iteration to calculate the 
reciprocal of a number, see Floating-point reciprocal estimate and step on page E1-2308. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
Elem[D[d+r],e,esize] = FPRecipStep(Elem[D[n+r],e,esize], Elem[D[m+r],e,esize]); 
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F6.1.174 VREV16 


Vector Reverse in halfwords reverses the order of 8-bit elements in each halfword of the vector, and places the result 
in the corresponding destination vector. 


There is no distinction between data types, other than size. 
The following figure shows an example of the operation of VREV16 doubleword operation. 


VREV16.8, doubleword 
Dm 


Dd 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


Ti 770077 1)0][1 i[szefo of va [ojo 0] oa|wjo] vm _| 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VREV16{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VREV16{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if UInt(op)+UInt(size) >= 3 then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size) 
integer container_size; 
case op of 
when '10' container_size = 16; 
when '@1' container_size = 32; 
when 'Q@Q' container_size = 64; 
integer containers = 64 DIV container_size; 
integer elements_per_container = container_size DIV esize; 


d = UInt(D:Vd); m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
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T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4/3 0 | 


Tiiti117 to] iszefo of va [ojo of ofa|wpo] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VREV16{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VREV16{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if UInt(op)+UInt(size) >= 3 then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size) 
integer container_size; 
case op of 
when '10' container_size = 16; 
when '@1' container_size = 32; 
when 'QQ' container_size = 64; 
integer containers = 64 DIV container_size; 
integer elements_per_container = container_size DIV esize; 


d = UInt(D:Vd); m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the operand, encoded in the "size" field. It can have the following 
values: 
8 when size = 00 


The following encodings are reserved: 





° size = Q1. 
° size = 1x. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


bits(64) result; 
integer element; 
integer rev_element; 
for r = @ to regs-1 
element = Q; 
for c = Q to containers-1 
rev_element = element + elements_per_container - 1; 
for e = @ to elements_per_container-1 
Elem[result, rev_element, esize] = Elem[D[m+r], element, esize]; 
element = element + 1; 
rev_element = rev_element - 1; 
D[d+r] = result; 
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F6.1.175 VREV32 


Vector Reverse in words reverses the order of 8-bit or 16-bit elements in each word of the vector, and places the 


result in the corresponding destination vector. 


There is no distinction between data types, other than size. 


The following figure shows an example of the operation of VREV32 doubleword operations. 


VREV32.8, doubleword VREV32.16, doubleword 


Dm Dm 

Dd Dd 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 


more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 


12\11109 8|7 6 5 4/3 


0} 


11770077 1)0][1 i[szeJo of va [ojo ofo 7a]wfo] vm _| 
op 


64-bit SIMD vector variant 

Applies when Q == 0. 

VREV32{<c>}{<q>}.<dt> <Dd>, <Dm> 

128-bit SIMD vector variant 

Applies when Q == 1. 

VREV32{<c>}{<q>}.<dt> <Qd>, <Qm> 

Decode for all variants of this encoding 


if UInt(op)+UInt(size) >= 3 then UNDEFINED; 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size) 
integer container_size; 
case op of 
when '10' container_size = 16; 
when '@1' container_size = 32; 
when 'Q@Q' container_size = 64; 
integer containers = 64 DIV container_size; 


integer elements_per_container = container_size DIV esize; 


d = UInt(D:Vd); m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
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T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4/3 0 | 


Toit i117 top iszefo of va [ojo ofo 7alw[o] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VREV32{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VREV32{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if UInt(op)+UInt(size) >= 3 then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size) 
integer container_size; 
case op of 
when '10' container_size = 16; 
when '@1' container_size = 32; 
when 'QQ' container_size = 64; 
integer containers = 64 DIV container_size; 
integer elements_per_container = container_size DIV esize; 


d = UInt(D:Vd); m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the operand, encoded in the "size" field. It can have the following 
values: 
8 when size = 00 
16 when size = Q1 


The encoding size = 1x is reserved. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


bits(64) result; 
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integer element; 
integer rev_element; 
for r = @ to regs-1 
element = Q; 
for c = Q to containers-1 
rev_element = element + elements_per_container - 1; 
for e = @ to elements_per_container-1 
Elem[result, rev_element, esize] = Elem[D[m+r], element, esize]; 
element = element + 1; 
rev_element = rev_element - 1; 
D[d+r] = result; 
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F6.1.176 VREV64 


Vector Reverse in doublewords reverses the order of 8-bit, 16-bit, or 32-bit elements in each doubleword of the 
vector, and places the result in the corresponding destination vector. 


There is no distinction between data types, other than size. 


The following figure shows an example of the operation of VREV64 doubleword operations. 


VREV64.8, doubleword VREV64.32, quadword 
Dm Qm 
Dd Qm 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 


instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


11770017 1)0[1 i[szeJo of va [ojo ofo oajwfo] vm _| 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VREV64{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VREV64{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if UInt(op)+UInt(size) >= 3 then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size) 
integer container_size; 
case op of 
when '10' container_size = 16; 
when '@1' container_size = 32; 
when 'Q@Q' container_size = 64; 
integer containers = 64 DIV container_size; 
integer elements_per_container = container_size DIV esize; 


d = UInt(D:Vd); m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
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T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4/3 0 | 


Tiiti117 [1 iszefo of va [ojo ofo ofa|w[o] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VREV64{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VREV64{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if UInt(op)+UInt(size) >= 3 then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


esize = 8 << UInt(size) 
integer container_size; 
case op of 
when '10' container_size = 16; 
when '@1' container_size = 32; 
when 'QQ' container_size = 64; 
integer containers = 64 DIV container_size; 
integer elements_per_container = container_size DIV esize; 


d = UInt(D:Vd); m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operand, encoded in the "size" field. It can have the following 
values: 
8 when size = 00 
16 when size = Q1 
32 when size = 10 


The encoding size = 11 is reserved. 





<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


bits(64) result; 
integer element; 
integer rev_element; 
for r = @ to regs-1 
element = Q; 
for c = Q to containers-1 
rev_element = element + elements_per_container - 1; 
for e = @ to elements_per_container-1 
Elem[result, rev_element, esize] = Elem[D[m+r], element, esize]; 
element = element + 1; 
rev_element = rev_element - 1; 
D[d+r] = result; 
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F6.1.177  VRHADD 
Vector Rounding Halving Add adds corresponding elements in two vectors of integers, shifts each result right one 
bit, and places the final results in the destination vector. 
The operand and result elements are all the same type, and can be any one of: 
° 8-bit, 16-bit, or 32-bit signed integers. 
° 8-bit, 16-bit, or 32-bit unsigned integers. 
The results of the halving operations are rounded. For truncated results, see VHADD. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20|19 16|15 12/1109 8|7 6 5 4|3 0 | 
111400 tJUjolp}size{ vn [| va _ [oo 0 1{NJaQjMfo} vm _| 
64-bit SIMD vector variant 
Applies when Q == 0. 
VRHADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VRHADD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '11' then UNDEFINED; 
unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12\/11109 8|7 6 5 4|3 0| 
7 afoli 117 o[ppsze[ va | va [oo 0 7[NalMpo] vm 
64-bit SIMD vector variant 
Applies when Q == 0. 
VRHADD{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VRHADD{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operands, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 

opl = Int(Elem[D[n+r],e,esize], unsigned); 
op2 = Int(Elem[D[m+r],e,esize], unsigned); 
result = opl + op2 + 1; 
Elem[D[d+r],e,esize] = result<esize:1>; 
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F6.1.178 | VRINTA (Advanced SIMD) 
Vector Round floating-point to integer towards Nearest with Ties to Away rounds a vector of floating-point values 
to integral floating-point values of the same size using the Round to Nearest with Ties to Away rounding mode. A 
zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and a 
NaN is propagated as for normal arithmetic. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 12;|11109 |7 6 5 4/3 0| 
11470077 1)0]1 1[sze]1 0] va Jo[ifo 7 ofajmjo] vm _| 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VRINTA{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VRINTA{<q>}.<dt> <Qd>, <Qm> 
Decode for all variants of this encoding 
if op<2> != op<@> then SEE "Related encodings"; 
if Q == '1' && (Vd<0> == '1' || Vm<@> == '1') then UNDEFINED; 
if size != '1@' then UNDEFINED; 
// Rounding encoded differently from other VCVT and VRINT instructions 
rounding = FPDecodeRM(op<2>:NOT(op<1>)); exact = FALSE; 
esize = 32; elements = 2; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 2 1 0/15 12\11109 |7 6 5 4|3 0| 
Ti 471177 10]1 t[sz0e]t 0] va Jo[ifo 7 ofajmjo] vm _| 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VRINTA{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VRINTA{<q>}.<dt> <Qd>, <Qm> 
Decode for all variants of this encoding 
if op<2> != op<@> then SEE "Related encodings"; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size != '10' then UNDEFINED; 
// Rounding encoded differently from other VCVT and VRINT instructions 
rounding = FPDecodeRM(op<2>:NOT(op<1>)); exact = FALSE; 
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esize = 32; elements = 2; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


Related encodings: See Advanced SIMD two registers misc on page F3-2455 for the T32 instruction set, or 
Advanced SIMD two registers misc on page F4-2542 for the A32 instruction set. 


Assembler symbols 


<q> 


<dt> 


<Qd> 
<Qm> 
<Dd> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 


F32 when size = 10 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


for 


r = @ to regs-1 

for e = 0 to elements-1 
op1 = Elem[D[m+r],e,esize]; 
result = FPRoundInt(op1, StandardFPSCRValue(), rounding, exact); 
Elem[D[d+r],e,esize] = result; 
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F6.1.179 | VRINTA (floating-point) 


Round floating-point to integer to Nearest with Ties to Away rounds a floating-point value to an integral 
floating-point value of the same size using the Round to Nearest with Ties to Away rounding mode. A zero input 
gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and a NaN is 


propagated as for normal arithmetic. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0 | 


114117170 1/0/71 4 1Jofo of va [1 oft xfoji{mfo} vm __| 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTA{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTA{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
rounding = FPDecodeRM(RM); exact = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


T1 


15 14 1312/1110 9 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 


0} 


11111110 1/oj1 1 tfofo of va [1 of1 xfoji{Mjo] vm | 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTA{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTA{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
rounding = FPDecodeRM(RM); exact = FALSE; 
case size of 

when '10' esize = 32; d 


UInt(Vd:D); m = UInt(Vm:M); 





when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 





when 32 
S[d] = FPRoundInt(S[m], FPSCR, rounding, exact); 
when 64 
D[d] = FPRoundInt(D[m], FPSCR, rounding, exact); 
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F6.1.180 | VRINTM (Advanced SIMD) 


Vector Round floating-point to integer towards -Infinity rounds a vector of floating-point values to integral 
floating-point values of the same size, using the Round towards -Infinity rounding mode. A zero input gives a zero 
result with the same sign, an infinite input gives an infinite result with the same sign, and a NaN is propagated as 
for normal arithmetic. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1109 |7 65 4|3 0| 


77710077 10]1 teze[i of va [o[i]7 0 7Ja|Mjo] vm | 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRINTM{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRINTM{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if op<2> != op<@> then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '1@' then UNDEFINED; 

// Rounding encoded differently from other VCVT and VRINT instructions 
rounding = FPDecodeRM(op<2>:NOT(op<1>)); exact = FALSE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 

if InITBlock() then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0\15 12/1109 |7 65 4|3 0 | 


774141774 1p]1 tewe[t 0] va fo[i]7 0 7alMjo] vm _| 





op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRINTM{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRINTM{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if op<2> != op<@> then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

// Rounding encoded differently from other VCVT and VRINT instructions 
rounding = FPDecodeRM(op<2>:NOT(op<1>)); exact = FALSE; 
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esize = 32; elements = 2; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


Related encodings: See Advanced SIMD two registers misc on page F3-2455 for the T32 instruction set, or 
Advanced SIMD two registers misc on page F4-2542 for the A32 instruction set. 


Assembler symbols 


<q> 


<dt> 


<Qd> 
<Qm> 
<Dd> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 


F32 when size = 10 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


for 


r = @ to regs-1 

for e = 0 to elements-1 
op1 = Elem[D[m+r],e,esize]; 
result = FPRoundInt(op1, StandardFPSCRValue(), rounding, exact); 
Elem[D[d+r],e,esize] = result; 
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F6.1.181 | VRINTM (floating-point) 


Round floating-point to integer towards -Infinity rounds a floating-point value to an integral floating-point value of 
the same size using the Round towards -Infinity rounding mode. A zero input gives a zero result with the same sign, 
an infinite input gives an infinite result with the same sign, and a NaN is propagated as for normal arithmetic. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0| 


1444147170 1J0]1 4 tfofr af va ft oft xfoli{mfo] vm __| 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTM{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTM{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
rounding = FPDecodeRM(RM); exact = FALSE; 
case size of 


when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
T1 
115141312/11109 8|7 6 5 4/3 2 1 0(|15 12/1110 9 8|7 6 5 4|3 





size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTM{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTM{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
rounding = FPDecodeRM(RM); exact = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


0} 


7414770 1]17 oli 1] va [1 0]7 xfo]t|Mjo] vm 
RM 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 





when 32 
S[d] = FPRoundInt(S[m], FPSCR, rounding, exact); 
when 64 
D[d] = FPRoundInt(D[m], FPSCR, rounding, exact); 
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F6.1.182 | VRINTN (Advanced SIMD) 
Vector Round floating-point to integer to Nearest rounds a vector of floating-point values to integral floating-point 
values of the same size using the Round to Nearest rounding mode. A zero input gives a zero result with the same 
sign, an infinite input gives an infinite result with the same sign, and a NaN is propagated as for normal arithmetic. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 12\11109 |7 6 5 4|3 0| 
11110044 1{d{1 1{size]1 of va Jjolijo o ofajmjo| vm | 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VRINTN{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VRINTN{<q>}.<dt> <Qd>, <Qm> 
Decode for all variants of this encoding 
if op<2> != op<@> then SEE "Related encodings"; 
if Q == '1' && (Vd<O> == '1' || Vm<O> == '1') then UNDEFINED; 
if size != '10' then UNDEFINED; 
// Rounding encoded differently from other VCVT and VRINT instructions 
rounding = FPDecodeRM(op<2>:NOT(op<1>)); exact = FALSE; 
esize = 32; elements = 2; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 
T1 
\15141312/11109 8|7 6 5 4/3 2 1 0|15 12\}11109 |7 6 5 4|3 0| 
Tittti117 tol tszelt 0] ve lolifo o ofalmjol vm | 
op 
64-bit SIMD vector variant 
Applies when Q == 0. 
VRINTN{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VRINTN{<q>}.<dt> <Qd>, <Qm> 
Decode for all variants of this encoding 
if op<2> != op<@> then SEE "Related encodings”; 
if Q == '1' && (Vd<0> == '1' || Vm<@> == '1') then UNDEFINED; 
if size != '10' then UNDEFINED; 
// Rounding encoded differently from other VCVT and VRINT instructions 
rounding = FPDecodeRM(op<2>:NOT(op<1>)); exact = FALSE; 
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esize = 32; elements = 2; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


Related encodings: See Advanced SIMD two registers misc on page F3-2455 for the T32 instruction set, or 
Advanced SIMD two registers misc on page F4-2542 for the A32 instruction set. 


Assembler symbols 


<q> 


<dt> 


<Qd> 
<Qm> 
<Dd> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 


F32 when size = 10 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


for 


r = @ to regs-1 

for e = 0 to elements-1 
op1 = Elem[D[m+r],e,esize]; 
result = FPRoundInt(op1, StandardFPSCRValue(), rounding, exact); 
Elem[D[d+r],e,esize] = result; 
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F6.1.183 | VRINTN (floating-point) 


Round floating-point to integer to Nearest rounds a floating-point value to an integral floating-point value of the 
same size using the Round to Nearest rounding mode. A zero input gives a zero result with the same sign, an infinite 
input gives an infinite result with the same sign, and a NaN is propagated as for normal arithmetic. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0| 


1444147170 1J0]1 4 tJofo1{ va [1 oft xfo]i{mfo} vm __| 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTN{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTN{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
rounding = FPDecodeRM(RM); exact = FALSE; 
case size of 


when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
T1 
115141312/11109 8|7 6 5 4/3 2 1 0|15 12/1110 9 8|7 6 5 4|3 





size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTN{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTN{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
rounding = FPDecodeRM(RM); exact = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


0} 


77414770 1]1 7 ojo 7] va [1 0]7 xfolt[mjo] vm 
RM 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 





when 32 
S[d] = FPRoundInt(S[m], FPSCR, rounding, exact); 
when 64 
D[d] = FPRoundInt(D[m], FPSCR, rounding, exact); 
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F6.1.184 


VRINTP (Advanced SIMD) 


Vector Round floating-point to integer towards +Infinity rounds a vector of floating-point values to integral 
floating-point values of the same size using the Round towards +Infinity rounding mode. A zero input gives a zero 
result with the same sign, an infinite input gives an infinite result with the same sign, and a NaN is propagated as 


for normal arithmetic. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1109 |7 65 4|3 


0 | 


77710077 10]1 teze[i of va [o[i]i 7 7Ja|Mjo] vm _| 
op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRINTP{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRINTP{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if op<2> != op<@> then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '1@' then UNDEFINED; 

// Rounding encoded differently from other VCVT and VRINT instructions 
rounding = FPDecodeRM(op<2>:NOT(op<1>)); exact = FALSE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 

if InITBlock() then UNPREDICTABLE; 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0\15 12\1109 |7 65 4/3 


0| 





777141717 1p]1 tewe[t 0] va fofi]i 7 7alMjo] vm _| 


op 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRINTP{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRINTP{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if op<2> != op<@> then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

// Rounding encoded differently from other VCVT and VRINT instructions 
rounding = FPDecodeRM(op<2>:NOT(op<1>)); exact = FALSE; 
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esize = 32; elements = 2; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


Notes for all encodings 


Related encodings: See Advanced SIMD two registers misc on page F3-2455 for the T32 instruction set, or 
Advanced SIMD two registers misc on page F4-2542 for the A32 instruction set. 


Assembler symbols 


<q> 


<dt> 


<Qd> 
<Qm> 
<Dd> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 


F32 when size = 10 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


for 


r = @ to regs-1 

for e = 0 to elements-1 
op1 = Elem[D[m+r],e,esize]; 
result = FPRoundInt(op1, StandardFPSCRValue(), rounding, exact); 
Elem[D[d+r],e,esize] = result; 
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F6.1.185 | VRINTP (floating-point) 


Round floating-point to integer towards +Infinity rounds a floating-point value to an integral floating-point value 
of the same size using the Round towards +Infinity rounding mode. A zero input gives a zero result with the same 
sign, an infinite input gives an infinite result with the same sign, and a NaN is propagated as for normal arithmetic. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0 | 


14447147170 1J0]1 4 1Jof1 of va [1 oft xfo]i{mfo] vm __| 
RM 


size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTP{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTP{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
rounding = FPDecodeRM(RM); exact = FALSE; 
case size of 


when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 
T1 
115141312/11109 8|7 6 5 4|/3 2 1 0|15 12/1110 9 8|7 6 5 4|3 





size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTP{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTP{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 
rounding = FPDecodeRM(RM); exact = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
if InITBlock() then UNPREDICTABLE; 


0} 


7414770 10]1 7 to]1 of vat 0]7 xfo[t|mjo] vm 
RM 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 





when 32 
S[d] = FPRoundInt(S[m], FPSCR, rounding, exact); 
when 64 
D[d] = FPRoundInt(D[m], FPSCR, rounding, exact); 
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F6.1.186 VRINTR 
Round floating-point to integer rounds a floating-point value to an integral floating-point value of the same size 
using the rounding mode specified in the FPSCR. A zero input gives a zero result with the same sign, an infinite 
input gives an infinite result with the same sign, and a NaN is propagated as for normal arithmetic. 
A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12|/11109 8|7 6 5 4/3 0| 
rit [1 41 0 1[p]4 tfol]1 1 of va | of4 xfoli{mMjo] vm | 
cond size op 
Single-precision scalar variant 
Applies when size == 10. 
VRINTR{<c>}{<q>}.F32 <Sd>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VRINTR{<c>}{<q>}.F64 <Dd>, <Dm> 
Decode for all variants of this encoding 
if size != '1x' then UNDEFINED; 
rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 
exact = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
T1 
|15141312|/1110 9 8|7 6 5 4/3 2 1 0(|15 12\1110 9 8|7 6 5 4|3 0| 
11401770 10]1 1)o]1 7 0] ve [1 0] xfo]i[mjo] vm _| 
size op 
Single-precision scalar variant 
Applies when size == 10. 
VRINTR{<c>}{<q>}.F32 <Sd>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VRINTR{<c>}{<q>}.F64 <Dd>, <Dm> 
Decode for all variants of this encoding 
if size != '1x' then UNDEFINED; 
rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 
exact = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 





when 32 
S[d] = FPRoundInt(S[m], FPSCR, rounding, exact); 
when 64 
D[d] = FPRoundInt(D[m], FPSCR, rounding, exact); 
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F6.1.187 | VRINTX (Advanced SIMD) 


Vector round floating-point to integer inexact rounds a vector of floating-point values to integral floating-point 
values of the same size, using the Round to Nearest rounding mode, and raises the Inexact exception when the result 
value is not numerically equal to the input value. A zero input gives a zero result with the same sign, an infinite input 
gives an infinite result with the same sign, and a NaN is propagated as for normal arithmetic. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0| 


77710077 10]1 t[eze[i of va [o[t oo 7alMo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRINTX{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRINTX{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

rounding = FPRounding_TIEEVEN; exact = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 


T1 


15 14131211109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


1atiti1i1 4 tfoji t}size{t of va fojt oo tfalMjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRINTX{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRINTX{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '1@' then UNDEFINED; 

rounding = FPRounding_TIEEVEN; exact = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
if InITBlock() then UNPREDICTABLE; 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> 


<dt> 


<Qd> 
<Qm> 
<Dd> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 


F32 when size = 10 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


for 


r = @ to regs-1 

for e = 0 to elements-1 
op1 = Elem[D[m+r],e,esize]; 
result = FPRoundInt(op1, StandardFPSCRValue(), rounding, exact); 
Elem[D[d+r],e,esize] = result; 
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F6.1.188 | VRINTX (floating-point) 


Round floating-point to integer inexact rounds a floating-point value to an integral floating-point value of the same 
size, using the rounding mode specified in the FPSCR, and raises an Inexact exception when the result value is not 
numerically equal to the input value. A zero input gives a zero result with the same sign, an infinite input gives an 
infinite result with the same sign, and a NaN is propagated as for normal arithmetic. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 


12/1110 9 8|7 6 5 4|3 0| 


Derm [i770 1p]1 tfol1 1 7] va _[1 0]7 x[o[t|mpo] vm 


cond 


size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTX{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTX{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 

exact = TRUE; 

case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4|3 


0| 


2 t1ort to tfofr sfoj1 t+ 1f va ft of xfoft}mfo] vm | 


size 


Single-precision scalar variant 
Applies when size == 10. 


VRINTX{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VRINTX{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != '1x' then UNDEFINED; 

exact = TRUE; 

case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
rounding = FPRoundingMode(FPSCR) ; 
case esize of 





when 32 
S[d] = FPRoundInt(S[m], FPSCR, rounding, exact); 
when 64 
D[d] = FPRoundInt(D[m], FPSCR, rounding, exact); 
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F6.1.189 | VRINTZ (Advanced SIMD) 


Vector round floating-point to integer towards Zero rounds a vector of floating-point values to integral 
floating-point values of the same size, using the Round towards Zero rounding mode. A zero input gives a zero result 
with the same sign, an infinite input gives an infinite result with the same sign, and a NaN is propagated as for 


normal arithmetic. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 


12/1110 9 8|7 6 5 4|3 


0| 


77710077 10]1 t[eze[i of va jolt o 7 7alMo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VRINTZ{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VRINTZ{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


if size != '10' then UNDEFINED; 
rounding = FPRounding_ZERO; exact = FALSE; 
esize = 32; elements = 2; 


d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


if InITBlock() then UNPREDICTABLE; 


T1 


15 14131211109 8/7 6 5 4/3 2.1 0|15 


12/1109 8|7 6 5 4/3 


0} 


1atiti1i1t 1foji t}size{t of va foft o 1 tfafmjo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 
VRINTZ{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VRINTZ{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if Q == 'L' && (Vd<O> == "1" || Vm<O> == '1') then UNDEFINED; 


if size != '1@' then UNDEFINED; 
rounding = FPRounding_ZERO; exact = FALSE; 
esize = 32; elements = 2; 


d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


if InITBlock() then UNPREDICTABLE; 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<q> 


<dt> 


<Qd> 
<Qm> 
<Dd> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 


F32 when size = 10 

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 


for 


r = @ to regs-1 

for e = 0 to elements-1 
op1 = Elem[D[m+r],e,esize]; 
result = FPRoundInt(op1, StandardFPSCRValue(), rounding, exact); 
Elem[D[d+r],e,esize] = result; 
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F6.1.190 VRINTZ (floating-point) 
Round floating-point to integer towards Zero rounds a floating-point value to an integral floating-point value of the 
same size, using the Round towards Zero rounding mode. A zero input gives a zero result with the same sign, an 
infinite input gives an infinite result with the same sign, and a NaN is propagated as for normal arithmetic. 
A1 
|31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12|/11109 8|7 6 5 4/3 0| 
eit [1 41 0 1[p]4 tfol1 1 of va | oft xftfi{Mjo] vm | 
cond size op 
Single-precision scalar variant 
Applies when size == 10. 
VRINTZ{<c>}{<q>}.F32 <Sd>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VRINTZ{<c>}{<q>}.F64 <Dd>, <Dm> 
Decode for all variants of this encoding 
if size != '1x' then UNDEFINED; 
rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 
exact = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
T1 
|15141312|/1110 9 8|7 6 5 4/3 2 1 0(|15 12/1110 9 8|7 6 5 4|3 0| 
Ti 401770 10]1 10170] ve [1 0]1 x[1]1[MJo] vm _| 
size op 
Single-precision scalar variant 
Applies when size == 10. 
VRINTZ{<c>}{<q>}.F32 <Sd>, <Sm> 
Double-precision scalar variant 
Applies when size == 11. 
VRINTZ{<c>}{<q>}.F64 <Dd>, <Dm> 
Decode for all variants of this encoding 
if size != '1x' then UNDEFINED; 
rounding = if op == '1' then FPRounding_ZERO else FPRoundingMode(FPSCR) ; 
exact = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); m = UInt(M:Vm); 
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Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 

<q> See Standard assembler syntax fields on page F2-2406. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sm> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 





when 32 
S[d] = FPRoundInt(S[m], FPSCR, rounding, exact); 
when 64 
D[d] = FPRoundInt(D[m], FPSCR, rounding, exact); 
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F6.1.191 VRSHL 
Vector Rounding Shift Left takes each element in a vector, shifts them by a value from the least significant byte of 
the corresponding element of a second vector, and places the results in the destination vector. If the shift value is 
positive, the operation is a left shift. If the shift value is negative, it is a rounding right shift. For a truncating shift, 
see VSHL. 
The first operand and result elements are the same data type, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 
° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 
The second operand is always a signed integer of the same size. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
Ti 1100 iulo[o[sze[ vn [va [o 10 a[NJQ]wpo] vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VRSHL{<c>}{<q>}.<dt> {<Dd>,} <Dm>, <Dn> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VRSHL{<c>}{<q>}.<dt> {<Qd>,} <Qm>, <Qn> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1' || Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); | m = UInt(M:Vm); | n = UInt(N:Vn); regs = if Q == '@' then 1 else 2; 
T1 
[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 
fa afolt 117 o[ppszel va | va [o 1 0 7[NalMpo] vm 
64-bit SIMD vector variant 
Applies when Q == 0. 
VRSHL{<c>}{<q>}.<dt> {<Dd>,} <Dm>, <Dn> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VRSHL{<c>}{<q>}.<dt> {<Qd>,} <Qm>, <Qn> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1' || Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); | m = UInt(M:Vm); | n = UInt(N:Vn); regs = if Q == 'Q@' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
S64 when U = 0, size = 11 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
U64 when U = 1, size = 11 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 

shift = SInt(Elem[D[n+r] ,e,esize]<7:0>); 
round_const = 1 << (-shift-1); // @ for left shift, 2A(n-1) for right shift 
result = (Int(Elem[D[m+r],e,esize], unsigned) + round_const) << shift; 
Elem[D[d+r],e,esize] = result<esize-1:0>; 
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F6.1.192 VRSHR 
Vector Rounding Shift Right takes each element in a vector, right shifts them by an immediate value, and places the 
rounded results in the destination vector. For truncated results, see VSHR. 
The operand and result elements must be the same size, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 
° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 2221 ‘| 16|15 12/1110 9 8|7 6 5 4|3 0| 
111700 14JU{1{D] imme [| vd [0 071 OjL{Q{mMjt] vm __| 
64-bit SIMD vector variant 
Applies when !(imm6 == Q0@xxx && L == 0) && Q == 0. 
VRSHR{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 
128-bit SIMD vector variant 
Applies when ! (imm6 == Q0@xxx && L == 0) && Q == 1. 
VRSHR{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 
Decode for all variants of this encoding 
if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
case L:imm6 of 
when 'QQQ1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when 'QQ1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '@1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 
unsigned = (U == '1'); d= UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
\15141312/11109 8|7 6 5 | 0 |15 12\11109 8|7 6 5 4|3 0 | 
[tt tfu[1 1414 1{D[ imme | va fo 071 oftfalm|1] vm _| 
64-bit SIMD vector variant 
Applies when ! (imm6 == @00xxx && L == 0) && Q == 0. 
VRSHR{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 
128-bit SIMD vector variant 
Applies when ! (imm6 == @00xxx && L == 0) && Q == 1. 
VRSHR{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 
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Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx shift_amount = 16 - UInt(imm6); 


esize = 8; elements = 8 


when 'Q@Q@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when 'Q1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 
unsigned = (U == '1'); | d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 


set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<type> Is the data type for the elements of the vectors, encoded in the "U" field. It can have the following 
values: 
Ss when U = @ 
U when U = 1 

<size> Is the data size for the elements of the vectors, encoded in the "L:imm6<5:3>" field. It can have the 


following values: 


8 when L = 0, imm6<5:3> = 001 

16 when L = 0, imm6<5:3> = Q1x 

32 when L = 0, imm6<5:3> = 1xx 

64 when L = 1, imm6<5:3> = xxx 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<imm> Is an immediate value, in the range 1 to <size>, encoded in the "imm6" field as <size> - <imm>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 
round_const = 1 << (shift_amount - 1); 
for r = @ to regs-1 
for e = Q to elements-1 
result = (Int(Elem[D[m+r],e,esize], unsigned) + round_const) >> shift_amount; 
Elem[D[d+r],e,esize] = result<esize-1:0>; 
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F6.1.193 | VRSHR (zero) 


Vector Rounding Shift Right copies the contents of one SIMD register to another 


This instruction is a pseudo-instruction of the VORR (register) instruction. This means that: 


° The encodings in this description are named to match the encodings of VORR (register). 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VORR (register) gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


777100 tfofopo]7 o| wa | va [ooo 7|Nja|Mi] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VRSHR{<c>}{<q>}.<dt> <Dd>, <Dm>, #0 

is equivalent to 

VORR{<c>}{<q>}{.<dt>} <Dd>, <Dm>, <Dm> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 
VRSHR{<c>}{<q>}.<dt> <Qd>, <Qm>, #0 
is equivalent to 

VORR{<c>}{<q>}{.<dt>} <Qd>, <Qm>, <Qm> 


and is never the preferred disassembly. 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 0 | 


[1 1 tfoj1 414 ofoft of vn _ | vd joo o t{NJajm{t} vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VRSHR{<c>}{<q>}.<dt> <Dd>, <Dm>, #0 
is equivalent to 

VORR{<c>}{<q>}{.<dt>} <Dd>, <Dm>, <Dm> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 


Applies when Q == 1. 
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VRSHR{<c>}{<q>}.<dt> <Qd>, <Qm>, #0 


is equivalent to 


VORR{<c>}{<q>}{.<dt>} <Qd>, <Qm>, <Qm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 


<Qm> 


<Dd> 


<Dm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, and must be one of: S8, S16, S32, S64, U8, U16, 
U32 or U64. 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 


Is the 128-bit name of the SIMD&FP source register, encoded in the "N:Vn" and "M:Vm" field as 
<Qm>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "N:Vn" and "M:Vm" field. 


Operation for all encodings 


The description of VORR (register) gives the operational pseudocode for this instruction. 
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F6.1.194 VRSHRN 
Vector Rounding Shift Right and Narrow takes each element in a vector, right shifts them by an immediate value, 
and places the rounded results in the destination vector. For truncated results, see VSHRN. 
The operand elements can be 16-bit, 32-bit, or 64-bit integers. There is no distinction between signed and unsigned 
integers. The destination elements are half the size of the source elements. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 2221 | 16|15 12/1110 9 8|7 6 5 4/3 0| 
Tt i100 t)olijo] imme | ve [100 olo]t[mji] vm | 
Al variant 
Applies when imm6 != QQ0xxx. 
VRSHRN{<c>}{<q>}.I<size> <Dd>, <Qm>, #<imm> 
Decode for this encoding 
if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 
if Vm<@> == '1' then UNDEFINED; 
case imm6 of 
when 'QQ@1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when '@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
d = UInt(D:Vd); > m = UInt(M:Vm); 
T1 
15141312|\11109 8|7 6 5 | 0 |15 12\11109 8|7 6 5 4|3 0 | 
[1 t tfoft 114 tfo] imme | vd ft oo of ofs mis] vm 
T1 variant 
Applies when imm6 != Q00xxx. 
VRSHRN{<c>}{<q>}.I<size> <Dd>, <Qm>, #<imm> 
Decode for this encoding 
if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 
if Vm<@> == '1' then UNDEFINED; 
case imm6 of 
when 'QQ@1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when '@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
d = UInt(D:Vd); m = UInt(M:Vm); 
Notes for all encodings 
Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size for the elements of the vectors, encoded in the "imm6<5:3>" field. It can have the 
following values: 
16 when imm6<5:3> = 001 
32 when imm6<5:3> = Q1x 
64 when imm6<5:3> = 1xx 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<imm> Is an immediate value, in the range 1 to <size>/2, encoded in the "imm6" field as <size>/2 - <imm>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
round_const = 1 << (shift_amount-1); 
for e = 0 to elements-1 
result = LSR(Elem[Qin[m>>1],e,2sesize] + round_const, shift_amount); 
Elem[D[d],e,esize] = result<esize-1:0>; 
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F6.1.195 | VRSHRN (zero) 


Vector Rounding Shift Right and Narrow takes each element in a vector, right shifts them by an immediate value, 
and places the rounded results in the destination vector 


This instruction is a pseudo-instruction of the VMOVN instruction. This means that: 


° The encodings in this description are named to match the encodings of VMOVN. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VMOVN gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


77710077 1[0]1 t[sze[i of va [ofo 70 o[o[Mjo] vm | 


Al variant 

VRSHRN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 


T1 


[15 141312/11109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


ti t1t1 4 tfoft ifsize[1 of va  Jofo 10 ofojmMjo] vm | 


T1 variant 

VRSHRN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operand, encoded in the "size" field. It can have the following 
values: 
I16 when size = 00 
132 when size = 01 
164 when size = 10 


The encoding size = 11 is reserved. 
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<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


The description of VMOVN gives the operational pseudocode for this instruction. 
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F6.1.196 VRSQRTE 


Vector Reciprocal Square Root Estimate finds an approximate reciprocal square root of each element in a vector, 
and places the results in a second vector. 


The operand and result elements are the same type, and can be 32-bit floating-point numbers, or 32-bit unsigned 
integers. 


For details of the operation performed by this instruction see Floating-point reciprocal estimate and step on 
page E1-2308. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


Ti 7700477 1)0][1 i[sze]i 1] va [olt ofF[ta]wpo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRSQRTE{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRSQRTE{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

floating_point = (F == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|/11109 8|7 6 5 4/3 2 1 0\15 12\11109 8|7 6 5 4|3 0 | 


7a ti1774 1p] towel 1] va [oft ofF[ajalmpo. vm _ 





64-bit SIMD vector variant 
Applies when Q == 0. 
VRSQRTE{<c>}{<q>}.<dt> <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 


VRSQRTE{<c>}{<q>}.<dt> <Qd>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size != '10' then UNDEFINED; 

floating_point = (F == '1'); 

esize = 32; elements = 2; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "F:size" field. It can have the 
following values: 
U32 when F = 0, size = 10 
F32 when F = 1, size = 10 


The following encodings are reserved: 
° F = Q, size = Ox. 


11. 


= 0, size 


F 
° F = 1, size = Ox. 
F 


° = 1,size = 11. 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Newton-Raphson iteration 


For details of the operation performed and how it can be used in a Newton-Raphson iteration to calculate the 
reciprocal of the square root of a number, see Floating-point reciprocal estimate and step on page E1-2308. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
if floating_point then 
Elem[D[d+r],e,esize] = FPRSqrtEstimate(Elem[D[m+r],e,esize], StandardFPSCRValue()); 
else 
Elem[D[d+r],e,esize] = UnsignedRSqrtEstimate(Elem[D[m+r] ,e,esize]); 
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F6.1.197 VRSQRTS 


Vector Reciprocal Square Root Step multiplies the elements of one vector by the corresponding elements of another 
vector, subtracts each of the products from 3.0, divides these results by 2.0, and places the results into the elements 
of the destination vector. 


The operand and result elements are 32-bit floating-point numbers. 


For details of the operation performed by this instruction see Floating-point reciprocal estimate and step on 
page E1-2308. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


77100 tfofo[pyife] va | va_[1 17 7|NJo[Myi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VRSQRTS{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VRSQRTS{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if sz == '1' then UNDEFINED; 


esize = 32; elements = 2; 
d = UInt(D:Vd); n = UInt(N:Vn); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


7 apoft 114 o[pyifee] va | va_[1 17 71[NJo[Myi] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VRSQRTS{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VRSQRTS{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Newton-Raphson iteration 


For details of the operation performed and how it can be used in a Newton-Raphson iteration to calculate the 
reciprocal of the square root of a number, see Floating-point reciprocal estimate and step on page E1-2308. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
Elem[D[d+r],e,esize] = FPRSqrtStep(Elem[D[n+r],e,esize], Elem[D[m+r],e,esize]); 
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F6.1.198 VRSRA 


Vector Rounding Shift Right and Accumulate takes each element in a vector, right shifts them by an immediate 
value, and accumulates the rounded results into the destination vector.For truncated results, see VSRA. 


The operand and result elements must all be the same type, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 
° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 12\11109 8|7 6 5 4/3 0 | 


At ttoo tulip] imme | vd foo t]tfajmii] vm | 


64-bit SIMD vector variant 
Applies when ! (imm6 == Q0@xxx && L == 0) && Q == 0. 


VRSRA{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 


128-bit SIMD vector variant 
Applies when !(imm6 == Q00xxx && L == @) && Q == 1. 


VRSRA{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 


Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx 
when 'QQ1xxxx 


shift_amount = 16 - UInt(imm6); 
; shift_amount = 32 - UInt(imm6); 


esize = 8; elements = 8; 

esize = 16; elements = 4 

32; elements = 2; shift_amount = 64 - UInt(imm6); 
1 


when 'Q1xxxxx' esize = 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 
unsigned = (U == '1'); d= UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
\15141312/11109 8|7 6 5 | 0 |15 12\11109 8|7 6 5 4|3 0 | 


at tfuls ttt fof imme | vd fo ot tf fQiMft] vm 


64-bit SIMD vector variant 

Applies when !(imm6 == @00xxx && L == 0) && Q == 0. 
VRSRA{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 
128-bit SIMD vector variant 

Applies when ! (imm6 == @00xxx && L == 0) && Q == 1. 


VRSRA{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 
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Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx shift_amount = 16 - UInt(imm6); 


esize = 8; elements = 8 


when 'Q@Q@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when 'Q1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 
unsigned = (U == '1'); | d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 


set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<type> Is the data type for the elements of the vectors, encoded in the "U" field. It can have the following 
values: 
Ss when U = @ 
U when U = 1 

<size> Is the data size for the elements of the vectors, encoded in the "L:imm6<5:3>" field. It can have the 


following values: 


8 when L = 0, imm6<5:3> = 001 

16 when L = 0, imm6<5:3> = Q1x 

32 when L = 0, imm6<5:3> = 1xx 

64 when L = 1, imm6<5:3> = xxx 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<imm> Is an immediate value, in the range 1 to <size>, encoded in the "imm6" field as <size> - <imm>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
round_const = 1 << (shift_amount - 1); 
for r = @ to regs-1 
for e = Q to elements-1 
result = (Int(Elem[D[m+r],e,esize], unsigned) + round_const) >> shift_amount; 
Elem[D[d+r],e,esize] = Elem[D[d+r],e,esize] + result; 
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F6.1.199 


VRSUBHN 


Vector Rounding Subtract and Narrow, returning High Half subtracts the elements of one quadword vector from the 
corresponding elements of another quadword vector, takes the most significant half of each result, and places the 
final results in a doubleword vector. The results are rounded. For truncated results, see VSUBHN. 


The operand elements can be 16-bit, 32-bit, or 64-bit integers. There is no distinction between signed and unsigned 
integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16|15 12\11109 8|7 6 5 4/3 0| 


77100 tpn, vw | va_[o 17 0[Njo[mjo] vm _| 


size 


Al variant 


VRSUBHN{<c>}{<q>}.<dt> <Dd>, <Qn>, <Qm> 


Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 


T1 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4/3 0 | 


api 1171 tye] va | va_[o1 7 o[Njo[Mjo] vm_| 


size 


T1 variant 


VRSUBHN{<c>}{<q>}.<dt> <Dd>, <Qn>, <Qm> 


Decode for this encoding 

if size == '11' then SEE "Related encodings"; 

if Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Notes for all encodings 


Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 
Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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<dt> Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 


116 when size = 00 
132 when size = 01 
164 when size = 10 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
round_const = 1 << (esize-1); 
for e = 0 to elements-1 
result = Elem[Qin[n>>1],e,2xesize] - Elem[Qin[m>>1],e,2xesize] + round_const; 
Elem[D[d],e,esize] = result<2esize-1:esize>; 
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F6.1.200 


VSELEQ, VSELGE, VSELGT, VSELVS 


Floating-point conditional select allows the destination register to take the value in either one or the other source 
register according to the condition codes in the The Application Program Status Register, APSR on page E1-2296. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


pati tito ofp} cc] vn | vd ft oft xjNfojmjo] vm | 


size 


VSELEQ,doubleprec variant 
Applies when cc == 00 && size == 11. 


VSELEQ.F64 <Dd>, <Dn>, <Dm> // Cannot be conditional 


VSELEQ,singleprec variant 
Applies when cc == 00 && size == 10. 


VSELEQ.F32 <Sd>, <Sn>, <Sm> // Cannot be conditional 


VSELGE,doubleprec variant 
Applies when cc == 10 && size == 11. 


VSELGE.F64 <Dd>, <Dn>, <Dm> // Cannot be conditional 


VSELGE, singleprec variant 
Applies when cc == 10 && size == 10. 


VSELGE.F32 <Sd>, <Sn>, <Sm> // Cannot be conditional 


VSELGT,doubleprec variant 
Applies when cc == 11 && size == 11. 


VSELGT.F64 <Dd>, <Dn>, <Dm> // Cannot be conditional 


VSELGT,singleprec variant 
Applies when cc == 11 && size == 10. 


VSELGT.F32 <Sd>, <Sn>, <Sm> // Cannot be conditional 


VSELVS,doubleprec variant 
Applies when cc == @1 && size == 11. 


VSELVS.F64 <Dd>, <Dn>, <Dm> // Cannot be conditional 


VSELVS,singleprec variant 
Applies when cc == @1 && size == 10. 


VSELVS.F32 <Sd>, <Sn>, <Sm> // Cannot be conditional 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
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when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
cond = cc:(cc<1> EOR cc<Q>):'0'; 
if InITBlock() then UNPREDICTABLE; 


T1 
151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0| 
117177170 0[Df cc] vn [va [1 of1 x|Nfo[mjo] vm | 


size 


VSELEQ,doubleprec variant 


Applies when cc == 00 && size == 11. 


VSELEQ.F64 <Dd>, <Dn>, <Dm> // Not permitted in IT block 
VSELEQ,singleprec variant 

Applies when cc == 00 && size == 10. 

VSELEQ.F32 <Sd>, <Sn>, <Sm> // Not permitted in IT block 
VSELGE,doubleprec variant 

Applies when cc == 10 && size == 11. 

VSELGE.F64 <Dd>, <Dn>, <Dm> // Not permitted in IT block 
VSELGE, singleprec variant 

Applies when cc == 10 && size == 10. 

VSELGE.F32 <Sd>, <Sn>, <Sm> // Not permitted in IT block 
VSELGT,doubleprec variant 

Applies when cc == 11 && size == 11. 

VSELGT.F64 <Dd>, <Dn>, <Dm> // Not permitted in IT block 
VSELGT,singleprec variant 

Applies when cc == 11 && size == 10. 

VSELGT.F32 <Sd>, <Sn>, <Sm> // Not permitted in IT block 
VSELVS,doubleprec variant 

Applies when cc == @1 && size == 11. 

VSELVS.F64 <Dd>, <Dn>, <Dm> // Not permitted in IT block 
VSELVS,singleprec variant 

Applies when cc == @1 && size == 10. 

VSELVS.F32 <Sd>, <Sn>, <Sm> // Not permitted in IT block 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
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when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
cond = cc:(cc<1> EOR cc<Q@>):'0'; 
if InITBlock() then UNPREDICTABLE; 


Assembler symbols 


<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 
<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
case esize of 
when 32 
S[d] = if ConditionHolds(cond) then S[n] else S[m]; 
when 64 
D[d] = if ConditionHolds(cond) then D[n] else D[m]; 
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F6.1.201 | VSHL (immediate) 


Vector Shift Left (immediate) takes each element in a vector of integers, left shifts them by an immediate value, and 
places the results in the destination vector. 


Bits shifted out of the left of each element are lost. 


The elements must all be the same size, and can be 8-bit, 16-bit, 32-bit, or 64-bit integers. There is no distinction 
between signed and unsigned integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 12/1109 8|7 6 5 4|3 0 | 


777100 7fo[1p] imme | va_[o 10 7[Lja|Myi] vm | 


64-bit SIMD vector variant 
Applies when ! (imm6 == Q0@xxx && L == 0) && Q == 0. 


VSHL{<c>}{<q>}.I<size> {<Dd>,} <Dm>, #<imm> 


128-bit SIMD vector variant 
Applies when !(imm6 == Q00xxx && L == @) && Q == 1. 


VSHL{<c>}{<q>}.I<size> {<Qd>,} <Qm>, #<imm> 


Decode for all variants of this encoding 


if L:imm6 == 'QQ@0xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx' esize = 8; elements = 8; shift_amount = UInt(imm6) - 8; 
when 'Q@Q1xxxx' esize = 16; elements = 4; shift_amount = UInt(imm6) - 16; 
when 'Q1xxxxx' esize = 32; elements = 2; shift_amount = UInt(imm6) - 32; 

: 1; shift_amount = UInt(imm6) ; 
Q == 'Q0' then 1 else 2; 


when '1xxxxxx' esize = 64; elements = 
d = UInt(D:Vd); | m = UInt(M:Vm); regs = if 


T1 


15 141312/11109 8/7 65 | 0 |15 12/1110 9 8|7 6 5 4/3 0 | 


7 apolt 717 1p] imme | va__[o 10 7[Lja|Myi] vm | 


64-bit SIMD vector variant 

Applies when ! (imm6 == Q0@xxx && L == 0) && Q == 0. 
VSHL{<c>}{<q>}.I<size> {<Dd>,} <Dm>, #<imm> 
128-bit SIMD vector variant 

Applies when ! (imm6 == Q0@xxx && L == 0) && Q == 1. 


VSHL{<c>}{<q>}.I<size> {<Qd>,} <Qm>, #<imm> 
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Decode for all variants of this encoding 


if L:imm6 == 'Q@Q@Oxxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx' esize = 8; elements = 8; shift_amount = UInt(imm6) - 8; 
when 'QQ@1xxxx' esize = 16; elements = 4; shift_amount = UInt(imm6) - 16; 
when 'Q1xxxxx' esize = 32; elements = 2; shift_amount = UInt(imm6) - 32; 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = UInt(imm6); 

d = UInt(D:Vd); m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size for the elements of the vectors, encoded in the "L:imm6<5:3>" field. It can have the 
following values: 
8 when L = 0, imm6<5:3> = 001 
16 when L = 0, imm6<5:3> = Q1x 
32 when L = 0, imm6<5:3> = 1xx 
64 when L = 1, imm6<5:3> = xxx 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<imm> Is an immediate value, in the range 0 to <size>-1, encoded in the "imm6" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
Elem[D[d+r],e,esize] = LSL(Elem[D[m+r],e,esize], shift_amount) ; 
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F6.1.202 VSHL (register) 


Vector Shift Left (register) takes each element in a vector, shifts them by a value from the least significant byte of 
the corresponding element of a second vector, and places the results in the destination vector. If the shift value is 
positive, the operation is a left shift. If the shift value is negative, it is a truncating right shift. 


For a rounding shift, see VRSHL. 

The first operand and result elements are the same data type, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 

° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 

The second operand is always a signed integer of the same size. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


[111100 1fufojofsize] vn | vd jo 10 ojNJajmMfo] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VSHL{<c>}{<q>}.<dt> {<Dd>,} <Dm>, <Dn> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VSHL{<c>}{<q>}.<dt> {<Qd>,} <Qm>, <Qn> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1' || Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); | m = UInt(M:Vm);/ n = UInt(N:Vn); regs = if Q == '@' then 1 else 2; 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4/3 0 | 


[1 1 tfuj1 414 ofdfsize] vn | vd _ jo 1 0 ojNJa}mMfo] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VSHL{<c>}{<q>}.<dt> {<Dd>,} <Dm>, <Dn> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VSHL{<c>}{<q>}.<dt> {<Qd>,} <Qm>, <Qn> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vm<@> == '1' || Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); | m = UInt(M:Vm); | n = UInt(N:Vn); regs = if Q == 'Q@' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "U:size" field. It can have the 
following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
S64 when U = 0, size = 11 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
U64 when U = 1, size = 11 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
shift = SInt(Elem[D[n+r] ,e,esize]<7:0>); 
result = Int(Elem[D[m+r],e,esize], unsigned) << shift; 
Elem[D[d+r],e,esize] = result<esize-1:0>; 
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F6.1.203  VSHLL 


Vector Shift Left Long takes each element in a doubleword vector, left shifts them by an immediate value, and places 
the results in a quadword vector. 


The operand elements can be: 


° 8-bit, 16-bit, or 32-bit signed integers. 


8-bit, 16-bit, or 32-bit unsigned integers. 
° 8-bit, 16-bit, or 32-bit untyped integers, maximum shift only. 
The result elements are twice the length of the operand elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 12\11109 8|7 6 5 4/3 0 | 


T4177 00 4JU[to] imme [| va [101 ofo]o|m[i] vm _| 


Al variant 
Applies when imm6 != Q00xxx. 


VSHLL{<c>}{<q>}.<type><size> <Qd>, <Dm>, #<imm> 


Decode for this encoding 


if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 

if Vd<@> == '1' then UNDEFINED; 

case imm6 of 
when 'QQ1xxx' esize = 8; elements = 8; shift_amount = UInt(imm6) - 8; 
when 'Q@1xxxx' esize = 16; elements = 4; shift_amount = UInt(imm6) - 16; 
when '1xxxxx' esize = 32; elements = 2; shift_amount = UInt(imm6) - 32; 

if shift_amount == @ then SEE VMOVL; 

unsigned = (U == '1'); | d = UInt(D:Vd); m = UInt(M:Vm); 


A2 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0 | 


144700117 1{d]1 afsize{1 of va jojo 11 ofojmjo] vm | 


A2 variant 


VSHLL{<c>}{<q>}.<type><size> <Qd>, <Dm>, #<imm> 


Decode for this encoding 


if size == '11' || Vd<@> == '1' then UNDEFINED; 

esize = 8 << UInt(size); elements = 64 DIV esize; shift_amount = esize; 
unsigned = FALSE; // Or TRUE without change of functionality 

d = UInt(D:Vd); | m = UInt(M:Vm); 
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T1 


\15141312\11109 8|7 65 | 0 |15 12/1110 9 8|7 6 5 4|3 0 | 


aut 7171 1p] imme [va __[1 07 ofojo|myi] vm _| 


T1 variant 
Applies when imm6 != Q00xxx. 


VSHLL{<c>}{<q>}.<type><size> <Qd>, <Dm>, #<imm> 


Decode for this encoding 


if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 

if Vd<@> == '1' then UNDEFINED; 

case imm6 of 
when 'QQ1xxx' esize = 8; elements = 8; shift_amount = UInt(imm6) - 8; 
when '@1xxxx' esize = 16; elements = 4; shift_amount = UInt(imm6) - 16; 
when '1xxxxx' esize = 32; elements = 2; shift_amount = UInt(imm6) - 32; 

if shift_amount == @ then SEE VMOVL; 

unsigned = (U == '1'); | d = UInt(D:Vd); m = UInt(M:Vm); 


T2 


[15 141312/11109 8/7 6 5 4/3 2 1 0|15 12/1109 8|7 6 5 4/3 0 | 


77414171 1(]1 teze[t 0 va [ofo 71 ojo[Mjo] vm _| 


T2 variant 


VSHLL{<c>}{<q>}.<type><size> <Qd>, <Dm>, #<imm> 


Decode for this encoding 


if size == '11' || Vd<@> == '1' then UNDEFINED; 

esize = 8 << UInt(size); elements = 64 DIV esize; shift_amount = esize; 
unsigned = FALSE; // Or TRUE without change of functionality 

d = UInt(D:Vd); m = UInt(M:Vm); 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 


Assembler symbols 


<C> For encoding Al and A2: see Standard assembler syntax fields on page F2-2406. This encoding 
must be unconditional. 


For encoding T1 and T2: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<type> The data type for the elements of the operand. It must be one of: 
Ss Signed. In encoding T1/A1, encoded as U = 0. 
U Unsigned. In encoding T1/A1, encoded as U = 1. 
I Untyped integer, Available only in encoding T2/A2. 
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<size> The data size for the elements of the operand. The following table shows the permitted values and 
their encodings: 

















<size> Encoding T1/A1 Encoding T2/A2 
8 Encoded as imm6<5:3> = @b001 Encoded as size = 0b00 
16 Encoded as imm6<5:4> = 0b01 Encoded as size = 0b01 
32 Encoded as imm6<5> = 1 Encoded as size = 0b10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<imm> The immediate value. <imm> must lie in the range 1 to <size>, and: 
. If <size> == <imm>, the encoding is T2/A2. 


° Otherwise, the encoding is T1/A1, and: 
— If <size> == 8, <imm> is encoded in imm6<2:0>. 
— If <size> == 16, <imm> is encoded in imm6<3:0>. 


— If <size> == 32, <imm> is encoded in imm6<4:0>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = Q to elements-1 
result = Int(Elem[Din[m],e,esize], unsigned) << shift_amount; 
Elem[Q[d>>1],e,2esize] = result<2xesize-1:0>; 
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F6.1.204 VSHR 
Vector Shift Right takes each element in a vector, right shifts them by an immediate value, and places the truncated 
results in the destination vector. For rounded results, see VRSHR. 
The operand and result elements must be the same size, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 
° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 2221 ‘| 16|15 12/1110 9 8|7 6 5 4|3 0| 
111700 1JUl1{D] imme [| vd [0 0 0 OfLJQimMit] vm _| 
64-bit SIMD vector variant 
Applies when !(imm6 == Q0@xxx && L == 0) && Q == 0. 
VSHR{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 
128-bit SIMD vector variant 
Applies when !(imm6 == Q0@xxx && L == 0) && Q == 1. 
VSHR{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 
Decode for all variants of this encoding 
if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 
if Q == '1' && (Vd<O> == '1' || Vm<O> == '1') then UNDEFINED; 
case L:imm6 of 
when 'QQQ1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when 'QQ1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '@1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 
unsigned = (U == '1'); d= UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
\15141312/11109 8|7 6 5 | 0 |15 12\11109 8|7 6 5 4|3 0 | 
[1a tfu[1 1414 1{D[ imme | va _ [0 0 0 oftfalmi1] vm _| 
64-bit SIMD vector variant 
Applies when ! (imm6 == Q0@xxx && L == 0) && Q == 0. 
VSHR{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 
128-bit SIMD vector variant 
Applies when ! (imm6 == Q0@xxx && L == 0) && Q == 1. 
VSHR{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 
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Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx shift_amount = 16 - UInt(imm6); 


esize = 8; elements = 8 


when 'Q@Q@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when 'Q1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 
unsigned = (U == '1'); | d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 


set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<type> Is the data type for the elements of the vectors, encoded in the "U" field. It can have the following 
values: 
Ss when U = @ 
U when U = 1 

<size> Is the data size for the elements of the vectors, encoded in the "L:imm6<5:3>" field. It can have the 


following values: 


8 when L = 0, imm6<5:3> = 001 

16 when L = 0, imm6<5:3> = Q1x 

32 when L = 0, imm6<5:3> = 1xx 

64 when L = 1, imm6<5:3> = xxx 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<imm> Is an immediate value, in the range 1 to <size>, encoded in the "imm6" field as <size> - <imm>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
result = Int(Elem[D[m+r],e,esize], unsigned) >> shift_amount; 
Elem[D[d+r],e,esize] = result<esize-1:0>; 
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F6.1.205 | VSHR (zero) 


Vector Shift Right copies the contents of one SIMD register to another 


This instruction is a pseudo-instruction of the VORR (register) instruction. This means that: 


° The encodings in this description are named to match the encodings of VORR (register). 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VORR (register) gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 


777100 tfofopo]7 o| wa | va [ooo 7|Nja|Mi] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VSHR{<c>}{<q>}.<dt> <Dd>, <Dm>, #0 

is equivalent to 

VORR{<c>}{<q>}{.<dt>} <Dd>, <Dm>, <Dm> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 

Applies when Q == 1. 
VSHR{<c>}{<q>}.<dt> <Qd>, <Qm>, #0 

is equivalent to 

VORR{<c>}{<q>}{.<dt>} <Qd>, <Qm>, <Qm> 


and is never the preferred disassembly. 


T1 


[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1109 8|7 6 5 4|3 


0| 


[1 1 tfoj1 414 ofoft of vn _ | vd joo o t{NJajm{t} vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VSHR{<c>}{<q>}.<dt> <Dd>, <Dm>, #0 

is equivalent to 

VORR{<c>}{<q>}{.<dt>} <Dd>, <Dm>, <Dm> 


and is never the preferred disassembly. 


128-bit SIMD vector variant 


Applies when Q == 1. 
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VSHR{<c>}{<q>}.<dt> <Qd>, <Qm>, #0 


is equivalent to 


VORR{<c>}{<q>}{.<dt>} <Qd>, <Qm>, <Qm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> 


<q> 


<dt> 


<Qd> 


<Qm> 


<Dd> 


<Dm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, and must be one of: S8, S16, S32, S64, U8, U16, 
U32 or U64. 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 


Is the 128-bit name of the SIMD&FP source register, encoded in the "N:Vn" and "M:Vm" field as 
<Qm>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "N:Vn" and "M:Vm" field. 


Operation for all encodings 


The description of VORR (register) gives the operational pseudocode for this instruction. 
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F6.1.206 VSHRN 
Vector Shift Right Narrow takes each element in a vector, right shifts them by an immediate value, and places the 
truncated results in the destination vector. For rounded results, see VRSHRN. 
The operand elements can be 16-bit, 32-bit, or 64-bit integers. There is no distinction between signed and unsigned 
integers. The destination elements are half the size of the source elements. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 2221 | 16|15 12/1110 9 8|7 6 5 4/3 0| 
Tt i100 t)olijo] imme | ve [100 ofojo|mj7] vm | 
Al variant 
Applies when imm6 != QQ0xxx. 
VSHRN{<c>}{<q>}.I<size> <Dd>, <Qm>, #<imm> 
Decode for this encoding 
if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 
if Vm<@> == '1' then UNDEFINED; 
case imm6 of 
when 'QQ@1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when '@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
d = UInt(D:Vd); > m = UInt(M:Vm); 
T1 
15141312|\11109 8|7 6 5 | 0 |15 12\11109 8|7 6 5 4|3 0| 
[it tfoft 114 fo] imme | vd {to 0 ofofofmi1] vm 
T1 variant 
Applies when imm6 != Q00xxx. 
VSHRN{<c>}{<q>}.I<size> <Dd>, <Qm>, #<imm> 
Decode for this encoding 
if imm6 == 'Q0Qxxx' then SEE "Related encodings"; 
if Vm<@> == '1' then UNDEFINED; 
case imm6 of 
when 'QQ@1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when '@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when '1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
d = UInt(D:Vd); m = UInt(M:Vm); 
Notes for all encodings 
Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size for the elements of the vectors, encoded in the "imm6<5:3>" field. It can have the 
following values: 
16 when imm6<5:3> = 001 
32 when imm6<5:3> = Q1x 
64 when imm6<5:3> = 1xx 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<imm> Is an immediate value, in the range 1 to <size>/2, encoded in the "imm6" field as <size>/2 - <imm>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = 0 to elements-1 
result = LSR(Elem[Qin[m>>1],e,2sesize], shift_amount); 
Elem[D[d],e,esize] = result<esize-1:0>; 
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F6.1.207  VSHRN (zero) 


Vector Shift Right Narrow takes each element in a vector, right shifts them by an immediate value, and places the 
truncated results in the destination vector 


This instruction is a pseudo-instruction of the VMOVN instruction. This means that: 


° The encodings in this description are named to match the encodings of VMOVN. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 
° The description of VMOVN gives the operational pseudocode for this instruction. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


77710077 1)0]1 t[eze[i of va [ofo 70 o[o[mo] vm | 


Al variant 

VSHRN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 


T1 


[15 141312/11109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


ti titi tfoft ifsize[1 of vd _ Jofo 10 ofojmMjo] vm | 


T1 variant 

VSHRN{<c>}{<q>}.<dt> <Dd>, <Qm>, #0 
is equivalent to 

VMOVN{<c>}{<q>}.<dt> <Dd>, <Qm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the operand, encoded in the "size" field. It can have the following 
values: 
I16 when size = 00 
132 when size = 01 
164 when size = 10 


The encoding size = 11 is reserved. 
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<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 


Operation for all encodings 


The description of VMOVN gives the operational pseudocode for this instruction. 
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F6.1.208 VSLI 


Vector Shift Left and Insert takes each element in the operand vector, left shifts them by an immediate value, and 
inserts the results in the destination vector. Bits shifted out of the left of each element are lost. 


The elements must all be the same size, and can be 8-bit, 16-bit, 32-bit, or 64-bit. There is no distinction between 
data types. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 12/1110 9 8|7 6 5 4|3 0 | 


77100 tip] imme | va_[o 10 7[tja|Mi] vm | 


64-bit SIMD vector variant 
Applies when !(imm6 == Q0@xxx && L == 0) && Q == 0. 


VSLI{<c>}{<q>}.<size> {<Dd>,} <Dm>, #<imm> 


128-bit SIMD vector variant 
Applies when !(imm6 == Q00xxx && L == @) && Q == 1. 


VSLI{<c>}{<q>}.<size> {<Qd>,} <Qm>, #<imm> 


Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx' esize = 8; elements = 8; shift_amount = UInt(imm6) - 8; 
when 'QQ1xxxx' esize = 16; elements = 4; shift_amount = UInt(imm6) - 16; 
when '@1xxxxx' esize = 32; elements = 2; shift_amount = UInt(imm6) - 32; 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = UInt(imm6); 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


15 141312/11109 8/7 65 | 0 |15 12\11109 8|7 6 5 4/3 0 | 


fat api 117 1] imme | va__jo 10 7[Lja|Mi] vm | 


64-bit SIMD vector variant 

Applies when !(imm6 == Q0@xxx && L == 0) && Q == 0. 
VSLI{<c>}{<q>}.<size> {<Dd>,} <Dm>, #<imm> 

128-bit SIMD vector variant 

Applies when ! (imm6 == Q00xxx && L == @) && Q == 1. 


VSLI{<c>}{<q>}.<size> {<Qd>,} <Qm>, #<imm> 
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Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx' esize = 8; elements = 8; shift_amount = UInt(imm6) - 8; 
when 'Q@Q@1xxxx' esize = 16; elements = 4;  shift_amount = UInt(imm6) - 16; 
when 'Q1xxxxx' esize = 32; elements = 2; shift_amount = UInt(imm6) - 32; 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = UInt(imm6); 

d = UInt(D:Vd); m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size for the elements of the vectors, encoded in the "L:imm6<5:3>" field. It can have the 
following values: 
8 when L = 0, imm6<5:3> = 001 
16 when L = 0, imm6<5:3> = Q1x 
32 when L = 0, imm6<5:3> = 1xx 
64 when L = 1, imm6<5:3> = xxx 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<imm> Is an immediate value, in the range 0 to <size>-1, encoded in the "imm6" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
mask = LSL(Ones(esize), shift_amount) ; 
for r = @ to regs-1 
for e = Q to elements-1 
shifted_op = LSL(Elem[D[m+tr],e,esize], shift_amount); 
Elem[D[d+r],e,esize] = (Elem[D[d+r],e,esize] AND NOT(mask)) OR shifted_op; 
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F6.1.209 VSQRT 


Square Root calculates the square root of the value in a floating-point register and writes the result to another 


floating-point register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


31 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 


0 | 


| feitt ft 110 1fdo{1 sfojo o 1] vd ft of x]ift}mMfo} vm | 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VSQRT{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VSQRT{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
if FPSCR.Len != 'Q@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 
case size of 
when '10' esize 
when '11' esize 


32; d = UInt(Vd:D); m = UInt(Vm:M); 
64; d = UInt(D:Vd); m = UInt(M:Vm); 


T1 


151413 12|11109 8|7 6 5 4/3 2 1 0|15 12\11109 8|7 6 5 4|3 


0| 


1 t1o7t1o fof sfojoo 1] va ft oft x]ittjmMfo] vm | 


size 


Single-precision scalar variant 
Applies when size == 10. 


VSQRT{<c>}{<q>}.F32 <Sd>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VSQRT{<c>}{<q>}.F64 <Dd>, <Dm> 


Decode for all variants of this encoding 


if size != 'lx' then UNDEFINED; 
if FPSCR.Len != '@@0' || FPSCR.Stride != 'Q0' then UNDEFINED; 
case size of 
when '10' esize = 
when '11' esize 


i 
w 
N 
Qa 

i 


= UInt(Vd:D); m = UInt(Vm:M); 
UInt(D:Vd); m = UInt(M:Vm); 


iT} 
aD 
B 
Q 
iT 
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Assembler symbols 


<c> 


<q> 


<Sd> 


<Sm> 


<Dd> 


<Dm> 


See Standard assembler syntax fields on page F2-2406. 

See Standard assembler syntax fields on page F2-2406. 

Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 
Is the 32-bit name of the SIMD&FP source register, encoded in the "Vm:M" field. 

Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 

case esize of 
when 32 S[d] 
when 64 D[d] 


FPSqrt(S[m], FPSCR); 
FPSqrt(D[m], FPSCR); 
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F6.1.210 VSRA 


Vector Shift Right and Accumulate takes each element in a vector, right shifts them by an immediate value, and 
accumulates the truncated results into the destination vector. For rounded results, see VRSRA. 


The operand and result elements must all be the same type, and can be any one of: 
° 8-bit, 16-bit, 32-bit, or 64-bit signed integers. 
° 8-bit, 16-bit, 32-bit, or 64-bit unsigned integers. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 12\11109 8|7 6 5 4/3 0 | 


At ttoo tulip] imme | va foo o tft fajmii] vm | 


64-bit SIMD vector variant 
Applies when !(imm6 == Q0@xxx && L == 0) && Q == 0. 


VSRA{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 


128-bit SIMD vector variant 
Applies when !(imm6 == Q00xxx && L == @) && Q == 1. 


VSRA{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 


Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx 
when 'QQ1xxxx 


shift_amount = 16 - UInt(imm6); 
; shift_amount = 32 - UInt(imm6); 


esize = 8; elements = 8; 

esize = 16; elements = 4 

32; elements = 2; shift_amount = 64 - UInt(imm6); 
1 


when 'Q1xxxxx' esize = 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 
unsigned = (U == '1'); d= UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
\15141312/11109 8|7 6 5 | 0 |15 12\11109 8|7 6 5 4|3 0 | 


at tfuls 414 tfof imme | vd fo oo tf fQiMft] vm 


64-bit SIMD vector variant 

Applies when !(imm6 == Q0@xxx && L == 0) && Q == 0. 
VSRA{<c>}{<q>}.<type><size> {<Dd>,} <Dm>, #<imm> 
128-bit SIMD vector variant 

Applies when ! (imm6 == Q0@xxx && L == 0) && Q == 1. 


VSRA{<c>}{<q>}.<type><size> {<Qd>,} <Qm>, #<imm> 
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Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'QQQ1xxx shift_amount = 16 - UInt(imm6); 


esize = 8; elements = 8 


when 'Q@Q@1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 
when 'Q1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 
unsigned = (U == '1'); | d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 


set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<type> Is the data type for the elements of the vectors, encoded in the "U" field. It can have the following 
values: 
Ss when U = @ 
U when U = 1 

<size> Is the data size for the elements of the vectors, encoded in the "L:imm6<5:3>" field. It can have the 


following values: 


8 when L = 0, imm6<5:3> = 001 

16 when L = 0, imm6<5:3> = Q1x 

32 when L = 0, imm6<5:3> = 1xx 

64 when L = 1, imm6<5:3> = xxx 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<imm> Is an immediate value, in the range 1 to <size>, encoded in the "imm6" field as <size> - <imm>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
result = Int(Elem[D[m+r],e,esize], unsigned) >> shift_amount; 
Elem[D[d+r],e,esize] = Elem[D[d+r],e,esize] + result; 
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F6.1.211  VSRI 


Vector Shift Right and Insert takes each element in the operand vector, right shifts them by an immediate value, and 
inserts the results in the destination vector. Bits shifted out of the right of each element are lost. 


The elements must all be the same size, and can be 8-bit, 16-bit, 32-bit, or 64-bit. There is no distinction between 
data types. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 2221 | 16/15 12/1110 9 8|7 6 5 4|3 0 | 


777100 tip] imme | va_[o 10 oftja|myi] vm | 


64-bit SIMD vector variant 
Applies when ! (imm6 == Q0@xxx && L == 0) && Q == 0. 


VSRI{<c>}{<q>}.<size> {<Dd>,} <Dm>, #<imm> 


128-bit SIMD vector variant 
Applies when ! (imm6 == Q00xxx && L == @) && Q == 1. 


VSRI{<c>}{<q>}.<size> {<Qd>,} <Qm>, #<imm> 


Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'Q@QQ1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 
when 'QQ1xxxx' esize = 16; elements = 4;  shift_amount = 32 - UInt(imm6); 
when '@1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 
when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


15 141312/11109 8/7 65 | 0 |15 12\11109 8|7 6 5 4/3 0 | 


aati 114 1p] imme | va_jo 10 oftja|Myi] vm | 


64-bit SIMD vector variant 

Applies when !(imm6 == Q0@xxx && L == 0) && Q == 0. 
VSRI{<c>}{<q>}.<size> {<Dd>,} <Dm>, #<imm> 

128-bit SIMD vector variant 

Applies when ! (imm6 == Q00xxx && L == @) && Q == 1. 


VSRI{<c>}{<q>}.<size> {<Qd>,} <Qm>, #<imm> 
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Decode for all variants of this encoding 


if (L:imm6) == '@@00xxx' then SEE "Related encodings"; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 

case L:imm6 of 
when 'Q@QQ1xxx' esize = 8; elements = 8; shift_amount = 16 - UInt(imm6); 

when 'Q@Q1xxxx' esize = 16; elements = 4; shift_amount = 32 - UInt(imm6); 

when 'Q1xxxxx' esize = 32; elements = 2; shift_amount = 64 - UInt(imm6); 

when '1xxxxxx' esize = 64; elements = 1; shift_amount = 64 - UInt(imm6); 


d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Notes for all encodings 


Related encodings: See Advanced SIMD one register and modified immediate on page F3-2460 for the T32 
instruction set, or Advanced SIMD one register and modified immediate on page F4-2547 for the A32 instruction 
set. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size for the elements of the vectors, encoded in the "L:imm6<5:3>" field. It can have the 
following values: 
8 when L = 0, imm6<5:3> = 001 
16 when L = 0, imm6<5:3> = Q1x 
32 when L = 0, imm6<5:3> = 1xx 
64 when L = 1, imm6<5:3> = xxx 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 
<imm> Is an immediate value, in the range 1 to <size>, encoded in the "imm6" field as <size> - <imm>. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
mask = LSR(Ones(esize), shift_amount) ; 
for r = @ to regs-1 
for e = Q to elements-1 
shifted_op = LSR(Elem[D[m+tr],e,esize], shift_amount); 
Elem[D[d+r],e,esize] = (Elem[D[d+r],e,esize] AND NOT(mask)) OR shifted_op; 
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F6.1.212 VST1 (single element from one lane) 


Store single element from one lane of one register stores one element to memory from one element of a register. For 
details of the addressing mode see The Advanced SIMD addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
[31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|\1110 9 8|7 4|3 0| 
Ttito700 topojo] Ra | vd [rt]o Olindexaign] Rm | 


size 


Offset variant 
Applies when Rm == 1111. 


VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 
case size of 
when 'QQ' 
if index_align<@> != '@' then UNDEFINED; 
ebytes = 1; index = UInt(index_align<3:1>); alignment = 1; 
when '@1' 
if index_align<1> != 'Q' then UNDEFINED; 
ebytes = 2; index = UInt(index_align<3:2>); 
alignment = if index_align<@> == '@' then 1 else 2; 
when '10' 
if index_align<2> != '@' then UNDEFINED; 
if index_align<1:@> != 'Q0' && index_align<1:0> != '11' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
alignment = if index_align<1:0> == 'Q@0' then 1 else 4; 
d = UInt(D:Vd); mn = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 then UNPREDICTABLE; 








T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12|1110 9 8|7 4|3 0| 
11441001 4[0)o]o] Ra | vd [=11]0 Ofindex align] Rm | 
size 
Offset variant 
Applies when Rm == 1111. 
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VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 
case size of 
when 'QQ' 
if index_align<@> != 'Q' then UNDEFINED; 
ebytes = 1; index = UInt(index_align<3:1>); alignment = 1; 
when 'Q1' 
if index_align<1l> != 'Q' then UNDEFINED; 
ebytes = 2; index = UInt(index_align<3:2>); 
alignment = if index_align<@> == '@' then 1 else 2; 
when '10' 
if index_align<2> != 'Q' then UNDEFINED; 
if index_align<1:@> != 'Q0' && index_align<1:0> != '11' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
alignment = if index_align<1:0> == 'Q@0' then 1 else 4; 
d = UInt(D:Vd); n= UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 then UNPREDICTABLE; 





Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = Q1 
32 when size = 10 
<list> Is a list containing the single 64-bit name of the SIMD&FP register holding the element. The list 


must be { <Dd>[<index>] }. The register <Dd> is encoded in the "D:Vd" field. The permitted values 
and encoding of <index> depend on <size>: 





<size> == <index> is in the range 0 to 7, encoded in the "index_align<3:1>" field. 
<size> == 16 <index> is in the range 0 to 3, encoded in the "index_align<3:2>" field. 
<size> == 32 <index> is 0 or 1, encoded in the "index_align<3>" field. 
<Rn> Is the general-purpose base register, encoded in the "Rn" field. 
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<align> When <size> == 8, <align> must be omitted, otherwise it is the optional alignment. Whenever 
<align> is omitted, the standard alignment is used, see Unaligned data access on page E2-2323, and 
the encoding depends on <size>: 


<size> == Encoded in the "index_align<0>" field as 0. 
<size> == 16 Encoded in the "index_align<1:0>" field as 0b00. 
<size> == 32 Encoded in the "index_align<2:0>" field as 0b000. 


Whenever <align> is present, the permitted values and encoding depend on <size>: 


<size> == 16 <align> is 16, meaning 16-bit alignment, encoded in the 
"index_align<1:0>" field as 0b01. 


<size> == 32 <align> is 32, meaning 32-bit alignment, encoded in the 
"index_align<2:0>" field as 0b011. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = TRUE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
MemU[address,ebytes] = Elem[D[d], index]; 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + ebytes; 
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VST1 (multiple single elements) 


Store multiple single elements from one, two, three, or four registers stores elements to memory from one, two, 
three, or four registers, without interleaving. Every element of each register is stored. For details of the addressing 
mode see The Advanced SIMD addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11 8|7 6 5 4|3 0 | 


Ti 770100 0[0]ojo] Ra | va |x x1 x|sweJalgn] Rm | 


type 


Offset variant 
Applies when Rm == 1111. 


VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


case type of 
when 'Q111' 
regs = 1; if align<1l> == '1' then UNDEFINED; 
when '1010' 
regs = 2; if align == '11' then UNDEFINED; 
when 'Q110' 
regs = 3; if align<1l> == '1' then UNDEFINED; 
when 'QQ10' 
regs = 4; 
otherwise 
SEE "Related encodings"; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); n= UInt(Rn); m = UInt(Rm); 
wback = (m != 15);  register_index = (m != 15 && m != 13); 
if n == 15 || d+regs > 32 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If d+regs > 32, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


. The memory locations specified by the instruction and the number of registers specified by the instruction if 
the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
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T1 
115 141312/11109 8|7 6 5 4/3 0 |15 12/11 8|7 6 5 4|3 0 | 
11471007 0[pjojo] Rn | va [xx 1 x]szeJaign] Rm _| 
type 
Offset variant 
Applies when Rm == 1111. 
VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VST1{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
case type of 
when 'Q111' 
regs = 1; if align<1l> == '1' then UNDEFINED; 
when '1010' 
regs = 2; if align == '11' then UNDEFINED; 
when 'Q110' 
regs = 3; if align<l> == '1' then UNDEFINED; 
when 'Q010' 
regs = 4; 
otherwise 
SEE "Related encodings"; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); n= UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d+regs > 32 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d+regs > 32, then one of the following behaviors must occur: 
. The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VST1 (multiple single elements) on 
page K1-5472. 


Related encodings: See Advanced SIMD element or structure Load/Store on page F3-2479 for the T32 instruction 
set, or Advanced SIMD element or structure Load/Store on page F4-2553 for the A32 instruction set. 
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Assembler symbols 


<c> 


<q> 


<size> 


<list> 


<Rn> 


<align> 


<Rm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data size, encoded in the "size" field. It can have the following values: 


8 when size = 00 
16 when size = @1 
32 when size = 10 
64 when size = 11 


Is a list containing the 64-bit names of the SIMD&FP registers. The list must be one of: 
{ <Dd> } 
Encoded in the "type" field as @b0111. 
{ <Dd>, <Dd+1> } 
Encoded in the "type" field as 0b1010. 
{ <Dd>, <Dd+1>, <Dd+2> } 
Encoded in the "type" field as 0b0110. 
{ <Dd>, <Dd+1>, <Dd+2>, <Dd+3> } 
Encoded in the "type" field as 0b0010. 
The register <Dd> is encoded in the "D:Vd" field. 


Is the general-purpose base register, encoded in the "Rn" field. 


Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 
Unaligned data access on page E2-2323, and is encoded in the "align" field as @b00. Whenever 
<align> is present, the permitted values are: 


64 64-bit alignment, encoded in the "align" field as 0b01. 


128 128-bit alignment, encoded in the "align" field as 0b10. Available only if <list> contains 
two or four registers. 


256 256-bit alignment, encoded in the "align" field as 0b11. Available only if <list> contains 
four registers. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about <Rn>, !, and <Rm>, see The Advanced SIMD addressing mode on page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = TRUE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
for r = @ to regs-1 


for e = Q to elements-1 
if ebytes != 8 then 


MemU[address,ebytes] = Elem[D[d+r],e]; 


else 


bits(64) data = Elem[D[d+r],e]; 
MemU[address,4] = if BigEndian() then data<63:32> else data<31:0>; 
MemU[address+4,4] = if BigEndian() then data<31:0> else data<63:32>; 
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address = address + ebytes; 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 8regs; 
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F6.1.214 VST2 (single 2-element structure from one lane) 


Store single 2-element structure from one lane of two registers stores one 2-element structure to memory from 
corresponding elements of two registers. For details of the addressing mode see The Advanced SIMD addressing 
mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|1110 9 8|7 4|3 0| 
T1770700 1oJojo] Rn | va [Fti]o t[indexaign] Rm | 
size 
Offset variant 
Applies when Rm == 1111. 
VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
case size of 
when 'QQ' 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
alignment = if index_align<@> == 'Q' then 1 else 2; 
when '@1' 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
alignment = if index_align<@> == '@' then 1 else 4; 
when '10' 


if index_align<1l> != 'Q' then UNDEFINED; 

ebytes = 4; index = UInt(index_align<3>); 

inc = if index_align<2> == 'Q' then 1 else 2; 

alignment = if index_align<@> == '@' then 1 else 8; 
d = UInt(D:Vd); d2 =d + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d2 > 31, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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. The memory locations specified by the instruction and the number of registers specified by the instruction if 
the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12|1110 9 8|7 4|3 0 | 


7717007 1pjofo] Rn | va [=11]0 1 [index align] Rm 


size 


Offset variant 
Applies when Rm == 1111. 


VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 
case size of 
when 'QQ' 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
alignment = if index_align<@> == '@' then 1 else 2; 
when '@1' 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
alignment = if index_align<@> == '@' then 1 else 4; 
when '10' 
if index_align<1> != '@' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
inc = if index_align<2> == '@' then 1 else 2; 
alignment = if index_align<@> == 'Q' then 1 else 8; 
d = UInt(D:Vd); d2 = d+ inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If d2 > 31, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


. The memory locations specified by the instruction and the number of registers specified by the instruction if 
the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VST2 (single 2-element structure from 
one lane) on page K1-5473. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = 01 
32 when size = 10 
<list> Is a list containing the 64-bit names of the two SIMD&FP registers holding the element. The list 


must be one of: 
{ <Dd>[<index>], <Dd+1>[<index>] } 
Single-spaced registers, encoded as "spacing" = @. 


{ <Dd>[<index>], <Dd+2>[<index>] } 


Double-spaced registers, encoded as "spacing" = 1. Not permitted when 
<size> == 


The encoding of "spacing" depends on <size>: 
<size> == 16 "spacing" is encoded in the "index_align<1>" field. 
<size> == 32 "spacing" is encoded in the "index_align<2>" field. 


The register <Dd> is encoded in the "D:Vd" field. The permitted values and encoding of <index> 
depend on <size>: 


<size> == <index> is in the range 0 to 7, encoded in the "index_align<3:1>" field. 
<size> == 16 <index> is in the range 0 to 3, encoded in the "index_align<3:2>" field. 
<size> == 32 <index> is 0 or 1, encoded in the "index_align<3>" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 

<align> Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 


Unaligned data access on page E2-2323, and the encoding depends on <size>: 


<size> == Encoded in the "index_align<0>" field as 0. 

<size> == 16 Encoded in the "index_align<0>" field as 0. 

<size> == 32 Encoded in the "index_align<1:0>" field as 0b00. 

Whenever <align> is present, the permitted values and encoding depend on <size>: 

<size> == <align> is 16, meaning 16-bit alignment, encoded in the "index_align<0>" 
field as 1. 

<size> == 16 <align> is 32, meaning 32-bit alignment, encoded in the "index_align<0>" 
field as 1. 

<size> == 32 <align> is 64, meaning 64-bit alignment, encoded in the 


"index_align<1:0>" field as 0b01. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 
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<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = TRUE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
MemU[address, ebytes] = Elem[D[d], index]; 
MemU[address+ebytes,ebytes] = Elem[D[d2], index]; 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 2ebytes; 





F6-3726 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.215 VST2 (multiple 2-element structures) 


Store multiple 2-element structures from two or four registers stores multiple 2-element structures from two or four 
registers to memory, with interleaving. For more information, see Element and structure load/store instructions on 
page F1-2388. Every element of each register is saved. For details of the addressing mode see The Advanced SIMD 
addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 16|15 12|11 8|7 6 5 4|3 0| 
11470700 0[pjojo] Rn | Va [x0 x x]szeJaign] Rm | 
type 
Offset variant 
Applies when Rm == 1111. 
VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
case type of 
when '1000' 
regs = 1; inc =1; if align == '11' then UNDEFINED; 
when '10Q1' 
regs = 1; inc = 2; if align == '11' then UNDEFINED; 
when '0Q11' 
regs = 2; inc = 2; 
otherwise 
SEE "Related encodings"; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2 =d+ inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2+regs > 32 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d2+regs > 32, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
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T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4/3 0| 
11471007 0[pjojo] Rn | va [x0 x x]szeJalgn] Rm _| 
type 
Offset variant 
Applies when Rm == 1111. 
VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VST2{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
case type of 
when '1000' 
regs = 1; inc =1; if align == '11' then UNDEFINED; 
when '10Q1' 
regs = 1; inc = 2; if align == '11' then UNDEFINED; 
when '0Q11' 
regs = 2; inc = 2; 
otherwise 
SEE "Related encodings"; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2 =d+ inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d2+regs > 32 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d2+regs > 32, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VST2 (multiple 2-element structures) 
on page K1-5472. 


Related encodings: See Advanced SIMD element or structure Load/Store on page F3-2479 for the T32 instruction 
set, or Advanced SIMD element or structure Load/Store on page F4-2553 for the A32 instruction set. 
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Assembler symbols 


<c> 


<q> 


<size> 


<list> 


<Rn> 


<align> 


<Rm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data size, encoded in the "size" field. It can have the following values: 


8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


Is a list containing the 64-bit names of the SIMD&FP registers. The list must be one of: 
{ <Dd>, <Dd+1> } 

Single-spaced registers, encoded in the "type" field as 0b1000. 
{ <Dd>, <Dd+2> } 

Double-spaced registers, encoded in the "type" field as 0b1001. 
{ <Dd>, <Dd+1>, <Dd+2>, <Dd+3> } 

Single-spaced registers, encoded in the "type" field as 0b0011. 
The register <Dd> is encoded in the "D:Vd" field. 


Is the general-purpose base register, encoded in the "Rn" field. 


Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 
Unaligned data access on page E2-2323, and is encoded in the "align" field as @b00. Whenever 
<align> is present, the permitted values are: 


64 64-bit alignment, encoded in the "align" field as 0b01. 
128 128-bit alignment, encoded in the "align" field as 0b10. 
256 256-bit alignment, encoded in the "align" field as 0b11. Available only if <list> contains 


four registers. 
: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = TRUE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
for r = @ to regs-1 


for e = Q to elements-1 
MemU[address, ebytes] = Elem[D[d+r], e]; 
MemU[addresstebytes,ebytes] = Elem[D[d2+r],e]; 
address = address + 2sebytes; 


if whack then 


if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 16«regs; 
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F6.1.216 VST3 (single 3-element structure from one lane) 


Store single 3-element structure from one lane of three registers stores one 3-element structure to memory from 
corresponding elements of three registers. For details of the addressing mode see The Advanced SIMD addressing 
mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
[31 30 29 28|27 26 25 24|23 22 21 20\19 16|15 12|\1110 9 8|7 4|3 0| 
Tit to700 Wopojo] Ra | vd [rt]1 Olindexaign] Rm | 


size 


Offset variant 
Applies when Rm == 1111. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>] 


Post-indexed variant 
Applies when Rm == 1101. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>], <Rm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 
case size of 
when 'QQ' 
if index_align<@> != '@' then UNDEFINED; 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
when '@1' 
if index_align<@> != 'Q' then UNDEFINED; 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
when '10' 
if index_align<1:@> != '@0' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
inc = if index_align<2> == '@' then 1 else 2; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d3 > 31 then UNPREDICTABLE; 








CONSTRAINED UNPREDICTABLE behavior 


If d3 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
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T1 
115 141312|/11109 8|7 6 5 4/3 0|15 12\1110 9 8|7 4\3 0| 
Ti tt 7007 Wopojo] Ra | ve [et] Ofindexaign] Rm | 


size 


Offset variant 
Applies when Rm == 1111. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>] 


Post-indexed variant 
Applies when Rm == 1101. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>], <Rm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 
case size of 
when 'QQ' 
if index_align<@> != '@' then UNDEFINED; 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
when '@1' 
if index_align<@> != '@' then UNDEFINED; 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
when '10' 
if index_align<1:@> != '@0' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
inc = if index_align<2> == '@' then 1 else 2; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d3 > 31 then UNPREDICTABLE; 








CONSTRAINED UNPREDICTABLE behavior 


If d3 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VST3 (single 3-element structure from 
one lane) on page K1-5473. 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = 01 
32 when size = 10 
<list> Is a list containing the 64-bit names of the three SIMD&FP registers holding the element. The list 


must be one of: 
{ <Dd>[<index>], <Dd+1>[<index>], <Dd+2>[<index>] } 

Single-spaced registers, encoded as "spacing" = 0. 
{ <Dd>[<index>], <Dd+2>[<index>], <Dd+4>[<index>] } 


Double-spaced registers, encoded as "spacing" = 1. Not permitted when 
<size> == 8. 


The encoding of "spacing" depends on <size>: 


<size> == "spacing" is encoded in the "index_align<0>" field. 

<size> == 16 "spacing" is encoded in the "index_align<1>" field, and "index_align<0>" 
is set to Q. 

<size> == 32 "spacing" is encoded in the "index_align<2>" field, and 


"index_align<1:0>" is set to @b00. 


The register <Dd> is encoded in the "D:Vd" field. The permitted values and encoding of <index> 
depend on <size>: 


<size> == <index> is in the range 0 to 7, encoded in the "index_align<3:1>" field. 
<size> == 16 <index> is in the range 0 to 3, encoded in the "index_align<3:2>" field. 
<size> == 32 <index> is 0 or 1, encoded in the "index_align<3>" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 

<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Alignment 


Standard alignment rules apply, see Alignment support on page B2-76. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; 
MemU[address, ebytes] = Elem[D[d], index]; 
MemU[address+ebytes, ebytes] = Elem[D[d2], index]; 
MemU[address+2sebytes,ebytes] = Elem[D[d3], index]; 
if whack then 

if register_index then 
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R[n] = R[n] + Rom]; 
else 
R[n] = R[n] + 3xebytes; 
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F6.1.217 VST3 (multiple 3-element structures) 


Store multiple 3-element structures from three registers stores multiple 3-element structures to memory from three 
registers, with interleaving. For more information, see Element and structure load/store instructions on 

page F1-2388. Every element of each register is saved. For details of the addressing mode see The Advanced SIMD 
addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 42\11 8|7 6 5 4|3 0| 
Tt i10700 0[ojojo] Ra | ve [070 x[swelaign] Rm | 


type 


Offset variant 
Applies when Rm == 1111. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' || align<1l> == '1' then UNDEFINED; 
case type of 
when 'Q100' 
inc = 1; 
when 'Q101' 
inc = 2; 
otherwise 
SEE "Related encodings"; 
alignment = if align<@> == 'Q' then 1 else 8; 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); | register_index = (m != 15 && m != 13); 
if n == 15 || d3 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d3 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
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T1 
[15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/11 8|7 6 5 4|3 0 | 
Ti tt7007 O[ojojo] Ra | ve lo 70 x[swe[aign] Rm | 


type 


Offset variant 
Applies when Rm == 1111. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VST3{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' || align<1l> == '1' then UNDEFINED; 
case type of 
when 'Q100' 
inc = 1; 
when 'Q101' 
inc = 2; 
otherwise 
SEE "Related encodings"; 
alignment = if align<@> == 'Q' then 1 else 8; 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2 =d+ inc; d3 = d2 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d3 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d3 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VST3 (multiple 3-element structures) 
on page K1-5473. 


Related encodings: See Advanced SIMD element or structure Load/Store on page F3-2479 for the T32 instruction 
set, or Advanced SIMD element or structure Load/Store on page F4-2553 for the A32 instruction set. 
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Assembler symbols 


<c> 


<q> 


<size> 


<list> 


<Rn> 


<align> 


<Rm> 


For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data size, encoded in the "size" field. It can have the following values: 


8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


Is a list containing the 64-bit names of the SIMD&FP registers. The list must be one of: 
{ <Dd>, <Dd+1>, <Dd+2> } 

Single-spaced registers, encoded in the "type" field as 0b0100. 
{ <Dd>, <Dd+2>, <Dd+4> } 

Double-spaced registers, encoded in the "type" field as 0b0101. 
The register <Dd> is encoded in the "D:Vd" field. 


Is the general-purpose base register, encoded in the "Rn" field. 


Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 
Unaligned data access on page E2-2323, and is encoded in the "align" field as @b00. Whenever 
<align> is present, the only permitted values is 64, meaning 64-bit alignment, encoded in the "align" 
field as Qb01. : is the preferred separator before the <align> value, but the alignment can be specified 
as @<align>, see The Advanced SIMD addressing mode on page F2-2427. 


Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if Condition 
Encoding 
address 
- = AArc 
for e = 
MemU 
MemU 
MemU 
addr 
if whack 
ifr 


else 


Passed() then 

SpecificOperations(); CheckAdvSIMDEnabled(); 

= R[n]; iswrite = TRUE; 

h32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 

®@ to elements-1 

[address, ebytes] 
[addresst+ebytes, ebytes] 
[address+2«ebytes, ebytes] 
ess = address + 3xebytes; 
then 

egister_index then 

R[n] = R[n] + R[m]; 


Elem[D[d], e]; 
Elem[D[d2],e]; 
= Elem[D[d3],e]; 


R[n] = R[n] + 24; 
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F6.1.218 VST4 (single 4-element structure from one lane) 


Store single 4-element structure from one lane of four registers stores one 4-element structure to memory from 
corresponding elements of four registers. For details of the addressing mode see The Advanced SIMD addressing 
mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16|15 12\1110 9 8|7 4|3 0| 
T1770700 1oJojo] Rn | va [Ft ]7 t[indexaign] Rm | 
size 
Offset variant 
Applies when Rm == 1111. 
VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
case size of 
when 'QQ' 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
alignment = if index_align<@> == 'Q' then 1 else 4; 
when 'Q1' 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
alignment = if index_align<@> == '@' then 1 else 8; 
when '10' 
if index_align<1:@> == '11' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
inc = if index_align<2> == '@' then 1 else 2; 
alignment = if index_align<1:0> == '@0' then 1 else 4 << UInt(index_align<1:0>); 


d = UInt(D:Vd); d2=d+ inc; d3 = d2+ inc; d4 = d3 + inc; n= UInt(Rn); m= UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d4 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d4 > 31, then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
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. The memory locations specified by the instruction and the number of registers specified by the instruction if 
the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 


T1 
115141312/11109 8|7 6 5 4/3 0 |15 12|1110 9 8|7 4|3 0 | 
11471007 1p)ojo] Rn | va [Ft ]1 t[indexaign] Rm | 
size 
Offset variant 
Applies when Rm == 1111. 
VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
case size of 
when 'QQ' 
ebytes = 1; index = UInt(index_align<3:1>); inc = 1; 
alignment = if index_align<@> == 'Q' then 1 else 4; 
when 'Q1' 
ebytes = 2; index = UInt(index_align<3:2>); 
inc = if index_align<l> == '@' then 1 else 2; 
alignment = if index_align<@> == '@' then 1 else 8; 
when '10' 
if index_align<1:@> == '11' then UNDEFINED; 
ebytes = 4; index = UInt(index_align<3>); 
inc = if index_align<2> == '@' then 1 else 2; 
alignment = if index_align<1:0> == '@0' then 1 else 4 << UInt(index_align<1:0>); 


d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; d4 = d3 + inc; n= UInt(Rn); m= UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d4 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If d4 > 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
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Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VST4 (single 4-element structure from 
one lane) on page K1-5473. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = 01 
32 when size = 10 
<list> Is a list containing the 64-bit names of the four SIMD&FP registers holding the element. The list 


must be one of: 


{ <Dd>[<index>], <Dd+1>[<index>], <Dd+2>[<index>], <Dd+3>[<index>] } 


Single-spaced registers, encoded as "spacing" = @. 


{ <Dd>[<index>], <Dd+2>[<index>], <Dd+4>[<index>], <Dd+6>[<index>] } 


Double-spaced registers, encoded as "spacing" = 1. Not permitted when 
<size> == 


The encoding of "spacing" depends on <size>: 
<size> == 16 "spacing" is encoded in the "index_align<1>" field. 
<size> == 32 "spacing" is encoded in the "index_align<2>" field. 


The register <Dd> is encoded in the "D:Vd" field. The permitted values and encoding of <index> 
depend on <size>: 


<size> == <index> is in the range 0 to 7, encoded in the "index_align<3:1>" field. 
<size> == 16 <index> is in the range 0 to 3, encoded in the "index_align<3:2>" field. 
<size> == 32 <index> is 0 or 1, encoded in the "index_align<3>" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. 

<align> Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 


Unaligned data access on page E2-2323, and the encoding depends on <size>: 


<size> == Encoded in the "index_align<0>" field as 0. 

<size> == 16 Encoded in the "index_align<0>" field as 0. 

<size> == 32 Encoded in the "index_align<1:0>" field as 0b00. 

Whenever <align> is present, the permitted values and encoding depend on <size>: 

<size> == <align> is 32, meaning 32-bit alignment, encoded in the "index_align<0>" 
field as 1. 

<size> == 16 <align> is 64, meaning 64-bit alignment, encoded in the "index_align<0>" 
field as 1. 

<size> == 32 <align> can be 64 or 128. 64-bit alignment is encoded in the 


"index_align<1:0>" field as 0b01, and 128-bit alignment is encoded in the 
"index_align<1:0>" field as 0b10. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3739 
1ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 


see The Advanced SIMD addressing mode on page F2-2427. 


<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 


"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 


page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = TRUE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
MemU[address, ebytes] = Elem[D[d], index] 
MemU[address+ebytes, ebytes] = Elem[D[d2], index] 
MemU[address+2sebytes,ebytes] = Elem[D[d3], index] 
MemU[address+3ebytes,ebytes] = Elem[D[d4], index]; 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 4xebytes; 
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VST4 (multiple 4-element structures) 


Store multiple 4-element structures from four registers stores multiple 4-element structures to memory from four 
registers, with interleaving. For more information, see Element and structure load/store instructions on 

page F1-2388. Every element of each register is saved. For details of the addressing mode see The Advanced SIMD 
addressing mode on page F2-2427. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12\11 8/7 6 5 4|3 0| 


Ti 770100 0[p]ojo] Ra | va [000 x[sweJalgn] Rm | 


type 


Offset variant 
Applies when Rm == 1111. 


VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 


Post-indexed variant 
Applies when Rm == 1101. 


VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 


Post-indexed variant 
Applies when Rm != 11x1. 


VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 
case type of 
when 'Q000' 
inc = 1; 
when 'Q001' 
inc = 2; 
otherwise 
SEE "Related encodings"; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; d4 = d3 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d4 > 31 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If d4 > 31, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 

. The instruction executes as NOP. 


° The memory locations specified by the instruction and the number of registers specified by the instruction if 
the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
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T1 
\15141312/11109 8|7 6 5 4|3 0 |15 12|11 8|7 6 5 4/3 0| 
Tiit7007 o[opojo] Ra | ve [000 x[swelaign] Rm | 
type 
Offset variant 
Applies when Rm == 1111. 
VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}] 
Post-indexed variant 
Applies when Rm == 1101. 
VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}]! 
Post-indexed variant 
Applies when Rm != 11x1. 
VST4{<c>}{<q>}.<size> <list>, [<Rn>{:<align>}], <Rm> 
Decode for all variants of this encoding 
if size == '11' then UNDEFINED; 
case type of 
when '0000' 
inc = 1; 
when 'Q0Q1' 
inc = 2; 
otherwise 
SEE "Related encodings"; 
alignment = if align == 'Q0' then 1 else 4 << UInt(align); 
ebytes = 1 << UInt(size); elements = 8 DIV ebytes; 
d = UInt(D:Vd); d2=d+ inc; d3 = d2 + inc; d4 = d3 + inc; n = UInt(Rn); m = UInt(Rm); 
wback = (m != 15); register_index = (m != 15 && m != 13); 
if n == 15 || d4 > 31 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
If d4 > 31, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VST4 (multiple 4-element structures) 


on page K1-5473. 


Related encodings: See Advanced SIMD element or structure Load/Store on page F3-2479 for the T32 instruction 
set, or Advanced SIMD element or structure Load/Store on page F4-2553 for the A32 instruction set. 
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Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<size> Is the data size, encoded in the "size" field. It can have the following values: 
8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


<list> Is a list containing the 64-bit names of the SIMD&FP registers. The list must be one of: 
{ <Dd>, <Dd+1>, <Dd+2>, <Dd+3> } 
Single-spaced registers, encoded in the "type" field as 0b0000. 
{ <Dd>, <Dd+2>, <Dd+4>, <Dd+6> } 
Double-spaced registers, encoded in the "type" field as 0b0001. 
The register <Dd> is encoded in the "D:Vd" field. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. 


<align> Is the optional alignment. Whenever <align> is omitted, the standard alignment is used, see 
Unaligned data access on page E2-2323, and is encoded in the "align" field as @b00. Whenever 
<align> is present, the permitted values are: 


64 64-bit alignment, encoded in the "align" field as 0b01. 
128 128-bit alignment, encoded in the "align" field as 0b10. 
256 256-bit alignment, encoded in the "align" field as @b11. 


: is the preferred separator before the <align> value, but the alignment can be specified as @<align>, 
see The Advanced SIMD addressing mode on page F2-2427. 


<Rm> Is the general-purpose index register containing an offset applied after the access, encoded in the 
"Rm" field. 


For more information about the variants of this instruction, see The Advanced SIMD addressing mode on 
page F2-2427. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
address = R[n]; iswrite = TRUE; 
- = AArch32.CheckAlignment(address, alignment, AccType_VEC, iswrite); 
for e = 0 to elements-1 
MemU[address, ebytes 
MemU[addresstebytes, ebytes 
MemU[address+2«ebytes, ebytes 
MemU[address+3ebytes, ebytes 
address = address + 4xebytes; 
if whack then 
if register_index then 
R[n] = R[n] + R[m]; 
else 
R[n] = R[n] + 32; 


= Elem[D[d], e 
Elem[D[d2],e 
Elem[D[d3],e 
= Elem[D[d4],e 





Paar ae eee 
Pee ae a ea 
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F6.1.220 


VSTM, VSTMDB, VSTMIA 


Store multiple SIMD&FP registers stores multiple registers from the Advanced SIMD and floating-point register 
file to consecutive memory locations using an address from a general-purpose register. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


This instruction is used by the alias VPUSH. See Alias conditions on page F6-3747 for details of when each alias is 
preferred. 


A1 


31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 | 1 0| 


| tert ft 4 ofpfujofwjof Rn | vd ft oft 1] imms<71> [0 


cond imm8<0> 


Decrement Before variant 
Applies when P == 1 && U == 0 && W == 1. 


VSTMDB{<c>}{<q>}{.<size>} <Rn>!, <dreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


VSTM{<c>}{<q>}{.<size>} <Rn>{!}, <dreglist> 
VSTMIA{<c>}{<q>}{.<size>} <Rn>{!}, <dreglist> 


Decode for all variants of this encoding 


if P == '0' && U == '0' && W == 'Q' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VSTR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = FALSE; add = (U == '1'); whack = (W == '1'); 

d = UInt(D:Vd); nm = UInt(Rn); imm32 = ZeroExtend(imm8:'Q0', 32) 

regs = UInt(imm8) DIV 2; // If UInt(imm8) is odd, see "FSTMX". 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 

if regs == 0 || regs > 16 || (d+regs) > 32 then UNPREDICTABLE; 

if imm8<@> == '1' && (d+regs) > 16 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 

If regs == Q, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 

° The instruction operates as a VSTM with the same addressing mode but stores no registers. 
If regs > 16 || (d+regs) > 32, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as NOP. 


° The memory locations specified by the instruction and the number of registers specified by the instruction if 
the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
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A2 
31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 | 0 | 
Derm [17 o[P[uppjw[o. en | va_[1 0]7 0[ imme 


cond 


Decrement Before variant 
Applies when P == 1 && U == 0 && W == 1. 


VSTMDB{<c>}{<q>}{.<size>} <Rn>!, <sreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


VSTM{<c>}{<q>}{.<size>} <Rn>{!}, <sreglist> 
VSTMIA{<c>}{<q>}{.<size>} <Rn>{!}, <sreglist> 


Decode for all variants of this encoding 


if P == '@' && U == '0' && W == '0' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VSTR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = TRUE; add = (U == '1'); whack = (W == '1'); d= UInt(Vd:D);\ n = UInt(Rn); 
imm32 = ZeroExtend(imm8:'00', 32); regs = UInt(imm8); 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 

if regs == @ || (d+regs) > 32 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If regs == 0, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° The instruction operates as a VSTM with the same addressing mode but stores no registers. 


If (d+regs) > 32, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 





T1 
|15141312|/1110 9 8|7 6 5 4/3 0 |15 12\1110 9 8|7 | 1 0| 
1110141 ofPjufpjwiof Rn {va [4 of1 1] imma<z7:1> Jo] 
imm8<0> 
Decrement Before variant 
Applies when P == 1 && U == @ && W == 1. 
VSTMDB{<c>}{<q>}{.<size>} <Rn>!, <dreglist> 
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Increment After variant 
Applies when P == 0 && U == 1. 


VSTM{<c>}{<q>}{.<size>} <Rn>{!}, <dreglist> 
VSTMIA{<c>}{<q>}{.<size>} <Rn>{!}, <dreglist> 


Decode for all variants of this encoding 


if P == '@' && U == '0' && W == 'Q' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VSTR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
single_regs = FALSE; add = (U == '1'); whack = (W == '1'); 

d = UInt(D:Vd); n= UInt(Rn); imm32 = ZeroExtend(imm8:'QQ', 32) 

regs = UInt(imm8) DIV 2; // If UInt(imm8) is odd, see "FSTMX". 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 

if regs == 0 || regs > 16 || (d+regs) > 32 then UNPREDICTABLE; 

if imm8<@> == '1' && (d+regs) > 16 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If regs == Q, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction operates as a VSTM with the same addressing mode but stores no registers. 


If regs > 16 || (d+regs) > 32, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
. The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 


T2 


[15 14 1312/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 | 0 | 


77047 0[Plupppw[o] en | va_[10]10[ imme _— 


Decrement Before variant 
Applies when P == 1 && U == 0 && W == 1. 


VSTMDB{<c>}{<q>}{.<size>} <Rn>!, <sreglist> 


Increment After variant 
Applies when P == 0 && U == 1. 


VSTM{<c>}{<q>}{.<size>} <Rn>{!}, <sreglist> 
VSTMIA{<c>}{<q>}{.<size>} <Rn>{!}, <sreglist> 


Decode for all variants of this encoding 


if P == '@' && U == '0' && W == '0' then SEE "Related encodings"; 

if P == '1' && W == '@' then SEE VSTR; 

if P == U && W == '1' then UNDEFINED; 

// Remaining combinations are PUW = 010 (IA without !), 011 (IA with !), 101 (DB with !) 
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single_regs = TRUE; add = (U == '1'); whack = (W == '1'); d = UInt(Vd:D); n = UInt(Rn); 
imm32 = ZeroExtend(imm8:'Q0', 32); regs = UInt(imm8); 

if n == 15 && (whack || CurrentInstrSet() != InstrSet_A32) then UNPREDICTABLE; 

if regs == @ || (d+regs) > 32 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


If regs == 0, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The instruction operates as a VSTM with the same addressing mode but stores no registers. 


If (d+regs) > 32, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as NOP. 
° The memory locations specified by the instruction and the number of registers specified by the instruction if 


the register list had not gone out of range, become UNKNOWN. If the instruction specifies writeback, then that 
register becomes UNKNOWN. This behavior does not affect any other memory locations. 
Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors, and particularly VSTM on page K1-5473. 


Related encodings: See Advanced SIMD and floating-point 64-bit move on page F3-2448 for the T32 instruction 
set, or Advanced SIMD and floating-point 64-bit move on page F4-2532 for the A32 instruction set. 


Alias conditions 





Alias is preferred when 





VPUSH P == ''1' @& U == '0' && W == '1' && Rn == '1101' 





Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<size> An optional data size specifier. If present, it must be equal to the size in bits, 32 or 64, of the registers 


being transferred. 


<Rn> Is the general-purpose base register, encoded in the "Rn" field. If writeback is not specified, the PC 
can be used. However, ARM deprecates use of the PC. 


! Specifies base register writeback. Encoded in the "W" field as 1 if present, otherwise 0. 


<sreglist> Is the list of consecutively numbered 32-bit SIMD&FP registers to be transferred. The first register 
in the list is encoded in "Vd:D", and "imm8" is set to the number of registers in the list. The list must 
contain at least one register. 


<dreglist> Is the list of consecutively numbered 64-bit SIMD&FP registers to be transferred. The first register 
in the list is encoded in "D:Vd", and "imm8" is set to twice the number of registers in the list. The 
list must contain at least one register, and must not contain more than 16 registers. 
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Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
address = if add then R[n] else R[n]-imm32; 
for r = @ to regs-1 
if single_regs then 
MemA[address,4] = S[d+r]; address = address+4; 
else 
// Store as two word-aligned words in the correct order for current endianness. 
MemA[address,4] = if BigEndian() then D[d+r]<63:32> else D[d+r]<31:0>; 
MemA[address+4,4] = if BigEndian() then D[d+r]<31:0> else D[d+r]<63:32>; 
address = address+8; 
if wboack then R[n] = if add then R[n]+imm32 else R[n]-imm32; 
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F6.1.221 VSTR 


Store SIMD&FP register stores a single register from the Advanced SIMD and floating-point register file to 
memory, using an address from a general-purpose register, with an optional offset. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 
31 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 | 0 | 
Perm [i 7 ofa[uppjofoy en | va_[1o]ix[ imme _—s 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VSTR{<c>}{<q>}{.32} <Sd>, [<Rn>{, #{+/-}<imm>}] 


Double-precision scalar variant 
Applies when size == 11. 


VSTR{<c>}{<q>}{.64} <Dd>, [<Rn>{, #{+/-}<imm>}] 


Decode for all variants of this encoding 


if size IN { 'Q@', '@1' } then UNDEFINED; 

esize = 8 << UInt(size); add = (U == '1'); 

imm32 = if esize == 16 then ZeroExtend(imm8:'@', 32) else ZeroExtend(imm8:'00', 32); 
d = UInt(D:Vd); | n = UInt(Rn); 

if n == 15 && CurrentInstrSet() != InstrSet_A32 then UNPREDICTABLE; 


T1 
151413 12|11109 8|7 6 5 4|3 0 |15 12|1110 9 8|7 | 0 | 
[111017 oftfu[pjofof Rn | va [1 of1 x] imma 


size 


Single-precision scalar variant 
Applies when size == 10. 


VSTR{<c>}{<q>}{.32} <Sd>, [<Rn>{, #{+/-}<imm>}] 


Double-precision scalar variant 
Applies when size == 11. 


VSTR{<c>}{<q>}{.64} <Dd>, [<Rn>{, #{+/-}<imm>}] 


Decode for all variants of this encoding 


if size IN { '@@', '@1' } then UNDEFINED; 

esize = 8 << UInt(size); add = (U == '1'); 

imm32 = if esize == 16 then ZeroExtend(imm8:'@', 32) else ZeroExtend(imm8:'00', 32); 
d = UInt(D:Vd); n = UInt(Rn); 

if n == 15 && CurrentInstrSet() != InstrSet_A32 then UNPREDICTABLE; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3749 
1ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> See Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
.64 Is an optional data size specifier for 64-bit memory accesses that can be used in the assembler source 


code, but is otherwise ignored. 
<Dd> Is the 64-bit name of the SIMD&FP source register, encoded in the "D:Vd" field. 


32 Is an optional data size specifier for 32-bit memory accesses that can be used in the assembler source 
code, but is otherwise ignored. 


<Sd> Is the 32-bit name of the SIMD&FP source register, encoded in the "Vd:D" field. 

<Rn> Is the general-purpose base register, encoded in the "Rn" field. The PC can be used, but this is 
deprecated. 

+/- Specifies the offset is added to or subtracted from the base register, defaulting to + if omitted and 
encoded in the "U" field. It can have the following values: 
- when U = @ 
+ when U = 1 

<imm> Is the optional unsigned immediate byte offset, a multiple of 4, in the range 0 to 1020, defaulting to 


0, and encoded in the "imm8" field as <imm>/4. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckVFPEnabled(TRUE) ; 
address = if add then (R[n] + imm32) else (R[n] - imm32); 
case esize of 
when 32 
MemA[address,4] = S[d]; 
when 64 
// Store as two word-aligned words in the correct order for current endianness. 
MemA[address,4] = if BigEndian() then D[d]<63:32> else D[d]<31:0>; 
MemA[address+4,4] = if BigEndian() then D[d]<31:@> else D[d]<63:32>; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.222 VSUB (floating-point) 


Vector Subtract (floating-point) subtracts the elements of one vector from the corresponding elements of another 
vector, and places the results in the destination vector. 


Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and mode in 
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp 
mode. For more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0| 


17100 tfofo[pyife] va | va_[1 10 1|NJQ|Mjo] vm _| 


64-bit SIMD vector variant 
Applies when Q == 0. 


VSUB{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VSUB{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

advsimd = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


A2 
31 28|27 26 25 24|23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4/3 0| 
Pst [1 1 4 ofofo]1 sf vn va ft oft xJN{t{mfol vm | 


cond size 


Single-precision scalar variant 
Applies when size == 10. 


VSUB{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VSUB{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@@@' || FPSCR.Stride != 'Q0' then UNDEFINED; 

if size != 'lx' then UNDEFINED; 

advsimd = FALSE; 

case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 
when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3751 
1ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


T1 


151413 12|11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0 | 


7 apolt 717 o[pyifee] va | va_[1 10 7[NJQ[Mpo] vm 


64-bit SIMD vector variant 
Applies when Q == 0. 


VSUB{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VSUB{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 


Decode for all variants of this encoding 


if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if sz == '1' then UNDEFINED; 

advsimd = TRUE; 

esize = 32; elements = 2; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T2 
15 1413 12/1110 9 8|7 6 5 4|3 0 |15 12/1110 9 8|7 6 5 4/3 0 | 
770477 Ofoppyi a] va | va_[1 0]? x[N[t[Mpo] vm 


size 


Single-precision scalar variant 
Applies when size == 10. 


VSUB{<c>}{<q>}.F32 {<Sd>,} <Sn>, <Sm> 


Double-precision scalar variant 
Applies when size == 11. 


VSUB{<c>}{<q>}.F64 {<Dd>,} <Dn>, <Dm> 


Decode for all variants of this encoding 


if FPSCR.Len != 'Q@0' || FPSCR.Stride != 'Q0' then UNDEFINED; 
if size != 'lx' then UNDEFINED; 
advsimd = FALSE; 
case size of 
when '10' esize = 32; d = UInt(Vd:D); n = UInt(Vn:N); m = UInt(Vm:M); 


when '11' esize = 64; d = UInt(D:Vd); n = UInt(N:Vn); m = UInt(M:Vm); 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding A2, T1 and T2: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> Is the data type for the elements of the vectors, encoded in the "sz" field. It can have the following 
values: 
F32 when sz = 0 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 

<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 

<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 

<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 

<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 

<Sd> Is the 32-bit name of the SIMD&FP destination register, encoded in the "Vd:D" field. 

<Sn> Is the 32-bit name of the first SIMD&FP source register, encoded in the "Vn:N" field. 

<Sm> Is the 32-bit name of the second SIMD&FP source register, encoded in the "Vm:M" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDOrVFPEnabled(TRUE, advsimd) ; 
if advsimd then // Advanced SIMD instruction 
for r = @ to regs-1 
for e = Q@ to elements-1 
Elem[D[d+r],e,esize] = FPSub(Elem[D[n+r],e,esize], Elem[D[m+r],e,esize], 
StandardFPSCRValue()); 





else // NFP instruction 
case esize of 
when 32 
S[d] = FPSub(S[n], S[m], FPSCR); 
when 64 
D[d] = FPSub(D[n], D[m], FPSCR); 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 





F6.1.223  VSUB (integer) 
Vector Subtract (integer) subtracts the elements of one vector from the corresponding elements of another vector, 
and places the results in the destination vector. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
111100 1[1)0[o[sze[ va | ve [100 o[nalmjo] vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VSUB{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VSUB{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); nn = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12\/11109 8|7 6 5 4|3 0 | 
7 api +71 o[o[sze] va | ve [100 o[Njalmpo] vm | 
64-bit SIMD vector variant 
Applies when Q == 0. 
VSUB{<c>}{<q>}.<dt> {<Dd>, }<Dn>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VSUB{<c>}{<q>}.<dt> {<Qd>, }<Qn>, <Qm> 
Decode for all variants of this encoding 
if Q == '1' && (Vd<O> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
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<q> 


<dt> 


<Qd> 
<Qn> 


<Qm> 


<Dd> 
<Dn> 


<Dm> 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
See Standard assembler syntax fields on page F2-2406. 


Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 


values: 

18 when size = 00 
I16 when size = @1 
132 when size = 10 
164 when size = 11 


Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 


Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 


Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 


Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 


EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 


Elem[D[d+r],e,esize] = Elem[D[n+r],e,esize] - Elem[D[m+r],e,esize]; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 





F6.1.224 VSUBHN 
Vector Subtract and Narrow, returning High Half subtracts the elements of one quadword vector from the 
corresponding elements of another quadword vector, takes the most significant half of each result, and places the 
final results in a doubleword vector. The results are truncated. For rounded results, see VRSUBHN. 
There is no distinction between signed and unsigned integers. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
T4170 0 io[to[en] vn [| va [o 171 O[N[o|w[o] vm | 
size 
Al variant 
VSUBHN{<c>}{<q>}.<dt> <Dd>, <Qn>, <Qm> 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12\/11109 8|7 6 5 4|3 0 | 
7 apolt 1 +7 tpopen> va | va Jo 17 o[Njolmpo] vm 
size 
T1 variant 
VSUBHN{<c>}{<q>}.<dt> <Dd>, <Qn>, <Qm> 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vn<@> == '1' || Vm<@> == '1' then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 


116 when size = 00 
132 when size = 01 
164 when size = 10 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*?2. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = 0 to elements-1 
result = Elem[Qin[n>>1],e,2sesize] - Elem[Qin[m>>1],e,2sesize]; 
Elem[D[d],e,esize] = result<2esize-1:esize>; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 








F6.1.225 VSUBL 
Vector Subtract Long subtracts the elements of one doubleword vector from the corresponding elements of another 
doubleword vector, and places the results in a quadword vector. Before subtracting, it sign-extends or zero-extends 
the elements of both operands. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
Ti 1700 iupto[en| ve [| va [oo 1[o|N[o|wpo[ vm | 
size op 
Al variant 
VSUBL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vd<@> == '1' || (op == '1' && Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; is_vsubw = (op == '1'); 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12\/11109 8|7 6 5 4|3 0 | 
size op 
T1 variant 
VSUBL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm> 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vd<@> == '1' || (op == '1' && Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; is_vsubw = (op == '1'); 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
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F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> Is the data type for the elements of the second operand vector, encoded in the "U:size" field. It can 
have the following values: 
S8 when U = 0, size = 00 
S16 when U = 0, size = Q1 
$32 when U = 0, size = 10 
U8 when U = 1, size = 00 
U16 when U = 1, size = Q1 
U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = 0 to elements-1 
if is_vsubw then 
opl = Int(Elem[Qin[n>>1],e,2*esize], unsigned); 
else 
opl = Int(Elem[Din[n],e,esize], unsigned); 
result = opl - Int(Elem[Din[m],e,esize], unsigned); 
Elem[Q[d>>1],e,2esize] = result<2xesize-1:0>; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 








F6.1.226 VSUBW 
Vector Subtract Wide subtracts the elements of a doubleword vector from the corresponding elements of a quadword 
vector, and places the results in another quadword vector. Before subtracting, it sign-extends or zero-extends the 
elements of the doubleword operand. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24/23 22 21 20/19 16/15 12\11109 8|7 6 5 4/3 0| 
T1100 iulipo[en| va [va [oo a]i[nJo[wfo] vm | 
size op 
Al variant 
VSUBW{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm> 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vd<@> == '1' || (op == '1' && Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; is_vsubw = (op == '1'); 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12\/11109 8|7 6 5 4|3 0 | 
size op 
T1 variant 
VSUBW{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Dm> 
Decode for this encoding 
if size == '11' then SEE "Related encodings"; 
if Vd<@> == '1' || (op == '1' && Vn<@> == '1') then UNDEFINED; 
unsigned = (U == '1'); 
esize = 8 << UInt(size); elements = 64 DIV esize; is_vsubw = (op == '1'); 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
Notes for all encodings 
Related encodings: See Advanced SIMD data-processing on page F3-2454 for the T32 instruction set, or Advanced 
SIMD data-processing on page F4-2541 for the A32 instruction set. 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


<dt> Is the data type for the elements of the second operand vector, encoded in the "U:size" field. It can 

have the following values: 

S8 when U = 0, size = 00 

S16 when U = 0, size = Q1 

$32 when U = 0, size = 10 

U8 when U = 1, size = 00 

U16 when U = 1, size = Q1 

U32 when U = 1, size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for e = 0 to elements-1 
if is_vsubw then 
opl = Int(Elem[Qin[n>>1],e,2*esize], unsigned); 
else 
opl = Int(Elem[Din[n],e,esize], unsigned); 
result = opl - Int(Elem[Din[m],e,esize], unsigned); 
Elem[Q[d>>1],e,2esize] = result<2xesize-1:0>; 
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F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 





F6.1.227 VSWP 
Vector Swap exchanges the contents of two vectors. The vectors can be either doubleword or quadword. There is 
no distinction between data types. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20|19 18 17 16|15 12|/11109 8|7 6 5 4/3 0| 
Tiito017 Wolt to 0] 0] va [ojo oo ofajmjo] vm | 
size 
64-bit SIMD vector variant 
Applies when Q == 0. 
VSWP{<c>}{<q>}{.<dt>} <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VSWP{<c>}{<q>}{.<dt>} <Qd>, <Qm> 
Decode for all variants of this encoding 
if size != '@@' then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == '@' then 1 else 2; 
T1 
|15141312|/1110 9 8|7 6 5 4/3 2 1 0(|15 12\1110 9 8|7 6 5 4|3 0| 
Tittti117 ot to o]7 0] ve [ojo oo ofamjo] vm | 
size 
64-bit SIMD vector variant 
Applies when Q == 0. 
VSWP{<c>}{<q>}{.<dt>} <Dd>, <Dm> 
128-bit SIMD vector variant 
Applies when Q == 1. 
VSWP{<c>}{<q>}{.<dt>} <Qd>, <Qm> 
Decode for all variants of this encoding 
if size != '@@' then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 
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For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> An optional data type. It is ignored by assemblers, and does not affect the encoding. 

<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
if d == m then 
D[d+r] = bits(64) UNKNOWN; 
else 
D[d+r] = Din[m+r] 
D[m+r] = Din[d+r] 
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F6.1.228 VTBL, VIBX 
Vector Table Lookup uses byte indexes in a control vector to look up byte values in a table and generate a new 
vector. Indexes out of range return 0. 
Vector Table Extension works in the same way, except that indexes out of range leave the destination element 
unchanged. 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 
A1 
|31 30 29 28|27 26 25 24|23 22 21 20/19 16/15 12/1110 9 8|7 6 5 4/3 0| 
Ttitootd Wolt a] va | ve [1 often [nfoo[mpo] vm | 
VTBL variant 
Applies when op == 0. 
VIBL{<c>}{<q>}.8 <Dd>, <list>, <Dm> 
VTBX variant 
Applies when op == 1. 
VIBX{<c>}{<q>}.8 <Dd>, <list>, <Dm> 
Decode for all variants of this encoding 
is_vtb] = (op == '0'); length = UInt(len)+1; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
if nt+length > 32 then UNPREDICTABLE; 
CONSTRAINED UNPREDICTABLE behavior 
Ifn + length > 32, then one of the following behaviors must occur: 
° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. This behavior does not affect any 
general-purpose registers. 
T1 
151413 12/11109 8|7 6 5 4/3 0|15 12/1109 8|7 6 5 4|3 0 | 
Totti at7 tlt a] ve | ve [1 often [nfoo[mjo] vm | 
VTBL variant 
Applies when op == 0. 
VIBL{<c>}{<q>}.8 <Dd>, <list>, <Dm> 
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VTBX variant 
Applies when op == 1. 


VIBX{<c>}{<q>}.8 <Dd>, <list>, <Dm> 


Decode for all variants of this encoding 


is_vtb] = (op == '0'); length = UInt(len)+1; 
d = UInt(D:Vd); n= UInt(N:Vn); m = UInt(M:Vm); 
if nt+length > 32 then UNPREDICTABLE; 


CONSTRAINED UNPREDICTABLE behavior 


Ifn + length > 32, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as NOP. 
° One or more of the SIMD and floating-point registers are UNKNOWN. This behavior does not affect any 


general-purpose registers. 


Notes for all encodings 


For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Appendix K1 
Architectural Constraints on UNPREDICTABLE behaviors. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<list> The vectors containing the table. It must be one of: 


{<Dn>} Encoded as len = 0b00. 

{<Dn>, <Dn+1>}Encoded as len = 0b01. 

{<Dn>, <Dn+l>, <Dn+2>}Encoded as len = 0b10. 

{<Dn>, <Dn+1>, <Dn+2>, <Dn+3>}Encoded as len = 0b11. 


<Dm> Is the 64-bit name of the SIMD&FP source register holding the indices, encoded in the "M:Vm" 
field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 


// Create 256-bit = 32-byte table variable, with zeros in entries that will not be used. 
table3 = if length == 4 then D[n+3] else Zeros(64); 

table2 = if length >= 3 then D[n+2] else Zeros(64); 

tablel = if length >= 2 then D[n+1] else Zeros(64); 

table = table3 : table2 : tablel : D[n]; 


fori =Q@to7 
index = UInt(Elem[D[m] ,i,8]); 
if index < 8*length then 
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Elem[D[d],i,8] = Elem[table, index, 8]; 
else 
if is_vtbl then 
Elem[D[d],i,8] = Zeros(8); 
// else Elem[D[d],i,8] unchanged 
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F6.1.229 VTRN 


Vector Transpose treats the elements of its operand vectors as elements of 2 x 2 matrices, and transposes the 
matrices. 


The elements of the vectors can be 8-bit, 16-bit, or 32-bit. There is no distinction between data types. 


The following figure shows an example of the operation of VTRN doubleword operations. 


VTRN.32 VTRN.16 VTRN.8 


1 0 3 2 1 0 7 6 5 43 2 1 «0 
Dd Dd Dd 
Dm Dm Dm 
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 


instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


This instruction is used by the pseudo-instructions VUZP (alias) and VZIP (alias). The pseudo-instruction is never 
the preferred disassembly. 


A1 


[31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 16/15 12\11109 8|7 6 5 4/3 0| 


Ti 770077 101 i[sze]i of va [ojo 0 7/Q[w[o] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VIRN{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VIRN{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


T1 


15 14 1312/1110 9 8/7 6 5 4/3 2 1 0|15 12/1110 9 8|7 6 5 4/3 0 | 


1141111114 1/d]1 tfsize{1 of vd [ojo oo tfajmjo} vm | 
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64-bit SIMD vector variant 
Applies when Q == 0. 


VIRN{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VIRN{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' then UNDEFINED; 

if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); m= UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Alias conditions 





Pseudo-instruction is preferred when 











VUZP (alias) Never 

VZIP (alias) Never 
Assembler symbols 
<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 
<dt> Is the data type for the elements of the vectors, encoded in the "size" field. It can have the following 
values: 
8 when size = 00 
16 when size = @1 
32 when size = 10 


The encoding size = 11 is reserved. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
h = elements DIV 2; 


for r = @ to regs-1 
if d == m then 
D[d+r] = bits(64) UNKNOWN; 
else 
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for e = @ to h-1 
Elem[D[d+r],2se+1,esize] = Elem[Din[m+r],2*e,esize]; 
Elem[D[m+tr],2se,esize] = Elem[Din[d+r] ,2*e+1,esize]; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. F6-3769 
1ID092916 Non-Confidential 


F6 T32 and A32 Advanced SIMD and floating-point Instruction Descriptions 
F6.1 Alphabetical list of floating-point and Advanced SIMD instructions 


F6.1.230 VTST 


Vector Test Bits takes each element in a vector, and bitwise ANDs it with the corresponding element of a second 
vector. If the result is not zero, the corresponding element in the destination vector is set to all ones. Otherwise, it is 
set to all zeros. 


The operand vector elements can be any one of: 
° 8-bit, 16-bit, or 32-bit fields. 
The result vector elements are fields the same size as the operand vector elements. 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


|31 30 29 28|27 26 25 24/23 22 21 20|19 16/15 12/1110 9 8|7 6 5 4|3 0 | 


717100 1[o[o[ppsze[ va | va_[1 00 o[Nja|Myi] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VTIST{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VTIST{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Qm> 


Decode for all variants of this encoding 
if Q == '1' && (Vd<@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 
if size == '11' then UNDEFINED; 


esize = 8 << UInt(size); elements = 64 DIV esize; 
d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == '@' then 1 else 2; 


T1 


151413 12\11109 8|7 6 5 4|3 0 |15 12\11109 8|7 6 5 4|3 0 | 


7 apolt 777 o[pysze] va | va_[1 00 o[Nja|Mji] vm | 


64-bit SIMD vector variant 

Applies when Q == 0. 
VTST{<c>}{<q>}.<dt> {<Dd>,} <Dn>, <Dm> 
128-bit SIMD vector variant 

Applies when Q == 1. 


VTST{<c>}{<q>}.<dt> {<Qd>,} <Qn>, <Qm> 
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Decode for all variants of this encoding 


if Q == '1' && (Vd<O@> == '1' || Vn<@> == '1' || Vm<@> == '1') then UNDEFINED; 

if size == '11' then UNDEFINED; 

esize = 8 << UInt(size); elements = 64 DIV esize; 

d = UInt(D:Vd); n = UInt(N:Vn); | m = UInt(M:Vm); regs = if Q == 'Q' then 1 else 2; 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 


<dt> Is the data type for the elements of the operands, encoded in the "size" field. It can have the 
following values: 


8 when size = 00 
16 when size = 01 
32 when size = 10 
<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qn> Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. 
<Qm> Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as 
<Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 
<Dn> Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. 
<Dm> Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
for r = @ to regs-1 
for e = Q to elements-1 
if !IsZero(Elem[D[n+r],e,esize] AND Elem[D[m+r],e,esize]) then 
Elem[D[d+r],e,esize] = Ones(esize); 
else 
Elem[D[d+r],e,esize] = Zeros(esize) 
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F6.1.231 VUZP 


Vector Unzip de-interleaves the elements of two vectors. 
The elements of the vectors can be 8-bit, 16-bit, or 32-bit. There is no distinction between data types. 


The following figure shows an example of the operation of VUZP doubleword operation for data type 8. 


VUZP.8, doubleword 





Register state before operation Register state after operation 


Poa [AT [As | AS [Aa [AS | A2 | AT] AD | 86] 84 | 82 | BO | AG | Ad | AD] AO, 





pom [87 [66 | 65 | 64 | 63 | 62 | 61 | 60 | 67 | 65 | 6s | Bi] AT] AS AS AT 
The following figure shows an example of the operation of VUZP quadword operation for data type 32. 


VUZP.32, quadword 
Register state before operation Register state after operation 
BO A2 AO 











Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


[31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 1615 12/1110 9 8|7 6 5 4/3 0| 


71770077 1/01 i[sze]i of va [ojo 0% oa[w[o] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VUZP{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VUZP{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 
if size == '11' || (Q == '@' && size == '10') then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


quadword_operation = (Q == '1'); esize = 8 << UInt(size); 
d = UInt(D:Vd); m = UInt(M:Vm); 


T1 


15 141312/11109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


1141111114 1{d]1 Afsize{1 of vd [ojo o 1 ofajmjo} vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VUZP{<c>}{<q>}.<dt> <Dd>, <Dm> 
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128-bit SIMD vector variant 
Applies when Q == 1. 


VUZP{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' || (Q == '@' && size == '10') then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
quadword_operation = (Q == '1'); esize = 8 << UInt(size); 

d = UInt(D:Vd); m = UInt(M:Vm); 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> For the 64-bit SIMD vector variant: is the data type for the elements of the vectors, encoded in the 
"size" field. It can have the following values: 
8 when size = 00 
16 when size = Q1 


The encoding size = 1x is reserved. 


For the 128-bit SIMD vector variant: is the data type for the elements of the vectors, encoded in the 
"size" field. It can have the following values: 


8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled() ; 
if quadword_operation then 
if d == m then 
Q[d>>1] = bits(128) UNKNOWN; Q[m>>1] = bits(128) UNKNOWN; 
else 
Zipped_q = Q[m>>1]:Q[d>>1]; 
for e = @ to (128 DIV esize) - 1 
Elem[Q[d>>1],e,esize] = Elem[zipped_q,2e,esize]; 
Elem[Q[m>>1],e,esize] = Elem[zipped_q,2#e+1,esize]; 
else 
if d == m then 
D[d] = bits(64) UNKNOWN; D[m] = bits(64) UNKNOWN; 
else 
Zipped_d = D[m]:D[d]; 
for e = 0 to (64 DIV esize) - 1 
Elem[D[d],e,esize] = Elem[zipped_d,2e,esize]; 
Elem[D[m],e,esize] = Elem[zipped_d,2*e+1,esize]; 
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F6.1.232 


VUZP (alias) 


Vector Unzip de-interleaves the elements of two vectors 

This instruction is a pseudo-instruction of the VTRN instruction. This means that: 

° The encodings in this description are named to match the encodings of VTRN. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 


° The description of VTRN gives the operational pseudocode for this instruction. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0 | 


11770077 101 i[sze]i of va [oo 00 t[o|w[o] vm 
Q 


64-bit SIMD vector variant 
VUZP{<c>}{<q>}.32 <Dd>, <Dm> 
is equivalent to 

VIRN{<c>}{<q>}.32 <Dd>, <Dm> 


and is never the preferred disassembly. 


T1 


15 14131211109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


Titti1117 10] isze]1 of va [ojo 00 t[o[w[o] vm _| 
Q 


64-bit SIMD vector variant 
VUZP{<c>}{<q>}.32 <Dd>, <Dm> 
is equivalent to 

VIRN{<c>}{<q>}.32 <Dd>, <Dm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


The description of VTRN gives the operational pseudocode for this instruction. 
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F6.1.233 VZIP 
Vector Zip interleaves the elements of two vectors. 
The elements of the vectors can be 8-bit, 16-bit, or 32-bit. There is no distinction between data types. 


The following figure shows an example of the operation of VZIP doubleword operation for data type 8. 


VZIP.8, doubleword 





Register state before operation Register state after operation 


Poa [AT [AB AS Ad [AS [ AZ| AT] AD | BS] AB | 82 | AZ| BI | At | BO] AD, 





pom [87 [66 | 65 | 64 | 63 | 62 | 61 | 60 | 87 | A7| 66 | AB] BS | AS] BA | AA 
The following figure shows an example of the operation of VZIP quadword operation for data type 32. 


VZIP.32, quadword 











Register state before operation Register state after operation 
Qd A3 A2 Al AO Bi Al BO AO 
Qm B3 B2 B1 BO B3 A3 B2 A2 


Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the 
instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to Hyp mode. For 
more information see Enabling Advanced SIMD and floating-point support on page G1-3881. 


A1 


[31 30 29 28|27 26 25 24/23 22 21 20/19 18 17 1615 12\11109 8|7 6 5 4/3 0| 


71770077 101 i[sze]i of va [ojoo 7% 7Q\w[o] vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VZIP{<c>}{<q>}.<dt> <Dd>, <Dm> 


128-bit SIMD vector variant 
Applies when Q == 1. 


VZIP{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 
if size == '11' || (Q == '@' && size == '10') then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 


quadword_operation = (Q == '1'); esize = 8 << UInt(size); 
d = UInt(D:Vd); > m = UInt(M:Vm); 


T1 


15 141312/11109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


1141111114 1{d]1 tfsize{1 of va [ojo o 1 tfajmjo} vm | 


64-bit SIMD vector variant 
Applies when Q == 0. 


VZIP{<c>}{<q>}.<dt> <Dd>, <Dm> 
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128-bit SIMD vector variant 
Applies when Q == 1. 


VZIP{<c>}{<q>}.<dt> <Qd>, <Qm> 


Decode for all variants of this encoding 


if size == '11' || (Q == '@' && size == '10') then UNDEFINED; 
if Q == '1' && (Vd<@> == '1' || Vm<@> == '1') then UNDEFINED; 
quadword_operation = (Q == '1'); esize = 8 << UInt(size); 

d = UInt(D:Vd); m = UInt(M:Vm); 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 


unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 


<q> See Standard assembler syntax fields on page F2-2406. 

<dt> For the 64-bit SIMD vector variant: is the data type for the elements of the vectors, encoded in the 
"size" field. It can have the following values: 
8 when size = 00 
16 when size = Q1 


The encoding size = 1x is reserved. 


For the 128-bit SIMD vector variant: is the data type for the elements of the vectors, encoded in the 


"size" field. It can have the following values: 


8 when size = 00 
16 when size = 01 
32 when size = 10 


The encoding size = 11 is reserved. 


<Qd> Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. 
<Qm> Is the 128-bit name of the SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 

<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


if ConditionPassed() then 
EncodingSpecificOperations(); CheckAdvSIMDEnabled(); 
if quadword_operation then 
if d == m then 
Q[d>>1] = bits(128) UNKNOWN; Q[m>>1] = bits(128) UNKNOWN; 
else 
bits(256) zipped_q; 
for e = 0 to (128 DIV esize) - 1 
Elem[zipped_q,2xe,esize] = Elem[Q[d>>1],e,esize]; 
Elem[zipped_q,2e+1,esize] = Elem[Q[m>>1],e,esize]; 
Q[{d>>1] = zipped_q<127:0>; Q[m>>1] = zipped_q<255:128>; 
else 
if d == m then 
D[d] = bits(64) UNKNOWN; D[m] = bits(64) UNKNOWN; 
else 
bits(128) zipped_d; 
for e = 0 to (64 DIV esize) - 1 
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Elem[zipped_d,2*e,esize] = 
Elem[zipped_d, 2*e+1,esize] 
D[d] = zipped_d<63:@>; D[m] = 


Elem[D[d],e,esize]; 
= Elem[D[m],e,esize]; 
Zipped_d<127: 64>; 
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F6.1.234 


VZIP (alias) 


Vector Zip interleaves the elements of two vectors 

This instruction is a pseudo-instruction of the VTRN instruction. This means that: 

° The encodings in this description are named to match the encodings of VTRN. 
. The assembler syntax is used only for assembly, and is not used on disassembly. 


° The description of VTRN gives the operational pseudocode for this instruction. 


A1 


|31 30 29 28|27 26 25 24|23 22 21 20/19 18 17 16|15 12/1110 9 8|7 6 5 4|3 0 | 


11770077 101 i[sze]i of va [oo 00 t[o|w[o] vm 
Q 


64-bit SIMD vector variant 
VZIP{<c>}{<q>}.32 <Dd>, <Dm> 
is equivalent to 

VIRN{<c>}{<q>}.32 <Dd>, <Dm> 


and is never the preferred disassembly. 


T1 


15 14131211109 8/7 6 5 4/3 2.1 0|15 12/1109 8|7 6 5 4/3 0 | 


Titti1117 10] isze]1 of va [ojo 00 t[o[w[o] vm _| 
Q 


64-bit SIMD vector variant 
VZIP{<c>}{<q>}.32 <Dd>, <Dm> 
is equivalent to 

VIRN{<c>}{<q>}.32 <Dd>, <Dm> 


and is never the preferred disassembly. 


Assembler symbols 


<c> For encoding A1: see Standard assembler syntax fields on page F2-2406. This encoding must be 
unconditional. 


For encoding T1: see Standard assembler syntax fields on page F2-2406. 
<q> See Standard assembler syntax fields on page F2-2406. 
<Dd> Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. 


<Dm> Is the 64-bit name of the SIMD&FP source register, encoded in the "M:Vm" field. 


Operation for all encodings 


The description of VTRN gives the operational pseudocode for this instruction. 
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Chapter G1 
The AArch32 System Level Programmers’ Model 


This chapter gives a system level description of the programmers’ model for execution in AArch32 state. It contains 
the following sections: 


° About the AArch32 System level programmers’ model on page G1-3782. 
° Exception levels on page G1-3783. 


° Exception terminology on page G1-3784. 
. Execution state on page G1-3786. 
° Instruction Set state on page G1-3788. 


° Security state on page G1-3789. 

° Security state, Exception levels, and AArch32 execution privilege on page G1-3792. 

° Virtualization on page G1-3794. 

. AArch32 PE modes, and general-purpose and Special-purpose registers on page G1-3796. 
° Process state, PSTATE on page G1-3805. 

° Instruction set states on page G1-3810. 

° Handling exceptions that are taken to an Exception level using AArch32 on page G1-3812. 
. Exception return to an Exception level using AArch32 on page G1-3834. 

° Asynchronous exception behavior for exceptions taken from AArch32 state on page G1-3839. 
° AArch32 state exception descriptions on page G1-3849. 

° Reset into AArch32 state on page G1-3868. 

° Mechanisms for entering a low-power state on page G1-3872. 

° The AArch32 System register interface on page G1-3877. 

° Advanced SIMD and floating-point support on page G1-3880. 


° Configurable instruction enables and disables, and trap controls on page G1-3885. 
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G1.1 About the AArch32 System level programmers’ model 


An application programmer has only a restricted view of the system. The System level programmers’ model 
supports this application level view of the system, and includes features that are required for one or both of an 
operating system (OS) and a hypervisor to provide the programming environment seen by an application. This 
chapter describes the System level programmers’ model when executing at EL1 or higher in an Exception level that 
is using AArch32. 


The system level programmers’ model includes all of the system features required to support operating systems and 
to handle hardware events. 


The following sections give a system level introduction to the basic concepts of the ARM architecture AArch32 
state, and the terminology that is used for describing the architecture when executing in this state: 


. Exception levels on page G1-3783. 

° Exception terminology on page G1-3784. 
° Execution state on page G1-3786. 

° Instruction Set state on page G1-3788. 

° Security state on page G1-3789. 

° Virtualization on page G1-3794. 


The rest of this chapter describes the system level programmers’ model when executing in AArch32 state. 


The other chapters in this part describe: 


. The memory system architecture, as seen when executing in an Exception level that is using AArch32: 
— Chapter G3 The AArch32 System Level Memory Model describes the general features of the ARMv8 
memory model, when executing in AArch32 state, that are not visible at the application level. 
Note 


Chapter E2 The AArch32 Application Level Memory Model describes the application level view of the 
memory model. 








— Chapter G4 The AArch32 Virtual Memory System Architecture describes the Virtual Memory System 
Architecture (VMSA) used in AArch32 state. 


° The AArch32 System Registers, see Chapter G6 AArch32 System Register Descriptions. 


Note 


The T32 and A32 instruction sets include instructions that provide system level functionality, such as returning from 
an exception. See for example, ERET on page F5-2673. 
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G1.2 Exception levels 


The ARMv8-A architecture defines a set of Exception levels, ELO to EL3, where: 


. If ELn is the Exception level, increased values of n indicate increased software execution privilege. 
° Execution at ELO is called unprivileged execution. 

. EL2 provides support for virtualization of Non-secure operation. 

° EL3 provides support for switching between two Security states, Secure state and Non-secure state. 


An implementation might not include all of the Exception levels. All implementations must include ELO and EL1. 
EL2 and EL3 are optional. 


Note 


A PE is not required to implement a contiguous set of Exception levels. For example, it is permissible for an 
implementation to include only ELO, EL1, and EL3. 








The effect of implementation choices on the programmers’ model on page D1-1619 provides information on 
implementations. 


When executing in AArch32 state, execution can move between Exception levels only on taking an exception or on 
returning from an exception: 


° On taking an exception, the Exception level can only increase or remain the same. 


° On returning from an exception, the Exception level can only decrease or remain the same. 


The Exception level that execution changes to or remains in on taking an exception is called the target Exception 
level of the exception. 


Each exception type has a target Exception level that is either: 
° Implicit in the nature of the exception. 


° Defined by configuration bits in the System registers. 
An exception cannot target ELO. 


Exception levels exist within Security states. The ARMvS8-A security model on page G1-3789 describes this. When 
executing at an Exception level, the PE can access both of the following: 


. The resources that are available for the combination of the current Exception level and the current Security 
state. 
° The resources that are available at all lower Exception levels, provided that those resources are available to 


the current Security state. 


This means that if the implementation includes EL3, then because EL3 is only implemented in Secure state, 
execution at EL3 can access all resources available at all Exception levels, for both Security states. 


Each Exception level other than ELO has its own translation regime and associated control registers. For information 
on the translation regimes, see Chapter G4 The AArch32 Virtual Memory System Architecture. 
G1.2.1 Typical Exception level usage model 


The architecture does not specify what software uses which Exception level. Such choices are outside the scope of 
the architecture. However, the following is a common usage model for the Exception levels: 





ELO Applications. 
EL1 OS kernel and associated functions that are typically described as privileged. 
EL2 Hypervisor. 
EL3 Secure monitor. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G1-3783 


ID092916 Non-Confidential 


G1 The AArch32 System Level Programmers’ Model 


G1.3 Exception terminology 


G1.3 


G1.3.1 


G1.3.2 


G1.3.3 


G1.3.4 


Exception terminology 


The following subsections define the terms that are used when describing exceptions: 


Terminology for taking an exception. 
Terminology for returning from an exception. 
Exception levels. 

Definition of a precise exception. 


Definitions of synchronous and asynchronous exceptions on page G1-3785. 


Terminology for taking an exception 


An exception is generated when the PE first responds to an exceptional condition. The PE state at this time is the 
state that the exception is taken from. The PE state immediately after taking the exception is the state that the 
exception is taken to. 


Terminology for returning from an exception 


To return from an exception, the PE must execute an exception return instruction. The PE state when an exception 
return instruction is committed for execution is the state the exception returns from. The PE state immediately after 
the execution of that instruction is the state that the exception returns to. 


Exception levels 


An Exception level, ELn, with a larger value of n than another Exception level, is described as being a higher 
Exception level than the other Exception level. For example, EL3 is a higher Exception level than EL1. 


An Exception level with a smaller value of n than another Exception level is described as being a lower Exception 
level than the other Exception level. For example, ELO is a lower Exception level than EL1. 


An Exception level is described as: 


Using AArch64 when execution in that Exception level is in the AArch64 Execution state. 


Using AArch32 when execution in that Exception level is in the AArch32 Execution state. 


Definition of a precise exception 


An exception is described as precise when the exception handler receives the PE state and memory system state that 
is consistent with the PE having executed all of the instructions up to but not including the point in the instruction 
stream where the exception was taken, and none afterwards. 


Other than the SError interrupt all exceptions that are taken to AArch32 state are required to be precise. For each 
occurrence of an SError interrupt, whether the interrupt is precise or imprecise is IMPLEMENTATION DEFINED. 


Where a synchronous exception that is taken to AArch32 state is generated as part of an instruction that performs 
more than one single-copy atomic memory access, the definition of precise permits that the values in registers or 
memory affected by those instructions can be UNKNOWN, provided that: 


The accesses affecting those registers or memory locations do not, themselves, generate exceptions. 


The registers are not involved in the calculation of the memory address that is used by the instruction. 


In AArch32 state, examples of instructions that perform more than one single-copy atomic memory access are the 
LDM and STM instructions. 


Note 





For the definition of a single-copy atomic access, see Properties of single-copy atomic accesses on 
page E2-2329. 


The SError interrupt replaces the ARMv7 asynchronous abort. 
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G1.3.5 Definitions of synchronous and asynchronous exceptions 


An exception is described as synchronous if all of the following apply: 


. The exception is generated as a result of direct execution or attempted execution of an instruction. 

° The return address presented to the exception handler is guaranteed to indicate the instruction that caused the 
exception. 

. The exception is precise. 


An exception is described as asynchronous if any of the following apply: 
. The exception is not generated as a result of direct execution or attempted execution of the instruction stream. 


° The return address presented to the exception handler is not guaranteed to indicate the instruction that caused 
the exception. 


. The exception is imprecise. 


For more information about exceptions, see Handling exceptions that are taken to an Exception level using AArch32 
on page G1-3812. 
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G1.4 Execution state 


The Execution states are: 
AArch64 The 64-bit Execution state. 
AArch32 The 32-bit Execution state. Operation in this state is compatible with ARMv7-A operation. 


Execution state on page A1-33 gives more information about them. 


Exception levels use Execution states. For example, ELO, EL1 and EL2 might all be using AArch32, under EL3 
using AArch64. 


This means that: 


. Different software layers, such as an application, an operating system kernel, and a hypervisor, executing at 
different Exception levels, can execute in different Execution states. 


. The PE can change Execution states only either: 
—  Atreset. 


—  Onachange of Exception level. 


Note 


° Typical Exception level usage model on page G1-3783 shows which Exception levels different software 
layers might typically use. 





. The effect of implementation choices on the programmers’ model on page D1-1619 gives information on 
supported configurations of Exception levels and Execution states. 





The interaction between the AArch64 and AArch32 Execution states is called interprocessing. Interprocessing on 
page D1-1607 describes this. 


G1.4.1 About the AArch32 PE modes 


AArch32 state provides a set of PE modes that support normal software execution and handle exceptions. The 
current mode determines the set of registers that are available, as described in AArch32 general-purpose registers, 
the PC, and the Special-purpose registers on page G1-3801. 


The AArch32 modes are: 

. Monitor mode. This mode always executes at Secure EL3. 

° Hyp mode. This mode always executes at Non-secure EL2. 

° System, Supervisor, Abort, Undefined, IRQ and FIQ modes. The Exception level these modes execute at 


depends on the Security state, as described in Security state on page G1-3789. 


° User mode. This mode always executes at ELO. 


Note 


AArch64 state does not support modes. Modes are a concept that is specific to AArch32 state. Modes that execute 
at a particular Exception level are only implemented if that Exception level supports using AArch32. 








For more information on modes see AArch32 PE mode descriptions on page G1-3796. 


The mode in use immediately before an exception is taken is described as the mode the exception is taken from. The 
mode that is used on taking the exception is described as the mode the exception is taken to. 


All of the following define the mode that an exception is taken to: 
. The type of exception. 

° The mode the exception is taken from. 

° Configuration settings defined at EL2 and EL3. 
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Monitor mode and Hyp mode can create system traps that cause exceptions to EL3 or EL2 respectively. There is an 
architected hierarchy where EL2 and EL3 configuration settings affect a common condition, for example interrupt 
routing. When no traps are enabled for a particular condition, the AArch32 mode an exception is taken to is called 
the default mode for that exception. 


In AArch32 state, a number of different modes can exist at the same Exception level. All modes at a particular 
Exception level have the execution privilege, meaning they have the same access rights for accesses to memory and 
to System registers. However, the mapping of PE modes to Exception levels depends on the Security state, as 
described in Security state on page G1-3789. Security state, Exception levels, and AArch32 execution privilege on 
page G1-3792 gives more information about the PE modes, their associated execution privilege, and how this maps 
onto the Exception levels. 
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G1.5 Instruction Set state 


In AArch32 state, the Instruction Set state determines the instruction set that the PE is executing. In an 
implementation that follows the ARM recommendations, the available Instruction Set states are: 


T32 state The PE is executing T32 instructions. 


A32 state The PE is executing A32 instructions. 





Note 
In previous versions of the ARM architecture: 
° The T32 instruction set was called the Thumb instruction set. 


° The A32 instruction set was called the ARM instruction set. 





For more information, see Process state, PSTATE on page E1-2294. 
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G1.6 Security state 


The ARMv8-A architecture provides two Security states, each with an associated physical memory address space, 


as follows: 

Secure state When in this state, the PE can access both the Secure physical address space and the 
Non-secure physical address space. 

Non-secure state When in this state, the PE: 
° Can access only the Non-secure physical address space. 
° Cannot access the Secure system control resources. 


For information on how virtual addresses translate onto Secure physical and Non-secure addresses, see About 
VMSAV8-32 on page G4-4022. 


G1.6.1 The ARMv8-A security model 
The general principles of the ARMv8-A security model are: 


° If the implementation includes EL3 then it has two Security states, Secure and Non-secure, and: 
—  EL3 exists only in Secure state. 
—  Achange from Non-secure state to Secure state can only occur on taking an exception to EL3. 
—  Achange from Secure state to Non-secure state can only occur on an exception return from EL3. 


— If EL2 is implemented, it exists only in Non-secure state. 


° If the implementation does not include EL3 it has one Security state, that is: 
— IMPLEMENTATION DEFINED, if the implementation does not include EL2. 


—  Non-secure state if the implementation includes EL2. 


The AArch32 security model, and execution privilege 


The Exception level hierarchy of four Exception levels, ELO, EL1, EL2, and EL3, applies to execution in both 
Execution states. This section describes the mapping between Exception levels, AArch32 modes, and execution 
privilege. 


The AArch32 modes Monitor, System, Supervisor, Abort, Undefined, IRQ, and FIQ all have the same execution 
privilege. 
In Secure state: 


. Monitor mode executes only at EL3, and is accessible only when EL3 is using AArch32. 


° System mode, Supervisor mode, Abort mode, Undefined mode, IRQ mode, and FIQ mode all: 
— Execute at EL1 when EL3 is using AArch64. 
— Execute at EL3 when EL3 is using AArch32. 
This means that there is a difference in the Secure state hierarchy that the PE is using, depending on which Execution 
state EL3 is using: 
° If EL3 is using AArch64: 
— There is no support for Monitor mode. 
— If EL1 is using AArch32, System mode, Supervisor mode, Abort mode, Undefined mode, IRQ mode, 
and FIQ mode execute at Secure EL1. 
° If EL3 is using AArch32: 
— Monitor mode is supported, and executes at Secure EL3 


— System mode, Supervisor mode, Abort mode, Undefined mode, IRQ mode, and FIQ mode execute at 
Secure EL3. 


— There is no support for a Secure EL1 Exception level. 
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See Security behavior in Exception levels using AArch32 when EL3 is using AArch64 on page G1-3824 for more 
information about operation in a Secure EL1 mode when EL3 is using AArch64. 


In Non-secure state, the PL1 modes System, Supervisor, Abort, Undefined, IRQ, and FIQ always execute at EL1. 
User mode always executes at ELO and has the lowest possible execution privilege. 


Hyp mode always executes in Non-secure state at EL2 and has higher execution privilege than all of: 
° User mode. 


° System mode, Supervisor mode, Abort mode, Undefined mode, IRQ mode, and FIQ mode. 


Limited use of Privilege level in ARMV8 AArch32 state on page G1-3793 describes how, in some contexts, the 
concept of Privilege levels can be used to represent the execution privilege hierarchy. 


For more information about the modes see About the AArch32 PE modes on page G1-3786. 


Figure G1-1 shows the security model when EL3 is using AArch32, and shows the expected use of the different 
Exception levels, and which modes execute at which Exception levels. 


Non-secure state Secure state 







































































AArch32 AArch32 AArch32 AArch32 AArch32 AArch32 
App1 App2 App1 App2 Secure App1 Secure App2 
ELO Leenere| wei $6 ee (ak ame | ee eCnenNh| [RRR at nee 
Modes Modes Modes Modes | | Modes Modes 
User User User User ! User User 
AArch32 AArch32 
Guest OS1 Guest OS2 ! 
ol i Re lene ae aN ag eer ere al Neo atr zAeRT RIN N 
Modes: system, FIQ, IRQ, Modes: system, FIQ, IRQ, 
Supervisor, Abort, Undefined Supervisor, Abort, Undefined ! 
AArch32 
Hypervisor 
ae ee ee aa oe ee ee art es ee ee ee 
Modes ! 
Hyp 
AArch32 
Secure monitor Secure OS 
EL3 = | RRRARRAttsrssir neetetumessriet ic Piet res citi eattnse clits Samer aninnsee 
Modes es | Modes: system, FIQ, IRQ, 
ela iehe Supervisor, Abort, Undefined 
Figure G1-1 ARMv8-A Security model when EL3 is using AArch32 
Note 
For an overview of the Security models when EL3 is using AArch64: 
° See Figure G1-2 on page G1-3799 for the case where EL2, EL1, and ELO are all using AArch32. This figure 
shows the implementation of the PE modes. 
° See Figure D1-1 on page D1-1503 for an overview of the set of possible implementations. 
G1-3790 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


G1 The AArch32 System Level Programmers’ Model 
G1.6 Security state 


Figure G1-1 on page G1-3790 shows that when EL3 is using AArch32, the Exception levels and modes available 
in each Security state are as follows: 


Secure state 
ELO User mode. 


EL3 Any mode that is available in Secure state, other than User mode. 


Non-secure state 


ELO User mode. 
EL1 Any mode that is available in Non-secure state, other than Hyp mode and User mode. 
EL2 Hyp mode. 


Execution at ELO is described as unprivileged execution. 


A mode associated with a particular Exception level, ELn, is described as an ELn mode. 





Note 


The Exception level defines the ability to access resources in the current Security state, and does not imply anything 
about the ability to access resources in the other Security state. 





When EL3 is using AArch32, many AArch32 System registers accessible at PL1 are banked between the Secure 
and Non-secure states. 


When EL3 is using AArch64 and Secure EL] is using AArch32, System registers accessible at PL1 are not banked 
between the Non-secure and Secure states. Software running at EL3 is expected to switch the content of the 
PL1-accessible System registers between the Secure and Non-secure context, in a similar manner to switching the 
contents of general purpose registers. For information on the relationship between AArch64 and AArch32 System 
registers in an interprocessing environment, see Mapping of the System registers between the Execution states on 
page D1-1610. 


For more information on the System registers, see The AArch32 System register interface on page G1-3877. 


The Secure Monitor Call (SMC) instruction provides software with a system call to EL3. When executing at a 
privileged Exception level, SMC instructions generates exceptions. For more information, see Secure Monitor Call 
(SMC) exception on page G1-3854 and SMC on page F5-2983. 





Note 


For more information about the Privilege level terminology, see Security state, Exception levels, and AArch32 
execution privilege on page G1-3792. 





Changing from Secure state to Non-secure state 


Monitor mode is provided to support switching between Secure and Non-secure states. When executing in an 
Exception level that is using AArch32, except in Monitor mode and Hyp mode, the Security state is controlled: 
° By the SCR.NS bit, when EL3 is using AArch32. 

° By the SCR_EL3.NS bit, when EL3 is using AArch64. 


The mapping of AArch32 privileged modes to the exception hierarchy means that it is possible when EL3 is using 
AArch32 to change from EL3 to Non-secure EL1 without an exception return. This can occur in one of the 
following ways: 


° Using an MSR or CPS instruction to switch from Monitor mode to another privileged mode while SCR.NS is 1. 


° Using an MCR instruction that writes SCR.NS to change from Secure to Non-secure state when in a privileged 
mode other than Monitor mode. 


ARM strongly recommends that software executing at EL3 using AArch32 does not use either of these mechanisms 
to change from EL3 to Non-secure EL1 without an exception return. The use of both of these mechanisms is 
deprecated. 
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G1.7 Security state, Exception levels, and AArch32 execution privilege 


In ARMvV8, the hierarchy of software execution privilege, within a particular Security state, is defined by the 
Exception levels, with higher Exception level numbers indicating higher privilege. Table G1-1 shows this hierarchy 
for each Security state. 


Table G1-1 Execution privilege and Exception levels, by Security state 

















Execution privilege Secure state Non-secure state Typical use 

Highest EL3 -a Secure monitor 

- -a EL2 Hypervisor 

- ELI EL1 Secure or Non-secure OS 
Lowest, Unprivileged ELO ELO Secure or Non-secure application 





a. EL2 is never implemented in Secure state, and EL3 is never implemented in Non-secure state. 


When executing in AArch32 state, within a given Security state, the current PE state, including the execution 
privilege, is primarily indicated by the current PE mode. In Secure state, how the PE modes map onto the Exception 
levels depends on whether EL3 is using AArch32 or is using AArch64, and: 


° Figure G1-1 on page G1-3790 shows this mapping when EL3 is using AArch32. 
° Figure G1-2 on page G1-3799 shows this mapping when EL3 is using AArch64. 


Table G1-2 shows this mapping. In interpreting this table: 
° Monitor mode is implemented only in Secure state, and only if EL3 is using AArch32. 
° Hyp mode is implemented only in Non-secure state, and only if EL2 is using AArch32. 
° System, FIQ, IRQ, Supervisor, Abort, and Undefined modes are implemented: 
In Secure state If either: 
° EL3 is using AArch32. 
. EL3 is using AArch64 and EL] is using AArch32. 
In Non-secure state If EL1 is using AArch32. 
° User mode is implemented if ELO is using AArch32. 


Table G1-2 Mapping of AArch32 PE modes to Exception levels 





PE modes in the given Security state, and EL3 Execution state 




















Exception 
level . é 
Secure state, EL3 using AArch32 Secure state, EL3 using AArch64 Non-secure state 
EL3 Monitor, System, FIQ, IRQ, - - 
Supervisor, Abort, Undefined 
EL2 - - Hyp 
ELI - System, FIQ, IRQ, Supervisor, Abort, System, FIQ, IRQ, Supervisor, 
Undefined Abort, Undefined 
ELO User User User 
Because AArch32 behavior is described in terms of the PE modes, and transitions between PE modes, the Exception 
levels are implicit in most of the description of operation in AArch32 state. 
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G1.7.1 Limited use of Privilege level in ARMv8 AArch32 state 


As described in The VMSAv8-32 translation regimes on page G4-4024, a translation regime maps a virtual address 
(VA) to the corresponding physical address (PA). The VMSAv8-64 translation regimes are defined by the Exception 
levels that use them. However, because the mapping between PE modes and Exception levels in Secure state 
depends on whether EL3 is using AArch32 or is using AArch64, as shown in Table G1-2 on page G1-3792, the 
VMS Av8-32 translation regimes cannot be described simply in terms of either the Exception levels or the PE modes 
that use them. 


To provide a consistent description of address translation as seen from AArch32 state, the VMSAv8-32 translation 
regimes are described in terms of the Privilege levels originally defined in the ARMv7 descriptions of AArch32 
state. Table G1-3 shows how the PE modes map to these Privilege levels: 


Table G1-3 Mapping of PE modes to AArch32 Privilege levels 














Privilege level Secure state Non-secure state 

PL2 - Hyp* 

PLI Monitor>, System, FIQ, IRQ, Supervisor, Abort, Undefined System, FIQ, IRQ, Supervisor, Abort, Undefined 
PLO User User 





a. Implemented only in Non-secure state, and only if EL2 is using AArch32 


b. Implemented only in Secure state, and only if EL3 is using AArch32. 


Comparing Table G1-3 with Table G1-2 on page G1-3792 shows that: 


In Non-secure state 


Each privilege level maps to the corresponding Exception level. For example PL1 maps to EL1. 


In Secure state 
PLO maps to ELO. 
The mapping of PL1 depends on the Execution state being used by EL3, as follows: 
EL3 using AArch64 Secure PL1 maps to Secure EL1. Monitor mode is not implemented. 


EL3 using AArch32 Secure PL1 maps to Secure EL3. Monitor mode is implemented as one of 
the Secure PL1 modes. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G1-3793 


1ID092916 


Non-Confidential 


G1 The AArch32 System Level Programmers’ Model 


G1.8 Virtualization 


G1.8 Virtualization 


The support for virtualization described in this section applies only to an implementation that includes EL2. A PE 
is in Hyp mode when it is executing at EL2 in the AArch32 state. An exception return from Hyp mode to software 
running at EL1 or ELO is performed using the ERET instruction. 


EL2 provides a set of features that support virtualizing the Non-secure state of an ARMv8-A implementation. The 
basic model of a virtualized system involves: 


. A hypervisor, running in EL2, that is responsible for switching between virtual machines. A virtual machine 
is comprised of Non-secure EL1 and Non-secure ELO. 
° A number of Guest operating systems, that each run in Non-secure EL1, on a virtual machine. 
° For each Guest operating system, applications, that usually run in Non-secure ELO, on a virtual machine. 
Note 





In some systems, a Guest OS is unaware that it is running on a virtual machine, and is unaware of any other Guest 
OS. In other systems, a hypervisor makes the Guest OS aware of these facts. The ARMv8-A architecture supports 
both of these models. 





The hypervisor assigns a virtual machine identifier (VMID) to each virtual machine. 
EL2 is implemented only in Non-secure state, to support Guest OS management. EL2 provides controls to: 


° Provide virtual values for the contents of a small number of identification registers. A read of one of these 
registers by a Guest OS or the applications for a Guest OS returns the virtual value. 


. Trap various operations, including memory management operations and accesses to many other registers. A 
trapped operation generates an exception that is taken to EL2. 


° Route interrupts to the appropriate one of: 
— The current Guest OS. 
— A Guest OS that is not currently running. 


— The hypervisor. 
In Non-secure state: 
. The implementation provides an independent translation regime for memory accesses from EL2. 


° For the PL1&0 translation regime, address translation occurs in two stages: 


— Stage 1 maps the virtual address (VA) to an intermediate physical address (IPA). This is managed at 
EL1, usually by a Guest OS. The Guest OS believes that the IPA is the physical address (PA). 


— Stage 2 maps the IPA to the PA. This is managed at EL2. The Guest OS might be completely unaware 
of this stage. 


For more information on the translation regimes, see Chapter G4 The AArch32 Virtual Memory System Architecture. 


G1.8.1 The effect of implementing EL2 on the Exception model 


An implementation that includes EL2 implements the following exceptions: 
° Hypervisor Call (HVC) exception. 
° Traps to EL2. EL2 configurable controls on page G1-3894, describes these. 
° All of the virtual interrupts: 
— Virtual SError. 
— Virtual IRQ. 
— Virtual FIQ. 


HVC exceptions are always taken to EL2. All virtual interrupts are always taken to EL1, and can only be taken from 
Non-secure EL1 or ELO. 


Each of the virtual interrupts can be independently enabled using controls at EL2. 
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Each of the virtual interrupts has a corresponding physical interrupt. See Virtual interrupts. 


When a virtual interrupt is enabled, its corresponding physical exception is taken to EL2, unless EL3 has configured 
that physical exception to be taken to EL3. For more information, see Asynchronous exception behavior for 
exceptions taken from AArch32 state on page G1-3839. 


An implementation that includes EL2 also: 


° Provides controls that can be used to route some synchronous exceptions, taken from Non-secure state, to 
EL2. For more information see: 


— Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 
— Routing debug exceptions to EL2 on page G1-3833. 


. Provides mechanisms to trap PE operations to EL2. See EL2 configurable controls on page G1-3894. 
When an operation is trapped to EL2, the hypervisor typically either: 


—  Enmulates the required operation. The application running in the Guest OS is unaware of the trap. 


— Returns an error to the Guest OS. 


Virtual interrupts 


The virtual interrupts have names that correspond to the physical interrupts, as shown in Table G1-4. 


Table G1-4 The virtual interrupts 





Physical interrupt Corresponding virtual interrupt 











External SError Virtual SError 
IRQ Virtual IRQ 
FIQ Virtual FIQ 





Software executing at EL2 can use virtual interrupts to signal physical interrupts to Non-secure EL1 and Non-secure 
ELO. Example G1-1 shows a usage model for virtual interrupts. 


Example G1-1 Virtual interrupt usage model 


A usage model is as follows: 


1. Software executing at EL2 routes a physical interrupt to EL2. 

2. When a physical interrupt of that type occurs, the exception handler executing in EL2 determines whether 
the interrupt can be handled in EL2 or requires routing to a Guest OS in ELI. If the interrupt requires routing 
to a Guest OS: 

° If the Guest OS is currently running, the hypervisor uses the appropriate virtual interrupt type to signal 


the physical interrupt to the Guest OS. 


° If the Guest OS is not currently running, the physical interrupt is marked as pending for the guest OS. 
When the hypervisor next switches to the virtual machine that is running that Guest OS, the hypervisor 
uses the appropriate virtual interrupt type to signal the physical interrupt to the Guest OS. 


Non-secure EL1 and Non-secure ELO modes cannot distinguish a virtual interrupt from the corresponding physical 
interrupt. 


For more information see Virtual exceptions when an implementation includes EL2 on page G1-3839. 
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G1.9 AArch32 PE modes, and general-purpose and Special-purpose registers 


The following sections describe the AArch32 PE modes and the general-purpose registers and the PC: 
AArch32 PE mode descriptions. 
AArch32 general-purpose registers, the PC, and the Special-purpose registers on page G1-3801. 
Saved Program Status Registers (SPSRs) on page G1-3803. 
ELR_hyp on page G1-3804. 


Note 





The PC is included in the scope of this section because, in AArch32 state, it is defined as being part of the same 
register file as the general-purpose registers. That is, the AArch32 register file RO-R15 comprises: 


The general-purpose registers RO-R14. 
The PC, that can be described as R15. 





G1.9.1 AArch32 PE mode descriptions 


Table G1-5 shows the PE modes defined by the ARM architecture, for execution in AArch32 state. In this table: 


The PE mode column gives the name of each mode and the abbreviation used, for example, in the 
general-purpose register name suffixes used in AArch32 general-purpose registers, the PC, and the 
Special-purpose registers on page G1-3801. 


The Encoding column gives the corresponding PSTATE.M field. 


The Exception level column gives the Exception level at which the mode is implemented, including 
dependencies on the current Security state and on whether EL3 is using AArch32, see Exception levels on 
page G1-3783. 


Table G1-5 AArch32 PE modes 



































PE mode Encoding Security state Exceptionlevel Implemented 
User usr 10000 Both ELO Always 
FIQ fiq 10001 Non-secure EL1 Always 
Secure EL1 or EL34 
IRQ irq 10010 Non-secure EL1 Always 
Secure EL1 or EL34 
Supervisor — svc 10011 Non-secure EL1 Always 
Secure EL1 or EL34 
Monitor mon 10110 Secure EL3 If EL3 implemented and using AArch32 
Abort abc 10111 Non-secure EL1 Always 
Secure EL1 or EL34 
Hyp hyp 11010 Non-secure EL2 If EL2 implemented and using AArch32 
Undefined und 11011 Non-secure EL1 Always 
Secure EL1 or EL34 
System sys 11111 Non-secure ELI Always 
Secure EL1 or EL32 





a. EL3 if EL3 is using AArch32. EL1 if EL3 is using AArch64 and EL] is using AArch32. 


Mode changes can be made under software control, or can be caused by an external or internal exception. 
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Notes on the AArch32 PE modes 


PE modes are defined only in AArch32 state. Because each mode is implemented as part of a particular Exception 
level that is using AArch32, the set of available modes depends on which Exception levels are implemented and 
using AArch32, as described in Effect of the EL3 Execution state on the PE modes and Exception levels on 


page G1-3799. 


This section gives more information about each of the modes, when it is implemented. 


User mode 


System mode 


Software executing in User mode executes at ELO. Execution in User mode is sometimes described 
as unprivileged execution. Application programs normally execute in User mode, and any program 
executed in User mode: 


° Makes only unprivileged accesses to system resources, meaning it cannot access protected 
system resources. 

. Makes only unprivileged access to memory. 

° Cannot change mode except by causing an exception, see Handling exceptions that are taken 


to an Exception level using AArch32 on page G1-3812. 


System mode is implemented at EL1 or EL3, see Effect of the EL3 Execution state on the PE modes 
and Exception levels on page G1-3799. 


System mode has the same registers available as User mode, and is not entered by any exception. 


Supervisor mode 


Supervisor mode is implemented at EL1 or EL3, see Effect of the EL3 Execution state on the PE 
modes and Exception levels on page G1-3799. 


Supervisor mode is the default mode to which a Supervisor Call exception is taken. Executing a SVC 
(Supervisor Call) instruction generates a Supervisor Call exception. 


In an implementation where the highest implemented Exception level is using AArch32, if that 
Exception level is EL3 or EL1, a PE enters Supervisor mode on Reset. 





Abort mode Abort mode is implemented at EL1 or EL3, see Effect of the EL3 Execution state on the PE modes 
and Exception levels on page G1-3799. 
Abort mode is the default mode to which a Data Abort exception or Prefetch Abort exception is 
taken. 

Undefined mode 
Undefined mode is implemented at EL1 or EL3, see Effect of the EL3 Execution state on the PE 
modes and Exception levels on page G1-3799. 
Undefined mode is the default mode to which an instruction-related exception, including any 
attempt to execute an UNDEFINED instruction, is taken. 

FIQ mode FIQ mode is implemented at EL1 or EL3, see Effect of the EL3 Execution state on the PE modes 
and Exception levels on page G1-3799. 
FIQ mode is the default mode to which an FIQ interrupt is taken. 

IRQ mode IRQ mode is implemented at EL1 or EL3, see Effect of the EL3 Execution state on the PE modes 
and Exception levels on page G1-3799. 
IRQ mode is the default mode to which an IRQ interrupt is taken. 

Hyp mode Hyp mode is the Non-secure EL2 mode. 
Hyp mode is entered on taking an exception from Non-secure state that must be taken to EL2. 
In an implementation where the highest implemented Exception level is EL2 and EL2 uses 
AArch32 on reset, a PE enters Hyp mode on Reset. 
The Hypervisor Call exception and Hyp Trap exception are implemented as part of EL2 and are 
always taken to Hyp mode. 
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— Note 


This means that Hypervisor Call and Hyp Trap exceptions cannot be taken from Secure state. 





When the value of the Hypervisor Call enable bit, SCR.HCE, is 1, executing an HVC (Hypervisor 
Call) instruction in a Non-secure EL1 mode generates a Hypervisor Call exception. 


For more information, see Hyp mode on page G1-3800. 


Monitor mode 


Monitor mode is the Secure EL3 mode. This means it is always in the Secure state, regardless of the 
value of the SCR.NS bit. 


Monitor mode is the mode to which a Secure Monitor Call exception is taken. In a Non-secure EL1 
mode, or a Secure EL3 mode, executing an SMC (Secure Monitor Call) instruction generates a Secure 
Monitor Call exception. 


When EL3 is using AArch32, some exceptions that are taken to a different mode by default can be 
configured to be taken to EL3, see PE mode for taking exceptions on page G1-3822. 


When EL3 is using AArch32, software executing in Monitor mode: 
. Has access to both the Secure and Non-secure copies of System registers. 


° Can perform an exception return to Secure state, or to Non-secure state. 
This means that, when EL3 is using AArch32, Monitor mode provides the only recommended 
method of changing between the Secure and Non-secure Security states. 

Secure and Non-secure modes 


In an implementation that includes EL3, the names of most implemented modes can be qualified as 
Secure or Non-secure, to indicate whether the PE is also in Secure state or Non-secure state. For 








example: 
. If a PE is in Supervisor mode and Secure state, it is in Secure Supervisor mode. 
° If a PE is in User mode and Non-secure state, it is in Non-secure User mode. 
— Note 
As indicated in the appropriate Mode descriptions: 
. Monitor mode is a Secure mode, meaning it is always in the Secure state. 
. Hyp mode is a Non-secure mode, meaning it is accessible only in Non-secure state. 
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Effect of the EL3 Execution state on the PE modes and Exception levels 


Figure G1-1 on page G1-3790 shows the PE modes, Exception levels, and Security states, for an implementation 
that includes all of the Exception levels, when EL3 is using AArch32. Figure G1-2 shows how the implemented 
modes change when EL3 is using AArch64. 


Non-secure state 


Secure state 


























AArch32 AArch32 AArch32 AArch32 AArch32 AArch32 

App1 App2 App1 App2 Secure App1 Secure App2 

}Modes:  —*| | I Modes: —|_—‘{ | Modes:  —* | I Modes: —si| | [Modes:  ———*| (| Modes: —si| 

User User User User ! User User 

AArch32" AArch32" | | AArch32t 
Guest OS1 Guest OS2 ! Secure OS 
Modes: system, FIQ, IRQ, Modes: system, FIQ, IRQ, Modes:  sstem, FIQ, IRQ, 
Supervisor, Abort, Undefined Supervisor, Abort, Undefined Supervisor, Abort, Undefined 

















AArch32* 


Hypervisor 














AArch64 





Secure monitor 








t+ When EL1 is using AArch64, System, FIQ, IRQ, Supervisor, Abort, and Undefined modes are not implemented 
+ When EL2 is using AArch64, Hyp mode is not implemented 


Figure G1-2 ARMv8 Exception levels, and PE modes, when EL3 is using AArch64 


Comparing Figure G1-1 on page G1-3790 and Figure G1-2 shows how, in Secure state only, the implementation of 
System, FIQ, IRQ, Supervisor, Abort, and Undefined mode depends on the Execution state that EL3 is using. That 
is, these modes are implemented as follows: 


Non-secure state 


If Non-secure EL1 is using AArch32 then System, FIQ, IRQ, Supervisor, Abort, and Undefined 
modes are implemented as part of EL1. Otherwise, these modes are not implemented in Non-secure 


Secure state 


state. 


The implementation of these modes depends on the Execution state that EL3 is using, as follows: 


EL3 using AArch64 If Secure EL1 is using AArch32 then System, FIQ, IRQ, Supervisor, Abort, 
and Undefined modes are implemented as part of EL1. Otherwise, these 
modes are not implemented in Secure state. 


EL3 using AArch32 


In Secure state, System, FIQ, IRQ, Supervisor, Abort, and Undefined modes 


are implemented as part of EL3, see Figure G1-1 on page G1-3790. 
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Hyp mode 


Hyp mode is the Non-secure EL2 mode. When EL2 is using AArch32, it provides the usual method of controlling 
the virtualization of Non-secure execution at EL1 and ELO. 


Note 





The alternative method of controlling this functionality is by accessing the EL2 controls from EL3 with the 
SCR_EL3.NS or SCR.NS bit set to 1. 





This section summarizes how Hyp mode differs from the other modes, and references where this part of the manual 
describes the features of Hyp mode in more detail: 


Software executing in Hyp mode executes at EL2, see Figure G1-1 on page G1-3790. 


Hyp mode is accessible only in Non-secure state. In Secure state, an attempt by a CPS or an MSR instruction to 
change PSTATE.M to Hyp mode is an illegal change to PSTATE.M, as described in //legal changes to 
PSTATE.M on page G1-3809. 


In Non-debug state, the only mechanisms for changing to Hyp mode are: 
—  Anexception taken from a Non-secure EL1 or ELO mode. 
— When EL3 is using AArch32, an exception return from Secure Monitor mode. 


— When EL3 is using AArch64, an exception return from EL3. 
In Hyp mode, the only exception return is execution of an ERET instruction, see ERET on page F5-2673. 


In Hyp mode, the CPACR has no effect on the execution of; 
— System register access instructions. 


— Advanced SIMD and floating-point instructions. 
The HCPTR controls execution of these instructions in Hyp mode. 


If software running in Hyp mode executes an SVC instruction, the Supervisor Call exception generated by the 
instruction is taken to Hyp mode, see SVC on page F5-3129. 


An exception return with restored PSTATE specifying Hyp mode is an illegal return event, as described in 
Illegal return events from AArch32 state on page G1-3835, if any of the following applies: 


—  EL3 is using AArch64 and the value of SCR_EL3.NS is 0. 
—  EL3 is using AArch32 and the value of SCR.NS is 0. 


— The return is from a Non-secure EL! mode. 


The instructions described in the following sections are UNDEFINED if executed in Hyp mode: 

— _ SRS. See SRS, SRSDA, SRSDB, SRSIA, SRSIB on page F5-3018. 

— RFE. See RFE, RFEDA, RFEDB, RFEIA, RFEIB on page F5-2918. 

—  LDM (exception return) on page F5-2703. 

—  LDM (User registers) on page F5-2705. 

— STM (User registers) on page F5-3053. 

— The SUBS PC, LR forms of the instructions described in SUB, SUBS (immediate) on page F5-3114. 


Note 
In T32 state, ERET is encoded as SUBS PC, LR, #0, and therefore this is a valid instruction. 








— The exception return form of the instructions described in MOV, MOVS (register) on page F5-2815. 


In addition, deprecated forms of the A32 ADCS, ADDS, ANDS, BICS, EORS, MOVS, MVNS, ORRS, RSBS, RSCS, SBCS, and 
SUBS instructions with the PC as the destination register are UNDEFINED if executed in Hyp mode. The 
instruction descriptions identify these UNDEFINED cases. 
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° The Load unprivileged and Store unprivileged instructions LDRT, LDRSHT, LDRHT, LDRBT, STRT, STRHT, and STRBT, 
are CONSTRAINED UNPREDICTABLE if executed in Hyp mode, see Execution of Load/Store unprivileged 
instructions in Hyp mode on page K1-5475. 


In an implementation that includes EL3, from reset, the HVC instruction is UNDEFINED in Non-secure EL1 modes, 
meaning entry to Hyp mode is disabled by default. To permit entry to Hyp mode using the Hypervisor Call 
exception, Secure software must enable use of the HVC instruction: 


° By setting the SCR_EL3.HCE bit tol, if EL3 is using AArch64. 
° By setting the SCR.HCE bit to 1, if EL3 is using AArch32. 


When the HVC instruction is UNDEFINED in Non-secure EL1 modes because of the value of the SCR_EL3.HCE or 
SCR.HCE bit, HVC is CONSTRAINED UNPREDICTABLE in Hyp mode. 


Pseudocode description of mode operations 
The BadMode() function tests whether a 5-bit mode number corresponds to one of the permitted modes. 


The BadMode() function is defined in Chapter J1 ARMv8 Pseudocode. 











G1.9.2 AArch32 general-purpose registers, the PC, and the Special-purpose registers 
The general-purpose registers, and the PC, in AArch32 state on page E1-2291 describes the application level view 
of the general-purpose registers, and the PC. This view provides: 
° The general-purpose registers RO-R14, of which: 
— The preferred name for R13 is SP (stack pointer). 
— The preferred name for R14 is LR (link register). 
° The PC (program counter), that can be described as R15. 
These registers are selected from a larger set of registers, that includes Banked copies of some registers, with the 
current register selected by the execution mode. The implementation and banking of the general-purpose registers 
depends on whether or not the implementation includes EL2 and EL3, and whether those exception levels are using 
AArch32. Figure G1-3 on page G1-3802 shows the full set of Banked general-purpose registers, and the 
Special-purpose registers: 
° The Program Status Registers CPSR and SPSR. 
° ELR_hyp. 
Note 
The architecture uses system level register names, such as RO_usr, R8_usr, and R8_fiq, when it must identify a 
specific register. The application level names refer to the registers for the current mode, and usually are sufficient 
to identify a register. 
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SPSR_sve |SPSR_abt |SPSR_und [SPSR_mon|SPSR_irq |SPSR_fiq 


+ Part of EL3. Exists only in Secure state, and only when EL3 is using AArch32. 
+ Part of EL2. Exists only in Non-secure state, and only when EL2 is using AArch32. 
Cells with no entry indicate that the User mode register is used. 


Figure G1-3 AArch32 general-purpose registers, PC, and Special-purpose registers, showing banking 


As described in PE mode for taking exceptions on page G1-3822, on taking an exception the PE changes mode, 
unless it is already in the mode to which it must take the exception. Each mode that the PE might enter in this way 


has: 


A Banked copy of the stack pointer, for example SP_irq and SP_hyp. 
A register that holds a preferred return address for the exception. This is: 
— _ For the EL2 mode, Hyp mode, the Special-purpose register ELR_hyp. 


— _ For the other privileged modes to which exceptions can be taken, a Banked copy of the link register, 
for example LR_und and LR_mon. 


A saved copy of PSTATE, made on exception entry, for example SPSR_irq and SPSR_hyp. 


In addition FIQ mode has Banked copies of the general-purpose registers R8 to R12. 


User mode and System mode share the same general-purpose registers. 


User mode, System mode, and Hyp mode share the same LR. 


For more information about the application level view of the SP, LR, and PC, and the alternative descriptions of 
them as R13, R14 and R15, see The general-purpose registers, and the PC, in AArch32 state on page E1-2291. 


AArch32 Special-purpose registers 


In AArch32 state, the Special-purpose registers are: 


The CPSR and its view as the APSR. 


The SPSR, including the banked copies SPSR_abt, SPSR_fiq, SPSR_hyp, SPSR_irq, SPSR_mon, 
SPSR_svc, and SPSR_und. 


The ELR_hyp. 
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Pseudocode description of general-purpose register and PC operations 


The following pseudocode gives access to the general-purpose registers and the PC. These registers are an array, 
_R[], indexed by parameter n. This array is common to AArch32 and AArch64 operation and therefore contains 31 
64-bit registers. _PC is the program counter, and its definition is common to AArch32 and AArch64 operation and 
therefore its size is 64-bit. 


LookUpRIndex() looks up the index value, n, for the specified register number and PE mode, using RBankSelect() to 
evaluates the result. 


R[] accesses the specified general-purpose register in the current PE mode, using Rmode[] to access the register, 
accessing _R[] if necessary. SP accesses the stack pointer, LR accesses the link register, and PC accesses the program 
counter. Each function has a non-assignment form for register reads and an assignment form for register writes, 
other than PC, which has only a non-assignment form. 


BranchTo() performs a branch to the specified address. 


The _R[], _PC, LR, SP, LookUpRIndex(), RBankSelect(), Rmode[], and BranchTo() functions are defined in Chapter J1 
ARMV8& Pseudocode. 


G1.9.3 Saved Program Status Registers (SPSRs) 


The Saved Program Status Registers (SPSRs) are used to save PE state on taking exceptions. In AArch3?2 state, there 
is an SPSR for every mode that an exception can be taken to, as shown in Figure G1-3 on page G1-3802. For 
example, the SPSR for Monitor mode is called SPSR_mon. 





Note 


Exceptions cannot be taken to ELO. 





When the PE takes an exception, PE state is saved from PSTATE in the SPSR for the mode the exception is taken 
to. For example, if the PE takes an exception to Monitor mode, PE state is saved in SPSR_mon. For more 
information on PSTATE, see Process state, PSTATE on page G1-3805. 


Note 


All PSTATE fields are saved, including those which have no direct read and write access. 








Saving the PSTATE fields means the exception handler can: 


° On return from the exception, restore the PE state to the values it had immediately before the exception was 
taken. When the PE returns from an exception, PE state is restored to the state stored in the SPSR of the mode 
the exception is returning from, if the exception return is made using one of: 


—  ERET. 

—_— LDM. 

— The Exception return form of the instruction described in MOV, MOVS (register) on page F5-2815. 
— The Exception return form of the instruction described in SUB, SUBS (immediate) on page F5-3114. 
For example, on returning from Monitor mode, PE state is restored to the state stored in SPSR_mon. If the 


exception return is made using the RFE instruction, the PE restores the PE state from an SPSR valued read 
from memory. 


° Examine the value that PSTATE had when the exception was taken, for example to determine the instruction 
set state and privilege level in which the instruction that caused an Undefined Instruction exception was 
executed. 


The SPSRs are UNKNOWN on reset. Any operation in a Non-secure EL1 or ELO mode makes SPSR_hyp UNKNOWN. 
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G1.9.4 


SPSR format for exceptions taken to AArch32 state 


The SPSR bit assignments are: 








31 30 29 28 27 26 25 24 23 22 21 20 19 1615 1009876543210 
l J J + 
Condition flags = 0] RESO a bits 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 109876543 2 1 0 








= L_ ReEso 
Condition flags IT[1: = PAN tad bits 


RESO 
RESO 


N, Z, C, V, Q, GE[3:0], bits[31:27, 19:16] 
Shows the value of the PSTATE.{N, Z, C, V, Q, GE} flags immediately before the exception was 
taken. 


IT[1:0], J, IL, IT[7:2], E, T, bits[26:24, 20, 15:9, 5] 
Shows the values of the PSTATE. {IT[7:0], J, IL, E, T} PE state controls immediately before the 
exception was taken. The J bit is RESO. 


Bits[23:21] Reserved, RESO. 


A, I, F, M, bits[8:6, 4:0] 
Shows the values of the PSTATE. {A, I, F} exception mask and PSTATE.M mode bits immediately 
before the exception was taken. 


Bits[23:21] of an SPSR are ignored on an exception return from AArch32 state. 


Pseudocode description of SPSR operations 
The following pseudocode gives access to the SPSRs. 
The SPSR[] function accesses the current SPSR and is common to AArch32 and AArch64 operation. 


The SPSRWriteByInstr() function is used by the MSR (register) and MSR (immediate) instructions to update the 
current SPSR. 


The SPSR[] and SPSRWriteByInstr() functions are defined in Chapter J1 ARMv8 Pseudocode. 


ELR_hyp 


Hyp mode does not provide its own Banked copy of LR. Instead, on taking an exception to Hyp mode, the preferred 
return address is stored in ELR_hyp, a 32-bit Special-purpose register implemented for this purpose. 


ELR_hyp can be accessed explicitly only by executing: 

° An MRS or MSR instruction that targets ELR_hyp, see: 
— MRS (Banked register) on page F5-2832. 
— MSR (Banked register) on page F5-2836. 


The ERET instruction uses the value in ELR_hyp as the return address for the exception. For more information, see 
ERET on page F5-2673. 


Software execution in any Non-secure EL1 or ELO mode makes ELR_hyp UNKNOWN. 
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G1.10 Process state, PSTATE 


In the ARMv8-A architecture, Process state or PSTATE is an abstraction of process state information. All of the 
instruction sets provide instructions that operate on elements of PSTATE. 


PSTATE includes all of the following: 


. Fields that are meaningful only in AArch32 state. 
. Fields that are meaningful only in AArch64 state. 
° Fields that are meaningful in both Execution states. 


PSTATE is defined in pseudocode as the PSTATE structure, of type ProcState. ProcState is defined in Chapter J1 
ARMV8 Pseudocode. 


The PSTATE fields that are meaningful in AArch32 state are: 


The condition flags 
N Negative Condition flag. 
Z Zero Condition flag. 
Cc Carry Condition flag. 
Vv Overflow Condition flag. 


Process state, PSTATE on page E1-2294 gives more information about these. 


The overflow or saturation flag 
Q See Process state, PSTATE on page E1-2294. 


The greater than or equal flags 
GE[3:0] See Process state, PSTATE on page E1-2294. 


The PE state controls 


J,T Instruction set state. See Process state, PSTATE on page E1-2294, J is RESO. On a reset 
to AArch32 state, T is set to an IMPLEMENTATION DEFINED value. On taking an 
exception to: 


° A PLI1 mode using AArch32, T is set to SCTLR.TE. 
° EL2 using AArch32, T is set to HSCTLR.TE. 


IT[7:0] IT block state bits. See Process state, PSTATE on page E1-2294. On a reset or taking an 
exception to AArch32 state, these bits are set to 0. 


E Endianness of data accesses. See Process state, PSTATE on page E1-2294. If an 
implementation provides both Big-endian and Little-endian support, then: 


° On a reset to AArch32 state this bit is set to the IMPLEMENTATION DEFINED reset 
value of: 
—  SCTLR.EE if the highest implemented Exception level is not EL2. 
—  HSCTLR.EE if the highest implemented Exception level is EL2. 


° On taking an exception to: 
— A PLI mode using AArch32, this bit is set to SCTLR.EE. 
—  EL2 using AArch32, this bit is set to HSCTLR.EE 


IL Illegal Execution state bit. See The Illegal Execution state exception on page G1-3837. 
On a reset or taking an exception to AArch32 state, this bit is set to 0. 


For information on how the J, T, IT[7:0], E, and IL fields can be accessed, see Accessing the PE 
state controls and the Execution state bit on page G1-3808. 


The asynchronous exception mask bits 





A SError interrupt mask bit. 
I IRQ interrupt mask bit. 
F FIQ interrupt mask bit. 
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For each bit, the values are: 

0 Exception not masked. 

1 Exception masked. 

On a reset to AArch32 state, these bits are set to 1. 

On taking an exception to AArch32 state, one or more of these bits are set to 1. 
For more information, see both: 

° Asynchronous exception masking controls on page G1-3842. 

° PE state on exception entry on page G1-3826. 


The mode bits 


M[4:0] Current mode of the PE. Table G1-5 on page G1-3796 lists the permitted values of this 
field. All other values are reserved. Illegal changes to PSTATE.M on page G1-3809 
describes the effect of setting M[4:0] to a reserved value. 


M[4] is: 

M{[4], Execution state 
The current Execution state: 
0 AArch64 state. 
1 AArch3?2 state. 


Note 


This is consistent with the use of M[4:0] in previous versions of the 
architecture. 








On a reset to AArch32 state, M[4:0] is set to: 


° 0b10011, meaning Supervisor mode, if the highest implemented Exception level 
is not EL2. 
° 0b11010, meaning Hyp mode, if the highest implemented Exception level is EL2. 


On taking an exception to AArch32 state, M[4:0] is set to the target mode for the 
exception type. 


For more information about the PE modes, see: 
° AArch32 PE mode descriptions on page G1-3796. 
° PE state on exception entry on page G1-3826. 


G1.10.1 Accessing PSTATE fields 


The PSTATE fields can be accessed as described in the following subsections: 

° The Current Program Status Register, CPSR on page G1-3807. 

° Accessing the PE state controls and the Execution state bit on page G1-3808. 
° The CPS instruction on page G1-3808. 

° The SETEND instruction on page G1-3808. 
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The Current Program Status Register, CPSR 


Some PSTATE fields can be accessed using the Special-purpose Current Program Status Register (CPSR). The 
CPSR can be directly read using the MRS instruction, and directly written using the MSR (register) and MSR 
(immediate) instructions. 
The CPSR bit assignments are: 

31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 1009876543210 





Lo L RESO 


Condition se Mask bits 


N, Z, C, V, bits [31:28] 
The PSTATE condition flags. 


Q, bit [27] The PSTATE overflow or saturation flag. 


Bits[26:24] | Reserved, RAZ/SBZP. Software can use MSR instructions that write the top byte of the CPSR without 
using a read-modify-write sequence. If it does this, it must write zeros to bits[26:24]. 
Bits[23:20, 15:10, 5] 


Reserved, RESO. 


GE[3:0], bits [19:16] 
The PSTATE greater than or equal flags. 


E, bit [9] The PSTATE endianness bit. 


A, I, F, bits [8:6] 
The PSTATE asynchronous exception mask bits. 


M[4:0], bits [4:0] 
The PSTATE mode bits. 


The other PSTATE fields cannot be accessed by using the CPSR. For information on how to access them, see 
Accessing the PE state controls and the Execution state bit on page G1-3808. 


The application level alias for the CPSR is the APSR. The APSR is a subset of the CPSR. See The Application 
Program Status Register, APSR on page E1-2296. 


Writes to the CPSR have side-effects on various aspects of PE operation. All of these side-effects, except 
side-effects on memory accesses associated with fetching instructions, are synchronous to the CPSR write. This 
means that they are guaranteed: 


° Not to be visible to earlier instructions in the execution stream. 


° To be visible to later instructions in the execution stream. 


The privilege level and address space of memory accesses associated with fetching instructions depend on the 
current Exception level and Security state. Writes to PSTATE.M can change one or both of the Exception level and 
Security state. The effect, on memory accesses associated with fetching instructions, of a change of Exception level 
or Security state is: 


° Synchronous to the change of Exception level or Security state, if that change is caused by an exception entry 
or exception return. 


° Guaranteed not to be visible to any memory access caused by fetching an earlier instruction in the execution 
stream. 
* Guaranteed to be visible to any memory access caused by fetching any instruction after the next Context 


synchronization event in the execution stream. 
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. Might or might not affect memory accesses caused by fetching instructions between the mode change 
instruction and the point where the mode change is guaranteed to be visible. 


See Exception return to an Exception level using AArch32 on page G1-3834 for the definition of exception return 
instructions. 


Accessing the PE state controls and the Execution state bit 
The PE state controls are the PSTATE. {IL, IT[7:0], J, E, T} fields. Software can read or write these in an SPSR. 


In the CPSR: 
° The PE state controls, other than PSTATE.E, are RAZ when read by an MRS instruction. 


° Writes to the PE state controls, other than PSTATE.E, by MSR (register) or MSR (immediate), are ignored 
in all modes. 


Instructions other than MRS, MSR (register), or MSR (immediate) that access the PE state controls can read and 
write them in any mode. 


Unlike the other PSTATE PE state controls, PSTATE.E can be read by an MRS instruction and might be written by 
MSR (register) or MSR (immediate). However, ARM deprecates PSTATE.E having a different value from the 
equivalent System register EE bit, see Mixed-endian support on page G3-3988. 


Note 


To determine the current endianness, software can use an LDR instruction to load a word from memory with a known 
value that differs if the endianness is reversed. For example, using an LDR instruction to load a word whose four bytes 
are 0x01, 0x00, 0x00, and 0x00 in ascending order of memory address loads the destination register with: 





° 0x00000001 if the current endianness is little-endian. 


° 0x01000000 if the current endianness is big-endian. 





The PSTATE.M[4] bit is the Execution state bit. When read by an MRS instruction in AArch32 state, this bit always 
reads as 1. When written by an MSR (register) instruction or MSR (immediate) instruction, writing a value other 
than 1 is an illegal change to the PSTATE.M field. See Illegal changes to PSTATE.M on page G1-3809. 


The CPS instruction 
The A32 and T32 instruction sets both include an instruction to manipulate PSTATE.{A, I, F} and PSTATE.M: 


CPSIE <iflags> {, #<mode>} 
Sets the specified PSTATE. {A, I, F} exception masks to 0, enabling the exception, and optionally 
changes to the specified mode. 


CPSID <iflags> {, #<mode>} 
Sets the specified PSTATE. {A, I, F} exception masks to 1, disabling the exception, and optionally 
changes to the specified mode. 


CPS #<mode> | Changes to the specified mode without affecting the PSTATE. {A, I, F} exception masks. 


The CPS instruction is unconditional. For more information, see CPS, CPSID, CPSIE on page F5-2645. 


The SETEND instruction 


The A32 and T32 instruction sets both include an instruction to manipulate PSTATE.E: 
SETEND BE Sets PSTATE.E to 1, for big-endian operation. 
SETEND LE Sets PSTATE.E to 0, for little-endian operation. 


The SETEND instruction is unconditional. For more information, see SETEND on page F5-2966. ARM deprecates use 
of the SETEND instruction. 
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G1.10.2 The Saved Program Status Registers (SPSRs) 


On taking an exception, PSTATE is preserved in the SPSR of the mode to which the exception is taken. The SPSRs 
are described in Saved Program Status Registers (SPSRs) on page G1-3803. 


G1.10.3 Illegal changes to PSTATE.M 


In AArch32 PE modes other than User mode, MSR and CPS instructions can explicitly change PSTATE.M. The 
following changes to PSTATE.M by MSR or CPS instructions are illegal: 


° A change to an encoding that Table G1-5 on page G1-3796 does not show. 
° A change to a mode that is not implemented. 


° A change to a mode that is not accessible from the context the MRS or CPS instruction is executed in, as follows: 
—  Achange to a mode that would cause entry to a higher Exception level. 
— When executing in Non-secure state, a change to Monitor mode. 
— When executing in Secure EL1, a change to Monitor mode when EL3 is using AArch64. 
—  Achange to Hyp mode from any other mode. 
—  Achange from Hyp mode to any other mode. 
— When the value of HCR.TGE is 1, attempting to change from Monitor mode to a Non-secure PL1 
mode, see Trapping of general exceptions to Hyp mode on page K1-5476. 


On executing an instruction that attempts an illegal change to PSTATE.M: 


° PSTATE.M is unchanged, and the current mode remains unchanged. 
° PSTATE.IL is set to 1. 
° All other PSTATE fields are written to as normal. 

Note 





For the PSTATE fields that MSR and CPS instructions update, see the instruction descriptions: 
° MSR (register) on page F5-2842. 

° MSR (immediate) on page F5-2840. 

° CPS, CPSID, CPSIE on page F5-2645. 





When the value of PSTATE.IL is 1, any attempt to execute any instruction results in an I!legal Execution state 
exception. See The Illegal Execution state exception on page G1-3837. 





Note 
° The PE ignores writes to PSTATE.M when executing at PLO. 
° In ARMv7, an instruction that attempts to make an illegal change to PSTATE.M is UNPREDICTABLE. 





G1.10.4 Pseudocode description of PSTATE operations 


The CPSRWriteByInstr() function is used by the MSR (register) and MSR (immediate) instructions to update 
PSTATE. 


The SetPSTATEFromPSR() function updates PSTATE from a CPSR or SPSR. 


Chapter J1 ARMv8 Pseudocode defines these functions. 
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G1.11 


Instruction set states 


The instruction set states are described in Chapter E2 The AArch32 Application Level Memory Model and 
application level operations on them are described there. This section supplies more information about how they 
interact with system level functionality, in the sections: 


° Exceptions and instruction set state. 


° Unimplemented instruction sets. 

















G1.11.1 Exceptions and instruction set state 
If an exception is taken to an EL1 mode, the SCTLR.TE bit for the Security state the exception is taken to determines 
the instruction set state that handles the exception, and if necessary, the PE changes to this instruction set state on 
exception entry. 
If the exception is taken to Hyp mode, the HSCTLR.TE bit determines the instruction set state that handles the 
exception, and if necessary, the PE changes to this instruction set state on exception entry. 
On coming out of reset, if the highest implemented Exception level is using AArch32: 
. If the highest implemented Exception level is EL2, the PE starts execution in Hyp mode, in the instruction 
set state determined by the reset value of HSCTLR.TE. 
° Otherwise, the PE starts execution in Supervisor mode, in the instruction set state determined by the reset 
value of SCTLR.TE. If the implementation includes EL3, this execution is in Secure Supervisor mode. 
For more information about exception entry, see Overview of exception entry on page G1-3819. 
G1.11.2 Unimplemented instruction sets 
The PSTATE.T bit defines the current instruction set state, see Process state, PSTATE on page E1-2294. 
In the ARMV8 architecture there is no support for the hardware acceleration of Java bytecodes, and the Jazelle 
Instruction set state is obsolete. Every AArch32 implementation must support the Trivial Jazelle implementation 
described in Trivial implementation of the Jazelle extension. 
Note 
In previous versions of the ARM architecture, the PSTATE. {J, T} bits determined the Instruction set state. In 
ARMV8, PSTATE.J is RESO. 
Trivial implementation of the Jazelle extension 
ARMvV8 requires that the implementation of AArch32 state includes the trivial Jazelle implementation. 
In a trivial implementation of the Jazelle extension: 
° At EL1, EL2, or EL3, if the Exception level is using AArch32: 
— The JMCR and JOSCR are RAZ/WI. 
— The JIDR is a RAZ read-only register. 
° At ELO when ELO is using AArch32: 
— It is IMPLEMENTATION DEFINED whether the JMCR and JOSCR are RAZ/WI or UNDEFINED. 
a It is IMPLEMENTATION DEFINED whether JIDR is RAZ or UNDEFINED. 
° The BXJ instruction behaves identically to the BX instruction in all circumstances. 
Note 
This is consistent with the JMCR.JE bit being RAZ, and means that the A32 and T32 instruction sets do not 
provide any mechanism for attempting to enter Jazelle state. 
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° Jazelle state, as defined in previous versions of the ARM architecture, is an unimplemented instruction set 
state. 


These requirements ensure that operating systems that support an EJVM execute correctly. 


A trivial implementation is not required to extend the PC to 32 bits, that is, it can implement PC[0] as RAZ/WI. 





Note 


This is because the only way that PC[0] is visible in A32 or T32 state is as a result of an exception occurring during 
Jazelle state execution, and Jazelle state execution cannot occur on a trivial implementation. 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G1-3811 


1ID092916 


Non-Confidential 


G1 The AArch32 System Level Programmers’ Model 
G1.12 Handling exceptions that are taken to an Exception level using AArch32 

















G1.12 Handling exceptions that are taken to an Exception level using AArch32 
An exception causes the PE to suspend program execution to handle an event, such as an externally generated 
interrupt or an attempt to execute an undefined instruction. Exceptions can be generated by internal and external 
sources. 
Normally, when an exception is taken the PE state is preserved immediately, before handling the exception. This 
means that, when the event has been handled, the original state can be restored and program execution resumed from 
the point where the exception was taken. 
More than one exception might be generated at the same time, and a new exception can be generated while the PE 
is handling an exception. 
The following sections describe exception handling: 
. Exception vectors and the exception base address. 
° Exception prioritization for exceptions taken to AArch32 state on page G1-3816. 
° Overview of exception entry on page G1-3819. 
. PE mode for taking exceptions on page G1-3822. 
° PE state on exception entry on page G1-3826. 
° Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 
° Routing debug exceptions to EL2 on page G1-3833. 
° Exception return to an Exception level using AArch32 on page G1-3834. 
Asynchronous exception behavior for exceptions taken from AArch32 state on page G1-3839 gives a full description 
of asynchronous exception handling, for exceptions taken asynchronously from AArch32 state. 
Note 
Because of the common model for handling exceptions, the current section requires some understanding of the 
asynchronous exception behaviors described in Asynchronous exception behavior for exceptions taken from 
AArch32 state on page G1-3839. 
AArch32 state exception descriptions on page G1-3849 then describes each exception. 
G1.12.1 Exception vectors and the exception base address 
When an exception is taken, PE execution is forced to an address that corresponds to the type of exception. This 
address is called the exception vector for that exception. The vectors for the different types of exception form a 
vector table. 
Note 
There are significant differences in the sets of exception vectors for exceptions taken to an Exception level that is 
using AArch32 and for exceptions taken to an Exception level that is using AArch64. This part of this manual 
describes only how exceptions are taken to an Exception level that is using AArch32. So, for example, when 
executing at EL1 or ELO, an exception might be generated that must be taken to EL3. In this case: 
° If EL3 is using AArch32 then the exception is taken as described in this chapter, using the exception vectors 
described in this section. 
° If EL3 is using AArch64 then the exception is taken as described in Chapter D1 The AArch64 System Level 
Programmers’ Model using the exception vectors described in Exception vectors on page D1-1522. 
AArch32 state defines exception vector tables for exceptions taken to EL2 and EL3 when those Exception levels 
are using AArch32. Those vector tables are not used when the corresponding Exception levels are using AArch64. 
A set of exception vectors for an Exception level that is using AArch32 comprises eight consecutive word-aligned 
memory addresses, starting at an exception base address. These eight vectors form an AArch32 vector table. 
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The number of possible exception base addresses, and therefore the number of vector tables, depends on the 
implemented Exception levels, as follows: 


Implementation that does not include EL3 


Any implementation that does not include EL3 must include the following AArch32 vector table if 
EL1 can use AArch32: 


An exception table for exceptions taken to EL1 modes other than System mode. This is the 
ELI vector table, and is in the address space of the PL1&0 translation regime. 


— Note 


Exceptions cannot be taken to System mode. 





For this vector table, the VBAR holds the exception base address. 


Implementation that includes EL2 


Any implementation that includes EL2 must include the following additional AArch32 vector table 
if EL2 can use AArch32: 


An exception table for exceptions taken to Hyp mode. This is the Hyp vector table, and is in 
the address space of the Non-secure PL2 translation regime. 


For this vector table, HVBAR holds the exception base address. 


Implementation that includes EL3 


Any implementation that includes EL3 must include the following AArch32 vector tables: 


If EL3 can use AArch32, a vector table for exceptions taken to Secure Monitor mode. This 
is the Monitor vector table, and is in the address space of the Secure PL1&0 translation 
regime. 

For this vector table, MVBAR holds the exception base address. 


If Secure EL1 can use AArch32, a vector table for exceptions taken to Secure privileged 
modes other than Monitor mode and System mode. This is the Secure vector table, and is in 
the address space of the Secure PL1&0 translation regime. 


For this vector table, the Secure VBAR holds the exception base address. 
If Non-secure EL1 can use AArch32, a vector table for exceptions taken to Non-secure PL1 


modes. This is the Non-secure vector table, and is in the address space of the Non-secure 
PL1&0 translation regime. 


For this vector table, the Non-secure VBAR holds the exception base address. 


The following subsections give more information: 


The vector tables and exception offsets. 


Pseudocode determination of the exception base address on page G1-3816. 


The vector tables and exception offsets 


Table G1-6 on page G1-3814 defines the AArch32 vector table entries. In this table: 


Note 


The Hyp mode column defines the vector table entries for exceptions taken to Hyp mode. 
The Monitor mode column defines the vector table entries for exceptions taken to Monitor mode. 


The Secure and Non-secure columns define the Secure and Non-secure vector table entries, that are used for 
exceptions taken to modes other than Monitor mode, Hyp mode, System mode, and User mode. Table G1-7 
on page G1-3814 shows the mode to which each of these exceptions is taken. Each of these modes is 
described as the default mode for taking the corresponding exception. 





Exceptions cannot be taken to System mode or User mode. 
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For more information about determining the mode to which an exception is taken, see PE mode for taking exceptions 


on page G1-3822. 


When EL2 is using AArch32, it provides a number of additional exceptions, some of which are not shown explicitly 
in the vector tables. For more information, see interrupt Offsets of AArch32 exceptions provided by EL2 on 


page G1-3815. 


Table G1-6 The AArch32 vector tables 





Vector tables 





























Offset 
Hyp2 Monitor Secure® Non-secure° 

0x00 Not used Not used Not used4 Not used 
0x04 Undefined Instruction, from Hyp mode Monitor Trap Undefined Instruction — Undefined Instruction 
0x08 Hypervisor Call, from Hyp mode Secure Monitor Call Supervisor Call Supervisor Call 
Ox0C Prefetch Abort, from Hyp mode Prefetch Abort Prefetch Abort Prefetch Abort 
0x10 Data Abort, from Hyp mode Data Abort Data Abort Data Abort 
@x14 Hyp Trap, or Hyp mode entry Not used Not used Not used 
0x18 IRQ interrupt IRQ interrupt IRQ interrupt IRQ interrupt 
@x1C FIQ interrupt FIQ interrupt FIQ interrupt FIQ interrupt 

a. Non-secure state only. Implemented only if the implementation includes EL2 and EL2 can use AArch32. 

b. Secure state only. Implemented only if the implementation includes EL3 and EL3 can use AArch32. 


If the implementation does not include EL3 then there is a single vector table for exceptions taken to EL1 when EL! is using AArch32. 
That table holds the vectors shown in the Secure column of this table 


In previous versions of the architecture, this entry has been used for the Reset vector, meaning the address at which execution starts on 
coming out of reset. In ARMv8, the AArch32 Reset vector is IMPLEMENTATION DEFINED. An implementation might use this vector table 


entry to hold the Reset vector. 


See Use of offset 0x14 in the Hyp vector table on page G1-3815. 


Table G1-7 Modes for taking the exceptions shown in the Secure or Non-secure vector table 























Exception Mode taken to 
Undefined Instruction Undefined 
Supervisor Call Supervisor 
Prefetch Abort Abort 

Data Abort Abort 

IRQ interrupt IRQ 

FIQ interrupt FIQ 





For more information about use of the vector tables see Overview of exception entry on page G1-3819. 
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interrupt Offsets of AArch32 exceptions provided by EL2 


EL2 provides the following exceptions. When EL2 is using AArch32, these exceptions are taken to Hyp mode, and 
the PE enters the handlers for these exceptions using the following vector table entries shown in Table G1-6 on 
page G1-3814: 


Hypervisor Call 


If taken from Hyp mode, shown explicitly in the Hyp mode vector table. Otherwise, see Use of offset 
0x14 in the Hyp vector table. 


Hyp Trap Shown explicitly in the Hyp mode vector table. 
Virtual Abort Entered through the Data Abort vector in the Non-secure vector table. 
Virtual IRQ Entered through the IRQ vector in the Non-secure vector table. 


Virtual FIQ — Entered through the FIQ vector in the Non-secure vector table. 


Note 


Virtual exceptions when an implementation includes EL2 on page G1-3839 gives more information about the virtual 
exceptions. 








Use of offset 0x14 in the Hyp vector table 


The vector at offset 0x14 in the Hyp vector table is used for exceptions that cause entry to Hyp mode. This means it 
is: 


. Always used for the Hyp Trap exception. 


° Used for any Hypervisor Call exception that is taken from a mode other than Hyp mode. 

° Used for any Supervisor Call exception that is taken from Non-secure User mode when the value of 
HCR.TGE is 1. 

° Used for any Undefined Instruction that is taken from Non-secure User mode when the value of HCR.TGE 
is 1. 

° Used for any Prefetch Abort exception that is: 


— Taken from Non-secure User mode when the value of HCR.TGE is 1. 
— Generated by a Debug exception from Non-secure state when the value of HDCR.TDE is 1. 


— Generated by a stage 2 abort on an address translation instruction. 


° Used for any Data Abort exception that is: 
— Taken from Non-secure User mode when the value of HCR.TGE is 1. 
— Generated by an SError interrupt from Non-secure state when the value of HCR.AMO is 1. 
— Generated by a Watchpoint exception from Non-secure state when the value of HDCR.TDE is 1. 


— Generated by a stage 2 abort on an address translation operation. 





Note 
Offset 0x14 is never used for the following exceptions: 
° IRQ exceptions. 
° Virtual IRQ exceptions. 
° FIQ exceptions. 
° Virtual FIQ exceptions. 


Virtual exceptions are never taken to Hyp mode. 





For more information, see PE mode for taking exceptions on page G1-3822. 
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Pseudocode determination of the exception base address 
For an exception taken to a PL1 mode, the ExcVectorBase() function determines the exception base address. 


The ExcVectorBase() function is defined in Chapter J1 ARMv8 Pseudocode. 


Note 


The PL1 modes to which exceptions can be taken are Supervisor mode, Undefined mode, Abort mode, IRQ mode, 
and FIQ mode. In Non-secure state, and in Secure state when EL3 is using AArch64, these are EL1 modes. 
However, in Secure state when EL3 is using AArch32, these are EL3 modes. For more information see Security 
state, Exception levels, and AArch32 execution privilege on page G1-3792. 








G1.12.2 Exception prioritization for exceptions taken to AArch32 state 


The following sections describe the ARMv8 requirements for the prioritization of synchronous exceptions, and the 
limits on when asynchronous exceptions can be taken: 


° Synchronous exception prioritization for exceptions taken to AArch32 state. 

° Architectural requirements for taking asynchronous exceptions on page G1-3818. 

See also: 

° AArch32 state prioritization of synchronous aborts from a single stage of address translation on 


page G4-4120, for information about: 
— The prioritization of aborts on a single memory access in a VMSA implementation. 


— The prioritization of exceptions generated during address translation. 

° Debug state entry and debug event prioritization on page H2-4847 for information about the relative 
prioritization of exceptions and the debug events that cause entry to Debug state. 

Synchronous exception prioritization for exceptions taken to AArch32 state 


In principle, any single instruction can generate a number of different synchronous exceptions, between the fetching 
of the instruction, its decode, and eventual execution. This section describes the prioritization of such exceptions 
when they are taken to an Exception level that is using AArch32. 





Note 


. An exception that is taken to an Exception level that is using AArch32 must have been taken from an 
Exception level that is using AArch32. 


° The priority numbering in this list only shows the relative priorities of exceptions taken to an Exception level 
that is using AArch32. This numbering has no global significance and, for example, does not correlate with 
the equivalent AArch32 list in Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548. 





For an exception that is taken to an Exception level that is using AArch32, exceptions are prioritized as follows, 
where | is the highest priority. 


1. PC alignment fault exceptions. A PC alignment fault exception can only be taken to an Exception level that 
is using AArch32 as a result of: 


° The CONSTRAINED UNPREDICTABLE handling of a branch to an unaligned address, see Branching to an 
unaligned PC on page K1-5458. 


° Exiting from Debug state to AArch32 specifying an unaligned PC value, see Exiting Debug state on 
page H2-4880. 


A PC alignment fault exception that is taken to an Exception level that is using AArch32 is reported as a 
Prefetch Abort exception, see Prefetch Abort exception reporting a PC alignment fault exception on 
page G1-3857. 
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Prefetch Abort exceptions. See Prefetch Abort exception on page G1-3856 and AArch32 state prioritization 
of synchronous aborts from a single stage of address translation on page G4-4120. 


Breakpoint exceptions or Address Matching Vector Catch exceptions. See: 
° Breakpoint exceptions on page G2-3938. 
° Vector Catch exceptions on page G2-3975. 





Note 


An Exception Trapping Vector Catch exception is generated on exception entry for an exception that has been 
prioritized as described in this section. This means that it does not have its own entry in this list. 





Illegal Execution state exceptions. See The Illegal Execution state exception on page G1-3837. 


Exceptions taken from EL1 to EL2 because of one of the following configuration settings: 
° HSTR.Tn. 
° HCR.TIDCP. 


Undefined Instruction exceptions that occur as a result of one or more of the following: 


° An attempt to execute an unallocated instruction encoding, including an encoding for an instruction 
that is not implemented in the PE implementation. 


° An attempt to execute an instruction that is defined never to be accessible at the current Exception 
level regardless of any enables or traps. 


. Debug state execution of an instruction encoding that is unallocated in Debug state. 
. Non-debug state execution of an instruction encoding that is unallocated in Non-debug state. 
° Execution of an HVC instruction, when HVC instructions are disabled by SCR.HCE or HCR.HCD. 
° Execution of an HLT instruction when HLT instructions are disabled by EDSCR.HDE. 
° In Debug state: 
— _ Execution of a DCPS1 instruction in Non-secure ELO when HCR.TGE is 1. 


— Execution of a DCPS2 instruction in EL] or ELO when SCR.NS is 0 or when EL2 is not 
implemented. 


— _ Execution of a DCPS3 instruction when EDSCR.SDD is 1 or when EL3 is not implemented. 


— When the value of EDSCR.SDD is 1, execution in EL2, EL1, or ELO of an instruction that is 
trapped to EL3. 


. Execution of an instruction that is UNDEFINED as a result of any of: 
— Being in an IT block when SCTLR.ITD is 1, or when HSCTLR.ITD is 1. 
— Executing a SETEND instruction when SCTLR.SED is 1, or when HSCTLR.SED is 1. 


— Executing a CP1SDMB, CP15DSB, or CP15ISB barrier instruction when SCTLR.CPI5BEN is 
0, or when HSCTLR.CPI5BEN is 0. 

See Disabling or enabling PLO and PLI use of AArch32 deprecated functionality on page G1-3888 

and Disabling or enabling EL2 use of AArch32 deprecated functionality on page G1-3897. 


° Execution of an instruction that is UNDEFINED because at least one of FPSCR. { Stride, Len} is nonzero, 
when programming these bits to nonzero values is supported. See Floating-point exception traps on 
page G1-3883. 


Exceptions taken to EL1, or taken to EL2 because the value of HCR.TGE is 1, that are generated because of 
configurable access to instructions, and that are not covered by any of priorities 1-6. 


Exceptions taken from ELO to EL2 because of one of the following configuration settings: 
° HSTR.Tn. 
° HCR.TIDCP. 


Exceptions taken to EL2 because of configuration settings in the HCPTR. 
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10. Exceptions taken to EL2 because of one of the following configuration settings: 
° Any setting in HCR, other than the TIDCP bit. 
. Any setting in CNTHCTL. 
. Any setting in HDCR. 


11. Exceptions taken to EL2 because of configurable access to instructions, and that are not covered by any of 
priorities 1-10. 


12. Exceptions caused by the SMC instruction being UNDEFINED because the value of SCR.SCD is 1. 
13. Exceptions caused by the execution of an Exception generating instruction, SVC, HVC, SMC, or BKPT. 


14. Exceptions taken to EL3 because of configuration settings in the SDCR. These might be taken from ELO, 
ELI, or EL2. 


15. Exceptions taken to EL3 because of configurable access to instructions, and that are not covered by any of 
priorities 1-14. 


16. Trapped floating-point exceptions, if supported. See Floating-point exception traps on page G1-3883. 


17. Data Abort exceptions other than a Data Abort exception generated by a Synchronous external abort that was 
not generated by a translation table walk. That is, any Data Abort exception that is not covered by item 19. 
See Data Abort exception on page G1-3859 and AArch32 state prioritization of synchronous aborts from a 
single stage of address translation on page G4-4120. It is IMPLEMENTATION DEFINED whether Synchronous 
external aborts are prioritized here or as item 19. 


18. Watchpoint exceptions. See Watchpoint exceptions on page G2-3961. 


19. Data Abort exception generated by a Synchronous external abort that was not generated by a translation table 
walk, see External aborts on page G3-4014. It is IMPLEMENTATION DEFINED whether Synchronous external 
aborts are prioritized here or as item 17. 


For items 17-19, if an instruction results in more than one single-copy atomic memory access, the prioritization 
between synchronous exceptions generated on each of those different memory accesses is not defined by the 
architecture. 





Note 


Exceptions generated by a translation table walk are reported and prioritized as either a Prefetch Abort exception, 
priority 2 in this list, or a Data Abort exception, priority 17 in this list. See also AArch32 state prioritization of 
synchronous aborts from a single stage of address translation on page G4-4120. 





Architectural requirements for taking asynchronous exceptions 


The ARM architecture does not define when asynchronous exceptions are taken. The prioritization of asynchronous 
exceptions, including virtual asynchronous exceptions, is IMPLEMENTATION DEFINED. 


An asynchronous exception that is pending before one of the following context synchronizing events is taken before 
the first instruction after the context synchronizing event completes its execution, provided that the pending 
asynchronous event is not masked: 





° Execution of an ISB instruction that does not fail its condition code check. 
. Exception entry. 
. Exception return. 
° Exit from Debug state. 
G1-3818 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


G1 The AArch32 System Level Programmers’ Model 
G1.12 Handling exceptions that are taken to an Exception level using AArch32 





Note 
° If the first instruction after the context synchronizing event generates a synchronous exception, then the 
architecture does not define the order in which that synchronous exception and the asynchronous exception 
are taken. 
° The ISR identifies any pending asynchronous exceptions. 
° Interrupts are masked when the PE is in Debug state, and therefore this list of context synchronizing events 


does not include the DCPS and DRPS instructions. 





In the absence of a specific requirement to take an asynchronous exception, the only requirement of the architecture 
is that an unmasked asynchronous exception is taken in finite time. 


Note 


The taking of an unmasked asynchronous exception in finite time must occur with all code sequences, including 
with a sequence that consists of unconditional loops. 








If an unmasked interrupt was pending but is changed to not pending before it is taken, then the architecture permits 
the interrupt to be taken, but does not require this to happen. If the interrupt is taken then it must be taken before the 
first Context synchronization event after the interrupt was changed to not pending. 


PSTATE includes a mask bit for each type of asynchronous exception. Setting one of these bits to 1 can prevent the 
corresponding asynchronous exception from being taken, although when the PE is in Non-secure state other controls 
can modify the effect of these bits. For more information, see Asynchronous exception behavior for exceptions taken 
from AArch32 state on page G1-3839. 


Taking an exception sets an exception-dependent subset of these mask bits. 


Note 


In some contexts, the PSTATE. {A, I, F} bits mask the taking of asynchronous exceptions. The way these are set on 
exception entry, described in PSTATE.{A, I, F, M} values on exception entry on page G1-3827, can prevent an 
exception handler being interrupted by an asynchronous exception. 











G1.12.3 Overview of exception entry 
There are some significant differences between the handling of exceptions taken to Hyp mode and exceptions taken 
to other modes. Because Hyp mode is the EL2 mode, this means that the following descriptions sometimes 
distinguish between the EL2 mode and the non-EL2 modes. 
On taking an exception to an Exception level that is using AArch32: 
1. The hardware determines the mode to which the exception must be taken, see PE mode for taking exceptions 
on page G1-3822. 
2. A link value, indicating the preferred return address for the exception, is saved. This is a possible return 
address for the exception handler, and depends on: 
° The exception type. 
° Whether the exception is taken to the EL2 mode or to a non-EL2 mode. 
° For some exceptions taken to non-EL2 modes, the instruction set state when the exception was taken. 
Where the link value is saved depends on whether the exception is taken to the EL2 mode. 
For more information see Link values saved on exception entry on page G1-3821. 
3: The value of PSTATE is saved in the SPSR for the mode to which the exception must be taken. The value 
saved in SPSR.IT[7:0] is always correct for the preferred return address. 
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4. In an implementation that includes EL3, when EL3 is using AArch32: 


If the exception is taken from Monitor mode, SCR.NS is cleared to 0. 
Otherwise, taking the exception leaves SCR.NS unchanged. 


When EL3 is using AArch64, Monitor mode is not available. 


3: PSTATE is updated with new context information for the exception handler. This includes: 


Setting PSTATE.M to the PE mode to which the exception is taken. 


Setting the appropriate PSTATE mask bits. This can disable the corresponding exceptions, preventing 
uncontrolled nesting of exception handlers. 


Setting the instruction set state to the state required for exception entry. 
Setting the endianness to the required value for exception entry. 


Clearing the PSTATE.IT[7:0] bits to 0. 


For more information, see PE state on exception entry on page G1-3826. 


6. The appropriate exception vector is loaded into the PC, see Exception vectors and the exception base address 
on page G1-3812. 


7. Execution continues from the address held in the PC. 


For an exception taken to a non-EL2 mode, on exception entry, the exception handler can use the SRS instruction to 
store the return state onto the stack of any mode at the same Exception level and in the same Security state, and can 
use the CPS instruction to change mode. For more information about the instructions, see SRS, SRSDA, SRSDB, 
SRSIA, SRSIB on page F5-3018 and CPS, CPSID, CPSIE on page F5-2645. 


Later sections of this chapter describe each of the possible exceptions, and each of these descriptions includes a 
pseudocode description of the PE state changes on taking that exception. Table G1-8 gives an index to these 
descriptions: 


Table G1-8 Pseudocode descriptions of exception entry for exceptions taken to AArch32 state 





Exception 


Description of exception entry 





Reset 


Pseudocode descriptions of reset on page G1-3871 





Undefined Instruction Pseudocode description of taking the Undefined Instruction exception on page G1-3851 





Hyp Trap 


Pseudocode description of taking the Hyp Trap exception on page G1-3853 





Monitor Trap 


Pseudocode description of taking the Monitor Trap exception on page G1-3852 





Supervisor Call 


Pseudocode description of taking the Supervisor Call exception on page G1-3854 





























Secure Monitor Call Pseudocode description of taking the Secure Monitor Call exception on page G1-3855 
Hypervisor Call Pseudocode description of taking the Hypervisor Call exception on page G1-3856 
Prefetch Abort Pseudocode description of taking the Prefetch Abort exception on page G1-3858 

Data Abort Pseudocode description of taking the Data Abort exception on page G1-3861 

Virtual Abort Pseudocode description of taking the Virtual SError interrupt exception on page G1-3862 
IRQ Pseudocode description of taking the IRQ exception on page G1-3864 

Virtual IRQ Pseudocode description of taking the Virtual IRQ exception on page G1-3865 

FIQ Pseudocode description of taking the FIQ exception on page G1-3866 

Virtual FIQ Pseudocode description of taking the Virtual FIQ exception on page G1-3867 
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The following sections give more information about the PE state changes, for different architecture 
implementations. However, you must refer to the pseudocode for a full description of the state changes: 


PE mode for taking exceptions on page G1-3822. 
PE state on exception entry on page G1-3826. 


Link values saved on exception entry 


On exception entry, a link value for use on return from the exception, is saved. This link value is based on the 


preferred return address for the exception, as shown in Table G1-9: 


Table G1-9 Exception return addresses for exceptions taken to AArch32 state 





Exception 


Preferred return address 


Taken to a mode at 





Undefined Instruction 


Address of the UNDEFINED instruction 


Non-EL24, or EL2¢ 





Hyp Trap 


Address of the trapped instruction 


EL2 only* 





Monitor Trap 


Address of the trapped instruction 


EL3 only 





Supervisor Call 


Address of the instruction after the SVC instruction 


Non-EL2?2 or EL2¢ 





Secure Monitor Call 


Address of the instruction after the SMC instruction 


EL3>, and only in Secure state 

















Hypervisor Call Address of the instruction after the HVC instruction EL2 only¢ 

Prefetch Abort Address of aborted instruction fetch Non-EL2? or EL2¢ 

Data Abort Address of instruction that generated the abort Non-EL2? or EL2¢ 

Virtual Abort Address of next instruction to execute EL1, and only in Non-secure state 
IRQ or FIQ Address of next instruction to execute Non-EL2? or EL2¢ 





Virtual IRQ or Virtual FIQ 


Address of next instruction to execute 


EL1, and only in Non-secure state 





a. EL! if the exception is taken to a Non-secure mode, or is taken to a Secure mode when EL3 is using AArch64. EL3 if the exception is taken 
to a Secure mode when EL3 is using AArch64. 


b. A Secure Monitor Call exception is taken to EL3, and therefore is taken to AArch32 state only if EL3 is using AArch32, in which case it is 


taken to Monitor mode. 


c. EL2 is implemented only in Non-secure state. Therefore, an exception can be taken to EL2 mode only if it is taken from Non-secure state. 








Although Reset is described as an exception, it differs significantly from other exceptions. The architecture 
has no concept of a return from a Reset and therefore it is not listed in this section. 


For each exception, the preferred return address is not affected by the Exception level from which the 
exception was taken. 
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The link value saved, and where it is saved, depend on whether the exception is taken to a non-EL2 mode, or to an 
EL2 mode, as follows: 


Exception taken to a non-EL2 mode 


The link value is saved in the LR for the mode to which the exception is taken. 


The saved link value is the preferred return address for the exception, plus an offset that depends on 
the instruction set state when the exception was taken, as Table G1-10 shows: 


Table G1-10 Offsets applied to Link value for exceptions taken to non-EL2 modes 





Offset, for PE state of: 





























Exception 
A32 T32 

Undefined Instruction +4 +2 
Monitor Trap +4 +2 
Supervisor Call None None 
Secure Monitor Call None None 
Prefetch Abort +4 +4 
Data Abort +8 +8 
Virtual Abort +8 +8 
IRQ or FIQ +4 +4 
Virtual IRQ or Virtual FIQ +4 +4 





Exception taken to an EL2 mode 


The link value is saved in the ELR_hyp Special-purpose register. 


The saved link value is the preferred return address for the exception, as shown in Table G1-9 on 
page G1-3821, with no offset. 


G1.12.4 PE mode for taking exceptions 


The following principles determine the Exception level to which an exception is taken, and if that Exception level 
is using AArch32, the PE mode to which the exception is taken: 


An exception cannot be taken to the ELO mode. 


An exception is taken either: 

— To the Exception level at which the PE was executing when it took the exception. 

—  Toahigher Exception level. 

This means that, in Secure state: 

— When EL3 is using AArch32, an exception is always taken to an EL3 mode. 

— When EL3 is using AArch64, an exception that is taken to AArch32 state is taken to an EL1 mode. 


Configuration options and other features provided by EL2 and EL3 can determine the mode to which some 
exceptions are taken, as follows: 


In an implementation that does not include EL2 or EL3 


An exception is always taken to the default mode for that exception. 
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In an implementation that includes EL3 


A Secure Monitor Call exception is always taken to EL3. This means: 
° If EL3 is using AArch32 the exception is taken to Secure Monitor mode. 


* If EL3 is using AArch64 then executing the instruction generates an exception that is taken 
to EL3, see Execution of an SMC instruction from a privileged Exception level that is using 
AArch32 on page G1-3824. 

IRQ, FIQ, and External abort exceptions can be configured to be taken to EL3. Therefore, if EL3 

is using AArch32 the exceptions are taken to Secure Monitor mode. 

When EL3 is using AArch32, a Monitor Trap exception is taken to Secure Monitor mode. 

Any exception taken from Secure state that is not taken to Secure Monitor mode is taken to 

Secure state in the default mode for that exception. As described in Security state, Exception 

levels, and AArch32 execution privilege on page G1-3792, this means it is taken to: 


* An EL3 mode other than Monitor mode if EL3 is using AArch32. 
. An EL1 mode if EL3 is using AArch64. 


If the implementation does not include EL2, any exception taken from Non-secure state that is 
not taken to Secure Monitor mode is taken to Non-secure state to the default mode for that 
exception. The default mode will be an EL1 mode. 


In an implementation that includes EL2 


An exception taken from Non-secure state that is not taken to Secure Monitor mode is taken to 
Non-secure state and: 





° If the exception is taken from Hyp mode then it is taken to Hyp mode. 
° Otherwise, the exception is either taken to Hyp mode, as described in Exceptions taken to 
Hyp mode on page G1-3824, or taken to the default mode for the exception. 
Note 
. Hyp mode is the EL2 mode. The other modes to which an exception can be taken in 


Non-secure state are EL! modes. 


° EL2 has no effect on the handling of exceptions taken from Secure state. 





Table G1-7 on page G1-3814 shows the default mode to which each exception is taken. 


Asynchronous exception routing controls on page G1-3841 describes the exception routing controls provided by 
EL2 and EL3. 


Routing of aborts taken to AArch32 state on page G4-4110 gives more information about the modes to which 
memory aborts are taken. 


The possible modes for taking each exception on page G1-3825 shows all modes to which each exception might be 
taken, in any implementation. That is, it applies to implementations: 


° That include neither EL2 nor EL3. 

° That include EL2 but not EL3 

° That do not include EL2 but include EL3 
° That include both EL2 and EL3. 
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Exceptions taken to Hyp mode 
In an implementation that includes EL2 and EL3, when EL2 is using AArch32: 


° Any exception taken from Hyp mode, that is not routed to EL3 by the controls described in Asynchronous 
exception routing controls on page G1-3841, is taken to Hyp mode. 
° The following exceptions, if taken from Non-secure state, are taken to Hyp mode: 


—  Anabort that Routing of aborts taken to AArch32 state on page G4-4110 identifies as taken to Hyp 
mode. 


— A Hyp Trap exception, see EL2 configurable controls on page G1-3894. 
— A Hypervisor Call exception. This is generated by executing an HVC instruction in a Non-secure mode. 


— An SError interrupt exception, IRQ exception or FIQ exception that is not routed to EL3 but is 
explicitly routed to Hyp mode, as described in Asynchronous exception routing controls on 
page G1-3841. 


— Asynchronous external abort, Alignment fault, Undefined Instruction exception, or Supervisor Call 
exception taken from the Non-secure ELO mode and explicitly routed to Hyp mode, as described in 
Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 

Note 


A synchronous external abort can be routed to Hyp mode only if it is not routed to EL3. 








— _ A debug exception that is explicitly routed to Hyp mode as described in Routing debug exceptions to 
EL2 on page G1-3833. 


Note 


The virtual exceptions cannot be taken to Hyp mode. They are always taken to a Non-secure EL1 mode. 








Security behavior in Exception levels using AArch32 when EL3 is using AArch64 


As described in The ARMv8-A security model on page G1-3789, when EL3 is using AArch64, lower Exception 
levels, in either Security state, can be using AArch32. This means software executing in those Exception levels 
might try to access AArch32 security features that are not available. The following subsections describe the 
associated behaviors: 


° Execution of an SMC instruction from a privileged Exception level that is using AArch32 
. Non-secure reads of the NSACR 
° Secure ELI operations when Secure EL1 is using AArch32 on page G1-3825 


Execution of an SMC instruction from a privileged Exception level that is using AArch32 


When EL3 is using AArch64, an SMC instruction executed from Secure or Non-secure EL1 using AArch32, or from 
Non-secure EL2 using AArch32 when the value of HCR.TSC is 0, generates an exception that is taken to EL3. The 
exception syndrome is reported with an EC value of 0x13, SMC instruction executed in AArch32 state, see Exception 
from SMC instruction execution in AArch32 state on page D1-1531. 


Non-secure reads of the NSACR 


The NSACR is defined as being RO from Non-secure PE modes other than User mode. When EL3 is using 
AArch64, a read of the NSACR returns a fixed value of 0x00000C00 in the following cases: 


° If the read is from a Non-secure EL1 mode when ELI is using AArch32. 
° If the read is from Hyp mode when EL2 is using AArch32. 
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Secure EL1 operations when Secure EL1 is using AArch32 


When Secure EL] is using AArch32 and EL3 is using AArch64: 


. Any of the following operations performed in a Secure EL1 mode is trapped to Secure EL3: 
— A read or write of any of the SCR, NSACR, MVBAR, and SDCR. 


— Executing any of the ATS12NSO** instructions described in ATS12NSOxx, Address translation 
stages I and 2, Non-secure state only on page G4-4143. 


— Executing an SRS instruction that would use SP_mon, see SRS, SRSDA, SRSDB, SRSIA, SRSIB on 
page F5-3018. 


— Executing an MRS (Banked register) or MSR (Banked register) instruction that would access SPSR_mon, 
SP_mon, or LR_mon, see MRS (Banked register) on page F5-2832 and MSR (Banked register) on 
page F5-2836. 


For more information about these traps, including the associated exception syndromes, see Traps to EL3 of 
Secure monitor functionality from Secure ELI using AArch32 on page D1-1590. 


° Writes to the CNTFRQ register are UNDEFINED. 


. Any attempt to move into Monitor mode, either by an exception return or by executing a CPS or MSR 
instruction, is treated as an illegal operation and is handled as described in I/legal return events from AArch32 
State on page G1-3835. 


Note 
This functionality supports a usage model where: 
° EL3 uses AArché4. 
° Secure software executed in Secure EL1 using AArch32 and Secure ELO using AArch32. 
. The Non-secure state uses AArch64. 








The possible modes for taking each exception 


Each of the exception descriptions in AArch32 state exception descriptions on page G1-3849 includes a subsection 
that describes the modes to which each exception can be taken. Those subsections are: 


. The PE mode to which the Undefined Instruction exception is taken on page G1-3850. 
. The PE mode to which the Hyp Trap exception is taken on page G1-3852. 

° The PE mode to which the Monitor Trap exception is taken on page G1-3852. 

. The PE mode to which the Supervisor Call exception is taken on page G1-3853. 

° The PE mode to which the Secure Monitor Call exception is taken on page G1-3855. 

° The PE mode to which the Hypervisor Call exception is taken on page G1-3855. 

° The PE mode to which the Prefetch Abort exception is taken on page G1-3858. 

° The PE mode to which the Data Abort exception is taken on page G1-3860. 

° The PE mode to which the Virtual SError interrupt exception is taken on page G1-3862. 
° The PE mode to which the physical IRQ exception is taken on page G1-3863. 

° The PE mode to which the Virtual IRQ exception is taken on page G1-3865. 

° The PE mode to which the physical FIQ exception is taken on page G1-3866. 

° The PE mode to which the Virtual FIQ exception is taken on page G1-3867. 


These descriptions also show the vector offset for the exception entry for each mode. These descriptions assume 
that all Exception levels are using AArch32, meaning: 


° HCR, rather than HCR_EL2, controls the routing of exceptions to EL2. 
. SCR, rather than SCR_EL3, controls the routing of exceptions to EL3. 
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For more information about: 
° Vector offsets, see Exception vectors and the exception base address on page G1-3812. 


° The routing of synchronous external aborts or SError, IRQ, and FIQ interrupt exceptions, and the virtual 
exceptions, see Asynchronous exception routing controls on page G1-3841. 


UNPREDICTABLE cases when the value of HCR.TGE is 1 


When the value of HCR.TGE is 1, exceptions that would otherwise be taken to EL1 are, instead, routed to EL2, see 
Routing exceptions from Non-secure ELO to EL2 on page G1-3828. Related to this, when the value of HCR.TGE is 
1, execution in a Non-secure EL1 mode is UNPREDICTABLE. ARMvé8 does not constrain this UNPREDICTABLE 
behavior, but in ARMV8 software that follows the ARM recommendations cannot get to this state. When following 
the ARM recommendations, any attempt to move to a Non-secure EL1 mode when the value of HCR.TGE is 1 is 
either: 


. An illegal exception return, see I/legal return events from AArch32 state on page G1-3835. 
. An illegal PE mode change, see //legal changes to PSTATE.M on page G1-3809. 


G1.12.5 PE state on exception entry 


The description of each exception includes a pseudocode description of entry to that exception, as Table G1-8 on 
page G1-3820 shows. The following sections describe the PE state changes on entering an exception, for different 
implementations and operating states. However, you must always see the exception entry pseudocode for a full 
description of the state changes on exception entry: 


° Instruction set state on exception entry. 
° PSTATE.E value on exception entry on page G1-3827. 
° PSTATE.{A, I, F, M} values on exception entry on page G1-3827. 


Note 


The descriptions in these sections assume that EL2 and EL3, that control some aspects of the routing of exceptions 
taken from EL1 or ELO, are both using AArch32. If this is not the case: 


. If EL2 is using AArch64: 
— Controls shown as provided by the HSCTLR are provided by the SCTLR_EL2. 
— Controls shown as provided by the HCR are provided by the HCR_EL2. 
° If EL3 is using AArch64, controls shown as provided by the SCR are provided by the SCR_EL3. 








Instruction set state on exception entry 


Exception handlers can execute in either T32 state or A32 state. On exception entry, PSTATE.T is set to the required 
value, as determined by SCTLR.TE or HSCTLR.TE, depending on the mode the exception is taken to. Table G1-11 
shows this: 


Table G1-11 PSTATE.T bit value on exception entry 





Mode to which exception is taken HSCTLR.TE SCTLR.TE PSTATE.T Exception handler state 




















Not Hyp mode x 0 0 A32 
1 1 T32 
Hyp mode 0 x 0 A32 
1 x 1 T32 
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When an implementation includes EL3 and EL3 is using AArch32, SCTLR is Banked for Secure and Non-secure 
states, and therefore the TE bit value might be different for Secure and Non-secure states. For an exception taken to 
a PE mode other than Hyp mode, the SCTLR.TE bit for the Security state to which the exception is taken determines 
the instruction set state for the exception handler. This means the instruction set state in which an exception handler 
might depending on the Security state to which the exception is taken. 


PSTATE.E value on exception entry 


PSTATE.E controls the load and store endianness for data handling. Table G1-12 show the value to which this bit 
is set on exception entry: 


Table G1-12 PSTATE.E value on exception entry 

















Exception mode HSCTLR.EE SCTLR.EE  Endianness for data loads and stores PSTATE.E 
Secure or Non-secure EL1 = x 0 Little-endian 0 
1 Big-endian 1 
Hyp 0 x Little-endian 0 
1 x Big-endian 1 





For more information, see the bit description in SPSR format for exceptions taken to AArch32 state on 
page G1-3804. 


PSTATE.{A, I, F, M} values on exception entry 


On exception entry, PSTATE.M is set to the value for the mode to which the exception is taken, as described in PE 
mode for taking exceptions on page G1-3822. 


Table G1-13 on page G1-3828 shows the cases where PSTATE. {A, I, F} bits are set to 1 on an exception entry, and 
how this depends on the mode and Security state to which an exception is taken. If the table entry for a particular 
mode and Security state does not define a value for a PSTATE.{A, I, F} bit then that bit is unchanged by the 
exception entry. In this table: 


. The Exception mode column is the mode to which the exception is taken. 


° The Non-secure, EL2 not implemented column applies to exceptions taken to Non-secure state in an 
implementation that includes EL3 but does not include EL2. 


. The All others column applies to: 
— Exceptions taken to Secure state. 


— Implementations that do not include the EL3. 
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— Exceptions taken to Non-secure state in an implementation that includes EL2. 


Table G1-13 PSTATE.{A, I, F} values on exception entry 








Security state 
PE mode exception is taken to 

Non-secure Secure 
Hyp If SCR.EA==0 then PSTATE.A is settol = - 


If SCR.IRQ==0 then PSTATE.1 is set to 1 
If SCR.FIQ==0 then PSTATE.F is set to 1 











Monitor - PSTATE.A is set to 1 
PSTATE.I is set to 1 

PSTATE.F is set to 1 

FIQ PSTATE.A is set to 1 PSTATE.A is set to 1 
PSTATE.I is set to 1 PSTATE.I is set to 1 

PSTATE.F is set to | PSTATE.F is set to 1 

IRQ, Abort PSTATE.A is set to 1 PSTATE.A is set to 1 


PSTATE.I is set to 1 


PSTATE.I is set to 1 





Undefined, Supervisor 


PSTATE.I is set to 1 


PSTATE.I is set to 1 





Asynchronous exception behavior for exceptions taken from AArch32 state on page G1-3839 describes how, in some 
situations, the PSTATE.{A, I, F} bits mask the taking of SError interrupts, IRQ interrupts, and FIQ interrupts. 


G1.12.6 Routing exceptions from Non-secure ELO to EL2 





Note 


The routing control described in this section permits a Non-secure state usage model where applications execute in 
User mode under a hypervisor, that executes in Hyp mode, without a Guest OS running at Non-secure EL1. This 
control applies when either: 

° EL2 is using AArch32 and the value of HCR.TGE is 1. 

° EL2 is using AArch64 and the value of HCR_EL2.TGE is 1. 





If the PE is in Non-secure User mode, any exception that would otherwise be taken to Non-secure EL] is taken to 
EL2 if either: 


° EL2 is using AArch32 and the value of HCR.TGE is 1. 


In this case the exception is taken to Hyp mode, instead of to the default Non-secure mode for handling the 
exception. For more information see Exception reporting when HCR.TGE routes an exception to EL2 using 
AArch32 on page G1-3829. 


° EL2 is using AArch64 and the value of HCR_EL2.TGE is 1. 


Any exception that is routed to Secure Monitor mode or to EL3 using AArch64 is unaffected by the value of 
HCR.TGE or HCR_EL2.TGE. 
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When TGE routing from Non-secure ELO applies: 


° When EL2 is using AArch32: 
— The SCTLR.M bit is treated as 0 for all purposes other than a direct read of the SCTLR register. 


— __ Each of the HCR.{FMO, IMO, AMO} bits is treated as 1 for all purposes other than a direct read of 
the HCR register 


— Each of the HDCR.{TDE, TDA, TDRA, TDOSA} bits is treated as 1 for all purposes other than a 
direct read of the HDCR register. 
Note 
When EL2 is using AArch64: 
— The SCTLR_EL2.M bit is treated as 0 for all purposes other than a direct read of the register. 
—_ The HCR_EL2.{FMO, IMO, AMO} bits are treated as 1 for all purposes other than a direct read of 
the register. 


— The MDCR_EL2.{TDE, TDA, TDRA, TDOSA} bits are treated as 1 for all purposes other than a 
direct read of the register. 








. An exception return to EL] is treated as an illegal exception return, see I//egal return events from AArch32 
State on page G1-3835. 


° All virtual interrupts, including any IMPLEMENTATION DEFINED mechanisms for signaling virtual interrupts, 
are disabled. 


Exception reporting when HCR.TGE routes an exception to EL2 using AArch32 

The following sections give more information about the behavior of synchronous exceptions that are routed to Hyp 
mode because the value of HCR.TGE is 1: 

° Undefined Instruction exception, when the value of HCR.TGE is 1. 

° Supervisor Call exception, when the value of HCR.TGE is 1. 

° Abort exceptions, when the value of HCR.TGE is J on page G1-3830. 

° Reporting of exceptions routed to EL2 using AArch32 because the value of HCR.TGE is 1 on page G1-3830. 


Undefined Instruction exception, when the value of HCR.TGE is 1 


When HCR.TGE is set to 1, if the PE is executing in Non-secure User mode and attempts to execute an UNDEFINED 
instruction, it takes the Hyp Trap exception, instead of an Undefined Instruction exception. On taking the Hyp Trap 
exception, the HSR reports an unknown reason for the exception, using the EC value 0x0. For more information 
see Use of the HSR on page G4-4137. 


Supervisor Call exception, when the value of HCR.TGE is 1 


When HCR.TGE is set to 1, if the PE executes an SVC instruction in Non-secure User mode, the Supervisor Call 
exception generated by the instruction is taken to Hyp mode. 
The HSR reports that entry to Hyp mode was because of a Supervisor Call exception, and: 
° If the SVC is unconditional, takes for the imm16 value in the HSR: 
— A zero-extended 8-bit immediate value for the T32 SVC instruction. 
Note 
The only T32 encoding for SVC is a 16-bit instruction encoding. 








— The bottom16 bits of the immediate value for the A32 SVC instruction. 
° If the SVC is conditional, the imm16 value in the HSR is UNKNOWN. 


If the SVC is conditional, the PE takes the exception only if the instruction passes its condition code check. 


The HSR reports the exception as a Supervisor Call exception taken to Hyp mode, using the EC value 0x11. For 
more information, see Use of the HSR on page G4-4137. 
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Note 


The effect of setting HCR.TGE to 1 is to route the Supervisor Call exception to Hyp mode, not to trap the execution 
of the SVC instruction. This means that the preferred return address for the exception, when routed to Hyp mode in 
this way, is the instruction after the SVC instruction. 








Abort exceptions, when the value of HCR.TGE is 1 


When the value of HCR.TGE is 1, if the PE is executing in Non-secure User mode then any abort exception that is 
not routed to Secure Monitor mode or to EL3 using AArch64 generates an exception that is taken as a Hyp Trap 
exception. Where an attempt to execute an instruction causes an abort, on taking the Hyp Trap exception, the HSR 
indicates whether a Data Abort exception or a Prefetch Abort exception caused the Hyp Trap exception entry, and 
presents a valid syndrome in the HSR. 


When SCR.EA is set to 1, external aborts and SError interrupts are routed to Secure Monitor mode, and this takes 
priority over the HCR.TGE routing. For more information, see Asynchronous exception routing controls on 

page G1-3841. The SCR.EA control described in that section applies to both synchronous external aborts and 
asynchronous SError interrupts. 


An SError interrupt that is routed to Hyp mode because the value of HCR.TGE is 1 is reported as a Data Abort 
exception routed to Hyp mode. 


The HSR reports the exception either: 
° As a Prefetch Abort exception routed to Hyp mode, using the EC value 0x20. 
° As a Data Abort exception routed to Hyp mode, using the EC value 0x24. 


For more information about the exception reporting, see Use of the HSR on page G4-4137. 


Reporting of exceptions routed to EL2 using AArch32 because the value of HCR.TGE is 1 


PL1 configurable controls on page G1-3886 describes controls that, when the value of HCR.TGE is 0, can generate 
exceptions that are taken from Non-secure ELO to EL1. When EL2 is using AArch32 and the value of HCR.TGE 
is 1, the exceptions generated by these controls are routed to Hyp mode. Table G1-14 shows how the exception is 


then reported in the HSR. 


Table G1-14 Syndrome reporting in HSR from HCR.TGE routing of traps, disables, and enables 





























: Control woe 
Control provided by PL1 ea Syndrome reporting in HSR 
SCTLR.{nTWE, nTWI} T Trapped WFI or WFE instruction, using EC value 0x01 
SCTLR.{SED, ITD} D Exception for an unknown reason, using EC value 0x00 
SCTLR.CPISBEN E Exception for an unknown reason, using EC value 0x00 
CPACR.TRCDIS T For accesses using: 
* MCR or MRC instructions, trapped MCR or MRC (coproc==0b1110) access, using EC value 
0x05. 
* MCRR or MRRC instructions, trapped MCRR or MRRC (coproc==0b1110) access, using EC 
value @xQC. 
CPACR.{cp11, cp10} E Exception for an unknown reason, using EC value 0x00 
FPEXC.EN E Exception for an unknown reason, using EC value 0x00 
CPACR.ASEDIS D Exception for an unknown reason, using EC value 0x00 
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Table G1-14 Syndrome reporting in HSR from HCR.TGE routing of traps, disables, and enables (continued) 





Control 











Control provided by PL1 type? Syndrome reporting in HSR 
DBGDSCRext.UDCCdis T For accesses using: 
° MCR or MRC instructions, trapped MCR or MRC (coproc==0b1110) access, using EC value 
0x05. 
° LDC or STC instructions, trapped LDC or STC access, using EC value 0x06. 
° MRRC instructions, trapped MCRR or MRRC (coproc==0b1110) access, using EC value 
Ox0C: 
CNTKCTL.{PLOPTEN, T For accesses using: 
PLOVTEN, PLOPCTEN, ° MRC or MCR instructions, trapped MCR or MRC (coproc==0b1111) access, using EC value 
PLOVCTEN} 0x03. 
° MCRR or MRRC instructions, trapped MCRR or MRRC (coproc==0b1111) access, using EC 
value 0x04. 
PMUSERENKR.{ER, CR, T For accesses using: 
SW, EN} : MRC or MCR instructions, trapped MCR or MRC (coproc==0b1111) access, using EC value 
0x03. 
* MCRR or MRRC instructions, trapped MCRR or MRRC (coproc==0b1111) access, using EC 
value 0x04. 





a. T indicates a trap control, E indicates an instruction enable, and D indicates an instruction disable. For the definition of these terms, see the 
list that begins with Instruction enables and instruction disables on page G1-3885. 


Exception reporting when HCR_EL2.TGE routes an exception to EL2 using AArch64 


This section give more information about the behavior of synchronous exceptions that are routed from Non-secure 
ELO using AArch32 to EL2 using AArch64 because the value of HCR_EL2.TGE is 1. That is, it describes the 
behavior of exceptions that are routed from Non-secure User mode to EL2 using AArch64. See: 


* Synchronous aborts from Non-secure User mode, when the value of HCR_EL2.TGE is 1. 


. Reporting of exceptions routed to EL2 using AArch64 because the value of HCR_EL2.TGE is J on 
page G1-3832. 


Synchronous aborts from Non-secure User mode, when the value of HCR_EL2.TGE is 1 


When executing in Non-secure User mode, the following sections describe synchronous exceptions that can occur: 
° Undefined Instruction exception on page G1-3849. 

° Supervisor Call (SVC) exception on page G1-3853. 

° Prefetch Abort exception on page G1-3856. 

° Data Abort exception on page G1-3859. 


When EL2 is using AArch64 and the value of HCR_EL2.TGE is 1, these exceptions are routed to EL2, and reported 
in the ESR_EL2. Table G1-15 shows how they are reported. 


Table G1-15 Syndrome reporting in ESR_EL2 of HCR_EL2 routing of exceptions 





AArch32 exception Syndrome reporting in ESR_EL2 





Undefined Instruction Exception for an unknown reason, using EC value 0x00 

















Supervisor Call Exception from SVC executed in AArch32 state, using EC value 0x11 
Prefetch Abort Exception from an Instruction abort at a lower Exception level, using EC value 0x20 
Data Abort Exception from a Data abort at a lower Exception level, using EC value 0x24 
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Reporting of exceptions routed to EL2 using AArch64 because the value of HCR_EL2.TGE is 1 


Table G1-16 shows syndrome reporting for exceptions that are routed from Non-secure User mode to EL2 using 


AArch64 by HCR_EL2.TGE. 


For: 


Instruction enable and disable controls provided by PL1, instructions are UNDEFINED when disabled. The 
Undefined Instruction exceptions that are routed to EL2 using AArch64 are reported with ESR_EL2.EC 
value Qx@Q, indicating an unknown reason. 


Trap controls provided by PL1, trapped instructions generate Trap exceptions. These are reported in the 
ESR_EL2 with an appropriate EC value. 


Table G1-16 Syndrome reporting in ESR_EL2 of HCR_EL2 routing of traps. enables and disables 



































; Control os 
Control provided by PL1 type Syndrome reporting in ESR_EL2 
SCTLR.{nTWE, nTWI} T Trapped WFI or WFE instruction, using EC value 0x01 
SCTLR.{SED, ITD} D Exception for an unknown reason, using EC value 0x00 
SCTLR.CPISBEN E Exception for an unknown reason, using EC value 0x00 
CPACR.TRCDIS T For accesses using: 
* MCR or MRC instructions, trapped MCR or MRC (coproc==0b1110) access, using EC value 
0x05. 
* MCRR or MRRC instructions, trapped MCRR or MRRC (coproc==0b1110) access, using EC 
value Qx0C. 
CPACR.{cp11, cp10} E Exception for an unknown reason, using EC value 0x00 
FPEXC.EN E Exception for an unknown reason, using EC value 0x00 
CPACR.ASEDIS D Exception for an unknown reason, using EC value 0x00 
DBGDSCRext.UDCCdis T For accesses using: 
° MCR or MRC instructions, trapped MCR or MRC (coproc==0b1110) access, using EC value 
0x05. 
° LDC or STC instructions, trapped LDC or STC access, using EC value 0x06. 
° MRRC instructions, trapped MCRR or MRRC (coproc==0b1110) access, using EC value 
Ox0C: 
CNTKCTL.{PLOPTEN, T For accesses using: 
PLOVTEN, PLOPCTEN, ° MRC or MCR instructions, trapped MCR or MRC (coproc==0b1111) access, using EC value 
PLOVCTEN} 0x03. 
. MCRR or MRRC instructions, trapped MCRR or MRRC (coproc==0b1111) access, using EC 
value 0x04. 
PMUSERENR. {ER, CR, T For accesses using: 


SW, EN} 


MRC or MCR instructions, trapped MCR or MRC (coproc==0b1111) access, using EC value 
0x03. 

MCRR or MRRC instructions, trapped MCRR or MRRC (coproc==0b1111) access, using EC 
value 0x04. 





a. T indicates a trap control, E indicates an instruction enable, and D indicates an instruction disable. For the definition of these terms, see the 
list that begins with Instruction enables and instruction disables on page G1-3885. 
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For all exceptions that Table G1-16 on page G1-3832 applies to except those generated by CPACR. {cp11, cp10}, 
ESR_EL2 reporting is the same as it would have been in ESR_EL1 under the following conditions: 


. EL1 using AArch64. 
° HCR_EL2.TGE set to 0. 


For exceptions generated by CPACR.{cp11, cp10}, ESR_EL2 reporting is as follows: 


° The ESR_EL2.EC value is reported as 0x00. This differs from the EC value reported in ESR_EL1 when an 
exception is generated by CPACR_EL1.FPEN and is taken to EL1 using AArch64. 











G1.12.7 Routing debug exceptions to EL2 
When the value of HDCR.TDE is 1, if the PE is executing in a Non-secure mode other than Hyp mode, any Debug 
exception is routed to Hyp mode. This means it generates a Hyp Trap exception. This applies to: 
. Debug exceptions associated with an instruction fetch, that would otherwise generate a Prefetch Abort 
exception. These are the Breakpoint, Breakpoint Instruction, and Vector Catch exception, see Chapter G2 
AArch32 Self-hosted Debug. 
° Watchpoint exceptions associated with data accesses, that would otherwise generate a Data Abort exception. 
See Watchpoint exceptions on page G2-3961. 
When the value of HDCR.TDE is 1, each of the HDCR.{TDRA, TDOSA, TDA} bits is treated as 1 for all purposes 
other than reading the HDCR register. 
Note 
° A Breakpoint or Watchpoint debug event that generates entry to Debug state cannot be trapped to Hyp mode. 
See Breakpoint and Watchpoint debug events on page H2-4846. 
° When HDCR.TDE is set to 1, the Hyp Trap exception is generated instead of the Prefetch Abort exception 
or Data Abort exception that is otherwise generated by the Debug exception. 
. Debug exceptions, other than Breakpoint Instruction exceptions, are never generated in Hyp mode. 
When a Hyp Trap exception is generated because HDCR.TDE is set to 1, The HSR reports the exception either: 
° As a Prefetch Abort exception routed to Hyp mode, using the EC value 0x20. 
. As a Data Abort exception routed to Hyp mode, using the EC value 0x24. 
For more information see Use of the HSR on page G4-4137. 
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G1.13 Exception return to an Exception level using AArch32 


In the ARM architecture, exception return to an Exception level that is using AArch32 requires the simultaneous 
restoration of the PC and PSTATE to values that are consistent with the desired state of execution on returning from 
the exception. Typically, exception return involves returning to one of: 


. The instruction after the instruction boundary at which an asynchronous exception was taken. 

° The instruction following an SVC, SMC, or HMC instruction, for an exception generated by one of those 
instructions. 

. The instruction that caused the exception, after the reason for the exception has been removed. 

° The subsequent instruction, if the instruction that caused the exception has been emulated in the exception 
handler. 


The ARM architecture defines a preferred return address for each exception other than Reset, see Link values saved 
on exception entry on page G1-3821. The values of the SPSR.IT[7:0] bits generated on exception entry are always 
correct for this preferred return address, but might require adjustment by the exception handler if returning 
elsewhere. 


In some cases, to calculate the appropriate preferred return address for a return to an Exception level that is using 
AArch32, a subtraction must be performed on the link value saved on taking the exception. The description of each 
exception includes any value that must be subtracted from the link value, and other information about the required 
exception return. 


On an exception return, the PSTATE takes either: 
. The value loaded by the RFE instruction. 


. If the exception return is not performed by executing an RFE instruction, the value of the current SPSR at the 
time of the exception return. 


Illegal return events from AArch32 state on page G1-3835 describes the behavior if the restored PE state would not 
be valid for the Exception level, PE mode, and Security state targeted by the exception return. 


Exception return instructions 


The instructions that an exception handler can use to return from an exception depend on whether the exception was 
taken to an EL1 mode, or in an EL2 mode, see: 


. Return from an exception taken to a PE mode other than Hyp mode. 


° Return from an exception taken to Hyp mode on page G1-3835. 


Return from an exception taken to a PE mode other than Hyp mode 


For an exception taken to a PE mode other than Hyp mode, the ARM AArch32 architecture provides the following 
exception return instructions: 


° From privileged modes other than System mode, the ERET instruction. After the exception return, execution 
resumes from the address held in the LR (R14) for the mode in which ERET is executed. See ERET on 
page F5-2673. 


. Data-processing instructions with the S bit set and the PC as a destination, see MOV, MOVS (register) on 
page F5-2815 and SUB, SUBS (immediate) on page F5-3114. 
Note 


The A32 instruction set includes other instructions that can be used for an exception return, but ARM 
deprecates any use of those instructions. 
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Typically: 


— _ A return where no subtraction is required uses SUBS with an operand of 0, or the equivalent MOVS 
instruction. 


— A return requiring subtraction uses SUBS with a nonzero operand. 


° The RFE instruction, see RFE, RFEDA, RFEDB, RFEIA, RFEIB on page F5-2918. If a subtraction is required, 
typically itis performed before saving the LR value to memory. After the exception return, execution resumes 
from the address held in the memory location indicated by the base register specified by the RFE instruction. 


. In A32 state, a form of the LDM instruction in which the PC is one of the registers loaded, see LDM (exception 
return) on page F5-2703. If a subtraction is required, typically it is performed before saving the LR value to 
memory. 


Return from an exception taken to Hyp mode 


For an exception taken to Hyp mode, the ARM architecture provides the ERET instruction, see ERET on 
page F5-2673. An exception handler executing in Hyp mode must return using the ERET instruction. 


Both Hyp mode and the ERET instruction are implemented only as part of EL2. 


G1.13.2 Alignment of exception returns 


The T bit of the value transferred to the PSTATE by an exception return controls the target instruction set of that 
return. The behavior of the hardware for exception returns for different values of the T bit is as follows: 





T== The target instruction set state is A32 state. Bits[1:0] of the address transferred to the PC are ignored 
by the hardware. 
T== The target instruction set state is T32 state: 
° Bit[0] of the address transferred to the PC is ignored by the hardware. 
° Bit[1] of the address transferred to the PC is part of the instruction address. 
Note 


In previous versions of the ARM architecture, the PSTATE.{J, T} bits determined the Instruction set state. In 
ARMv8, PSTATE.J is RESO. 





ARM deprecates any dependence on the requirements that the hardware ignores bits of the address. ARM 
recommends that the address transferred to the PC for an exception return is correctly aligned for the target 
instruction set. 


After an exception entry other than Reset, the LR value has the correct alignment for the instruction set indicated 
by the SPSR.T bit. This means that if exception return instructions are used with the LR and SPSR values produced 
by such an exception entry, the only precaution software needs to take to ensure correct alignment is that any 
subtraction is of a multiple of four if returning to A32 state, or a multiple of two if returning to T32 state. 


G1.13.3 Illegal return events from AArch32 state 


Throughout this section: 


Return In AArch32 state, refers to any of: 
. Execution of any exception return instruction. 
° Execution of a DRPS instruction in Debug state. 
° Exit from Debug state. 


If an exception or debug return from an Exception level using AArch32 triggers an illegal exception 
return, then bit[1] of the PC is either: 





° Zero. 
° The value of bit[1] of the return address for the exception or debug return. 
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The choice between these two alternatives is made by the implementation, and might differ from 
instance to instance of an illegal exception return. 


— Note 


This means software must support both alternatives. 





Saved process state value 
In AArch32 state, refers to any of: 


° The value held in the SPSR for any exception return other than an exception return made by 
executing an RFE instruction. 


° The value read from memory that is to be restored to PSTATE by the execution of an RFE 
instruction. 
° The value held in the SPSR for the execution of a DRPS instruction in Debug state. 


° The value held in the DSPSR for a Debug state exit. 


Link address In AArch32 state, refers to any of: 


° The address held in the link register for any exception return other than an exception return 
made by executing an ERET, LDM, or RFE instruction. 


. The address held in ELR_hyp for any exception return made by executing an ERET instruction. 


. The address read from memory that is to be restored to the PC by the execution of an LDM or 
RFE instruction. 


° The address held in the DLR for Debug state exit. 


Configured from reset 
Indicates the state determined on powerup or reset by a configuration input signal, or by another 
IMPLEMENTATION DEFINED mechanism. 


The ARMV8 architecture has a generic mechanism for handling exception or debug returns to a mode or state that 
is illegal. In AArch32 state, this can occur as a result of any of the following situations: 


° A return where the Exception level being returned to is higher than the current Exception level. 


° A return where the mode being returned to is not implemented. For example: 
— A return to Hyp mode when EL2 is not implemented. 


— __ A return to Monitor mode, when EL3 is either not implemented or using AArch6é4 state. 


° A return to EL2 when: 
—  EL3 is implemented and using AArch64, and the value of the SCR_EL3.NS bit is 0. 
— _ EL3 is implemented and using AArch32, and the value of the SCR.NS bit is 0. 


° A return to Non-secure EL1 when: 
—  EL2 is implemented and using AArch64, and the value of the HCR_EL2.TGE bit is 1. 
—  EL2 is implemented and using AArch32, and the value of the HCR.TGE bit is 1. 


° A return where the value of the saved process state M[4:0] field is not a valid AArch32 PE mode for the 
implementation. Table G1-5 on page G1-3796 shows the valid M[4:0] values for AArch32 PE modes. 


In these cases: 
° PSTATE.IL is set to 1, to indicate an illegal return. 
° PSTATE.M is unchanged. This means the PE mode does not change. 


° The SS bit is handled in the same way as any other exception or debug return, see Software Step exceptions 
on page D2-1673. 
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° The following PSTATE bits are restored from the saved process state value: 
— The N, Z, C, V Condition flags. 
— The Q Overflow or saturation flag. 
— The GE Greater than or Equal flags. 
— The E Endianness mapping bit. 
— The A, I, F exception mask bits. 


° The PSTATE. {IT, T} bits are each either: 
— Set to 0. 
— Copied from the saved process state in the SPSR for the PE mode in which the exception is handled. 


The choice between these two options is determined by an implementation, and might vary dynamically 
within an implementation. Correspondingly software must regard the value as being an UNKNOWN choice 
between the two values. 


° The PC is restored from the link address, unless the illegal return is the execution of a DRPS instruction in 
Debug state. 


When the value of the PSTATE.IL bit is 1, any attempt to execute any instruction results in an Illegal Execution state 
exception. See The Illegal Execution state exception. 


All aspects of the illegal return, other than the effects described in this section, are the same as for a legal return. 


G1.13.4 Legal returns that set PSTATE.IL to 1 


In this section, return, saved process state value, and link address have the meaning that is defined in I/legal return 
events from AArch32 state on page G1-3835. 


If the IL bit in the saved process state value is 1, then it is copied to PSTATE meaning that PSTATE.IL is set to 1. 
In this case, the PSTATE. {IT, T} bits are each either: 


7 Set to 0. 
° Copied from the SPSR, or loaded from memory if the exception return was performed by executing an RFE 
instruction. 


The choice between these two options is determined by an implementation, and might vary dynamically within the 
implementation. This means software must regard each value as being an UNKNOWN choice between the two 
permitted values. 


Because the return sets the PSTATE.IL bit to 1, any attempt to execute any instruction results in an Illegal Execution 
state exception. See The Illegal Execution state exception. 


G1.13.5 The Illegal Execution state exception 


When the value of the PSTATE.IL bit is 1, any attempt to execute an instruction generates an Illegal Execution state 
exception. In AArch32 state, the PSTATE.IL bit can be set to 1 by one of the following: 


. An illegal return, as described in Illegal return events from AArch32 state on page G1-3835. 
° An illegal change to PSTATE.M, as described in Illegal changes to PSTATE.M on page G1-3809. 
° A legal return that sets PSTATE.IL to 1, as described in Legal returns that set PSTATE.IL to 1. 


An Illegal Execution state exception is taken in the same way as an Undefined Instruction exception in the current 
Exception level. If the current Exception level is EL2 using AArch32 state, the HSR provides additional syndrome 
information for the exception, see Use of the HSR on page G4-4137. 


An Illegal Execution state exception has priority over any other Undefined Instruction exception that might arise 
from instruction execution. 
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Note 


This section only describes the handling of an Illegal Execution state exception that is taken to an Exception level 
that is using AArch32 state. The Illegal Execution state exception on page D1-1539 describes the cases where an 
Illegal Execution state exception is taken to an Exception level that is using AArch64 state. 








On taking any exception to an Exception level that is using AArch32 state: 


1. The value of the PSTATE.IL bit is 1 and this is copied to the SPSR.IL bit for the PE mode to which the 
exception is taken. 


2s The PSTATE.IL bit is cleared to 0. 


Note 
This means that it is not possible for software to observe the value of PSTATE.IL. 








Pseudocode description of exception return 


The AArch32.ExceptionReturn() function transfers the return address to the PC and restores PSTATE to its saved 
value. 


This function uses the function SetPSTATEFromPSR(). 
The I]legalExceptionReturn() function checks for an Illegal Execution state exception. 


Chapter J1 ARMv8 Pseudocode includes the definitions of these functions. 
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G1.14 Asynchronous exception behavior for exceptions taken from AArch32 state 


In an implementation that does not include EL2 or EL3, the asynchronous exceptions behave as follows when EL1 
and ELO are both using AArch32: 


° An SError interrupt is taken to Abort mode. 
° An IRQ exception is taken to IRQ mode. 
° An FIQ exception is taken to FIQ mode. 


These are the default PE modes for taking these exceptions. 


Note 


The SError interrupt replaces the ARMv7 asynchronous abort. The new name better describes the nature of the 
exception. 








However, the PSTATE. {A, I, F} bits mask the asynchronous exceptions, meaning that when the value of one of these 
PSTATE bits is 1, the corresponding exception is not taken. 


If a masked asynchronous exceptions remains signaled, then the exception remains pending unless the value of the 
PSTATE bit is changed to 0. 


EL2 and EL3 provide controls that affect: 
. The routing of these exceptions, see Asynchronous exception routing controls on page G1-3841. 


. Masking of these exceptions in Non-secure state, see Asynchronous exception masking controls on 
page G1-3842. 


Similar register control bits are provided regardless of whether EL2 and EL3 are using AArch32 or AArch64: 


° The EL2 controls are provided by the HCR when EL2 is using AArch32, and by the HCR_EL?2 when EL2 is 
using AArch64. 


° The EL3 controls are provided by the SCR when EL3 is using AArch32, and by the SCR_EL3 when EL3 is 
using AArch64. 


Therefore, most references to the HCR or SCR in this section are to entries in Table K12-1 on page K12-5660, that 
disambiguates between AArch32 registers and AArch64 registers. However, the Execution states used by EL2 and 
EL3 do affect some aspects of the routing and masking of the asynchronous exceptions, see Asynchronous exception 
routing and masking with higher Exception levels using AArch64 on page G1-3844. 


G1.14.1 Virtual exceptions when an implementation includes EL2 


When implemented, EL2 provides the following virtual exceptions, that correspond to the physical asynchronous 
exceptions: 


° Virtual SError, that corresponds to a physical external SError interrupt. 
° Virtual IRQ, that corresponds to a physical IRQ. 
° Virtual FIQ, that corresponds to a physical FIQ. 


When the value of HCR.TGE is 0 and the value of an HCR.{ AMO, IMO, FMO} routing control bit is 1, the 
corresponding virtual interrupt is enabled and a virtual exception is generated either: 


° By setting the corresponding virtual interrupt pending bit, HCR.{ VA, VI, VF}, to 1. 


. For a Virtual IRQ or Virtual FIQ, by an IMPLEMENTATION DEFINED mechanism. This might be a signal from 
an interrupt controller. See, for example, the ARM Generic Interrupt Controller Architecture Specification. 


When the value of HCR_EL2.TGE is 1 all virtual interrupts are disabled. 


When a virtual interrupt is disabled: 





° Tt cannot be taken. 
° It cannot be seen in the ISR. 
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In AArch32 state, a virtual exception is taken only from a Non-secure EL1 or ELO mode. In any other mode, if the 
exception is generated it is not taken. 


A virtual exception is taken in Non-secure state to the default mode for the corresponding physical exception. This 
means: 


° A Virtual SError is taken to Non-secure Abort mode. 
° A Virtual IRQ is taken to Non-secure IRQ mode. 
° A Virtual FIQ is taken to Non-secure FIQ mode. 


Table G1-17 summarizes the HCR bits that route asynchronous exceptions to EL2, and the bits that generate the 
virtual exceptions. 


Table G1-17 HCR bits controlling asynchronous exceptions 





Exception Routing the physical exception to EL2 Generating the virtual exception 











SError HCR.AMO HCR.VA 
IRQ HCR.IMO HCR.VI 
FIQ HCR.FMO HCR.VF 





The HCR.{VA, VI, VF} bits generate a virtual exception only if set to 1 when the value of the corresponding 
HCR.{AMO, IMO, FMO} is 1. 


Similarly, if the implementation also includes EL3, the HCR.{ AMO, IMO, FMO} bits route the corresponding 
physical exception to Hyp mode only if the physical exception is not routed to Monitor mode by the SCR.{EA, IRQ, 
FIQ} bit. For more information, see Asynchronous exception routing controls on page G1-3841. 


When the value of an HCR.{ AMO, IMO, FMO} control bit is 1, the corresponding mask bit in PSTATE: 
° Does not mask the physical exception. 
° Masks the virtual exception when the PE is executing in a Non-secure EL1 or ELO mode. 


Taking a Virtual Abort exception clears HCR.VA to zero. Taking a Virtual IRQ exception or a Virtual FIQ exception 
does not affect the value of HCR.VI or HCR.VF. 


Note 


This means that the exception handler for a Virtual IRQ exception or a Virtual FIQ exception must cause software 
that is executing at EL2 or EL3 to update the HCR to clear the appropriate virtual exception bit to 0. 








See WFE wake-up events on page G1-3874 and Wait For Interrupt on page G1-3875 for information about how 
virtual exceptions affect wake up from power-saving states. 





Note 


A hypervisor can use virtual exceptions to signal exceptions to the current Guest OS. The Guest OS takes a virtual 
exception exactly as it would take the corresponding physical exception, and is unaware of any distinction between 
virtual exception and the corresponding physical exception. 





Effects of the HCR.{AMO, IMO, FMO} bits 


As described in this section, the HCR.{ AMO, IMO, FMO} bits are part of the mechanism for enabling the virtual 
exceptions. In addition, for exceptions generated in Non-secure state: 


° As mentioned in this section, affect the routing of the exceptions. See Asynchronous exception routing 
controls on page G1-3841. 


. Affect the masking of the exceptions. See Asynchronous exception masking controls on page G1-3842. 
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G1.14.2 Asynchronous exception routing controls 





Note 


This section describes the behavior when all exception levels are using AArch32. For the differences when this is 
not the case see Asynchronous exception routing and masking with higher Exception levels using AArch64 on 
page G1-3844. 





In an implementation that includes EL3 the following bits in the SCR control the routing of asynchronous 
exceptions, and also the routing of synchronous external aborts: 


SCR.EA When the value of this bit is 1, any synchronous external abort or SError interrupt is taken to EL3. 
—— Note 
° Although this section describes the asynchronous exception routing controls, SCR.EA 


controls the routing of both synchronous external aborts and asynchronous SError interrupts. 


° The other classes of abort cannot be routed to EL3. For more information about the 
classification of aborts, see VMSAv8-32 memory aborts on page G4-4110. 





SCR.FIQ When the value of this bit is 1, any FIQ exception is taken to EL3. 
SCR.IRQ When the value of this bit is 1, any IRQ exception is taken to EL3. 


When EL3 is using AArch32 and the value of one of the SCR. {EA, FIQ, IRQ} bits is 1, the exception is taken to 
Monitor mode. 


Only Secure software can change the values of these bits. 


In an implementation that includes EL2, the following bits in the HCR route asynchronous exceptions to EL2, for 
exceptions that are both: 


° Taken from a Non-secure EL1 or ELO mode. 


° If the implementation also includes EL3, not configured, by the SCR.{EA, FIQ, IRQ} controls, to be taken 
to EL3. 


HCR.AMO _ When the value of this bit is 1, an SError interrupt exception taken from a Non-secure EL1 or ELO 
mode is taken to EL2, instead of to Non-secure Abort mode. If the implementation also includes 
EL3, this control applies only if the value of SCR.EA is 0. When the value of SCR.EA is 1, the value 
of the AMO bit is ignored. 


HCR.FMO — When the value of this bit is 1, an FIQ exception taken from a Non-secure EL1 or ELO mode is taken 
to EL2, instead of to Non-secure FIQ mode. If the implementation also includes EL3, this control 
applies only if the value of SCR.FIQ is 0. When the value of SCR.FIQ is 1, the value of the FMO 
bit is ignored. 


HCR.IMO — When the value of this bit is 1, an IRQ exception taken from a Non-secure EL1 or ELO mode is taken 
to EL2, instead of to Non-secure IRQ mode. If the implementation also includes EL3, this control 
applies only if the value of SCR.IRQ is 0. When the value of SCR.IRQ is 1, the value of the IMO 
bit is ignored. 


When EL? is using AArch32 and the value of one of the HCR.{ AMO, FMO, IMO} bits is 1, the exception is taken 
to Hyp mode. 


Only software executing in Hyp mode, or Secure software executing at EL3 with SCR.NS set to 1, can change the 
values of these bits. If EL3 is using AArch32, this requires the Secure software to be executing in Monitor mode. 


The HCR.{AMO, FMO, IMO} bits also affect the masking of asynchronous exceptions in Non-secure state, as 
described in Asynchronous exception masking controls on page G1-3842. 


The SCR.{EA, FIQ, IRQ} and HCR. { AMO, FMO, IMO} bits have no effect on the routing of Virtual Abort, Virtual 
FIQ, and Virtual IRQ exceptions. 
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Note 
When the PE is in Hyp mode: 
° Physical asynchronous exceptions that are not routed to Monitor mode are taken to Hyp mode. 
° Virtual exceptions are not signaled to the PE. 





See also Asynchronous exception behavior for exceptions taken from AArch32 state on page G1-3839. 


G1.14.3 Asynchronous exception masking controls 


Note 


This section describes the behavior when all exception levels are using AArch32. For the differences when this is 
not the case see Asynchronous exception routing and masking with higher Exception levels using AArch64 on 
page G1-3844. 








The PSTATE.{A, I, F} bits can mask the taking of the corresponding exceptions from AArch32 state, as follows: 
° PSTATE.A can mask SError interrupt exceptions. 

° PSTATE.I can mask IRQ exceptions. 

° PSTATE.F can mask FIQ exceptions. 


In an implementation that does not include either of EL2 and EL3, setting one of these bits to 1 masks the 
corresponding exception, meaning the exception cannot be taken. 


In an implementation that includes EL2, the HCR.{ AMO, IMO, FMO} bits modify the masking of exceptions taken 
from Non-secure state. 


Similarly, in an implementation that includes EL3, the SCR. { AW, FW} bits modify the masking of exceptions taken 
from Non-secure state by the PSTATE.{A, F} bits. 


An implementation that includes only EL1 and ELO does not provide any masking of the PSTATE. {A, I, F} bits. 
The following subsections describe the masking of these bits in other implementations: 


. Asynchronous exception masking in an implementation that includes EL2 but not EL3. 

. Asynchronous exception masking in an implementation that includes EL3 but not EL2. 

. Asynchronous exception masking in an implementation that includes both EL2 and EL3 on page G1-3843. 
° Summary of the asynchronous exception masking controls on page G1-3843. 


Asynchronous exception masking in an implementation that includes EL2 but not EL3 


The HCR.{ AMO, IMO, FMO} bits modify the effect of the PSTATE. {A, I, F} bits. When the value of an 
HCR.{AMO, IMO, FMO} mask override bit is 1, the value of the corresponding PSTATE. {A, I, F} bit is ignored 
when the exception is taken from a Non-secure mode other than Hyp mode. 


Asynchronous exception masking in an implementation that includes EL3 but not EL2 
The SCR.{AW, FW} bits modify the effect of the PSTATE.{A, F} bits. When the value of one of the 

SCR.{AW, FW} bits is 0, the corresponding PSTATE bit is ignored when both of the follow apply: 

° The corresponding exception is taken from Non-secure state. 


° The value of the corresponding SCR.{EA, FIQ} bit is 1, routing the exception to EL3. This means the 
exception is routed to Monitor mode if EL3 is using AArch32. 


Note 
Whenever the value of PSTATE.I is 1, IRQ exceptions are masked and cannot be taken. 
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Asynchronous exception masking in an implementation that includes both EL2 and EL3 


When the value of an HCR.{ AMO, IMO, FMO} mask override bit is 1, the value of the corresponding PSTATE.{A, 
I, F} bit is ignored when both of the following apply: 


° The exception is taken from Non-secure state. 
. Either: 
— The corresponding SCR.{EA, IRQ, FIQ} bit routes the exception to Monitor mode. 


— The exception is taken from a Non-secure mode other than Hyp mode. 


In addition, when the value of an SCR.{ AW, FW} bit is 0, the value of the corresponding PSTATE. {A, F} bit is 
ignored when all of the following apply: 


. The exception is taken from Non-secure state. 
° The corresponding SCR.{EA, FIQ} bit routes the exception to Monitor mode. 
° The corresponding HCR.{AMO, FMO} mask override bit is set to 0. 


Summary of the asynchronous exception masking controls 


The tables in this section show the masking controls for each of the PSTATE. {A, I, F} bits. For an implementation 
that does not include all of the exception levels: 


If the implementation includes only EL1 and ELO 
The PSTATE bits cannot be masked. The behavior is as shown in the Secure row of the tables. 


If the implementation includes EL2 but not EL3 


The behavior is as shown in the Non-secure table rows when the control bits in the SCR are both 0. 


If the implementation includes EL3 but not EL2 


The behavior is as shown in the table rows where the control bit in the HCR is 0. 


Table G1-18 shows the controls of the masking of SError interrupt exceptions by PSTATE.A. 


Table G1-18 Control of masking by PSTATE.A 





Security state HCR.AAMO SCR.EA SCR.AW Mode PSTATE.A 





























Secure x x x x Masks SError interrupt, when set to 1 
Non-secure 0 0 x x Masks SError interrupt, when set to 1 
1 0 x Ignored 
1 x Masks SError interrupt, when set to 1 
1 x x Not Hyp Ignored 
0 x Hyp Masks SError interrupt, when set to 1 
1 x x Ignored 
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Table G1-19 shows the controls of the masking of FIQ exceptions by PSTATE.F: 


Table G1-19 Control of masking by PSTATE.| 





Security state HCRIMO SCRIRQ Mode PSTATE.|I 

















Secure x x x Masks IRQs, when set to 1 
Non-secure 0 x x Masks IRQs, when set to 1 
1 x Not Hyp Ignored 
0 Hyp Masks IRQs, when set to 1 

1 x Ignored 





Table G1-20 shows the controls of the masking of FIQ exceptions by PSTATE.F: 


Table G1-20 Control of masking by PSTATE.F 





Security state HCR.FMO SCR.FIQ SCR.FW Mode PSTATE.F 























Secure x x x x Masks FIQs, when set to 1 
Non-secure 0 0 x x Masks FIQs, when set to 1 
1 0 x Ignored 
1 x Masks FIQs, when set to | 
1 x x Not Hyp Ignored 
0 x Hyp Masks FIQs, when set to 1 
1 x x Ignored 





G1.14.4 Asynchronous exception routing and masking with higher Exception levels using AArch64 


Asynchronous exception routing controls on page G1-3841 and Asynchronous exception masking controls on 
page G1-3842 give full descriptions of the routing and masking of the asynchronous exceptions when all Exception 
levels are using AArch32. However, when ELO and EL1 are using AArch32: 


° As already described, the SCR and HCR controls might be from Exception levels that are using AArch64. 


° If EL3 is using AArch64, or EL2 is using AArch64, there are some changes to the asynchronous exception 
behaviors. 


Therefore, the following sections summarize the asynchronous exception behaviors, taking account of the 
Execution state being used at EL2 and EL3: 


° Summary of physical interrupt routing. 


° Summary of physical interrupt masking on page G1-3846. 


Summary of physical interrupt routing 


The following tables show the routing of physical interrupts. Table G1-21 on page G1-3845 shows the routing of 
physical FIQ interrupts, Table G1-22 on page G1-3845 shows the routing of physical IRQ interrupts, and 
Table G1-23 on page G1-3846 shows the routing of physical SError interrupts. 
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In these tables, for exceptions that must be taken to an Exception level that is using AArch32, the table shows the 
target Exception level and PE mode. In these entries, Mon indicates Monitor mode, and Abt indicates Abort mode. 


Table G1-21 Routing of physical FIQ exceptions 
























































Control bits Target when take from: 
EL3 
Execution SCR HCR Non-secure Secure 
state 
FIQ Rwa FMO> ELO EL1 EL2 ELO EL1¢ EL3 
AArch32 0 X 0 EL1 FIQ EL1 FIQ EL2 Hyp EL3FIQ - EL3 FIQ 
1 EL2 Hyp EL2Hyp EL2Hyp- EL3FIQ - EL3 FIQ 
1 x x EL3 Mon EL3Mon EL3Mon EL3Mon-~ - EL3 Mon 
AArch64 0 0 0 EL1 FIQ EL1 FIQ EL2 Hyp ELI FIQ EL1 FIQ_ pd 
x 1 EL2¢ EL2¢ EL2¢ ELIf ELIf pd 
1 0 ELI ELI pd ELI ELI pd 
1 x x EL3 EL3 EL3 EL3 EL3 EL3 
a. SCR_EL3.RW. When 1, the next lower Exception level is using AArch64. This control is not present when EL3 is using AArch32. 
b. When the value of HCR.TGE is 1, the effective value of this bit is 1. 
c. When EL3 is using AArch32, the only Secure Exception levels are ELO and EL3. 
d. Interrupt is not taken, but remains pending. 
e. If EL2 is using AArch32, taken to Hyp mode. 
f. If ELI is using AArch32.taken to Abort mode. 
Table G1-22 Routing of physical IRQ exceptions 
Control bits Target when take from: 
EL3 
Execution SCR HCR Non-secure Secure 
state 
IRQ RWa IMO> ELO EL1 EL2 ELO EL1¢ EL3 
AArch32 0 x 0 ELI IRQ ELIIRQ EL2Hyp- EL3IRQ_ - EL3 IRQ 
1 EL2 Hyp EL2Hyp EL2Hyp- = EL3IRQ_ - EL3 IRQ 
1 x x EL3 Mon EL3Mon EL3Mon EL3Mon-~ - EL3 Mon 
AArch64 0 0 0 ELI IRQ ELIIRQ EL2Hyp'- ELIIRQ - ELIIRQ pd 
x 1 EL2°¢ EL2¢ EL2°¢ EL1f ELI1f pd 
1 0 ELI ELI pd ELI ELI pd 
1 x x EL3 EL3 EL3 EL3 EL3 EL3 
a. SCR_EL3.RW. When 1, the next lower Exception level is using AArch64. This control is not present when EL3 is using AArch32. 
b. When the value of HCR.TGE is 1, the effective value of this bit is 1. 
c. When EL3 is using AArch32, the only Secure Exception levels are ELO and EL3. 
d. Interrupt is not taken, but remains pending. 


e. If EL2 is using AArch32, taken to Hyp mode. 





ARM DDI 0487A.k_iss10775 


1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


G1-3845 


G1 The AArch32 System Level Programmers’ Model 
G1.14 Asynchronous exception behavior for exceptions taken from AArch32 state 


f. IfEL1 is using AArch32.taken to Abort mode. 


Table G1-23 Routing of physical SError interrupts 
































Control bits Target when take from: 
EL3 
Execution SCR HCR Non-secure Secure 
state 
EA Rwa AMO» ELO EL1 EL2 ELO EL1¢ EL3 
AArch32 0 x 0 EL1 Abt EL1Abt EL2Hyp EL3Abt~ - EL3 Abt 
1 EL2 Hyp EL2Hyp EL2Hyp EL3Abt - EL3 Abt 
1 x X EL3 Mon EL3Mon EL3Mon EL3Mon_~ - EL3 Mon 
AArch64 0 0 0 EL1 Abt ELI Abt EL2Hyp EL1Abt  ELIAbt pd 
Xx 1 EL2¢ EL2¢ EL2¢ ELIf ELIf Pd 
1 0 EL1 EL1 pd EL1 EL1 pd 
1 Xx x EL3 EL3 EL3 EL3 EL3 EL3 
a. SCR_EL3.RW. When 1, the next lower Exception level is using AArch64. This control is not present when EL3 is using AArch32. 
b. When the value of HCR.TGE is 1, the effective value of this bit is 1. 
c. When EL3 is using AArch32, the only Secure Exception levels are ELO and EL3. 
d. Interrupt is not taken, but remains pending. 
e. IfEL2 is using AArch32, taken to Hyp mode. 
f. IfELl1 is using AArch32.taken to Abort mode. 
Summary of physical interrupt masking 
The following tables show the masking of physical interrupts. Table G1-24 on page G1-3847 shows the masking of 
physical FIQ interrupts, Table G1-25 on page G1-3847 shows the masking of physical IRQ interrupts, and 
Table G1-26 on page G1-3848 shows the masking of physical SError interrupts. In these tables: 
M Indicates that the exception is masked when the value of the PSTATE mask bit is 1. 
T Indicates that the exception is taken, regardless of the value of the PSTATE mask bit. 
P Indicates that the exception is not taken but remains pending. The value of the PSTATE mask bit 
has no effect on this behavior. 
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Table G1-24 Masking of physical FIQ exceptions 









































Control bits Effect of PSTATE.F@ mask in Exception level 
EL3 
Execution SCR HCR Non-secure Secure 
state 
FIQ FW RW» FMOc ELO EL1 EL2 ELO EL14 EL3 
AArch32 0 x x 0 M M M M - M 
1 T T M M - M 
1 0 x 0 T T T M - M 
1 x 0 M M M M - M 
x x 1 T T T M - M 
AArch64 0 x 0 0 M M M M M P 
1 T T M M M P 
1 0 M M P M M P 
1 T T M M M P 
1 x x x T T T T T M 





a. In AArch32 state this is visible as CPSR.F. 

b. SCR_EL3 only, this control is not present when EL3 is using AArch32. When the value of RW is 1, the next lower Exception level is 
using AArch64. 

c. When the value of HCR.TGE is 1, the effective value of this bit is 1. 

d. When EL3 is using AArch32, the only Secure Exception levels are ELO and EL3. 


Table G1-25 Masking of physical IRQ exceptions 












































Control bits Effect of PSTATE.I@ mask in Exception level 
EL3 
Execution SCR HCR Non-secure Secure 
state 
IRQ Rwb IMOc ELO EL1 EL2 ELO EL14 EL3 
AArch32 0 x 0 M M M M - M 
1 T T M M - M 
1 x 0 M M M M - M 
1 T T T M - M 
AArch64 0 0 0 M M M M M P 
1 T T M M M P 
1 0 M M P M M P 
1 T T M M M P 
1 x x T T T T T M 
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a. In AArch32 state this is visible as CPSR.I. 
b. SCR_EL3 only, this control is not present when EL3 is using AArch32. When the value of RW is 1, the next lower 


Exception level is using AArch64. 


c. When the value of HCR.TGE is 1, the effective value of this bit is 1. 
d. When EL3 is using AArch32, the only Secure Exception levels are ELO and EL3. 


Table G1-26 Masking of physical SError interrupts 









































Control bits Effect of PSTATE.A2 mask in Exception level 
EL3 
Execution SCR HCR Non-secure Secure 
state 
EA AW RW» AMOc ELO EL1 EL2 ELO EL14 EL3 
AArch32 0 x x 0 M M M M - M 
1 T T M M - M 
1 0 x 0 T T T M - M 
1 x 0 M M M M - M 
x x 1 T T T M - M 
AArch64 0 x 0 0 M M M M M P 
1 T T M M M P 
1 0 M M P M M P 
1 T T M M M P 
1 x x x T T T T T M 





a. In AArch32 state this is visible as CPSR.A. 


b. SCR_EL3 only, this control is not present when EL3 is using AArch32. When the value of RW is 1, the next lower Exception level is 


using AArch64. 


c. When the value of HCR.TGE is 1, the effective value of this bit is 1. 


d. When EL3 is using AArch32, the only Secure Exception levels are ELO and EL3. 
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G1.15 AArch32 state exception descriptions 


Handling exceptions that are taken to an Exception level using AArch32 on page G1-3812 gives general information 
about exception handling. This section describes each of the exceptions, in the following subsections: 


° Undefined Instruction exception. 

° Monitor Trap exception on page G1-3851. 

° Hyp Trap exception on page G1-3852. 

° Supervisor Call (SVC) exception on page G1-3853. 
° Secure Monitor Call (SMC) exception on page G1-3854. 
. Hypervisor Call (HVC) exception on page G1-3855. 
. Prefetch Abort exception on page G1-3856. 

° Data Abort exception on page G1-3859. 

° Virtual SError interrupt exception on page G1-3862. 
° IRQ exception on page G1-3863. 

° Virtual IRQ exception on page G1-3864. 

° FIQ exception on page G1-3865. 

° Virtual FIQ exception on page G1-3866. 


Additional pseudocode functions for exception handling on page G1-3867 gives additional pseudocode that is used 
in the pseudocode descriptions of a number of the exceptions. 


G1.15.1 Undefined Instruction exception 
An Undefined Instruction exception might be caused by: 


° A System register access, floating-point, or Advanced SIMD instruction that is not accessible because of the 
settings in one or more of the CPACR, NSACR, HCPTR, and DBGDSCRext. 


° A System register access, floating-point, or Advanced SIMD instruction that is not implemented. 
° A System register access, floating-point, or Advanced SIMD instruction that causes an exception during 
execution. This includes: 


— _ Trapped floating-point exceptions that are taken to AArch32, if an implementation supports these 
traps. See Floating-point exceptions on page E1-2303. 


— Execution of certain floating-point instructions when one or both of the FPSCR. {Stride, Len} fields 
in nonzero, in an implementation in which those fields are RW. The description of FPEXC specifies 
the instructions to which this applies. 


° An instruction that is UNDEFINED. 


Note 


The Undefined Instruction exception is taken using offset 0x04 in the Hyp, Secure, or Non-secure vector table. In 
the Monitor vector table this offset is used for the Monitor Trap exception. See Monitor Trap exception on 
page G1-3851 and The vector tables and exception offsets on page G1-3813. 








By default, an Undefined Instruction exception is taken to Undefined mode, but an Undefined Instruction exception 
can be taken to EL2, meaning it is taken to Hyp mode if EL2 is using AArch32, see The PE mode to which the 
Undefined Instruction exception is taken on page G1-3850. 


The Undefined Instruction exception can provide: 





° Signaling of an illegal instruction execution. 
* Lazy context switching of System registers. 
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The preferred return address for an Undefined Instruction exception is the address of the instruction that generated 
the exception. This return is performed as follows: 


° If returning from Secure or Non-secure Undefined mode, the exception return uses the SPSR and LR_und 
values generated by the exception entry, as follows: 


—  IfSPSR.T is 0, indicating that the exception occurred in A372 state, the return uses an exception return 
instruction with a subtraction of 4. 


—  IfSPSR.T is 1, indicating that the exception occurred in T32 state, the return uses an exception return 


instruction with a subtraction of 2. 


° If returning from Hyp mode, the exception return is performed by an ERET instruction, using the SPSR and 
ELR_hyp values generated by the exception entry. 


For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 


Note 


If handling the Undefined Instruction exception requires instruction emulation, followed by return to the next 
instruction after the instruction that caused the exception, the instruction emulator must use the instruction length 
to calculate the correct return address, and to calculate the updated values of the IT bits if necessary. 








The PE mode to which the Undefined Instruction exception is taken 


Figure G1-4 shows how the implementation, state, and configuration options determine the PE mode to which an 
Undefined Instruction exception is taken. 
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Figure G1-4 The PE mode the Undefined Instruction exception is taken to 


See also UNPREDICTABLE cases when the value of HCR.TGE is 1 on page G1-3826. 
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Pseudocode description of taking the Undefined Instruction exception 


The AArch32.TakeUndefInstrException() pseudocode procedure describes how the PE takes the exception. 


Conditional execution of undefined instructions 


The conditional execution rules described in Conditional execution on page F2-2407 apply to all instructions. This 
includes undefined instructions and other instructions that would cause entry to the Undefined Instruction 
exception. 


If such an instruction fails its condition check, the behavior depends on the potential cause of entry to the Undefined 
Instruction exception, as follows: 


. If the potential cause is the execution of the instruction itself and depends on data values used by the 
instruction, the instruction executes as a NOP and does not cause an Undefined Instruction exception. 


° In the following cases, it is IMPLEMENTATION DEFINED whether the instruction executes as a NOP or causes an 
Undefined Instruction exception: 


— The potential cause is the execution of an earlier System register access instruction, floating-point 
instruction, or Advanced SIMD instruction. 


— The potential cause is the execution of the instruction itself without dependence on the data values 
used by the instruction. 


An implementation must handle all such cases in the same way. 


Note 


Before ARMv7, all implementations executed any instruction that failed its condition check as a NOP, even if it 
would otherwise have caused an Undefined Instruction exception. An Undefined Instruction handler written for 
these implementations might assume without checking that the undefined instruction passed its condition check. 
Such an Undefined Instruction handler is likely to need rewriting, to check the condition is passed, before it 
functions correctly on all AArch32 implementations. 








Interaction of UNDEFINED instruction behavior with UNPREDICTABLE or 
CONSTRAINED UNPREDICTABLE instruction behavior 


If this manual describes an instruction as both: 





° UNPREDICTABLE and UNDEFINED then the instruction is UNPREDICTABLE. 
. CONSTRAINED UNPREDICTABLE and UNDEFINED then the instruction is CONSTRAINED UNPREDICTABLE. 
Note 
An example of this is where both: 
. An instruction, or instruction class, is made UNDEFINED by some general principle, or by a configuration 
field. 
. A particular encoding of that instruction or instruction class is specified as CONSTRAINED UNPREDICTABLE. 





G1.15.2 Monitor Trap exception 


The Monitor Trap exception is implemented only as part of EL3, and can be generated only if EL3 is using 
AArch32. 





Note 


The Monitor Trap exception is taken using offset @x04 in the Monitor vector table. In the other vector tables, this 
offset is used for the Undefined Instruction exception. See Undefined Instruction exception on page G1-3849 and 
The vector tables and exception offsets on page G1-3813. 
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G1.15.3 


A Monitor Trap exception is generated if the PE is running in a mode other than Monitor mode, and commits for 
execution a WFI or WFE instruction that would otherwise cause suspension of execution when: 


° In the case of the WFI instruction, the value of the SCR.TWI bit is 1. 
° In the case of the WFE instruction, the value of the SCR.TWE bit is 1. 


Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of WFI are not guaranteed 
to be taken, even if the WFE or WFI is executed when there is no Wakeup event. The only guarantee is that if the 
instruction does not complete in finite time in the absence of a Wakeup event, the trap will be taken. 








The preferred return address for a Monitor Trap exception is the address of the instruction that generated the 
exception. The exception return uses the SPSR and LR_mon values generated by the exception entry, as follows: 


° If SPSR.T is 0, indicating that the exception occurred in A32 state, the return uses an exception return 
instruction with a subtraction of 4. 


° If SPSR.T is 1, indicating that the exception occurred in T32 state, the return uses an exception return 
instruction with a subtraction of 2. 


For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 


The PE mode to which the Monitor Trap exception is taken 


When EL3 is using AArch32, a Monitor Trap exception is taken to Monitor mode, using a vector offset of 0x04 from 
the Monitor exception base address. 


Pseudocode description of taking the Monitor Trap exception 


The AArch32.TakeMonitorTrapException() pseudocode procedure describes how the PE takes the exception. 


Hyp Trap exception 


The Hyp Trap exception provides the standard mechanism for trapping Guest OS functions to the hypervisor. 
The Hyp Trap exception is implemented only as part of EL2 and can be generated only if EL2 is using AArch32. 


A Hyp Trap exception is generated if the PE is running in a Non-secure mode other than Hyp mode, and commits 
for execution an instruction that is trapped to Hyp mode. Instruction traps are enabled by setting bits to 1 in the HCR, 
HCPTR, HDCR, or HSTR. For more information see EL2 configurable controls on page G1-3894. 


Traps to Hyp mode never apply in Secure state, regardless of the value of the SCR.NS bit. 


The preferred return address for a Hyp Trap exception is the address of the trapped instruction. The exception return 
is performed by an ERET instruction, using the SPSR and ELR_hyp values generated by the exception entry. 


Note 


The SPSR and ELR_hyp values generated on exception entry can be used, without modification, for an exception 
return to re-execute the trapped instruction. If the exception handler emulates the trapped instruction, and must 
return to the following instruction, the emulation of the instruction must include modifying ELR_hyp, and possibly 
updating SPSR_hyp. 








When the PE enters the handler for a Hyp Trap exception, the HSR holds syndrome information for the exception. 
For more information see Use of the HSR on page G4-4137. 


The PE mode to which the Hyp Trap exception is taken 


A Hyp Trap exception is taken to Hyp mode, using a vector offset of 0x14 from the Hyp exception base address. 
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Pseudocode description of taking the Hyp Trap exception 


The AArch32.TakeHypTrapException() pseudocode procedure describes how the PE takes the exception. 











G1.15.4 Supervisor Call (SVC) exception 

The Supervisor Call instruction, SVC, requests a supervisor function, typically to request an operating system 

function. When EL] is using AArch32, executing an SVC instruction causes the PE to enter Supervisor mode. For 

more information, see SVC on page F5-3129. 
Note 

In an implementation that includes EL2: 

° When an SVC instruction is executed in Hyp mode, the Supervisor Call exception is taken to Hyp mode. For 
more information see SVC on page F5-3129. 

° When the HCR.TGE bit is set to 1, the Supervisor Call exception generated by execution of an SVC instruction 
in Non-secure User mode is routed to Hyp mode. For more information, see Supervisor Call exception, when 
the value of HCR.TGE is I on page G1-3829. 

By default, a Supervisor Call exception is taken to Supervisor mode, but a Supervisor Call exception can be taken 

to EL2, meaning it is taken to Hyp mode if EL2 is using AArch32, see The PE mode to which the Supervisor Call 

exception is taken. 

The preferred return address for a Supervisor Call exception is the address of the next instruction after the SVC 

instruction. This return is performed as follows: 

° If returning from Secure or Non-secure Supervisor mode, the exception return uses the SPSR and LR_sve 
values generated by the exception entry, in an exception return instruction without subtraction. 

. If returning from Hyp mode, the exception return is performed by an ERET instruction, using the SPSR and 
ELR_hyp values generated by the exception entry. 

For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 

The PE mode to which the Supervisor Call exception is taken 

Figure G1-5 on page G1-3854 shows how the implementation, state, and configuration options determine the PE 

mode to which a Supervisor Call exception is taken. 
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Figure G1-5 The PE mode the Supervisor Call exception is taken to 


See also UNPREDICTABLE cases when the value of HCR.TGE is 1 on page G1-3826. 


Pseudocode description of taking the Supervisor Call exception 


The AArch32.TakeSVCException() pseudocode procedure describes how the PE takes the exception. 


G1.15.5 Secure Monitor Call (SMC) exception 


The Secure Monitor Call exception is implemented only as part of EL3. When EL3 is using AArch32, the exception 
is taken to Monitor mode. 


The Secure Monitor Call instruction, SMC, requests a Secure Monitor function. When EL3 is using AArch32, 
executing an SMC instruction causes the PE to enter Monitor mode. For more information, see SMC on page F5-2983. 


Note 
In an implementation that includes EL2, execution of an SMC instruction in a Non-secure EL1 mode can be trapped 
to EL2. When EL2 is using AArch32, this means that when HCR.TSC 1, execution of an SMC instruction in a 
Non-secure EL1 mode generates a Hyp Trap Exception that is taken to Hyp mode. For more information see Traps 
to Hyp mode of Non-secure EL] execution of SMC instructions on page G1-3901. 








The preferred return address for a Secure Monitor Call exception is the address of the next instruction after the SMC 
instruction. This return is performed using the SPSR and LR_mon values generated by the exception entry, using 
an exception return instruction without a subtraction. 


For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 
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Note 


The exception handler can return to the SMC instruction itself by returning using a subtraction of 4, without any 
adjustment to the SPSR.IT[7:0] bits. If it does this, the return occurs, then asynchronous exceptions might occur and 
be handled, then the SMC instruction is re-executed and another Secure Monitor Call exception occurs. 





This relies on: 


° The SMC instruction being used correctly, either outside an IT block or as the last instruction in an IT block, 
so that the SPSR.IT[7:0] bits indicate unconditional execution. 


° The Secure Monitor Call handler not changing the result of the original conditional execution test for the SMC 
instruction. 





The PE mode to which the Secure Monitor Call exception is taken 


The Secure Monitor Call exception is supported only as part of EL3. When EL3 is using AArch32, a Secure Monitor 
Call exception is taken to Monitor mode, using vector offset 0x08 from the Monitor exception base address. 


Note 


° An SMC instruction that is trapped to Hyp mode because HCR.TSC is set to 1 generates a Hyp Trap exception, 
see The PE mode to which the Hyp Trap exception is taken on page G1-3852. 





° If EL3 is using AArch64 then Security behavior in Exception levels using AArch32 when EL3 is using 
AArch64 on page G1-3824 describes the effect of executing an SMC instruction in a mode that is part of an 
Exception level that is using EL1. 





Pseudocode description of taking the Secure Monitor Call exception 


The AArch32.TakeSMCException() pseudocode procedure describes how the PE takes the exception. 


G1.15.6 Hypervisor Call (HVC) exception 
The Hypervisor Call exception is implemented only as part of EL2. 


The Hypervisor Call instruction, HVC, requests a hypervisor function. When EL2 is using AArch32, executing an 
HVC instruction generates a Hypervisor Call exception that is taken to Hyp mode. For more information, see HVC 
on page F5-2677. 


The preferred return address for a Hypervisor Call exception is the address of the next instruction after the HVC 
instruction. The exception return is performed by an ERET instruction, using the SPSR and ELR_hyp values 
generated by the exception entry. 


For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 


When EL2 is using AArch32, executing an HVC instruction transfers the immediate argument of the instruction to 
the HSR. The exception handler retrieves the argument from the HSR, and therefore does not have to access the 
original HVC instruction. For more information see Use of the HSR on page G4-4137. 


The PE mode to which the Hypervisor Call exception is taken 


The Hypervisor Call exception is supported only as part of EL2. When EL2 is using AArch32, a Hypervisor Call 
exception is taken to Hyp mode, using a vector offset that depends on the mode from which the exception is taken, 
as Figure G1-6 on page G1-3856 shows. This offset is from the Hyp exception base address. 
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Figure G1-6 The PE mode the Hypervisor Call exception is taken to 


Pseudocode description of taking the Hypervisor Call exception 


The AArch32.TakeHVCException() pseudocode procedure describes how the PE takes the exception. 

















G1.15.7 Prefetch Abort exception 
A Prefetch Abort exception can be generated by: 
. A synchronous memory abort on an instruction fetch. 
Note 
Asynchronous external aborts on instruction fetches are reported as SError interrupts using the Data Abort 
exception, see Data Abort exception on page G1-3859. 
A Prefetch Abort exception entry is synchronous to the instruction whose fetch aborted. 
For more information about memory aborts see VMSAv8-32 memory aborts on page G4-4110. 
. A Breakpoint, Vector Catch or Breakpoint Instruction exception, see Chapter G2 AArch32 Self-hosted 
Debug. 
Note 
If an implementation fetches instructions speculatively, it must handle a synchronous abort on such an instruction 
fetch by: 
° Generating a Prefetch Abort exception only if the instruction would be executed in a simple sequential 
execution of the program. 
° Ignoring the abort if the instruction would not be executed in a simple sequential execution of the program. 
By default, when EL1 is using AArch32, a Prefetch Abort exception is taken to Abort mode, but a Prefetch Abort 
exception can be taken to: 
° EL2, meaning it is taken to Hyp mode if EL2 is using AArch32. 
° EL3, meaning it is taken to Monitor mode if EL3 is using AArch32. 
For more information, see The PE mode to which the Prefetch Abort exception is taken on page G1-3858. 
The preferred return address for a Prefetch Abort exception is the address of the aborted instruction. This return is 
performed as follows: 
° If returning from a mode other than Hyp mode, using the SPSR and LR values generated by the exception 
entry, using an exception return instruction with a subtraction of 4. This means using: 
—_ SPSR_abt and LR_abt if returning from Abort mode. 
— SPSR_mon and LR_mon if returning from Monitor mode. 
° If returning from Hyp mode, using the SPSR_hyp and ELR_hyp values generated by the exception entry, 
using an ERET instruction. 
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For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 


Prefetch Abort exception reporting a PC alignment fault exception 


A PC alignment fault exception that is taken to an Exception level that is using AArch32 is reported as a Prefetch 


Abort exception, and: 


If the exception is taken to EL1 using AArch32 or EL3 using AArch32 


The IFSR indicates the cause of the exception: 
— If the value of TTBCR.EAE is 0, IFSR.FS takes the value 0b00001. 
— If the value of TTBCR.EAE is 1, IFSR.STATUS takes the value 0b100001. 


IFAR holds the value of the address that faulted, including the misaligned low order bit or 
bits. 


R14_abt holds the address that faulted, including the misaligned low order bit or bits, with 
the standard offset for a Prefetch Abort exception. 


If the exception is taken to EL2 using AArch32 


HSR.EC takes the value 0b100010. 
HSR.IL is UNKNOWN. 
HSR.ISS is RESO. 


HIFAR and ELR_hyp each hold the value of the address that faulted, including the 
misaligned low order bit or bits. 


For a PC alignment fault exception taken to an Exception level that is using AArch32: 


. If the exception occurred because of the CONSTRAINED UNPREDICTABLE behavior of a branch to an unaligned 
PC value, as described in Branching to an unaligned PC on page K1-5458, then bit[0] of the faulting address 
is forced to zero, and therefore the misalignment is because the value of bit[1] of this address is 1. 


. If the exception occurred on an exit from Debug state, as described in Exiting Debug state on page H2-4880, 
then it is CONSTRAINED UNPREDICTABLE whether bit[0] of the faulting address is forced to zero. 
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The PE mode to which the Prefetch Abort exception is taken 


Figure G1-7 shows how the implementation, state, and configuration options determine the PE mode to which a 


Prefetch Abort exception is taken. 
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Figure G1-7 The PE mode the Prefetch Abort exception is taken to 


See also UNPREDICTABLE cases when the value of HCR.TGE is 1 on page G1-3826. 


Pseudocode description of taking the Prefetch Abort exception 


The AArch32.TakePrefetchAbortException() pseudocode procedure describes how the PE takes the exception. 
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G1.15.8 Data Abort exception 

A Data Abort exception can be generated by: 

° A synchronous abort on a data read or write memory access. Exception entry is synchronous to the instruction 
that generated the memory access. 

° An SError interrupt. The SError interrupt might be caused by an external abort on a memory access, which 
can be any of: 
— __ A data read or write access. 
— _ An instruction fetch. 
— Ina VMSA memory system, a translation table access. 
Exception entry occurs asynchronously. 
As described in Asynchronous exception masking controls on page G1-3842, SError interrupts can be 
masked. When this happens, a generated SError interrupt is not taken until it is not masked. 

° A watchpoint, see Watchpoint exceptions on page G2-3961. 

By default, when EL1 is using AArch32 a Data Abort exception is taken to Abort mode, but a Data Abort exception 

can be taken to: 

. EL2, meaning it is taken to Hyp mode if EL2 is using AArch32. 

° EL3, meaning it is taken to Monitor mode if EL3 is using AArch32. 

For more information see The PE mode to which the Data Abort exception is taken on page G1-3860. 

For more information about memory aborts see VMSAv8-32 memory aborts on page G4-4110. 

The preferred return address for a Data Abort exception is the address of the instruction that generated the aborting 

memory access, or the address of the instruction following the instruction boundary at which an SError interrupt 

exception was taken. This return is performed as follows: 

° If returning from a mode other than Hyp mode, using the SPSR and LR values generated by the exception 
entry, using an exception return instruction with a subtraction of 8. This means using: 
— SPSR_abt and LR_abt if returning from Abort mode. 
— SPSR_mon and LR_mon if returning from Monitor mode. 

° If returning from Hyp mode, using the SPSR_hyp and ELR_hyp values generated by the exception entry, 
using an ERET instruction. 

For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 
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The PE mode to which the Data Abort exception is taken 


Figure G1-8 shows the determination of the mode to which a Data Abort exception is taken. 
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Figure G1-8 The PE mode the Data Abort exception is taken to 
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See also UNPREDICTABLE cases when the value of HCR.TGE is 1 on page G1-3826. 


Pseudocode description of taking the Data Abort exception 


The AArch32.TakeDataAbortException() pseudocode procedure describes how the PE takes the exception. 


Effects of data-aborted instructions 


An instruction that accesses data memory can modify memory by storing one or more values. If the execution of 
such an instruction generates a Data Abort exception, or causes Debug state entry because of a watchpoint set on 
the instruction, the value of each memory location that the instruction stores to is: 


° Unchanged for any location for which one of the following applies: 
—  AnMMU fault is generated. 
— _ A Watchpoint is generated. 
— Anexternal abort is generated, if that external abort is taken synchronously. 


. UNKNOWN for any location for which no exception and no debug event is generated. 


If the access to a memory location generates an external abort that is taken asynchronously, it is outside the scope 
of the architecture to define the effect of the store on that memory location, because this depends on the 
system-specific nature of the external abort. However, in general, ARM recommends that such locations are 
unchanged. 


For external aborts and Watchpoints, where in principle faulting could be identified at byte or halfword granularity, 
the size of a location in this definition is the size for which a memory access is single-copy atomic. 


In AArch32 state, instructions that access data memory can modify registers in the following ways: 


° By loading values into one or more of the general-purpose registers. The registers loaded can include the PC. 
° By loading values into one or more of the registers in the Advanced SIMD and floating-point register file. 
° By specifying base register writeback, in which the base register used in the address calculation has a 


modified value written to it. All instructions that support base register writeback have CONSTRAINED 
UNPREDICTABLE results if base register writeback is specified with the PC as the base register. Only 
general-purpose registers can be modified reliably in this way. 


° By a direct or indirect write to one or more System registers, for example: 


— An LDC instruction that writes to DBGDTRTXint using a value read from memory is a direct write to 
the System register DBGDTRTXint. 


— An STC instruction that reads DBGDTRRXint makes an indirect write to DBGDSCRint.RXfull. 
° By modifying PSTATE. 
If the execution of such an instruction generates a synchronous Data Abort exception, the following rules determine 
the values left in these registers: 
° On entry to the Data Abort exception handler: 


— The PC value is the Data Abort vector address, see Exception vectors and the exception base address 
on page G1-3812. 


— The LR_abt value is determined from the address of the aborted instruction. 
Neither value is affected by the results of any load specified by the instruction. 
. The base register is restored to its original value if either: 


— The aborted instruction is a load and the list of registers to be loaded includes the base register. 


— The base register is being written back. 
. If the instruction only loads one general-purpose register the value in that register is unchanged. 


° If the instruction loads more than one general-purpose register, UNKNOWN values are left in destination 
registers other than the PC and the base register of the instruction. 
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G1.15.9 


° If the instruction affects any registers in the Advanced SIMD and floating-point register file, UNKNOWN 
values are left in the registers that are affected. 


° PSTATE bits that are not defined as updated on exception entry retain their current value. 
° If the instruction is a STREX, STREXB, STREXH, or STREXD, <Rd> is not updated. 


After taking a Data Abort exception, the state of the exclusive monitors is UNKNOWN. Therefore, ARM strongly 
recommends that the abort handler performs a CLREX instruction, or a dummy STREX instruction, to clear the exclusive 
monitor state. 


The ARM abort model 


The abort model used by an ARM PE is described as a Base Restored Abort Model. This means that if a synchronous 
Data Abort exception is generated by executing an instruction that specifies base register writeback, the value in the 
base register is unchanged. 


The abort model applies uniformly across all instructions. 


Virtual SError interrupt exception 


The Virtual SError interrupt exception is implemented only as part of EL2. 


A Virtual SError interrupt exception is generated if all of the following apply: 
° The PE is in a Non-secure mode other than Hyp mode. 
. The value of PSTATE.A is 0. 
. Either: 
—  EL2is using AArch32 and the values of the HCR.{TGE, AMO, VA} bits are {0, 1, 1}. 
— EL2 is using AArch64 and the values of the HCR_EL2.{TGE, AMO, VA} bits are {0, 1, 1}. 


The conditions for generating a Virtual SError interrupt exception mean the exception is always: 
. Taken from a Non-secure EL1 or ELO mode. 


° Taken to Non-secure Abort mode. 


For more information see Virtual exceptions when an implementation includes EL2 on page G1-3839. 





Note 


Because the Virtual SError interrupt exception is always taken to Non-secure Abort mode, on exception entry the 
preferred return address is always saved to LR_abt. 





The preferred return address for a Virtual SError interrupt exception is the address of the instruction immediately 
after the instruction boundary where the exception was taken. This return is performed using the SPSR and LR_abt 
values generated by the exception entry, using an exception return instruction without subtraction. 


The PE mode to which the Virtual SError interrupt exception is taken 


The Virtual SError interrupt exception is supported only as part of EL2. A Virtual SError interrupt exception is taken 
from a Non-secure EL1 or ELO mode, and is taken to Non-secure Abort mode, using a vector offset of 0x10 from 
the Non-secure exception base address. 


For more information about this exception see Virtual exceptions when an implementation includes EL2 on 
page G1-3839. 


Pseudocode description of taking the Virtual SError interrupt exception 


The AArch32.TakeVirtualSErrorException() pseudocode procedure describes how the PE takes the exception. 
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IRQ exception 


The IRQ exception is generated by IMPLEMENTATION DEFINED means. Typically this is by asserting an IRQ interrupt 
request input to the PE. 


When an IRQ exception is taken, exception entry is precise to an instruction boundary. 


As described in Asynchronous exception masking controls on page G1-3842, IRQ exceptions can be masked. When 
this happens, a generated IRQ exception is not taken until it is not masked. 


By default, when EL1 is using AArch32, an IRQ exception is taken to IRQ mode, but an IRQ exception can be taken 
to: 


. EL2, meaning it is taken to Hyp mode if EL2 is using AArch32. 
° EL3, meaning it is taken to Monitor mode if EL3 is using AArch32. 


For more information, see The PE mode to which the physical IRQ exception is taken. 


The preferred return address for an IRQ exception is the address of the instruction following the instruction 
boundary at which the exception was taken. This return is performed as follows: 


° If returning from a mode other than Hyp mode, using the SPSR and LR values generated by the exception 
entry, using an exception return instruction with a subtraction of 4. This means using: 


—  SPSR_irq and LR_irq if returning from IRQ mode. 


— SPSR_mon and LR_mon if returning from Monitor mode. 


° If returning from Hyp mode, using the SPSR_hyp and ELR_hyp values generated by the exception entry, 
using an ERET instruction. 


For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 


The PE mode to which the physical IRQ exception is taken 


Figure G1-9 on page G1-3864 shows how the implementation, state, and configuration options determine the mode 
to which an IRQ exception is taken. 
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Figure G1-9 The PE mode the IRQ exception is taken to 


Pseudocode description of taking the IRQ exception 


The AArch32.TakePhysicalIRQException() pseudocode procedure describes how the PE takes the exception. 


G1.15.11 ‘Virtual IRQ exception 


The Virtual IRQ exception is implemented only as part of EL2. 


A Virtual IRQ exception is generated if all of the following apply: 


The PE is in a Non-secure mode other than Hyp mode. 

The value of PSTATE.I is 0. 

Either: 

—  EL2is using AArch32 and the value of HCR.{TGE, IMO} is {0, 1}. 

—  EL2is using AArch64 and the value of HCR_EL2.{TGE, IMO} is {0, 1}. 

One of the following applies: 

—  EL2is using AArch32 and the value of HCR.VI is 1. 

—  EL2is using AArch64 and the value of HCR_EL2.V1 is 1. 

— A Virtual IRQ exception is generated by an IMPLEMENTATION DEFINED mechanism. 
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The conditions for generating a Virtual IRQ exception mean the exception is always: 
. Taken from a Non-secure EL1 or ELO mode. 


° Taken to Non-secure IRQ mode. 
For more information see Virtual exceptions when an implementation includes EL2 on page G1-3839. 


The preferred return address for a Virtual IRQ exception is the address of the instruction immediately after the 
instruction boundary where the exception was taken. This return is performed using the SPSR and LR_irq values 
generated by the exception entry, using an exception return instruction with a subtraction of 4. 


The PE mode to which the Virtual IRQ exception is taken 


The Virtual IRQ exception is supported only as part of EL2. A Virtual IRQ exception is taken from a Non-secure 
EL1 or ELO mode, and is taken to Non-secure IRQ mode, using a vector offset of x18 from the Non-secure 
exception base address. 


For more information about this exception see Virtual exceptions when an implementation includes EL2 on 
page G1-3839. 


Pseudocode description of taking the Virtual IRQ exception 


The AArch32.TakeVirtualIRQException() pseudocode procedure describes how the PE takes the exception. 


G1.15.12 FIiQ exception 


The FIQ exception is generated by IMPLEMENTATION DEFINED means. Typically this is by asserting an FIQ interrupt 
request input to the PE. 


When an FIQ exception is taken, exception entry is precise to an instruction boundary. 


As described in Asynchronous exception masking controls on page G1-3842, FIQ exceptions can be masked. When 
this happens, a generated FIQ exception is not taken until it is not masked. 


By default, an FIQ exception is taken to FIQ mode, but an FIQ exception can be taken to: 
. EL2, meaning it is taken to Hyp mode if EL2 is using AArch32. 
° EL3, meaning it is taken to Monitor mode if EL3 is using AArch32. 


For more information, see The PE mode to which the physical FIQ exception is taken on page G1-3866. 


The preferred return address for an FIQ exception is the address of the instruction following the instruction 
boundary at which the exception was taken. This return is performed as follows: 


° If returning from a mode other than Hyp mode, using the SPSR and LR values generated by the exception 
entry, using an exception return instruction with a subtraction of 4. This means using: 


—  SPSR_fiq and LR_fiq if returning from FIQ mode. 


—_ SPSR_mon and LR_mon if returning from Monitor mode. 


° If returning from Hyp mode, using the SPSR_hyp and ELR_hyp values generated by the exception entry, 
using an ERET instruction. 


For more information, see Exception return to an Exception level using AArch32 on page G1-3834. 
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The PE mode to which the physical FIQ exception is taken 


Figure G1-9 on page G1-3864 shows how the implementation, state, and configuration options determine the PE 
mode to which an FIQ exception is taken. 
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Figure G1-10 The PE mode the FIQ exception is taken to 


Pseudocode description of taking the FIQ exception 


The AArch32.TakePhysicalFIQException() pseudocode procedure describes how the PE takes the exception. 


G1.15.13 ‘Virtual FIQ exception 
The Virtual FIQ exception is implemented only as part of EL2. 


A Virtual FIQ exception is generated if all of the following apply: 


° The PE is in a Non-secure mode other than Hyp mode. 
. The value of PSTATE.F is 0. 
° Either: 


—  EL2is using AArch32 and the value of HCR.{TGE, FMO} is {0, 1}. 
—  EL2is using AArch64 and the value of HCR_EL2.{TGE, FMO} is {0, 1}. 
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° One of the following applies: 
—  EL2is using AArch32 and the value of HCR.VF is 1. 
—  EL2is using AArch64 and the value of HCR_EL2.VF is 1. 
— A Virtual FIQ exception is generated by an IMPLEMENTATION DEFINED mechanism. 


The conditions for generating a Virtual FIQ exception mean the exception is always: 
° Taken from a Non-secure EL1 or ELO mode. 


° Taken to Non-secure FIQ mode. 
For more information see Virtual exceptions when an implementation includes EL2 on page G1-3839. 


The preferred return address for a Virtual FIQ exception is the address of the instruction immediately after the 
instruction boundary where the exception was taken. This return is performed using the SPSR and LR_irq values 
generated by the exception entry, using an exception return instruction with a subtraction of 4. 


The PE mode to which the Virtual FIQ exception is taken 


The Virtual FIQ exception is supported only as part of EL2. A Virtual FIQ exception is taken from a Non-secure 
EL1 or ELO mode, and is taken to Non-secure FIQ mode, using a vector offset of @x1C from the Non-secure 
exception base address. 


For more information about this exception see Virtual exceptions when an implementation includes EL2 on 
page G1-3839. 
Pseudocode description of taking the Virtual FIQ exception 


The AArch32.TakeVirtualFIQException() pseudocode procedure describes how the PE takes the exception. 





G1.15.14 Additional pseudocode functions for exception handling 
The AArch32.EnterMonitorMode() pseudocode function changes the PE mode to Monitor mode, with the required 
state changes. 
The AArch32.EnterHypMode() pseudocode function changes the PE mode to Hyp mode, with the required state 
changes. 
The AArch32.EnterMode() pseudocode function changes the PE mode to a PL1 mode, with the required state changes. 
It is used for all exceptions that are not routed to Hyp mode or Monitor mode. 
The AArch32.EnterMonitorMode(), AArch32.EnterHypMode(), and AArch32.EnterMode() functions are described in 
Chapter J1 ARMv8 Pseudocode. 
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G1.16 


Reset into AArch32 state 


Reset on page D1-1517 describes the ARMv8 reset model, including the defined levels of reset. When reset is 
deasserted, the PE starts executing instructions in the highest implemented Exception level. If that Exception level 
is using AArch32, then it starts execution: 


° In Secure state, if the implementation includes EL3. 
° With interrupts disabled: 
—  InHyp mode, if the highest implemented Exception level is EL2. 


— In Supervisor mode, otherwise. 


Note 


° This section describes the architectural requirements for a reset into AArch32 state. It takes no account of 
whether ARM licenses any particular combination of Exception levels and Execution state. For more 
information about the licensed combinations see Support for Exception levels and Execution states on 
page D1-1620. 





. The Execution state in which the highest implemented Execution level starts executing instructions on 
coming out of reset might be determined by a configuration input signal. 





Reset returns some PE state to architecturally-defined or IMPLEMENTATION DEFINED values, and makes other state 
UNKNOWN, as described in PE state on reset into AArch32 state on page G1-3869. For more information about 
behavior when resetting into an Exception level using AArch32, see: 


° Behavior of caches at reset on page G3-3995. 

° Enabling stages of address translation on page G4-4033. 
° TLB behavior at reset on page G4-4090. 

. Reset and debug on page H6-4955. 


When reset is deasserted, if the PE resets into an Exception level that is using AArch32, it is IMPLEMENTATION 
DEFINED whether execution starts: 





s From an IMPLEMENTATION DEFINED address. 

° If reset is into EL3 or EL1, from the low or high reset vector address, as determined by the reset value of the 
SCTLR.V bit. This reset value can be determined by an IMPLEMENTATION DEFINED configuration input 
signal. 

Note 


This option might be implemented for compatibility with earlier versions of the architecture. 





Software might be able to identify the reset address: 


° If reset is into EL3, by reading the reset value of MVBAR. That is, after coming out of reset, by reading 
MVBAR before the boot software has updated it. It is IMPLEMENTATION DEFINED whether this discovery 
mechanism is supported. 


° If reset is into EL2 or EL1, by reading RVBAR. RVBAR can only be implemented at the highest implemented 
Exception level, and only if that Exception level is not EL3. 


If RVBAR is not implemented, and at all Exception levels other than the highest implemented Exception level, the 
encoding for RVBAR is UNDEFINED. 


The ARM architecture does not define any way of returning to a previous Execution state from a reset. 





G1-3868 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G1 The AArch32 System Level Programmers’ Model 
G1.16 Reset into AArch32 state 


G1.16.1 PE state on reset into AArch32 state 





Note 


See the ARM® Generic Interrupt Controller Architecture Specification, GIC architecture version 3.0 and version 4.0 
for the reset requirements for GIC System registers. 





Immediately after a reset, much of the PE state is UNKNOWN. However, some of the PE state is defined. If the PE 
resets to AArch32 state using either a Cold or a Warm reset, the PE state that is defined is as follows: 


If reset is into EL3 using AArch32 then all fields of the SCR reset to zero. 


Note 
This means SCR.NS correctly indicates that the PE is in Secure state. 








If reset is into EL2 using AArch32 then reset is into Hyp mode and CPSR.M resets to 0b1010, otherwise reset 
is into Supervisor mode and CPSR.M resets to 0b0011, 


CPSR.IL resets to 0. 


The CPACR. {cp11, cp10} fields reset to zero, and if CPACR.ASEDIS is implemented as an RW field it resets 
to zero. 





Note 
When CPACR.TRCDIS is a RW field, its reset value is architecturally UNKNOWN. 





PSTATE is reset to the values defined by the AArch32.TakeReset() pseudocode function, see Pseudocode 
descriptions of reset on page G1-3871. 


The FPEXC.EN field resets to 0. 


In the SCTLR: 
— The {AFE, TRE, UWXN, WXN, I, SED, ITD, C, A, M} fields reset to 0. 
— The {nTWE, nTWI, CP15BEN} fields reset to 1. 


— The {TE, EE, V} fields reset to IMPLEMENTATION DEFINED values, see the register description for 
more information. 


When the reset is to EL3 using AArch32 then these reset values apply only to the Secure instance of the 

SCTLR, and the reset value of the Non-secure SCTLR is architecturally UNKNOWN. 

All field of the TTBCR reset to 0. 

When the reset is to EL3 using AArch32 then: 

— All fields of the Secure TTBCR reset to 0. 

— In the Non-secure TTBCR, the EAE field resets to 0, and the reset values of all other fields are 
architecturally UNKNOWN. 

The VBAR resets to an IMPLEMENTATION DEFINED value. 

When the reset is to EL3 using AArch32 then this reset value applies only to the Secure instance of the 

register, and the reset value of the Non-secure VBAR is architecturally UNKNOWN. 

All fields of the DBGDCCINT reset to 0. 

The DBGDSCRext.{ MDBGen, UDCCadis} fields reset to 0. 


The DBGOSDLR.DLK field resets to 0. 
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In addition: 


If the reset is into EL1 using AArch32 


In the RMR (at EL1) register, the RR field resets to 0 on any warm or cold reset, and the 
AA64 field resets to 0 on a cold reset. 


If the reset is into EL2 using AArch32 


In the HRMR, the RR field resets to 0 on any warm or cold reset, and the AA64 field resets 
to 0 on a cold reset. 


The HSCTLR.{I, C, M} fields all reset to 0, and the HSCTLR.EE field resets to an 
IMPLEMENTATION DEFINED value. 


If the reset is into EL2 using AArch32 or into EL3 using AArch32 


For a reset into EL3 using AArch32 these reset values apply only if the implementation includes 
EL2, see the register descriptions for more information. 


All fields of the HCPTR reset to zero. 
All fields of the HCR reset to zero. 

The HCR2.{ID, CD} fields reset to zero. 
All fields of the HSTR reset to zero. 


The VMPIDR resets to the value of the MPIDR, see the register description for more 
information. 


The VPIDR resets to the value of the MIDR, see the register description for more 
information. 


The VTTBR.VMID field resets to zero. 

In the HDCR: 

— The HPMN field resets to the IMPLEMENTATION DEFINED value of PMCR.N. 
— The reset value of the HPME field is architecturally UNKNOWN. 

—  Allother fields reset to 0. 


If the reset is into EL3 using AArch32 


The MVBAR resets to an IMPLEMENTATION DEFINED value, see the register description for 
more information. 


If the NSACR.{NSTRCDIS, NSASEDIS} fields are RW fields then they reset to 0. 


In the RMR (at EL3) register, the RR field resets to 0 on any warm or cold reset, and the 
AA64 field resets to 0 on a cold reset. 


All fields of the SCR reset to zero. 
All fields of the SDER reset to 0. 
All fields of the SDCR reset to zero. 


For either a warm or a cold reset 


For a cold reset only 


The EDPRSR:.SR field resets to 1. 
The EDESR.{SS, RC, OSUC} fields reset to 0. 


The EDSCR.{RXO, TXU, INTdis, TDA, MA, HDE, ERR, RXfull, TXfull} fields reset to 0. 
The EDECCR.{NSE, SE} fields reset to 0. 

The EDPRSR.{SPMAD, SDAD} fields reset to 0, and the EDPRSR.SPD field resets to 1. 
The DBGOSLSR.OSLK field resets to 1. 

The DBGPRCR.CORENPDRQ field resets to the value of EDPRCR.COREPURQ. 
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An External Debug reset sets EDPRCR.COREPURQ to 0, see External debug register resets 
on page H8-4981. If an External Debug reset and a cold reset coincide, both 
DBGPRCR.CORENPRDRQ and EDPRCR.COREPURQ are reset to 0. 


° The debug CLAIM bits are reset to 0. 


— Note 


These are the bits that are set to 1 by writing to DBGCLAIMSET.CLAIM, and reset to 0 by 
writing to DBGCLAIMCLR.CLAIM. 














G1.16.2 Pseudocode descriptions of reset 
The AArch32.TakeReset() pseudocode procedure describes how the PE behaves when reset is deasserted. 
The AArch32.ResetGeneralRegisters() pseudocode function resets the general-purpose registers. 
The AArch32.ResetSIMDFPRegisters() pseudocode function resets the SIMD and floating-point registers. 
The AArch32.ResetSpecialRegisters() pseudocode function resets the Special-purpose registers, and the debug 
System registers DLR and DSPSR, that are used for handling Debug exceptions. 
The AArch32.ResetSystemRegisters() pseudocode function resets all System registers in the (coproc==0b111x) 
encoding space to their reset state as defined in the register descriptions in Chapter G6 AArch32 System Register 
Descriptions. 
Note 
The ResetSystemRegisters() function only resets the System registers. It has no effect on memory-mapped registers. 
The ResetExternalDebugRegisters() pseudocode function resets all external debug registers to their reset state as 
defined in the register descriptions in Chapter H9 External Debug Register Descriptions. 
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G1.17 


G1.17.1 


Mechanisms for entering a low-power state 


The following sections describe the architectural mechanisms that a PE can use to request entry to a low-power 
state: 


° Wait For Event and Send Event. 
° Wait For Interrupt on page G1-3875. 


Wait For Event and Send Event 


The Wait For Event (WFE) mechanism permits a PE to request entry to a low-power state, and, if the request 
succeeds, to remain in that state until an event is generated by a Send Event operation, or another WFE wake-up 
event occurs. Example G1-2 describes how a spinlock implementation might use this mechanism to save energy. 


Example G1-2 Spinlock as an example of using Wait For Event and Send Event 


A multiprocessor operating system requires locking mechanisms to protect data structures from being accessed 
simultaneously by multiple PEs. These mechanisms prevent the data structures becoming inconsistent or corrupted 
if different PEs try to make conflicting changes. If a lock is busy, because a data structure is being used by one PE, 
it might not be practical for another PE to do anything except wait for the lock to be released. For example, if a PE 
is handling an interrupt from a device it might need to add data received from the device to a queue. If another PE 
is removing data from the queue, it will have locked the memory area that holds the queue. The first PE cannot add 
the new data until the queue is in a consistent state and the lock has been released. It cannot return from the interrupt 
handler until the data has been added to the queue, so it must wait. 


Typically, a spin-lock mechanism is used in these circumstances: 


. A PE requiring access to the protected data attempts to obtain the lock using single-copy atomic 
synchronization primitives such as the Load-Exclusive and Store-Exclusive operations described in 
Synchronization and semaphores on page E2-2355. 


. If the PE obtains the lock it performs its memory operation and releases the lock. 


. If the PE cannot obtain the lock, it reads the lock value repeatedly in a tight loop until the lock becomes 
available. At this point it again attempts to obtain the lock. 


A spin-lock mechanism is not ideal for all situations: 
° In a low-power system the tight read loop is undesirable because it uses energy to no effect. 


° In a multithreaded implementation the execution of spin-locks by waiting threads can significantly degrade 
overall performance. 


Using the Wait For Event and Send Event mechanism can improve the energy efficiency of a spinlock. In this 
situation, a PE that fails to obtain a lock can execute a Wait For Event instruction, WFE, to request entry to a 
low-power state. When a PE releases a lock, it must execute a Send Event instruction, SEV, causing any waiting PEs 
to wake up. Then, these PEs can attempt to gain the lock again. 


The execution of a WFE instruction can cause suspension of execution only if all of the following are true: 
. The instruction does not cause any other exception. 
° When the instruction is executed: 

— The Event Register is not set. 


— There is not a pending WFE wakeup event. 


For more information about the trap to EL2 see Traps to Hyp mode of Non-secure ELO and EL] execution of WFE 
and WF instructions on page G1-3904. 


The architecture does not define the exact nature of the low power state entered as a result of executing a WFE 
instruction, but the execution of a WFE instruction must not cause a loss of memory coherency. 
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Note 


Although a complex operating system can contain thousands of distinct locks, the event sent by this mechanism does 
not indicate which lock has been released. If the event relates to a different lock, or if another PE acquires the lock 
more quickly, the PE fails to acquire the lock and can reenter the low-power state waiting for the next event. 








The Wait For Event system relies on hardware and software working together to achieve energy saving: 
. The hardware provides the mechanism to enter the Wait For Event low-power state. 


. The operating system software is responsible for issuing: 


— A Wait For Event instruction, to request entry to the low-power state, used in the example when 
waiting for a spin-lock. 


— A Send Event instruction, required in the example when releasing a spin-lock. 


The mechanism depends on the interaction of: 
° WEE wake-up events, see WFE wake-up events on page G1-3874. 


. The Event Register, see The Event Register. 
° The Send Event instructions, see The Send Event instructions on page G1-3874. 
° The Wait For Event instruction, see The Wait For Event instruction. 


The Event Register 


The Event Register is a single bit register for each PE. When set, an event register indicates that an event has 
occurred, since the register was last cleared, that might require some action by the PE. Therefore, the PE must not 
suspend operation on issuing a WFE instruction. 


The reset value of the Event Register is UNKNOWN. 


The Event Register for a PE is set by: 


° The execution of an SEV instruction on any PE in the multiprocessor system. 

° The execution of an SEVL instruction by the PE. 

. An exception return. 

° An event from a Generic Timer event stream, see Event streams on page G5-4223. 
° An event sent by some IMPLEMENTATION DEFINED mechanism. 


As shown in this list, the Event Register might be set by IMPLEMENTATION DEFINED mechanisms. 
The Event Register is cleared only by a Wait For Event instruction. 


Software cannot read or write the value of the Event Register directly. 


The Wait For Event instruction 
The action of the Wait For Event instruction depends on the state of the Event Register: 


° If the Event Register is set, the instruction clears the register and completes immediately. Normally, if this 
happens the software makes another attempt to claim the lock. 


° If the Event Register is clear the PE can suspend execution, and hardware might enter a low-power state. The 
PE can remain suspended until a WFE wake-up event or a reset occurs. When a WFE wake-up event occurs, 
or earlier if the implementation chooses, the WFE instruction completes. 


The execution in AArch32 state of a WFE instruction that would otherwise cause suspension of execution might be 
trapped, see: 


. Traps to Undefined mode of ELO execution of WFE and WFI instructions on page G1-3888. 
° Traps to Hyp mode of Non-secure ELO and EL1 execution of WFE and WFI instructions on page G1-3904. 


° Traps to Monitor mode of the execution of WFE and WFI instructions in modes other than Monitor mode on 
page G1-3915. 
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The Wait For Event instruction, WFE, is available at all privilege levels, see WFE on page F5-3222. 


Software using the Wait For Event mechanism must tolerate spurious wake-up events, including multiple wake ups. 


WFE wake-up events 


The following events are WFE wake-up events: 


The execution of an SEV instruction on any PE in the system. 

The execution of an SEVL instruction on the PE. 

A physical IRQ interrupt, unless masked by the PSTATE.I bit. 

A physical FIQ interrupt, unless masked by the PSTATE.F bit. 

A physical SError interrupt, unless masked by the PSTATE.A bit. 

In Non-secure state in any mode other than Hyp mode: 

— When HCR.IMO is set to 1, a virtual IRQ interrupt, unless masked by the PSTATE.I bit. 

— | When HCR.FMO is set to 1, a virtual FIQ interrupt, unless masked by the PSTATEF bit. 

— When HCR.AMO is set to 1, a virtual SError interrupt, unless masked by the PSTATE.A bit. 


An asynchronous External Debug Request debug event, if halting is allowed. For the definition of halting is 
allowed see Halting allowed and halting prohibited on page H2-4845. 


See also External Debug Request debug event on page H3-4900. 
An event sent by the timer event stream, see Event streams on page D6-1882. 
An event sent by some IMPLEMENTATION DEFINED mechanism. 


An event caused by the clearing of the global monitor associated with the PE. 


In addition to the possible masking of WFE wake-up events shown in this list, when invasive debug is enabled and 
EDSCR.HDE is set to 1, EDSCR.INTdis can mask interrupts, including masking them acting as WFE wake-up 
events. For more information, see EDSCR, External Debug Status and Control Register on page H9-5064. 


As shown in the list of wake-up events, an implementation can include IMPLEMENTATION DEFINED hardware 
mechanisms to generate wake-up events. 


Note 





For more information about PSTATE masking see Asynchronous exception masking controls on page G1-3842. If 
the configuration of the masking controls provided by EL2 and EL3 mean that a PSTATE mask bit cannot mask the 
corresponding exception, then the physical exception is a WFE wake-up event, regardless of the value of the 
PSTATE mask bit. 





The Send Event instructions 


The Send Event instructions are: 


SEV, Send Event This causes an event to be signaled to all PEs in the multiprocessor system. 


SEVL, Send Event Local 


This must set the local Event Register. It might signal an event to other PEs, but is not 
required to do so. 


The mechanism that signals an event to other PEs is IMPLEMENTATION DEFINED. The PE is not required to guarantee 
the ordering of this event with respect to the completion of memory accesses by instructions before the SEV 
instruction. Therefore, ARM recommends that software includes a DSB instruction before any SEV instruction. 


Note 





A DSB instruction ensures that no instruction, including any SEV instruction, that appears in program order after the 
DSB instruction, can execute until the DSB instruction has completed. For more information, see Data Synchronization 
Barrier (DSB) on page E2-2337. 
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The SEVL instruction appears to execute in program order relative to any subsequent WFE instruction executed on the 
same PE, without the need for any explicit insertion of barrier instructions. 


Execution of the Send Event instruction sets the Event Register. 


The Send Event instructions are available at all privilege levels. 


Pseudocode description of the Wait For Event mechanism 
This section defines pseudocode functions that describe the operation of the Wait For Event mechanism. 
The ClearEventRegister() pseudocode procedure clears the Event Register of the current PE. 


The EventRegistered() pseudocode function returns TRUE if the Event Register of the current PE is set and FALSE 
if it is clear. 


The WaitForEvent() pseudocode procedure optionally suspends execution until a WFE wake-up event or reset 
occurs, or until some earlier time if the implementation chooses. It is IMPLEMENTATION DEFINED whether restarting 
execution after the period of suspension causes a ClearEventRegister() to occur. 


The SendEvent() pseudocode procedure sets the Event Register of every PE in the system. 


G1.17.2 Wait For Interrupt 


AArch32 state supports Wait For Interrupt through an instruction, WFI, that is provided in the A32 and T32 
instruction sets. For more information, see WFI on page F5-3224. 


When a PE issues a WFI instruction, its execution can be suspended, and a low-power state can be entered. 


The execution in AArch32 state of a WFI instruction that would otherwise cause suspension of execution might be 
trapped, see: 


. Traps to Undefined mode of ELO execution of WFE and WFI instructions on page G1-3888. 
° Traps to Hyp mode of Non-secure ELO and EL1 execution of WFE and WFI instructions on page G1-3904. 


° Traps to Monitor mode of the execution of WFE and WF] instructions in modes other than Monitor mode on 
page G1-3915. 


The execution of a WFI instruction can cause suspension of execution only if both: 
° The instruction does not cause any other exception. 


. When the instruction is executed, there is not a pending WFI wakeup event. 


WFI wake-up events 


The PE can remain suspended in its WFI state until it is reset, or one of the following WFI wake-up events occurs: 
. A physical IRQ interrupt, regardless of the value of the PSTATE.I bit. 
. A physical FIQ interrupt, regardless of the value of the PSTATE.F bit. 
° A physical SError interrupt, regardless of the value of the PSTATE.A bit. 
° In Non-secure state in any mode other than Hyp mode: 
— When HCR.IMO is set to 1, a virtual IRQ interrupt, regardless of the value of the PSTATE.I bit. 
— | When HCR.FMO is set to 1, a virtual FIQ interrupt, regardless of the value of the PSTATE.F bit. 
— | When HCR.AMO is set to 1, a virtual SError interrupt, regardless of the value of the PSTATE.A bit. 


° An asynchronous External Debug Request debug event, if halting is allowed. For the definition of halting is 
allowed see Halting allowed and halting prohibited on page H2-4845. 


See also External Debug Request debug event on page H3-4900. 


An implementation can include other IMPLEMENTATION DEFINED hardware mechanisms to generate WFI wake-up 
events. 


When a WFI wake-up event is detected, or earlier if the implementation chooses, the WFI instruction completes. 


WFI wake-up events cannot be masked by the mask bits in the PSTATE. 
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The architecture does not define the exact nature of the low power state, but the execution of a WFI instruction must 
not cause a loss of memory coherency. 





Note 


° Because debug events are WFI wake-up events, ARM strongly recommends that is used as part of an idle 
loop rather than waiting for a single specific interrupt event to occur and then moving forward. This ensures 
the intervention of debug while waiting does not significantly change the function of the program being 
debugged. 


. In some previous implementations of Wait For Interrupt, the idle loop is followed by exit functions that must 
be executed before taking the interrupt. The operation of Wait For Interrupt remains consistent with this 
model, and therefore differs from the operation of Wait For Event. 


° Some implementations of Wait For Interrupt drain down any pending memory activity before suspending 
execution. The ARM architecture does not require this operation, and software must not rely on Wait For 
Interrupt operating in this way. 





Using WFI to indicate an idle state on bus interfaces 


A common implementation practice is to complete any entry into powerdown routines with a WFI instruction. 
Typically, the WFI instruction: 


1. Forces the completion of execution of any instructions that are in progress, and of all associated bus activity. 


2. Suspends the execution of instructions by the PE. 


The control logic required to do this tracks the activity of the bus interfaces used by the PE. This means it can signal 
to an external power controller when there is no ongoing bus activity. 


However, memory-mapped and external debug interface accesses to debug registers must continue to be processed 
while the PE is in the WFI state. The indication of idle state to the system normally only applies to the non-debug 
functional interfaces used by the PE, not the debug interfaces. 


When the value of DBGOSDLR.DLK, the OS Double Lock status bit, is set to 1, this idle state must not be signaled 
to the PE unless the system can guarantee, also, that the debug interface is idle. 





Note 


When separate core and debug power domains are implemented, the debug interface referred to in this section is the 
interface between the core and debug power domains, since the signal to the power controller indicates that the core 
power domain is idle. For more information about the power domains see Power domains and debug on 

page H6-4945. 





The exact nature of this interface is IMPLEMENTATION DEFINED, but the use of Wait For Interrupt as the only 
architecturally-defined mechanism that completely suspends execution makes it very suitable as the preferred 
powerdown entry mechanism. 


Pseudocode description of Wait For Interrupt 


The WaitForInterrupt() pseudocode function optionally suspends execution until a WFI wake-up event or reset 
occurs, or until some earlier time if the implementation chooses. 
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G1.18 The AArch32 System register interface 


In ARMv8, most system registers are accessed using the instructions described in System register access 
instructions on page F1-2387. The System register interface provides access to those instructions, and: 


° These registers are encoded using the parameters {coproc, opc1, CRn, CRm, opc2}, with permitted coproc values 
of 0b1110 and 0b1111. 


° Some of these encodings provide the AArch32 System instructions summarized in: 
— — Cache maintenance instructions, functional group on page G4-4201. 
— — TLB maintenance instructions, functional group on page G4-4202. 


— Address translation instructions, functional group on page G4-4204. 


. To maintain compatibility with previous versions of the ARM architecture, the access controls for the 
AArch32 System registers include the access controls for AArch32 Advanced SIMD and floating-point 
functionality. 





Note 


See Background to the System register interface on page G1-3879 for more information. 





The following sections give more information about the AArch32 System register interface: 

° System registers in the coproc == ObI11x encoding space. 

° Access to System registers. 

. Access controls for Advanced SIMD and floating-point functionality on page G1-3878. 


. Pseudocode description of checking accesses to the System registers on page G1-3878. 


G1.18.1 System registers in the coproc == 0b111x encoding space 
In AArch32 state: 


° The coproc == 0b1110 encoding space is reserved for the configuration and control of: 
— Debug features, see Debug registers on page G6-4668. 
— Trace features, see the Embedded Trace Macrocell Architecture Specification. 
— Identification registers for the Trivial Jazelle implementation, see Trivial implementation of the Jazelle 


extension on page G1-3810. 


° The coproc == 0b1111 encoding space is reserved for the control and configuration of the PE, including 
architecture and feature identification. This means these encodings provide access to the System registers that 
control and return status information for PE operation. 


For more information see Chapter G6 AArch32 System Register Descriptions. 


G1.18.2 Access to System registers 
Most System registers are accessible only from EL1 or higher. For possible accesses from ELO: 


° The register descriptions in Chapter G6 AArch32 System Register Descriptions indicate whether a register is 
accessible from ELO. 


° PLO views of the System registers in the (coproc==ObI1111) encoding space on page G4-4190 summarizes 
the permitted accesses to System registers in the coproc == 0b1111 encoding space from ELO. 
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G1.18.3 Access controls for Advanced SIMD and floating-point functionality 


In ARMv8, the CPACR controls access to Advanced SIMD and floating-point functionality from software 
executing at PL1 or ELO in AArch32 state: 


° The {cp10, cp11} fields control access to all Advanced SIMD and floating-point functionality, and can: 
— Disable ELO and PL1 access to this functionality. 
— Enable access to this functionality at PL1 only. 
— _ Enable access to this functionality at ELO and PL1. 


° The ASEDIS field controls access to Advanced SIMD instructions that are not also floating-point 
instructions. 


Initially on powerup or reset into AArch32 state, all access to all Advanced SIMD and floating-point functionality 
from PL1 and ELO is disabled. 


Note 
The CPACR has no effect on accesses from Hyp mode. 








If an implementation includes EL3, the NSACR determines whether Advanced SIMD and floating-point 
functionality can be accessed from Non-secure state: 


° The {cp10, cp11} fields control Non-secure access to all Advanced SIMD and floating-point functionality. 


° The NSASEDIS field controls Non-secure access to Advanced SIMD instructions that are not also 
floating-point instructions. 


If an implementation includes EL2, the HCPTR provides additional controls on Non-secure accesses to Advanced 
SIMD and floating-point functionality. For accesses that are otherwise permitted by the CPACR and NSACR 
settings, setting HCPTR bits to 1: 


° Traps otherwise-permitted accesses from EL1 or ELO to EL2. When EL2 is using AArch32, these accesses 
are trapped to Hyp mode. 


. Makes accesses from EL2 mode UNDEFINED. When EL2 is using AArch32, this makes accesses from Hyp 
mode UNDEFINED. 


In the HCPTR: 
° The {TCP10, TCP11} fields control access to all Advanced SIMD and floating-point functionality. 
° The TASE field controls access to Advanced SIMD instructions that are not also floating-point instructions. 


° The TCPAC field traps Non-secure EL1 accesses to the CPACR to Hyp mode. 


For more information, see General trapping to Hyp mode of Non-secure accesses to the SIMD and floating-point 
registers on page G1-3905. 


Note 


Whenever a pair of fields control the access to the Advanced SIMD and floating-point functionality, the values of 
each field of the pair must be identical. In ARMV8, if these settings are not identical the behavior of the Advanced 
SIMD and floating-point functionality is CONSTRAINED UNPREDICTABLE, see Handling of System register control 
fields for Advanced SIMD and floating-point operation on page K1-5462. 








For more information about Advanced SIMD and floating-point support see Advanced SIMD and floating-point 
support on page G1-3880. 
G1.18.4 Pseudocode description of checking accesses to the System registers 


The AArch32.CheckSystemAccess() function determines whether a System register access instruction that targets a 
System register in the (coproc == 0b111x) encoding space is accepted. 
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G1.18.5 Background to the System register interface 
Note 

This section is not part of the ARMv8 Architecture specification. It is included only to present the rationale of some 

aspects of the System register interface. 

The interface to the System registers was originally defined as part of a generic coprocessor interface, that gave 

access to fifteen coprocessors, CPO - CP15. Of these, CP8 - CP15 were reserved for use by ARM, while CPO - CP7 

were available for IMPLEMENTATION DEFINED coprocessors. 

The coprocessors were accessed using coprocessor instructions. These instructions remain part of the T32 and A32 

instruction sets, see System register access instructions on page F1-2387. 

In the ARM coprocessor model, a coprocessor included both: 

° Primary and secondary coprocessor registers, that form part of the coprocessor interface. 

° A number of internal registers. 

When accessing a 32-bit internal coprocessor register, using an MCR or MRC instruction, the instruction specified: 

. The target coprocessor, specified by the coproc parameter and taking a value between p@ for CPO and p15 for 
CP15. 

. The primary coprocessor register, specified by the CRn parameter and taking a value between c@ and c15. 

° The secondary coprocessor register, specified by the CRm parameter and taking a value between c@ and c15. 

° Up to two additional parameters, opcl and opc2, taking values between 0 and 7. 

To maintain backwards compatibility, the arguments to an MCR or MRC instruction remain {coproc, opcl, CRn, CRm, 

opc2}. Similarly, the encoding of the AArch64 System registers is described using the parameters {o0pQ, op1, CRn, CRm, 

op2}. However the naming of these parameters no longer has any particular significance. 

Of the coprocessors reserved for use by ARM: 

° CP15 provided access to the System registers relating to non-debug operation, and was originally called the 
System control coprocessor. In ARMv8 these registers are described as being in the coproc == 0b1111 
encoding space. 

° CP14 provided access to additional System registers, including those relating to debug and trace. In ARMv8 
these registers are described as being in the coproc == 0b1110 encoding space. 

° CP10 and CP11 were used for Advanced SIMD and floating-point control, and many coprocessor instruction 
encodings targeting CP10 and CP11 were used as floating-point instruction encodings: 

— Generally ARMv8 does not relate these instructions to the coprocessor encoding space, but the naming 
of registers and register fields for Advanced SIMD and floating-point control reflects the historic 
coprocessor model. 

— Because the Advanced SIMD and floating-point functionality used both CP10 and CP11, some System 
register controls of this functionality have a pair of fields, for example NSACR.{cp10, cp11}. In these 
cases, both fields must be set to the same value. For more information see Access controls for 
Advanced SIMD and floating-point functionality on page G1-3878. 

In ARMV8, the AArch32 System registers include registers that were described as Special registers in ARMv7 and 

earlier versions of the architecture. This means that the ARMVv8 System registers include registers that are outside 

the earlier coprocessor model. 
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G1.19 Advanced SIMD and floating-point support 


Advanced SIMD and floating-point instructions on page E1-2300 introduces: 


° The scalar floating-point instructions in the A32 and T32 instruction sets. 
° The Advanced SIMD integer and floating-point vector instructions in the A32 and T32 instruction sets. 
° The SIMD and floating-point register file, that can be viewed as: 


— __ Singleword registers SO - $31. 
—  Doubleword registers DO - D31. 
—  Quadword registers QO - Q15. 
. The Floating-Point Status and Control Register (FPSCR). 


For more information about the System registers for the Advanced SIMD and floating-point operation see Advanced 
SIMD and floating-point System registers on page G1-3882. Software can interrogate these registers to discover the 
implemented Advanced SIMD and floating-point support. 


AArch32 implications of not including support for Advanced SIMD and floating-point summarizes the effects of not 
supporting these instructions, and the following subsections give more information about the Advanced SIMD and 
Floating-point support: 


° Enabling Advanced SIMD and floating-point support on page G1-3881. 

. Advanced SIMD and floating-point System registers on page G1-3882. 

. Context switching when using Advanced SIMD and floating-point functionality on page G1-3883. 
° Floating-point exception traps on page G1-3883. 


G1.19.1 AArch32 implications of not including support for Advanced SIMD and floating-point 


As stated in Implementations not including Advanced SIMD and floating-point instructions on page D1-1621, 
although ARMv8-A generally requires the inclusion of the Advanced SIMD and floating-point instructions in all 
instruction sets, for implementations targeting specialized markets, ARM might produce or license ARMv8-A 
implementations that do not provide any support for Advanced SIMD and floating-point instructions. In such an 
implementation, in AArch32 state: 


° The CPACR.{ASEDIS, cp11, cp10} fields are RESO. 
° The NSACR. {NSASEDIS, cp11, cp10} fields are RESO. 
° The HCPTR.{TASE, TCP11, TCP10} fields are RES1. 


° The FPEXC, FPSCR, FPSID, MVFRO, MVFR1, and MVER2 registers are not implemented and their 
encodings are UNDEFINED. 


° Attempted accesses to Advanced SIMD and floating-point functionality are UNDEFINED. This means: 
— _ All Advanced SIMD and floating-point instructions are UNDEFINED. 
— _ Attempts to access the Advanced SIMD and floating-point System registers are UNDEFINED. 
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G1.19.2 Enabling Advanced SIMD and floating-point support 


Software must ensure that the required access to the Advanced SIMD and floating-point features is enabled. Most 
of those controls are described in Configurable instruction enables and disables, and trap controls on 
page G1-3885, and this section: 


° Summarizes those controls. 

° Provides additional information in the following subsections: 
—  FPEXC control of access to Advanced SIMD and floating-point functionality on page G1-3882. 
—  ELO access to Advanced SIMD and floating-point functionality on page G1-3882. 


Note 


This section shows the controls when the controlling Exception levels are using AArch32. Similar controls are 
provided when the Exception levels are using AArch64, and then apply to lower Exception levels that are using 
AArch32. 








The controls of access to Advanced SIMD and floating-point functionality are: 


General {cp10, cp11} or {TCP10, TCP11} controls 


This relates to the CPACR.{cp10, cp11}, NSACR.{cp10, cp11}, and HCPTR.{TCP10, TCP11} 
controls. 


— Note 


Background to the System register interface on page G1-3879 explains the naming of these controls. 





The {cp10, cp11} controls provide general control of the use of Advanced SIMD and floating-point 
functionality, as follows: 
. CPACR.{cp10, cp11} control access from PE modes other than Hyp mode. 
These fields have no effect on accesses to Advanced SIMD and floating-point functionality 
from Hyp mode. 


. In an implementation that includes EL3, NSACR. {cp10, cp11} control access from 
Non-secure state. 


° In an implementation that includes EL2, if NSACR.{cp10, cp11} permit Non-secure 
accesses, or if EL3 is not implemented, HCPTR.{TCP10, TCP11} provide an additional 
control on those accesses. 


In each case, the {cp10, cp11} controls must be programmed to the same value, otherwise operation 
is CONSTRAINED UNPREDICTABLE. The ARMv8 CONSTRAINED UNPREDICTABLE behavior is that, for 
all purposes other than reading the value of the register field, behavior is as if the cp11 field has the 
same value as the cp10 field. For more information see Handling of System register control fields 
for Advanced SIMD and floating-point operation on page K1-5462. 


For more information about these controls, see: 
° Enabling PLO and PL1 accesses to the SIMD and floating-point registers on page G1-3890. 


° General trapping to Hyp mode of Non-secure accesses to the SIMD and floating-point 
registers on page G1-3905. 


° Enabling Non-secure access to SIMD and floating-point functionality on page G1-3917. 


Control of accesses to the CPACR from Non-secure PL1 modes 


As stated in General {cp10, cp11} or {TCP10, TCP11} controls, the CPACR controls access to 
Advanced SIMD and floating-point functionality from PE modes other than Hyp mode. Accesses 
to the CPACR from Non-secure PL1 modes can be trapped to EL2, see Traps to Hyp mode of 
Non-secure ELI accesses to the CPACR on page G1-3906. 


Additional controls of Advanced SIMD functionality 


° If implemented as an RW field, CPACR.ASEDIS can make all Advanced SIMD instructions 
UNDEFINED in all modes other than Hyp mode. 
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. In an implementation that includes EL3, when CPACR.ASEDIS permits use of the Advanced 
SIMD instructions or if the CPACR.ASEDIS control is not implemented, 
NSACR.NSASEDIS can make all Advanced SIMD instructions UNDEFINED in Non-secure 
state. 


. In an implementation that includes EL2, when the CPACR and NSACR settings permit 
Non-secure use of the Advanced SIMD instructions, if HCPTR.TASE is implemented as an 
RW field it can make these instructions UNDEFINED in Hyp mode, and trap to Hyp mode any 
use of these instructions in a Non-secure PLO or PL1 mode. 


For more information about these controls, see: 
° Disabling PLO and PL1 execution of Advanced SIMD instructions on page G1-3891. 


° Traps to Hyp mode of Non-secure accesses to Advanced SIMD functionality on 
page G1-3906. 


. Disabling Non-secure access to Advanced SIMD functionality on page G1-3918. 


Pseudocode description of enabling SIMD and floating-point functionality on page G1-3919 all of these controls. 


FPEXC control of access to Advanced SIMD and floating-point functionality 


In addition, FPEXC.EN is an enable bit for most Advanced SIMD and floating-point operations. When FPEXC.EN 
is 0, all Advanced SIMD and floating-point instructions are treated as UNDEFINED except for: 


° A VMSR to the FPEXC or FPSID register. 
° A YMRS from the FPEXC, FPSID, MVFRO, MVFR1, or MVFR2 register. 


These instructions can be executed only at EL1 or higher. 





Note 
° When the FPSID is accessible, any write access to the FPSID is ignored. 
° When FPEXC.EN is 0, these operations are treated as UNDEFINED: 
— A VNSR to the FPSCR. 
— A VNRS from the FPSCR. 





ELO access to Advanced SIMD and floating-point functionality 


When the access controls summarized in this section permit ELO access to the Advanced SIMD and floating-point 
functionality, this applies only to the subset of functionality that is available at ELO. In particular: 


° Only Advanced SIMD and Floating-point System register that is accessible is the FPSCR. 


° The Advanced SIMD and floating-point instructions are available. 


Execution at ELO corresponds to the application level view of the Advanced SIMD and floating-point functionality, 
as described in Advanced SIMD and floating-point System registers on page E1-2302. 





G1.19.3 Advanced SIMD and floating-point System registers 
AArch32 state provides a common set of System registers for the Advanced SIMD and floating-point functionality. 
This section gives general information about this set of registers, and indicates where each register is described in 
detail. It contains the following subsections: 
° Register map of the Advanced SIMD and floating-point System registers. 
° Accessing the Advanced SIMD and floating-point System registers on page G1-3883. 
Register map of the Advanced SIMD and floating-point System registers 
Table G4-67 on page G4-4212 shows the register map of the Advanced SIMD and Floating-point registers. Each 
register is 32 bits wide. In an implementation that includes EL3, the Advanced SIMD and Floating-point registers 
are common registers, see Common System registers on page G4-4160. 
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Accessing the Advanced SIMD and floating-point System registers 


Software accesses the Advanced SIMD and floating-point System registers using the VMRS and VMSR instructions, see: 
° VMRS on page F6-3525. 
° VMSR on page F6-3528. 


For example: 


VMRS <Rt>, FPSID ; Read Floating-Point System ID Register 
VMRS <Rt>, MVFR1 ; Read Media and VFP Feature Register 1 
VMSR FPSCR, <Rt> ; Write Floating-Point System Control Register 


Software can access the Advanced SIMD and floating-point System registers only if the access controls permit the 
access, see Enabling Advanced SIMD and floating-point support on page G1-3881. 





Note 
All hardware ID information can be accessed only from EL1 or higher. This means: 
The FPSID is accessible only from EL1 or higher. 


This is a change introduced from VFPv3. Previously, the FPSID register can be accessed in all 
modes. 


The MVER registers are accessible only from EL1 or higher. 


Unprivileged software must issue a system call to determine what features are supported. 





G1.19.4 Context switching when using Advanced SIMD and floating-point functionality 


When the Advanced SIMD and floating-point functionality is used by only a subset of processes, the operating 
system might implement lazy context switching of the Advanced SIMD and floating-point register file and System 
registers. 


In the simplest lazy context switch implementation, the primary context switch software uses the 

CPACR.{cp10, cp11} controls to disable access to the Advanced SIMD and floating-point functionality, see 
Enabling Advanced SIMD and floating-point support on page G1-3881. Subsequently, when a process or thread 
attempts to use an Advanced SIMD or floating-point instruction, it triggers an Undefined Instruction exception. The 
operating system responds by saving and restoring the Advanced SIMD and floating-point register file and System 
registers. Typically, it then re-executes the Advanced SIMD or floating-point instruction that generated the 
Undefined Instruction exception. 


G1.19.5 Floating-point exception traps 


Execution of a floating-point instruction can generate an exceptional condition, called a floating-point exception. 


The ARMv8-A architecture supports synchronous exception generation in the event of any or all of the following 
floating-point floating-point exceptions: 





. Input Denormal. 

° Inexact. 

° Underflow. 

° Overflow. 

. Divide by Zero. 

° Invalid Operation. 
Note 


Do not confuse floating-point exceptions with the AArch32 architectural exceptions summarized in AArch32 state 
exception descriptions on page G1-3849. 
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Whether an implementation includes synchronous exception generation for these floating-point exceptions is 
IMPLEMENTATION DEFINED: 


° For an implementation that does provide this capability, FPSCR.{IDE, IXE, UFE, OFE, DZE, IOE} are the 
control bits that enable synchronous exception generation for each of the floating-point exceptions. 


° For an implementation that does provide this capability, the FPSCR.{IDE, IXE, UFE, OFE, DZE, IOE} are 
RAZ/WI. 


Note 


The ARMv8-A architecture does not support asynchronous reporting of floating-point exceptions. 








When generating synchronous exceptions for one or more floating-point exceptions is enabled, the synchronous 
exceptions generated by the floating-point exception traps are taken to the lowest Exception level that can handle 
such an exception, while adhering to the rule that an exception can never be taken to a lower Exception level. This 
means that trapped floating-point exceptions taken: 


° From ELO are taken to EL1, except for the following cases when they are taken from Non-secure ELO to EL2: 
—  EL2is using AArch32 and the value of HCR.TGE is 1. 
—  EL2is using AArch64 and the value of HCR_EL2.TGE is 1 


° From EL]! are taken to EL1. 
° From EL2 are taken to EL2. 
° From EL3 are taken to EL3. 


If the exception is taken to an Exception level that is using AArch64 then it is reported in the ELR_ELx for the 
Exception level to which it is taken, as described in Exception entry on page D1-1521. 


If the exception is taken to an Exception level that is using AArch32 then it is taken as an Undefined Instruction 
exception, see Undefined Instruction exception on page G1-3849. The FPEXC identifies the floating-point 
exceptions that occurred since the corresponding status bits in that register were last set to 0. 


See also Floating-point exceptions on page E1-2303. 
In an implementation that provides synchronous exception generation for floating-point exceptions: 


° Synchronous exception generation applies to floating-point exceptions generated by scalar SIMD and 
floating-point instructions executed in AArch32 state. 


° The registers that are presented to the exception handler are consistent with the state of the PE immediately 
before the instruction that caused the exception. An implementation is permitted not to restore the cumulative 
exception flags in the event of such an exception. 


ARMvVv8 does not support the trapping of floating-point exceptions from Advanced SIMD instructions executed in 
AArch3?2 state, 


The AArch32.FPTrappedException() and FPProcessException() pseudocode functions describe the handling of 
trapped floating-point exceptions generated in AArch32 state. 


The AArch32.FPTrappedException() and FPProcessException() functions are described in Chapter J1 ARMv8 
Pseudocode. 
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G1.20 Configurable instruction enables and disables, and trap controls 


This section describes the controls provided by AArch32 state for enabling, disabling, and trapping particular 
instructions. Each control is categorized as an instruction enable, an instruction disable, or a trap control: 


Instruction enables and instruction disables 


Trap controls 


Enable or disable the use of one or more particular instructions at a particular Privilege level and 
Security state. 


When an instruction is disabled as a result of an instruction enable or disable, it is UNDEFINED. 
The exception generated by attempting to executed an UNDEFINED instruction is: 


° Taken to EL1 if the UNDEFINED instruction was executed at ELO, unless the instruction was 
executed at Non-secure ELO and is routed to EL2 by the control described in Routing 
exceptions from Non-secure ELO to EL2 on page G1-3828. 


When the exception is taken to EL1 it is taken to Undefined mode. 
. Otherwise, taken to the Exception level at which the UNDEFINED instruction was executed: 
— If the instruction was executed in Hyp mode the exception is taken to Hyp mode. 


— Otherwise, the exception is taken to Undefined mode. 


Control whether one or more instructions, when executed at a particular Privilege level, are trapped. 


—— Note 

AArch32 trap controls are described in terms of Privilege levels, rather than Exception levels, 
because the PL1 traps apply at and are controlled from: 

ELI In Non-secure state, and in Secure state when EL3 is using AArch64. 

EL3 In Secure state when EL3 is using AArch32. 


For more information see Security state, Exception levels, and AArch32 execution privilege on 
page G1-3792. 





Trap controls are grouped as: 


PL1, excluding Monitor mode 


Trapped instructions generate Undefined Instruction exceptions that are taken to 
Undefined mode, unless the instruction was executed at Non-secure ELO and is routed 
to EL2 by the control described in Routing exceptions from Non-secure ELO to EL2 on 
page G1-3828. 


For more information about these traps see PL/ configurable controls on page G1-3886. 
Hyp mode (PL2) 


These traps apply only to execution in Non-secure state. This section only describes the 
traps that apply when EL2 is using AArch32. 


Trapped instructions generate: 


° Hyp Trap exceptions, taken to Hyp mode, if trapped from a mode other than Hyp 
mode. 


° Undefined Instruction exceptions taken to Hyp mode, if trapped from Hyp mode. 
For more information about these traps see EL2 configurable controls on page G1-3894. 
See also Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 

Monitor mode (Secure PL1) 
This section only describes the traps that apply when EL3 is using AArch32. 
Trapped instructions generate Monitor Trap exceptions, that are taken to Monitor mode. 


For more information about these traps see EL3 configurable controls on page G1-3914. 


An exception generated as a result of an instruction enable or disable, or a trap control, is only taken if the instruction 
does not also generate a higher priority exception. Exception prioritization for exceptions taken to AArch32 state on 
page G1-3816 defines the prioritization of different exceptions on the same instruction. 
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Exceptions generated as a result of these controls are synchronous exceptions. 


For exceptions taken to an Exception level that is using AArch32, only exceptions that are taken to Hyp mode are 
reported in a syndrome register, the HSR. 





Note 
. A particular control might have a mnemonic that suggests it is different type of control to the control type it 
is categorized as. For example, CPACR.TRCDIS is a trap control even though TRCDIS is a mnemonic for 
Trace Disable. 
° An implementation might provide additional controls, in IMPLEMENTATION DEFINED registers, to provide 


control of trapping of IMPLEMENTATION DEFINED features. 


° Configurable instruction enables and disables, and trap controls on page D1-1562 describes controls 
provided by AArch6é4 state for enabling, disabling, and trapping instructions. Generally, where an AArch64 
control applies to execution at lower Exception levels, it traps the equivalent functionality when that lower 
Exception level is using AArch32. See the AArch64 trap controls for more information. 





This section is organized as follows: 

° Register access instructions. 

° PLI configurable controls. 

. EL2 configurable controls on page G1-3894. 
. EL3 configurable controls on page G1-3914. 


. Pseudocode description of configurable instruction enables, disables, and traps on page G1-3919. 


G1.20.1 Register access instructions 


When an instruction is disabled or trapped, the exception is taken before execution of the instruction. This means 
that if the instruction is a register access instruction: 


. No access is made before the exception is taken. 


° Side-effects that are normally associated with the access do not occur before the exception is taken. 


G1.20.2 PL1 configurable controls 


In AArch32 state, each control is associated with a particular System register field that is accessible: 
° When EL3 is using AArch64, or when an implementation does not include EL3, from EL1. 
* When EL3 is using AArch32: 

— In Non-secure state, from EL1. 


— In Secure state, from EL3. 


This means that the controls are described as PL1 controls, because PL1 is defined as being the Privilege level of 
software that is executing: 


° At EL3, if the PE is executing in EL3 and EL3 is using AArch32. 
° At EL1 under all other conditions. 


Where there is an AArch64 control that is equivalent to an AArch32 PL1 control, the AArch64 control is an EL1 
control. 


Any exception that is generated because of an AArch32 PL1 control is taken to a PL1 mode. 





Note 
Any exception generated because of an AArch32 PL1 control is taken to AArch32 state. 
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Table G1-27 shows the AArch32 System registers that contain these controls. 


Table G1-27 System registers that contain instruction enables and disables, and trap controls 





Register name Register description 

















SCTLR System Control Register 

FPEXC Floating-point Exception Control Register 
CPACR Architectural Feature Access Control Register 
DBGDSCRext Monitor System Debug Control Register 
PMUSERENR Performance Monitors User Enable Register 





Table G1-28 summarizes these controls. 


Table G1-28 Instruction enables and disables, and trap controls, for exceptions taken to Undefined mode 


























Control 
Control Description 
type? 
SCTLR.{nTWE, nTWI} T Traps to Undefined mode of ELO execution of WFE and WFI instructions on 
page G1-3888 
SCTLR.{SED, ITD} D Disabling or enabling PLO and PL1 use of AArch32 deprecated functionality on 
SCTLR.CP15SBEN E page G1-3888 
CPACR.TRCDIS T Traps to Undefined mode of PLO and PL1 System register accesses to trace 
registers on page G1-3889 
CPACR.{cp11, cp10} E Enabling use of Advanced SIMD and floating-point functionality on 
FPEXC.EN E page G1-3890 
CPACR.ASEDIS D 
DBGDSCRext.UDCCdis T Traps to Undefined mode of ELO accesses to the Debug Communications 
Channel (DCC) registers on page G1-3891 
CNTKCTL.{PLOPTEN, PLOVTEN, T Traps to Undefined mode of ELO accesses to the Generic Timer registers on 
PLOPCTEN, PLOVCTEN} page G1-3892 
PMUSERENR.{ER, CR, SW, EN} T Traps to Undefined mode of ELO accesses to Performance Monitors registers on 


page G1-3892 





a. See Table G1-29. 


Table G1-29 Control types, for exceptions taken to Undefined mode 














Abbreviation Type See 

D Disable — Instruction enables and instruction disables on page G1-3885 
E Enable Instruction enables and instruction disables on page G1-3885 
T Trap Trap controls on page G1-3885 





When generated in Non-secure User mode, exceptions generated by these controls can be routed to EL2, as 
described in Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 
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Instructions that fail their condition code check 


See Conditional execution of undefined instructions on page G1-3851. 


Trapping to PL1 of instructions that are UNPREDICTABLE 


For an instruction that is UNPREDICTABLE or CONSTRAINED UNPREDICTABLE, when the instruction is disabled or 
trapped then it is CONSTRAINED UNPREDICTABLE whether execution of the instruction generates an Undefined 
Instruction exception. 


Traps to Undefined mode of ELO execution of WFE and WFI instructions 


SCTLR.{nTWE, nTWI} trap ELO execution of WFE and WFI instructions to Undefined mode: 


SCTLR.nTWE 
This control has no effect on the ELO execution of WFE instructions. 
0 Any attempt to execute a WFE instruction at ELO is trapped to Undefined mode, if the 
instruction would otherwise have caused the PE to enter a low-power state. 
SCTLR.nTWI 
This control has no effect on the ELO execution of WFI instructions. 
0 Any attempt to execute a WFI instruction at ELO is trapped to Undefined mode, if the 


instruction would otherwise have caused the PE to enter a low-power state. 


The attempted execution of a conditional WFE or WFI instruction is only trapped if the instruction passes its condition 
code check. 


Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of WFI are not guaranteed 
to be taken, even if the WFE or WFI is executed when there is no Wakeup event. The only guarantee is that if the 
instruction does not complete in finite time in the absence of a Wakeup event, the trap will be taken. 








When generated in Non-secure User mode, exceptions generated by these controls can be routed to EL2, as 
described in Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 


For more information about these instructions, and when they can cause the PE to enter a low-power state, see: 
° Wait For Event and Send Event on page G1-3872. 
° Wait For Interrupt on page G1-3875. 


Disabling or enabling PLO and PL1 use of AArch32 deprecated functionality 


Table G1-30 on page G1-3889 shows the deprecated AArch32 functionality that might have disable controls in the 
SCTLR: 


° The SED control is always implemented. 


° Whether each of the ITD or CP15BEN controls is implemented is IMPLEMENTATION DEFINED. If a control is 
not implemented then the associated functionality cannot be disabled. 


When an instruction is disabled by one of these controls, it is UNDEFINED at PLO and PL1. This means an attempt 
to execute the instruction at PLO or PL1 generates an Undefined Instruction exception that is taken to Undefined 
mode, unless both of the following apply, in which case the attempted execution generates an exception that is taken 
to EL2, as described in Routing exceptions from Non-secure ELO to EL2 on page G1-3828: 


° The instruction is executed at Non-secure ELO using AArch32. 
° Either: 
—  EL2 is using AArch32 and the value of HCR.TGE is 1. 
—  EL2is using AArch64 and the value of HCR_EL2.TGE is 1. 
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Table G1-30 PL1 controls for disabling and enabling PLO and PL1 use of AArch32 deprecated functionality 





Deprecated AArch32 functionality plata genet Disabled instructions 











SETEND instructions SED> SETEND instructions 

Some uses of IT instructions ITD* See the SCTLR.ITD description 
Accesses to the CP1SDMB, CP15DSB, and CP15ISB CP1I5BEN¢4 MCR accesses to the CP15DMB, CP15DSB, 
barrier instructions and CP15ISB instructions 





a. The controls that are implemented in SCTLR are also implemented in SCTLR_EL1, and apply when PL] is using AArch64 and PLO is using 
AArch32. 


b. SETEND instruction disable. SETEND instructions are disabled when the value of this field is 1. 
c. IT instruction disable. If this control is implemented, some uses of IT instructions are disabled when the value of this field is 1. 


d. System register (coproc==0b1111) memory barrier enable. If this control is implemented, the specified register accesses are disabled when 
the value of CP15BEN is 0. 


Note 


The uses of the IT instruction, and use of the CP15DMB, CP15DSB, and CP15ISB barrier instructions, are 
deprecated for performance reasons. 








Traps to Undefined mode of PLO and PL1 System register accesses to trace registers 


If implemented, the CPACR.TRCDIS control traps PLO and PL1 System register accesses to the trace registers to 
Undefined mode, as follows: 


1 PLO and PL1 accesses to the System register interface to the trace macrocell are trapped to 
Undefined mode 

0 This control has no effect on PLO and PL1 accesses to the System register interface to the trace 
macrocell. 


If the CPACR.TRCDIS control is not implemented, then the CPACR.TRCDIS field is RAZ/WI. This means the 
CPACR does not provide a trap to Undefined mode of PL1 and PLO System register accesses to trace registers. See 
the register description for more information. 





Note 
° System register accesses to the trace macrocell use the (coproc==0b1110) encoding space. 
° The ETMv4 architecture does not permit ELO to access the trace registers. If the ARMv8-A architecture is 


implemented with an ETMv4 implementation, ELO accesses to the trace System registers are UNDEFINED. 


° The ARMVv8-A architecture does not provide traps on trace register accesses through the optional 
memory-mapped external debug interface. 





System register accesses to the trace System registers can have side-effects. When a System register access is 
trapped, no side-effects occur before the exception is taken, see Register access instructions on page G1-3886. 


If EL3 is implemented and is using AArch32, and NSACR.NSTRCDIS is 1, CPACR.TRCDIS behaves as RAO/WI 
in Non-secure state. This behavior also applies if the CPACR.TRCDIS control is not implemented. 


When generated in Non-secure User mode, an exception generated by this control can be routed to EL2, as described 
in Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 
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Enabling use of Advanced SIMD and floating-point functionality 


Table G1-31 summarizes the controls of Advanced SIMD and floating-point functionality. 


Table G1-31 Controls of use of Advanced SIMD and floating-point functionality 














Control Type Description, see 

CPACR.{cpll,cp10} E Enabling PLO and PL1 accesses to the SIMD and floating-point registers 
FPEXC.EN E Enabling access to the SIMD and floating-point registers on page G1-3891 
CPACR.ASEDIS D Disabling PLO and PLI execution of Advanced SIMD instructions on page G1-3891 





If any of CPACR.{cp11, cp10}, FPEXC.EN, or for Advanced SIMD instructions, CPACR.ASEDIS, disable a 
floating-point or an Advanced SIMD instruction, the instruction is UNDEFINED. Support for the CPACR.ASEDIS 
control is optional, and if the control is not implemented behavior is as if the control permits the execution of 
Advanced SIMD instructions at PL1 and PLO. 


When generated in Non-secure User mode, exceptions generated by these controls can be routed to EL2, as 
described in Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 


Enabling PLO and PL1 accesses to the SIMD and floating-point registers 


CPACR.{cp11, cp10} enable PLO and PL1 accesses to the SIMD and floating-point registers. 


When CPACR.cp10 is: 

00 PLO and PL1 accesses to Advanced SIMD and floating-point registers or instructions are 
UNDEFINED. 

01 PLO accesses to Advanced SIMD and floating-point registers or instructions are UNDEFINED. 

10 Reserved. The effect of programming this field to this value is CONSTRAINED UNPREDICTABLE. 

11 This control permits full access to the Advanced SIMD and floating-point functionality from PLO 
and PLI. 


The value of CPACR.cp11 is ignored. If CPACR.cp11 is programmed with a different value to CPACR.cp10 then 
CPACR.cp11 is UNKNOWN on a direct read of the CPACR. 


Note 
° Software must set CPACR.cp11 and CPACR.cp10 to the same value. 








Table G1-32 shows the registers for which accesses are enabled. 


Table G1-32 Register accesses enabled at PLO and PL1 by CPACR.{cp11, cp10} 





Enabled at Registers 





PLO and PL1, or PLO only =FPSCR, FPEXC, FPSID, MVFRO, MVFR1, MVFR2, and any of the SIMD and floating-point registers 
QO0-Q15, including their views as DO-D31 registers or SO-S31 registers 





a. Depending on the value of CPACR.{cp11, cp10}. See the register description for details. 


b. Permitted VMSR accesses to the FPSID are ignored, but for the purposes of the {cp10, cp11} controls the architecture defines a VMSR accesses 
to the FPSID from EL] or higher is an access to a SIMD and floating-point register. 


If EL3 is implemented and is using AArch32, and NSACR.{cp11, cp10} are both set to 0, the functionality 
described in this section is disabled in Non-secure state, and CPACR. {cp11, cp10} are RAZ/WI in Non-secure state. 
See Enabling Non-secure access to SIMD and floating-point functionality on page G1-3917. 
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For more information about SIMD and floating-point support, see Advanced SIMD and floating-point support on 
page G1-3880. 


Enabling access to the SIMD and floating-point registers 


FPEXC.EN enables accesses to the SIMD and floating-point registers at all Exception levels, but does not control 
the following: 


° VMSR accesses to the FPEXC or FPSID. 
° VMRS accesses from the FPEXC, FPSID, MVFRO, MVFR1, or MVFR2. 


When FPEXC.EN is: 
1 Accesses to the registers shown in Table G1-33 are enabled at all Exception levels. 
0 All accesses to the registers shown in Table G1-33 are UNDEFINED. 


Table G1-33 shows the registers for which accesses are enabled, and for an exception taken to Hyp mode, how the 
exception is reported in HSR. 


Table G1-33 Register accesses enabled when FPEXC.EN is 1 





Syndrome reporting in 





Enabled at Register: 

bled a egisters HSRa 

All Exception FPSCR, and any of the SIMD and floating-point registers QO-Q15, Exception for an unknown 
levels including their views as DO-D31 registers or SO-S31 registers. reason, using EC value 0x00 





a. Only for exceptions that are taken to Hyp mode. 


For more information, see Advanced SIMD and floating-point support on page G1-3880. 


Disabling PLO and PL1 execution of Advanced SIMD instructions 


If implemented as an RW field, CPACR.ASEDIS can disable PLO and PL1 execution of Advanced SIMD 
instructions, as follows: 


1 Advanced SIMD instructions are UNDEFINED at PLO and PL1. 
0 Advanced SIMD instruction execution is enabled at PLO and PL1. 


The instructions that CPACR.ASEDIS disables are those described in Controls of Advanced SIMD operation that 
do not apply to floating-point operation on page E1-2306. 


When the control is not implemented, meaning the CPACR.ASEDIS field is RAZ/WI, behavior is as if the control 
permits execution of Advanced SIMD instructions at PLO and PL1. 


If EL3 is implemented and is using AArch32, and NSACR.NSASEDIS is 1, CPACR.ASEDIS is RAO/WI in 
Non-secure state. This also applies when the CPACR.ASEDIS control is not implemented. 


Traps to Undefined mode of ELO accesses to the Debug Communications Channel 
(DCC) registers 


DBGDSCRext.UDCCdis traps ELO accesses to the DCC registers to Undefined mode: 
1 ELO accesses to the DCC registers are trapped to Undefined mode 
0 This control has no effect on ELO accesses to the DCC registers. 


Traps of ELO accesses to the DBGDTRRXint and DBGDTRTXint are ignored in Debug state. 


Table G1-34 shows the registers for which accesses are trapped. 


Table G1-34 Register accesses trapped to Undefined mode when DBGDSCRext.UDCCdis is 1 





Traps from Registers 





ELO DBGDSCRint, DBGDTRRXint, DBGDTRTXint, DBGDIDR, DBGDSAR, DBGDRAR 
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Note 


All accesses to these registers are trapped, including LDC and STC accesses to DBGDTRTXint and DBGDTRRXint, 
and MRRC accesses to DBGDSAR and DBGDRAR. 








When generated in Non-secure User mode, an exception generated by this control can be routed to EL2, as described 
in Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 


Traps to Undefined mode of ELO accesses to the Generic Timer registers 


CNTKCTL.{PLOPTEN, PLOVTEN, PLOPCTEN, PLOVCTEN} trap ELO accesses to the Generic Timer registers 
to Undefined mode, as follows: 


° CNTKCTL.PLOPTEN traps ELO accesses to the physical timer registers. 

° CNTKCTL.PLOVTEN traps ELO accesses to the virtual timer registers. 

° CNTKCTL.PLOPCTEN traps ELO accesses to the frequency register and physical counter register. 
° CNTKCTL.PLOVCTEN traps ELO accesses to the frequency register and virtual counter register. 


For all of these controls: 
1 This control has no effect on ELO accesses to the corresponding registers. 


0 ELO accesses to the corresponding registers are trapped to Undefined mode. 


Accesses to the frequency register, CNTFRQ, are only trapped if CNTKCTL.PLOPCTEN and 
CNTKCTL.PLOVCTEN are both 0. 


Table G1-35 shows the registers for which accesses are trapped. 


Table G1-35 Register accesses trapped to Undefined mode by CNTKCTL trap controls 





Traps from Trapcontrol Registers 





ELO PLOPTEN CNTP CTL, CNTP CVAL,CNTP_TVAL 





PLOVTEN CNTV_CTL, CNTV_CVAL, CNTV_TVAL 





PLOPCTEN CNTFRQ, CNTPCT 





PLOVCTEN CNTFRQ, CNTVCT 





When generated in Non-secure User mode, an exception generated by this control can be routed to EL2, as described 
in Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 


Traps to Undefined mode of ELO accesses to Performance Monitors registers 


PMUSERENR.{ER, CR, SW, EN} trap ELO accesses to the Performance Monitors registers to Undefined mode. 
For each of these controls: 


1 This control has no effect on ELO accesses to the corresponding registers. 


0 ELO accesses to the corresponding registers are trapped to Undefined mode. 


For those Performance Monitors registers that more than one PMUSERENR.{ER, CR, SW, EN} control applies to, 
accesses are only trapped if all controls that apply are set to 0. 


The accesses that these trap controls trap might be reads, writes, or both. 


Note 


° The architecture does not provide traps on Performance Monitors register accesses through the 
memory-mapped external debug interface. 





. If the Performance Monitors Extension is not implemented, the Performance Monitors registers, including 
PMUSERENR, are reserved. 
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Table G1-36 shows the registers for which ELO accesses are trapped. For each register, the table shows the type of 
access trapped. 


Table G1-36 Register accesses trapped to Undefined mode when disabled from ELO 
































Tra Access 
Traps from . Registers 
control type 
ELO ER PMXEVCNTR, PMEVCNTR<n> R 
PMSELR RW 
CR PMCCNTR, accessed using an MRC R 
CR PMCCNTR, accessed using an MRRC R 
SW PMSWINC WwW 
EN PMCNTENSET, PMCNTENCLR, PMCR, PMOVSR, PMSWINC, PMSELR, PMCEIDO, RW 
PMCEID1, PMCCNTR, PMXEVTYPER, PMXEVCNTR, PMOVSSET, PMEVCNTR<n>, 
PMEVTYPER<n>, PMCCFILTR 
When generated in Non-secure User mode, an exception generated by this control can be routed to EL2, as described 
in Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 
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G1.20.3 EL2 configurable controls 


These controls are ignored in Secure state. 


Table G1-37 shows the System registers that contain these controls. 


Table G1-37 System registers that contain instruction enables and disables, and trap controls 








Register name Register description 




















FPEXC Floating-point Exception Control Register 
HCR Hypervisor Configuration Register 
HSTR Hypervisor System Trap Register 
HCPTR Hyp Architectural Feature Trap Register 
HDCR Hyp Debug Control Register 
Note 
° FPEXC.EN is a control that is in a System register provided by PL1. However, it results in exceptions taken 
to Hyp mode. 
° For completeness, Table G1-38 includes the HCR.TGE routing control, that is described in Routing 


exceptions from Non-secure ELO to EL2 on page G1-3828. 





Table G1-38 summarizes the controls. 


Table G1-38 Instruction enables and disables, and trap controls, for exceptions taken to Hyp mode 






































ntrol 
Control cong Description 
type? 

HSCTLR.{SED, ITD} D Disabling or enabling EL2 use of AArch32 deprecated functionality on 

HSCTLR.CP15BEN E page G1-3897 

HCR.{TRVM, TVM} T Traps to Hyp mode of Non-secure EL1 accesses to virtual memory control 
registers on page G1-3897 

HCR.HCD Disabling Non-secure state execution of HVC instructions on page G1-3898 

HCR.TGE R Routing exceptions from Non-secure ELO to EL2 on page G1-3828 

HCR.TTLB Traps to Hyp mode of Non-secure ELI execution of TLB maintenance 
instructions on page G1-3898 

HCR.{TSW, TPC, TPU} T Traps to Hyp mode of Non-secure EL1 execution of cache maintenance 
instructions on page G1-3899 

HCR.TAC T Traps to Hyp mode of Non-secure EL1 accesses to the Auxiliary Control 
Register on page G1-3899 

HCR.TIDCP T Traps to Hyp mode of Non-secure ELO and ELI accesses to lockdown, DMA, 
and TCM operations on page G1-3900 

HCR.TSC T Traps to Hyp mode of Non-secure ELI execution of SMC instructions on 
page G1-3901 

HCR.{TIDO, TID1, TID2, TID3} T Traps to Hyp mode of Non-secure ELO and ELI accesses to the ID registers on 
page G1-3901 
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Table G1-38 Instruction enables and disables, and trap controls, for exceptions taken to Hyp mode (continued) 





Control 
































Control Description 
type? P 

HCR.{TWI, TWE} T Traps to Hyp mode of Non-secure ELO and EL1 execution of WFE and WFI 
instructions on page G1-3904 

HCPTR.{TCP11, TCP10} T General trapping to Hyp mode of Non-secure accesses to the SIMD and 
floating-point registers on page G1-3905 

FPEXC.EN T Enabling access to the SIMD and floating-point registers on page G1-3906 

HCPTR.TASE T Traps to Hyp mode of Non-secure accesses to Advanced SIMD functionality on 
page G1-3906 

HCPTR.TCPAC T Traps to Hyp mode of Non-secure ELI accesses to the CPACR on 
page G1-3906 

HCPTR.TTA T Traps to Hyp mode of Non-secure System register accesses to trace registers on 
page G1-3907 

HSTR.{TO-T3, T5-T13, T15} T General trapping to Hyp mode of Non-secure ELO and EL1 accesses to System 
registers in the (coproc==Ob1I111) encoding space on page G1-3908 

HDCR.{TDRA, TDOSA, TDA} T Traps to Hyp mode of Non-secure System register accesses to debug registers 
on page G1-3909 

CNTHCTL.{PLIPCEN, PLIPCTEN} T Traps to Hyp mode of Non-secure ELO and EL] accesses to the Generic Timer 
registers on page G1-3911 

HDCR.{TPM, TPMCR} T Traps to Hyp mode of Non-secure ELO and ELI accesses to Performance 


Monitors registers on page G1-3912 





a. See Table G1-39. 


Table G1-39 Control types, for exceptions taken to Hyp mode 




















Abbreviation Type See 
D Disable Instruction enables and instruction disables on page G1-3885 
E Enable Instruction enables and instruction disables on page G1-3885 
R Routing control Routing exceptions from Non-secure ELO to EL2 on page G1-3828 
T Trap Trap controls on page G1-3885 
Also see the following: 
° Register access instructions on page G1-3886. 
. Instructions that fail their condition code check on page G1-3896. 


° Trapping to EL2 of instructions that are UNPREDICTABLE on page G1-3896. 
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Instructions that fail their condition code check 


For UNDEFINED instructions that fail their condition code check, see Conditional execution of undefined instructions 
on page G1-3851. 


For an instruction that has a Hyp trap set, that fails its condition code check: 


° Unless the trap description states otherwise, it is IMPLEMENTATION DEFINED whether the instruction: 
— Generates a Hyp Trap exception. 


—_ Executes as a NOP. 


Any implementation must be consistent in its handling of instructions that fail their condition code check. This 
means that: 


° Whenever a Hyp trap is set on an instruction it must either: 
— Always generate a Hyp Trap exception. 


— __ Always treat the instruction as a NOP. 


° The IMPLEMENTATION DEFINED part of the requirements of Conditional execution of undefined instructions 
on page G1-3851 must be consistent with the handling of Hyp traps on instructions that fail their condition 
code check. Table G1-40 shows this: 


Table G1-40 Consistent handling of instructions that fail their condition code check 





Behavior of conditional UNDEFINED instruction@ Hyp trap on instruction that fails its condition code check» 





Executes as a NOP Executes as a NOP 





Generates an Undefined Instruction exception Generates a Hyp Trap exception 





As defined in Conditional execution of undefined instructions on page G1-3851. In Non-secure ELO and EL1 modes, this applies only 
if no Hyp trap is set for the instruction, otherwise see the behavior in the other column of the table. 


b. For a trapped instruction executed in a Non-secure EL1 or ELO mode. 


Note 
Hyp traps on WFE and WFI instructions generate Hyp Trap exceptions only if the instruction passes its condition code 


check. See Traps to Hyp mode of Non-secure ELO and EL1 execution of WFE and WFI instructions on 
page G1-3904. 








Trapping to EL2 of instructions that are UNPREDICTABLE 


For an instruction that is UNPREDICTABLE or CONSTRAINED UNPREDICTABLE, when the instruction is disabled or 
trapped then it is CONSTRAINED UNPREDICTABLE whether execution of the instruction generates a Hyp Trap 
exception. 


Note 


UNPREDICTABLE and CONSTRAINED UNPREDICTABLE behavior must not perform any function that cannot be 
performed at the current or lower Exception level using instructions that are not UNPREDICTABLE and are not 
CONSTRAINED UNPREDICTABLE. This means that disabling or trapping an instruction changes the set of instructions 
that might be executed in Non-secure state at EL1 or ELO. This indirectly affects the permitted behavior of 
UNPREDICTABLE and CONSTRAINED UNPREDICTABLE instructions. 








If no instructions are trapped, the attempted execution of an UNPREDICTABLE instruction in a Non-secure EL1 or 
ELO mode must not generate a Hyp Trap exception. 
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Disabling or enabling EL2 use of AArch32 deprecated functionality 
Table G1-41 shows the deprecated AArch32 functionality that might have disable controls in the HSCTLR: 
° The SED control is always implemented. 


° Whether each of the ITD, CP15BEN controls is implemented is IMPLEMENTATION DEFINED. If a control is 
not implemented then the associated functionality cannot be disabled. 


These HSCTLR controls apply only to execution at EL2 using AArch32. When an instruction is disabled by one of 
these controls, it is UNDEFINED at EL2, meaning it is undefined in Hyp mode. 


Table G1-41 EL2 controls for disabling and enabling EL2 use of AArch32 deprecated functionality 





Instruction enable or 


disable in the HSCTLR Disabled instructions 


Deprecated AArch32 functionality 











SETEND instructions SED@ SETEND instructions 

Some uses of IT instructions ITD> See the HSCTLR.IT description 

Accesses to the System register (coproc==0b1111) CPIS5BEN¢ MCR accesses to the CPISDMB, CPI5DSB, 
DMB, DSB, and ISB barrier operations and CP15ISB 





a. SETEND instruction disable. SETEND instructions are disabled when the value of this field is 1. 
b. IT instruction disable. If this control is implemented, some uses of IT instructions are disabled when the value of this field is 1. 


c. System register (coproc==0b1111) memory barrier enable. If this control is implemented, the specified register accesses are disabled when 





the value of CPI5BEN is 0. 
Note 
° These controls have no effect on instructions executed in any mode other than Hyp mode. The SCTLR 


provides similar controls that apply to execution in other modes. 


° The uses of the IT instruction, and use of the CP15DMB, CP15DSB, and CP15ISB barrier instructions, are 
deprecated for performance reasons. 





Traps to Hyp mode of Non-secure EL1 accesses to virtual memory control registers 
HCR.{TRVM, TVM} trap Non-secure EL] accesses to the virtual memory control registers to Hyp mode: 


HCR.TRVM, for read accesses: 


1 Non-secure EL1 reads of the virtual memory control registers are trapped to Hyp mode. 
0 This control has no effect on Non-secure EL1 reads of the virtual memory control 
registers. 


HCR.TVM, for write access: 


1 Non-secure EL1 writes to the virtual memory control registers are trapped to Hyp mode. 
0 This control has no effect on Non-secure EL1 writes to the virtual memory control 
registers. 


Table G1-42 on page G1-3898 shows the registers for which: 
° Reads are trapped to Hyp mode when HCR.TRVM is 1. 
° Writes are trapped to Hyp mode when HCR.TVM is 1. 
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The table also shows how the exceptions are reported in HSR. 


Table G1-42 Register read and write accesses trapped when HCR.{TRVM, TVM} are 1 





Traps from 


Registers Syndrome reporting in HSR 





Non-secure EL 1 


SCTLR, TTBRO, TTBR1, TTBCR, DACR, DFSR, Trapped MCR or MRC access (coproc==0b1111), using EC 
IFSR, DFAR, IFAR, ADFSR, AIFSR, PRRR, NMRR, _ value 0x03 








MAIRO, MAIR1, AMAIRO, AMAIRI, Trapped MCRR or MRRC access (coproc==0b1111), using EC 
CONTEXTIDR value 0x04 
Note 


These registers are not accessible at ELO. 





Disabling Non-secure state execution of HVC instructions 
HCR.HCD disables Non-secure state execution of HVC instructions: 


1 HVC instructions are UNDEFINED at EL2 and Non-secure EL1. The Undefined Instruction exception 
is taken from the current Exception level to the current Exception level. 


0 HVC instruction execution is enabled at EL2 and Non-secure EL1. 





Note 
HVC instructions are always UNDEFINED at ELO. 





HCR.HCD is only implemented if EL3 is not implemented. Otherwise, it is RESO. See the HCR register description. 


Table G1-43 shows how the exceptions are reported in HSR. 


Table G1-43 Instruction that causes exceptions when HCR.HCD is 1 











Attempted execution in Digebled Syndrome reporting in HSR 

instruction 
Hyp mode HVC Exception for an unknown reason, using EC value 0x00 
Mode other than Hyp mode —_— HVC Not applicable 





Traps to Hyp mode of Non-secure EL1 execution of TLB maintenance instructions 
In the ARMv8-A architecture, the system instruction encoding space includes TLB maintenance instructions. 


HCR.TTLB traps Non-secure EL1 execution of TLB maintenance instructions to Hyp mode: 
1 Any attempt to execute a TLBI instruction at Non-secure EL1 is trapped to Hyp mode. 


0 This control has no effect on the Non-secure EL1 execution of TLBI instructions. 


Table G1-44 shows the instructions that are trapped, and how the exceptions are reported in HSR. 


Table G1-44 Instructions trapped to Hyp mode when HCR.TTLB is 1 





Traps from 





Non-secure EL1 


Trapped instructions Syndrome reporting in HSR 
TLBIALLIS, TLBIMVAIS, TLBIASIDIS, TLBIMVAAIS, TLBIMVALIS, Trapped MCR or MRC access 
TLBIMVAALIS, ITLBIALL, ITLBIMVA, ITLBIASID, DTLBIALL, (coproc==0b1111), using EC value 
DTLBIMVA, DTLBIASID, TLBIALL, TLBIMVA, TLBIASID, 0x03 


TLBIMVAA, TLBIMVAL, TLBIMVAAL. 
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Note 
These instructions are always UNDEFINED at ELO. 








For more information about these instructions, see The scope of TLB maintenance instructions on page G4-4101. 


Traps to Hyp mode of Non-secure EL1 execution of cache maintenance instructions 
HCR.{TSW, TPC, TPU} trap cache maintenance instructions to Hyp mode: 
0 The control has no effect on the execution of cache maintenance instructions. 


1 Any attempt to execute one of the cache maintenance instructions shown in Table G1-46 at 
Non-secure EL] is trapped to Hyp mode. 


Table G1-45 Controls for trapping cache maintenance instructions to Hyp mode 





Trap control Trapped instructions 











HCR.TSW Data or unified cache maintenance by set/way 
HCR.TPC Data or unified cache maintenance to point of coherency 
HCR.TPU Cache maintenance to point of unification 





Table G1-46 shows the instructions that are trapped to Hyp mode, and how the exceptions are reported in HSR. 


Table G1-46 Instructions trapped to Hyp mode when HCR.{TSW, TPC, TPU} are 1 




















Traps from Trap control Trapped instructions Syndrome reporting in HSR 
Non-secure EL1 TSW DCISW, DCCSW, DCCISW Trapped MCR or MRC access 
(coproc==0b1111), using EC value 0x3 
TPC DCIMVAC, DCCIMVAC, DCCMVAC 
TPU ICIMVAU, ICIALLU, ICIALLUIS, DCCMVAU 
Note 


These instructions are always UNDEFINED at ELO. 





For more information about these instructions, see Cache maintenance instructions, functional group on 
page G4-4201. 


Traps to Hyp mode of Non-secure EL1 accesses to the Auxiliary Control Register 


HCR.TAC traps Non-secure EL1 accesses to the Auxiliary Control Registers to Hyp mode: 
1 Non-secure EL1 accesses to the Auxiliary Control Registers are trapped to Hyp mode. 


0 This control has no effect on Non-secure EL1 accesses to the Auxiliary Control Registers. 


Table G1-47 shows the registers for which accesses are trapped, and how the exceptions are reported in HSR: 


Table G1-47 Register accesses trapped to Hyp mode when HCR.TAC is 1 














Traps from Registers Syndrome reporting in HSR 
Non-secure EL1 ACTLR and, if implemented, ACTLR2. Trapped MCR or MRC access (coproc==0b1111) access, using EC 
value 0x03 
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Note 
The ACTLR and ACTLR2 are not accessible at ELO. 








Traps to Hyp mode of Non-secure ELO and EL1 accesses to lockdown, DMA, and TCM 
operations 


The lockdown, DMA, and TCM features of the ARMv8-A architecture are IMPLEMENTATION DEFINED. The 
architecture reserves the encodings of a number of System registers for control of these features. 


HCR.TIDCP traps the execution of System register access instructions that access these registers, as follows: 


1 At Non-secure EL 1, any attempt to execute an MCR or MRC instruction with a reserved register 
encoding shown in Table G1-48 is trapped to Hyp mode. 


At Non-secure ELO, it is IMPLEMENTATION DEFINED whether attempts to execute MCR or MRC 
instructions with reserved register encodings are: 


° Trapped to Hyp mode. 


° UNDEFINED, and the PE takes the Undefined Instruction exception to Non-secure Undefined 
mode. 


Any lockdown fault in the memory system caused by the use of these operations in Non-secure state 
generates a Data Abort exception that is taken to Hyp mode. 


0 This control has no effect on Non-secure ELO and EL1 System register access instructions with 
reserved register encodings shown in Table G1-48. 


Note 


This means that a Hyp Trap exception taken from Non-secure EL1 to Hyp mode, generated because of a 
configuration setting in HCR.TIDCP is a higher priority exception than an Undefined Instruction exception 
generated because either the System register encoding is unallocated or because the register is never accessible at 
EL1. As Synchronous exception prioritization for exceptions taken to AArch32 state on page G1-3816 shows, this 
is an exception to the general exception prioritization rules, that prioritize most Undefined Instruction exceptions 
taken to Undefined mode above traps to EL2. 








Table G1-48 shows the register encodings for which accesses are trapped to Hyp mode, and how the exceptions are 
reported in HSR. 


Table G1-48 Encodings trapped to Hyp mode when HCR.TIDCP is 1 




















Traps from Register encodings Syndrome reportingin HSR 
Non-secure ELO and = An access to any of the following encodings: Trapped MCR or MRC access 
ELI . CRn==c9, opel=={0-7}, CRm=={c0-c2, c5-c8}, ope2=={0-7}. (coproc==0b1111), using EC 
© CRn==c10, opel=={0-7}, CRm=={c0, cl, c4, 8}, opc2=={0-7}. Value 0x03 
° CRn==cl1, opcl=={0-7}, CRm=={c0-c8, c15}, opc2=={0-7}. 
An implementation can also include IMPLEMENTATION DEFINED registers that provide additional controls, to give 
finer-grained control of the trapping of IMPLEMENTATION DEFINED features. 
Note 
ARM expects the trapping of Non-secure User mode accesses to these functions to Hyp mode to be unusual, and 
used only when the hypervisor is virtualizing User mode operation. ARM strongly recommends that unless the 
hypervisor must virtualize User mode operation, a Non-secure User mode access to any of these functions generates 
an Undefined Instruction exception, as it would if the implementation did not include EL2. The PE then takes this 
exception to Non-secure Undefined mode. 
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Traps to Hyp mode of Non-secure EL1 execution of SMC instructions 


HCR.TSC traps Non-secure EL1 execution of SMC instructions to Hyp mode: 


1 Any attempt to execute an SMC instruction at Non-secure EL] is trapped to Hyp mode, regardless of 
the value of SCR.SCD. 
0 This control has no effect on Non-secure EL1 execution of SMC instructions. 


Table G1-49 shows how the exceptions are reported in HSR: 


Table G1-49 SMC Instruction trapped to Hyp mode when HCR.TSC is 1 





Traps from Trapped instruction Syndrome reporting in HSR 





Non-secure EL1 SMC on page F5-2983 Trapped SMC instruction execution in AArch32 state, using EC value 0x13 





The ARMv8-A architecture permits, but does not require, this trap to apply to conditional SMC instructions that fail 
their condition code check, in the same way as with traps on other conditional instructions. 





Note 
° This trap is implemented only if the implementation includes EL3. 
° SMC instructions are always UNDEFINED at ELO. 
° HCR.TSC traps execution of the SMC instruction. It is not a routing control for the SMC exception. Hyp Trap 


and SMC exceptions have different preferred return addresses. 





For more information about SMC instructions, see SMC on page F5-2983. 


Traps to Hyp mode of Non-secure ELO and EL1 accesses to the ID registers 


Other than the MIDR, MPIDR, and PMCR.N, the ID registers are divided into groups, with a trap control in the 
HCR for each group. 


Table G1-50 ID register groups 





Trap control Register group 














HCR.TIDO ID group 0, Primary device identification registers on page G1-3902 
HCR.TID1 ID group 1, Implementation identification registers on page G1-3903 
HCR.TID2 ID group 2, Cache identification registers on page G1-3903 

HCR.TID3 ID group 3, Detailed feature identification registers on page G1-3904 





These controls trap register accesses from Non-secure ELO or EL1 to Hyp mode, as follows: 





HCR.TIDO 0 This control has no effect on Non-secure EL1 reads of the ID group 0 registers. 
1 Any attempt at Non-secure ELO or ELI to read any register in ID group 0 is trapped to 
Hyp mode 
HCR.TIDI 0 This control has no effect on Non-secure EL1 reads of the ID group 1 registers. 
Any attempt at Non-secure EL] to read any register in ID group 1 is trapped to Hyp 
mode. 
HCR.TID2 0 This control has no effect on Non-secure EL1 and ELO accesses to the ID group 2 
registers. 
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1 Any attempt at Non-secure ELO or EL] to read any register in ID group 2, and any 
attempt at Non-secure ELO or EL1 to write to the CSSELR, is trapped to Hyp mode. 
HCR.TID3 0 This control has no effect on Non-secure EL1 reads of the ID group 3 registers. 


Any attempt at Non-secure EL] to read any register in ID group 3 is trapped to Hyp 
mode. 


For the MIDR and MPIDR, and for PMCR.N, the architecture provides read/write aliases. The original register 
becomes accessible only from Hyp mode and Secure state, and a Non-secure ELO or EL] read of the original register 
returns the value of the read/write alias. This substitution is invisible to the ELO or EL1 software reading the register. 


Table G1-51 ID register substitution 














Register Original Alias, EL2 using AArch32 
Main ID MIDR VPIDR 

Multiprocessor Affinity MPIDR VMPIDR 

Performance Monitors Control Register PMCR.N HDCR.HPMN 





Reads of the MIDR, MPIDR, or PMCR.N from Hyp mode or Secure state are unchanged by the implementation of 
EL2, and access the physical registers. 





Note 
° If the optional Performance Monitors Extension is not implemented, HDCR.HPMN is RESO and PMCR is 
reserved. 
° HDCR.HPMN also affects whether a Performance Monitors counter can be accessed from Non-secure EL1 


or ELO. See the register description of HDCR for more information. 


° PMCR contains other fields that identify the implementation. For more information about trapping accesses 
to the PMCR, see Traps to Hyp mode of Non-secure ELO and EL1 accesses to Performance Monitors 
registers on page G1-3912. 





A reset into AArch32 state sets VPIDR to the MIDR value, VMPIDR to the MPIDR value, and HDCR.HPMN to 
the PMCR.N value. 


ID group 0, Primary device identification registers 
These registers identify some top-level implementation choices. 


Table G1-52 shows the registers that are in ID group 0 for traps to Hyp mode, and how the exceptions are reported 











in HSR. 
Table G1-52 ID group 0 registers 
Traps from Group 0 registers Syndrome reporting in HSR 
Non-secure EL1 FPSID Trapped VMRS access, for ID group traps, using EC value 0x08 
Non-secure ELO and ELI JIDR Trapped MCR or MRC access (coproc==0b1110), using EC value 0x05 





Note 
The FPSID is not accessible at ELO. 








If HCPTR.{TCP11, TCP10} traps accesses to SIMD and floating-point functionality, then for a read of FPSID, that 
trap has priority over this trap. 
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When the FPSID is accessible, a VMSR FPSID, <Rt> instruction is permitted but is ignored. The execution of this VMSR 
instruction is not trapped by the ID group 0 trap. 


ID group 1, Implementation identification registers 
These registers often provide coarse-grained identification mechanisms for implementation-specific features. 


Table G1-53 shows the registers that are in ID group 1 for traps to Hyp mode, and how the exceptions are reported 
in HSR. 


Table G1-53 ID group 1 registers 





Traps from Group 1 registers Syndrome reporting in HSR 





Non-secure EL1 TCMTR,TLBTR,REVIDR, AIDR Trapped MCR or MRC access (coproc==0b1111), using EC value 0x03 





ID group 2, Cache identification registers 
These registers describe and control the cache implementation. 


Table G1-54 shows the registers that are in ID group 2 for traps to Hyp mode, and how the exceptions are reported 
in HSR. 


Table G1-54 ID group 2 registers 





Traps from Group 2 registers Syndrome reporting in HSR 





Non-secure ELO and EL1 CTR, CCSIDR, CLIDR, CSSELR Trapped MCR or MRC access (coproc==0b1111), using EC value 0x03 
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ID group 3, Detailed feature identification registers 


These registers provide detailed information about the features of the implementation. 


Note 


These registers are called the CPUID registers. There is no requirement for this trap to apply to those registers that 
the CPUID Identification Scheme defines as reserved. See The CPUID identification scheme on page G4-4195. 








Table G1-55 shows the registers that are in ID group 3 for traps to Hyp mode, and how the exceptions are reported 











in HSR. 
Table G1-55 ID group 3 registers 
Traps from Group 3 registers Syndrome reporting in HSR 
Non-secure EL1 MVFRO, MVFR1, MVFR2. Trapped VMRS access for ID group 
traps, using EC value 0x08 
ID_PFRO, ID_PFR1, ID_DFRO, ID_AFRO. Trapped MCR or MRC access 
ID_MMERO, ID_MMEFR1, ID_MMFR2, ID_MMFR3, and ID_MMFR4, (coproc==0b1111), using EC value 
except that if ID_MMFR4 is implemented as RAZ/WI then it is 0x03 


IMPLEMENTATION DEFINED whether reads of the register are trapped. 
ID_ISARO, ID_ISAR1, ID_ISAR2, ID_ISAR3, ID_ISAR4, ID_ISARS. 
Any MRC access to any of the following encodings when coproc==0b1111: 

° opcl == 0, CRn == cO, CRm == {c3-c7}, opc2 == {0, 1}. 

° opcl == 0, CRn == cO, CRm == c3, opc2 == 2. 

° opcl == 0, CRn == cO, CRm == €5, opc2 == {4, 5}. 

It is IMPLEMENTATION DEFINED whether HCR.TID3 traps MRC accesses with 


coproc==0b1111 to encodings in the following range that are not already 
mentioned in this table: 


° CRn == cO, opcl == 0, CRm == {c2-c7}, opc2 == {0-7}. 





If HCPTR traps accesses to SIMD and floating-point functionality, then for reads of MVFRO, MVFR1, and 
MVER2, that trap has priority over this trap. 


Traps to Hyp mode of Non-secure ELO and EL1 execution of WFE and WFI instructions 


HCR.{TWE, TW]} trap Non-secure ELO and EL1 execution of WFE and WFI instructions to Hyp mode: 





HCR.TWE: 
1 Any attempt to execute a WFE instruction at Non-secure ELO or EL] is trapped to Hyp 
mode, if the instruction would otherwise have caused the PE to enter a low-power state. 
0 This control has no effect on Non-secure ELO or EL1 execution of WFE instructions. 
HCR.TWI: 
1 Any attempt to execute a WFI instruction at Non-secure ELO or EL] is trapped to Hyp 
mode, if the instruction would otherwise have caused the PE to enter a low-power state. 
0 This control has no effect on Non-secure ELO or EL1 execution of WFI instructions. 
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Table G1-56 shows how the exceptions are reported in HSR. 


Table G1-56 Instructions trapped to Hyp mode when HCR.{TWE, TWI} are 1 











Traps from Trapped instructions Syndrome reporting in HSR 
Non-secure ELO and EL1 ~~ WFE Trapped WFI or WFE instruction, using EC value 0x01 
WFI 





The attempted execution of a conditional WFE or WFI instruction is only trapped if the instruction passes its condition 
code check. 


Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of WFI are not guaranteed 
to be taken, even if the WFE or WFI is executed when there is no Wakeup event. The only guarantee is that if the 
instruction does not complete in finite time in the absence of a Wakeup event, the trap will be taken. 








For more information about these instructions, and when they can cause the PE to enter a low-power state, see: 
° Wait For Event and Send Event on page G1-3872. 
° Wait For Interrupt on page G1-3875. 


General trapping to Hyp mode of Non-secure accesses to the SIMD and floating-point 
registers 


HCPTR.{TCP11, TCP10} trap Non-secure accesses to the SIMD and floating-point registers to Hyp mode: 


Ob11 All Non-secure accesses to the SIMD and floating-point registers are trapped to Hyp mode. Trapped 
instructions generate: 
° Hyp Trap exceptions, if the exception is taken from Non-secure ELO or EL1. 
. Undefined Instruction exceptions taken to Hyp mode, if the exception is taken from EL2. 
0b00 This control has no effect on Non-secure accesses to the SIMD and floating-point registers. 





Note 
Software must set HCPTR.TCP11 and HCPTR.TCP1O0 to the same value. 





Table G1-57 shows the registers for which accesses are trapped, and how the exceptions are reported in HSR. 


Table G1-57 Register accesses trapped to Hyp mode when HCPTR.{TCP11, TCP10} are both 0b11 





Traps from _ Registers Syndrome reporting in HSR 





Non-secure FPSID, MVFRO, MVFR1, MVFR2, FPSCR, FPEXC, and any of the SIMD Trapped access to SIMD and 

state and floating-point registers QO-Q15, including their views as DO-D31 floating-point register, resulting from 
registers or SO-S31 registers. See Advanced SIMD and floating-point System | HCPTR, using EC value 0x07@ 
registers on page G1-3882. 





a. VMSR accesses to the FPSID are ignored, but for the purposes of this trap the architecture defines a VMSR access to the FPSID from EL1 or 
higher as an access to a SIMD and floating-point register. 


If EL3 is implemented and is using AArch32, and NSACR.{cp11, cp10} are both set to 0, then 
HCPTR.{TCP11, TCP10} behave as RAO/WI, regardless of their actual value. 


For more information about SIMD and floating-point support, see Advanced SIMD and floating-point support on 
page G1-3880. 
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Enabling access to the SIMD and floating-point registers 


FPEXC.EN is an instruction enable that enables access to the SIMD and floating-point registers from all Exception 
levels, but does not control the following: 


° VMSR accesses to the FPEXC or FPSID. 
° VMRS accesses from the FPEXC, FPSID, MVFRO, MVFR1, or MVFR2. 


FPEXC.EN is a PL1 control that also applies at EL2. See Enabling access to the SIMD and floating-point registers 
on page G1-3891. 
Traps to Hyp mode of Non-secure accesses to Advanced SIMD functionality 


If implemented as an RW field, HCPTR.TASE can trap Non-secure execution of Advanced SIMD instructions to 
Hyp mode, as follows. This trap applies only when HCPTR.{TCP11, TCP10} are both 0: 


1 Any attempt to execute an Advanced SIMD instruction in Non-secure state is trapped to Hyp mode. 
Trapped instructions generate: 
° Hyp Trap exceptions, if the exception is taken from Non-secure ELO or EL1. 
° Undefined Instruction exceptions taken to Hyp mode, if the exception is taken from EL2. 

0 This control has no effect on Non-secure execution of Advanced SIMD instructions. 


When the control is not implemented, meaning the HCPTR.TASE field is RAZ/WI, the HCPTR does not provide a 
trap to Hyp mode of the Non-secure execution of Advanced SIMD instructions, other than the 

HCPTR.{TCP11, TCP10} trap that applies to Non-secure execution of both Advanced SIMD and floating-point 
instructions. 


Table G1-32 on page G1-3890 shows the instructions that are trapped, and how the exceptions are reported in HSR. 


Table G1-58 Instructions trapped to Hyp mode when HCPTR.TASE is set to 1 





Traps from Instructions Syndrome reporting in HSR 





Non-secure state All Advanced SIMD instructions that are not also floating-point Trapped access to SIMD and floating-point 
instructions. For more information see Controls of Advanced register, resulting from HCPTR, using EC 
SIMD operation that do not apply to floating-point operationon value @x07 
page E1-2306. 





If EL3 is implemented and is using AArch32, and NSACR.NSASEDIS is 1, then HCPTR.TASE behaves as 
RAO/WI, regardless of its actual value. This behavior also applies when the HCPTR.TASE control is not 
implemented. 


Traps to Hyp mode of Non-secure EL1 accesses to the CPACR 


HCPTR.TCPAC traps Non-secure EL1 accesses to the CPACR to Hyp mode: 
1 Non-secure EL1 accesses to the CPACR are trapped to Hyp mode. 
0 This control has no effect on Non-secure EL1 accesses to the CPACR. 


Table G1-59 shows how the exceptions are reported in HSR: 


Table G1-59 Register accesses trapped to Hyp mode when HCPTR.TCPAC is 1 





Traps from Register Syndrome reporting in HSR 





Non-secure EL1 CPACR Trapped MCR or MRC access to System register with coproc==0b1111, using EC value 0x03 
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Note 
° The CPACR is not accessible at ELO. 





. In ARMv7 and earlier versions of the ARM architecture, one use of the CPACR is to identify what 
coprocessor, or conceptual coprocessor, functionality is implemented. Legacy software might use this 
identification mechanism. A hypervisor can use this trap to emulate this mechanism. See Background to the 
System register interface on page G1-3879 for more information about this functionality. 





Traps to Hyp mode of Non-secure System register accesses to trace registers 


If implemented, the HCPTR.TTA control traps System register accesses to the trace registers from Non-secure state 
to Hyp mode, as follows: 


1 Non-secure System register accesses to the trace registers are trapped to Hyp mode. Trapped 
instructions generate: 
° Hyp Trap exceptions, if the exception is taken from Non-secure ELO or EL1. 
. Undefined Instruction exceptions taken to Hyp mode, if the exception is taken from EL2. 
0 This control has no effect on Non-secure System register accesses to the trace registers. 


If the HCPTR.TTA control is not implemented, then HCPTR.TTA is RAO/WL See the register description for more 





information. 
Note 
° System register accesses to the trace registers use the System register (coproc==0b1110) encoding space. 
° The ETMv4 architecture does not permit ELO to access the trace registers. If the ARMv8-A architecture is 


implemented with an ETMv4 implementation, ELO accesses to the trace registers are UNDEFINED. A resulting 
Undefined Instruction exception is higher priority than an HCPTR.TTA Hyp Trap exception. 


° EL2 does not provide traps on trace register accesses through the optional memory-mapped external debug 
interface. 





System register accesses to the trace registers can have side-effects. When a System register access is trapped, no 
side-effects occur before the exception is taken, see Register access instructions on page G1-3886. 


Table G1-60 shows the registers for which accesses are trapped to Hyp mode when HCPTR.TTA is 1, and how the 
exceptions are reported in HSR. 


Table G1-60 Register accesses trapped to Hyp mode when HCPTR.TTA is 1 








Traps from Registers Syndrome reporting in HSR 

Non-secure System register accesses _ For accesses using: 

state to all implemented trace MCR or MRC instructions, trapped MCR or MRC access (coproc==0b1110), using EC value 
registers 0x05. 


° MCRR or MRRC instructions, trapped MCRR or MRRC access (coproc==0b1110), using EC 
value Qx@C. 





If EL3 is implemented and is using AArch32, and NSACR.NSTRCDIS is 1, then HCPTR.TTA behaves as 
RAO/WI, regardless of its actual value. This behavior applies, also, when the HCPTR.TTA control is not 
implemented. 
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General trapping to Hyp mode of Non-secure ELO and EL1 accesses to System registers 
in the (coproc==0b1111) encoding space 


HSTR.{TO-T3, T5-T13, T15} trap Non-secure ELO and EL1 accesses, using MCR, MRC, MCRR, or MRRC instructions, to 
the System registers in the (coproc==0b1111) encoding space, by: 
° The value of the CRn argument to the instruction, for MCR and MRC instructions. 


° The value of the CRm argument to the instruction, for MCRR and MRRC instructions. 
This applies for the set of CRn, or CRm, values {cO0-c3, c5-c13, c15}. 
When an HSTR.Tn trap control is: 


1 Non-secure EL] accesses to the corresponding System registers in the (coproc==0b1111) encoding 
space are trapped to Hyp mode. 


ELO accesses to the corresponding System registers are trapped to Hyp mode if they would not be 
UNDEFINED if the bit was zero. 


0 This control has no effect on Non-secure ELO or EL1 accesses to System registers. 


Note 


This means that a Hyp Trap exception taken from EL1 to EL2, generated because of a configuration setting in 
HSTR.Tn, is a higher priority exception than an Undefined Instruction exception generated because either the 
System register encoding is unallocated or because a register is never accessible at Non-secure EL1. As 
Synchronous exception prioritization for exceptions taken to AArch32 state on page G1-3816 shows, this is an 
exception to the general exception prioritization rules, that prioritize most Undefined Instruction exceptions taken 
to Undefined mode above traps to EL2. This prioritization includes any access from Non-secure EL1 to a 
register that is only accessible in Secure state. So, for example, an access to the SCR from Non-secure EL1: 





— When the value of HSTR.T1 is 0, generates an Undefined Instruction exception. 
— When the value of HSTR.T1 is 1, generates a Hyp Trap exception. 





Table G1-61 shows the accesses that are trapped, and how the exceptions are reported in HSR. 


Table G1-61 Accesses trapped to Hyp mode when an HSTR.Tn trap is enabled 








Traps from Trap control Trapped accesses Syndrome reporting in HSR 
Non-secure ELO and Tn MCR and MRC instructions, with coproc set to Trapped MCR or MRC access (coproc==0b1111), 
EL12 Qb1111 and CRn set ton using EC value 0x03 





MCRR and MRRC instructions, with coproc set Trapped MCRR or MRRC access 
to Qb1111 and CRm set to n (coproc==0b1111), using EC value 0x04 





a. As described in this section, traps from EL1 apply whenever the value of HSTR.Tn is 1. Traps from ELO apply only if the value of 
HSTR.Tn is 1 and the access would not be UNDEFINED if the value of HSTR.Tn was 0. 


For example, when HSTR.T7 is 1, considering only accesses from Non-secure EL1: 


° Any 32-bit access from a Non-secure PL1 mode using an MRC or MCR instruction with coproc set to 0b1111 and 
CRn set to c7, is trapped to Hyp mode. 


° Any 64-bit access from a Non-secure PL1 mode using an MRRC or MCRR instructions with coproc set to 0b1111 
and CRm set to c7, is trapped to Hyp mode. 
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Note 


° Bits[4,14] of the HSTR are reserved, RESO. Although the Generic Timer control registers are implemented in 
the coproc == 0b1111 encoding space with CRn==c14 for an MRC or MCR access, EL2 does not provide a trap 
on accesses to the Generic Timer System registers. 





° An implementation might provide additional controls, in IMPLEMENTATION DEFINED registers, to provide 
finer-grained control of control of trapping of IMPLEMENTATION DEFINED features. 





System registers in the (coproc==0b1111) encoding space with IMPLEMENTATION DEFINED access 
permission from ELO 


For a System register in the (coproc==0b1111) encoding space, that is accessed using a CRn or CRm value that can 
be trapped by a HSTR.Tn control, if an access to the register from User mode is UNDEFINED when the value of the 
corresponding HSTR.Tn trap control is 0, then when that HSTR.Tn trap control is 1, it is IMPLEMENTATION DEFINED 
whether an access from Non-secure User mode generates: 


° A Hyp Trap exception. 


° An Undefined Instruction exception taken to Non-secure Undefined mode. 


Note 


ARM expects that trapping to Hyp mode of Non-secure User mode accesses to System register in the 
(coproc==0b1111) encoding space will be unusual, and used only when the hypervisor must virtualize User mode 
operation. ARM recommends that, whenever possible, Non-secure User mode accesses to System register in the 
(coproc==0b1111) encoding space behave as they would if the processor did not implement EL2, generating an 
Undefined Instruction exception taken to Non-secure Undefined mode if the architecture does not support the User 
mode access. 








Traps to Hyp mode of Non-secure System register accesses to debug registers 


HDCR.{TDRA, TDOSA, TDA} trap Non-secure System register accesses to debug registers to Hyp mode, as 
follows: 


° HDCR.(TDRA, TDA} trap Non-secure ELO and EL1 accesses. 
° HDCR.TDOSA traps Non-secure EL1 accesses. 





Note 


EL2 does not provide traps of debug register accesses through the optional memory-mapped external debug 
interface. 





System register accesses to the debug registers can have side-effects. When a System register access is trapped to 
Hyp mode, no side-effects occur before the exception is taken to Hyp mode. See Register access instructions on 
page G1-3886. 


Table G1-62 shows the subsections that list the accesses trapped. The subsections describe how the traps are 




















reported in HSR. 
Table G1-62 Traps of Non-secure ELO and EL1 accesses to debug registers 
Trap control Subsection 
HDCR.TDRA Trapping Non-secure System register accesses to Debug ROM registers on page G1-3910 
HDCR.TDOSA Trapping Non-secure System register accesses to powerdown debug registers on page G1-3910 
HDCR.TDA Trapping general Non-secure System register accesses to debug registers on page G1-3911 
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Note 


System register accesses to debug registers use the (coproc==0b1110) encoding space. 








Trapping Non-secure System register accesses to Debug ROM registers 


HDCR.TDRA traps Non-secure ELO and EL1 System register accesses to the Debug ROM registers to Hyp mode: 


1 Non-secure ELO or EL1 System register accesses to the Debug ROM registers are trapped to Hyp 
mode. 

0 This control has no effect on Non-secure ELO and EL1 System register accesses to the Debug ROM 
registers. 


Table G1-63 shows the register accesses that are trapped, and how the exceptions are reported in HSR: 


Table G1-63 Register accesses trapped to Hyp mode when HDCR.TDRA is 1 





Traps from 


Registers Syndrome reporting in HSR 





Non-secure ELO 
and EL1 


DBGDRAR, DBGDSAR _ For accesses using: 
° MCR or MRC instructions, trapped MCR or MRC access (coproc==0b1110), using EC 
value 0x05. 


. MRRC instructions, trapped MRRC access (coproc==0b1110), using EC value @x@C. 





If HDCR.TDE or HCR.TGE is 1, behavior is as if HDCR.TDRA is 1 other than for the purpose of a direct read. 


Trapping Non-secure System register accesses to powerdown debug registers 
HDCR.TDOSA traps Non-secure EL1 System register accesses to the powerdown debug registers to Hyp mode: 


1 Non-secure EL1 System register accesses to the powerdown debug registers are trapped to Hyp 
mode. 


0 This control has no effect on Non-secure EL1 System register accesses to the powerdown debug 
registers. 


Table G1-64 shows the register accesses that are trapped, and how the exceptions are reported in HSR. 


Table G1-64 Register accesses trapped to Hyp mode when HDCR.TDOSA is 1 





Traps from 


Registers Syndrome reporting in HSR 





Non-secure 
EL1 


DBGOSLSR, DBGOSLAR, DBGOSDLR, DBGPRCR Trapped MCR or MRC access 
Any IMPLEMENTATION DEFINED integration registers. (coproc==0b1110), using EC value 0x05 


Any IMPLEMENTATION DEFINED register with similar functionality that 
the implementation specifies as trapped by HDCR.TDOSA. 








Note 


These registers are not accessible at ELO. 





If HDCR.TDE or HCR.TGE is 1, behavior is as if HDCR.TDOSA is 1 other than for the purpose of a direct read. 
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Trapping general Non-secure System register accesses to debug registers 


HDCR.TDA traps Non-secure ELO and EL1 System register accesses to the debug registers that are not mentioned 
in either of the following: 


° Traps to Hyp mode of Non-secure System register accesses to debug registers on page G1-3909. 


° Trapping Non-secure System register accesses to powerdown debug registers on page G1-3910. 


This means that HDCR.TDA traps to Hyp mode Non-secure ELO and EL1 System register accesses to all debug 
registers except the following: 


° Non-secure System register accesses to DBGDRAR or DBGDSAR. The HDCR.TDRA trap traps these 
accesses. 


° Non-secure System register access to DBGOSLSR, DBGOSLAR, DBGOSDLR, or DBGPRCR. The 
HDCR.TDOSA trap traps these accesses. 


HDCR.TDA does not trap accesses to DBGDTRTXint or DBGDTRRXint when the PE is in Debug state. 


When HDCR.TDA is: 

1 Non-secure ELO or EL1 System register accesses to any of the registers shown in Table G1-65 are 
trapped to Hyp mode. 

0 This control has no effect on Non-secure ELO or EL1 System register accesses. 


Table G1-65 shows how the exceptions are reported in HSR. 


Table G1-65 Accesses trapped to Hyp mode when HDCR.TDA is 1 





Traps from Trapped accesses Syndrome reporting in HSR 





Non-secure Accesses to the DBGDIDR, DBGDSCRint, DBGDCCINT, _ For accesses using MCR or MRC instructions, trapped MCR 
ELO and EL1 DBGDTRRXint, DBGDTRTXint, DBGWFAR, or MRC access (coproc==0b1110), using EC value 0x05 

DBGVCR, DBGDSCRext, DBGDTRTXext, 

DBGDTRRXext, DBGBVR<n>, DBGBCR<n>, 

DBGBXVR<n>, DBGWCR<n>, DBGWVR<n>, 

DBGCLAIMSET, DBGCLAIMCLR, 

DBGAUTHSTATUS, DBGDEVID, DBGDEVID1, 

DBGDEVID2, and DBGOSECCR 





STC accesses to DBGDTRRXint. Trapped LDC or STC access, using EC value 0x06 
LDC accesses to DBGDTRTXint. 





If HDCR.TDE or HCR.TGE is 1, behavior is as if HDCR.TDA is 1 other than for the purpose of a direct read. 


Traps to Hyp mode of Non-secure ELO and EL1 accesses to the Generic Timer registers 


CNTHCTL.{PL1PCEN, PL1PCTEN } trap Non-secure ELO and EL1 accesses to the Generic Timer registers to Hyp 
mode, as follows: 


° CNTHCTL.PL1PCEN traps Non-secure ELO and EL] accesses to the physical timer registers. 
° CNTHCTL.PL1PCTEN traps Non-secure ELO and EL1 accesses to the physical counter register. 


For each of these controls: 





1 This control has no effect on Non-secure ELO and EL1 accesses to the registers shown in 
Table G1-66 on page G1-3912. 
0 Non-secure ELO and EL] accesses are trapped to Hyp mode. 
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Table G1-66 shows the registers for which accesses are trapped, and how the exceptions are reported in HSR. 


Table G1-66 Register accesses trapped to Hyp mode by CNTHCTL trap controls 








Traps from Trapcontrol Registers Syndrome reporting in HSR 

Non-secure PL1PCEN CNTP CTL, CNTP_CVAL, For accesses using: 

ELO and CNTP_TVAL ° MCR or MRC instructions, trapped MCR or MRC access 
ELI (coproc==0b1111), using EC value 0x03 


° MCRR or MRRC instructions, trapped MCRR or MRRC access 
(coproc==0b1111), using EC value 0x04 





PLIPCTEN CNTPCT Trapped MCRR or MRRC access (coproc==0b1110), using EC value 0x04 





Traps to Hyp mode of Non-secure ELO and EL1 accesses to Performance Monitors 
registers 


If the Performance Monitors Extension is implemented, HDCR.{TPM, TPMCR} trap Non-secure ELO and EL1 
accesses to the Performance Monitors registers to Hyp mode: 











HDCR.TPM: 
1 Non-secure ELO and EL1 accesses to all Performance Monitors registers are trapped to 
Hyp mode. 
0 This control has no effect on Non-secure ELO and EL] accesses to the Performance 
Monitors registers. 
HDCR.TPMCR: 
1 Non-secure ELO and EL1 accesses to the Performance Monitors Control Register are 
trapped to Hyp mode. 
—— Note 
The conditions for this trap are identical to those for the trap controlled by HDCR.TPM 
0 This control has no effect on Non-secure ELO and EL] accesses to the Performance 
Monitors Control Registers. 
Note 
° EL2 does not provide traps on Performance Monitor register accesses through the optional memory-mapped 


external debug interface. 


° If the Performance Monitors Extension is not implemented, HDCR.{TPM, TPMCR} are RESO. 
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Table G1-67 shows the registers for which accesses are trapped, and how the exceptions are reported in HSR. 


Table G1-67 Register accesses trapped to Hyp mode when HDCR.{TPM, TPMCR} are 1 








Traps from Trap control Registers Syndrome reporting in HSR 

Non-secure TPM PMCR, PMCNTENSET, PMCNTENCLR, PMOVSR, For accesses using: 

ELO and ELI PMSWINC, PMSELR, PMCEIDO, PMCEID1, . MCR or MRC instructions, trapped 
PMCCNTR, PMXEVTYPER, PMXEVCNTR, MCR or MRC access 
PMUSERENR, PMINTENSET, PMINTENCLR, (coproc==0b1111), using EC 
PMOVSSET, PMEVCNTR<n>, PMEVTYPER<n>, value 0x03. 
EMOCHILTE . MCRR or MRRC instructions, 


trapped MCRR or MRRC access 
(coproc==0b1111), using EC 
value 0x04. 





TPMCR PMCR Trapped MCR or MRC access 
(coproc==0b1111), using EC value 
0x03 





Note 


HDCR.HPMN affects whether a counter can be accessed from Non-secure EL1 or ELO. See the register description 
of HDCR for more information. 











ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G1-3913 
1ID092916 Non-Confidential 


G1 The AArch32 System Level Programmers’ Model 
G1.20 Configurable instruction enables and disables, and trap controls 



































G1.20.4 EL3 configurable controls 
Table G1-68 shows the System registers that contain these controls. 
Table G1-68 System registers that contain instruction enables and disables, and trap controls 
Register name Register description 
SCR Secure Configuration Register 
NSACR Non-secure Access Control Register 
Table G1-69 summarizes the controls. 
Table G1-69 EL3 Instruction enables and disables, and trap controls 
Type of 
Control he ole Trap 
SCR.{TWE, TWI} T Traps to Monitor mode of the execution of WFE and WF1 instructions in modes other than 
Monitor mode on page G1-3915 
SCR.HCE E Enabling EL2 and Non-secure ELI execution of HVC instructions on page G1-3916 
SCR.SCD D Disabling SMC instructions on page G1-3916 
NSACR.NSTRCDIS D Disabling Non-secure System register access to the trace registers on page G1-3917 
NSACR. {cp11, cp10} E Enabling Non-secure access to SIMD and floating-point functionality on page G1-3917 
NSACR.NSASEDIS D Disabling Non-secure access to Advanced SIMD functionality on page G1-3918 





a. See Table G1-70. 


Table G1-70 Control types, for AArch32 EL3 controls 





Abbreviation Type See 











D Disable — Instruction enables and instruction disables on page G1-3885 
E Enable Instruction enables and instruction disables on page G1-3885 
T Trap Trap controls on page G1-3885 





Also see the following: 

° Register access instructions on page G1-3886. 

° Instructions that fail their condition code check. 

° Trapping to EL3 of instructions that are UNPREDICTABLE on page G1-3915. 


Instructions that fail their condition code check 


For UNDEFINED instructions that fail their condition code check, see Conditional execution of undefined instructions 
on page G1-3851. 


For an instruction that has a Monitor trap set, that fails its condition code check: 


° Unless the trap description states otherwise, it is IMPLEMENTATION DEFINED whether the instruction: 
— Generates a Monitor Trap exception. 


— Executes as a NOP. 
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Any implementation must be consistent in its handling of instructions that fail their condition code check. This 


means that: 
° Whenever a Monitor trap is set on such an instruction it must either: 
— Always generate a Monitor trap exception. 
— _ Always treat the instruction as a NOP. 
° The IMPLEMENTATION DEFINED part of the requirements of Conditional execution of undefined instructions 


on page G1-3851 must be consistent with the handling of Monitor traps on instructions that fail their 
condition code check. Table G1-71 shows this: 


Table G1-71 Consistent handling of instructions that fail their condition code check 





Behavior of conditional UNDEFINED instruction4 Monitor trap on instruction that fails its condition code check» 





Executes as a NOP Executes as a NOP 





Generates an Undefined Instruction exception Generates a Monitor trap exception 





a. As defined in Conditional execution of undefined instructions on page G1-3851. In Non-secure ELO and EL1 modes, this applies only if no 
Monitor trap is set for the instruction, otherwise see the behavior in the other column of the table. 


b. Fora trapped instruction executed in a Non-secure EL1 or ELO mode. 





Note 


When SCR{TWE, TWI]} is set so that conditional WFE and WFI instructions are trapped to Monitor mode, the 
attempted execution of a conditional WFE or WFI instruction is only trapped if the instruction passes its condition code 
check. See Traps to Monitor mode of the execution of WFE and WF instructions in modes other than Monitor mode. 





Trapping to EL3 of instructions that are UNPREDICTABLE 


For an instruction that is UNPREDICTABLE, when the instruction is disabled or trapped then it is CONSTRAINED 
UNPREDICTABLE whether execution of the instruction generates a Monitor Trap exception. 


Note 


UNPREDICTABLE and CONSTRAINED UNPREDICTABLE behavior must not perform any function that cannot be 
performed at the current or lower Exception level using instructions that are not UNPREDICTABLE and are not 
CONSTRAINED UNPREDICTABLE. This means that disabling or trapping an instruction changes the set of instructions 
that might be executed in modes other than Monitor mode. This affects, indirectly, the permitted behavior of 
UNPREDICTABLE and CONSTRAINED UNPREDICTABLE instructions. 








If no instructions are trapped, the attempted execution of an UNPREDICTABLE instruction in a mode other than 
Monitor mode must not generate a Monitor Trap exception. 


Traps to Monitor mode of the execution of WFE and WFI instructions in modes other than 
Monitor mode 


SCR{TWE, TWI]} trap WFE and WFI instructions to Monitor mode: 


SCR.TWE 1 Any attempt to execute a WFE instruction in any mode other than Monitor mode is 
trapped to Monitor mode, if the instruction would otherwise have caused the PE to enter 
a low-power state. 


0 This control has no effect on the execution of WFE instructions. 
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SCR.TWI 1 Any attempt to execute a WFI instruction in any mode other than Monitor mode is 
trapped to Monitor mode, if the instruction would otherwise have caused the PE to enter 
a low-power state. 


0 This control has no effect on the execution of WFI instructions. 
For PLO and PL1, these traps apply to WFE and WFI instruction execution in both Security states. 


The attempted execution of a conditional WFE or WFI instruction is only trapped if the instruction passes its condition 
code check. 


Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of WFI are not guaranteed 
to be taken, even if the WFE or WFI is executed when there is no Wakeup event. The only guarantee is that if the 
instruction does not complete in finite time in the absence of a Wakeup event, the trap will be taken. 








For more information about these instructions, and when they can cause the PE to enter a low-power state, see: 
° Wait For Event and Send Event on page G1-3872. 
° Wait For Interrupt on page G1-3875. 


Enabling EL2 and Non-secure EL1 execution of HVC instructions 


SCR.HCE enables EL2 and Non-secure EL1 execution of HVC instructions: 


1 HVC instruction execution is enabled at EL2 and Non-secure EL1. 
0 HVC instructions are: 
° UNDEFINED at Non-secure EL1. The Undefined Instruction exception is taken to Undefined 
mode. 
° CONSTRAINED UNPREDICTABLE at EL2. The behavior must be one of the following: 


— The instruction is UNDEFINED. 


— The instruction executes as a NOP. 





Note 
° If EL2 is not implemented, SCR.HCE is RESO and HVC is UNDEFINED. 
° HVC instructions are always UNDEFINED at ELO and in Secure state. 





Disabling SMC instructions 
SCR.SCD disables SMC instructions: 


1 In Non-secure state 
SMC instructions are UNDEFINED. The Undefined Instruction exception is taken from the 
current Exception level to the current Exception level. 
In Secure state 


Behavior is one of the following: 








° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
0 SMC instructions are enabled. 
Note 
° SMC instructions are always UNDEFINED at ELO. 
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° When the value of HCR.TSC is 1, any attempted execution of an SMC instruction at Non-secure EL] is trapped 
to EL2, regardless of the value of SCR.SCD, see Traps to Hyp mode of Non-secure EL] execution of SMC 
instructions on page G1-3901. As Synchronous exception prioritization for exceptions taken to AArch32 state 
on page G1-3816 shows, this is an exception to the general exception prioritization rules, that prioritize most 
Undefined Instruction exceptions taken to Undefined mode above traps to a higher Exception level. 





Disabling Non-secure System register access to the trace registers 
NSACR.NSTRCDIS disables Non-secure System register accesses to the trace registers, from all Privilege levels: 


1 Non-secure state accesses are disabled. Secure state accesses are enabled. If the PE is in Non-secure 
state: 
. CPACR.TRCDIS behaves as RAO/WI, regardless of its actual value. See Traps to Undefined 
mode of PLO and PL1 System register accesses to trace registers on page G1-3889. 
This behavior applies even if the CPACR.TRCDIS control is not implemented. See the 
referenced section for more information. 


° HCPTR.TTA behaves as RAO/WI, regardless of its actual value. See Traps to Hyp mode of 
Non-secure System register accesses to trace registers on page G1-3907. 


0 There is no effect on accesses to CPACR.TRCDIS and HCPTR.TTA. 





Note 


° System register accesses to the trace registers use the (coproc==0b1111) encoding space. 


° NSACR.NSTRCDIS might be implemented as RAZ/WI. See the NSACR register description for more 
information. 


° The ETMv4 architecture does not permit ELO to access the trace registers. If the ARMv8-A architecture is 
implemented with an ETMv4 implementation, ELO accesses to the trace registers are UNDEFINED. 


° EL3 does not provide Non-secure access controls on trace register accesses through the optional 
memory-mapped external debug interface. 





Enabling Non-secure access to SIMD and floating-point functionality 


NSACR. {cp11, cp10} enable Non-secure access to the SIMD and floating-point registers, from all Privilege levels: 


0b11 All accesses, from both Security states, are enabled. 
0b00 Non-secure state accesses are disabled. Secure state accesses are enabled. If the PE is in Non-secure 
state: 


° CPACR.{cp11, cp10} behave as RAZ/WI. See Enabling PLO and PLI1 accesses to the SIMD 
and floating-point registers on page G1-3890. 


. HCPTR.{TCP11, TCP10} behave as RAO/WI. See General trapping to Hyp mode of 
Non-secure accesses to the SIMD and floating-point registers on page G1-3905. 


Note 
Software must set NSACR.cp11 and NSACR.cp10 to the same value. 








For more information about SIMD and floating-point support, see Advanced SIMD and floating-point support on 
page G1-3880. 
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Disabling Non-secure access to Advanced SIMD functionality 
NSACR.NSASEDIS disables Non-secure accesses to the Advanced SIMD functionality, from all Privilege levels: 


1 Non-secure state accesses are disabled. Secure accesses are enabled. If the PE is in Non-secure state: 


° CPACR.ASEDIS behaves as RAO/WI. See Disabling PLO and PL] execution of Advanced 
SIMD instructions on page G1-3891. 


° HCPTR.TASE behaves as RAO/WI. See Traps to Hyp mode of Non-secure accesses to 
Advanced SIMD functionality on page G1-3906. 


These behaviors apply even if one or both of the CPACR.ASEDIS and HCPTR.TASE controls is 
not implemented. See the referenced sections for more information. 


0 There is no effect on CPACR.ASEDIS and HCPTR.TASE. 
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Pseudocode description of configurable instruction enables, disables, and traps 


The pseudocode function AArch32.CheckITEnabled() checks whether the T32 IT instruction is enabled. 
The pseudocode function AArch32.CheckSETENDEnabled() checks whether the SETEND instruction is disabled. 
The pseudocode function for AArch32.CheckForSMCTrap() checks for traps on an SMC instruction. 


The AArch32.CheckForWFxTrap() pseudocode function checks for traps on WFE and WFI instructions: 


Pseudocode description of enabling SIMD and floating-point functionality 


The AArch32.CheckAdvSIMDOrFPEnabled() and AArch32.CheckFPAdvSIMDTrap() pseudocode functions take appropriate 
action if an SIMD or floating-point instruction is used when the SIMD and floating-point functionality is not 
enabled or is trapped. 


The CheckAdvSIMDOrVFPEnabled(), CheckAdvSIMDEnabled(), and CheckVFPEnabled() wrapper functions support the 
AArch32.CheckAdvSIMDOrFPEnabled() and AArch32.CheckFPAdvSIMDTrap() functions. 


The AArch32.CheckAdvSIMDOrFPEnabled(), AArch32.CheckFPAdvSIMDTrap(), CheckAdvSIMDOrVFPEnabled(), 
CheckAdvSIMDEnabled(), and CheckVFPEnabled() functions are described in Chapter J1 ARMv8 Pseudocode. 
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Chapter G2 
AArch32 Self-hosted Debug 


When the PE is using self-hosted debug, it generates debug exceptions. This chapter describes the AArch32 
self-hosted debug exception model. It is organized as follows: 
Introductory information 

° About self-hosted debug on page G2-3922. 

° The debug exception enable controls on page G2-3926. 


The debug Exception model 
. Routing debug exceptions on page G2-3927. 


° Enabling debug exceptions from the current Privilege level and Security state on 
page G2-3929. 


° The effect of powerdown on debug exceptions on page G2-3931. 
° Summary of permitted routing and enabling of debug exceptions on page G2-3932. 
. Pseudocode description of debug exceptions on page G2-3934. 


The debug exceptions 
° Breakpoint Instruction exceptions on page G2-3935. 
° Breakpoint exceptions on page G2-3938. 
° Watchpoint exceptions on page G2-3961. 
° Vector Catch exceptions on page G2-3975. 


Synchronization requirements 


The behavior of self-hosted debug after changes to System registers, or after changes to the 
authentication interface, but before a Context synchronization event guarantees the effects of the 
changes: 


° Synchronization and debug exceptions on page G2-3983. 
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G2.1 


G2.1.1 


G2.1.2 


G2.1.3 


About self-hosted debug 


Self-hosted debug supports debugging through the generation and handling of debug exceptions, that are taken using 
the exception model described in: 


° Chapter D1 The AArch64 System Level Programmers’ Model, if the exception is taken to AArch64 state. 
° Chapter G1 The AArch32 System Level Programmers’ Model, if the exception is taken to AArch32 state. 


This section introduces some terms used in describing self-hosted debug, and then introduces the debug exceptions. 
See: 


° Definition of a debugger in the context of self-hosted debug. 
° Context ID and Process ID. 


Definition of a debugger in the context of self-hosted debug 


Within this chapter, debugger means that part of an operating system, or higher level of system software, that 
handles debug exceptions and programs the debug System registers. An operating system with rich application 
environments might provide debug services that support a debugger user interface executing at ELO. From the 
architectural perspective, the debug services are the debugger. 


Context ID and Process ID 


In AArch32 state, the CONTEXTIDR identifies the current Context ID, that is used by: 
. The debug logic, for breakpoint and watchpoint matching. 


. Implemented trace logic, to identify the current process. 


When using the Long-descriptor translation table format, the CONTEXTIDR has a single field, PROCID, that is 
defined as the Process Identifier (Process ID). Therefore, in AArch64 state, the Context ID and Process ID are 
identical when using this translation table format. 


When using the Short-descriptor translation table format: 
° CONTEXTIDRJ3 1:0] defines the Context ID, that is used for breakpoint and watchpoint matching. 
° CONTEXTIDR[31:8] defines the Process ID. 


° CONTEXTIDRJ7:0] define the ASID. See Global and process-specific translation table entries on 
page G4-4089. This means that, when using the Short-descriptor translation table format, the ASID is always 
bits[7:0] of the Context ID. 


About debug exceptions 


Debug exceptions occur during normal program flow if a debugger has programmed the PE to generate them. For 
example, a software developer might use a debugger contained in an operating system to debug an application. To 
do this, the debugger might enable one or more debug exceptions. The debug exceptions that can be generated in 
an AArch32 stage 1 translation regime are: 


° Breakpoint Instruction exceptions on page G2-3923. 

° Breakpoint exceptions on page G2-3923, generated by hardware breakpoints. 
° Watchpoint exceptions on page G2-3924, generated by hardware watchpoints. 
° Vector Catch exceptions on page G2-3924. 


Note 


In addition, Software Step exceptions can be generated in stage 1 of an AArch32 translation regime. However, these 
are always taken to AArch6é4 state. Software Step exceptions on page D2-1628 describes this. 








The PE can only generate a particular debug exception when both: 


1. Debug exceptions are enabled from the current Exception level and Security state. 
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See Enabling debug exceptions from the current Privilege level and Security state on page G2-3929. 
Breakpoint Instruction exceptions are always enabled from the current Exception level and Security state. 
2. A debugger has enabled that particular debug exception. 


All of the debug exceptions except for Breakpoint Instruction exceptions have an enable control contained in 
the DBGDSCRext. See The debug exception enable controls on page G2-3926. 


Note 


If halting is allowed and EDSCR.HDE is 1, hardware breakpoints and watchpoints cause entry to Debug state 
instead of causing debug exceptions. In Debug state, the PE is halted. 





For the definition of halting is allowed, see Halting allowed and halting prohibited on page H2-4845. 





When a debug exception is taken to an Exception level that is using AArch32: 
° If the debug exception is a Watchpoint exception, it is taken as a Data Abort exception. 


° Otherwise, it is taken as a Prefetch Abort exception. 
The following list summarizes each of the debug exceptions: 


Breakpoint Instruction exceptions 


Breakpoint instructions generate these. Breakpoint instructions are instructions that software 
developers can use to cause exceptions at particular points in the program flow. 


The breakpoint instruction in the A32 and T32 instruction sets is BKPT #<immediate>. Whenever one 
of these is committed for execution, the PE takes a Breakpoint Instruction exception. 
PE behavior 


Breakpoint Instruction exceptions cannot be masked. The PE takes Breakpoint 
Instruction exceptions regardless of both of the following: 


. The current Privilege level and AArch32 mode. 


° The current Security state. 


For more information, see Breakpoint Instruction exceptions on page G2-3935. 


Breakpoint exceptions 


The ARMv8-A architecture provides 2-16 hardware breakpoints. These can be programmed to 
generate Breakpoint exceptions based on particular instruction addresses, or based on particular PE 
contexts, or both. 


For example, a software developer might program a hardware breakpoint to generate a Breakpoint 
exception whenever the instruction with address 0x1000 is committed for execution. 


The ARMVv8-A architecture supports the following types of hardware breakpoint for use in stage 1 
of an AArch32 translation regime: 
. Address: 

— Address Match. 

— Address Mismatch. 


Comparisons are made with the virtual address of each instruction in the program flow. 


° Context: 
— Context ID Match. Matches with the Context ID value held in the CONTEXTIDR. 
—  VMID Match. Matches with the VMID value held in the VTTBR. 
— Context ID and VMID Match. Matches with both the Context ID and the VMID value. 
An Address breakpoint can link to a Context breakpoint, so that the Address breakpoint only 


generates a Breakpoint exception if the PE is in a particular context when the address match or 
mismatch occurs. 
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A breakpoint generates a Breakpoint exception whenever an instruction that causes a match is 
committed for execution. 
PE behavior 


If halting is allowed and EDSCR.HDE is 1, hardware breakpoints cause entry to Debug 
state. That is, they halt the PE. See Chapter H2 Debug State. 


Otherwise: 

. If debug exceptions are enabled, hardware breakpoints cause Breakpoint 
exceptions. 

. If debug exceptions are disabled, hardware breakpoints are ignored. 


For more information, see Breakpoint exceptions on page G2-3938. 


Watchpoint exceptions 


The ARMv8-A architecture provides 2-16 hardware watchpoints. These can be programmed to 
generate Watchpoint exceptions based on accesses to particular data addresses, or based on accesses 
to any address in a data address range. 


For example, a software developer might program a hardware watchpoint to generate a Watchpoint 
exception on an access to any address in the data address range 0x1000 - 0x101F. 


A hardware watchpoint can link to a hardware breakpoint if the hardware breakpoint is a Linked 
Context type. In this case, the watchpoint only generates a Watchpoint exception if the PE is ina 
particular context when the data address match occurs. 


The smallest data address size that a watchpoint can be programmed to match on is a byte. A single 
watchpoint can be programmed to match on one or more bytes. 


A watchpoint generates a Watchpoint exception whenever an instruction that initiates an access that 
causes a match is committed for execution. 


PE behavior 


If halting is allowed and EDSCR.HDE is 1, hardware watchpoints cause entry to Debug 
state. That is, they halt the PE. See Chapter H2 Debug State. 


Otherwise: 

. If debug exceptions are enabled, hardware watchpoints cause Watchpoint 
exceptions. 

° If debug exceptions are disabled, hardware watchpoints are ignored. 


For more information, see Watchpoint exceptions on page G2-3961. 


Vector Catch exceptions 


These are used to trap exceptions. The ARMv8-A architecture provides two forms of vector catch, 
address-matching and exception-trapping. Only one form can be implemented. 


Whichever form is implemented, a debugger must enable Vector Catch exceptions for one or more 
exception vectors by programming the DBGVCR. Generation of Vector Catch exceptions is then as 
follows: 


. For the address-matching form, a Vector Catch exception is generated whenever the virtual 
address of an instruction matches a vector that Vector Catch exceptions are enabled for. 


. For the Exception-trapping form, a Vector Catch exception is generated as part of exception 
entry for exception types that correspond to vectors that Vector Catch exceptions are enabled 
for. 

PE behavior 

If debug exceptions are: 
° Enabled, Vector Catch exceptions can be generated. 


. Disabled, vector catch is ignored. 


For more information, see Vector Catch exceptions on page G2-3975. 


Table G2-1 on page G2-3925 summarizes PE behavior and shows the location of the pseudocode for each of the 
debug exceptions. 
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Table G2-1 PE behavior and pseudocode for each of the debug exceptions 





PE behavior if debug exceptions are: 














Debug exception Pseudocode 
Enabled Disabled 
Breakpoint Instruction Takes Prefetch Abort Takes Prefetch Abort = AArch32.SoftwareBreakpoint() 
exception exception exception 
Breakpoint exception Takes Prefetch Abort Ignored See Pseudocode description of Breakpoint 
exception exceptions taken from AArch32 state on 
page G2-3959 
Watchpoint exception Takes Data Abort Ignored See Pseudocode description of Watchpoint 
exception®@ exceptions taken from AArch32 state on 
page G2-3974 
Vector Catch exception Takes Prefetch Abort Ignored See Pseudocode description of Vector Catch 
exception exceptions on page G2-3981 





a. Ifhalting is allowed and EDSCR.HDE is 1, hardware breakpoints and watchpoints cause the PE to enter Debug state instead of causing debug 
exceptions. See Chapter H2 Debug State. 
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G2.2 The debug exception enable controls 
The enable controls for each debug exception are as follows: 


Breakpoint Instruction exceptions 


None. Breakpoint Instruction exceptions are always enabled. 


Breakpoint exceptions 
DBGDSCRext.MDBGen, plus an enable control for each breakpoint, DBGBCR<n>.E. 


Watchpoint exceptions 
DBGDSCRext.MDBGern, plus an enable control for each watchpoint, DBGWCR<n>.E. 


Vector Catch exceptions 
DBGDSCRext.MDBGen. 


In addition, for all debug exceptions other than Breakpoint Instruction exceptions, software must configure the 
controls that enable debug exceptions from the current Exception level and Security state. See Enabling debug 
exceptions from the current Privilege level and Security state on page G2-3929. 


The PE cannot take a debug exception if debug exceptions are disabled from either the current Exception level or 
the current Security state. 


Breakpoint Instruction exceptions are always enabled from the current Exception level and Security state. 
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G2.3 Routing debug exceptions 
Debug exceptions are usually routed to Abort mode. However, if EL2 is implemented, the following applies: 


. Breakpoint Instruction exceptions taken from Hyp mode are routed to Hyp mode. 


All other debug exceptions are disabled from Hyp mode. 


. The routing of debug exceptions taken from Non-secure PL1 and PLO depends on HDCR.TDE: 


1 Debug exceptions taken from Non-secure PL1 and PLO are routed to Hyp mode. 
0 Debug exceptions taken from Non-secure PL1 and PLO are routed to Abort mode. 
Note 





If HCR.TGE is 1, HDCR.TDE is treated as being 1 except for the purpose of a direct read of HDCR. 





Table G2-2 shows this. 


Table G2-2 The effect of the TGE and TDE control bits on debug exception routing 





Debug exceptions taken from 


HER-TGE “HOGR-IDE Non-secure PL1 and PLO are taken to: 











0 0 Non-secure Abort mode 
0 1 Hyp mode 
1 x Hyp mode 





Note 
If EL2 is not implemented, the PE behaves as if both HCR.TGE and HDCR.TDE are 0. 








The following tables show the routing of debug exceptions: 


Table G2-3 Routing when both EL3 and EL2 are implemented 

















Target AArch32 mode when executing in: 
HDCR.TDE@ Non-secure: 
Secure state 
PLO PL1 PL2 
0 Non-secure Abort mode Non-secure Abort mode Hyp modeb | Secure Abort mode 
1 Hyp mode Hyp mode Hyp mode> | Secure Abort mode 





a. If HCR.TGE is 1, this bit is treated as being | other than for a direct read of HDCR. 


b. Only applies to Breakpoint Instruction exceptions. All other debug exceptions are disabled. 


Table G2-4 Routing when EL3 is implemented and EL2 is not implemented 





Target AArch32 mode when executing in: 


Non-secure state Secure state 





Non-secure Abort mode Secure Abort mode 
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Table G2-5 Routing when EL3 is not implemented and EL2 is implemented 





HDCR.TDE@ 


Target AArch32 mode when executing in Non-secure: 


PLO PL1 PL2 





Non-secure Abort mode = Non-secure Abort mode — Hyp mode? 








Hyp mode Hyp mode Hyp mode> 





a. If HCR.TGE is 1, this bit is treated as being | other than for a direct read of HDCR. 


b. Only applies to Breakpoint Instruction exceptions. All other debug exceptions are 


disabled. 


G2.3.1 Pseudocode description of routing debug exceptions 


DebugTarget() returns the current debug target Exception level. DebugTargetFrom() returns the debug target 
Exception level for the specified Security state. 
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Enabling debug exceptions from the current Privilege level and Security state 


A debug exception can only be taken if all of the following are true: 

° The OS lock is unlocked. 

. DoubleLockStatus() == FALSE. 

. The debug exception is enabled from the current Privilege level. 


° The debug exception is enabled from the current Security state. 


Table G2-6 shows when debug exceptions are enabled from the current Privilege level. 


Table G2-6 Whether debug exceptions are enabled from the current Privilege level 





Breakpoint Instruction 


Current Privilege level é 
exceptions 


All other debug exceptions 





PL2 Enabled Disabled 





PL1 Enabled Enabled 





PLO 








Table G2-7 shows when debug exceptions are enabled from the current Security state. 


Table G2-7 Whether debug exceptions are enabled from the current Security state 





Current Breakpoint Instruction 
Security state | exceptions 


All other debug exceptions 





Non-secure Enabled Enabled from PL1 and PLO only. 





Enabled Depends on SDCR.SPD and SDER.SUIDEN. 


See Disabling debug exceptions from Secure state. 








Disabling debug exceptions from Secure state 


If EL3 is implemented, software executing at EL3 can enable or disable all debug exceptions taken from Secure PL1 
other than Breakpoint Instruction exceptions, by using one of: 


° The Secure Privileged Debug field, SDCR.SPD, if EL3 is using AArch32. 
° The AArch32 Secure Privileged Debug field, MDCR_EL3.SPD32, if EL3 is using AArch64. 


If debug exceptions are disabled from Secure PL1, software executing at Secure PL1 can set the Secure User 
Invasive Debug Enable bit, SDER.SUIDEN, to | to enable all debug exceptions taken from Secure PLO other than 
Breakpoint Instruction exceptions. 


Note 


Breakpoint Instruction exceptions are always enabled. 








The ARMv8-A architecture does not support disabling debug in Non-secure state. 


Note 


If the boot software that is executed when reset is deasserted programs SUIDEN and SPD so that all debug 
exceptions are disabled from Secure state, software operating at EL3 never has to switch any of the debug registers 
between the Security states. 
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G2.4.2 Pseudocode description of enabling debug exceptions 


AArch64.GenerateDebugExceptions() determines whether debug exceptions are enabled from the current Exception 
level and Security state. AArch64.GenerateDebugExceptionsFrom() determines whether debug exceptions are enabled 
from the specified Exception level and Security state. 
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G2.5 The effect of powerdown on debug exceptions 
Debug OS Save and Restore sequences on page H6-4951 describes the powerdown save routine and the restore 
routine. 
When executing either routine, software must use the OS Lock to disable generation of all of the following: 
. Breakpoint exceptions. 
. Watchpoint exceptions. 
° Vector Catch exceptions. 
This is because the generation of these exceptions depends on the state of the debug registers, and the state of the 
debug registers might be lost over these routines. 
Debug exceptions other than Breakpoint Instruction exceptions are enabled only if both the OS Lock is unlocked 
and DoubleLockStatus() == FALSE. 
Breakpoint Instruction exceptions are enabled regardless of the state of the OS Lock and the OS Double Lock. 
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G2.6 Summary of permitted routing and enabling of debug exceptions 


Behavior is as follows: 


Breakpoint Instruction exceptions 


These are always enabled, regardless of the current Privilege level and Security state. Table G2-8 
shows the routing of these. In the table, n/a means not applicable. 


Table G2-8 Routing of Breakpoint Instruction exceptions 





Current Security 


Target when enabled from: 








tate HDCR.TDE?: 

PLO PL1 PL2 
Secure Xx Secure Abort mode> Secure Abort mode n/a 
Non-secure 0 Non-secure Abort mode Non-secure Abort mode Hyp mode 





1 





Hyp mode 


Hyp mode 


Hyp mode 





a. If EL2 is not implemented, behavior is as if the value of this bit is 0. Otherwise, if the value of HCR.TGE is 1, 
HDCR.TDE is treated as being | other than for a direct read of HDCR. 


b. If EL3 is implemented and is using AArch32, Secure Abort mode is at EL3. Otherwise, Secure Abort mode is at 


ELI. 


All other debug exceptions 


The enabling and permitted routing is controlled by all of the following: 


SDCR.SPD. 


SDER.SUIDEN. 


HDCR.TDE. 


The IMPLEMENTATION DEFINED authentication interface. 


Table G2-9 shows the valid combinations of the values of SDCR.SPD, SDER.SUIDEN, 
HDCR.TDE, and, in the Auth column, the input from the IMPLEMENTATION DEFINED authentication 
interface described by the pseudocode function 
AArch32.Se1fHostedSecurePrivi legedInvasiveDebugEnabled(). For each combination, the table 
shows where debug exceptions are enabled from and where they are taken to. 


In the table, n/a means not applicable and a dash, -, means that debug exceptions are disabled from 


that Exception level. 


Table G2-9 Breakpoint, Watchpoint, and Vector Catch exceptions 





Target AArch32 mode when 


























Current enabled from: 
Debug state Lock@ Security SPD> Authe SUIDEN TDE¢ , 
state 
PLO PL1 PL2 
Yes x x ObXX xX x x - - - 
No TRUE xX ObXX xX xX x - - - 
No FALSE — Secure 0b00 TRUE 0 xX - - n/a 
No FALSE Secure 0b00 TRUE 1 xX Secure - n/a 
Abort mode® 
No FALSE Secure 0b00 FALSE xX xX Secure Secure n/a 
Abort mode® Abort mode® 
No FALSE Secure 0b10 xX 0 xX - - n/a 
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Table G2-9 Breakpoint, Watchpoint, and Vector Catch exceptions (continued) 





Target AArch32 mode when 




















Current 
: enabled from: 
Debug state Lock? Security SPD> Auth¢ SUIDEN TDEd 
state 
PLO PL1 PL2 
No FALSE _ Secure 0b10 x 1 x Secure - n/a 
Abort mode® 
No FALSE Secure Qb11 x xX x Secure Secure n/a 
Abort mode® Abort mode® 
No FALSE Non-secure  QbXX x xX 0 Non-secure Non-secure - 
Abort mode Abort mode 
No FALSE Non-secure  @bXX Xx x 1 Hyp mode Hyp mode - 
a. The value of (OSLSR_EL1.OSLK == ’1’ || DoubleLockStatus()). 
b. If EL3 is not implemented, behavior is as if this is 0b11. 
c. See the text that introduces this table for an explanation of the Auth on page G2-3932 column. An entry of TRUE indicates that the 
authentication mechanism permits the debug exceptions to be taken to their default target PE mode. 
d. If HCR.TGE is |, this bit is treated as being 1 other than for a direct read of HDCR. If EL2 is not implemented, behavior is as if TDE is 0. 


If EL3 is implemented and is using AArch32, Secure Abort mode is at EL3. Otherwise, Secure Abort mode is at EL1 
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G2.7 Pseudocode description of debug exceptions 


AArch32.DebugFault() returns a FaultRecord() that indicates that a memory access has generated a debug exception. 


The AArch32.Abort() function processes FaultRecord(), as described in Abort exceptions on page G3-4019, and 





generates: 
. Data Abort exceptions for watchpoints. 
. Prefetch Abort exceptions for all other debug exceptions. 
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G2.8 Breakpoint Instruction exceptions 


This section describes Breakpoint Instruction exceptions in an AArch32 translation regime. 


Note 


When the PE is executing in ELO using AArch32 and EL] is using AArch64, it is using the AArch64 EL1&0 
translation regime. A T32 or A32 BKPT instruction executed at ELO can generate a Breakpoint Instruction exception 
that is taken to an Exception level that is using AArch64. For more information about the handling of these 
exceptions, see Breakpoint Instruction exceptions on page D2-1639. 








It contains the following subsections: 


° About Breakpoint Instruction exceptions. 

° Breakpoint instruction in the A32 and T32 instruction sets. 

° BKPT instructions as the first instruction in an IT block on page G2-3936. 

° Exception syndrome information and preferred return address on page G2-3936. 

° Pseudocode description of Breakpoint Instruction exceptions on page G2-3937. 
G2.8.1 About Breakpoint Instruction exceptions 


A breakpoint is an event that results from the execution of an instruction, based on either: 
° The instruction address, the PE context, or both. This type of breakpoint is called a hardware breakpoint. 


° The instruction itself. That is, the instruction is a breakpoint instruction. These can be included in the 
program that the PE executes. This type of breakpoint is called a software breakpoint. 


Breakpoint Instruction exceptions, that this section describes, are software breakpoints. Breakpoint exceptions on 
page G2-3938 describes hardware breakpoints. 


There is no enable control for Breakpoint Instruction exceptions. They are always enabled, and cannot be masked. 


A Breakpoint Instruction exception is generated whenever a breakpoint instruction is committed for execution, 
regardless of all of the following: 


° The current Exception level. 
° The current Security state. 
. Whether the debug target Exception level, ELp, is using AArch64 or AArch32. 


Note 


° ELp is the Exception level that debug exceptions are targeting. See Enabling debug exceptions from the 
current Privilege level and Security state on page G2-3929. 





° Debuggers using breakpoint instructions must be aware of the ARMv8 rules for concurrent modification and 
execution of instructions. See Concurrent modification and execution of instructions on page B2-83. 





G2.8.2 Breakpoint instruction in the A32 and T32 instruction sets 
The breakpoint instruction, in both instruction sets, is: 
° BKPT #<immediate> 


For details of the instruction encoding, see BKPT on page F5-2621. 


About whether the BKPT instruction is conditional 
In the T32 instruction set, BKPT instructions are always unconditional. 


In the A32 instruction set: 


° If the condition code field is AL, the BKPT instruction is unconditional. 
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. If the condition code field is anything other than AL, behavior is CONSTRAINED UNPREDICTABLE, and is one 
of the following: 


— The instruction is UNDEFINED. 
— The instruction is treated as a NOP instruction. 
— The instruction is executed unconditionally. 


— The instruction is executed conditionally. 


G2.8.3 BKPT instructions as the first instruction in an IT block 


If the first instruction in an IT block is a T32 BKPT instruction, then in an implementation that supports the ITD 
control, if ITD field that applies to the current Exception level is: 


0 The BKPT instruction generates a Breakpoint Instruction exception. 


1 The combination of IT instruction and BKPT instruction is UNDEFINED. Either the IT instruction or the 
BKPT instruction generates an Undefined Instruction exception. 


In such an implementation, to ensure consistent behavior when making the first instruction in one or more IT blocks 
a BKPT instruction, the debugger must replace the IT instruction. 


An implementation that does not support the ITD control behaves as if the value of the ITD field is 0. 


The ITD control fields are: 
HSCTLR.ITD Applies to execution at EL2 when EL2 is using AArch32. 
SCTLR.ITD Applies to execution at ELO or EL1 when EL] is using AArch32. 
SCTLR_EL1.ITD 
Applies to execution at ELO using AArch32 when EL] is using AArch64. 


Note 
T32 BKPT instructions are always unconditional, even when they are inside an IT block. See: 
. Disabling or enabling PLO and PL1 use of AArch32 deprecated functionality on page G1-3888. 
. Disabling or enabling EL2 use of AArch32 deprecated functionality on page G1-3897. 








G2.8.4 Exception syndrome information and preferred return address 


See the following: 
° Exception syndrome information. 


. Preferred return address on page G2-3937. 


Exception syndrome information 
The PE takes a Breakpoint Instruction exception as either: 
° A Prefetch Abort exception if it is taken to PL1. In this case, it is taken to Abort mode. 


° A Hyp Trap exception, if it is taken to PL2 because either HCR.TGE or HDCR.TDE is 1. In this case, it is 
taken to Hyp mode. 


If the exception is taken to: 


PL1 Abort mode 
The PE sets all of the following: 
° DBGDSCRext.MOE to 0b0011, to indicate a Breakpoint Instruction exception. 
° IFSR.FS to the code for a debug, 0b00010. 
° The IFAR with an UNKNOWN value. 
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PL2 Hyp mode 
The PE does all of the following: 
° Records information about the exception in the Hypervisor Syndrome Register, HSR. See 
Table G2-10. 


° Sets DBGDSCRext.MOE to 0b0011, to indicate a Breakpoint Instruction exception. 


° Sets the HIFAR to an UNKNOWN value. 


Table G2-10 Information recorded in the HSR 





HSR field Information recorded 





Exception Class, EC The PE sets this to the code for a Prefetch Abort exception routed to Hyp mode, 0x20. 





Instruction Length,1L The PE sets this to: 
° 0 for a T32 BKPT instruction. 
° 1 for an A32 BKPT instruction. 





Instruction Specific ISS[24:10] RESO. 
Syndrome, ISS ISS[9]___ External Abort type (EA). The PE sets this to 0. 
ISS[8:6] RESO. 
ISS[5:0] Instruction Fault Status Code (IFSC). The PE sets this to the code for a debug exception, 
0b100010. 





— Note 


For information about how debug exceptions can be routed to PL2, see Routing debug exceptions 
on page G2-3927. 





Preferred return address 


The preferred return address is the address of the breakpoint instruction, not the next instruction. This is different 
to the behavior of other exception-generating instructions, like SVC. 


G2.8.5 Pseudocode description of Breakpoint Instruction exceptions 


AArch32.SoftwareBreakpoint() generates a Prefetch Abort exception that is taken from AArch32 state. 
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G2.9 Breakpoint exceptions 
This section describes Breakpoint exceptions in stage 1 of an AArch32 translation regime. 


The PE is using an AArch32 translation regime when it is executing either: 
. At EL] or higher in an Exception level that is using AArch32. 
° At ELO using AArch32 when EL! is using AArch32. 


This section contains the following subsections: 
° About Breakpoint exceptions. 


. Breakpoint types and linking of breakpoints on page G2-3939. 


. Execution conditions for which a breakpoint generates Breakpoint exceptions on page G2-3946. 
. Breakpoint instruction address comparisons on page G2-3947. 

° Breakpoint context comparisons on page G2-3952. 

° Using breakpoints on page G2-3953. 

. Exception syndrome information and preferred return address on page G2-3958. 

° Pseudocode description of Breakpoint exceptions taken from AArch32 state on page G2-3959. 


G2.9.1 About Breakpoint exceptions 
A breakpoint is an event that results from the execution of an instruction, based on either: 
° The instruction address, the PE context, or both. This type of breakpoint is called a hardware breakpoint. 


° The instruction itself. That is, the instruction is a breakpoint instruction. These can be included in the 
program that the PE executes. This type of breakpoint is called a software breakpoint. 


Breakpoint exceptions are generated by Breakpoint debug events. Breakpoint debug events are generated by 
hardware breakpoints. Software breakpoints are described in Breakpoint Instruction exceptions on page G2-3935. 


An implementation can include between 2-16 hardware breakpoints. DBGDIDR.BRPs shows how many are 
implemented. 


To use an implemented hardware breakpoint, a debugger programs the following registers for the breakpoint: 


° The Breakpoint Control Register, DBGBCR<n>. This contains controls for the breakpoint, for example an 
enable control. 


° The Breakpoint Value Register, DBGBVR<n>. This holds a value used for breakpoint matching, that is one 
of: 


— An instruction virtual address. 
— A Context ID. 


° If EL2 is implemented, the Breakpoint Extended Value Register, DBGBXVR<n>, that holds a VMID value 
used for breakpoint matching. 


These registers are numbered, so that: 
° DBGBCR1, DBGBVR1I, and DBGBXVR1 are for breakpoint number one. 
° DBGBCR2, DBGBVR2, and DBGBXVR2 are for breakpoint number two. 


° DBGBCR<n>, DBGBVR<n>, and DBGBX VR<n> are for breakpoint number <n>. 


A debugger can link a breakpoint that is programmed with an address and a breakpoint that is programmed with 
anything other than an address together, so that a Breakpoint debug event is only generated if both breakpoints 
match. 
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For each instruction in the program flow, all of the breakpoints are tested. When a breakpoint is tested, it generates 
a Breakpoint debug event if all of the following are true: 


° The breakpoint is enabled. That is, the breakpoint enable control for it, DBGBCR<n>-E, is 1. 
. The conditions specified in the DBGBCR<n> are met. 


° The comparisons with the values held in one or both of the DBGBVR<n> and DBGBXVR<n>, as applicable, 
are successful. 


. If the breakpoint is linked to another breakpoint, the comparisons made by that other breakpoint are also 
successful. 
° The instruction is committed for execution. 


If all of these conditions are met, the breakpoint generates the Breakpoint debug event regardless of the following: 
. Whether the instruction passes its condition code check. 


. The instruction type. 
If halting is allowed and EDSCR.HDE is 1, Breakpoint debug events cause entry to Debug state. 


Otherwise, if debug exceptions are 


° Enabled, Breakpoint debug events generate Breakpoint exceptions 
° Disabled, Breakpoint debug events are ignored. 
Note 





The remainder of this Breakpoint exceptions section, including all subsections, describes breakpoints as generating 
Breakpoint exceptions. 


However, the behavior described also applies if breakpoints are causing entry to Debug state. 





The debug exception enable controls on page G2-3926 describes the enable controls for Breakpoint debug events. 


G2.9.2 Breakpoint types and linking of breakpoints 
Each implemented breakpoint is one of the following: 


° A context-aware breakpoint. This is a breakpoint that can be programmed to generate a Breakpoint exception 
on any one of the following: 


—  Aninstruction address match. 

— An instruction address mismatch. 

— A Context ID match, with the value held in the CONTEXTIDR. 
— A VMID match, with the value held in the VTTBR. 

— Both a Context ID match and a VMID match. 


° A breakpoint that is not context-aware. These can only be programmed to generate a Breakpoint exception 
on an instruction address match or an instruction address mismatch. 


DBGDIDR.CTX_CMPs shows how many of the implemented breakpoints are context-aware breakpoints. At least 
one implemented breakpoint must be context-aware. The context-aware breakpoints are the highest numbered 
breakpoints. 


Any breakpoint that is programmed to generate a Breakpoint exception on an instruction address match or mismatch 
is categorized as an Address breakpoint. Breakpoints that are programmed to match on anything else are categorized 
as Context breakpoints. 


When a debugger programs a breakpoint to be an Address or a Context breakpoint, it must also program that 
breakpoint so that it is either: 





° Used in isolation. In this case, the breakpoint is called an Unlinked breakpoint. 
. Enabled for linking to another breakpoint. In this case, the breakpoint is called a Linked breakpoint. 
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By linking an Address breakpoint and a Context breakpoint together, the debugger can create a breakpoint pair that 
only generates a Breakpoint exception if the PE is in a particular context when an instruction address match or 
mismatch occurs. For example, a debugger might: 


1. Program breakpoint number one to be a Linked Address Match breakpoint. 


2. Program breakpoint number five to be a Linked Context ID Match breakpoint. 


3% Link these two breakpoints together. A Breakpoint exception is only generated if both the instruction address 
matches and the Context ID matches. 


The Breakpoint Type field for a breakpoint, DBGBCR<n>.BT, controls the breakpoint type and whether the 
breakpoint is enabled for linking. If BT[0] is 1, the breakpoint is enabled for linking. 


Figure G2-1 shows all of the possible breakpoint types that stage 1 of an AArch32 translation regime supports, and 


their associated BT field values. 
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Figure G2-1 Breakpoint types and their associated BT field values 


Address breakpoints can be programmed to generate Breakpoint exceptions on addresses that are halfword-aligned 
but not word-aligned. This makes it possible to breakpoint on T32 instructions. See Specifying the halfword-aligned 


address that an Address breakpoint matches on on page G2-3948. 
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Rules for linking breakpoints 
The rules for breakpoint linking are as follows: 
° Only Linked breakpoint types can be linked. 


° Any type of Linked Address breakpoint can link to any type of Linked Context breakpoint. The Linked 
Breakpoint Number field, DBGBCR<n>.LBN, for the Linked Address breakpoint specifies the particular 
Linked Context breakpoint that the Linked Address breakpoint links to, and: 


—  DBGBCR<n>.{SSC,HMC, PMC} for the Linked Address breakpoint define the execution conditions 
that the breakpoint pair generates Breakpoint exceptions for. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page G2-3946. 


—  DBGBCR<n>.{SSC, HMC, PMC} for the Linked Context breakpoint are ignored. 


° Linked Context breakpoint types can only be linked to. The LBN field for Context breakpoints is therefore 
ignored. 

. Linked Address breakpoints cannot link to watchpoints. The LBN field can therefore only specify another 
breakpoint. 

° If a Linked Address breakpoint links to a breakpoint that is not context-aware, the behavior of the Linked 


Address breakpoint is CONSTRAINED UNPREDICTABLE. See Other usage constraints for Address breakpoints 
on page G2-3957. 


° If a Linked Address breakpoint links to an Unlinked Context breakpoint, the Linked Address breakpoint 
never generates any Breakpoint exceptions. 


. Multiple Linked Address breakpoints can link to a single Linked Context breakpoint. 





Note 


Multiple Linked watchpoints can also link to a single Linked Context breakpoint. Watchpoint exceptions on 
page G2-3961 describes watchpoints. 





These rules mean that a single Linked Context breakpoint might be linked to by all, or any combination of, the 
following: 


° Multiple Linked Address Match breakpoints. 
° Multiple Linked Address Mismatch breakpoints. 
° Multiple Linked watchpoints. 


It is also possible that a Linked Context breakpoint might have no breakpoints or watchpoints linked to it. 


Figure G2-2 on page G2-3942 shows an example of permitted breakpoint and watchpoint linking. 
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Figure G2-2 The role of linking in Breakpoint and Watchpoint exception generation 


In Figure G2-2, each Linked Address breakpoint can only generate a Breakpoint exception if the comparisons made 
by both it, and the Linked Context breakpoint that it links to, are successful. Similarly, each Linked watchpoint can 
only generate a Watchpoint exception if the comparisons made by both it, and the Linked Context breakpoint that 


it links to, are successful. 


Breakpoint types defined by DBGBCRn.BT 


The following list provides more detail about each breakpoint type: 


0b0000, Unlinked Address Match breakpoint 


Generation of a Breakpoint exception depends on both: 


° DBGBCR<n>.{SSC, HMC, PMC}. These define the execution conditions that the 
breakpoint generates Breakpoint exceptions for. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page G2-3946. 


° A successful address match, as described in Breakpoint instruction address comparisons on 
page G2-3947. 


DBGBCR<n>.LBN for this breakpoint is ignored. 
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0b0001, Linked Address Match breakpoint 


Generation of a Breakpoint exception depends on all of the following: 


DBGBCR<n>.{SSC, HMC, PMC} for this breakpoint. These define the execution 
conditions that the breakpoint generates Breakpoint exceptions for. See Execution conditions 
for which a breakpoint generates Breakpoint exceptions on page G2-3946. 


A successful address match defined by this breakpoint, as described in Breakpoint instruction 
address comparisons on page G2-3947. 


A successful context match defined by the Linked Context breakpoint that this breakpoint 
links to. 


DBGBCR<n>.LBN for this breakpoint selects the Linked Context breakpoint that this breakpoint 
links to. 


0b0010, Unlinked Context ID Match breakpoint 


BT == 0b0010 is a reserved value if the breakpoint is not a context-aware breakpoint. 


For context-aware breakpoints, generation of a Breakpoint exception depends on both: 


DBGBCR<n>.{SSC, HMC, PMC}. These define the execution conditions that the 
breakpoint generates Breakpoint exceptions for. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page G2-3946. 


A successful Context ID match, as described in Breakpoint context comparisons on 
page G2-3952. 


DBGBCR<n>.{LBN, BAS} for this breakpoint are ignored 


0b0011, Linked Context ID Match breakpoint 


BT == 0b0011 is a reserved value if the breakpoint is not a context-aware breakpoint. 


For context-aware breakpoints, either: 


This breakpoint does not generate any Breakpoint exceptions, if no Linked breakpoints or 
Linked watchpoints link to it. 


Generation of a Breakpoint exception depends on both: 


— A successful instruction address match, defined by a Linked Address breakpoint that 
links to this breakpoint, see Breakpoint instruction address comparisons on 
page G2-3947. 


—  Assuccessful Context ID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page G2-3952. 


Generation of a Watchpoint exception depends on both: 


— A ssuccessful data address match, defined by a Linked watchpoint that links to this 
breakpoint, see Watchpoint data address comparisons on page G2-3965. 


— _ Assuccessful Context ID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page G2-3952. 


DBGBCR<n>.{LBN, SSC, HMC, BAS PMC} for this breakpoint are ignored. 


0b0100, Unlinked Address Mismatch breakpoint 


Generation of a Breakpoint exception depends on both: 


DBGBCR<n>.{SSC, HMC, PMC}. These define the execution conditions that the 
breakpoint generates Breakpoint exceptions for. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page G2-3946. 


A successful address mismatch, as described in Breakpoint instruction address comparisons 
on page G2-3947, 


DBGBCR<n>.LBN for this breakpoint is ignored. 
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0b0101, Linked Address Mismatch breakpoint 
Generation of a Breakpoint exception depends on all of the following: 


° DBGBCR<n>.{SSC, HMC, PMC}. These define the execution conditions that the 
breakpoint generates Breakpoint exceptions for. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page G2-3946. 


° A successful address mismatch defined by this breakpoint, as described in Breakpoint 
instruction address comparisons on page G2-3947. 

° A successful context match defined by the Linked Context breakpoint that this breakpoint 
links to. 


DBGBCR<n>.LBN for this breakpoint selects the Linked Context breakpoint that this breakpoint 
links to. 
0b1000, Unlinked VMID Match breakpoint 


BT == 0b1000 is a reserved value if either: 
° The breakpoint is not a context-aware breakpoint. 


° EL2 is not implemented. 
For context-aware breakpoints, generation of a Breakpoint exception depends on both: 


° DBGBCR<n>.{SSC, HMC, PMC}. These define the execution conditions that the 
breakpoint generates Breakpoint exceptions for. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page G2-3946. 


° A successful VMID match, as described in Breakpoint context comparisons on 
page G2-3952. 


DBGBCR<n>.{LBN, BAS} for this breakpoint are ignored. 


0b1001, Linked VMID Match breakpoint 


BT == 0b1000 is a reserved value if either: 
° The breakpoint is not a context-matching breakpoint. 


° EL2 is not implemented. 
For context-aware breakpoints, either: 


. This breakpoint does not generate any Breakpoint exceptions, if no Linked breakpoints or 
Linked watchpoints link to it. 


° Generation of a Breakpoint exception depends on both: 


— A successful instruction address match, defined by a Linked Address Match 
breakpoint that links to this breakpoint. See Breakpoint instruction address 
comparisons on page G2-3947. 


— A successful VMID match defined by this breakpoint. 
° Generation of a Watchpoint exception depends on both: 


— A ssuccessful data address match, defined by a Linked watchpoint that links to this 
breakpoint, see Watchpoint data address comparisons on page G2-3965. 


— A successful VMID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page G2-3952. 


DBGBCR<n>.{LBN, SSC, HMC, BAS, PMC} for this breakpoint are ignored. 
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0b1010, Unlinked Context ID and VMID Match breakpoint 
BT == 0b1010 is a reserved value if either: 


° The breakpoint is not a context-matching breakpoint. 


. EL2 is not implemented. 


For context-matching breakpoints, generation of a Breakpoint exception depends on all of the 
following: 


° DBGBCR<n>.{SSC, HMC, PMC}. These define the execution conditions that the 
breakpoint generates Breakpoint exceptions for. See Execution conditions for which a 
breakpoint generates Breakpoint exceptions on page G2-3946. 


. A successful Context ID match, as described in Breakpoint context comparisons on 
page G2-3952. 


° A successful VMID match. 


Breakpoint context comparisons on page G2-3952 describes the requirements for a successful 
Context ID match and a successful VMID match. 


DBGBCR<n>.{LBN, BAS} for this breakpoint are ignored. 


0b1011, Linked Context ID and VMID Match breakpoint 
BT == 0b1011 is a reserved value if either: 
° The breakpoint is not a context-matching breakpoint. 


. EL2 is not implemented. 
For context-matching breakpoints, either: 


. This breakpoint does not generate any Breakpoint exceptions, if no Linked breakpoints or 
Linked watchpoints link to it. 


° Generation of a Breakpoint exception depends on all of the following: 


— A successful instruction address match, defined by a Linked Address breakpoint that 
links to this breakpoint, see Breakpoint instruction address comparisons on 
page G2-3947. 


— _ Assuccessful Context ID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page G2-3952. 


— A successful VMID match defined by this breakpoint. 
° Generation of a Watchpoint exception depends on all of the following: 


— A successful data address match, defined by a Linked watchpoint that links to this 
breakpoint, see Watchpoint data address comparisons on page G2-3965. 


— _ Assuccessful Context ID match defined by this breakpoint, as described in Breakpoint 
context comparisons on page G2-3952. 


— A successful VMID match defined by this breakpoint. 


Breakpoint context comparisons on page G2-3952 describes the requirements for a successful 
Context ID match and a successful VMID match by this breakpoint. 


DBGBCR<n>.{LBN, SSC, HMC, BAS, PMC} for this breakpoint are ignored. 











Note 
See Reserved DBGBCR<n>.BT values on page G2-3955 for the behavior of breakpoints programmed with reserved 
BT values. 
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G2.9.3 Execution conditions for which a breakpoint generates Breakpoint exceptions 


Each breakpoint can be programmed so that it only generates Breakpoint exceptions for certain execution 
conditions. For example, a breakpoint might be programmed to generate Breakpoint exceptions only when the PE 
is executing at PLO in Secure state. 


DBGBCR<n>.{SSC, HMC, PMC} define the execution conditions the breakpoint generates Breakpoint exceptions 
for, as follows: 
Security State Control, SSC 

Controls whether the breakpoint generates Breakpoint exceptions only in Secure state, only in 


Non-secure state, or in both Security states. 


— Note 


This is determined by the Security state of the PE, not from the NS attribute returned by the 
translation of the virtual address on which the breakpoint is set. 





Higher Mode Control, HMC, and Privileged Mode Control, PMC 


HMC and PMC together control which AArch32 modes the breakpoint generates Breakpoint 
exceptions in. 


Table G2-11 shows the valid combinations of the values of HMC, SSC, and PMC, and for each combination shows 
which Privilege levels breakpoints generate Breakpoint exceptions in. 


In the table: 


Y or - Means that a breakpoint programmed with the values of HMC, SSC and PMC shown in that row: 
Y Can generate Breakpoint exceptions in AArch32 modes at that Privilege level. 


- Cannot generate Breakpoint exceptions in AArch32 modes at that Privilege level. 


Res Means that the combination of HMC, SSC, and PMC is reserved. See Reserved 
DBGBCR<n>.{SSC, HMC, PMC} values on page G2-3956. 


Table G2-11 Summary of breakpoint HMC, SSC, and PMC encodings 
























































uc ssc pac Seoumivetalehetreskpoint | pu2e put pig] 
NoEL3 NoEL2 and no EL3 

0 00 00 Both - yo yo - - 

0 00 01 - Y - - - 

0 00 10 - - Y - - 

0 00 11 - Y = : 

0 01 00 Non-secure - yb yo Res Res 

0 01 01 - Y - Res Res 

0 01 10 - - Y Res Res 

0 01 11 - Y Y Res Res 
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Table G2-11 Summary of breakpoint HMC, SSC, and PMC encodings (continued) 





Security state the breakpoint 


Implementation 





















































HM 2eG: PMG is programmed to match in Pee PEt. “PEO 
NoEL3 No EL2 and no EL3 

0 10 00 Secure - yb yb Res Res 

0 10 01 - Y - Res Res 

0 10 10 - - Res Res 

0 10 11 - Y Res Res 

1 00 01 Both Y Y - - Res 

1 00 11 Y Y Y - Res 

1 01 01 Non-secure Y Y - Res Res 

1 01 11 Y Y Y Res Res 

1 10 01 Secure - Y - Res Res 

1 10 11 - Y Y Res Res 

1 11 00 Non-secure Y = = = Res if no EL2¢ 











a. Debug exceptions are not generated at PL2 using AArch32. This means that these combinations of HMC, SSC, and PMC are only 
relevant if breakpoints cause entry to Debug state. Self-hosted debuggers must avoid combinations of HMC, SSC, and PMC that 
generate Breakpoint exceptions at PL2 using AArch32. 


b. Only in User, System and Supervisor modes. 


c. This encoding is only reserved when EL2 is not implemented, regardless of whether EL3 is implemented. 


G2.9.4 Breakpoint instruction address comparisons 


All combinations of HMC, SSC, and PMC that this table does not show are reserved. See Reserved HMC, SSC, and 
PMC combinations on page G2-3956. 


Address comparisons are made for each instruction in the program flow. The following subsections describe the 
criteria for a successful address comparison, for: 


° Address Match breakpoints. 
. Address Mismatch breakpoints on page G2-3948. 


Address Match breakpoints 


An address match comparison is successful if both: 


° Bits [31:2] of the current instruction virtual address are equal to DBGBVR<n>[31:2]. 


° The word or halfword selected by DBGBCR<n>.BAS matches. That is, either: 
DBGBCR<n>.BAS is programmed with @b0011 or @b1111, and the instruction is at a word-aligned 


address. 


DBGBCR<n>.BAS is programmed with b1100, and the instruction is not at a word-aligned address. 


See Specifying the halfword-aligned address that an Address breakpoint matches on on page G2-3948. 


Note 





DBGBVR<n>[1:0] are RESO and are ignored. 
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Address Mismatch breakpoints 
An address mismatch comparison is successful if either: 
° Bits [31:2] of the current instruction virtual address are not equal to DBGBVR<n>[31:2]. 


° The word or halfword selected by DBGBCR<n>.BAS does not match. That is, either: 


—  DBGBCR<n>.BAS is programmed with 0b0011 or @b1111, and the instruction is not at a word-aligned 
address. 


—  DBGBCR<n>.BAS is programmed with 0b1100, and the instruction is at a word-aligned address. 


See Specifying the halfword-aligned address that an Address breakpoint matches on. 





Note 
° DBGBVR<n>[1:0] are RESO and are ignored. 


° Address Mismatch breakpoints can be used to single-step through code. See Using an Address Mismatch 
breakpoint to single-step an instruction on page G2-3953. 





Specifying the halfword-aligned address that an Address breakpoint matches on 


For an Address breakpoint, a debugger can use the Byte Address Selection field, DBGBCR<n>.BAS, so that the 
address comparison is successful on one of: 


° The whole word starting at address DBGBVR<n>[31:2]:00. 
° The halfword starting at address DBGBVR<n>[31:2]:00. 
° The halfword starting at address ((DBGBVR<n>[31:2]:00) + 2). 





Note 
The address programmed into the DBGBVR<n> must be word-aligned. 





DBGBCR<n>.BAS can be used in both Address Match breakpoints and Address Mismatch breakpoints, as follows: 


° For an Address Match breakpoint, DBGBCR<n>.BAS selects which halfword-aligned address the 
breakpoint must generate a Breakpoint exception for. This means that an address comparison is successful 
only if both of the following match: 


— The instruction address held in bits [31:2] of the DBGBVR<n>. 
— The halfword defined by the BAS field. 


That is, a successful address comparison = DBGBVR<n>[31:2] match AND BAS match. 


° For an Address Mismatch breakpoint, DBGBCR<n>.BAS selects which halfword-aligned address the 
breakpoint must not generate a Breakpoint exception for. This means that an address comparison is successful 
if either or both of the following do not match: 


— The instruction address held in bits [31:2] of the DBGBVR<n>. 
— The halfword defined by the BAS field. 


That is, a successful address comparison = NOT (DBGBVR<n>[31:2] match AND BAS match). 


The following subsections show the supported BAS values: 
° Using the BAS field in Address Match breakpoints on page G2-3949. 
° Using the BAS field in Address Mismatch breakpoints on page G2-3950. 


For Context breakpoints, DBGBCR<n>.BAS is RES! and is ignored. 
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Using the BAS field in Address Match breakpoints 
The supported BAS values are: 


0b0000 This value is reserved. Behavior is a CONSTRAINED UNPREDICTABLE choice of: 
. The breakpoint is disabled. 
° The breakpoint behaves as if BAS is 0b0011, @b1100, or 0b1111. 


0b0011 The breakpoint generates a Breakpoint exception if an instruction with an address described as 
follows is committed for execution: 


° Bits [31:2] of the address equals DBGBVR<n>[31:2]. 
° Bits [1:0] of the address are 0b00. 


This means that breakpoints programmed with this BAS value generate Breakpoint exceptions for 


all of the following: 

. 32-bit T32 instructions at word-aligned addresses. 

° 16-bit T32 instructions at word-aligned addresses. 

. A32 instructions. These are always at word-aligned addresses. 


However, ARM recommends that a debugger uses this BAS value only for T32 instructions. 


It is CONSTRAINED UNPREDICTABLE whether a breakpoint programmed with this BAS value 
generates a Breakpoint exception on the second halfword of a 32-bit T32 instruction starting at the 
halfword-aligned address (DBGBVR<n>[31:2]:00) - 2). 


0b1100 The breakpoint generates a Breakpoint exception if an instruction with an address described as 
follows is committed for execution: 


° Bits [31:2] of the address equals DBGBVR<n>[31:2]. 
° Bits [1:0] of the address are 0b10. 


This means that breakpoints programmed with this BAS value generate Breakpoint exceptions for 


both of the following: 
° 32-bit T32 instructions at addresses that are halfword-aligned but not word-aligned. 
° 16-bit T32 instructions at addresses that are halfword-aligned but not word-aligned. 


It is CONSTRAINED UNPREDICTABLE whether a breakpoint programmed with this BAS value 
generates a Breakpoint exception on the second halfword of a 32-bit T32 or A32 instruction starting 
at a word-aligned address. 


0b1111 The breakpoint generates a Breakpoint exception if an instruction with an address described as 
follows is committed for execution: 


° Bits [31:2] of the address equals DBGBVR<n>[31:2]. 
° Bits [1:0] of the address are 0b00. 


This means that breakpoints programmed with this BAS value generate Breakpoint exceptions for 
all of the following: 


° 32-bit T32 instructions at word-aligned addresses. 
° 16-bit T32 instructions at word-aligned addresses. 
. A32 instructions. These are always at word-aligned addresses. 


However, ARM recommends that a debugger uses this BAS value only for A32 instructions. 


It is CONSTRAINED UNPREDICTABLE whether a breakpoint programmed with this BAS value 
generates a Breakpoint exception on the second halfword of a 32-bit T32 instruction starting at the 
halfword-aligned address (DBGBVR<n>[31:2]:00) - 2). 


It is CONSTRAINED UNPREDICTABLE whether a breakpoint programmed with this BAS value 
generates a Breakpoint exception on a 32-bit T32 instruction or a 16-bit T32 instruction at the 
halfword-aligned address (DBGBVR<n>[31:2]:00) + 2). 


All other BAS values are reserved. For these reserved other values, DBGBCR<n>.BAS[3,1] ignore writes and read 
the same values as DBGBCR<n>[2,0] respectively. This means that the smallest instruction size a debugger can 
program breakpoints to match on is a halfword. 
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Figure G2-3 shows a summary of when breakpoints programmed with particular BAS values generate Breakpoint 
exceptions. 


The figure contains four parts: 


° A column showing the row number, on the left. 
° An instruction set and instruction size table. 
° A location of instruction figure. 


° A BAS field values table, on the right. 
To use the figure, read across the rows. For example: 


° Row 2 shows that a breakpoint with a BAS value of 0b1100 generates Breakpoint exceptions for 16-bit T32 
instructions starting at the halfword-aligned address ((DBGB VR<n>[31:2]:00) + 2). 


° Row 6 shows that a breakpoint with a BAS value of either 0b0011 or 0b1111 generates Breakpoint exceptions 
for A32 instructions. A32 instructions are always at word-aligned addresses. 


In the figure: 


Yes Means that the breakpoint generates a Breakpoint exception. 
No Means that the breakpoint does not generate a Breakpoint exception. 
UNP Means that is it CONSTRAINED UNPREDICTABLE whether the breakpoint generates a Breakpoint 


exception. See Other usage constraints for Address breakpoints on page G2-3957. 


























Location of instruction? BAS[3:0] 
Instruction set Size 2 -1 O +1 +2 +3 +4 «+5 0b0011 0b1100 0b1111 
Row 1 T32 16-bit Yes No Yes 
Row 2 16-bit No Yes UNP 
Row 3 T32 32-bit UNP No UNP 
Row 4 32-bit Yes UNP Yes 
Row 5 32-bit No Yes UNP 
Row 6 A32 32-bit Yes UNP Yes 




















a. 0 means the word-aligned address held in the DBGBVRn. The other locations 
are as follows: 
+ -2 means ((DBGBVRn[31:2]:00) - 2). 
+ -1 means ((DBGBVRn[31:2]:00) - 1). 


- +5 means ((DBGBVRn[31:2]:00) + 5). 


The solid areas show the location of the instruction. 
Figure G2-3 Summary of BAS field meanings for Address Match breakpoints 


Using the BAS field in Address Mismatch breakpoints 


An Address Mismatch breakpoint generates Breakpoint exceptions for all instructions committed for execution, 
except the instruction whose address the breakpoint is programmed to match. 


The supported BAS values are: 


0b0000 The breakpoint ignores the address held in the DBGBVR<n> and generates Breakpoint exceptions 
for all instruction addresses. 
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0b0011 The breakpoint does not generate a Breakpoint exception if an instruction with an address described 
as follows is committed for execution: 


° Bits [31:2] of the address equals DBGBVR<n>[31:2]. 
° Bits [1:0] of the address are 0b00. 


This means that breakpoints programmed with this BAS value do not generate Breakpoint 
exceptions for any of the following: 


. 32-bit T32 instructions at word-aligned addresses. 
° 16-bit T32 instructions at word-aligned addresses. 
° A32 instructions. These are always at word-aligned addresses. 


However, ARM recommends that a debugger uses this BAS value only for T32 instructions. 


It is CONSTRAINED UNPREDICTABLE whether a breakpoint programmed with this BAS value does 
not generate a Breakpoint exception on the second halfword of a 32-bit T32 instruction starting at 
the halfword-aligned address ((DBGB VR<n>[31:2]:00) - 2). 


0b1100 The breakpoint does not generate a Breakpoint exception if an instruction with an address described 
as follows is committed for execution: 


° Bits [31:2] equals DBGBVR<n>[31:2]. 

° Bits [1:0] of the address are 0b10. 

This means that breakpoints programmed with this BAS value do not generate Breakpoint 
exceptions for either of the following: 

. 32-bit T32 instructions at addresses that are halfword-aligned but not word-aligned. 

° 16-bit T32 instructions at addresses that are halfword-aligned but not word-aligned. 

It is CONSTRAINED UNPREDICTABLE whether a breakpoint programmed with this BAS value does 


not generate a Breakpoint exception on the second halfword of a 32-bit T32 or A32 instruction at a 
word-aligned address. 


0b1111 The breakpoint does not generate a Breakpoint exception if an instruction with an address described 
as follows is committed for execution: 


° Bits [31:2] of the address equals DBGBVR<n>[31:2]. 
° Bits [1:0] of the address are 0b00. 


This means that breakpoints programmed with this BAS value do not generate Breakpoint 
exceptions for any of the following: 


° 32-bit T32 instructions at word-aligned addresses. 
. 16-bit T32 instructions at word-aligned addresses. 
. A32 instructions. These are always at word-aligned addresses. 


However, ARM recommends that a debugger uses this BAS value only for A32 instructions. 


It is CONSTRAINED UNPREDICTABLE whether a breakpoint programmed with this BAS value does 
not generate a Breakpoint exception on the second halfword of a 32-bit T32 instruction starting at 
the halfword-aligned address ((DBGB VR<n>([31:2]:00) - 2). 


It is CONSTRAINED UNPREDICTABLE whether a breakpoint programmed with this BAS value does 
not generate a Breakpoint exception on a 32-bit T32 instruction or a 16-bit T32 instruction at the 
halfword-aligned address (DBGBVR<n>[31:2]:00) + 2). 


All other BAS values are reserved. For these reserved other values, DBGBCR<n>.BAS[3,1] ignore writes and read 
the same values as DBGBCR<n>[2,0] respectively. This means that the smallest instruction size that a breakpoint 
can never generate a Breakpoint exception for is a halfword. 


Figure G2-4 on page G2-3952 shows a summary of when breakpoints programmed with particular BAS values 
generate Breakpoint exceptions. 


The figure contains four parts: 


. A column showing the row number, on the left. 
. An instruction set and instruction size table. 
. A location of instruction figure. 


° A BAS field values table, on the right. 
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G2.9.5 


To use the figure, read across the rows. For example: 


° Row | shows that a breakpoint with a BAS value of 0b1100 generates Breakpoint exceptions for 16-bit T32 
instructions starting at the word-aligned address held in the DBGBVR<n>. 


° Row 5 shows that a breakpoint with a BAS value of 0b0011 generates Breakpoint exceptions for 32-bit T32 
instructions starting at the halfword-aligned address immediately after the word aligned address held in the 


DBGBVR<n>. 


In the figure: 


Yes Means that the breakpoint does generate a Breakpoint exception. 
No Means that the breakpoint does not generate a Breakpoint exception. 
UNP Means that is it CONSTRAINED UNPREDICTABLE whether the breakpoint generates a Breakpoint 


exception. See Other usage constraints for Address breakpoints on page G2-3957 





























Location of instruction® BAS[3:0] 
Instruction set Size -22 -1 O +1 +42 +3 +4 «+5 Ob0000 O0b0011 0b1100 0b1111 
Row 1 T32 16-bit Yes No Yes No 
Row 2 16-bit Yes Yes No UNP 
Row 3 T32 32-bit Yes UNP Yes UNP 
Row 4 32-bit Yes No UNP No 
Row 5 32-bit Yes Yes No UNP 
Row 6 A32 32-bit Yes No UNP No 























a. 0 means the word-aligned address held in the DBGBVRn. The other locations are as follows: 
-2 means ((DBGBVRn[31:2]:00) - 2). 
-1 means ((DBGBVRn[31:2]:00) - 1). 


+5 means ((DBGBVRn[31:2]:00) + 5). 


The solid areas show the location of the instruction. 


Breakpoint context comparisons 


Figure G2-4 Summary of BAS field meanings for Address Mismatch breakpoints 


The breakpoint type defined by DBGBCR<n>.BT determines what context comparison is required, if any. 
Table G2-12 shows the BT values that require a comparison, and the match required for the comparison to be 


successful. 


Table G2-12 Breakpoint context comparison tests 





DBGBCR<n>.BT 


Test required for successful context comparison 





Qb001x 





IDR value matches DBGBVR<n>.ContextID value eS) 





0b100x 


ra value matches DBC n>. VMID value 








Qb101x 






>.ContextID value and 
VMID value 





No context comparison is required for other valid DBGBCR<n>.BT values. 


Context breakpoints do not generate Breakpoint exceptions when execution is in EL2. 


The following Context breakpoint types do not generate Breakpoint exceptions in Secure state: 


. VMID Match breakpoints. 


° VMID and Context ID Match breakpoints. 
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Note 
° For all Context breakpoints, DBGBCR<n>.BAS is RES! and is ignored. 
° For Linked Context breakpoints, DBGBCR<n>.{LBN, SSC, HMC, PMC} are RESO and are ignored. 














G2.9.6 Using breakpoints 

This section contains the following: 

° Using an Address Mismatch breakpoint to single-step an instruction. 

° ITD control effects on address breakpoints on the first instruction in an IT block on page G2-3954. 

° Usage constraints on page G2-3955. 

Using an Address Mismatch breakpoint to single-step an instruction 

In execution conditions that an Address Mismatch breakpoint matches, defined by DBGBCR<n>.{LBN, SSC, 

PMC}, the breakpoint generates Breakpoint exceptions for all instructions committed for execution, except the 

instruction whose address the breakpoint is programmed with. Figure G2-5 shows an example of Address Mismatch 

breakpoint operation, for an Address Mismatch breakpoint programmed with address 0x1014. 

Instruction 
addresses 
0x1000 7) 
0x1004 
0x1008 > The breakpoint generates a Breakpoint exception for all of these instructions 

Program 9x100C 

flow 0x1010 _ 
0x1014 9 <—— The breakpoint does not generate a Breakpoint exception 
0x1018 
0x101C > The breakpoint generates a Breakpoint exception for all of these instructions 
v 0x1020) 
All executed in execution conditions that the breakpoint matches 
Figure G2-5 Operation of an Address Mismatch breakpoint 

This means that an Address Mismatch breakpoint can be used to single-step an instruction. 

In the example shown in Figure G2-5: 

° If the target of a branch is an instruction other than the instruction at address 0x1014, the breakpoint generates 
a Breakpoint exception when the instruction is committed for execution. 

. If the target of a branch is the instruction at address 0x1014, the PE executes the instruction at 0x1014 and the 
breakpoint does not generate a Breakpoint exception until the instruction at address 0x1018 is committed for 
execution. The instruction at address 0x1014 is therefore single-stepped. 

However, if the instruction at @x1014 generates a synchronous exception, or if the PE takes an asynchronous 

exception while the instruction is being stepped, the breakpoint is evaluated again after taking the exception. 

This means that behavior is as follows: 

— _ If the exception handler executes in execution conditions that the breakpoint matches, the breakpoint 
generates a Breakpoint exception for the exception vector, because the exception vector is not address 
0x1014. This means that software execution steps into the exception. 

— If the exception handler executes in execution conditions that the breakpoint does not match, the 
breakpoint does not generate any Breakpoint exceptions after the PE has taken the exception, until the 
exception handler completes and executes an exception return instruction. The effect is to step over 
the exception. Whether the instruction is stepped again depends on whether the target of the exception 
return instruction is the instruction at 0x1014 or the instruction at 0x1018. 

ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G2-3953 
ID092916 Non-Confidential 


G2 AArch32 Self-hosted Debug 
G2.9 Breakpoint exceptions 


If the instruction at 0x1014 is single-stepped and branches to itself, it is CONSTRAINED UNPREDICTABLE 
whether the breakpoint generates a Breakpoint exception after the PE has executed the branch. 


This means that an instruction is only single-stepped if it is the target of a branch instruction and its address matches 
the address the breakpoint is programmed for. In the example shown in Figure G2-5 on page G2-3953, this is 0x1014. 


Usually this branch instruction is an exception return instruction that changes PE mode, branching from a PE mode 
in which the breakpoint does not generate a Breakpoint exception. A branch instruction that does not change PE 
mode would itself generate a Breakpoint exception. However, it might be a branch-to-self instruction as described 
above. 


Because Address Mismatch breakpoints can single-step instructions, the behavior of an address mismatch 
Breakpoint exception is similar to the behavior of an AArch64 Software Step exception. 





Note 
* The example shown in Figure G2-5 on page G2-3953 assumes an A32 instruction. The same behavior applies 
for both 32-bit and 16-bit T32 instructions. 


° Software Step exceptions are the highest priority exception. Breakpoint exceptions are lower priority. See 
Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548. 





ITD control effects on address breakpoints on the first instruction in an IT block 


In an implementation that supports the ITD control, if the value of the ITD field that applies to the current Exception 
level is 1, all of the following are true: 


° An IT instruction can only be used to apply to one 16-bit T32 instruction. 
. Only certain combinations of an IT instruction and second single 16-bit T32 instruction are permitted. 
. For a permitted combination, it is IMPLEMENTATION DEFINED whether the implementation treats the 


combination as: 
— A pair of 16-bit instructions. 


— One 32-bit instruction. 


If the implementation treats the combination as one 32-bit instruction, then as described in Other usage constraints 
for Address breakpoints on page G2-3957, an Address breakpoint might not generate a Breakpoint exception for an 
address match only on the second halfword of the instruction. 


For this reason, if the ITD bit associated with the current Exception level is 1, ARM recommends that a debugger 
that wants to program a breakpoint to match on the second T32 instruction programs it to match on the IT instruction 
instead. 


However, if returning from an exception whose preferred return address is the address of the second T32 instruction, 
then because the debugger is aware that the implementation has treated the combination as a pair of 16-bit 
instructions, the debugger is permitted to program the breakpoint to match on the second T32 instruction. 


The ITD control fields are: 
HSCTLR.ITD Applies to execution at EL2 when EL2 is using AArch32. 
SCTLR.ITD Applies to execution at ELO or EL1 when EL! is using AArch372. 
SCTLR_EL1.ITD 
Applies to execution at ELO using AArch32 when EL] is using AArch64. 


An implementation that does not support the ITD control behaves as if the value of the ITD field is 0, and therefore 
the information in this section does not apply to such an implementation. 


Note 


Programming the breakpoint to match on the second T32 instruction might be necessary when using an Address 
Mismatch breakpoint for single stepping. 
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Usage constraints 


See the following: 

° Reserved DBGBCR<n>.BT values. 

° Reserved DBGBCR<n>.{SSC, HMC, PMC} values on page G2-3956. 
° Reserved DBGBCR<n>.BAS values on page G2-3956. 

. Reserved DBGBCR<n>.LBN values on page G2-3957. 

. Other usage constraints for Address breakpoints on page G2-3957. 

° Other usage constraints for Context breakpoints on page G2-3958. 


Reserved DBGBCR<n>.BT values 
Table G2-13 shows when particular DBGBCR<n>.BT values are reserved. 


Table G2-13 Reserved BT values 


























BT value Breakpoint type Reserved 

Qb001x Context ID Match For non context-aware breakpoints. 

0b010x Address Mismatch If EDSCR.HDE is | and halting is allowed. 

Qb011x - Always. 

0b100x VMID Match For non context-aware breakpoints, or if EL2 is not implemented. 
0b101x Context ID and VMID Match 

Qb11xx - Always. 





If a breakpoint is programmed with one of these reserved BT values: 


° The breakpoint must behave as if it is either: 
— Disabled. 
— Programmed with a BT value that is not reserved, other than for a direct or external read of 
DBGBCR<n>. 
° For a direct or external read of DBGBCR<n>, if the reserved BT value: 


— Has no function for any execution conditions, the value read back is UNKNOWN. 


—_— Has a function for execution conditions other than the current execution conditions, the value read 
back is the value written. This permits software to save and restore the BT value so that the breakpoint 
functions for the other execution conditions. 


The behavior of breakpoints with reserved BT values might change in future revisions of the architecture. For this 
reason, software must not rely on the behavior described here. 
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Reserved DBGBCRe<n>.{SSC, HMC, PMC} values 


Table G2-14 shows when particular combinations of DBGBCR<n>.{SSC, HMC, PMC} are reserved in stage 1 of 
an AArch32 translation regime. 


Table G2-14 Reserved HMC, SSC, and PMC combinations 











HMC, SSC, and PMC combination Reserved 

All combinations with SSC set to 0b01 or 0b10. When EL3 is not implemented and EL2 is 
implemented. 

Any combination where HMC or SSC is nonzero, When both of EL2 and EL3 are not implemented. 


except for the combination with HMC set to 1, SSC set 
to 0b11, and PMC set to 0b00. 





The combination with HMC set to 1, SSC set to @b11, © When EL2 is not implemented. 
and PMC set to @b00. 





Combinations not included in Table G2-11 on Always 
page G2-3946. 





For all breakpoints except Linked Context breakpoints, if a breakpoint is programmed with one of these reserved 
combinations: 
. If the reserved combination has a function for other execution conditions: 

— The breakpoint must behave as if it is disabled. 

— A direct or external read of DBGBCR<n>.{SSC, HMC, PMC} returns the values written. This means 


that software can save and restore the combination so that the breakpoint can function for the other 
execution conditions. 


° If the reserved combination does not have a function for other execution conditions: 


— _ It must behave either as if it is programmed with a combination that is not reserved or as if it is 
disabled. 


— A direct or external read of DBGBCR<n>.{SSC, HMC, PMC} returns UNKNOWN values. 


If the breakpoint is a Linked Context breakpoint, then: 
° The values of HMC, SSC, and PMC are ignored. 
° A direct or external read of DBGBCR<n>.{SSC, HMC, PMC} returns UNKNOWN values 


The behavior of breakpoints with reserved combinations of HMC, SSC, and PMC might change in future revisions 
of the architecture. For this reason, software must not rely on the behavior described here. 


Reserved DBGBCR<n>.BAS values 


For all Context breakpoints 
DBGBCR<n>.BAS is RES1 and is ignored. 


For all Address breakpoints 


The supported values of the BAS field for the Address Match and Address Mismatch breakpoints 
are shown in Specifying the halfword-aligned address that an Address breakpoint matches on on 
page G2-3948. 
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If a breakpoint is programmed with a reserved BAS value: 


° The breakpoint must behave as if it is either: 
— Disabled. 
— Programmed with a BAS value that is not reserved, other than for a direct or external read of 
DBGBCR<n>. 
° A direct or external read of DBGBCR<n>.BAS returns an UNKNOWN value. 


Software must not rely on these properties as the behavior of reserved values might change in a future revision of 
the architecture. 


Reserved DBGBCR<n>.LBN values 


For all Context breakpoints 


DBGBCR<n>.LBN reads UNKNOWN and its value is ignored. 


For Linked Address breakpoints 


A Linked Address breakpoint must link to a context-aware breakpoint. For a Linked Address 
breakpoint, any DBGBCR<n>.LBN value that is not for a context-aware breakpoint is reserved. 


If a Linked Address breakpoint links to a breakpoint that is not implemented, or that is not 
context-aware, then reads of DBGBCR<n>.LBN return an UNKNOWN value and the behavior is 
CONSTRAINED UNPREDICTABLE. The Linked Address breakpoint behaves as if it is either: 


. Disabled. 
° Linked to an UNKNOWN context-aware breakpoint. 
If a Linked Address breakpoint that links to a breakpoint that is implemented and that is 


context-aware, but that is either not enabled or not programmed as a Linked Context breakpoint, it 
behaves as if it is disabled. 


For Unlinked Address breakpoints 
DBGBCR<n>.LBN reads UNKNOWN and its value is ignored. 


Other usage constraints for Address breakpoints 


For all Address breakpoints 
° DBGBVR<n>([1:0] are RESO and are ignored. 
. The DBGBXVR<n> is ignored. 


For Address Match breakpoints 


. For 32-bit instructions, if a breakpoint matches on the address of the second halfword but not 
the address of the first halfword, it is CONSTRAINED UNPREDICTABLE whether the breakpoint 
generates a Breakpoint exception. 


° If DBGBCR<n>.BAS is @b1111, it is CONSTRAINED UNPREDICTABLE whether the breakpoint 
generates a Breakpoint exception for a T32 instruction starting at address 
((DBGBVR<n>[31:2]:00) + 2). For T32 instructions, ARM recommends that the debugger 
programs the BAS field with either 0b0011 or 0b1100. 


For Address Mismatch breakpoints 


The constraints are the same as those described in For Address Match breakpoints, except that if 
two Address Mismatch breakpoints are programmed to match in the same Exception level and 
Security state, it is CONSTRAINED UNPREDICTABLE whether or not the instruction is stepped or a 
Breakpoint debug even is generated. 
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Other usage constraints for Context breakpoints 


For all Context breakpoints 


Any bits of DBGBVR<n> and DBGBXVR<n> that are not used to specify Context ID or VMID 
are RESO and are ignored. 


— Note 


This means that for Context ID Match breakpoints, the DBGBXVR<n> is RESO and is ignored, and 
for VMID Match breakpoints, the DBGBVR<n> is RESO and is ignored. 





For Linked Context breakpoints 


If no Linked Address breakpoints or Linked Watchpoints link to a Linked Context breakpoint, the 
Linked Context breakpoint does not generate any Breakpoint exceptions. 


G2.9.7 Exception syndrome information and preferred return address 


See the following: 
° Exception syndrome information. 


° Preferred return address on page G2-3959. 


Exception syndrome information 
The PE takes a Breakpoint exception as either: 
. A Prefetch Abort exception if it is taken to PL1. In this case, it is taken to Abort mode. 


° A Hyp trap exception, if it is taken to PL2 because HCR.TGE or HDCR.TDE is 1. In this case, it is taken to 
Hyp mode. 


If the exception is taken to: 





Abort mode 
The PE sets all of the following: 
. DBGDSCRext.MOE to 0b0001, to indicate a Breakpoint exception. 
° IFSR.FS to the code for a debug exception, 0b00010. 
° The IFAR with an UNKNOWN value. 
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Hyp mode 
The PE does all of the following: 
° Records information about the exception in the Hypervisor Syndrome Register, HSR. See 
Table G2-15. 
° Sets DBGDSCRext.MOE to 0b0001, to indicate a Breakpoint exception. 
° Sets the HIFAR to an UNKNOWN value. 
Table G2-15 Information recorded in the HSR 

HSR field Information recorded 
Exception Class, EC The PE sets this to the code for a Prefetch Abort exception routed to Hyp mode, 0x20. 
Instruction Length, IL The PE sets this to 1. 





Instruction Specific Syndrome, ISS _— ISS[24:10] ResO. 
ISS[9] External Abort type (EA). The PE sets this to 0. 
ISS[8:6] RESO. 


ISS[5:0] = Instruction Fault Status Code (IFSC). The PE sets this to the code for a debug 
exception, 0b100010. 





— Note 


For information about how debug exceptions can be routed to PL2, see Routing debug exceptions 
on page G2-3927. 





Preferred return address 


The preferred return address of a Breakpoint exception is the address of the instruction that was not executed 
because the PE took the Breakpoint exception instead. 


This means that the preferred return address is the address of the instruction that caused the exception. 


G2.9.8 Pseudocode description of Breakpoint exceptions taken from AArch32 state 


AArch32.BreakpointValueMatch() returns a pair of results: 
° A result for Address Match and Context breakpoints. 
° A result for Address Mismatch breakpoints. 


AArch32.StateMatch() tests the values in DBGBCR<n>.{SSC, HMC, PMC} and, if the breakpoint links to a Linked 
Context breakpoint, also tests the Linked Context breakpoint. 


AArch32.BreakpointMatch() tests a committed instruction against all breakpoints. 
AArch32.CheckBreakpoint() generates a FaultRecord. A Breakpoint exception is taken if all of the following are true: 
° DBGDSCRext.MDBGen is 1. 


. Debug exceptions are enabled from the current Exception level and Security state. See Enabling debug 
exceptions from the current Privilege level and Security state on page G2-3929. 


° All of the conditions required for Breakpoint exception generation are met. See About Breakpoint exceptions 
on page G2-3938. 


Note 
AArch32.CheckBreakpoint() might halt the PE and cause it to enter Debug state. External debug uses Debug state. 
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The AArch32.Abort() function processes the FaultRecord object returned by AArch32.CheckBreakpoint(), as 
described in Abort exceptions on page G3-4019. When a Breakpoint exception is taken to AArch32 state, the 
AArch32.Abort() function generates a Prefetch Abort exception. 
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Watchpoint exceptions 


This section describes Watchpoint exceptions in stage 1 of an AArch32 translation regime. 


The PE is using an AArch32 translation regime when it is executing either: 
. At EL] or higher in an Exception level that is using AArch32. 
° At ELO using AArch32 when EL! is using AArch32. 


This section contains the following subsections: 


. About Watchpoint exceptions. 

. Watchpoint types and linking of watchpoints on page G2-3963. 

° Execution conditions for which a watchpoint generates Watchpoint exceptions on page G2-3964. 
. Watchpoint data address comparisons on page G2-3965. 

. Determining the memory location that caused a Watchpoint exception on page G2-3969. 

° Watchpoint behavior on other instructions on page G2-3970. 


° Usage constraints on page G2-3970. 





. Exception syndrome information and preferred return address on page G2-3973. 
° Pseudocode description of Watchpoint exceptions taken from AArch32 state on page G2-3974. 
G2.10.1 About Watchpoint exceptions 

A watchpoint is an event that results from the execution of an instruction, based on a data address. Watchpoints are 

also known as data breakpoints. 

A watchpoint operates as follows: 

1. A debugger programs the watchpoint with a data address, or a data address range. 

2. The watchpoint generates a Watchpoint debug event on an access to the address, or any address in the address 
range. 

A watchpoint never generates a Watchpoint debug event on an instruction fetch. 

An implementation can include between 2-16 watchpoints. In an implementation, DBGDIDR.WRPs shows how 

many are implemented. 

To use an implemented watchpoint, a debugger programs the following registers for the watchpoint: 

° The Watchpoint Control Register, DBGWCR<n>. This holds control information for the watchpoint, for 
example an enable control. 

° The Watchpoint Value Register, DBGWVR<n>. This holds the data virtual address used for watchpoint 
matching. 

The registers are numbered, so that: 

° DBGWCRI and DBGWVR1 are for watchpoint number one. 

° DBGWCR2 and DBGWVR2 are for watchpoint number two. 

° DBGWCRn and DBGWVRn are for watchpoint number n. 

A watchpoint can: 

° Be programmed to generate Watchpoint debug events on read accesses only, on write accesses only, or on 
both types of access. 

° Link to a Linked Context breakpoint, so that a Watchpoint debug event is only generated if the PE is ina 
particular context when the address match occurs. 
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A single watchpoint can be programmed to match on one or more address bytes. A watchpoint generates a 
Watchpoint debug event on an access to any byte that it is watching. The number of bytes a watchpoint is watching 
is either: 


° One to eight bytes, provided that these bytes are contiguous and that they are all in the same naturally-aligned 
doubleword. A debugger uses the Byte Address Select field, DBGWCR<n>.BAS, to select the bytes. See 
Programming a watchpoint with eight bytes or fewer on page G2-3966. 


° Eight bytes to 2GB, provided that both of the following are true: 
— The number of bytes is a power-of-two. 
—  . The range starts at an address that is aligned to the range size. 


A debugger uses the MASK field, DBGWCR<n>.MASK, to program a watchpoint with eight bytes to 2GB. 
See Programming a watchpoint with eight or more bytes on page G2-3968. 


A debugger must use either the BAS field or the MASK field. If it uses both, whether the watchpoint generates 
Watchpoint exceptions is CONSTRAINED UNPREDICTABLE. See Programming dependencies of the BAS and MASK 
fields on page G2-3971. 


For each memory access, all of the watchpoints are tested. When a watchpoint is tested, it generates a Watchpoint 
debug event if all of the following are true: 


° The watchpoint is enabled. That is, the watchpoint enable control for it, DBGWCR<n>.E, is 1. 

. The conditions specified in the DBGWCR<n> are met. 

° The comparison with the address held in the DBGWVR<n> is successful. 

° If the watchpoint links to a Linked Context breakpoint, the comparison or comparisons made by the Linked 


Context breakpoint are successful. See on page G2-3942 shows this. See also Breakpoint context 
comparisons on page G2-3952. 


° The instruction that initiates the memory access is committed for execution. 
. The instruction that initiates the memory access passes its condition code check. 
If halting is allowed and EDSCR.HDE is 1, Watchpoint debug events cause entry to Debug state. 


Otherwise, if debug exceptions are: 


° Enabled, Watchpoint debug events generate Watchpoint exceptions. 
. Disabled, Watchpoint debug events are ignored. 
Note 





The remainder of this Watchpoint Exceptions section, including all subsections, describes watchpoints as generating 
Watchpoint exceptions. 


However, the behavior described also applies if watchpoints are causing entry to Debug state. 





The debug exception enable controls on page G2-3926 describes the enable controls for Watchpoint debug events. 
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G2.10.2 Watchpoint types and linking of watchpoints 


When a debugger programs a watchpoint, it must program that watchpoint so that it is either: 

° Used in isolation. In this case, the watchpoint is called an Unlinked watchpoint. 

° Enabled for linking to a Linked Context breakpoint. In this case, the watchpoint is called a Linked watchpoint. 
When a Linked watchpoint links to a Linked Context breakpoint, the Linked watchpoint only generates a 


Watchpoint exception if the PE is in a particular context when the data address match occurs. For example, a 
debugger might: 


1. Program watchpoint number one with a data address. 
2. Program breakpoint number five to be a Linked VMID Match breakpoint. 


3. Link the watchpoint and the breakpoint together. A Watchpoint exception is only generated if both the data 
address matches and the VMID matches. 


The Watchpoint Type field for a watchpoint, DBGWCR<n>.WT, controls whether the watchpoint is enabled for 
linking. If DBGWCR<n>.WT is 1, the watchpoint is enabled for linking. 


Rules for linking watchpoints 
The rules for watchpoint linking are as follows: 
° Only Linked watchpoints can be linked. 


. A Linked watchpoint can link to any type of Linked Context breakpoint. The Linked Breakpoint Number 
field, DBGWCR<n>.LBN, for the Linked watchpoint specifies the particular Linked Context breakpoint that 
the Linked watchpoint links to, and: 

—  DBGWCR<n>.WT.{SSC, HMC, PAC} for the Linked watchpoint define the execution conditions that 
the watchpoint generates Watchpoint exceptions for. See Execution conditions for which a watchpoint 
generates Watchpoint exceptions on page G2-3964. 


—  DBGBCR<n>.{SSC, HMC, PMC} for the Linked Context breakpoint are ignored. 


. A Linked watchpoint cannot link to another watchpoint. The LBN field can therefore only specify a 
breakpoint. 
° If a Linked watchpoint links to a breakpoint that is not context-aware, the behavior of the Linked watchpoint 


is CONSTRAINED UNPREDICTABLE. See Usage constraints on page G2-3970 


° If a Linked watchpoint links to an Unlinked Context breakpoint, the Linked watchpoint never generates any 
Watchpoint exceptions. 


. Multiple Linked watchpoints can link to a single Linked Context breakpoint. 


Note 


Multiple Address breakpoints can also link to a single Linked Context breakpoint. Breakpoint exceptions on 
page G2-3938 describes breakpoints. 








Figure G2-2 on page G2-3942 shows an example of permitted watchpoint linking. 
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G2.10.3 Execution conditions for which a watchpoint generates Watchpoint exceptions 


Each watchpoint can be programmed so that it only generates Watchpoint exceptions for certain execution 
conditions. For example, a watchpoint might be programmed to generate Watchpoint exceptions only when the PE 
is executing at EL2. 


DBGWCR<n>.{SSC, HMC, PAC} define the execution conditions a watchpoint generates Watchpoint exceptions 
for, as follows: 
Security State Control, SSC 
Controls whether the watchpoint generates Watchpoint exceptions only in Secure state, only in 
Non-secure state, or in both Security states. 
—— Note 


This is determined by the Security state of the PE, not from the NS attribute returned by the 
translation of the virtual address on which the watchpoint is set. 





Higher Mode Control, HMC, and Privileged Access Control, PAC 


HMC and PAC together control which Privilege level the watchpoint generates Watchpoint 
exceptions in. 


The PAC control relates to the privilege of the memory access, not to the Exception level or 
Privilege level at which the access was made. 


— Note 


This means that, if the PE executes a Load unprivileged or Store unprivileged instruction at PL1, 
the resulting data access triggers a watchpoint only if both: 


° PAC is programmed to a value that generates watchpoints on PLO accesses. 


° All other conditions for generating the watchpoint are met. 


Example A32/T32 Load unprivileged and Store unprivileged instructions are LDRT and STRT. 





Table G2-16 on page G2-3965 shows the valid combinations of HMC, SSC, and PAC, and for each combination 
shows which Privilege levels watchpoints generate Watchpoint exceptions in. 


In the table: 


Y or - Means that a watchpoint programmed with the values of HMC, SSC, and PAC shown in that row: 
Y Can generate Watchpoint exceptions at that Privilege level. 


- Cannot generate Watchpoint exceptions at that Privilege level. 


Res Means that the combination of HMC, SSC, and PAC is reserved. See Reserved 
DBGWCR<n>.{SSC, HMC, PAC} values on page G2-3971. 
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Table G2-16 Summary of watchpoint HMC, SSC, and PAC encodings 





Security state the watchpoint 


Implementation 













































































HB oe. FAG is programmed to match in a a 
NoEL3 No EL2 and no EL3 
0 00 01 Both - Y - - - 
0 00 10 - - = : 
0 00 11 - Y - - 
0 01 01 Non-secure - - Res Res 
0 01 10 - - Res Res 
0 01 11 - Res Res 
0 10 01 Secure - - Res Res 
0 10 10 - - Y Res Res 
0 10 11 - Y Res Res 
1 00 01 Both Y Y - - Res 
1 00 11 Y Y Y - Res 
1 01 01 Non-secure Y Y - Res Res 
1 01 11 Y Y Y Res Res 
1 10 01 Secure - Y - Res Res 
1 10 11 - Y Y Res Res 
1 11 00 Non-secure Y - - - Res if no EL2> 











a. Debug exceptions are not generated at PL2 using AArch32. This means that these combinations of HMC, SSC, and PAC are only 
relevant if watchpoints cause entry to Debug state. Self-hosted debuggers must avoid combinations of HMC, SSC, and PAC that 
generate Watchpoint exceptions at PL2 using AArch32. 


b. This encoding is only reserved when EL2 is not implemented, regardless of whether EL3 is implemented. 


All combinations of HMC, SSC, and PAC that this table does not show are reserved. See Reserved 


DBGWCR<n>.{SSC, HMC, PAC} values on page G2-3971. 


G2.10.4 Watchpoint data address comparisons 


An address comparison is successful if bits [31:2] of the current data virtual address are equal to 
DBGWVR<n>[31:2], taking into account all of the following: 


° The size of the access. See Size of the data access on page G2-3966. 


° The bytes selected by DBGWVR<n>.BAS. See Programming a watchpoint with eight bytes or fewer on 
page G2-3966. 


° Any address ranges indicated by DBGWVR<n>.MASK. See Programming a watchpoint with eight or more 
bytes on page G2-3968. 


Note 





DBGWVR<n>[1:0] are RESO and are ignored. 
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Size of the data access 
Because watchpoints can be programmed to generate Watchpoint exceptions on individual bytes, the size of each 


access must be taken into account. See Example G2-1. 


Example G2-1 


1. A debugger programs a watchpoint to generate Watchpoint exceptions only when the byte at address @x1009 
is accessed. 


2. The PE accesses the unaligned doubleword starting at address 0x1003. 


In this scenario, the watchpoint must generate a Watchpoint exception. 


The size of data accesses initiated by DCIMVAC instructions is an IMPLEMENTATION DEFINED size that is both: 


° From the inclusive range between: 
— _ The size that CTR.DminLine defines. 
—  2KB. 

. A power-of-two. 


The lowest address accessed by a DCIMVAC instruction is the address supplied to the instruction, rounded down to the 
nearest multiple of the access size initiated by that instruction. 


The highest address accessed is (size - 1) bytes above the lowest address accessed. 


See also, Watchpoint behavior on accesses by DCIMVAC instructions on page G2-3970. 


Programming a watchpoint with eight bytes or fewer 


The Byte Address Select field, DBGWCR<n>.BAS, selects which bytes in the doubleword starting at the address 
contained in the DBGWVR<n> the watchpoint generates Watchpoint exceptions for. 


If the address programmed into the DBGWVR<n> is: 


° Doubleword-aligned: 


— Alleight bits of DBGWCR<n>.BAS are used, and the descriptions given in Table G2-17 apply. 


° Word-aligned but not doubleword-aligned: 


— Only DBGWCR<n>.BAS[3:0] are used, and the descriptions given in Table G2-18 on page G2-3967 
apply. In this case, DBGWCR<n>.BAS[7:4] are RESO. 


Table G2-17 Supported BAS values when the DBGWVRn address alignment is doubleword 





BAS value _ Description 

















0b00000000 Watchpoint never generates a Watchpoint exception 

BAS[0] == Generates a Watchpoint exception if byte at address DBGWVR<n>[31:3]:000 is accessed 
BAS[1] == Generates a Watchpoint exception if byte at address DBGWVR<n>[31:3]:001 is accessed 
BAS[2] == Generates a Watchpoint exception if byte at address DBGWVR<n>[31:3]:010 is accessed 
BAS[3] == Generates a Watchpoint exception if byte at address DBGWVR<n>[31:3]:011 is accessed 





BAS[4] ==1 Generates a Watchpoint exception if byte at address DBGWVR<n>[31:3]:100 is accessed 
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Table G2-17 Supported BAS values when the DBGWVRn address alignment is doubleword 





BAS value Description 





BAS[5] == Generates a Watchpoint exception if byte at address DBGWVR<n>[31:3]:101 is accessed 





BAS[6] ==1 Generates a Watchpoint exception if byte at address DBGWVR<n>[31:3]:110 is accessed 





BAS[7] == Generates a Watchpoint exception if byte at address DBGWVR<n>[31:3]:111 is accessed 





Table G2-18 Supported BAS values when the DBGWVRn address alignment is word 





BAS value? Description 





0b00000000 Watchpoint never generates a Watchpoint exception 





BAS[0] == Generates a Watchpoint exception if byte at address DBGWVR<n>[31:2]:00 is accessed 





BAS[1] ==1 Generates a Watchpoint exception if byte at address DBGWVR<n>[31:2]:01 is accessed 





BAS[2] == Generates a Watchpoint exception if byte at address DBGWVR<n>[31:2]:10 is accessed 





BAS[3] ==1 Generates a Watchpoint exception if byte at address DBGWVR<n>[31:2]:11 is accessed 





a. DBGWCR<n>.BAS[7:4] are RESO. 


If the BAS field is programmed with more than one byte, the bytes that it is programmed with must be contiguous. 
For watchpoint behavior when its BAS field is programmed with non-contiguous bytes, see Other usage constraints 
on page G2-3972. 


When programming the BAS field with anything other than 0b11111111, a debugger must also program 
DBGWCR<n>.MASK to be 0b00000. See Programming dependencies of the BAS and MASK fields on 
page G2-3971. 


A watchpoint generates a Watchpoint exception whenever a watched byte is accessed, even if: 
° The access size is smaller or larger than the address region being watched. 


. The access is misaligned, and the base address of the access is not in the doubleword or word of memory 
addressed by the DBGWVR<n>[31:3]. See Example G2-1 on page G2-3966. 


The following are some example configurations of the BAS field: 


° To program a watchpoint to generate a Watchpoint exception on the byte at address 0x1003, program: 
—  DBGWVR<n> with 0x1000. 
— DBGWCR<n>_EL1.BAS to be 0b00001000. 


° To program a watchpoint to generate a Watchpoint exception on the bytes at addresses 0x2003, 0x2004 and 
0x2005, program: 


— DBGWVR<n> with 0x2000. 
—  DBGWCR<n>_EL1.BAS to be 0b00111000. 


° If the address programmed into the DBGWVR<n> is doubleword-aligned: 


— To generate a Watchpoint exception when any byte in the word starting at the doubleword-aligned 
address is accessed, program DBGWCR<n>.BAS to be 0b00001111. 


— _ To generate a Watchpoint exception when any byte in the word starting at address 
DBGWVR<n>[31:3]:100 is accessed, program DBGWCR<n>.BAS to be 0b11110000. 
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Note 
ARM deprecates programming a DBGWVR<n> with an address that is not doubleword-aligned. 








Programming a watchpoint with eight or more bytes 


A debugger can use the MASK field, DBGWCR<n>.MASK, to program a single watchpoint with a data address 
range. The data address range must meet all of the following criteria: 


° It is a size that is both: 
— A power-of-two. 
—  Aminimum of eight bytes. 


— A maximum of 2GB. 
° It starts at an address that is aligned to the size. 


The MASK field specifies the number of least significant data address bits that must be masked. Up to 31 least 
significant bits can be masked: 


MASK 0b00000 = No bits are masked. 
0b00001 Reserved. 
0b00010 Reserved. 
0b00011 Three least significant bits are masked. 
0b00100 ~— Four least significant bits are masked. 


0b00101 ~—‘ Five least significant bits are masked. 





Qb11111 31 least significant bits are masked. 


If n least significant address bits are masked, the watchpoint generates a Watchpoint exception on all of the 
following: 


° Address DBGWVR<n>[31:n]:000... 
° Address DBGWVR<n>[31:n]:111... 


° Any address between these two addresses. 


For example, if the four least significant address bits are masked, Watchpoint exceptions are generated for all 
addresses between DBGWVR<n>[31:4]:0000 and DBGWVR<n>[31:4]:1111, including these addresses. 





Note 
. The most significant bit cannot be masked. This means that the full address cannot be masked. 
° For watchpoint behavior when its MASK field is programmed with a reserved value, see Reserved 


DBGWCR<n>.MASK values on page G2-3972. 





When masking address bits, a debugger must both: 


° Program DBGWCR<n>.BAS to be 0b11111111. See Programming dependencies of the BAS and MASK fields 
on page G2-3971. 


° In the DBGWVR<n>, set the masked address bits to 0. For watchpoint behavior when any of the masked 
address bits are not 0, see Other usage constraints on page G2-3972. 
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G2.10.5 Determining the memory location that caused a Watchpoint exception 


On a Watchpoint exception, the PE records an address in a Fault Address Register that the debugger can use to 
determine the memory location that triggered the watchpoint. 


The Fault Address Register (FAR) used is either: 
° DFAR, if the exception is taken to PL1. 
. HDFAR, if the exception is taken to PL2. 


In cases where one instruction triggers multiple watchpoints, only one address is recorded. 


On entering Debug state on a Watchpoint debug event, the PE records the address in the EDWAR. 


Note 


If Debug state was entered from AArch32 state, then EDWAR[63:32] is UNKNOWN and must be ignored by the 
debugger. 








For more information, see the subsections that follow. These are: 
° Address recorded for Watchpoint exceptions generated by instructions other than Data Cache instructions. 


° Address recorded for Watchpoint exceptions generated by Data Cache instructions on page G2-3970. 


Address recorded for Watchpoint exceptions generated by instructions other than Data 
Cache instructions 


The address recorded must be both: 


. From the inclusive range between: 
— The lowest address accessed by the memory access that triggered the watchpoint. 
— The highest watchpointed address accessed by the memory access. A watchpointed address is an 


address that the watchpoint is watching. 


. Within a naturally-aligned block of memory that is all of the following: 
— A power-of-two size. 
— No larger than 2KB. 
— No larger than the block size used by the A64 DC ZVA instruction. 


Note 


There are no architectural means to discover the A64 DC ZVA instruction block size from AArch32 
state. 








— Contains a watchpointed address accessed by the memory access. 


The size of the block is IMPLEMENTATION DEFINED. There is no architectural means of discovering the size. 


Example G2-2 Address recorded for a watchpoint programmed on 0x8019 


A debugger programs a watchpoint to generate a Watchpoint exception on any access to the byte 0x8019. 


An A32 load multiple instruction then loads nine registers starting from address 0x8004 upwards. This triggers the 
watchpoint. 


If the DC ZVA block size is: 





. 32 bytes, the address that the PE records must be between 0x8004 and 0x8019 inclusive. 
° 16 bytes, the address that the PE records must be between 0x8010 and 0x8019 inclusive. 
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Address recorded for Watchpoint exceptions generated by Data Cache instructions 


The address recorded is the address passed to the instruction. This means that the address recorded might be higher 
than the address of the location that triggered the watchpoint. 











G2.10.6 Watchpoint behavior on other instructions 
Under normal operating conditions, the following do not generate Watchpoint exceptions: 
° Instruction cache maintenance instructions. 
° Address translation instructions. 
° TLB maintenance instructions. 
° Preload instructions. 
° All data cache maintenance instructions except DCIMVAC. 
However, the debug architecture allows for IMPLEMENTATION DEFINED controls, such as those in ACTLR registers, 
to enable watchpoints on an IMPLEMENTATION DEFINED subset of these instructions. Whether a watchpoint treats 
the instruction as a load or a store, and the access size of instruction cache, address translation, and TLB operations 
are IMPLEMENTATION DEFINED. 
The access size of the IMPLEMENTATION DEFINED instruction cache, address translation, and TLB operations which 
generate Watchpoint exceptions are IMPLEMENTATION DEFINED. 
See also: 
° Watchpoint behavior on accesses by Store-Exclusive instructions. 
. Watchpoint behavior on accesses by DCIMVAC instructions. 
Watchpoint behavior on accesses by Store-Exclusive instructions 
If a watchpoint matches on a data access caused by a Store-Exclusive instruction, then: 
° If the store fails because an exclusive monitor does not permit it, it is IMPLEMENTATION DEFINED whether the 
watchpoint generates a Watchpoint exception. 
° Otherwise, the watchpoint generates a Watchpoint exception. 
Watchpoint behavior on accesses by DCIMVAC instructions 
It is IMPLEMENTATION DEFINED whether DCIMVAC operations can generate Watchpoint exceptions. If they can, they 
are treated as data stores. This means that for a watchpoint to match on an access caused by a DCIMVAC instruction, 
the debugger must program DBGWCR<n>.LSC to be one of the following: 
10 Match on data stores only. 
11 Match on data stores and data loads. 
Note 
For the size of data accesses performed by DCIMVAC instructions, see Watchpoint data address comparisons on 
page G2-3965. The size of all data accesses must be considered because watchpoints can be programmed to match 
on individual bytes. 
G2.10.7 Usage constraints 
See the following: 
° Reserved DBGWCR<n>.{SSC, HMC, PAC} values on page G2-3971. 
° Reserved DBGWCR<n>.LBN values on page G2-3971. 
. Programming dependencies of the BAS and MASK fields on page G2-3971. 
° Reserved DBGWCR<n>.BAS values on page G2-3972. 
° Reserved DBGWCR<n>.MASK values on page G2-3972. 
° Other usage constraints on page G2-3972. 
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Reserved DBGWCR<n>.{SSC, HMC, PAC} values 


Table G2-19 shows when particular combinations of DBGWCR<n>.{SSC, HMC, PAC} are reserved. 


Table G2-19 Reserved SSC, HMC, and PAC combinations 











HMC, SSC, and PMC combination Reserved 
All combinations with SSC set to @b@1 or 0b10. When EL3 is not implemented and EL2 is implemented. 
All combinations where HMC or SSC is nonzero, except for the When both of EL2 and EL3 are not implemented 


combination with HMC set to 1, SSC set to @b11, and PMC set to @b00. 





The combination with HMC set to 1, SSC set to @b11, and PMC set to @b00._ When EL2 is not implemented 





Combinations not included in Table G2-16 on page G2-3965. Always 





If a watchpoint is programmed with one of these reserved combinations: 


. The watchpoint must behave as if it is either: 
— Disabled. 
— Programmed with a combination that is not reserved, other than for a direct or external read of 
DBGWCR<n>. 
° For a direct or external read of DBGWCR<n>, if the reserved combination: 


— Has no function for any execution conditions, the value read back for each of SSC, HMC, and PMC 
is UNKNOWN. 


— Has a function for execution conditions other than the current execution conditions, the value read 
back is the value written. This permits software to save and restore the combination so that the 
watchpoint functions for the other execution conditions. 


The behavior of watchpoints with reserved combinations of SSC, HMC, and PAC might change in future revisions 
of the architecture. For this reason, software must not rely on the behavior described here. 


Reserved DBGWCR<n>.LBN values 


For Linked watchpoints 


A Linked watchpoint must link to a context-aware breakpoint. For a Linked watchpoint, any 
DBGWCR<n>.LBN value that is not for a context-aware breakpoint is reserved. 


If a Linked watchpoint links to a breakpoint that is not implemented, or that is not context-aware, 
then reads of DBGWCR<n>.LBN return an UNKNOWN value and the behavior is CONSTRAINED 
UNPREDICTABLE. The Linked watchpoint behaves as if it is either: 


. Disabled. 
. Linked to an UNKNOWN context-aware breakpoint. 


If a Linked watchpoint links to a breakpoint that is implemented and is context-aware, but that is 
either not enabled or not programmed as a Linked Context breakpoint, it behaves as if it is disabled. 


For Unlinked watchpoints 


For Unlinked watchpoints, DBGWCR<n>.LBN reads UNKNOWN and its value is ignored. 


Programming dependencies of the BAS and MASK fields 
When programming a watchpoint, a debugger must use either: 
° The MASK field, to program the watchpoint with an address range that can be eight bytes to 2GB. 


° The BAS field, to select which bytes in the doubleword or word starting at the address contained in the 
DBGWVR<n> the watchpoint must generate Watchpoint exceptions for. 
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If the debugger uses the: 
° MASK field, it must program BAS to be 0b11111111, so that all bytes in the doubleword or word are selected. 


° BAS field, it must program MASK to be 0b00000, so that the MASK field does not indicate any address 
ranges. 


If an enabled watchpoint has a MASK field that is non-zero and a BAS field that is not set to 0b11111111, then for 
each byte in the address range, it is CONSTRAINED UNPREDICTABLE whether or not a Watchpoint exception 
is generated. 


Reserved DBGWCR<n>.BAS values 


The BAS field must be programmed with a value Zeros(8-n-m) :Ones(n):Zeros(m), where: 


. nis a non-zero positive integer less-than-or-equal-to 8. 
° mis a positive integer less-than 8. 
° n+m is less-than-or-equal-to 8. 


All other values are reserved. 


Note 


If x is zero, then Zeros(x) is an empty bitstring. 








If DBGWVR<n>[2] is 1, DBGWCR<n>.BAS[7:4] are RESO and are ignored. 
If a watchpoint is programmed with a reserved BAS value: 


° It is CONSTRAINED UNPREDICTABLE whether the watchpoint generates a Watchpoint exception for each byte 
in the doubleword or word of memory addressed by the DBGWVR<n>. 


° A direct or external read of DBGWCR<n>.BAS returns an UNKNOWN value. 

Software must not rely on these properties as the behavior of reserved values might change in a future revision of 
the architecture. 

Reserved DBGWCR<n>.MASK values 

If a watchpoint is programmed with a reserved MASK value: 


. The watchpoint must behave as if it is either: 
— Disabled. 


— Programmed with an UNKNOWN value that is not reserved, that might be 0b0000, other than for a direct 
or external read of DBGWCR<n>. 


° A direct or external read of DBGWCR<n>.MASK returns an UNKNOWN value. 


Other usage constraints 
For all watchpoints: 
° DBGWVR<n>([1:0] are RESO and are ignored. 


° If DBGWCR<n>.MASK is nonzero, and any masked bits of DBGWVR<n> are not 0, it is CONSTRAINED 
UNPREDICTABLE whether the watchpoint generates a Watchpoint exception when the unmasked bits match. 


. A watchpoint never generates any Watchpoint exceptions if DBGWCR<n>.LSC is 0b00. 





G2-3972 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G2 AArch32 Self-hosted Debug 
G2.10 Watchpoint exceptions 





G2.10.8 Exception syndrome information and preferred return address 

See the following: 

° Exception syndrome information. 

° Preferred return address on page G2-3974. 

Exception syndrome information 

The PE takes a Watchpoint exception as either: 

° A Data Abort exception, if it is taken to PL1. In this case, it is taken to Abort mode. 

° A Hyp trap exception, if it is taken to PL2 because HCR.TGE or HDCR.TDE is 1. In this case, it is taken to 
Hyp mode. 

If the exception is taken to: 

Abort mode 

The PE sets all of the following: 

° DBGDSCRext.MOE to 0b1010, to indicate a Watchpoint exception. 

° DFSR.CM to indicate whether a cache maintenance instruction caused the exception. 

° DFSR.WnkR to indicate whether the exception was generated on a read instruction or a write 
instruction. 

° DFAR to an address that the debugger can use to determine the memory location that 
triggered the watchpoint. See Determining the memory location that caused a Watchpoint 
exception on page G2-3969. 

In addition, if using the: 

° Short-descriptor format, the PE sets DFSR.FS to the code for a debug exception, 0b00010, and 
DFSR.Domain to an UNKNOWN value. 

° Long-descriptor format, the PE sets DFSR.STATUS to the code for a debug exception, 
0b100010. 

Hyp mode 

The PE does all of the following: 

° Records information about the exception in the Hypervisor Syndrome Register, HSR. See 
Table G2-20 on page G2-3974. 

° Sets DBGDSCRext.MOE to 0b1001, to indicate a Watchpoint exception. 

° Sets the HDFAR to an address that the debugger can use to determine the memory location 
that triggered the watchpoint. See Determining the memory location that caused a 
Watchpoint exception on page G2-3969. 
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Table G2-20 Information recorded in the HSR 





HSR field 


Information recorded 





Exception Class, EC 


The PE sets this to the code for a Data Abort exception routed to Hyp mode, 0x24. 





Instruction Length, IL The PE sets this to 1. 





Instruction Specific Syndrome, ISS _— ISV[24]__ Instruction Syndrome Valid (ISV). The PE sets this to 0. 


ISS[23:10] RESO. 

ISS[9] External Abort type (EA). The PE sets this to 0. 

ISS[8] Cache Maintenance (CM). The PE sets this to indicate whether a cache maintenance 
instruction caused the exception. 

ISS[7] RESO. 

ISS[6] Write not Read (WnR). The PE sets this to indicate whether the exception was 
generated on a read instruction or a write instruction. 


ISS[5:0] | Data Fault Status Code (DFSC). The PE sets this to the code for a debug exception, 
0b100010. 





— Note 


For information about how debug exceptions can be routed to PL2, see Routing debug exceptions 
on page G2-3927. 





Preferred return address 


The preferred return address of a Watchpoint exception is the address of the instruction that was not executed 
because the PE took the Watchpoint exception instead. 


This means that the preferred return address is the address of the instruction that caused the exception. 











G2.10.9 Pseudocode description of Watchpoint exceptions taken from AArch32 state 
AArch32.WatchpointByteMatch() tests an individual byte accessed by an operation. 
AArch32.StateMatch() tests the values in DBGWCR<n>.{HMC, SSC, PAC}, and if the watchpoint is Linked, also 
tests the Linked Context breakpoint that the watchpoint links to. 
AArch32.WatchpointMatch() tests the value in DBGWVR<n>. 
AArch32.CheckWatchpoint() generates a FaultRecord. A Watchpoint exception is taken if all of the following are true: 
° DBGDSCRext.MDBGen is 1. 
. Debug exceptions are enabled from the current Exception level and Security state. See Enabling debug 
exceptions from the current Privilege level and Security state on page G2-3929. 
° All of the conditions required for Watchpoint exception generation are met. See About Watchpoint exceptions 
on page G2-3961. 
Note 
AArch32.CheckWatchpoint might halt the PE and cause it to enter Debug state. External debug uses Debug state. 
The AArch32.Abort() function processes the FaultRecord object returned by AArch32.CheckWatchpoint(), as 
described in Abort exceptions on page G3-4019. If a Watchpoint exception is taken to AArch32 state, the 
AArch32.Abort() function generates a Data Abort exception. 
G2-3974 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


G2.11 


G2 AArch32 Self-hosted Debug 
G2.11 Vector Catch exceptions 


Vector Catch exceptions 


ARM deprecates the use of vector catch. 
This section describes Vector Catch exceptions in stage 1 of an AArch32 translation regime. 


The PE is using an AArch32 translation regime when it is executing either: 
° At EL] or higher in an Exception level that is using AArch32. 
° At ELO using AArch32 when EL! is using AArch32. 


Note 


Vector Catch exceptions cannot be generated when the PE is using an AArch64 translation regime. 








This section contains the following subsections: 











° About Vector Catch exceptions. 
° Exception vectors that Vector Catch exceptions can be enabled for on page G2-3977. 
° Generation of Vector Catch exceptions on page G2-3978. 
° Usage constraints on page G2-3980. 
. Exception syndrome information and preferred return address on page G2-3980. 
° Pseudocode description of Vector Catch exceptions on page G2-3981. 
G2.11.1 About Vector Catch exceptions 
Whenever the PE takes an exception, execution is forced to an address that is the exception vector for that exception. 
Vector catch permits a debugger to trap exceptions based on the exception vector, or based on the exception type 
associated with the exception vector, as follows: 
. If the address-matching form of vector catch is implemented, the debugger can trap exceptions based on the 
exception vector. 
° If the exception-trapping form of vector catch is implemented, the debugger can trap exceptions based on the 
exception type associated with the exception vector. 
The ARMv8-A architecture supports only these two forms of vector catch. Only one form can be implemented, and 
which is implemented is IMPLEMENTATION DEFINED. The DBGDEVID indicates which form is implemented. 
Regardless of the form of vector catch implemented, a debugger enables Vector Catch exceptions for exception 
vectors or types by programming the DBGVCR. This register contains vector catch enable bits. Each of these bits 
corresponds to a different vector. When a debugger sets a vector catch enable bit to 1, Vector Catch exceptions are 
enabled for the corresponding exception vector or type. 
Note 
EL2 using AArch64 or EL3 using AArch64 can enable Vector Catch exceptions for vectors by programming the 
DBGVCR32_EL2. The DBGVCR32_EL2 is architecturally mapped to the DBGVCR. 
When Vector Catch exceptions are enabled for an exception vector, this is called an enabled vector catch. The set 
of exception vectors that Vector Catch exceptions are enabled for is called the enabled vector catch set. 
If the form of vector catch implemented is the: 
Address-matching form: 
The PE compares the virtual address of each instruction in the program flow with a subset of the 
enabled vector catch set. 
If an address match occurs, a Vector Catch exception is generated when the instruction that caused 
the match is committed for execution. 
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Exception-trapping form 


Whenever the PE takes an exception, if the vector the exception is taken to is included in a subset 
of the enabled vector catch set, a Vector Catch exception is generated. 


The Vector Catch exception is generated as part of entry to the exception, and must be taken before 
the PE either executes any instructions or takes any further exceptions. 


The addresses that comprise the subset depend on whether EL3 is implemented and, for the: 


. Address-matching form, the current Security state. 


° Exception-trapping form, the Security state that the exception is handled in. 


See Generation of Vector Catch exceptions on page G2-3978. 


Table G2-21 summarizes the differences between the address-matching and exception-trapping forms. 


Table G2-21 Differences in behavior of the address-matching and exception-trapping forms of vector catch 





Address-matching 


Exception-trapping 





An enabled vector catch generates a Vector Catch exception when 
an instruction that is fetched from the vector is committed for 
execution. 

This means that spurious Vector Catch exceptions might occur, 
where the Vector Catch exception does not result from an 
exception entry, but is instead caused by a branch to the vector. 
A branch to the vector might occur, for example, on a return from 
a nested exception or when simulating an exception entry. 


An enabled vector catch generates a Vector Catch exception 
immediately after the PE takes the exception that is associated 
with the vector. 


This means that Vector Catch exceptions always result from 
exception entry, and not from branches to exception vectors. 





A Vector Catch exception is generated as a result of an instruction 
fetch. This means that the Vector Catch exception has a priority 
relative to the other synchronous exceptions that result from an 
instruction fetch. 


Synchronous exception prioritization for exceptions taken to 
AArch64 on page D1-1548 describes this prioritization. 


A Vector Catch exception is generated as a result of an exception 
entry. This means that the Vector Catch exception is part of the 
exception that caused the Vector Catch exception. Therefore, the 
Vector Catch exception has no priority associated with it. 

For this reason, Vector Catch exceptions are outside the scope of 
the prioritization that Synchronous exception prioritization for 
exceptions taken to AArch64 on page D1-1548 describes. 





A Vector Catch exception can be preempted by another exception. 


If this happens, the Vector Catch exception is generated again 
when the exception handler branches back to the vector. 


Vector Catch exceptions must be taken before other exceptions. 





A Vector Catch exception can be generated as a result of an 
instruction fetch executed in any AArch32 mode except Hyp 
mode, including User mode. 


Because a Vector Catch exception is generated as the result of an 
exception entry, the Vector Catch exception is only generated 
when the PE is in the AArch32 exception handling mode. 





If HCR.TGE is 1, Vector Catch exceptions can be generated for 
User mode instruction fetches from Non-secure PL1 vectors. 


If HCR.TGE is 1, Vector Catch exceptions are never generated in 
Non-secure state, because: 


° Exceptions are routed away from Non-secure PL1 vectors, 
to PL2. 
. The architecture does not provide vector catch enable bits 


for the Hyp exception vectors. 





Depending on the implementation, some vector catch enable bits in the DBGVCR might be RESO. For example, if 
EL3 is not implemented or is implemented but is using AArch64, Monitor mode is not implemented, and so the 
enable bits for exception vectors for exceptions taken to Monitor mode are RESO. See Exception vectors that Vector 
Catch exceptions can be enabled for on page G2-3977 for the vector catch enable bits that exist for different 


implementations. 


The debug exception enable controls on page G2-3926 describes the enable controls for Vector Catch exceptions. 
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G2.11.2 Exception vectors that Vector Catch exceptions can be enabled for 


When the PE takes an exception, the exception vector is contained in a vector table at the Privilege level the 


exception is taken to. 


Depending on the Security state and AArch32 mode the exception is taken to, when the exception is taken, the 


vector table used is the table that contains one of: 


. Local exception vectors. 

° Non-secure Local exception vectors. 
° Secure Local exception vectors. 

° Hyp exception vectors. 

° Monitor exception vectors. 


Table G2-22 shows which vector tables are implemented for different implementations. In the table: 


. A dash, -, means that the Exception level is not implemented. 
. 64 means that the Exception level is using AArch64. 
. 32 means that the Exception level is using AArch32. 


Table G2-22 Vector tables implemented for different implementations 





Implementation 


ELO EL1 EL2 


EL3 


Vector table or tables implemented 





32 32 - 


Local exception vectors. 





64 


Non-secure Local exception vectors. 





32 


Non-secure Local exception vectors. 


Hyp exception vectors. 





64 


Secure Local exception vectors. 


Non-secure Local exception vectors. 





32 


Secure Local exception vectors. 
Non-secure Local exception vectors. 


Monitor exception vectors. 





64 


64 


Secure Local exception vectors. 


Non-secure Local exception vectors. 





32 


64 


Secure Local exception vectors. 
Non-secure Local exception vectors. 


Hyp exception vectors. 





32 


32 





Secure Local exception vectors. 
Non-secure Local exception vectors. 
Hyp exception vectors. 


Monitor exception vectors. 





For example, in an AArch32-only implementation that includes ELO, EL1, and EL3, when the PE takes an exception 
to Monitor mode, it uses the vector table containing Monitor exception vectors. 
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The tables that follow show the vectors that Vector Catch exceptions can be enabled for, and their corresponding 
vector catch enable bits in the DBGVCR: 


° Table G2-23 shows the Local exception vectors, Secure Local exception vectors, and Non-secure Local 
exception vectors that Vector Catch exceptions can be enabled for. 


° Table G2-24 shows the Monitor exception vectors that Vector Catch exceptions can be enabled for. 


The ARMv8-A architecture does not provide vector catch enable bits for the Hyp exception vectors. 


Table G2-23 Local exception vectors, Secure Local exception vectors, and Non-secure Local exception vectors that 
Vector Catch exceptions can be enabled for 





Vector catch enable bit Local exception vectors 


Non-secure Local Excapion'typs 


exception vectors 


Local or Secure Local 


: Normal. SCTLR.V is 0.4 
exception vectors 


High. SCTLR.V is 1. 




















SF NSF FIQ interrupt VBAR + 0x0000001C OxFFFFQQ1C 
SI NSI IRQ interrupt VBAR + 0x00000018 OxFFFFQ018 
SD NSD Data Abort VBAR + 0x00000010 OxFFFFQ010 
SP NSP Prefetch Abort VBAR + 0x0000000C OxFFFFQQQC 
SS NSS Supervisor Call VBAR + 0x00000008 OxFFFFQ008 
SU NSU Undefined VBAR + 0x00000004 OxFFFFQ004 


Instruction 











a. IfEL3 is implemented and is using AArch32, VBAR is banked. The Secure Local exception vectors use VBARs and the Non-secure Local 
Exception vectors use VBARws. 


Table G2-24 Monitor exception vectors that Vector Catch exceptions can be enabled for 





Vector catch enable bit Exception type Monitor exception vectors 




















MF FIQ interrupt MVBAR + 0x0000001C 
MI IRQ interrupt MVBAR + 0x00000018 
MD Data Abort MVBAR + 0x00000010 
MP Prefetch Abort MVBAR + 0x0000000C 
MS Secure Monitor Call MVBAR + 0x00000008 








Note 


There is no vector catch enable bit for Monitor trap exceptions. 











G2.11.3 Generation of Vector Catch exceptions 
How Vector Catch exceptions are generated depends on which form is implemented: 
. Address-matching form on page G2-3979. 
° Exception-trapping form on page G2-3979. 
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Address-matching form 


The PE compares the virtual address of each instruction in the program flow is with some or all of the addresses in 
the enabled vector catch set, as follows: 


° If EL3 is not implemented, the enabled vector catch set contains only Local exception vectors. The PE 
compares the virtual address of each instruction in the program flow, including those executed at ELO, with 
all addresses in the enabled vector catch set. 


° If EL3 is implemented, the enabled vector catch set might contain one or more of the following: 
— Monitor exception vectors, if EL3 is using AArch32. 
— Secure Local exception vectors. 
—  Non-secure Local exception vectors. 


In this case, Table G2-25 shows which addresses, in the enabled vector catch set, the virtual address of each 
instruction in the program flow is compared with. 


Table G2-25 Comparisons made if the implementation includes EL3 











For exceptions taken to: 
EL3 is using 

Secure PL1 modes Non-secure PL1 modes 
AArch64 Secure Local exception vectors Non-secure Local exception vectors 
AArch32 Secure Local exception vectors 

and Monitor exception vectors 








For example, for exceptions taken to a Secure PL1 mode when EL3 is using AArch64, the virtual address of each 
instruction in the program flow is compared with each Secure Local exception vector in the enabled vector catch set. 


For each instruction in the program flow, the PE tests for any possible Vector Catch exceptions before executing the 
instruction. If a match occurs, a Vector Catch exception is generated when the instruction is committed for 
execution, regardless of all of the following: 


. Whether the instruction passes its condition code check. 

° Whether the instruction is executed as part of exception entry. 

° If EL2 is implemented, what HCR.{IMO, FMO, AMO} are set to. 
° If EL3 is implemented, what SCR. {IRQ, FIQ, EA} are set to. 


Exception-trapping form 


When the PE takes an exception, it tests whether the exception is by branching to an exception vector in a subset of 
the enabled vector catch set, as follows: 


° If EL3 is not implemented, the enabled vector catch set contains only Local exception vectors. The PE tests 
whether the exception is by branching to any address in the enabled vector catch set. 


° If EL3 is implemented, the enabled vector catch set might contain one or more of the following: 
— Monitor exception vectors, if EL3 is using AArch32. 
— Secure Local exception vectors. 


—  Non-secure Local exception vectors. 
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G2.11.4 


G2.11.5 


In this case, the PE tests whether the exception is by branching to a vector in one of the subsets that 
Table G2-26 shows. In the table, n/a means not applicable. 


Table G2-26 Subsets that the PE tests within if EL3 is implemented 





For exceptions taken to: 








EL3 is using 

Monitor mode Other Secure PL1 modes Non-secure PL1 modes 
AArch64 n/a Secure Local exception vectors Non-secure Local exception vectors 
AArch32 Monitor exception vectors 





For example, for an exception taken to a Secure PL1 mode when EL3 is using AArch64, the PE tests whether the 
exception is by branching to any of the Secure Local exception vectors in the enabled vector address set. 


If the exception is by branching to a vector in the subset, a Vector Catch exception is generated as part of exception 
entry. That is, a Vector Catch exception is generated instead of the exception handler executing its first instruction. 


Usage constraints 


See the following subsections: 
° Usage constraints that apply to both forms of vector catch. 


° Usage constraints that apply only to the address-matching form. 


Usage constraints that apply to both forms of vector catch 


For Vector Catch exceptions enabled for either the Prefetch Abort exception vector or the Data Abort exception 
vector, if one of these exception types is taken to the Exception level that debug exceptions are targeting, behavior 
is CONSTRAINED UNPREDICTABLE. Either: 


° Vector catch is ignored, therefore a Vector Catch exception is not generated. 


. Vector catch generates a Prefetch Abort debug exception. For Vector Catch exceptions enabled for the 
Prefetch Abort exception vector, the PE might enter a recursive loop of Prefetch Abort exceptions causing 
Vector Catch exceptions and Vector Catch exceptions causing Prefetch Abort exceptions. 


Note 
The Exception level that debug exceptions are targeting is called the debug target Exception level, ELp. Routing 





debug exceptions on page G2-3927 describes how ELp is derived. 





Usage constraints that apply only to the address-matching form 
Exception vectors are at word-aligned addresses, and: 


° It is CONSTRAINED UNPREDICTABLE whether an enabled vector catch generates a Vector Catch exception for 
a 32-bit T32 instruction starting at the halfword-aligned address immediately prior to the vector address. 


° T32 instructions that start at the halfword-aligned address immediately after the exception vector do not 
generate Vector Catch exceptions. 


For the address-matching form, Vector Catch exceptions have the same priority as Breakpoint exceptions. If a single 
instruction causes both a Vector Catch exception and a Breakpoint exception, it is CONSTRAINED UNPREDICTABLE 
which of these debug exceptions the PE takes. 


Exception syndrome information and preferred return address 


See the following: 


° Exception syndrome information on page G2-3981. 
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. Preferred return address. 


Exception syndrome information 
The PE takes a Vector Catch exception as either: 
. A Prefetch Abort exception if it is taken to PL1. In this case, it is taken to Abort mode. 


° A Hyp trap exception, if it is taken to PL2 because HCR.TGE or HDCR.TDE is 1. In this case, it is taken to 
Hyp mode. 


If the exception is taken to: 


PL1 Abort mode 
The PE sets all of the following: 
. IFSR.FS to the code for a debug exception, 0b00010. 
° DBGDSCRext.MOE to 0b0101, to indicate a Vector Catch exception. 


° The IFAR with an UNKNOWN value. 
PL2 Hyp mode 
The PE does all of the following: 
° Records information about the exception in the Hypervisor Syndrome Register, HSR. See 
Table G2-27. 


° Sets DBGDSCRext.MOE to 0b0101, to indicate a Vector Catch exception. 
° Sets the HIFAR to an unknown value. 


Table G2-27 Information recorded in the HSR 











HSR field Information recorded 
Exception Class, EC The PE sets this to the code for a Prefetch Abort exception routed to Hyp mode, 0x20. 
Instruction Length, IL The PE sets this to 1. 





Instruction Specific Syndrome, 1SS_— ISS[24:10] ResO. 
ISS[9] External Abort type (EA). The PE sets this to 0. 
ISS[8:6] RESO. 
ISS[5:0] = Instruction Fault Status Code (IFSC). The PE sets this to the code for a debug 
exception, 0b100010. 





— Note 


For information about how debug exceptions can be routed to PL2, see Routing debug exceptions 
on page G2-3927. 





Preferred return address 


The preferred return address of a Vector Catch exceptions is the address of the instruction that was not executed 
because the PE took the Vector Catch exception instead. 


This means that the preferred return address is the exception vector. This is true regardless of whether the 
address-matching form or the exception trapping form is implemented. 


G2.11.6 Pseudocode description of Vector Catch exceptions 


The AArch32.VCRMatch() pseudocode function checks whether the instruction at address generates a Vector Catch 
exception. It therefore shows the address-matching form of vector catch. 
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The AArch32.CheckVectorCatch() pseudocode function uses AArch32.VCRMatch() to test whether the instruction 
generates a Vector Catch exception, and if AArch32.VCRMatch() returns TRUE it generates that event. 


The AArch32.Abort() function processes the Faultrecord object returned by AArch32.CheckVectorCatch(), as 
described in Abort exceptions on page G3-4019. If there is a Vector Catch exception, the AArch32.Abort() function 
generates a Prefetch Abort exception. 
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G2.12 Synchronization and debug exceptions 


The behavior of debug depends on all of the following: 

° The state of the external debug authentication interface. 

. Indirect reads of: 
— External debug registers. 
— _ System registers, including system debug registers. 
—  Special-purpose registers. 


If a change is made to any of these, the effect of that change on debug exception generation cannot be relied on until 
after a Context synchronization event has occurred. 


For any instructions executed between the time when the change is made and the time when the next Context 
synchronization event occurs, it is CONSTRAINED UNPREDICTABLE whether debug uses the state of the PE before the 
change, or the state of the PE after the change. 


Example G2-3 
1. Software changes DBGDSCRext.MDBGen from 0 to 1. 
2: An instruction is executed, that would cause a Breakpoint exception if self-hosted debug uses the state of the 
PE after the change. 
3: A Context synchronization event occurs. 


In this case, it is CONSTRAINED UNPREDICTABLE whether the instruction generates a Breakpoint exception. 


Example G2-4 


i Software unlocks the OS lock. 
2. The PE executes some instructions. 


3. A Context synchronization event occurs. 


During the time when the PE is executing some instructions, step 2, it is CONSTRAINED UNPREDICTABLE whether 
debug exceptions other than Breakpoint Instruction exceptions can be generated. 





Note 
° See Context synchronization event for the definition of this term. 
° Some register updates are self-synchronizing. Others require an explicit Context synchronization event. For 


more information, see both: 

— __ Synchronization of changes to AArch32 System registers on page G4-4163. 
— Accessing PSTATE fields on page G1-3806. 

— _ Synchronization of changes to the external debug registers on page H8-4964. 
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G2.12.1 State and mode changes without explicit context synchronization events 


Most changes to the Exception level, and most changes to the Security state if EL3 is implemented, happen as a 
result of operations that are an explicit Context synchronization event. This is because taking an exception and 
returning from an exception are both explicit Context synchronization events, and the Privilege level and Security 
state can only change as a result of taking or returning from an exception. 


However, some Security state and AArch32 mode changes can happen because of operations that are not an explicit 
Context synchronization event. These are: 


° AArch32 mode changes caused by MSR and CPS instructions. A mode change might be to a mode at a lower 
Privilege level. 


° If EL3 is using AArch32, a Security state change caused by a direct write to the SCR in a privileged mode 
other than Monitor mode, to set SCR.NS to 1. 
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Chapter G3 


The AArch32 System Level Memory Model 


This chapter provides a system level view of the general features of the memory system. It contains the following 
sections: 


About the memory system architecture on page G3-3986. 

Address space on page G3-3987. 

Mixed-endian support on page G3-3988. 

AArch32 cache and branch predictor support on page G3-3989. 

System register support for IMPLEMENTATION DEFINED memory features on page G3-4013. 
External aborts on page G3-4014. 

Memory barrier instructions on page G3-4016. 


Pseudocode description of general memory system instructions on page G3-4017. 
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G3.1 About the memory system architecture 


The ARM architecture supports different implementation choices for the memory system microarchitecture and 
memory hierarchy, depending on the requirements of the system being implemented. In this respect, the memory 
system architecture describes a design space in which an implementation is made. The architecture does not 
prescribe a particular form for the memory systems. Key concepts are abstracted in a way that permits 
implementation choices to be made while enabling the development of common software routines that do not have 
to be specific to a particular microarchitectural form of the memory system. For more information about the concept 
of a hierarchical memory system see Memory hierarchy on page E2-2318. 


G3.1.1 Form of the memory system architecture 
The ARMV8 A-profile architecture includes a Virtual Memory System Architecture (VMSA), described in 
Chapter G4 The AArch32 Virtual Memory System Architecture. 


G3.1.2 Memory attributes 


Memory types and attributes on page E2-2342 describes the memory attributes, including how different memory 
types have different attributes. Each location in memory has a set of memory attributes, and the translation tables 
define the virtual memory locations, and the attributes for each location. 


Table G3-1 shows the memory attributes that are visible at the system level. 


Table G3-1 Memory attribute summary 











Memory type Shareability Cacheability 

Device@ Outer Shareable Non-cacheable. 

Normal One of: One of: 
. Non-shareable. ° Non-cacheable. 
° Inner Shareable. ° Write-Through Cacheable. 
° Outer Shareable. ° Write-Back Cacheable. 





a. Takes additional attributes, see Device memory on page E2-2346. 


b. See also Cacheability, cache allocation hints, and cache transient hints on page G3-3992. 


For more information on Cacheability and Shareability see The Cacheability and Shareability memory attributes on 


page E2-2319, Non-shareable Normal memory on page E2-2344, and Caches and memory hierarchy on 
page E2-2318. 
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G3 The AArch32 System Level Memory Model 
G3.2 Address space 


Address space 


The ARMV8 architecture is designed to support a wide range of applications with different memory requirements. 
It supports a range of physical address (PA) sizes, and provides associated control and identification mechanisms. 
For more information, see About VMSAv8-32 on page G4-4022. 


Address space overflow or underflow 


This subsection describes address space overflow or underflow: 


Instruction address space overflow 

When a PE performs a normal, sequential execution of instructions, it calculates: 
(address_of_current_instruction) + (size_of_executed_instruction) 

This calculation is performed after each instruction to determine which instruction to execute next. 


If the address calculation performed after executing an A32 or T32 instruction overflows @xFFFF FFFF, the program 
counter becomes UNKNOWN. 


If the PE executes an instruction for which the instruction address, size, and alignment mean that it contains the 
bytes OxFFFFFFFF and 0x00000000, the bytes that apparently from 0x00000000 onwards come from an UNKNOWN 
address. 


Data address space overflow and underflow 


If the PE executes a load or store instruction for which the computed address, total access size, and alignment mean 
that it accesses bytes @xFFFFFFFF and 0x00000000, then the bytes that apparently come from 0x00000000 onwards come 
from UNKNOWN addresses. 
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G3.3 Mixed-endian support 


Table G3-2 shows the endianness of explicit data accesses and translation table walks. 


Table G3-2 Endianness support 





Exception level 


Explicit data accesses Stage 1 translation table walks Stage 2 translation table walks 


























ELO PSTATE.E SCTLR(S/NS).EE HSCTLR.EE 
EL1 PSTATE.E SCTLR(S/NS).EE HSCTLR.EE 
EL2 PSTATE.E HSCTLR.EE N/A 
EL3 PSTATE.E SCTLR(S).EE N/A 
ARMvV8 provides the following options for endianness support: 
° All Exception levels support mixed-endianness: 
—  SCTLR(S/NS).EE, HSCTLR.EE, and PSTATE.E are RW. 
° Only ELO supports mixed-endianness and EL1, EL2, and EL3 support only little-endianness: 
—  SCTLR(S/NS).EE and HSCTLR.EE are RESO. PSTATE.E is RW when in ELO and RESO when in EL1, 
EL2, or EL3. SPSR.E is also RESO when not returning to ELO. 
° Only ELO supports mixed-endianness and EL1, EL2, and EL3 support only big-endianness: 
—  SCTLR(S/NS).EE and HSCTLR.EE are RES1. PSTATE.E is RW when in ELO and RES1 when in EL1, 
EL2, or EL3. SPSR.E is also RES! when not returning to ELO. 
° All Exception levels support only little-endianness: 
— _ Each of SCTLR(S/NS).EE, HSCTLR.EE, PSTATE.E, and SPSR.E is RESO. 
° All Exception levels support only big-endianness: 
— Each of SCTLR(S/NS).EE, HSCTLR.EE, PSTATE.E, and SPSR.E is RES1. 
If mixed endian support is implemented for an Exception level using AArch32, endianness is controlled by 
PSTATE.E. For exception returns to AArch32 state, PSTATE.E is copied from SPSR_ELx.E. If the target Exception 
level supports only little-endian accesses, SPSR_ELx.E is RESO. If the target Exception level supports only 
big-endian accesses, SPSR_ELx.E is RES1. 
Note 
° When using AArch32, ARM deprecates PSTATE.E having a different value from the equivalent System 
register EE bit when in EL1, EL2 or EL3. The use of the SETEND instruction is also deprecated. 
° If the higher Exception levels are using AArch64, the corresponding registers are: 
—  SCTLR_EL1 for SCTLR(NS). 
—  SCTLR_EL2 for HSCTLR. 
—  SCTLR_EL3 for SCTLR(S). 
The BigEndian() function determines whether the current Exception level and Execution state is using big-endian 
data. 
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G3.4 AArch32 cache and branch predictor support 


The following sections describe the support for caches and branch predictors in AArch32 state: 

° General behavior of the caches. 

° Cache identification on page G3-3991. 

° Cacheability, cache allocation hints, and cache transient hints on page G3-3992. 

. Enabling and disabling the caching of memory accesses in AArch32 state on page G3-3993. 
. Behavior of caches at reset on page G3-3995. 

. AArch32 cache and branch predictor maintenance instructions on page G3-3999, 


. About cache maintenance in ARMV8 on page G3-3995. 


° AArch32 cache and branch predictor maintenance instructions on page G3-3999, 
° Cache lockdown on page G3-4010. 
° System level caches on page G3-4012. 


See also Chapter G4 The AArch32 Virtual Memory System Architecture, and in particular Caches in VMSAv8-32 on 
page G4-4106. 





Note 


° Branch predictors typically use a form of cache to hold branch target data. Therefore, they are included in 
this section. 


. In the instruction mnemonics, MVA is a synonym for VA. 





G3.4.1 General behavior of the caches 


When a memory location is marked with a Normal Cacheable memory attribute, determining whether a copy of the 
memory location is held in a cache still depends on many aspects of the implementation. The following 
non-exhaustive list of factors might be involved: 


. The size, line length, and associativity of the cache. 

° The cache allocation algorithm. 

° Activity by other elements of the system that can access the memory. 
° Speculative instruction fetching algorithms. 

° Speculative data fetching algorithms. 

° Interrupt behaviors. 


Given this range of factors, and the large variety of cache systems that might be implemented, the architecture 
cannot guarantee whether: 


. A memory location present in the cache remains in the cache. 


. A memory location not present in the cache is brought into the cache. 
Instead, the following principles apply to the behavior of caches: 


° The architecture has a concept of an entry locked down in the cache. How lockdown is achieved is 
IMPLEMENTATION DEFINED, and lockdown might not be supported by: 


— __ A particular implementation. 


— Some memory attributes. 


. An unlocked entry in a cache might not remain in that cache. The architecture does not guarantee that an 
unlocked cache entry remains in the cache or remains incoherent with the rest of memory. Software must not 
assume that an unlocked item that remains in the cache remains dirty. 
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. A locked entry in a cache is guaranteed to remain in that cache. The architecture does not guarantee that a 
locked cache entry remains incoherent with the rest of memory, that is, it might not remain dirty. 


Note 


For more information, see The interaction of cache lockdown with cache maintenance instructions on 
page G3-4010. 








° Any memory location that has a Normal Cacheable attribute at either the current Exception level or at a 
higher Exception level can be allocated to a cache at any time. 


° It is guaranteed that no memory location that does not have a Normal Cacheable attribute is allocated into the 
cache. 
. It is guaranteed that no memory location is allocated to the cache if it has a Normal Non-cacheable attribute 


or any type of Device memory attribute in both: 
— The translation regime at the current Exception level. 


— The translation regime at any higher Exception level. 


. For data accesses, any memory location with a Normal Inner Shareable or Normal Outer Shareable attribute 
is guaranteed to be coherent with all masters in its Shareability domain. 


° Any memory location is not guaranteed to remain incoherent with the rest of memory. 


. The eviction of a cache entry from a cache level can overwrite memory that has been written by another 
observer only if the entry contains a memory location that has been written to by an observer in the 
Shareability domain of that memory location. The maximum size of the memory that can be overwritten is 
called the Cache Write-back Granule. In some implementations the CTR identifies the Cache Write-back 
Granule, see CTR, Cache Type Register on page G6-4293. 


. The allocation of a memory location into a cache cannot cause the most recent value of that memory location 
to become invisible to an observer, if it was previously visible to that observer. 


Note 


The Cacheability attribute of an address is determined by the applicable translation table entry for that address, as 
modified by any applicable System register Cacheability controls, such as the SCTLR.{I, C} controls. 








For the purpose of these principles, a cache entry covers at least 16 bytes and no more than 2KB of contiguous 
address space, aligned to the size of the cache entry. 
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G3.4.2 Cache identification 


The ARMv8 cache identification consists of a set of registers that describe the implemented caches that are affected 
by cache maintenance instructions executed on the PE. This includes cache maintenance instructions that: 


° Affect the entire cache, for example ICIALLUIS. 
° Operate by address, for example ICIMVAU. 
° Operate by set/way, for example DCISW. 


The cache identification registers are: 


° A single Cache Type Register, CTR, that defines: 


— The minimum line length of any of the instruction caches affected by the instruction cache 
maintenance instructions. 


— The minimum line length of any of the data or unified caches, affected by the data cache maintenance 
instructions. 


— The cache indexing and tagging policy of the Level | instruction cache. 


Note 


It is IMPLEMENTATION DEFINED whether caches beyond the PoC will be reported by this mechanism, and 
because of the possible existence of system caches some caches before the PoC might not be reported. For 
more information about system caches see System level caches on page G3-4012. 








° A single Cache Level ID Register, CLIDR, that defines: 


— _ The type of cache that is implemented and can be maintained using the architected cache maintenance 
instructions that operate by set/way or operate on the entire cache at each cache level, up to the 
maximum of seven levels. 


— The Level of Unification Inner Shareable (LoUIS), Level of Coherence (LoC) and the Level of 
Unification (LoU) for the caches. See Terms used in describing the maintenance instructions on 
page G3-3996 for a definition of these terms. 


— Anoptional ICB field to indicate the boundary between the caches use for caching Inner Cacheable 
memory regions and those used only for caching Outer Cacheable regions. 


° A single Cache Size Selection Register, CSSELR, that selects the cache level and cache type of the current 
Cache Size Identification Register. 


. For each implemented cache that is identifiable by this mechanism, across all the levels of caching, a Cache 
Size Identification Register, CCSIDR, that defines: 


— Whether the cache supports Write-Through, Write-Back, Read-Allocate and Write-Allocate. 


— _ The number of sets, associativity and line length of the cache. See Terms used in describing the 
maintenance instructions on page G3-3996 for a definition of these terms. 


To determine the cache topology associated with a PE: 


1. Read the Cache Type Register to find the indexing and tagging policy used for the Level 1 instruction cache. 
This register also provides the size of the smallest cache lines used for the instruction caches, and for the data 
and unified caches. These values are used in cache maintenance instructions. 


2. Read the Cache Level ID Register to find what caches are implemented. The register includes seven Cache 
type fields, for cache levels 1 to 7. Scanning these fields, starting from Level 1, identifies the instruction, data 
or unified caches implemented at each level. This scan ends when it reaches a level at which no caches are 
defined. The Cache Level ID Register also specifies the Level of Unification (LoU) and the Level of 
Coherence (LoC) for the cache implementation. 
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G3.4.3 


3: For each cache identified at stage 2: 


° Write to the Cache Size Selection Register to select the required cache. A cache is identified by its 
level, and whether it is: 
— An instruction cache. 


— A data or unified cache. 


° Read the Cache Size ID Register to find details of the cache. 


Cacheability, cache allocation hints, and cache transient hints 


Cacheability only applies to Normal memory, and is defined independently for Inner and Outer cache locations. All 
types of Device memory are always treated as Non-cacheable. 


As described in Memory types and attributes on page E2-2342, the memory attributes include a cacheability 
attribute that is one of: 


° Non-cacheable. 
. Write-Through cacheable. 
° Write-Back cacheable. 


In ARMV8, Cacheability attributes other than Non-cacheable can be complemented by a cache allocation hint. This 
is an indication to the memory system of whether allocating a value to a cache is likely to improve performance. In 
addition, it is IMPLEMENTATION DEFINED whether a cache transient hint is supported, see Transient cacheability 
hint. 


The cache allocation hints are assigned independently for read and write accesses, and therefore when the Transient 
hit is supported the following cache allocation hints can be used: 


For read accesses: Read-Allocate, Transient Read-Allocate, or No Read-Allocate. 


For write accesses: | Write-Allocate, Transient Write-Allocate, or No Write-Allocate. 


Note 


° A Cacheable location with both No Read-Allocate and No Write-Allocate hints is not the same as a 
Non-cacheable location. A Non-cacheable location has coherency guarantees for all observers within the 
system that do not apply for a location that is Cacheable, No Read-Allocate, No Write-Allocate. 





. Implementations can use the cache allocation hints to limit cache pollution to a part of a cache, such as to a 
subset of ways. 


° For VMSAv8-32 translation table walks using the Long-descriptor translation table format, the appropriate 
TCR.{IRGNn, ORGNn} fields define the memory attributes of the translation tables, including the 
cacheability. However, this assignment supports only a subset of the cacheability attributes described in this 
section. 





The architecture does not require an implementation to make any use of cache allocation hints. This means an 
implementation might not make any distinction between memory locations with attributes that differ only in their 
cache allocation hint. 


Transient cacheability hint 


In ARMvV8, it is IMPLEMENTATION DEFINED whether a Transient hint is supported for the VMSAv8-32 translation 
scheme when using the Long-descriptor translation table format. In an implementation that supports the Transient 
hint, the Transient hint is a qualifier of the cache allocation hints, and indicates that the benefit of caching is for a 
relatively short period. It indicates that it might be better to restrict allocation of transient entries, to avoid possibly 
casting-out other, less transient, entries. 





Note 


The architecture does not specify what is meant by a relatively short period. 
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When using the Short-descriptor translation table format, VMSAv8-32 cannot support the Transient hint. 


The description of the MAIRO, MAIR1, HMAIRO, and HMAIR1 registers includes the assignment of the Transient 
attribute in an implementation that supports this option. In this assignment: 


° The Transient hint is defined independently for Inner Cacheable and Outer Cacheable memory regions. 


° A single Transient hint applies to both read and write accesses to a memory region. 


G3.4.4 Enabling and disabling the caching of memory accesses in AArch32 state 


In ARMvV8, Cacheability control fields can force all memory locations with the Normal memory type to be treated 
as Non-cacheable, regardless of their assigned Cacheability attribute. Independent controls are provided for each 
stage of address translation, with separate controls for: 


° Data accesses. These controls also apply to accesses to the translation tables. 


® Instruction accesses. 


Note 


These Cacheability controls replace the cache enable controls provided in previous versions of the ARM 
architecture. 








In AArch32 state, the Cacheability control fields and their effects are as follows: 


For the Non-secure PL1&0 translation regime 
The Non-secure instance of SCTLR holds the EL1 controls that affect cacheability: 
. When the value of SCTLR.C is 0: 


—  Allstage 1 translations for data accesses to Normal memory are Non-cacheable. 


— _ Allaccesses to the PL1&0 stage 1 translation tables are Non-cacheable. 


° When the value of SCTLR.I is 0: 


—  Allstage 1 translations for instruction accesses to Normal memory are Non-cacheable. 


° When the value of HCR2.CD is 1: 
—  Allstage 2 translations for data accesses to Normal memory are Non-cacheable. 


— _ Allaccesses to the PL1&0 stage 2 translation tables are Non-cacheable. 


° When the value of HCR2.ID is 1: 


—  Allstage 2 translations for instruction accesses to Normal memory are Non-cacheable. 


° When the value of HCR.DC is 1, all Non-secure stage 1 translations and all accesses to the 
Non-secure EL1&0 stage | translation tables, are treated as accesses to Normal 
Non-shareable Inner Write-Back Cacheable Read-Allocate Write-Allocate, Outer 
Write-Back Cacheable Read-Allocate Write-Allocate memory, regardless of the value of 
SCTLR.C. This applies to translations for both data and instruction accesses. 


In addition, when the value of SCTLR.M is 0, indicating that the stage | translations are disabled 
for the translation regime, then if EL2 is using AArch32 and the value of HCR.DC is 0 or if EL2 is 
using AArch64 and the value of HCR_EL2.DC is 0, then: 


° If the value of SCTLR.I is 0, instruction accesses to Normal memory from stage 1 of the 
translation regime are Outer Shareable, Inner Non-cacheable, Outer Non-cacheable. 


° If the value of SCTLR.I is 1, instruction accesses to Normal memory from stage 1 of the 
translation regime are Outer Shareable, Inner Write-Through cacheable, Outer 
Write-Through cacheable. 
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— Note 


° In Non-secure state, the stage 1 and stage 2 cacheability attributes are combined as described 
in Combining the Cacheability attribute on page G4-4087. 


° The Non-secure SCTLR.{C, I} and HCR.DC fields have no effect on the Secure PL1&0 and 
EL2 translation regimes. 


° The HCR2.{ID, CD} fields affect only stage 2 of the Non-secure PL1&0 translation regime. 


. In Non-secure state, the PL1&0 translation regime can be described as the Non-secure 
EL1&0 translation regime. This is consistent with the equivalent AArch64 descriptions. 





For the Secure PL1&0 translation regime 
The Secure instance of SCTLR holds the controls that determine cacheability: 
. When the value of SCTLR.C is 0: 


— _ All data accesses to Normal memory using the Secure PL1&0 translation regime are 
Non-cacheable. 


—_— All accesses to the Secure PL1&0 translation tables are Non-cacheable. 
° When the value of SCTLR.I is 0: 


— All instruction accesses to Normal memory using the Secure PL1&0 translation 
regime are Non-cacheable. 


In addition, when the value of SCTLR.M is 0, indicating that stage 1 translations are disabled, then: 


° If the value of SCTLR.I is 0, instruction accesses to Normal memory from stage 1 of the 
translation regime are Outer Shareable, Inner Non-cacheable, Outer Non-cacheable. 


° If the value of SCTLR.1 is 1, instruction accesses to Normal memory from stage 1 of the 
translation regime are Outer Shareable, Inner Write-Through cacheable, Outer 
Write-Through cacheable. 


— Note 


The Secure SCTLR. {I, C, M} fields have no effect on the Non-secure PL1&0 and EL2 translation 
regimes. 





For the EL2 translation regime 
° When the value of HSCTLR.C is 0: 


— All data accesses to Normal memory using the EL2 translation regime are 
Non-cacheable. 


— All accesses to the EL2 translation tables are Non-cacheable. 


° When the value of HSCTLR.I is 0: 


— All instruction accesses to Normal memory using the EL2 translation regime are 
Non-cacheable. 


In addition, when the value of HSCTLR.M is 0, indicating that stage 1 translations are disabled, 


then: 

° If the value of HSCTLR.1 is 0, instruction accesses to Normal memory from stage | of the 
translation regime are Outer Shareable, Inner Non-cacheable, Outer Non-cacheable. 

. If the value of HSCTLR.1 is 1, instruction accesses to Normal memory from stage | of the 


translation regime are Outer Shareable, Inner Write-Through cacheable, Outer 
Write-Through cacheable. 


— Note 
The HSCTLR. {I. C, M} fields have no effect on the PL1&0 and EL3 translation regimes. 





The effect of the SCTLR.C or HSCTLR.C and HCR2.CD bits is reflected in the result of the address translation 
instructions in the PAR. 
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Note 


° The requirements in this section mean the architecturally required effects of SCTLR.I and HSCTLR.I are 
limited to their effects on caching instruction accesses in unified caches. 


. This specification can give rise to different cacheability attributes between instruction and data accesses to 
the same location. Where this occurs, the measures for mismatch memory attributes described in Mismatched 
memory attributes on page E2-2352 must be followed to manage the corresponding loss of coherency. 





G3.4.5 Behavior of caches at reset 
In ARMV8: 
° All caches reset to IMPLEMENTATION DEFINED states that might be UNKNOWN. 


° The Cacheability control fields described in Enabling and disabling the caching of memory accesses in 
AArch32 state on page G3-3993 reset to values that force all memory locations to be treated as 
Non-cacheable. 





Note 


This applies only to the controls that apply to the Translation regime that is used by the Exception level, PE 
mode, and Security state entered on reset. 





. An implementation can require the use of a specific cache initialization routine to invalidate its storage array 
before caching is enabled. The exact form of any required initialization routine is IMPLEMENTATION DEFINED, 
and the routine must be documented clearly as part of the documentation of the device. 


° If an implementation permits cache hits when the Cacheability control fields force all memory locations to 
be treated as Non-cacheable then the cache initialization routine must: 


— Provide a mechanism to ensure the correct initialization of the caches. 

—  Bedocumented clearly as part of the documentation of the device. 

In particular, if an implementation permits cache hits when the Cacheability controls force all memory 
locations to be treated as Non-cacheable, and the cache contents are not invalidated at reset, the initialization 


routine must avoid any possibility of running from an uninitialized cache. It is acceptable for an initialization 
routine to require a fixed instruction sequence to be placed in a restricted range of memory. 


. ARM recommends that whenever an invalidation routine is required, it is based on the ARMV8 cache 
maintenance instructions. 


Similar rules apply to: 
° Branch predictor behavior, see Behavior of the branch predictors at reset on page G3-4003. 
° TLB behavior, see TLB behavior at reset on page G4-4090. 


G3.4.6 About cache maintenance in ARMv8 


The following sections give general information about the ARMvé8 cache maintenance: 
° Terms used in describing the maintenance instructions on page G3-3996. 
° The ARMVv8 abstraction of the cache hierarchy on page G3-3999. 


The following sections describe the AArch32 state cache maintenance instructions for ARMv8: 


° AArch32 instruction cache maintenance instructions (IC*) on page G3-4001. 
. AArch32 data cache maintenance instructions (DC*) on page G3-4001. 
Note 





Some descriptions of the cache maintenance instructions refer to the cacheability of the address on which the 
instruction operates. The Cacheability of an address is determined by the applicable translation table entry for that 
address, as modified by any applicable System register Cacheability controls, such as the SCTLR.{I, C} controls. 
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Terms used in describing the maintenance instructions 


Cache maintenance instructions are defined to act on particular memory locations. Instructions can be defined: 
° By the address of the memory location to be maintained, referred to as operating by VA. 


° By a mechanism that describes the location in the hardware of the cache, referred to as operating by set/way. 
In addition, for instruction caches and branch predictors, there are instructions that invalidate all entries. 


The following subsections define the terms used in the descriptions of the cache maintenance instructions: 


° Terminology for cache maintenance instruction operating by virtual address, VA. 
° Terminology for cache maintenance instructions operating by set/way. 
. Terminology for Clean, Invalidate, and Clean and Invalidate instructions on page G3-3997. 


Terminology for cache maintenance instruction operating by virtual address, VA 


In a VMSA implementation, the addresses used by the PE are VAs. When all applicable stages of translation are 
disabled, the VA is identical to the PA. 





Note 


For more information about memory system behavior when address translation is disabled, see The effects of 
disabling address translation stages on VMSAv8-32 behavior on page G4-4031. 





For the cache maintenance instruction, any instruction described as operating by VA includes as part of any required 
VA to PA translation: 


° For an instruction executed at EL1, the current system Address Space IDentifier, ASTD. 

° The current Security state. 

. Whether the instruction was performed from Hyp mode, or from Non-secure EL] state. 

° For an instruction executed from a Non-secure EL] state, the Virtual Machine IDentifier, VMID. 


For a data or unified cache maintenance instruction by VA, the operation cannot generate a Data Abort exception 
for a Domain fault or a Permission fault, except for the Permission fault cases described in: 


° Data cache maintenance instructions (DC*) on page D3-1704. 


° Stage 2 fault on a stage I translation table walk on page G4-4117. 
For an instruction cache maintenance instruction by VA: 


° It is IMPLEMENTATION DEFINED whether the operation can generate a Data Abort exception for a Translation 
fault or an Access flag fault. 


. The operation cannot generate a Data Abort exception for a Domain fault or a Permission fault, except for 
the Permission fault case described in Stage 2 fault on a stage I translation table walk on page G4-4117. 


For more information about these faults, see MMU faults in AArch32 state on page G4-4118. 


Terminology for cache maintenance instructions operating by set/way 


Cache maintenance instruction that operate by set/way refer to the particular structures in a cache. Three parameters 
describe the location in a cache hierarchy that an instruction works on. These parameters are: 


Level The cache level of the hierarchy. The number of levels of cache is IMPLEMENTATION DEFINED. The 
cache levels that can be managed using the architected cache maintenance instructions that operate 
by set/way can be determined from the CLIDR. 


In the ARM architecture, the lower numbered cache levels are those closest to the PE. See Memory 


hierarchy on page E2-2318. 


Set Each level of a cache is split up into a number of sets. Each set is a set of locations in a cache level 
to which an address can be assigned. Usually, the set number is an IMPLEMENTATION DEFINED 
function of an address. 


In the ARM architecture, sets are numbered from 0. 
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Way The associativity of a cache is the number of locations in a set to which a specific address can be 
assigned. The way number specifies one of these locations. 


In the ARM architecture, ways are numbered from 0. 





Note 


Because the allocation of a memory address to a cache location is entirely IMPLEMENTATION DEFINED, ARM expects 
that most portable software will use only the cache maintenance instructions by set/way as single steps in a routine 
to perform maintenance on the entire cache. 





Terminology for Clean, Invalidate, and Clean and Invalidate instructions 
Caches introduce coherency problems in two possible directions: 


1. An update to a memory location by a PE that accesses a cache might not be visible to other observers that 
can access memory. This can occur because new updates are still in the cache and are not visible yet to the 
other observers that do not access that cache. 


2. Updates to memory locations by other observers that can access memory might not be visible to a PE that 
accesses a cache. This can occur when the cache contains an old, or stale, copy of the memory location that 
has been updated. 


The Clean and Invalidate instructions address these two issues. The definitions of these instructions are: 


Clean A cache clean instruction ensures that updates made by an observer that controls the cache are made 
visible to other observers that can access memory at the point to which the instruction is performed. 
Once the Clean has completed, the new memory values are guaranteed to be visible to the point to 
which the instruction is performed, for example to the Point of Unification. 


The cleaning of a cache entry from a cache can overwrite memory that has been written by another 
observer only if the entry contains a location that has been written to by an observer in the 
Shareability domain of that memory location. 


Invalidate A cache invalidate instruction ensures that updates made visible by observers that access memory 
at the point to which the invalidate is defined, are made visible to an observer that controls the cache. 
This might result in the loss of updates to the locations affected by the invalidate instruction that 
have been written by observers that access the cache, if those updates have not been cleaned from 
the cache since they were made. 


If the address of an entry on which the invalidate instruction operates is Normal, Non-cacheable or 
any type of Device memory then an invalidate instruction also ensures that this address is not 
present in the cache. 


— Note 


Entries for addresses that are Normal Cacheable can be allocated to the cache at any time, and so 
the cache invalidate instruction cannot ensure that the address is not present in a cache. 





Clean and Invalidate 
A cache clean and invalidate instruction behaves as the execution of a clean instruction followed 


immediately by an invalidate instruction. Both instructions are performed to the same location. 


The points to which a cache maintenance instruction can be defined differ depending on whether the instruction 
operates by VA or by set/way: 


° For instructions operating by set/way, the point is defined to be to the next level of caching. For the All 
operations, the point is defined as the Point of Unification for each location held in the cache. 
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. For instruction operating by VA, two conceptual points are defined: 


Point of Coherency (PoC) 
For a particular VA, the PoC is the point at which all agents that can access memory are 
guaranteed to see the same copy of a memory location. In many cases, this is effectively the main 
system memory, although the architecture does not prohibit the implementation of caches beyond 
the PoC that have no effect on the coherence between memory system agents. 


Note 


The presence of system caches can affect the definition of the point of coherency as described in 
System level caches on page G3-4012. 








Point of Unification (PoU) 
The PoU for a PE is the point by which the instruction and data caches and the translation table 
walks of that PE are guaranteed to see the same copy of a memory location. In many cases, the 
Point of Unification is the point in a uniprocessor memory system by which the instruction and 
data caches and the translation table walks have merged. 
The PoU for an Inner Shareable Shareability domain is the point by which the instruction and 
data caches and the translation table walks of all the PEs in that Inner Shareable Shareability 
domain are guaranteed to see the same copy of a memory location. Defining this point permits 
self-modifying software to ensure future instruction fetches are associated with the modified 
version of the software by using the standard correctness policy of: 
1. Clean data cache entry by address. 


2. Invalidate instruction cache entry by address. 
The following fields in the CLIDR relate to these conceptual points: 


LoC, Level of Coherence 


This field defines the last level of cache that must be cleaned or invalidated when cleaning or 
invalidating to the Point of Coherency. The LoC value is a cache level, so, for example, if LoC 
contains the value 3: 


° A clean to the Point of Coherency operation requires the level 1, level 2 and level 3 caches 
to be cleaned. 
° Level 4 cache is the first level that does not have to be maintained. 


If the LoC field value is @x@, this means that no levels of cache need to cleaned or invalidated 
when cleaning or invalidating to the Point of Coherency. 


If the LoC field value is a nonzero value that corresponds to a level that is not implemented, this 
indicates that all implemented caches are before the Point of Coherency. 

LoUU, Level of Unification, uniprocessor 
This field defines the last level of cache that must be cleaned or invalidated when cleaning or 
invalidating to the Point of Unification for the PE. As with LoC, the LoUU value is a cache level. 
If the LoUU field value is 0x0, this means that no levels of cache need to cleaned or invalidated 
when cleaning or invalidating to the Point of Unification. 
If the LoUU field value is a nonzero value that corresponds to a level that is not implemented, 
this indicates that all implemented caches are before the Point of Unification. 


LoUIS, Level of Unification, Inner Shareable 
In any implementation: 
° This field defines the last level of cache that must be cleaned or invalidated when cleaning 


or invalidating to the Point of Unification for the Inner Shareable Shareability domain. As 
with LoC, the LoUIS value is a cache level. 


° If the LoUIS field value is 0x0, this means that no levels of cache need to cleaned or 
invalidated when cleaning or invalidating to the Point of Unification for the Inner 
Shareable Shareability domain. 





* If the LoUIS field value is a nonzero value that corresponds to a level that is not 
implemented, this indicates that all implemented caches are before the Point of 
Unification. 
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For more information, see CLIDR, Cache Level ID Register on page G6-4274. 


The ARMV8 abstraction of the cache hierarchy 


The following subsections describe the ARMv8 abstraction of the cache hierarchy: 
° Cache maintenance instructions that operate by address. 


° Cache maintenance instructions that operate by set/way. 


Cache maintenance instructions that operate by address 


The address-based cache maintenance instructions are described as operating by VA. Each of these instructions is 
always qualified as being either: 


. Performed to the Point of Coherency. 


° Performed to the Point of Unification. 


See Terms used in describing the maintenance instructions on page G3-3996 for definitions of Point of Coherency 
and Point of Unification, and more information about possible meanings of VA. 


AArch32 cache and branch predictor maintenance instructions lists the address-based maintenance instructions. 


The CTR holds minimum line length values for: 
. The instruction caches. 


° The data and unified caches. 


These values support efficient invalidation of a range of addresses, because this value is the most efficient address 
stride to use to apply a sequence of address-based maintenance instructions to a range of addresses. 


For the Invalidate data or unified cache line by VA instruction, the Cache Write-back Granule field of the CTR 
defines the maximum granule that a single invalidate instruction can invalidate. This meaning of the Cache 
Write-back Granule is in addition to its defining the maximum size that can be written back. 


Cache maintenance instructions that operate by set/way 


AArch32 cache and branch predictor maintenance instructions lists the set/way-based maintenance instructions. 
Some encodings of these instructions include a required field that specifies the cache level for the instruction: 


° A clean instruction cleans from the level of cache specified through to at least the next level of cache, moving 
further from the PE. 
. An invalidate instruction invalidates only at the level specified. 











G3.4.7 AArch32 cache and branch predictor maintenance instructions 
The instruction and data cache maintenance instructions have the same functionality in AArch32 state and in 
AArch64 state. Table G3-3 on page G3-4000 shows these system instructions. Instructions that take an argument 
include Rt in the instruction description. 
AArch32 state also provides branch predictor maintenance instructions. 
Note 
° In Table G3-3 on page G3-4000 the Point of Unification is the Point of Unification of the PE executing the 
cache maintenance instruction. 
° In AArch32 state, all of the maintenance instructions are available from EL1 or higher. 
° In AArch64 state, branch predictors are always invisible to software, and therefore AArch64 state does not 
provide any branch predictor maintenance instructions. 
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Table G3-3 AArch32 System instructions for cache maintenance 





Register Instruction 





Instruction cache maintenance instructions 











ICIALLUIS Invalidate all to Point of Unification, Inner Shareable 
ICIALLU Invalidate all to Point of Unification 
ICIMVAU, Rt Invalidate by virtual address to Point of Unification 





Data cache maintenance instructions 























DCIMVAC, Rt Invalidate by virtual address to Point of Coherency 

DCISW, Rt Invalidate by set/way 

DCCMVAC, Rt Clean by virtual address to Point of Coherency 

DCCSW, Rt Clean by set/way 

DCCMVAU, Rt Clean by virtual address to Point of Unification 
DCCIMVAC, Rt Clean and invalidate by virtual address to Point of Coherency 
DCCISW, Rt Clean and invalidate by set/way 





Branch prediction maintenance instructions 











BPIMVA, Rt Invalidate the virtual address from the branch predictors 
BPIALLIS, Rt Invalidate all entries from branch predictors, Inner Shareable 
BPIALL, Rt Invalidate all entries from branch predictors 





A DSB or DMB instruction intended to ensure the completion of cache or branch predictor maintenance instructions 
must have an access type of both loads and stores. 


In an implementation where the branch predictors are architecturally invisible, the BPIMVA, BPIALLIS, and 
BPIALL instructions can execute as NOPs. 


The following subsections give more information about these instructions: 

° AArch32 instruction cache maintenance instructions (IC*) on page G3-4001. 
. AArch32 data cache maintenance instructions (DC*) on page G3-4001. 

. Branch predictors on page G3-4002. 


° General requirements for the scope of cache and branch predictor maintenance instructions on 
page G3-4003. 





. Effects of instructions that operate by virtual address to the Point of Coherency on page G3-4004. 
° Effects of instructions that operate by virtual address but not to the Point of Coherency on page G3-4004. 
° Effects of All and set/way maintenance instructions on page G3-4005. 
° Effects of virtualization and security on the AArch32 cache maintenance instructions on page G3-4005. 
° Boundary conditions for cache maintenance instructions on page G3-4007. 
° Ordering of cache and branch predictor maintenance instructions on page G3-4007. 
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° Performing cache maintenance instructions on page G3-4009. 


AArch32 instruction cache maintenance instructions (IC*) 


Where an address argument for these instructions is required, it takes the form of a 32-bit register that holds the 
virtual address argument. No alignment restrictions apply for this address. 


All instruction cache maintenance instructions can execute in any order relative to other instruction cache 
maintenance instructions, data cache maintenance instructions, and loads and stores, unless a DSB is executed 
between the instructions. 


An instruction cache maintenance instruction can complete at any time after it is executed, but is only guaranteed 
to be complete, and its effects visible to other observers, following a DSB instruction executed by the PE that executed 
the cache maintenance instruction. 


See also Ordering of cache and branch predictor maintenance instructions on page G3-4007. 


AArch32 data cache maintenance instructions (DC*) 


Where an address argument for these instructions is required, it takes the form of a 32-bit register that holds the 
virtual address argument. No alignment restrictions apply for this address. 


Data cache maintenance instructions that take a set/way/level argument take a 32-bit register. 


A data or unified cache invalidation by virtual address instruction performed in a Non-secure EL1 mode must not 
change data in any location for which the stage 2 translation permissions do not permit write access. Where such a 
permission violation occurs, it is IMPLEMENTATION DEFINED whether: 


° A stage 2 Permission fault is generated for the DCIMVAC operation. 
° The DCIMVAC operation is upgraded to DCCIMVAC. 


DCIMVAC and DCISW at EL! is performed by the PE as clean and invalidate, that is DCCIMVAC or DCCISW if 
all of the following apply: 


. EL2 is implemented. 
. PL1&0 stage two address translation is enabled, meaning either: 
—  EL2is using AArch32 and the value of HCR.VM is 1. 
—  EL2 is using AArch64 and the value of HCR_EL2.VM is 1. 
° EL3 is not implemented, or either: 
—  EL3 is using AArch32 and the value of SCR.NS is 1. 
— EL3 is using AArch64 and the value of SCR_EL3.NS is 1. 


Note 


Similarly, DCIMVAC and DCISW at EL1 must be performed as clean and invalidate, that is DCCIMVAC and 
DCCISW at EL1 when EL] is using AArch32, if all of the following apply: 





. EL2 is implemented. 


° EL2 is using AArch32 and HCR.VM is set to the value of 1, or EL2 is using AArch64 and HCR_EL2.VM 
is set to the value of 1. 


° EL3 is using AArch32 and SCR.NS is set to the value of 1, or EL3 is using AArch64 and SCR.NS is set to 
the value of 1, or EL3 is not implemented. 





If a data cache maintenance by set/way instruction specifies a set, way, or level argument that is larger than the value 
supported by the implementation then the instruction is CONSTRAINED UNPREDICTABLE, see Out of range values of 
the Set/Way/Index fields in cache maintenance instructions on page K1-5467 or the instruction description. 


If a memory fault that sets FAR for the translation regime applicable for the cache maintenance instruction is 
generated from a data cache maintenance instruction, the FAR holds the address specified in the register argument 
of the instruction. 
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See also Ordering of cache and branch predictor maintenance instructions on page G3-4007. 


Branch predictors 


In AArch32 state it is IMPLEMENTATION DEFINED whether branch prediction is architecturally visible. This means 
that under some circumstances software must perform branch predictor maintenance to avoid incorrect execution 
caused by out-of-date entries in the branch predictor. For example, to ensure correct operation it might be necessary 
to invalidate branch predictor entries on a change to instruction memory, or a change of instruction address mapping. 
For more information, see Specific requirements for branch predictor maintenance instructions. 


In an implementation where the branch predictors are architecturally invisible, the branch predictor maintenance 
instructions can execute as NOPs. 


An invalidate all operation on the branch predictor ensures that any location held in the branch predictor has no 
functional effect on execution. An invalidate branch predictor by VA instruction operates on the address of the 
branch instruction, but can affect other branch predictor entries. 





Note 


The architecture does not make visible the range of addresses in a branch predictor to which the invalidate operation 
applies. This means the address used in the invalidate by VA operation must be the address of the branch to be 
invalidated. 





If branch prediction is architecturally visible, an instruction cache invalidate all operation also invalidates all branch 
predictors. 


See also Ordering of cache and branch predictor maintenance instructions on page G3-4007. 


Specific requirements for branch predictor maintenance instructions 


If, for a given translation regime and a given ASID and VMID as appropriate, the instructions at any virtual address 
change, then branch predictor maintenance instructions must be performed to invalidate entries in the branch 
predictor, to ensure that the change is visible to subsequent execution. This maintenance is required when writing 
new values to instruction locations. It can also be required as a result of any of the following situations that change 
the translation of a virtual address to a physical address, if, as a result of the change to the translation, the instructions 
at the virtual addresses change: 


° For any translation regime other than the Non-secure PL1&0 translation regime, enabling or disabling stage 1 
translations. 
° For the Non-secure PL1&0 translation regime: 


— When stage 2 translations are enabled, enabling or disabling stage 1 translations unless accompanied 
by a change of VMID. 


— When stage 2 translations are disabled, enabling or disabling stage | translations. 


— Enabling or disabling stage 2 translations. 
. Writing new mappings to the translation tables. 


° Any change to the TTBRO, TTBR1, or TTBCR registers, unless: 


—  Forachange to the Secure PL1&0 translation regime, the change is accompanied by a change to the 
ASID. 


—  Forachange to the stage 1 translations of the Non-secure PL1&0 translation regime, the change is 
accompanied by a change to the ASID or a change to the VMID. 


° Any change to the VTTBR or VTCR registers, unless accompanied by a change to the VMID. 
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Note 


Invalidation is not required if the changes to the translations are such that the instructions associated with the 
non-faulting translations of a virtual address, for a given translation regime and a given ASID and VMID, as 
appropriate, remain unchanged throughout the sequence of changes to the translations. Examples of translation 
changes to which this applies are: 





° Changing a valid translation to a translation that generates an MMU fault. 


° Changing a translation that generates an MMU fault to a valid translation. 





Failure to invalidate entries might give CONSTRAINED UNPREDICTABLE results, caused by the execution of old 
branches. For more information, see Ordering of cache and branch predictor maintenance instructions on 
page G3-4007. 


Note 


° In ARMvV8, there is no requirement to use the branch predictor maintenance operations to invalidate the 
branch predictor after: 


— Changing the ContextID or VMID. 





—  Acache maintenance instruction that is identified as also flushing the branch predictors, see AArch32 
cache and branch predictor maintenance instructions on page G3-3999. 





Cache maintenance instructions, functional group on page G4-4201 shows the branch predictor maintenance 
operations ina VMSA implementation. 


Behavior of the branch predictors at reset 


In ARMv8: 
° If branch predictors are not architecturally invisible: 
— The branch predictors reset to an IMPLEMENTATION DEFINED state that might be UNKNOWN. 
— The branch predictors are disabled at reset. 
. An implementation can require the use of a specific branch predictor initialization routine to invalidate the 


branch predictor storage array before it is enabled. The exact form of any required initialization routine is 
IMPLEMENTATION DEFINED, but the routine must be documented clearly as part of the documentation of the 
device. 


° ARM recommends that whenever an invalidation routine is required, it is based on the ARMv8 branch 
predictor maintenance operations. 


Similar rules apply: 
. To cache behavior, see Behavior of caches at reset on page G3-3995. 
. To TLB behavior, see TLB behavior at reset on page G4-4090. 


General requirements for the scope of cache and branch predictor maintenance 
instructions 


The ARMV8 specification of the cache maintenance and branch predictor instructions describes what each 
instruction is guaranteed to do in a system. It does not limit other behaviors that might occur, provided they are 
consistent with the requirements described in General behavior of the caches on page G3-3989, Behavior of caches 
at reset on page G3-3995, and Preloading caches on page E2-2321. 


This means that as a side-effect of a cache maintenance instruction: 
. Any location in the cache might be cleaned. 


° Any unlocked location in the cache might be cleaned and invalidated. 


As a side-effect of a branch predictor maintenance instruction, any entry in the branch predictor might be 
invalidated. 
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Note 


ARM recommends that, for best performance, such side-effects are kept to a minimum. ARM strongly recommends 
that the side-effects of operations performed in Non-secure state do not have a significant performance impact on 
execution in Secure state. 








Effects of instructions that operate by virtual address to the Point of Coherency 


For Normal memory that is not Inner Non-cacheable, Outer Non-cacheable, these instructions must affect the caches 
of other PEs in the Shareability domain described by the Shareability attributes of the VA supplied with the 
instruction. 


For Device memory and Normal memory that is Inner Non-cacheable, Outer Non-cacheable, these instructions must 
affect the caches of all PEs in the Outer Shareable Shareability domain of the PE on which the instruction is 
operating. 


In all cases, for any affected PE, these instructions affect all data and unified caches to the Point of Coherency. 


Table G3-4 PEs affected by cache maintenance instructions to the Point of Coherency 











Shareability PEs affected Effective to 
Non-shareable The PE performing the operation The Point of Coherency of the entire 
system 
Inner Shareable All PEs in the same Inner Shareable Shareability domain as the PE The Point of Coherency of the entire 
performing the operation system 





Outer Shareable 


All PEs in the same Outer Shareable Shareability domain as the PE The Point of Coherency of the entire 
performing the operation system 





Effects of instructions that operate by virtual address but not to the Point of Coherency 


The following instruction operate by virtual address but not to the Point of Coherency: 

° Clean data or unified cache line by MVA to the Point of Unification, DCCMVAU. 
° Invalidate instruction cache line by MVA to Point of Unification, ICIMVAU. 

° Invalidate by MVA from branch predictors, BPIMVA. 


For these instructions, Table G3-5 shows how, for a VA in a Normal or Device memory location, the Shareability 
attribute of the VA determines the minimum set of PEs affected, and the point to which the instruction must be 


effective. 


Table G3-5 PEs affected by cache maintenance instructions to the Point of Unification 





Shareability 


PEs affected Effective to 





Non-shareable 


The PE executing the instruction The Point of Unification of instruction cache fills, data cache fills and 
write-backs, and translation table walks, on the PE executing the instruction 











Inner Shareable All PEs in the same Inner The Point of Unification of instruction cache fills, data cache fills and 
or Shareable Shareability domain write-backs, and translation table walks, of all PEs in the same Inner 
Outer Shareable as the PE executing the Shareable Shareability domain as the PE executing the instruction 
instruction 
Note 


The set of PEs guaranteed to be affected is never greater than the PEs in the Inner Shareable Shareability domain 
containing the PE executing the instruction. 
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Effects of All and set/way maintenance instructions 


The ICIALLU, BPIALL and DC* set/way instructions apply only to the caches and branch predictors of the PE that 
performs the instruction. If the branch predictors are architecturally-visible, ICIALLU also performs a BPIALL 
operation. 


The ICIALLUIS and BPIALLIS instructions can affect the caches and branch predictors of all PEs in the same Inner 
Shareable Shareability domain as the PE that performs the instruction. If the branch predictors are 
architecturally-visible, ICIALLUIS also performs a BPIALLIS operation. These instructions have an effect to the 
Point of Unification of instruction cache fills, data cache fills, and write-backs, and translation table walks, of all 
PEs in the same Inner Shareable Shareability domain. 


Note 


The possible presence of system caches, as described in System level caches on page G3-4012, means architecture 
does not guarantee that all levels of cache can be maintained using set/way instructions. 








Effects of virtualization and security on the AArch32 cache maintenance instructions 


Each Security state has its own physical address space, and therefore cache entries are associated with physical 
address space. In addition, cache maintenance and branch predictor instructions performed in Non-secure state have 
to take account of: 


. Whether the instruction was performed at EL1 or at EL2. 
° For instructions that operate by VA, the current VMID. 


Table G3-6 shows the effects of virtualization and security on these maintenance instructions. 


Table G3-6 Effects of virtualization and security on the AArch32 cache maintenance instructions 





Cache maintenance Security 
instructions 


ciate Specified entry 





Data or unified cache maintenance instructions 





Invalidate, Clean, or Clean Either All lines that hold the PA that, in the current translation regime, are mapped to by 
and Invalidate by VA: the combination of all of: 
DCIMVAC, DCCMVAC, ° The specified VA. 


DCCMVAU, DCCIMVAC 


° For an instruction executed at EL1, the current ASID if the location is 
mapped to by a non-global page. 











° For an instruction executed at Non-secure EL1, the current VMID@. 
Invalidate, Clean, or Clean Non- secure Line specified by set/way provided that the entry comes from the Non-secure PA 
and Invalidate by set/way: space. 
DCISW, DCCSW, 
DCCISW Secure Line specified by set/way regardless of the PA space that the entry has come from. 
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Table G3-6 Effects of virtualization and security on the AArch32 cache maintenance instructions (continued) 





Cache maintenance Security 


instructions state Specified entry 





Instruction cache maintenance instructions 








Invalidate by VA: Either All lines corresponding to the specified VA® in the current translation regime and: 
ICIMVAU ° For an instruction executed at EL1 or ELO, the current ASID. 
° For an instruction executed at Non-secure EL1 or Non-secure ELO, the 
current VMID?2. 
Invalidate All: ICIALLU, ° Can invalidate any unlocked entry in the instruction cache. 
ICIALLUIS . 


Are required to invalidate any entries relevant to the software component that 
executed it. The Non-secure and Secure descriptions give more information: 
Non-secure 
An instruction executed at EL1 must operate on all instruction 
cache lines that contain entries associated with the current virtual 
machine, meaning any entry with the current VMID.@ 
An instruction executed at EL2 must operate on all instruction 


cache lines that contain entries that can be accessed from 
Non-secure state. 


Secure The instruction must invalidate all instruction cache lines. 





Branch predictor instructions® 





Invalidate by VA: BPIMVA _ Either All lines that, in the current translation regime, are mapped to by the combination 
of: all of: 


° The specified VA. 


° For an instruction executed at EL1 or ELO, the current ASID if the location 
is mapped to by a non-global page. 


° For an instruction executed at Non-secure EL1 or ELO, the current VMID2. 





Invalidate all: BPIALL, ° Can invalidate any unlocked entry in the branch predictor. 


BPIALLIS ° Are required to invalidate any entries relevant to the software component that executed it. 


The Non-secure and Secure descriptions give more information. 





a. Dependencies on the VMID apply even when HCR_EL2.VM is set to 0. However, VTTBR_EL2.VMID resets to zero, meaning there is a 
valid VMID from reset. 

b. The type of instruction cache used affects the interpretation of the specified entries in this table such that: 
¢ Fora PIPT instruction cache, the cache maintenance applies to all entries whose physical address corresponds to the specified address. 


¢ Fora VIPT instruction cache, the cache maintenance applies to entries whose virtual index and physical tag corresponds to the specified 
address. 


¢ For an ASID and VMID tagged VIVT instruction cache, the cache maintenance applies to entries whose virtual address corresponds to 
the specified address. 


For information of types of instruction cache, see Instruction caches on page G4-4106. 


c. In an implementation where the branch predictors are architecturally invisible, these instructions can execute as NOPs. 


For locked entries and entries that might be locked, the behavior of cache maintenance instructions described in The 
interaction of cache lockdown with cache maintenance instructions on page G3-4010 applies. 


With an implementation that generates aborts if entries are locked or might be locked in the cache, when the use of 
lockdown aborts is enabled, these aborts can occur on any cache maintenance instructions. 


In an implementation that includes EL2: 


° The architecture does not require cache cleaning when switching between virtual machines. Cache 
invalidation by set/way must not present an opportunity for one virtual machine to corrupt state associated 
with a second virtual machine. To ensure this requirement is met, Non-secure clean by set/way operations 
can be upgraded to clean and invalidate by set/way. 
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° The AArch32 Data cache invalidate instructions DCIMVAC and DCISW, perform a cache clean as well as 
a cache invalidate, meaning they operate as DCCIMVAC and DCCISW respectively, if both of the following 
apply: 

— The value of HCR.VM is 1. 


— The instruction is executed in Non-secure state, or EL3 is not implemented. 


° The AArch32 Data cache invalidate by set/way instruction DCISW performs a cache clean as well as a cache 
invalidate, meaning it operates as DCCISW, if both of the following apply: 


—_ The value of HCR.SWIO is 1. 


— The instruction is executed in Non-secure state, or EL3 is not implemented. 


° When the value of HCR.FB is 1, TLB and instruction cache invalidate instructions executed in the 
Non-secure EL1 Exception level are broadcast across the Inner Shareable domain. When Non-secure EL] is 
using AArch32, this applies to the TLBIMVA, TLBIASID, TLBIMVAA, TLBIMVAL, TLBIMVAAL, and 
ICIALLU instructions. This means the instruction is upgraded to the corresponding Inner Shareable 
instruction, for example ICIALLU is upgraded to ICIALLUIS, and BPIALL is upgraded to BPIALLIS. 


For more information about the cache maintenance instructions, see About cache maintenance in ARMV8 on 
page G3-3995, AArch32 cache and branch predictor maintenance instructions on page G3-3999, and Chapter G4 
The AArch32 Virtual Memory System Architecture. 


Boundary conditions for cache maintenance instructions 


Cache maintenance instructions operate on the caches regardless of whether the System register Cacheability 
controls force all memory accesses to be Non-cacheable. 


For address-based cache maintenance instructions, the instructions operate on the caches regardless of the memory 
type and cacheability attributes marked for the memory address in the VMSA translation table entries. This means 
that the effects of the cache maintenance instructions can apply regardless of: 
. Whether the address accessed: 

— Is Normal memory or Device memory. 


— Has the Cacheable attribute or the Non-cacheable attribute. 
. Any applicable domain control of the address accessed. 


. The access permissions for the address accessed, other than the effect of the stage two write permission on 
data or unified cache invalidation instructions. 


Ordering of cache and branch predictor maintenance instructions 


The following rules describe the effect of the memory order model on the cache and branch predictor maintenance 
instructions: 


° All cache and branch predictor maintenance instructions that do not specify an address execute, relative to 
each other, in program order. 
All cache and branch predictor instructions that specify an address: 


— Execute in program order relative to all cache and branch predictor operations that do not specify an 
address. 


— Execute in program order relative to all cache and branch predictor operations that specify the same 
address. 


— Can execute in any order relative to cache and branch predictor operations that specify a different 
address. 


° Where a cache maintenance or branch predictor instruction appears in program order before a change to the 
translation tables, the architecture guarantees that the cache or branch predictor maintenance instruction uses 
the translations that were visible before the change to the translation tables. 
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. Where a change of the translation tables appears in program order before a cache maintenance or branch 
predictor instruction, software must execute the sequence outlined in Ordering and completion of TLB 
maintenance instructions on page G4-4096 before performing the cache or branch predictor maintenance 
instruction, to ensure that the maintenance operation uses the new translations. 


. A DMB instruction causes the effect of all data or unified cache maintenance instructions appearing in program 
order before the DMB to be visible to all explicit load and store operations appearing in program order after the 
DMB. 


Also, a DMB instruction ensures that the effects of any data or unified cache maintenance instruction appearing 
in program order before the DMB are observable by any observer in the same required Shareability domain 
before any data or unified cache maintenance or explicit memory operations appearing in program order after 
the DMB are observed by the same observer. Completion of the DMB does not guarantee the visibility of all data 
to other observers. For example, all data might not be visible to a translation table walk, or to instruction 
fetches. 


° A DSB is required to guarantee the completion of all cache maintenance instruction that appear in program 
order before the DSB instruction. 


° A Context synchronization event is required to guarantee the effects of any branch predictor maintenance 
operation. This means a Context synchronization event causes the effect of all completed branch predictor 
maintenance operations appearing in program order before the Context synchronization event to be visible to 
all instructions after the Context synchronization event. 


Note 


See Context synchronization event in the Glossary for the definition of this term. 








This means that, if a branch instruction appears after an invalidate branch predictor operation and before any 
Context synchronization event, it is CONSTRAINED UNPREDICTABLE whether the branch instruction is affected 
by the invalidate. Software must avoid this ordering of instructions, because it might cause CONSTRAINED 
UNPREDICTABLE behavior. 


. Any data or unified cache maintenance instruction by VA must be executed in program order relative to any 
explicit load or store on the same PE to an address covered by the VA of the cache instruction if that load or 
store is to Normal Cacheable memory. The order of memory accesses that result from the cache maintenance 
instruction, relative to any other memory accesses to Normal Cacheable memory, are subject to the memory 
ordering rules. For more information, see Memory ordering on page E2-2332. 


Any data or unified cache maintenance instruction by VA can be executed in any order relative to any explicit 
load or store on the same PE to an address covered by the VA of the cache maintenance instruction if that 
load or store is not to Normal Cacheable memory. 


° There is no restriction on the ordering of data or unified cache maintenance instruction by VA relative to any 
explicit load or store on the same PE where the address of the explicit load or store is not covered by the VA 
of the cache instruction. Where the ordering must be restricted, a DMB instruction must be inserted to enforce 
ordering. 


° There is no restriction on the ordering of a data or unified cache maintenance instruction by set/way relative 
to any explicit load or store on the same PE. Where the ordering must be restricted, a DMB instruction must be 
inserted to enforce ordering. 


° Software must execute a Context synchronization event after the completion of an instruction cache 
maintenance instruction, to guarantee that the effect of the maintenance instruction is visible to any 
instruction fetch. 


A DSB or DMB instruction intended to ensure the completion of cache maintenance instructions or branch predictor 
instructions must have an access type of both loads and stores. 


The scope of instruction cache maintenance depends on the type of the instruction cache. For more information see 
Instruction caches on page G4-4106. 
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Example G3-1 Cache cleaning operations for self-modifying code 


The sequence of cache cleaning operations for a line of self-modifying code on a uniprocessor system is: 


; Coherency example for data and instruction accesses within the same Inner Shareable domain. 
; Enter this code with <Rt> containing a new 32-bit instruction, 

; to be held in Cacheable space at a location pointed to by Rn. Use STRH in the first line 

; instead of STR for a 16-bit instruction. 


STR Rt, [Rn] 

DCCMVAU Rn ; Clean data cache by MVA to point of unification (PoU) 
DSB ; Ensure visibility of the data cleaned from cache 
ICIMVAU Rn ; Invalidate instruction cache by MVA to PoU 

BPIMVA Rn ; Invalidate branch predictor by MVA to PoU 

DSB ; Ensure completion of the invalidations 

ISB ; Synchronize the fetched instruction stream 


Performing cache maintenance instructions 


To ensure all cache lines in a block of address space are maintained through all levels of cache ARM strongly 
recommends that software: 


° For data or unified cache maintenance, uses the CTR.DMinLine value to determine the loop increment size 
for a loop of data cache maintenance by VA instructions. 


° For instruction cache maintenance, uses the CTR.[MinLine value to determine the loop increment size for a 
loop of instruction cache maintenance by VA instructions. 


Example code for cache maintenance instructions 


The cache maintenance instructions by set/way can be used to clean or invalidate, or both, the entirety of one or 
more levels of cache attached to a PE. However, unless all PEs attached to the caches regard all memory locations 
as Non-cacheable, it is not possible to prevent locations being allocated into the cache during such a sequence of 
the cache maintenance instructions. 


Note 


Because the set/way instructions operate only locally, there is no guarantee of the atomicity of cache maintenance 
between different PEs, even if those different PEs are each executing the same cache maintenance instructions at 
the same time. Because any cacheable line can be allocated into the cache at any time, it is possible for a cache line 
to migrate from an entry in the cache of one PE to the cache of a different PE in a way that means the cache line is 
not affected by set/way based cache maintenance. Therefore, ARM strongly discourages the use of set/way 
instructions to manage coherency in coherent systems. The expected use of the cache maintenance instructions that 
operate by set/way is limited to the cache maintenance associated with the powerdown and powerup of caches, if 
this is required by the implementation. 





The limitations of cache maintenance by set/way mean maintenance by set/way does not happen on multiple PEs, 
and cannot be made to happen atomically for each address on each PE. Therefore in multiprocessor or multithreaded 
systems, the use of cache maintenance by set/way to clean, or clean and invalidate, the entire cache for coherency 
management with very large buffers or with buffers with unknown address can fail to provide the expected 
coherency results because of speculation by other PEs, or possibly by other threads. The only way that these 
instructions can be used in this way is to first ensure that all PEs that might cause speculative accesses to caches that 
need to be maintained are not capable of generating speculative accesses. This can be achieved by ensuring that 
those PEs have no memory locations with a Normal Cacheable attribute. Such an approach can have very large 
system performance effects, and ARM advises implementers to use hardware coherency mechanisms in systems 
where this will be an issue. 


System level caches on page G3-4012 refers to other limitations of cache maintenance by set/way. 
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G3.4.8 


The following example code for cleaning a data or unified cache to the Point of Coherency illustrates a generic 
mechanism for cleaning the entire data or unified cache to the Point of Coherency. 


RC p15, 1, R@, c@, c@, 1 ; Read CLIDR into RO 
ANDS R3, RO, #0x07000000 











OV R3, R3, LSR #23 ; Cache level value (naturally aligned) 
BEQ Finished 
OV R10, #0 
Loop1 
ADD R2, R1@, R10, LSR #1 ; Work out 3 x cache level 
OV R1, R@, LSR R2 ; bottom 3 bits are the Cache type for this level 
AND R1, R1, #7 ; get those 3 bits alone 
CMP R1, #2 
BLT Skip ; no cache or only instruction cache at this level 
CR p15, 2, R10, c@, c®, ® ; write CSSELR from R10 
ISB ; ISB to sync the change to the CCSIDR 
RC p15, 1, R1, c@, c@, @ 3}; read current CCSIDR to R1 
AND R2, R1, #7 ; extract the line length field 
ADD R2, R2, #4 ; add 4 for the line length offset (log2 16 bytes) 
OV R4, #0x3FF 
ANDS R4, R4, R1, LSR #3 ; R4 is the max number on the way size (right aligned) 
CLZ R5, R4 ; R5 is the bit position of the way size increment 
OV R9, R4 ; R9 working copy of the max way size (right aligned) 
Loop2 
OV R7, #0xQQQ07FFF 
ANDS R7, R7, R1, LSR #13 ; R7 is the max number of the index size (right aligned) 
Loop3 
ORR R11, R1@, R9, LSL RS ; factor in the way number and cache number into R11 
ORR R11, R11, R7, LSL R2 ; factor in the index number 


MCR p15, @, R11, c7, c10, 2 ; DCCSW, clean by set/way 
SUBS R7, R7, #1 decrement the index 
BGE Loop3 
SUBS R9, R9, #1 
BGE Loop2 
Skip 
ADD R10, R10, #2 
CMP R3, R10 
DSB 
BGT Loop1 
Finished 


decrement the way number 


increment the cache number 


ensure completion of previous cache maintenance instruction 


Similar approaches can be used for all cache maintenance instructions. 


Cache lockdown 


The concept of an entry locked in a cache is allowed, but not architecturally defined. How lockdown is achieved is 
IMPLEMENTATION DEFINED and might not be supported by: 


° An implementation. 


° Some memory attributes. 


An unlocked entry in a cache might not remain in that cache. The architecture does not guarantee that an unlocked 
cache entry remains in the cache or remains incoherent with the rest of memory. Software must not assume that an 
unlocked item that remains in the cache remains dirty. 


A locked entry in a cache is guaranteed to remain in that cache. The architecture does not guarantee that a locked 
cache entry remains incoherent with the rest of memory, that is, it might not remain dirty. 


The interaction of cache lockdown with cache maintenance instructions 


The interaction of cache lockdown and cache maintenance instructions is IMPLEMENTATION DEFINED. However, an 
architecturally-defined cache maintenance instruction on a locked cache line must comply with the following 
general rules: 


° The effect of the following instructions on locked cache entries is IMPLEMENTATION DEFINED: 
— Cache clean by set/way, DCCSW. 
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— Cache invalidate by set/way, DCISW. 
— Cache clean and invalidate by set/way, DCISW. 
— Instruction cache invalidate all, ICIALLU and ICIALLUIS. 


However, one of the following approaches must be adopted in all these cases: 


1. If the instruction specified an invalidation, a locked entry is not invalidated from the cache. 
2. If the instruction specified a clean it is IMPLEMENTATION DEFINED whether locked entries are cleaned. 
3. If an entry is locked down, or could be locked down, an IMPLEMENTATION DEFINED Data Abort 


exception is generated, using the fault status code defined for this purpose. See Data Abort exception 
on page G1-3859. 


This permits a usage model for cache invalidate routines to operate on a large range of addresses by 
performing the required operation on the entire cache, without having to consider whether any cache entries 
are locked. 


The effect of the following instructions is IMPLEMENTATION DEFINED: 
° Cache clean by virtual address, DCCMVAC and DCCMVAU. 
° Cache invalidate by virtual address, DCIMVAC. 

° Cache clean and invalidate by virtual address, DCCIMVAC. 


However, one of the following approaches must be adopted in all these cases: 


1. If the instruction specified an invalidation, a locked entry is invalidated from the cache. For the clean and 
invalidate instructions, the entry must be cleaned before it is invalidated. 


2: If the instruction specified an invalidation, a locked entry is not invalidated from the cache. If the instruction 
specified a clean it is IMPLEMENTATION DEFINED whether locked entries are cleaned. 


3. If an entry is locked down, or could be locked down, an IMPLEMENTATION DEFINED Data Abort exception is 
generated, using the fault status code defined for this purpose. See DFSR or HSR. 


In an implementation that includes EL2, if HCR.TIDCP is set to 1, any exception relating to lockdown of an entry 
associated with Non-secure memory is routed to EL2. 


Note 


An implementation that uses an abort mechanism for entries that can be locked down but are not actually locked 
down must: 





° Document the IMPLEMENTATION DEFINED instruction sequences that perform the required operations on 
entries that are not locked down. 


. Implement one of the other permitted alternatives for the locked entries. 


ARM recommends that, when possible, such IMPLEMENTATION DEFINED instruction sequences use 
architecturally-defined instructions. This minimizes the number of customized instructions required. 


In addition, an implementation that uses an abort to handle cache maintenance instructions for entries that might be 
locked must provide a mechanism that ensures that no entries are locked in the cache. 


The reset setting of the cache must be that no cache entries are locked. 





Additional cache functions for the implementation of lockdown 


An implementation can add additional cache maintenance functions for the handling of lockdown in the 
IMPLEMENTATION DEFINED space. See IMPLEMENTATION DEFINED registers, functional group on 
page G4-4211. 
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G3.4.9 System level caches 


The system level architecture might define further aspects of the software view of caches and the memory model 
that are not defined by the ARMv8 architecture. These aspects of the system level architecture can affect the 
requirements for software management of caches and coherency. For example, a system design might introduce 
additional levels of caching that cannot be managed using the architecturally-defined maintenance instructions. 
Such caches are referred to as system caches. 


Conceptually, three classes of system cache can be envisaged: 


1. System caches which lie before the point of coherency and cannot be managed by cache maintenance 
instructions. Such systems fundamentally undermine the concept of cache maintenance instructions 
operating to the point of coherency, as they imply the use of non-architecture mechanisms to manage 
coherency. The use of such systems in the ARM architecture is explicitly prohibited. 


2. System caches which lie before the point of coherency and can be managed by cache maintenance by address 
instructions that apply to the point of coherency, but cannot be managed by cache maintenance by set/way 
instructions. Where maintenance of the entire system cache must be performed, as is the case for power 
management, it must be performed using non-architectural mechanisms. 


3. System caches which lie beyond the point of coherency and so are invisible to software. The management of 
such caches is outside the scope of architecture. 
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G3.5 System register support for IMPLEMENTATION DEFINED memory features 


The VMSAv8-32 defines the following registers for describing IMPLEMENTATION DEFINED features of the memory 
system: 


° The TCM Type Register, TCMTR must be implemented. The following conditions apply to this register: 
— _ Ifno TCMs are implemented, the TCMTR indicates zero-size TCMs. 


— If bits[31:29] are @b100, the format of the rest of the register is IMPLEMENTATION DEFINED. This value 
indicates that the implementation includes TCMs that do not follow the legacy usage model. Other 
fields in the register might give more information about the TCMs. 


° The System register encoding space with {coproc==0b1111, CRn==c9, CRm=={cO-c2, c5-c7}} is 
IMPLEMENTATION DEFINED for all values of opc2 and opcl. This space is reserved for branch predictor, cache 
and TCM functionality, for example maintenance, override behaviors and lockdown. 


° In a VMSAv8-32 implementation, part of the System register encoding space with {coproc==0b1111, 
CRn==cl10} is IMPLEMENTATION DEFINED and reserved for TLB functionality, see TLB lockdown on 
page G4-4091. 


° The System register encoding space with {coproc==0b1111, CRn==c11, CRm=={c0-c8, c15}} is 
IMPLEMENTATION DEFINED for all values of opc2 and opc1. This space is reserved for DMA operations to and 
from the TCMs. 


In addition, the System register encoding space with {coproc==0b1111, CRn==c15}is reserved for 
IMPLEMENTATION DEFINED registers, and can provide additional registers for the memory system. For more 
information, see Reserved encodings in the VMSAv8-32 System register (coproc==ObI1111) space on 

page G4-4177. 
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G3.6 External aborts 


The ARM architecture defines external aborts as errors that occur in the memory system, other than those that are 
detected by the MMU or Debug hardware. External aborts include parity or ECC errors detected by the caches or 
other parts of the memory system. For example, an uncorrectable parity or ECC failure on a Level 2 Memory 
structure might generate an external abort. 


An external abort is one of: 


° Synchronous. 
° Precise asynchronous. 
° Imprecise asynchronous. 


For more information, see Exception terminology on page G1-3784. 


The ARM architecture does not provide any method to distinguish between precise asynchronous and imprecise 
asynchronous external aborts. 


VMSAv8-32 permits external aborts on data accesses, translation table walks, and instruction fetches to be either 
synchronous or asynchronous. The reported fault code identifies whether the external abort is synchronous or 
asynchronous. 


It is IMPLEMENTATION DEFINED which external aborts, if any, are supported. Asynchronous external aborts are taken 
as SError interrupt exceptions. 


In AArch32 state: 
° SError interrupts are taken as asynchronous Data Abort exceptions. 
° Synchronous external aborts: 
— On data accesses are taken as synchronous Data Abort exceptions. 


— Oninstruction fetches, or prefetches, are taken as synchronous Prefetch Abort exceptions. 


See also: 
. External abort on a translation table walk on page G4-4120. 
° Handling exceptions that are taken to an Exception level using AArch32 on page G1-3812. 


Normally, external aborts are rare. An imprecise asynchronous external abort is likely to be fatal to the process that 
is running. ARM recommends that implementations make external aborts precise wherever possible. 


The following subsections give more information about possible external aborts: 


° External abort on instruction fetch. 
° External abort on data read or write. 
° Provision for classification of external aborts on page G3-4015. 


° Parity or ECC error reporting on page G3-4015. 


The section Exception reporting in a VMSAv8-32 implementation on page G4-4123 describes the reporting of 
external aborts. 


G3.6.1 External abort on instruction fetch 


An external abort on an instruction fetch can be either synchronous or asynchronous. A synchronous external abort 
on an instruction fetch is taken precisely. 


An implementation can report the external abort asynchronously from the instruction that it applies to. In such an 
implementation these aborts behave essentially as interrupts. The aborts are masked when PSTATE.A is set to 1, 
otherwise they are reported using the Data Abort exception. 


G3.6.2 External abort on data read or write 


Externally-generated errors during a data read or write can be either synchronous or asynchronous. 
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An implementation can report the external abort asynchronously from the instruction that generated the access. In 
such an implementation these aborts behave essentially as interrupts. The aborts are masked when PSTATE.A is set 
to 1, otherwise they are reported using the Data Abort exception. 


Provision for classification of external aborts 


For a synchronous external abort taken to a privileged mode other than Hyp mode, an implementation can use the 

DFSR.ExT and IFSR.ExT bits to provide more information about synchronous external aborts: 

° DFSR.ExT provides an IMPLEMENTATION DEFINED classification of synchronous external aborts on data 
accesses. 

° IFSR.ExT provides an IMPLEMENTATION DEFINED classification of synchronous external aborts on 
instruction accesses. 


For a synchronous external abort taken to Hyp mode, the HSR.EA, ISS[9] bit, provides an IMPLEMENTATION 
DEFINED classification of external aborts. 


For all aborts other than synchronous external aborts these bits return a value of 0. 


Parity or ECC error reporting 


The ARM architecture supports the reporting of both synchronous and asynchronous parity or ECC errors from the 
cache systems. It is IMPLEMENTATION DEFINED what parity or ECC errors in the cache systems, if any, result in 
synchronous or asynchronous parity or ECC errors. 


A fault code is defined for reporting parity or ECC errors, see Exception reporting in a VMSAv8-32 implementation 
on page G4-4123. However when parity or ECC error reporting is implemented it is IMPLEMENTATION DEFINED 
whether a parity or ECC error is reported using the assigned fault code, or using another appropriate encoding. 


For all purposes other than the fault status encoding, parity or ECC errors are treated as external aborts. 
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G3.7 Memory barrier instructions 
Memory barriers on page E2-2335 describes the memory barrier instructions. This section describes the system 
level controls of those instructions. 
G3.7.1 EL2 control of the Shareability of data barrier instructions executed at ELO or EL1 
In an implementation that includes EL2 and supports Shareability limitations on the data barrier instructions, the 
HCR.BSU field can upgrade the required Shareability of an instruction that is executed at ELO or EL1 in Non-secure 
state. Table G3-7 shows the encoding of this field: 
Table G3-7 EL2 control of Shareability of barrier instructions executed at ELO or EL1 
HCR.BSU Minimum Shareability of barrier instructions 
00 No effect, Shareability is as specified by the instruction 
01 Inner Shareable 
10 Outer Shareable 
11 Full system 
For an instruction executed at ELO or EL1 in Non-secure state, Table G3-8 shows how the HCR.BSU is combined 
with the Shareability specified by the argument of the DMB or DSB instruction to give the scope of the instruction: 
Table G3-8 Effect of the HCR_EL2.BSU on barrier instructions executed at Non-secure EL1 or EL1 
Shareability specified by the DMB or DSB argument HCR.BSU Resultant Shareability 
Full system Any Full system 
Outer Shareable 00, 01, or 10 Outer Shareable 
11, Full system Full system 
Inner Shareable 00 or 01 Inner Shareable 
10, Outer Shareable Outer Shareable 
11, Full system Full system 
Non-shareable 00, No effect Non-shareable 
01, Inner Shareable Inner Shareable 
10, Outer Shareable Outer Shareable 
11, Full system Full system 
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G3.8 Pseudocode description of general memory system instructions 


This section lists the pseudocode describing general memory operations: 
° Memory data type definitions. 

° Basic memory access. 

. Aligned memory access on page G3-4018. 

. Unaligned memory access on page G3-4018. 

° Exclusive monitors operations on page G3-4018. 

. Access permission checking on page G3-4019. 

° Abort exceptions on page G3-4019. 

° Memory barriers on page G3-4019. 


G3.8.1 Memory data type definitions 
This section lists the memory data types. 


The memory data types are: 
° Address descriptor, defined by the AddressDescriptor type. 
. Full address, defined by the FullAddress type. 


. Memory attributes, defined by the MemoryAttributes type. 
. Memory type, defined by the MemType enumeration. 
. Device memory type, defined by the DeviceType enumeration. 
. Normal memory attributes, defined by the MemAttrHints type. 
° Cacheability attributes, defined by the MemAttr_NC, MemAttr_WT, and MemAttr_WB constants. 
° Allocation hints, defined by the MemHint_No, MemHint_WA, MemHint_RA, and MemHint_RWA constants. 
° Access permissions, defined by the Permissions type. 
G3.8.2 Basic memory access 


The two _Mem[] accessors, Non-assignment (memory read) _Mem[] and Assignment (memory write) _Mem[], are the 
operations that perform single-copy atomic, aligned, little-endian memory accesses of size bytes to or from the 
underlying physical memory array of bytes. 


The functions address the array using desc.paddress which supplies: 
° A 48-bit physical address. 


° A single NS bit to select between Secure and Non-secure parts of the array. 


The AccType parameter describes the access type, such as normal, exclusive, ordered, and streaming. For a definition 
of AccType, see Address space on page E2-2316. 


The actual implemented array of memory might be smaller than the 248 bytes implied. In this case the scheme for 
aliasing is IMPLEMENTATION DEFINED, or some parts of the address space might give rise to external aborts or a 
System Error. 


The attributes in memaddrdesc.memattrs are used by the memory system to determine caching and ordering behaviors 
as described in Memory types and attributes on page E2-2342, Memory ordering on page E2-2332, and Atomicity 
in the ARM architecture on page E2-2328. 


PAMax() returns the IMPLEMENTATION DEFINED size of the physical address. 











Note 
A translation regime used when the PE is executing in AArch32 state can never generate more than 40 bits of an 
address. 
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G3.8.3 Aligned memory access 


The AArch32.MemSingle[] functions make atomic, little-endian accesses of size bytes. 


G3.8.4 Unaligned memory access 


See Unaligned data access on page E2-2323 for details of the SCTLR.A and HSCTLR.A controls on the generation 
of alignment faults. The HSCTLR control applies to Normal memory accesses from Hyp mode, and the SCTLR 
control applies to Normal memory accesses from all other modes. 


The Mem_with_type[] functions make an access of the required type. If that access is naturally aligned, each form of 
the function performs an atomic access by making a single call to AArch32.MemSingle[]. If that access is not aligned 
but passes the AArch32.CheckAlignment() checks, each form of the function synthesizes the required access from 
multiple calls to AArch32.MemSingle[]. It also reverses the byte order if the access is big-endian. 


G3.8.5 Exclusive monitors operations 


The AArch32.SetExclusiveMonitors() function sets the exclusive monitors for a Load-Exclusive instruction, for a 
block of bytes. The size of the blocks is determined by size, at the VA address. The ExclusiveMonitorsPass() 
function checks whether a Store-Exclusive instruction still has possession of the exclusive monitors and therefore 
completes successfully. 


The AArch32.ExclusiveMonitorsPass() function checks whether a Store-Exclusive instruction still has possession of 
the exclusive monitors, by checking whether the exclusive monitors are set to include the location of the memory 
block specified by size, at the virtual address defined by address. The atomic write that follows after the exclusive 
monitors have been set must be to the same physical address. It is permitted, but not required, for this function to 
return FALSE if the virtual address is not the same as that used in the previous call to 
AArch32.SetExclusiveMonitors(). 


The ExclusiveMonitorsStatus() function returns 0 if the previous atomic write was to the same physical memory 
locations selected by ExclusiveMonitorsPass() and therefore succeeded. Otherwise the function returns 1, indicating 
that the address translation delivered a different physical address. 


The MarkExclusiveGlobal() procedure takes as arguments a FullAddress.paddress, the PE identifier processorid and 
the size of the transfer. The procedure records that the PE processorid has requested exclusive access covering at 
least size bytes from address paddress. The size of the location marked as exclusive is IMPLEMENTATION DEFINED, 
up to a limit of 2KB and no smaller than two words, and aligned in the address space to the size of the location. It 
is CONSTRAINED UNPREDICTABLE whether this causes any previous request for exclusive access to any other address 
by the same PE to be cleared. 


The MarkExclusiveLocal() procedure takes as arguments a FullAddress paddress, the PE identifier processorid and 
the size of the transfer. The procedure records in a local record that PE processorid has requested exclusive access 
to an address covering at least size bytes from address paddress. The size of the location marked as exclusive is 
IMPLEMENTATION DEFINED, and can at its largest cover the whole of memory but is no smaller than two words, and 
is aligned in the address space to the size of the location. It is IMPLEMENTATION DEFINED whether this procedure 
also performs a MarkExclusiveGlobal() using the same parameters. 


The IsExclusiveGlobal() function takes as arguments a FullAddress paddress, the PE identifier processorid and the 
size of the transfer. The function returns TRUE if the PE processorid has marked in a global record an address range 
as exclusive access requested that covers at least size bytes from address paddress. It is IMPLEMENTATION DEFINED 
whether it returns TRUE or FALSE if a global record has marked a different address as exclusive access requested. 
If no address is marked in a global record as exclusive access, IsExclusiveGlobal() returns FALSE. 


The IsExclusiveLocal() function takes as arguments a FullAddress paddress, the PE identifier processorid and the 
size of the transfer. The function returns TRUE if the PE processorid has marked an address range as exclusive 
access requested that covers at least the size bytes from address paddress. It is IMPLEMENTATION DEFINED whether 
this function returns TRUE or FALSE if the address marked as exclusive access requested does not cover all of size 
bytes from address paddress. If no address is marked as exclusive access requested, then this function returns 
FALSE. It is IMPLEMENTATION DEFINED whether this result is ANDed with the result of IsExclusiveGlobal() with 
the same parameters. 
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The ClearExclusiveByAddress() procedure takes as arguments a FullAddress paddress, the PE identifier processorid 
and the size of the transfer. The procedure clears the global records of all PEs, other than processorid, for which an 
address region including any of size bytes starting from paddress has had a request for an exclusive access. It is 
IMPLEMENTATION DEFINED whether the equivalent global record of the PE processorid is also cleared if any of size 
bytes starting from paddress has had a request for an exclusive access, or if any other address has had a request for 
an exclusive access. 


The ClearExclusiveLocal() procedure takes as arguments the PE identifier processorid. The procedure clears the 
local record of PE processorid for which an address has had a request for an exclusive access. It is IMPLEMENTATION 
DEFINED whether this operation also clears the global record of PE processorid that an address has had a request for 
an exclusive access. 


Access permission checking 


The function AArch32.CheckPermission() is used by the architecture to perform access permission checking based 
on attributes derived from the translation tables or location descriptors. 


The interpretation of access permission is shown in Memory access control on page G4-4068. 


Abort exceptions 


The function AArch32.Abort() generates a Data Abort exception or a Prefetch Abort exception by calling the 
AArch32.TakeDataAbortException() or AArch32.TakePrefetchAbortException() function. 


The FaultRecord type describes a fault. Functions that check for faults return a record of this type appropriate to the 
type of fault. Pseudocode description of VMSAv8-32 memory system operations on page G4-4215 provides a 
number of wrappers to generate a FaultRecord. 


The function AArch32.NoFau1t() returns a null record that indicates no fault. The IsFau1t() function tests whether a 
FaultRecord contains a fault. 


Memory barriers 


The definition for the memory barrier functions is given by the enumerations MBReqDomain and MBReqTypes. 


These enumerations define the required Shareability domains and required access types used as arguments for DMB 
and DSB instructions. 


The procedures DataMemoryBarrier(), DataSynchronizationBarrier(), and InstructionSynchronizationBarrier() 
perform the memory barriers. 
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This chapter describes the ARMv8-A AArch32 Virtual Memory System Architecture (VMSA), that is 
backwards-compatible with VMSAv/7. It includes the following sections: 


° About VMSAv8-32 on page G4-4022. 

° The effects of disabling address translation stages on VMSAv8-32 behavior on page G4-4031. 

. Translation tables on page G4-4035. 

. The VMSAv8-32 Short-descriptor translation table format on page G4-4040. 

° The VMSAv8-32 Long-descriptor translation table format on page G4-4049. 

° Memory access control on page G4-4068. 

° Memory region attributes on page G4-4077. 

° Translation Lookaside Buffers (TLBs) on page G4-4089. 

° TLB maintenance requirements on page G4-4093. 

° Caches in VMSAv8-32 on page G4-4106. 

° VMSAV8-32 memory aborts on page G4-4110. 

° Exception reporting in a VMSAv&-32 implementation on page G4-4123. 

° Address translation instructions on page G4-4142. 

° About the System registers for VMSAv8-32 on page G4-4148. 

° VMSAv8-32 organization of registers in the (coproc==O0b1110) encoding space on page G4-4172. 
° VMSAV8-32 organization of registers in the (coproc==ObI1111) encoding space on page G4-4175. 
° Functional grouping of VMSAv8-32 System registers on page G4-4193. 

° Pseudocode description of VMSAv8-32 memory system operations on page G4-4215. 





Note 
This chapter must be read with Chapter G3 The AArch32 System Level Memory Model. 
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G4.1 


About VMSAv8-32 


Note 


° This chapter describes the ARMv8 VMSA for AArch32 state, VMSAv8-32. This is generally equivalent to 
VMSAv7 for an implementation that includes all of the Security Extensions, the Multiprocessing Extensions, 
the Large Physical Address Extension, and the Virtualization Extensions. 





° This chapter describes the control of the VMSA by Exception levels that are using AArch32. Security state, 
Exception levels, and AArch32 execution privilege on page G1-3792 summarizes how the AArch32 PE 
modes map onto the Exception levels. 


Chapter D4 The AArch64 Virtual Memory System Architecture describes the control of the VMSA by 
exception levels that are using AArch64. 


° For details of the VMSA differences in previous versions of the ARM architecture see the ARM® Architecture 
Reference Manual, ARMv7-A and ARMv7-R edition. 





The main function of the VMSA is to perform address translation, and access permissions and memory attribute 
determination and checking, for memory accesses made by the PE. Address translation, and permissions and 
attribute determination and checking, is performed by a stage of address translation. 


In VMSAv8-32, the Memory Management Unit (MMU) provides a number of stages of address translation. This 
chapter describes only the stages that are visible from Exception levels that are using AArch32, which are as 
follows: 


For operation in Secure state 


A single stage of address translation, for use when executing at PL1 or PLO. This is the Secure 
PL1&0 stage 1 address translation stage. 


For operation in Non-secure state 


° A single stage of address translation for use when executing at PL2. This is the Non-secure 
PL2 stage I address translation stage. 


° Two stages of address translation for use when executing at PL1 or PLO. These are: 
— The Non-secure PL1&O stage J address translation stage. 
— The Non-secure PL1&0 stage 2 address translation stage. 


The System registers provide independent control of each supported stage of address translation, including a control 
to disable that stage of translation. 


However, if the PE is executing at ELO using AArch32 when EL] is using AArch64 then it is using the VMSAv8-64 
EL1&0 translation regime, described in Chapter D4 The AArch64 Virtual Memory System Architecture. 


These features mean the VMSAv8-32 can support a hierarchy of software supervision, for example an Operating 
System and a hypervisor. 


Each stage of address translation uses address translations and associated memory properties held in memory 
mapped tables called translation tables. 


For information about how the MMU features differ if an implementation does not include all of the Exception 
levels, see About address translation for VMSAv8-32 on page G4-4026. 


The translation tables define the following properties: 


Access to the Secure or Non-secure address map 


The translation table entries determine whether an access from Secure state accesses the Secure or 
the Non-secure address map. Any access from Non-secure state accesses the Non-secure address 
map. 
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Memory access permission control 
This controls whether a program is permitted to access a memory region. For instruction and data 
access, the possible settings are: 
. No access. 
° Read-only. 
° Write-only. This is possible only in a translation regime with two stages of translation. 
° Read/write. 


For instruction accesses, additional controls determine whether instructions can be fetched and 
executed from the memory region. 


If a PE attempts an access that is not permitted, a memory fault is signaled to the PE. 


Memory region attributes 


These describe the properties of a memory region. The top-level attribute, the Memory type, is one 
of Normal, or a type of Device memory, as follows: 
° Both translation table formats support the following Device memory types: 
—  Device-nGnRnE 
—  Device-nGnRE 
° The Long-descriptor translation table format supports, in addition, the following Device 
memory types: 
—  Device-nGRE 
—  Device-GRE 


— Note 


ARMvV8 added the Device-nGRE and Device-GRE memory types. Also, in versions of the ARM 
architecture before ARMv8: 


. Device-nGnRnE memory is described as Strongly-ordered memory. 


° Device-nGnRE memory is described as Device memory. 





Normal memory regions can have additional attributes. 


For more information, see Memory types and attributes on page E2-2342. 


Address translation mappings 
An address translation maps an input address to an output address. 


A stage | translation takes the address of an explicit data access or instruction fetch, a virtual 
address (VA), as the input address, and translates it to a different output address: 


° If only one stage of translation is provided, this output address is the physical address (PA). 

° If two stages of address translation are provided, the output address of the stage | translation 
is an intermediate physical address (IPA). 

—— Note 


In the ARMv8-32 architecture, a software agent, such as an Operating System, that uses or defines 
stage 1 memory translations, might be unaware of the distinction between IPA and PA. 





A stage 2 translation translates the IPA to a PA. 


The possible security states and privilege levels of memory accesses define a set of translation 
regimes, where a translation regime maps an input VA to the corresponding PA, using one or two 
stages of translation. See The VMSAv8-32 translation regimes on page G4-4024. 


System registers control VMSAv8-32, including defining the location of the translation tables, and enabling and 
configuring the MMU, including enabling and disabling the different address translation stages. Also, they report 
any faults that occur on a memory access. For more information, see Functional grouping of VMSAv8-32 System 
registers on page G4-4193. 
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G4.1.1 


The following sections give an overview of VMSAv8-32, and of the implementation options for VMSAv8-32: 
° The VMSAv8-32 translation regimes. 

. Address types used in a VMSAv8-32 description on page G4-4025. 

. Address spaces in VMSAv8-32 on page G4-4025. 

. About address translation for VMSAv8-32 on page G4-4026. 


The remainder of the chapter fully describes the VMSA, including the different implementation options, as 
summarized in Organization of this chapter on page G4-4029. 


The VMSAv8-32 translation regimes 


As introduced in Address translation mappings on page G4-4023, a translation regime maps an input VA to the 
corresponding PA, using one or two stages of translation. Figure G4-1 shows the VMSAv8-32 translation regimes, 
and their associated translation stages and the Exception levels from which they are controlled. 


Translation regimes, for Exception levels that are using AArch32 


Non-secure EL2 VA 


Non-secure PL1&0 VA 


Secure PL1&0 VA 


Secure PL1&0 stage 1 


» PA, Secure or Non-secure 
Controlled from Secure PL1 modest 





Non-secure EL2 stage 1 
Controlled from Hyp modet 





» PA, Non-secure only 


Non-secure PL1&0 stage 1 ‘| Non-secure PL1&0 stage 2 
Controlled from Non-secure PL1 modest Controlled from Hyp modet 





» PA, Non-secure only 


t Typical control when controlled from an Exception level using AArch32. 
Figure G4-1 VMSAv8-32 translation regimes, and associated control 


Note 


Conceptually, a translation regime that has only a stage 1 address translation is equivalent to a regime with a fixed, 
flat stage 2 mapping from IPA to PA. 








Limited use of Privilege level in ARMV8S AArch32 state on page G1-3793 describes the mapping between the PE 
modes and the Privilege levels (PLs). 


Alternative descriptions of the PL1&0 translation regime 


The PL1&0 is described in terms of Privilege level because of the way the AArch32 PE modes map onto the 
Exception levels, as described in Limited use of Privilege level in ARMV8 AArch32 state on page G1-3793, The 
description of this translation regime in terms of the Exception levels using depends on the current state of the PE, 
as follows: 


. In Non-secure state, PL1 always maps to EL1, and therefore the Non-secure PL1&0 translation regime could 
be described as the Non-secure EL1&0 translation regime. 
° In Secure state: 


— When EL3 is using AArch32, PL1 maps to EL3, and therefore under these conditions the Secure 
PL1&0 translation regime could be described as the Secure EL3&0 translation regime, 


— When EL3 is using AArch64, Secure PL1 maps to Secure EL1, and therefore under these conditions 


the Secure PL1&0 translation regime could be described as the Secure EL1&0 translation regime, 


However, these descriptions all refer to the same translation regime, with the same System registers associated with 
its stage 1 translations. Therefore, the regime is generally referred to as the PL1&0 translation regime. 





Note 


As Figure G4-1 shows, Stage 2 translation is supported only in Non-secure state. 
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G4.1.2 Address types used in a VMSAv8-32 description 


A description of VMSAv8-32 refers to the following address types. 


Note 


These descriptions relate to a VMSAv8-32 description and therefore sometimes differ from the generic definitions 
given in the Glossary. 








Virtual address (VA) 
An address used in an instruction, as a data or instruction address, is a Virtual Address (VA). 
An address held in the PC, LR, or SP, is a VA. 
The VA map runs from zero to the size of the VA space. For AArch32 state, the maximum VA space 
is 4GB, giving a maximum VA range of 0x00000000-0xFFFFFFFF. 
Intermediate physical address (IPA) 


In a translation regime that provides two stages of address translation, the IPA is the address after 
the stage | translation, and is the input address for the stage 2 translation. 


In a translation regime that provides only one stage of address translation, the IPA is identical to the 


PA. 

A VMSAv8-32 implementation provides only one stage of address translation: 
. If the implementation does not include EL2. 

° When executing in Secure state. 

° When executing in Hyp mode. 


Physical address (PA) 


The address of a location in the Secure or Non-secure memory map. That is, an output address from 
the PE to the memory system. 
G4.1.3 Address spaces in VMSAv8-32 
For execution in AArch32 state, the ARMv8 architecture supports: 
. A VA space of up to 32 bits. The actual width is IMPLEMENTATION DEFINED. 


° An IPA space of up to 40 bits. The translation tables and associated System registers define the width of the 
implemented address space. 


Note 


AArch32 defines two translation table formats. The Long-descriptor format gives access to the full 40-bit IPA or 
PA space at a granularity of 4KB. The Short-descriptor format: 





° Gives access to a 32-bit PA space at 4KB granularity. 
° Gives access to a 40-bit PA space, but only at 16MB granularity, by the use of Supersections. 





If an implementation includes EL3, the address maps are defined independently for Secure and Non-secure 
operation, providing two independent 40-bit address spaces, where: 





° A VA accessed from Non-secure state can only be translated to the Non-secure address map. 
° A VA accessed from Secure state can be translated to either the Secure or the Non-secure address map. 
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G4.1.4 About address translation for VMSAv8-32 


Address translation is the process of mapping one address type to another, for example, mapping VAs to IPAs, or 
mapping VAs to PAs. A translation table defines the mapping from one address type to another, and a Translation 
table base register (TTBR) indicates the start of a translation table. Each implemented stage of address translation 
shown in Figure G4-1 on page G4-4024 requires its own translation tables. 


For PL1&0 stage | translations, the mapping can be split between two tables, one controlling the lower part of the 
VA space, and the other controlling the upper part of the VA space. This can be used, for example, so that: 


° One table defines the mapping for operating system and I/O addresses, that do not change on a context switch. 


. A second table defines the mapping for application-specific addresses, and therefore might require updating 
on a context switch. 


The VMSAv8-32 implementation options determine the supported address translation stages. The following 
descriptions apply when all implemented Exception levels are using AArch32: 


VMSAv8-32 without EL2 or EL3 


Supports only a single PL1&0 stage 1 address translation. Translation of this stage of address 
translation can be split between two sets of translation tables, with base addresses defined by 
TTBRO and TTBRI, and controlled by TTBCR. 


VMSAv8-32 with EL3 but without EL2 


Supports only the Secure PL1&0 stage 1 address translation and the Non-secure PL1&0 stage 1 
address translation. In each security state, this stage of translation can be split between two sets of 
translation tables, with base addresses defined by the Secure and Non-secure copies of TTBRO and 
TTBRI, and controlled by the Secure and Non-secure copies of TTBCR. 


VMSAv8-32 with EL2 but without EL3 
The implementation supports the following stages of address translation: 


Non-secure PL2 stage 1 address translation 
The HTTBR defines the base address of the translation table for this stage of address 
translation, controlled by HTCR. 

Non-secure PL1&0 stage 1 address translation 
Translation of this stage of address translation can be split between two sets of 
translation tables, with base addresses defined by the Non-secure copies of TTBRO and 
TTBRI and controlled by the Non-secure instance of TTBCR. 

Non-secure PL1&0 stage 2 address translation 


The VTTBR defines the base address of the translation table for this stage of address 
translation, controlled by VTCR. 


VMSAv8-32 with EL2 and EL3 
The implementation supports all of the stages of address translation, as follows: 


Secure PL1&0 stage 1 address translation 


Translation of this stage of address translation can be split between two sets of 
translation tables, with base addresses defined by the Secure copies of TTBRO and 
TTBRI1, and controlled by the Secure instance of TTBCR. 


Non-secure PL2 stage 1 address translation 
The HTTBR defines the base address of the translation table for this stage of address 
translation, controlled by HTCR. 

Non-secure PL1&0 stage 1 address translation 


Translation of this stage of address translation can be split between two sets of 
translation tables, with base addresses defined by the Non-secure copies of TTBRO and 
TTBR1 and controlled by the Non-secure instance of TTBCR. 
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Non-secure PL1&0 stage 2 address translation 


The VTTBR defines the base address of the translation table for this stage of address 
translation, controlled by VTCR. 


Figure G4-2 shows the translation regimes and stages in a VMSAv8-32 implementation that includes all of the 
Exception levels, and indicates the PE mode that, typically, defines each set of translation tables, if that stage of 
address translation is controlled by a Privilege level that is using AArch32: 


Translation regime 
Secure PL1&0 stage 1 PA, 








Secure PL1&0 VA > 
Secure TTBRO*, TTBR1?, and TTBCR* Secure or Non-secure 
Non-secure PL2 stage 1 
ae Saar HTTBR® and HTCR <a secure only 
Non-secure PL1&0 VA Non-secure PL1&0 stage 1 ™ Non-secure PL1&0 stage 2 PA, 


Non-secure TTBRO', TTBR1', and TTBCRt VTTBRS and VTCR® Non-secure only 


+ Typically configured from a Secure PL1 mode 
§ Typically configured from Hyp mode 
t Typically configured from a Non-secure PL1 mode 


Translation table base address and control registers. 
See the Note that follows this figure for other configuration options. 


Figure G4-2 VMSAv8-32 address translation summary 


Note 


The term Typically configured is used in Figure G4-2 to indicate the expected software usage. However, stages of 
address translation used in AArch32 state can also be configured: 





° From an Exception level higher than the Exception level of the configuring PE mode shown in Figure G4-2, 
regardless of whether that Exception level is using AArch32 or is using AArch64, except that a Non-secure 
Exception level can never configure a stage of address translation that is used in Secure state. 


° From an Exception level that is using AArch64 and is higher than the level at which the translation stage is 
being used. For example, if Non-secure ELO is the only Non-secure Exception level that is using AArch32, 
then the Non-secure PL1&0 stage of address translation is configured from Non-secure EL1, that is using 





AArché64. 
In general: 
° The translation from VA to PA can require multiple stages of address translation, as Figure G4-2 shows. 
° A single stage of address translation takes an input address and translates it to an output address. 


A full translation table lookup is called a translation table walk. It is performed automatically by hardware, and can 
have a significant cost in execution time. To support fine granularity of the VA to PA mapping, a single input address 
to output address translation can require multiple accesses to the translation tables, with each access giving finer 
granularity. Each access is described as a level of address lookup. The final level of the lookup defines: 

. The required output address. 

° The attributes and access permissions of the addressed memory. 


Translation Lookaside Buffers (TLBs) reduce the average cost of a memory access by caching the results of 
translation table walks. TLBs behave as caches of the translation table information, and VMSAv8-32 provides TLB 
maintenance instructions for the management of TLB contents. 


Note 


The ARM architecture permits TLBs to hold any translation table entry that does not directly cause a Translation 
fault, an Address size fault, or an Access flag fault. 
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To reduce the software overhead of TLB maintenance, for the PL1&0 translation regimes VMSAv8-32 
distinguishes between Global pages and Process-specific pages. The Address Space Identifier (ASID) identifies 
pages associated with a specific process and provides a mechanism for changing process-specific tables without 
having to maintain the TLB structures. 


If an implementation includes EL2, the virtual machine identifier (VMID) identifies the current virtual machine, 
with its own independent ASID space. The TLB entries include this VMID information, meaning TLBs do not 
require explicit invalidation when changing from one virtual machine to another, if the virtual machines have 
different VMIDs. For stage 2 translations, all translations are associated with the current VMID. There is no 
mechanism to associate a particular stage 2 translation with multiple virtual machines. 


Atomicity of register changes on changing virtual machine 


From the viewpoint of software executing at Non-secure PL1 or PLO, when there is a switch from one virtual 
machine to another, the registers that control or affect address translation must be changed atomically. This applies 
to the registers for the Non-secure PL1&0 translation regime. This means that all of the following registers must 
change atomically: 
. The registers associated with the stage 1 translations: 

—  MAIRO, MAIR1, AMAIRO, and AMAIR1. 

—  TTBRO, TTBR1, TTBCR, and CONTEXTIDR. 

—  SCTLR. 


° The registers associated with the stage 2 translations: 
—  VTTBR and VTCR. 
—  MHSCTLR. 


Note 


Only some bits of SCTLR affect the stage 1 translation, and only some bits of HSCTLR affect the stage 2 translation. 
However, in each case, changing these bits requires a write to the register, and that write must be atomic with the 
other register updates. 








These registers apply to execution using the Non-secure PL1&0 translation regime. However, when updated as part 
of a switch of virtual machines they are updated by software executing at EL2. This means the registers are out of 
context when they are updated, and no synchronization precautions are required. 


Use of out-of-context translation regimes 
The architecture requires that: 


° When executing at EL3 or EL2, the PE must not use the registers associated with the Non-secure PL1&0 
translation regime for speculative memory accesses. 


° When executing at EL3 the PE must not use the registers associated with the EL2 translation regime for 
speculative memory accesses. 


° When executing at EL3, EL2, or Non-secure EL1, the PE must not use the registers associated with the 
Secure PL1&0 translation regime for speculative memory accesses. 


When entering an exception level on completion of a DSB instruction, no new memory accesses using any translation 
table entries from a translation regime of an exception level lower than the exception level that has been entered, 
will be observed by any observers to the extent that those accesses are required to be observed, as determined by 
the Shareability and Cacheability of those translation table entries. 





Note 


° This does not require that speculative memory accesses cannot be performed using those entries if it is 
impossible to tell that those memory accesses have been observed by the observers. 
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. This requirement does not imply that, on taking an exception to a higher Exception level, any translation table 
walks started before the exception was taken will be completed by the time the higher Exception level is 
entered, and therefore memory accesses required for such a translation table walk might, in effect, be 
performed speculatively. However, the execution of a DSB on entry to the higher Exception level ensures that 
these accesses are complete. 








G4.1.5 Organization of this chapter 
The remainder of this chapter is organized as follows. 
The first part of the chapter describes address translation and the associated memory properties held in the 
translation table entries, in the following sections: 
. The effects of disabling address translation stages on VMSAv8-32 behavior on page G4-4031. 
. Translation tables on page G4-4035. 
° Secure and Non-secure address spaces on page G4-4038. 
° The VMSAv8-32 Short-descriptor translation table format on page G4-4040. 
° The VMSAv8-32 Long-descriptor translation table format on page G4-4049. 
° Memory access control on page G4-4068. 
° Memory region attributes on page G4-4077. 
° Translation Lookaside Buffers (TLBs) on page G4-4089. 
° TLB maintenance requirements on page G4-4093. 
Caches in VMSAv8-32 on page G4-4106 describes VMSAv8-32-specific cache requirements. 
The following sections describe aborts on VMSAv8-32 memory accesses, and how these and other faults are 
reported: 
° VMSAv8-32 memory aborts on page G4-4110. 
° Exception reporting in a VMSAv8-32 implementation on page G4-4123. 
Address translation instructions on page G4-4142 describes these operations, and how they relate to address 
translation. 
A number of sections then describe the System registers for VMSAv8-32. The following sections give general 
information about the System registers, and the organization of the registers in the primary encoding spaces, 
(coproc==0b1110) and (coproc==0b1111) for these registers: 
° About the System registers for VMSAv8-32 on page G4-4148. 
° VMSAv8-32 organization of registers in the (coproc==O0b1110) encoding space on page G4-4172. 
° VMSAv8-32 organization of registers in the (coproc==Ob1111) encoding space on page G4-4175. 
° Functional grouping of VMSAv8-32 System registers on page G4-4193. 
The following sections then describe each of the functional groups of the System registers in the (coproc==0b1111) 
encoding space, including a full description of each register in the group: 
° Identification registers, functional group on page G4-4194. 
° Virtual memory control registers, functional group on page G4-4196. 
° Exception and fault handling registers, functional group on page G4-4200. 
° General system control registers, functional group on page G4-4195. 
° Lockdown, DMA, and TCM features, functional group on page G4-4205. 
° Cache maintenance instructions, functional group on page G4-4201. 
° TLB maintenance instructions, functional group on page G4-4202. 
. Address translation instructions, functional group on page G4-4204. 
° Legacy feature registers, functional group on page G4-4211. 
° Performance Monitors Extension registers, functional group on page G4-4205. 
° Security registers, functional group on page G4-4199, 
° Virtualization registers, functional group on page G4-4197. 
° IMPLEMENTATION DEFINED registers, functional group on page G4-4211. 
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Note 


The System registers in the (coproc==0b1110) encoding space provide the following functionality: 





. Self-hosted debug. These registers are described in Debug registers on page G6-4668. 
° The System register interface to a trace macrocell. These registers are not described in this manual. 


° Jazelle registers. These registers are summarized in Legacy feature registers, functional group on 
page G4-4211. 


Therefore, there is no summary of these registers by functional groups. 





Pseudocode description of VUSAv8-32 memory system operations on page G4-4215 then summarizes the 
pseudocode functions that describe many features of VMSAv8-32 operation. 
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G4.2 The effects of disabling address translation stages on VMSAv8-32 behavior 


About VMSAv8-32 on page G4-4022 defines the translation regimes and the associated stages of address translation, 
each of which has its own System registers for control and configuration. VMSAv8-32 includes an enable bit for 
each stage of address translation, as follows: 


° SCTLR.M, in the Secure instance of the register, controls Secure PL1&0 stage 1 address translation. 
° SCTLR.M, in the Non-secure instance of the register, controls Non-secure PL1&0 stage 1 address 
translation. 


° HCR.VM controls Non-secure PL1&0 stage 2 address translation. 





° HSCTLR.M controls Non-secure PL2 stage 1 address translation. 
Note 
° The descriptions throughout this chapter describe address translation as seen by Exception levels that are 


using AArch32. However, for the Non-secure PL1&0 translation regime, the stage 2 translation: 
—  Iscontrolled by the HCR if EL2 is using AArch32. 
—  Iscontrolled by the HCR_EL2 if EL2 is using AArch64. 


For this reason, links to the HCR link to a table that disambiguates between the AArch32 HCR and the 
AArch64 HCR_EL2. 


° If EL2 is using AArch64, then the equivalent of the Non-secure PL2 translation regime is described in 
Chapter D4 The AArch64 Virtual Memory System Architecture, not in this chapter. 





The following sections describe the effect on VMSAv8-32 behavior of disabling each stage of translation: 

° VMSAVv8-32 behavior when stage 1 address translation is disabled. 

. VMSAV8-32 behavior when stage 2 address translation is disabled on page G4-4033. 

° Behavior of instruction fetches when all associated address translations are disabled on page G4-4033. 


Enabling stages of address translation on page G4-4033 gives more information about each stage of address 
translation, in particular after a reset on an implementation that includes EL3. 


G4.2.1 VMSAv8-32 behavior when stage 1 address translation is disabled 


When stage 1 address translation is disabled, memory accesses that would otherwise be translated by that stage of 
address translation are treated as follows: 


Non-secure PL1 and PLO accesses when EL2 is implemented and HCR.DC is set to 1 


In an implementation that includes EL2, for an access from a Non-secure PL1 or PLO mode when 
HCR.DC is set to 1, the stage 1 translation assigns the Normal Non-shareable, Inner Write-Back 
Read-Allocate Write-Allocate, Outer Write-Back Read-Allocate Write-Allocate memory attributes. 


See also Effect of the HCR.DC bit on page G4-4032. 


All other accesses 


For all other accesses, when a stage 1 address translation is disabled, the assigned attributes depend 
on whether the access is a data access or an instruction access, as follows: 
Data access 
The stage 1 translation assigns the Device-nGnRnE memory type. 
Instruction access 


The stage 1 translation assigns Normal memory attribute, with the Cacheability and 
Shareability attributes determined by the value of: 


° The Secure instance of SCTLR.I for the Secure PL1&0 translation regime. 
° The Non-secure instance of SCTLR.I for the Non-secure PL1&0 translation 
regime. 


° HSCTLR.I for the Non-secure PL2 translation regime. 
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In these cases, the meaning of the I bit is as follows: 
When [ is set to 0 


The stage 1 translation assigns the attributes Outer Shareable, 
Non-cacheable. 


When I is set to 1 
The stage 1 translation assigns the attributes Inner Write-Through 


Read-Allocate No Write-Allocate, Outer Write-Through Read-Allocate 
No Write-Allocate Cacheable. 


— Note 


On some implementations, if the SCTLR.TRE bit is set to 0 then this behavior can be 
changed by the remap settings in the memory remap registers. The details of TEX remap 
when SCTLR.TRE is set to 0 are IMPLEMENTATION DEFINED, see SCTLR.TRE, 
SCTLR.M, and the effect of the TEX remap registers on page G4-4082. 





For this stage of translation, no memory access permission checks are performed, and therefore no MMU faults 
relating to this stage of translation can be generated. 





Note 


Alignment checking is performed, and therefore Alignment faults can occur. 





For every access, when stage | translation is disabled, the output address of the stage 1 translation is equal to the 
input address. This is called a flat address mapping. If the implementation supports output addresses of more than 
32 bits then the output address bits above bit[31] are zero. For example, for a VA to PA translation on an 
implementation that supports 40-bit PAs, PA[39:32] is 0x00. 


For a Non-secure PL1 or PLO access, if the PL1&0 stage 2 address translation is enabled, the stage 1 memory 
attribute assignments and output address can be modified by the stage 2 translation. 


See also Behavior of instruction fetches when all associated address translations are disabled on page G4-4033. 


Effect of the HCR.DC bit 


The HCR.DC bit determines the default memory attributes assigned for the first stage of the Non-secure PL1&0 
translation regime when that stage of translation is disabled. 


When executing in a Non-secure PL1 or PLO mode with HCR.DC set to 1: 
° For all purposes other than reading the value of the SCTLR, the PE behaves as if the value of the SCTLR.M 
bit is 0. This means Non-secure PL1&0 stage 1 address translation is disabled. 


° For all purposes other than reading the value of the HCR, the PE behaves as if the value of the HCR.VM bit 
is 1. This means Non-secure PL1&0 stage 2 address translation is enabled. 


The effect of HCR.DC might be held in TLB entries associated with a particular VMID. Therefore, if software 
executing at EL2 changes the HCR.DC value without also changing the current VMID, it must also invalidate all 
TLB entries associated with the current VMID. Otherwise, the behavior of Non-secure software executing at EL1 
or ELO is CONSTRAINED UNPREDICTABLE, see CONSTRAINED UNPREDICTABLE behaviors due to caching of 
control or data values on page K1-5461. 


Effect of disabling translation on maintenance and address translation instructions 


Cache maintenance instructions act on the target cache whether address translation is enabled or not, and regardless 
of the values of the memory attributes. However, if a stage of translation is disabled, they use the flat address 
mapping for that stage, and all mappings are considered global. 


TLB invalidate operations act on the target TLB whether address translation is enabled or not. 


When the Non-secure PL1&0 stage 1 address translation is disabled, any ATS1C** or ATS12NSO** address 
translation instruction that accesses the Non-secure state translation reflects the effect of the HCR.DC bit. For more 
information about these operations see Address translation instructions on page G4-4142. 
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G4.2.2 VMSAv8-32 behavior when stage 2 address translation is disabled 


When stage 2 address translation is disabled: 
° The IPA output from the stage | translation maps flat to the PA 


° The memory attributes and permissions from the stage 1 translation apply to the PA. 


If the stage 1 address translation and the stage 2 address translation are both disabled, see Behavior of instruction 
fetches when all associated address translations are disabled. 


G4.2.3 Behavior of instruction fetches when all associated address translations are disabled 


The information in this section applies to memory accesses: 
° From Secure PL1 and PLO modes, when the Secure PL1&0 stage 1 address translation is disabled 
. From Hyp mode, when the Non-secure PL2 stage 1 address translation is disabled 
° From Non-secure PL1 and PLO modes, when all of the following apply: 
— The Non-secure PL1&0 stage 1 address translation is disabled. 
— The Non-secure PL1&0 stage 2 address translation is disabled. 
—  HCR.DC is set to 0. 


In these cases, when execution is in AArch32 state a memory location might be accessed as a result of an instruction 
fetch if either: 


° The memory location is in the same 4KB block of memory, aligned to 4KB, as an instruction which a simple 
sequential execution of the program either requires to be fetched now or has required to be fetched since the 
last reset, or is in the 4KB block immediately following such a block. 


° The memory location is the target of a direct branch that a simple sequential execution of the program would 
have taken since the most recent of: 
— The last reset. 


— If the branch predictor is architecturally invisible, the last synchronization of instruction cache 
maintenance targeting the address of the branch instruction. 


— If the branch predictor is not architecturally invisible, the last synchronization of branch predictor 
maintenance targeting the address of the branch instruction. 


These accesses can be caused by speculative instruction fetches, regardless of whether the prefetched instruction is 
committed for execution. 


Note 


To ensure architectural compliance, software must ensure that both of the following apply: 





° Instructions that will be executed when address translation is disabled are located in 4KB blocks of the 
address space that contain only memory that is tolerant to speculative accesses. 


° Each 4KB block of the address space that immediately follows a 4KB block that holds instructions that will 
be executed when address translation is disabled also contains only memory that is tolerant to speculative 
accesses. 





G4.2.4 Enabling stages of address translation 


On powerup or reset, only the SCTLR.M bit for the Exception level and Security state entered on reset is reset to 0, 
disabling address translation for the initial state of the PE. All other SCTLR.M and HSCTLR.M bits that are 
implemented are UNKNOWN after the reset. 
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This means, on powerup or reset: 


° On an implementation that includes EL3, where EL3 is using AArch32: 


— The PL1&0 stage 1 address translation enable bit, SCTLR.M, is Banked, meaning there are separate 
enables for operation in Secure and Non-secure state. 
—  IfEL3is using AArch32, only the Secure instance of the SCTLR.M bit resets to 0, disabling the Secure 


state PL1&0 stage 1 address translation. The reset value of the Non-secure instance of SCTLR.M is 
UNKNOWN. 


° On an implementation that includes EL2, where EL2 is using AArch32, the HSCTLR.M bit, that controls the 
Non-secure PL2 stage 1 address translation: 
— _ If the implementation does not include EL3, resets to 0. 
— Otherwise, is UNKNOWN. 


° On an implementation that does not include either EL2 or EL3, there is a single stage of translation. This is 
controlled by SCTLR.M, that resets to 0. 


Note 


If, for the software that enables or disables a stage of address translation, the input address of a stage 1 translation 
differs from the output address of that stage 1 translation, and the software is running in translation regime that is 
affected by that stage of translation, then the requirement to synchronize changes to the System registers means it 
is uncertain where in the instruction stream the change of the translation takes place. For this reason, ARM strongly 
recommends that the input address and the output address are identical in this situation. 
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G4.3 Translation tables 
VMSAv8-32 defines two alternative translation table formats: 


Short-descriptor format 


It uses 32-bit descriptor entries in the translation tables, and provides: 


. Up to two levels of address lookup. 

° 32-bit input addresses. 

° Output addresses of up to 40 bits. 

. Support for PAs of more than 32 bits by use of supersections, with 16MB granularity. 
. Support for No access, Client, and Manager domains. 


Long-descriptor format 


It uses 64-bit descriptor entries in the translation tables, and provides: 


. Up to three levels of address lookup. 


. Input addresses of up to 40 bits, when used for stage 2 translations. 

. Output addresses of up to 40 bits. 

° 4KB assignment granularity across the entire PA range. 

° No support for domains, all memory regions are treated as in a Client domain. 

. Fixed 4KB table size, unless truncated by the size of the input address space. 
—— Note 


— Translation with a 40-bit input address range requires two concatenated 4KB top-level 
tables, aligned to 8KB. 


— The VMSAv8-64 Long-descriptor translation table format is generally similar to this 
format, but supports input and output addresses of up to 48 bits, and has an assignment 
granularity and table size defined by its translation granule. This can be 4KB, 16KB, 
or 64KB. See The VMSAv8-64 translation table format on page D4-1756. 





In all implementations, of the possible address translations shown in Figure G4-2 on page G4-4027, for stages of 
address translation that are using AArch32: 


° In a particular Security state, the translation tables for the PL1&0 stage 1 translations can use either 
translation table format, and the TTBCR.EAE bit indicates the current translation table format. 


. The translation tables for the Non-secure PL2 stage 1 translations, and for the Non-secure PL1&0 stage 2 
translations, must use the Long-descriptor translation table format. 


Many aspects of performing a translation table walk depend on the current translation table format. Therefore, the 
following sections describe the two formats, including how the MMU performs a translation table walk for each 
format: 

° The VMSAv8-32 Short-descriptor translation table format on page G4-4040. 


° The VMSAv8-32 Long-descriptor translation table format on page G4-4049. 


The following subsections describe aspects of the translation tables and translation table walks, for memory 
accesses from AArch32 state, that are independent of the translation table format: 


. Translation table walks for memory accesses using VMSAv8-32 translation regimes on page G4-4036. 

. Information returned by a translation table lookup on page G4-4036. 

° Determining the translation table base address in the VMSAv8-32 translation regimes on page G4-4037. 
° Control of translation table walks on a TLB miss on page G4-4038. 

° Access to the Secure or Non-secure physical address map on page G4-4038. 


See also TLB maintenance requirements on page G4-4093. 
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G4.3.1 Translation table walks for memory accesses using VMSAv8-32 translation regimes 


A translation table walk occurs as the result of a TLB miss, and starts with a read of the appropriate starting-level 
translation table. The result of that read determines whether additional translation table reads are required, for this 
stage of translation, as described in either: 


° Translation table walks, when using the VMSAv8-32 Short-descriptor translation table format on 
page G4-4046. 
° Translation table walks, when using the VUSAv8-32 Long-descriptor translation table format on 


page G4-4063. 


Note 


When using the Short-descriptor translation table format, the starting level for a translation table walk is always a 
level 1 lookup. However, with the Long-descriptor translation table format, the starting-level can be either a 
level lor a level 2 lookup. 








For the PL1&0 stage | translations, SCTLR.EE determines the endianness of the translation table lookups. SCTLR 
is Banked, and therefore the endianness is determined independently for each Security state. 


HSCTLR.EE defines the endianness for the Non-secure PL2 stage 1 and Non-secure PL1&0 stage 2 translations. 





Note 
Dynamically changing translation table endianness 

Because any change to SCTLR.EE or HSCTLR.EE requires synchronization before it is visible to 

subsequent operations, ARM strongly recommends that: 

° SCTLR.EE is changed only when either: 
— Executing in a mode that does not use the translation tables affected by SCTLR.EE. 
— Executing with SCTLR.M set to 0. 

° HSCTLR.EE is changed only when either: 
— Executing in a mode that does not use the translation tables affected by HSCTLR.EE. 
— Executing with HSCTLR.M set to 0. 





The physical address of the base of the starting-level translation table is determined from the appropriate TTBR, see 
Determining the translation table base address in the VMSAv8-32 translation regimes on page G4-4037. 


For more information, see Ordering and completion of TLB maintenance instructions on page G4-4096. 


Translation table walks must access data or unified caches, or data and unified caches, of other agents participating 
in the coherency protocol, according to the Shareability attributes described in the TTBR. These Shareability 
attributes must be consistent with the Shareability attributes for the translation tables themselves. 


G4.3.2 Information returned by a translation table lookup 


When an associated stage of address translation is enabled, a memory access requires one or more translation table 
lookups. If the required translation table descriptor is not held in a TLB, a translation table walk is performed to 
obtain the descriptor. A lookup, whether from the TLB or as the result of a translation table walk, returns both: 


. An output address that corresponds to the input address for the lookup. 


. A set of properties that correspond to that output address. 


The returned properties are classified as providing address map control, access controls, or region attributes. This 
classification determines how the descriptions of the properties are grouped. The classification is based on the 
following model: 


Address map control 


Memory accesses from Secure state can access either the Secure or the Non-secure address map, as 
summarized in Access to the Secure or Non-secure physical address map on page G4-4038. 


Memory accesses from Non-secure state can only access the Non-secure address map. 
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Access controls 


Determine whether the PE, in its current state, can access the output address that corresponds to the 
given input address. If not, an MMU fault is generated and there is no memory access. 


Memory access control on page G4-4068 describes the properties in this group. 
Attributes Are valid only for an output address that the PE, in its current state, can access. The attributes define 
aspects of the required behavior of accesses to the target memory region. 


Memory region attributes on page G4-4077 describes the properties in this group. 





G4.3.3 Determining the translation table base address in the VMSAv8-32 translation regimes 
On a TLB miss, the VMSA must perform a translation table walk, and therefore must find the base address of the 
translation table to use for its lookup. A TTBR holds this address. As Figure G4-2 on page G4-4027 shows: 
° For a Non-secure PL2 stage 1 translation, the HTTBR holds the required base address. The HTCR is the 
control register for these translations. 
° For a Non-secure PL1&0 stage 2 translation, the VTTBR holds the required base address. The VTCR is the 
control register for these translations. 
° For a PL1&0 stage 1 translation, either TTBRO or TTBR1 holds the required base address. The TTBCR is 
the control register for these translations. 
The Non-secure copies of TTBRO, TTBR1, and TTBCR, relate to the Non-secure PL1&0 stage 1 translation. 
The Secure copies of TTBRO, TTBR1, and TTBCR, relate to the Secure PL1&0 stage 1 translation. 
For the PL1&0 translation table walks: 
° TTBRO can be configured to describe the translation of VAs in the entire address map, or to describe only the 
translation of VAs in the lower part of the address map. 
° If TTBRO is configured to describe the translation of VAs in the lower part of the address map, TTBR1 is 
configured to describe the translation of VAs in the upper part of the address map. 
The contents of the appropriate instance of the TTBCR determine whether the address map is separated into two 
parts, and where the separation occurs. The details of the separation depend on the current translation table format, 
see: 
° Selecting between TTBRO and TTBR1, VMSAv8-32 Short-descriptor translation table format on 
page G4-4045. 
° Selecting between TTBRO and TTBR1, VMSAv8-32 Long-descriptor translation table format on 
page G4-4057. 
Example G4-1 shows a typical use of the two sets of translation tables: 
Example G4-1 Example use of TTBRO and TTBR1 
An example of using the two TTBRs for PL1&0 stage 1 address translations is: 
TTBRO Used for process-specific addresses. 
Each process maintains a separate level 1 translation table. On a context switch: 
° TTBRO is updated to point to the level 1 translation table for the new context. 
° TTBCR is updated if this change changes the size of the translation table. 
° The CONTEXTIDR is updated. 
TTBCR can be programmed so that all translations use TTBRO in a manner compatible with 
architecture versions before ARMv6. 
TTBR1 Used for operating system and I/O addresses, that do not change on a context switch. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4037 


1ID092916 


Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.3 Translation tables 


G4.3.4 Control of translation table walks on a TLB miss 


Two bits in the TCR for the translation stage required by a memory access control whether a translation table walk 
is performed on a TLB miss. These two bits are the: 


° PDO and PD1 bits, on a PE using the Short-descriptor translation table format. 
° EPDO and EPD1 bits, on a PE using the Long-descriptor translation table format. 
Note 





For the VMSAv8-32 translation regimes, the different bit names are because the bits are in different positions in 
TTBCR, depending on the translation table format. 





The effect of these bits is: 


{E}PDx ==0 IfaTLB miss occurs based on TTBRx, a translation table walk is performed. The current security 
state determines whether the memory access is Secure or Non-secure. 


{E}PDx ==1 If aTLB miss occurs based on TTBRx, a level | Translation fault is returned, and no translation 
table walk is performed. 


G4.3.5 Access to the Secure or Non-secure physical address map 


As stated in Address spaces in VMSAv8-32 on page G4-4025, a PE can access independent Secure and Non-secure 
address maps. When the PL1 Exception level is using AArch32, these are defined by the translation tables identified 
by the Secure TTBRO and TTBRI1. In both translation table formats in the Secure translation tables, the NS bit in a 
descriptor indicates whether the descriptor refers to the Secure or the Non-secure address map: 


NS == Access the Secure physical address space. 


NS == Access the Non-secure physical address space. 


Note 


In the Non-secure translation tables, the corresponding bit is SBZ. Non-secure accesses always access the 
Non-secure physical address space, regardless of the value of this bit. 








The Long-descriptor translation table format extends this control, adding an NSTable bit to the Secure translation 
tables, as described in Hierarchical control of Secure or Non-secure memory accesses, Long-descriptor format on 
page G4-4056. In the Non-secure translation tables, the corresponding bit is SBZ, and Non-secure accesses ignore 
the value of this bit. 


The following sections describe the address map controls in the two implementations: 
° Control of Secure or Non-secure memory access, VMSAv8-32 Short-descriptor format on page G4-4045. 
° Control of Secure or Non-secure memory access, VMSAv8-32 Long-descriptor format on page G4-4056. 


The following subsection gives more information. 


Secure and Non-secure address spaces 


EL3 provides two physical address spaces, a Secure physical address space and a Non-secure physical address 
space. 


As described in Access to the Secure or Non-secure physical address map, for the PL1&0 stage 1 translations when 
controlled from an Exception level using AArch32, the registers that control the stage of translation, TTBRO, 
TTBRI1, and TTBCR, are Banked between Secure and Non-secure versions, and the Security state of the PE when 
it performs a memory access selects the corresponding version of the registers. This means there are independent 
Secure and Non-secure versions of these translation tables, and translation table walks are made to the physical 
address space corresponding to the security state of the translation tables used. 


For a translation table walk caused by a memory access from Non-secure state, all memory accesses are to the 
Non-secure address space. 
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For a translation table walk caused by a memory access from Secure state: 


When address translation is using the Long-descriptor translation table format: 
— The initial lookup performed must access the Secure address space. 


— _ Ifatable descriptor read from the Secure address space has the NSTable bit set to 0, then the next level 
of lookup is from the Secure address space. 


—  Ifatable descriptor read from the Secure address space has the NSTable bit set to 1, then the next level 
of lookup, and any subsequent level of lookup, is from the Non-secure address space. 


For more information, see Control of Secure or Non-secure memory access, VMSAv8-32 Long-descriptor 


format on page G4-4056. 


Otherwise, all memory accesses are to the Secure address space. 


Note 





When executing in Non-secure state, additional translations are supported. For memory accesses from 
AArch32 state these are: 


—  Non-secure PL2 stage | translation. 


—  Non-secure PL1&0 stage 2 translation. 


These translations can access only the Non-secure address space. 


A system implementation can alias parts of the Secure physical address space to the Non-secure physical 
address space in an implementation-specific way. As with any other aliasing of physical memory, the use of 
aliases in this way can require the use of cache maintenance instructions to ensure that changes to memory 
made using one alias of the physical memory are visible to accesses to the other alias of the physical memory. 
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G4.4 The VMSAv8-32 Short-descriptor translation table format 


The Short-descriptor translation table format supports a memory map based on memory sections or pages: 


Supersections Consist of 16MB blocks of memory. Support for Supersections is optional, except that an 
implementation that supports more than 32 bits of Physical Address must also support 
Supersections to provide access to the entire Physical Address space. 


Sections Consist of 1MB blocks of memory. 
Large pages Consist of 64KB blocks of memory. 
Small pages Consist of 4KB blocks of memory. 


Supersections, Sections and Large pages map large regions of memory using only a single TLB entry. 


Note 


Whether a VMSAv8-32 implementation of the Short-descriptor format translation tables supports supersections is 
IMPLEMENTATION DEFINED. 








When using the Short-descriptor translation table format, two levels of translation tables are held in memory: 


Level 1 table 

Holds level 1 descriptors that contain the base address and 

. Translation properties for a Section and Supersection. 

° Translation properties and pointers to a level 2 table for a Large page or a Small page. 
Level 2 tables 


Hold level 2 descriptors that contain the base address and translation properties for a Small page or 
a Large page. With the Short-descriptor format, level 2 tables can be referred to as Page tables. 


A level 2 table requires 1KB of memory. 


In the translation tables, in general, a descriptor is one of: 


. An invalid or fault entry. 

. A Page table entry, that points to a next-level translation table. 

. A page or section entry, that defines the memory properties for the access. 
. A reserved format. 


Bits[1:0] of the descriptor give the primary indication of the descriptor type. 


Figure G4-3 on page G4-4041 gives a general view of address translation when using the Short-descriptor 
translation table format. 
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+ Repeated entries required because of descriptor field overlaps. 


Figure G4-3 General view of address translation using VMSAv8-32 Short-descriptor format translation tables 


Additional requirements for Short-descriptor format translation tables on page G4-4044 describes why, when using 
the Short-descriptor format, Supersection and Large page entries must be repeated 16 times, as shown in 
Figure G4-3. 


VMSAv8-32 Short-descriptor translation table format descriptors, Memory attributes in the VMSAv8-32 
Short-descriptor translation table format descriptors on page G4-4044, and Control of Secure or Non-secure 
memory access, VMSAv8-32 Short-descriptor format on page G4-4045 describe the format of the descriptors in the 
Short-descriptor format translation tables. 


The following sections then describe the use of this translation table format: 


° Selecting between TTBRO and TTBR1, VMSAv8-32 Short-descriptor translation table format on 
page G4-4045. 


° Translation table walks, when using the VMSAv8-32 Short-descriptor translation table format on 
page G4-4046. 


G4.4.1 VMSAv8-32 Short-descriptor translation table format descriptors 


The following sections describe the formats of the entries in the Short-descriptor translation tables: 
° Short-descriptor translation table level 1 descriptor formats on page G4-4042. 
° Short-descriptor translation table level 2 descriptor formats on page G4-4043. 


For more information about level 2 translation tables see Additional requirements for Short-descriptor format 
translation tables on page G4-4044. 


Note 


Previous versions of the ARM Architecture Reference Manual, and some other documentation, describes the AP[2] 
bit in the translation table entries as the APX bit. 








Information returned by a translation table lookup on page G4-4036 describes the classification of the non-address 
fields in the descriptors as address map control, access control, or attribute fields. 
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Short-descriptor translation table level 1 descriptor formats 
Each entry in the level 1 table describes the mapping of the associated 1MB VA range. 


Figure G4-4 shows the possible level 1 descriptor formats. 





31 210 
Invalid IGNORED jo], 
31 109 8 543210 


Page table Page table base address, bits[31:10] | | Domain | | | Jo] 
IMPLEMENTATION DEFINED — RESO 
NS 
PXN 


20191817161514 1211109 8 543210 


Section Section base address, PA[31:20] | fol js] | [|| Domain | fefe}1] | 





NS— nG— xn— PXN— 
AP[2] 
TEX[2:0] 


AP[1:0] 
IMPLEMENTATION DEFINED 














24 23 20191817161514 1211109 8 543210 
Supersection es ee op ttt feet] | 
Supersection base address, PA[31:24] ——_ | Ns— ne — XN— PXN— 
Extended base address, PA[35:32] AP[2] 
AP[1:0] 





IMPLEMENTATION DEFINED 
Extended base address, PA[39:36] 





Figure G4-4 VMSAv8-32 Short-descriptor level 1 descriptor formats 
Descriptor bits[1:0] identify the descriptor type. The encoding of these bits is: 


0b00, Invalid entry 
The associated VA is unmapped, and any attempt to access it generates a Translation fault. 
Bits[31:2] of the descriptor are IGNORED, see IGNORED on page Glossary-5718. This means 
software can use these bits for its own purposes. 

0b01, Page table 
The descriptor gives the address of a level 2 translation table, that specifies the mapping of the 
associated 1MByte VA range. 

0b10, Section or Supersection 


The descriptor gives the base address of the Section or Supersection. Bit[18] determines whether 
the entry describes a Section or a Supersection. 


This encoding also defines the PXN bit as 0. 


0b11, Section or Supersection, if the implementation supports the PXN attribute 


This encoding is identical to 0b10, except that it defines the PXN bit as 1. 
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Note 


A VMSAv8-32 implementation can use the Short-descriptor translation table format for the PL1&0 stage 1 
translations, by setting TTBCR.EAE to 0. 








The address information in the level 1 descriptors is: 
Page table Bits[31:10] of the descriptor are bits[31:10] of the address of a Page table. 
Section Bits[31:20] of the descriptor are bits[31:20] of the address of the Section. 
Supersection Bits[31:24] of the descriptor are bits[31:24] of the address of the Supersection. 
Optionally, bits[8:5, 23:20] of the descriptor are bits[39:32] of the extended Supersection address. 


For the Non-secure PL1&0 translation tables, the address in the descriptor is the IPA of the Page table, Section, or 
Supersection. Otherwise, the address is the PA of the Page table, Section, or Supersection. 


For descriptions of the other fields in the descriptors, see Memory attributes in the VMSAv8-32 Short-descriptor 
translation table format descriptors on page G4-4044. 


Short-descriptor translation table level 2 descriptor formats 


Figure G4-5 shows the possible formats of a level 2 descriptor. 
31 210 


Invalid IGNORED jo], 


161514 1211109 8 6543210 


Large page Large page base address, PA[31:16] 7 TEX[2:0] | |s| | reso | |cfpfo|1| 





XN ng 4 APit:0]— 
an si 


1211109 8 6543210 


Small page Small page base address, PA[31:12] | |s| | TEX[2:0] | [cfala| | 





ng — aP[t:0]— XN — 
pa os 


Figure G4-5 Short-descriptor level 2 descriptor formats 


Descriptor bits[1:0] identify the descriptor type. The encoding of these bits is: 


0b00, Invalid entry 
The associated VA is unmapped, and attempting to access it generates a Translation fault. 
Bits[31:2] of the descriptor are IGNORED, see IGNORED on page Glossary-5718. This means 
software can use these bits for its own purposes. 

0b01, Large page 
The descriptor gives the base address and properties of the Large page. 

Qb1x, Small page 
The descriptor gives the base address and properties of the Small page. 
In this descriptor format, bit[0] of the descriptor is the XN bit. 

The address information in the level 2 descriptors is: 


Large page _ Bits[31:16] of the descriptor are bits[31:16] of the address of the Large page. 
Small page _ Bits[31:12] of the descriptor are bits[31:12] of the address of the Small page. 


For the Non-secure PL1&0 translation tables, the address in the descriptor is the IPA of the Page table, Section, or 
Supersection. Otherwise, the address is the PA of the Page table, Section, or Supersection. 
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For descriptions of the other fields in the descriptors, see Memory attributes in the VMSAv8-32 Short-descriptor 
translation table format descriptors. 


Additional requirements for Short-descriptor format translation tables 


When using Supersection or Large page descriptors in the Short-descriptor translation table format, the input 
address field that defines the Supersection or Large page descriptor address overlaps the table address field. In each 
case, the size of the overlap is 4 bits. The following diagrams show these overlaps: 


° Figure K7-14 on page K7-5566 for the level 1 translation table entry for a Supersection. 
° Figure K7-16 on page K7-5568 for the level 2 translation table entry for a Large page. 


Considering the case of using Large page descriptors in a level 2 translation table, this overlap means that for any 
specific Large page, the bottom four bits of the level 2 translation table entry might take any value from 0b0000 to 
0b1111. Therefore, each of these sixteen index values must point to a separate copy of the same descriptor. 


This means that each Large page or Supersection descriptor must: 
° Occur first on a sixteen-word boundary. 


° Be repeated in 16 consecutive memory locations. 


G4.4.2 Memory attributes in the VMSAv8-32 Short-descriptor translation table format descriptors 
This section describes the descriptor fields other than the descriptor type field and the address field: 
TEX[2:0], C, B 
Memory region attribute bits, see Memory region attributes on page G4-4077. 
These bits are not present in a descriptor for a Page table. 
XN bit The Execute-never bit. Determines whether the PE can execute software from the addressed region, 
see Execute-never restrictions on instruction fetching on page G4-4071. 
This bit is not present in a descriptor for a Page table. 


PXN bit The Privileged execute-never bit. Determines whether the PE can execute software from the region 
when executing at PL1, see Execute-never restrictions on instruction fetching on page G4-4071. 


— Note 


Memory accesses by software executing at EL2 always use the Long-descriptor translation table 
format. 





When this bit is set to 1 in the descriptor for a Page table, it indicates that all memory pages 
described in the corresponding Page table are Privileged execute-never. 


NS bit Non-secure bit. Specifies whether the translated PA is in the Secure or Non-secure address map, see 
Control of Secure or Non-secure memory access, VMSAv8-32 Short-descriptor format on 
page G4-4045. 


This bit is not present in level 2 descriptors. The value of the NS bit in a level 1 descriptor for a 
Page table applies to all entries in the corresponding level 2 translation table. 

Domain Domain field, see Domains, Short-descriptor format only on page G4-4073. 
This field is not present in a Supersection entry. Memory described by Supersections is in domain 0. 
This bit is not present in level 2 descriptors. The value of the Domain field in the level 1 descriptor 
for a Page table applies to all entries in the corresponding level 2 translation table. 

An IMPLEMENTATION DEFINED bit 


This bit is not present in level 2 descriptors. 
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AP[2], AP[1:0] 
Access Permissions bits, see Memory access control on page G4-4068. 
AP[0] can be configured as the Access flag, see The Access flag on page G4-4074. 
These bits are not present in a descriptor for a Page table. 


S bit Shareable bit. Used in determining the Shareability of the addressed region, see Memory region 
attributes on page G4-4077. 


— Note 


The naming of this bit as the Shareable bit is carried forward from early versions of the ARM 
architecture. This name is no longer an adequate description of the interpretation of the bit. 





This bit is not present in a descriptor for a Page table. 


nG bit The not global bit. If a lookup using this descriptor is cached in a TLB, determines whether the TLB 
entry applies to all ASID values, or only to the current ASID value. See Global and process-specific 
translation table entries on page G4-4089. 


This bit is not present in a descriptor for a Page table. 


Bit[18], when bits[1:0] indicate a Section or Supersection descriptor 


0 Descriptor is for a Section. 
1 Descriptor is for a Supersection. 
G4.4.3 Control of Secure or Non-secure memory access, VMSAv8-32 Short-descriptor format 


Access to the Secure or Non-secure physical address map on page G4-4038 describes how the NS bit in the 
translation table entries: 


° For accesses from Secure state, determines whether the access is to Secure or Non-secure memory. 


° Is ignored by accesses from Non-secure state. 


In the Short-descriptor translation table format, the NS bit is defined only in the level 1 translation tables. This 
means that, in a level 1 descriptor for a Page table, the NS bit defines the physical address space, Secure or 
Non-secure, for all of the Large pages and Small pages of memory described by that table. 


The NS bit of a level 1 descriptor for a Page table has no effect on the physical address space in which that translation 
table is held. As stated in Secure and Non-secure address spaces on page G4-4038, the physical address of that 
translation table is in: 


° The Secure address space if the translation table walk is in Secure state. 


° The Non-secure address space if the translation table walk is in Non-secure state. 


This means the granularity of the Secure and Non-secure memory spaces is 1MB. However, in these memory 
spaces, table entries can define physical memory regions with a granularity of 4KB. 


G4.4.4 Selecting between TTBRO and TTBR1, VMSAv8-32 Short-descriptor translation table format 


As described in Determining the translation table base address in the VMSAv8-32 translation regimes on 
page G4-4037, two sets of translation tables can be defined for each of the PL1&0 stage | translations, and TTBRO 
and TTBR1 hold the base addresses for the two sets of tables. When using the Short-descriptor translation table 
format, the value of TTBCR.N indicates the number of most significant bits of the input VA that determine whether 
TTBRO or TTBR1 holds the required translation table base address, as follows: 
° If N == 0 then use TTBRO. Setting TTBCR.N to zero disables use of a second set of translation tables. 
° if N > 0 then: 

— If bits[31:32-N] of the input VA are all zero then use TTBRO. 

— Otherwise use TTBR1. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4045 
1ID092916 Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.4 The VMSAv8-32 Short-descriptor translation table format 


Table G4-1 shows how the value of N determines the lowest address translated using TTBR1, and the size of the 
level 1 translation table addressed by TTBRO. 


Table G4-1 Effect of TTBCR.N on address translation, Short-descriptor format 





























TTBRO table 
TTBCR.N _ First address translated with TTBR1 

Size Index range 
0b000 TTBRI not used 16KB VA[31:20] 
Qb001 0x80000000 8KB VA[30:20] 
0b010 0x40000000 4KB VA[29:20] 
0b011 0x20000000 2KB VA[28:20] 
0b100 0x10000000 1KB VA[27:20] 
0b101 0x08000000 512 bytes VA[26:20] 
0b110 0x04000000 256 bytes VA[25:20] 
Qb111 0x02000000 128 bytes VA[24:20] 








Whenever TTBCR.N is nonzero, the size of the translation table addressed by TTBR1 is 16KB. 


Figure G4-6 shows how the value of TTBCR.N controls the boundary between VAs that are translated using 
TTBRO, and VAs that are translated using TTBR1. 




















OxFFFFFFFF 
ss ss TTBR1 region > 
TTBRO region 
Effect of decreasing N 
002000000 — Sea 
TTBRO region 
0x00000000 





TTBCR.N==0b000 
Use of TTBR1 disabled 


Figure G4-6 How TTBCR.N controls the boundary between the TTBRs, Short-descriptor format 
In the selected TTBR, bits RGN, S and IRGN[1:0] define the memory region attributes for the translation table walk. 
Translation table walks, when using the VMUSAv8-32 Short-descriptor translation table format describes the 
translation. 
G4.4.5 Translation table walks, when using the VMSAv8-32 Short-descriptor translation table format 


When using the Short-descriptor translation table format, and a memory access requires a translation table walk: 





. A section-mapped access only requires a read of the level 1 translation table. 
. A page-mapped access also requires a read of the level 2 translation table. 
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Reading a level I translation table describes how either TTBR1 or TTBRO is used, with the accessed VA, to 
determine the address of the levell descriptor. 


Reading a level I translation table shows the output address as A[39:0]: 


° For a Non-secure PL1&0 stage 1 translation, this is the IPA of the required descriptor. A Non-secure PL1&0 
stage 2 translation of this address is performed to obtain the PA of the descriptor. 


° Otherwise, this address is the PA of the required descriptor. 


The full translation flow for Sections, Supersections, Small pages and Large pages on page G4-4048 then shows the 
complete translation flow for each valid memory access. 


Reading a level 1 translation table 


When performing a fetch based on TTBRO: 

° The address bits taken from TTBRO vary between bits[31:14] and bits[31:7]. 

. The address bits taken from the VA, that is the input address for the translation, vary between bits[31:20] and 
bits[24:20]. 

The width of the TTBRO and VA fields depend on the value of TTBCR.N, as Figure G4-7 shows. 


When performing a fetch based on TTBR1, Bits TTBR1[31:14] are concatenated with bits[31:20] of the VA. This 
makes the fetch equivalent to that shown in Figure G4-7, with N==0. 


Note 


See The address and Properties fields shown in the translation flows on page K7-5570 for more information about 
the Properties label used in this and other figures. 
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Translation base Table index (0) A[39:32] = 0x00 Descriptor address 
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N is the value of TTBCR.N 


For details of the Properties field, see the register description 


Figure G4-7 Accessing level 1 translation table based on TTBRO, Short-descriptor format 


Regardless of which register is used as the base for the fetch, the resulting output address selects a four-byte 
translation table entry that is one of: 


° A level 1 descriptor for a Section or Supersection. 
° A descriptor for a Page table, that points to a level 2 translation table. In this case: 
— A second fetch is performed to retrieve a level 2 descriptor. 
— The descriptor also contains some attributes for the access, see Figure G4-4 on page G4-4042. 


° A faulting entry. 
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The full translation flow for Sections, Supersections, Small pages and Large pages 

In a translation table walk, only the initial lookup uses the translation table base address from the appropriate TTBR. 
Subsequent lookups use a combination of address information from: 

° The table descriptor read in the previous lookup. 

° The input address. 

Address translation examples using the VUSAv8-32 Short descriptor translation table format on page K7-5565 


shows the full translation flow for each of the memory section and page options. As described in VMSAv8-32 
Short-descriptor translation table format descriptors on page G4-4041, these options are: 


Supersection A 16MB memory region, see Translation flow for a Supersection on page K7-5566. 
Section A 1MB memory region, see Translation flow for a Section on page K7-5567. 


Large page A 64KB memory region, described by the combination of: 
° A level 1 translation table entry that indicates the address of a level 2 Page table. 
° A level 2 descriptor that indicates a Large page. 


See Translation flow for a Large page on page K7-5568. 


Small page A4KB memory region, described by the combination of: 
° A level 1 translation table entry that indicates the address of a level 2 Page table. 


° A level 2 descriptor that indicates a Small page. 


See Translation flow for a Small page on page K7-5569. 
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G4.5 The VMSAv8-32 Long-descriptor translation table format 


The VMSAv8-32 Long-descriptor translation table format supports the assignment of memory attributes to memory 
Pages, at a granularity of 4KB, across the complete input address range. It also supports the assignment of memory 
attributes to blocks of memory, where a block can be 2MB or 1GB. 


Note 


° Although the VMSAv8-32 Long-descriptor format is limited to three levels of address lookup, its design and 
naming conventions support extension to additional levels, to support a larger input address range. 





° Similarly, while the VMSAv8-32 implementation limits the output address range to 40 bits, its design 
supports extension to a larger output address range. 





Figure G4-2 on page G4-4027 shows the different address translation stages. The Long-descriptor translation table 


format: 
° Is used for: 
— The Non-secure PL2 stage 1 translation. 
— The Non-secure PL1&0 stage 2 translation. 
° Can be used for the Secure and Non-secure PL1&0 translations. 


When used for a stage | translation, the translation tables support an input address of up to 32 bits, corresponding 
to the VA address range of the PE. 


When used for a stage 2 translation, the translation tables support an input address range of up to 40 bits, to support 
the translation from IPA to PA. If the input address for the stage 2 translation is a 32-bit address then this address is 
zero-extended to 40 bits. 


Note 


When the Short-descriptor translation table format is used for the Non-secure stage | translations, this generates 
32-bit IPAs. These are zero-extended to 40 bits to provide the input address for the stage 2 translation. 








Overview of VMSAv8-32 address translation using Long-descriptor translation tables on page G4-4050 
summarizes address translation from AArch32 state when using the Long-descriptor format translation tables. 


The following sections then describe the format of the descriptors in the Long-descriptor format translation tables: 


° VMSAVv8-32 Long-descriptor translation table format descriptors on page G4-4050. 


° Memory attribute fields in the VUSAv8-32 Long-descriptor translation table format descriptors on 
page G4-4053. 
° Control of Secure or Non-secure memory access, VMSAv8-32 Long-descriptor format on page G4-4056. 


The following sections then describe this translation table format: 


. Selecting between TTBRO and TTBR1, VMSAv8-32 Long-descriptor translation table format on 
page G4-4057. 


. VMSAv8-32 Long-descriptor translation table format address lookup levels on page G4-4060. 


. Translation table walks, when using the VMUSAv8-32 Long-descriptor translation table format on 
page G4-4063. 
. The algorithm for finding the translation table entries, VMSAv8-32 Long-descriptor format on 


page G4-4066. 
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G4.5.1 Overview of VMSAv8-32 address translation using Long-descriptor translation tables 


Figure G4-8 gives a general view of VMSAv8-32 stage 1 address translation when using the Long-descriptor 
translation table format. 
































































TTBRO, Level 1 table 
TTBR1 
pas 1GB Level 2 table 
Block > memory 
Indexed by region 
VA[31:30] Table 2MB Level 3 table 
Block > memory 
Indexed by region 
VA[29:21] =a 
able 
Indexed by 4kKB 
VA[20:12] }> memory 
page 


If a level 1 table would contain only one entry, it is skipped, and the TTBR points to 
the level 2 table. This happens if the VA address range is 30 bits or less. 





Figure G4-8 General view of VMSAv8-32 stage 1 address translation using Long-descriptor format 


Figure G4-9 gives a general view of VMSAv8-32 stage 2 address translation. Stage 2 translation always uses the 
Long-descriptor translation table format. 
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Indexed by region 
IPA[29:21] Table 4KB 
Indexed by Page [> memory 
Block IPA[20:12] | page 
Table Block 
Table 
Up to two concatenated 











Level 1 tables, so that 
IPA[39] indexes the table. 


If a level 1 table would contain 16 entries or fewer, level 1 lookup can be omited. If so, VTTBR 
points to the start of a block of concatenated level 2 tables. See text for more information. 


Figure G4-9 General view of VMSAv8-32 stage 2 address translation, Long-descriptor translation table format 


Use of concatenated translation tables for the initial stage 2 lookup on page G4-4060 describes how using 
concatenated level 2 tables means lookup can start at level 2, as referred to in Figure G4-9. 


G4.5.2 VMSAv8-32 Long-descriptor translation table format descriptors 


As described in VMSAv8-32 Long-descriptor translation table format address lookup levels on page G4-4060, the 
Long-descriptor translation table format provides up to three levels of address lookup. A translation table walk starts 
either at level 1 or level 2 of the address lookup. 


In general, a descriptor is one of: 


. An invalid or fault entry. 

. A table entry, that points to the next-level translation table. 

° A block entry, that defines the memory properties for the access. 
° A reserved format. 


Bit[1] of the descriptor indicates the descriptor type, and bit[0] indicates whether the descriptor is valid. 
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The following sections describe the Long-descriptor translation table descriptor formats: 
° VMSAv8-32 Long-descriptor level 1 and level 2 descriptor formats. 
° VMSAv8-32 Long-descriptor translation table level 3 descriptor formats on page G4-4052. 


Information returned by a translation table lookup on page G4-4036 describes the classification of the non-address 
fields in the descriptors between address map control, access controls, and region attributes. 


VMSAv8-32 Long-descriptor level 1 and level 2 descriptor formats 


In the Long-descriptor translation tables, the formats of the level 1 and level 2 descriptors differ only in the size of 
the block of memory addressed by the block descriptor. A block entry: 


° In a level 1 table describes the mapping of the associated 1GB input address range. 


° In a level 2 table describes the mapping of the associated 2MB input address range. 


Figure G4-10 shows the Long-descriptor level 1 and level 2 descriptor formats: 


63 10 
Invalid | IGNORED 0 
52 51 40 39 12.11 210 


soc Upper block attributes sBz* Output address[39:n] RESO Lower block attibutes| 0] 1 | 


For the level 1 descriptor, n is 30. For the level 2 descriptor, n is 21. 


NSTable 

APTable Stage 1 only, 

XNTable SBZ at stage 2 
rp PXNTable 


63 62 61 60:59 58 52 51 40 39 12 11 210 


Table LL [I [irene SBz? Next-level table address[39:12] IGNORED 


The level 1 descriptor returns the address of the level 2 table. 
The level 2 descriptor returns the address of the level 3 table. 


+ See the descriptions of the address fields for more information about bits[47:40] of the Block and Table descriptors. 
Figure G4-10 VMSAv8-32 Long-descriptor level 1and level 2 descriptor formats 


Descriptor encodings, Long-descriptor level 1 and level 2 formats 


Descriptor bit[0] identifies whether the descriptor is valid, and is 1 for a valid descriptor. If a lookup returns an 
invalid descriptor, the associated input address is unmapped, and any attempt to access it generates a Translation 
fault. 


Descriptor bit[1] identifies the descriptor type, and is encoded as: 


0, Block The descriptor gives the base address of a block of memory, and the attributes for that memory 
region. 
1, Table The descriptor gives the address of the next level of translation table, and for a stage 1 translation, 


some attributes for that translation. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4051 
1ID092916 Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.5 The VMSAv8-32 Long-descriptor translation table format 


The other fields in the valid descriptors are: 


Block descriptor 
Gives the base address and attributes of a block of memory: 


° For a level 1 Block descriptor, bits[39:30] are bits[39:30] of the output address that specifies 
a 1GB block of memory. 


° For a level 2 Block descriptor, bits[39:21] are bits[39:21] of the output address that specifies 
a 2MB block of memory. 


In both cases, if bits[47:40] of the descriptor are not zero then a translation that uses the descriptor 
will generate an Address size fault, see Address size fault on page G4-4119. 


Bits[63:52, 11:2] provide attributes for the target memory block, see Memory attribute fields in the 
VMSAv8-32 Long-descriptor translation table format descriptors on page G4-4053. The position 
and contents of these bits is identical in the level 2 block descriptor and in the level 3 page descriptor. 


Table descriptor 


Bits[39:m] are bits[39:m] of the address of the required next-level table. Bits[m-1:0] of the table 
address are zero: 


° For a level 1 Table descriptor, this is the address of a level 2 table. 
° For a level 2 Table descriptor, this is the address of a level 3 table. 


In both cases, if bits[47:40] of the descriptor are not zero then a translation that uses the descriptor 
will generate an Address size fault, see Address size fault on page G4-4119. 


For a stage | translation only, bits[63:59] provide attributes for the next-level lookup, see Memory 
attribute fields in the VUSAv8-32 Long-descriptor translation table format descriptors on 
page G4-4053. 


If the translation table defines the Non-secure PL1&0 stage 1 translations, then the output address in the descriptor 
is the IPA of the target block or table. Otherwise, it is the PA of the target block or table. 


VMSAv8-32 Long-descriptor translation table level 3 descriptor formats 
Each entry in a level 3 table describes the mapping of the associated 4KB input address range. 


Figure G4-11 shows the Long-descriptor level 3 descriptor formats. 


10 


Invalid IGNORED 0 


Reserved, 
invalid 


2 1 0 


———————————E 


52:51 40 39 12 11 2 1 0 


Page| Upper page attributes | sBz* Output address[39:12] Lower page atributes| 1] 1| 


+ See the description of the address field for more information about bits[47:40] of the Page descriptor. 


Figure G4-11 VMSAv8-32 Long-descriptor level 3 descriptor formats 


Descriptor bit[0] identifies whether the descriptor is valid, and is 1 for a valid descriptor. If a lookup returns an 
invalid descriptor, the associated input address is unmapped, and any attempt to access it generates a Translation 
fault. 
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Descriptor bit[1] identifies the descriptor type, and is encoded as: 


0, Reserved, invalid 
Behaves identically to encodings with bit[0] set to 0. 


This encoding must not be used in level 3 translation tables. 
1, Page Gives the address and attributes of a 4KB page of memory. 
At this level, the only valid format is the Page descriptor. The other fields in the Page descriptor are: 


Page descriptor 
Bits[39:12] are bits[39:12] of the output address for a page of memory. 


If bits[47:40] of the descriptor are not zero then a translation that uses the descriptor will generate 
an Address size fault, see Address size fault on page G4-4119. 


Bits[63:52, 11:2] provide attributes for the target memory page, see Memory attribute fields in the 
VMSAv8-32 Long-descriptor translation table format descriptors. The position and contents of 
these bits are identical in the level 1 block descriptor and in the level 2 block descriptor. 


If the translation table defines the Non-secure PL1&0 stage 1 translations, then the output address in the descriptor 
is the IPA of the target page. Otherwise, it is the PA of the target page. 





G4.5.3 Memory attribute fields in the VMSAv8-32 Long-descriptor translation table format 
descriptors 
The memory attributes in the VMSAv8-32 Long-descriptor translation tables are based on those in the 
Short-descriptor translation table format, with some extensions. Memory region attributes on page G4-4077 
describes these attributes. In the Long-descriptor translation table format: 
° Table entries for stage 1 translations define attributes for the next level of lookup, see Next-level attributes in 
VMSAv8-32 Long-descriptor stage 1 Table descriptors 
. Block and Page entries define memory attributes for the target block or page of memory. Stage 1 and stage 2 
translations have some differences in these attributes, see: 
— Attribute fields in VMSAv8-32 Long-descriptor stage 1 Block and Page descriptors on page G4-4054. 
— Attribute fields in VMSAv8-32 Long-descriptor stage 2 Block and Page descriptors on page G4-4055. 
Next-level attributes in VMSAv8-32 Long-descriptor stage 1 Table descriptors 
In a Table descriptor for a stage 1 translation, bits[63:59] of the descriptor define the following attributes for the 
next-level translation table access: 
NSTable, bit[63] For memory accesses from Secure state, specifies the Security state for subsequent levels of 
lookup, see Hierarchical control of Secure or Non-secure memory accesses, 
Long-descriptor format on page G4-4056. 
For memory accesses from Non-secure state, this bit is ignored. 
APTable, bits[62:61] Access permissions limit for subsequent levels of lookup, see Hierarchical control of access 
permissions, Long-descriptor format on page G4-4069. 
APTable[0] is reserved, SBZ, in the Non-secure PL2 stage 1 translation tables. 
XNTable, bit[60] XN limit for subsequent levels of lookup, see Hierarchical control of instruction fetching, 
Long-descriptor format on page G4-4072. 
PXNTable, bit[59] PXN limit for subsequent levels of lookup, see Hierarchical control of instruction fetching, 
Long-descriptor format on page G4-4072. 
This bit is RESO in the Non-secure PL2 stage 1 translation tables. 
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Attribute fields in VMSAv8-32 Long-descriptor stage 1 Block and Page descriptors 


Block and Page descriptors split the memory attributes into an upper block and a lower block. Figure G4-12 shows 
the memory attribute fields in these blocks, for a stage | translation: 


Upper attributes Lower attributes 
63 59 58 55 54 53 52 11109 8 7 65 4 2 


aoe [oom ||| (It TTT] 


Reserved for software use ——! nc 
XN AF 
PXN SH[1:0] 


Contiguous AP[2:1] 
NS 
Attrindx[2:0] 








Figure G4-12 VMSAv8-32 memory attribute fields in Long-descriptor stage 1 Block and Page descriptors 


For a stage 1 descriptor, the attributes are: 


XN, bit[54] 


PXN, bit[53] 


The Execute-never bit. Determines whether the region is executable, see Execute-never restrictions 
on instruction fetching on page G4-4071. 


The Privileged execute-never bit. Determines whether the region is executable at EL1, see 
Execute-never restrictions on instruction fetching on page G4-4071. 


This bit is RESO in the Non-secure PL2 stage 1 translation tables. 


Contiguous, bit[52] 


nG, bit[11] 


AE, bit[10] 


SH, bits[9:8] 


Indicates that 16 adjacent translation table entries point to contiguous memory regions, see 
Contiguous bit on page G4-4084. 


The not global bit. Determines how the translation is marked in the TLB, see Global and 
process-specific translation table entries on page G4-4089. 


This bit is RESO in the Non-secure PL2 stage 1 translation tables. 
The Access flag, see The Access flag on page G4-4074. 


Shareability field, see Memory region attributes on page G4-4077. 


AP[2:1], bits[7:6] 


NS, bit[5] 


Access Permissions bits, see Memory access control on page G4-4068. 


— Note 


For consistency with the Short-descriptor translation table formats, the Long-descriptor format 
defines AP[2:1] as the Access Permissions bits, and does not define an AP[0] bit. 





AP[1] is RES1 in the Non-secure PL2 stage 1 translation tables. 


Non-secure bit. For memory accesses from Secure state, specifies whether the output address is in 
Secure or Non-secure memory, see Control of Secure or Non-secure memory access, VMSAv8-32 
Long-descriptor format on page G4-4056. 


For memory accesses from Non-secure state, this bit is RESO and is ignored by the PE. 


AttrIndx[2:0], bits[4:2] 


Stage 1 memory attributes index field, for the indicated Memory Attribute Indirection Register, see 
VMSAv8-32 Long-descriptor format memory region attributes on page G4-4083. 


The definition of IGNORED means the architecture guarantees that the PE makes no use of the field, see JGNORED 
on page Glossary-5718. For more information about these fields see Other fields in the Long-descriptor translation 
table format descriptors on page G4-4084. 
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Attribute fields in VMSAv8-32 Long-descriptor stage 2 Block and Page descriptors 


Block and Page descriptors split the memory attributes into an upper block and a lower block. Figure G4-13 shows 
the memory attribute fields in these blocks, for a stage 2 translation: 








Upper attributes Lower attributes 
63 60 59 58 55 54 53 52 11109 8 7 6 5 2 
[sonore [ [sonore [Jol] ol TT 
\ J\ | J 
Reserved for use by System MMU ——_Y AF 
IGNORED SH[1:0] 
Reserved for software use S2AP[1:0] 
XN MemAttr[3:0] 











Contiguous 


Figure G4-13 VMSAv8-32 memory attribute fields in Long-descriptor stage 2 Block and Page descriptors 


For a stage 2 descriptor, the attributes are: 


XN, bit[54] | The Execute-never bit. Determines whether the region is executable, see Execute-never restrictions 
on instruction fetching on page G4-4071. 


Contiguous, bit[52] 


Indicates that 16 adjacent translation table entries point to contiguous memory regions, see 
Contiguous bit on page G4-4084. 


AF, bit[10]_ The Access flag, see The Access flag on page G4-4074. 
SH, bits[9:8] Shareability field, see EL2 control of Non-secure memory region attributes on page G4-4085. 


S2AP, bits[7:6] 
Stage 2 Access Permissions bits, see Hyp mode control of Non-secure access permissions on 
page G4-4075. 
—— Note 


In the original VMSAv7-32 Long-descriptor attribute definition, this field was called HAP[2:1], for 
consistency with the AP[2:1] field in the stage 1 descriptors and despite there being no HAP[0] bit. 
ARMvV8 renames the field for greater clarity. 





MemAttr, bits[5:2] 


Stage 2 memory attributes, see EL2 control of Non-secure memory region attributes on 
page G4-4085. 


The definition of IGNORED means the architecture guarantees that the PE makes no use of the field, see [IGNORED 
on page Glossary-5718. For more information about these fields see Other fields in the Long-descriptor translation 
table format descriptors on page G4-4084. 
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G4.5.4 Control of Secure or Non-secure memory access, VMSAv8-32 Long-descriptor format 


Access to the Secure or Non-secure physical address map on page G4-4038 describes how the NS bit in the 
translation table entries: 
° For accesses from Secure state, determines whether the access is to Secure or Non-secure memory. 


° Is ignored by accesses from Non-secure state. 
In the Long-descriptor format: 
° The NS bit relates only to the memory block or page at the output address defined by the descriptor. 


° The descriptors also include an NSTable bit, see Hierarchical control of Secure or Non-secure memory 
accesses, Long-descriptor format. 


The NS and NSTable bits are valid only for memory accesses from Secure state. Memory accesses from Non-secure 
state ignore the values of these bits. 


Hierarchical control of Secure or Non-secure memory accesses, Long-descriptor 
format 


For Long-descriptor format table descriptors for stage 1 translations, the descriptor includes an NSTable bit, that 
indicates whether the table identified in the descriptor is in Secure or Non-secure memory. For accesses from Secure 
state, the meaning of the NSTable bit is: 


NSTable == 0 The defined table address is in the Secure physical address space. In the descriptors in that 
translation table, NS bits and NSTable bits have their defined meanings. 


NSTable == 1 The defined table address is in the Non-secure physical address space. Because this table is fetched 
from the Non-secure address space, the NS and NSTable bits in the descriptors in this table must be 
ignored. This means that, for this table: 


° The value of the NS bit in any block or page descriptor is ignored. The block or page address 
refers to Non-secure memory. 


° The value of the NSTable bit in any table descriptor is ignored, and the table address refers 
to Non-secure memory. When this table is accessed, the NS bit in any block or page 
descriptor is ignored, and all descriptors in the table refer to Non-secure memory. 


In addition, an entry fetched in Secure state is treated as non-global if it is read from Non-secure memory. That is, 
these entries must be treated as if nG==1, regardless of the value of the nG bit. For more information about the nG 
bit, see Global and process-specific translation table entries on page G4-4089. 


The effect of NSTable applies to later entries in the translation table walk, and so its effects can be held in one or 
more TLB entries. Therefore, a change to NSTable requires coarse-grained invalidation of the TLB to ensure that 
the effect of the change is visible to subsequent memory transactions. 





Note 
. When using the Long-descriptor format, table descriptors are defined only for the level 1 and level 2 of 
lookup. 
° Stage 2 translations are performed only for operations in Non-secure state, that can access only the 


Non-secure address space. Therefore, the stage 2 descriptors do not include NS or NSTable bits. 
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G4.5.5 Selecting between TTBRO and TTBR1, VMSAv8-32 Long-descriptor translation table format 


As described in Determining the translation table base address in the VMSAv8-32 translation regimes on 

page G4-4037, two sets of translation tables can be defined for each of the PL1&0 stage 1 translations, and TTBRO 
and TTBR1 hold the base addresses for the two sets of tables. The Long-descriptor translation table format provides 
more flexibility in defining the boundary between using TTBRO and using TTBR1. When a PL1&0 stage 1 address 
translation is enabled, TTBRO is always used. If TTBR1 is also used then: 


° TTBRI is used for the top part of the input address range. 
° TTBRO is used for the bottom part of the input address range. 


The TTBCR.TOSZ and TTBCR.T1SZ size fields control the use of TTBRO and TTBR1I, as Table G4-2 shows. 


Table G4-2 Use of TTBRO and TTBR1, Long-descriptor format 

















TTBCR Input address range using: 

TOSZ 1T1SZ TTBRO TTBR1 

Qb000 = @000@~=—S All addresses Not used 

Ma 0b000 = Zero to (202-)-1) 232M to maximum input address 
Qbeed = Na Zero to (232-2(32-N)-1) 232-2(32-) to maximum input address 
Ma Na Zero to (262-)-1) 232-2(32-) to maximum input address 





a. M,Nmust be greater than 0.The maximum possible value for each of TOSZ and T1SZ is 7. 


For stage | translations, the input address is always a VA, and the maximum possible VA is (232-1). 
When address translation is using the Long-descriptor translation table format: 


° Figure G4-14 shows how, when TTBCR.T1SZ is zero, the value of TTBCR.TOSZ controls the boundary 
between VAs that are translated using TTBRO, and VAs that are translated using TTBR1. 


TTBCR.T1SZ==0b000 


OxFFFFFFFF — CT - 


0x80000000 — - <+— Boundary, when TTBCR.TOSZ==0b001 


TTBR1 region 
TTBRO region + >= se | Effect of increasing TTBCR.T0SZ 





0x02000000 — « Boundary, when TTBCR.TOSZ==0b111 
TTBRO region 














0x00000000 
TTBCR.TOSZ==0b000 
Use of TTBR1 disabled 


Figure G4-14 Control of TTBR boundary, when TTBCR.T1SZ is zero 
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° Figure G4-15 shows how, when TTBCR.T1SZ is nonzero, the values of TTBCR.TOSZ and TTBCR.T1SZ 
control the boundaries between VAs that are translated using TTBRO, and VAs that are translated using 
TTBRI. 


Ox FFF FFRRR pwn nnnnnnnnnncnennnecnnecnncceee 








Effect of 
increasing 
TTBCR.T1SZ 


(¢ 
y 
(¢ 


; TTBR1 region = ; TTBR1 region = |eftect of increasing TTBCR.T1SZ 


Boundary, a 

TTBCR.T1SZ==0b001 — |access generates a 
+ + + eee ae ‘: | Effect of decreasing TTBCR.TOSZ 
0x40000000—| TTBRO region _ | .---------------------------------- —s Boundary, when TTBCR.TOSZ==0b010 


iL ab * TTBROregion + Effect of increasing TTBCR.TOSZ 


0x80000000 « Boundary, when TTBCR.T1SZ==0b001 























0x00000000 — --—______ ----------------------------------- 
TTBCR.TOSZ==0b000 TTBCR.TOSZ>0b000 








Figure G4-15 Control of TTBR boundaries, when TTBCR.T1SZ is nonzero 
When TOSZ and T1SZ are both nonzero: 


— _ If both fields are set to 0b001, the boundary between the two regions is 0x80000000. This is identical to 
having TOSZ set to @b000 and T1SZ set to 0b001. 


— Otherwise, the TTBRO and TTBRI regions are non-contiguous. In this case, any attempt to access an 
address that is in that gap between the TTBRO and TTBR1 regions generates a Translation fault. 


Note 


The handling of the Contiguous bit can mean that the boundary between the translation regions defined 
by the TCR_EL1.TnSZ values and the region for which an access generates a Translation fault is wider 
than shown in Figure G4-15. That is, if the descriptor for an access to the region shown as generating 
a fault has the Contiguous bit set to 1, the access might not generate a fault. Possible translation table 
registers programming errors describes this possibility. 








When using the Long-descriptor translation table format: 


° The TTBCR contains fields that define memory region attributes for the translation table walk, for each 
TTBR. These are the SHO, ORGNO, IRGNO, SH1, ORGN1, and IRGN1 bits. 


° TTBRO and TTBR1 each contain an ASID field, and the TTBCR.A1 field selects which ASID to use. 


For this translation table format, VMSAv8-32 Long-descriptor translation table format address lookup levels on 
page G4-4060 summarizes the lookup levels, and Translation table walks, when using the VMSAv8-32 
Long-descriptor translation table format on page G4-4063 describes the possible translations. 


Possible translation table registers programming errors 


In all the descriptions in this subsection, the size of the input address supported for a PL1&0 stage 1 translation 
refers to the size specified by a TTBCR.TxSZ field. 


Note 


For a PL1&0 stage 1 translation, this section has described how the input address range can be split so that the lower 
addresses are translated by TTBRO and the higher addresses are translated by TTBR1. In this case, each of input 
address sizes specified by TTBCR.{TOSZ, T1SZ} is smaller than the total address size supported by the stage of 
translation. 
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The following are possible errors in the programming of TTBRO, TTBR1, and TTBCR. For the translation of a 
particular address at a particular stage of translation, either: 


. The block size being used to translate the address is larger than the size of the input address supported at a 
stage of translation used in performing the required translation. This can occur only for the PL1&0 stage 1 
translations, and only when either TTBCR.TOSZ or TTBCR.T1SZ is zero, meaning there is no gap between 
the address range translated by TTBRO and the range translated by TTBR1. In this case, this programming 
error occurs if a block translated from the region that has TxSZ set to zero straddles the boundary between 
the two address ranges. Example G4-2 shows an example of this mis-programming. 


° The address range translated by a set of blocks marked as contiguous, by use of the contiguous bit, is larger 
than the size of the input address supported at a stage of translation used in performing the required 
translation. 


Example G4-2 Translation table programming error 


If TTBCR.TOSZ is programmed to 0 and TTBCR.T1SZ is programmed to 7, this means: 
° TTBRO translates addresses in the range @x00000000-0xFDFFFFFF. 
° TTBR1 translates addresses in the range @xFEQ00000-OxFFFFFFFF. 


The translation table indicated by TTBRO might be programmed with a block entry for a 1GB region starting at 
0xC0000000. This covers the address range 0xC0000000-OxFFFFFFFF, that overlaps the TTBR1 address range. This 
means this block size is larger than the input address size supported for translations using TTBRO, and therefore this 
is a programming error. 


To understand why this must be a programming error, consider a memory access to address 0xFFFF0000. According 
to the TTBCR.{TOSZ, T1SZ} values, this must be translated using TTBR1. However, the access matches a TLB 
entry for the translation, using TTBRO, of the block at 0xC0000000. Hardware is not required to detect that the access 
to OxFFFFQQ00 is being translated incorrectly. 


In these cases, an implementation might use one of the following approaches: 


. Treat such a block, that might be a block within a contiguous set of blocks, as causing a Translation fault, 
even though the block is valid, and the address accessed within that block is within the size of the input 
address supported at a stage of translation. 


° Treat such a block, that might be a block within a contiguous set of blocks, as not causing a Translation fault, 
even though the address accessed within that block is outside the size of the input address supported at a stage 
of translation, provided that both of the following apply: 


— The block is valid. 
— Atleast one address within the block, or contiguous set of blocks, is within the size of the input address 


supported at a stage of translation. 


Additional constraints apply to programming the VTCR, see Determining the required initial lookup level for stage 
2 translations on page G4-4065. 
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G4.5.6 


VMSAv8-32 Long-descriptor translation table format address lookup levels 


As stated at the start of this section, because the Long-descriptor translation table format is used for the Non-secure 
PL1&0 stage 2 translations, the format must support input addresses of up to 40 bits. 


Table G4-3 summarizes the properties of the different levels of address lookup when using this format. 


Table G4-3 Properties of the three levels of address lookup with VMSAv8-32 Long-descriptor translation tables 














Input address Output address@ 

Level Number of entries 
Size Address range’ Size Address range 

First Up to 512GB Up to Address[38:0] 1GB Address[39:30] Up to 512 

Second Up to 1GB Up to Address[29:0] 2MB Address[39:21] Up to 512 

Third 2MB Address[20:0] 4KB Address[39:12] 512 





a. Output address when an entry addresses a block of memory or a memory page. If an entry addresses the next level of 
address lookup it specifies Address[39:12] for the next-level translation table. 

b. Input address range for the translation table. See Use of concatenated level 1 translation tables on page G4-4061 for 
details of support for additional bits of address at a given level, including possible support of a 40-bit input address 
range for stage 2 translations at level 1. For stage 1 translations at level | the input address range is limited to the VA 
size of [31:0]. 


For level 1 and level 2 tables, reducing the input address range reduces the number of addresses in the table and 
therefore reduces the table size. The appropriate Translation Table Control Register specifies the input address 
range. 


Stage | translations require an input address range of up to 32 bits, corresponding to VA[31:0]. For these 
translations: 


° For a memory access from a mode other than Hyp mode, the Secure or Non-secure TTBRO or TTBR1 holds 
the translation table base address, and the Secure or Non-secure TTBCR is the control register. 


. For a memory access from Hyp mode, HTTBR holds the translation table base address, and HTCR is the 
control register. 





Note 


For translations controlled by TTBRO and TTBR1, if neither TTBR has an input address range larger than 1GB, 
then translation starts at level 2. Together, TTBRO and TTBR1 can still cover the 32-bit VA input address range. 





Stage 2 translations require an input address range of up to 40 bits, corresponding to IPA[39:0], and the supported 
input address size is configurable in the range 25-40 bits. Table G4-3 indicates a requirement for the translation 
mechanism to support a 39-bit input address range, Address[38:0]. Use of concatenated translation tables for the 
initial stage 2 lookup describes how a 40-bit IPA address range is supported. For stage 2 translations: 


° VTTBR holds the translation table base address, and VTCR is the control register. 


° If a supplied input address is larger than the configured input address size, a Translation fault is generated. 


Use of concatenated translation tables for the initial stage 2 lookup 


If a stage 2 translation would require 16 entries or fewer in its top-level translation table, that stage of translation 
can, instead, be configured so that: 


° It requires the corresponding number of concatenated translation tables at the next translation level, aligned 
to the size of the block of concatenated translation tables. 


. The stage 2 translation starts at that next translation level. 
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Note 


Stage 2 translations always use the Long-descriptor translation table format. 








This use of concatenated translation tables is: 


° Required when the stage 2 translation supports a 40-bit input address range, see Use of concatenated level 1 
translation tables. 


° Supported for a stage 2 translation with an input address range of 31-34 bits, see Use of concatenated level 2 
translation tables. 


The use of concatenated translation tables requires the software that is defining the translation to: 


° Define the concatenated translation tables with the required overall alignment. 

° Program VTTBR to hold the address of the first of the concatenated translation tables. 

° Program VTCR to indicate the required input address range and initial lookup level. 
Note 





The use of concatenated translation tables avoids the overhead of an additional level of translation 





Use of concatenated level 1 translation tables 


The Long-descriptor format translation tables provide 9 bits of address resolution at each level of lookup. However, 
a 40-bit input address range with a translation granularity of 4KB requires a total of 28 bits of address resolution. 
Therefore, a stage 2 translation that supports a 40-bit input address range requires two concatenated level 1 
translation tables, together aligned to 8KB, where: 


° The table at the address with PA[12:0]==0b0_0000_0000_0000 defines the translations for input addresses with 


bit[39]==0. 

° The table at the address with PA[12:0]==0b1_0000_0000_0000 defines the translations for input addresses with 
bit[39]==1. 

° The 8KB alignment requirement means that both tables have the same value for PA[39:13]. 


Use of concatenated level 2 translation tables 


A stage 2 translation with an input address range of 31-34 bits can start the translation either: 
. With a level 1 lookup, accessing a level | translation table with 2-16 entries. 


. With a level 2 lookup, accessing a set of concatenated level 2 translation tables. 


Table G4-4 on page G4-4062 shows these options, for each of the input address ranges that can use this scheme. 


Note 


Because these are stage 2 translations, the input address range is an IPA range. 
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Table G4-4 Possible uses of concatenated translation tables for level 2 lookup 

















Input address range Lookup starts at level 1 Lookup starts at level 2 

IPArange Size Required level 1 entries Number of concatenated tables Required alignment? 
IPA[30:0] 23! bytes 2 2 8KB 

IPA[31:0] 232 bytes 4 4 16KB 

IPA[32:0] 233 bytes 8 8 32KB 

IPA[33:0] 234 bytes 16 16 64KB 





a. Required alignment of the set of concatenated level 2 tables. 


See also Determining the required initial lookup level for stage 2 translations on page G4-4065. 
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G4.5.7 Translation table walks, when using the VMSAv8-32 Long-descriptor translation table format 
Figure G4-2 on page G4-4027 shows the possible address translations. If a stage of translation is controlled from an 
Exception level that is using AArch32, the input and output address constraints and the registers that control the 
translation are as follows: 

Stage 1 translations 
For all stage 1 translations: 
° The input address range is up to 32 bits, as determined by either: 
—  TTBCR.TOSZ or TTBCR.T1SZ, for a PL1&0 stage 1 translation. 
—  HTCR.TOSZ, for a PL2 stage 1 translation. 
° The output address range is 40 bits. 
The stage 1 translations are: 
Non-secure PL1&0 stage 1 translation 
The stage 1 translation for memory accesses from Non-secure modes other than Hyp 
mode. This translates a VA to an IPA. For this translation, when Non-secure EL 1 is 
using AArch32: 
° Non-secure TTBRO or TTBR1 holds the translation table base address. 
° Non-secure TTBCR determines which TTBR is used. 
Non-secure PL2 stage 1 translation 
The stage 1 translation for memory accesses from Hyp mode, translates a VA to a PA. 
For this translation, when EL2 is using AArch32, HTTBR holds the translation table 
base address. 
Secure PL1&0 stage 1 translation 
The stage 1 translation for memory accesses from Secure modes, translates a VA to a 
PA. For this translation, when the Secure PL1 modes are using AArch32: 
° Secure TTBRO or TTBR1 holds the translation table base address. 
° Secure TTBCR determines which TTBR is used. 
Stage 2 translation 
Non-secure PL1&0 stage 2 translation 
The stage 2 translation for memory accesses from Non-secure modes other than Hyp 
mode, and translates an IPA to a PA. For this translation, when EL2 is using AArch32: 
° The input address range is 40 bits, and VTCR.TOSZ determines the input address 
size. 
° The output address range depends on the implemented memory system, and is up 
to 40 bits. 
° VTTBR holds the translation table base address. 
° VTCR specifies the required input address range, and whether the initial lookup 
is at level 1 or at level 2. 
The descriptions of the VMSAv8-32 translation stages state that the maximum output address size is 40 bits. 
However, the register and Long-descriptor format descriptor fields that hold these addresses are 48 bits wide. If 
bits[47:40] of an output address are not all zero then the address generates an Address size fault. 
The Long-descriptor translation table format provides up to three levels of address lookup, as described in 
VMSAVv8-32 Long-descriptor translation table format address lookup levels on page G4-4060, and the initial 
lookup, in which the MMU reads the translation table base address, is at either level lor level 2. The following 
determines the level of the initial lookup: 
. For a stage | translation, the required input address range. For more information see Determining the required 
initial lookup level for stage I translations on page G4-4065. 
° For a stage 2 translation, the level specified by the VTCR.SLO field. For more information see Determining 
the required initial lookup level for stage 2 translations on page G4-4065. 
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Note 





For a stage 2 translation, the size of the required input address range constrains the VTCR.SLO value. 





Figure G4-16 shows how the descriptor address for the initial lookup for a translation using the Long-descriptor 
translation table format is determined from the input address and the TTBR value. This figure shows the lookup for 
a translation that starts with a level 1 lookup, that translates bits[39:30] of the input address, zero extended if 


necessary. 








56:55 48:47 40:39 


63 nn-1 0 
RESO Register-defined Translation table base address[39:n] RESO | 
N y, 
4 Y, 


n-1 











Input address 


TTBR 


39 n 3 2 0 


See text for more information about the translation table base register used, and the value of n. 

+ This field is absent if nis 13. 

+ For a Non-secure PL1&0 stage 1 translation, the IPA of the descriptor. Otherwise, the PA of the descriptor. 
§ See the lookup description for more information about bits[40:47] of the TTBR 


Figure G4-16 VMSAv8-32 Long-descriptor initial lookup, starting at level 1 


If bits[47:40] of the TTBR are not zero then the initial lookup will generate an Address size fault, see Address size 


fault on page G4-4119. 


For a translation that starts with a level 1 lookup, as shown in Figure G4-16: 


For a stage 1 translation 
nis in the range 4-5 and: 


° For a memory access from Hyp mode: 
—  HTTBR is the TTBR. 
—  n=5-(HTCR.TOSZ). 


° For other accesses: 


— The Secure or Non-secure instance of TTBRO or TTBR1 is the TTBR. 
—  n=(5-TTBCR.TxSZ), where x is 0 when using TTBRO, and 1 when using TTBR1. 


For a stage 2 translation 
nis in the range 4-13 and: 
° VTTBR is the TTBR. 
° n=5-(VTCR.TOSZ). 
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For a translation that starts with a level 2 lookup, the descriptor address is obtained in the same way, except that 
bits[(n+17):21] of the input address provide bits[(-1):3] of the descriptor address, where: 
For a stage 1 translation 


nis in the range 7-12. As Determining the required initial lookup level for stage 1 translations 
shows, for a stage | translation to start with a level 2 lookup, the corresponding TOSZ or T1SZ field 
must be 2 or more. This means: 


° For a memory access from Hyp mode, n=14-HTCR.TOSZ. 
° For other memory accesses, n=14-(TTBCR.TxSZ), where x is 0 when using TTBRO, and 1 
when using TTBR1. 
For a stage 2 translation 
nis in the range 7-16. For a stage 2 translation to start with a level 2 lookup, VTCR.SLO is 0b0, and 
n=14-(VTCR.TOSZ). 


The following sections describe how the level of the initial lookup is determined: 
. Determining the required initial lookup level for stage 1 translations. 


° Determining the required initial lookup level for stage 2 translations. 


Address translation examples using the VMUSAv8-32 Long descriptor translation table format on page K7-5570 
shows examples of full translation flows, to an entry fora 4KB memory page, for lookups starting at level 1 and 
lookups starting at level 2. 


Determining the required initial lookup level for stage 1 translations 


For a stage | translation, the required input address range, indicated by a TOSZ or T1SZ field in a translation table 
control register, determines the initial lookup level. The size of this input address region is 22-T*SZ) bytes, and if 
this size is: 


. Less than or equal to 279 bytes, the required start is at level 2, and translation requires two levels of table to 
map to 4KB pages. This corresponds to a TxSZ value of 2 or more. 


. More than 29 bytes, the required start is at level 1, and translation requires three levels of table to map to 
4KB pages. This corresponds to a TxSZ value that is less than 2. 


For the PL1&0 stage 1 translations, the TTBCR: 


° Splits the 32-bit VA input address range between TTBRO and TTBR1, see Selecting between TTBRO and 
TTBR1, VMSAv8s-32 Long-descriptor translation table format on page G4-4057. 


° Holds the input address range sizes for TTBRO and TTBR1, in the TTBCR.TOSZ and TTBCR.T1SZ fields. 


For the PL2 stage 1 translations, HTCR.TOSZ indicates the size of the required input address range. For example, 
if this field is 0b000, it indicates a 32-bit VA input address range, and translation lookup must start at level 1. 


Determining the required initial lookup level for stage 2 translations 


For a PL1&0 stage 2 translation, the output address range from the PL1&0 stage | translations determines the 
required input address range for the stage 2 translation. 


VTCR.SLO indicates the starting level for the lookup. The permitted SLO values are: 
0bee Stage 2 translation lookup must start at level 2. 


Qb01 Stage 2 translation lookup must start at level 1. 


In addition, VTCR.TOSZ must indicate the required input address range. The size of the input address region is 
262-T0SZ) bytes. 
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G4.5.8 


Note 


VTCR.TOSZ holds a four-bit signed integer value, meaning it supports values from -8 to 7. This is different from 
the other translation control registers, where TnSZ holds a three-bit unsigned integer, supporting values from 0 to 7. 








The programming of VTCR must follow the constraints shown in Table G4-5, otherwise any attempt to perform a 
translation table walk that uses the stage 2 address translation generates a stage 2 level 1 Translation Fault. The table 
also shows how the VTCR.SLO and VTCR.TOSZ values determine the VTTBR.BADDR field width. 


Note 


If VTCR.SLO is programmed to a reserved value then the constraints shown in Table G4-5 are not met, and a 
translation table walk that uses stage 2 translation generates a stage 2 level 1 Translation fault. 








Table G4-5 Input address range constraints on programming VTCR 





VTCR.SLO VTCR.TOSZ_ Input address range, R Initial lookup level BADDR[39:x] width@ 

















Qbd0 2 to7 R<239bytes Level 2 [39:12] to [39:7] 
Qbd0 -2tol 230<R<2*4bytes Level 2 [39:16] to [39:13] 
Qb01 -2tol Level | [39:7] to [39:4] 
0b01 -8 to -3 234<R Level 1 [39:13] to [39:8] 





a. The first range corresponds to the first TOSZ value, the second range to the second TOSZ value. 


In addition, VTCR.S must be programmed to the value of TOSZ[3], otherwise behavior is CONSTRAINED 
UNPREDICTABLE with the resulting behavior being that VTCR.TOSZ is treated as an UNKNOWN value. 


Note 


VTCR.TOSZ being treated as an UNKNOWN value results in a stage 2 level 1 Translation Fault if that UNKNOWN 
value is not consistent with the programmed value of VTCR.SLO. 








CONSTRAINED UNPREDICTABLE behaviors associated with the VITCR on page K1-5474 describes these 
CONSTRAINED UNPREDICTABLE behaviors. 


Where necessary, the initial lookup level provides multiple concatenated translation tables, as described in Use of 
concatenated level 2 translation tables on page G4-4061. This section also gives more information about the 
alternatives, shown in Table G4-5, when R is in the range 231-234. 


The algorithm for finding the translation table entries, VMSAv8-32 Long-descriptor format 


This section gives the algorithm for finding the translation table entry that corresponds to a given IA, for each 
required level of lookup. The algorithm encodes the descriptions of address translation given earlier in this section. 
The VMSAv8-32 Long-descriptor format uses a 4KB translation granule. 


The description uses the following terms: 


BaseAddr The base address for the level of lookup, as defined by: 


. For the initial lookup level, the TTBR.BADDR base address field in the appropriate TTBR, 
see the description of TnSZ on page G4-4067. 


. Otherwise, the translation table address returned by the previous level of lookup. 


IA The supplied IA for this stage of translation. 
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The translation table size for this stage of translation: 


Either: 
° TTBCR.TOSZ if the translation is using TTBRO. 
° TTBCR.T1SZ if the translation is using TTBR1. 


VTCR.TOSZ. The translation uses VTTBR. 
HTCR.TOSZ. The translation uses HTTBR. 


VTCR.SLO. Applies to the Non-secure PL1&0 stage 2 translation only. 


Table G4-6 shows the translation table descriptor address, for each level of lookup. The table shows only 
architecturally-valid programming of the TCR. See also Possible translation table registers programming errors on 


page G4-4058. 


Table G4-6 Translation table entry addresses, VMSAv8-32 using Long-descriptor format 





Entry address and conditions 











ae General conditions 
Stage 1 translation Stage 2 translation 
One BaseAddr[39:x]:IA[y:30]:0b000 BaseAddr[39:x]:IA[y:30]:0b000 y=(x+ 26) 
if? 0 < TnSZ < | then x = (5 - TnSZ) if SLO == 1 then 
if? -8 < TOSZ < 1 then x = (5 - TOSZ) 
Two BaseAddr[39:x]:IA[y:21]:0b000 BaseAddr[39:x]:IA[y:21]:0b000 y=(x+17) 
if4 2 < TnSZ < 7then x = (14 - TnSZ) if SLO == 0 then 
if@ -2 < TOSZ <7 then x = (14 - TOSZ) 
elsee x =12 elsif© SLOP == 1 then x = 12 
Three BaseAddr[39:12]:IA[20:12]:0b000 BaseAddr[39:12]:IA[20:12]:@b000 - 





a. This line indicates the range of permitted values for TnSZ, for a lookup that starts at this level, see Use of concatenated translation tables 


for the initial stage 2 lookup on page G4-4060. 


b. SLO == 0 if the initial lookup is level 2, SLO == 1 if the initial lookup is level 1. 


c. This is the case where this level of lookup is not the initial level of lookup. 
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G4.6 Memory access control 


In addition to an output address, a translation table entry that refers to page or region of memory includes fields that 
define properties of the target memory region. [Information returned by a translation table lookup on page G4-4036 
describes the classification of those fields as address map control, access control, and memory attribute fields. The 
access control fields, described in this section, determine whether the PE, in its current state, is permitted to perform 
the required access to the output address given in the translation table descriptor. If a translation stage does not 
permit the access then an MMU fault is generated for that translation stage, and no memory access is performed. 


The following sections describe the memory access controls: 


° Access permissions. 
° Execute-never restrictions on instruction fetching on page G4-4071. 
° Domains, Short-descriptor format only on page G4-4073. 
° The Access flag on page G4-4074. 
° Hyp mode control of Non-secure access permissions on page G4-4075. 
G4.6.1 Access permissions 
Note 





This section gives a general description of memory access permissions. Software executing at PL1 in Non-secure 
state can see only the access permissions defined by the Non-secure PL1&0 stage 1 translations. However, software 
executing at PL2 can modify these permissions, as described in Hyp mode control of Non-secure access permissions 
on page G4-4075. This modification is invisible to Non-secure software executing at EL1 or ELO. 





Access permission bits in a translation table descriptor control access to the corresponding memory region. The 
details of this control depend on the translation table format, as follows: 
Short-descriptor format 


This format supports two options for defining the access permissions: 
° Three bits, AP[2:0], define the access permissions. 


° Two bits, AP[2:1], define the access permissions, and AP[0] can be used as an Access flag. 


SCTLR.AFE selects the access permissions option. Setting this bit to 1, to enable the Access flag, 
also selects use of AP[2:1] to define access permissions. 


ARM deprecates any use of the AP[2:0] scheme for defining access permissions. 


Long-descriptor format 


AP[2:1] to control the access permissions, and the descriptors provide an AF bit for use as an Access 
flag. This means VMSAv8-32 behaves as if the value of SCTLR.AFE is 1, regardless of the value 
that software has written to this bit. 

—— Note 

When use of the Long-descriptor format is enabled, SCTLR.AFE is UNK/SBOP. 





The Access flag on page G4-4074 describes the Access flag, for both translation table formats. 


The XN and PXN bits provide additional access controls for instruction fetches, see Execute-never restrictions on 
instruction fetching on page G4-4071. 


An attempt to perform a memory access that the translation table access permission bits do not permit generates a 
Permission fault, for the corresponding stage of translation. However, when using the Short-descriptor translation 
table format, it generates the fault only if the access is to memory in the Client domain, see Domains, 
Short-descriptor format only on page G4-4073. 
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Note 


For the Non-secure PL1&0 translation regime, memory accesses are subject to two stages of translation. Each stage 
of translation has its own, independent, fault checking. Fault handling is different for the two stages, see Exception 
reporting in a VMSAv8-32 implementation on page G4-4123. 





The following sections describe the two access permissions models: 
° AP[2:1] access permissions model. 


° AP[2:0] access permissions control, Short-descriptor format only on page G4-4070. This section includes 
some information on access permission control in earlier versions of the ARM VMSA. 


AP[2:1] access permissions model 





Note 


ARM recommends that this model is always used, even where the AP[2:0] model is permitted. Some documentation 
describes the AP[2:1] model as the simplified access permissions model. 





This access permissions model is used if the translation is either: 
° Using the Long-descriptor translation table format. 


° Using Short-descriptor translation table format, and the SCTLR.AFE bit is set to 1. 


In this model: 

° One bit, AP[2], selects between read-only and read/write access. 

° A second bit, AP[1], selects between Application level (PLO) and System level (PL1) control. 
For the Non-secure PL2 stage 1 translations, AP[1] is SBO. 


This provides four access combinations: 


. Read-only at all privilege levels. 
° Read/write at all privilege levels. 
° Read-only at PL1, no access by software executing at PLO. 
° Read/write at PL1, no access by software executing at PLO. 


Table G4-7 shows this access control model. 


Table G4-7 VMSAv8-32 AP[2:1] access permissions model 





AP[2], disable write access AP[1], enable unprivileged access Access 














0 Qa Read/write, only at PL1 
0 1 Read/write, at any privilege level 
1 0a Read-only, only at PL1 
1 1 Read-only, at any privilege level 





a. Not valid for Non-secure PL2 stage | translation tables. AP[1] is SBO in these tables. 


Hierarchical control of access permissions, Long-descriptor format 


The Long-descriptor translation table format introduces a mechanism that entries at one level of translation table 
lookup can use to set limits on the permitted entries at subsequent levels of lookup. This applies to the access 
permissions, and also to the restrictions on instruction fetching described in Hierarchical control of instruction 
fetching, Long-descriptor format on page G4-4072. 


The restrictions apply only to subsequent levels of lookup at the same stage of translation. The APTable[1:0] field 
restricts the access permissions, as Table G4-8 on page G4-4070 shows. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4069 
1ID092916 Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.6 Memory access control 


As stated in the table footnote, for the Non-secure PL2 stage 1 translation tables, APTable[0] is reserved, SBZ. 


Table G4-8 Effect of APTable[1:0] on subsequent levels of lookup 





APTable[1:0] 


Effect 


















































00 No effect on permissions in subsequent levels of lookup. 
01a Access at PLO not permitted, regardless of permissions in subsequent levels of lookup. 
10 Write access not permitted, at any exception level, regardless of permissions in subsequent levels of lookup. 
11a Regardless of permissions in subsequent levels of lookup: 
° Write access not permitted, at any exception level. 
° Read access not permitted at PLO. 
a. Not valid for the Non-secure PL2 stage | translation tables. In those tables, APTable[0] is SBZ. 
Note 
The APTable[1:0] settings are combined with the translation table access permissions in the translation tables 
descriptors accessed in subsequent levels of lookup. They do not restrict or change the values entered in those 
descriptors. 
The Long-descriptor format provides APTable[1:0] control only for the stage 1 translations. The corresponding bits 
are SBZ in the stage 2 translation table descriptors. 
The effect of APTable applies to later entries in the translation table walk, and so its effects can be held in one or 
more TLB entries. Therefore, a change to APTable requires coarse-grained invalidation of the TLB to ensure that 
the effect of the change is visible to subsequent memory transactions. 
AP[2:0] access permissions control, Short-descriptor format only 
This access permissions model applies when using the Short-descriptor translation tables format, and the 
SCTLR.AFE bit is set to 0. ARM deprecates any use of this access permissions model. 
When SCTLR.AFE is set to 0, ensuring that the AP[0] bit is always set to 1 effectively changes the access model to 
the simpler model described in AP/2:1] access permissions model on page G4-4069. 
Table G4-9 shows the full AP[2:0] access permissions model: 
Table G4-9 VMSAv8-32 MMU access permissions 
AP[2] AP[1:0] PL1 access Unprivileged access _ Description 
0 00 No access No access All accesses generate Permission faults 
01 Read/write No access Access only at PL1 
10 Read/write Read-only Writes at PLO generate Permission faults 
11 Read/write Read/write Full access 
1 00 - - Reserved 
01 Read-only No access Read-only, only at PL1 
10 Read-only Read-only Read-only at any exception level, deprecated 
11 Read-only Read-only Read-only at any exception level» 





a. From VMSAv7, ARM strongly recommends use of the 0b11 encoding for Read-only at any exception level. 


b. This mapping was introduced in VMSAv7, and is reserved in earlier versions of the VMSA. 
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Note 


° VMS Av8-32 supports the full set of access permissions shown in Table G4-9 on page G4-4070 only when 
SCTLR.AFE is set to 0. When SCTLR.AFE is set to 1, the only supported access permissions are those 
described in AP/2:1] access permissions model on page G4-4069. 





° Some old documentation describes the AP[2] bit in the translation table entries as the APX bit. 








G4.6.2 Execute-never restrictions on instruction fetching 
Execute-never (XN) controls provide an additional level of control on memory accesses permitted by the access 
permissions settings. These controls are: 
XN, Execute-never 
When the XN bit is 1, a Permission fault is generated if the PE attempts to execute an instruction 
from the corresponding memory region. However, when using the Short-descriptor translation table 
format, the fault is generated only if the access is to memory in the Client domain, see Domains, 
Short-descriptor format only on page G4-4073. A PE can execute instructions from a memory 
region only if the access permissions for its current state permit read access, and the value of the XN 
bit is 0. 
PXN, Privileged execute-never 
When the PXN bit is 1, a Permission fault is generated if the PE is executing at PL1 and attempts to 
execute an instruction from the corresponding memory region. As with the XN bit, when using the 
Short-descriptor translation table format, the fault is generated only if the access is to memory in the 
Client domain. 
In both the Short-descriptor format and the Long-descriptor format translation tables, all descriptors for memory 
blocks and pages always include an XN bit. 
Support for the PXN bit is as follows: 
° The Long-descriptor translation table formats always include the PXN bit. However, in the Non-secure PL2 
stage | translation tables, and in the Non-secure PL1&0 stage 2 translation tables, the PXN bit is reserved, 
SBZ. 
° All implementations must: 
— Support the use of the PXN bit in the PL1&0 translation regime. 
— Use the Short-descriptor translation table formats that include the PXN bit. 
In the Non-secure PL1&0 translation regime, a region is execute-never if the value of the XN bit is 1 in one or both 
of the stage | translation table descriptor and the stage 2 translation table descriptor. See also Hyp mode control of 
Non-secure access permissions on page G4-4075. 
In addition, System register controls can enforce execute-never restrictions, regardless of the settings in the 
translation table XN and PXN fields, see: 
° Restriction on Secure instruction fetch on page G4-4073. 
° Preventing execution from writable locations on page G4-4073. 
The execute-never controls apply also to speculative instruction fetching. This means a speculative instruction fetch 
from a memory region that is execute-never at the current level of privilege is prohibited. 
The XN control means that, when the stage of address translation is enabled, the PE can fetch, or speculatively fetch, 
an instruction from a memory location only if all of the following apply: 
° If using the Short-descriptor translation table format, the translation table descriptor for the location does not 
indicate that it is in a No access domain. 
° If using the Long-descriptor translation table format, or using the Short descriptor format and the descriptor 
indicates that the location is in a Client domain, in the descriptor for the location the following apply: 
— XNissetto0. 
— The access permissions permit a read access from the current PE mode. 
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. No other Prefetch Abort condition exists. 
Note 
. The PXN control applies to the PE privilege when it attempts to execute the instruction. In an implementation 


that fetches instructions speculatively, this might not be the privilege when the instruction was prefetched. 
Therefore, the architecture does not require the PXN control to prevent instruction fetching. 


. Although the XN control applies to speculative fetching, on a speculative instruction fetch from an XN 
location, no Permission fault is generated unless the PE attempts to execute the instruction that would have 
been fetched from that location. This means that, if a speculative fetch from an XN location is attempted, but 
there is no attempt to execute the corresponding instruction, a Permission fault is not generated. 


° The software that defines a translation table must mark any region of memory that is read-sensitive as XN, 
to avoid the possibility of a speculative fetch accessing the memory region. This means it must mark any 
memory region that corresponds to a read-sensitive peripheral as XN. Hardware does not prevent speculative 
accesses to a region of any Device memory type unless that region is also marked as execute-never for all 
Exception levels from which it can be accessed. 


° When using the Short-descriptor translation table format, the XN attribute is not checked for domains marked 
as Manager. Therefore, the system must not include read-sensitive memory in domains marked as Manager, 
because the XN bit does not prevent speculative fetches from a Manager domain. 





When no stage of address translation for the translation regime is enabled, memory regions cannot have XN or PXN 
attributes assigned. Behavior of instruction fetches when all associated address translations are disabled on 
page G4-4033 describes how disabling all MMUs affects instruction fetching. 


Hierarchical control of instruction fetching, Long-descriptor format 


The Long-descriptor translation table format introduces a mechanism that means entries at one level of translation 
tables lookup can set limits on the permitted entries at subsequent levels of lookup. This applies to the restrictions 
on instruction fetching, and also to the access permissions described in Hierarchical control of access permissions, 
Long-descriptor format on page G4-4069. 


The restrictions apply only to subsequent levels of lookup at the same stage of translation, and: 


° XNTable restricts the XN control: 


— When XNTable is set to 1, the XN bit is treated as 1 in all subsequent levels of lookup, regardless of 
the actual value of the bit. 


— When XNTable is set to 0 it has no effect. 


PXNTable restricts the PXN control: 


— When PXNTable is set to 1, the PXN bit is treated as 1 in all subsequent levels of lookup, regardless 
of the actual value of the bit. 


— When PXNTable is set to 0 it has no effect. 


Note 


The XNTable and PXNTable settings are combined with the XN and PXN bits in the translation table descriptors 
accessed at subsequent levels of lookup. They do not restrict or change the values entered in those descriptors. 








The XNTable and PXNTable controls are provided only in the Long-descriptor translation table format, and only 
for stage 1 translations. The corresponding bits are SBZ in the stage 2 translation table descriptors. 


The effect of XNTable or PXNTable applies to later entries in the translation table walk, and so its effects can be 
held in one or more TLB entries. Therefore, a change to XNTable or PXNTable requires coarse-grained invalidation 
of the TLB to ensure that the effect of the change is visible to subsequent memory transactions. 
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Preventing execution from writable locations 


ARMvV8 provides control bits that, when the corresponding stage | address translation is enabled, force writable 
memory to be treated as XN or PXN, regardless of the value of the XN or PXN bit. When the translation stages are 
controlled by an Exception level that is using AArch32: 


° For PL1&0 stage 1 translations, when SCTLR.WXN is set to 1, all regions that are writable at stage 1 of the 
address translation are treated as XN. 


° For PL1&0 stage 1 translations, when SCTLR.UWXN is set to 1, an instruction fetch is treated as accessing 
a PXN region if it accesses a region that software executing at PLO can write to. 


° For Non-secure PL2 stage 1 translations, when HSCTLR.WXN is set to 1, all regions that are writable at 
stage 1 of the address translation are treated as XN. 





Note 
° The SCTLR.WXN controls are intended to be used in systems with very high security requirements. 
° Setting a WXN or UWXN bit to 1 changes the interpretation of the translation table entry, overriding a zero 


value of an XN or PXN field. It does not cause any change to the translation table entry. 





For any given virtual machine, ARM expects WXN and UWXN to remain static in normal operation. In particular, 
it is IMPLEMENTATION DEFINED whether TLB entries associated with a particular VMID reflect the effect of the 
values of these fields. A generic sequence to ensure synchronization of a change to these fields, when that change 
is made without a corresponding change of VMID, is: 


Change the WXN or UWXN bit 


ISB ; This ensures synchronization of the change 
Invalidate entire TLB of associated entries 

DSB ; This completes the TLB Invalidation 

ISB ; This ensures instruction synchronization 


As with all Permission fault checking, if the stage 1 translation is using the Short-descriptor translation table format, 
the permission checks are performed only for Client domains. For more information see Access permissions on 
page G4-4068. 


For more information about address translation see About address translation for VMSAv8-32 on page G4-4026. 


Restriction on Secure instruction fetch 


EL3 provides a Secure instruction fetch bit, SCR.SIF. When this bit is set to 1, any attempt in Secure state to execute 
an instruction fetched from Non-secure physical memory causes a Permission fault. As with all Permission fault 
checking, when using the Short-descriptor format translation tables the check applies only to Client domains, see 
Access permissions on page G4-4068. 


ARM expects SCR.SIF to be static during normal operation. In particular, whether the TLB holds the effect of the 
SIF bit is IMPLEMENTATION DEFINED. The generic sequence to ensure visibility of a change to the SIF bit is: 


Change the SCR.SIF bit 


ISB ; This ensures synchronization of the change 

Invalidate entire TLB 

DSB ; This completes the TLB Invalidation 

ISB ; This ensures instruction synchronization 
G4.6.3 Domains, Short-descriptor format only 


A domain is a collection of memory regions. The Short-descriptor translation table format supports 16 domains, and 
requires the software that defines a translation table to assign each VMSAv8-32 memory region to a domain. When 
using the Short-descriptor format: 


° Level | translation table entries for Page tables and Sections include a domain field. 
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° Translation table entries for Supersections do not include a domain field. The Short-descriptor format defines 
Supersections as being in domain 0. 


. Level 2 translation table entries inherit a domain setting from the parent level 1 Page table descriptor. 
. Each TLB entry includes a domain field. 


The domain field specifies which of the 16 domains the entry is in, and a two-bit field in the DACR defines the 
permitted access for each domain. The possible settings for each domain are: 


No access Any access using the translation table descriptor generates a Domain fault. 


Clients On an access using the translation table descriptor, the access permission attributes are checked. 
Therefore, the access might generate a Permission fault. 


Managers On an access using the translation table descriptor, the access permission attributes are not checked. 
Therefore, the access cannot generate a Permission fault. 


See The MMU fault-checking sequence on page G4-4113 for more information about how, when using the 
Short-descriptor translation table format, the Domain attribute affects the checking of the other attributes in the 
translation table descriptor. 





Note 
A single program might: 
° Be a Client of some domains. 
° Be a Manager of some other domains. 
° Have no access to the remaining domains. 





The Long-descriptor translation table format does not support domains. When a stage of translation is using this 
format, all memory is treated as being in a Client domain, and the settings in the DACR are ignored. 


G4.6.4 The Access flag 


The Access flag indicates when a page or section of memory is accessed for the first time since the Access flag in 
the corresponding translation table descriptor was set to 0: 


° If address translation is using the Short-descriptor translation table format, it must set SCTLR.AFE to | to 
enable use of the Access flag. Setting this bit to 1 redefines the AP[0] bit in the translation table descriptors 
as an Access flag, and limits the access permissions information in the translation table descriptors to 
AP[2:1], as described in AP/2:1] access permissions model on page G4-4069. 


° The Long-descriptor format always supports an Access flag bit in the translation table descriptors, and 
address translation using this format behaves as if SCTLR.AFE is set to 1, regardless of the value of that bit. 


In ARMv8.0, the Access flag is managed by software as described in Software management of the Access flag. 


Note 


Previous version of the ARM architecture optionally supported hardware management of the Access flag. ARMv8.0 
obsoletes this option. 








Software management of the Access flag 


ARMvVv8.0 requires that software manages the Access flag. This means an Access flag fault is generated whenever 
an attempt is made to read into the TLB a translation table descriptor entry for which the value of the Access flag 
is 0. 
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Note 


When using the Short-descriptor translation table format, Access flag faults are generated only if SCTLR.AFE is 
set to 1, to enable use of a translation table descriptor bit as an Access flag. 





The Access flag mechanism expects that, when an Access flag fault occurs, software resets the Access flag to 1 in 
the translation table entry that caused the fault. This prevents the fault occurring the next time that memory location 
is accessed. Entries with the Access flag set to 0 are never held in the TLB, meaning software does not have to flush 
the entry from the TLB after setting the flag. 





Note 


If a system incorporates components that can autonomously update translation table entries that are shared with the 
ARM PE, then the software must be aware of the possibility that such components can update the access flag 
autonomously. 


In such a system, system software should perform any changes of translation table entries with an Access flag of 0, 
other than changes to the Access flag value, by using an Load-Exclusive/Store-Exclusive loop, to allow for the 
possibility of simultaneous updates. 





G4.6.5 Hyp mode control of Non-secure access permissions 


When EL2 is using AArch32, Non-secure software executing in Hyp mode controls two sets of translation tables, 
both of which use the Long-descriptor translation table format: 


° The translation tables that control the Non-secure PL2 stage | translations. These map VAs to PAs, for 
memory accesses made when executing in Non-secure state in Hyp mode, and are indicated and controlled 
by the HTTBR and HTCR. 


These translations have similar access controls to other Non-secure stage 1 translations using the 
Long-descriptor translation table format, as described in: 
—  AP[2:1] access permissions model on page G4-4069. 


—  Execute-never restrictions on instruction fetching on page G4-4071. 


The differences from the Non-secure stage 1| translations are that: 
— The APTable[0], PXNTable, and PXN bits are reserved, SBZ. 
_— AP[1] is reserved, SBO. 


° The translation tables that control the Non-secure PL1&0 stage 2 translations. These map the IPAs from the 
stage | translation onto PAs, for memory accesses made when executing in Non-secure state at PL1 or PLO, 
and are indicated and controlled by the VTTBR and VTCR. 


The descriptors in the virtualization translation tables define stage 2 access permissions, that are combined 
with the permissions defined in the stage | translation. This section describes this combining of access 
permissions. 


Note 
The level 2 access permissions mean a hypervisor can define additional access restrictions to those defined by a 
Guest OS in the stage | translation tables. For a particular access, the actual access permission is the more restrictive 
of the permissions defined by: 





° The Guest OS, in the stage 1 translation tables. 


° The hypervisor, in the stage 2 translation tables. 





The stage 2 access controls defined from Hyp mode: 





° Affect only the Non-secure stage 1 access permissions settings. 
° Take no account of whether the accesses are from a Non-secure PL1 mode or a Non-secure PLO mode. 
° Permit software executing in Hyp mode to assign a write-only attribute to a memory region. 
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The S2AP field in the stage 2 descriptors define the stage 2 access permissions, as Table G4-10 shows: 


Table G4-10 Stage 2 control of access permissions 





S2AP Access permission 














00 No access permitted 

01 Read-only. Writes to the region are not permitted, regardless of the stage 1 permissions. 
10 Write-only. Reads from the region are not permitted, regardless of the stage 1 permissions. 
11 Read/write. The stage 1 permissions determine the access permissions for the region. 





For more information about the S2AP field see Attribute fields in VUSAv8-32 Long-descriptor stage 2 Block and 
Page descriptors on page G4-4055. 


If the stage 2 permissions cause a Permission fault, this is a stage 2 MMU fault. Stage 2 MMU faults are taken to 
Hyp mode, and reported in the HSR using an EC code of 0x20 or 0x24. For more information, see Use of the HSR 
on page G4-4137. 


Note 


In the HSR, the combination of the EC code and the DFSC or IFSC value in the ISS indicate that the fault is a stage 
2 MMU fault. 








The stage 2 permissions include an XN attribute. If this is set to 1, execution from the region is not permitted, 
regardless of the value of the XN attribute in the stage 1 translation. If a Permission fault is generated because the 
stage 2 XN bit is set to 1, this is reported as a stage 2 MMU fault. 


AArch32 state prioritization of synchronous aborts from a single stage of address translation on page G4-4120 
describes the abort prioritization if both stages of a translation generate a fault. 
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G4.7 Memory region attributes 


In addition to an output address, a translation table entry that refers to a page or region of memory includes fields 
that define properties of that target memory region. /nformation returned by a translation table lookup on 

page G4-4036 describes the classification of those fields as address map control, access control, and memory 
attribute fields. The memory region attribute fields control the memory type, Cacheability, and Shareability of the 
region. 


The following sections describe the assignment of memory region attributes for stage 1 translations: 


° Overview of memory region attributes for stage I translations. 
° Short-descriptor format memory region attributes, without TEX remap on page G4-4078. 
° Short-descriptor format memory region attributes, with TEX remap on page G4-4080. 


° VMSAVv8-32 Long-descriptor format memory region attributes on page G4-4083. 


For an implementation that is operating in Secure state, or in Hyp mode, these assignments define the memory 
attributes of the accessed region. 


For an implementation that is operating in a Non-secure PL1 or PLO mode, the Non-secure PL1&0 stage 2 
translation can modify the memory attributes assigned by the stage 1 translation. EL2 control of Non-secure memory 
region attributes on page G4-4085 describes these stage 2 assignments. 


G4.7.1 Overview of memory region attributes for stage 1 translations 
The description of the memory region attributes in a translation descriptor divides into: 


Memory type and attributes 
These are described either: 
° Directly, by bits in the translation table descriptor. 


. Indirectly, by registers referenced by bits in the table descriptor. This is described as 
remapping the memory type and attribute description. 


The Short-descriptor translation table format can use either of these approaches, selected by the 
SCTLR.TRE bit: 


TRE == 0 Remap disabled. The TEX[2:0], C, and B bits in the translation table descriptor define 
the memory region attributes. Short-descriptor format memory region attributes, 
without TEX remap on page G4-4078 describes this encoding. 

—— Note 


With the Short-descriptor format, remapping is called TEX remap, and the SCTLR.TRE 
bit is the TEX remap enabled bit. 





The description of the TRE == 0 encoding includes information about the encoding in 
previous versions of the architecture. 


TRE == 1 Remap enabled. The TEX[0], C, and B bits in the translation table descriptor are index 
bits to the remap registers, that define the memory region attributes: 
° The Primary Region Remap Register, PRRR. 
° The Normal Memory Remap Register, NMVRR. 
Short-descriptor format memory region attributes, with TEX remap on page G4-4080 
describes this encoding scheme. 
This scheme reassigns translation table descriptor bits TEX[2:1] for use as bits managed 
by the operating system. 


The Long-descriptor translation table format always uses remapping. This means that when the 
value of TTBCR.EAE is 1, enabling use of the Long-descriptor translation table format, 





SCTLR.TRE is RES1. 
VMSAv8-32 Long-descriptor format memory region attributes on page G4-4083 describes this 
encoding. 

ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4077 


ID092916 Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.7 Memory region attributes 


Shareability 


In the Short-descriptor translation table format, the S bit in the translation table descriptor is used in 
determining the Shareabilty of the region. How the S bit is interpreted depends on whether TEX 
remap is enabled, see: 

° Shareability and the S bit, without TEX remap on page G4-4079. 

° Determining the Shareability, with TEX remap on page G4-4081. 


In the Long-descriptor translation table format, the SH[1:0] field in the translation table descriptor 
encodes the Shareability of the region, see Shareability, Long-descriptor format on page G4-4083. 


—— Note 

Shareability is one of Non-shareable, Inner Shareable, and Outer Shareable. However, when using 
the Short-descriptor translation table format without TEX remap, VMSAv8-32 does not support any 
distinction between Inner Shareable and Outer Shareable memory, and a memory region is either 
Non-shareable or Outer Shareable. 


















































G4.7.2 Short-descriptor format memory region attributes, without TEX remap 
When using the Short-descriptor translation table formats, TEX remap is disabled when the value of SCTLR.TRE 
is 0. 
Note 
° The Short-descriptor format scheme without TEX remap is the scheme used in VMSAv6. 
° The B (Bufferable), C (Cacheable), and TEX (Type extension) bit names are inherited from earlier versions 
of the architecture. These names no longer adequately describe the function of the B, C, and TEX bits. 
Table G4-11 shows the C, B, and TEX[2:0] encodings when TEX remap is disabled. In the Page Shareability 
column, an entry of S bit indicates that the S bit in the translation table descriptor determines the Shareability, see 
Shareability and the S bit, without TEX remap on page G4-4079. 
Table G4-11 TEX, C, and B encodings when TRE == 
TEX[2:0] Description Memory type Page Shareability 
000 Device-nGnRnE Device-nGnRnE Outer Shareable 
Device-nGnRE? Device-nGnRE Outer Shareable@ 
Outer and Inner Write-Through, Read-Allocate Normal S bit 
No Write-Allocate 
Outer and Inner Write-Back, Read-Allocate Normal S bit 
No Write-Allocate 
001 Outer and Inner Non-cacheable Normal Outer Shareable> 
Reserved - - 
IMPLEMENTATION DEFINED IMPLEMENTATION IMPLEMENTATION 
DEFINED DEFINED 
Outer and Inner Write-Back, Read-Allocate Write-Allocate Normal S bit 
010 Device-nGnRE@ Device-nGnRE Outer Shareable# 
Reserved - - 
Reserved - - 
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Table G4-11 TEX, C, and B encodings when TRE == 0 (continued) 











TEX[2:0] C B Description Memory type Page Shareability 
Oll x x Reserved - - 
1BB A A Cacheable memory: AA = Inner attribute*¢ Normal S bit 


BB = Outer attribute 





a. In ARMv8, all Device memory types are Outer Shareable. For the Device-nGnRE memory type this is a change from previous versions of 
the architecture. This is why Device-nGnRE memory has two entries in this table. For compatibility with ARMv7 software should use the 
{ TEX, C, B} values {000, 0, 1}. 

b. In ARMv8, Normal Inner Non-cacheable, Outer Non-cacheable memory is always Outer Shareable. This is a change from previous versions 
of the architecture, where the S bit determined the Shareability. For compatibility with ARMv7 software should set S to 1. 


c. For more information, see Cacheability attributes, without TEX remap. 


See Memory types and attributes on page E2-2342 for an explanation of Normal memory, and the types of Device 
memory, and of the Shareability attribute. 


Cacheability attributes, without TEX remap 


When the value of TEX[2] is 0, the same Cacheability attribute applies to Inner Cacheable and Outer Cacheable 
memory regions, and the {TEX[1:0], C, B} values identify this attribute, as Table G4-11 on page G4-4078 shows. 


When the value of TEX[2] is 1, the memory described by the translation table entry is cacheable, and the rest of the 
encoding defines the Inner Cacheability and Outer Cacheability attributes: 


TEX[1:0] Define the Outer Cacheability attribute. 
C,B Define the Inner Cacheability attribute. 


The translation table entries use the same encoding for the Outer and Inner Cacheability attributes, as Table G4-12 
shows. 


Table G4-12 Inner and Outer Cacheability attribute encoding 





Encoding Cacheability attribute 














00 Non-cacheable 

01 Write-Back, Read-Allocate Write-Allocate 

10 Write-Through, Read Allocate No Write-Allocate 
11 Write-Back, Read Allocate No Write-Allocate 





Shareability and the S bit, without TEX remap 


The Short-descriptor format translation table entries include an S bit. This bit: 


° Is ignored if the entry refers to any type of Device memory, or to Normal memory that is Inner 
Non-cacheable, Outer Non-cacheable. 


° For Normal memory that is not Inner Non-cacheable, Outer Non-cacheable, determines whether the memory 
region is Outer Shareable or Non-shareable: 


S== Normal memory region is Non-shareable. 


S== Normal memory region is Outer Shareable. 


Without TEX remapping there is no distinction between Inner Shareable and Outer Shareable memory, meaning the 
S bit determines whether the region is Non-shareable or Outer Shareable. 
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G4.7.3 Short-descriptor format memory region attributes, with TEX remap 


When using the Short-descriptor translation table formats, TEX remap is enabled when the value of SCTLR.TRE 
is 1. In this configuration: 


The software that defines the translation tables must program the PRRR and NMRR to define seven possible 
memory region attributes. 


The TEX[0], C, and B bits of the translation table descriptors define the memory region attributes, by 
indexing PRRR and NMRR. 


Hardware makes no use of TEX[2:1], see The OS managed translation table bits on page G4-4082. 


When TEX remap is enabled: 


For seven of the eight possible combinations of the TEX[0], C and B bits, fields in the PRRR and NURR 
define the region attributes, as described in this section. 


The meaning of the eighth combination for the TEX[0], C and B bits is IMPLEMENTATION DEFINED. 


If the TEX[0], C and B bits determine that the region is a Device memory type, or is Normal Inner 
Non-cacheable, Outer Non-cacheable, then the region is Outer Shareable. Otherwise, the Shareability is 
determined by the combination of: 


— __ The S bit from the translation table descriptor. 
— The value of the PRRR.NSO or PRRR.NS1 bit. 
— The value of the appropriate PRRR.NOSn bit, as shown in Table G4-13. 


For more information, see Determining the Shareability, with TEX remap on page G4-4081. 


For each of the possible encodings of the TEX[0], C, and B bits in a translation table entry, Table G4-13 shows 
which fields of the PRRR and NMRR registers describe the memory region attributes. 


Table G4-13 TEX, C, and B encodings when TRE == 





Encoding 


TEX[0] C B 


Cache attributes: : 
Memory type@ Outer Shareable attribute: ¢ 
Inner cacheability Outer cacheability 





























0 0 0 PRRR.TRO NMRR.IRO NMRR.ORO NOT(PRRR.NOSO) 
if PRRR.TR1 NMRR.IR1 NMRR.ORI1 NOT(PRRR.NOS 1) 
1 0 PRRR.TR2 NMRR.IR2 NMRR.OR2 NOT(PRRR.NOS2) 
1 PRRR.TR3 NMRR.IR3 NMRR.OR3 NOT(PRRR.NOS3) 
1 0 0 PRRR.TR4 NMRR.IR4 NMRR.OR4 NOT(PRRR.NOS4) 
1 PRRR.TR5 NMRR.IR5 NMRR.ORS5 NOT(PRRR.NOSS) 
1 0 IMPLEMENTATION DEFINED 
1 PRRR.TR7 NMRR.IR7 NMRR.OR7 NOT(PRRR.NOS7) 





a. For details of the Memory type and Outer Shareable encodings see the description of the PRRR. For details of the Cache attributes 
encodings the description of the NMRR. 


b. Applies only if the memory type for the region is mapped as Normal memory. 


c. Applies only if both of the following apply: 


. The memory type for the region is mapped as Normal memory that is not Inner Non-cacheable and Outer Non-cacheable. 


. The region is not Non-shareable. 


See Determining the Shareability, with TEX remap on page G4-4081. 





G4-4080 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G4 The AArch32 Virtual Memory System Architecture 
G4.7 Memory region attributes 


As Table G4-13 on page G4-4080 shows, when TEX remap is enabled, for a given set of {TEX[0], C, B} bits from 
a translation table descriptor: 


1. The primary mapping, to memory type, is given by the PRRR.TRzx field as shown in the Memory type 
column. 


2. For any region that the PRRR.TRn maps as Normal memory, NMRR.IRn determines the Inner cacheability 
attribute, and NMRR.ORn determines the Outer cacheability attribute. 


3: For a region that PRRR.TRn maps as Normal memory, if NMRR. {IRn, ORn} do not map the region as Inner 
Non-cacheable, Outer Non-cacheable, PRRR.{NSO, NS1} and PRRR.NOSz are used to determine the 
Shareability of the region, see Determining the Shareability, with TEX remap. 


The TEX remap registers and the SCTLR.TRE bit are Banked between the Secure and Non-secure Security states. 
For more information, see The effect of EL3 on TEX remap on page G4-4083. 


The TEX remap registers must be static during normal operation. In particular, when the remap registers are 


changed: 
. It is IMPLEMENTATION DEFINED when the changes take effect. 
. It is CONSTRAINED UNPREDICTABLE whether the TLB caches the effect of the TEX remap on translation 


tables, see CONSTRAINED UNPREDICTABLE behaviors due to caching of control or data values on 
page K1-5461. 


The software sequence to ensure the synchronization of changes to the TEX remap registers is: 

Execute a DSB instruction. This ensures any memory accesses using the old mapping have completed. 
Write the TEX remap registers or SCTLR.TRE bit. 

Execute an ISB instruction. This ensures synchronization of the register updates. 

Invalidate the entire TLB. 

Execute a DSB instruction. This ensures completion of the entire TLB operation. 

Clean and invalidate all caches. This removes any cached information associated with the old mapping. 


Execute a DSB instruction. This ensures completion of the cache maintenance. 


CO: SW AON: ats ee 00) bo 


Execute an ISB instruction. This ensures instruction synchronization. 


This extends the standard rules for the synchronization of changes to System registers described in Synchronization 
of changes to AArch32 System registers on page G4-4163, and provides implementation freedom as to whether or 
not the effect of the TEX remap is cached. 


Determining the Shareability, with TEX remap 


The memory type of a region, as indicated in the Memory type column of Table G4-13 on page G4-4080, provides 
the first level of control of the Shareability of the region: 


. If the memory is any type of Device memory, then the region is Outer Shareable, and any Shareability 
attributes in the translation table descriptor and PRRR for that region are ignored. 
This applies also to a Normal memory region that the NMRR attributes identify as Inner Non-cacheable and 
Outer Non-cacheable, 


° If using the Short descriptor translation table format then the Shareability of the region is determined using 
the value of the S bit in the translation table descriptor to index one of the PRRR.{NS1. NSO} bits, as 
described in this section. 


Table G4-14 shows how the translation table S bit indexes into the PRRR: 


Table G4-14 Determining the Shareability attribute, with TEX remap 

















Memory type Remapping when S == Remapping when S == 
Device or Normal Inner Non-cacheable, Outer Non-cacheable Outer Shareable Outer Shareable 
Normal, not Inner Non-cacheable, Outer Non-cacheable PRRR.NSO PRRR.NS1 
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For a Normal memory region that is not Inner Non-cacheable, Outer Non-cacheable, the appropriate bit of the 
PRRR indicates whether the region is Non-shareable, as follows: 
PRRR.NSn== Non-shareable. 
PRRR.{NOS7:NOSO} are ignored. 
PRRR.NSn == The appropriate PRRR.NOSm field, as shown in Table G4-13 on page G4-4080, indicates 
whether the region is Inner Shareable or Outer Shareable: 
PRRR.NOSm==0 Region is Outer Shareable. 
PRRR.NOSm== Region is Inner Shareable. 


Note 


This means that TEX remapping can map a translation table entry with S == 0 as shareable memory. 








SCTLR.TRE, SCTLR.M, and the effect of the TEX remap registers 
When TEX remap is disabled, because the value of the SCTLR.TRE bit is 0: 
. The effect of the PRRR and NMRR registers can be IMPLEMENTATION DEFINED. 


° The interpretation of the fields of the PRRR and NMRR registers can differ from the description given earlier 
in this section. One implication of this is that the implementation can provide an IMPLEMENTATION DEFINED 
mechanism to interpret the PRRR.{ NOS7:NOS0} fields. 


VMSAv8-32 requires that the effect of these registers is limited to remapping the attributes of memory locations. 
These registers must not change whether any cache hardware or stages of address translation are enabled. The 
mechanism by which the TEX remap registers have an effect when the value of the SCTLR.TRE bit is 0 is 
IMPLEMENTATION DEFINED. The AArch32 architecture requires that from reset, if the IMPLEMENTATION DEFINED 
mechanism has not been invoked: 


° If the PL1&0 stage 1 address translation is enabled and is using the Short-descriptor format translation tables, 
the architecturally-defined behavior of the TEX[2:0], C, and B bits must apply, without reference to the TEX 
remap functionality. In other words, memory attribute assignment must comply with the scheme described 
in Short-descriptor format memory region attributes, without TEX remap on page G4-4078. 


° If the PL1&0 stage 1 address translation is disabled, then the architecturally-defined behavior of VMSAv8-32 
with address translation disabled must apply, without reference to the TEX remap functionality. See The 
effects of disabling address translation stages on VMSAv8-32 behavior on page G4-4031. 


Possible mechanisms for enabling the IMPLEMENTATION DEFINED effect of the TEX remap registers when the value 
of SCTLR.TRE is 0 include: 


° A control bit in the ACTLR, or in an IMPLEMENTATION DEFINED System register. 


° Changing the behavior when the PRRR and NMRR registers are changed from their IMPLEMENTATION 
DEFINED reset values. 


In addition, if the stage of address translation is disabled and the value of the SCTLR.TRE bit is 1, the 
architecturally-defined behavior of the VMSAv8-32 with the stage of address translation disabled must apply 
without reference to the TEX remap functionality. 


In an implementation that includes EL3, the IMPLEMENTATION DEFINED effect of these registers must only take 
effect in the Security state of the registers. See also The effect of EL3 on TEX remap on page G4-4083. 


The OS managed translation table bits 


When TEX remap is enabled, the TEX[2:1] bits in the translation table descriptors are available as two bits that can 
be managed by the operating system. In VMSAv8-32, as long as the SCTLR.TRE bit is set to 1, the values of the 
TEX[2:1] bits are IGNORED by the PE. Software can write any value to these bits in the translation tables. 
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The effect of EL3 on TEX remap 


In an implementation that includes EL3, when EL3 is using AArch32, the TEX remap registers are Banked between 
the Secure and Non-secure Security states. When EL3 is using AArch32, write accesses to the Secure register for 
the current security state apply to all PL1&0 stage 1 translation table lookups in that state. The SCTLR.TRE bit is 
Banked in the Secure and Non-secure copies of the register, and the appropriate version of this bit determines 
whether TEX remap is applied to translation table lookups in the current security state. 


Write accesses to the Secure copies of the TEX remap registers are disabled when the CPISSDISABLE input is 
asserted HIGH, meaning the MCR operations to access these registers are UNDEFINED. For more information, see The 
CPISSDISABLE input signal on page G4-4161. 























G4.7.4 VMSAv8-32 Long-descriptor format memory region attributes 
When a PE is using the VMSAv8-32 Long-descriptor translation table format, the AttrIndx[2:0] field in a block or 
page translation table descriptor for a stage 1 translation indicates the 8-bit field, in the appropriate MAIR register, 
that specifies the attributes for the corresponding memory region, as follows: 
° AttrIndx[2] indicates the MAIR register to be used: 
AttrIndx[2] == Use MAIRO. 
AttrIndx[2] == Use MAIRI. 
° AttrIndx[2:0] indicates the required Attr field, Attrn, where n = AttrIndx[2:0]. 
Each AttrIndx field defines, for the corresponding memory region: 
. The memory type, Normal or a type of Device memory. 
° For Normal memory: 
— The Inner cacheability and Outer cacheability attributes, each of which is one of Non-cacheable, 
Write-Through Cacheable, or Write-Back Cacheable. 
— _ For Write-Through Cacheable and Write-Back Cacheable regions, the Read-Allocate and 
Write-Allocate policy hints, each of which is Allocate or No allocate. 
For more information about the AttrIndx[2:0] descriptor field, see Attribute fields in VUSAv8-32 Long-descriptor 
stage 1 Block and Page descriptors on page G4-4054. 
Shareability, Long-descriptor format 
When a PE is using the Long-descriptor translation table format, the SH[1:0] field in a block or page translation 
table descriptor specifies the Shareability attributes of the corresponding memory region, if the MAIR entry for that 
region identifies it as Normal memory that is not both Inner Non-cacheable and Outer Non-cacheable. Table G4-15 
shows the encoding of this field. 
Table G4-15 SH[1:0] field encoding for Normal memory, Long-descriptor format 
SH[1:0] Normal memory 
00 Non-shareable 
01 Reserved, CONSTRAINED UNPREDICTABLE, see Reserved values in System and memory-mapped registers and translation 
table entries on page K1-5477 for the permitted behavior. 
10 Outer Shareable 
11 Inner Shareable 
See Combining the Shareability attribute on page G4-4088 for constraints on the Shareability attributes of a Normal 
memory region that is Inner Non-cacheable, Outer Non-cacheable. 
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For any type of Device memory, and for Normal Inner Non-cacheable, Outer Non-cacheable memory, the value of 
the SH[1:0] field of the translation table descriptor is ignored. 


Other fields in the Long-descriptor translation table format descriptors 


The following subsections describe the other fields in the translation table block and page descriptors when a PE is 
using the Long-descriptor translation table format: 


° Contiguous bit 

° IGNORED fields. 

. Field reserved for software use 
Contiguous bit 


The Long-descriptor translation table format descriptors contain a Contiguous bit. Setting this bit to 1 indicates that 
16 adjacent translation table entries point to a contiguous output address range. These 16 entries must be aligned in 
the translation table so that the top five bits of their input addresses, that index their position in the translation table, 
are the same. For example, to use this bit for a block of 16 entries in the level 3 translation table, bits[20:16] of the 
input addresses for the 16 entries must be the same. 


The contiguous output address range must be aligned to size of 16 translation table entries at the same translation 
table level. 


Use of this bit means that the TLB can cache a single entry to cover the 16 translation table entries. 


This bit acts as a hint. The architecture does not require a PE to cache TLB entries in this way. To avoid TLB 
coherency issues, any TLB maintenance by address must not assume any optimization of the TLB tables that might 
result from use of this bit. 





Note 


The use of the contiguous bit is similar to the approach used, in the Short-descriptor translation table format, for 
optimized caching of Large Pages and Supersections in the TLB. However, an important difference in the 
contiguous bit capability is that TLB maintenance must be performed based on the size of the underlying translation 
table entries, to avoid TLB coherency issues. That is, any use of the contiguous bit has no effect on the minimum 
size of entry that must be invalidated from the TLB. 





IGNORED fields 

For stage 1 and stage 2 Block and Page descriptors, the architecture defines bits[63:59] and bits[58:55] as IGNORED 
fields, meaning the architecture guarantees that a PE makes no use of these fields. In addition: 

° Bits[58:55] are reserved for software use, see Field reserved for software use. 

. In the stage 2 Block and Page descriptors, bits[63:60] are reserved for use by a System MMU. 


Note 
The definition of IGNORED means there is no need to invalidate the TLB if these bits are changed. 








Field reserved for software use 


The architecture reserves a 4-bit field in the Block and Page table descriptors, bits[58:55], for software use. In 
considering migration from using the Short-descriptor format to the Long-descriptor format, this field is an 
extension of the Short-descriptor field described in The OS managed translation table bits on page G4-4082. 





Note 
The definition of IGNORED means there is no need to invalidate the TLB if these bits are changed. 
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G4.7.5 EL2 control of Non-secure memory region attributes 


Software executing at EL2 controls two sets of translation tables, both of which use the Long-descriptor translation 
table format. These are: 


. The translation tables that control Non-secure PL2 stage 1 translations. These map VAs to PAs, and when 
EL2 is using AArch32 they are indicated and controlled by the HTTBR and HTCR. 


These translations have exactly the same memory region attribute controls as any other stage | translations, 
as described in VMSAv8-32 Long-descriptor format memory region attributes on page G4-4083. 


° The translation tables that control Non-secure PL1&0 stage 2 translations. These map the IPAs from the stage 
1 translation onto PAs, and are indicated and when EL2 is using AArch32 they are controlled by the VTTBR 
and VTCR. 


The descriptors in the virtualization translation tables define level 2 memory region attributes, that are 
combined with the attributes defined in the stage 1 translation. This section describes this combining of 
attributes. 


VMSAVv8-32 Long-descriptor translation table format descriptors on page G4-4050 describes the format of the 
entries in these tables. 





Note 
In a virtualization implementation, a hypervisor might usefully: 
° Reduce the permitted Cacheability of a region. 
° Increase the required Shareability of a region. 


The combining of attributes from stage 1 and stage 2 translations supports both of these options. 





In the stage 2 translation table descriptors for memory regions and pages, the MemAttr[3:0] and SH[1:0] fields 
describe the stage 2 memory region attributes: 


° The definition of the stage 2 SH[1:0] field is identical to the same field for a stage 1 translation, see 
Shareability, Long-descriptor format on page G4-4083. 


° MemAttr[3:2] give a top-level definition of the memory type, and of the cacheability of a Normal memory 
region, as Table G4-16 shows: 


Table G4-16 Long-descriptor MemAttr[3:2] encoding, stage 2 translation 























MemAttr[3:2] | Memory type Cacheability 
00 Device, of type determined by MemAttr[1:0] Not applicable 
Ol Normal, Inner cacheability determined by Outer Non-cacheable 
—__— MemAttr[1:0] 
10 Outer Write-Through Cacheable 
11 Outer Write-Back Cacheable 
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The encoding of MemAttr[1:0] depends on the Memory type indicated by MemAttr[3:2]: 


— When MemAttr[3:2]==0b00, indicating a type of Device memory, Table G4-17 shows the encoding of 
MemAttr[1:0]: 


Table G4-17 MemAttr[1:0] encoding for the types of Device memory 





MemAttr[1:0] | Meaning when MemAttr[3:2] == 0b00 














00 Region is Device-nGnRnE memory 
01 Region is Device-nGnRE memory 
10 Region is Device-nGRE memory 
11 Region is Device-GRE memory 





— When MemAttr[3:2] !=0b00, indicating Normal memory, Table G4-18 shows the encoding of 
MemAttr[1:0]: 


Table G4-18 MemAttr[1:0] encoding for Normal memory 





MemAttr[1:0] 


Meaning when MemAttr[3:2] != 0b00 
































00 Reserved, CONSTRAINED UNPREDICTABLE, See Reserved values in System and memory-mapped registers and 
translation table entries on page K1-5477 for the permitted behavior. 
01 Inner Non-cacheable 
10 Inner Write-Through Cacheable 
11 Inner Write-Back Cacheable 
Note 
The stage 2 translation does not assign any allocation hints. 
The following sections describe how the memory type attributes assigned at stage 2 of the translation are combined 
with those assigned at stage 1: 
° Combining the memory type attribute on page G4-4087. 
° Combining the Cacheability attribute on page G4-4087. 
° Combining the Shareability attribute on page G4-4088. 
Note 
° The following stage 2 translation table attribute settings leave the stage 1 settings unchanged: 
— MemAttr[3:2] == 0b11, Normal memory, Outer Write-Back Cacheable. 
— MemAttr[1:0] == 0b11, Inner Write-Back Cacheable. 
° In addition to the attribute combinations described in this section, Execute-never restrictions on instruction 
fetching on page G4-4071 describes how the stage 1 and stage 1 XN permission fields are combined, so that 
a region is execute-never if the value of the XN field is 1 in at least one stage of translation. 
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Table G4-19 shows how the stage 1 and stage 2 memory type assignments are combined: 


Table G4-19 Combining the stage 1 and stage 2 memory type assignments 





Assignment in stage 1 


Assignment in stage 2 


Resultant type 





Device-nGnRnE 


Any 


Device-nGnRnE 





Device-nGnRE 


Device-nGnRnE 


Device-nGnRnE 





Not Device-nGnRnE 


Device-nGnRE 





























Device-nGRE Device-nGnRnE Device-nGnRnE 
Device-nGnRE Device-nGnRE 
Not (Device-nGnRnE or Device-nGnRE) Device-nGRE 

Device-GRE Device-nGnRnE Device-nGnRnE 
Device-nGnRE Device-nGnRE 
Device-nGRE Device-nGRE 
Device-GRE or Normal Device-GRE 

Normal Any type of Device Device type assigned at stage 2 
Normal Normal 





See Combining the Shareability attribute on page G4-4088 for information about the Shareability of: 


° A region for which the resultant type is any Device type. 


° A region with a resultant type of Normal for which the resultant cacheability, described in Combining the 


Cacheability attribute, is Inner Non-cacheable, Outer Non-cacheable. 


The combining of the memory type attribute means a translation table walk for a stage 1 translation can be made to 
a type of Device memory. If this occurs, then: 


° If the value of HCR.PTW is 0, then the translation table walk occurs as if it is to Normal Non-cacheable 
memory. This means it can be done speculatively. 


° If the value of HCR.PTW is 1, then the memory access generates a stage 2 Permission fault. 


Combining the Cacheability attribute 


For a Normal memory region, Table G4-20 shows how the stage 1 and stage 2 Cacheability assignments are 


combined. This combination applies, independently, for the Inner Cacheability and Outer Cacheability attributes: 


Table G4-20 Combining the stage 1 and stage 2 cacheability assignments 





Assignment in stage 1 Assignment in stage 2 Resultant cacheability 





Non-cacheable Any Non-cacheable 





Any Non-cacheable Non-cacheable 
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Table G4-20 Combining the stage 1 and stage 2 cacheability assignments (continued) 





Assignment in stage 1 Assignment in stage 2 Resultant cacheability 





Write-Through Cacheable Write-Through or Write-Back Cacheable — Write-Through Cacheable 





Write-Through or Write-Back Cacheable — Write-Through Cacheable Write-Through Cacheable 





Write-Back Cacheable Write-Back Cacheable Write-Back Cacheable 





Note 
Only Normal memory has a Cacheability attribute. 








Combining the Shareability attribute 


In the following cases, a memory region is treated as Outer Shareable, regardless of any shareability assignments at 
either stage of translation: 


° The resultant memory type attribute, described in Combining the memory type attribute on page G4-4087, is 
any type of Device memory. 


. The resultant memory type attribute is Normal memory, and the resultant Cacheability, described in 
Combining the Cacheability attribute on page G4-4087, is Inner Non-cacheable Outer Non-cacheable. 


For a memory region with a resultant memory type attribute of Normal that is not Inner Non-cacheable Outer 
Non-cacheable, Table G4-21 shows how the stage 1 and stage 2 shareability assignments are combined: 


Table G4-21 Combining the stage 1 and stage 2 Shareability assignments 





Assignment in stage 1 Assignment in stage 2. Resultant Shareability 





Outer Shareable 


Any 


Outer Shareable 





Inner Shareable 


Outer Shareable 


Outer Shareable 





Inner Shareable 


Inner Shareable 


Inner Shareable 





Inner Shareable 


Non-shareable 


Inner Shareable 





Non-shareable 


Outer Shareable 


Outer Shareable 





Non-shareable 


Inner Shareable 


Inner Shareable 





Non-shareable 


Non-shareable 


Non-shareable 
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G4.8 Translation Lookaside Buffers (TLBs) 


Translation Lookaside Buffers (TLBs) are an implementation technique that caches translations or translation table 
entries. TLBs avoid the requirement to perform a translation table walk in memory for every memory access. The 
ARM architecture does not specify the exact form of the TLB structures for any design. In a similar way to the 
requirements for caches, the architecture only defines certain principles for TLBs: 


. The architecture has a concept of an entry locked down in the TLB. The method by which lockdown is 
achieved is IMPLEMENTATION DEFINED, and an implementation might not support lockdown. 


° The architecture does not guarantee that an unlocked TLB entry remains in the TLB. 


. The architecture guarantees that a locked TLB entry remains in the TLB. However, a locked TLB entry might 
be updated by subsequent updates to the translation tables. Therefore, when a change is made to the 
translation tables, the architecture does not guarantee that a locked TLB entry remains incoherent with an 
entry in the translation table. 


° The architecture guarantees that a translation table entry that generates a Translation fault, an Address size 
fault, or an Access flag fault is not held in the TLB. However a translation table entry that generates a Domain 
fault or a Permission fault might be held in the TLB. 


° When address translation is enabled, any translation table entry that does not generate a Translation fault, an 
Address size fault, or an Access flag fault and is not from a translation regime for an Exception level that is 
lower than the current Exception level can be allocated to a TLB at any time. The only translation table entries 
guaranteed not to be held in the TLB are those that generate a Translation fault, an Address size fault, or an 
Access flag fault. 





Note 


An TLB can hold translation table entries that do not generate a Translation fault but point to subsequent 
tables in the translation table walk. This can be referred to as intermediate caching of TLB entries. 





° Software can rely on the fact that between disabling and re-enabling a stage of address translation, entries in 
the TLB relating to that stage of translation have not been corrupted to give incorrect translations. 


The following sections give more information about TLB implementation: 
° Global and process-specific translation table entries. 

° TLB matching on page G4-4090. 

° TLB behavior at reset on page G4-4090. 

° TLB lockdown on page G4-4091. 

° TLB conflict aborts on page G4-4091. 


See also TLB maintenance requirements on page G4-4093. 


G4.8.1 Global and process-specific translation table entries 


For VMSAv8-32, system software can divide a virtual memory map used by memory accesses at PL1 and PLO into 
global and non-global regions, indicated by the nG bit in the translation table descriptors: 


nG == The translation is global, meaning the region is available for all processes. 
nG == 1 The translation is non-global, or process-specific, meaning it relates to the current ASID, as defined 
by: 


° TTBRO.ASID or TTBR1.ASID, if using the Long-descriptor translation table format. In this 
case, TTBCR.A1 selects which ASID is current. 


° CONTEXTIDR.ASID, if using the Short-descriptor translation table format. 
Each non-global region has an associated ASID. These identifiers mean different translation table mappings can 


co-exist in a caching structure such as a TLB. This means that software can create a new mapping of a non-global 
memory region without removing previous mappings. 
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For a symmetric multiprocessor cluster where a single operating system is running on the set of PEs, the architecture 
requires all ASID values to be assigned uniquely within any single Inner Shareable domain. In other words, each 
ASID value must have the same meaning to all PEs in the system. 


The translation regime used for accesses made at PL2 does not support ASIDs, and all pages are treated as global. 


When a PE is using the Long-descriptor translation table format, and is in Secure state, a translation must be treated 
as non-global, regardless of the value of the nG bit, if NSTable is set to 1 at any level of the translation table walk. 


For more information see Control of Secure or Non-secure memory access, VMSAv8-32 Long-descriptor format on 
page G4-4056. 


G4.8.2 TLB matching 


A TLB is a hardware caching structure for translation table information. Like other hardware caching structures, it 
is mostly invisible to software. However, there are some situations where it can become visible. These are associated 
with coherency problems caused by an update to the translation table that has not been reflected in the TLB. Use of 
the TLB maintenance instructions described in TLB maintenance requirements on page G4-4093 can prevent any 
TLB incoherency becoming a problem. 


A particular case where the presence of the TLB can become visible is if the translation table entries that are in use 
under a particular ASID and VMID are changed without suitable invalidation of the TLB. This can occur only if the 
architecturally-required break-before-make sequence described in Using break-before-make when updating 
translation table entries on page G4-4094 is not used. If the break-before make sequence is not used, the TLB can 
hold two mappings for the same address, and this: 


° Might generate an exception that is reported using the TLB Conflict fault code, see TLB conflict aborts on 
page G4-4091. 


. Might lead to CONSTRAINED UNPREDICTABLE behavior. In this case, behavior will be consistent with one of 
the mappings held in the TLB, or with some amalgamation of the values held in the TLB, but cannot give 
access to regions of memory with permissions or attributes that could not be assigned by valid translation 
table entries in the translation regime being used for the access. See CONSTRAINED UNPREDICTABLE 
behaviors due to caching of control or data values on page K1-5461. 


G4.8.3 TLB behavior at reset 


The ARM architecture does not require a reset to invalidate the TLBs, and recognizes that an implementation might 
require caches, including TLBs, to maintain context over a system reset. Possible reasons for doing so include power 
management and debug requirements. 


Therefore, for ARMv8: 
° All TLBs reset to an IMPLEMENTATION DEFINED state that might be UNKNOWN. 


° All TLBs are disabled from reset. All stages of address translation that are used from the PE state entered on 
coming out of reset are disabled from reset, and the contents of the TLBs have no effect on address 
translation. For more information see Enabling stages of address translation on page G4-4033. 


. An implementation can require the use of a specific TLB invalidation routine, to invalidate the TLB arrays 
before they are enabled after a reset. The exact form of this routine is IMPLEMENTATION DEFINED, but if an 
invalidation routine is required it must be documented clearly as part of the documentation of the device. 


ARM recommends that if an invalidation routine is required for this purpose, and the PE resets into AArch32 
state, the routine is based on the AArch32 TLB maintenance instructions described in The scope of TLB 
maintenance instructions on page G4-4101. 


Similar rules apply: 





° To cache behavior, see Behavior of caches at reset on page G3-3995. 
. To branch predictor behavior, see Behavior of the branch predictors at reset on page G3-4003. 
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G4.8.4 TLB lockdown 


The ARM architecture recognizes that any TLB lockdown scheme is heavily dependent on the microarchitecture, 
making it inappropriate to define a common mechanism across all implementations. This means that: 


. The architecture does not require TLB lockdown support. 


° If TLB lockdown support is implemented, the lockdown mechanism is IMPLEMENTATION DEFINED. However, 
key properties of the interaction of lockdown with the architecture must be documented as part of the 
implementation documentation. 


This means that: 
° The TLB Type Register, TLBTR, does not define the lockdown scheme in use. 


° In AArch32 state, a region of the {coproc==0b1111, CRn==c10} encodings is reserved for IMPLEMENTATION 
DEFINED TLB functions, such as TLB lockdown functions. The reserved encodings are those with: 


— <CRm> == {c0, cl, c4, c8}. 
— All values of <opc2> and <opcl1>. 


An implementation might use some of the {coproc==0b1111, CRn==c10} encodings that are reserved for 
IMPLEMENTATION DEFINED TLB functions to implement additional TLB control functions. These functions might 


include: 
° Unlock all locked TLB entries. 
° Preload into a specific level of TLB. This is beyond the scope of the PLI and PLD hint instructions. 


The inclusion of EL2 in an implementation does not affect the TLB lockdown requirements. However, in an 
implementation that includes EL2, exceptions generated as a result of TLB lockdown when executing in a 
Non-secure PL1 mode or in Non-secure User mode can be routed to either: 


° Non-secure Abort mode, using the Non-secure Data Abort exception vector. 


. Hyp mode, using the Hyp Trap exception vector. 


For more information, see Traps to Hyp mode of Non-secure ELO and ELI accesses to lockdown, DMA, and TCM 
operations on page G1-3900. 


G4.8.5 TLB conflict aborts 


If an address matches multiple entries in the TLB, it is IMPLEMENTATION DEFINED whether a TLB conflict abort is 
generated. 


An implementation can generate TLB conflict aborts on either or both instruction fetches and data accesses. A TLB 
conflict abort is classified as an MMU fault, see MMU faults in AArch32 state on page G4-4118. This means: 


. A TLB conflict abort on an instruction fetch is reported as a Prefetch Abort exception, 


. A TLB conflict abort on a data access is reported as a Data Abort exception, 


Fault status encodings for TLB conflict aborts are defined for both the Short-descriptor and Long-descriptor 
translation table formats, see: 


° PLI fault reporting with the Short-descriptor translation table format on page G4-4128 
° PL]I fault reporting with the Long-descriptor translation table format on page G4-4129. 


Ona TLB conflict abort, the fault address register returns the address that generated the fault. That is, it returns the 
address that was being looked up in the TLB. 


It is IMPLEMENTATION DEFINED whether a TLB conflict abort is a stage 1 abort or a stage 2 abort. 


Note 


. An address can hit multiple entries in the TLB if the TLB has been invalidated inappropriately, for example 
if TLB invalidation required by this manual has not been performed. 





° A stage 2 abort cannot be generated if the Non-secure PL1&0 stage 2 address translation is disabled. 
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The priority of the TLB conflict abort is IMPLEMENTATION DEFINED, because it depends on the form of any TLB 
that can generate the abort. However, the TLB conflict abort must have higher priority than any abort that depends 
on a value held in the TLB. 


If an address matches multiple entries in the TLB and no TLB conflict abort not generated, the resulting behavior 
is CONSTRAINED UNPREDICTABLE, see CONSTRAINED UNPREDICTABLE behaviors due to caching of control or 
data values on page K1-5461. The CONSTRAINED UNPREDICTABLE behavior must not permit access to regions of 
memory with permissions or attributes that mean they cannot be accessed in the current Security state at the current 
Privilege level. 
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TLB maintenance requirements 


Translation Lookaside Buffers (TLBs) on page G4-4089 describes the ARM architectural provision for TLBs. 
Although the ARM architecture does not specify the form of any TLB structures, it does define the mechanisms by 
which TLBs can be maintained. The following sections describe the VMSAv8-32 TLB maintenance instructions: 











° General TLB maintenance requirements. 
° Maintenance requirements on changing System register values on page G4-4097. 
° Atomicity of register changes on changing virtual machine on page G4-4098. 
. Synchronization of changes of ASID and TTBR on page G4-4099. 
° The scope of TLB maintenance instructions on page G4-4101. 
G4.9.1 General TLB maintenance requirements 
TLB maintenance instructions provide a mechanism to invalidate entries from a TLB. As stated at the start of 
Translation Lookaside Buffers (TLBs) on page G4-4089, when address translation is enabled any translation table 
entry that does not generate a Translation fault, an Address size fault, or an Access flag fault can be allocated to a 
TLB at any time. This means that software must perform TLB maintenance between updating translation table 
entries that apply in a particular context and accessing memory locations whose translation is determined by those 
entries in that context. 
Note 
This requirement applies to any translation table entry at any level of the translation tables, including an entry that 
points to further levels of the tables, provided that the entry in that level of the tables does not cause a Translation 
fault, an Address size fault, or an Access flag fault. 
In addition to any TLB maintenance requirement, when changing the cacheability attributes of an area of memory, 
software must ensure that any cached copies of affected locations are removed from the caches. For more 
information see Cache maintenance requirement created by changing translation table attributes on page G4-4108. 
Because a TLB never holds any translation table entry that generates a Translation fault, an Address size fault, or 
an Access flag fault, a change from a translation table entry that causes a Translation, Address size, or Access flag 
fault to one that does not fault, does not require any TLB or branch predictor invalidation. 
Special considerations apply to translation table updates that change the memory type, cacheability, or output 
address of an entry, see Using break-before-make when updating translation table entries on page G4-4094. 
In addition, software must perform TLB maintenance after updating the System registers if the update means that 
the TLB might hold information that applies to a current translation context, but is no longer valid for that context. 
Maintenance requirements on changing System register values on page G4-4097 gives more information about this 
maintenance requirement. 
Each of the translation regimes defined in Figure G4-1 on page G4-4024 is a different context, and: 
* For the Non-secure PL1&0 regime, a change in the VMID or ASID value changes the context. 
° For the Secure PL1&0 regime, a change in the ASID value changes the context. 
For operation in Non-secure PL1 or PLO modes, a change of HCR.VM, unless made at the same time as a change 
of VMID, requires the invalidation of all TLB entries for the Non-secure PL1&0 translation regime that apply to 
the current VMID. Otherwise, there is no guarantee that the effect of the change of HCR.VM is visible to software 
executing in the Non-secure PL1 and PLO modes. 
Any TLB maintenance instruction can affect any other TLB entries that are not locked down. 
AArch32 state defines {coproc==0b1111, CRn==c8} System instructions for TLB maintenance instructions, and 
supports the following operations: 
° Invalidate all unlocked entries in the TLB. 
° Invalidate a single TLB entry, by VA, or VA and ASID for a non-global entry. 
° Invalidate all TLB entries that match a specified ASID. 
. Invalidate all TLB entries that match a specified VA, regardless of the ASID. 
° Operations that apply across multiprocessors in the same Inner Shareable domain. 
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Note 


An address-based TLB maintenance instruction that applies to the Inner Shareable domain does so regardless 
of the Shareability attributes of the address supplied as an argument to the instruction. 





A TLB maintenance instruction that specifies a virtual address that would generate any MMU fault, including a 
virtual address that is not in the range of virtual addresses that can be translated, does not generate an abort. 


EL2 provides additional TLB maintenance instructions for use in AArch32 state at PL2, and has some implications 
for the effect of the other TLB maintenance instructions, see The scope of TLB maintenance instructions on 
page G4-4101. 


In an implementation that includes EL3, the TLB maintenance instructions take account of the current Security 
state, as part of the address translation required for the TLB maintenance instruction. 


Some TLB maintenance instructions are defined as operating only on instruction TLBs, or only on data TLBs. 
ARMV8 AArch32 state includes these instructions for backwards compatibility. However, more recent TLB 
maintenance instructions do not support this distinction. From the introduction of ARMv7, ARM deprecates any 
use of Instruction TLB maintenance instructions, or of Data TLB maintenance instructions, and developers must 
not rely on this distinction being maintained in future revisions of the ARM architecture. 


The ARM architecture does not dictate the form in which the TLB stores translation table entries. However, for TLB 
invalidate instructions, the minimum size of the table entry that is invalidated from the TLB must be at least the size 
that appears in the translation table entry. 


The scope of TLB maintenance instructions on page G4-4101 describes the TLB maintenance instructions. The 
following subsections give more information about the general requirements for TLB maintenance: 


° Using break-before-make when updating translation table entries 
° The interaction of TLB lockdown with TLB maintenance instructions on page G4-4095 
° Ordering and completion of TLB maintenance instructions on page G4-4096 


Using break-before-make when updating translation table entries 


To avoid possibly creating multiple TLB entries for the same address, and to avoid the effects of TLB caching 
possibly breaking coherency, ordering guarantees or uniprocessor semantics, or possibly failing to clear the 
exclusive monitors, the architecture requires the use of a break-before-make sequence when changing translation 
table entries whenever multiple threads of execution can use the same translation tables and the change to the 
translation table entries involves any of: 


° A change of the memory type. 
° A change of the cacheability attributes. 
° A change of the output address (OA), if the OA of at least one of the old translation table entries and the new 


translation table entry is writable. 


° A change to the size of block used by the translation system. This applies both: 


— When changing from a smaller size to a larger size, for example by replacing a table mapping with a 
block mapping in a stage 2 translation table. 


— When changing from a larger size to a smaller size, for example by replacing a block mapping with a 
table mapping in a stage 2 translation table. 


° Creating a global entry when there might be non-global entries in a TLB that overlap with that global entry. 
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A break-before-make sequence on changing from an old translation table entry to a new translation table entry 
requires the following steps: 


1. Replace the old translation table entry with an invalid entry, and execute a DSB instruction. 


2. Invalidate the translation table entry with a broadcast TLB invalidation instruction, and execute a DSB 
instruction to ensure the completion of that invalidation. 


3. Write the new translation table entry, and execute a DSB instruction to ensure that the new entry is visible. 


This sequence ensures that at no time are both the old and new entries simultaneously visible to different threads of 
execution, and therefore the problems described at the start of this subsection cannot arise. 


The interaction of TLB lockdown with TLB maintenance instructions 


The precise interaction of TLB lockdown with the TLB maintenance instructions is IMPLEMENTATION DEFINED. 
However, the architecturally-defined TLB maintenance instructions must comply with these rules: 


. The effect on locked entries of a TLB invalidate all unlocked entries instruction or a TLB invalidate by VA 
all ASID instruction is IMPLEMENTATION DEFINED. However, these instructions must implement one of the 
following options: 


— Have no effect on entries that are locked down. 


— Generate an IMPLEMENTATION DEFINED Data Abort exception if an entry is locked down, or might be 
locked down. For an invalidate instruction performed in AArch32 state, the {coproc==0b1111, 
CRn==c5} fault status register definitions include a fault code for cache and TLB lockdown faults, see 
Table G4-24 on page G4-4128 for the codes used with the Short-descriptor translation table formats, 
or Table G4-25 on page G4-4130 for the codes used with the Long-descriptor translation table formats. 
In an implementation that includes EL2, if EL2 is using AArch32 and the value of HCR.TIDCP is 1, 
any such exceptions taken from a Non-secure PL1 mode are routed to Hyp mode, see Traps to Hyp 
mode of Non-secure ELO and ELI accesses to lockdown, DMA, and TCM operations on 
page G1-3900. 


This permits a usage model for TLB invalidate routines, where the routine invalidates a large range of 
addresses, without considering whether any entries are locked in the TLB. 


° The effect on locked entries of a TLB invalidate by VA instruction or a TLB invalidate by ASID match 
instruction is IMPLEMENTATION DEFINED. However, these instructions must implement one of the following 
options: 

— A locked entry is invalidated in the TLB. 


— The instruction has no effect on a locked entry in the TLB. In the case of the Invalidate single entry by 
VA, this means the PE treats the instruction as a NOP. 


— The instruction generates an IMPLEMENTATION DEFINED Data Abort exception if it operates on an entry 
that is locked down, or might be locked down. For an invalidate instruction performed in AArch32 
state, the {coproc==0b1111, CRn==c5} fault status register definitions include a fault code for cache 
and TLB lockdown faults, see Table G4-24 on page G4-4128 and Table G4-25 on page G4-4130. 


Note 


Any implementation that uses an abort mechanism for entries that can be locked down but are not actually locked 
down must: 





. Document the IMPLEMENTATION DEFINED instruction sequences that perform the required invalidation on 
entries that are not locked down. 


. Implement one of the other specified alternatives for the locked entries. 


ARM recommends that, when possible, such IMPLEMENTATION DEFINED instruction sequences use the 
architecturally-defined maintenance instructions. This minimizes the number of customized maintenance 
operations required. 
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In addition, an implementation that uses an abort mechanism for handling TLB maintenance instructions on entries 
that can be locked down but are not actually locked down must also must provide a mechanism that ensures that no 
TLB entries are locked. 


Similar rules apply to cache lockdown, see The interaction of cache lockdown with cache maintenance instructions 
on page G3-4010. 





The architecture does not guarantee that any unlocked entry in the TLB remains in the TLB. This means that, as a 
side-effect of a TLB maintenance instruction, any unlocked entry in the TLB might be invalidated. 


Ordering and completion of TLB maintenance instructions 


The following rules describe the relations between the memory order model and the TLB maintenance instructions: 


. A TLB invalidate instruction is complete when all memory accesses using the invalidated TLB entries have 
been observed by all observers, to the extent that those accesses must be observed. The Shareability and 
Cacheability of the accessed memory locations determine the extent to which the accesses must be observed. 





Note 


For a TLB invalidate instruction that affects other PEs, the set of memory accesses that have been observed 
when the TLB maintenance instruction is complete must include the memory accesses from those processes 
that used the invalidated TLB entries. 





After the TLB invalidate instruction is complete, no new memory accesses using the invalidated TLB entries 
will be observed by those observers. 





Note 


This requirement does not mean that speculative memory accesses cannot be performed using those entries 
if it is impossible to tell that those memory accesses have been observed by the observers. 





. A TLB maintenance instruction is only guaranteed to be complete after the execution of a DSB instruction. 


° An ISB instruction, or a return from an exception, causes the effect of all completed TLB maintenance 
instructions that appear in program order before the ISB or return from exception to be visible to all 
subsequent instructions, including the instruction fetches for those instructions. 


. An exception causes all completed TLB maintenance instructions, that appear in the instruction stream before 
the point where the exception is taken, to be visible to all subsequent instructions, including the instruction 
fetches for those instructions. 


° All TLB maintenance instructions are executed in program order relative to each other. 


° The execution of a Data or Unified TLB maintenance instruction is only guaranteed to be visible to a 
subsequent explicit load or store instruction after both: 


— The execution of a DSB instruction to ensure the completion of the TLB maintenance instruction. 


— Execution of a subsequent Context synchronization event. 


° The execution of an Instruction or Unified TLB maintenance instruction is only guaranteed to be visible to a 
subsequent instruction fetch after both: 
— The execution of a DSB instruction to ensure the completion of the TLB maintenance instruction. 


— _ Execution of a subsequent Context synchronization event. 


In all cases in this section where a DMB or DSB is referred to, it refers to a DMB or DSB whose required access type is 
both loads and stores. A DSB NSH is sufficient to ensure completion of TLB maintenance instructions that apply to a 
single PE. A DSB ISH is sufficient to ensure completion of TLB maintenance instructions that apply to PEs in the 
same Inner Shareable domain. 


The following rules apply when writing translation table entries. They ensure that the updated entries are visible to 
subsequent accesses and cache maintenance instructions. 
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For TLB maintenance, the translation table walk is treated as a separate observer. This means: 


° A write to the translation tables is only guaranteed to be seen by a translation table walk caused by an explicit 
load or store after the execution of both a DSB and an ISB. 


However, the architecture guarantees that any writes to the translation tables are not seen by any explicit 
memory access that occurs in program order before the write to the translation tables. 


. A write to the translation tables is only guaranteed to be seen by a translation table walk caused by the 
instruction fetch of an instruction that follows the write to the translation tables after both a DSB and an ISB. 


Therefore, in a uniprocessor system, an example instruction sequence for writing a translation table entry, covering 
changes to the instruction or data mappings is: 


STR rx, [Translation table entry] ; write new entry to the translation table 
DSB ; ensures visibility of the new entry 

Invalidate TLB entry by VA (and ASID if non-global) [page address] 

Invalidate BTC 


DSB ; ensure completion of the Invalidate TLB instruction 
ISB ; ensure table changes visible to instruction fetch 
G4.9.2 Maintenance requirements on changing System register values 


The TLB contents can be influenced by control bits in a number of System registers. This means the TLB must be 
invalidated after any changes to these bits, unless the changes are accompanied by a change to the VMID or ASID 
that defines the context to which the bits apply. The general form of the required invalidation sequence is as follows: 


; Change control bits in System registers 
ISB ; Synchronize changes to the control bits 
; Perform TLB invalidation of all entries that might be affected by the changed control bits 
The System register changes that this applies to are: 
° Any change to the NMRR, PRRR, MAIRO,MAIR1I, HMAIRO or HMAIR1 registers. 
. Any change to the SCTLR.AFE bit, see Changing the Access flag enable on page G4-4098. 
° Any change to any of the SCTLR.{TRE, WXN, UWXN} bits. 
° Any change to the Translation table base 0 address in TTBRO. 
° Any change to the Translation table base | address in TTBR1. 
° Any change to HTTBR.BADDR. 
° Any change to VTTBR.BADDR. 
° Changing TTBCR.EAE, see Changing the current Translation table format on page G4-4098. 
° In an implementation that includes EL3, any change to the SCR.SIF bit. 
. In an implementation that includes EL2: 
— Any change to the HCR.VM bit. 
— Any change to HCR.PTW bit, see Changing HCR.PTW on page G4-4098. 
° When using the Short-descriptor translation table format: 
— Any change to the RGN, IRGN, S, or NOS fields in TTBRO or TTBR1I. 
— Any change to the N, EAE, PDO or PD1 fields in TTBCR 
° When using the Long-descriptor translation table format: 


— Any change to the EAE, TnSZ, ORGNn, IRGNn, SHn, or EPDn fields in the TTBCR, where n is 0 or 
1. 


— Any change to the TOSZ, ORGNO, IRGNO, or SHO fields in the HTCR. 
— Any change to the TOSZ, ORGNO, IRGNO, or SHO fields in the VTCR. 
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G4.9.3 


Changing the Access flag enable 


In a PE that is using the Short-descriptor translation table format, it is CONSTRAINED UNPREDICTABLE whether the 
TLB caches the effect of the SCTLR.AFE bit on translation tables. This means that, after changing the SCTLR.AFE 
bit software must invalidate the TLB before it relies on the effect of the new value of the SCTLR.AFE bit, otherwise 
behavior is CONSTRAINED UNPREDICTABLE, see CONSTRAINED UNPREDICTABLE behaviors due to caching of 
control or data values on page K1-5461. 


Note 


There is no enable bit for use of the Access flag when using the Long-descriptor translation table format. 








Changing HCR.PTW 


When EL2 is using AArch32 and the value of the Protected table walk bit, HCR.PTW, is 1, a stage 1 translation 
table access in the Non-secure PL1&0 translation regime, to an address that is mapped to any type of Device 
memory by its stage 2 translation, generates a stage 2 Permission fault. A TLB associated with a particular VMID 
might hold entries that depend on the effect of HCR.PTW. Therefore, if the value of HCR.PTW is changed without 
achange to the VMID value, all TLB entries associated with the current VMID must be invalidated before executing 
software in a Non-secure PL1 or PLO mode. If this is not done, behavior is CONSTRAINED UNPREDICTABLE, see 
CONSTRAINED UNPREDICTABLE behaviors due to caching of control or data values on page K1-5461. 


Changing the current Translation table format 


The effect of changing TTBCR.EAE when executing in the translation regime affected by TTBCR.EAE with any 
stage of address translation for that translation regime enabled is CONSTRAINED UNPREDICTABLE. This means that, 
when TTBCR.EAE is changed for a given context, the TLB must be invalidated before resuming execution in that 
context, otherwise the effect is CONSTRAINED UNPREDICTABLE, see CONSTRAINED UNPREDICTABLE behaviors 
due to caching of control or data values on page K1-5461. 


Atomicity of register changes on changing virtual machine 


From the viewpoint of software executing in a Non-secure PL1 or PLO mode, when there is a switch from one virtual 
machine to another, the registers that control or affect address translation must be changed atomically. This applies 
to the registers for: 


° Non-secure PL1&0 stage 1 address translations. This means that all of the following registers must change 
atomically: 


—  PRRR and NMRR, if using the Short-descriptor translation table format. 
—  MAIRO and MAIR1, if using the Long-descriptor translation table format. 
—  TTBRO, TTBR1, TTBCR, DACR, and CONTEXTIDR. 

— The SCTLR. 


° Non-secure PL1&0 stage 2 address translations. When EL2 is using AArch32, this means that all of the 
following registers and register fields must change atomically: 


—  VTTBR and VTCR. 
—  HMAIRO and HMAIRI. 
— The HSCTLR. 





Note 


Only some bits of SCTLR affect the stage 1 translation, and only some bits of HSCTLR affect the stage 2 translation. 
However, in each case, changing these bits requires a write to the register, and that write must be atomic with the 
other register updates. 





These registers apply to execution in Non-secure PL1&0 modes. However, when updated as part of a switch of 
virtual machines they are updated by software executing in Hyp mode. This means the registers are out of context 
when they are updated, and no synchronization precautions are required. 
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Note 


By contrast, a translation table change associated with a change of ASID, made by software executing at PL1, can 
require changes to registers that are in context. Synchronization of changes of ASID and TTBR describes appropriate 
precautions for such a change. 








Software executing in Hyp mode, or in Secure state, must not use the registers associated with the Non-secure 
PL1&0 translation regime for speculative memory accesses. 


G4.9.4 Synchronization of changes of ASID and TTBR 


A common virtual memory management requirement is to change the ASID and TTBR together to associate the 

new ASID with different translation tables, without any change to the current translation regime. When using the 
Short-descriptor translation table format, different registers hold the ASID and the translation table base address, 
meaning these two values cannot be updated atomically. Since a PE can perform a speculative memory access at 
any time, this lack of atomicity is a problem that software must address. Such a change is complicated by: 


° The depth of speculative fetch being IMPLEMENTATION DEFINED. 


° The use of branch prediction. 


When using the Short-descriptor translation table format, the virtual memory management operations must ensure 
the synchronization of changes of the ContextID and the translation table registers. For example, some or all of the 
TLBs, branch predictors, and other caching of ASID and translation information might become corrupt with invalid 
translations. Synchronization is required to avoid either: 


° The old ASID being associated with translation table walks from the new translation tables. 


° The new ASID being associated with translation table walks from the old translation tables. 


There are a number of possible solutions to this problem, and the most appropriate approach depends on the system. 
Example G4-3, Example G4-4 on page G4-4100, and Example G4-5 on page G4-4100 describe three possible 
approaches. 


Note 


Another instance of the synchronization problem occurs if a branch is encountered between changing the ASID and 
performing the synchronization. In this case the value in the branch predictor might be associated with the incorrect 
ASID. Software can address this possibility using any of these approaches, but instead software might be written in 
a way that avoids such branches. 








Example G4-3 Using a reserved ASID to synchronize ASID and TTBR changes 


In this approach, a particular ASID value is reserved for use by the operating system, and is used only for the 
synchronization of the ASID and TTBR. This example uses the value of 0 for this purpose, but any value could be 
used. 


This approach can be used only when the size of the mapping for any given virtual address is the same in the old 
and new translation tables. 


The maintenance software uses the following sequence, that must be executed from memory marked as global: 


Change ASID to 0 

ISB 

Change TTBR 

ISB 

Change ASID to new value 


This approach ensures that any non-global pages fetched at a time when it is uncertain whether the old or new 
translation tables are being accessed are associated with the unused ASID value of 0. Since the ASID value of 0 is 
not used for any normal operations these entries cannot cause corruption of execution. 
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Example G4-4 Using translation tables containing only global mappings when changing the ASID 


A second approach involves switching the translation tables to a set of translation tables that only contain global 
mappings while switching the ASID. 


The maintenance software uses the following sequence, that must be executed from memory marked as global: 


Change TTBR to the global-only mappings 
ISB 

Change ASID to new value 

ISB 

Change TTBR to new value 


This approach ensures that no non-global pages can be fetched at a time when it is uncertain whether the old or new 
ASID value will be used. 


This approach works without the need for TLB invalidations in systems that have caching of intermediate levels of 
translation tables, as described in General TLB maintenance requirements on page G4-4093, provided that the 
translation tables containing only global mappings have only level 1 translation table entries of the following kinds: 


° Entries that are global. 


. Pointers to level 2 tables that hold only global entries, and that are the same level 2 tables that are used for 
accessing global entries by both: 


— The set of translation tables that were used under the old ASID value. 
— The set of translation tables that will be used with the new ASID value. 


° Invalid level | entries. 


In addition, all sets of translation tables in this example should have the same Shareability and Cacheability 
attributes, as held in the TTBRO.{ORGN, IRGN} or TTBR1.{ORGN, IRGN} fields. 


If these rules are not followed, then the implementation might cache level | translation table entries that require 
explicit invalidation. 


Example G4-5 Disabling non-global mappings when changing the ASID 


In systems where only the translation tables indexed by TTBRO hold non-global mappings, maintenance software 
can use the TTBCR.PD0 field to disable use of TTBRO during the change of ASID. This means the system does not 
require a set of global-only mappings. 


The maintenance software uses the following sequence, that must be executed from a memory region with a 
translation that is accessed using the base address in the TTBR1 register, and is marked as global: 


Set TTBCR.PD@ = 1 

ISB 

Change ASID to new value 
Change TTBR to new value 
ISB 

Set TTBCR.PD@ = 0 


This approach ensures that no non-global pages can be fetched at a time when it is uncertain whether the old or new 
ASID value will be used. 


When using the Long-descriptor translation table format, TTBCR.A1 holds the number, 0 or 1, of the TTBR that 
holds the current ASID. This means the current TTBR can also hold the current ASID, and the current translation 
table base address and ASID can be updated atomically when: 


° TTBRO is the only TTBR being used. TTBCR.A1 must be set to 0. 
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° TTBRO points to the only translation tables that hold non-global entries, and TTBCR.A1 is set to 0. 
. TTBR1 points to the only translation tables that hold non-global entries, and TTBCR.A1 is set to 1. 


In these cases, software can update the current translation table base address and ASID atomically, by updating the 
appropriate TTBR, and does not require a specific routine to ensure synchronization of the change of ASID and base 
address. 


However, in all other cases using the Long-descriptor format, the synchronization requirements are identical to 
those when using the Short-descriptor formats, and the examples in this section indicate how synchronization might 
be achieved. 


Note 


When using the Long-descriptor translation table format, CONTEXTIDR.ASID has no significance for address 
translation, and is only an extension of the Context ID value. 








G4.9.5 The scope of TLB maintenance instructions 


TLB maintenance instructions provide a mechanism for invalidating entries from TLB caching structures, to ensure 
that changes to the translation tables are reflected correctly in the TLB caching structures. To support TLB 
maintenance in multiprocessor systems, there are maintenance operations that apply to the TLBs of all PEs in the 
same Inner Shareable domain. 


The architecture permits the caching of any translation table entry that has been returned from memory without a 
fault and that does not, itself, cause a Translation Fault, an Address size fault, or an Access Flag fault. This means 
the TLB: 


° Cannot hold an entry that, when used for a translation table lookup, causes a Translation fault, an Address 
size fault, or an Access Flag fault. 


° Can hold an entry for a translation table lookup for a translation that causes a Translation Fault, an Address 
size fault, or an Access Flag fault at a subsequent level of translation table lookup. For example, it can hold 
an entry for the level 1 lookup of a translation that causes a Translation fault, an Address size fault, or an 
Access Flag fault at level 2 or level 3 of lookup. 


This means that entries cached in the TLB can include: 
. Translation table entries that point to a subsequent table to be used in the current stage of translation. 
. In an implementation that includes EL2: 

— Stage 2 translation table entries that are used as part of a stage 1 translation table walk. 


— Stage 2 translation table entries for translating the output address of a stage 1 translation. 


Such entries might be held in intermediate TLB caching structures that are distinct from the data caches, in that they 
are not required to be invalidated as the result of writes of the data. The architecture makes no restriction on the form 
of these intermediate TLB caching structures. 


The architecture does not intend to restrict the form of TLB caching structures used for holding translation table 
entries. In particular for translation regimes that involve two stages of translation, it recognizes that such caching 
structures might contain: 


° At any level of the translation table walk, entries containing information from stage | translation table entries. 


° In an implementation that includes EL2: 


—  Atany level of the translation table walk, entries containing information from stage 2 translation table 
entries. 


—  Atany level of the translation table walk, entries combining information from both stage | and stage 
2 translation table entries. 


Note 


For the purpose of TLB maintenance, the term TLB entry denotes any structure, including temporary working 
registers in translation table walk hardware, that holds a translation table entry. 
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Where a TLB maintenance instruction is: 


° Required to apply to stage 1 entries, then it must apply to any cached entry in the caching structures that 
includes any stage 1 information that would be used to translate the address being invalidated, including any 
entry that combines information from both stage 1 and stage 2 translation table entries. 





Note 


— Where stage 1 information has been cached in multiple TLB entries, as could occur from splintering 
a page when caching in the TLB, then the invalidation must apply to each cached entry containing 
stage 1 information from the page that is used to translate the address being invalidated, regardless of 
whether or not that cached entry would be used to translate the address being invalidated. 


— __ Asstated in Global and process-specific translation table entries on page G4-4089, translation table 
entries from levels of translation other than the final level are treated as being non-global. ARM 
expects that, in at least some implementations, cached copies of levels of the translation table walk 
other than the last level are tagged with their ASID, regardless of whether the final level is global. This 
means that TLB invalidations that involve the ASID require the ASID to match such entries to perform 
the required invalidation. 





. Required to apply to stage 2 entries only, then: 


— _ Itis not required to apply to caching structures that combine stage | and stage 2 translation table 
entries. 


— It must apply to caching structures that contain information only from stage 2 translation table entries. 


° Required to apply to both stage | and stage 2 entries, then it must apply to any entry in the caching structures 
that includes information from either a stage 1 translation table entry or a stage 2 translation table entry, 
including any entry that combines information from both stage 1 and stage 2 translation table entries. 


Table G4-22 on page G4-4103 summarizes the required effect of the preferred TLB operations, for execution in 
AArch32 state, that operate only on TLBs on the PE that executes the instruction. Additional TLB operations: 


° Apply across all PEs in the same Inner Shareable domain. Each operation shown in the table has an Inner 
Shareable equivalent, identified by an IS suffix. For example, the Inner Shareable equivalent of TLBIALL is 
TLBIALLIS. See also EL2 upgrading of TLB maintenance instructions on page G4-4104. 


° Can apply to separate Instruction or Data TLBs, as indicated by a footnote to the table. ARM deprecates any 
use of these operations. 


Note 


° The architecture permits a TLB invalidation operation to affect any unlocked entry in the TLB. Table G4-22 
on page G4-4103 defines only the entries that each operation must invalidate. 





. All TLB operations, including those that operate on a VA match, operate regardless of the value of 
SCTLR.M. 





When interpreting the table: 


Related operations —__ Each operation description applies also to any equivalent operation that either: 
° Applies to all PEs in the same Inner Shareable domain. 
. Applies only to a data TLB, or only to an instruction TLB. 


So, for example, the TLBIALL description applies also to TLBIALLIS, ITLBIALL, and 
DTLBIALL. 


TLB maintenance instructions, functional group on page G4-4202 lists all of the TLB 
maintenance instructions. 


Matches the VA Means the VA argument for the operation must match the VA value in the TLB entry. 


Matches the ASID ~—— Means the ASID argument for the operation must match the ASID in use when the TLB 
entry was assigned. 
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Matches the current VMID 


Means the current VMID must match the VMID in use when the TLB entry was assigned. 


The dependency on the VMID applies even when the value of HCR.VM is 0, including 
situations where there is no use of virtualization. However, VTTBR.VMID resets to zero, 
meaning there is a valid VMID from reset. 


Execution at PL2 Descriptions of operations at PL2 apply only to implementations that include EL2. 


For the definitions of the translation regimes referred to in the table see About VMSAv8-32 on page G4-4022. 


Table G4-22 Effect of the TLB maintenance instructions 





Executed from 












































Operation Effect, must invalidate any entry that matches all stated conditions 
State Mode 
TLBIALL2 Secure PLI All entries for the Secure PL1&0 translation regime. That is, all entries that were 
allocated in Secure state. 
Non-secure PL1 All entries for stage 1 of the Non-secure PL1&0 translation regime that match the 
current VMID. 
Hyp All entries for stage 1 or stage 2 of the Non-secure PL1&0 translation regime that 
match the current VMID. 
TLBIMVA@ Secure PL1 Any entry for the Secure PL1&0 translation regime that both: 
° Matches the VA argument. 
° Matches the ASID argument, or is global. 
Non-secure  PLI1 or Any entry for stage 1 of the Non-secure PL1&0 translation regime to which all of 
Hyp the following apply. The entry: 
° Matches the VA argument. 
. Matches the ASID argument, or is global. 
. Matches the current VMID. 
TLBIASID2 Secure PLI1 Any entry for the Secure PL1 &0 translation regime that matches the specified ASID 
and either: 
. Is from a level of lookup above the final level. 
. Is a non-global entry from the final level of lookup. 
Non-secure PL1 or Any entry for stage 1 of the Non-secure PL1&0 translation regime that both: 
Hyp . Matches the specified ASID and either: 
— Is from a level of lookup above the final level. 
—  Isanon-global entry from the final level of lookup. 
° Matches the current VMID. 
TLBIMVAA Secure PL1 Any entry for the Secure PL1&0 translation regime that matches the VA argument. 
Non-secure PL1 or Any entry for stage 1 of the Non-secure PL1&0 translation regime that both: 
Hyp ° Matches the VA argument. 
° Matches the current VMID. 
TLBIALLNSNH? _ Secure Monitor All entries for stage 1 or stage 2 of the Non-secure PL1&0 translation regime, 
regardless of the associated VMID. 
Non-secure Hyp 
TLBIALLH> Secure Monitor All entries for the Non-secure PL2 translation regime. That is, any entry that was 
allocated in Non-secure state from Hyp mode. 
Non-secure Hyp 
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Table G4-22 Effect of the TLB maintenance instructions (continued) 





Executed from 






































Operation Effect, must invalidate any entry that matches all stated conditions 
State Mode 
TLBIMVAL Secure PL1 Any entry for stage | of the Secure PL1&0 translation regime that is from the last 
level of the translation table walk and both: 
° Matches the VA argument. 
° Matches the ASID argument, or is global. 
Non-secure PLI1 or Any entry for stage 1 of the Non-secure PL1&0 translation regime that is from the 
Hyp last level of the translation table walk and to which all of the following apply. The 
entry: 
° Matches the VA argument. 
° Matches the ASID argument, or is global. 
° Matches the current VMID. 
TLBIMVAAL Secure PL1 Any entry for stage 1 of the Secure PL1&0 translation regime that is from the last 
level of the translation table walk and matches the VA argument. 
Non-secure PL1 or Any entry for stage 1 of the Non-secure PL1&0 translation regime that is from the 
Hyp last level of the translation table walk and both: 
° Matches the VA argument. 
° Matches the current VMID. 
TLBIMVAH? Secure Monitor Any entry for the Non-secure PL2 translation regime that matches the VA argument. 
Non-secure Hyp 
TLBIMVALH> Secure Monitor Any entry for the Non-secure PL2 translation regime that is from the last level of the 
translation table walk and matches the VA argument. 
Non-secure Hyp 
TLBIIPAS2>: ¢ Secure Monitor? Any entry for stage 2 of the PL1&0 translation regime that both: 
° Matches the IPA argument. 
Non-secure Hyp 
° Matches the current VMID. 
TLBIIPAS2L>. ¢ Secure Monitor? Any entry for stage 2 of the PL1&0 translation regime that is from the last level of 
translation and both: 
Non-secure Hyp 


° Matches the IPA argument. 
° Matches the current VMID. 





a. The architecture defines variants of these operations that apply only to instruction TLBs, and only to data TLBs. ARM deprecates any use 
of these variants. For more information, see the referenced description of the operation. 


b. Available only in an implementation that includes EL2. See also EL2 upgrading of TLB maintenance instructions. 


c. This instruction is CONSTRAINED UNPREDICTABLE if executed in any AArch32 Secure privileged mode, see Hyp mode TLB maintenance 


instructions on page K1-5476. 


d. This instruction executes as a NOP when SCR.NS == 0. 


EL2 upgrading of TLB maintenance instructions 


In an implementation that includes EL2, when the value of HCR.FB is 1, the TLB maintenance instructions that are 
not broadcast across the Inner Shareable domain are upgraded to operate across the Inner Shareable domain when 
performed in a Non-secure PL1 mode. For example, when the value of HCR.FB is 1, a TLBIMVA operation 
performed in a Non-secure PL1 mode operates as a TLBIMVAIS operation. 
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TLB maintenance with different translation granule sizes 


Ifa TLB maintenance instruction specifying a virtual address affecting the PL2 translation regime is broadcast from 
a PE using AArch32 to a PE using AArch64 using a translation granule size that is different from the AArch32 
translation granule size for that same translation regime, the TLB maintenance instruction is not required to perform 
any invalidation on the recipient PE. 


Ifa TLB maintenance instruction specifying a virtual address affecting the PL1 translation regime is broadcast from 
a PE using AArch32 using one translation granule size for that translation regime for a particular ASID, VMID (if 
applicable), and Security state, to a PE using AArch64 where EL1 for the same ASID, VMID (if applicable), and 
Security state, is using a translation granule size that is different from the AArch32 translation granule size, the TLB 
maintenance instruction is not required to perform any invalidation on the recipient PE. 
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G4.10 


G4.10.1 


G4.10.2 


Caches in VMSAv8-32 


The ARM architecture describes the required behavior of an implementation of the architecture. As far as possible 
it does not restrict the implemented microarchitecture, or the implementation techniques that might achieve the 
required behavior. 


Maintaining this level of abstraction is difficult when describing the relationship between memory address 
translation and caches, especially regarding the indexing and tagging policy of caches. This section: 


° Summarizes the architectural requirements for the interaction between caches and memory translation. 


° Gives some information about the likely implementation impact of the required behavior. 


The following sections give this information: 
° Data and unified caches. 


° Instruction caches. 


In addition Cache maintenance requirement created by changing translation table attributes on page G4-4108 
describes the cache maintenance required after updating the translation tables to change the attributes of an area of 
memory. 


For more information about cache maintenance see: 


° AArch32 cache and branch predictor support on page G3-3989. This section describes the ARM cache 
maintenance instructions. 


° Cache maintenance instructions, functional group on page G4-4201. This section summarizes the System 
register encodings used for these operations when executing in AArch32 state. 


Data and unified caches 


For data and unified caches, the use of memory address translation is entirely transparent to any data access other 
than as described in Mismatched memory attributes on page E2-2352. 


This means that the behavior of accesses from the same observer to different VAs, that are translated to the same PA 
with the same memory attributes, is fully coherent. This means these accesses behave as follows, regardless of 
which VA is accessed: 


. Two writes to the same PA occur in program order. 
° A read of a PA returns the value of the last successful write to that PA. 


° A write to a PA that occurs, in program order, after a read of that PA, has no effect on the value returned by 
that read. 


The memory system behaves in this way without any requirement to use barrier or cache maintenance instructions. 


In addition, if cache maintenance is performed on a memory location, the effect of that cache maintenance is visible 
to all aliases of that physical memory location. 


These properties are consistent with implementing all caches that can handle data accesses as Physically-indexed, 
physically-tagged (PIPT) caches. 


Instruction caches 


In the ARM architecture, an instruction cache is a cache that is accessed only as a result of an instruction fetch. 
Therefore, an instruction cache is never written to by any load or store instruction executed by the PE. 


The ARM architecture supports three different behaviors for instruction caches. For ease of reference and 
description these are identified by descriptions of the associated expected implementation, as follows: 


. PIPT instruction caches. 
° Virtually-indexed, physically-tagged (VIPT) instruction caches. 
° ASID and VMID tagged Virtually-indexed, virtually-tagged (VIVT) instruction caches. 
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In AArch32 state, the CTR identifies the form of the instruction caches, see CTR, Cache Type Register on 
page G6-4293. 


The following subsections describe the behavior associated with these cache types, including any occasions where 
explicit cache maintenance is required to make the use of memory address translation transparent to the instruction 


cache: 
° PIPT instruction caches. 
° VIPT instruction caches. 


° ASID and VMID tagged VIVT instruction caches. 
° The IVIPT architecture Extension on page G4-4108. 





Note 


For software to be portable between implementations that might use any of PIPT instruction caches, VIPT 
instruction caches, or ASID and VMID tagged VIVT instruction caches, the software must invalidate the instruction 
cache whenever any condition occurs that would require instruction cache maintenance for at least one of the 
instruction cache types. 





PIPT instruction caches 


For PIPT instruction caches, the use of memory address translation is entirely transparent to all instruction fetches 
other than as described in Mismatched memory attributes on page E2-2352. 


If cache maintenance is performed on a memory location, the effect of that cache maintenance is visible to all aliases 
of that physical memory location. 


An implementation that provides PIPT instruction caches implements the [VIPT Extension, see The [VIPT 
architecture Extension on page G4-4108. 


VIPT instruction caches 


For VIPT instruction caches, the use of memory address translation is transparent to all instruction fetches other 
than for the effect of memory address translation on instruction cache invalidate by address operations or as 
described in Mismatched memory attributes on page E2-2352. 


Note 


Cache invalidation is the only cache maintenance instruction that can be performed on an instruction cache. 








If instruction cache invalidation by address is performed on a memory location, the effect of that invalidation is 
visible only to the virtual address supplied with the operation. The effect of the invalidation might not be visible to 
any other virtual address aliases of that physical memory location. 


The only architecturally-guaranteed way to invalidate all aliases of a physical address from a VIPT instruction cache 
is to invalidate the entire instruction cache. 


An implementation that provides VIPT instruction caches implements the [VIPT Extension, see The 1VIPT 
architecture Extension on page G4-4108. 


ASID and VMID tagged VIVT instruction caches 


For ASID and VMID tagged VIVT instruction caches, if the instructions at any virtual address change, for a given 
translation regime and a given ASID and VMID, as appropriate, then instruction cache maintenance is required to 
ensure that the change is visible to subsequent execution. This maintenance is required when writing new values to 
instruction locations. It can also be required as a result of any of the following situations that change the translation 
of a virtual address to a physical address, if, as a result of the change to the translation, the instructions at the virtual 
addresses change: 





° For any translation regime other than the Non-secure PL1&0 translation regime, enabling or disabling stage 1 
translations. 
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G4.10.3 


° For the Non-secure PL1&0 translation regime: 


— When stage 2 translations are enabled, enabling or disabling stage 1 translations unless accompanied 
by a change of VMID. 


— When stage 2 translations are disabled, enabling or disabling stage | translations. 


— Enabling or disabling stage 2 translations. 
. Writing new mappings to the translation tables. 


° In AArch32 state, any change to the TTBRO, TTBR1, or TTBCR registers, unless: 


—  Forachange to the Secure PL1&0 translation regime, the change is accompanied by a change to the 
ASID. 


—  Forachange to the stage 1 translations of the Non-secure PL1&0 translation regime, the change is 
accompanied by a change to the ASID or a change to the VMID. 


° In AArch32 state, any change to the VITBR or VTCR registers, unless accompanied by a change to the 
VMID. 


Note 


For ASID and VMID tagged VIVT instruction caches only, for a given translation regime and a given ASID and 
VMID, as appropriate, invalidation is not required if a change to the translations is such that the instructions 
associated with the non-faulting translations of a virtual address remain unchanged through the change to the 
translations, even if the physical locations being mapped to by the changed translation have been written as part of 
changing the translation. 





Examples of situations where this might occur include: 
° Copy-on-Write. 
° Demand Paging of memory locations to/from disk. 


This does not apply for VIPT or PIPT instruction caches, because those caches hold copies of physical addresses, 
and therefore must be invalidated when the contents are written to, to avoid the use of stale entries. 





If instruction cache invalidation by address is performed on a memory location, the effect of that invalidation is 
visible only to the virtual address supplied with the operation. The effect of the invalidation might not be visible to 
any other virtual address aliases of that physical memory location. 


The only architecturally-guaranteed way to invalidate all aliases of a physical address from an ASID and VMID 
tagged VIVT instruction cache is to invalidate the entire instruction cache. 


The IVIPT architecture Extension 


An implementation in which the instruction cache exhibits the behaviors described in PIPT instruction caches on 
page G4-4107, or those described in VIPT instruction caches on page G4-4107, is said to implement the /V/PT 
Extension to the ARM architecture. 


The formal definition of the IVIPT Extension to the ARM architecture is that it reduces the instruction cache 
maintenance requirement to the following condition: 


° Instruction cache maintenance is required only after writing new data to a physical address that holds an 
instruction. 


Cache maintenance requirement created by changing translation table attributes 


Any change to the translation tables to change the attributes of an area of memory can require maintenance of the 
translation tables, as described in General TLB maintenance requirements on page G4-4093. If the change affects 
the cacheability attributes of the area of memory, including any change between Write-Through and Write-Back 

attributes, software must ensure that any cached copies of affected locations are removed from the caches, typically 
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by cleaning and invalidating the locations from the levels of cache that might hold copies of the locations affected 
by the attribute change. Any of the following changes to the inner cacheability or outer cacheability attribute creates 
this maintenance requirement: 


. Write-Back to Write-Through. 
. Write-Back to Non-cacheable. 
. Write-Through to Non-cacheable. 
. Write-Through to Write-Back. 


The cache clean and invalidate avoids any possible coherency errors caused by mismatched memory attributes. 


Similarly, to avoid possible coherency errors caused by mismatched memory attributes, the following sequence 
must be followed when changing the Shareability attributes of a cacheable memory location: 





1. Make the memory location Non-cacheable, Outer Shareable. 
2. Clean and invalidate the location from them cache. 
3. Change the Shareability attributes to the required new values. 
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G4.11 


VMSAv8-32 memory aborts 


In a VMSAv8-32 implementation, the following mechanisms cause a PE to take an exception on a failed memory 
access: 


Debug exception An exception caused by the debug configuration, see Chapter G2 AArch32 Self-hosted 
Debug. 


Alignment fault An Alignment fault is generated if the address used for a memory access does not have the 
required alignment for the operation. For more information see Unaligned data access on 
page E2-2323 and Alignment faults on page G4-4117. 


MMU fault An MMU fault is a fault generated by the fault checking sequence for the current translation 
regime. See MMU faults in AArch32 state on page G4-4118. 


External abort Any memory system fault other than a Debug exception, an Alignment fault, or an MMU 
fault. 


Collectively, these mechanisms are called aborts. Chapter G2 AArch32 Self-hosted Debug and Chapter H3 Halting 
Debug Events describe Debug exceptions, and the remainder of this section describes Alignment faults, MMU 
faults, and External aborts. 


The exception generated on a synchronous memory abort: 





° On an instruction fetch is called the Prefetch Abort exception. 
° On a data access is called the Data Abort exception. 
Note 


The Prefetch Abort exception applies to any synchronous memory abort on an instruction fetch. It is not restricted 
to speculative instruction fetches. 





In AArch32 state, asynchronous memory aborts are a type of External abort, and are treated as a type of Data Abort 
exception. 


The following sections describe the abort mechanisms: 

° Routing of aborts taken to AArch32 state. 

° VMSAv8-32 MMU fault terminology on page G4-4113. 
° The MMU fault-checking sequence on page G4-4113. 
. Alignment faults on page G4-4117. 

° MMU faults in AArch32 state on page G4-4118. 


. External abort on a translation table walk on page G4-4120. 
° AArch32 state prioritization of synchronous aborts from a single stage of address translation on 
page G4-4120. 


An access that causes an abort is said to be aborted. On an abort, System registers are used to record context 
information. For more information see Exception reporting in a VMSAv8-32 implementation on page G4-4123. 





G4.11.1 Routing of aborts taken to AArch32 state 
A memory abort is either a Data Abort exception or a Prefetch Abort exception. When executing in AArch32 state, 
depending on the cause of the abort, and possibly on configuration settings, an abort is taken either: 
° To the Exception level of the PE mode from which the abort is taken. In this case the abort is taken to 
AArch32 state. 
. To a higher Exception level. In this case the Exception level to which the abort is taken is either: 
— Using AArch32. In this case, this chapter describes how the abort is handled. 
— Using AArch64. In this case, Chapter D4 The AArch64 Virtual Memory System Architecture describes 
how the abort is handled. 
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For an abort taken to an Exception level that is using AArch32, the mode to which a memory abort is taken depends 
on the reason for the exception, the mode the PE is in when it takes the exception, and configuration settings, as 
follows: 


Memory aborts taken to Monitor mode 


If an implementation includes EL3, when the value of SCR.EA is 1, all External aborts are taken to 
EL3, and if EL3 is using AArch32 they are taken to Monitor mode. This applies to aborts taken from 
Secure modes and from Non-secure modes. For more information see Asynchronous exception 
routing controls on page G1-3841. 


— Note 


° Although the referenced section mostly describes the routing of asynchronous exceptions, it 
includes the SCR.EA control that applies to both synchronous external aborts and SError 
interrupts. 


° The SCR is implemented only as part of EL3. 





Memory aborts taken to Secure Abort mode 


If an implementation includes EL3, when the PE is executing in Secure state, all memory aborts that 
are not routed to EL3 are taken to Secure Abort mode. 


— Note 


The only memory aborts that can be routed to Monitor mode are External aborts. 





Memory aborts taken to Hyp mode 


If an implementation includes EL2, when the PE is executing in Non-secure state, the following 
aborts are taken to EL2. If EL2 is using AArch32 this means they are taken to Hyp mode: 


° Alignment faults taken: 
— When the PE is in Hyp mode. 


— When the PE is in a Non-secure PL1 or PLO mode and the exception is generated 
because the Non-secure PL1&0 stage 2 translation identifies the target of an unaligned 
access as any type of Device memory. 


— When the PE is in Non-secure User mode and HCR.TGE is set to 1. For more 
information see Abort exceptions, when the value of HCR.TGE is 1 on page G1-3830. 


° When the PE is using the Non-secure PL1&0 translation regime: 


— MMU faults from stage 2 translations, for which the stage | translation did not cause 
an MMU fault. 


— Any abort taken during the stage 2 translation of an address accessed in a stage 1 
translation table walk that is not routed to Secure Monitor mode, see Stage 2 fault on 
a stage I translation table walk on page G4-4117. 


° When the PE is using the Non-secure PL2 translation regime, MMU faults from stage 1 
translations. 


— Note 


The Non-secure PL2 translation regime has only one stage of translation. 
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External aborts, if SCR.EA is set to 0 and any of the following applies: 
— The PE was executing in Hyp mode when it took the exception. 


— The PE was executing in a Non-secure PLO or PL1 mode when it took the exception, 
the abort is asynchronous, and HCR.AMO is set to 1. For more information see 
Asynchronous exception routing controls on page G1-3841. 


— The PE was executing in the Non-secure User mode when it took the exception, the 
abort is synchronous, and HCR.TGE is set to 1. For more information see Abort 
exceptions, when the value of HCR.TGE is I on page G1-3830. 


— The abort occurred on a stage 2 translation table walk. 


Debug exceptions, if HDCR.TDE is set to 1. For more information, see Routing debug 
exceptions to EL2 on page G1-3833. 


Memory aborts taken to Non-secure Abort mode 


In an implementation that does not include EL3, all memory aborts that are taken to an Exception 
level that is using AArch32 are taken to Abort mode. 


Otherwise, when the PE is executing in Non-secure state, the following aborts are taken to 
Non-secure Abort mode: 


When the PE is in a Non-secure PL1 or PLO mode, Alignment faults taken for any of the 
following reasons: 


— SCTLR.A is set to 1. 


— An instruction that does not support unaligned accesses is committed for execution, 
and the instruction accesses an unaligned address. 


— The PL1&0 stage | translation identifies the target of an unaligned access as any type 
of Device memory. 


— Note 


In an implementation that does not include EL2, this case results in a CONSTRAINED 
UNPREDICTABLE memory access, see Cases where unaligned accesses are 
CONSTRAINED UNPREDICTABLE on page E2-2324 and Loads and Stores to 
unaligned locations on page K1-5458. 





If an implementation includes EL2 and the PE is in Non-secure User mode, these exceptions 
are taken to Abort mode only if the value of HCR.TGE is 0. 


When the PE is using the Non-secure PL1&0 translation regime, an MMU fault from a stage 
1 translation. 

External aborts, if all of the following apply: 

— The abort is not on a stage 2 translation table walk. 

— The PE is not in Hyp mode. 

— The value of SCR.EA is 0. 

— The abort is asynchronous, and HCR.AMO is set to 0. 

— __ The abort is synchronous, and HCR.TGE is set to 0. 

Virtual Aborts, see Virtual exceptions when an implementation includes EL2 on 


page G1-3839. 


When the value of HDCR.TDE is 0, Debug exceptions. For more information, see Routing 
debug exceptions to EL2 on page G1-3833. 


— Note 


If ELO is using AArch32 and EL] is using AArch64 then any of these memory aborts taken from 
User mode are taken to EL1 as described in Chapter D4 The AArch64 Virtual Memory System 
Architecture. 








G4-4112 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


G4 The AArch32 Virtual Memory System Architecture 
G4.11 VMSAv8-32 memory aborts 


Memory aborts with IMPLEMENTATION DEFINED behavior 


In addition, a PE can generate an abort for an IMPLEMENTATION DEFINED reason associated with 
lockdown. In an implementation that includes EL2, whether such an abort is taken to Non-secure 
Abort mode or is taken to EL2 is IMPLEMENTATION DEFINED, and an implementation might include 
a mechanism to select whether the abort is routed to Non-secure Abort mode or to EL2. 


When the PE is in a Non-secure mode other than Hyp mode, if multiple factors cause an Alignment fault, the abort 
is taken to Non-secure Abort mode if any of the factors require the abort to be taken to Abort mode. For example, 
if the SCTLR.A bit is set to 1, and the access is an unaligned access to an address that the stage 2 translation tables 
mark as Device-nGnRnE, then the abort is taken to Non-secure Abort mode. 


For more information see Handling exceptions that are taken to an Exception level using AArch32 on 
page G1-3812. 


VMSAv8-32 MMU fault terminology 


The ARMv7 Large Physical Address Extension introduced new terminology for faults on a stage of address 
translation, to provide consistent terminology across all implementations. Table G4-23 shows the terminology used 
in this manual for an MMU faults, compared with older ARM documentation. The current terms are the same for 


faults that occur with the Short-descriptor translation table format and with the Long-descriptor format, and also 
apply to faults in a level 3 lookup when using the Long-descriptor translation table format. 


Table G4-23 MMU fault terminology 





Current term 


Old term 


Note 





Level | Translation fault 


Section Translation fault 





Level 2 Translation fault 


Page Translation fault 





Level 3 Translation fault 


Long-descriptor translation table format only. 





Level 1 Access flag fault 


Section Access flag fault 





Level 2 Access flag fault 


Page Access flag fault 





Level 3 Access flag fault 


Long-descriptor translation table format only. 





Level 1 Domain fault 


Section Domain fault 





Level 2 Domain fault 


Page Domain fault 


Short-descriptor translation table format only, except for reporting faults 
on address translation instructions in the 64-bit PAR, see Determining 
the PAR format on page G4-4145. 


Cannot occur at level 3. 





Level 1 Permission fault 


Section Permission fault 





Level 2 Permission fault 


Page Permission fault 





Level 3 Permission fault 


Long-descriptor translation table format only. 





In an implementation that includes EL2, MMU faults are also classified by the translation stage at which the fault 
is generated. This means that a memory access from a Non-secure PL1 or PLO mode can generate: 


G4.11.3 


The MMU fault-checking sequence 


A stage 1 MMU fault, for example, a stage 1 Translation fault. 
A stage 2 MMU fault, for example, a stage 2 Translation fault. 


This section describes the MMU checks made for the memory accesses required for instruction fetches and for 


explicit memory accesses: 
° If an instruction fetch faults it generates a Prefetch Abort exception. 


. If an data memory access faults it generates a Data Abort exception. 
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For more information about Prefetch Abort exceptions and Data Abort exceptions see Handling exceptions that are 
taken to an Exception level using AArch32 on page G1-3812. 


In VMSAv8-32, all memory accesses require VA to PA translation. Therefore, when a corresponding stage of 
address translation is enabled, each access requires a lookup of the translation table descriptor for the accessed VA. 
For more information, see Translation tables on page G4-4035 and subsequent sections of this chapter. MMU fault 
checking is performed for each level of translation table lookup. If an implementation includes EL2 and is operating 
in Non-secure state, MMU fault checking is performed for each stage of address translation. 


Note 


In an implementation that includes EL2, if a PE is executing in Non-secure state, the operating system or similar 
Non-secure system software defines the stage | translation tables in the IPA address space, and typically is unaware 
of the stage 2 translation from IPA to PA. However, each Non-secure stage | translation table access is subject to 
stage 2 address translation, and might be faulted at that stage. 








The MMU fault checking sequence is largely independent of the translation table format, as the figures in this 
section show. The differences are: 


When using the Short-descriptor format 


° There are one or two levels of lookup. 
° Lookup always starts at level 1. 
° The final level of lookup checks the Domain field of the descriptor and: 


— Faults if there is no access to the Domain. 


— Checks the access permissions only for Client domains. 


When using the Long-descriptor format 


° There are one, two, or three levels of lookup. 
° Lookup starts at either level 1 or level 2. 
° Domains are not supported. All accesses are treated as Client domain accesses. 


The fault-checking sequence shows a translation from an Input address to an Output address. For more information 
about this terminology, see About address translation for VMSAv8-32 on page G4-4026. 





Note 


The descriptions in this section do not include the possibility that the attempted address translation generates a TLB 
conflict abort, as described in TLB conflict aborts on page G4-4091. 





MMU faults in AArch32 state on page G4-4118 describes the faults that an MMU fault-checking sequence can 
report. 


Figure G4-17 on page G4-4115 shows the process of fetching a descriptor from the translation table. For the 
top-level fetch for any translation, the descriptor is fetched only if the input address passes any required alignment 
check. As the figure shows, in an implementation that includes EL2, if the translation is stage 1 of the Non-secure 
PL1&0 translation regime, then the descriptor address is in the IPA address space, and is subject to a stage 2 
translation to obtain the required PA. This stage 2 translation requires a recursive entry to the fault checking 
sequence. 


Note 


Figure G4-17 on page G4-4115 and Figure G4-18 on page G4-4116 give an overview of the fault checking 
performed by the MMU. See AArch32 state prioritization of synchronous aborts from a single stage of address 
translation on page G4-4120 for the complete set of possible faults and their prioritization. 
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Figure G4-17 Fetching the descriptor in a VMSAv8-32 translation table walk 
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Figure G4-18 shows the full VMSAv8-32 fault checking sequence, including the alignment check on the initial 
access. 


Input address 
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‘Yes Check address alignment 
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Figure G4-18 VMSAv8-32 fault checking sequence 
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Stage 2 fault on a stage 1 translation table walk 


When an implementation that includes EL2 is operating in a Non-secure PL1 or PLO mode, any memory access goes 
through two stages of translation: 


° Stage 1, from VA to IPA. 
° Stage 2, from IPA to PA. 


Note 


In a virtualized system that is using AArch32, typically, a Guest OS operating in a Non-secure PL1 mode defines 
the translation tables and translation table register entries controlling the Non-secure PL1&0 stage 1 translations. A 
Guest OS has no awareness of the stage 2 address translation, and therefore believes it is specifying translation table 
addresses in the physical address space. However, it actually specifies these addresses in its IPA space. Therefore, 
to support virtualization, translation table addresses for the Non-secure PL1&0 stage 1 translations are always 
defined in the IPA address space. 








On performing a translation table walk for the stage 1 translations, the descriptor addresses must be translated from 
IPA to PA, using a stage 2 translation. This means that a memory access made as part of a stage | translation table 
lookup might generate, on a stage 2 translation: 

° A Translation fault, Access flag fault, or Permission fault. 


° A synchronous external abort on the memory access. 


If SCR.EA is set to 1, asynchronous external abort is taken to EL3, and if EL3 is using AArch32 it is taken to Secure 
Monitor mode. Otherwise, these faults are reported as stage 2 memory aborts. When EL2 is using AArch32, 

HSR.ISS[7] is set to 1, to indicate a stage 2 fault during a stage 1| translation table walk, and the part of the ISS field 
that might contain details of the instruction is invalid. For more information see Use of the HSR on page G4-4137. 


Alternatively, a memory access made as part of a stage | translation table lookup might target an area of memory 
with the any type of Device memory attribute assigned on the stage 2 translation of the address accessed. When the 
value of the HCR.PTW bit is 1, such an access generates a stage 2 Permission fault. 





Note 


° On most systems, such a mapping to a Device memory type on the stage 2 translation is likely to indicate a 
Guest OS error, where the stage | translation table is corrupted. Therefore, it is appropriate to trap this access 
to the hypervisor. 





A TLB might hold entries that depend on the effect of HCR.PTW. Therefore, if HCR.PTW is changed without 
changing the current VMID, the TLBs must be invalidated before executing in a Non-secure PL1 or PLO mode. For 
more information see Changing HCR.PTW on page G4-4098. 


A cache maintenance instruction performed from a Non-secure PL1 mode can cause a stage | translation table walk 
that might generate a stage 2 Permission fault, as described in this section. This is an exception to the general rule 
that a cache maintenance instruction cannot generate a Permission fault. 


G4.11.4 Alignment faults 


The ARM memory architecture requires support for strict alignment checking. This checking is controlled by 
SCTLR.A. In addition, some instructions do not support unaligned accesses, regardless of the value of SCTLR.A. 
Unaligned data access on page E2-2323 defines when Alignment faults are generated, for both values of SCTLR.A. 


An Alignment fault can occur on an access for which the stage of address translation is disabled. 
Any unaligned access to memory region with any Device memory type attribute generates an Alignment fault. 
Routing of aborts taken to AArch32 state on page G4-4110 defines the mode to which an Alignment fault is taken. 


The prioritization of Alignment faults depends on whether the fault was generated because of an access to a Device 
memory type, or for another reason. For more information see AArch32 state prioritization of synchronous aborts 
from a single stage of address translation on page G4-4120. 
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G4.11.5 MMU faults in AArch32 state 

This section describes the faults that might be detected during one of the fault-checking sequences described in The 

MMU fault-checking sequence on page G4-4113. Unless indicated otherwise, information in this section applies to 

the fault checking sequences for both the Short-descriptor translation table format and the Long-descriptor 

translation table format. 

MMU faults are always synchronous. 

When an MMU fault generates an abort for a region of memory, no memory access is made if that region is or could 

be marked as any type of Device memory. 

The following subsections describe the MMU faults that might be detected during a fault checking sequence: 

. Permission fault. 

. Translation fault on page G4-4119. 

° Address size fault on page G4-4119. 

° Access flag fault on page G4-4119. 

. Domain fault, Short-descriptor format translation tables only on page G4-4120. 

. TLB conflict aborts on page G4-4091. 

See also External abort on a translation table walk on page G4-4120. 

Note 

° Although the TLB conflict abort is classified as an MMU fault, it is described in the section Translation 
Lookaside Buffers (TLBs) on page G4-4089. 

° In VMSAv8-64 an external abort on a translation table walk is classified as an MMU fault. However, in 
VMSAv8-32, for consistency with earlier versions of the architecture these aborts are not classified as MMU 
faults. 

Permission fault 

A Permission fault can be generated at any level of lookup, and the reported fault code identifies the lookup level. 

See Access permissions on page G4-4068 for information about conditions that cause a Permission fault. 

Note 

When using the Short-descriptor translation table format, the translation table descriptors are checked for 

Permission faults only for accesses to memory regions in Client domains. 

A TLB might hold a translation table entry that cause a Permission fault. Therefore, if the handling of a Permission 

fault results in an update to the associated translation tables, the software that updates the translation tables must 

invalidate the appropriate TLB entry, to prevent the stale information in the TLB being used on a subsequent 
memory access. For more information, see the translation table entry update examples in Ordering and completion 
of TLB maintenance instructions on page G4-4096. 

Note 

In an implementation that includes EL2, this maintenance requirement applies to Permission faults in both stage 1 

and stage 2 translations. 

Cache or branch predictor maintenance operations cannot cause a Permission fault, except that: 

. A stage 1 translation table walk performed as part of a cache or branch predictor maintenance operation can 
generate a stage 2 Permission fault as described in Stage 2 fault on a stage I translation table walk on 
page G4-4117. 

° A DCIMVAC issued in Non-secure state that attempts to update a location for which it does not have stage 2 
write access can generate a stage 2 Permission fault, as described in AArch32 data cache maintenance 
instructions (DC*) on page G3-4001. 
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Translation fault 


A Translation fault can be generated at any level of lookup, and the reported fault code identifies the lookup level. 
A Translation fault is generated if bits[1:0] of a translation table descriptor identify the descriptor as either a Fault 
encoding or a reserved encoding. For more information see: 


° VMSAVv8-32 Short-descriptor translation table format descriptors on page G4-4041. 
° VMSAV8-32 Long-descriptor translation table format descriptors on page G4-4050. 


In addition, a Translation fault is generated if the input address for a translation either does not map onto an address 
range of a TTBR, or the TTBR range that it maps onto is disabled. In these cases the fault is reported as a level 1 
Translation fault on the translation stage at which the mapping to a region described by a TTBR failed. 


The architecture guarantees that any translation table entry that causes a Translation fault is not cached, meaning 
the TLB never holds such an entry. Therefore, when a Translation fault occurs, the fault handler does not have to 
perform any TLB maintenance instructions to remove the faulting entry. 


A data or unified cache maintenance instruction by VA can generate a Translation fault. It is IMPLEMENTATION 
DEFINED whether an instruction cache invalidate by VA operation can generate a Translation fault. 


It is IMPLEMENTATION DEFINED whether a branch predictor maintenance operation can generate a Translation fault. 


Address size fault 
An Address size fault can be generated at any level of lookup, and the reported fault code identifies the lookup level. 


An Address size fault is generated if the translation table entries or the TTBR for the stage of translation have 
nonzero address bits above the most significant bit of the maximum output address size. Because VMS Av8-32 
supports a maximum PA and IPA size of 40 bits, this means any case where a translation table entry or the TTBR 
holds an address for which A[47:40] is nonzero generates an Address size fault. 


A data or unified cache maintenance instruction by VA can generate an Address size fault. It is IMPLEMENTATION 
DEFINED whether an instruction cache invalidate by VA operation can generate an Address size fault. 


It is IMPLEMENTATION DEFINED whether a branch predictor maintenance operation can generate an Address size 
fault. 


The architecture guarantees that any translation table entry that causes an Address size fault is not cached, meaning 
the TLB never holds such an entry. Therefore, when an Address size fault occurs, the fault handler does not have to 
perform any TLB maintenance instructions to remove the faulting entry. 


Access flag fault 


An Access flag fault can be generated at any level of lookup, and the reported fault code identifies the lookup level. 
An Access flag fault is generated only if all of the following apply: 


° The translation tables support an Access flag bit: 
— The Short-descriptor format supports an Access flag only when SCTLR.AFE is set to 1. 


— The Long-descriptor format always supports an Access flag. 
° A translation table descriptor with the Access flag bit set to 0 is loaded. 


For more information about the Access flag bit see: 
° VMSAv8-32 Short-descriptor translation table format descriptors on page G4-4041 
. VMSAv8-32 Long-descriptor translation table format descriptors on page G4-4050. 


The architecture guarantees that any translation table entry that causes an Access flag fault is not cached, meaning 
the TLB never holds such an entry. Therefore, when an Access flag fault occurs, the fault handler does not have to 
perform any TLB maintenance instructions to remove the faulting entry. 


Whether any cache maintenance instruction by VA can generate Access flag faults is IMPLEMENTATION DEFINED. 
Whether branch predictor invalidate by VA operations can generate Access flag faults is IMPLEMENTATION DEFINED. 


For more information, see The Access flag on page G4-4074. 
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G4.11.6 


G4.11.7 


Domain fault, Short-descriptor format translation tables only 


When using the Short-descriptor translation table format, a Domain fault can be generated at level lor level 2 of 
lookup. The reported fault code identifies the lookup level. The conditions for generating a Domain fault are: 


Level 1 When a level 1 descriptor fetch returns a valid Section level 1 descriptor, the domain field of that 
descriptor is checked against the DACR. A level 1 Domain fault is generated if this check fails. 


Level 2 When a level 2 descriptor fetch returns a valid level 2 descriptor, the domain field of the level 1 
descriptor that required the level 2 fetch is checked against the DACR, and a level 2 Domain fault 
is generated if this check fails. 


For more information, see Domains, Short-descriptor format only on page G4-4073. 
Domain faults cannot occur on cache or branch predictor maintenance operations. 


A TLB might hold a translation table entry that cause a Domain fault. Therefore, if the handling of a Domain fault 
results in an update to the associated translation tables, the software that updates the translation tables must 
invalidate the appropriate TLB entry, to prevent the stale information in the TLB being used on a subsequent 
memory access. For more information, see the translation table entry update examples in Ordering and completion 
of TLB maintenance instructions on page G4-4096. 


Any change to the DACR must be synchronized by a Context synchronization event. For more information see 
Synchronization of changes to AArch32 System registers on page G4-4163. 


External abort on a translation table walk 


An external abort on a translation table walk can be either synchronous or asynchronous. For more information on 
external aborts, see External aborts on page G3-4014. 


An external abort on a translation table walk is reported: 

. If the external abort is synchronous, using: 
— Asynchronous Prefetch Abort exception if the translation table walk is for an instruction fetch. 
— Asynchronous Data Abort exception if the translation table walk is for a data access. 


° If the external abort is asynchronous, using an SError interrupt, which is taken as an asynchronous Data Abort 
exception. 


If an implementation reports the error in the translation table walk asynchronously from executing the instruction 
whose instruction fetch or memory access caused the translation table walk, these aborts behave essentially as 
interrupts. The aborts are masked when PSTATE.A is set to 1, otherwise they are reported using the Data Abort 
exception. 


Behavior of external aborts on a translation table walk caused by address translation 
instructions 


The address translation instructions summarized in Address translation instructions, functional group on 

page G4-4204 require translation table walks. An external abort can occur in the translation table walk. The abort 
generates a Data Abort exception, and can be synchronous or asynchronous. For more information, see Handling of 
faults and aborts during an address translation instruction on page G4-4146. 


AArch32 state prioritization of synchronous aborts from a single stage of address translation 


Exception prioritization for exceptions taken to AArch32 state on page G1-3816 describes the prioritization of 
exceptions taken from an Exception level that is using AArch32. This section gives additional information about 
the prioritization of MMU faults from VMSAv8-32 translation regimes. 


If a single instruction generates aborts on more than one memory access, the architecture does not define any 
prioritization between those aborts. 


In general, the ARM architecture does not define when asynchronous events are taken, and therefore the 
prioritization of asynchronous events is IMPLEMENTATION DEFINED. 
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Note 


The priority numbering in this list only shows the relative priorities of aborts from a single stage of address 
translation in a VMSAv8-32 translation regime. This numbering has no global significance and, for example, does 
not correlate with the equivalent AArch64 list in AArch64 state prioritization of synchronous aborts from a single 
stage of address translation on page D4-1807. 








For a single stage of translation ina VMSAv8-32 translation regime, the following numbered list shows the priority 
of the possible memory management faults on a memory access. In this list: 


. For memory accesses that undergo two stages of translation, the italic entries show where the faults from the 
stage 2 translation can occur. A stage 2 fault within a stage | translation table walk follows the same 
prioritization of faults. 


° For synchronous external aborts from translation table walks see also Synchronous external abort errors from 
address translation caching structures on page G4-4122. 


The priority order, from highest priority to lowest priority, is: 
1. Alignment fault not caused by memory type. This is possible for a stage 1 translation only. 


2. Translation fault due to the input address being out of the address range to be translated or requiring an 
AArch32 TTBR that is disabled. This includes VTCR.SLO being inconsistent with VTCR.TOSZ or 
programmed to a reserved value. 


Bt Address size fault on an AArch32 TTBR caused by the physical address being out of the range implemented. 


4. Second stage abort ona level 1 lookup of aa stage 1 table walk. When stage 2 address translation is enabled 
this includes an Address size fault caused by the physical address being out of the range implemented. This 
is second stage abort during a first stage translation table walk. 


5. Synchronous parity or ECC error on a level 1 lookup of a translation table walk. 

6. Synchronous external abort on a level | lookup level of a translation table walk. 

7. Translation fault on a level 1 translation table entry. 

8. Address size fault on a level 1 lookup translation table entry caused by the output address being out of the 


range implemented. 


9. Second stage abort on a level 2 lookup of aa stage 1 table walk. When stage 2 address translation is enabled 
this includes an Address size fault caused by the physical address being out of the range implemented. This 
is second stage abort during a first stage translation table walk. 


10. Synchronous parity or ECC error on a level 2 lookup of a translation table walk. 
11. Synchronous external abort on a level 2 lookup level of a translation table walk. 
12. ‘Translation fault on a level 2 translation table entry. 


13. Address size fault on a level 2 lookup translation table entry caused by the output address being out of the 
range implemented. 


14. Second stage abort on a level 3 lookup of aa stage 1 table walk. When stage 2 address translation is enabled 
this includes an Address size fault caused by the physical address being out of the range implemented. This 
is second stage abort during a first stage translation table walk. 


15. | Synchronous parity or ECC error on a level 3 lookup of a translation table walk. 
16. Synchronous external abort on a level 3 lookup level of a translation table walk. 
17. Translation fault on a level 3 translation table entry. 


18. Address size fault on a level 3 lookup translation table entry caused by the output address being out of the 
range implemented. 
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19. 


20. 


21. 


22. 


23. 


24. 


25. 


Access Flag fault. 
Alignment fault caused by the memory type. 


Domain fault. 


Note 


Domain faults are possible only when using the VMSAv8-32 Short-descriptor translation table format, see 
Domain fault, Short-descriptor format translation tables only on page G4-4120. 








Permission fault. 


A fault from the stage 2 translation of the memory access. When stage 2 address translation is enabled this 
includes an Address size fault caused by the physical address being out of the range implemented. 


Synchronous parity or ECC error on the memory access. 


Synchronous External Abort on the memory access. 


Note 





The prioritization of TLB Conflict aborts is IMPLEMENTATION DEFINED, as the exact cause of these aborts 
depends on the form of TLBs implemented. However, the TLB conflict abort must have higher priority than 
any abort that depends on a value held in the TLB. 


The prioritization of IMPLEMENTATION DEFINED MMU faults for a Load-Exclusive or Store-Exclusive to an 
unsupported memory type is IMPLEMENTATION DEFINED. 





See also The MMU fault-checking sequence on page G4-4113. 


Synchronous external abort errors from address translation caching structures 


A caching structure used for caching translation table walks might support: 


An arbitrary number of levels of translation table lookup. 


One or more stages of translation, that might not correspond to the stages of an address translation lookup. 


This might mean that, on a synchronous external abort arising from the caching structure, such as from a parity or 
ECC error, the PE cannot precisely determine one or both of the translation stage and level of lookup at which the 
error occurred. In this case: 


If the PE cannot determine precisely the translation stage at which the error occurred, it is reported and 
prioritized as a stage 1 error. 


If the PE cannot determine precisely the lookup level at which the error occurred, the level is reported and 
prioritized as either: 


— The lowest-numbered level that could have given rise to the error. 


— Level 1 if it the PE cannot determine any information about the level. 





G4-4122 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G4 The AArch32 Virtual Memory System Architecture 
G4.12 Exception reporting in a VUSAv8-32 implementation 


G4.12 Exception reporting in a VMSAv8-32 implementation 


This section describes exception reporting, in AArch32 state, ina VMSAv8-32 implementation. That is, it describes 
only the reporting of exceptions that are taken to an Exception level that is using AArch32. EL2 provides an 
enhanced reporting mechanism for exceptions taken to the Non-secure EL2 mode, Hyp mode. This means that, for 
VMSAv8-32, the exception reporting depends on the mode to which the exception is taken. 





Note 


The enhanced reporting mechanism for exceptions that are taken to Hyp mode is generally similar to the reporting 
of exceptions that are taken to an Exception level that is using AArch64. 





About exception reporting introduces the general approach to exception reporting, and the following sections then 
describe exception reporting at different privilege levels: 


. Reporting exceptions taken to PLI modes on page G4-4124. 

. Fault reporting in PLI modes on page G4-4127. 

° Summary of register updates on faults taken to PLI modes on page G4-4131. 
° Reporting exceptions taken to Hyp mode on page G4-4133. 

° Use of the HSR on page G4-4137. 





° Summary of register updates on exceptions taken to Hyp mode on page G4-4140. 

Note 
The registers used for exception reporting also report information about debug exceptions. For more information 
see: 


. Data Abort exceptions, taken to a PLI mode on page G4-4125. 
° Prefetch Abort exceptions, taken to a PLI mode on page G4-4127. 
° Reporting exceptions taken to Hyp mode on page G4-4133. 





G4.12.1 About exception reporting 


In an implementation that includes EL2 and EL3, exceptions can be taken to: 
° Monitor mode, if EL3 is using AArch32. 
° Hyp mode, if EL2 is using AArch32. 


° A Secure or Non-secure PL1 mode. 

Monitor mode is a PL1 mode, but: 

° It is accessible only when EL3 is using AArch32. 
° It is present only in Secure state. 


° When EL3 is using AArch32, System register controls route some exceptions from Non-secure state to 
Monitor mode. These are the only cases where taking an exception to an Exception level that is using 
AArch32 changes the Security state of the PE. 


Exception reporting in Hyp mode differs significantly from that in the other modes, but in general, exception 
reporting returns: 
° Information about the exception: 


— On taking an exception to Hyp mode, the Hyp Syndrome Register, HSR, returns syndrome 
information. 


— On taking an exception to any other mode, a Fault Status Register (FSR) returns status information. 


. For synchronous exceptions, one or more addresses associated with the exceptions, returned in Fault Address 
Registers (FARs). For a permitted exception to this requirement see Fault address reporting on synchronous 
external aborts on page G4-4124. 


In all modes, additional IMPLEMENTATION DEFINED registers can provide additional information about exceptions. 
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Note 


° PE mode for taking exceptions on page G1-3822 describes how the mode to which an exception is taken is 
determined. 





° EL2 provides: 


— __ Specific exception types, that can only be taken from Non-secure PL1 and PLO modes, and are always 
taken to Hyp mode. 


— _ Routing controls that can route some exceptions from Non-secure PL1 and PLO modes to Hyp mode. 


These exceptions are reported using the same mechanism as the Hyp mode reporting of VMSAv8-32 memory 
aborts, as described in this section. 





Memory system faults generate either a Data Abort exception or a Prefetch Abort exception, as summarized in: 
. Reporting exceptions taken to PL1 modes. 


. Memory fault reporting in Hyp mode on page G4-4135. 


On an access that might have multiple aborts, the MMU fault checking sequence and the prioritization of aborts 
determine which abort occurs. For more information, see The MMU fault-checking sequence on page G4-4113 and 
AArch32 state prioritization of synchronous aborts from a single stage of address translation on page G4-4120. 


Fault address reporting on synchronous external aborts 


The general architectural requirement is that, on a synchronous abort, the faulting address is recorded in a Fault 
Address Register (FAR). This requirement is relaxed for the case of a synchronous external abort that is not a 
synchronous external abort on a translation table walk. In this case only: 


° It is IMPLEMENTATION DEFINED whether the faulting address is recorded in a FAR. 


° A bit in a fault reporting register, the FnV bit, indicates whether a valid address is recorded. 


For exceptions taken to an Exception level that is using AArch32, the details of this reporting depend on whether 
the exception is taken to: 


° A PLI mode, as described in Reporting exceptions taken to PLI modes. 


. Hyp mode, as described in Reporting exceptions taken to Hyp mode on page G4-4133. 


G4.12.2 Reporting exceptions taken to PL1 modes 


The following sections give general information about the reporting of exceptions when they are taken to a Secure 
or Non-secure PL1 mode: 


. Registers used for reporting exceptions taken to PLI modes on page G4-4125. 
. Data Abort exceptions, taken to a PLI mode on page G4-4125. 
. Prefetch Abort exceptions, taken to a PLI mode on page G4-4127. 


Fault reporting in PLI modes on page G4-4127 then describes the fault reporting in these modes, including the 
encodings used for reporting the faults. 


Note 


Security state, Exception levels, and AArch32 execution privilege on page G1-3792 describes how the Secure and 
Non-secure PL1 modes map onto the Exception levels. 
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Registers used for reporting exceptions taken to PL1 modes 


AArch32 state defines the following registers, and register encodings, for exceptions taken to PL1 modes: 


° The DFSR holds information about a Data Abort exception. 

° The DFAR holds the faulting address for some synchronous Data Abort exceptions. 

° The IFSR holds information about a Prefetch Abort exception. 

. The IFAR holds the faulting address for some synchronous Prefetch Abort exceptions. 


In addition, if implemented, the optional ADFSR and AIFSR can provide additional fault information, see Auxiliary 
Fault Status Registers. 


Auxiliary Fault Status Registers 


AArch32 state defines the following Auxiliary Fault Status Registers: 
° The Auxiliary Data Fault Status Register, ADFSR. 
° The Auxiliary Instruction Fault Status Register, AIFSR. 


The position of these registers is architecturally-defined, but the content and use of the registers is IMPLEMENTATION 
DEFINED. An implementation can use these registers to return additional fault status information. An example use 
of these registers is to return more information for diagnosing parity or ECC errors. 


An implementation that does not need to report additional fault information must implement these registers as 
UNK/SBZP. This ensures that an attempt to access these registers from software executing at PL1 does not cause 
an Undefined Instruction exception. 


Data Abort exceptions, taken to a PL1 mode 
On taking a Data Abort exception to a PL1 mode: 


° If the exception is on an instruction cache or branch predictor maintenance operation by VA, its reporting 
depends on the value of TTBCR.EAE. For more information about the registers used when reporting the 
exception, see Data Abort on an instruction cache or branch predictor maintenance instruction by VA on 


page G4-4126. 


° Otherwise, the DFSR is updated with details of the fault, including the appropriate fault status code. If the 
Data Abort exception is synchronous, DFSR.WnR is updated to indicate whether the faulted instruction was 
a read or a write. However, if the fault is on a cache maintenance instruction, or on an address translation 
instruction, WnR is set to 1, to indicate a fault on a write instruction, and the CM bit is set to 1. 


See the register description for more information about the returned fault information. See also Data Abort 
on a Watchpoint exception on page G4-4126. 
If the Data Abort exception is 


— Synchronous, the DFAR is updated with the VA that caused the exception, but see Fault address 
reporting on synchronous external aborts on page G4-4124 for a permitted exception to this 
requirement. 


— Asynchronous, the DFAR becomes UNKNOWN. 
DFSR.WnR and DFSR.CM are UNKNOWN on an asynchronous Data Abort exception. 


For all Data Abort exceptions, if the implementation includes EL3, the Security state of the PE in the mode to which 
the Data Abort exception is taken determines whether the Secure or Non-secure DFSR and DFAR are updated. 
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Data Abort on an instruction cache or branch predictor maintenance instruction by VA 


If an instruction cache invalidation by VA or branch predictor invalidation by VA operation generates a Data Abort 
exception that is taken to a PL1 mode, the DFAR is updated to hold the faulting VA. However, the reporting of the 
fault depends on the value of TTBCR.EAE: 


TTBCR.EAE == 
When the value of TTBCR.EAE is 0, it is IMPLEMENTATION DEFINED which of the following is used 
when reporting the fault: 
° The DFSR indicates an Instruction cache maintenance instruction fault, and the IFSR is valid 
and indicates the cause of the fault, a Translation fault or Access flag fault. 
° The DFSR indicates the cause of the fault, a Translation fault or Access flag fault. The IFSR 
is UNKNOWN. 
In either case: 
° DFSR.WnbR is set to 1. 
° DFSR.C©M is set to 1, to indicate a fault on a cache maintenance instruction. 
TTBCR.EAE == 


When the value of TTBCR.EAE is 1: 

° DFSR.C©M is set to 1, to indicate a fault on a cache maintenance instruction. 

° DFSR.STATUS indicates the cause of the fault, a Translation or Access flag fault. 
° DFSR.WnR is set to 1. 

° The IFSR is UNKNOWN. 


Data Abort on a Watchpoint exception 


On taking a Data Abort exception caused by a watchpoint: 
° DFSR.FS is updated to indicate a debug exception. 





° DFSR.{WnR, Domain} are UNKNOWN. 
° DFAR is set to the address that generated the watchpoint 
Note 
° LR_abt indicates the address of the instruction that triggered the watchpoint. 


° In some ARMv7 AArch32 implementations, the DBGWFAR is set to the address of the instruction that 
triggered the watchpoint. In ARMV8 this register is RESO. 





A watchpointed address can be any byte-aligned address. The address reported in DFAR might not be the 
watchpointed address, and: 


° For a watchpoint due to an operation other than a Data Cache maintenance instruction, can be any address 
between and including: 


— The lowest address accessed by the instruction that triggered the watchpoint. 
— The highest watchpointed address accessed by that instruction. 
If multiple watchpoints are set in this range, there is no guarantee of which watchpoint is generated. 
The address must also be within a naturally-aligned block of memory of an IMPLEMENTATION DEFINED 
power-of-two size, containing a watchpoint address accessed by that location. 

Note 


— In particular, there is no guarantee of generating the watchpoint with the lowest address in the range. 





— The IMPLEMENTATION DEFINED power-of-two size must be no larger than the block size of the 
AArch64 DC ZVA operation. 
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° For a watchpoint due to a Data Cache operation, the address is the address passed to the instruction. This 
might be an address that is above the watchpointed location. 


Prefetch Abort exceptions, taken to a PL1 mode 


For a Prefetch Abort exception generated by an instruction fetch, the Prefetch Abort exception is taken 
synchronously with the instruction that the abort is reported on. This means: 


. If the PE attempts to execute the instruction a Prefetch Abort exception is generated. 


. If an instruction fetch is issued but the PE does not attempt to execute the prefetched instruction, no Prefetch 
Abort exception is generated for that instruction. For example, if the execution flow branches round a 
prefetched instruction, no Prefetch Abort exception is generated. 


In addition, Breakpoint Instruction, Breakpoint, and Vector Catch exceptions, generate a Prefetch Abort exception, 
see the following for more information: 


. Exception syndrome information and preferred return address on page G2-3936 for Breakpoint Instruction 
exceptions. 

. Exception syndrome information and preferred return address on page G2-3958 for Breakpoint exceptions. 

° Exception syndrome information and preferred return address on page G2-3980 for Vector Catch exceptions. 


On taking a Prefetch Abort exception to a PL1 mode: 


° The IFSR is updated with details of the fault, including the appropriate fault code. If appropriate, the fault 
code indicates that the exception was generated by a debug exception. 
See the register description for more information about the returned fault information. 

° For a Prefetch Abort exception generated by an instruction fetch, the IFAR is updated with the VA that caused 


the exception, but see Fault address reporting on synchronous external aborts on page G4-4124 for a 
permitted exception to this requirement. 


. For a Prefetch Abort exception generated by a debug exception, the IFAR is UNKNOWN. 


If the implementation includes EL3, the security state of the PE in the mode to which it takes the Prefetch Abort 
exception determines whether the exception updates the Secure or Non-secure IFSR and IFAR. 


G4.12.3 Fault reporting in PL1 modes 


The FSRs provide fault information, including an indication of the fault that occurred. The following subsections 
describe fault reporting in PL1 modes for each of the translation table formats: 


° PLI fault reporting with the Short-descriptor translation table format on page G4-4128. 
. PL]I fault reporting with the Long-descriptor translation table format on page G4-4129. 


Reserved encoding in the IFSR and DFSR encodings tables on page G4-4131 gives some additional information 
about the encodings for both formats. 


Summary of register updates on faults taken to PL1 modes on page G4-4131 shows which registers are updated on 
each of the reported faults. 


Reporting of External aborts taken from Non-secure state to Monitor mode describes how the fault status register 
format is determined for those aborts. For all other aborts, the current translation table format determines the format 
of the fault status registers. 


Reporting of External aborts taken from Non-secure state to Monitor mode 


When an External abort is taken from Non-secure state to Monitor mode: 





° For a Data Abort exception, the Secure DFSR and DFAR hold information about the abort. 
° For a Prefetch Abort exception, the Secure IFSR and IFAR hold information about the abort. 
° The abort does not affect the contents of the Non-secure copies of the fault reporting registers. 
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Normally, the current translation table format determines the format of the DFSR and IFSR. However, when 
SCR.EA is set to 1, to route external aborts to Monitor mode, and an external abort is taken from Non-secure state, 
this section defines the DFSR and IFSR format. 


For an External abort taken from Non-secure state to Monitor mode, the DFSR or IFSR uses the format associated 
with the Long-descriptor translation table format, as described in PL/ fault reporting with the Long-descriptor 
translation table format on page G4-4129, if any of the following applies: 


° The Secure TTBCR.EAE bit is set to 1. 


° The External abort is synchronous and either: 


—  Itis taken from Hyp mode. 


— It is taken from a Non-secure PL1 or PLO mode, and the Non-secure TTBCR.EAE bit is set to 1. 


Otherwise, the DFSR or IFSR uses the format associated with the Short-descriptor translation table format, as 
described in PL fault reporting with the Short-descriptor translation table format. 


PL1 fault reporting with the Short-descriptor translation table format 


This subsection describes the fault reporting for a fault taken to a PL1 when address translation is using the 


Short-descriptor translation table format. 


On taking an exception, bit[9] of the FSR is RAZ, or set to 0, if the PE is using this FSR format. 


An FSR encodes the fault in a 5-bit FS field, that comprises FSR[10, 3:0]. Table G4-24 shows the encoding of that 
field. Summary of register updates on faults taken to PLI modes on page G4-4131 shows: 


° Whether the corresponding FAR is updated on the fault. That is: 
— _ Fora fault reported in the IFSR, whether the IFAR holds a valid address. 
— _ Fora fault reported in the DFSR, whether the DFAR holds a valid address. 
° For faults that update DFSR, whether DFSR.Domain is valid 


When reading Table G4-24: 


° FS values not shown in the table are reserved. 
° FS values shown as DFSR only are reserved for the IFSR. 


Table G4-24 FSR encodings when using the Short-description translation table format 





























FS Source Notes 
00001 Alignment fault DFSR only. Fault on initial lookup 
00100 Fault on instruction cache maintenance DFSR only 
01100 — Synchronous external abort on translation table walk? Level 1 
01110 Level 2 
11100 Synchronous parity or ECC error on translation table Level 1 
11110 walk’ Level 2 
00101 — Translation fault@ Level 1 MMU fault 
00111 Level 2 
00011 = Access flag fault@ Level 1 MMU fault 
b Level 2 
00110 
01001 = Domain fault@ Level 1 MMU fault 
01011 Level 2 
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Table G4-24 FSR encodings when using the Short-description translation table format (continued) 






































FS Source Notes 
01101 Permission fault@ Level 1 MMU fault 
01111 Level 2 
00010 Debug exception See Chapter G2 AArch32 Self-hosted Debug 
01000 Synchronous external abort - 
10000  TLB conflict abort See TLB conflict aborts on page G4-4091 
10100 IMPLEMENTATION DEFINED Lockdown 
10101 IMPLEMENTATION DEFINED Unsupported Exclusive access 
11001 Synchronous parity or ECC error on memory access - 
10110 = SError interrupt¢ DFSR only 
11000 _—_ SError interrupt® from a parity or ECC error on memory access DFSR only 
a. See The level associated with MMU faults on a Short-descriptor translation table lookup. 
b. Previously, this encoding was a deprecated encoding for Alignment fault. The extensive changes in the memory model in VMSAv8-32 mean 
there should be no possibility of confusing the new use of this encoding with its previous use 
c. Including asynchronous external abort on a data access, a translation table walk, or an instruction fetch. 
The level associated with MMU faults on a Short-descriptor translation table lookup 
The lookup level associated with a fault is: 
° For a fault generated on a translation table walk, the lookup level of the walk being performed. 
. For a Translation fault, the lookup level of the translation table that gave the fault. If a fault occurs because 
a stage of address translation is disabled, or because the input address is outside the range specified by the 
appropriate base address register or registers, the fault is reported as a level 1 fault. 
° For an Access flag fault, Permission fault, or Domain fault, the lookup level of the final level of translation 
table accessed for the translation. That is, the lookup level of the translation table that returned a 
Supersection, Section, or Page descriptor. 
Also see Synchronous external abort errors from address translation caching structures on page G4-4122. 
The Domain field in the DFSR 
The DFSR includes a Domain field. This is inherited from previous versions of the VMSA. The IFSR does not 
include a Domain field. Summary of register updates on faults taken to PLI modes on page G4-4131 describes when 
DFSR.Domain is valid. 
ARM deprecates any use of the Domain field in the DFSR. The Long-descriptor translation table format does not 
support a Domain field, and future versions of the ARM architecture might not support a Domain field in the 
Short-descriptor translation table format. ARM strongly recommends that new software does not use this field. 
For both Data Abort exceptions and Prefetch Abort exceptions, software can find the domain information by 
performing a translation table read for the faulting address and extracting the Domain field from the translation table 
entry. 
PL1 fault reporting with the Long-descriptor translation table format 
This subsection describes the fault reporting for a fault taken to a PL1mode when address translation is using the 
Long-descriptor translation table format. 
When the PE takes an exception, bit[9] of the FSR is set to 1 if the PE is using this FSR format. 
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The FSRs encode the fault in a 6-bit STATUS field, that comprises FSR[5:0]. Table G4-25 shows the encoding of 


that field. In addition: 


° For a fault taken to a PL1 mode, Summary of register updates on faults taken to PLI modes on page G4-4131 
shows whether the corresponding FAR is updated on the fault. That is: 


— Fora fault reported in the IFSR, whether the IFAR holds a valid address. 
— Fora fault reported in the DFSR, whether the DFAR holds a valid address. 


° For a fault taken to the Hyp mode, Summary of register updates on exceptions taken to Hyp mode on 


page G4-4140 shows what registers are updated on the fault. 


Table G4-25 FSR encodings when using the Long-descriptor translation table format 





















































STATUS Source Notes 
OO0O0LL Address size fault. LL bits indicate level>. MMU fault 
0001LL Translation fault. LL bits indicate level>. MMU fault 
0010LL Access flag fault. LL bits indicate level>. MMU fault 
OO11LL Permission fault. LL bits indicate level>. MMU fault 
010000 Synchronous external abort. - 
011000 Synchronous parity or ECC error on memory access. - 
010001 SError interrupt°. DFSR only 
011001 SError interrupt from a parity or ECC error on memory access. DFSR only 
0101LL Synchronous external abort on translation table walk. - 
LL bits indicate level». 
O111ILL Synchronous parity or ECC error on memory access on translation - 
table walk. 
LL bits indicate level®. 
100001 Alignment fault. Fault on initial lookup 
100010 Debug exception. See Chapter G2 AArch32 Self-hosted Debug 
110000 TLB conflict abort. See TLB conflict aborts on page G4-4091 
110100 IMPLEMENTATION DEFINED. Lockdown, DFSR only 
110101 IMPLEMENTATION DEFINED. Unsupported Exclusive access 
1111LL Domain fault. MMU fault. 64-bit PAR only, level 1 or level 


LL bits indicate level. 


2 only. Never used in DFSR, IFSR, or HSR4 





a. STATUS values not shown in this table are reserved. STATUS values not supported in the IFSR or DFSR are reserved for the register or 
registers in which they are not supported. 


b. See The level associated with MMU faults on a Long-descriptor translation table lookup on page G4-4131. 


c. Including asynchronous external abort on a data access, a translation table walk, or an instruction fetch. 


d. A Domain fault can be reported using the Long-descriptor STATUS encodings only as a result of a fault on an address translation instruction. 
For more information see MMU fault on an address translation instruction on page G4-4146. 
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The level associated with MMU faults on a Long-descriptor translation table lookup 


For MMU faults, Table G4-26 shows how the LL bits in the xFSR.STATUS field encode the lookup level associated 
with the fault. 


Table G4-26 Use of LL bits to encode the lookup level at which the fault occurred 





LL bits Meaning 














00 Address size fault Address size fault in TTBRO or TTBR1. 
All other faults Reserved. 
Ol Level 1. 
10 Level 2. 
11 Level 3. When xFSR.STATUS indicates a Domain fault, this value is reserved. 





The lookup level associated with a fault is: 
° For a fault generated on a translation table walk, the lookup level of the walk being performed. 


. For a Translation fault, the lookup level of the translation table that gave the fault. If a fault occurs because 
a stage of address translation is disabled, or because the input address is outside the range specified by the 
appropriate base address register or registers, the fault is reported as a level 1 fault. 


° For an Access flag fault, the lookup level of the translation table that gave the fault. 


. For a Permission fault, including a Permission fault caused by hierarchical permissions, the lookup level of 
the final level of translation table accessed for the translation. That is, the lookup level of the translation table 
that returned a Block or Page descriptor. 


Also see Synchronous external abort errors from address translation caching structures on page G4-4122. 


Reserved encoding in the IFSR and DFSR encodings tables 


With both the Short-descriptor and the Long-descriptor FSR format, the fault encodings reserve a single encoding 
for Cache and TLB lockdown faults. The details of these faults and any associated subsidiary registers are 
IMPLEMENTATION DEFINED. 


G4.12.4 Summary of register updates on faults taken to PL1 modes 


For faults that generate exceptions that are taken to a PL1 mode, Table G4-27 on page G4-4132 shows the registers 
affected by each fault. In this table: 





. Yes indicates that the register is updated. 
. UNK indicates that the fault makes the register value UNKNOWN. 
. A null entry, -, indicates that the fault does not affect the register. 
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For faults that update the DFSR using the Short-descriptor format FSR encodings, Table G4-28 on page G4-4133 


shows whether DFSR.Domain is valid. 


Table G4-27 Effect of a fault taken to a PL1 mode on the reporting registers 










































































Fault IFSR IFAR DFSR DFAR 
Faults reported as Prefetch Abort exceptions: 
MMU fault, always synchronous Yes Yes - - 
Synchronous external abort on translation table walk Yes Yes - - 
Synchronous parity or ECC error on translation table walk Yes Yes - - 
Synchronous external abort Yes IMP DEF@~—- - 
Synchronous parity or ECC error on memory access Yes Yes - - 
TLB conflict abort Yes Yes - - 
Fault reported as Data Abort exception: 
Alignment fault, always synchronous - - Yes Yes 
MMU fault, always synchronous - - Yes Yes 
Fault on instruction cache maintenance, when using Long-descriptor UNK - Yes Yes 
translation table format 
Fault on instruction cache maintenance, when using either Yes - Yes Yes 
Short descriptor translation table format 
or UNK - Yes Yes 
Synchronous external abort on translation table walk - - Yes Yes 
Synchronous parity or ECC error on translation table walk - - Yes Yes 
Synchronous external abort - - Yes IMP DEF 
Synchronous parity or ECC error on memory access - - Yes Yes 
SError interrupt - - Yes UNK 
SError interrupt from a parity or ECC error on memory access - - Yes UNK 
TLB conflict abort - - Yes Yes 
Debug exceptions: 
Breakpoint, Breakpoint Instruction, or Vector Catch4 Yes UNK = = 
Watchpoint® = - Yes Yes 





a. IMPLEMENTATION DEFINED. The IFSR.FnV or DFSR.FnV bit indicates whether the register holds a valid address. See Fault address 


reporting on synchronous external aborts on page G4-4124. 


b. When using the Long-descriptor translation table format, there is not a specific fault code for a fault on an instruction cache 


maintenance instruction. For more information see Data Abort on an instruction cache or branch predictor maintenance instruction 


by VA on page G4-4126. 


c. The two lines of this entry show the alternative ways of reporting the fault when using the Short-descriptor translation table format. 
It is IMPLEMENTATION DEFINED which methods is used, see Data Abort on an instruction cache or branch predictor maintenance 


instruction by VA on page G4-4126. 
d. Generates a Prefetch Abort exception. 


e. Generates a Data Abort exception. 
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For those faults for which Table G4-27 on page G4-4132 shows that the DFSR is updated, if the fault is reported 
using the Short-descriptor FSR encodings, Table G4-28 shows whether DFSR.Domain is valid. In this table, UNK 


indicates that the fault makes DFSR.Domain UNKNOWN. 


Table G4-28 Validity of Domain field on faults that update the DFSR when using the Short-descriptor encodings 





















































DFSR.FS Source DFSR.Domain Notes 
00001 Alignment fault UNK - 

00100 Fault on instruction cache maintenance instruction UNK - 

01100 Synchronous external abort on translation table walk Level 1 UNK 

01110 Level 2 Valid 

11100 Synchronous parity or ECC error on translation table Level 1 UNK 

11110 walk Level 2 Valid 

00101 Translation fault Level 1 UNK MMU fault 
00111 Level 2 Valid 

000114 Access flag fault Level 1 UNK MMU fault 
00110 Level 2 Valid 

01001 Domain fault Level 1 Valid MMU fault 
01011 Level 2 Valid 

01101 Permission fault Level 1 UNK MMU fault 
01111 Level 2 UNK 

01000 Synchronous external abort UNK - 

10000 TLB conflict abort UNK - 

11001 Synchronous parity or ECC error on memory access UNK - 

10110 SError interrupt UNK - 

11000 SError interrupt from a parity or ECC error on memory access UNK = 

00010 Watchpoint UNK 





a. Previously, this encoding was a deprecated encoding for Alignment fault. The extensive changes in the memory model in VMSAv8-32 


mean there should be no possibility of confusing the new use of this encoding with its previous use 


b. Including asynchronous external abort on a data access, a translation table walk, or an instruction fetch. 


G4.12.5 


Reporting exceptions taken to Hyp mode 


Hyp mode is the Non-secure EL2 mode. It is entered by taking an exception to Hyp mode. 


Note 





Software executing in Monitor mode, or at EL3 when EL3 is using AArch64, can perform an exception return to 
Hyp mode. This means Hyp mode can be entered either by taking an exception, or by a permitted exception return. 





When EL2 is using AArch32, the following exceptions are taken to Hyp mode: 


° SError interrupt exceptions, IRQ exceptions, and FIQ exceptions, from Non-secure PLO and PL1 modes, if 
not routed to Secure Monitor mode, and if configured by the AMO, FMO or IMO bits. For more information 
see Asynchronous exception routing controls on page G1-3841. 


° When HCR.TGE is set to 1, all exceptions that would be routed to Non-secure PL1 modes. 
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For more information, see Routing exceptions from Non-secure ELO to EL2 on page G1-3828. 


° When HDCR.TDE is set to 1, any debug exception that would otherwise be taken to a Non-secure PL1 mode, 
see Routing debug exceptions to EL2 on page G1-3833. 


. The privilege rules for taking exceptions mean that any exception taken from Hyp mode, if not routed to EL3, 
must be taken to Hyp mode. 


° Hypervisor Call exceptions, and Hyp Trap exceptions, are always taken to Hyp mode. These exceptions are 
supported only as part of EL2. 


When EL2 is implemented, various operations from Non-secure PLO and PL1 modes can be trapped to Hyp 
mode, using the Hyp Trap exception. For more information, see EL2 configurable controls on page G1-3894. 


These exceptions include any memory system fault that occurs: 

° On a memory access from Hyp mode. 

° On memory access from a Non-secure PLO or PL1 mode: 
—  Onastage 2 translation, from IPA to PA. 


— On the stage 2 translation of an address accessed in performing a stage | translation table walk. 
Memory fault reporting in Hyp mode on page G4-4135 gives more information about these faults. 
The following exceptions provide syndrome information in the HSR: 
° Any synchronous exception taken to Hyp mode. 


° Some exceptions taken from Debug state that would be taken to Hyp mode if the PE was in Non-debug state, 
see Exceptions in Debug state on page H2-4874. 


Note 


— In Debug state, the PE does not change mode on taking an exception. 





— As Exceptions in Debug state on page H2-4874 describes, some other exceptions taken from Debug 
state make the HSR UNKNOWN. 





The syndrome information in the HSR includes the fault status code otherwise provided by the fault status register, 
and extends the fault reporting compared to that available for an exception taken to a PL1 mode. For more 
information, see Use of the HSR on page G4-4137. 


In addition, for a Debug exception taken to Hyp mode, DBGDSCRint.MOE or DBGDTRRXext.MOE shows what 
caused the Debug exception. This bit is valid regardless of whether the Debug exception was taken from Hyp mode 
or from another Non-secure mode. 


Registers used for reporting exceptions taken to Hyp mode lists all of the registers used for exception reporting in 
Hyp mode. 


Registers used for reporting exceptions taken to Hyp mode 


The following registers are used for reporting exceptions taken to Hyp mode: 

° The HSR holds syndrome information for the exception. 

° The HDFAR holds the VA associated with a Data Abort exception. 

° The HIFAR holds the VA associated with a Prefetch Abort exception. 

° The HPFAR holds bits[39:12] of the IPA associated with some aborts on stage 2 address translations. 


In addition, if implemented, the optional HADFSR and HAIFSR can provide additional fault information, see Hyp 
Auxiliary Fault Syndrome Registers. 


Hyp Auxiliary Fault Syndrome Registers 


EL2 also defines encodings for the following Hyp Auxiliary Fault Syndrome Registers: 
° The Hyp Auxiliary Data Fault Syndrome Register, HADFSR. 
° The Hyp Auxiliary Instruction Fault Syndrome Register, HAIFSR. 
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An implementation can use these registers to return additional fault status information for aborts taken to Hyp mode. 
They are the Hyp mode equivalents of the registers described in Auxiliary Fault Status Registers on page G4-4125. 
An example use of these registers is to return more information for diagnosing parity or ECC errors. 


The architectural requirements for the HADFSR and HAIFSR are: 


° The position of these registers is architecturally-defined, but the content and use of the registers is 
IMPLEMENTATION DEFINED. 


. An implementation with no requirement for additional fault reporting can implement these registers as 
UNK/SBZP, but the architecture does not require it to do so. 


Memory fault reporting in Hyp mode 


Prefetch Abort and Data Abort exceptions taken to Hyp mode report memory faults. For these aborts, the HSR 
contains the following fault status information: 


° The HSR.EC field indicates the type of abort, as Table G4-29 shows. 


° The HSR.ISS field holds more information about the abort. In particular: 


—  Bits[5:0] of this field hold the STATUS field for the abort, using the encodings defined in PL/ fault 
reporting with the Long-descriptor translation table format on page G4-4129. 


— Other subfields of the ISS give more information about the exception, equivalent to the information 
returned in the FSR for a memory fault reported at PL1. 


See the descriptions of the ISS fields for the memory faults, referenced from the Syndrome description 
column of Table G4-29, for information about the returned fault information. 


Table G4-29 HSR.EC encodings for aborts taken to Hyp mode 





HSR.EC 


Abort 


Syndrome description 





0x20 


Prefetch Abort taken from Non-secure PLO or PL1 mode —SS encoding for an exception from a Prefetch Abort on 





0x21 


page G6-4406 


Prefetch Abort taken from Hyp mode 





0x24 


Data Abort taken from Non-secure PLO or PL1 mode ISS encoding for an exception from a Data Abort on 





0x25 


page G6-4408 


Data Abort taken from Hyp mode 





For more information, see Use of the HSR on page G4-4137. 
A Prefetch Abort exception is taken synchronously with the instruction that the abort is reported on. This means: 
. If the PE attempts to execute the instruction a Prefetch Abort exception is generated. 


° If an instruction fetch is issued but the PE does not attempt to execute the prefetched instruction, no Prefetch 
Abort exception is generated for that instruction. For example, if the execution flow branches round a 
prefetched instruction that would abort if the PE attempted to execute it, no Prefetch Abort exception is 
generated. 


Register updates on exception reporting in Hyp mode 


The use of the HSR, and of the other registers listed in Registers used for reporting exceptions taken to Hyp mode 
on page G4-4134, depends on the cause of the Abort. In reporting these faults, in general: 


° If the fault generates a synchronous Data Abort exception, the HDFAR holds the associated VA, but see Fault 
address reporting on synchronous external aborts on page G4-4124 for a permitted exception to this 
requirement. 


° If the fault generates a Prefetch Abort exception, the HIFAR holds the associated VA, but see Fault address 
reporting on synchronous external aborts on page G4-4124 for a permitted exception to this requirement. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4135 


1ID092916 


Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.12 Exception reporting in a VMSAv8-32 implementation 


In the following cases, the HPFAR holds the faulting IPA: 


A Translation or Access flag fault on a stage 2 translation. 


A Translation, Access flag, or Permission fault on the stage 2 translation of an address accessed in a 
stage 1 translation table walk. 


A stage 2 Address size fault. 


In all other cases, the HPFAR is UNKNOWN. 


On a Data Abort exception that is taken to Hyp mode, the HIFAR is UNKNOWN. 


On a Prefetch Abort exception that is taken to Hyp mode, the HDFAR is UNKNOWN. 


In addition, the reporting of particular aborts is as follows: 


Abort on the stage 1 translation for a memory access from Hyp mode 


The HDFAR or HIFAR holds the VA that caused the fault. The STATUS subfield of HSR.ISS 
indicates the type of fault, Translation, Address size, Access flag, or Permission. The HPFAR is 
UNKNOWN. 


Abort on the stage 2 translation for a memory access from a Non-secure PL1 or PLO mode 


This includes aborts on the stage 2 translation of amemory access made as part of a translation table 
walk for a stage 1 translation. The HDFAR or HIFAR holds the VA that caused the fault. The 
STATUS subfield of HSR.ISS indicates the type of fault, Translation, Address size, Access flag, or 
Permission. 


For any Access flag fault or Translation fault, and also for any Permission fault on the stage 2 
translation of a memory access made as part of a translation table walk for a stage | translation, the 
HPFAR holds the IPA that caused the fault. Otherwise, the HPFAR is UNKNOWN. 


Abort caused by a synchronous external abort, or synchronous parity or ECC error, and taken to Hyp mode 


The HDFAR or HIFAR holds the VA that caused the fault, but see Fault address reporting on 
synchronous external aborts on page G4-4124 for a permitted exception to this requirement. The 
HPFAR is UNKNOWN. 


Data Abort caused by a Watchpoint exception and routed to Hyp mode because HDCR.TDE is set to 1 


When HDCR.TDE is set to 1, a Watchpoint exception generated in a Non-secure PL1 or PLO mode, 
that would otherwise generate a Data Abort exception, is routed to Hyp mode and generates a Hyp 
Trap exception. 


HDFAR is set to the address that generated the watchpoint. 


—— Note 
ELR_hyp indicates the address of the instruction that triggered the watchpoint. 





A watchpointed address can be any byte-aligned address. The address reported in HDFAR might 
not be the watchpointed address, and, for a watchpoint due to an operation other than a Data Cache 
maintenance instruction, can be any address between and including: 


° The lowest address accessed by the instruction that triggered the watchpoint. 


° The highest watchpointed address accessed by that instruction. 
If multiple watchpoints are set in this range, there is no guarantee of which watchpoint is generated. 


— Note 


In particular, there is no guarantee of generating the watchpoint with the lowest address in the range. 





The address must also be within a naturally-aligned block of memory of an IMPLEMENTATION 
DEFINED power-of-two size, containing a watchpoint address accessed by that location. 
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— Note 


The IMPLEMENTATION DEFINED power-of-two size must be no larger than the block size of the 
AArch64 DC ZVA operation. 





See also Watchpoint exceptions on page G2-3961. 
In all cases, HPFAR is UNKNOWN. 


Prefetch Abort caused by a Breakpoint Instruction exception and taken to Hyp mode 


This abort is generated if a BKPT instruction is executed in Hyp mode. The abort leaves the HIFAR 
and HPFAR UNKNOWN. 


See also Breakpoint Instruction exceptions on page G2-3935. 
Prefetch Abort caused by a Breakpoint Instruction, Breakpoint, or Vector Catch exception, and routed to 
Hyp mode because HDCR.TDE is set to 1 


When HDCR.TDE is set to 1, a debug exception, generated in a Non-secure PL1 or PLO mode, that 
would otherwise generate a Prefetch Abort exception, is routed to Hyp mode and generates a Hyp 
Trap exception. 


The abort leaves the HIFAR and HPFAR UNKNOWN. This is identical to the reporting of a Prefetch 
Abort exception caused by a Debug exception on a BKPT instruction that is executed in Hyp mode. 
— Note 

The difference between these two cases is: 


° The Debug exception on a BKPT instruction executed in Hyp mode generates a Prefetch Abort 
exception, taken to Hyp mode, and reported in the HSR using EC value 0x21. 


° Aborts generated because HDCR.TDE is set to 1 generate a Hyp Trap exception, and are 
reported in the HSR using EC value 0x20. 





Use of the HSR 


The HSR holds syndrome information for any synchronous exception taken to Hyp mode. Compared with the 
reporting of exceptions taken to PL1 modes, the HSR: 


° Always provides details of the fault. The DFSR and IFSR are not used. 


° Provides more extensive information, for a wider range of exceptions. 


Note 
IRQ and FIQ exceptions taken to Hyp mode do not report any syndrome information in the HSR. 








HSR, Hyp Syndrome Register on page G6-4393 describes the HSR, this section summarizes the general form of the 
register, to show how it encodes exception syndrome information. The register comprises: 


° A 6-bit Exception class field, EC, that indicates the cause of the exception. 

. An instruction length bit, IL. When an exception is caused by trapping an instruction to Hyp mode, this bit 
indicates the length of the trapped instruction, as follows: 
0 16-bit instruction trapped. 
1 32-bit instruction trapped. 


In other cases the IL field is not valid and is RES1. 


° An instruction specific syndrome field, ISS. Architecturally, this field could be defined independently for 
each defined Exception class (EC), but in practice several ISS formats are common to more than one EC. 
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The format of the HSR depends on the value of the EC field, as follows: 


0b000000<EC<0b001100 


The ISS part of the returned value includes the CV and COND fields described in Encoding 
of ISS[24:20] when 0b000000 < EC s0b001100. Figure G4-19 shows the HSR format in this 
case. 


31 30 29 26 25 24 23 20.19 0 


| 


CV 
Jil 
EC IL ISS 








Figure G4-19 HSR format when the ISS includes CV and COND fields 


EC==0b000000 or EC0b001110 There are no generic fields within the ISS. Figure G4-20 shows the HSR format 
in this case. 





31 26 25 24 0 
| iT 
EC OIL ISS 


Figure G4-20 HSR format when the ISS does not include a COND field 


Encoding of ISS[24:20] when 0b000000<EC<0b001100 


For EC values that are nonzero and less than or equal to 0b001100, ISS[24:20] provides the condition code field for 
the trapped instruction, together with a valid flag for this field. The encoding of this part of the ISS field is: 


CV, ISS[24] | Condition code valid. Possible values of this bit are: 
0 The COND field is not valid. 
1 The COND field is valid 
COND, ISS[23:20] 
The condition code for the trapped instruction. This field is valid only when CV is set to 1. 


If CV is set to 0, this field is UNK/SBZP. 
The full descriptions of the HSR.ISS formats give more information about the CV field. 


Note 


In some circumstances, it is IMPLEMENTATION DEFINED whether a conditional instruction that fails its condition 


code check generates an Undefined Instruction exception, see Conditional execution of undefined instructions on 
page G1-3851. 
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Table G4-30 shows the encoding of the HSR exception class field, EC. Values of EC not shown in the table are 
reserved. For each EC value, the table references a subsection of the description of the HSR that describes the 
associated ISS format and gives information about the cause of the exception, for example the configuration 


required to enable the associated trap. 


Table G4-30 HSR.EC field encoding 

































































EC Exception class ISS description, or notes 
0b000000 = =©Unknown reason ISS encoding for exceptions with an unknown reason on page G6-4396. 
0b000001 ‘Trapped WFI or WFE instruction ISS encoding for an exception from a WFI or WFE instruction on 
page G6-4397. 
0b000011 Trapped MCR or MRC access with ISS encoding for an exception from an MCR or MRC access on 
(coproc==0b1111) page G6-4398. 
0b00010@ Trapped MCRR or MRRC access with ISS encoding for an exception from an MCRR or MRRC access on 
(coproc==0b1111) page G6-4400. 
Qb000101 Trapped MCR or MRC access with ISS encoding for an exception from an MCR or MRC access on 
(coproc==0b1110) page G6-4398. 
Qb000110 Trapped LDC or STC access ISS encoding for an exception from an LDC or STC instruction on 
page G6-4401. 
0b000111 Advanced SIMD or floating-point functionality  /SS encoding for an exception from an access to SIMD or floating-point 
trapped by a HCPTR.{TASE, TCP10} control — functionality, resulting from HCPTR on page G6-4403. 
0b00100@ ‘Trapped VMRS access, from ID group traps, that SS encoding for an exception from an MCR or MRC access on 
is not reported using EC 0b000111 page G6-4398. 
This trap is not taken if the HCPTR settings trap the access. 
0b001100 Trapped MRRC access with (coproc==0b1110) ISS encoding for an exception from an MCRR or MRRC access on 
page G6-4400. 
0b001110 ~=Illegal exception return to AArch32 state ISS encoding for an exception from an Illegal state or PC alignment 
fault on page G6-4408. 
0b010001 Exception on SVC execution in AArch32 state ISS encoding for an exception from HVC or SVC instruction execution 
routed to EL2 on page G6-4404. 
0b010010 HVC instruction execution in AArch32 state, 
when HVC is not disabled 
0b010011 Trapped execution of SMC instruction in ISS encoding for an exception from SMC instruction execution on 
AArch32 state page G6-4405. 
0b100000 ~=Prefetch Abort from a lower Exception level ISS encoding for an exception from a Prefetch Abort on page G6-4406. 
0b100001 Prefetch Abort taken without a change in 
Exception level 
0b100010 PC alignment exception. ISS encoding for an exception from an Illegal state or PC alignment 
fault on page G6-4408. 
0b10010@ Data Abort from a lower Exception level ISS encoding for an exception from a Data Abort on page G6-4408. 
0b100101 Data Abort taken without a change in Exception 





level 





All EC encodings not shown in Table G4-29 on page G4-4135 are reserved by ARM. 
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G4.12.6 Summary of register updates on exceptions taken to Hyp mode 
For memory system faults that generate exceptions that are taken to Hyp mode, Table G4-31 shows the registers 
affected by each fault. In this table: 
° Yes indicates that the register is updated. 
. UNK indicates that the fault makes the register value UNKNOWN. 
° A null entry, -, indicates that the fault does not affect the register. 
Note 
For a list of the MMU faults see MMU faults in AArch32 state on page G4-4118. 
Table G4-31 Effect of an exception taken to Hyp mode on the reporting registers 
Fault HSR HIFAR HDFAR- HPFAR 
Faults reported as Prefetch Abort exceptions: 
MMU fault? at stage 1. Yes Yes UNK UNK 
Translation or Access flag MMU fault? at stage 2. Yes Yes UNK Yes 
Other> MMU fault® at stage 2. Yes Yes UNK UNK 
Stage 2 MMU fault@ on a stage | translation. Yes Yes UNK Yes 
Synchronous external abort on translation table walk. Yes Yes UNK UNK 
Synchronous parity or ECC error on translation table walk. Yes Yes UNK UNK 
Synchronous external abort. Yes IMP UNK UNK 
DEF 
Synchronous parity or ECC error on memory access. Yes Yes UNK UNK 
Fault reported as Data Abort exception: 
MMU fault? at stage 1. Yes UNK Yes UNK 
Translation or Access flag MMU fault® at stage 2. Yes UNK Yes Yes 
Other> MMU fault® at stage 2. Yes UNK Yes UNK 
Stage 2 MMU fault@ on a stage 1 translation. Yes UNK Yes Yes 
Synchronous external abort on translation table walk. Yes UNK Yes UNK 
Synchronous parity or ECC error on translation table walk. Yes UNK Yes UNK 
Synchronous external abort. Yes UNK IMP DEF° UNK 
Synchronous parity or ECC error on memory access. Yes UNK Yes UNK 
SError interrupt Yes UNK UNK UNK 
SError interrupt from a parity or ECC error on memory access. Yes UNK UNK UNK 
Debug exception: 
Breakpoint Instruction4, generates a Prefetch Abort exception. Yes UNK 7 UNK 
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Table G4-31 Effect of an exception taken to Hyp mode on the reporting registers (continued) 





Fault HSR- HIFAR' HDFAR’- HPFAR 





Debug exception routed to Hyp mode because HDCR.TDE is set to 1. Generates a Hyp Trap exception. 





Breakpoint Breakpoint Instruction or Vector Catch Yes UNK - UNK 





Watchpoint Yes - Yes UNK 





a. For more information see Classification of MMU faults taken to Hyp mode 
b. MMU fault other than a Translation fault or an Access flag fault. 


c. IMPLEMENTATION DEFINED. The FnV bit in the HSR.ISS field indicates whether the register holds a valid address. See 
Fault address reporting on synchronous external aborts on page G4-4124. 


d. All other debug exceptions are not permitted in Hyp mode. 


Note 


Unlike Table G4-27 on page G4-4132, the Hyp mode fault reporting table does not include an entry for a fault on 
an instruction cache maintenance instruction. That is because, when the fault is taken to Hyp mode, the reporting 
indicates the cause of the fault, for example a Translation fault, and ISS.CM is set to 1 to indicate that the fault was 
on a cache maintenance instruction, see JSS encoding for an exception from a Data Abort on page G6-4408. 








Classification of MMU faults taken to Hyp mode 


This subsection gives more information about the MMU faults shown in Table G4-31 on page G4-4140. 





Note 
All MMU faults are synchronous. 





The table uses the following descriptions for MMU faults taken to Hyp mode: 


MMU fault at stage 1 
This is an MMU fault generated on a stage | translation performed in the Non-secure PL2 
translation regime. 


MMU fault at stage 2 


This is an MMU fault generated on a stage 2 translation performed in the Non-secure PL1&0 
translation regime. 


As the table shows, for the faults in this group: 
° Translation and Access flag faults update the HPFAR 
. Permission faults leave the HPFAR UNKNOWN. 


MMU stage 2 fault on a stage 1 translation 


This is an MMU fault generated on the stage 2 translation of an address accessed in a stage 1 
translation table walk performed in the Non-secure PL1&0 translation regime. For more 
information about these faults see Stage 2 fault on a stage I translation table walk on page G4-4117. 


Figure G4-1 on page G4-4024 shows the different translation regimes and associated stages of translation. 
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G4.13 Address translation instructions 


The System register encoding space includes encodings for instructions that either: 
° Translate a virtual address (VA) to a physical address (PA). 
° Translate a virtual address (VA) to an intermediate physical address (IPA). 


Address translation instructions, functional group on page G4-4204 summarizes these instructions. 


When using the Short-descriptor translation table format, all translations performed by these instructions take 
account of TEX remap when this is enabled, see Short-descriptor format memory region attributes, with TEX remap 
on page G4-4080. 


An address translation instruction that executes successfully returns the output address, a PA or an IPA, in the PAR. 
This is a 64-bit register, that can hold addresses of up to 40 bits. 


It is IMPLEMENTATION DEFINED whether the address translation instructions return the values held in a TLB or the 
result of a translation table walk. Therefore, ARM recommends that these instructions are not used at a time when 
the TLB entries might be different from the underlying translation tables held in memory. 


The following sections give more information about these instructions: 


° Address translation instruction naming and operation summary. 

° Encoding and availability of the address translation instructions on page G4-4144. 

° Determining the PAR format on page G4-4145. 

. Handling of faults and aborts during an address translation instruction on page G4-4146. 


G4.13.1 Address translation instruction naming and operation summary 


Some older documentation uses the original names for the address translation instructions that were included in the 
original ARMv7 documentation. Table G4-32 summarizes the instructions that are available in AArch32 state, and 
relates the old instruction names to the current names. 


Table G4-32 Naming of address translation instructions 








Name Old name Description 
ATS1CPR, ATSICPW, V2PCWPR, V2PCWPW, See ATS1Cxx, Address translation stage 1, current security state on 
ATS1CUR, ATSICUW V2PCWUR, V2PCWUW spage G4-4143 





ATS12NSOPR, ATSIZ2NSOPW, V2POWPR, V2POWPW, See ATS12NSOxx, Address translation stages 1 and 2, Non-secure 
ATS12NSOUR, ATSI2NSOUW = V2POWUR, V2POWUW state only on page G4-4143 





ATS1HR, ATS1HW 


Not applicable See ATS1Hx, Address translation stage 1, Hyp mode on 
page G4-4144 





a. Instructions are part of EL2 and have no equivalent in the older descriptions. 


In an implementation that does not include EL2, there is no distinction between stage 1 translations and stage 1 and 
2 combined translations. 


For the stage 1 current state and stages I and 2 Non-secure state only instructions, the meanings of the last two 
letters of the names are: 





PR PL1 mode, read operation. 

PW PL1 mode, write operation. 

UR User mode, read operation. 

UW User mode, write operation. 
Note 


User mode can be described as the unprivileged mode. It is the only PLO mode. 
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For the stage 1 Hyp mode instructions, the last letter of the instruction name is R for the read operation and W for 
the write operation. 


See also Encoding and availability of the address translation instructions on page G4-4144. 


ATS1Cxx, Address translation stage 1, current security state 


Any VMSAv8-32 implementation supports the ATS 1Cxx instructions. They can be executed by any software 
executing at PL1 or higher, in either Security state. 


The ATS1Cxx instructions are ATS1CPR, ATS1CPW, ATS1ICUR, and ATS1CUW. These instructions perform the 
address translations of the PL1&0 translation regime. 


In an implementation that includes EL2, when executed in Non-secure state, these instructions return the IPA that 
is the output address of the stage 1 translation. Figure G4-1 on page G4-4024 shows the different translation 
regimes. 


Note 


The Non-secure PL1 and PLO modes have no visibility of the stage 2 address translations, that can be defined only 
at PL2, and translate IPAs to be PAs. 








See Determining the PAR format on page G4-4145 for the format used when returning the result of these 
instructions. 


ATS12NSOxx, Address translation stages 1 and 2, Non-secure state only 


A VMSAv8-32 implementation supports the ATS 12NSOxx instructions only if it includes EL2. In an 
implementation that includes EL2, in AArch32 state, they can be executed: 


° By software executing in Non-secure state at EL2. This means by software executing in Hyp mode. 
° If the implementation includes EL3, when EL3 is using AArch32, by software executing in Secure state at 
PLI. 


The ATS12NSOxx instructions are ATS12NSOPR, ATS12NSOPW, ATS12NSOUR, and ATS12NSOUW. 


In an implementation that includes EL3, when EL3 is using AArch64 and EL] is using AArch32, any execution of 
an ATS12NSOxx instruction at Secure EL1 is trapped as an exception that is taken to EL3. 


In an implementation that does not include EL2, but includes EL3, when EL3 is using AArch32 these instructions 
are not UNDEFINED but each instruction behaves in the same way as the equivalent ATS 1Cxx instruction. 


If an implementation does not include EL2 and does not include EL3 then these instructions are CONSTRAINED 
UNPREDICTABLE, with the permitted behavior that the instructions are UNDEFINED, see Unallocated System register 
access instructions on page K1-5460. 


ARM deprecates use of these instructions from any Secure PL1 mode other than Monitor mode. 


In Secure state and in Non-secure Hyp mode these instructions perform the translations made by the Non-secure 
PL1&0 translation regime. 


These instructions always return the PA and final attributes generated by the translation. That is, for an 
implementation that includes EL2, they return: 





° The result of the two stages of address translation for the specified Non-secure input address. 
° The memory attributes obtained by the combination of the stage 1 and stage 2 attributes. 
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Note 


From Hyp mode, the ATS1Cxx and ATS12NSOxx instructions both return the results of address translations that 
would be performed in the Non-secure modes other than Hyp mode. The difference is: 





° The ATS 1Cxx instructions return the Non-secure PL1 view of the associated address translation. That is, they 
return the IPA output address corresponding to the VA input address. 


° The ATS12NSOxx instructions return the EL2, or Hyp mode, view of the associated address translation. That 
is, they return the PA output address corresponding to the VA input address, generated by two stages of 
translation. 





See Determining the PAR format on page G4-4145 for the format used when returning the result of these 
instructions. 


ATS1Hx, Address translation stage 1, Hyp mode 


A VMSAv8-32 implementation supports the ATS 1Hx instructions only if it includes EL2. They can be executed by: 
° Software executing in Non-secure state at PL2. This means by software executing in Hyp mode. 


° Software executing in Secure state in Monitor mode. 


The ATS1Hx instructions are ATS1HR and ATS1HW. In an implementation that includes EL3, these instructions 
are CONSTRAINED UNPREDICTABLE if executed in a Secure PL1 mode other than Monitor mode, see Hyp mode VA 
to PA address translation instructions on page K1-5476. 


If an implementation does not include EL2 then these instructions are CONSTRAINED UNPREDICTABLE, with the 
permitted behavior that the instructions are UNDEFINED, see Unallocated System register access instructions on 
page K1-5460. 


These instructions perform the translations made by the Non-secure EL2 translation regime. The instruction takes 
a VA input address and returns a PA output address. 


These instructions always return a result in a 64-bit format PAR. 


G4.13.2 Encoding and availability of the address translation instructions 


Software executing at PLO never has any visibility of the address translation instructions, but software executing at 
PL1 or higher can use the unprivileged address translation instructions to find the address translations used for 
memory accesses by software executing at PLO and PLI. 


Note 


For information about translations when the stage of address translation is disabled see The effects of disabling 
address translation stages on VMSAv8-32 behavior on page G4-4031. 








Table G4-33 shows the encodings for the address translation instructions, and their availability in different 
implementations in different PE modes and states. 


Table G4-33 Address translation instructions in AArch32 state 





opc1 CRm opc2 Name Type Description 





All VMSAv8-32 implementations, in all modes, at PL1 or higher, see ATS/Cxx, Address translation stage 1, current security state on 
page G4-4143 

















0 c8 0 ATS1CPR WO PL1I stage 1 read translation, current state 
1 ATS1CPW WO PL1 stage 1 write translation, current state 
2 ATS1CUR WO Unprivileged stage 1 read translation, current state 
3 ATS1CUW WO Unprivileged stage 1 write translation, current state 
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Table G4-33 Address translation instructions in AArch32 state (continued) 





opc1 CRm opc2 Name Type Description 





Implementation includes EL2, in Non-secure Hyp mode and Secure PL1 modes, see ATS/2NSOxx, Address translation stages 1 and 2, 
Non-secure state only on page G4-4143 














0 c8 4 ATS12NSOPR WO Non-secure PL1 stage 1 and 2 read translation 
3 ATS12NSOPW WO Non-secure PL1 stage 1 and 2 write translation 
6 ATS12NSOUR WO Non-secure unprivileged stage 1 and 2 read translation 
7 ATS12NSOUW WO Non-secure unprivileged stage 1 and 2 write translation 





Implementation includes EL2, in Non-secure Hyp mode and Secure Monitor mode, see ATS/ Hx, Address translation stage 1, Hyp mode 
on page G4-4144 





4 c8 0 ATS1HR WO Hyp mode stage 1| read translation 





1 ATS1HW WO Hyp mode stage | write translation 





The result of an instruction is always returned in the PAR. The PAR is a RW register and: 


° In all implementations, the 32-bit format PAR is accessed using an MCR or MRC instruction with CRn set to c7, 
CRm set to c4, and opc! and opc2 both set to 0. 


° The 64-bit format PAR is accessed using an MCRR or MRRC instruction with CRm set to c7, and opcl set to 0. 


Address translation instructions that are not available in a particular implementation are reserved and CONSTRAINED 
UNPREDICTABLE. For example: 


° In an implementation that does not include EL2, the encodings with an opc1 value of 4 are reserved and 
CONSTRAINED UNPREDICTABLE. These are the ATS1Hx operations. 


° In an implementation that does not include either EL2 or EL3, the encodings with opc2 values of 4-7 are 
reserved and CONSTRAINED UNPREDICTABLE. These are the ATS 12NSOxx operations. 


The CONSTRAINED UNPREDICTABLE behavior of these encodings is that they are UNDEFINED, see Unallocated 
System register access instructions on page K1-5460. 


G4.13.3 Determining the PAR format 


The PAR is a 64-bit register, that supports both 32-bit and 64-bit PAR formats. This section describes how the PAR 
format is determined, for returning a result from each of the groups of address translation instructions. The returned 
result might be the translated address, or might indicate a fault on the translation, see Handling of faults and aborts 
during an address translation instruction on page G4-4146. 


ATS1Cxx instructions 


Address translations for the current state. From modes other than Hyp mode: 


. TTBCR.EAE determines whether the result is returned using the 32-bit or the 64-bit PAR 
format. 


° If the implementation includes EL3, the translation performed is for the current security state 
and, depending on that state: 
— The Secure or Non-secure TTBCR.EAE determines the PAR format. 


— The result is returned to the Secure or Non-secure instance of the PAR 


Instructions executed in Hyp mode always return a result to the Non-secure PAR, using the 64-bit 
format. 
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ATS12NSOxx instructions 


Address translations for the Non-secure PL1 and PLO modes. These instructions return a result using 
the 64-bit PAR format if at least one of the following is true: 


. The Non-secure TTBCR.EAE bit is set to 1. 
° The implementation includes EL2, and the value of HCR.VM is 1. 


Otherwise, the instruction returns a result using the 32-bit PAR format. 


Instructions executed in a Secure PL1 mode return a result to the Secure PAR. Instructions executed 
in Hyp mode return a result to the Non-secure PAR. 


ATS1Hx instructions 


Address translations from Hyp mode. These instructions always return a result using the 64-bit PAR 
format. 


Instructions executed in Secure Monitor mode return a result to the Secure PAR. Instructions 
executed in Non-secure Hyp mode return a result to the Non-secure PAR. 











G4.13.4 Handling of faults and aborts during an address translation instruction 

When a stage of address translation is enabled, any corresponding address translation instruction requires a 

translation table lookup, and this might require a translation table walk. However, the input address for the 

translation might be a faulting address, either because: 

° The translation table entries used for the translation indicate a fault. 

° A stage 2 fault or an external abort occurs on the required translation table walk. 

VMSAv8-32 memory aborts on page G4-4110 describes the faults that might occur on a translation table walk in 

AAtrch32 state. 

How the fault is handled, and whether it generates an exception, depends on the cause of the fault, as described in: 

° MMU fault on an address translation instruction. 

° External abort during an address translation instruction on page G4-4147. 

° Stage 2 fault on a current state address translation instruction on page G4-4147. 

MMU fault on an address translation instruction 

In the following cases, an MMU fault on an address translation is reported in the PAR, and no abort is taken. This 

applies: 

. For a faulting address translation instruction executed in Hyp mode, or in a Secure PL1 mode. 

° For a faulting address translation instruction executed in a Non-secure PL1 mode, for cases where the fault 
would generate a stage | abort if it occurred on the equivalent load or store operation. 

Using the PAR to report a fault on an address translation instruction on page G4-4147 gives more information about 

how these faults are reported. 

Note 

° The Domain fault encodings shown in Table G4-25 on page G4-4130 are used only for reporting a fault on 
an address translation instruction that uses the 64-bit PAR format. That is, they are used only in an 
implementation that includes EL2, and are used for reporting a Domain fault on either: 

—  AnATS1Cxx operation from Hyp mode. 
— An ATS12NSOxx operation when HCR.VM is set to 1. 
These encodings are never used for fault reporting in the DFSR, IFSR, or HSR. 

° For an address translation instruction executed in a Non-secure PL1 mode, for a fault that would generate a 
stage 2 abort if it occurred on the equivalent load or store operation, the stage 2 abort is generated as described 
in Stage 2 fault on a current state address translation instruction on page G4-4147. 
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Using the PAR to report a fault on an address translation instruction 


For a fault on an address translation instruction for which no abort is taken, the PAR is updated with the following 
information, to indicate the fault: 


. The fault code, that would normally be written to the Fault status register. The code used depends on the 
current translation table format, as described in either: 


—  PLI fault reporting with the Short-descriptor translation table format on page G4-4128. 
—  PLI fault reporting with the Long-descriptor translation table format on page G4-4129. 


See also the Note at the start of Determining the PAR format on page G4-4145 about the Domain fault 
encodings shown in Table G4-25 on page G4-4130. 


° A status bit, that indicates that the translation operation failed. 


The fault does not update any Fault Address Register. 


External abort during an address translation instruction 


As stated in External abort on a translation table walk on page G4-4120, an external abort on a translation table 
walk generates a Data Abort exception. The abort can be synchronous or asynchronous, and behaves as follows: 


Synchronous external abort on a translation table walk 


The fault status and fault address registers of the Security state to which the abort is taken are 
updated. The fault status register indicates the appropriate external abort on a Translation fault, and 
the fault address register indicates the input address for the translation. 


The PAR is UNKNOWN. 


Asynchronous external abort on a translation table walk 


The fault status register of the Security state to which the abort is taken is updated, to indicate the 
asynchronous external abort. No fault address registers are updated. 


The PAR is UNKNOWN. 


Stage 2 fault on a current state address translation instruction 


If the PE is in a Non-secure PL1 mode and performs one of the ATS1C** operations, then a fault in the stage 2 
translation of an address accessed in a stage | translation table lookup generates an exception. This is equivalent to 
the case described in Stage 2 fault on a stage 1 translation table walk on page G4-4117. When this fault occurs on 
an ATS1C** address translation instruction: 


° A Hyp Trap exception is taken to Hyp mode. 
° The PAR is UNKNOWN. 
° The HSR indicates that: 
— The fault occurred on a translation table walk. 
— The operation that faulted was a cache maintenance instruction. 
. The HPFAR holds the IPA that faulted. 
° The HDFAR holds the VA that the executing software supplied to the address translation instruction. 
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G4.14 About the System registers for VMSAv8-32 


The System registers and System instructions that are accessible in AArch32 state are almost all in the encoding 
space described in The AArch32 System register interface on page G1-3877. This section gives general information 
about these registers, which comprise: 


° Registers in the (coproc==0b1111) encoding space, that provide control and status information for the PE in 
Non-debug state. 


° Registers in the (coproc==0b1110) encoding space, including: 
— Debug registers. 
— _ Trace registers. 


— Legacy execution environment registers. 


VMSAV8-32 organization of registers in the (coproc==Ob1110) encoding space on page G4-4172 summarizes the 
registers in the (coproc==0b1110) encoding space, and indicates where these registers are described, either in this 
manual or in other architecture specifications. 


VMSAV8-32 organization of registers in the (coproc==ObI1111) encoding space on page G4-4175 summarizes the 
registers in the (coproc==0b1111) encoding space, and indicates where in this manual these registers are described. 


This section gives general information about the AArch32 System registers, and the conventions used in describing 
these registers. 


Note 


Many implementations include other interfaces to some System registers, for example a memory-mapped interface 
to some debug System registers. These are described in the appropriate sections of this manual. 








This section is organized as follows: 

° About AArch32 System register accesses. 

° General behavior of System registers on page G4-4151. 

° Classification of System registers on page G4-4154. 

. Synchronization of changes to AArch32 System registers on page G4-4163. 
. Fixed values in register diagrams on page G4-4169. 

° Principles of the ID scheme for fields in ID registers on page G4-4169. 


G4.14.1 About AArch32 System register accesses 


The following subsections give more information about accesses to the AArch32 System registers: 
° Ordering of reads of System registers. 

° Accessing 32-bit System registers on page G4-4149. 

. Accessing 64-bit System registers on page G4-4150. 


Ordering of reads of System registers 


Reads of the System registers can occur out of order with respect to earlier instructions executed on the same PE, 
provided that the data dependencies between the instructions, specified in Synchronization of changes to AArch32 
System registers on page G4-4163, are met. 





Note 


In particular, System registers holding self-incrementing counts, for example the Performance Monitors counters or 
the Generic Timer counter or timers, can be read early. This means that, for example, if a memory communication 
is used to communicate a read of the Generic Timer counter, an ISB must be inserted between the read of the memory 
location used for this communication and the read of the Generic Timer counter if it is required that the Generic 
Timer counter returns a count value that is later than the memory communication. 
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Accessing 32-bit System registers 


Software accesses most 32-bit System registers using the generic MCR and MRC System register access instructions, 
specifying some or all of the parameters {coproc, CRn, opcl, CRm, opc2}, where: 


coproc Identifies the primary region of the System register encoding space. Takes one of the values: 
p14 Encoded as 0b1110. 
p15 Encoded as 0b1111. 

CRn Takes a value in the range c0-c15, encoded the corresponding 4-bit binary value, @b0000-0b1111. 


In the (coproc==0b1110) encoding space, the opc1 value identifies the System register functional 
group, and CRn is the most significant identifier for the required register within that group. 


In the (coproc==0b1111) encoding space, CRn is the most significant identifier for the required 
register. 


opcl Takes a value in the range 0-7, encoded as its 3-bit binary value. 


In the (coproc==0b1110) encoding space, the opc1 value identifies the System register functional 
group, and can take the following values: 


Q Debug System registers. 
1 Trace System registers. 
7 Legacy Jazelle System registers. 


In the (coproc==0b1111) encoding space, opcl can take any value in the range 0-7. 


CRm Takes a value in the range c0-c15, encoded the corresponding 4-bit binary value, @b0000-0b1111. 
opc2 Takes a value in the range 0-7, encoded as its 3-bit binary value. 
opc2 is optional in the MCR and MRC instruction syntax, and if no value is specified the encoding 
defaults to b000. 
Rt A general-purpose register to hold a 32-bit value to transfer to or from the System register. Takes a 


value in the range RQ-R14, encoded as the corresponding 4-bit binary value, 0b0000-0b1110. 


This means an MCR or MRC access to a specific 32-bit System register uses: 


° A unique combination of coproc, CRn, opcl1, CRm, and opc2, to specify the required System register. 
. A general-purpose register, Rt, for the transferred 32-bit value. 
See also: 


° MCR on page F5-2804. 
° MRC on page F5-2826. 


A small number of AArch32 debug System registers are accessed using LDC or STC instructions. In these cases, the 
register to be accessed is identified in the instruction syntax by the use of p14, c5 where: 


p14 Identifies that the access is to the (coproc==0b1110) encoding space. 
c5 Identifies the target debug System register. 


See the instruction descriptions: 

. LDC (immediate) on page F5-2695. 
. LDC (literal) on page F5-2697. 

° STC on page F5-3032. 


The only uses of LDC and STC permitted in ARMv8-A are: 


. An LDC access to load data from memory to DBGDTRTXint, see LDC (immediate) on page F5-2695 and LDC 
(literal) on page F5-2697. 


° An STC access to store data to memory from DBGDTRRXint, see STC on page F5-3032. 
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A small number of AArch32 System registers are accessed using MRS, MSR, VMRS, or VMSR instructions, see the 
appropriate register and instruction description for more information, see: 


. MRS on page F5-2830. 

. MSR (immediate) on page F5-2840, 
° MSR (register) on page F5-2842. 

. VMRS on page F6-3525. 

° VMSR on page F6-3528. 





Note 
° For example: 
— The APSR, CPSR, and SPSR are accessed using MRS or MSR instructions. 
— The MVFRO, MVFR1, and MVFR2 are accessed using VMRS or VMSR instructions. 


° In addition, the banked register forms of the MRS and MSR instructions can be used to access some System 
registers associated with PE modes other than the mode in which the PE is currently executing, see MRS 
(Banked register) on page F5-2832 and MSR (Banked register) on page F5-2836. 





Accessing 64-bit System registers 


Software accesses a 64-bit System register using the generic MCRR and MRRC System register access instructions, 
specifying the parameters {coproc, CRm, opcl}, where: 


coproc Identifies the primary region of the System register encoding space. Takes one of the values: 
p14 Encoded as 0b1110. 
p15 Encoded as 0b1111. 

CRm Takes a value in the range c0-c15, encoded the corresponding 4-bit binary value, @b0000-0b1111. 


In the (coproc==0b1110) encoding space, the opc1 value identifies the System register functional 
group, and CRm is the most significant identifier for the required register within that group. 


In the (coproc==0b1111) encoding space, CRm is the most significant identifier for the required 
register. 


opcl Takes a value in the range 0-15, encoded as its 3-bit binary value. 


In the (coproc==0b1110) encoding space, the opc1 value identifies the System register functional 
group, and can take the following values: 


) Debug System registers. 
1 Trace System registers. 
In the (coproc==0b1111) encoding space, opc1 can take any value in the range 0-15. 


Rt A general-purpose register to hold bits[31:0] of the value to transfer to or from the System register. 
Takes a value in the range RQ-R14, encoded as the corresponding 4-bit binary value, 0b0000-0b1110. 


Rt2 A general-purpose register to hold bits[63:32] of the value to transfer to or from the System register. 
Takes a value in the range RQ-R14, encoded as the corresponding 4-bit binary value, 0b0000-0b1110. 


This means an MCRR or MRRC access to a specific 64-bit System register uses: 
° A unique combination of coproc, CRm and opcl, to specify the required 64-bit System register. 


° Two general-purpose registers, each holding 32 bits of the value to transfer. 


This means a PE can access a 64-bit System register using: 
° An MCRR instruction to write to a System register, see MCRR on page F5-2806. 
° An MRRC instruction to read a System register, see MCRR on page F5-2806. 


When using an MCRR or MRRC instruction the System register access is 64-bit atomic. 


Some 64-bit registers also have an MCR and MRC encoding. The MCR and MRC encodings for these registers access the 
least significant 32 bits of the register. For example, to access the PAR, software can: 
° Use the following instructions to access all 64 bits of the register: 


MRRC p15, @, <Rt>, <Rt2>, c7 ; Read 64-bit PAR into Rt (low word) and Rt2 (high word) 
MCRR p15, @, <Rt>, <Rt2>, c7 ; Write Rt (low word) and Rt2 (high word) to 64-bit PAR 
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° Use the following instructions to access the least-significant 32 bits of the register: 
MRC p15, 0, <Rt>, c7, c4, @ ; Read PAR[31:0] into Rt 
MCR p15, 0, <Rt>, c7, c4, @ ; Write Rt to PAR[31:0] 


G4.14.2 General behavior of System registers 


Except where indicated, System registers are 32-bits wide. As stated in About AArch32 System register accesses on 
page G4-4148, there are some 64-bit registers, and these include cases where software can access either a 32-bit 
view or a 64-bit view of a register. The register summaries, and the individual register descriptions, identify the 
64-bit registers and how they can be accessed. 


The following sections give information about the general behavior of these registers. Unless otherwise indicated, 
information applies to all AArch32 System registers: 


° Read-only bits in read/write registers. 
° UNPREDICTABLE, CONSTRAINED UNPREDICTABLE, and UNDEFINED behavior for AArch32 System 


register accesses. 
. Read-only and write-only register encodings on page G4-4153. 
° Reset behavior of AArch32 System registers on page G4-4153. 


See also About AArch32 System register accesses on page G4-4148 and Fixed values in register diagrams on 
page G4-4169. 


Read-only bits in read/write registers 
Some read/write registers include bits that are read-only. These bits ignore writes. 


An example of this is the SCTLR.NMFI bit, SCTLR[27]. 


UNPREDICTABLE, CONSTRAINED UNPREDICTABLE, and UNDEFINED behavior for 
AArch32 System register accesses 


This section defines UNPREDICTABLE and UNDEFINED behaviors for accesses to System registers, including those 
cases where the ARMvV8 behavior is CONSTRAINED UNPREDICTABLE. 


In AArch32 state the following operations are UNDEFINED: 


° All LDC and STC accesses, except for the LDC access to DBGDTRTXint and the STC access to DBGDTRRXint 
specified in Table G4-44 on page G4-4174. 


° All MCRR and MRRC operations to the (coproc==0b111x) encoding space, except for those explicitly defined as 
accessing 64-bit System registers specified in Table G4-43 on page G4-4173 and Table G4-45 on 
page G4-4179. 


Unless otherwise indicated in the individual register descriptions: 
. Reserved fields in registers are RESO. 


. Assigning a reserved value to a field has a CONSTRAINED UNPREDICTABLE effect, see Reserved values in 
System and memory-mapped registers and translation table entries on page K1-5477. 


The following subsections give more information about UNPREDICTABLE, CONSTRAINED UNPREDICTABLE, and 
UNDEFINED behavior for accesses to the (coproc==0b111x) encoding space: 


° Accesses to unallocated encodings in the (coproc==ObI111x) encoding space. 
° Additional rules for MCR and MRC accesses to System registers on page G4-4152. 
° Effects of EL3 and EL2 on System register accesses on page G4-4152. 


Accesses to unallocated encodings in the (coproc==0b111x) encoding space 


In ARMv8-A, accesses to unallocated register encodings in the (coproc==0b111x) encoding space are UNDEFINED, 
see Unallocated System register access instructions on page K1-5460. 
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Additional rules for MCR and MRC accesses to System registers 


The following operations are CONSTRAINED UNPREDICTABLE for all encodings in the (coproc==0b111x) encoding 
space: 
° All MCR operations from the PC. 


° All MRC operations to APSR_nzcv, except for the (coproc==0b1110) MRC operation to APSR_nzcv from 
DBGDSCRint. 


The CONSTRAINED UNPREDICTABLE behavior of these operations is that they are UNDEFINED, see Unallocated 
System register access instructions on page K1-5460. 


For registers and operations that are accessible from a particular Privilege level, any attempt to access those registers 
from a lower Privilege level is UNDEFINED. 


Some individual registers can be made inaccessible by setting configuration bits, possibly including 
IMPLEMENTATION DEFINED configuration bits, to disable access to the register. The effects of the 
architecturally-defined configuration bits are defined individually in this manual. Unless explicitly stated otherwise 
in this manual, setting a configuration bit to disable access to a register results in the register becoming UNDEFINED 
for MRC and MCR accesses. 


See also Read-only and write-only register encodings on page G4-4153. 


Effects of EL3 and EL2 on System register accesses 


EL2 and EL3 introduce classes of System registers, described in Classification of System registers on 
page G4-4154. Some of these classes of register are either: 


° Accessible only from certain modes or states. 


° Accessible from certain modes or states only when configuration settings permit the access. 


Accesses to these registers that are not permitted are UNDEFINED, meaning execution of the register access 
instruction generates an Undefined Instruction exception. 


Note 


This section applies only to registers that are accessible from some modes and states. That is, it applies only to 
register access instructions using an encoding that, under some circumstances, would perform a valid register 
access. 








The following register classes restrict access in this way: 


Restricted access System registers 
This register class is defined in any implementation that includes EL3. 


Restricted access registers other than the NSACR are accessible only from Secure EL3 modes. All 
other accesses to these registers are UNDEFINED. 
The NSACR is a special case of a Restricted access register and: 
° The NSACR is: 
— __ Read/write accessible from Secure PL1 modes. 
— Is Read-only accessible from Non-secure PL2 and PL1 modes. 
° All other accesses to the NSACR are UNDEFINED. 


For more information, including behavior when EL3 is using AArch64 or is not implemented, see 
Restricted access System registers on page G4-4156. 


Configurable access System registers 


This register class is defined in any implementation that includes EL3. 


Most Configurable access registers are accessible from Non-secure state only if control bits in the 
NSACR permit Non-secure access to the register. Otherwise, a Non-secure access to the register is 
UNDEFINED. 





G4-4152 
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For other Configurable access registers, control bits in the NSACR control the behavior of bits or 
fields in the register when it is accessed from Non-secure state. That is, Non-secure accesses to the 
register are permitted, but the NSACR controls how they behave. The only architecturally-defined 
register of this type is the CPACR. 


For more information, see Configurable access System registers on page G4-4157. 


EL2-mode System registers 
This register class is defined only in an implementation that includes EL2. 


EL2-mode registers are accessible only from: 
° The Non-secure EL2 mode, Hyp mode. 
. Secure Monitor mode when SCR.NS is set to 1. 


All other accesses to these registers are UNDEFINED. 
For more information, see Hyp mode read/write registers in the (coproc==Ob1111) encoding space 
on page G4-4157 and Hyp mode encodings for shared (coproc==Ob1111) System registers on 
page G4-4159. 

EL2-mode write-only operations 
This register class is defined only in an implementation that includes EL2. 


EL2-mode write-only operations are accessible only from: 


° The Non-secure EL2 mode, Hyp mode. 


° Secure Monitor mode, regardless of the value of SCR.NS. 

Write accesses to these operations are: 

° CONSTRAINED UNPREDICTABLE in Secure EL3 modes other than Monitor mode. 
° UNDEFINED in Non-secure modes other than Hyp mode. 


For more information, see Hyp mode (coproc ==Ob1111) write-only System instructions on 
page G4-4159. 


In addition, in any implementation that includes EL3, when EL3 is using AArch32, if write access to a register is 
disabled by the CPISSDISABLE signal then any MCR access to that register is UNDEFINED. 


Read-only and write-only register encodings 


Some System registers are read-only (RO) or write-only (WO). For example: 
° Most identification registers are read-only. 


° Most encodings that perform an operation, such as a cache maintenance instruction, are write-only. 


If a particular Privilege level defines a register to be: 


° RO, then any attempt to write to that register, at that Privilege level, is UNDEFINED. This means that any access 
to that register with L == 0 is UNDEFINED. 


. WO, then any attempt to read from that register, at that Privilege level, is UNDEFINED. This means that any 
access to that register with L== 1 is UNDEFINED. 


For IMPLEMENTATION DEFINED encoding spaces, the treatment of the encodings is IMPLEMENTATION DEFINED. 


Note 


This section applies only to registers that this manual defines as RO or WO. It does not apply to registers for which 
other access permissions are explicitly defined. 








Reset behavior of AArch32 System registers 
Reset values apply only to RW registers and fields, however: 


° Some RO registers or fields, including feature ID registers and some status registers or register fields, always 
return a known value. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4153 
1ID092916 Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.14 About the System registers for VMUSAv8-32 


° Some RW and RO registers or register fields return status information about the PE. Unless the register 
description indicates that the value is UNKNOWN on reset, a read of the register immediately after a reset 
returns valid information. 


° Some RW and RO registers and fields are aliases of other registers or fields. In these cases, the reset behavior 
of the aliased register or field determines the value returned by a read of the register immediately after a reset. 


° WO registers that only have an effect on writes do not have meaningful reset values. However, an access to 
a WO register might affect underlying state, and that that state might have a defined reset value. 


° IMPLEMENTATION DEFINED registers have IMPLEMENTATION DEFINED reset behavior. 


After a reset, only a limited subset of the PE state is guaranteed to be set to defined values. Also, for debug and trace 
System registers, reset requirements must take account of different levels of reset. For more information about the 
reset behavior of System registers when the PE resets into an Exception Level that is using AArch32, see: 


° PE state on reset into AArch32 state on page G1-3869. 


. The appropriate Trace architecture specification, for the Trace System registers. 


When the PE resets into an Exception Level that is using AArch64, PE state that relates to execution in AArch32 
state, including the System register values, is UNKNOWN. The only exception to this is state that applies to execution 
in both AArch64 state and AArch32 state and that has a defined reset value on the reset into AArch64 state. An 
example of such PE state is the EDPRSR.SR bit. 


For a PE reset into an Exception level that is using AArch32, the architecture defines which AArch32 System 
registers have a defined reset value, and when that defined reset value applies. The register descriptions include this 
information, and PE state on reset into AArch32 state on page G1-3869 summarizes these architectural 
requirements. Otherwise, RW registers reset to an architecturally UNKNOWN value. 





Note 


In an implementation that includes EL3, unless this manual explicitly states otherwise, only the Secure instance of 
a Banked register is reset to the defined value. This means that software must program the Non-secure instance of 
the register with the required values. Typically, this programming is part of the PE boot sequence. 





Pseudocode description of resetting System registers 


The AArch32.ResetControlRegisters() pseudocode function resets all System registers, and register fields, that have 
defined reset values, as described in this section and PE state on reset into AArch32 state on page G1-3869. 


Note 


For debug and trace System registers this function resets registers as defined for the appropriate level of reset. 











G4.14.3 Classification of System registers 
Features provided by EL3 and EL2 integrate with many features of the architecture. Therefore, the descriptions of 
the individual System registers include information about how these Exception levels affect the register. This 
section: 
° Summarizes how EL3 and EL2 affect the implementation of the System registers, and the classification of 
those registers. 
° Summarizes how EL3 controls access to the System registers. 
° Describes an EL3 signal that can control access to some registers in the (coproc==0b1111) encoding space. 
It contains the following subsections: 
° Banked System registers on page G4-4155. 
° Restricted access System registers on page G4-4156. 
° Configurable access System registers on page G4-4157. 
° EL2-mode System registers on page G4-4157. 
° Common System registers on page G4-4160. 
° The CPISSDISABLE input signal on page G4-4161. 
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° Access to registers from Monitor mode on page G4-4162. 


Note 


EL3 defines the register classifications of Banked, Restricted access, Configurable, and Common. EL2 defines the 
EL2-mode classification. 








It is IMPLEMENTATION DEFINED whether each IMPLEMENTATION DEFINED register is Banked, Restricted access, 
Configurable, EL2-mode, or Common. 


Banked System registers 


In an implementation that includes EL3 using AArch32, some System registers are Banked. Banked System 
registers have two copies, one Secure and one Non-secure. The SCR.NS bit selects the Secure or Non-secure 
instance of the register. Table G4-34 shows which registers in the (coproc==0b1111) encoding space are Banked in 
this way, and the permitted access to each register. No registers in the (coproc==0b1110) encoding space are Banked. 


Table G4-34 Banked (coproc==0b1110) System registers 






































Banked register Permitted accesses4 

CSSELR, Cache Size Selection Register Read/write only at EL1 or higher 
SCTLR, System Control Register> Read/write only at EL1 or higher 
ACTLR, Auxiliary Control Register Read/write only at EL1 or higher 
ACTLR2, Auxiliary Control Register 2 Read/write only at EL1 or higher. 
TTBRO, Translation Table Base 0 Read/write only at EL1 or higher 
TTBR1, Translation Table Base 1 Read/write only at EL1 or higher 
TTBCR, Translation Table Base Control Read/write only at EL1 or higher 
DACR, Domain Access Control Register Read/write only at EL1 or higher 
DFSR, Data Fault Status Register Read/write only at EL1 or higher 
IFSR, Instruction Fault Status Register Read/write only at EL1 or higher 
ADFSR, Auxiliary Data Fault Status Register Read/write only at EL1 or higher 





AIFSR, Auxiliary Instruction Fault Status Register Read/write only at EL1 or higher 






































DFAR, Data Fault Address Register Read/write only at EL1 or higher 

IFAR, Instruction Fault Address Register Read/write only at EL1 or higher 

PAR, Physical Address Register Read/write only at EL1 or higher 

PRRR, Primary Region Remap Register Read/write only at EL1 or higher 

NMRR, Normal Memory Remap Register Read/write only at EL1 or higher 

VBAR, Vector Base Address Register Read/write only at EL1 or higher 
CONTEXTIDR, Context ID Register Read/write only at EL1 or higher 
TPIDRURW, User Read/Write Thread ID Read/write at all privilege levels, including ELO 
TPIDRURO, User Read-only Thread ID Read-only at ELO 

Read/write at EL1 or higher 
TPIDRPRW, ELI only Thread ID Read/write only at EL1 or higher 
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a. Any attempt to execute an access that is not permitted results in an Undefined Instruction exception. 


b. Some bits are common to the Secure and the Non-secure copies of the register, see SCTLR, System 
Control Register on page G6-4547. 


A Banked System register can contain a mixture of: 
° Fields that are Banked. 


° Fields that are read-only in Non-secure PL1 or PL2 modes but read/write in the Secure state. 
The System Control Register SCTLR is an example of a register of that contains this mixture of fields. 


The Secure copies of the Banked System registers are sometimes referred to as the Secure Banked System registers. 
The Non-secure copies of the Banked System registers are sometimes referred to as the Non-secure Banked System 
registers. 


Restricted access System registers 


In an implementation that includes EL3, some System registers are present only in the Secure security state. These 
are called Restricted access registers, and their read/write access permissions are: 


. In Non-secure state, software cannot modify Restricted access registers. 


° For the NSACR, in Non-secure state: 
— Software running at PL1 or higher can read the register. 


—  Unprivileged software, meaning software running at PLO, cannot read the register. 


This means that Non-secure software running at PL1 or higher can read the access permissions for System 
registers that have Configurable access. 


If EL3 is using AArch64 then any read of the NSACR from Non-secure EL2 using AArch32, or Non-secure 
EL1 using AArch32, returns the value 0x00000CQ0. 


° For all other Restricted access registers, Non-secure software cannot read the register. 


In an implementation that does not include EL3: 


° SDER is implemented only in Secure state. 
. Any read of the NSACR returns the value 0x00000C00. 
° All other accesses to Restricted access System registers are UNDEFINED. 


Table G4-35 shows the Restricted access System registers in the (coproc==0b1111) encoding space. There are no 
Restricted access registers in the (coproc==0b1110) encoding space. 


Table G4-35 Restricted access (coproc==0b1111) System registers 

















Register Permitted accesses? 

SCR, Secure Configuration Read/write in Secure PL1 modes 
SDCR, Secure Debug Configuration Register Read/write in Secure PL1 modes 
SDER, Secure Debug Enable Read/write in Secure PL1 modes 
NSACR, Non-Secure Access Control Read/write in Secure PL! modes 


Read-only in Non-secure PL1 and PL2 modes 





MVBAR, Monitor Vector Base Address Read/write in Secure PL! modes 





a. Any attempt to execute an access that is not permitted results in an Undefined Instruction exception. 
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Configurable access System registers 


Secure software can configure the access to some System registers. These registers are called Configurable access 
registers, and the control can be: 


° A bit in the control register determines whether the register is: 
— Accessible from Secure state only. 


—_ Accessible from both Secure and Non-secure states. 


° A bit in the control register changes the accessibility of a register bit or field. For example, setting a bit in the 
control register might mean that a RW field behaves as RAZ/WI when accessed from Non-secure state. 


Bits in the NSACR control access. 


In an AArch32 implementation that includes EL3: 
° There are no Configurable access System registers in the (coproc==0b1110) encoding space. 
. The only required Configurable access register in the (coproc==0b1111) encoding space is the CPACR. 
* The following Advanced SIMD and floating point System registers are Configurable access: 
— Floating-point Status and Control Register, FPSCR 
— Floating-point Exception register, FPEXC. 
— Floating-point System ID register, FPSID. 
— Media and VFP Feature Register 0, MVFRO. 
— Media and VFP Feature Register 1, MVFR1. 
— Media and VFP Feature Register 2, MVFR2. 


EL2-mode System registers 


In an implementation that includes EL2, if EL2 can use AArch32, the implementation provides a number of 
registers for use in the EL2 mode, Hyp mode. As with other System register encodings, some of these register 
encodings provide write-only operations. When the implementation includes EL3 and EL3 is using AArch32, these 
registers are also accessible from Monitor mode when the value of SCR.NS is 1. 


The following subsections describe the EL2-mode registers: 

° Hyp mode read/write registers in the (coproc==ObI1111) encoding space. 

° Hyp mode encodings for shared (coproc==O0b1111) System registers on page G4-4159. 
° Hyp mode (coproc==0b1111) write-only System instructions on page G4-4159. 


There are no EL2-mode registers in the (coproc==0b1110) encoding space. 


Hyp mode read/write registers in the (coproc==0b1111) encoding space 


These registers are implemented only in Non-secure state, and in Non-secure state they are accessible only from 
Hyp mode. 


Except for accesses to CNT VOFF in an implementation that includes EL3 but not EL2, the behavior of accesses to 
these registers is as follows: 


° In Secure state, the registers can be accessed from EL3 when SCR.NS is set to 1, see Access to registers from 
Monitor mode on page G4-4162. 


. The following accesses are UNDEFINED: 
— Accesses from Non-secure PL1 modes. 


— Accesses in Secure state when SCR.NS is set to 0. 
In an implementation that includes EL3 but not EL2, the behavior of accesses to CNTVOFF is as follows: 


. Any access from Secure Monitor mode is CONSTRAINED UNPREDICTABLE, regardless of the value of SCR.NS. 
The CONSTRAINED UNPREDICTABLE behavior is that the access is UNDEFINED, see Unallocated System 
register access instructions on page K1-5460. 


° All other accesses are UNDEFINED. 
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Note 


Except for CNTVOFF, the Hyp mode registers are part of EL2, meaning they are implemented only if the 
implementation includes EL2. However, conceptually, CNT VOFF is part of any implementation of the Generic 
Timer, see Status of the CNTVOFF register on page G5-4223. This means the behavior of CNTVOFF in an 
implementation that does not include EL2 is not covered by the general definition of the behavior of the Hyp mode 
(coproc==0b1111) read/write registers. 








Table G4-36 shows the Hyp mode read/write System registers, in the (coproc==0b1111) encoding space: 


Table G4-36 Hyp mode (coproc==0b1111) read/write System registers 



















































































Register Width Permitted accesses? 

VPIDR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
VMPIDR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HSCTLR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HACTLR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HACTLR2 32-bit Read/write. In Non-secure state, accessible only from Hyp mode. 
HCR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HDCR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HCPTR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HSTR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HACR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HTCR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
VTCR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HTTBR 64-bit Read/write. In Non-secure state, accessible only from Hyp mode 
VTTBR 64-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HADFSR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HAIFSR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HSR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HPFAR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HMAIRO 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HMAIRI1I 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HAMAIRO 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HAMAIR1I 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HVBAR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
HTPIDR 32-bit Read/write. In Non-secure state, accessible only from Hyp mode 
CNTVOFF> 64-bit Read/write. In Non-secure state, accessible only from Hyp mode 





a. Any attempt to execute an access that is not permitted results in an Undefined Instruction exception. 


b. Implemented in any implementation of the Generic Timer. See, also, the Note earlier in this section. 
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Hyp mode encodings for shared (coproc==0b1111) System registers 


Some Hyp mode registers share the Secure instance of an existing Banked register. In this case the implementation 
includes an encoding for the register that is accessible only in Hyp mode, or in Monitor mode when SCR.NS is 
set to l. 


For these registers, the following accesses are UNDEFINED: 
° Accesses from Non-secure PL1 modes. 


° Accesses in Secure state when SCR.NS is set to 0. 


Table G4-37 lists the Hyp mode encodings for shared registers. 


Table G4-37 Hyp mode (coproc==0b1111) System register encodings for shared registers 





Register Permitted accesses@ Shared register 





HDFAR Read/write. Accessible from Hyp mode, or from Monitor mode when SCR.NS is set to 1. Secure DFAR 





HIFAR Read/write. Accessible from Hyp mode, or from Monitor mode when SCR.NS is set to 1. Secure IFAR 





a. Any attempt to execute an access that is not permitted results in an Undefined Instruction exception. 


In Monitor mode, the Secure copies of these registers can be accessed either: 
° Using the DFAR or IFAR encoding with SCR.NS set to 0. 
° Using the HDFAR or HIFAR encoding with SCR.NS set to 1. 


However, between accessing a register using one alias and accessing the register using the other alias, a Context 
synchronization event is required to ensure the ordering of the accesses. 


Hyp mode (coproc==0b1111) write-only System instructions 


Architecturally, these encodings are an extension of the Banked register encodings described in Banked System 
registers on page G4-4155, where: 


. The implementation does not implement the operation in Secure state. 


. In Non-secure state, the operation is accessible only at EL2, that is, only from Hyp mode. 
In Secure state: 


. These instructions can be accessed from Monitor mode regardless of the value of SCR.NS, see Access to 
registers from Monitor mode on page G4-4162. 


° Accesses to these instructions are CONSTRAINED UNPREDICTABLE if executed in a Secure mode other than 
Monitor mode, see Hyp mode TLB maintenance instructions on page K1-5476 and Hyp mode VA to PA 
address translation instructions on page K1-5476. 


Accesses to these instructions are UNDEFINED if accessed from a Non-secure PL1 mode. 


Table G4-38 shows the EL2-mode write-only instructions in the (coproc==0b1111) encoding space: 


Table G4-38 Hyp mode (coproc==0b1111) write-only System instructions 

















Register Width Permitted accesses4 

ATS1HR 32-bit Write-only. Accessible only from Hyp mode and Monitor mode 
ATSIHW 32-bit Write-only. Accessible only from Hyp mode and Monitor mode 
TLBIALLHIS 32-bit Write-only. Accessible only from Hyp mode and Monitor mode 
TLBIMVAHIS 32-bit Write-only. Accessible only from Hyp mode and Monitor mode 





TLBIALLNSNHIS 32-bit Write-only. Accessible only from Hyp mode and Monitor mode 
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Table G4-38 Hyp mode (coproc==0b1111) write-only System instructions (continued) 














Register Width Permitted accesses4 

TLBIALLH 32-bit Write-only. Accessible only from Hyp mode and Monitor mode 
TLBIMVAH 32-bit Write-only. Accessible only from Hyp mode and Monitor mode 
TLBIALLNSNH 32-bit Write-only. Accessible only from Hyp mode and Monitor mode 





a. This section describes the behavi 


or of write accesses that are not permitted. See also Read-only and write-only 


register encodings on page G4-4153. 


For more information about these operations, see ATS/ Hx, Address translation stage 1, Hyp mode on page G4-4144. 


Common System registers 


Some System registers and operations are common to the Secure and Non-secure Security states. These are 
described as the Common access registers, or simply as the Common registers. These registers include: 


. Read-only registers that hold configuration information. 

. Register encodings used for various memory system operations, rather than to access registers. 
. The ISR. 

° All System registers in the (coproc==0b1110) encoding space. 


Table G4-39 shows the Common Syst 


em registers in the (coproc==0b1111) encoding space. These registers are not 


affected by whether EL3 is implemented. 


Table G4-39 Common (coproc==0b1111) System registers 





Register 


Permitted accesses? 





MIDR, Main ID Register 


Read-only, only at EL1 or higher 





CTR, Cache Type Register 


Read-only, only at EL1 or higher 





TCMTR, TCM Type Register 


Read-only, only at EL1 or higher 





TLBTR, TLB Type Register® 


Read-only, only at EL1 or higher 





MPIDR, Multiprocessor Affinity Register 


Read-only, only at EL1 or higher 





REVIDR, Revision ID 


Read-only, only at EL1 or higher 





ID_PFRO, ID_PFR1, Processor Feature Registers 


Read-only, only at EL1 or higher 





ID_DFRO, Debug Feature Register 0 


Read-only, only at EL1 or higher 





ID_AFRO, Auxiliary Feature Register 0 


Read-only, only at EL1 or higher 





ID_MMFRO-ID_MMFR4, Memory Model Feature 
Registers 


Read-only, only at EL1 or higher 





ID_ISARO-ID_ISARS, Instruction Set Attribute Registers 


Read-only, only at EL1 or higher 





CCSIDR, Cache Size ID Register 


Read-only, only at EL1 or higher 





CLIDR, Cache Level ID Register 


Read-only, only at EL1 or higher 





AIDR, Auxiliary ID Register> 


Read-only, only at EL1 or higher 





FCSEIDR, FCSE PID Register 


Read/write, only at EL1 or higher 





Cache and branch predictor maintenance instruction 


See Cache maintenance instructions, functional group on page G4-4201 
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Table G4-39 Common (coproc==0b1111) System registers (continued) 





Register 


Permitted accesses 





Address translation instructions See Address translation instructions, functional group on page G4-4204 





CP15DMB, CP15DSB, CP15ISB Data barrier operations Write-only at all privilege levels, including ELO 





TLB maintenance instructions Write-only, only at EL1 or higher, see TLB maintenance instructions, 


functional group on page G4-4202 





Performance monitors See Access permissions on page D5-1871 





ISR, Interrupt Status Register Read-only, only at EL1 or higher 





a. Any attempt to execute an access that is not permitted results in an Undefined Instruction exception. 


b. Register or operation details are IMPLEMENTATION DEFINED. 


Secure System registers for the (coproc==0b1111) encoding space 


The Secure System registers in the (coproc==0b1111) encoding space comprise: 


° The Secure copies of the Banked System registers in the (coproc==0b1111) encoding space. 
° The Restricted access System registers in the (coproc==0b1111) encoding space. 
° The Configurable access System registers in the (coproc==0b1111) encoding space that are configured to be 


accessible only from Secure state. 


In an implementation that includes EL3, the Non-secure System registers are the System registers other than the 
Secure System registers. 


The CP15SDISABLE input signal 
When EL3 is using AArch32, it provides an input signal, CPISSDISABLE, that disables write access to some of 
the Secure registers when asserted HIGH. The CPISSDISABLE signal has no effect on: 


° Register accesses from AArch64 state. 
° Register accesses from Secure EL1 when EL3 is using AArch64 and EL] is using AArch32. 


Note 


When EL3 is using AArch32, the interaction between CPISSDISABLE and any IMPLEMENTATION DEFINED 
register is IMPLEMENTATION DEFINED. 








Table G4-40 shows the registers and operations affected. 


Table G4-40 Secure registers affected by CP15SDISABLE 



































Register name Affected operation 

SCTLR, System Control Register CR p15, @, <Rt>, cl, c0, @ 
TTBRO, Translation Table Base Register 0 CR p15, 0, <Rt>, c2, c, 2 
TTBCR, Translation Table Base Control Register CR pl5, @, <Rt>, c2, cQ, 2 
DACR, Domain Access Control Register CR p15, 0, <Rt>, c3, c, 2 
PRRR. Primary Region Remap Register CR p15, @, <Rt>, c1@, c2, 0 
NMRR, Normal Memory Remap Register CR p15, @, <Rt>, c1@, c2, 1 
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Table G4-40 Secure registers affected by CP15SDISABLE (continued) 





Register name Affected operation 





AMAIRO, Auxiliary Memory Attribute Indirection RegisterQ MCR pl5, @, <Rt>, cl0, c3, 0 





AMAIR1, Auxiliary Memory Attribute Indirection Register 1 MCR p15, 0, <Rt>, cl, c3, 1 





VBAR, Vector Base Address Register MCR p15, @, <Rt>, c12, c0, 0 





MVBAR, Monitor Vector Base Address Register MCR p15, @, <Rt>, c12, c0, 1 





On a reset by the external system that resets the PE into EL3 using AArch32, the CPISSDISABLE input signal 
must be taken LOW. This permits the Reset code to set up the configuration of EL3 features. When the input is 
asserted HIGH, any attempt to write to the Secure registers shown in Table G4-40 on page G4-4161 results in an 
Undefined Instruction exception. 


The CPISSDISABLE input does not affect reading Secure registers, or reading or writing Non-secure registers. It 
is IMPLEMENTATION DEFINED how the input is changed and when changes to this input are reflected in the PE, and 
an implementation might not provide any mechanism for driving the CPISSDISABLE input HIGH. However, in 
an implementation in which the CPISSDISABLE input can be driven HIGH, changes in the state of 
CPISSDISABLE must be reflected as quickly as possible. Any change must occur before completion of an 
Instruction Synchronization Barrier operation, issued after the change, is visible to the PE with respect to instruction 
execution boundaries. Software must perform an Instruction Synchronization Barrier operation meeting the above 
conditions to ensure all subsequent instructions are affected by the change to CPISSDISABLE. 


When EL3 is using AArch32, use of CPISSDISABLE means key Secure features that are accessible only at PL1 
can be locked in a known state. This provides an additional level of overall system security. ARM expects control 
of CP15SDISABLE to reside in the system, in a block dedicated to security. 


Access to registers from Monitor mode 


When the PE is in Monitor mode, the PE is in Secure state regardless of the value of the SCR.NS bit. In Monitor 
mode, the SCR.NS bit determines whether, for System registers in the (coproc==0b1111) encoding space, valid uses 
of the MRC, MCR, MRRC and MCRR instructions access the Secure Banked System registers or the Non-secure Banked 
System registers. That is, when: 


NS == Common, Restricted access, and Secure Banked System registers are accessed by MRC, MCR, MRRC and 
MCRR instructions that target the (coproc==0b1111) encoding space. 


If the implementation includes EL2, the registers listed in Hyp mode read/write registers in the 
(coproc==O0b1111) encoding space on page G4-4157 and Hyp mode encodings for shared 
(coproc==O0b1111) System registers on page G4-4159 are not accessible, and any attempt to access 
them generates an Undefined Instruction exception. 


— Note 


The operations listed in Hyp mode (coproc==0b1111) write-only System instructions on 
page G4-4159 are accessible in Monitor mode regardless of the value of SCR.NS. 





System instructions in the (coproc==0b1111) encoding space use the Security state to determine all 
resources used, that is, all operations performed by these instructions are performed in Secure state. 


NS == Common, Restricted access and Non-secure Banked System registers are accessed by MRC, MCR, MRRC 
and MCRR instructions that target the (coproc==0b1111) encoding space. 


If the implementation includes EL2, all the registers and operations listed in the subsections of 
EL2-mode System registers on page G4-4157 are accessible, using the MRC, MCR, MRRC, or MCRR 
instructions required to access them from Hyp mode. 


System instructions in the (coproc==0b1111) encoding space use the Security state to determine all 
resources used, that is, all operations by these instructions are performed in Secure state. 


The Security state determines whether the Secure or Non-secure Banked registers determine the control state. 
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Note 


Where the contents of a register select the value accessed by an MRC or MCR access to a different register, then the 
register that is used for selection is being used as control state. For example, CSSELR selects the current CCSIDR, 
and therefore CSSELR is used as control state. Therefore, in Monitor mode: 

° SCR.NS determines whether the Secure or Non-secure CSSELR is accessible. 


° Because the PE is in Secure state, the Secure CSSELR selects the current CCSIDR. 





G4.14.4 Synchronization of changes to AArch32 System registers 


In this section, this PE means the PE on which accesses are being synchronized. 


Note 


See Definitions of direct and indirect reads and writes and their side-effects on page G4-4167 for definitions of the 
terms direct write, direct read, indirect write, and indirect read. 








A direct write to a System register might become visible at any point after the change to the register, but without a 
Context synchronization event there is no guarantee that the change becomes visible. 


Any direct write to a System register is guaranteed not to affect any instruction that appears, in program order, before 
the instruction that performed the direct write, and any direct write to a System register must be synchronized before 
any instruction that appears after the direct write, in program order, can rely on the effect of that write. The only 
exceptions to this are: 


° All direct writes to the same register, using the same encoding, are guaranteed to occur in program order. 


. All direct writes to a register are guaranteed to occur in program order relative to all direct reads of the same 
register using the same encoding. 


° Any System register access that an ARM Architecture Specification or equivalent specification defines as not 
requiring synchronization. 


° If an instruction that appears in program order before the direct write performs a memory access, such as a 
memory-mapped register access, that causes an indirect read or write to a register, that memory access is 
subject to the memory order model. In this case, if permitted by the memory order model, the instruction that 
appears in program order before the direct write can be affected by the direct write. For information about 
the memory order model, see Memory ordering on page E2-2332. 


These rules mean that an instruction that writes to one of the address translation instructions described in Address 
translation instructions on page G4-4142 must be explicitly synchronized to guarantee that the result of the address 
translation instruction is visible in the PAR. 





Note 


In this case, the direct write to the encoding of the address translation instruction causes an indirect write to the PAR. 
Without a Context synchronization event after the direct write there is no guarantee that the indirect write to the PAR 
is visible. 





Conceptually, the explicit synchronization occurs as the first step of any Context synchronization event. This means 
that if the operation uses the state that had been changed but not synchronized before the operation occurred, the 
operation is guaranteed to use the state as if it had been synchronized. 
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Note 


° This explicit synchronization is applied as the first step of the execution of any instruction that causes the 
synchronization operation. This means it does not synchronize any effect of changes to the System registers 
that might affect the fetch and decode of the instructions that cause the operation, such as breakpoints or 
changes to translation tables. 


° For a synchronous exception, the control state in use at the time the exception is generated determines the 
exception syndrome information, and this syndrome information is not changed by this synchronization at 
the start of taking the exception. 





Except for the register reads listed in Registers with some architectural guarantee of ordering or observability on 
page G4-4166, if no Context synchronization event is performed, direct reads of System registers can occur in any 
order. 


Table G4-41 shows the synchronization requirement between two reads or writes that access the same System 
register. In the column headings, First and Second refer to: 


. Program order, for any read or write caused by the execution of an instruction by this PE, other than a read 
or write caused by a memory access made by that instruction. 


° The order of arrival of asynchronous reads or writes made by this PE relative to the execution of instructions 
by this PE. 


In addition: 


° For indirect reads or writes caused by an external agent, such as a debugger, the mechanism that determines 
the order of the reads or writes is defined by that external agent. The external agent can provide mechanisms 
that ensure that any read or write it makes arrives at the PE. These indirect reads and writes are asynchronous 
to software execution on the PE. 


° For indirect reads or writes caused by memory-mapped reads or writes made by this PE, the ordering of the 
memory accesses is subject to the memory order model, including the effect of the memory type of the 
accessed memory address. This applies, for example, if this PE reads or writes one of its registers in a 
memory-mapped register interface. 


The mechanism for ensuring completion of these memory accesses, including ensuring the arrival of the 
asynchronous read or write at the PE, is defined by the system. 





Note 


Such accesses are likely to be given a Device memory attribute, but requiring this is outside the scope of the 
architecture. 





° For indirect reads or writes caused by autonomous asynchronous events that are counted, for example events 
caused by the passage of time, the events are ordered so that: 


— Counts progress monotonically. 


— The events arrive at the PE in finite time and without undue delay. 


Table G4-41 Synchronization requirements for updates to System registers 




















First read or write Second read or write Context synchronization event required 
Direct read Direct read No 
Direct write No 
Indirect read No2 
Indirect write No, but see text in this section for exceptions 
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Table G4-41 Synchronization requirements for updates to System registers (continued) 












































First read or write Second read or write Context synchronization event required 
Direct write Direct read No 
Direct write No 
Indirect read Yes@ 
Indirect write No, but see text in this section for exceptions 
Indirect read Direct read No 
Direct write No 
Indirect read No 
Indirect write No 
Indirect write Direct read Yes, but see text in this section for exceptions 
Direct write No, but see text in this section for exceptions 
Indirect read Yes, but see text in this section for exceptions 
Indirect write No, but see text in this section for exceptions 





a. Although no synchronization is required between a Direct write and a Direct read, or between a Direct read and 
an Indirect write, this does not imply that a Direct read causes synchronization of a previous Direct write. This 
means that the sequence Direct write followed by Direct read followed by Indirect read, with no intervening 
context synchronization, does not guarantee that the Indirect read observes the result of the Direct write. 


If the indirect write is to a register that Registers with some architectural guarantee of ordering or observability on 
page G4-4166 shows as having some guarantee of the visibility of an indirect write, synchronization might not be 
required. 


If a direct read or a direct write to a register is followed by an indirect write to that register that is caused by an 
external agent, or by an autonomous asynchronous event, or as a result of a memory-mapped write, then 
synchronization is required to guarantee the ordering of the indirect write relative to the direct read or direct write. 


If an indirect write caused by a direct write is followed by an indirect write caused by an external agent, or by an 
autonomous asynchronous event, or as a result of a memory-mapped write, then synchronization is required to 
guarantee the ordering of the two indirect writes. 


If a direct read causes an indirect write, synchronization is required to guarantee that the indirect write is visible to 
subsequent direct or indirect reads or writes. This synchronization must be performed after the direct read, before 
any subsequent direct or indirect read or write. 


If a direct write causes an indirect write, synchronization is required to guarantee that the indirect write is visible to 
subsequent direct or indirect reads or writes. This synchronization must be performed after the direct write, before 
any subsequent direct or indirect read or write. 


Note 


Where a register has more that one encoding, a direct write to the register using a particular encoding is not an 
indirect write to the same register with a different encoding. 








Where an indirect write is caused by the action of an external agent, such as a debugger, or by a memory-mapped 
read or write by the PE, then an indirect write by that agent to a register using a particular access mechanism, 
followed by an indirect read by that agent to the same register using the same access mechanism and address does 
not need synchronization. 


For information about the additional synchronization requirements for memory-mapped registers, see 
Synchronization requirements for AArch64 System registers on page D7-1889. 
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To guarantee the visibility of changes to some registers, additional operations might be required before the Context 
synchronization event. For such a register, the definition of the register identifies these additional requirements. 


In this manual, unless the context indicates otherwise: 
° Accessing a System register refers to a direct read or write of the register. 


° Using a System register refers to an indirect read or write of the register. 


Registers with some architectural guarantee of ordering or observability 


For the registers for which Table G4-42 shows that the ordering of direct reads is guaranteed, multiple direct reads 
of a single register, using the same encoding, occur in program order without any explicit ordering. 


For the registers for which Table G4-42 shows that some observability of indirect writes is guaranteed, an indirect 
write to the register caused by an external agent, an autonomous asynchronous event, or as a result of a 
memory-mapped write, is both: 


° Observable to direct reads of the register, in finite time, without explicit synchronization. 


° Observable to subsequent indirect reads of the register without explicit synchronization. 


These two sets of registers are similar, as Table G4-42 shows: 


Table G4-42 Registers with a guarantee of ordering or observability, VMSAv8-32 





Ordering of Observability of 

































































Rbuletek directreads __ indirect writes mete 
ISR Guaranteed Guaranteed Interrupt Status Register 
DBGCLAIMCLR Guaranteed Guaranteed Debug CLAIM registers 
DBGCLAIMSET - Guaranteed 
DBGDTRRXint Guaranteed Guaranteed Debug Communication Channel registers 
DBGDTRTXint - Guaranteed 
The DCC flags in Guaranteed Guaranteed 
DBGDSCRint 
CNTPCT Guaranteed Guaranteed Generic Timer registers 
CNTP_TVAL Guaranteed Guaranteed 
CNTVCT Guaranteed Guaranteed 
CNTV_TVAL Guaranteed Guaranteed 
CNTHP_TVAL Guaranteed Guaranteed 
PMCCNTR Guaranteed Guaranteed Performance Monitors Extension registers, if the 
implementation includes the extension 
PMEVCNTR<n> Guaranteed Guaranteed 
PMXEVCNTR Guaranteed Guaranteed 
PMOVSSET Guaranteed Guaranteed 
PMOVSR Guaranteed Guaranteed 
EDSCR.PipeAdv and the - Guaranteed Fields of the External Debug Status and Control 
DCC flags in EDSCR Register 
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In addition to the requirements shown in Table G4-42 on page G4-4166: 


° Indirect writes to the following registers as a result of memory-mapped writes, including accesses by external 
agents, are required to be observable to the indirect read made in determining the response to a subsequent 
memory-mapped access without explicit synchronization: 


—  OSLAR_EL1. OSLAR_EL1 is indirectly read to determine whether the subsequent access is 
permitted. 


Note 
OSLAR_EL1 maps to the AArch32 System register DBGOSLAR. 








—  EDLAR, if implemented. EDLAR is indirectly read to determine whether a subsequent write or 
side-effect of an access is ignored. 


Note 


This requirements is stricter than the general requirement for the observability of indirect writes. 








° When the PE is in Debug state, there are synchronization requirements for the Debug Communication 
Channel and Instruction Transfer registers. See DCC and ITR access in Debug state on page H4-4923. 


The possibility that direct reads can occur early, in the absence of context synchronization, described in Ordering 
of reads of System registers on page G4-4148, still applies to the registers listed in Table G4-42 on page G4-4166. 


Definitions of direct and indirect reads and writes and their side-effects 
Direct and indirect reads and writes are defined as follows: 


Direct read Is arread of a register, using an MRC, MRRC, or STC instruction, that the architecture permits for the 
current PE state. 


If a direct read of a register has a side-effect of changing the value of a register, the effect of a direct 
read on that register is defined to be an indirect write, and has the synchronization requirements of 
an indirect write. This means the indirect write is guaranteed to have occurred, and to be visible to 
subsequent direct or indirect reads and writes only if synchronization is performed after the direct 
read. 


— Note 


The indirect write described here can affect either the register written to by the direct write, or some 
other register. The synchronization requirement is the same in both cases. 





Direct write Is a write to a register, using an MCR, MCRR, or LDC instruction, that the architecture permits for the 
current PE state. 


In the following cases, the side-effect of the direct write is defined to be an indirect write of the 
affected register, and has the synchronization requirements of an indirect write: 


° If the direct write has a side-effect of changing the value of a register other than the register 
accessed by the direct write. 

° If the direct write has a side-effect of changing the value of the register accessed by the direct 
write, so that the value in that register might not be the value that the direct write wrote to the 
register. 


In both cases, this means that the indirect write is not guaranteed to be visible to subsequent direct 
or indirect reads and writes unless synchronization is performed after the direct write. 
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— Note 


. As an example of a direct write to a register having an effect that is an indirect write of that 
register, writing 1 to a PMCNTENCLR.Px bit is also an indirect write, because if the Px bit 
had the value 1 before the direct write, the side-effect of the write changes the value of that 
bit to 0. 


. The indirect write described here can affect either the register written to by the direct write, 
or some other register. The synchronization requirement is the same in both cases. 
For example, writing 1 toa PMCNTENCLR.Px bit that is set to 1 also changes the 
corresponding PMCNTENSET.Px bit from 1 to 0. This means that the direct write to the 
PMCNTENCLR defines indirect writes to both itself and to the PMCNTENSET. 





Indirect read Is a use of the register by an instruction to establish the operating conditions for the instruction. 
Examples of operating conditions that might be determined by an indirect read are the translation 
table base address, or whether memory accesses are forced to be Non-cacheable. 


Indirect reads include situations where the value of one register determines what value is returned 
by a second register. This means that any read of the second register is an indirect read of the register 
that determines what value is returned. 


Indirect reads also include: 


° Reads of the System registers by external agents, such as debuggers, as described in Debug 
registers on page G6-4668. 


° Memory-mapped reads of the System registers made by the PE on which the System registers 
are implemented. 


Where an indirect read of a register has a side-effect of changing the value of a register, that change 
is defined to be an indirect write, and has the synchronization requirements of an indirect write. 


Indirect write Is an update to the value of a register as a consequence of either: 


. An exception, operation, or execution of an instruction that is not a direct write to that 
register. 
° The asynchronous operation of an external agent. 


This can include: 


° The passage of time, as seen in counters or timers, including performance counters. 
° The assertion of an interrupt. 
° A write from an external agent, such as a debugger. 


However, for some registers, the architecture gives some guarantee of visibility without any explicit 
synchronization, see Registers with some architectural guarantee of ordering or observability on 
page G4-4166. 


— Note 


Taking an exception is a context-synchronizing operation. Therefore, any indirect write performed 
as part of an exception entry does not require additional synchronization. This includes the indirect 
writes to the registers that report the exception, as described in Exception reporting ina VMSAv8-32 
implementation on page G4-4123. 
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G4.14.5 Fixed values in register diagrams 
In register diagrams, a fixed bit can be indicated by one of 0, (0), 1, or (1), with the following meanings: 
0 In any implementation: 
° The bit must read as 0. 
° Writes to the bit must be ignored. 
° Software: 
— Can rely on the bit reading as 0. 
— Must use an SBZP policy to write to the bit. 
(0) The bit should-be-zero. In AArch32 state there are a small number of cases where a bit is (0) in some 
contexts, and has a different defined behavior in other contexts. See RESO. 
1 In any implementation: 
. The bit must read as 1. 
. Writes to the bit must be ignored. 
° Software: 
—  Canrely on the bit reading as 1. 
— Must use an SBOP policy to write to the bit. 
(1) The bit is should-be-one. In AArch32 state there are a small number of cases where a bit is (1) in 
some contexts, and has a different defined behavior in other contexts. See RES1. 
Note 
In register diagrams, (0) is a synonym for RESO, and (1) is a synonym for RES1, where RESO and RES1 are defined in 
the Glossary. However, when used in an instruction encoding diagram, (0) and (1) have the narrower definition that 
behavior is UNPREDICTABLE or CONSTRAINED UNPREDICTABLE if either: 
° A bit marked as (0) has the value 1. 
° A bit marked as (1) has the value 0. 
G4.14.6 Principles of the ID scheme for fields in ID registers 
The ARM architecture specifies a number of /D registers that are characterized as comprising a set of 4-bit JD fields, 
Each ID field identifies the presence, and possibly the level of support for, a particular feature in an implementation 
of the architecture. These fields follow an architectural model that aids their use by software and provides future 
compatibility. This section describes that model. AArch32 ID registers to which this scheme applies on 
page G4-4170 identifies the set of ID registers that are accessible from AArch32 state. 
Note 
These fields are distinct from register fields that enumerate the number of resources, such as the number of 
breakpoints, watchpoints, or performance monitors, or the amount of memory. 
Note 
The presence of an ID register field for a feature does not imply that the feature is optional. 
To provide forward compatibility, software can rely on the features of these fields that are described in this section. 
The ID fields, which are either signed or unsigned, use increasing numerical values to indicate increases in 
functionality. Therefore, if a value of 0x1 indicates the presence of some instructions, then the value 0x2 will indicate 
the presence of those instructions plus some additional instructions or functionality. This means software can be 
written in the form: 
if (value >= number) { 
// do something that relies on the value of the feature 
} 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4169 
ID092916 Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.14 About the System registers for VUSAv8-32 


For ID fields where the value 0x@ defines that a feature is not present, the field holds an unsigned value. This covers 
the vast majority of such fields. 


In a few cases, the architecture has been changed to permit implementations to exclude a feature that has previously 
been required and for which no ID field has been defined. In these cases, a new ID field is defined and: 


° The field holds a signed value. 


° The field value @xF indicates that the feature is not implemented. 
° The field value @x@ indicates that the feature is implemented. 
° Software that depends on the feature can use the test: 


if value >= 0 { 
// Software features that depend on the presence of the hardware feature 


} 


In some cases, it has been decided retrospectively that the increase in functionality between two consecutive 
numerical values is too great, and it is desirable to permit an intermediate degree of functionality, and the means to 
discover this. This is done by the introduction of a fractional field that both: 


° Is referred to in the definition of the original field. 


° Applies only when the original field is at the lower value of the step. 


In principle a fractional field can be used for two different fractional steps, with different meanings associated with 
each of these steps. For this reason, a fractional field must be interpreted in the context of the field to which it relates 
and the value of that field. Example G4-6 shows the use of such a field. 


Example G4-6 Example of the use of a fractional field 


For a field describing some class of functionality: 
° The value 0x1 was defined as indicating that item A is present. 


° The value 0x2 was defined as indicating that items B and C are present, in addition to item A. 


Subsequently, it might be necessary to introduce a second ID field to indicate that A and B only are present. This 
new field is a fractional field, and might be defined as having the value 0x1 when A and B only are present. This 
fractional field is valid only when the original ID field has the value 0x1. 


This approach means that: 


° Software that depends on the test if (value >= 0x2) can rely on features A, B, and C being present, 
° Software that depends on the test if (value >= Qx1) can rely on feature A being present. 
. If new software needs to check only that features A and B are present then it can test: 


if (value >= @x2 || (value == @x1 && fractional_value >= Qx1)) { 
// Software features that depend on A and B only 
} 


A fractional field uses the same approach of increasing numerical values indicating increasing functionality, and the 
fractional approach can also be applied recursively to fractional fields. 


Unused ID fields, and fractional fields that are not applicable, are RESO to allow their future use when features, or 
fractional implementation options, are added. 

AArch32 ID registers to which this scheme applies 

This scheme applies to the following AArch32 system registers: 

° The Auxiliary Feature register ID_AFRO. 

e The Processor Feature registers ID_PFRO and ID_PFR1. 

° The Debug Feature register ID_DFRO. 


e The Memory Model Feature registers ID_ MMFRO, ID-MMFR1, ID_MMFR2, ID_MMER3, and 
ID_MMFR4. 
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° The Instruction Set Attribute registers ID_ISARO, ID_ISAR1, ID_ISAR2, ID_ISAR3, ID_ISAR4, and 
ID_ISARS. 


° The Media and VFP Feature registers MVFRO, MVFR1, and MVFR2. 


Note 


Principles of the ID scheme for fields in ID registers on page D7-1893 includes information about the AArch64 
System registers and the memory-mapped registers to which this scheme applies. 
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G4.15 VMSAv8-32 organization of registers in the (coproc==0b1110) encoding space 


The System registers in the (coproc==0b1110) encoding space provide a number of distinct control functions, 


covering: 

° Debug. 

° Trace. 

. Execution environment control, for identification of the trivial Jazelle implementation. 


Because these functions are distinct, the descriptions of these registers are distributed, as follows: 


° In this manual Debug registers on page G6-4668 describes the Debug registers. 


° The Embedded Trace Macrocell Architecture Specification describes the Trace registers. 


This section summarizes the allocation of the System registers in the (coproc==0b1110) encoding space between 
these different functions, and the register encodings in this space that are reserved. 


The 32-bit System register encodings are classified by the {opc1, CRn, opc2, CRm} values required to access them using 
an MCR or an MRC instruction. The 64-bit System register encodings are classified by the {opc1, CRm} values required 
to access them using an MCRR or an MRRC instruction. For the registers in the (coproc==0b1110) encoding space, the 

opcl value determines the primary allocation of these registers, as follows: 


opcl==0 Debug registers. 
opcel==1 Trace registers. 
opcl==7 Jazelle registers. Jazelle registers are implemented as required for a trivial Jazelle implementation. 


Other opcl values 
Reserved. 


Note 


Primary allocation of (coproc==0b1110) register function by opcl value differs from the allocation of 
(coproc==0b1111) registers, where primary allocation is by CRn value for 32-bit register accesses, or CRm value for 
64-bit register accesses. 








For the Debug and Jazelle registers, Table G4-43 on page G4-4173 defines: 
° The {opcl, CRn, opc2, CRm} values used for accessing the 32-bit registers using the MRC and MCR instructions. 


° The {opcl, CRm} values used for accessing the 64-bit register using the MRRC instruction. 


Some Debug registers can also be accessed using the LDC and STC instructions. Table G4-44 on page G4-4174 defines 
the CRn values used for accessing the registers using these instructions. 


Note 
The only permitted uses of the LDC and STC instructions are: 
° An LDC access to load data from memory to DBGDTRTXint. 
° An STC access to store data to memory from DBGDTRRXint. 





In the LDC and STC syntax descriptions in this Manual, the required coproc value of p14 and CRn value of c5 are given 
explicitly. 
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G4.15.1 Register access instruction arguments, (coproc==0b1110) registers 
Table G4-43 shows the MCR, MRC, and MRRC instruction arguments required for accesses to each register that can be 
visible in the System register interface in the (coproc==0b1110) encoding. 
Table G4-43 Mapping of (coproc==0b1110) MCR, MRC, and MRRC instruction arguments to System registers 
opci CRn opc2 CRm Name Width Description 
0 cO 0 cO DBGDIDR 32-bit Debug ID, or unallocated@ 
cl DBGDSCRint 32-bit Debug Status and Control Register, Internal View 
c2 DBGDCCINT 32-bit DCC Interrupt Enable Register 
c5 DBGDTRRXint 32-bit Debug Data Transfer Register, Receive, Internal 
View 
c5 DBGDTRTXint 32-bit Debug Data Transfer Register, Transmit, Internal 
View 
c6 - 32-bit Legacy DBGWFAR, RESO 
c7 DBGVCR 32-bit Debug Vector Catch Register 
2 c0 DBGDTRRXext 32-bit Debug Data Transfer Register, Receive, External 
View 
c2 DBGDSCRext 32-bit Debug Status and Control Register, External View 
c3 DBGDTRTXext 32-bit Debug Data Transfer Register, Transmit, External 
View 
c6 DBGOSECCR 32-bit Debug OS Lock Exception Catch Register 
4 c0-15> =DBGBVR<n> 32-bit Debug Breakpoint Value Registers, n = 0-15 
5 c0-15> =DBGBCR<n> 32-bit Debug Breakpoint Control Registers, n = 0-15 
6 c0-15> = DBGWVR<n> 32-bit Debug Watchpoint Value Registers, n =0-15 
7 c0-15> ==DBGWCR<n> 32-bit Debug Watchpoint Control Registers, n = 0-15 
cl 0 c0 DBGDRAR 32-bit Debug ROM Address Register 
: £ cl 64-bit 
cl 1 c0-15> = DBGBXVR<n> 32-bit Debug Breakpoint Extended Value Registers n = 
0-15 
4 c0 DBGOSLAR 32-bit Debug OS Lock Access Register 
cl DBGOSLSR 32-bit Debug OS Lock Status Register 
c3 DBGOSDLR 32-bit Debug OS Double Lock Register 
c4 DBGPRCR 32-bit Debug Power Control Register 
c2 0 c0 DBGDSAR 32-bit Debug Self Address Register or unallocated@ 
Fe E c2 64-bit 
c4 0-3 c0-15 - IMPLEMENTATION DEFINED 
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Table G4-43 Mapping of (coproc==0b1110) MCR, MRC, and MRRC instruction arguments to System registers (continued) 






































opci CRn opc2 CRm Name Width Description 
0 c7 6 c8 DBGCLAIMSET 32-bit Debug CLAIM Tag Set register 
c9 DBGCLAIMCLR 32-bit Debug CLAIM Tag Clear register 
cl4 DBGAUTHSTATUS 32-bit Debug Authentication Status register 
7 c0 DBGDEVID2 32-bit Debug Device ID register 2 
cl DBGDEVID1 32-bit Debug Device ID register 1 
c2 DBGDEVID 32-bit Debug Device ID register 
1 c0-c7 0-7 c0-cl5—- 32-bit Reserved for OPTIONAL Trace extension 
7 c0 0 cO JIDR 32-bit Jazelle ID Register¢ 
cl 0 cO JOSCR 32-bit Jazelle OS Control Register¢ 
c2 0 cO JMCR 32-bit Jazelle Main Configuration Register¢ 
All other encodings - 32-bit Unallocated 





a. IfEL1 cannot use AArch32 this register is OPTIONAL and deprecated. See the register description for details. 


b. Not implemented breakpoint and watchpoint register access instructions are unallocated. If EL2 is not implemented or breakpoint n is not 
context-aware, DBGBXVR<n> is unallocated. CRm encodes <n>, the breakpoint or watchpoint number. 


c. Legacy register, see Legacy feature registers, functional group on page G4-4211. 


Table G4-44 shows the LDC and STC instruction arguments required for accesses to the Debug registers that can be 
accessed using these instructions. 


Table G4-44 Mapping of LDC and STC instruction arguments to System registers 














CRn Instruction Name Width Description 

c5 LDC DBGDTRTXint 32-bit Debug Data Transfer Register, Transmit, Internal View 

c5 STC DBGDTRRXint 32-bit Debug Data Transfer Register, Receive, Internal View 
Note 





In the instruction syntax descriptions for the LDC and STC instructions, the required coproc and CRn values are given 
explicitly as coproc==p14, CRn==c5. 
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G4.16 VMSAv8-32 organization of registers in the (coproc==0b1111) encoding space 


For 32-bit accesses to the System registers in the (coproc==0b1111) encoding space, the ordered set of parameters 
{CRn, opcl, CRm, opc2} determine the register order. Within this ordering, the CRn value originally provided a 
functional grouping of these registers. As the number of System registers has increased this ordering has become 
less appropriate. 


This document now: 


Groups the System registers in the (coproc==0b1111) encoding space by functional group, see Functional 
grouping of VMSAv8s-32 System registers on page G4-4193. 


Describes all of the System registers for VMS Av8-32, in Chapter G6 AArch32 System Register Descriptions. 


Gives additional information about the organization of the VMSAv8-32 System registers in the 
(coproc==0b1111) encoding space, in the remainder of this section. 


This section presents information as follows: 


Register ordering by {CRn, opcl, CRm, opc2} 


See: 
° System register summary for (coproc==Ob1111) encodings by CRn value on page G4-4176. 


. Full list of VMSAv8-32 System registers in the (coproc==Ob1111) encoding space on 
page G4-4178. 


—_ Note 


The ordered listing of (coproc==0b1111) registers by the {CRn, opc1, CRm, opc2} encoding of the 32-bit 
registers is most likely to be useful to those implementing AArch32 state, and to those validating 
such implementations. However, otherwise, the grouping of registers by function is more logical. 





Views of the registers, that depend on the current state of the PE 


See AArch32 views of the System registers in the (coproc==Ob1111) encoding space on 
page G4-4190. 


— Note 


The different register views are particularly significant in implementations that include EL2. 





In addition, the indexes in Appendix K12 Registers Index include all of the System registers. 
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G4.16.1 


System register summary for (coproc==0b1111) encodings by CRn value 


Figure G4-21 summarizes the grouping of the System registers in the (coproc==0b1111) encoding space, for a 
VMSAv8-32 implementation, by the value of CRn used for a 32-bit access to the register. 


CRn 
cO 
c1 
c2 
c3 
c4 
c5 
c6 
c7 
c8 


CRm 
{c0-c2} 
{cO, ci} 
{c0, c1} 
cO 
c6 
{c0,c1} 
cO 
Various 
Various 


opc1 
{0-2, 4} 
{0, 4} 
{0, 4} 

0. 

0 
{0, 4} 
{0, 4} 
{0, 4} 
{0, 4} 


opc2 
{0-7} 
{0-7} 
{0-2} 
0 
0 
{0,1} 
{0, 2, 4} 
Various» {J |Cache maintenance, address translations, legacy operations 
Various» TLB maintenance operations 
c9 {0-7} Various {0-7}—>| Performance monitors 
c10— {0-7} Various {0-7}—> Memory mapping registers and TLB operations 
c11 +—{0-7}—-_{c0-c8,c1 5}}- {0-7} > Reserved for DMA operations for TCM access 
c12 AH{0-2, 4, 6} Various : {0,1}—» System control registers, GIC System registers * 
c13 +—2 4}-+—_c0-—__+—_{0-4}_» Process, Context, and Thread ID registers 
c14 +—_{0-7} {c0-c15} {0-7}—>| Generic Timer registers *, Performance Monitors registers * 


I 
t 
c15 {0-7} {c0-c15} {0-7}—> IMPLEMENTATION DEFINED registers 


= Read-only _<l Read/Write Write-only [ 1 |Access depends on the implementation 


* If implemented 








4 |ID registers 
System control registers 














Memory system control registers 








GIC System register *, Debug exception registers 








LAME 





> Memory system fault registers 


















































9) 9) 9) =9)|=9)/ 4 =4) 

















Figure G4-21 System register groupings for (coproc==0b1111), for 32-bit registers 





Note 


For the System registers in the (coproc==0b1111) encoding space, Figure G4-21 gives only an overview of the 
assigned encodings for 32-bit registers for each of the CRn values c0-c15. For more information, see: 


. The full list of registers in the (coproc==0b1111) encoding space, in Full list of VMSAv8-32 System registers 
in the (coproc==ObI1111) encoding space on page G4-4178, for the definition of the assigned and unassigned 
encodings for that register. 


° The register definitions in Chapter G6 AArch32 System Register Descriptions for any dependencies on the 
implemented Exception levels. 





In general, System register accesses using an unallocated set of {CRn, opcl, CRm, opc2} values are UNDEFINED. 
Behavior of VMSAv8-32 32-bit System registers with (coproc==O0b1111, CRn==c0) on page G4-4177 described 
the only exceptions to this rule. 


The 32-bit System registers with (coproc==0b1111, CRn==c15), and the corresponding 64-bit System registers, are 
reserved for implementation defined registers. For more information see Reserved encodings in the VMSAv8-32 
System register (coproc==Ob1111) space on page G4-4177. 


The HSTR.Tn trap on (coproc==0b1111) System registers 


As General trapping to Hyp mode of Non-secure ELO and EL1 accesses to System registers in the 
(coproc==O0bI1111) encoding space on page G1-3908 describes, when the value of HSTR.Tn is 1, Non-secure PL1 
accesses to System registers in the (coproc==0b1111) encoding space using a CRn or CRm value that corresponds to 
the value of Tn are trapped to PL2, even if the encoding is UNDEFINED when the value of HSTR.Tn is 0. This applies: 
° For 32 bit register accesses when the value of Rn in the MCR or MRC instruction corresponds to Tn. 


° For 64 bit register accesses when the value of Rm in the MCRR or MRRC instruction corresponds to Tn. 


If there are matching System register encodings that are accessible from Non-secure PLO then those accesses are 
also trapped to PL2 when the value of HSTR.Tn is 1. 
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Behavior of VMSAv8-32 32-bit System registers with (coproc==0b1111, CRn==c0) 


In the (coproc==0b1111) encoding space, the 32-bit System registers with (CRn==c0) provide device and feature 
identification. That is, they provide registers in the functional group described in J/dentification registers, functional 
group on page G4-4194. 


Table G4-45 on page G4-4179 shows all of the architecturally required System registers with {coproc==0b1111, 
CRn==c0}. The behavior of 32-bit System register encodings in this group that are not shown in the table, and 
encodings that are part of an unimplemented Exception level, depends on the value of opc1, and possibly on the 
value of CRm and opc2, as follows: 


opcl == All write accesses to the encodings are UNDEFINED. 
For read accesses: 
° The following encodings return an UNKNOWN value: 
—_— CRm==3, opc2=={0, 1, 2}. 
—  (CRm=={4, 6,7}, opc2=={0, 1}. 
— CRm==5, opc2=={0, 1, 4, 5}. 
° All other encodings are RESO. 
opcl > 0 All accesses to the encodings are UNDEFINED. 


See also Accesses to unallocated encodings in the (coproc==ObI111x) encoding space on page G4-4151. 


Note 


Some of these registers were previously described as being part of the CPUID identification scheme, see The 
CPUID identification scheme on page G4-4195. 








Reserved encodings in the VMSAv8-32 System register (coproc==0b1111) space 


AArch32 state reserves a number of regions in the (coproc==0b1111) encoding space for IMPLEMENTATION 
DEFINED System registers. These reservations are defined in terms of the encoding of 32-bit accesses to the System 
register encoding space. That is, they are defined by the reserved 32-bit {CRn, opc1, CRm, opc2} encodings. 


In ARMv8, reserved encodings that do not have an IMPLEMENTATION DEFINED function are UNDEFINED. 
The following subsections give more information about these reserved encodings: 

° Reserved 32-bit encodings with {coproc==ObI1111, CRn==c9}. 

° Reserved 32-bit encodings with {coproc==0Ob1111, CRn==c10} on page G4-4178. 

. Reserved 32-bit encodings with {coproc==O0b1111, CRn==cl11} on page G4-4178. 

° Reserved 32-bit encodings with {coproc==O0b1111, CRn==c15} on page G4-4178. 


Reserved 32-bit encodings with {coproc==0b1111, CRn==c9} 


In the AArch32 encoding space, for 32-bit encodings with {coproc==0b1111, CRn==c9}, the following encodings 
are reserved for IMPLEMENTATION DEFINED purposes: 


° Encodings with { coproc==0b1111, CRn==c9, opc=={0-7}, CRm=={cO-c2, c5-c8}} are reserved for 
IMPLEMENTATION DEFINED branch predictor, cache, and TCM operations. 


° Encodings with {coproc==0b1111, CRn==c9, opc== {0-7}, CRm==c15 } are reserved for IMPLEMENTATION 
DEFINED performance monitors. 
Note 


These are distinct from the OPTIONAL ARM Performance Monitors Extension, the registers for which use the 
encoding space {coproc==0b1111, CRn==c9, opc=={0-7}, CRm=={c12-c14}}. 
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Reserved 32-bit encodings with {coproc==0b1111, CRn==c10} 


In the AArch32 encoding space, for 32-bit encodings with { coproc==0b1111, CRn==c10}, the following encodings 
are reserved for IMPLEMENTATION DEFINED purposes: 


° Encodings with {coproc==0b1111, CRn==c10, opc=={0-7}, CRm=={cO, c1, c4, c8}} are reserved for 
IMPLEMENTATION DEFINED TLB lockdown operations. 


Reserved 32-bit encodings with {coproc==6b1111, CRn==c11} 


In the AArch32 encoding space, for 32-bit encodings with {coproc==0b1111, CRn==c11}, the following encodings 
are reserved for IMPLEMENTATION DEFINED purposes: 


° Encodings with {coproc==0b1111, CRn==c11, opc=={0-7}, CRm=={c0-c8, c15}} are reserved for 
IMPLEMENTATION DEFINED DMA operations for TCM access. 


In ARMvVv8, the remainder of the AArch32 {coproc==0b1111, CRn==c11} encoding space is UNDEFINED. 


Reserved 32-bit encodings with {coproc==6b1111, CRn==c15} 


ARMvV8 reserves the AArch32 System register encodings with (coproc==0b1111, CRn==c15) for IMPLEMENTATION 
DEFINED purposes, and does not impose any restrictions on the use of the these encodings. The documentation of 
the ARM implementation must describe fully any registers implemented in the {coproc==0b1111, CRn==c15} 
encoding space. Normally, for processor implementations by ARM, this information is included in the Technical 
Reference Manual for the processor. 


Typically, an implementation uses the {coproc==0b1111, CRn==c15} encodings to provide test features, and any 
required configuration options that are not covered by this Manual. 


This reservation means that the AArch32 64-bit encodings with {coproc==0b1111, CRn==c15} are also reserved for 
IMPLEMENTATION DEFINED purposes, without any restrictions on the use of the these encodings. 


G4.16.2 Full list of VMSAv8-32 System registers in the (coproc==0b1111) encoding space 


Table G4-45 on page G4-4179 shows the System registers in the (coproc==0b1111) encoding space in VMSAv8-32, 
in the order of the {CRn, opc1, CRm, opc2} parameter values used in MCR or MRC accesses to the 32-bit registers: 


° For MCR or MRC accesses to the 32-bit registers, CRn is the primary identifier of the target System register for 
the access. This applies, also, to MCR or MRC instructions that provide 32-bit accesses to a single word of a 64-bit 
System register. 


° For MCRR or MRRC accesses to the 64-bit registers, CRm is the primary identifier of the target System register for 
the access. Table G4-45 on page G4-4179 orders the 64-bit registers with the 32-bit registers accessed using 
the same primary register identifier. For example, the 64-bit encoding of TTBRO, that is accessed with 
(CRm==c2), is listed with the 32-bit registers that are accessed with (CRn==c2). 
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Table G4-45 VMSAv8-32 (coproc==0b1111) register summary, in MCR/MRC parameter order 

























































































CRn opct CRm opc2 Name Width Description 
cO 0 c0 0 MIDR 32-bit Main ID Register 
1 CTR 32-bit | Cache Type Register 
2 TCMTR 32-bit TCM Type Register 
3 TLBTR 32-bit | TLB Type Register 
4,62, MIDR 32-bit Aliases of Main ID Register 
7, 
5 MPIDR 32-bit | Multiprocessor Affinity Register 
6a REVIDR 32-bit Revision ID Register 
cl 0 ID_PFRO 32-bit Processor Feature Register 0 
1 ID_PFR1 32-bit Processor Feature Register 1 
2 ID_DFRO 32-bit | Debug Feature Register 0 
3 ID_AFRO 32-bit Auxiliary Feature Register 0 
4 ID_MMFRO 32-bit | Memory Model Feature Register 0 
5 ID_MMFR1 32-bit | Memory Model Feature Register 1 
6 ID_MMFR2 32-bit | Memory Model Feature Register 2 
ih ID_MMFR3 32-bit | Memory Model Feature Register 3 
c2 0 ID_ISARO 32-bit _—_ Instruction Set Attribute Register 0 

1 ID_ISAR1 32-bit —_ Instruction Set Attribute Register 1 
2 ID_ISAR2 32-bit _—_ Instruction Set Attribute Register 2 
3 ID_ISAR3 32-bit _—_ Instruction Set Attribute Register 3 
4 ID_ISAR4 32-bit —_ Instruction Set Attribute Register 4 
5 ID_ISARS 32-bit —_ Instruction Set Attribute Register 5 
6 ID_MMFR4 32-bit | Memory Model Feature Register 4 

1 c0 0 CCSIDR 32-bit | Cache Size ID Registers 
1 CLIDR 32-bit | Cache Level ID Register 
i AIDR 32-bit Auxiliary ID Register 

2 c0 0 CSSELR 32-bit | Cache Size Selection Register 

4 c0 0 VPIDR> 32-bit Virtualization Processor ID Register 
5 VMPIDR®> 32-bit —_- Virtualization Multiprocessor ID Register 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


G4-4179 


Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.16 VMSAv8-32 organization of registers in the (coproc==0b1111) encoding space 


Table G4-45 VMSAv8-32 (coproc==0b1111) register summary, in MCR/MRC parameter order (continued) 


































































































CRn opct CRm opc2 Name Width Description 
cl 0 c0 0 SCTLR 32-bit | System Control Register 
1 ACTLR 32-bit Auxiliary Control Register 
2 CPACR 32-bit | Architectural Feature Access Control Register 
3 ACTLR2 32-bit Auxiliary Control Register 2 
cl 0 SCR° 32-bit Secure Configuration Register 
1 SDER¢ 32-bit | Secure Debug Enable Register 
2 NSACR¢ 32-bit Non-Secure Access Control Register 
c3 1 SDCR4 32-bit Secure Debug Configuration Register 
el 4 cO 0 HSCTLR® 32-bit | Hyp System Control Register 
1 HACTLR® 32-bit | Hyp Auxiliary Control Register 
3 HACTLR2° 32-bit | Hyp Auxiliary Control Register 2 
cl 0 HCR?> 32-bit | Hyp Configuration Register 
1 HDCR> 32-bit | Hyp Debug Configuration Register 
2 HCPTR? 32-bit Hyp Architectural Feature Trap Register 
3 HSTR? 32-bit | Hyp System Trap Register 
4 HCR2>.4 32-bit | Hyp Configuration Register 2 
7 HACR? 32-bit | Hyp Auxiliary Configuration Register 
c2 0 c0 0 TTBRO 32-bit Translation Table Base Register 0 
- 0 c2 - TTBRO 64-bit 
c2 0 c0 1 TTBRI 32-bit Translation Table Base Register 1 
- 1 c2 - TTBRI 64-bit 
c2 0 c0 2 TTBCR 32-bit Translation Table Base Control Register 
4 c0 2 HTCR? 32-bit | Hyp Translation Control Register 
cl 2 VTCR> 32-bit Virtualization Translation Control Register 
- 4 c2 = HTTBR?® 64-bit | Hyp Translation Table Base Register 
- 6 c2 - VTTBR® 64-bit —_- Virtualization Translation Table Base Register 
c3 0 c0 0 DACR 32-bit | Domain Access Control Register 
c4 0 c6 0 ICC_PMR®* 32-bit —_ Interrupt Controller Interrupt Priority Mask Register 
ICV_PMRe Interrupt Controller Virtual Priority Mask Register 
3 c5 0 DSPSR4 32-bit Debug Saved Program Status Register‘ 
1 DLR4 32-bit | Debug Link Register! 
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Table G4-45 VMSAv8-32 (coproc==0b1111) register summary, in MCR/MRC parameter order (continued) 


































































































CRn opct CRm opc2 Name Width Description 
c5 0 c0 0 DFSR 32-bit Data Fault Status Register 
1 IFSR 32-bit —_ Instruction Fault Status Register 
cl 0 ADFSR 32-bit Auxiliary Data Fault Status Register 
1 AIFSR 32-bit Auxiliary Instruction Fault Status Register 
4 cl 0 HADFSR> 32-bit | Hyp Auxiliary Data Fault Syndrome Register 
1 HAIFSR 32-bit | Hyp Auxiliary Instruction Fault Syndrome Register 
c2 0 HSR? 32-bit | Hyp Syndrome Register 
c6 0 c0 0 DFAR 32-bit | Data Fault Address Register 
2 IFAR 32-bit —_ Instruction Fault Address Register 
4 cO 0 HDFAR®> 32-bit | Hyp Data Fault Address Register 
2 HIFAR> 32-bit Hyp Instruction Fault Address Register 
4 HPFARD 32-bit Hyp IPA Fault Address Register 
c7 0 cl 0 ICIALLUIS 32-bit | See Cache maintenance instructions, functional group on 
6 BPIALLIS a ee 
c4 0 PAR 32-bit Physical Address Register 
- 0 c7 - PAR 64-bit 
c7 0 c5 0 ICIALLU 32-bit | See Cache maintenance instructions, functional group on 
1 ICIMVAU in 
4 CPI1SISB 32-bit | See Memory barriers on page E2-2335 
6 BPIALL 32-bit | See Cache maintenance instructions, functional group on 
7 BPIMVA a 
c6 1 DCIMVAC 32-bit | See Cache maintenance instructions, functional group on 
2 DCISW “aoe 
c8 0 ATSICPR 32-bit | See Address translation instructions on page G4-4142 
1 ATS1CPW 32-bit 
2 ATSICUR 32-bit 
3 ATS1CUW 32-bit 
4 ATS12NSOPR¢ 32-bit 
5 ATS12NSOPWe 32-bit 
6 ATS12NSOUR* 32-bit 
7 ATS12NSOUW* 32-bit 
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Table G4-45 VMSAv8-32 (coproc==0b1111) register summary, in MCR/MRC parameter order (continued) 































































































CRn opct CRm = opc2 Name Width Description 
c7 0 cl0 1 DCCMVAC 32-bit | See Cache maintenance instructions, functional group on 
page G4-4201 
2 DCCSW 32-bit 
4 CP15DSB 32-bit | See Memory barriers on page E2-2335 
5 CP15DMB 32-bit 
0 cll 1 DCCMVAU 32-bit | See Cache maintenance instructions, functional group on 
page G4-4201 
cl4 1 DCCIMVAC 32-bit | See Cache maintenance instructions, functional group on 
page G4-4201 
2 DCCISW 32-bit 
4 c8 0 ATS1HR® 32-bit See Address translation instructions on page G4-4142 
1 ATSIHW? 32-bit 
c8 0 c3 0 TLBIALLIS 32-bit See The scope of TLB maintenance instructions on 
page G4-4101 
1 TLBIMVAIS 32-bit 
2 TLBIASIDIS 32-bit 
3 TLBIMVAAIS 32-bit 
5 TLBIMVALIS4 32-bit 
7 TLBIMVAALIS4 32-bit 
cS 0 ITLBIALL 32-bit See The scope of TLB maintenance instructions on 
page G4-4101 
1 ITLBIMVA 32-bit 
2 ITLBIASID 32-bit 
c6 0 DTLBIALL 32-bit See The scope of TLB maintenance instructions on 
page G4-4101 
1 DTLBIMVA 32-bit 
2 DTLBIASID 32-bit 
c7 0 TLBIALL 32-bit See The scope of TLB maintenance instructions on 
page G4-4101 
1 TLBIMVA 32-bit 
2 TLBIASID 32-bit 
3 TLBIMVAA 32-bit 
3 TLBIMVAL4 32-bit 
7 TLBIMVAAL4 32-bit 
4 cO 1 TLBIIPAS2IS4 32-bit See The scope of TLB maintenance instructions on 
page G4-4101 
> TLBIIPAS2LIS4 32-bit 
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CRn opct CRm opc2 Name Width Description 
c8 4 c3 0 TLBIALLHIS> 32-bit See The scope of TLB maintenance instructions on 
page G4-4101 
1 TLBIMVAHIS> 32-bit 
4 TLBIALLNSNHIS® = 32-bit 
5 TLBIMVALHIS4 32-bit 
4 c4 1 TLBIIPAS2¢ 32-bit See The scope of TLB maintenance instructions on 
page G4-4101 
5 TLBIIPAS2L4 32-bit 
c7 0 TLBIALLH? 32-bit See The scope of TLB maintenance instructions on 
page G4-4101 
1 TLBIMVAH> 32-bit 
4 TLBIALLNSNH> 32-bit 
5 TLBIMVALH4 32-bit 
c9 0-7 cO-c2 0-7 - 32-bit | Reserved for IMPLEMENTATION DEFINED branch predictor, 
cache, and TCM operations 
c5-c8 0-7 - 32-bit Reserved for IMPLEMENTATION DEFINED branch predictor, 
cache, and TCM operations 
0 c12 0 PMCR 32-bit | Performance Monitors Control Register 
1 PMCNTENSET 32-bit | Performance Monitors Count Enable Set register 
2 PMCNTENCLR 32-bit Performance Monitors Count Enable Clear register 
3 PMOVSR 32-bit | Performance Monitors Overflow Flag Status Register 
4 PMSWINC 32-bit | Performance Monitors Software Increment register 
5 PMSELR 32-bit | Performance Monitors Event Counter Selection Register 
6 PMCEIDO 32-bit | Performance Monitors Common Event Identification register 
0 
7 PMCEID1 32-bit | Performance Monitors Common Event Identification register 
1 
0 c13 0 PMCCNTR 32-bit | Performance Monitors Cycle Count Register 
- 0 c9 - PMCCNTR_ELO 64-bit | Performance Monitors Cycle Count Register 
c9 0 c13 1 PMXEVTYPER 32-bit | Performance Monitors Event Type Select Register 
2 PMXEVCNTR 32-bit | Performance Monitors Event Count Register 
0 cl4 0 PMUSERENR 32-bit | Performance Monitors User Enable Register 
1 PMINTENSET 32-bit | Performance Monitors Interrupt Enable Set register 
2 PMINTENCLR 32-bit Performance Monitors Interrupt Enable Clear register 
3 PMOVSSET> 32-bit Performance Monitors Overflow Flag Status Set register 
0-7 cl5 0-7 - 32-bit | Reserved for IMPLEMENTATION DEFINED performance 
monitors 
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Table G4-45 VMSAv8-32 (coproc==0b1111) register summary, in MCR/MRC parameter order (continued) 



















































































CRn opct CRm opc2 Name Width Description 
cl0 0 cO-cl 0-7 - 32-bit | Reserved for IMPLEMENTATION DEFINED TLB lockdown 
operations 
c2 0 PRRR& 32-bit | Primary Region Remap Register 
MAIROS 32-bit Memory Attribute Indirection Register 0 
1 NMRR8& 32-bit | Normal Memory Remap Register 
MAIR1s 32-bit | Memory Attribute Indirection Register 1 
c3 0 AMAIRO 32-bit Auxiliary Memory Attribute Indirection Register 0 
1 AMAIR1 32-bit | Auxiliary Memory Attribute Indirection Register 1 
c4, c8 0-7 - 32-bit Reserved for IMPLEMENTATION DEFINED TLB lockdown 
operations 
1-3 c0,cl, 0-7 - 32-bit | Reserved for IMPLEMENTATION DEFINED TLB lockdown 
c4, c8 operations 
4 c0,cl 0-7 - 32-bit | Reserved for IMPLEMENTATION DEFINED TLB lockdown 
operations 
c2 0 HMAIRO» 32-bit | Hyp Memory Attribute Indirection Register 0 
1 HMAIRI1> 32-bit | Hyp Memory Attribute Indirection Register 1 
c3 0 HAMAIRO> 32-bit | Hyp Auxiliary Memory Attribute Indirection Register 0 
1 HAMAIRI> 32-bit | Hyp Auxiliary Memory Attribute Indirection Register 1 
c4, c8 0-7 - 32-bit Reserved for IMPLEMENTATION DEFINED TLB lockdown 
operations 
5-7 cO,cl, 0-7 - 32-bit Reserved for IMPLEMENTATION DEFINED TLB lockdown 
c4, c8 operations 
cll 0-7 c0-c8 0-7 - 32-bit | Reserved for IMPLEMENTATION DEFINED DMA operations for 
TCM access. 
cl5 0-7 - 32-bit 
- 0 c12 - ICC_SGI1Re 64-bit —_ Interrupt Controller SGI group 1 Register 
c12 0 c0 0 VBAR 32-bit | Vector Base Address Register 
1 MVBAR¢ 32-bit Monitor Vector Base Address Register 
RVBAR 32-bit Reset Vector Base Address Register 
2 RMR (at EL1)5 32-bit Reset Management Register, at EL1 
RMR (at EL3)4 32-bit Reset Management Register, at EL3 
cl 0 ISR¢ 32-bit Interrupt Status Register 
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Table G4-45 VMSAv8-32 (coproc==0b1111) register summary, in MCR/MRC parameter order (continued) 















































CRn opct CRm opc2 Name Width Description 
c12 0 c8 0 ICC_IARO & 32-bit _—_ Interrupt Controller Interrupt Acknowledge Register 0 

ICV_IARO & Interrupt Controller Virtual Acknowledge Register 0 

1 ICC_EOIRO © 32-bit —_ Interrupt Controller End Of Interrupt Register 0 
ICV_EOIRO ¢ Interrupt Controller Virtual End Of Interrupt Register 0 

2 ICC_HPPIRO ¢ 32-bit Interrupt Controller Highest Priority Pending Interrupt 

Register 0 
ICV_HPPIRO & Interrupt Controller Virtual Highest Priority Pending Interrupt 
Register 0 

3 ICC_BPRO © 32-bit Interrupt Controller Binary Point Register 0 
ICV_BPRO ®& Interrupt Controller Virtual Binary Point Register 0 

4 ICC_APORO & 32-bit Interrupt Controller Active Priorities Register (0,0) 
ICV_APORO & Interrupt Controller Virtual Active Priorities Register (0,0) 

5 ICC_APORI & 32-bit Interrupt Controller Active Priorities Register (0,1) 
ICV_APORI ¢ Interrupt Controller Virtual Active Priorities Register (0,1) 

6 ICC_APORI1 & 32-bit Interrupt Controller Active Priorities Register (0,2) 
ICV_APORI1 ¢ Interrupt Controller Virtual Active Priorities Register (0,2) 

7 ICC_APOR3 © 32-bit Interrupt Controller Active Priorities Register (0,3) 
ICV_APOR3 & Interrupt Controller Virtual Active Priorities Register (0,3) 

co 0 ICC_APIRO& 32-bit Interrupt Controller Active Priorities Register (1,0) 

ICV_APIRO® Interrupt Controller Virtual Active Priorities Register (1,0) 

1 ICC_APIRI1 & 32-bit Interrupt Controller Active Priorities Register (1,1) 
ICV_APIR1¢ Interrupt Controller Virtual Active Priorities Register (1,1) 

2 ICC_APIR2¢ 32-bit Interrupt Controller Active Priorities Register (1,2) 
ICV_APIR2¢ Interrupt Controller Virtual Active Priorities Register (1,2) 

3 ICC_AP1R3 ¢ 32-bit Interrupt Controller Active Priorities Register (1,3) 
ICV_APIR3 & Interrupt Controller Virtual Active Priorities Register (1,3) 

cll 1 ICC_DIR°& 32-bit Interrupt Controller Deactivate Interrupt Register 

ICV_DIR ®& Interrupt Controller Deactivate Virtual Interrupt Register 

3 ICC_RPR & 32-bit —_ Interrupt Controller Running Priority Register 
ICV_RPR° Interrupt Controller Virtual Running Priority Register 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


G4-4185 


Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.16 VMSAv8-32 organization of registers in the (coproc==0b1111) encoding space 


Table G4-45 VMSAv8-32 (coproc==0b1111) register summary, in MCR/MRC parameter order (continued) 







































































CRn opct CRm opc2 Name Width Description 
cl2 0 c12 0 ICC_IARI & 32-bit —_ Interrupt Controller Interrupt Acknowledge Register 1 
ICV_IARI¢ Interrupt Controller Virtual Interrupt Acknowledge Register 1 
1 ICC_EOIR1 ¢ 32-bit —_ Interrupt Controller End Of Interrupt Register 1 
ICV_EOIR1 ¢ Interrupt Controller Virtual End Of Interrupt Register 1 
2 ICC_HPPIR1 ¢ 32-bit Interrupt Controller Highest Priority Pending Interrupt 
Register 1 
ICV_HPPIR1 & Interrupt Controller Virtual Highest Priority Pending Interrupt 
Register 1 
3 ICC_BPRI ¢ 32-bit Interrupt Controller Binary Point Register 1 
ICV_BPRI & Interrupt Controller Virtual Binary Point Register 1 
4 ICC_CTLR ¢ 32-bit —_ Interrupt Controller Control Register 
ICV_CTLR & Interrupt Controller Virtual Control Register 
5 ICC_SRE & 32-bit —_ Interrupt Controller System Register Enable Register 
6 ICC_IGRPENO® 32-bit —_ Interrupt Controller Interrupt Group 0 Enable Register 
ICV_IGRPENO & Interrupt Controller Virtual Interrupt Group 0 Enable Register 
7 ICC_IGRPEN 1 ¢ 32-bit —_ Interrupt Controller Interrupt Group 1 Enable Register 
ICV_IGRPENI1 ¢& Interrupt Controller Virtual Interrupt Group 1 Enable Register 
- 1 c12 - ICC_ASGIIR & 64 bit Interrupt Controller Alias SGI group 1 Register 
- 2 c12 - ICC_SGIOR ¢ 64 bit Interrupt Controller SGI group 0 Register 
c12 4 cO 0 HVBAR®?:¢ 32-bit | Hyp Vector Base Address Register 
2 HRMRbh 32-bit | Hyp Reset Management Register 
c8 0 ICH_APORO ¢ 32-bit —_ Interrupt Controller Hyp Active Priorities Register (0,0) 
1 ICH_APORI & 32-bit —_ Interrupt Controller Hyp Active Priorities Register (0,1) 
2 ICH_APOR2 © 32-bit —_ Interrupt Controller Hyp Active Priorities Register (0,2) 
3 ICH_APOR3 © 32-bit —_ Interrupt Controller Hyp Active Priorities Register (0,3) 
c9 0 ICH_AP1RO¢ 32-bit Interrupt Controller Hyp Active Priorities Register (1,0) 
1 ICH_APIR1 ¢ 32-bit —_ Interrupt Controller Hyp Active Priorities Register (1,1) 
2 ICH_AP1R2 ¢ 32-bit —_ Interrupt Controller Hyp Active Priorities Register (1,2) 
3 ICH_AP1R3 & 32-bit —_ Interrupt Controller Hyp Active Priorities Register (1,3) 
5) ICC_HSRE& 32-bit —_ Interrupt Controller Hyp System Register Enable register 
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Table G4-45 VMSAv8-32 (coproc==0b1111) register summary, in MCR/MRC parameter order (continued) 



















































































CRn opct CRm opc2 Name Width Description 
c12 4 cll 0 ICH_HCR & 32-bit —_ Interrupt Controller Hyp Control Register 
1 ICH _VTR& 32-bit Interrupt Controller VGIC Type Register 
2 ICH_MISR & 32-bit Interrupt Controller Maintenance Interrupt State Register 
3 ICH_EISR ¢& 32-bit —_ Interrupt Controller End of Interrupt Status Register 
5 ICH_ELRSR ¢ 32-bit —_ Interrupt Controller Empty List Register Status Register 
7 ICH_VMCR & 32-bit —_ Interrupt Controller Virtual Machine Control Register 
c12 0-7 ICH_LR<n>, 32-bit —_ Interrupt Controller List Registers 0 to 7 
for n==0 to 7° 
c13 0-7 ICH_LR<n>, 32-bit —_ Interrupt Controller List Registers 8 to 15 
for n==8 to 15¢ 
cl4 0-7 ICH_LRC<n>, 32-bit Interrupt Controller List Registers 0 to 7, continuation 
for n==0 to 7° 
cl5 0-7 ICH_LRC<n>, 32-bit Interrupt Controller List Registers 8 to 15, continuation 
for n==8 to 15¢ 
6 c12 4 ICC_MCTLR & 32-bit —_ Interrupt Controller Monitor Control Register 
3 ICC_MSRE ° 32-bit —_ Interrupt Controller Monitor System Register Enable register 
7 ICC_MGRPEN1¢ 32-bit —_ Interrupt Controller Monitor Interrupt Group | Enable register 
c13 0 c0 0 FCSEIDR 32-bit | FCSE Process ID Register 
1 CONTEXTIDR 32-bit | Context ID Register 
2 TPIDRURW 32-bit | User Read/Write Thread ID Register 
3 TPIDRURO 32-bit | User Read-Only Thread ID Register 
4 TPIDRPRW 32-bit | EL1 only Thread ID Register 
4 c0 2 HTPIDR? 32-bit | Hyp Software Thread ID Register 
- 0 cl4 - CNTPCTi 64-bit Physical Count register 
cl4 0 cO 0 CNTFRQi 32-bit Counter Frequency register 
cl 0 CNTKCTLi 32-bit Timer EL1 Control register 
c2 0 CNTP_TVAL! 32-bit | ELI Physical TimerValue register 
1 CNTP_CTLi 32-bit | ELI Physical Timer Control register 
c3 0 CNTV_TVALi 32-bit —_- Virtual TimerValue register 
1 CNTV_CTLi 32-bit Virtual Timer Control register 
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CRn opct CRm_~=— opc2 Name Width Description 
cl4 0 c8 0-7 PMEVCNTR<n>, 32-bit Performance Monitors Event Count Registers, 0-7 
for n==0 to 74 
c9 0-7 PMEVCNTR<n>, 32-bit Performance Monitors Event Count Registers, 8-15 
for n==8 to 154 
cl0 0-7 PMEVCNTR<n>, 32-bit Performance Monitors Event Count Registers, 16-23 
for n==16 to 234 
cll 0-6 PMEVCNTR<n>, 32-bit Performance Monitors Event Count Registers, 24-30 
for n==24 to 304 
c12 0-7 PMEVTYPER<n>, 32-bit | Performance Monitors Event Type Registers, 0-7 
for n==0 to 74 
c13 0-7 PMEVTYPER<n>, 32-bit Performance Monitors Event Type Registers, 8-15 
for n==8 to 154 
cl4 0-7 PMEVTYPER<n>, 32-bit | Performance Monitors Event Type Registers, 16-23 
for n==16 to 234 
cl5 0-6 PMEVTYPER<n>, 32-bit | Performance Monitors Event Type Registers, 24-30 
for n==17 to 304 
c15 7 PMCCFILTR4 32-bit Performance Monitors Cycle Count Filter Register 
- 1 cl4 - CNTVCTi 64-bit Virtual Count register 
- 2 cl4 - CNTP_CVALi 64-bit | EL1 Physical Timer Compare Value register 
- 3 cl4 - CNTV_CVALi 64-bit Virtual Timer CompareValue register 
- 4 cl4 - CNTVOFFi 64-bit Virtual Offset register 
cl4 4 cl 0 CNTHCTL 32-bit Timer EL2 Control register 
c2 0 CNTHP_TVAL 32-bit | EL2 Physical TimerValue register 
1 CNTHP CTL 32-bit | EL2 Physical Timer Control register 
- 6 cl4 - CNTHP_CVAL 64-bit | EL2 Physical Timer Compare Value register 
cl5 0-7 c0-cl5 0-7 - 32-bit See Reserved 32-bit encodings with {coproc==0b1111, 


CRn==c15} on page G4-4178 





a. REVIDR is an optional register. If it is not implemented, the encoding with opc? set to 6 is an alias of MIDR. 


b. Implemented only as part of EL2, when EL2 is using AArch32. Otherwise, encoding is unallocated and CONSTRAINED UNPREDICTABLE, see 


Accesses to unallocated encodings in the (coproc==Ob111x) encoding space on page G4-4151. 


Implemented only as part of the EL3, when EL3 is using AArch32. Otherwise, as described in Accesses to unallocated encodings in the 
(coproc==ObI11x) encoding space on page G4-4151, encoding is unallocated and: 


* UNDEFINED, for the registers accessed using CRn set to c12. 
* CONSTRAINED UNPREDICTABLE, for the register accessed using CRn values other than c12. 





d. Introduced in ARMVv8. 
e. Introduced in ARMv8. Implemented only if the implementation includes the GIC System registers. For information see Generic Interrupt 
Controller System registers, functional groups on page G4-4207. As that subsection describes, each ICV_* register uses the same encoding 
as the corresponding ICC_* register 
f. This register is only accessible in Debug state. 
g. When an implementation is using the Long descriptor translation table format these encodings access the MAIRO and MAIR I registers. 
Otherwise, they access the PRRR and NMRR. 
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h. Introduced in ARMv8. Only one of RMR (at EL1), HRMR, and RMR (at EL3) is implemented, corresponding to the highest implemented 
Exception level, and the register is implemented only if that Exception level is using AArch32. 


i. Implemented as part of the Generic Timer. 


j. Implemented as RW as part of the Generic Timer on an implementation that includes EL2 and when EL2 is using AArch32. For more 
information see Status of the CNITVOFF register on page G5-4223. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4189 
1ID092916 Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.16 VMSAv8-32 organization of registers in the (coproc==0b1111) encoding space 


G4.16.3 AArch32 views of the System registers in the (coproc==0b1111) encoding space 


The following sections summarize the different software views of the System registers in the (coproc==0b1111) 
encoding space, for VMSAv8-32: 





° PLO views of the System registers in the (coproc==ObI1111) encoding space. 

° PL1 views of the System registers in the in the (coproc==ObI1111) encoding space on page G4-4192. 

° Non-secure PL2 view of the System registers in the (coproc==Ob1111) encoding space on page G4-4192. 
Note 


In previous versions of the ARM architecture, the behavior of accesses to the System registers in the 
(coproc==0b1111) encoding space were described in relation to the PE modes. In Secure state, the Exception level 
of the privileged PE modes depends on whether EL3 is using AArch32, or is using AArch64. This means it is 
simpler to describe the views of the registers in terms of privilege levels, PLO-PL2, rather than the Exception levels 
EL1-EL3. This is because each AArch32 PE mode is associated with a particular privilege level, but in Secure state 
its Exception level can depend on the EL3 Execution state. For more information see Security state, Exception 
levels, and AArch32 execution privilege on page G1-3792. 





PLO views of the System registers in the (coproc==0b1111) encoding space 


Software executing at PLO, unprivileged, can access only a small subset of the System registers in the 
(coproc==0b1111) encoding space, as Table G4-46 shows. This table excludes possible PLO access to the following 
registers: 


° The Performance Monitors Extension, see Possible PLO access to the Performance Monitors Extension 
System registers on page G4-4191. 


° The Generic Timer registers, see Possible PLO access to the Generic Timer System registers on 
page G4-4191. 


Table G4-46 System registers in the (coproc==0b1111) encoding space accessible from PLO 








Name Access _ Description Note 
CP15ISB WO Memory barriers on page E2-2335 ARM deprecates use of these 
operations 





CP15DSB WO 





CP15DMB WO 





TPIDRURW RW TPIDRURW, PLO Read/Write Software Thread ID Register on - 
page G6-4641 





TPIDRURO RO TPIDRURO, PLO Read-Only Software Thread ID Register on RW at PLI 
page G6-4639 
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Possible PLO access to the Performance Monitors Extension System registers 


In a VMSAv8-32 implementation that includes the Performance Monitors Extension, when using the System 
register interface to the Performance Monitors: 


° The PMUSERENR is RO from PLO. 


° When: 


— The value of PMUSERENR.EN is 1, PMCNTENSET, PMCNTENCLR, PMOVSR, PMSWINC, 
PMSELR, PMCEIDO, PMCEID1, PMCCNTR, PMXEVTYPER, PMXEVCNTR, PMUSERENR, 
PMOVSSET, PMEVCNTR<n>, PMEVTYPER<n>, and PMCCFILTR.are accessible by reads and 
writes from PLO. 


— The value of PMUSERENR.ER is 1, PMXEVCNTR and PMEVCNTR<n>.are accessible by reads 
and from PLO, and PMSELR is accessible by reads and writes from PLO. 


— The value of PMUSERENR.CR is 1, PMCCNTR is accessible by reads from PLO. 
— The value of PMUSERENR.SW is 1, PMSWINC is accessible by writes from PLO. 
In general, when the value of a PMUSERENR.{EN, ER, CR, SW} bit is 1, the enabled registers have the 


same access permissions from PLO as they do from PL1. 


For more information, see: 
° Traps to ELI of ELO accesses to Performance Monitors registers on page D1-1570. 


° Chapter D5 The Performance Monitors Extension, in particular Access permissions on page D5-1871. 


Possible PLO access to the Generic Timer System registers 
In a VMSAv8-32 implementation, when using the System register interface to the Generic Timer registers: 


° If CNTKCTL.PLOPCTEN is set to 1, then if the physical counter register CNTPCT is accessible from PL1 
it is also accessible from PLO. For more information see Accessing the physical counter on page D6-1880. 


° If CNTKCTL.PLOPVTEN is set to 1, the virtual counter register CNTVCT is accessible from PLO. For more 
information, see Accessing the virtual counter on page D6-1881. 


° If at least one of CNTKCTL. {PLOPCTEN, PLOPVTEN } is set to 1, the CNTFRQ register is RO from PLO. 


° If: 


— CNTKCTL.PLOPTEN is set to 1, the physical timer registers CNTP_CTL, CNTP_CVAL, and 
CNTP_TVAL are accessible from PLO. 


— CNTKCTL.PLOVTEN is set to 1, the virtual timer registers CNTV_CTL, CNTV_CVAL, and 
CNTV_TVAL, are accessible from PLO. 


For more information, see Accessing the timer registers on page D6-1883. 
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PL1 views of the System registers in the in the (coproc==0b1111) encoding space 


Software executing at PL1 can access all System registers in the (coproc==0b1111) encoding space, with the 
following exceptions: 


Non-secure PL1 software 
EL3 restricts or prevents access to some registers by Non-secure PL1 software. In particular: 


° The Restricted access System registers are either not accessible to Non-secure PL1 software, 
or are read-only to Non-secure PL1 software, see Restricted access System registers on 
page G4-4156. 


° Configuration settings determine whether Non-secure PL1 software can access the 
Configurable access System registers, see Configurable access System registers on 
page G4-4157. 


The individual register descriptions identify these access restrictions. 


In an implementation that includes EL2, Non-secure PL1 software has no visibility of the PL2-mode 
registers summarized in Hyp mode read/write registers in the (coproc==ObI1111) encoding space 
on page G4-4157. The individual register descriptions identify these registers as EL2-mode 
registers. 


Secure PL1 software 


In general, Secure PL1 software has access to all System registers. However: 


° When EL3 is using AArch32, the CPISSDISABLE signal disables write access to a number 
of Secure registers, see The CPI5SDISABLE input signal on page G4-4161. 


. The PL2-mode registers are accessible from Secure state only if EL3 is using AArch32. 
When this is the case, Secure PL1 software can access these registers by moving into Monitor 
mode, and setting SCR.NS to 1. 


Hyp mode read/write registers in the (coproc==Ob1111) encoding space on page G4-4157 
summarizes these registers. 


The individual register descriptions identify: 
. The registers affected by the CPISSDISABLE signal. 
° The PL2-mode registers. 


Non-secure PL2 view of the System registers in the (coproc==0b1111) encoding space 
Non-secure software executing at PL2 can access: 


° The registers that are accessible to Non-secure software executing at PL1, as defined in PL/ views of the 
System registers in the in the (coproc==0Ob1111) encoding space. Access permissions for these registers are 
identical to those for Non-secure software executing at EL1. 


° The PL2-mode registers summarized in Hyp mode read/write registers in the (coproc==Ob1111) encoding 
space on page G4-4157, and described in Virtualization registers, functional group on page G4-4197. 
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Functional grouping of VMSAv8-32 System registers 


This section describes how the System registers in an VMSAv8-32 implementation divide into functional groups. 


General system control registers on page G6-4231 describes each of these registers. 





Note 
Table G4-45 on page G4-4179 lists all of the VMSAv8-32 System registers in the (coproc==0b1111) 
encoding space, ordered by: 
1. The CRn primary register used when using a 32-bit access to the register. 


For 64-bit register accesses using an MCRR or MRRC instruction, the instruction arguments that identify 
the target register are {coproc, Rm, opc1} The value of Rm determines where these registers appear in 
Table G4-45 on page G4-4179, so that these registers appear with the 32-bit registers accessed using 
that value for CRn. So, for example, the 64-bit access to TTBRO, that uses (CRn==c2), appears with the 
32-bit access to TTBRO, that uses (CRn==c2). 


2. The opcl value used when accessing the register. 


3. For 32-bit registers, the {CRm, opc2} values used when accessing the register. 


The functional groups defined in this section mainly consist of the VMSAv8-32 System registers, but include 
some additional System registers. 


Some registers belong to more than one functional group. 





For other related information see: 


The AArch32 System register interface on page G1-3877 for general information about the access to the 
AArch32 System registers, including the main register access instructions MRC and MCR 


About the System registers for VMSAv8-32 on page G4-4148 for general information about the System 
registers for VMSAv8-32, including: 


— Their organization, both by the {coproc, CRn, opc1, CRm, opc2} parameters and by function. 
— Their general behavior. 

— _ The effect of not implementing some Exception levels on the registers. 

— Different views of the registers, that depend on the state of the PE. 


— Conventions used in describing the registers. 


The remainder of this chapter, and General system control registers on page G6-4231, assumes you are familiar with 
About the System registers for VMSAv8-32 on page G4-4148, and uses conventions and other information from that 
section without any explanation. 


Each of the following sections summarizes a functional group of VMSAv8-32 System registers: 


Identification registers, functional group on page G4-4194. 

General system control registers, functional group on page G4-4195. 
Virtual memory control registers, functional group on page G4-4196. 
Virtualization registers, functional group on page G4-4197. 

Security registers, functional group on page G4-4199. 

Exception and fault handling registers, functional group on page G4-4200. 
Reset management registers, functional group on page G4-4201. 

Thread and process ID registers, functional group on page G4-4201. 
Cache maintenance instructions, functional group on page G4-4201. 

TLB maintenance instructions, functional group on page G4-4202. 
Address translation instructions, functional group on page G4-4204. 
Lockdown, DMA, and TCM features, functional group on page G4-4205. 
Performance Monitors Extension registers, functional group on page G4-4205. 
Generic Timer registers, functional group on page G4-4207. 


Generic Interrupt Controller System registers, functional groups on page G4-4207. 
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° Legacy feature registers, functional group on page G4-4211. 

° IMPLEMENTATION DEFINED registers, functional group on page G4-4211. 

° Advanced SIMD and floating-point registers, functional group on page G4-4212. 
° Debug registers, functional group on page G4-4212. 


G4.17.1 Identification registers, functional group 


Table G4-47 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are in the 
Identification registers functional group. 


Table G4-47 Identification registers, VMSAv8-32 






















































































Name CRn opc1 CRm_ opc2 Width Type _ Description 
AIDR c0 1 c0 7 32-bit RO Auxiliary ID Register 
CCSIDR c0 1 c0 0 32-bit RO Cache Size ID Registers 
CLIDR c0 ik c0 1 32-bit RO Cache Level ID Register 
CSSELR c0 2 c0 0 32-bit RW Cache Size Selection Register 
CTR c0 0 c0 1 32-bit RO Cache Type Register 
ID_AFRO c0 0 cl 3 32-bit RO Auxiliary Feature Register 04 
ID_DFRO c0 0 cl 2 32-bit RO Debug Feature Register 04 
ID_ISARO c0 0 c2 0 32-bit RO Instruction Set Attribute Register 02 
ID_ISAR1 c0 0 c2 1 32-bit RO Instruction Set Attribute Register 1 
ID_ISAR2 c0 0 c2 2 32-bit RO Instruction Set Attribute Register 22 
ID_ISAR3 c0 0 c2 3 32-bit RO Instruction Set Attribute Register 32 
ID_ISAR4 c0 0 c2 4 32-bit RO Instruction Set Attribute Register 42 
ID_ISARS c0 0 c2 2) 32-bit RO Instruction Set Attribute Register 52 
ID_MMFRO_ 0 0 cl 4 32-bit RO Memory Model Feature Register 04 
ID_MMFRI1_ cO 0 cl 5 32-bit RO Memory Model Feature Register 14 
ID_MMFR2__c0 0 cl 6 32-bit RO Memory Model Feature Register 24 
ID_MMFR3___cO0 0 cl 7 32-bit RO Memory Model Feature Register 34 
ID_MMFR4__cO 0 c2 6 32-bit RO Memory Model Feature Register 4@ 
ID_PFRO c0 0 cl 0 32-bit RO Processor Feature Register 04 
ID_PFR1 c0 0 cl 1 32-bit RO Processor Feature Register 14 
MIDR c0 0 c0 0 32-bit RO Main ID Register 
MPIDR c0 0 c0 5 32-bit RO Multiprocessor Affinity Register 
REVIDR c0 0 c0 6 32-bit RO Revision ID Register 
TCMTR c0 0 c0 2 32-bit RO TCM Type Register 
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Table G4-47 Identification registers, VMSAv8-32 (continued) 














Name CRn opc1 CRm_ opc2 Width Type _ Description 

TLBTR c0 0 c0 3 32-bit RO TLB Type Register 

VMPIDR c0 4 c0 5 32-bit RW Virtualization Multiprocessor ID Register 
VPIDR c0 4 c0 5 32-bit RW Virtualization Processor ID Register 





a. CPUID register, see The CPUID identification scheme. 


The other registers in this group are the FPSID, MVFRO, MVFR1, and MVFR2. 


The JIDR holds legacy identification information. 


The CPUID identification scheme 


The ID_* registers were originally called the CPUID identification scheme registers. A footnote to Table G4-47 on 
page G4-4194 identifies these registers. However, functionally, there is no value in separating these registers from 
the slightly larger Identification registers functional group. 
































G4.17.2 General system control registers, functional group 
Table G4-48 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are in the 
General system control registers functional group. 
Table G4-48 General system control registers, VMSAv8-32 
Name CRn opc1 CRm = opc2 Width Type Description 
ACTLR cl 0 c0 1 32-bit RW Auxiliary Control Register 
ACTLR2 cl 0 c0 3 32-bit RW Auxiliary Control Register 2 
CPACR cl 0 c0 2 32-bit RW Architectural Feature Access Control Register 
HACTLR cl 4 c0 0 32-bit RW Hyp Auxiliary System Control Register 
HACTLR2 cl 4 c0 3 32-bit RW Hyp Auxiliary System Control Register 2 
HSCTLR cl 4 c0 0 32-bit RW Hyp System Control Register 
SCTLR cl 0 c0 0 32-bit RW System Control Register 
The following sections summarize the System registers added by the corresponding Exception levels: 
° Security registers, functional group on page G4-4199. 
° Virtualization registers, functional group on page G4-4197. 
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G4.17.3 Virtual memory control registers, functional group 


Table G4-49 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are in the Virtual 


memory control registers functional group. 


Table G4-49 Virtual memory control registers 













































































Name CRn opc1 CRm_ opc2 Width Type _ Description 
AMAIRO cl0 0 c3 0 32-bit RW Auxiliary Memory Attribute Indirection Register 0 
AMAIR1 cl0 0 c3 1 32-bit RW Auxiliary Memory Attribute Indirection Register 1 
CONTEXTIDR - c13 0 c0 1 32-bit RW Context ID Register 
DACR c3 0 c0 0 32-bit RW Domain Access Control Register 
HAMAIRO cl0 4 c3 0 32-bit RW Hyp Auxiliary Memory Attribute Indirection Register 0 
HAMAIRI cl0 4 c3 1 32-bit RW Hyp Auxiliary Memory Attribute Indirection Register 1 
HMAIRO cl0 4 c2 0 32-bit RW Hyp Memory Attribute Indirection Register 0 
HMAIR1I cl0 4 c2 fl 32-bit RW Hyp Memory Attribute Indirection Register 1 
HTCR c2 4 c0 2 32-bit RW Hyp Translation Control Register 
HTTBR - 4 c2 - 64-bit RW Hyp Translation Table Base Register 
MAIRO cl0 0 c2 0 32-bit RW Memory Attribute Indirection Register 0 
MAIRI cl0 0 c2 1 32-bit RW Memory Attribute Indirection Register 1 
NMRR cl0 0 c2 1 32-bit RW Normal Memory Remap Register 
PRRR cl0 0 c2 0 32-bit RW Primary Region Remap Register 
SCTLR cl 0 c0 0 32-bit RW System Control Register 
TTBCR c2 0 c0 2 32-bit RW Translation Table Base Control Register 
TTBRO c2 0 c0 0 32-bit RW Translation Table Base Register 0 
TTBRO - 0 c2 - 64-bit RW Translation Table Base Register 0 
TTBRI c2 0 c0 1 32-bit RW Translation Table Base Register 1 
TTBRI1 - 1 c2 - 64-bit RW Translation Table Base Register 1 
VTCR c2 4 cl 2 32-bit RW Virtualization Translation Control Register 
VTTBR - 6 c2 - 64-bit RW Virtualization Translation Table Base Register 
The ACTLR and, if implemented, ACTLR2, might provide additional virtual memory control. 
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G4.17.4 Virtualization registers, functional group 


Table G4-50 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are in the 
Virtualization registers functional group, excluding the System instructions in this group. 


Table G4-50 Virtualization registers, excluding System instructions 































































































Name CRn opcit CRm_= opc2 Width Type _ Description 
CNTHCTL cl4 4 cl 0 32-bit RW Counter-timer Hyp Control register 
CNTHP_CTL cl4 4 c2 1 32-bit RW Counter-timer Hyp Physical Timer Control register 
CNTHP_CVAL~ - 6 cl4 - 64-bit RW Counter-timer Hyp Physical Compare Value register 
CNTHP_TVAL  cl4 4 c2 0 32-bit RW Counter-timer Hyp Physical Timer TimerValue register 
CNTVOFF - 4 cl4 - 64-bit RW Counter-timer Virtual Offset register 
HACR cl 4 cl 7 32-bit RW Hyp Auxiliary Configuration Register 
HACTLR cl 4 c0 1 32-bit RW Hyp Auxiliary Control Register 
HACTLR2 cl 4 c0 3 32-bit RW Hyp Auxiliary Control Register 2 
HADFSR c5 4 cl 0 32-bit RW Hyp Auxiliary Data Fault Status Register 
HAIFSR c5 4 cl 1 32-bit RW Hyp Auxiliary Instruction Fault Status Register 
HAMAIRO cl0 4 c3 0 32-bit RW Hyp Auxiliary Memory Attribute Indirection Register 0 
HAMAIRI cl0 4 c3 1 32-bit RW Hyp Auxiliary Memory Attribute Indirection Register 1 
HCPTR cl 4 cl 2 32-bit RW Hyp Architectural Feature Trap Register 
HCR cl 4 cl 0 32-bit RW Hyp Configuration Register 
HCR2 cl 4 cl 4 32-bit RW Hyp Configuration Register 2 
HDCR cl 4 cl 1 32-bit RW Hyp Debug Configuration Register 
HDFAR c6 4 c0 0 32-bit RW Hyp Data Fault Address Register 
HIFAR c6 4 c0 2 32-bit RW Hyp Instruction Fault Address Register 
HMAIRO cl0 4 c2 0 32-bit RW Hyp Memory Attribute Indirection Register 0 
HMAIR1 cl0 4 c2 1 32-bit RW Hyp Memory Attribute Indirection Register 1 
HPFAR c6 4 c0 4 32-bit RW Hyp IPA Fault Address Register 
HRMR c12 4 c0 2 32-bit RW Hyp Reset Management Register 
HSCTLR cl 4 c0 0 32-bit RW Hyp System Control Register 
HSR c5 4 c2 0 32-bit RW Hyp Syndrome Register 
HSTR cl 4 cl 3 32-bit RW Hyp System Trap Register 
HTCR c2 4 c0 2 32-bit RW Hyp Translation Control Register 
HTPIDR c13 4 c0 2 32-bit RW Hyp Thread and Process ID Register 
HTTBR - 4 c2 - 64-bit RW Hyp Translation Table Base Register 
HVBAR c12 4 c0 0 32-bit RW Hyp Vector Base Address Register 
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Table G4-50 Virtualization registers, excluding System instructions (continued) 










































































Name CRn opci1 CRm_= opc2 Width Type Description 

ICC_HSRE 4 c12 4 c9 5 32-bit RW Interrupt Controller Hyp System Register Enable 
register 

ICH_APORO @ c12 4 c8 0 32-bit RW Interrupt Controller Hyp Active Priorities Register (0,0) 

ICH_APORI @ c12 4 c8 1 32-bit RW Interrupt Controller Hyp Active Priorities Register (0,1) 

ICH_APOR2 4 c12 4 c8 2 32-bit RW Interrupt Controller Hyp Active Priorities Register (0,2) 

ICH_APOR3 4 c12 4 c8 3 32-bit RW Interrupt Controller Hyp Active Priorities Register (0,3) 

ICH_APIRO 4 c12 4 c9 0 32-bit RW Interrupt Controller Hyp Active Priorities Register (1,0) 

ICH_APIR1 4 c12 4 c9 1 32-bit RW Interrupt Controller Hyp Active Priorities Register (1,1) 

ICH_APIR2 4 c12 4 c9 2 32-bit RW Interrupt Controller Hyp Active Priorities Register (1,2) 

ICH_APIR3 4 c12 4 c9 3 32-bit RW Interrupt Controller Hyp Active Priorities Register (1,3) 

ICH_EISR 2 c12 4 cll 3 32-bit RO Interrupt Controller End of Interrupt Status Register 

ICH_ELRSR 4 c12 4 cll 5 32-bit RO Interrupt Controller Empty List Register Status Register 

ICH_HCR # c12 4 cll 0 32-bit RW Interrupt Controller Hyp Control Register 

ICH_LR<n>, cl2 4 cl2 0-7 32-bit RW Interrupt Controller List Registers, 0-7 

n==0-7 4 

ICH_LR<n>, c12 4 c13 0-7 32-bit RW Interrupt Controller List Registers, 8-15 

n==8-154 

ICH_LRC<n>, cl2 4 cl4 0-7 32-bit RW Interrupt Controller List Registers Continuation, 0-7 

n==0-7 4 

ICH_LRC<n>, cl2 4 cl5 0-7 32-bit RW Interrupt Controller List Registers Continuation, 8-15 

n==8-154 

ICH_MISR 2 c12 4 cll 2 32-bit RO Interrupt Controller Maintenance Interrupt State 
Register 

ICH_VMCR 2 c12 4 cll 7 32-bit RW Interrupt Controller Virtual Machine Control Register 

ICH_VTR2 c12 4 cll 1 32-bit RO Interrupt Controller VGIC Type Register 

VMPIDR c0 4 c0 5 32-bit RW Virtualization Multiprocessor ID Register 

VPIDR c0 4 c0 0 32-bit RW Virtualization Processor ID Register 

VTCR c2 4 cl 2 32-bit RW Virtualization Translation Control Register 

VTTBR - 6 c2 - 64-bit RW Virtualization Translation Table Base Register 





a. For more information about these registers see Generic Interrupt Controller System registers, functional groups on page G4-4207. As that 
section describes, each ICV_* register uses the same encoding as the corresponding ICC_* register. 
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Table G4-51 shows the Hyp mode System instructions, in the (coproc==0b1111) encoding space, that are part of this 


functional group. See also Table G4-50 on page G4-4197. 


Table G4-51 Hyp mode System instructions 















































Name CRn opc1 CRm_= opc2 Width Type _ Description 

ATS1HR c7 4 c8 0 32-bit WO Address Translate Stage 1 Hyp mode Read 

ATSIHW c7 4 c8 1 32-bit WO Address Translate Stage 1 Hyp mode Write 

TLBIALLH4 c8 4 ci 0 32-bit WO Invalidate entire Hyp unified TLB 

TLBIALLHIS2 c8 4 c3 0 32-bit WO Invalidate entire Hyp unified TLB 

TLBIALLNSNH2 c8 4 c7 4 32-bit WO Invalidate entire Non-secure Non-Hyp unified TLB 

TLBIALLNSNHIS#4 — c8 4 c3 4 32-bit WO Invalidate entire Non-secure Non-Hyp unified TLB 

TLBITPAS24 c8 4 c4 1 32-bit WO TLB Invalidate entry by IPA, Stage 2 

TLBIIPAS2IS2 c8 4 c0 1 32-bit WO TLB Invalidate entry by IPA, Stage 2, Inner 
Shareable 

TLBITPAS2L2 c8 4 c4 5 32-bit WO TLB Invalidate entry by IPA, Stage 2, Last level 

TLBITPAS2LIS4 c8 4 c0 5 32-bit WO TLB Invalidate entry by IPA, Stage 2, Last level, 
Inner Shareable 

TLBIMVAHa c8 4 Ci 1 32-bit WO Invalidate Hyp unified TLB by VA 

TLBIMVAHIS4 c8 4 c3 1 32-bit WO Invalidate Hyp unified TLB by VA 

TLBIMVALH2 c8 4 c7 5 32-bit WO TLB Invalidate entry by MVA, Last level, Hyp mode 

TLBIMVALHIS4 c8 4 c3 5 32-bit WO TLB Invalidate entry by MVA, Last level, Hyp mode, 


Inner Shareable 





a. These links are to a summary of the operation, and The scope of TLB maintenance instructions on page G4-4101 describes the operation. 


All the encodings shown in Table G4-50 on page G4-4197 and Table G4-51 are unallocated and CONSTRAINED 
UNPREDICTABLE on an implementation that does not include EL2, see Accesses to unallocated encodings in the 
(coproc==O0bI111x) encoding space on page G4-4151. 

















G4.17.5 Security registers, functional group 
Table G4-52 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are in the 
Security registers functional group. 
Table G4-52 Security registers 
Name CRn opcet CRm_ opc2 Width Type Description 
ICC_MCTLR@ c12 6 c12 4 32-bit RW Interrupt Controller Monitor Control Register 
ICC_MGRPEN12_ cl2 6 c12 7 32-bit RW Interrupt Controller Monitor Interrupt Group 1 Enable 
register 
ICC_MSRE @ c12 6 c12 5 32-bit RW Interrupt Controller Monitor System Register Enable 
register 
MVBAR c12 0 c0 1 32-bit RW Monitor Vector Base Address Register 
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Table G4-52 Security registers (continued) 

















Name CRn opct1 CRm_= opc2 Width Type Description 

NSACR cl 0 cl 2 32-bit RW Non-Secure Access Control Register 
RMR (at EL3) c12 0 c0 2 32-bit RW Reset Management Register 

SCR cl 0 cl 0 32-bit RW Secure Configuration Register 
SDER cl 0 cl 1 32-bit RW Secure Debug Enable Register 





a. For information about these registers see the ARM® Generic Interrupt Controller Architecture Specification, GIC architecture version 3.0 


and version 4.0 (ARM THI 0069). 


All the encodings shown in Table G4-52 on page G4-4199 are unallocated and CONSTRAINED UNPREDICTABLE on 
an implementation that does not include EL3, see Accesses to unallocated encodings in the (coproc==ObI111x) 
encoding space on page G4-4151. 






























































G4.17.6 Exception and fault handling registers, functional group 
Table G4-53 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are in the 
Exception and fault handling registers functional group. 
Table G4-53 Exception and fault handling registers 
Name CRn opci CRm_= opc2 Width Type _ Description 
ADFSR c5 0 cl 0 32-bit RW Auxiliary Data Fault Status Register 
AIFSR cS 0 cl 1 32-bit RW Auxiliary Instruction Fault Status Register 
DFAR c6 0 c0 0 32-bit RW Data Fault Address Register 
DFSR c5 0 c0 0 32-bit RW Data Fault Status Register 
HADFSR cS 4 cl 0 32-bit RW Hyp Auxiliary Data Fault Status Register 
HAIFSR c5 4 cl 1 32-bit RW Hyp Auxiliary Instruction Fault Status Register 
HDFAR c6 4 c0 0 32-bit RW Hyp Data Fault Address Register 
HIFAR c6é 4 c0 2 32-bit RW Hyp Instruction Fault Address Register 
HPFAR c6 4 c0 4 32-bit RW Hyp IPA Fault Address Register 
HSR c5 4 c2 0 32-bit RW Hyp Syndrome Register 
HVBAR c12 4 c0 1 32-bit RW Hyp Vector Base Address Register 
IFAR c6 0 c0 2 32-bit RW Instruction Fault Address Register 
IFSR c5 0 c0 i 32-bit RW Instruction Fault Status Register 
ISR c12 0 cl 0 32-bit RO Interrupt Status Register 
MVBAR c12 0 c0 1 32-bit RW Monitor Vector Base Address Register 
RVBAR c12 0 c0 1 32-bit RW Reset Vector Base Address Register 
VBAR c12 0 c0 0 32-bit RW Vector Base Address Register 
The PE returns fault information using the fault status registers and the fault address registers. For details of how 
these registers are used see Exception reporting in a VMSAv8-32 implementation on page G4-4123. 
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These registers also report information about debug exceptions. For more information see: 


Data Abort exceptions, taken to a PLI mode on page G4-4125. 


Prefetch Abort exceptions, taken to a PLI mode on page G4-4127. 


Reporting exceptions taken to Hyp mode on page G4-4133. 





G4.17.7 Reset management registers, functional group 


Table G4-54 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are in the Reset 
management registers functional group. 


Table G4-54 Reset management registers 














Name CRn opct1 CRm = opc2 Width Type Description 

HRMR c12 4 c0 2 32-bit RW Hyp Reset Management Register 
RMR (at EL1) c12 0 c0 2 32-bit RW Reset Management Register, EL1 
RMR (at EL3) c12 0 c0 2 32-bit RW Reset Management Register, EL3 





G4.17.8 Thread and process ID registers, functional group 


Table G4-55 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are in the Thread 
and process ID registers functional group. 


Table G4-55 VMSAv8-32 Thread and process ID registers 

















Name CRn opci CRm = opc2 Width Typea Description 

HTPIDR? c13 4 c0 2 32-bit RW Hyp Software Thread ID Register 
TPIDRPRW c13 0 c0 4 32-bit RW PL1 Software Thread ID Register 
TPIDRURO c13 0 c0 3 32-bit | RW, PLO PLO Read-Only Software Thread ID Register 
TPIDRURW c13 0 c0 2 32-bit RW, PLO PLO Read/Write Software Thread ID Register 





a. PLO ina Type description indicates that the encoding is accessible by software executing at PLO. See the register description for more 


information. 


b. Implemented only as part of EL2. Otherwise, encoding is unallocated and CONSTRAINED UNPREDICTABLE, see Accesses to unallocated 
encodings in the (coproc==O0b111x) encoding space on page G4-4151. 


G4.17.9 Cache maintenance instructions, functional group 


Table G4-56 shows the VMSAv8-32 System instructions in the (coproc==0b1111) encoding space that are in the 
Cache and branch predictor maintenance instructions functional group. 


Table G4-56 Cache and branch predictor maintenance instructions 

















Name CRn opc1 CRm_= opc2 Width Type _ Description Limits4 
BPIALL> c7 c5 6 32-bit WO Branch predictor invalidate all - 
BPIALLIS> c7 cl 6 32-bit WO Branch predictor invalidate all IS 
BPIMVA» c7 c5 vi 32-bit WO Branch predictor invalidate by VA - 
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Table G4-56 Cache and branch predictor maintenance instructions (continued) 



































Name CRn opc1 CRm_= opc2 Width Type _ Description Limits 4 
DCCIMVAC?P c7 0 cl4 1 32-bit WO Data cache clean and invalidate by VA PoC 
DCCISW? c7 0 cl4 2 32-bit WO Data cache clean and invalidate by set/way —- 
DCCMVAC> c7 0 cl0 1 32-bit WO Data cache clean by VA PoC 
DCCMVAU> c7 0 cll 1 32-bit WO Data cache clean by VA PoU 
DCCSW> c7 0 cl0 2 32-bit WO Data cache clean by set/way - 
DCIMVAC?> c7 0 c6 1 32-bit WO Data cache invalidate by VA PoC 
DCISW? c7 0 c6 2 32-bit WO Data cache invalidate by set/way - 
ICIALLU> c7 0 c5 0 32-bit WO Instruction cache invalidate all PoU 
ICIALLUIS> c7 0 cl 0 32-bit WO Instruction cache invalidate all PoU, IS 
ICIMVAU>b c7 0 c5 1 32-bit WO Instruction cache invalidate by VA PoU 





a. PoU = to Point of Unification, PoC = to Point of Coherency, IS = Inner Shareable. 


b. The links in this column are to a summary of the operation, see AArch32 cache and branch predictor maintenance instructions on 
page G3-3999. 












































G4.17.10 TLB maintenance instructions, functional group 

Table G4-57 shows the VMSAv8-32 System instructions in the (coproc==0b1111) encoding space that are in the 

TLB maintenance instructions functional group. The scope of TLB maintenance instructions on page G4-4101 

describes the operations. 

Table G4-57 TLB maintenance instructions 
Name@ CRn opci CRm_=  opc2 Width Type Description Limits» 
DTLBIALL*¢ c8 0 c6 0 32-bit WO Invalidate entire data TLB - 
DTLBIASID°¢ c8 0 c6 2 32-bit WO Invalidate data TLB by ASID - 
DTLBIMVA¢ c8 0 c6 1 32-bit WO Invalidate data TLB entry by VA - 
ITLBIALL* c8 0 c5 0 32-bit WO Invalidate entire instruction TLB - 
ITLBIASID¢ c8 0 c5 2 32-bit WO Invalidate instruction TLB by ASID - 
ITLBIMVA®* c8 0 c5 1 32-bit WO Invalidate instruction TLB by VA - 
TLBIALL4 c8 0 c7 0 32-bit WO Invalidate entire unified TLB - 
TLBIALLH c8 4 c7 0 32-bit WO Invalidate entire Hyp unified TLB - 
TLBIALLHIS c8 4 c3 0 32-bit WO Invalidate entire Hyp unified TLB IS 
TLBIALLIS c8 0 c3 0 32-bit WO Invalidate entire unified TLB IS 
TLBIALLNSNH c8 4 c7 4 32-bit WO Invalidate entire Non-secure Non-Hyp - 
unified TLB 
TLBIALLNSNHIS  c8 4 c3 4 32-bit WO Invalidate entire Non-secure Non-Hyp IS 
unified TLB 
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Table G4-57 TLB maintenance instructions (continued) 



























































Name@ CRn opc1 CRm_= opc2 Width Type Description Limits 

TLBIASID c8 0 c7 2 32-bit WO Invalidate unified TLB by ASID - 

TLBIASIDIS c8 0 c3 2 32-bit WO Invalidate unified TLB by ASID IS 

TLBITPAS2 c8 4 c4 1 32-bit WO TLB Invalidate entry by IPA, Stage 2 - 

TLBIPASZ2IS c8 4 c0 1 32-bit WO TLB Invalidate entry by IPA, Stage 2, IS 
Inner Shareable 

TLBIPAS2L c8 4 c4 5 32-bit WO TLB Invalidate entry by IPA, Stage 2, - 
Last level 

TLBIIPAS2LIS c8 4 c0 5 32-bit WO TLB Invalidate entry by IPA, Stage 2, IS 
Last level, Inner Shareable 

TLBIMVAA c8 0 c7 3 32-bit WO Invalidate unified TLB by VA, all ASID_—- 

TLBIMVAAIS c8 0 c3 3 32-bit WO Invalidate unified TLB by VA, all ASID IS 

TLBIMVAAL c8 0 c7 7 32-bit WO TLB Invalidate entry by MVA, All - 
ASID, Last level 

TLBIMVAALIS c8 0 c3 7 32-bit WO TLB Invalidate entry by MVA, All IS 
ASID, Last level, Inner Shareable 

TLBIMVA c8 0 c7 1 32-bit WO Invalidate unified TLB by VA - 

TLBIMVAH c8 4 c7 1 32-bit WO Invalidate Hyp unified TLB by VA - 

TLBIMVAHIS c8 4 c3 1 32-bit WO Invalidate Hyp unified TLB by VA IS 

TLBIMVAIS c8 0 c3 1 32-bit WO Invalidate unified TLB by VA IS 

TLBIMVAL c8 0 c7 5 32-bit WO TLB Invalidate entry by MVA, Last level - 

TLBIMVALH c8 4 c7 5 32-bit WO TLB Invalidate entry by MVA, Last - 
level, Hyp mode 

TLBIMVALHIS c8 + c3 5 32-bit WO TLB Invalidate entry by MVA, Last IS 
level, Hyp mode, Inner Shareable 

TLBIMVALIS c8 0 c3 5 32-bit WO TLB Invalidate entry by MVA, Last level IS 





a. These links are to a summary of the operation, and The scope of TLB maintenance instructions on page G4-4101 describes the operation. 
b. IS = Inner Shareable. 


c. Deprecated. ARM deprecates use of operations that operate only on an Instruction TLB, or only on a Data TLB. 


a. 


The mnemonics for the operations with CRm==c7, opc2=={0, 1, 2} were previously UTLBIALL, UTLBIMVA and UTLBIMASID. 
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G4.17.11. Address translation instructions, functional group 


Table G4-58 shows the VMSAv8-32 System instructions in the (coproc==0b1111) encoding space that are in the 
Address translation instructions functional group. 


Table G4-58 Address translation instructions 









































Name CRn opct CRm_= opc2 Width Type _ Description 
ATS12NSOPR@-¢ c7 0 c8 4 32-bit WO Stages 1 and 2 Non-secure only EL1 read 
ATS12NSOPW2:¢ ci 0 c8 5 32-bit WO Stages 1 and 2 Non-secure only EL1 write 
ATS12NSOUR@2-¢ c7 0 c8 6 32-bit WO Stages 1 and 2 Non-secure only unprivileged read 
ATS1I2NSOUW2¢ ~— c7 0 c8 7 32-bit WO Stages 1 and 2 Non-secure only unprivileged write 
ATS1CPR°¢ c7 0 c8 0 32-bit WO Stage 1 Current state EL1 read 
ATS1CPW°¢ c7 0 c8 1 32-bit WO Stage 1 Current state EL1 write 
ATS1CUR®* c7 0 c8 2 32-bit WO Stage 1 Current state unprivileged read 
ATS1CUW* c7 0 c8 3 32-bit WO Stage 1 Current state unprivileged write 
ATS1HR®-¢ ci 4 c8 0 32-bit WO Stage 1 Hyp mode read 
ATSIHW?:¢ ci 4 c8 1 32-bit WO Stage 1 Hyp mode write 
PAR ci 0 c4 0 32-bit RW Physical Address Register 

- 0 c7 - 64-bit RW 





a. Implemented only as part of EL3. Otherwise, encoding is unallocated and CONSTRAINED UNPREDICTABLE, see Accesses to unallocated 
encodings in the (coproc==ObI111x) encoding space on page G4-4151. 


b. Implemented only as part of EL2. Otherwise, encoding is unallocated and CONSTRAINED UNPREDICTABLE, see Accesses to unallocated 
encodings in the (coproc==ObI11x) encoding space on page G4-4151. 


c. These links are to a summary of the operation. 


Address translation instructions on page G4-4142 describes these operations. 
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G4.17.12 Lockdown, DMA, and TCM features, functional group 
Table G4-59 shows the VMSAv8-32 reserved encodings in the (coproc==0b1111) encoding space that provide the 
Lockdown, DMA, and TCM features registers functional group. 
Table G4-59 Lockdown, DMA, and TCM features, VMSAv8-32 
Name CRn opct CRm_ Width opc2 Type Description 
IMPLEMENTATION = c9 0-7 c0-c2 32-bit 0-7 a Reserved for IMPLEMENTATION DEFINED branch 
DEFINED predictor, cache, and TCM operations. 
c5-c8 32-bit 0-7 a 
cl0 0-7 c0-cl 32-bit 0-7 a Reserved for IMPLEMENTATION DEFINED TLB lockdown 
operations 
c4 32-bit 0-7 a 
c8 32-bit 0-7 a 
cll 0-7 c0-c8 32-bit 0-7 a Reserved for IMPLEMENTATION DEFINED DMA 
operations to and from TCMs 
cl5 32-bit 0-7 a 





a. Access depends on the register or operation, and is IMPLEMENTATION DEFINED. 















































G4.17.13 Performance Monitors Extension registers, functional group 
Table G4-60 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are the 
Performance Monitors Extension registers functional group. See also IMPLEMENTATION DEFINED performance 
monitors on page G4-4206. 
Table G4-60 Performance Monitors Extension registers 
Name CRn opcet1 CRm opc2 Width Description 
PMCCFILTR cl4 0 cl5 7 32-bit | Performance Monitors Cycle Count Filter Register 
PMCCNTR c9 0 c13 0 32-bit | Performance Monitors Cycle Count Register 
PMCEIDO c9 0 c12 6 32-bit | Performance Monitors Common Event Identification 
register 0 
PMCEID1 c9 0 c12 7 32-bit | Performance Monitors Common Event Identification 
register 1 
PMCNTENCLR c9 0 c12 2 32-bit | Performance Monitors Count Enable Clear register 
PMCNTENSET c9 0 c12 1 32-bit | Performance Monitors Count Enable Set register 
PMCR c9 0 cl2 0 32-bit | Performance Monitors Control Register 
PMEVCNTR<n>, cl4 0 c8 0-7 32-bit Performance Monitors Event Count Registers, 0-7 
for n==0 to 7 
PMEVCNTR<n>, cl4 0 cl0 0-7 32-bit Performance Monitors Event Count Registers, 16-23 
for n==16 to 23 
PMEVCNTR<n>, cl4 0 cll 0-6 32-bit Performance Monitors Event Count Registers, 24-30 
for n==24 to 30 
PMEVCNTR<n>, cl4 0 c9 0-7 32-bit Performance Monitors Event Count Registers, 8-15 
for n==8 to 15 
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Table G4-60 Performance Monitors Extension registers (continued) 


















































Name CRn opcet1 CRm opc2 Width Description 

PMEVTYPER<n>, cl4 0 cl2 0-7 32-bit | Performance Monitors Event Type Registers, 0-7 

for n==0 to 7 

PMEVTYPER<n>, cl4 0 cl4 0-7 32-bit | Performance Monitors Event Type Registers, 16-23 

for n==16 to 23 

PMEVTYPER<n>, cl4 0 cl5 0-6 32-bit | Performance Monitors Event Type Registers, 24-30 

for n==17 to 30 

PMEVTYPER<n>, cl4 0 c13 0-7 32-bit | Performance Monitors Event Type Registers, 8-15 

for n==8 to 15 

PMINTENCLR c9 0 cl4 2 32-bit | Performance Monitors Interrupt Enable Clear register 
PMINTENSET c9 0 cl4 1 32-bit | Performance Monitors Interrupt Enable Set register 
PMOVSR c9 0 c12 3 32-bit | Performance Monitors Overflow Flag Status Register 
PMOVSSET c9 0 cl4 3 32-bit | Performance Monitors Overflow Flag Status Set register 
PMSELR c9 0 c12 5 32-bit | Performance Monitors Event Counter Selection Register 
PMSWINC c9 0 c12 4 32-bit | Performance Monitors Software Increment register 
PMUSERENR c9 0 cl4 0 32-bit | Performance Monitors User Enable Register 
PMXEVCNTR c9 0 c13 2 32-bit | Performance Monitors Event Count Register 
PMXEVTYPER c9 0 c13 1 32-bit | Performance Monitors Event Type Select Register 





IMPLEMENTATION DEFINED performance monitors 


VMSAv8-32 reserves some additional System register encodings in the (coproc==0b1111) encoding space for 
optional additional IMPLEMENTATION DEFINED performance monitors. Table G4-61 shows the allocation of these 
encodings: 


Table G4-61 Performance Monitors System register encoding allocations 





CRn opci CRm opc2 Name Width Type 





co 0-7 c1l2-cl4 0-7 Performance Monitors Extension registers, see Table G4-60 on 32-bit RW or RO2@ 


page G4-4205 





cl5 0-7 IMPLEMENTATION DEFINED 32-bit b 





a. The table referenced in the Name entry shows the type of each of the OPTIONAL Performance Monitors Extension registers. 


b. Access depends on the register or operation, and is IMPLEMENTATION DEFINED. 
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G4.17.14 Generic Timer registers, functional group 
Table G4-62 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are the Generic 
Time registers functional group. 
Table G4-62 Generic Timer registers 
Name CRn opct1 CRm_= opc2 Width Type@ Description 
CNTFRQ cl4 0 c0 0 32-bit RW Counter Frequency register 
CNTHCTL cl4 4 cl 0 32-bit RW Timer EL2 Control register 
CNTHP_CTL cl4 4 c2 1 32-bit RW EL2 Physical Timer Control register 
CNTHP_CVAL ~~ - 6 cl4 - 64-bit RW EL2 Physical Timer CompareValue register 
CNTHP_TVAL  cl4 4 c2 0 32-bit RW EL2 Physical TimerValue register 
CNTKCTL cl4 0 cl 0 32-bit RW Timer EL1 Control register 
CNTP_CTL cl4 0 c2 1 32-bit RW EL1 Physical Timer Control register 
CNTP_CVAL - 2 cl4 - 64-bit RW EL1 Physical Timer CompareValue register 
CNTP_TVAL cl4 0 c2 0 32-bit RW EL1 Physical TimerValue register 
CNTPCT - 0 cl4 - 64-bit RW Physical Count register 
CNTY CTL, cl4 0 c3 1 32-bit RW Virtual Timer Control register 
CNTV_CVAL - 3 cl4 - 64-bit RW Virtual Timer CompareValue register 
CNTV_TVAL cl4 0 c3 0 32-bit RW Virtual TimerValue register 
CNTVCT - 1 cl4 - 64-bit RO Virtual Count register 
CNTVOFF> - 4 cl4 - 64-bit RW Virtual Offset register 





a. See the register descriptions for more information. Accessibility can depend on configuration settings as well as on the current Exception 


level. 


b. Implemented as RW as part of the Generic Timer on an implementation that includes EL2 and when EL2 is using AArch32. For more 
information see Status of the CNTVOFF register on page G5-4223. 


G4.17.15 


Generic Interrupt Controller System registers, functional groups 


From version 3.0 of the GIC architecture specification, the specification defines three groups of System registers, 
identified by the prefix of the register name: 


ICC_ GIC physical CPU interface System registers. 
ICH_ GIC virtual interface control System registers. 
ICV_ GIC Virtual CPU interface System registers. 


Note 


These registers are in addition to the GIC memory-mapped register groups GICC_, GICD_, GICH_, GICR_, 
GICV_, and GITS_. 











In VMSAv8-32, the GIC System registers are all in the (coproc==0b1111) encoding space with (CRn==c12). The 
ICV_* registers have the same {CRn, opcl, CRm, op2} encodings as the corresponding ICC_* registers. For these 
encodings, GIC register configuration fields determine which register is accessed. 


For more information see the ARM® Generic Interrupt Controller Architecture Specification, GIC architecture 
version 3.0 and version 4.0 (ARM IHI 0069). 
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Table G4-63 shows the functional groups of GIC System registers. 


Table G4-63 Generic Interrupt Controller System registers 


















































Name CRn opci CRm_= opc2_ Width Type _ Description 

ICC_APORO2 c12 0 c8 4 32-bit RW Interrupt Controller Active Priorities Register (0,0) 

ICV_APORO@ Interrupt Controller Virtual Active Priorities 
Register (0,0) 

ICC_APOR14 c12 0 c8 5 32-bit RW Interrupt Controller Active Priorities Register (0,1) 

ICV_APOR1@ Interrupt Controller Virtual Active Priorities 
Register (0,1) 

ICC_APOR22 c12 0 c8 6 32-bit RW Interrupt Controller Active Priorities Register (0,2) 

ICV_APOR2@ Interrupt Controller Virtual Active Priorities 
Register (0,2) 

ICC_APOR32 c12 0 c8 7 32-bit RW Interrupt Controller Active Priorities Register (0,3) 

ICV_APOR32 Interrupt Controller Virtual Active Priorities 
Register (0,3) 

ICC_AP1R04 c12 0 c9 0 32-bit RW Interrupt Controller Active Priorities Register (1,0) 

ICV_APIRO2 Interrupt Controller Virtual Active Priorities 
Register (1,0) 

ICC_APIR12 c12 0 c9 1 32-bit RW Interrupt Controller Active Priorities Register (1,1) 

ICV_APIR1@ Interrupt Controller Virtual Active Priorities 
Register (1,1) 

ICC_AP1R24 c12 0 c9 2 32-bit RW Interrupt Controller Active Priorities Register (1,2) 

ICV_APIR24 Interrupt Controller Virtual Active Priorities 
Register (1,2) 

ICC_AP1R34 c12 0 c9 3 32-bit RW Interrupt Controller Active Priorities Register (1,3) 

ICV_APIR3@ Interrupt Controller Virtual Active Priorities 
Register (1,3) 

ICC_ASGIIR - 1 cl2 - 64-bit WO Interrupt Controller Alias Software Generated 
Interrupt group 1 Register 

ICC_BPRO@ c12 0 c8 3 32-bit RW Interrupt Controller Binary Point Register 0 

ICV_BPRO@ Interrupt Controller Virtual Binary Point Register 0 

ICC_BPR14 c12 0 c12 3 32-bit RW Interrupt Controller Binary Point Register 1 

ICV_BPR1@ Interrupt Controller Virtual Binary Point Register 1 

ICC_CTLR4 c12 0 c12 4 32-bit RW Interrupt Controller Control Register 

ICV_CTLR2 Interrupt Controller Virtual Control Register 

ICC_DIR4 c12 0 cll 1 32-bit WO Interrupt Controller Deactivate Interrupt Register 

ICV_DIR? Interrupt Controller Deactivate Virtual Interrupt 
Register 

ICC_EOIRO# c12 0 c8 1 32-bit WO Interrupt Controller End Of Interrupt Register 0 

ICV_EOIRO@ Interrupt Controller Virtual End Of Interrupt 
Register 0 
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Table G4-63 Generic Interrupt Controller System registers (continued) 
























































Name CRn opct CRm_= opc2 Width Type _ Description 

ICC_EOIR14 c12 0 c12 1 32-bit WO Interrupt Controller End Of Interrupt Register 1 

ICV_EOIR12 Interrupt Controller Virtual End Of Interrupt 
Register | 

ICC_HPPIRO# c12 0 c8 2 32-bit RO Interrupt Controller Highest Priority Pending Interrupt 
Register 0 

ICV_HPPIRO@ Interrupt Controller Virtual Highest Priority Pending 
Interrupt Register 0 

ICC_HPPIR14 c12 0 c12 2 32-bit RO Interrupt Controller Highest Priority Pending Interrupt 
Register 1 

ICV_HPPIR14 Interrupt Controller Virtual Highest Priority Pending 
Interrupt Register 1 

ICC_HSRE c12 4 c9 5 32-bit RW Interrupt Controller Hyp System Register Enable 
register 

ICC_IARO@ c12 0 c8 0 32-bit RO Interrupt Controller Interrupt Acknowledge Register 0 

ICV_IARO@ Interrupt Controller Virtual Interrupt Acknowledge 
Register 0 

ICC_IAR14 c12 0 c12 0 32-bit RO Interrupt Controller Interrupt Acknowledge Register 1 

ICV_IAR1@ Interrupt Controller Virtual Interrupt Acknowledge 
Register 1 

ICC_IGRPENRO* = c12 0 c12 6 32-bit RW Interrupt Controller Interrupt Group 0 Enable register 

ICV_IGRPENRO2 Interrupt Controller Virtual Interrupt Group 0 Enable 
register 

ICC_IGRPENR14@ = cl2 0 c12 7 32-bit RW Interrupt Controller Interrupt Group 1 Enable register 

ICV_IGRPENR12 Interrupt Controller Virtual Interrupt Group 1 Enable 
register 

ICC_MCTLR c12 6 c12 4 32-bit RW Interrupt Controller Monitor Control Register 

ICC_MGRPEN1 cl2 6 c12 7 32-bit RW Interrupt Controller Monitor Interrupt Group 1 Enable 
register 

ICC_MSRE c12 6 c12 5 32-bit RW Interrupt Controller Monitor System Register Enable 
register 

ICC_PMR# c4 0 c6 0 32-bit RW Interrupt Controller Interrupt Priority Mask Register 

ICV_PMR2 Interrupt Controller Virtual Interrupt Priority Mask 
Register 

ICC_RPR&@ c12 0 cll 3 32-bit RO Interrupt Controller Running Priority Register 

ICV_RPR? Interrupt Controller Virtual Running Priority Register 

ICC_SGIOR - 2 c12 - 64-bit WO Interrupt Controller Software Generated Interrupt 
group 0 Register 

ICC_SGIIR - 0 c12 - 64-bit WO Interrupt Controller Software Generated Interrupt 
group | Register 

ICC_SRE c12 0 c12 -) 32-bit RW Interrupt Controller System Register Enable register 
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Table G4-63 Generic Interrupt Controller System registers (continued) 



























































Name CRn opci CRm_= opc2 Width Type _ Description 

ICH_APORO cl2 4 c8 0 32-bit RW Interrupt Controller Hyp Active Priorities Register 
(0,0) 

ICH_APOR1 cl2 4 c8 1 32-bit RW Interrupt Controller Hyp Active Priorities Register 
(0,1) 

ICH_APOR2 cl2 4 c8 2 32-bit RW Interrupt Controller Hyp Active Priorities Register 
(0,2) 

ICH_APOR3 c12 4 c8 3 32-bit RW Interrupt Controller Hyp Active Priorities Register 
(0,3) 

ICH_AP1IRO cl2 4 c9 0 32-bit RW Interrupt Controller Hyp Active Priorities Register 
(1,0) 

ICH_APIR1 cl2 4 c9 i 32-bit RW Interrupt Controller Hyp Active Priorities Register 
d,1) 

ICH_AP1IR2 cl2 4 c9 2 32-bit RW Interrupt Controller Hyp Active Priorities Register 
(1,2) 

ICH_AP1R3 cl2 4 c9 3 32-bit RW Interrupt Controller Hyp Active Priorities Register 
(1,3) 

ICH_EISR c12 4 cll 3 32-bit RO Interrupt Controller End of Interrupt Status Register 

ICH_ELRSR cl2 4 cll 5 32-bit RO Interrupt Controller Empty List Register Status 
Register 

ICH_HCR c12 4 cll 0 32-bit RW Interrupt Controller Hyp Control Register 

ICH_LR<n>, cl2 4 cl2 0-7 32-bit RW Interrupt Controller List Registers, 0-7 

n==0-7 

ICH_LR<n>, cl2 4 c13 0-7 32-bit RW Interrupt Controller List Registers, 8-15 

n==8-15 

ICH_LRC<n>, cl2 4 cl4 0-7 32-bit RW Interrupt Controller List Registers Continuation, 0-7 

n==0-7 

ICH_LRC<n>, cl2 4 cl5 0-7 32-bit RW Interrupt Controller List Registers Continuation, 8-15 

n==8-15 

ICH_MISR c12 4 cll 2 32-bit RO Interrupt Controller Maintenance Interrupt State 
Register 

ICH_VMCR c12 4 cll 7 32-bit RW Interrupt Controller Virtual Machine Control Register 

ICH_VTR c12 4 cll 1 32-bit RO Interrupt Controller VGIC Type Register 





a. As described in this section, these ICC_* and ICV_* registers use the same encodings. 
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Table G4-64 shows the VMSAv8-32 System registers in the (coproc==0b1111) encoding space that are in the Legacy 
features registers functional group. 


Table G4-64 Legacy features System registers in the (coproc==0b1111) encoding space 

















Name CRn opci CRm_~— opc2_ Width Typea Description 

CP15DMB c7 0 cl0 5 32-bit WO, PLO Memory barriers on page E2-2335 
CP15DSB c7 0 cl0 4 32-bit WO, PLO 

CP15ISB c7 0 c5 4 32-bit WO, PLO 

FCSEIDR c13 0 c0 0 32-bit RW? FCSE Process ID Register 





a. 


PLO in a Type description indicates that the encoding is accessible by software executing at PLO. See the register 
description for more information. 
In ARMV8, the PE does not implement the FCSEIDR, and therefore the register is RAZ/WI. See the register description 
for more information. 


Table G4-65 shows the VMSAv8-32 System registers in the (coproc==0b1110) encoding space that are in the Legacy 
features registers functional group. 


Table G4-65 Legacy features registers in the (coproc==0b1110) encoding space 














Name CRn opci1 CRm_= opc2 Width Type Description 

JIDR cO 7 c0 32-bit RO Jazelle ID Register 

JMCR c2 7 c0 32-bit RW Jazelle Main Configuration Register 
JOSCR cl h c0 32-bit RW Jazelle OS Control Register 





G4.17.17. IMPLEMENTATION DEFINED registers, functional group 


VMSAv8-32 defines some encodings for registers with content that is entirely IMPLEMENTATION DEFINED. 
Table G4-66 shows these registers in the (coproc==0b1111) encoding space. 


Table G4-66 Architectural encodings for registers with IMPLEMENTATION DEFINED content 






































Name CRn opct1 CRm_= opc2 Width Type _ Description 
ACTLR cl 0 c0 1 32-bit RW Auxiliary Control Register 
ACTLR2 cl 0 c0 3 32-bit RW Auxiliary Control Register 2 
ADFSR cS 0 cl 0 32-bit RW Auxiliary Data Fault Status Register 
AIDR c0 1 c0 7 32-bit RO Auxiliary ID Register 
AIFSR cs 0 cl 1 32-bit RW Auxiliary Instruction Fault Status Register 
AMAIRO cl0 0 c3 0 32-bit RW Auxiliary Memory Attribute Indirection Register 0 
AMAIR1 cl0 0 c3 1 32-bit RW Auxiliary Memory Attribute Indirection Register 1 
HACR cl 4 cl 7 32-bit RW Hyp Auxiliary Configuration Register 
HACTLR cl 4 c0 0 32-bit RW Hyp Auxiliary System Control Register 
HACTLR2 cl 4 c0 ) 32-bit RW Hyp Auxiliary System Control Register 2 
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Table G4-66 Architectural encodings for registers with IMPLEMENTATION DEFINED content (continued) 























Name CRn opci CRm_= opc2_ Width’ Type Description 

HADFSR cs 4 cl 0 32-bit RW Hyp Auxiliary Data Fault Status Register 

HAIFSR cS 4 cl 1 32-bit RW Hyp Auxiliary Instruction Fault Status Register 
HAMAIRO ~~ cl0 4 c3 0 32-bit RW Hyp Auxiliary Memory Attribute Indirection Register 0 
HAMAIRI cl0 4 c3 1 32-bit RW Hyp Auxiliary Memory Attribute Indirection Register 1 
REVIDR c0 0 c0 6 32-bit RO Revision ID Register 

TCMTR c0 0 c0 2 32-bit RO TCM Type Register 





See also IMPLEMENTATION DEFINED performance monitors on page G4-4206. 


G4.17.18 Advanced SIMD and floating-point registers, functional group 


Table G4-67 shows the VMSAv8-32 System registers in the Advanced SIMD and floating-point registers functional 
group. These registers are accesses using VMRS and VMSR instructions, see the register descriptions for more 
information. 


Table G4-67 Floating-point registers 























Name Width Type Description 

FPEXC 32-bit RW Floating-Point Exception Control register 
FPSCR 32-bit RW Floating-Point Status and Control Register 
FPSID 32-bit RW2 Floating-Point System ID register 
MVFRO 32-bit RO Media and VFP Feature Register 0 
MVFRI 32-bit RO Media and VFP Feature Register 1 
MVFR2 32-bit RO Media and VFP Feature Register 2 





a. When the FPSID is accessible, VMSR accesses to the FPSID are ignored. 


G4.17.19 Debug registers, functional group 


In AArch32 state, most Debug registers that are accessible through the System registers interface use 
{coproc==0b1110, opcl==0} encodings. Table G4-68 shows these registers. 


Table G4-68 System register {coproc == 0b1110, opc1 == 0} encodings of Debug registers 


























Name CRn opc2 CRm Width Type Description 
DBGAUTHSTATUS c7 6 cl4 32-bit RO Authentication Status 
DBGBCR<n> c0 5 c0-cl5 = 32-bit RW Breakpoint Control 
DBGBVR<n> c0 4 c0-cl5 = 32-bit RW Breakpoint Value 
DBGBXVR<n> cl 1 c0-cl5 32-bit RW Breakpoint Extended Value 
DBGCLAIMCLR c7 6 c9 32-bit RW CLAIM Tag Clear 
DBGCLAIMSET c7 6 c8 32-bit RW CLAIM Tag Set 
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Table G4-68 System register {coproc == 0b1110, opc1 == 0} encodings of Debug registers (continued) 






















































































Name CRn opc2 CRm Width Type Description 
DBGDCCINT c0 0 c2 32-bit RW Debug Communications Channel Interrupt Enable 
Register 
DBGDEVID c7 7 c2 32-bit RO Device ID 0 
DBGDEVID1 c7 7 cl 32-bit RO Device ID 1 
DBGDEVID2 c7 7 cO 32-bit RO Reserved, UNK 
DBGDIDR cO 0 c0 32-bit RO Debug ID 
DBGDRAR - - cl 64-bit RO Debug ROM Address 
cl 0 c0 32-bit 
DBGDSAR - - c2 64-bit RO Debug Self Address Offset 
c2 0 c0 32-bit 
DBGDSCRext c0 2 c2 32-bit RW Debug Status and Control external 
DBGDSCRint c0 0 cl 32-bit RO Debug Status and Control internal 
DBGDTRRXext c0 2 c0 32-bit RW Host to Target Data Transfer external 
DBGDTRRXint c0 0 cS 32-bit RO Host to Target Data Transfer internal 
DBGDTRTXext c0 2 c3 32-bit RW Target to Host Data Transfer external 
DBGDTRTXint c0 0 cS 32-bit WO Target to Host Data Transfer internal 
DBGOSDLR cl 4 c3 32-bit RW OS Double Lock 
DBGOSECCR c0 2 c6 32-bit RW OS Lock Exception Catch Control Register 
DBGOSLAR cl 4 c0 32-bit WO OS Lock Access 
DBGOSLSR cl 4 cl 32-bit RO OS Lock Status 
DBGPRCR cl 4 c4 32-bit RW Device Powerdown and Reset Control 
DBGVCR c0 0 c7 32-bit RW Vector Catch 
DBGWCR<n> c0 7 c0-cl5 = 32-bit RW Watchpoint Control 
DBGWFAR c0 0 c6 32-bit RW Watchpoint Fault Address 
DBGWVR<n> c0 6 c0-cl5 32-bit RW Watchpoint Value 
- c4 0-3 c0-c15 32-bit IMP DEF IMPLEMENTATION DEFINED 
- c7 2-3 c0-cl5 32-bit IMP DEF Integration registers 
+h c0 32-bit | IMP DEF 





In AArch32 state, some Debug registers that are accessible through the System registers interface using 
(coproc==0b1110) encodings. Table G4-69 on page G4-4214 shows these registers. 
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Table G4-69 System register (coproc==0b1110) encodings of Debug registers 


























Name CRn opci CRm_= opc2_ Width’ Type Description 
DLR c4 3 cS 1 32-bit RW Debug Link Register 
DSPSR c4 3 c5 0 32-bit RW Debug Saved Program Status Register 
HDCR cl 4 cl 1 32-bit RW Hyp Debug Control Register 
SDCR cl 0 c3 1 32-bit RW Secure Debug Configuration Register 
SDER cl 0 cl 1 32-bit RW Secure Debug Enable Register 
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G4.18 Pseudocode description of VMSAv8-32 memory system operations 


This section contains a list of pseudocode functions describing VMSAv8-32 memory operations. The following 
subsections describe the pseudocode functions: 


° Alignment fault. 

° Address translation. 

. Domain checking. 

° TLB operations. 

. Translation table walk. 

° Reporting syndrome information on page G4-4216. 


° Memory access decode when TEX remap is enabled on page G4-4216. 
See also the descriptions of pseudocode for general memory system operations in Pseudocode description of 


general memory system instructions on page G3-4017. 


G4.18.1 Alignment fault 


The AArch32.AlignmentFault() pseudocode function describes the generation of an Alignment fault Data Abort 
exception. 


See also Abort exceptions on page G3-4019. 


G4.18.2 Address translation 


The AArch32.TranslateAddress() and AArch32.FullTranslate() pseudocode functions describe a VMSAv8-32 
address translation. 


The AArch32.FullTranslate() function calls either: 
° The function described in Address translation when the stage 1 address translation is disabled. 
° One of the functions described in Translation table walk. 


Stage 2 translation table walk on page G4-4216 describes the CheckS2Permission() and CombineS1S2Desc() 
pseudocode functions. 
Address translation when the stage 1 address translation is disabled 
The AArch32.TranslateAddressS10ff() pseudocode function describes the address translation performed when the 
stage 1 address translation is disabled. 

G4.18.3 Domain checking 


The AArch32.CheckDomain() pseudocode function describes domain checking. 


G4.18.4 TLB operations 


The TLBRecord type represents the contents of a TLB entry: 


G4.18.5 Translation table walk 


Because of the complexity of a translation table walk, the following sections describe the different cases: 

. Translation table walk using the Short-descriptor translation table format for stage 1 on page G4-4216. 
. Translation table walk using the Long-descriptor translation table format for stage 1 on page G4-4216. 
. Stage 2 translation table walk on page G4-4216. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G4-4215 
1ID092916 Non-Confidential 


G4 The AArch32 Virtual Memory System Architecture 
G4.18 Pseudocode description of VMSAv8-32 memory system operations 


G4.18.6 


G4.18.7 


Translation table walk using the Short-descriptor translation table format for stage 1 


The AArch32.TranslationTableWalkSD() pseudocode function describes the translation table walk when the stage 1 
translation tables use the Short-descriptor format. It calls the function described in Stage 2 translation table walk if 
necessary. 


The ShortConvertAttrsHints() pseudocode function converts the Normal memory cacheability attribute, from the 
TTBR or the translation table TEX field, into the separate cacheability attribute and cache allocation hint defined 
in a Long-descriptor translation table descriptor. 


Translation table walk using the Long-descriptor translation table format for stage 1 


The AArch32.TranslationTableWalkLD() pseudocode function describes the translation table walk when the stage 1 
translation tables use the Long-descriptor format. It calls the function described in Stage 2 translation table walk if 
necessary. 


AArch32.TranslationTableWalkLD() calls the ConvertAttrsHints() pseudocode function that is defined in Translation 
table walk using the Short-descriptor translation table format for stage 1. 


The AArch32.S1AttrDecode() pseudocode function uses the MAIRO and MAIR1 registers to decode the Attr[2:0] 
value from a stage 1 translation table descriptor. 


The S2AttrDecode() pseudocode function decodes the Attr[3:0] value from a stage 2 translation table descriptor. 


Stage 2 translation table walk 


In the Non-secure EL1&0 translation regime, a descriptor address returned by stage 1 lookup is in the IPA address 
space, and must be mapped to a PA by a stage 2 translation. When EL2 is using AArch32, function 
AArch32.SecondStageWalk() performs this translation, by calling the AArch32.SecondStageTranslate() function. 
When called from AArch32.SecondStageWalk(), the AArch32.SecondStageTranslate() function performs a second 
stage translation, from IPA to PA, of the supplied address, including checking that the access has read permission 
at the second stage. If the access does not have second stage read permission it generates a second stage Permission 
fault on the first stage translation table walk. The second stage translation might hit in a TLB, or might involve a 
translation table walk, which will use the algorithm described in this section. Stage 2 translations tables always use 
the Long-descriptor translation table format. 


The AArch32.CheckPermission() pseudocode function checks the access permissions for the stage | translation. 
The AArch32.CheckS2Permission() pseudocode function checks the access permissions for the stage 2 translation. 


The CombineS1S2Desc() pseudocode function combines the stage 1 and stage 2 access descriptors: 


Reporting syndrome information 


The AArch32.ReportHypEntry(), AArch32.ReportDataAbort(), and AArch32.ReportPrefetchAbort() pseudocode 
functions write syndrome value information to the appropriate registers for the current mode. 


Memory access decode when TEX remap is enabled 


When using the Short-descriptor translation table format, the function AArch32.RemappedTEXDecode() decodes the 
texcb and S attributes derived from the translation tables when TEX remap is enabled. Short-descriptor format 
memory region attributes, with TEX remap on page G4-4080 shows the interpretation of the arguments. 
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Chapter G5 
The Generic Timer in AArch32 state 


This chapter describes the implementation of the ARM Generic Timer as an extension to an ARMv8 
implementation. It includes an overview of the AArch32 System register interface to an ARM Generic Timer. 


It contains the following sections: 
. About the Generic Timer in AArch32 state on page G5-4218. 
° The AArch32 view of the Generic Timer on page G5-4222. 


Chapter D6 The Generic Timer in AArch64 state describes the AArch64 view of the Generic Timer, including an 
additional timer that can be implemented in AArch64 state, and Chapter I1 System Level Implementation of the 
Generic Timer describes the system level implementation of the Generic Timer. 
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G5.1 About the Generic Timer in AArch32 state 


Figure G5-1 shows an example system-on-chip that uses the Generic Timer as a system timer. In this figure: 






















































° This manual defines the architecture of the individual PEs in the multiprocessor blocks. 

° The ARM Generic Interrupt Controller Architecture Specification defines a possible architecture for the 
interrupt controllers. 

° Generic Timer functionality is distributed across multiple components. 
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Figure G5-1 Generic Timer example 
The Generic Timer: 


° Provides a system counter, that measures the passing of time in real-time. 


Note 


The Generic Timer can also provide other components at a system level, but Figure G5-1 does not show any 
such components. 








° Supports virtual counters that measure the passing of virtual-time. That is, a virtual counter can measure the 
passing of time on a particular virtual machine. 


. Timers, that can trigger events after a period of time has passed. The timers: 
— Can be used as count-up or as count-down timers. 


— Can operate in real-time or in virtual-time. 


This chapter describes an instance of the Generic Timer component that Figure GS-1 shows as Timer_0O or Timer_1 
within the Multiprocessor A or Multiprocessor B block. This component can be accessed from AArch64 state or 
AArch32 state, and this chapter describes access from AArch32 state. Chapter D6 The Generic Timer in AArch64 
state describes access to this component from AArch6é4 state. 


Note 


The reset requirements of Generic Timer registers are more strict when they are accessed from AArch32 state than 
when they are accessed from AArch64 state. 
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A Generic Timer implementation must also include a memory-mapped system component, see The full set of 
Generic Timer components. 


G5.1.1 The full set of Generic Timer components 
Within a system that might include multiple PEs, a full set of Generic Timer components is as follows: 


The system counter 


This provides a uniform view of system time, see The system counter on page G5-4220. Because 
this must be implemented at the system level, it is accessed through The system level 
memory-mapped implementation of the Generic Timer. However, during initialization, a status 
register in each implemented timer in the system must be programmed with the frequency of the 
system counter, so that software can read this frequency. 


PE implementations of the Generic Timer 
Each PE implementation of the Generic Timer provides the following components: 
. A physical counter, that gives access to the count value of the system counter. 


. A virtual counter, that gives access to virtual time. In AArch32 state, the CNT VOFF register 
defines the offset between physical time, as defined by the value of the system counter, and 
virtual time. 


° A number of timers. In an implementation where all Exception levels are implemented and 
can use AArch32 state, the timers that are accessible from AArch64 state are: 
— A Secure PL1 physical timer. 
— A Non-secure EL] physical timer. 
—  AnEL2 physical timer. 


— A virtual timer. 


— Note 


The Secure PL1 physical timer uses the Secure banked instances of the CNTP_CTL, 
CNTP_CVAL, and CNTP_TVAL registers, and the Non-secure EL1 physical timer uses the 
Non-secure instances of the same registers. 





The AArch32 view of the Generic Timer on page G5-4222 describes these components. 


The system level memory-mapped implementation of the Generic Timer 


The memory-mapped registers that control the components of the system level implementation of 
the Generic Timer are grouped into frames. The Generic Timer architecture defines the offset of 
each register within its frame, but the base address of each frame is IMPLEMENTATION DEFINED, and 
defined by the system. 


Each system level component has one or two register frames. The possible system level components 
are: 
The memory-mapped counter module, required 

This module controls the system counter. It has two frames: 

. A control frame, CNTControlBase. 

° A status frame, CNTReadBase. 


The memory-mapped timer control module, required 


The system level implementation of the Generic Timer can provide up to eight timers, 
and the memory-mapped timer control module identifies: 


° Which timers are implemented. 

° The features of each implemented timer. 

This module has a single frame, CNTCTLBase. 
Memory-mapped timers, optional 


An implemented memory-mapped timer: 


° Must provide a privileged view of the timer, in the CNTBaseW frame. 
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G5.1.2 


° Optionally. provides an unprivileged view of the timer in the CNTELOBaseV 
frame. 


N is the timer number, and the corresponding frame number, in the range 0-7. 


Chapter I1 System Level Implementation of the Generic Timer describes these components. 


The system counter 


The Generic Timer provides a system counter with the following specification: 
Width At least 56 bits wide. 

The value returned by any 64-bit read of the counter is zero-extended to 64 bits. 
Frequency Increments at a fixed frequency, typically in the range 1-50MHz. 


Can support one or more alternative operating modes in which it increments by larger amounts at a 
lower frequency, typically for power-saving. 


Roll-over Roll-over time of not less than 40 years. 


Accuracy ARM does not specify a required accuracy, but recommends that the counter does not gain or lose 
more than ten seconds in a 24-hour period. 


Use of lower-frequency modes must not affect the implemented accuracy. 


Start-up Starts operating from zero. 


The system counter, once configured and running, must provide a uniform view of system time. More precisely, it 
must be impossible for the following sequence of events to show system time going backwards: 


1. Device A reads the time from the system counter. 
2. Device A communicates with another agent in the system, Device B. 


3. After recognizing the communication from Device A, Device B reads the time from the system counter. 
The system counter must be implemented in an always-on power domain. 


To support lower-power operating modes, the counter can increment by larger amounts at a lower frequency. For 
example, a 10OMHz system counter might either increment: 


° By | at 1OMHz. 
° By 500 at 20kHz, when the system lowers the clock frequency, to reduce power consumption. 


In this case, the counter must support transitions between high-frequency, high-precision operation, and 
lower-frequency, lower-precision operation, without any impact on the required accuracy of the counter. 


The CNTFRQ register is intended to hold a copy of the current clock frequency to allow fast reference to this 
frequency by software running on the PE. For more information see Initializing and reading the system counter 


frequency. 


The mechanism by which the count from the system counter is distributed to system components is 
IMPLEMENTATION DEFINED, but each PE with a System register interface to the system counter must have a counter 
input that can capture each increment of the counter. 


Note 


So that the system counter can be clocked independently from the PE hardware, the count value might be distributed 
using a Gray code sequence. Gray-count scheme for timer distribution scheme on page 11-5134 gives more 
information about this possibility. 








Initializing and reading the system counter frequency 


The CNTFRQ register must be programmed to the clock frequency of the system counter. Typically, this is done 
only during the system boot process, by using the System register interface to write the system counter frequency 
to the CNTFRQ register. Only software executing at the highest implemented Exception level can write to 
CNTFRQ. 
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Note 


The CNTFRQ register is UNKNOWN at reset, and therefore the counter frequency must be set as part of the system 
boot process. 








Software can read the CNTFRQ register, to determine the current system counter frequency, in the following states 
and modes: 


° Hyp mode. 
° Secure PL1 modes and Non-secure EL1 modes. 
° When CNTKCTL.PLOPCTEN is set to 1, Secure and Non-secure ELO modes. 


Memory-mapped controls of the system counter 


Some system counter controls are accessible only through the memory-mapped interface to the system counter. 
These controls are: 


. Enabling and disabling the counter. 

° Setting the counter value. 

° Changing the operating mode, to change the update frequency and increment value. 
. Enabling Halt-on-debug, that a debugger can then use to suspend counting. 


For descriptions of these controls, see Chapter I1 System Level Implementation of the Generic Timer. 
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G5.2 The AArch32 view of the Generic Timer 


The following sections describe the components and features of a PE implementation of the Generic Timer, as seen 


from AArch3?2 state: 
° The physical counter. 
° The virtual counter. 


° Event streams on page G5-4223. 
° Timers on page G5-4224. 


G5.2.1 The physical counter 


The PE includes a physical counter that contains the count value of the system counter. The CNTPCT register holds 
the current physical counter value. 


Accessing the physical counter 

Software with sufficient privilege can read CNTPCT using a 64-bit System register read. 
CNTPCT: 

° Is always accessible from Secure PL1 modes and from Non-secure Hyp mode. 


° Is accessible from Non-secure EL1 modes when the value of CNTHCTL.PL1IPCTEN is 1. When the value 
of CNTHCTL.PL1PCTEN is 0, any attempt to access CNTPCT from Non-secure EL1 modes is trapped to 
Hyp mode. 


° Is accessible from Secure User mode when the value of CNTKCTL.PLOPCTEN is |. When the value of 
CNTKCTL.PLOPCTEN is 0, any attempt to access CNTPCT generates an UNDEFINED exception. 


° Is accessible from Non-secure User mode when the value of CNTHCTL.PLIPCTEN is 1 and the value of 
CNTKCTL.PLOPCTEN is 1. Otherwise: 


— When the value of CNTKCTL.PLOPCTEN is 0, any attempt to access CNTPCT from Non-secure 
User mode generates an UNDEFINED exception. 


— When the value of CNTKCTL.PLOPCTEN is 1 and the value of CNTHCTL.PL1PCTEN is 0, any 
attempt to access CNTPCT from Non-secure User mode is trapped to Hyp mode. 


Reads of CNTPCT can occur speculatively and out of order relative to other instructions executed on the same PE. 


For example, if a read from memory is used to obtain a signal from another agent that indicates that CNTPCT must 
be read, an ISB is used to ensure that the read of CNTPCT occurs after the signal has been read from memory, as 
shown in the following code sequence: 


loop ; polling for some communication to indicate a requirement to read the timer 
LDR R1, [R2] 
CMP R1, #1 
BNE loop 
ISB ; without this, the CNTPCT could be read before the memory location in [R2] 


; has had the value 1 written to it 
MRS R1, CNTPCT 


G5.2.2 The virtual counter 


An implementation of the Generic Timer always includes a virtual counter, that indicates virtual time. 


The virtual counter contains the value of the physical counter minus a 64-bit virtual offset. When executing in a 
Non-secure EL1 or ELO mode, the virtual offset value relates to the current virtual machine. 


The CNTVOFF register contains the virtual offset. CNTVOFF is only accessible: 
° From Hyp mode. 
° From Monitor mode only when SCR.NS is set to 1. 


For more information see Status of the CNTVOFF register on page G5-4223. 
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The CNTVCT register holds the current virtual counter value. 


Accessing the virtual counter 

Software with sufficient privilege can read CNTVCT using a 64-bit System register read. 

CNTVCT is always accessible from Secure PL1 modes and from Non-secure EL1 and EL2 modes. 

In addition, when CNTKCTL.PLOVCTEN is set to 1, CNTVCT is accessible from ELO. 

When CNTKCTL.PLOVCTEN is set to 0, any attempt to access CNTVCT from ELO is UNDEFINED. 

Reads of CNTVCT can occur speculatively and out of order relative to other instructions executed on the same PE. 


For example, if a read from memory is used to obtain a signal from another agent that indicates that CNTVCT must 
be read, an ISB is used to ensure that the read of CNT VCT occurs after the signal has been read from memory, as 
shown in the following code sequence: 


loop ; polling for some communication to indicate a requirement to read the timer 
LDR R1, [R2] 
CMP R1, #1 
BNE loop 
ISB ; without this, the CNTVCT could be read before the memory location in [R2] 


; has had the value 1 written to it 
MRS R1, CNTVCT 


Status of the CNTVOFF register 


All implementations of the Generic Timer include the virtual counter. Therefore, conceptually, all implementations 
include the CNTVOFF register that defines the virtual offset between the physical count and the virtual count. 
CNTVOFF is only accessible at EL2 or above. If EL2 is not implemented, the virtual counter uses a fixed virtual 
offset of zero. 


G5.2.3 Event streams 


Any implementation of the Generic Timer can use the system counter to generate one or more event streams, to 
generate periodic wake-up events as part of the mechanism described in Wait for Event mechanism and Send event 
on page D1-1599. 





Note 
An event stream might be used: 
. To impose a time-out on a Wait For Event polling loop. 
. To safeguard against any programming error that means an expected event is not generated. 





An event stream is configured by: 


° Selecting which bit, from the bottom 16 bits of a counter, triggers the event. This determines the frequency 
of the events in the stream. 


° Selecting whether the event is generated on each 0 to | transition, or each 1 to 0 transition, of the selected 
counter bit. 


The CNTKCTL.{EVNTEN, EVNTDIR, EVNTI} fields define an event stream that is generated from the virtual 
counter. 


In all implementations the CNTHCTL.{EVNTEN, EVNTDIR, EVNTI} fields define an event stream that is 
generated from the physical counter. 


The operation of an event stream is as follows: 


. The pseudocode variables PreviousCNTVCT and PreviousCNTPCT are initialized as: 


// Variables used for generation of the timer event stream. 
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G5.2.4 Timers 


bits(64) PreviousCNTVCT = bits(64) UNKNOWN; 
bits(64) PreviousCNTPCT = bits(64) UNKNOWN; 


The pseudocode functions TestEventCNTV() and TestEventCNTP() are called on each cycle of the PE clock. 


The TestEventCNTx() pseudocode template defines the functions TestEventCNTV() and TestEventCNTP(): 
// TestEventCNTx() 
——E 


// Template for the TestEventCNTV() and TestEventCNTP() functions 
// Describes operation when all Exception Levels are using AArch32: 


// — CNTxCT is CNTVCT or CNTPCT 64-bit count value 
// — CNTx_CTL is CNTV_CTL or CNTP_CTL Control register 
//  PreviousCNTxCT is PreviousCNTVCT or PreviousCNTPCT 
TestEventCNTx() 

if CNTx_CTL.EVNTEN == '1' then 


n = UInt(CNTx_CTL.EVNTI); 
SampleBit = CNTxCT<n>; 
PreviousBit = PreviousCNTxCT<n>; 


if CNTx_CTL.EVNTDIR == 'Q@' then 
if PreviousBit == '0' && SampleBit == '1' then EventRegisterSet(); 
else 
if PreviousBit == '1' && SampleBit == 'Q' then EventRegisterSet(); 
PreviousCNTxCT = CNTxCT; 


return; 


In an implementation that includes EL3, in any implementation of the Generic Timer, the following timers are 
accessible from AArch32 state, provided the appropriate Exception level can use AArch32: 


A Non-secure EL1 physical timer. A Non-secure EL1 control determines whether this register is accessible 
from Non-secure ELO. 


A Secure PL1 physical timer. This timer: 
—  Isaccessible from Secure EL1 using AArch32 when EL3 is using AArch64. 
— __ Isaccessible from Secure EL3 when EL3 is using AArch32. 


A Secure PL1 control determines whether this register is accessible from Secure ELO. 
A Non-secure EL2 physical timer. 


A virtual timer. 


The output of each implemented timer: 


Provides an output signal to the system. 


If the PE interfaces to a Generic Interrupt Controller (GIC), signals a Private Peripheral Interrupt (PPI) to 
that GIC. In a multiprocessor implementation, each PE must use the same interrupt number for each timer. 


Each timer is implemented as three registers: 


A 64-bit CompareValue register, that provides a 64-bit unsigned upcounter. 
A 32-bit TimerValue register, that provides a 32-bit signed downcounter. 
A 32-bit Control register. 
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In all implementations, the AArch32 System registers for the EL1 (or PL1) physical timer are Banked, to provide 
the Secure and Non-secure implementations of the timer. Table GS-1 shows the Timer registers. 


Table G5-1 Timer registers summary for the Generic Timer 





Secure PL1/Non-secure EL1 


Timer register physical timer 


EL2 physical timer Virtual timer 











Compare Value register CNTP_CVAL2 CNTHP_CVAL CNTV_CVAL 
Timer Value register CNTP_TVAL4 CNTHP_TVAL CNTV_TVAL 
Control register CNTP_CTL@ CNTHP_CTL CNTV_CTL 





a. In AArch32 state, these registers are Banked to provide the Non-secure EL1 physical timer and the Secure PL1 
physical timer. 


The following sections describe: 


° Accessing the timer registers 
° Operation of the Compare Value views of the timers on page G5-4226 
° Operation of the TimerValue views of the timers on page G5-4226. 


Accessing the timer registers 
For each timer, all timer registers have the same access permissions, as follows: 


Secure PL1 and Non-secure EL1 physical timer 
The Secure PL1 physical timer is accessible from Secure PL1 modes. 


Non-secure software executing at EL2 controls access to the Non-secure EL1 physical timer 
from Non-secure EL1 modes. The Non-secure EL1 physical timer is accessible from 
Monitor mode when the value of SCR.NS is 1. 


When access from PL1 or EL1 modes is permitted, CNTKCTL.PLOPTEN determines 
whether the registers are accessible from ELO. If an access is not permitted because 
CNTKCTL.PLOPTEN is set to 0, an attempted access from ELO is UNDEFINED. 


In all implementations: 


° Except for accesses from Monitor mode, accesses are to the registers for the current 
Security state. 


° For accesses from Monitor mode, the value of SCR.NS determines whether accesses 
are to the Secure or the Non-secure registers. 


Note 
Monitor mode is present only when EL3 is using AArch32. 








° The Non-secure registers are accessible from Hyp mode. 


° CNTHCTL.PL1PCEN determines whether the Non-secure registers are accessible 
from Non-secure EL1 modes. If this bit is set to 1, to enable access from Non-secure 
EL1 modes, CNTKCTL.PLOPTEN determines whether the registers are accessible 
from Non-secure ELO. 
If an access is not permitted because CNTHCTL.PL1PCEN is set to 0, an attempted 
access from a Non-secure EL1 or ELO mode generates a Hyp Trap exception. 
However, if CNTKCTL.PLOPTEN is set to 0, this control takes priority, and an 
attempted access from ELO is UNDEFINED. 


EL2 physical timer Accessible from Hyp mode, and from Secure Monitor mode when SCR_EL3.NS is set to 1. 
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Virtual timer Accessible from Secure PL1 modes and Non-secure EL1 modes, and from Hyp mode. 


CNTKCTL.PLOVTEN determines whether the registers are accessible from ELO modes. If 
an access is not permitted because CNTKCTL.PLOVTEN is set to 0, an attempted access 
from ELO is UNDEFINED. 


Operation of the CompareValue views of the timers 


The CompareValue view of a timer operates as a 64-bit upcounter. The timer condition is met when the appropriate 
counter reaches the value programmed into its Compare Value register. When the timer condition is met an interrupt 
is generated if the interrupt is not masked in the corresponding timer control register, CNTP_CTL, CNTHP_CTL, 
or CNTV_CTL. For CNTP_CTL, the interrupt is the same as the interrupt asserted by the Non-secure instance of 
the AArch64 register CNTP_CTL_ELO. 


The operation of this view of a timer is: 


TimerConditionMet = (((Counter[63:0] - Offset[63:0]) [63:0] - CompareValue[63:0]) >= Q) 








Where: 

TimerConditionMet Is TRUE if the timer condition for this counter is met, and FALSE otherwise. 

Counter The physical counter value, that can be read from the CNTPCT register. 

Note 
The virtual counter value, that can be read from the CNTVCT register, is the value: 
(Counter - Offset) 

Offset For a physical timer it is zero, and for the virtual timer it is the virtual offset, held in the 
CNTVOFF register. 

CompareValue The value of the appropriate CompareValue register, CNTP_CVAL, CNTHP_CVAL, or 
CNTV_CVAL. 


In this view of a timer, Counter, Offset, and CompareValue are all 64-bit unsigned values. 


Note 


This means that a timer with a CompareValue of, or close to, @xFFFF_FFFF_FFFF_FFFF might never meet its timer 
condition. However, there is no practical requirement to use values close to the counter wrap value. 








Operation of the TimerValue views of the timers 


The TimerValue view of a timer operates as a signed 32-bit downcounter. A TimerValue register is programmed 
with a count value. This value decrements on each increment of the appropriate counter, and the timer condition is 
met when the value reaches zero. When the timer condition is met, an interrupt is generated if the interrupt is not 
masked in the corresponding timer control register, CNTP_CTL, CNTHP_CTL, or CNTV_CTL. 


This view of a timer depends on the following behavior of accesses to TimerValue registers: 
Reads TimerValue = (CompareValue - (Counter - Offset) ) [31:0] 
Writes CompareValue = ((Counter - Offset) [63:0] + SignExtend(TimerValue) ) [63:0] 


Where the arguments have the definitions used in Operation of the Compare Value views of the timers, and in 
addition: 


TimerValue The value of a TimerValue register, CNTP_TVAL, CNTHP_TVAL, or CNTV_TVAL. 
In this view of a timer, all values are signed, in standard two’s complement form. 


A read of a TimerValue register after the timer condition has been met indicates the time since the timer condition 
was met. 
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G5 The Generic Timer in AArch32 state 
G5.2 The AArch32 view of the Generic Timer 





Note 
° Operation of the Compare Value views of the timers on page G5-4226 gives a strict definition of 
TimerConditionMet. However, provided that the TimerValue is not expected to wrap as a 32-bit signed value 
when decremented from 0x80000000, the TimerValue view can be used as giving an effect equivalent to: 


TimerConditionMet = (TimerValue <Q) 
. Programming TimerValue to a negative number with magnitude greater than (Counter—Offset) can lead to 


an arithmetic overflow that causes the Compare Value to be an extremely large positive value. This potentially 
delays meeting the timer condition for an extremely long period of time. 
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G5.2 The AArch32 view of the Generic Timer 
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Chapter G6 
AArch32 System Register Descriptions 


This chapter describes each of the AArch32 System registers. 


It contains the following sections: 

° About the AArch32 System registers on page G6-4230. 
° General system control registers on page G6-4231. 

° Debug registers on page G6-4668. 

° Performance Monitors registers on page G6-4758. 
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G6 AArch32 System Register Descriptions 
G6.1 About the AArch32 System registers 


G6.1 About the AArch32 System registers 


For general information about the AArch32 System registers, see: 

° About the System registers for VMSAv8-32 on page G4-4148. 

° VMSAV8-32 organization of registers in the (coproc==O0b1110) encoding space on page G4-4172. 
° VMSAV8-32 organization of registers in the (coproc==ObI1111) encoding space on page G4-4175. 
° Functional grouping of VMSAv8-32 System registers on page G4-4193. 


The remainder of this chapter describes the AArch32 System registers, in the following sections: 
° General system control registers on page G6-4231. 

° Debug registers on page G6-4668. 

° Performance Monitors registers on page G6-4758. 

° Generic Timer registers on page G6-4803. 
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G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2 General system control registers 


This section lists the System registers in AArch32 state that are not part of one of the other listed groups. 
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G6 AArch32 System Register Descriptions 


G6.2 General system control registers 


G6.2.1 ACTLR, Auxiliary Control Register 


The ACTLR characteristics are: 


Purpose 


Provides IMPLEMENTATION DEFINED configuration and control options for execution at EL1 and 
ELO. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


ACTLR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





E : : : : RW 





ACTLRU(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW = 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


ACTLR is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 

prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 

° If HCR.TAC==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 

. If HCR_EL2.TACR==1, Non-secure accesses to this register from EL1 are trapped to EL2 
using AArch64. 

° If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 


° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


Attributes 


AArch32 System register ACTLR is architecturally mapped to AArch64 System register 
ACTLR_EL1[31:0]. 


Some bits might define global configuration settings, and be common to the Secure and Non-secure 
instances of the register. 


RW fields in this register reset to architecturally UNKNOWN values. 


ACTLR is a 32-bit register. 
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G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Field descriptions 


The ACTLR bit assignments are: 


0 


31 
IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the ACTLR: 
To access the ACTLR: 


MRC p15,0,<Rt>,c1,c@,1 ; Read ACTLR into Rt 
MCR p15,0,<Rt>,c1,c0,1 ; Write Rt to ACTLR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0001 0000 001 
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G6 AArch32 System Register Descriptions 


G6.2 General system control registers 


G6.2.2 ACTLR2, Auxiliary Control Register 2 


The ACTLR2 characteristics are: 


Purpose 


Provides additional space to the ACTLR register to hold IMPLEMENTATION DEFINED trap 
functionality for execution at EL1 and ELO. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


ACTLR2(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





E : : : : RW 





ACTLR2(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW = 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


ACTLR2?2 is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 

prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 

° If HCR.TAC==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 

. If HCR_EL2.TACR==1, Non-secure accesses to this register from EL1 are trapped to EL2 
using AArch64. 

° If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 


° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


Attributes 


AArch32 System register ACTLR2 is architecturally mapped to AArch64 System register 
ACTLR_EL1[63:32]. 


It is IMPLEMENTATION DEFINED whether this register is implemented, or whether it causes 
UNDEFINED exceptions when accessed. 


The implementation of this register can be detected by examining bits [7:4] of the 
ID_MMFR4/ID_MMFR4_ EL] register. 


RW fields in this register reset to architecturally UNKNOWN values. 


ACTLR2 is a 32-bit register. 
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G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Field descriptions 


The ACTLR2 bit assignments are: 


0 


31 
IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the ACTLR2: 
To access the ACTLR2: 


MRC p15,0,<Rt>,c1,c@,3 ; Read ACTLR2 into Rt 
MCR p15,0,<Rt>,c1,c@,3 ; Write Rt to ACTLR2 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0001 0000 011 
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G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2.3 ADFSR, Auxiliary Data Fault Status Register 
The ADFSR characteristics are: 


Purpose 
Provides additional IMPLEMENTATION DEFINED fault status information for Data Abort exceptions 
taken to EL1 modes, and EL3 modes when EL3 is implemented and is using AArch32. 

Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


ADFSR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





E : : : : RW 





ADFSR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW = 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


ADFSR is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 

° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL] are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T5==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T5==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register ADFSR is architecturally mapped to AArch64 System register 
AFSRO_EL1. 


RW fields in this register reset to architecturally UNKNOWN values. 
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G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Attributes 
ADFSR is a 32-bit register. 


Field descriptions 


The ADFSR bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the ADFSR: 
To access the ADFSR: 


MRC p15,0,<Rt>,c5,c1,@ ; Read ADFSR into Rt 
MCR p15,0,<Rt>,c5,c1,@ ; Write Rt to ADFSR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0101 0001 000 
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G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2.4 


AIDR, Auxiliary ID Register 


The AIDR characteristics are: 


Purpose 
Provides IMPLEMENTATION DEFINED identification information. 


The value of this register must be used in conjunction with the value of MIDR. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID1==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID1==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register AIDR is architecturally mapped to AArch64 System register AIDR_EL1. 


Attributes 
AIDR is a 32-bit register. 


Field descriptions 


The AIDR bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
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G6.2 General system control registers 


Accessing the AIDR: 
To access the AIDR: 
MRC p15,1,<Rt>,c@,c@,7 ; Read AIDR into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 001 0000 0000 111 
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G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2.5 AIFSR, Auxiliary Instruction Fault Status Register 
The AIFSR characteristics are: 


Purpose 
Provides additional IMPLEMENTATION DEFINED fault status information for Prefetch Abort 
exceptions taken to EL1 modes, and EL3 modes when EL3 is implemented and is using AArch32. 
Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


AIFSR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





E : : : : RW 





AIFSR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW = 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


AIFSR is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 

° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL] are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T5==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T5==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register AIFSR is architecturally mapped to AArch64 System register 
AFSR1_EL1. 


RW fields in this register reset to architecturally UNKNOWN values. 
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G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Attributes 
AIFSR is a 32-bit register. 


Field descriptions 


The AIFSR bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the AIFSR: 
To access the AIFSR: 


MRC p15,0,<Rt>,c5,c1,1 ; Read AIFSR into Rt 
MCR p15,0,<Rt>,c5,c1,1 ; Write Rt to AIFSR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0101 0001 001 
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G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2.6 AMAIRO, Auxiliary Memory Attribute Indirection Register 0 
The AMAIRO characteristics are: 


Purpose 
When using the Long-descriptor format translation tables for stage 1 translations, provides 
IMPLEMENTATION DEFINED memory attributes for the memory regions specified by MAIRO. 
Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


AMAIRO(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





E : : : : RW 





AMAIROV(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


AMAIRO is accessible as follows: 





ELO EL1  EL2 (NS) 








- RW RW 

This register is RESO in the following cases: 

. When an implementation does not provide any IMPLEMENTATION DEFINED memory 
attributes. 

. When the Long-descriptor translation table format is not used. 

If EL3 is implemented and is using AArch32: 

° AMAIRO(S) gives the value for memory accesses from Secure state. 

° AMAIRO(NS) gives the value for memory accesses from Non-secure states other than Hyp 
mode. 


Any IMPLEMENTATION DEFINED memory attributes are additional qualifiers for the memory 
locations and must not change the architected behavior specified by MAIRO and MAIR1. 


In a typical implementation, AMAIRO and AMAIR1 split into eight one-byte fields, corresponding 
to the MAIRn.Attr<n> fields, but the architecture does not require them to do so. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 
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G6.2 General system control registers 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 

. If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 

. If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL] are trapped to 
EL2. 


° If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register AMAIRO is architecturally mapped to AArch64 System register 
AMAIR_EL1[31:0]. 
When EL3 is using AArch32, write access to AMAIRO(S) is disabled when the CPISSDISABLE 
signal is asserted HIGH. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
AMAIRO is a 32-bit register. 


Field descriptions 


The AMAIRO bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the AMAIRO: 
To access the AMAIRO: 


MRC p15,0,<Rt>,c10,c3,@ ; Read AMAIR@ into Rt 
MCR p15,0,<Rt>,c10,c3,@ ; Write Rt to AMAIRO 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1010 0011 000 
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G6.2 General system control registers 


G6.2.7 AMAIR1, Auxiliary Memory Attribute Indirection Register 1 
The AMAIRI characteristics are: 


Purpose 
When using the Long-descriptor format translation tables for stage 1 translations, provides 
IMPLEMENTATION DEFINED memory attributes for the memory regions specified by MAIR1I. 
Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


AMAIRI1(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





E : : : : RW 





AMAIRI(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


AMAIR1 is accessible as follows: 





ELO EL1  EL2 (NS) 








- RW RW 

This register is RESO in the following cases: 

. When an implementation does not provide any IMPLEMENTATION DEFINED memory 
attributes. 

. When the Long-descriptor translation table format is not used. 

If EL3 is implemented and is using AArch32: 

° AMAIR1(S) gives the value for memory accesses from Secure state. 

° AMAIRI1(NS) gives the value for memory accesses from Non-secure states other than Hyp 
mode. 


Any IMPLEMENTATION DEFINED memory attributes are additional qualifiers for the memory 
locations and must not change the architected behavior specified by MAIRO and MAIR1. 


In a typical implementation, AMAIRO and AMAIRI split into eight one-byte fields, corresponding 
to the MAIRn.Attr<n> fields, but the architecture does not require them to do so. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 
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. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 

. If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 

. If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL] are trapped to 
EL2. 


° If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register AMAIR1 is architecturally mapped to AArch64 System register 
AMAIR_EL1[63:32]. 
When EL3 is using AArch32, write access to AMAIRI(S) is disabled when the CPISSDISABLE 
signal is asserted HIGH. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
AMAIR1 is a 32-bit register. 


Field descriptions 


The AMAIRI bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the AMAIR1: 
To access the AMAIR1: 


MRC p15,0,<Rt>,c10,c3,1 ; Read AMAIR1 into Rt 
MCR p15,0,<Rt>,c10,c3,1 ; Write Rt to AMAIR1 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1010 0011 001 
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G6.2.8 APSR, Application Program Status Register 

The APSR characteristics are: 

Purpose 
Hold program status and control information. 

Usage constraints 
The APSR can be read using the MRS instruction and written using the MSR (immediate) or MSR 
(register) instructions. For more details on the instruction syntax, see PSTATE and banked register 
access instructions on page F1-2380. 

Traps and Enables 
There are no traps or enables affecting this register. 

Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 

Attributes 
APSR is a 32-bit register. 

Field descriptions 

The APSR bit assignments are: 

31 30 29 28 27 26 20 19 16 15 5 4 3 0 

LS RES1 

N, bit [31] 
Negative condition flag. Set to bit[31] of the result of the last flag-setting instruction. If the result is 
regarded as a two's complement signed integer, then N is set to 1 if the result was negative, and N 
is set to 0 if the result was positive or zero. 

Z, bit [30] 
Zero condition flag. Set to 1 if the result of the last flag-setting instruction was zero, and to 0 
otherwise. A result of zero often indicates an equal result from a comparison. 

C, bit [29] 
Carry condition flag. Set to 1 if the last flag-setting instruction resulted in a carry condition, for 
example an unsigned overflow on an addition. 

V, bit [28] 
Overflow condition flag. Set to 1 if the last flag-setting instruction resulted in an overflow condition, 
for example a signed overflow on an addition. 

Q, bit [27] 
Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 

Bits [26:20] 
Reserved, RESO. 
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GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 





Bits [15:5] 
Reserved, RESO. 
Bit [4] 
Reserved, RES 1. 
Bits [3:0] 
Reserved, RESO. 
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G6.2.9 ATS12NSOPR, Address Translate Stages 1 and 2 Non-secure Only PL1 Read 
The ATS12NSOPR characteristics are: 


Purpose 
Performs stage 1 and 2 address translations as defined for PL1 and the Non-secure state, with 
permissions as if reading from the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 


exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2(NS) 





- WO 





If EL3 is implemented and is using AArch64, execution of ATS12NSOPR in Secure EL1 using 
AArch32 is trapped as an exception to EL3. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATS12NSOPR is a 32-bit System instruction. 


Field descriptions 


The ATS12NSOPR input value bit assignments are: 


31 0 


Input address for translation 


Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 


This instruction takes a VA as input. The resulting address is the PA that is the output address of the 
stage 2 translation. 
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Executing the ATS12NSOPR instruction: 
The ATS12NSOPR instruction is executed as: 
MCR p15,0,<Rt>,c7,c8,4 ; ATS12NSOPR operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_= opc2 
1111 000 0111 1000 100 
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G6.2.10 ATS12NSOPW, Address Translate Stages 1 and 2 Non-secure Only PL1 Write 
The ATSI12NSOPW characteristics are: 


Purpose 
Performs stage 1 and 2 address translations as defined for PL1 and the Non-secure state, with 
permissions as if writing to the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 


exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2(NS) 





- WO 





If EL3 is implemented and is using AArch64, execution of ATS12NSOPW in Secure EL1 using 
AArch32 is trapped as an exception to EL3. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATS12NSOPW is a 32-bit System instruction. 


Field descriptions 


The ATS12NSOPW input value bit assignments are: 


31 0 


Input address for translation 


Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 


This instruction takes a VA as input. The resulting address is the PA that is the output address of the 
stage 2 translation. 
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Executing the ATS12NSOPW instruction: 
The ATS12NSOPW instruction is executed as: 
MCR p15,0,<Rt>,c7,c8,5 ; ATS12NSOPW operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_= opc2 
1111 000 0111 1000 101 
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G6.2.11 ATS12NSOUR, Address Translate Stages 1 and 2 Non-secure Only Unprivileged Read 
The ATS12NSOUR characteristics are: 


Purpose 
Performs stage 1 and 2 address translations as defined for PLO and the Non-secure state, with 
permissions as if reading from the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 


exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2 (NS) 





- WO 





If EL3 is implemented and is using AArch64, execution of ATS12NSOUR in Secure EL1 using 
AArch32 is trapped as an exception to EL3. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATS12NSOUR is a 32-bit System instruction. 


Field descriptions 


The ATS12NSOUR input value bit assignments are: 


31 0 


Input address for translation 


Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 


This instruction takes a VA as input. The resulting address is the PA that is the output address of the 
stage 2 translation. 
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Executing the ATS12NSOUR instruction: 
The ATS12NSOUR instruction is executed as: 
MCR p15,0,<Rt>,c7,c8,6 ; ATS12NSOUR operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_= opc2 
1111 000 0111 1000 110 
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G6.2.12 ATS12NSOUW, Address Translate Stages 1 and 2 Non-secure Only Unprivileged Write 
The ATS12NSOUW characteristics are: 


Purpose 
Performs stage 1 and 2 address translations as defined for PLO and the Non-secure state, with 
permissions as if writing to the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 


exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2(NS) 





- WO 





If EL3 is implemented and is using AArch64, execution of ATS12NSOUW in Secure EL1 using 
AArch32 is trapped as an exception to EL3. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATS12NSOUW is a 32-bit System instruction. 


Field descriptions 


The ATS12NSOUW input value bit assignments are: 


31 0 


Input address for translation 


Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 


This instruction takes a VA as input. The resulting address is the PA that is the output address of the 
stage 2 translation. 
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Executing the ATS12NSOUW instruction: 
The ATS12NSOUW instruction is executed as: 
MCR p15,0,<Rt>,c7,c8,7 ; ATS12NSOUW operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_= opc2 
1111 000 0111 1000 111 
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G6.2.13 ATS1CPR, Address Translate Stage 1 Current state PL1 Read 
The ATSICPR characteristics are: 


Purpose 
Performs stage 1 address translation as defined for PL1 and the current Security state, with 
permissions as if reading from the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATSICPR is a 32-bit System instruction. 


Field descriptions 


The ATS1ICPR input value bit assignments are: 





31 0 
Input address for translation 
Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 
This instruction takes a VA as input. In an implementation that includes EL2, when executed in 
Non-secure state, the resulting address is the IPA that is the output address of the stage 1 translation. 
Otherwise, the resulting address is a PA. 
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Executing the ATS1CPR instruction: 
The ATS1CPR instruction is executed as: 
MCR p15,0,<Rt>,c7,c8,@ ; ATSICPR operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_= opc2 
1111 000 0111 1000 000 
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G6.2.14 ATS1CPW, Address Translate Stage 1 Current state PL1 Write 
The ATSICPW characteristics are: 


Purpose 
Performs stage 1 address translation as defined for PL1 and the current Security state, with 
permissions as if writing to the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATS1CPW is a 32-bit System instruction. 


Field descriptions 


The ATS1CPW input value bit assignments are: 





31 0 
Input address for translation 
Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 
This instruction takes a VA as input. In an implementation that includes EL2, when executed in 
Non-secure state, the resulting address is the IPA that is the output address of the stage 1 translation. 
Otherwise, the resulting address is a PA. 
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Executing the ATS1CPW instruction: 
The ATS1CPW instruction is executed as: 
MCR p15,0,<Rt>,c7,c8,1 ; ATSICPW operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_= opc2 
1111 000 0111 1000 001 
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G6.2.15 ATS1CUR, Address Translate Stage 1 Current state Unprivileged Read 
The ATSICUR characteristics are: 


Purpose 
Performs stage 1 address translation as defined for PLO and the current Security state, with 
permissions as if reading from the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATS1CUR is a 32-bit System instruction. 


Field descriptions 


The ATS1CUR input value bit assignments are: 





31 0 
Input address for translation 
Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 
This instruction takes a VA as input. In an implementation that includes EL2, when executed in 
Non-secure state, the resulting address is the IPA that is the output address of the stage 1 translation. 
Otherwise, the resulting address is a PA. 
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Executing the ATS1CUR instruction: 
The ATSICUR instruction is executed as: 
MCR p15,0,<Rt>,c7,c8,2 ; ATSICUR operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_= opc2 
1111 000 0111 1000 010 
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G6.2.16 ATS1CUW, Address Translate Stage 1 Current state Unprivileged Write 
The ATSICUW characteristics are: 


Purpose 
Performs stage 1 address translation as defined for PLO and the current Security state, with 
permissions as if writing to the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATS1CUW is a 32-bit System instruction. 


Field descriptions 


The ATS1CUW input value bit assignments are: 





31 0 
Input address for translation 
Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 
This instruction takes a VA as input. In an implementation that includes EL2, when executed in 
Non-secure state, the resulting address is the IPA that is the output address of the stage 1 translation. 
Otherwise, the resulting address is a PA. 
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Executing the ATS1CUW instruction: 
The ATS1CUW instruction is executed as: 
MCR p15,0,<Rt>,c7,c8,3 ; ATSICUW operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_= opc2 
1111 000 0111 1000 011 
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G6.2.17 ATS1HR, Address Translate Stage 1 Hyp mode Read 


The ATS1HR characteristics are: 


Purpose 
Performs stage | address translation as defined for PL2 and the Non-secure state, with permissions 
as if reading from the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0,Mon) EL3(SCR.NS=0, !Mon) 





z WO WO WO WO-UNPREDICTABLE 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





: wo 





If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode VA to PA address translation 
instructions on page K1-5476. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 


° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATS1HR is a 32-bit System instruction. 


Field descriptions 


The ATS1HR input value bit assignments are: 


31 0 


Input address for translation 


Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 


This instruction takes a VA as input. The resulting address is the PA that is the output address of the 


translation. 
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Executing the ATS1HR instruction: 
The ATS1HR instruction is executed as: 
MCR p15,4,<Rt>,c7,c8,@ ; ATSIHR operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_ opc2 
1111 100 0111 1000 000 
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G6.2.18 ATS1HW, Address Translate Stage 1 Hyp mode Write 


The ATS1HW characteristics are: 


Purpose 
Performs stage | address translation as defined for PL2 and the Non-secure state, with permissions 
as if writing to the given virtual address. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0,Mon) EL3(SCR.NS=0, !Mon) 





z WO WO WO WO-UNPREDICTABLE 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





: wo 





If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode VA to PA address translation 
instructions on page K1-5476. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 


° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ATS1HW is a 32-bit System instruction. 


Field descriptions 


The ATS1HW input value bit assignments are: 


31 0 


Input address for translation 


Bits [31:0] 
Input address for translation. The resulting address can be read from the PAR. 


This instruction takes a VA as input. The resulting address is the PA that is the output address of the 


translation. 
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Executing the ATS1HW instruction: 
The ATS1HW instruction is executed as: 
MCR p15,4,<Rt>,c7,c8,1 ; ATSIHW operation 


The instruction is encoded in the System instruction encoding space as follows: 














coproc opct CRn CRm_= opc2 
1111 100 0111 1000 001 
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G6.2.19 BPIALL, Branch Predictor Invalidate All 
The BPIALL characteristics are: 


Purpose 


Invalidate all entries from branch predictors. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


In an implementation where the branch predictors are architecturally invisible, this instruction can 
execute as a NOP. 


Attributes 
BPIALL is a 32-bit System instruction. 


Field descriptions 


BPIALL ignores the value in the register specified by the instruction. Software does not have to write a value to the 
register before issuing this instruction. 


Executing the BPIALL instruction: 
The BPIALL instruction is executed as: 


MCR p15,0,<Rt>,c7,c5,6 ; BPIALL operation, ignoring the value in Rt 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opc1 CRn CRm opc2 





1111 000 0111 0101 110 





The PE ignores the value of <Rt>. Software does not have to write a value to this register before issuing this 
instruction. 





G6-4268 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2.20 BPIALLIS, Branch Predictor Invalidate All, Inner Shareable 
The BPIALLIS characteristics are: 


Purpose 


Invalidate all entries from branch predictors Inner Shareable. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


In an implementation where the branch predictors are architecturally invisible, this instruction can 
execute as a NOP. 


Attributes 
BPIALLIS is a 32-bit System instruction. 


Field descriptions 


BPIALLIS ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 


Executing the BPIALLIS instruction: 
The BPIALLIS instruction is executed as: 
MCR p15,0,<Rt>,c7,c1,6 ; BPIALLIS operation, ignoring the value in Rt 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0111 0001 110 





The PE ignores the value of <Rt>. Software does not have to write a value to this register before issuing this 
instruction. 
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G6.2.21 BPIMVA, Branch Predictor Invalidate by VA 
The BPIMVA characteristics are: 


Purpose 


Invalidate virtual address from branch predictors. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 
In an implementation where the branch predictors are architecturally invisible, this instruction can 


execute as a NOP. 


Attributes 
BPIMVA is a 32-bit System instruction. 


Field descriptions 


The BPIMVA input value bit assignments are: 





31 0 
Virtual address to use 
Bits [31:0] 
Virtual address to use. 

Executing the BPIMVA instruction: 

The BPIMVA instruction is executed as: 

MCR p15,0,<Rt>,c7,c5,7 ; BPIMVA operation 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0111 0101 111 
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G6.2.22 CCSIDR, Current Cache Size ID Register 
The CCSIDR characteristics are: 
Purpose 
Provides information about the architecture of the currently selected cache. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - RO RO RO RO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- RO RO 
If CSSELR.Level indicates a cache that is not implemented, then on a read of the CCSIDR the 
behavior is CONSTRAINED UNPREDICTABLE, and can be one of the following: 
. The CCSIDR read is treated as NOP. 
° The CCSIDR read is UNDEFINED. 
° The CCSIDR read returns an UNKNOWN value. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TID2==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 
° If HCR_EL2.TID2==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
° If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 
. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
AArch32 System register CCSIDR is architecturally mapped to AArch64 System register 
CCSIDR_EL1. 
The implementation includes one CCSIDR for each cache that it can access. CSSELR and the 
Security state select which Cache Size ID Register is accessible. 
Attributes 
CCSIDR is a 32-bit register. 
Field descriptions 
The CCSIDR bit assignments are: 
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31 28 27 13 12 3 2 0 


UNKNOWN, bits [31:28] 
Reserved, UNKNOWN. 
NumSets, bits [27:13] 


(Number of sets in cache) - 1, therefore a value of 0 indicates 1 set in the cache. The number of sets 
does not have to be a power of 2. 


Associativity, bits [12:3] 


(Associativity of cache) - 1, therefore a value of 0 indicates an associativity of 1. The associativity 
does not have to be a power of 2. 


LineSize, bits [2:0] 
(Logo(Number of bytes in cache line)) - 4. For example: 


For a line length of 16 bytes: Logo(16) = 4, LineSize entry = 0. This is the minimum line length. 
For a line length of 32 bytes: Log2(32) = 5, LineSize entry = 1. 





Note 


The parameters NumSets, Associativity, and LineSize in these registers define the architecturally visible parameters 
that are required for the cache maintenance by Set/Way instructions. They are not guaranteed to represent the actual 
microarchitectural features of a design. You cannot make any inference about the actual sizes of caches based on 
these parameters. 





Accessing the CCSIDR: 
To access the CCSIDR: 
MRC p15,1,<Rt>,c@,c@,@ ; Read CCSIDR into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 001 0000 0000 000 
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G6.2.23 CLIDR, Cache Level ID Register 
The CLIDR characteristics are: 


Purpose 


Identifies the type of cache, or caches, that are implemented at each level and can be managed using 
the architected cache maintenance instructions that operate by set/way, up to a maximum of seven 
levels. Also identifies the Level of Coherence (LoC) and Level of Unification (LoU) for the cache 
hierarchy. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID2==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TID2==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 
Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register CLIDR is architecturally mapped to AArch64 System register 
CLIDR_EL1. 


Attributes 
CLIDR is a 32-bit register. 


Field descriptions 


The CLIDR bit assignments are: 


313029 2726 2423 2120 1817 1514 1211 9 8 6 5 3 2 0 








G6-4274 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


ICB, bits [31:30] 


Inner cache boundary. This field indicates the boundary for caching Inner Cacheable memory 
regions. 


The possible values are: 


00 Not disclosed by this mechanism. 

01 L1 cache is the highest Inner Cacheable level. 
10 L2 cache is the highest Inner Cacheable level. 
11 L3 cache is the highest Inner Cacheable level. 


LoUU, bits [29:27] 

Level of Unification Uniprocessor for the cache hierarchy. 
LoC, bits [26:24] 

Level of Coherence for the cache hierarchy. 
LoUIS, bits [23:21] 

Level of Unification Inner Shareable for the cache hierarchy. 
Ctype<n>, bits [3(n-1)+2:3(n-1)], for n = 1 to 7 


Cache Type fields. Indicate the type of cache that is implemented and can be managed using the 
architected cache maintenance instructions that operate by set/way at each level, from Level | up to 
a maximum of seven levels of cache hierarchy. Possible values of each field are: 


000 No cache. 

001 Instruction cache only. 

010 Data cache only. 

011 Separate instruction and data caches. 
100 Unified cache. 


All other values are reserved. 


If software reads the Cache Type fields from Ctypel upwards, once it has seen a value of 000, no 
caches that can be managed using the architected cache maintenance instructions that operate by 
set/way exist at further-out levels of the hierarchy. So, for example, if Ctype3 is the first Cache Type 
field with a value of 000, the values of Ctype4 to Ctype7 must be ignored. 


Accessing the CLIDR: 
To access the CLIDR: 
MRC p15,1,<Rt>,c@,c@,1 ; Read CLIDR into Rt 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 001 0000 0000 001 
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G6.2.24 CONTEXTIDR, Context ID Register 
The CONTEXTIDR characteristics are: 


Purpose 


Identifies the current Process Identifier and, when using the Short-descriptor translation table 
format, the Address Space Identifier. 


The value of the whole of this register is called the Context ID and is used by: 
° The debug logic, for Linked and Unlinked Context ID matching. 
° The trace logic, to identify the current process. 


The significance of this register is for debug and trace use only. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


CONTEXTIDR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: , : Z : RW 





CONTEXTIDR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


CONTEXTIDR is accessible as follows: 





ELO EL1 _ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T13==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T13==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
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Configurations 
AArch32 System register CONTEXTIDR is architecturally mapped to AArch64 System register 
CONTEXTIDR_EL1. 


The register format depends on whether address translation is using the Long-descriptor or the 
Short-descriptor translation table format. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CONTEXTIDR is a 32-bit register. 


Field descriptions 


The CONTEXTIDR bit assignments are: 


When TTBCR.EAE==0: 


31 8 7 0 
PROCID ASID 


PROCID, bits [31:8] 
Process Identifier. This field must be programmed with a unique value that identifies the current 


process. 
ASID, bits [7:0] 


Address Space Identifier. This field is programmed with the value of the current ASID. 


When TTBCR.EAE==1: 


31 0 
PROCID 


PROCID, bits [31:0] 
Process Identifier. This field must be programmed with a unique value that identifies the current 


process. 


Accessing the CONTEXTIDR: 
To access the CONTEXTIDR: 


MRC p15,0,<Rt>,c13,c@,1 ; Read CONTEXTIDR into Rt 
MCR p15,0,<Rt>,c13,c@,1 ; Write Rt to CONTEXTIDR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1101 0000 001 
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G6.2.25 CP15DMB, Data Memory Barrier System instruction 
The CP15DMB characteristics are: 


Purpose 
Performs a Data Memory Barrier. 
ARM deprecates any use of this operation, and strongly recommends that software use the DMB 
instruction instead. 

Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





WO WO WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If SCTLR.CP15BEN==0, execution of this instruction at PLO and PL1 is UNDEFINED. 
° If SCTLR_EL1.CP15BEN==0, execution of this instruction at PLO is UNDEFINED. 
° If HSCTLR.CP15BEN==0, execution of this instruction at PL2 is UNDEFINED. 


. If HSTR.T7==1, Non-secure execution of this instruction at ELO and EL] is trapped to Hyp 
mode. 


° If HSTR_EL2.T7==1, Non-secure execution of this instruction at ELO and EL! is trapped to 
EL2. 


Configurations 


There are no configuration notes. 


Attributes 
CP15DMB is a 32-bit System instruction. 


Field descriptions 

CP15DMB ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 

Executing the CP15DMB instruction: 

The CP15DMB instruction is executed as: 


MCR p15,0,<Rt>,c7,c10,5 ; CP15DMB operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0111 1010 101 
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G6.2.26 CP15DSB, Data Synchronization Barrier System instruction 
The CP1I5DSB characteristics are: 


Purpose 
Performs a Data Synchronization Barrier. 


ARM deprecates any use of this operation, and strongly recommends that software use the DSB 
instruction instead. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





WO WO WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If SCTLR.CP15BEN==0, execution of this instruction at PLO and PL1 is UNDEFINED. 
° If SCTLR_EL1.CP15BEN==0, execution of this instruction at PLO is UNDEFINED. 
° If HSCTLR.CP15BEN==0, execution of this instruction at PL2 is UNDEFINED. 


. If HSTR.T7==1, Non-secure execution of this instruction at ELO and EL] is trapped to Hyp 
mode. 


° If HSTR_EL2.T7==1, Non-secure execution of this instruction at ELO and EL] is trapped to 
EL2. 


Configurations 


There are no configuration notes. 


Attributes 
CP15DSB is a 32-bit System instruction. 


Field descriptions 

CP15DSB ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 

Executing the CP15DSB instruction: 

The CP15DSB instruction is executed as: 


MCR p15,0,<Rt>,c7,c10,4 ; CP15DSB operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0111 1010 100 
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G6.2.27 CP15ISB, Instruction Synchronization Barrier System instruction 


The CP15ISB characteristics are: 


Purpose 
Performs an Instruction Synchronization Barrier. 


ARM deprecates any use of this operation, and strongly recommends that software use the ISB 
instruction instead. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





WO WO WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If SCTLR.CP15BEN==0, execution of this instruction at PLO and PL1 is UNDEFINED. 
° If SCTLR_EL1.CP15BEN==0, execution of this instruction at PLO is UNDEFINED. 
° If HSCTLR.CP15BEN==0, execution of this instruction at PL2 is UNDEFINED. 


. If HSTR.T7==1, Non-secure execution of this instruction at ELO and EL] is trapped to Hyp 
mode. 


° If HSTR_EL2.T7==1, Non-secure execution of this instruction at ELO and EL] is trapped to 
EL2. 


Configurations 


There are no configuration notes. 


Attributes 
CP15ISB is a 32-bit System instruction. 


Field descriptions 

CP15ISB ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 

Executing the CP15ISB instruction: 

The CP15ISB instruction is executed as: 


MCR p15,0,<Rt>,c7,c5,4 ; CP15ISB operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0111 0101 100 
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G6.2.28 CPACR, Architectural Feature Access Control Register 
The CPACR characteristics are: 


Purpose 


Controls access to trace, and to Advanced SIMD and floating-point functionality from ELO, EL1, 
and EL3. 


In an implementation that includes EL2, the CPACR has no effect on instructions executed at EL2. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1_ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCPTR.TCPAC==1, Non-secure accesses to this register from EL1 are trapped to Hyp 
mode. 


° If CPTR_EL2.TCPAC==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If CPTR_EL3.TCPAC==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 

° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register CPACR is architecturally mapped to AArch64 System register 
CPACR_EL1. 


Bits in the NSACR control Non-secure access to the CPACR fields. See the field descriptions for 
more information. 
— Note 


In the register field descriptions, controls are described as applying at specified Privilege levels. 
This is because, in Secure state, a PL1 control: 


° Applies to execution in a Secure EL3 mode when EL3 is using AArch32. 
° Applies to execution in a Secure EL1 mode when EL3 is using AArch64. 


See Security state, Exception levels, and AArch32 execution privilege on page G1-3792. 





Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into an Exception level that is using AArch32. Otherwise, RW fields in this register reset to 
architecturally UNKNOWN values. 
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Attributes 
CPACR is a 32-bit register. 


Field descriptions 


The CPACR bit assignments are: 


31 30 29 28 27 24 23 22 21 20 19 0 


ASEDIS = | 
RESO 


TRCDIS 


ASEDIS, bit [31] 
Disables PLO and PL1 execution of Advanced SIMD instructions. 
0 This control permits execution of Advanced SIMD instructions at PLO and PL1. 


1 All instruction encodings that are Advanced SIMD instruction encodings, but are not 
also floating-point instruction encodings, are UNDEFINED at PLO and PL1. 


If the implementation does not include Advanced SIMD and floating-point functionality, this field 
is RESO. Otherwise, it is IMPLEMENTATION DEFINED whether this field is implemented as a RW field. 
If it is not implemented as a RW field, it is RAZ/WI. 


If EL3 is implemented and is using AArch32, and the value of NSACR.NSASEDIS is 1, this field 
behaves as RAO/WI in Non-secure state, regardless of its actual value. This applies even if the field 
is implemented as RAZ/WI. 


For the list of instructions affected by this field, see Controls of Advanced SIMD operation that do 
not apply to floating-point operation on page E1-2306. 


See the description of CPACR.cp10 for a list of other controls that can disable or trap execution of 
Advanced SIMD instructions in AArch32 state. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 


Bits [30:29] 


Reserved, RESO. 


TRCDIS, bit [28] 
Traps PLO and PL1 System register accesses to all implemented trace registers to Undefined mode. 
0 This control has no effect on PLO and PL1 System register accesses to trace registers. 


1 PLO and PL1 System register accesses to all implemented trace registers are trapped to 
Undefined mode. 


If the implementation does not include a trace macrocell, or does not include a System register 
interface to the trace macrocell registers, this field is RESO. Otherwise, it is IMPLEMENTATION 
DEFINED whether this field is implemented as a RW field. If it is not implemented as a RW field, it 
is RAZ/WI. 


If EL3 is implemented and is using AArch32, and the value of NSACR.NSTRCDIS is 1, this field 
behaves as RAO/WI in Non-secure state, regardless of its actual value. This applies even if the field 
is implemented as RAZ/WI. 
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—— Note 

° The ETMv4 architecture does not permit ELO to access the trace registers. If the 
implementation includes an ETMv4 implementation, ELO accesses to the trace registers are 
UNDEFINED. 

° The architecture does not provide traps on trace register accesses through the optional 


memory-mapped external debug interface. 





System register accesses to the trace registers can have side-effects. When a System register access 
is trapped, any side-effects that are normally associated with the access do not occur before the 
exception is taken. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


Bits [27:24] 


Reserved, RESO. 


cp11, bits [23:22] 


The value of this field is ignored. If this field is programmed with a different value to the cp10 field 
then this field is UNKNOWN on a direct read of the CPACR. 


If the implementation does not include Advanced SIMD and floating-point functionality, this field 
is RESO. 


In Non-secure state, if EL3 is implemented and is using AArch32, when the value of NSACR.cp10 
is 0, this field behaves as RAZ/WI, regardless of its actual value. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 


cp10, bits [21:20] 


Defines the access rights for the floating-point and Advanced SIMD functionality. Possible values 


of the field are: 

00 PLO and PL1 accesses to floating-point and Advanced SIMD registers or instructions 
are UNDEFINED. 

01 PLO accesses to floating-point and Advanced SIMD registers or instructions are 
UNDEFINED. 

10 Reserved. The effect of programming this field to this value is CONSTRAINED 


UNPREDICTABLE. See Reserved values in System and memory-mapped registers and 
translation table entries on page K1-5477. 


11 This control permits full access to the floating-point and Advanced SIMD functionality 
from PLO and PL1. 

The floating-point and Advanced SIMD features controlled by these fields are: 

° Execution of any floating-point or Advanced SIMD instruction. 

. Any access to the Advanced SIMD and floating-point registers DO-D31 and their views as 


S0-S31 and QO-Q15. 
° Any access to the FPSCR, FPSID, MVFRO, MVFR1, MVFR2, or FPEXC System registers. 


— Note 


The CPACR has no effect on floating-point and Advanced SIMD accesses from PL2. These can be 
disabled by the HCPTR.TCP10 field. 





If the implementation does not include Advanced SIMD and floating-point functionality, this field 
is RESO. 


In Non-secure state, if EL3 is implemented and is using AArch32, when the value of NSACR.cp10 
is 0, this field behaves as RAZ/WI, regardless of its actual value. 
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Execution of floating-point and Advanced SIMD instructions in AArch32 state can be disabled or 
trapped by the following controls: 


° CPACR.cp10, or, if executing at ELO, CPACR_EL1.FPEN. 
. FPEXC.EN. 
° If executing in Non-secure state: 
—  HCPTR.TCP10, or if EL2 is using AArch64, CPTR_EL2.TFP. 
— NSACR.cp10, or if EL3 is using AArch64, CPTR_EL3.TFP. 
° For Advanced SIMD instructions only: 
—  CPACR.ASEDIS. 
— If executing in Non-secure state, HCPTR.TASE and NSACR.NSTRCDIS. 
See the descriptions of the controls for more information. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 


Bits [19:0] 


Reserved, RESO. 


Accessing the CPACR: 
To access the CPACR: 


MRC p15,0,<Rt>,c1,c@,2 ; Read CPACR into Rt 
MCR p15,0,<Rt>,c1,c@,2 ; Write Rt to CPACR 


Register access is encoded as follows: 





coproc opci CRn CRm_= opc2 





1111 000 0001 0000 010 
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G6.2.29 CPSR, Current Program Status Register 
The CPSR characteristics are: 


Purpose 
Holds PE status and control information. 


Usage constraints 


The CPSR can be read using the MRS instruction and written using the MSR (immediate) or MSR 
(register) instructions. For more details on the instruction syntax, see PSTATE and banked register 
access instructions on page F1-2380. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


Attributes 
CPSR is a 32-bit register. 


Field descriptions 


The CPSR bit assignments are: 


31 30 29 28 27 26 25 24 23 20 19 1615 1098765 4 3 





IT[1:0] Se (= 


N, bit [31] 
Negative condition flag. Set to bit[31] of the result of the last flag-setting instruction. If the result is 
regarded as a two's complement signed integer, then N is set to 1 if the result was negative, and N 
is set to 0 if the result was positive or zero. 

Z, bit [30] 
Zero condition flag. Set to 1 if the result of the last flag-setting instruction was zero, and to 0 
otherwise. A result of zero often indicates an equal result from a comparison. 

C, bit [29] 
Carry condition flag. Set to 1 if the last flag-setting instruction resulted in a carry condition, for 
example an unsigned overflow on an addition. 

V, bit [28] 
Overflow condition flag. Set to 1 if the last flag-setting instruction resulted in an overflow condition, 
for example a signed overflow on an addition. 

Q, bit [27] 


Cumulative saturation bit. Set to 1 to indicate that overflow or saturation occurred in some 
instructions. 


IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 
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J, bit [24] 
RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvVv8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:20] 


Reserved, RESO. 


GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 


E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
1 Big-endian operation. 
Instruction fetches ignore this bit. 


When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 


If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 


If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 


Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 


A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
0 Exception not masked. 


1 Exception masked. 

¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
Q Exception not masked. 


L Exception masked. 
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T, bit [5] 
T32 Instruction set state bit. Indicates the AArch32 instruction set state. Possible values of this bit 
are: 
0 A32 state. 
1 T32 state. 
Bit [4] 
Reserved, RES 1. 
M{[3:0], bits [3:0] 
Current PE mode. Possible values are: 
M[3:0] Mode 
Qb0000 User 
0b0001 FIQ 
0b0010 IRQ 
@b0011 Supervisor 
0b0110 Monitor 
Qb0111 Abort 
0b1010 = #=Hyp 
@b1011 Undefined 
Qb1111 System 
Other values are reserved. 
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G6.2.30 CSSELR, Cache Size Selection Register 
The CSSELR characteristics are: 


Purpose 
Selects the current Cache Size ID Register, CCSIDR, by specifying the required cache level and the 
cache type (either instruction or data cache). 

Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


CSSELR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





. : : : : RW 





CSSELR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW = 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


CSSELR is accessible as follows: 





ELO EL1  EL2(NS) 





- RW RW 





If CSSELR.Level is programmed to a cache level that is not implemented, then a read of CSSELR 
is CONSTRAINED UNPREDICTABLE, and returns UNKNOWN values for CSSELR. { Level, InD}. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID2==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HCR_EL2.TID2==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If HSTR.TO==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T0==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register CSSELR is architecturally mapped to AArch64 System register 
CSSELR_EL1. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CSSELR is a 32-bit register. 
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Field descriptions 


The CSSELR bit assignments are: 


31 


4 3 10 


Bits [31:4] 


es InD 


Reserved, RESO. 


Level, bits [3:1] 


Cache level of required cache. Permitted values are: 


000 
001 
010 
011 
100 
101 
110 


Level 1 cache 
Level 2 cache 
Level 3 cache 
Level 4 cache 
Level 5 cache 
Level 6 cache 


Level 7 cache 


All other values are reserved. 


If CSSELR.Level is programmed to a cache level that is not implemented, then the value for this 
field on a read of CSSELR is UNKNOWN. 


InD, bit [0] 


Instruction not Data bit. Permitted values are: 


0 
1 


Data or unified cache. 


Instruction cache. 


If CSSELR.Level is programmed to a cache level that is not implemented, then the value for this 
field on a read of CSSELR is UNKNOWN. 


Accessing the CSSELR: 


To access the CSSELR: 


MRC p15,2,<Rt>,c@,c@,@ ; Read CSSELR into Rt 
MCR p15,2,<Rt>,c@,c0,@ ; Write Rt to CSSELR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 010 0000 0000 000 
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G6.2.31 CTR, Cache Type Register 


The CTR characteristics are: 


Purpose 


Provides information about the architecture of the caches. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID2==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 


mode. 

° If HCR_EL2.TID2==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 

° If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL] are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register CTR is architecturally mapped to AArch64 System register CTR_ELO. 


Attributes 
CTR is a 32-bit register. 


Field descriptions 


The CTR bit assignments are: 


3130 2827 24 23 20 19 16 15 14 13 4 3 0 


RES1 _ 
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Bit [31] 
Reserved, RES1. 
Bits [30:28] 


Reserved, RESO. 


CWG, bits [27:24] 


Cache Writeback Granule. Logs of the number of words of the maximum size of memory that can 
be overwritten as a result of the eviction of a cache entry that has had a memory location in it 
modified. 


A value of 0b0000 indicates that this register does not provide Cache Writeback Granule information 


and either: 
° The architectural maximum of 512 words (2KB) must be assumed. 
° The Cache Writeback Granule can be determined from maximum cache line size encoded in 


the Cache Size ID Registers. 


Values greater than 0b1001 are reserved. 


ERG, bits [23:20] 


Exclusives Reservation Granule. Log» of the number of words of the maximum size of the 
reservation granule that has been implemented for the Load-Exclusive and Store-Exclusive 
instructions. 


A value of 0b0000 indicates that this register does not provide Exclusives Reservation Granule 
information and the architectural maximum of 512 words (2KB) must be assumed. 


Values greater than 0b1001 are reserved. 
DminLine, bits [19:16] 


Log? of the number of words in the smallest cache line of all the data caches and unified caches that 
are controlled by the PE. 


L1Ip, bits [15:14] 


Level | instruction cache policy. Indicates the indexing and tagging policy for the L1 instruction 
cache. Possible values of this field are: 


01 ASID-tagged Virtual Index, Virtual Tag (AIVIVT) 
10 Virtual Index, Physical Tag (VIPT) 
11 Physical Index, Physical Tag (PIPT) 


Other values are reserved. 


Bits [13:4] 


Reserved, RESO. 


IminLine, bits [3:0] 


Log» of the number of words in the smallest cache line of all the instruction caches that are 
controlled by the PE. 

Accessing the CTR: 

To access the CTR: 


MRC p15,0,<Rt>,c@,cQ,1 ; Read CTR into Rt 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0000 001 
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G6.2.32 DACR, Domain Access Control Register 
The DACR characteristics are: 


Purpose 


Defines the access permission for each of the sixteen memory domains. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


DACR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 





“ “ : : z RW 





DACR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


DACR is accessible as follows: 





ELO EL1 EL2(NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T3==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T3==1, Non-secure accesses to this register from EL1 are trapped to EL2. 





Configurations 
AArch32 System register DACR is architecturally mapped to AArch64 System register 
DACR32_EL2. 
When EL3 is using AArch32, write access to DACR(S) is disabled when the CPISSDISABLE 
signal is asserted HIGH. 
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This register has no function when TTBCR.EAE is set to 1, to select the Long-descriptor translation 
table format. 


RW fields in this register reset to architecturally UNKNOWN values. 


DACR is a 32-bit register. 


Field descriptions 


The DACR bit assignments are: 


31 30 29 28 27 26 25 24 23 22 21 201918 17 161514131211109 8 76543 210 





De<n>, bits [2n+1:2n], for n = 0 to 15 


Domain n access permission, where n = 0 to 15. Permitted values are: 


00 No access. Any access to the domain generates a Domain fault. 
01 Client. Accesses are checked against the permission bits in the translation tables. 
11 Manager. Accesses are not checked against the permission bits in the translation tables. 


The value 10 is reserved. 


Accessing the DACR: 


To access the DACR: 


MRC p15,0,<Rt>,c3,c@,@ ; Read DACR into Rt 
MCR p15,0,<Rt>,c3,c@,@ ; Write Rt to DACR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0011 0000 000 
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G6.2.33 DCCIMVAC, Data Cache line Clean and Invalidate by VA to PoC 
The DCCIMVAC characteristics are: 
Purpose 
Clean and Invalidate data or unified cache line by virtual address to PoC. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HCR.TPC==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TPC==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
AArch32 System instruction DCCIMVAC performs the same function as AArch64 System 
instruction DC CIVAC. 
Attributes 
DCCIMVAC is a 32-bit System instruction. 
Field descriptions 
The DCCIMVAC input value bit assignments are: 
31 0 
Virtual address to use 
Bits [31:0] 
Virtual address to use. 
Executing the DCCIMVAC instruction: 
The DCCIMVAC instruction is executed as: 
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MCR p15,0,<Rt>,c7,c14,1 ; DCCIMVAC operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opci CRn CRm_= opc2 





1111 000 0111 1110 001 
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G6.2.34 DCCISW, Data Cache line Clean and Invalidate by Set/Way 
The DCCISW characteristics are: 
Purpose 
Clean and Invalidate data or unified cache line by set/way. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2(NS) 
- WO WO 
If this instruction is executed with a set, way or level argument that is larger than the value supported 
by the implementation then the behavior is CONSTRAINED UNPREDICTABLE and one of the following 
occurs: 
. The instruction is UNDEFINED 
° The instruction performs cache maintenance on one of: 
—  Nocache lines. 
— A single arbitrary cache line. 
— Multiple arbitrary cache lines. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TSW==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HCR_EL2.TSW==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
AArch32 System instruction DCCISW performs the same function as AArch64 System instruction 
DC CISW. 
Attributes 
DCCISW is a 32-bit System instruction. 
Field descriptions 
The DCCISW input value bit assignments are: 
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31 4 3 1 0 


| RESO 


SetWay, bits [31:4] 
Contains two fields: 
° Way, bits[31:32-A], the number of the way to operate on. 
° Set, bits[B-1:L], the number of the set to operate on. 
Bits[L-1:4] are RESO. 
A = Log2(ASSOCIATIVITY), L = Log2(LINELEN), B = (L + S), S = Logo(NSETS). 


ASSOCIATIVITY, LINELEN (line length, in bytes), and NSETS (number of sets) have their usual 
meanings and are the values for the cache level being operated on. The values of A and S are 
rounded up to the next integer. 


Level, bits [3:1] 


Cache level to operate on, minus 1. For example, this field is 0 for operations on L1 cache, or 1 for 
operations on L2 cache. 


Bit [0] 


Reserved, RESO. 


Executing the DCCISW instruction: 
The DCCISW instruction is executed as: 
MCR p15,0,<Rt>,c7,c14,2 ; DCCISW operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0111 1110 010 
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G6.2.35 DCCMVAC, Data Cache line Clean by VA to PoC 
The DCCMVAC characteristics are: 
Purpose 
Clean data or unified cache line by virtual address to PoC. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TPC==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TPC==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
AArch32 System instruction DCCMVAC performs the same function as AArch64 System 
instruction DC CVAC. 
Attributes 
DCCMVAC is a 32-bit System instruction. 
Field descriptions 
The DCCMVAC input value bit assignments are: 
31 0 
Virtual address to use 
Bits [31:0] 
Virtual address to use. 
Executing the DCCMVAC instruction: 
The DCCMVAC instruction is executed as: 
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MCR p15,0,<Rt>,c7,c10,1 ; DCCMVAC operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opci CRn CRm_= opc2 





1111 000 0111 1010 001 
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G6.2.36 DCCMVAU, Data Cache line Clean by VA to PoU 
The DCCMVAU characteristics are: 
Purpose 
Clean data or unified cache line by virtual address to PoU. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HCR.TPU==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TPU==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
AArch32 System instruction DCCMVAU performs the same function as AArch64 System 
instruction DC CVAU. 
Attributes 
DCCMVAU is a 32-bit System instruction. 
Field descriptions 
The DCCMVAU input value bit assignments are: 
31 0 
Virtual address to use 
Bits [31:0] 
Virtual address to use. 
Executing the DCCMVAU instruction: 
The DCCMVAU instruction is executed as: 
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MCR p15,0,<Rt>,c7,c11,1 ; DCCMVAU operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0111 1011 001 
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G6.2.37 DCCSW, Data Cache line Clean by Set/Way 
The DCCSW characteristics are: 
Purpose 
Clean data or unified cache line by set/way. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- WO WO 
If this instruction is executed with a set, way or level argument that is larger than the value supported 
by the implementation then the behavior is CONSTRAINED UNPREDICTABLE and one of the following 
occurs: 
. The instruction is UNDEFINED 
° The instruction performs cache maintenance on one of: 
—  Nocache lines. 
— A single arbitrary cache line. 
— Multiple arbitrary cache lines. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TSW==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HCR_EL2.TSW==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
AArch32 System instruction DCCSW performs the same function as AArch64 System instruction 
DC CSW. 
Attributes 
DCCSW is a 32-bit System instruction. 
Field descriptions 
The DCCSW input value bit assignments are: 
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31 4 3 1 0 


| RESO 


SetWay, bits [31:4] 
Contains two fields: 
° Way, bits[31:32-A], the number of the way to operate on. 
° Set, bits[B-1:L], the number of the set to operate on. 
Bits[L-1:4] are RESO. 
A = Log2(ASSOCIATIVITY), L = Log2(LINELEN), B = (L + S), S = Logo(NSETS). 


ASSOCIATIVITY, LINELEN (line length, in bytes), and NSETS (number of sets) have their usual 
meanings and are the values for the cache level being operated on. The values of A and S are 
rounded up to the next integer. 


Level, bits [3:1] 


Cache level to operate on, minus 1. For example, this field is 0 for operations on L1 cache, or 1 for 
operations on L2 cache. 


Bit [0] 


Reserved, RESO. 


Executing the DCCSW instruction: 
The DCCSW instruction is executed as: 
MCR p15,0,<Rt>,c7,c10,2 ; DCCSW operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0111 1010 010 
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G6.2.38 


DCIMVAC, Data Cache line Invalidate by VA to PoC 
The DCIMVAC characteristics are: 


Purpose 


Invalidate data or unified cache line by virtual address to PoC. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- WO WO 





It is IMPLEMENTATION DEFINED whether, when this instruction is executed, it can generate a 
watchpoint. If this instruction can generate a watchpoint this is prioritized in the same way as other 
watchpoints. 


At EL], this instruction must be performed as DCCIMVAC if all of the following apply: 
° EL2 is implemented and either: 

— EL2is using AArch64 and the value of HCR_EL2.VM is 1. 

—  EL2is using AArch32 and the value of HCR.VM is 1. 


. Execution is in Non-secure state, or EL3 is not implemented. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TPC==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HCR_EL2.TPC==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


AArch32 System instruction DCIMVAC performs the same function as AArch64 System 
instruction DC IVAC. 


Attributes 
DCIMVAC is a 32-bit System instruction. 


Field descriptions 


The DCIMVAC input value bit assignments are: 
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31 0 
Virtual address to use 
Bits [31:0] 


Virtual address to use. 


Executing the DCIMVAC instruction: 
The DCIMVAC instruction is executed as: 
MCR p15,0,<Rt>,c7,c6,1 ; DCIMVAC operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0111 0110 001 





It is IMPLEMENTATION DEFINED whether, when this instruction is executed, it can generate a watchpoint. If this 
instruction can generate a watchpoint this is prioritized in the same way as other watchpoints. 
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G6.2.39 DCISW, Data Cache line Invalidate by Set/Way 
The DCISW characteristics are: 


Purpose 


Invalidate data or unified cache line by set/way. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- WO WO 





At EL1, this operation must be performed as DCCISW if all of the following apply: 

° EL2 is implemented and either: 
—  EL2is using AArch64 and the value of HCR_EL2.{SWIO, VM} is not {0, 0}. 
—  EL2is using AArch32 and the value of HCR.{SWIO, VM} is not {0, 0}. 

. Execution is in Non-secure state, or EL3 is not implemented. 


If this instruction is executed with a set, way or level argument that is larger than the value supported 
by the implementation then the behavior is CONSTRAINED UNPREDICTABLE and one of the following 


occurs: 
. The instruction is UNDEFINED 
° The instruction performs cache maintenance on one of: 


—  Nocache lines. 
—  Assingle arbitrary cache line. 


— Multiple arbitrary cache lines. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TSW==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TSW==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 





Configurations 
AArch32 System instruction DCISW performs the same function as AArch64 System instruction 
DC ISW. 
Attributes 
DCISW is a 32-bit System instruction. 
G6-4310 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Field descriptions 


The DCISW input value bit assignments are: 


34 4 3 10 


== RESO 


SetWay, bits [31:4] 
Contains two fields: 
° Way, bits[31:32-A], the number of the way to operate on. 
° Set, bits[B-1:L], the number of the set to operate on. 
Bits[L-1:4] are RESO. 
A = Logo(ASSOCIATIVITY), L = Logo(LINELEN), B = (L + S), S = Logo(NSETS). 


ASSOCIATIVITY, LINELEN (line length, in bytes), and NSETS (number of sets) have their usual 
meanings and are the values for the cache level being operated on. The values of A and S are 


rounded up to the next integer. 


Level, bits [3:1] 
Cache level to operate on, minus 1. For example, this field is 0 for operations on L1 cache, or 1 for 


operations on L2 cache. 


Bit [0] 
Reserved, RESO. 


Executing the DCISW instruction: 
The DCISW instruction is executed as: 
MCR p15,0,<Rt>,c7,c6,2 ; DCISW operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0111 0110 010 
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G6.2.40 DFAR, Data Fault Address Register 
The DFAR characteristics are: 


Purpose 


Holds the virtual address of the faulting address that caused a synchronous Data Abort exception. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


DFAR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: : : : : RW 





DFAR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


DFAR is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL] are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T6==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T6==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register DFAR(NS) is architecturally mapped to AArch64 System register 
FAR_EL1[31:0]. 


AArch32 System register DFAR(S) is architecturally mapped to AArch32 System register HDFAR 
when EL2 is implemented. 
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AArch32 System register DFAR(S) is architecturally mapped to AArch64 System register 
FAR_EL2[31:0] when EL2 is implemented. 


RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 


DFAR is a 32-bit register. 


Field descriptions 


The DFAR bit assignments are: 


31 0 
VA of faulting address of synchronous Data Abort exception 
Bits [31:0] 


VA of faulting address of synchronous Data Abort exception. 


Accessing the DFAR: 
To access the DFAR: 


MRC p15,0,<Rt>,c6,c@,@ ; Read DFAR into Rt 
MCR p15,0,<Rt>,c6,c@,@ ; Write Rt to DFAR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0110 0000 000 
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G6.2.41 DFSR, Data Fault Status Register 
The DFSR characteristics are: 


Purpose 


Holds status information about the last data fault. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


DFSR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: : : : : RW 





DFSR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


DFSR is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T5==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T5==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register DFSR is architecturally mapped to AArch64 System register ESR_EL1. 
The current translation table format determines which format of the register is used. 


RW fields in this register reset to architecturally UNKNOWN values. 
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DFSR is a 32-bit register. 


Field descriptions 


The DFSR bit assignments are: 


When TTBCR.EAE==0: 


1716151413121110 9 8 7 








5 | ——_. RESO 
LPAE 
FS[4] 
WnR 
ExT 
CM 
RESO 
Bits [31:17] 
Reserved, RESO. 
FnV, bit [16] 
FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 
0 DFAR is valid. 
1 DFAR is not valid, and holds an UNKNOWN value. 
This field is only valid for a Synchronous external abort other than a Synchronous external abort on 
a translation table walk. It is RESO for all other Data Abort exceptions. 
Bits [15:14] 
Reserved, RESO. 
CM, bit [13] 
Cache maintenance fault. For synchronous faults, this bit indicates whether a cache maintenance 
instruction generated the fault. The possible values of this bit are: 
0 Abort not caused by execution of a cache maintenance instruction. 
1 Abort caused by execution of a cache maintenance instruction. 
On a synchronous Data Abort on a translation table walk, this bit is UNKNOWN. 
On an asynchronous fault, this bit is UNKNOWN. 
EXT, bit [12] 
External abort type. This bit can be used to provide an IMPLEMENTATION DEFINED classification of 
external aborts. 
In an implementation that does not provide any classification of external aborts, this bit is RESO. 
For aborts other than external aborts this bit always returns 0. 
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WnR, bit [11] 


Write not Read bit. Indicates whether the abort was caused by a write or a read instruction. The 
possible values of this bit are: 


7) 
1 


Abort caused by a read instruction. 


Abort caused by a write instruction. 


For faults on the cache maintenance and address translation System instructions in the 
(coproc==1111) encoding space this bit always returns a value of 1. 


FS[4], bit [10] 


See FS[3:0], bits [3:0] for description of the FS field. 


LPAE, bit [9] 


On taking a Data Abort exception, this bit is set as follows: 


0 
1 


Using the Short-descriptor translation table formats. 


Using the Long-descriptor translation table formats. 


Hardware does not interpret this bit to determine the behavior of the memory system, and therefore 
software can set this bit to 0 or 1 without affecting operation. 


Bit [8] 


Reserved, RESO. 


Domain, bits [7:4] 


The domain of the fault address. 
ARM deprecates any use of this field, see The Domain field in the DFSR on page G4-4129. 


This field is UNKNOWN for certain faults where the DFSR is updated and reported using the 
Short-descriptor FSR encodings, see Table G4-28 on page G4-4133. 


FS[3:0], bits [3:0] 


Fault status bits. Interpreted with bit [10]. Possible values of FS[4:0] are: 








00001 Alignment fault 

00010 Debug exception 

00011 Access flag fault, level 1 

00100 Fault on instruction cache maintenance 

00101 Translation fault, level 1 

00110 Access flag fault, level 2 

00111 Translation fault, level 2 

01000 Synchronous external abort, not on translation table walk 

01001 Domain fault, level 1 

01011 Domain fault, level 2 

01100 Synchronous external abort, on translation table walk, level 1 

Q1101 Permission fault, level 1 

01110 Synchronous external abort, on translation table walk, level 2 

Q1111 Permission fault, level 2 

10000 TLB conflict abort 

10100 IMPLEMENTATION DEFINED fault (Lockdown fault) 

10101 IMPLEMENTATION DEFINED fault (Unsupported Exclusive access fault) 

10110 SError interrupt 

11000 SError interrupt, from a parity or ECC error on memory access 
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11001 Synchronous parity or ECC error on memory access, not on translation table walk 
11100 Synchronous parity or ECC error on translation table walk, level 1 
11110 Synchronous parity or ECC error on translation table walk, level 2 


All other values are reserved. 


For more information about the lookup level associated with a fault, see The level associated with 
MMU faults on a Short-descriptor translation table lookup on page G4-4129. 


When TTBCR.EAE==1: 


31 


1716151413121110 9 8 


RESO Ty TTT] |= RESO STATUS 


FnV ee ree LPAE 


Bits [31:17] 


FnV, bit [16] 


Bits [15:14] 


CM, bit [13] 


RESO 
WnR 
ExT 
CM 
RESO 


Reserved, RESO. 


FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 


) DFAR is valid. 
1 DFAR is not valid, and holds an UNKNOWN value. 


This field is only valid for a Synchronous external abort other than a Synchronous external abort on 
a translation table walk. It is RESO for all other Data Abort exceptions. 


Reserved, RESO. 


Cache maintenance fault. For synchronous faults, this bit indicates whether a cache maintenance 
instruction generated the fault. The possible values of this bit are: 


0 Abort not caused by execution of a cache maintenance instruction. 
1 Abort caused by execution of a cache maintenance instruction. 
On a synchronous Data Abort on a translation table walk, this bit is UNKNOWN. 


On an asynchronous fault, this bit is UNKNOWN. 





ExT, bit [12] 
External abort type. This bit can be used to provide an IMPLEMENTATION DEFINED classification of 
external aborts. 
In an implementation that does not provide any classification of external aborts, this bit is RESO. 
For aborts other than external aborts this bit always returns 0. 
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WnR, bit [11] 


Write not Read bit. Indicates whether the abort was caused by a write or a read instruction. The 
possible values of this bit are: 


0 Abort caused by a read instruction. 
1 Abort caused by a write instruction. 
For faults on the cache maintenance and address translation System instructions in the 
(coproc==1111) encoding space this bit always returns a value of 1. 
Bit [10] 


Reserved, RESO. 


LPAE, bit [9] 
On taking a Data Abort exception, this bit is set as follows: 
0 Using the Short-descriptor translation table formats. 
1 Using the Long-descriptor translation table formats. 
Hardware does not interpret this bit to determine the behavior of the memory system, and therefore 
software can set this bit to 0 or 1 without affecting operation. 
Bits [8:6] 


Reserved, RESO. 


STATUS, bits [5:0] 
Fault status bits. Possible values of this field are: 
000000 Address size fault in TTBRO or TTBR1 
000001 Address size fault, level 1 
000010 Address size fault, level 2 
000011 Address size fault, level 3 
000101 Translation fault, level 1 
000110 Translation fault, level 2 
000111 Translation fault, level 3 
001001 Access flag fault, level 1 
001010 Access flag fault, level 2 
001011 Access flag fault, level 3 
001101 Permission fault, level 1 
001110 Permission fault, level 2 
001111 Permission fault, level 3 





010000 Synchronous external abort, not on translation table walk 

010001 SError interrupt 

010101 Synchronous external abort, on translation table walk, level 1 

010110 Synchronous external abort, on translation table walk, level 2 

Q10111 Synchronous external abort, on translation table walk, level 3 

011000 Synchronous parity or ECC error on memory access, not on translation table walk 
011001 SError interrupt, from a parity or ECC error on memory access 

011101 Synchronous parity or ECC error on memory access on translation table walk, level 1 
011110 Synchronous parity or ECC error on memory access on translation table walk, level 2 
Q11111 Synchronous parity or ECC error on memory access on translation table walk, level 3 


100001 Alignment fault 
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Debug exception 
TLB conflict abort 
IMPLEMENTATION DEFINED fault (Lockdown fault) 


IMPLEMENTATION DEFINED fault (Unsupported Exclusive access fault) 


All other values are reserved. 


For more information about the lookup level associated with a fault, see The level associated with 
MMU faults on a Long-descriptor translation table lookup on page G4-4131. 


Accessing the DFSR: 


To access the DFSR: 


MRC p15,0,<Rt>,c5,c@,@ ; Read DFSR into Rt 
MCR p15,0,<Rt>,c5,c@,@ ; Write Rt to DFSR 


Register access is encoded as follows: 





coproc opci CRn CRm_= opc2 





1111 000 0101 0000 000 
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G6.2.42 DTLBIALL, Data TLB Invalidate All 
The DTLBIALL characteristics are: 
Purpose 
Invalidate all data TLB entries for the PL1&0 translation regime, subject to the Security state and 
Privilege level at which the instruction is executed. 
If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 
For details of the scope of this instruction see TLBIALL. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
There are no configuration notes. 
Attributes 
DTLBIALL is a 32-bit System instruction. 
Field descriptions 
DTLBIALL ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 
Executing the DTLBIALL instruction: 
The DTLBIALL instruction is executed as: 
MCR p15,0,<Rt>,c8,c6,0 ; DTLBIALL operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0110 000 
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G6.2.43 DTLBIASID, Data TLB Invalidate by ASID match 


The DTLBIASID characteristics are: 


Purpose 


Invalidate data TLB entries for stage 1 of the PL1&0 translation regime that match the given ASID, 
subject to the Security state at which the instruction is executed. 


If this operation is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIASID. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
DTLBIASID is a 32-bit System instruction. 


Field descriptions 


The DTLBIASID input value bit assignments are: 


31 8 7 0 
RESO ASID 
Bits [31:8] 


Reserved, RESO. 
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ASID, bits [7:0] 
ASID value to match. Any TLB entries for non-global pages that match the ASID values will be 


affected by this operation. 
Executing the DTLBIASID instruction: 
The DTLBIASID instruction is executed as: 
MCR p15,0,<Rt>,c8,c6,2 ; DILBIASID operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0110 010 
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G6.2.44 DTLBIMVA, Data TLB Invalidate by VA 


The DTLBIMVA characteristics are: 


Purpose 


Invalidate data TLB entries for stage 1 of the PL1&0 translation regime that match the given VA 
and ASID, subject to the Security state at which the instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIMVA. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
DTLBIMVA is a 32-bit System instruction. 


Field descriptions 


The DTLBIMVA input value bit assignments are: 


31 12 11 8 7 0 
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VA, bits [31:12] 


Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 
by this operation. 


Bits [11:8] 
Reserved, RESO. 


ASID, bits [7:0] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 

Executing the DTLBIMVA instruction: 

The DTLBIMVA instruction is executed as: 

MCR p15,0,<Rt>,c8,c6,1 ; DTLBIMVA operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0110 001 
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G6.2.45 ELR_hyp, Exception Link Register (Hyp mode) 


The ELR_hyp characteristics are: 


Purpose 
When taking an exception to Hyp mode, holds the address to return to. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1_ EL2 (NS) 





- - RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
AArch32 System register ELR_hyp is architecturally mapped to AArch64 System register 
ELR_EL2. 
On a reset into an Exception level that is using AArch32 ELR_hyp is UNKNOWN. 


Attributes 
ELR_hyp is a 32-bit register. 


Field descriptions 


The ELR_hyp bit assignments are: 


31 0 


Return address 


Bits [31:0] 


Return address. 


Accessing the ELR_hyp: 
To access the ELR_hyp: 


MRS <Rd>, ELR_hyp ; Read ELR_hyp into Rd 
MSR ELR_hyp, <Rd> ; Write Rd to ELR_hyp 
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Register access is encoded as follows: 


1 1110 «0 
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G6.2.46 FCSEIDR, FCSE Process ID register 
The FCSEIDR characteristics are: 


Purpose 
Identifies whether the Fast Context Switch Extension (FCSE) is implemented. 


In ARMv8, the FCSE is not implemented, so this register is RAZ/WI. Software can access this 
register to determine that the implementation does not include the FCSE. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T13==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T13==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


Attributes 
FCSEIDR is a 32-bit register. 


Field descriptions 


The FCSEIDR bit assignments are: 


31 0 
Bits [31:0] 


Reserved, RAZ/WI. Hardware must implement this as RAZ/WI. Software must not rely on this 
property as the behavior of reserved values might change in a future revision of the architecture. 


Accessing the FCSEIDR: 
To access the FCSEIDR: 
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MRC p15,0,<Rt>,c13,c@,@ ; Read FCSEIDR into Rt 
MCR p15,0,<Rt>,c13,c0,@ ; Write Rt to FCSEIDR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1101 0000 000 
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G6.2.47 FPEXC, Floating-Point Exception Control register 
The FPEXC characteristics are: 


Purpose 


Provides a global enable for the implemented Advanced SIMD and floating-point functionality, and 
reports floating-point status information. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - Config-RW  Config-RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





- Config-RW Config-RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If CPACR.cp10==00, accesses to this register from PL1 are UNDEFINED. 

° If NSACR.cp10==0, Non-secure accesses to this register from EL1 and EL2 are UNDEFINED. 
° If CPTR_EL2.TFP==1, Non-secure accesses to this register from EL1 are trapped to EL2. 

. If CPTR_EL3.TFP==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


. If HCPTR.TCP10==1, Non-secure accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCPTR.TCP10==1, Non-secure accesses to this register from EL2 are UNDEFINED. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register FPEXC is architecturally mapped to AArch64 System register 
FPEXC32. BL2. 


Implemented only if the implementation includes the Advanced SIMD and floating-point 
functionality. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into an Exception level that is using AArch32. Otherwise, RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
FPEXC is a 32-bit register. 


Field descriptions 


The FPEXC bit assignments are: 
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876543210 





~——! 


Lae 





DEX OFF 
FP2V UFF 
TFV IXF 
RESO 
IDF 
EX, bit [31] 
Exception bit. In ARMVv8, this bit is RAZ/WI. 
EN, bit [30] 
Enables access to the Advanced SIMD and floating-point functionality from all Exception levels, 
except that setting this field to 0 does not disable the following: 
° VMSR accesses to the FPEXC or FPSID. 
° VMRS accesses from the FRPEXC, FPSID, MVFRO, MVFR1, or MVFR2. 
) Accesses to the FPSCR, and any of the SIMD and floating-point registers QO-Q15, 
including their views as DO-D31 registers or SO-S31 registers, are UNDEFINED at all 
Exception levels. 
1 This control permits access to the Advanced SIMD and floating-point functionality at 
all Exception levels. 
Execution of floating-point and Advanced SIMD instructions in AArch32 state can be disabled or 
trapped by the following controls: 
° CPACR.cp10, or, if executing at ELO, CPACR_EL1.FPEN. 
° FPEXC.EN. 
° If executing in Non-secure state: 
—  HCPTR.TCP10, or if EL2 is using AArch64, CPTR_EL2.TFP. 
— NSACR.cp10, or if EL3 is using AArch64, CPTR_EL3.TFP. 
° For Advanced SIMD instructions only: 
—  CPACR.ASEDIS. 
— If executing in Non-secure state, HCPTR.TASE and NSACR.NSTRCDIS. 
See the descriptions of the controls for more information. 
—— Note 
When executing at ELO using AArch32 with EL1 using AArch64, the PE behaves as if the value of 
FPEXC.EN bit is 1. 
When this register has an architecturally-defined reset value, this field resets to 0. 
DEX, bit [29] 


Defined synchronous exception on floating-point execution. 


This field identifies whether a synchronous exception generated by the attempted execution of an 
instruction was generated by an unallocated encoding. The instruction must be in the encoding space 
that is identified by the pseudocode function ExecutingCP10or11Instr() returning TRUE. This field 
also indicates whether the FPEXC.TFYV field is valid. 
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The meaning of this bit is: 


0 The exception was generated by the attempted execution of an unallocated instruction 
in the encoding space that is identified by the pseudocode function 
ExecutingCP10or11Instr(). If FREXC.TFV is RW then FPEXC.TFYV is invalid and 
UNKNOWN. If FPEXC.{IDF, IXF, UFF, OFF, DZF, IOF} are RW then they are invalid 
and UNKNOWN. 


1 The exception was generated during the execution of an allocated encoding. 
FPEXC.TFV is valid and indicates the cause of the exception. 


On an exception that sets this bit to 1 the exception-handling routine must clear this bit to 0. 


On an implementation that both does not support trapping of floating-point exceptions and 
implements the FPSCR. { Stride, Len} fields as RAZ, this bit is RESO. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


FP2V, bit [28] 
FPINST2 instruction valid bit. In ARMv8, this bit is RESO. 


VV, bit [27] 
VECITR valid bit. In ARMv8, this bit is RESO. 


TEV, bit [26] 


Trapped Fault Valid bit. Valid only when the value of FPEXC.DEX is 1. When valid, it indicates the 
cause of the exception and therefore whether the FPEXC.{IDF, IXF, UFF, OFF, DZF, IOF} bits are 
valid. 


0 The exception was caused by the execution of a floating-point VABS, VADD, VDIV, 
VFMA, VFMS, VFNMA, VFENMS, VMLA, VMLS, VMOV, VMUL, VNEG, 
VNMLA, VNMLS, VNMUL, VSQRT, or VSUB instruction when one or both of 
FPSCR.{ Stride, Len} was non-zero. If the FPEXC.{IDF, IXF, UFF, OFF, DZF, IOF} 
bits are RW then they are invalid and UNKNOWN. 


1 FPEXC. {IDF, IXF, UFF, OFF, DZF, IOF} indicate the presence of trapped 
floating-point exceptions that had occurred at the time of the exception. Bits are set for 
all trapped exceptions that had occurred at the time of the exception. 


This bit returns a status value and ignores writes. 
When the value of FREXC.DEX is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


On an implementation that supports the trapping of floating-point exceptions and implements 
FPSCR.{ Stride, Len} as RAZ, this bit is RAO/WI. 


Bits [25:11] 
Reserved, RESO. 


VECITR, bits [10:8] 


Vector iteration count. In ARMvé8, this field is RES1. 





IDF, bit [7] 
Input Denormal trapped exception bit. Valid only when the value of FPEXC.TFV is 1. When valid, 
it indicates whether an Input Denormal exception occurred while FPSCR.IDE was 1: 
0 Input denormal exception has not occurred. 
1 Input denormal exception has occurred. 
Input Denormal exceptions can occur only when FPSCR.FZ is 1. 
This bit must be cleared to 0 by the exception-handling routine. 
When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 
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On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


Bits [6:5] 


Reserved, RESO. 


IXF, bit [4] 


Inexact trapped exception bit. Valid only when the value of FPEXC.TFV is 1. When valid, it 
indicates whether an Inexact exception occurred while FPSCR.IXE was 1: 


0 Inexact exception has not occurred. 

ch Inexact exception has occurred. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


UFF, bit [3] 


Underflow trapped exception bit. Valid only when the value of FPEXC.TFV is 1. When valid, it 
indicates whether an Underflow exception occurred while FPSCR.UFE was 1: 


0 Underflow exception has not occurred. 

1 Underflow exception has occurred. 

Underflow trapped exceptions can occur only when FPSCR.FZ is 0. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


OFF, bit [2] 


Overflow trapped exception bit. Valid only when the value of FPEXC.TFV is 1. When valid, it 
indicates whether an Overflow exception occurred while FPSCR.OFE was 1: 


) Overflow exception has not occurred. 

1 Overflow exception has occurred. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


DZF, bit [1] 


Divide-by-zero trapped exception bit. Valid only when the value of FPEXC.TFYV is 1. When valid, 
it indicates whether a Divide-by-zero exception occurred while FPSCR.DZE was 1: 


) Divide-by-zero exception has not occurred. 

ab Divide-by-zero exception has occurred. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 
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If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
IOF, bit [0] 


Invalid Operation trapped exception bit. Valid only when the value of FPEXC.TFYV is 1. When valid, 
it indicates whether an Invalid Operation exception occurred while FPSCR.IOE was 1: 


0 Invalid Operation exception has not occurred. 

1 Invalid Operation exception has occurred. 

This bit must be cleared to 0 by the exception-handling routine. 

When the value of FREXC.TFV is 0 and this bit is RW, this bit is invalid and UNKNOWN. 


On an implementation that does not support the trapping of floating-point exceptions this bit is 
RAZ/WI. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


Accessing the FPEXC: 
To access the FPEXC: 


VMRS <Rt>, FPEXC ; Read FPEXC into Rt 
VMSR FPEXC, <Rt> ; Write Rt to FPEXC 


Register access is encoded as follows: 


spec_reg 


1000 
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G6.2.48 FPSCR, Floating-Point Status and Control Register 
The FPSCR characteristics are: 


Purpose 


Provides floating-point system status information and control. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW Config-RW Config-RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW  Config-RW  Config-RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If CPACR.cp10==00, accesses to this register from PLO and PL1 are UNDEFINED. 

° If CPACR.cp10==01, accesses to this register from PLO are UNDEFINED. 

° If NSACR.cp10==0, Non-secure accesses to this register from ELO, EL1, and EL2 are 
UNDEFINED. 


. If CPACR_EL1.FPEN==00, accesses to this register from PLO are trapped to EL1. 
° If CPACR_EL1.FPEN==01, accesses to this register from PLO are trapped to EL1. 
° If CPACR_EL1.FPEN==10, accesses to this register from PLO are trapped to EL1. 


° If CPTR_EL2.TFP==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If CPTR_EL3.TFP==1, accesses to this register from ELO, EL1, and EL2 are trapped to EL3. 


. If HCPTR.TCP10==1, Non-secure accesses to this register from ELO and EL 1 are trapped to 
Hyp mode. 


. If HCPTR.TCP10==1, Non-secure accesses to this register from EL2 are UNDEFINED. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
The named fields in this register map to the equivalent fields in the AArch64 FPCR and FPSR. 


It is IMPLEMENTATION DEFINED whether the Len and Stride fields can be programmed to non-zero 
values, which will cause some AArch32 floating-point instruction encodings to be UNDEFINED, or 
whether these fields are RAZ. 


Implemented only if the implementation includes the Advanced SIMD and floating-point 
functionality. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
FPSCR is a 32-bit register. 
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Field descriptions 


The FPSCR bit assignments are: 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 


161514131211109 8 76543210 





| L_— 





Qc | lOc 
AHP 
DN OFC 
RMode UFC 
RESO IXC 
RESO 
IDC 
IOE 
DZE 
OFE 
UFE 
IXE 
RESO 
IDE 
N, bit [31] 
Negative condition flag. This is updated by floating-point comparison operations. 
Z, bit [30] 
Zero condition flag. This is updated by floating-point comparison operations. 
C, bit [29] 
Carry condition flag. This is updated by floating-point comparison operations. 
V, bit [28] 
Overflow condition flag. This is updated by floating-point comparison operations. 
QC, bit [27] 
Cumulative saturation bit, Advanced SIMD only. This bit is set to 1 to indicate that an Advanced 
SIMD integer operation has saturated since 0 was last written to this bit. 
AHP, bit [26] 
Alternative half-precision control bit: 
0 IEEE half-precision format selected. 
1 Alternative half-precision format selected. 
DN, bit [25] 
Default NaN mode control bit: 
) NaN operands propagate through to the output of a floating-point operation. 
1 Any operation involving one or more NaNs returns the Default NaN. 
The value of this bit only controls scalar floating-point arithmetic. Advanced SIMD arithmetic 
always uses the Default NaN setting, regardless of the value of the DN bit. 
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FZ, bit [24] 
Flush-to-zero mode control bit: 


Q Flush-to-zero mode disabled. Behavior of the floating-point system is fully compliant 
with the IEEE 754 standard. 


1 Flush-to-zero mode enabled. 


The value of this bit only controls scalar floating-point arithmetic. Advanced SIMD arithmetic 
always uses the Flush-to-zero setting, regardless of the value of the FZ bit. 


RMode, bits [23:22] 
Rounding Mode control field. The encoding of this field is: 


00 Round to Nearest (RN) mode 

01 Round towards Plus Infinity (RP) mode 

10 Round towards Minus Infinity (RM) mode 
11 Round towards Zero (RZ) mode. 


The specified rounding mode is used by almost all scalar floating-point instructions. Advanced 
SIMD arithmetic always uses the Round to Nearest setting, regardless of the value of the RMode 
bits. 

Stride, bits [21:20] 
Tt is IMPLEMENTATION DEFINED whether this field is RW or RAZ. 


If this field is RW and is set to a value other than zero, some floating-point instruction encodings 
are UNDEFINED. The instruction pseudocode identifies these instructions. 


ARM strongly recommends that software never sets this field to a value other than zero. 


The value of this field is ignored when processing Advanced SIMD instructions. 


Bit [19] 


Reserved, RESO. 


Len, bits [18:16] 
It is IMPLEMENTATION DEFINED whether this field is RW or RAZ. 


If this field is RW and is set to a value other than zero, some floating-point instruction encodings 
are UNDEFINED. The instruction pseudocode identifies these instructions. 


ARM strongly recommends that software never sets this field to a value other than zero. 


The value of this field is ignored when processing Advanced SIMD instructions. 


IDE, bit [15] 
Input Denormal exception trap enable. Possible values are: 


0 Untrapped exception handling selected. If the floating-point exception occurs then the 
IDC bit is set to 1. 


1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the IDC bit. The trap handling software can decide whether to set the IDC 
bit to 1. 


This bit is RW only if the implementation supports the trapping of floating-point exceptions. In an 
implementation that does not support floating-point exception trapping, this bit is RESO. 


When this bit is RW, it applies only to floating-point operations. Advanced SIMD operations always 
use untrapped floating-point exception handling in AArch32 state. 


Bits [14:13] 
Reserved, RESO. 
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IXE, bit [12] 
Inexact exception trap enable. Possible values are: 


0 Untrapped exception handling selected. If the floating-point exception occurs then the 
IXC bit is set to 1. 


1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the [XC bit. The trap handling software can decide whether to set the IXC 
bit to 1. 


This bit is RW only if the implementation supports the trapping of floating-point exceptions. In an 
implementation that does not support floating-point exception trapping, this bit is RESO. 


When this bit is RW, it applies only to floating-point operations. Advanced SIMD operations always 
use untrapped floating-point exception handling in AArch32 state. 

UFE, bit [11] 
Underflow exception trap enable. Possible values are: 


0 Untrapped exception handling selected. If the floating-point exception occurs then the 
UFC bit is set to 1. 


1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the UFC bit. The trap handling software can decide whether to set the UFC 
bit to 1. 


This bit is RW only if the implementation supports the trapping of floating-point exceptions. In an 
implementation that does not support floating-point exception trapping, this bit is RESO. 


When this bit is RW, it applies only to floating-point operations. Advanced SIMD operations always 
use untrapped floating-point exception handling in AArch32 state. 

OFE, bit [10] 
Overflow exception trap enable. Possible values are: 


0 Untrapped exception handling selected. If the floating-point exception occurs then the 
OFC bit is set to 1. 


1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the OFC bit. The trap handling software can decide whether to set the OFC 
bit to 1. 


This bit is RW only if the implementation supports the trapping of floating-point exceptions. In an 
implementation that does not support floating-point exception trapping, this bit is RESO. 


When this bit is RW, it applies only to floating-point operations. Advanced SIMD operations always 
use untrapped floating-point exception handling in AArch32 state. 

DZE, bit [9] 
Division by Zero exception trap enable. Possible values are: 


Q Untrapped exception handling selected. If the floating-point exception occurs then the 
DZC bit is set to 1. 


1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the DZC bit. The trap handling software can decide whether to set the DZC 
bit to 1. 


This bit is RW only if the implementation supports the trapping of floating-point exceptions. In an 
implementation that does not support floating-point exception trapping, this bit is RESO. 


When this bit is RW, it applies only to floating-point operations. Advanced SIMD operations always 
use untrapped floating-point exception handling in AArch32 state. 

IOE, bit [8] 
Invalid Operation exception trap enable. Possible values are: 


0 Untrapped exception handling selected. If the floating-point exception occurs then the 
IOC bit is set to 1. 
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1 Trapped exception handling selected. If the floating-point exception occurs, the PE does 
not update the IOC bit. The trap handling software can decide whether to set the IOC 
bit to 1. 


This bit is RW only if the implementation supports the trapping of floating-point exceptions. In an 
implementation that does not support floating-point exception trapping, this bit is RESO. 


When this bit is RW, it applies only to floating-point operations. Advanced SIMD operations always 
use untrapped floating-point exception handling in AArch32 state. 


Input Denormal cumulative exception bit. This bit is set to 1 to indicate that the Input Denormal 
exception has occurred since 0 was last written to this bit. 


How VFP instructions update this bit depends on the value of the IDE bit. 


Advanced SIMD instructions set this bit if the Input Denormal exception occurs in one or more of 
the floating-point calculations performed by the instruction, regardless of the value of the IDE bit. 


Reserved, RESO. 


Inexact cumulative exception bit. This bit is set to 1 to indicate that the Inexact exception has 
occurred since 0 was last written to this bit. 


How VFP instructions update this bit depends on the value of the [XE bit. 


Advanced SIMD instructions set this bit if the Inexact exception occurs in one or more of the 
floating-point calculations performed by the instruction, regardless of the value of the [XE bit. 


Underflow cumulative exception bit. This bit is set to 1 to indicate that the Underflow exception has 
occurred since 0 was last written to this bit. 


How VFP instructions update this bit depends on the value of the UFE bit. 


Advanced SIMD instructions set this bit if the Underflow exception occurs in one or more of the 
floating-point calculations performed by the instruction, regardless of the value of the UFE bit. 


Overflow cumulative exception bit. This bit is set to 1 to indicate that the Overflow exception has 
occurred since 0 was last written to this bit. 


How VFP instructions update this bit depends on the value of the OFE bit. 


Advanced SIMD instructions set this bit if the Overflow exception occurs in one or more of the 
floating-point calculations performed by the instruction, regardless of the value of the OFE bit. 


Division by Zero cumulative exception bit. This bit is set to 1 to indicate that the Division by Zero 
exception has occurred since 0 was last written to this bit. 


How VFP instructions update this bit depends on the value of the DZE bit. 


Advanced SIMD instructions set this bit if the Division by Zero exception occurs in one or more of 
the floating-point calculations performed by the instruction, regardless of the value of the DZE bit. 


Invalid Operation cumulative exception bit. This bit is set to 1 to indicate that the Invalid Operation 
exception has occurred since 0 was last written to this bit. 


How VFP instructions update this bit depends on the value of the IOE bit. 


Advanced SIMD instructions set this bit if the Invalid Operation exception occurs in one or more of 
the floating-point calculations performed by the instruction, regardless of the value of the IOE bit. 
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Accessing the FPSCR: 
To access the FPSCR: 


VMRS <Rt>, FPSCR ; Read FPSCR into Rt 
VMSR FPSCR, <Rt> ; Write Rt to FPSCR 


Register access is encoded as follows: 


spec_reg 


0001 
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G6.2.49 FPSID, Floating-Point System ID register 
The FPSID characteristics are: 


Purpose 
Provides top-level information about the floating-point implementation. 


This register largely duplicates information held in the MIDR. ARM deprecates use of it. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - Config-RW  Config-RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





- Config-RW Config-RW 





When access to this register is permitted, write accesses are ignored. 
This register largely duplicates information held in the MIDR. ARM deprecates use of it. 
Implemented only if the implementation includes the Advanced SIMD and floating-point 
functionality. 

Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If CPACR.cp10==00, accesses to this register from PL1 are UNDEFINED. 

° If NSACR.cp10==0, Non-secure accesses to this register from EL1 and EL2 are UNDEFINED. 
. If CPTR_EL2.TFP==1, Non-secure accesses to this register from EL1 are trapped to EL2. 

. If CPTR_EL3.TFP==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


° If HCPTR.TCP10==1, Non-secure accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCPTR.TCP10==1, Non-secure accesses to this register from EL2 are UNDEFINED. 


° If HCR.TIDO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TIDO==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


Implemented only if the implementation includes the Advanced SIMD and floating-point 





functionality. 
Attributes 
FPSID is a 32-bit register. 
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Field descriptions 


The FPSID bit assignments are: 


31 24 23 22 1615 8 7 4 3 0 


| 


Implementer, bits [31:24] 
Implementer codes are the same as those used for the MIDR. 


For an implementation by ARM this field is 0x41, the ASCII code for A. 


SW, bit [23] 
Software bit. Defined values are: 
) The implementation provides a hardware implementation of the floating-point 
instructions. 
1 The implementation supports only software emulation of the floating-point instructions. 


In ARMv8-A the only permitted value is 0. 


Subarchitecture, bits [22:16] 
Subarchitecture version number. For an implementation by ARM, defined values are: 
0000000 =8=VFPv1 architecture with an IMPLEMENTATION DEFINED subarchitecture. 
0000001 + VFPv2 architecture with Common VFP subarchitecture v1. 


0000010 VFPv3 architecture, or later, with Common VFP subarchitecture v2. The VFP 
architecture version is indicated by the MVFRO and MVFR1 registers. 


0000011 + VFPv3 architecture, or later, with Null subarchitecture. The entire floating-point 
implementation is in hardware, and no software support code is required. The VFP 
architecture version is indicated by the MVFRO and MVFR1 registers. This value can 
be used only by an implementation that does not support the trap enable bits in the 
FPSCR. 


0000100 VFPv3 architecture, or later, with Common VFP subarchitecture v3, and support for 
trap enable bits in FPSCR. The VFP architecture version is indicated by the MVFRO and 
MVERI registers. 


For a subarchitecture designed by ARM the most significant bit of this field, register bit[22], is 0. 
Values with a most significant bit of 0 that are not listed here are reserved. 


When the subarchitecture designer is not ARM, the most significant bit of this field, register bit[22], 
must be 1. Each implementer must maintain its own list of subarchitectures it has designed, starting 
at subarchitecture version number 0x40. 


In ARMv8-A the permitted values are 0000011 and 0000100. 


PartNum, bits [15:8] 
An IMPLEMENTATION DEFINED part number for the floating-point implementation, assigned by the 
implementer. 

Variant, bits [7:4] 
An IMPLEMENTATION DEFINED variant number. Typically, this field distinguishes between different 
production variants of a single product. 

Revision, bits [3:0] 
An IMPLEMENTATION DEFINED revision number for the floating-point implementation. 
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Accessing the FPSID: 
To access the FPSID: 


VMRS <Rt>, FPSID ; Read FPSID into Rt 
VMSR FPSID, <Rt> ; Write Rt to FPSID 


Register access is encoded as follows: 


spec_reg 


0000 
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G6.2.50 HACR, Hyp Auxiliary Configuration Register 
The HACR characteristics are: 


Purpose 
Controls trapping to Hyp mode of IMPLEMENTATION DEFINED aspects of Non-secure EL1 or ELO 


operation. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2 (NS) 





- RW 





Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register HACR is architecturally mapped to AArch64 System register 
HACR_EL2. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HACR is a 32-bit register. 


Field descriptions 


The HACR bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the HACR: 


To access the HACR: 
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MRC p15,4,<Rt>,cl,cl,7 ; Read HACR into Rt 
MCR p15,4,<Rt>,cl,cl,7 ; Write Rt to HACR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 100 0001 0001 111 
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G6.2.51 HACTLR, Hyp Auxiliary Control Register 
The HACTLR characteristics are: 
Purpose 
Controls IMPLEMENTATION DEFINED features of Hyp mode operation. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- - RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HACTLR is architecturally mapped to AArch64 System register 
ACTLR_EL2[31:0]. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HACTLR is a 32-bit register. 
Field descriptions 
The HACTLR bit assignments are: 
31 0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
Accessing the HACTLR: 
To access the HACTLR: 
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MRC p15,4,<Rt>,c1,c@,1 ; Read HACTLR into Rt 
MCR p15,4,<Rt>,c1,c@,1 ; Write Rt to HACTLR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 100 0001 0000 001 
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G6.2.52 HACTLR2, Hyp Auxiliary Control Register 2 
The HACTLR2 characteristics are: 
Purpose 
Provides additional space to the HACTLR register to hold IMPLEMENTATION DEFINED trap 
functionality. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
_ - - RW RW = 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- - RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HACTLR2 is architecturally mapped to AArch64 System register 
ACTLR_EL2[63:32]. 
It is IMPLEMENTATION DEFINED whether this register is implemented, or whether it causes 
UNDEFINED exceptions when accessed. 
The implementation of this register can be detected by examining bits [7:4] of the 
ID_MMFR4/ID_MMFR4_ EL] register. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HACTLR2 is a 32-bit register. 
Field descriptions 
The HACTLR2 bit assignments are: 
31 0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
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Accessing the HACTLR2: 
To access the HACTLR2: 


MRC p15,4,<Rt>,c1,c@,3 ; Read HACTLR2 into Rt 
MCR p15,4,<Rt>,c1,c@,3 ; Write Rt to HACTLR2 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0001 0000 011 
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G6.2.53 HADFSR, Hyp Auxiliary Data Fault Status Register 
The HADFSR characteristics are: 
Purpose 
Provides additional IMPLEMENTATION DEFINED syndrome information for Data Abort exceptions 
taken to Hyp mode. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- - RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T5==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T5==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HADFSR is architecturally mapped to AArch64 System register 
AFSRO_EL2. 
This is an optional register. An implementation that does not require this register can implement it 
as RESO. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HADFSR is a 32-bit register. 
Field descriptions 
The HADFSR bit assignments are: 
31 0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
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Accessing the HADFSR: 
To access the HADFSR: 


MRC p15,4,<Rt>,c5,c1,@ ; Read HADFSR into Rt 
MCR p15,4,<Rt>,c5,c1,@ ; Write Rt to HADFSR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0101 0001 000 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4351 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 























G6.2.54 HAIFSR, Hyp Auxiliary Instruction Fault Status Register 
The HAIFSR characteristics are: 
Purpose 
Provides additional IMPLEMENTATION DEFINED syndrome information for Prefetch Abort 
exceptions taken to Hyp mode. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- - RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T5==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T5==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HAIFSR is architecturally mapped to AArch64 System register 
AFSR1_EL2. 
This is an optional register. An implementation that does not require this register can implement it 
as RESO. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HAIFSR is a 32-bit register. 
Field descriptions 
The HAIFSR bit assignments are: 
31 0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
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Accessing the HAIFSR: 
To access the HAIFSR: 


MRC p15,4,<Rt>,c5,c1,1 ; Read HAIFSR into Rt 
MCR p15,4,<Rt>,c5,c1,1 ; Write Rt to HAIFSR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0101 0001 001 
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G6.2.55 HAMAIRO, Hyp Auxiliary Memory Attribute Indirection Register 0 
The HAMAIRO characteristics are: 


Purpose 
Provides IMPLEMENTATION DEFINED memory attributes for the memory attribute encodings defined 
by HMAIRO. These IMPLEMENTATION DEFINED attributes can only provide additional qualifiers for 
the memory attribute encodings, and cannot change the memory attributes defined in HMAIRO. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





= = RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2 (NS) 





- - RW 





If an implementation does not provide any IMPLEMENTATION DEFINED memory attributes, this 
register is RESO. 


Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register HAMAIRO is architecturally mapped to AArch64 System register 
AMAIR_EL2[31:0]. 
If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HAMAIRO is a 32-bit register. 


Field descriptions 


The HAMAIRO bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
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Accessing the HAMAIRO: 
To access the HAMAIRO: 


MRC p15,4,<Rt>,c10,c3,@ ; Read HAMAIR@ into Rt 
MCR p15,4,<Rt>,c10,c3,@ ; Write Rt to HAMAIRO 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1010 0011 000 
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G6.2.56 HAMAIR1, Hyp Auxiliary Memory Attribute Indirection Register 1 
The HAMAIRI characteristics are: 


Purpose 
Provides IMPLEMENTATION DEFINED memory attributes for the memory attribute encodings defined 
by HMAIR1. These IMPLEMENTATION DEFINED attributes can only provide additional qualifiers for 
the memory attribute encodings, and cannot change the memory attributes defined in HMAIR1I. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





= = RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2 (NS) 





- - RW 





If an implementation does not provide any IMPLEMENTATION DEFINED memory attributes, this 
register is RESO. 


Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register HAMAIRI is architecturally mapped to AArch64 System register 
AMAIR_EL2[63:32]. 
If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HAMAIR1 is a 32-bit register. 


Field descriptions 


The HAMAIR1 bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
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Accessing the HAMAIR1: 
To access the HAMAIRI: 


MRC p15,4,<Rt>,c1@,c3,1 ; Read HAMAIR1 into Rt 
MCR p15,4,<Rt>,c10,c3,1 ; Write Rt to HAMAIR1 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1010 0011 001 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4357 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2.57 HCPTR, Hyp Architectural Feature Trap Register 
The HCPTR characteristics are: 


Purpose 
Controls: 


° Trapping to Hyp mode of Non-secure access, at EL1 or ELO, to trace, and to Advanced SIMD 
and floating-point functionality. 


° Hyp mode access to trace, and to Advanced SIMD and floating-point functionality. 

—— Note 

Accesses to this functionality: 

° From Non-secure modes other than Hyp mode are also affected by settings in the CPACR and 
NSACR. 


. From Hyp mode are also affected by settings in the NSACR. 


Exceptions generated by the CPACR and NSACR controls are higher priority than those generated 
by the HCPTR controls. 





Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





: : RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If CPTR_EL3.TCPAC==1, accesses to this register from EL2 are trapped to EL3. 
. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register HCPTR is architecturally mapped to AArch64 System register 
CPTR_EL?2, 


If EL2 is not implemented, this register is RESO from EL3. 
Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 with EL2 using AArch32, or into EL3 with EL3 using AArch32. Otherwise, RW fields in 
this register reset to architecturally UNKNOWN values. 

Attributes 


HCPTR is a 32-bit register. 
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Field descriptions 


The HCPTR bit assignments are: 


31 30 21 20 19 1615141312 1110 9 0 


7 RESO i RESO yh | RES1 
TCPAC = | tp TCP10 
TCP11 








TTA 
RES1 
RESO 
TASE 
TCPAC, bit [31] 
Traps Non-secure EL1 accesses to the CPACR to Hyp mode. 
) This control has no effect on Non-secure EL1 accesses to the CPACR. 
1 Non-secure EL1 accesses to the CPACR are trapped to Hyp mode. 
—— Note 
The CPACR is not accessible at ELO. 
When this register has an architecturally-defined reset value, this field resets to 0. 
Bits [30:21] 
Reserved, RESO. 
TTA, bit [20] 
Traps Non-secure System register accesses to all implemented trace registers to Hyp mode. 
) This control has no effect on Non-secure System register accesses to trace registers. 
1 Any Non-secure System register access to an implemented trace register is trapped to 
Hyp mode, unless the access is trapped to EL1 by a CPACR or NSACR control, or the 
access is from Non-secure ELO and the definition of the register in the appropriate trace 
architecture specification indicates that the register is not accessible from ELO. A 
trapped instruction generates: 
° A Hyp Trap exception, if the exception is taken from Non-secure ELO or EL1. 
° An Undefined Instruction exception taken to Hyp mode, if the exception is taken 
from Hyp mode. 
If the implementation does not include a trace macrocell, or does not include a System register 
interface to the trace macrocell registers, it is IMPLEMENTATION DEFINED whether this bit: 
. Is RESO. 
: Is RES1. 
° Can be written from Hyp mode, and from Secure Monitor mode when SCR.NS is 1. 
If EL3 is implemented and is using AArch32, and the value of NSACR.NSTRCDIS is 1, in 
Non-secure state this field behaves as RAO/WI, regardless of its actual value. 
——— Note 
° The ETMv4 architecture does not permit ELO to access the trace registers. If the 
implementation includes an ETMv4 implementation, ELO accesses to the trace registers are 
UNDEFINED, and a resulting Undefined Instruction exception is higher priority than a 
HCPTR.TTA Hyp Trap exception. 
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. The architecture does not provide traps on trace register accesses through the optional 
memory-mapped external debug interface. 





System register accesses to the trace registers can have side-effects. When a System register access 
is trapped, any side-effects that are normally associated with the access do not occur before the 
exception is taken. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 
Bits [19:16] 


Reserved, RESO. 


TASE, bit [15] 
Traps Non-secure execution of Advanced SIMD instructions to Hyp mode when the value of 
HCPTR.TCP10 is 0. 
0 This control has no effect on Non-secure execution of Advanced SIMD instructions. 


1 When the value of HCPTR.TCP10 is 0, any attempt to execute an Advanced SIMD 
instruction in Non-secure state is trapped to Hyp mode, unless it is trapped to EL1 by a 
CPACR or NSACR control. A trapped instruction generates: 


. A Hyp Trap exception, if the exception is taken from Non-secure ELO or EL 1. 
° An Undefined Instruction exception taken to Hyp mode, if the exception is taken 
from Hyp mode. 


When the value of HCPTR.TCP10 is 1, the value of this field is ignored. 


If the implementation does not include Advanced SIMD and floating-point functionality, this field 
is RES1. Otherwise, it is IMPLEMENTATION DEFINED whether this field is implemented as a RW field. 
If it is not implemented as a RW field, then it is RAZ/WI. 


If EL3 is implemented and is using AArch32, and the value of NSACR.NSASEDIS is 1, in 
Non-secure state this field behaves as RAO/WI, regardless of its actual value. This applies even if 
the field is implemented as RAZ/WI. 


For the list of instructions affected by this field, see Controls of Advanced SIMD operation that do 
not apply to floating-point operation on page E1-2306. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 

Bit [14] 
Reserved, RESO. 


Bits [13:12] 


Reserved, RES1. 


TCP11, bit [11] 


The value of this field is ignored. If this field is programmed with a different value to the TCP10 bit 
then this field is UNKNOWN on a direct read of the HCPTR. 


If the implementation does not include Advanced SIMD and floating-point functionality, this field 
is RES1. 


If EL3 is implemented and is using AArch32, and the value of NSACR.cp10 is 0, in Non-secure 
state this field behaves as RAO/WI, regardless of its actual value. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 
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TCP10, bit [10] 
Trap Non-secure accesses to Advanced SIMD and floating-point functionality to Hyp mode: 


1) This control has no effect on Non-secure accesses to Advanced SIMD and 
floating-point functionality. 


1 Any attempted access to Advanced SIMD and floating-point functionality from 
Non-secure state is trapped to Hyp mode, unless it is trapped to EL1 by a CPACR or 
NSACR control. A trapped instruction generates: 


. A Hyp Trap exception, if the exception is taken from Non-secure ELO or EL1. 
° An Undefined Instruction exception taken to Hyp mode, if the exception is taken 
from Hyp mode. 
The Advanced SIMD and floating-point features controlled by these fields are: 
° Execution of any floating-point or Advanced SIMD instruction. 
° Any access to the Advanced SIMD and floating-point registers DO-D31 and their views as 


S0-S31 and QO-Q15. 
° Any access to the FPSCR, FPSID, MVFRO, MVFR1, MVFR2, or FPEXC System registers. 


If the implementation does not include Advanced SIMD and floating-point functionality, this field 
is RES1. 


If EL3 is implemented and is using AArch32, and the value of NSACR.cp10 is 0, in Non-secure 
state this field behaves as RAO/WI, regardless of its actual value. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 


Bits [9:0] 


Reserved, RES1. 


Accessing the HCPTR: 
To access the HCPTR: 


MRC p15,4,<Rt>,cl,c1,2 ; Read HCPTR into Rt 
MCR p15,4,<Rt>,cl,c1,2 ; Write Rt to HCPTR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 100 0001 0001 010 
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G6.2.58 HCR, Hyp Configuration Register 
The HCR characteristics are: 


Purpose 


Provides configuration controls for virtualization, including defining whether various Non-secure 
operations are trapped to Hyp mode. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register HCR is architecturally mapped to AArch64 System register 
HCR_EL2[31:0]. 


If EL2 is not implemented, this register is RESO from EL3. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 with EL2 using AArch32, or into EL3 with EL3 using AArch32. Otherwise, RW fields in 
this register reset to architecturally UNKNOWN values. 


Attributes 
HCR is a 32-bit register. 


Field descriptions 


The HCR bit assignments are: 
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TRVM SWIO 


HCD PTW 
RESO FMO 

TGE IMO 
TVM AMO 

TTLB VF 
TPU VA 
TPC DC 
TSW TWI 
TAC TWE 
TIDCP TIDO 

TSC 

TID3 

TID2 

TID1 

Bit [31] 


Reserved, RESO. 


TRVM, bit [30] 


Trap Reads of Virtual Memory controls. Traps Non-secure EL1 reads of the virtual memory control 
registers to Hyp mode. The registers for which read accesses are trapped are as follows: 


SCTLR, TTBRO, TTBR1, TTBCR, DACR, DFSR, IFSR, DFAR, IFAR, ADFSR, AIFSR, PRRR, 
NMRR, MAIRO, MAIR1, AMATRO, AMAIRI, CONTEXTIDR. 


) This control has no effect on Non-secure EL] read accesses to Virtual Memory controls. 
1 Non-secure EL] read accesses to the specified Virtual Memory controls are trapped to 
Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to Q. 


HCD, bit [29] 


HVC instruction disable. Disables Non-secure state execution of HVC instructions. 


1) HVC instruction execution is enabled at EL2 and Non-secure EL 1. 

i HVC instructions are UNDEFINED at EL2 and Non-secure EL1. The Undefined 
Instruction exception is taken to the Exception level at which the HVC instruction is 
executed. 

— Note 


HVC instructions are always UNDEFINED at ELO. 





This bit is only implemented if EL3 is not implemented. Otherwise, it is RESO. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 


Bit [28] 


Reserved, RESO. 
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TGE, bit [27] 
Trap General Exceptions, from Non-secure ELO. 
) This control has no effect on execution at ELO. 


1 When the value of SCR.NS is 0, this control has no effect on execution at ELO. 
When the value of SCR.NS is 1, then: 


° All exceptions that would be routed to EL1 are routed to EL2. 


° The SCTLR.M bit is treated as being 0 for all purposes other than returning the 
result of a direct read of SCTLR. 


° The HCR.{FMO, IMO, AMO} bits are treated as being | for all purposes other 
than returning the result of a direct read of HCR. 


. All virtual interrupts are disabled. 

° Any IMPLEMENTATION DEFINED mechanisms for signaling virtual interrupts are 
disabled. 

° An exception return to EL] is treated as an illegal exception return. 

° Monitor mode execution of an MSR or CPS instruction that changes CPSR.M to 


a Non-secure EL1 mode is an illegal change to PSTATE.M. For more information 
see Illegal changes to PSTATE.M on page G1-3809. 


Also, when HCR.TGE is 1: 


° If EL3 is using AArch32, an attempt to change from a Secure PL1 mode to a Non-secure EL1 
mode by changing SCR.NS from 0 to 1 results in SCR.NS remaining as 0. 


° The HDCR.{TDRA, TDOSA, TDA, TDE} bits are ignored and treated as being 1 other than 
for the purpose of a direct read of HDCR. 


In the following cases the field resets to 0: 

. The PE resets into EL3 with EL3 using AArch32. 

° The PE resets into EL2 with EL2 using AArch32. 
Otherwise, the field reset value is architecturally UNKNOWN. 


When this register has an architecturally-defined reset value, this field resets to 0. 
TVM, bit [26] 


Trap Virtual Memory controls. Traps Non-secure EL1 writes to the virtual memory control registers 
to Hyp mode. The registers for which write accesses are trapped are as follows: 


SCTLR, TTBRO, TTBR1, TTBCR, DACR, DFSR, IFSR, DFAR, IFAR, ADFSR, AIFSR, PRRR, 
NMRR, MAIRO, MAIR1, AMATRO, AMAIRI, CONTEXTIDR. 


Q This control has no effect on Non-secure EL1 write accesses to EL1 virtual memory 
control registers. 

1 Non-secure EL1 write accesses to EL1 virtual memory control registers are trapped to 
Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 
TTLB, bit [25] 


Trap TLB maintenance instructions. Traps Non-secure EL1 execution of a TLBI instruction to Hyp 
mode. This applies to the following instructions: 


TLBIALLIS, TLBIMVAIS, TLBIASIDIS, TLBIMVAAIS, TLBIMVALIS, TLBIMVAALIS, 
ITLBIALL, ITLBIMVA, ITLBIASID, DTLBIALL, DTLBIMVA, DTLBIASID, TLBIALL, 
TLBIMVA, TLBIASID, TLBIMVAA, TLBIMVAL, TLBIMVAAL 


1) This control has no effect on Non-secure EL1 accesses to TLB maintenance 
instructions. 
1 Non-secure EL1 accesses to TLB maintenance instructions are trapped to Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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TPU, bit [24] 


Trap cache maintenance instructions that operate to the Point of Unification. Traps Non-secure EL1 
execution of those cache maintenance instructions to Hyp mode. This applies to the following 
instructions: 


ICIMVAU, ICIALLU, ICIALLUIS, DCCMVAU. 


— Note 


An Undefined Instruction exception generated at ELO is higher priority than this trap to EL2, and 
these instructions are always UNDEFINED at ELO. 





) This control has no effect on the execution of cache maintenance instructions. 
1 Non-secure EL1 execution of the specified instructions is trapped to Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to Q. 


TPC, bit [23] 


Trap data or unified cache maintenance instructions that operate to the Point of Coherency. Traps 
Non-secure EL1 execution of those cache maintenance instructions to Hyp mode. This applies to 
the following instructions: 


DCIMVAC, DCCIMVAC, DCCMVAC. 


— Note 


An Undefined Instruction exception generated at ELO is higher priority than this trap to EL2, and 
these instructions are always UNDEFINED at ELO. 





0 This control has no effect on the execution of cache maintenance instructions. 
1 Non-secure execution of the specified instructions is trapped to Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 


TSW, bit [22] 


Trap data or unified cache maintenance instructions that operate by Set/Way. Traps Non-secure EL1 
execution of those cache maintenance instructions by set/way to Hyp mode. This applies to the 
following instructions: 


DCISW, DCCSW, DCCISW. 


— Note 


An Undefined Instruction exception generated at ELO is higher priority than this trap to EL2, and 
these instructions are always UNDEFINED at ELO. 





) This control has no effect on the execution of cache maintenance instructions. 
a Non-secure execution of the specified instructions is trapped to Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 


TAC, bit [21] 


Trap Auxiliary Control Registers. Traps Non-secure EL1 accesses to the Auxiliary Control 
Registers to Hyp mode, from both Execution states. This applies to the following register accesses: 


ACTLR and, if implemented, ACTLR2. 


) This control has no effect on Non-secure EL1 accesses to the Auxiliary Control 
Registers. 
1 Non-secure EL1 accesses to the specified registers are trapped to Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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TIDCP, bit [20] 


Trap IMPLEMENTATION DEFINED functionality. Traps Non-secure EL1 accesses to the encodings for 
IMPLEMENTATION DEFINED System Registers to Hyp mode. 


MCR and MRC instructions accessing the following encodings: 
° All coproc==p15, CRn==c9, Opcodel = {0-7}, CRm == {c0-c2, c5-c8}, opcode2 == {0-7}. 


° All coproc==p15, CRn==c10, Opcodel =={0-7}, CRm == {c0, cl, c4, c8}, opcode2 == 
{0-7}. 


° All coproc==p15, CRn==cl1, Opcodel=={0-7}, CRm == {c0-c8, c15}, opcode2 == {0-7}. 


When HCR.TIDCP is set to 1, it is IMPLEMENTATION DEFINED whether any of this functionality 
accessed from Non-secure ELO is trapped to Hyp mode. If it is not, it is UNDEFINED, and the PE takes 
an Undefined Instruction exception to Non-secure Undefined mode. 


0 This control has no effect on Non-secure EL1 and ELO accesses to the System register 
encodings for IMPLEMENTATION DEFINED functionality. 


i Non-secure EL1 accesses to the specified System register encodings for 
IMPLEMENTATION DEFINED functionality are trapped to Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 


TSC, bit [19] 
Trap SMC instructions. Traps Non-secure EL1 execution of SMC instructions to Hyp mode. 
) This control has no effect on execution of SMC instructions. 


1 Any attempt to execute an SMC instruction at Non-secure EL] is trapped to Hyp mode, 
regardless of the value of SCR.SCD. 


The ARMv8-A architecture permits, but does not require, this trap to apply to conditional SMC 
instructions that fail their condition code check, in the same way as with traps on other conditional 


instructions. 

—— Note 

° This trap is only implemented if the implementation includes EL3. 

° SMC instructions are always UNDEFINED at PLO. 

° This bit traps execution of the SMC instruction. It is not a routing control for the SMC 
exception. Hyp Trap exceptions and SMC exceptions have different preferred return 
addresses. 





When this register has an architecturally-defined reset value, this field resets to 0. 


TID3, bit [18] 
Trap ID group 3. Traps Non-secure EL1 reads of the following registers to Hyp mode: 


ID_PFRO, ID_PFR1, ID_DFRO, ID_AFRO, ID_MMFRO, ID_MMFR1, ID_MMFR2, ID_MMEFR3, 
ID_ISARO, ID_ISAR1, ID_ISAR2, ID_ISAR3, ID_ISAR4, ID_ISAR5, MVFRO, MVFR1, 
MVFR2, and ID_MMFR4, except that if ID_MMFR4 is implemented as RAZ/WI then it is 
IMPLEMENTATION DEFINED whether accesses to ID_MMEFR4 are trapped. 





Also an MRC access to any of the following encodings: 

° coproc==p15, opcl == 0, CRn == c0, CRm == {c3-c7}, opc2 == {0,1}. 

° coproc==p15, opel == 0, CRn == c0, CRm == c3, opc2 == 2. 

° coproc==p15, opcl == 0, CRn == c0, CRm == c5, opc2 == {4,5}. 

It is IMPLEMENTATION DEFINED whether this bit traps MRC accesses to the following encodings: 
° coproc==p15, opel == 0, CRn == c0O, CRm == c2, ope2 == 7. 

° coproc==p15, opel == 0, CRn == c0O, CRm == c3, ope2 == {3-7}. 

° coproc==p15, opcl == 0, CRn == c0, CRm == {c4, c6, c7}, opc2 == {2-7}. 
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° coproc==p15, opel == 0, CRn == c0, CRm == ¢5, opc2 == {2, 3, 6, 7}. 


0 This control has no effect on Non-secure EL1 reads of the ID group 3 registers. 
1 The specified Non-secure EL1 read accesses to ID group 3 registers are trapped to Hyp 
mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 


TID2, bit [17] 
Trap ID group 2. Traps the following register accesses to Hyp mode: 
° Non-secure EL and ELO reads of the CTR, CCSIDR, CLIDR, and CSSELR. 
° Non-secure EL1 and ELO writes to the CSSELR. 


) This control has no effect on Non-secure EL1 and ELO accesses to the ID group 2 
registers. 

1 The specified Non-secure EL1 and ELO accesses to ID group 2 registers are trapped to 
Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 


TID1, bit [16] 
Trap ID group 1. Traps Non-secure EL1 reads of the following registers to Hyp mode: 
TCMTR, TLBTR, REVIDR, AIDR. 


Q This control has no effect on Non-secure EL1 reads of the ID group 1 registers. 
1 The specified Non-secure EL1 read accesses to ID group | registers are trapped to Hyp 
mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 








TIDO, bit [15] 

Trap ID group 0. Traps the following register accesses to Hyp mode: 

° Non-secure EL1 reads of the JIDR and FPSID. 

° If the JIDR is RAZ from Non-secure ELO, Non-secure ELO reads of the JIDR. 

—— Note 

© It is IMPLEMENTATION DEFINED whether the JIDR is RAZ or UNDEFINED at ELO. If it is 

UNDEFINED at ELO then the Undefined Instruction exception takes precedence over this trap. 

° The FPSID is not accessible at ELO. 

° Writes to the FPSID are ignored, and not trapped by this control. 

0 This control has no effect on Non-secure EL1 reads of the ID group 0 registers. 

1 The specified Non-secure EL1 read accesses to ID group 0 registers are trapped to Hyp 
mode. 

When this register has an architecturally-defined reset value, this field resets to 0. 

TWE, bit [14] 

Traps Non-secure ELO and EL1 execution of WFE instructions to Hyp mode: 

1) This control has no effect on the execution of WFE instructions at Non-secure ELO or 
Non-secure EL1. 

1 Any attempt to execute a WFE instruction at Non-secure ELO or EL] is trapped to Hyp 
mode, if the instruction would otherwise have caused the PE to enter a low-power state, 
except that when the value of SCTLR.nTWE is 0, the trap of ELO execution to 
Undefined mode takes precedence over this trap. 

The attempted execution of a conditional WFE instruction is only trapped if the instruction passes 

its condition code check. 
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TWI, bit [13] 


DC, bit [12] 


— Note 


Since a WFE can complete at any time, even without a Wakeup event, the traps on WFE are not 
guaranteed to be taken, even if the WFE is executed when there is no Wakeup event. The only 
guarantee is that if the instruction does not complete in finite time in the absence of a Wakeup event, 
the trap will be taken. 





When this register has an architecturally-defined reset value, this field resets to 0. 


Traps Non-secure ELO and EL1 execution of WFI instructions to Hyp mode. 


1) This control has no effect on the execution of WFI instructions at Non-secure EL1 or 
Non-secure ELO. 


1 Any attempt to execute a WFI instruction at Non-secure ELO or EL] is trapped to Hyp 
mode, if the instruction would otherwise have caused the PE to enter a low-power state, 
except that when the value of SCTLR.nTWI is 0, the trap of ELO execution to 
Undefined mode takes precedence over this trap. 


The attempted execution of a conditional WFI instruction is only trapped if the instruction passes 
its condition code check. 


— Note 


Since a WFI can complete at any time, even without a Wakeup event, the traps on WFI are not 
guaranteed to be taken, even if the WFI is executed when there is no Wakeup event. The only 
guarantee is that if the instruction does not complete in finite time in the absence of a Wakeup event, 
the trap will be taken. 





When this register has an architecturally-defined reset value, this field resets to Q. 


Default Cacheability. 
) This control has no effect on the Non-secure EL1&0 translation regime. 
1 In Non-secure state: 


° The SCTLR.M field behaves as 0 for all purposes other than a direct read of the 
value of the field. 


° The HCR.VM field behaves as 1 for all purposes other than a direct read of the 
value of the field. 


. The memory type produced by the first stage of the EL1&0 translation regime is 
Normal Non-Shareable, Inner Write-Back Read-Allocate Write-Allocate, Outer 
Write-Back Read-Allocate Write-Allocate. 


This field has no effect on the EL2 and EL3 translation regimes. 
This field is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, this field resets to 0. 


BSU, bits [11:10] 


Barrier Shareability upgrade. This field determines the minimum shareability domain that is applied 
to any barrier instruction executed from Non-secure EL1 or Non-secure ELO: 


00 No effect 


Q1 Inner Shareable 
10 Outer Shareable 
11 Full system 


This value is combined with the specified level of the barrier held in its instruction, using the same 
principles as combining the shareability attributes from two stages of address translation. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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FB, bit [9] 


Force broadcast. Causes the following instructions to be broadcast within the Inner Shareable 
domain when executed from Non-secure EL1: 


BPIALL, TLBIALL, TLBIMVA, TLBIASID, DTLBIALL, DTLBIMVA, DTLBIASID, 
ITLBIALL, ITLBIMVA, ITLBIASID, TLBIMVAA, ICIALLU, TLBIMVAL, TLBIMVAAL. 


0 This field has no effect on the operation of the specified instructions. 


He When one of the specified instruction is executed at Non-secure EL1, the instruction is 
broadcast within the Inner Shareable shareability domain. 


When this register has an architecturally-defined reset value, this field resets to Q. 


VA, bit [8] 
Virtual SError interrupt exception. 
0 This mechanism is not making a virtual SError interrupt pending. 
1 A virtual SError interrupt is pending because of this mechanism. 
The virtual SError interrupt is only enabled when the value of HCR.{TGE, AMO} is {0, 1}. 
The Guest OS cannot distinguish the virtual exception from the corresponding physical exception. 


When this register has an architecturally-defined reset value, this field resets to 0. 


VI, bit [7] 
Virtual IRQ exception. 
0 This mechanism is not making a virtual IRQ pending. 
1 A virtual IRQ is pending because of this mechanism. 
The virtual IRQ is only enabled when the value of HCR.{TGE, IMO} is {0, 1}. 
The Guest OS cannot distinguish the virtual exception from the corresponding physical exception. 


When this register has an architecturally-defined reset value, this field resets to 0. 


VF, bit [6] 
Virtual FIQ exception. 
Q This mechanism is not making a virtual FIQ pending. 
1 A virtual FIQ is pending because of this mechanism. 
The virtual FIQ is only enabled when the value of HCR.{TGE, FMO} is {0, 1}. 
The Guest OS cannot distinguish the virtual exception from the corresponding physical exception. 


When this register has an architecturally-defined reset value, this field resets to 0. 


AMO, bit [5] 


SError interrupt Mask Override. When this bit is set to 1, it overrides the effect of CPSR.A, and 
enables virtual exception signaling by the VA bit. 


If the value of HCR.TGE is 0, then Virtual SError interrupts are enabled in the Non-secure state. 


If the value of HCR.TGE is 1, then in Non-secure state the HCR.AMO bit behaves as | for all 
purposes other than a direct read of the value of the bit. 


When this register has an architecturally-defined reset value, this field resets to Q. 


IMO, bit [4] 


IRQ Mask Override. When this bit is set to 1, it overrides the effect of CPSR.I, and enables virtual 
exception signaling by the VI bit. 


If the value of HCR.TGE is 0, then Virtual IRQ interrupts are enabled in the Non-secure state. 


If the value of HCR.TGE is 1, then in Non-secure state the HCR.IMO bit behaves as 1 for all 
purposes other than a direct read of the value of the bit. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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FMO, bit [3] 


PTW, bit [2] 


SWIO, bit [1] 


VM, bit [0] 


FIQ Mask Override. When this bit is set to 1, it overrides the effect of CPSR.F, and enables virtual 
exception signaling by the VF bit. 


If the value of HCR.TGE is 0, then Virtual FIQ interrupts are enabled in the Non-secure state. 


If the value of HCR.TGE is 1, then in Non-secure state the HCR.FMO bit behaves as | for all 
purposes other than a direct read of the value of the bit. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Protected Table Walk. In the Non-secure PL1&0 translation regime, a translation table access made 
as part of a stage 1 translation table walk is subject to a stage 2 translation. The combining of the 
memory type attributes from the two stages of translation means the access might be made to a type 
of Device memory. If this occurs then the value of this bit determines the behavior: 


Q The translation table walk occurs as if it is to Normal Non-cacheable memory. This 
means it can be made speculatively. 


1 The memory access generates a stage 2 Permission fault. 
This field is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Set/Way Invalidation Override. Causes Non-secure EL1 execution of the data cache invalidate by 
set/way instructions to be treated as data cache clean and invalidate by set/way. 


Q This control has no effect on the operation of data cache invalidate by set/way 
instructions. 

1 Data cache invalidate by set/way instructions operate as data cache clean and invalidate 
by set/way. 


When this bit is set to 1, DCISW is executed as DCCISW. 


As a result of changes to the behavior of DCISW, this bit is redundant in ARMv8. This bit can be 
implemented as RES1. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 


Virtualization enable. Enables stage 2 address translation for the Non-secure EL1&0 translation 
regime. Possible values of this bit are: 


Q Non-secure EL1&0 stage 2 address translation disabled. 
1 Non-secure EL1&0 stage 2 address translation enabled. 


If the HCR.DC bit is set to 1, then the behavior of the PE when executing in a Non-secure mode 
other than Hyp mode is consistent with HCR.VM being 1, regardless of the actual value of 
HCR.VM, other than the value returned by an explicit read of HCR.VM. 


When the value of this bit is 1, data cache invalidate instructions executed at Non-secure EL1 
operate as data cache clean and invalidate instructions. For the invalidate by set/way instruction this 
behavior applies regardless of the value of the HCR.SWIO bit. 


This bit is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the HCR: 


To access the HCR: 


MRC p15,4,<Rt>,cl,c1,@ ; Read HCR into Rt 
MCR p15,4,<Rt>,cl,c1,@ ; Write Rt to HCR 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0001 0001 000 
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G6.2.59 HCR2, Hyp Configuration Register 2 
The HCR2 characteristics are: 
Purpose 
Provides additional configuration controls for virtualization. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2 (NS) 
- - RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HCR2 is architecturally mapped to AArch64 System register 
HCR_EL2[63:32]. 
If EL2 is not implemented, this register is RESO from EL3. 
Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 with EL2 using AArch32, or into EL3 with EL3 using AArch32. Otherwise, RW fields in 
this register reset to architecturally UNKNOWN values. 
Attributes 
HCR2 is a 32-bit register. 
Field descriptions 
The HCR2 bit assignments are: 
31 765 210 
RESO | RESO o 
L— =. CD 
MIOCNCE 
Bits [31:7] 
Reserved, RESO. 
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MIOCNCE, bit [6] 
Mismatched Inner/Outer Cacheable Non-Coherency Enable, for the Non-secure PL1&0 translation 


regime. 

0 For the Non-secure PL1&0 translation regime, for permitted accesses to a memory 
location that use a common definition of the Shareability and Cacheability of the 
location, there must be no loss of coherency if the Inner Cacheability attribute for those 
accesses differs from the Outer Cacheability attribute. 

1 For the Non-secure PL1&0 translation regime, for permitted accesses to a memory 


location that use a common definition of the Shareability and Cacheability of the 
location, there might be a loss of coherency if the Inner Cacheability attribute for those 
accesses differs from the Outer Cacheability attribute. 


For more information see Mismatched memory attributes on page E2-2352. 


The value of this field has no effect on translation regimes other than the Non-secure PL1&0 
translation regime. 


This field can be implemented as RAZ/WI. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


Bits [5:2] 


Reserved, RESO. 


ID, bit [1] 


Stage 2 Instruction access cacheability disable. For the Non-secure PL1&0 translation regime, when 
HCR.VM==1, this control forces all stage 2 translations for instruction accesses to Normal memory 
to be Non-cacheable. 


0 This control has no effect on stage 2 of the Non-secure PL1&0 translation regime. 


1 For the Non-secure PL1&0 translation regime, forces all stage 2 translations for 
instruction accesses to Normal memory to be Non-cacheable. 


This bit has no effect on the EL2 translation regime. 


When this register has an architecturally-defined reset value, this field resets to 0. 


CD, bit [0] 


Stage 2 Data access cacheability disable. When HCR.VM==1, this forces all stage 2 translations for 
data accesses and translation table walks to Normal memory to be Non-cacheable for the 
Non-secure PL1&0 translation regime. 


0 This control has no effect on stage 2 of the Non-secure PL1&0 translation regime for 
data accesses and translation table walks. 


al For the Non-secure PL1&0 translation regime, forces all stage 2 translations for data 
accesses and translation table walks to Normal memory to be Non-cacheable. 


This bit has no effect on the EL2 translation regime. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the HCR2: 
To access the HCR2: 


MRC p15,4,<Rt>,cl,c1,4 ; Read HCR2 into Rt 
MCR p15,4,<Rt>,cl,c1,4 ; Write Rt to HCR2 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0001 0001 100 
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G6.2.60 HDFAR, Hyp Data Fault Address Register 
The HDFAR characteristics are: 
Purpose 
Holds the virtual address of the faulting address that caused a synchronous Data Abort exception 
that is taken to Hyp mode. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- - RW 
Any execution in a Non-secure EL1 mode, or in Secure state, makes the HDFAR UNKNOWN. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HSTR.T6==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T6==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HDFAR is architecturally mapped to AArch64 System register 
FAR_EL2[31:0]. 
AArch32 System register HDFAR is architecturally mapped to AArch32 System register DFAR (S) 
when EL2 is implemented. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HDFAR is a 32-bit register. 
Field descriptions 
The HDFAR bit assignments are: 
31 0 
VA of faulting address of synchronous Data Abort exception taken to Hyp mode 
Bits [31:0] 
VA of faulting address of synchronous Data Abort exception taken to Hyp mode. 
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On a Prefetch Abort exception, this register is UNKNOWN. 


Any execution in a Non-secure EL1 mode, or in Secure state, makes this register UNKNOWN. 


Accessing the HDFAR: 
To access the HDFAR: 


MRC p15,4,<Rt>,c6,c@,@ ; Read HDFAR into Rt 
MCR p15,4,<Rt>,c6,c@,@ ; Write Rt to HDFAR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0110 0000 000 
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G6.2.61 HIFAR, Hyp Instruction Fault Address Register 
The HIFAR characteristics are: 
Purpose 
Holds the virtual address of the faulting address that caused a synchronous Prefetch Abort exception 
that is taken to Hyp mode. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- - RW 
Any execution in a Non-secure EL1 mode, or in Secure state, makes the HIFAR UNKNOWN. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HSTR.T6==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T6==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HIFAR is architecturally mapped to AArch64 System register 
FAR_EL2[63:32]. 
AArch32 System register HIFAR is architecturally mapped to AArch32 System register [FAR (S) 
when EL2 is implemented. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HIFAR is a 32-bit register. 
Field descriptions 
The HIFAR bit assignments are: 
31 0 
VA of faulting address of synchronous Prefetch Abort exception taken to Hyp mode 
Bits [31:0] 
VA of faulting address of synchronous Prefetch Abort exception taken to Hyp mode. 
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On a Data Abort exception, this register is UNKNOWN. 


Any execution in a Non-secure EL1 mode, or in Secure state, makes this register UNKNOWN. 


Accessing the HIFAR: 
To access the HIFAR: 


MRC p15,4,<Rt>,c6,c@,2 ; Read HIFAR into Rt 
MCR p15,4,<Rt>,c6,c@,2 ; Write Rt to HIFAR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0110 0000 010 
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G6.2.62 HMAIRO, Hyp Memory Attribute Indirection Register 0 
The HMAIRO characteristics are: 


Purpose 


Along with HMAIR1, provides the memory attribute encodings corresponding to the possible 
AttrIndx values in a Long-descriptor format translation table entry for stage 1 translations for 
memory accesses from Hyp mode. 


AttrIndx[2] indicates the HMAIR register to be used: 
° When AttrIndx[2] is 0, HMAIRO is used. 
° When AttrIndx[2] is 1, HMAIR1 is used. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - RW 





AttrIndx[2], from the translation table descriptor, selects the appropriate HMAIR: setting 
AttrIndx[2] to 0 selects HMAIRO. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register HMAIRO is architecturally mapped to AArch64 System register 
MAIR_EL2[31:0]. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
HMAIRO is a 32-bit register. 


Field descriptions 


The HMAIRO bit assignments are: 
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1615 8 7 0 


Attr3 Attr2 Attr1 AttrO 


Attr<n>, bits [8n+7:8n], for n = 0 to 3 


The memory attribute encoding for an AttrIndx[2:0] entry in a Long descriptor format translation 
table entry, where: 


° AttrIndx[2:0] gives the value of <n> in Attr<n>. 
° AttrIndx[2] defines which MAIR to access. Attr7 to Attr4 are in MAIR1I, and Attr3 to AttrO 
are in MAIRO. 


Bits [7:4] are encoded as follows: 





Attr<n>[7:4] 


Meaning 





0000 


Device memory. See encoding of Attr<n>[3:0] for the type of Device memory. 





QORW, RW not 00 


Normal Memory, Outer Write-Through transient 





0100 


Normal Memory, Outer Non-cacheable 





Q1RW, RW not 00 


Normal Memory, Outer Write-Back transient 





10RW 


Normal Memory, Outer Write-Through non-transient 








11RW 





Normal Memory, Outer Write-Back non-transient 





R = Outer Read-Allocate policy, W = Outer Write-Allocate policy. 


The meaning of bits [3:0] depends on the value of bits [7:4]: 





Attr<n>[3:0] 


Meaning when Attr<n>[7:4] is 0000 Meaning when Attr<n>[7:4] is not 0000 











0000 Device-nGnRnE memory UNPREDICTABLE 
Q@RW, RW not @@ UNPREDICTABLE Normal Memory, Inner Write-Through transient 
0100 Device-nGnRE memory Normal memory, Inner Non-cacheable 





Q1RW, RW not 00 


U 


NPREDICTABLE 


Normal Memory, Inner Write-Back transient 





1000 


Device-nGRE memory 


Normal Memory, Inner Write-Through non-transient (RW=00) 





10RW, RW not 00 


U 


NPREDICTABLE 


Normal Memory, Inner Write-Through non-transient 





1100 


Device-GRE memory 


Normal Memory, Inner Write-Back non-transient (RW=00) 





11RW, RW not 00 





U 


NPREDICTABLE 








Normal Memory, Inner Write-Back non-transient 





R = Inner Read-Allocate policy, W = Inner Write-Allocate policy. 
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The R and W bits in some Attr<n> fields have the following meanings: 





RorW Meaning 





0 No Allocate 





1 Allocate 





Accessing the HMAIRO: 
To access the HMAIRO: 


MRC p15,4,<Rt>,c10,c2,@ ; Read HMAIR@ into Rt 
MCR p15,4,<Rt>,c10,c2,@ ; Write Rt to HMAIRO 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1010 0010 000 
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G6.2.63 HMAIR1, Hyp Memory Attribute Indirection Register 1 
The HMAIRI characteristics are: 


Purpose 


Along with HMAIRO, provides the memory attribute encodings corresponding to the possible 
AttrIndx values in a Long-descriptor format translation table entry for stage 1 translations for 
memory accesses from Hyp mode. 


AttrIndx[2] indicates the HMAIR register to be used: 
° When AttrIndx[2] is 0, HMAIRO is used. 
° When AttrIndx[2] is 1, HMAIR1 is used. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - RW 





AttrIndx[2], from the translation table descriptor, selects the appropriate HMAIR: setting 
AttrIndx[2] to 1 selects HMAIR1. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register HMAIR1 is architecturally mapped to AArch64 System register 
MAIR_EL2[63:32]. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
HMAIR1 is a 32-bit register. 


Field descriptions 


The HMAIRI bit assignments are: 
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When TTBCR.EAE==1: 


31 


24 23 


1615 8 7 0 


Attr7 Attr6 Attr5 Attr4 


Attr<n>, bits [8(n-4)+7:8(n-4)], for n = 4 to 7 


The memory attribute encoding for an AttrIndx[2:0] entry in a Long descriptor format translation 
table entry, where: 


° AttrIndx[2:0] gives the value of <n> in Attr<n>. 
° AttrIndx[2] defines which MAIR to access. Attr7 to Attr4 are in MAIR1I, and Attr3 to AttrO 
are in MAIRO. 


Bits [7:4] are encoded as follows: 





Attr<n>[7:4] 


Meaning 





0000 


Device memory. See encoding of Attr<n>[3:0] for the type of Device memory. 





QORW, RW not 00 


Normal Memory, Outer Write-Through transient 





0100 


Normal Memory, Outer Non-cacheable 





Q1RW, RW not 00 


Normal Memory, Outer Write-Back transient 





10RW 


Normal Memory, Outer Write-Through non-transient 








11RW 





Normal Memory, Outer Write-Back non-transient 





R = Outer Read-Allocate policy, W = Outer Write-Allocate policy. 


The meaning of bits [3:0] depends on the value of bits [7:4]: 





Attr<n>[3:0] 


Meaning when Attr<n>[7:4] is 0000 Meaning when Attr<n>[7:4] is not 0000 











0000 Device-nGnRnE memory UNPREDICTABLE 
Q@RW, RW not @@ UNPREDICTABLE Normal Memory, Inner Write-Through transient 
0100 Device-nGnRE memory Normal memory, Inner Non-cacheable 





Q1RW, RW not 00 


U 


NPREDICTABLE 


Normal Memory, Inner Write-Back transient 





1000 


Device-nGRE memory 


Normal Memory, Inner Write-Through non-transient (RW=00) 





10RW, RW not 00 


U 


NPREDICTABLE 


Normal Memory, Inner Write-Through non-transient 





1100 


Device-GRE memory 


Normal Memory, Inner Write-Back non-transient (RW=00) 





11RW, RW not 00 





U 


NPREDICTABLE 








Normal Memory, Inner Write-Back non-transient 





R = Inner Read-Allocate policy, W = Inner Write-Allocate policy. 
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The R and W bits in some Attr<n> fields have the following meanings: 





RorW Meaning 





0 No Allocate 





1 Allocate 





Accessing the HMAIR1: 
To access the HMAIRI1: 


MRC p15,4,<Rt>,c10,c2,1 ; Read HMAIR1 into Rt 
MCR p15,4,<Rt>,c10,c2,1 ; Write Rt to HMAIR1 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1010 0010 001 
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G6.2.64 HPFAR, Hyp IPA Fault Address Register 
The HPFAR characteristics are: 
Purpose 
Holds the faulting IPA for some aborts on a stage 2 translation taken to Hyp mode. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- - RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T6==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T6==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HPFAR is architecturally mapped to AArch64 System register 
HPFAR_EL2[31:0]. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HPFAR is a 32-bit register. 
Field descriptions 
The HPFAR bit assignments are: 
31 4 3 0 
FIPA[39:12] RESO 
Execution in any Non-secure mode other than Hyp mode makes this register UNKNOWN. 
FIPA[39:12], bits [31:4] 
Bits [39:12] of the faulting intermediate physical address. 
Bits [3:0] 
Reserved, RESO. 
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Accessing the HPFAR: 
To access the HPFAR: 


MRC p15,4,<Rt>,c6,c@,4 ; Read HPFAR into Rt 
MCR p15,4,<Rt>,c6,c@,4 ; Write Rt to HPFAR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0110 0000 100 
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G6.2.65 HRMR, Hyp Reset Management Register 
The HRMR characteristics are: 
Purpose 
When this register is implemented: 
° A write to the register can request a Warm reset. 
. If EL2 can use AArch32 and AArch64, this register specifies the Execution state that the PE 
boots into on a Warm reset. 
Usage constraints 
This register is accessible as follows: 
ELO EL1  EL2(NS) 
- - RW 
However, see Configurations for information about whether the register is implemented. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. 
Subject to the prioritization rules: 
If HSTR.T12==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
If HSTR_EL2.T12==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HRMR is architecturally mapped to AArch64 System register RMR_EL2. 
Only implemented if EL2 is the highest implemented Exception level. In this case: 
° If EL2 can use AArch32 and AArch64 then this register must be implemented. 
° If EL2 cannot use AArch64 then it is IMPLEMENTATION DEFINED whether the register is 
implemented. 
When this register is not implemented its encoding is UNDEFINED. 
See the field descriptions for the reset values. These apply whenever the register is implemented. 
Attributes 
HRMR is a 32-bit register. 
Field descriptions 
The HRMR bit assignments are: 
31 210 
RESO FA 
| AA64 
Bits [31:2] 
Reserved, RESO. 
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Reset Request. Setting this bit to 1 requests a Warm reset. 


This field resets to @ on a Warm or Cold reset. 


When EL2 can use AArch64, determines which Execution state the PE boots into after a Warm 
reset: 


0 AArch32., 
1 AArch64. 


On coming out of the Warm reset, execution starts at the IMPLEMENTATION DEFINED reset vector 
address of the specified Execution state. 


If EL2 cannot use AArch64 this bit is RAZ/WI. 


When implemented as an RW field, this field resets to @ on a Cold reset. It is not affected by a Warm 
reset. 


Accessing the HRMR: 


To access the HRMR when EL2 implemented, EL3 not implemented: 


MRC p15,4,<Rt>,c12,c@,2 ; Read HRMR into Rt 
MCR p15,4,<Rt>,c12,c@,2 ; Write Rt to HRMR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 100 1100 0000 010 
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G6.2.66 HSCTLR, Hyp System Control Register 
The HSCTLR characteristics are: 


Purpose 


Provides top level control of the system operation in Hyp mode. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1_ EL2 (NS) 





- - RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register HSCTLR is architecturally mapped to AArch64 System register 
SCTLR_EL2. 


If EL2 is not implemented, this register is RESO from EL3. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 with EL2 using AArch32. Otherwise, RW fields in this register reset to architecturally 
UNKNOWN values. 


Attributes 
HSCTLR is a 32-bit register. 


Field descriptions 


The HSCTLR bit assignments are: 





G6-4388 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


31 30 29 28 27 26 25 24 23 22 21 201918171615 131211109 8 76543 210 





RESO =| 2 RES1 





RES1 CP15BEN 
RESO RESO 
EE ITD 
RESO SED 
RES1 RESO 
RESO RES1 
WXN 
RES1 
RESO 
RES1 

Bit [31] 

Reserved, RESO. 
TE, bit [30] 
T32 Exception Enable. This bit controls whether exceptions to EL2 are taken to A32 or T32 state: 
) Exceptions, including reset, taken to A32 state. 
1 Exceptions, including reset, taken to T32 state. 
This field resets to a value that is architecturally UNKNOWN. 
Bits [29:28] 

Reserved, RES1. 
Bits [27:26] 

Reserved, RESO. 
EE, bit [25] 

The value of the PSTATE.E bit on entry to Hyp mode, the endianness of stage 1 translation table 

walks in the EL2 translation regime, and the endianness of stage 2 translation table walks in the 

PL1&0 translation regime. 

The possible values of this bit are: 

0 Little-endian. PSTATE.E is cleared to 0 on entry to Hyp mode. Stage 1 translation table 
walks in the EL2 translation regime, and stage 2 translation table walks in the PL1&0 
translation regime are little-endian. 

1 Big-endian. PSTATE.E is set to 1 on entry to Hyp mode. Stage 1 translation table walks 
in the EL2 translation regime, and stage 2 translation table walks in the PL1&0 
translation regime are big-endian. 

If an implementation does not provide Big-endian support at Exception Levels higher than ELO, this 

bit is RESO. 

If an implementation does not provide Little-endian support at Exception Levels higher than ELO, 

this bit is RES1. 

When this register has an architecturally-defined reset value, if this field is implemented as an RW 

field, it resets to an IMPLEMENTATION DEFINED value. 

Bit [24] 

Reserved, RESO. 
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Bits [23:22] 


Reserved, RES1. 


Bits [21:20] 


Reserved, RESO. 


WXN, bit [19] 


Write permission implies XN (Execute-never). For the EL2 translation regime, this bit can force all 
memory regions that are writable to be treated as XN. The possible values of this bit are: 


0 This control has no effect on memory access permissions. 


1 Any region that is writable in the EL2 translation regime is forced to XN for accesses 
from software executing at EL2. 


The WXN bit is permitted to be cached in a TLB. 


This field resets to a value that is architecturally UNKNOWN. 


Bit [18] 

Reserved, RES 1. 
Bit [17] 

Reserved, RESO. 
Bit [16] 


Reserved, RES1. 


Bits [15:13] 


Reserved, RESO. 





I, bit [12] 
Instruction access Cacheability control, for accesses at EL2: 
0 All instruction access to Normal memory from EL2 are Non-cacheable for all levels of 
instruction and unified cache. 
If the value of HSCTLR.M is 0, instruction accesses from stage 1 of the EL2 translation 
regime are to Normal, Outer Shareable, Inner Non-cacheable, Outer Non-cacheable 
memory. 
1 All instruction access to Normal memory from EL2 can be cached at all levels of 
instruction and unified cache. 
If the value of HSCTLR.M is 0, instruction accesses from stage 1 of the EL2 translation 
regime are to Normal, Outer Shareable, Inner Write-Through, Outer Write-Through 
memory. 
This bit has no effect on the PL1&0 translation regime. 
When this register has an architecturally-defined reset value, this field resets to Q. 
Bit [11] 
Reserved, RES1. 
Bits [10:9] 
Reserved, RESO. 
SED, bit [8] 
SETEND instruction disable. Disables SETEND instructions at EL2. 
) SETEND instruction execution is enabled at EL2. 
al SETEND instructions are UNDEFINED at EL2. 
If the implementation does not support mixed-endian operation at EL2, this bit is RES1. 
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If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


IT Disable. Disables some uses of IT instructions at EL2. 
0 AILIT instruction functionality is enabled at EL2. 
cb Any attempt at EL2 to execute any of the following is UNDEFINED: 
° All encodings of the IT instruction with hw1[3:0]!=1000. 
. All encodings of the subsequent instruction with the following values for hw1: 


T1XXXXXXXXXXXXXX 
All 32-bit instructions, and the 16-bit instructions B, UDF, SVC, 
LDM, and STM. 
1011 xxXxXXXXXXXXXX 
All instructions in Miscellaneous 16-bit instructions on 
page F3-2442. 
10100xXxxXXXXXXXXX 
ADD Rad, PC, #imm 


Q@1001xxxXXXXXXXXX 
LDR Rd, [PC, #imm] 
Q@100x1xxx1111xxx 
ADD Rdn, PC; CMP Rn, PC; MOV Rd, PC; BX PC; BLX PC. 


@10001xx1xxxx111 


ADD PC, Rm; CMP PC, Rm; MOV PC, Rm. This pattern also covers 
UNPREDICTABLE cases with BLX Rn. 


These instructions are always UNDEFINED, regardless of whether they would pass or fail 
the condition code check that applies to them as a result of being in an IT block. 


It is IMPLEMENTATION DEFINED whether the IT instruction is treated as: 
° A 16-bit instruction, that can only be followed by another 16-bit instruction. 


° The first half of a 32-bit instruction. 


This means that, for the situations that are UNDEFINED, either the second 16-bit 
instruction or the 32-bit instruction is UNDEFINED. 


An implementation might vary dynamically as to whether IT is treated as a 16-bit 
instruction or the first half of a 32-bit instruction. 


If an instruction in an active IT block that would be disabled by this field sets this field to 1 then 
behavior is CONSTRAINED UNPREDICTABLE. For more information see Changes to an ITD control by 
an instruction in an IT block on page E1-2298. 


ITD is optional, but if it is implemented in the SCTLR then it must also be implemented in the 
HSCTLR. If it is not implemented then this bit is RAZ/WI. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


CP1S5BEN, bit [5] 


System instruction memory barrier enable. Enables accesses to the DMB, DSB, and ISB System 
instructions in the (coproc==1111) encoding space from EL2: 


1) EL2 execution of the CP15DMB, CP15DSB, and CP15ISB instructions is UNDEFINED. 
1 EL2 execution of the CP15DMB, CP15DSB, and CP15ISB instructions is enabled. 


CP15BEN is optional, but if it is implemented in the SCTLR then it must also be implemented in 
the HSCTLR. If it is not implemented then this bit is RAO/WI. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4391 


Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Bits [4:3] 
Reserved, RES1. 
C, bit [2] 
Cacheability control, for data accesses at EL2: 
Q All data access to Normal memory from EL2, and all accesses to the EL2 translation 
tables, are Non-cacheable for all levels of data and unified cache. 
1 All data access to Normal memory from EL2, and all accesses to the EL2 translation 
tables, can be cached at all levels of data and unified cache. 
This bit has no effect on the PL1&0 translation regime. 
When this register has an architecturally-defined reset value, this field resets to Q. 
A, bit [1] 
Alignment check enable. This is the enable bit for Alignment fault checking at EL2: 
0 Alignment fault checking disabled when executing at EL2. 
Instructions that load or store one or more registers, other than load/store exclusive and 
load-acquire/store-release, do not check that the address being accessed is aligned to the 
size of the data element or data elements being accessed. 
1 Alignment fault checking enabled when executing at EL2. 
All instructions that load or store one or more registers have an alignment check that the 
address being accessed is aligned to the size of the data element or data elements being 
accessed. If this check fails it causes an Alignment fault, which is taken as a Data Abort 
exception. 
Load/store exclusive and load-acquire/store-release instructions have an alignment check regardless 
of the value of the A bit. 
This field resets to a value that is architecturally UNKNOWN. 
M, bit [0] 


MMU enable for EL2 stage | address translation. Possible values of this bit are: 


1) EL2 stage 1 address translation disabled. 
See the HSCTLR.I field for the behavior of instruction accesses to Normal memory. 


Hl EL2 stage | address translation enabled. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the HSCTLR: 
To access the HSCTLR: 


MRC p15,4,<Rt>,c1,c@,@ ; Read HSCTLR into Rt 
MCR p15,4,<Rt>,c1,c0,@ ; Write Rt to HSCTLR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0001 0000 000 
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G6.2.67 HSR, Hyp Syndrome Register 
The HSR characteristics are: 
Purpose 
Holds syndrome information for an exception taken to Hyp mode. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
3 7 - RW RW = 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- - RW 
Execution in any Non-secure PE mode other than Hyp mode makes this register UNKNOWN. 
When an UNPREDICTABLE instruction is treated as UNDEFINED, and the exception is taken to EL2, 
the value of HSR is UNKNOWN. The value written to HSR must be consistent with a value that could 
be created as a result of an exception from the same Exception level that generated the exception as 
a result of a situation that is not UNPREDICTABLE at that Exception level, in order to avoid the 
possibility of a privilege violation. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HSTR.T5==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T5==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HSR is architecturally mapped to AArch64 System register ESR_EL2. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HSR is a 32-bit register. 
Field descriptions 
The HSR bit assignments are: 
31 26 25 24 0 
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EC, bits [31:26] 


Exception Class. Indicates the reason for the exception that this register holds information about. 
Possible values of this field are: 


EC == 000000 

Unknown reason. 

See ISS encoding for exceptions with an unknown reason. 
EC == 000001 

Trapped WFI or WFE instruction execution. 


Conditional WFE and WFI instructions that fail their condition code check do not cause 
an exception. 


See ISS encoding for an exception from a WFI or WFE instruction. 
EC == 000011 


Trapped MCR or MRC access with (coproc==1111) that is not reported using EC 
0b000000. 


See ISS encoding for an exception from an MCR or MRC access. 
EC == 000100 


Trapped MCRR or MRRC access with (coproc==1111) that is not reported using EC 
0b000000. 


See ISS encoding for an exception from an MCRR or MRRC access. 
EC == 000101 

Trapped MCR or MRC access with (coproc==1110). 

See ISS encoding for an exception from an MCR or MRC access. 
EC == 000110 

Trapped LDC or STC access. 

The only architected uses of these instructions are: 

° An STC to write data to memory from DBGDTRRXint. 

° An LDC to read data from memory to DBGDTRTXint. 

See ISS encoding for an exception from an LDC or STC instruction. 
EC == 000111 


Access to Advanced SIMD or floating-point functionality trapped by a HCPTR.{TASE, 
TCP10} control. 


Excludes exceptions generated because Advanced SIMD and floating-point are not 
implemented. These are reported with EC value 0b000000. 


See ISS encoding for an exception from an access to SIMD or floating-point 
functionality, resulting from HCPTR. 


EC == 001000 
Trapped VMRS access, from ID group trap, that is not reported using EC 0b000111. 
See ISS encoding for an exception from an MCR or MRC access. 
EC == 001100 
Trapped MRRC access with (coproc==1110). 
See ISS encoding for an exception from an MCRR or MRRC access. 
EC == 001110 
Illegal exception return to AArch32 state. 
See ISS encoding for an exception from an Illegal state or PC alignment fault. 
EC == 010001 
Exception on SVC instruction execution in AArch32 state routed to EL2. 
See ISS encoding for an exception from HVC or SVC instruction execution. 
EC == 010010 
HVC instruction execution in AArch32 state, when HVC is not disabled. 
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See ISS encoding for an exception from HVC or SVC instruction execution. 
EC == 010011 

Trapped execution of SMC instruction in AArch32 state. 

See ISS encoding for an exception from SMC instruction execution. 
EC == 100000 

Prefetch Abort from a lower Exception level. 

See ISS encoding for an exception from a Prefetch Abort. 
EC == 100001 

Prefetch Abort taken without a change in Exception level. 

See ISS encoding for an exception from a Prefetch Abort. 
EC == 100010 

PC alignment fault exception. 

See ISS encoding for an exception from an Illegal state or PC alignment fault. 
EC == 100100 

Data Abort from a lower Exception level. 

See ISS encoding for an exception from a Data Abort. 
EC == 100101 

Data Abort taken without a change in Exception level. 

See ISS encoding for an exception from a Data Abort. 


All other EC values are reserved by ARM, and: 


° Unused values in the range 0b000000 - @b101100 (0x00 - @x2C) are reserved for future use for 
synchronous exceptions. 


° Unused values in the range 0b101101 - @b111111 (@x2D - 0x3F) are reserved for future use, and 
might be used for synchronous or asynchronous exceptions. 


The effect of programming this field to a reserved value is that behavior is CONSTRAINED 
UNPREDICTABLE, as described in Reserved values in System and memory-mapped registers and 
translation table entries on page K1-5492. 

IL, bit [25] 


Instruction length bit. Indicates the size of the instruction that has been trapped to Hyp mode. When 
this bit is valid, possible values of this bit are: 


0 16-bit instruction trapped. 

1 32-bit instruction trapped. 

This field is RES1 and not valid for the following cases: 

. When the EC field is @b000000, indicating an exception with an unknown reason. 

. Prefetch Aborts. 

° Data Aborts that do not have valid ISS information, or for which the ISS is not valid. 


° When the EC value is 0b001110, indicating an Illegal state exception. 


— Note 


This is a change from the behavior in ARMv/7, where the IL field is UNK/SBZP for the 
corresponding cases. 





The IL field is not valid and is UNKNOWN on an exception from a PC alignment fault. 


ISS, bits [24:0] 


Instruction Specific Syndrome. Architecturally, this field can be defined independently for each 
defined Exception class. However, in practice, some ISS encodings are used for more than one 
Exception class. 


The following subsections describe each ISS format. 
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ISS encoding for exceptions with an unknown reason 


This encoding is used by: 


Unknown reason. 


The ISS encoding for these exceptions is: 


24 0 


RESO 


Bits [24:0] 


Reserved, RESO. 


This EC code is used for all exceptions that are not covered by any other EC value. This includes exceptions that 
are generated in the following situations: 


The attempted execution of an instruction bit pattern that has no allocated instruction in the current PE mode 
in the current Security state, including: 


— A read access using a System register encoding pattern that is not allocated for reads at the current 
Exception level and Security state. 


—  Avwrite access using a System register encoding pattern that is not allocated for writes at the current 
Exception level and Security state. 


— Instruction encodings for instructions not implemented in the implementation. 
In Debug state, the attempted execution of an instruction bit pattern that is unallocated in Debug state. 


In Non-debug state, the attempted execution of an instruction bit pattern that is unallocated in Non-debug 
state. 


The attempted execution of a short vector floating-point instruction. 


In an implementation that does not include Advanced SIMD and floating-point functionality, an attempted 
access to Advanced SIMD or floating-point functionality under conditions where that access would be 
permitted if that functionality was present. This includes the attempted execution of an Advanced SIMD or 
floating-point instruction, and attempted accesses to Advanced SIMD and floating-point System registers. 


An exception generated because of the value of one of the SCTLR. {ITD, SED, CP1SBEN} control bits. 


Attempted execution of: 

— An HVC instruction when disabled by HCR.HCD, SCR.HCE, or SCR_EL3.HCE. 

—  AnSMC instruction when disabled by SCR.SCD or SCR_EL3.SMD. 

—  AnHLT instruction when disabled by EDSCR.HDE. 

An exception generated because of the attempted execution of an MSR (Banked register) or MRS (Banked 


register) instruction that would access a Banked register that is not accessible from the Security state and PE 
mode at which the instruction was executed. 


Note 





An exception is generated only if the CONSTRAINED UNPREDICTABLE behavior of the instruction is that it is 
UNDEFINED, see MSR/MRS Banked registers on page K1-5477. 





Attempted execution, in Debug state, of: 


—  ADCPS1 instruction in Non-secure state from ELO when EL2 is using AArch32 and the value of 
HCR.TGE is 1. 
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— A DCPS2 instruction at EL1 or ELO when EL2 is not implemented, or when EL3 is using AArch32 
and the value of SCR.NS is 0, or when EL3 is using AArch64 and the value of SCR_EL3.NS is 0. 


— A DCPS3 instruction when EL3 is not implemented, or when the value of EDSCR.SDD is 1. 


° In Debug state when the value of EDSCR.SDD is 1, the attempted execution at EL2, EL1, or ELO of an 
instruction that is configured to trap to EL3. 


Undefined Instruction exception, when the value of HCR.TGE is I on page G1-3829 describes the configuration 
settings for a trap that returns an HSR.EC value of 0b000000. 


ISS encoding for an exception from a WFI or WFE instruction 
This encoding is used by: 


. Trapped WFI or WFE instruction execution. 


Conditional WFE and WFI instructions that fail their condition code check do not cause an exception. 


The ISS encoding for these exceptions is: 


24 23 20 19 10 


7 COND RESO 


| 


CV, bit [24] 
Condition code valid. Possible values of this bit are: 
) The COND field is not valid. 
1 The COND field is valid. 


When an A32 instruction is trapped, CV is set to 1. 
When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or set to 
0. See the description of the COND field for more information. 
COND, bits [23:20] 
The condition code for the trapped instruction. 


When an A32 instruction is trapped, CV is set to 1 and: 


° If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 
° If the instruction is unconditional, COND is set to 0b1110. 


A conditional A32 instruction that is known to pass its condition code check can be presented either: 
° With COND set to 0b1110, the value for unconditional. 
° With the COND value held in the instruction. 


When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


° CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the SPSR.IT 
field to determine the condition, if any, of the T32 instruction. 

° CV is set to 1 and COND is set to the condition code for the condition that applied to the 
instruction. 


For an implementation that, for both A32 and T32 instructions, takes an exception on a trapped 
conditional instruction only if the instruction passes its condition code check, these definitions mean 
that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND field is set to @b1110, 
or to the value of any condition that applied to the instruction. 
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Bits [19:1] 
Reserved, RESO. 
TI, bit [0] 
Trapped instruction. Possible values of this bit are: 
) WFI trapped. 
1 WEE trapped. 


Traps to Hyp mode of Non-secure ELO and EL] execution of WFE and WFI instructions on page G1-3904 describes 
the configuration settings for this trap. 


ISS encoding for an exception from an MCR or MRC access 

This encoding is used by: 

. Trapped MCR or MRC access with (coproc==1111) that is not reported using EC 0b000000. 
° Trapped MCR or MRC access with (coproc==1110). 


. Trapped VMRS access, from ID group trap, that is not reported using EC 0b000111. 


The ISS encoding for these exceptions is: 


24 23 2019 1716 1413 109 8 


92 | ican 
RESO 


CV, bit [24] 
Condition code valid. Possible values of this bit are: 
) The COND field is not valid. 
1 The COND field is valid. 


When an A32 instruction is trapped, CV is set to 1. 
When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or set to 
0. See the description of the COND field for more information. 
COND, bits [23:20] 
The condition code for the trapped instruction. 


When an A32 instruction is trapped, CV is set to 1 and: 


° If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 
° If the instruction is unconditional, COND is set to @b1110. 


A conditional A32 instruction that is known to pass its condition code check can be presented either: 
° With COND set to 0b1110, the value for unconditional. 
. With the COND value held in the instruction. 


When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 





° CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the SPSR.IT 
field to determine the condition, if any, of the T32 instruction. 
. CV is set to 1 and COND is set to the condition code for the condition that applied to the 
instruction. 
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For an implementation that, for both A32 and T32 instructions, takes an exception on a trapped 
conditional instruction only if the instruction passes its condition code check, these definitions mean 
that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND field is set to @b1110, 
or to the value of any condition that applied to the instruction. 


Opc2, bits [19:17] 
The Opc2 value from the issued instruction. 


For a trapped VMRS access, holds the value 0b000. 


Opcl, bits [16:14] 
The Opcl value from the issued instruction. 


For a trapped VMRS access, holds the value 0b111. 


CRn, bits [13:10] 
The CRn value from the issued instruction. 


For a trapped VMRS access, holds the reg field from the VMRS instruction encoding. 


Bit [9] 


Reserved, RESO. 


Rt, bits [8:5] 


The Rt value from the issued instruction, the general-purpose register used for the transfer. 


CRn, bits [4:1] 
The CRm value from the issued instruction. 


For a trapped VMRS access, holds the value 0b0000. 


Direction, bit [0] 
Indicates the direction of the trapped instruction. The possible values of this bit are: 
0 Write to System register space. MCR instruction. 


1 Read from System register space. MRC or VMRS instruction. 
The following sections describe configuration settings for traps that are reported using EC value 0b000011: 
. Traps to Hyp mode of Non-secure ELO and EL1 accesses to the ID registers on page G1-3901. 


° Traps to Hyp mode of Non-secure ELO and EL1 accesses to lockdown, DMA, and TCM operations on 
page G1-3900. 


° Traps to Hyp mode of Non-secure EL1 execution of cache maintenance instructions on page G1-3899. 
° Traps to Hyp mode of Non-secure EL1 execution of TLB maintenance instructions on page G1-3898. 
° Traps to Hyp mode of Non-secure EL1 accesses to the Auxiliary Control Register on page G1-3899. 


° Traps to Hyp mode of Non-secure ELO and ELI accesses to Performance Monitors registers on 
page G1-3912. 


° Traps to Hyp mode of Non-secure EL1 accesses to the CPACR on page G1-3906. 
° Traps to Hyp mode of Non-secure EL] accesses to virtual memory control registers on page G1-3897. 


° General trapping to Hyp mode of Non-secure ELO and ELI accesses to System registers in the 
(coproc==O0b1111) encoding space on page G1-3908. 


The following sections describe configuration settings for traps that are reported using EC value 0b000101: 
° ID group 0, Primary device identification registers on page G1-3902. 


° Traps to Hyp mode of Non-secure System register accesses to trace registers on page G1-3907. 
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° Trapping Non-secure System register accesses to Debug ROM registers on page G1-3910. 
° Trapping Non-secure System register accesses to powerdown debug registers on page G1-3910. 
° Trapping general Non-secure System register accesses to debug registers on page G1-3911. 


The following sections describes configuration settings for traps that are reported using EC value 0b001000: 
. ID group 0, Primary device identification registers on page G1-3902. 

° ID group 3, Detailed feature identification registers on page G1-3904. 

ISS encoding for an exception from an MCRR or MRRC access 

This encoding is used by: 

° Trapped MCRR or MRRC access with (coproc==1111) that is not reported using EC 0b000000. 


. Trapped MRRC access with (coproc==1110). 


The ISS encoding for these exceptions is: 


24 23 20 19 16 15 14 13 109 8 


ne Ee 
RESO RESO 


CV, bit [24] 
Condition code valid. Possible values of this bit are: 
) The COND field is not valid. 
1 The COND field is valid. 


When an A32 instruction is trapped, CV is set to 1. 
When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or set to 
0. See the description of the COND field for more information. 
COND, bits [23:20] 
The condition code for the trapped instruction. 


When an A32 instruction is trapped, CV is set to 1 and: 


° If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 
° If the instruction is unconditional, COND is set to @b1110. 


A conditional A32 instruction that is known to pass its condition code check can be presented either: 
° With COND set to 0b1110, the value for unconditional. 

° With the COND value held in the instruction. 

When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


° CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the SPSR.IT 
field to determine the condition, if any, of the T32 instruction. 

. CV is set to 1 and COND is set to the condition code for the condition that applied to the 
instruction. 


For an implementation that, for both A32 and T32 instructions, takes an exception on a trapped 
conditional instruction only if the instruction passes its condition code check, these definitions mean 
that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND field is set to @b1110, 
or to the value of any condition that applied to the instruction. 





G6-4400 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Opcl, bits [19:16] 

The Opcl value from the issued instruction. 
Bits [15:14] 

Reserved, RESO. 
Rt2, bits [13:10] 


The Rt2 value from the issued instruction, the second general-purpose register used for the transfer. 


Bit [9] 
Reserved, RESO. 
Rt, bits [8:5] 
The Rt value from the issued instruction, the first general-purpose register used for the transfer. 
CRn, bits [4:1] 
The CRm value from the issued instruction. 
Direction, bit [0] 
Indicates the direction of the trapped instruction. The possible values of this bit are: 
Q Write to System register space. MCRR instruction. 
1 Read from System register space. MRRC instruction. 


The following sections describe configuration settings for traps that are reported using EC value 0b000100: 
° Traps to Hyp mode of Non-secure EL1 accesses to virtual memory control registers on page G1-3897. 


° General trapping to Hyp mode of Non-secure ELO and ELI accesses to System registers in the 
(coproc==O0b1111) encoding space on page G1-3908. 


The following sections describe configuration settings for traps that are reported using EC value 0b001100: 
° Traps to Hyp mode of Non-secure System register accesses to trace registers on page G1-3907. 


° Trapping Non-secure System register accesses to Debug ROM registers on page G1-3910. 


ISS encoding for an exception from an LDC or STC instruction 
This encoding is used by: 


° Trapped LDC or STC access. 
The only architected uses of these instructions are: 
—  AnSTC to write data to memory from DBGDTRRXint. 
—  AnLDC to read data from memory to DBGDTRTXint. 


The ISS encoding for these exceptions is: 


24 23 20 19 12 11 9 8 5 4 3 0 


1 
CV | | L Direction 
Offset 





CV, bit [24] 
Condition code valid. Possible values of this bit are: 
0 The COND field is not valid. 
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1 The COND field is valid. 
When an A32 instruction is trapped, CV is set to 1. 
When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or set to 
0. See the description of the COND field for more information. 
COND, bits [23:20] 
The condition code for the trapped instruction. 


When an A32 instruction is trapped, CV is set to 1 and: 


° If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 
° If the instruction is unconditional, COND is set to 0b1110. 


A conditional A32 instruction that is known to pass its condition code check can be presented either: 
° With COND set to 0b1110, the value for unconditional. 
° With the COND value held in the instruction. 


When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


° CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the SPSR.IT 
field to determine the condition, if any, of the T32 instruction. 

. CV is set to 1 and COND is set to the condition code for the condition that applied to the 
instruction. 


For an implementation that, for both A32 and T32 instructions, takes an exception on a trapped 
conditional instruction only if the instruction passes its condition code check, these definitions mean 
that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND field is set to @b111, 
or to the value of any condition that applied to the instruction. 

imm8, bits [19:12] 


The immediate value from the issued instruction. 
Bits [11:9] 
Reserved, RESO. 


Rn, bits [8:5] 


The Rn value from the issued instruction. Valid only when AM[2] is 0, indicating an immediate 
form of the LDC or STC instruction. 


When AM[2] is 1, indicating a literal form of the LDC or STC instruction, this field is UNKNOWN. 


Offset, bit [4] 
Indicates whether the offset is added or subtracted: 
Q Subtract offset. 
1 Add offset. 


This bit corresponds to the U bit in the instruction encoding. 


AM, bits [3:1] 


Addressing mode. The permitted values of this field are: 


000 Immediate unindexed. 
001 Immediate post-indexed. 
010 Immediate offset. 

011 Immediate pre-indexed. 
100 Literal unindexed. 


LDC instruction in A32 instruction set only. 


For a trapped STC instruction or a trapped T32 LDC instruction this encoding is 
reserved. 
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110 Literal offset. 
LDC instruction only. 


For a trapped STC instruction, this encoding is reserved. 


The values @b101 and 0b111 are reserved. The effect of programming this field to a reserved value is 
that behavior is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and 
memory-mapped registers and translation table entries on page K1-5477. 


Bit [2] in this subfield indicates the instruction form, immediate or literal. 

Bits [1:0] in this subfield correspond to the bits {P, W} in the instruction encoding. 
Direction, bit [0] 

Indicates the direction of the trapped instruction. The possible values of this bit are: 

0 Write to memory. STC instruction. 


1 Read from memory. LDC instruction. 


Trapping general Non-secure System register accesses to debug registers on page G1-3911 describes the 
configuration settings for the trap that is reported using EC value 0b000110. 


ISS encoding for an exception from an access to SIMD or floating-point functionality, resulting 
from HCPTR 


This encoding is used by: 


° Access to Advanced SIMD or floating-point functionality trapped by a HCPTR.{TASE, TCP10} control. 


Excludes exceptions generated because Advanced SIMD and floating-point are not implemented. These are 
reported with EC value 0b000000. 


The ISS encoding for these exceptions is: 


24 23 20 19 6 5 4 3 0 


CV id [| RESO 


Excludes exceptions that occur because Advanced SIMD and floating-point functionality is not implemented, or 
because the value of HCR.TGE or HCR_EL2.TGE is 1. These are reported with EC value 0b000000. 


CV, bit [24] 
Condition code valid. Possible values of this bit are: 
) The COND field is not valid. 
1 The COND field is valid. 


When an A32 instruction is trapped, CV is set to 1. 
When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or set to 
0. See the description of the COND field for more information. 
COND, bits [23:20] 
The condition code for the trapped instruction. 


When an A32 instruction is trapped, CV is set to 1 and: 


° If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 
° If the instruction is unconditional, COND is set to @b1110. 


A conditional A32 instruction that is known to pass its condition code check can be presented either: 


° With COND set to 0b1110, the value for unconditional. 
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. With the COND value held in the instruction. 
When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


° CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the SPSR.IT 
field to determine the condition, if any, of the T32 instruction. 

° CV is set to 1 and COND is set to the condition code for the condition that applied to the 
instruction. 


For an implementation that, for both A32 and T32 instructions, takes an exception on a trapped 
conditional instruction only if the instruction passes its condition code check, these definitions mean 
that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND field is set to @b1110, 
or to the value of any condition that applied to the instruction. 

Bits [19:6] 
Reserved, RESO. 

TA, bit [5] 
Indicates trapped use of Advanced SIMD functionality. The possible values of this bit are: 
0 Exception was not caused by trapped use of Advanced SIMD functionality. 
1 Exception was caused by trapped use of Advanced SIMD functionality. 


Any use of an Advanced SIMD instruction that is not also a floating-point instruction that is trapped 
to Hyp mode because of a trap configured in the HCPTR sets this bit to 1. 


For a list of these instructions, see Controls of Advanced SIMD operation that do not apply to 
floating-point operation on page E1-2306. 


Bit [4] 


Reserved, RESO. 


coproc, bits [3:0] 
When the TA field returns the value 1, this field returns the value 1010, otherwise this field is RESO. 


The following sections describe the configuration settings for the traps that are reported using EC value 0b000111: 


° General trapping to Hyp mode of Non-secure accesses to the SIMD and floating-point registers on 
page G1-3905. 


° Traps to Hyp mode of Non-secure accesses to Advanced SIMD functionality on page G1-3906. 


ISS encoding for an exception from HVC or SVC instruction execution 


This encoding is used by: 
° Exception on SVC instruction execution in AArch32 state routed to EL2. 
° HVC instruction execution in AArch32 state, when HVC is not disabled. 


The ISS encoding for these exceptions is: 


24 1615 0 


Reserved, RESO. 


Bits [24:16] 


imm16, bits [15:0] 
The value of the immediate field from the HVC or SVC instruction. 


For an HVC instruction, this is the value of the imm16 field of the issued instruction. 
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For an SVC instruction: 
° If the instruction is unconditional, then: 


— For the T32 instruction, this field is zero-extended from the imm8 field of the 
instruction. 


— For the A32 instruction, this field is the bottom 16 bits of the imm24 field of the 
instruction. 


° If the instruction is conditional, this field is UNKNOWN. 


The HVC instruction is unconditional, and a conditional SVC instruction generates an exception only if it passes its 
condition code check. Therefore, the syndrome information for these exceptions does not require conditionality 
information. 


Supervisor Call exception, when the value of HCR.TGE is I on page G1-3829 describes the configuration settings 
for the trap reported with EC value 0b010001. 


ISS encoding for an exception from SMC instruction execution 
This encoding is used by: 
° Trapped execution of SMC instruction in AArch32 state. 


The ISS encoding for these exceptions is: 


24 23 20 19 18 0 


i COND ii RESO 


—— 


CCKNOWNPASS 
CV, bit [24] 
Condition code valid. Possible values of this bit are: 
0 The COND field is not valid. 
1 The COND field is valid. 


When an A32 instruction is trapped, CV is set to 1. 


When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether CV is set to | or set to 
0. See the description of the COND field for more information. 


This field is only valid if CCKNOWMPASS is 1, otherwise it is RESO. 


COND, bits [23:20] 
The condition code for the trapped instruction. 


When an A32 instruction is trapped, CV is set to 1 and: 


° If the instruction is conditional, COND is set to the condition code field value from the 
instruction. 
° If the instruction is unconditional, COND is set to 0b1110. 


A conditional A32 instruction that is known to pass its condition code check can be presented either: 
° With COND set to 0b1110, the value for unconditional. 

. With the COND value held in the instruction. 

When a T32 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 


° CV is set to 0 and COND is set to an UNKNOWN value. Software must examine the SPSR.IT 
field to determine the condition, if any, of the T32 instruction. 
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. CV is set to 1 and COND is set to the condition code for the condition that applied to the 
instruction. 


For an implementation that, for both A32 and T32 instructions, takes an exception on a trapped 
conditional instruction only if the instruction passes its condition code check, these definitions mean 
that when CV is set to | it is IMPLEMENTATION DEFINED whether the COND field is set to @b111, 
or to the value of any condition that applied to the instruction. 


This field is only valid if CCKNOWMPASS is 1, otherwise it is RESO. 
CCKNOWNMNFPASS, bit [19] 
Indicates whether the instruction might have failed its condition code check. 


0 The instruction was unconditional, or was conditional and passed its condition code 
check. 


1 The instruction was conditional, and might have failed its condition code check. 
Bits [18:0] 
Reserved, RESO. 


Traps to Hyp mode of Non-secure EL] execution of SMC instructions on page G1-3901 describes the configuration 
settings for this trap, for instructions executed in Non-secure PL1 modes. 


ISS encoding for an exception from a Prefetch Abort 
This encoding is used by: 

° Prefetch Abort from a lower Exception level. 

. Prefetch Abort taken without a change in Exception level. 


The ISS encoding for these exceptions is: 


24 11109 8 7 6 5 0 


RESO yyy] IFSC 
| Po RESO 
S1PTW 


RESO 
EA 
FnV 


Bits [24:11] 
Reserved, RESO. 
FnV, bit [10] 


FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 


Q HIFAR is valid. 

1 HIFAR is not valid, and holds an UNKNOWN value. 

This field is only valid if the IFSC code is 010000. It is RESO for all other aborts. 
EA, bit [9] 


External abort type. This bit can provide an IMPLEMENTATION DEFINED classification of external 
aborts. 


For any abort other than an External abort this bit returns a value of 0. 
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Bit [8] 
Reserved, RESO. 
S1PTW, bit [7] 
For a stage 2 fault, indicates whether the fault was a stage 2 fault on an access made for a stage 1 
translation table walk: 
0 Fault not on a stage 2 translation for a stage 1 translation table walk. 
at Fault on the stage 2 translation of an access for a stage | translation table walk. 
For any abort other than a stage 2 fault this bit is RESO. 
Bit [6] 


Reserved, RESO. 


IFSC, bits [5:0] 
Instruction Fault Status Code. Possible values of this field are: 
000000 Address size fault, translation table base register 
000001 Address size fault, level 1 
000010 Address size fault, level 2 
000011 Address size fault, level 3 
000101 Translation fault, level 1 
000110 Translation fault, level 2 
000111 Translation fault, level 3 
001001 Access flag fault, level 1 
001010 Access flag fault, level 2 
001011 Access flag fault, level 3 
001101 Permission fault, level 1 
001110 Permission fault, level 2 
001111 Permission fault, level 3 





010000 Synchronous external abort, not on translation table walk 

011000 Synchronous parity or ECC error on memory access, not on translation table walk 
010101 Synchronous external abort, on translation table walk, level 1 

010110 Synchronous external abort, on translation table walk, level 2 

010111 Synchronous external abort, on translation table walk, level 3 

011101 Synchronous parity or ECC error on memory access on translation table walk, level 1 
011110 Synchronous parity or ECC error on memory access on translation table walk, level 2 
@11111 Synchronous parity or ECC error on memory access on translation table walk, level 3 
100010 Debug exception, only when the EC value is 0b100001 

110000 TLB conflict abort 


All other values are reserved. The effect of programming this field to a reserved value is that 
behavior is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and 
memory-mapped registers and translation table entries on page K1-5477. 


For more information about the lookup level associated with a fault, see The level associated with 
MMU faults on a Long-descriptor translation table lookup on page G4-4131. 


The following sections describe cases where Prefetch Abort exceptions can be routed to Hyp mode, generating 
exceptions that are reported in the HSR with EC value 0b100000: 


° Abort exceptions, when the value of HCR.TGE is J on page G1-3830. 


° Routing debug exceptions to EL2 on page G1-3833. 
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ISS encoding for an exception from an Illegal state or PC alignment fault 


This encoding is used by: 
. Illegal exception return to AArch32 state. 
° PC alignment fault exception. 


The ISS encoding for these exceptions is: 


24 0 


RESO 


Reserved, RESO. 


Bits [24:0] 


For more information about the Illegal state exception, see: 

° Illegal changes to PSTATE.M on page G1-3809. 

° Illegal return events from AArch32 state on page G1-3835. 
° Legal returns that set PSTATE.IL to 1 on page G1-3837. 

° The Illegal Execution state exception on page G1-3837. 


For more information about the PC alignment fault exception, see Branching to an unaligned PC on page K1-5458. 


ISS encoding for an exception from a Data Abort 


This encoding is used by: 
. Data Abort from a lower Exception level. 
° Data Abort taken without a change in Exception level. 


The ISS encoding for these exceptions is: 


24 23 22 21 20 19 16151413 11109 8 7 6 5 





WnR 

——— 
ee 
a 


ISV, bit [24] 


Instruction syndrome valid. Indicates whether the syndrome information in ISS[23:14] is valid. 





0 No valid instruction syndrome. ISS[23:14] are RESO. 
1 ISS[23:14] hold a valid instruction syndrome. 
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This bit is 0 for all faults except Data Aborts generated by stage 2 address translations for which all 
the following apply to the instruction that generated the Data Abort exception: 


° The instruction is an LDR, LDA, LDRT, LDRSH, LDRSHT, LDRH, LDAH, LDRHT, 
LDRSB, LDRSBT, LDRB, LDAB, LDRBT, STR, STL, STRT, STRH, STLH, STRHT, 
STRB, STLB, or STRBT instruction. 


. The instruction is not performing register writeback. 
° The instruction is not using the PC as a source or destination register. 


For these cases, ISV is UNKNOWN if the exception was generated in Debug state in memory access 
mode, as described in Data Aborts in Memory access mode on page H4-4914, and otherwise 
indicates whether ISS[23:14] hold a valid syndrome. 


— Note 


In the A32 instruction set, LDR*T and STR*T instructions always perform register writeback and 
therefore never return a valid instruction syndrome. 





ISV is set to 0 on a stage 2 abort on a stage 1 translation table lookup. 
It is IMPLEMENTATION DEFINED whether ISV is set to 1 or 0 on a Synchronous external abort on 
stage 2 translation table walk. 
SAS, bits [23:22] 
Syndrome Access Size. When ISV is 1, indicates the size of the access attempted by the faulting 


operation. 

00 Byte 

01 Halfword 

10 Word 

11 Doubleword 


This field is UNKNOWN when the value of ISV is UNKNOWN. 
This field is RESO when the value of ISV is 0. 


SSE, bit [21] 


Syndrome Sign Extend. When ISV is 1, for a byte, halfword, or word load operation, indicates 
whether the data item must be sign extended. For these cases, the possible values of this bit are: 


Q Sign-extension not required. 

1 Data item must be sign-extended. 

For all other operations this bit is 0. 

This field is UNKNOWN when the value of ISV is UNKNOWN. 
This field is RESO when the value of ISV is 0. 


Bit [20] 
Reserved, RESO. 


SRT, bits [19:16] 


Syndrome Register transfer. When ISV is 1, the register number of the Rt operand of the faulting 
instruction. 


This field is UNKNOWN when the value of ISV is UNKNOWN. 
This field is RESO when the value of ISV is 0. 
Bit [15] 


Reserved, RESO. 
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AR, bit [14] 


Bits [13:11] 


FnV, bit [10] 


EA, bit [9] 


CM, bit [8] 


S1PTW, bit [7] 


WnalR, bit [6] 


Acquire/Release. When ISV is 1, the possible values of this bit are: 
) Instruction did not have acquire/release semantics. 

1 Instruction did have acquire/release semantics. 

This field is UNKNOWN when the value of ISV is UNKNOWN. 

This field is RESO when the value of ISV is 0. 


Reserved, RESO. 


FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 


Q HDFAR is valid. 
1 HDFAR is not valid, and holds an UNKNOWN value. 
This field is valid only if the DFSC code is @b@10000. It is RESO for all other aborts. 


External abort type. This bit can provide an IMPLEMENTATION DEFINED classification of external 
aborts. 


For any abort other than an External abort this bit returns a value of 0. 


Cache maintenance. For a synchronous fault, identifies fault that comes from a cache maintenance 
or address translation instruction. For synchronous faults, the possible values of this bit are: 


0 Fault not generated by a cache maintenance or address translation instruction. 


1 Fault generated by a cache maintenance or address translation instruction. 


For a stage 2 fault, indicates whether the fault was a stage 2 fault on an access made for a stage 1 
translation table walk: 


Q Fault not on a stage 2 translation for a stage 1 translation table walk. 
1 Fault on the stage 2 translation of an access for a stage 1 translation table walk. 


For any abort other than a stage 2 fault this bit is RESO. 


Write not Read. Indicates whether a synchronous abort was caused by a write instruction or a read 
instruction. The possible values of this bit are: 


) Abort caused by a read instruction. 
1 Abort caused by a write instruction. 


For faults on cache maintenance and address translation instructions, this bit always returns a value 
of 1. 


For an asynchronous Data Abort exception this bit is UNKNOWN. 


DFSC, bits [5:0] 


Data Fault Status Code. Possible values of this field are: 
000000 Address size fault, translation table base register 
000001 Address size fault, level 1 

000010 Address size fault, level 2 

000011 Address size fault, level 3 
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000110 
000111 
001001 
001010 
001011 
001101 
001110 
001111 
010000 
011000 
010001 
011001 
010101 
010110 
010111 
011101 
011110 
011111 
100001 
100010 
110000 
110100 
110101 
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Translation fault, level 1 

Translation fault, level 2 

Translation fault, level 3 

Access flag fault, level 1 

Access flag fault, level 2 

Access flag fault, level 3 

Permission fault, level 1 

Permission fault, level 2 

Permission fault, level 3 

Synchronous external abort, not on translation table walk 

Synchronous parity or ECC error on memory access, not on translation table walk 
SError interrupt 

SError interrupt from a parity or ECC error on memory access 

Synchronous external abort, on translation table walk, level 1 

Synchronous external abort, on translation table walk, level 2 

Synchronous external abort, on translation table walk, level 3 

Synchronous parity or ECC error on memory access on translation table walk, level 1 
Synchronous parity or ECC error on memory access on translation table walk, level 2 
Synchronous parity or ECC error on memory access on translation table walk, level 3 
Alignment fault 

Debug exception, only when the EC value is 0b100100 

TLB conflict abort 

IMPLEMENTATION DEFINED fault (Lockdown fault) 


IMPLEMENTATION DEFINED fault (Unsupported Exclusive access fault) 


All other values are reserved. The effect of programming this field to a reserved value is that 
behavior is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and 
memory-mapped registers and translation table entries on page K1-5477. 


For more information about the lookup level associated with a fault, see The level associated with 
MMU faults on a Long-descriptor translation table lookup on page G4-4131. 


The following describe cases where Data Abort exceptions can be routed to Hyp mode, generating exceptions that 
are reported in the HSR with EC value 0b100100: 


° Abort exceptions, when the value of HCR.TGE is 1 on page G1-3830. 


° Routing debug exceptions to EL2 on page G1-3833. 


The following describe cases that can cause a Data Abort exception that is taken to Hyp mode, and reported in the 
HSR with EC value of 0b100000 or 0b100100: 


° Hyp mode control of Non-secure access permissions on page G4-4075. 


. Memory fault reporting in Hyp mode on page G4-4135. 


Accessing the HSR: 


To access the HSR: 


MRC p15,4,<Rt>,c5,c2,@ ; Read HSR into Rt 
MCR p15,4,<Rt>,c5,c2,@ ; Write Rt to HSR 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0101 0010 000 








G6-4412 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2.68 HSTR, Hyp System Trap Register 


The HSTR characteristics are: 


Purpose 


Controls trapping to Hyp mode of Non-secure accesses, at EL1 or lower, to the System register in 
the coproc == 1111 encoding space, by the CRn value used to access the register using MCR or MRC 
instruction. When the register is accessible using an MCRR or MRRC instruction, this is the CRm 
value used to access the register. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





: RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


Attributes 


AArch32 System register HSTR is architecturally mapped to AArch64 System register HSTR_EL2. 
If EL2 is not implemented, this register is RESO from EL3. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 with EL2 using AArch32, or into EL3 with EL3 using AArch32. Otherwise, RW fields in 
this register reset to architecturally UNKNOWN values. 


HSTR is a 32-bit register. 


Field descriptions 


The HSTR bit assignments are: 
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161514131211109 8 76543210 





T10 
T11 
T12 
T13 
T14 
T15 


Bits [31:16] 


Reserved, RESO. 


T<n>, bit [n], for n = 0 to 15 
Fields T14 and T4 are RESO. 


The remaining fields control whether Non-secure ELO and EL1 accesses, using MCR, MRC, 
MCRR, and MRRC instructions, to the System registers in the coproc == 1111 encoding space are 


trapped to Hyp mode: 
) This control has no effect on Non-secure ELO or EL1 accesses to System registers. 
1 Any Non-secure EL1 MCR, MRC access with coproc == 1111 and CRn == <n> is 


trapped to Hyp mode if the access is not UNDEFINED when the value of this field is 0. 


Any Non-secure EL1 MCRR, MRRC access with coproc == 1111 and CRm == <n> is 
trapped to Hyp mode if the access is not UNDEFINED when the value of this field is 0. 


For example, when HSTR.T7 is 1: 


° Any 32-bit access from a Non-secure EL] mode, using an MCR or MRC instruction with 
coproc set to 1111 and <CRn> set to c7, and that is not UNDEFINED when HSTR.T7 is 0, is 
trapped to Hyp mode. 


° Any 64-bit access from a Non-secure EL1 mode, using an MCRR or MRRC instruction with 
coproc set to 1111 and <CRm> set to c7, and that is not UNDEFINED when HSTR.T7 is 0, is 
trapped to Hyp mode. 


When this register has an architecturally-defined reset value, this field resets to Q. 


Accessing the HSTR: 
To access the HSTR: 


MRC p15,4,<Rt>,cl,c1,3 ; Read HSTR into Rt 
MCR p15,4,<Rt>,cl,cl,3 ; Write Rt to HSTR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 100 0001 0001 011 
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G6.2.69 HTCR, Hyp Translation Control Register 


The HTCR characteristics are: 


Purpose 
Controls translation table walks required for the stage 1 translation of memory accesses from Hyp 
mode, and holds cacheability and shareability information for the accesses. 


Used in conjunction with HTTBR, that defines the translation table base address for the translations. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2 (NS) 





5 : RW 





Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T2==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T2==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register HTCR is architecturally mapped to AArch64 System register TCR_EL2. 


If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
HTCR is a 32-bit register. 


Field descriptions 


The HTCR bit assignments are: 


31 30 29 24 23 22 14131211109 8 7 3 


2 0 
I RESO fl RESO jo] || RESO TOSZ 
IMP DEF ORGNO 
RES1 
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Bit [31] 
Reserved, RES1. 
IMP DEF, bit [30] 
IMPLEMENTATION DEFINED. 
Bits [29:24] 
Reserved, RESO. 
Bit [23] 
Reserved, RES1. 
Bits [22:14] 


Reserved, RESO. 


SHO, bits [13:12] 
Shareability attribute for memory associated with translation table walks using HTTBR. 
00 Non-shareable 
10 Outer Shareable 
11 Inner Shareable 


Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


ORGNDO, bits [11:10] 


Outer cacheability attribute for memory associated with translation table walks using HTTBR. 


00 Normal memory, Outer Non-cacheable 

01 Normal memory, Outer Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Outer Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Outer Write-Back Read-Allocate No Write-Allocate Cacheable 


IRGNO, bits [9:8] 


Inner cacheability attribute for memory associated with translation table walks using HTTBR. 


00 Normal memory, Inner Non-cacheable 

01 Normal memory, Inner Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Inner Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Inner Write-Back Read-Allocate No Write-Allocate Cacheable 


Bits [7:3] 
Reserved, RESO. 


TOSZ, bits [2:0] 


The size offset of the memory region addressed by HTTBR. The region size is 2G2-T0SZ) bytes. 


Accessing the HTCR: 
To access the HTCR: 


MRC p15,4,<Rt>,c2,c@,2 ; Read HTCR into Rt 
MCR p15,4,<Rt>,c2,c@,2 ; Write Rt to HTCR 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0010 0000 010 
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G6.2.70 HTPIDR, Hyp Software Thread ID Register 


The HTPIDR characteristics are: 


Purpose 
Provides a location where software running in Hyp mode can store thread identifying information 
that is not visible to Non-secure software executing at ELO or EL1, for hypervisor management 


purposes. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





= = RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2 (NS) 





- - RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T13==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T13==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register HTPIDR is architecturally mapped to AArch64 System register 
TPIDR_EL2[31:0]. 
If EL2 is not implemented, this register is RESO from EL3. 


The PE never updates this register. This means the register is always UNKNOWN on reset. 
Attributes 
HTPIDR is a 32-bit register. 


Field descriptions 


The HTPIDR bit assignments are: 


31 0 


Thread ID 


Bits [31:0] 
Thread ID. Thread identifying information stored by software running at this Exception level. 
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Accessing the HTPIDR: 
To access the HTPIDR: 


MRC p15,4,<Rt>,c13,c@,2 ; Read HTPIDR into Rt 
MCR p15,4,<Rt>,c13,c@,2 ; Write Rt to HTPIDR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1101 0000 010 
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G6.2.71 HTTBR, Hyp Translation Table Base Register 
The HTTBR characteristics are: 


Purpose 


Holds the base address of the translation table for the stage 1 translation of memory accesses from 
Hyp mode. 


Used in conjunction with the HTCR. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: 4 : RW RW : 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





5 : RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T2==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T2==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register HTTBR is architecturally mapped to AArch64 System register 
TTBRO_EL2. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
HTTBR is a 64-bit register. 


Field descriptions 


The HTTBR bit assignments are: 


63 48 47 0 


RESO BADDR 


Bits [63:48] 


Reserved, RESO. 
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BADDR, bits [47:0] 


Translation table base address, bits[47:x], Bits [x-1:0] are RESO, with the additional requirement that 
if bits[x-1:3] are not all zero, this is a misaligned translation table base address, with effects that are 
CONSTRAINED UNPREDICTABLE, and must be one of the following: 


° Register bits [x-1:3] are treated as if all the bits are zero. The value read back from these bits 
is either the value written or zero. 


° The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


x is determined from the value of HTCR.TOSZ as follows: 
° If HTCR.TOSZ is 0 or 1, x = 5 - HTCR.TOSZ. 
° If HTCR.TOSZ is greater than 1, x = 14 - HTCR.TOSZ. 


If bits[47:40] of the translation table base address are not zero, an Address size fault is generated. 


Accessing the HTTBR: 
To access the HTTBR: 


MRRC p15,4,<Rt>,<Rt2>,c2 ; Read HTTBR[31:0] into Rt and HTTBR[63:32] into Rt2 
MCRR p15,4,<Rt>,<Rt2>,c2 ; Write Rt to HTTBR[31:0] and Rt2 to HTTBR[63:32] 


Register access is encoded as follows: 





coproc opc1 CRm 





1111 0100 0010 
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G6.2.72 HVBAR, Hyp Vector Base Address Register 
The HVBAR characteristics are: 
Purpose 
Holds the vector base address for any exception that is taken to Hyp mode. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- - RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T12==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T12==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
AArch32 System register HVBAR is architecturally mapped to AArch64 System register 
VBAR_EL2[31:0]. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
HVBAR is a 32-bit register. 
Field descriptions 
The HVBAR bit assignments are: 
31 5 4 0 
Vector Base Address RESO 
Bits [31:5] 
Vector Base Address. Bits[31:5] of the base address of the exception vectors for exceptions taken to 
this Exception level. Bits[4:0] of an exception vector are the exception offset. 
Bits [4:0] 
Reserved, RESO. 
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Accessing the HVBAR: 
To access the HVBAR: 


MRC p15,4,<Rt>,c12,c@,@ ; Read HVBAR into Rt 
MCR p15,4,<Rt>,c12,c0,@ ; Write Rt to HVBAR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1100 0000 000 
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G6.2.73 ICIALLU, Instruction Cache Invalidate All to PoU 
The ICIALLU characteristics are: 


Purpose 
Invalidate all instruction caches to PoU. If branch predictors are architecturally visible, also flush 
branch predictors. 

Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HCR.TPU==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HCR_EL2.TPU==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


AArch32 System instruction ICIALLU performs the same function as AArch64 System instruction 
IC IALLU. 


Attributes 
ICIALLU is a 32-bit System instruction. 


Field descriptions 

ICIALLU ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 

Executing the ICIALLU instruction: 

The ICIALLU instruction is executed as: 


MCR p15,0,<Rt>,c7,c5,@ ; ICIALLU operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0111 0101 000 





The PE ignores the value of <Rt>. Software does not have to write a value to this register before issuing this 
instruction. 
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G6.2.74 ICIALLUIS, Instruction Cache Invalidate All to PoU, Inner Shareable 
The ICIALLUIS characteristics are: 


Purpose 
Invalidate all instruction caches Inner Shareable to PoU. If branch predictors are architecturally 
visible, also flush branch predictors. 

Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HCR.TPU==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HCR_EL2.TPU==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
° If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


AArch32 System instruction ICIALLUIS performs the same function as AArch64 System 
instruction IC [ALLUIS. 


Attributes 
ICIALLUIS is a 32-bit System instruction. 


Field descriptions 


ICIALLUIS ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 


Executing the ICIALLUIS instruction: 
The ICIALLUIS instruction is executed as: 


MCR p15,0,<Rt>,c7,c1,0@ ; ICIALLUIS operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0111 0001 000 





The PE ignores the value of <Rt>. Software does not have to write a value to this register before issuing this 
instruction. 
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G6.2.75 ICIMVAU, Instruction Cache line Invalidate by VA to PoU 
The ICIMVAU characteristics are: 
Purpose 
Invalidate instruction cache line by virtual address to PoU. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HCR.TPU==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TPU==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T7==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T7==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
AArch32 System instruction ICIMVAU performs the same function as AArch64 System instruction 
IC IVAU. 
Attributes 
ICIMVAU is a 32-bit System instruction. 
Field descriptions 
The ICIMVAU input value bit assignments are: 
31 0 
Virtual address to use 
Bits [31:0] 
Virtual address to use. 
Executing the ICIMVAU instruction: 
The ICIMVAU instruction is executed as: 
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MCR p15,0,<Rt>,c7,c5,1 ; ICIMVAU operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opci CRn CRm_= opc2 





1111 000 0111 0101 001 
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G6.2.76 ID_AFRO, Auxiliary Feature Register 0 
The ID_AFRO characteristics are: 


Purpose 
Provides information about the IMPLEMENTATION DEFINED features of the PE in AArch32. 
Must be interpreted with the Main ID Register, MIDR. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_AFRO is architecturally mapped to AArch64 System register 
ID_AFRO_EL1. 


Attributes 
ID_AFR0O is a 32-bit register. 


Field descriptions 


The ID_AFRO bit assignments are: 
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31 1615 12 11 8 


7 4 3 0 


IMP DEF 
IMP DEF 
IMP DEF 
IMP DEF 


Bits [31:16] 
Reserved, RESO. 


IMPLEMENTATION DEFINED, bits [15:12] 
IMPLEMENTATION DEFINED. 


IMPLEMENTATION DEFINED, bits [11:8] 


IMPLEMENTATION DEFINED. 


IMPLEMENTATION DEFINED, bits [7:4] 


IMPLEMENTATION DEFINED. 

















IMPLEMENTATION DEFINED, bits [3:0] 
IMPLEMENTATION DEFINED. 


Accessing the ID_AFRO: 
To access the ID_AFRO: 
MRC p15,0,<Rt>,c@,c1,3 ; Read ID_AFR@ into Rt 


Register access is encoded as follows: 





coproc opci CRn CRm_= opc2 





1111 000 0000 0001 011 
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G6.2.77 ID_DFRO, Debug Feature Register 0 
The ID_DFRO characteristics are: 


Purpose 
Provides top level information about the debug system in AArch32. 
Must be interpreted with the Main ID Register, MIDR. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_DFRO is architecturally mapped to AArch64 System register 
ID_DFRO_EL1. 


Attributes 
ID_DFR0O is a 32-bit register. 


Field descriptions 


The ID_DFR0O bit assignments are: 


28 27 24 23 20 19 1615 12 11 
RESO MProfDbg | MMapTrc MMapDbg |} CopSDbg | CopDbg 


G6-4432 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 





Bits [31:28] 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Reserved, RESO. 


PerfMon, bits [27:24] 


Performance Monitors. Support for System registers-based ARM Performance Monitors Extension, 
using registers in the coproc == 1111 encoding space, for A and R profile processors. Defined values 
are: 


0000 Performance Monitors Extension System registers not implemented. 

0001 Support for Performance Monitors Extension version 1 (PMUv1) System registers. 
0010 Support for Performance Monitors Extension version 2 (PMUv2) System registers. 
0011 Support for Performance Monitors Extension version 3 (PMUv3) System registers. 
1111 IMPLEMENTATION DEFINED form of Performance Monitors System registers supported. 


PMUvV3 not supported. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000, 0011, and 1111. 


In ARMv7, the value 0000 can mean that PMUv1 is implemented. PMUV1 is not permitted in an 
ARMv8 implementation. 


MProfDbg, bits [23:20] 


M Profile Debug. Support for memory-mapped debug model for M profile processors. Defined 
values are: 


0000 Not supported. 
0001 Support for M profile Debug architecture, with memory-mapped access. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


MMapTre, bits [19:16] 


Memory Mapped Trace. Support for memory-mapped trace model. Defined values are: 
0000 Not supported. 

0001 Support for ARM trace architecture, with memory-mapped access. 

All other values are reserved. 

In ARMv8-A the permitted values are 0000 and 0001. 


In the Trace registers, the ETMIDR gives more information about the implementation. 


CopTre, bits [15:12] 


Support for System registers-based trace model, using registers in the coproc == 1110 encoding 
space. Defined values are: 


0000 Not supported. 

0001 Support for ARM trace architecture, with System registers access. 
All other values are reserved. 

In ARMv8-A the permitted values are 0000 and 0001. 


In the Trace registers, the ETMIDR gives more information about the implementation. 


MMapDbg, bits [11:8] 


Memory Mapped Debug. Support for v7 memory-mapped debug model, for A and R profile 
processors. 


In ARMv8-A this field is RESO. 
The optional memory map defined by ARMV8 is not compatible with ARMv7. 
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CopSDbg, bits [7:4] 


Support for a System registers-based Secure debug model, using registers in the coproc = 1110 
encoding space, for an A profile processor that includes EL3. 


If EL3 is not implemented and the implemented Security state is Non-Secure state, this field is RESO. 
Otherwise, this field reads the same as bits [3:0]. 


CopDbg, bits [3:0] 


Support for System registers-based debug model, using registers in the coproc == 1110 encoding 
space, for A and R profile processors. Defined values are: 


0000 Not supported. 

0010 Support for ARMv6, v6 Debug architecture, with System registers access. 
0011 Support for ARMv6, v6.1 Debug architecture, with System registers access. 
0100 Support for ARMv7, v7 Debug architecture, with System registers access. 
0101 Support for ARMv7, v7.1 Debug architecture, with System registers access. 
0110 Support for ARMv8 debug architecture, with System registers access. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000, and 0110. 


Accessing the ID_DFRO: 
To access the ID_DFRO: 
MRC p15,0,<Rt>,c@,c1,2 ; Read ID_DFR@ into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0001 010 
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G6.2.78 ID_ISARO, Instruction Set Attribute Register 0 
The ID_ISARO characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 
Must be interpreted with ID_ISAR1, ID_ISAR2, ID_ISAR3, ID_ISAR4, and ID_ISARS. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_ISARO is architecturally mapped to AArch64 System register 
ID_ISARO_EL1. 


Attributes 
ID_ISAR0O is a 32-bit register. 


Field descriptions 


The ID_ISARO bit assignments are: 


31 28 27 24 23 20 19 1615 12 11 8 7 4 3 0 
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Bits [31:28] 


Reserved, RESO. 


Divide, bits [27:24] 


Indicates the implemented Divide instructions. Defined values are: 


0000 None implemented. 
0001 Adds SDIV and UDIV in the T32 instruction set. 
0010 As for 0001, and adds SDIV and UDIV in the A32 instruction set. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


Debug, bits [23:20] 


Indicates the implemented Debug instructions. Defined values are: 
0000 None implemented. 

0001 Adds BKPT. 

All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Coproc, bits [19:16] 


Indicates the implemented System register access instructions. Defined values are: 


0000 None implemented, except for instructions separately attributed by the architecture to 
provide access to AArch32 System registers and System instructions. 

0001 Adds generic CDP, LDC, MCR, MRC, and STC. 

0010 As for 0001, and adds generic CDP2, LDC2, MCR2, MRC2, and STC2. 

0011 As for 0010, and adds generic MCRR and MRRC. 

0100 As for 0011, and adds generic MCRR2 and MRRC2. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


CmpBranch, bits [15:12] 


Indicates the implemented combined Compare and Branch instructions in the T32 instruction set. 
Defined values are: 


0000 None implemented. 
0001 Adds CBNZ and CBZ. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


BitField, bits [11:8] 


BitCount, bits 


Indicates the implemented BitField instructions. Defined values are: 
0000 None implemented. 

0001 Adds BFC, BFI, SBFX, and UBFX. 

All other values are reserved. 

In ARMv8-A the only permitted value is 0001. 

[7:4] 

Indicates the implemented Bit Counting instructions. Defined values are: 
0000 None implemented. 

0001 Adds CLZ. 


All other values are reserved. 
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In ARMv8-A the only permitted value is 0001. 
Swap, bits [3:0] 


Indicates the implemented Swap instructions in the A32 instruction set. Defined values are: 
0000 None implemented. 

0001 Adds SWP and SWPB. 

All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


Accessing the ID_ISARO: 
To access the ID_ISARO: 
MRC p15,0,<Rt>,c@,c2,@ ; Read ID_ISAR@ into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0010 000 
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G6.2.79 ID_ISAR1, Instruction Set Attribute Register 1 
The ID_ISAR1 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 
Must be interpreted with ID_ISARO, ID_ISAR2, ID_ISAR3, ID_ISAR4, and ID_ISARS. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_ISAR1 is architecturally mapped to AArch64 System register 
ID_ISAR1_EL1. 


Attributes 
ID_ISAR1 is a 32-bit register. 


Field descriptions 


The ID_ISAR1 bit assignments are: 


28 27 24 23 20 19 1615 12 11 
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Jazelle, bits [31:28] 
Indicates the implemented Jazelle extension instructions. Defined values are: 
0000 No support for Jazelle. 


0001 Adds the BXJ instruction, and the J bit in the PSR. This setting might indicate a trivial 
implementation of the Jazelle extension. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Interwork, bits [27:24] 


Indicates the implemented Interworking instructions. Defined values are: 


0000 None implemented. 

0001 Adds the BX instruction, and the T bit in the PSR. 

0010 As for 0001, and adds the BLX instruction. PC loads have BX-like behavior. 

0011 As for 0010, and guarantees that data-processing instructions in the A32 instruction set 


with the PC as the destination and the S bit clear have BX-like behavior. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0011. 


Immediate, bits [23:20] 
Indicates the implemented data-processing instructions with long immediates. Defined values are: 
0000 None implemented. 
0001 Adds: 
° The MOVT instruction. 
° The MOV instruction encodings with zero-extended 16-bit immediates. 


° The T32 ADD and SUB instruction encodings with zero-extended 12-bit 
immediates, and the other ADD, ADR, and SUB encodings cross-referenced by 
the pseudocode for those encodings. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


IfThen, bits [19:16] 
Indicates the implemented If-Then instructions in the T32 instruction set. Defined values are: 
0000 None implemented. 
0001 Adds the IT instructions, and the IT bits in the PSRs. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Extend, bits [15:12] 


Indicates the implemented Extend instructions. Defined values are: 


0000 No scalar sign-extend or zero-extend instructions are implemented, where scalar 
instructions means non-Advanced SIMD instructions. 

0001 Adds the SXTB, SXTH, UXTB, and UXTH instructions. 

0010 As for 0001, and adds the SXTB16, SXTAB, SXTAB16, SXTAH, UXTB16, UXTAB, 


UXTAB16, and UXTAH instructions. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 
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Except_AR, bits [11:8] 


Indicates the implemented A and R profile exception-handling instructions. Defined values are: 


0000 None implemented. 
0001 Adds the SRS and RFE instructions, and the A and R profile forms of the CPS 
instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Except, bits [7:4] 


Indicates the implemented exception-handling instructions in the ARM instruction set. Defined 


values are: 

0000 Not implemented. This indicates that the User bank and Exception return forms of the 
LDM and STM instructions are not implemented. 

0001 Adds the LDM (exception return), LDM (user registers), and STM (user registers) 


instruction versions. 
All other values are reserved. 
In ARMv8-A the only permitted value is 0001. 
Endian, bits [3:0] 
Indicates the implemented Endian instructions. Defined values are: 
0000 None implemented. 
0001 Adds the SETEND instruction, and the E bit in the PSRs. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


Accessing the ID_ISAR1: 
To access the ID_ISARI: 
MRC p15,0,<Rt>,c@,c2,1 ; Read ID_ISAR1 into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0010 001 
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G6.2.80 ID_ISAR2, Instruction Set Attribute Register 2 
The ID_ISAR2 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 
Must be interpreted with ID_ISARO, ID_ISAR1, ID_ISAR3, ID_ISAR4, and ID_ISARS. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_ISAR2 is architecturally mapped to AArch64 System register 
ID_ISAR2_EL1. 


Attributes 
ID_ISAR2 is a 32-bit register. 


Field descriptions 


The ID_ISAR2 bit assignments are: 
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28 27 24 23 20 19 1615 12 11 


sa ae MultiAccessInt 


Reversal, bits [31:28] 


Indicates the implemented Reversal instructions. Defined values are: 


0000 None implemented. 
0001 Adds the REV, REV16, and REVSH instructions. 
0010 As for 0001, and adds the RBIT instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


PSR_AR, bits [27:24] 


Indicates the implemented A and R profile instructions to manipulate the PSR. Defined values are: 


0000 None implemented. 
0001 Adds the MRS and MSR instructions, and the exception return forms of data-processing 
instructions. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0001. 
The exception return forms of the data-processing instructions are: 


. In the A32 instruction set, data-processing instructions with the PC as the destination and the 
S bit set. These instructions might be affected by the WithShifts attribute. 


° In the T32 instruction set, the SUBS PC,LR,#N instruction. 


MultU, bits [23:20] 


Indicates the implemented advanced unsigned Multiply instructions. Defined values are: 


0000 None implemented. 
0001 Adds the UMULL and UMLAL instructions. 
0010 As for 0001, and adds the UMAAL instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


MultS, bits [19:16] 


Indicates the implemented advanced signed Multiply instructions. Defined values are: 


0000 None implemented. 
0001 Adds the SMULL and SMLAL instructions. 
0010 As for 0001, and adds the SMLABB, SMLABT, SMLALBB, SMLALBT, SMLALTB, 


SMLALTT, SMLATB, SMLATT, SMLAWB, SMLAWT, SMULBB, SMULBT, 
SMULTB, SMULTT, SMULWB, and SMULWT instructions. Also adds the Q bit in the 
PSRs. 


0011 As for 0010, and adds the SMLAD, SMLADX, SMLALD, SMLALDX, SMLSD, 
SMLSDX, SMLSLD, SMLSLDX, SMMLA, SMMLAR, SMMLS, SMMLSR, 
SMMUL, SMMULR, SMUAD, SMUADX, SMUSD, and SMUSDX instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0011. 
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Mult, bits [15:12] 


Indicates the implemented additional Multiply instructions. Defined values are: 


0000 No additional instructions implemented. This means only MUL is implemented. 
0001 Adds the MLA instruction. 
0010 As for 0001, and adds the MLS instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


MultiAccessInt, bits [11:8] 


Indicates the support for interruptible multi-access instructions. Defined values are: 


0000 No support. This means the LDM and STM instructions are not interruptible. 
0001 LDM and STM instructions are restartable. 
0010 LDM and STM instructions are continuable. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


MemHint, bits [7:4] 


Indicates the implemented Memory Hint instructions. Defined values are: 


0000 None implemented. 

0001 Adds the PLD instruction. 

0010 Adds the PLD instruction. (0001 and 0010 have identical effects.) 
0011 As for 0001 (or 0010), and adds the PLI instruction. 

0100 As for 0011, and adds the PLDW instruction. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0100. 


LoadStore, bits [3:0] 


Indicates the implemented additional load/store instructions. Defined values are: 


0000 No additional load/store instructions implemented. 
0001 Adds the LDRD and STRD instructions. 
0010 As for 0001, and adds the Load Acquire (LDAB, LDAH, LDA, LDAEXB, LDAEXH, 


LDAEX, LDAEXD) and Store Release (STLB, STLH, STL, STLEXB, STLEXH, 
STLEX, STLEXD) instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


Accessing the ID_ISAR2: 
To access the ID_ISAR2: 
MRC p15,0,<Rt>,c@,c2,2 ; Read ID_ISAR2 into Rt 


Register access is encoded as follows: 





coproc opci CRn CRm_= opc2 





1111 000 0000 0010 010 
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G6.2.81 ID_ISAR3, Instruction Set Attribute Register 3 
The ID_ISAR3 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 
Must be interpreted with ID_ISARO, ID_ISAR1, ID_ISAR2, ID_ISAR4, and ID_ISARS. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_ISAR3 is architecturally mapped to AArch64 System register 
ID_ISAR3_EL1. 


Attributes 
ID_ISAR3 is a 32-bit register. 


Field descriptions 


The ID_ISAR3 bit assignments are: 


28 27 24 23 20 19 1615 12 11 
T32EE TrueNOP | T32Copy | TabBranch | SynchPrim Cae.) SIMD 
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T32EE, bits [31:28] 


Indicates the implemented T32EE instructions. Defined values are: 
0000 None implemented. 


0001 Adds the ENTERX and LEAVEX instructions, and modifies the load behavior to 
include null checking. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


TrueNOP, bits [27:24] 


Indicates the implemented true NOP instructions. Defined values are: 


0000 None implemented. This means there are no NOP instructions that do not have any 
register dependencies. 


0001 Adds true NOP instructions in both the T32 and A32 instruction sets. This also permits 
additional NOP-compatible hints. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


T32Copy, bits [23:20] 


Indicates the support for T32 non flag-setting MOV instructions. Defined values are: 


0000 Not supported. This means that in the T32 instruction set, encoding T1 of the MOV 
(register) instruction does not support a copy from a low register to a low register. 


0001 Adds support for T32 instruction set encoding T1 of the MOV (register) instruction, 
copying from a low register to a low register. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


TabBranch, bits [19:16] 


Indicates the implemented Table Branch instructions in the T32 instruction set. Defined values are: 
0000 None implemented. 

0001 Adds the TBB and TBH instructions. 

All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


SynchPrim, bits [15:12] 


Used in conjunction with ID_ISAR4.SynchPrim_frac to indicate the implemented Synchronization 
Primitive instructions. Defined values are: 


0000 If SynchPrim_frac == 0000, no Synchronization Primitives implemented. 

0001 If SynchPrim_frac == 0000, adds the LDREX and STREX< instructions. 
If SynchPrim_frac == 0011, also adds the CLREX, LDREXB, STREXB, and STREXH 
instructions. 

0010 If SynchPrim_frac == 0000, as for [0001, 0011] and also adds the LDREXD and 
STREXD instructions. 


All other combinations of SynchPrim and SynchPrim_frac are reserved. 


In ARMv8-A the only permitted value is 0010. 


SVC, bits [11:8] 


Indicates the implemented SVC instructions. Defined values are: 
0000 Not implemented. 
0001 Adds the SVC instruction. 


All other values are reserved. 
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In ARMv8-A the only permitted value is 0001. 


SIMD, bits [7:4] 


Indicates the implemented SIMD instructions. Defined values are: 


0000 None implemented. 
0001 Adds the SSAT and USAT instructions, and the Q bit in the PSRs. 
0011 As for 0001, and adds the PKHBT, PKHTB, QADD 16, QADD8, QASX, QSUB 16, 


QSUB8, QSAX, SADD16, SADD8, SASX, SEL, SHADD16, SHADD8, SHASX, 
SHSUB16, SHSUB8, SHSAX, SSAT16, SSUB16, SSUB8, SSAX, SXTAB16, 
SXTB16, UADD16, UADD8, UASX, UHADD16, UHADD8, UHASX, UHSUB16, 
UHSUB8, UHSAX, UQADD16, UQADD8, UQASX, UQSUB 16, UQSUB8, UQSAX, 
USAD8, USADA8, USAT16, USUB16, USUB8, USAX, UXTAB16, and UXTB16 
instructions. Also adds support for the GE[3:0] bits in the PSRs. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0011. 


The SIMD field relates only to implemented instructions that perform SIMD operations on the 
general-purpose registers. In an implementation that supports floating-point and Advanced SIMD 
instructions, MVFRO, MVFR1, and MVFR2 give information about the implemented Advanced 
SIMD instructions. 


Saturate, bits [3:0] 


Indicates the implemented Saturate instructions. Defined values are: 


0000 None implemented. This means no non-Advanced SIMD saturate instructions are 
implemented. 
0001 Adds the QADD, QDADD, QDSUB, and QSUB instructions, and the Q bit in the PSRs. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Accessing the ID_ISAR3: 


To access the ID_ISAR3: 


MRC p15,0,<Rt> 


Register access 


,c0,c2,3 ; Read ID_ISAR3 into Rt 


is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0010 011 
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G6.2.82 ID_ISAR4, Instruction Set Attribute Register 4 
The ID_ISAR4 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 
Must be interpreted with ID_ISARO, ID_ISAR1, ID_ISAR2, ID_ISAR3, and ID_ISARS. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_ISAR4 is architecturally mapped to AArch64 System register 
ID_ISAR4_EL1. 


Attributes 
ID_ISAR4 is a 32-bit register. 


Field descriptions 


The ID_ISAR4 bit assignments are: 
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31 28 27 24 23 20 19 1615 12 11 8 7 4 3 0 





SynchPrim_frac ee 


SWP_frac, bits [31:28] 


Indicates support for the memory system locking the bus for SWP or SWPB instructions. Defined 


values are: 
0000 SWP or SWPB instructions not implemented. 
0001 SWP or SWPB implemented but only in a uniprocessor context. SWP and SWPB do not 


guarantee whether memory accesses from other masters can come between the load 
memory access and the store memory access of the SWP or SWPB. 


All other values are reserved. This field is valid only if the ID_ISARO.Swap_instrs field is 0000. 
In ARMv8-A the only permitted value is 0000. 


PSR_M, bits [27:24] 
Indicates the implemented M profile instructions to modify the PSRs. Defined values are: 
0000 None implemented. 
0001 Adds the M profile forms of the CPS, MRS, and MSR instructions. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


SynchPrim_frac, bits [23:20] 


Used in conjunction with ID_ISAR3.SynchPrim to indicate the implemented Synchronization 
Primitive instructions. Possible values are: 


0000 If SynchPrim == 0000, no Synchronization Primitives implemented. If SynchPrim == 
0001, adds the LDREX and STREX< instructions. If SynchPrim == 0010, also adds the 
CLREX, LDREXB, LDREXH, STREXB, STREXH, LDREXD, and STREXD 
instructions. 


0011 If SynchPrim == 0001, adds the LDREX, STREX, CLREX, LDREXB, LDREXH, 
STREXB, and STREXH instructions. 


All other combinations of SynchPrim and SynchPrim_frac are reserved. 
In ARMv8-A the only permitted value is 0000. 
Barrier, bits [19:16] 


Indicates the implemented Barrier instructions in the A32 and T32 instruction sets. Defined values 


are: 

0000 None implemented. Barrier operations are provided only as System instructions in the 
(coproc==1111) encoding space. 

0001 Adds the DMB, DSB, and ISB barrier instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


SMC, bits [15:12] 
Indicates the implemented SMC instructions. Defined values are: 
0000 None implemented. 
0001 Adds the SMC instruction. 


All other values are reserved. 
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In ARMv8-A the only permitted value is 0001. 
Writeback, bits [11:8] 


Indicates the support for Writeback addressing modes. Defined values are: 


0000 Basic support. Only the LDM, STM, PUSH, POP, SRS, and RFE instructions support 
writeback addressing modes. These instructions support all of their writeback 
addressing modes. 


0001 Adds support for all of the writeback addressing modes. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


WithShifts, bits [7:4] 


Indicates the support for instructions with shifts. Defined values are: 


0000 Nonzero shifts supported only in MOV and shift instructions. 
0001 Adds support for shifts of loads and stores over the range LSL 0-3. 
0011 As for 0001, and adds support for other constant shift options, both on load/store and 


other instructions. 
0100 As for 0011, and adds support for register-controlled shift options. 
All other values are reserved. 
In ARMv8-A the only permitted value is 0100. 
Unpriv, bits [3:0] 


Indicates the implemented unprivileged instructions. Defined values are: 


0000 None implemented. No T variant instructions are implemented. 
0001 Adds the LDRBT, LDRT, STRBT, and STRT instructions. 
0010 As for 0001, and adds the LDRHT, LDRSBT, LDRSHT, and STRHT instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


Accessing the ID_ISAR4: 
To access the ID_ISAR4: 
MRC p15,0,<Rt>,c@,c2,4 ; Read ID_ISAR4 into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0010 100 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4449 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2.83 


ID_ISARS, Instruction Set Attribute Register 5 


The ID_ISARS5 characteristics are: 


Purpose 
Provides information about the instruction sets implemented by the PE in AArch32. 
Must be interpreted with ID_ISARO, ID_ISAR1, ID_ISAR2, ID_ISAR3, and ID_ISAR4. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_ISARS is architecturally mapped to AArch64 System register 
ID_ISARS5_EL1. 


Attributes 
ID_ISARS is a 32-bit register. 


Field descriptions 


The ID_ISARS bit assignments are: 


31 20 19 1615 12 11 8 


7 4 3 0 
RESO CRC32 SHA2 SHA1 SEVL 
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Bits [31:20] 
Reserved, RESO. 


CRC32, bits [19:16] 


Indicates whether CRC32 instructions are implemented in AArch32. 


0000 No CRC32 instructions implemented. 
0001 CRC32B, CRC32H, CRC32W, CRC32CB, CRC32CH, and CRC32CW instructions 
implemented. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


SHA, bits [15:12] 
Indicates whether SHA2 instructions are implemented in AArch32. 
0000 No SHA2 instructions implemented. 
0001 SHA256H, SHA256H2, SHA256SU0, and SHA256SU1 implemented. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
SHALL, bits [11:3] 
Indicates whether SHA1 instructions are implemented in AArch32. 
0000 No SHAL instructions implemented. 
0001 SHAIC, SHAIP, SHAIM, SHA1H, SHAISUO, and SHA1SU1 implemented. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


AES, bits [7:4] 


Indicates whether AES instructions are implemented in AArch32. 


0000 No AES instructions implemented. 
0001 AESE, AESD, AESMC, and AESIMC implemented. 
0010 As for 0001, plus PMULL/PMULL2 instructions operating on 64-bit data quantities. 


All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0010. 
SEVL, bits [3:0] 
Indicates whether the SEVL instruction is implemented in AArch32. 
0000 SEVL is implemented as a NOP. 
0001 SEVL is implemented as Send Event Local. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Accessing the ID_ISARS: 
To access the ID_ISARS: 
MRC p15,0,<Rt>,c@,c2,5 ; Read ID_ISAR5 into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0010 101 
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G6.2.84 ID_MMFRO, Memory Model Feature Register 0 
The ID_MMFRO characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArch32. 


Must be interpreted with ID_MMFR1, ID_MMFR2, ID_MMFR3, and ID_MMFR4. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 
Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_MMFR0O is architecturally mapped to AArch64 System register 
ID_MMFRO_EL1. 


Attributes 
ID_MMER0O is a 32-bit register. 


Field descriptions 


The ID_MMEFRO bit assignments are: 
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28 27 24 23 20 19 1615 12 11 


rose | Acsreg | Tom sharelwi | outers | pasa | VMSA 


InnerShr, bits [31:28] 


Innermost Shareability. Indicates the innermost shareability domain implemented. Defined values 


are: 

0000 Implemented as Non-cacheable. 

0001 Implemented with hardware coherency support. 
1111 Shareability ignored. 


All other values are reserved. 
In ARMvV8 the permitted values are 0000, 0001, and 1111. 


This field is valid only if the implementation supports two levels of shareability, as indicated by 
ID_MMFRO.ShareLvl having the value 0001. 


When ID_MMFRO.ShareLvI is zero, this field is UNK. 
FCSE, bits [27:24] 
Indicates whether the implementation includes the FCSE. Defined values are: 
0000 Not supported. 
0001 Support for FCSE. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 


AuxReg, bits [23:20] 


Auxiliary Registers. Indicates support for Auxiliary registers. Defined values are: 


0000 None supported. 
0001 Support for Auxiliary Control Register only. 
0010 Support for Auxiliary Fault Status Registers (AIFSR and ADFSR) and Auxiliary 


Control Register. 
All other values are reserved. 
In ARMvV8 the only permitted value is 0010. 
— Note 


Accesses to unimplemented Auxiliary registers are UNDEFINED. 





TCM, bits [19:16] 
Indicates support for TCMs and associated DMAs. Defined values are: 


0000 Not supported. 

0001 Support is IMPLEMENTATION DEFINED. ARMv7 requires this setting. 
0010 Support for TCM only, ARMv6 implementation. 

0011 Support for TCM and DMA, ARMv6 implementation. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


ShareLvl, bits [15:12] 
Shareability Levels. Indicates the number of shareability levels implemented. Defined values are: 


0000 One level of shareability implemented. 
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0001 Two levels of shareability implemented. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0001. 


OuterShr, bits [11:8] 


PMSA, bits [7: 


VMSA, bits [3 


Outermost Shareability. Indicates the outermost shareability domain implemented. Defined values 
are: 


0000 Implemented as Non-cacheable. 
0001 Implemented with hardware coherency support. 
1111 Shareability ignored. 


All other values are reserved. 


In ARMvV8 the permitted values are 0000, 0001, and 1111. 


4] 
Indicates support for a PMSA. Defined values are: 


0000 Not supported. 

0001 Support for IMPLEMENTATION DEFINED PMSA. 

0010 Support for PMSAv6, with a Cache Type Register implemented. 

0011 Support for PMSAv7, with support for memory subsections. ARMv7-R profile. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


20] 


Indicates support for a VMSA. Defined values are: 


0000 Not supported. 

0001 Support for IMPLEMENTATION DEFINED VMSA. 

0010 Support for VMSAv6, with Cache and TLB Type Registers implemented. 

0011 Support for VMSAv7, with support for remapping and the Access flag. ARMv7-A 
profile. 

0100 As for 0011, and adds support for the PXN bit in the Short-descriptor translation table 


format descriptors. 
0101 As for 0100, and adds support for the Long-descriptor translation table format. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0101. 


Accessing the ID_MMFRO: 


To access the ID_MMEFRO: 


MRC p15,0,<Rt> 


,€0,c1,4 ; Read ID_MMFR@ into Rt 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0000 0001 100 








G6-4454 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


G6.2.85 ID_MMFR1, Memory Model Feature Register 1 
The ID_MMEFRI characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArch32. 


Must be interpreted with ID_MMFRO, ID_MMFR2, ID_MMFR3, and ID_MMFR4. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 
Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_MMFR1 is architecturally mapped to AArch64 System register 
ID_MMFR1_EL1. 


Attributes 
ID_MMFR1 is a 32-bit register. 


Field descriptions 


The ID_MMFR1 bit assignments are: 
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28 27 


20 19 1615 12 11 


L1TstCin L1Hvd L1UniSW | L1HvdSW | L1UniVA | LiHvdVA 


BPred, bits [31:28] 


Branch Predictor. Indicates branch predictor management requirements. Defined values are: 


0000 
0001 


0010 


0011 
0100 


No branch predictor, or no MMU present. Implies a fixed MPU configuration. 


Branch predictor requires flushing on: 


° Enabling or disabling a stage of address translation. 
° Writing new data to instruction locations. 
° Writing new mappings to the translation tables. 


° Changes to the TTBRO, TTBR1, or TTBCR registers. 
° Changes to the ContextID or ASID, or to the FCSE ProcessID if this is supported. 


Branch predictor requires flushing on: 


. Enabling or disabling a stage of address translation. 
. Writing new data to instruction locations. 
° Writing new mappings to the translation tables. 


° Any change to the TTBRO, TTBR1, or TTBCR registers without a change to the 
corresponding ContextID or ASID, or FCSE ProcessID if this is supported. 


Branch predictor requires flushing only on writing new data to instruction locations. 


For execution correctness, branch predictor requires no flushing at any time. 


All other values are reserved. 


In ARMVv8-A the permitted values are 0010, 0011, or 0100. For values other than 0000 and 0100 the 
ARM Architecture Reference Manual, or the product documentation, might give more information 
about the required maintenance. 


L1TstCln, bits [27:24] 


Level 1 cache Test and Clean. Indicates the supported Level 1 data cache test and clean operations, 
for Harvard or unified cache implementations. Defined values are: 


0000 
0001 


0010 


None supported. 

Supported Level 1 data cache test and clean operations are: 
° Test and clean data cache. 

As for 0001, and adds: 


° Test, clean, and invalidate data cache. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


L1Uni, bits [23:20] 


Level 1 Unified cache. Indicates the supported entire Level 1 cache maintenance operations for a 
unified cache implementation. Defined values are: 





0000 None supported. 
0001 Supported entire Level 1 cache operations are: 
° Invalidate cache, including branch predictor if appropriate. 
° Invalidate branch predictor, if appropriate. 
0010 As for 0001, and adds: 
° Clean cache, using a recursive model that uses the cache dirty status bit. 
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° Clean and invalidate cache, using a recursive model that uses the cache dirty 
status bit. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


L1Hvd, bits [19:16] 


Level 1 Harvard cache. Indicates the supported entire Level 1 cache maintenance operations for a 
Harvard cache implementation. Defined values are: 


0000 None supported. 
0001 Supported entire Level 1 cache operations are: 
° Invalidate instruction cache, including branch predictor if appropriate. 
° Invalidate branch predictor, if appropriate. 
0010 As for 0001, and adds: 
. Invalidate data cache. 
° Invalidate data cache and instruction cache, including branch predictor if 
appropriate. 
0011 As for 0010, and adds: 
° Clean data cache, using a recursive model that uses the cache dirty status bit. 
° Clean and invalidate data cache, using a recursive model that uses the cache dirty 
status bit. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0000. 
L1UniSW, bits [15:12] 


Level 1 Unified cache by Set/Way. Indicates the supported Level 1 cache line maintenance 
operations by set/way, for a unified cache implementation. Defined values are: 


0000 None supported. 

0001 Supported Level 1 unified cache line maintenance operations by set/way are: 
° Clean cache line by set/way. 

0010 As for 0001, and adds: 
° Clean and invalidate cache line by set/way. 

0011 As for 0010, and adds: 
. Invalidate cache line by set/way. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


L1HvdSW,, bits [11:8] 


Level 1 Harvard cache by Set/Way. Indicates the supported Level 1 cache line maintenance 
operations by set/way, for a Harvard cache implementation. Defined values are: 


0000 None supported. 
0001 Supported Level 1 Harvard cache line maintenance operations by set/way are: 
° Clean data cache line by set/way. 
° Clean and invalidate data cache line by set/way. 
0010 As for 0001, and adds: 
° Invalidate data cache line by set/way. 
0011 As for 0010, and adds: 
. Invalidate instruction cache line by set/way. 


All other values are reserved. 
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In ARMv8-A the only permitted value is 0000. 


L1UniVA, bits [7:4] 


Level 1 Unified cache by Virtual Address. Indicates the supported Level 1 cache line maintenance 
operations by VA, for a unified cache implementation. Defined values are: 


0000 None supported. 

0001 Supported Level 1 unified cache line maintenance operations by VA are: 
° Clean cache line by VA. 
° Invalidate cache line by VA. 


° Clean and invalidate cache line by VA. 
0010 As for 0001, and adds: 
° Invalidate branch predictor by VA, if branch predictor is implemented. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


L1HvdVA, bits [3:0] 


Level 1 Harvard cache by Virtual Address. Indicates the supported Level 1 cache line maintenance 
operations by VA, for a Harvard cache implementation. Defined values are: 


0000 None supported. 
0001 Supported Level 1 Harvard cache line maintenance operations by VA are: 


° Clean data cache line by VA. 


° Invalidate data cache line by VA. 
° Clean and invalidate data cache line by VA. 
° Clean instruction cache line by VA. 
0010 As for 0001, and adds: 
° Invalidate branch predictor by VA, if branch predictor is implemented. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


Accessing the ID_MMFR1: 
To access the ID_MMFRI1: 
MRC p15,0,<Rt>,c@,c1,5 ; Read ID_MMFR1 into Rt 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0000 0001 101 
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G6.2.86 ID_MMFR2, Memory Model Feature Register 2 
The ID_-MMEFR2 characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArch32. 


Must be interpreted with ID_MMFRO, ID_MMFR1, ID_MMFR3, and ID_MMFR4. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 
Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_MMFR2 is architecturally mapped to AArch64 System register 
ID_MMFR2_EL1. 


Attributes 
ID_MMER2 is a 32-bit register. 


Field descriptions 


The ID_MMEFR2 bit assignments are: 
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28 27 24 23 20 19 1615 12 11 


HWAccFlg | WEFIStall UniTLB HvdTLB | L1HvdRng | L1HvdBG | L1HvdFG 


HWAccFilg, bits [31:28] 


Hardware Access Flag. In earlier versions of the ARM Architecture, this field indicates support for 
a Hardware Access flag, as part of the VMSAv7 implementation. Defined values are: 


0000 Not supported. 
0001 Support for VMSAv7 Access flag, updated in hardware. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 


WFIStall, bits [27:24] 
Wait For Interrupt Stall. Indicates the support for Wait For Interrupt (WFI) stalling. Defined values 


are: 
0000 Not supported. 
0001 Support for WFI stalling. 


All other values are reserved. 


In ARMvV8 the permitted values are 0000 and 0001. 


MemBarr, bits [23:20] 


Memory Barrier. Indicates the supported memory barrier System instructions in the (coproc==1111) 
encoding space: 


0000 None supported. 

0001 Supported memory barrier System instructions are: 
. Data Synchronization Barrier (DSB). 

0010 As for 0001, and adds: 
° Instruction Synchronization Barrier (ISB). 


° Data Memory Barrier (DMB). 
All other values are reserved. 
In ARMvV8 the only permitted value is 0010. 


ARM deprecates the use of these operations. ID_ISAR4.Barrier_instrs indicates the level of support 
for the preferred barrier instructions. 


UniTLB, bits [19:16] 


Unified TLB. Indicates the supported TLB maintenance operations, for a unified TLB 
implementation. Defined values are: 


0000 Not supported. 

0001 Supported unified TLB maintenance operations are: 
° Invalidate all entries in the TLB. 
° Invalidate TLB entry by VA. 

0010 As for 0001, and adds: 
° Invalidate TLB entries by ASID match. 

0011 As for 0010, and adds: 


° Invalidate instruction TLB and data TLB entries by VA All ASID. This is a 
shared unified TLB operation. 
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0100 As for 0011, and adds: 
° Invalidate Hyp mode unified TLB entry by VA. 
. Invalidate entire Non-secure PL1&0 unified TLB. 
. Invalidate entire Hyp mode unified TLB. 


0101 As for 0100, and adds the following operations: TLBIMVALIS, TLBIMVAALIS, 
TLBIMVALHIS, TLBIMVAL, TLBIMVAAL, TLBIMVALH. 
0110 As for 0101, and adds the following operations: TLBIIPAS2IS, TLBUPAS2LIS, 


TLBUPAS2, TLBIIPAS2L. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0110. 


HvdTLB, bits [15:12] 


If the Unified TLB field (UniTLB, bits [19:16]) is not 0000, then the meaning of this field is 
IMPLEMENTATION DEFINED. ARM deprecates the use of this field by software. 


L1HvdRng, bits [11:8] 


Level 1 Harvard cache Range. Indicates the supported Level 1 cache maintenance range operations, 
for a Harvard cache implementation. Defined values are: 


0000 Not supported. 

0001 Supported Level 1 Harvard cache maintenance range operations are: 
° Invalidate data cache range by VA. 
° Invalidate instruction cache range by VA. 


° Clean data cache range by VA. 
° Clean and invalidate data cache range by VA. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 


L1HvdBG, bits [7:4] 


Level 1 Harvard cache Background fetch. Indicates the supported Level 1 cache background fetch 
operations, for a Harvard cache implementation. When supported, background fetch operations are 
non-blocking operations. Defined values are: 


0000 Not supported. 
0001 Supported Level 1 Harvard cache background fetch operations are: 
. Fetch instruction cache range by VA. 


° Fetch data cache range by VA. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 


L1HvdKFG, bits [3:0] 


Level 1 Harvard cache Foreground fetch. Indicates the supported Level 1 cache foreground fetch 
operations, for a Harvard cache implementation. When supported, foreground fetch operations are 
blocking operations. Defined values are: 


0000 Not supported. 
0001 Supported Level 1 Harvard cache foreground fetch operations are: 
. Fetch instruction cache range by VA. 


° Fetch data cache range by VA. 
All other values are reserved. 


In ARMvV8 the only permitted value is 0000. 
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Accessing the ID_MMFR2: 
To access the ID_MMFR2: 
MRC p15,0,<Rt>,c@,c1,6 ; Read ID_MMFR2 into Rt 


Register access is encoded as follows: 





coproc opci 


CRn 


CRm_ opc2 





1111 000 


0000 


0001 110 
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G6.2.87 ID_MMFR3, Memory Model Feature Register 3 
The ID_MMFR3 characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArch32. 


Must be interpreted with ID_MMFRO, ID_MMFR1, ID_MMFR2, and ID_MMFR4. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 
Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_MMFR3 is architecturally mapped to AArch64 System register 
ID_MMFR3_EL1. 


Attributes 
ID_MMEFR3 is a 32-bit register. 


Field descriptions 


The ID_MMEFR3 bit assignments are: 
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31 28 27 24 23 20 19 1615 12 11 8 7 4 3 0 





Supersec, bits [31:28] 


Supersections. On a VMSA implementation, indicates whether Supersections are supported. 
Defined values are: 


0000 Supersections supported. 
1111 Supersections not supported. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 1111. 


CMem3Sz, bits [27:24] 


Cached Memory Size. Indicates the physical memory size supported by the caches. Defined values 


are: 
0000 4GB, corresponding to a 32-bit physical address range. 

0001 64GB, corresponding to a 36-bit physical address range. 

0010 1TB or more, corresponding to a 40-bit or larger physical address range. 


All other values are reserved. 


In ARMV8-A the permitted values are 0000, 0001, and 0010. 


CohWalk, bits [23:20] 


Coherent Walk. Indicates whether Translation table updates require a clean to the point of 
unification. Defined values are: 


0000 Updates to the translation tables require a clean to the point of unification to ensure 
visibility by subsequent translation table walks. 


0001 Updates to the translation tables do not require a clean to the point of unification to 
ensure visibility by subsequent translation table walks. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Bits [19:16] 


Reserved, RESO. 


MaintBcst, bits [15:12] 


Maintenance Broadcast. Indicates whether Cache, TLB, and branch predictor operations are 
broadcast. Defined values are: 


0000 Cache, TLB, and branch predictor operations only affect local structures. 


0001 Cache and branch predictor operations affect structures according to shareability and 
defined behavior of instructions. TLB operations only affect local structures. 


0010 Cache, TLB, and branch predictor operations affect structures according to shareability 
and defined behavior of instructions. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


BPMaint, bits [11:8] 


Branch Predictor Maintenance. Indicates the supported branch predictor maintenance operations in 
an implementation with hierarchical cache maintenance operations. Defined values are: 


0000 None supported. 
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0001 Supported branch predictor maintenance operations are: 
° Invalidate all branch predictors. 
0010 As for 0001, and adds: 


° Invalidate branch predictors by VA. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0010. 


CMaintSW, bits [7:4] 


Cache Maintenance by Set/Way. Indicates the supported cache maintenance operations by set/way, 
in an implementation with hierarchical caches. Defined values are: 


0000 None supported. 

0001 Supported hierarchical cache maintenance instructions by set/way are: 
° Invalidate data cache by set/way. 
° Clean data cache by set/way. 
° Clean and invalidate data cache by set/way. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0001. 
In a unified cache implementation, the data cache maintenance operations apply to the unified 
caches. 
CMaintVA, bits [3:0] 


Cache Maintenance by Virtual Address. Indicates the supported cache maintenance operations by 
VA, in an implementation with hierarchical caches. Defined values are: 


0000 None supported. 

0001 Supported hierarchical cache maintenance operations by VA are: 
° Invalidate data cache by VA. 
° Clean data cache by VA. 


° Clean and invalidate data cache by VA. 
° Invalidate instruction cache by VA. 
. Invalidate all instruction cache entries. 


All other values are reserved. 
In ARMv8-A the only permitted value is 0001. 


In a unified cache implementation, data cache maintenance operations apply to the unified caches, 
and the instruction cache maintenance instructions are not implemented. 

Accessing the ID_MMFR3: 

To access the ID_MMEFR3: 

MRC p15,0,<Rt>,c@,c1,7 ; Read ID_MMFR3 into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0001 111 
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G6.2.88 ID_MMFR4, Memory Model Feature Register 4 
The ID_MMFR4 characteristics are: 


Purpose 


Provides information about the implemented memory model and memory management support in 
AArch32. 


Must be interpreted with ID_MMFRO, ID_MMFR1, ID_MMEFR2, and ID_MMEFR3. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If EL2 is using AArch32 and HCR.TID3==1, then: 


— If the register is not RAZ/WI then Non-secure EL]! read accesses to the register are 
trapped to Hyp mode. 


— Otherwise, it is IMPLEMENTATION DEFINED whether Non-secure EL1 read accesses to 
this register are trapped to Hyp mode. 


° If EL2 is using AArch64 and HCR_EL2.TID3==1, then: 


— If the register is not RAZ/WI then Non-secure EL1 read accesses to the register are 
trapped to EL2. 


— Otherwise, it is IMPLEMENTATION DEFINED whether Non-secure EL1 read accesses to 
this register are trapped to EL2. 


° If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_MMFR4 is architecturally mapped to AArch64 System register 
ID_MMFR4_EL1. 


In an implementation that does not include ACTLR2 and HACTLR2 this register is RAZ. 


Attributes 
ID_MMFR4 is a 32-bit register. 
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Field descriptions 


The ID_MMFR4 bit assignments are: 


31 8 7 4 3 0 
Bits [31:8] 


Reserved, RAZ. 


AC2, bits [7:4] 
Indicates the extension of the ACTLR and HACTLR registers using ACTLR2 and HACTLR2. 


Defined values are: 
0000 ACTLR2 and HACTLR2 are not implemented. 


0001 ACTLR2 and HACTLR2 are implemented. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


Bits [3:0] 
Reserved, RAZ. 


Accessing the ID_MMFR4: 
To access the ID_MMEFR4: 
MRC p15,0,<Rt>,c@,c2,6 ; Read ID_MMFR4 into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0010 110 
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G6.2.89 


ID_PFRO, Processor Feature Register 0 


The ID_PFRO characteristics are: 


Purpose 
Gives top-level information about the instruction sets supported by the PE in AArch32. 
Must be interpreted with ID_PFR1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_PFRO is architecturally mapped to AArch64 System register 
ID_PFRO_EL1. 


Attributes 
ID_PFRO is a 32-bit register. 


Field descriptions 


The ID_PFRO bit assignments are: 


31 1615 12 11 8 


7 43 0 
RESO State3 State2 State1 Stated 
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Bits [31:16] 
Reserved, RESO. 


State3, bits [15:12] 
T32EE instruction set support. Defined values are: 
0000 Not implemented. 
0001 T32EE instruction set implemented. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


State2, bits [11:8] 


Jazelle extension support. Defined values are: 


0000 Not implemented. 
0001 Jazelle extension implemented, without clearing of JOSCR.CV on exception entry. 
0010 Jazelle extension implemented, with clearing of JOSCR.CV on exception entry. 


All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Statel, bits [7:4] 


T32 instruction set support. Defined values are: 


0000 T32 instruction set not implemented. 
0001 T32 encodings before the introduction of Thumb-2 technology implemented: 
. All instructions are 16-bit. 


° A BL or BLX is a pair of 16-bit instructions. 
° 32-bit instructions other than BL and BLX cannot be encoded. 


0011 T32 encodings after the introduction of Thumb-2 technology implemented, for all 
16-bit and 32-bit T32 basic instructions. 


All other values are reserved. 

In ARMv8-A the only permitted value is 0011. 
State0, bits [3:0] 

A32 instruction set support. Defined values are: 

0000 A32 instruction set not implemented. 

0001 A32 instruction set implemented. 

All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Accessing the ID_PFRO: 
To access the ID_PFRO: 
MRC p15,0,<Rt>,c@,c1,@ ; Read ID_PFR@ into Rt 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0000 0001 000 
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G6.2.90 ID_PFR1, Processor Feature Register 1 
The ID_PFRI1 characteristics are: 


Purpose 
Gives information about the AArch32 programmers' model. 
Must be interpreted with ID_PFRO. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register ID_PFR1 is architecturally mapped to AArch64 System register 
ID_PFR1_EL1. 


Attributes 
ID_PFR1 is a 32-bit register. 


Field descriptions 


The ID_PFR1 bit assignments are: 
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28 27 24 23 20 19 1615 12 11 


ce Virtualization 





GIC, bits [31:28] 
System register GIC CPU interface. Defined values are: 
0000 No System register interface to the GIC CPU interface is supported. 
0001 System register interface to versions 3.0 and 4.0 of the GIC CPU interface is supported. 


All other values are reserved. 


Virt_frac, bits [27:24] 


Virtualization fractional field. When the Virtualization field is 0000, determines the support for 
features from the ARMv7 Virtualization Extensions. Defined values are: 


0000 No features from the ARMv7 Virtualization Extensions are implemented. 
0001 The following features of the ARMv7 Virtualization Extensions are implemented: 
° The SCR.SIF bit, if EL3 is implemented. 


° The modifications to the SCR.AW and SCR.FW bits described in the 
Virtualization Extensions, if EL3 is implemented. 


° The MSR (Banked register) and MRS (Banked register) instructions. 
. The ERET instruction. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
This field is only valid when the value of ID_PFR1.Virtualization is 0, otherwise it holds the value 
0000. 


— Note 


The ID_ISAR registers do not identify whether the instructions added by the ARMv7 Virtualization 
Extensions are implemented. 





Sec_frac, bits [23:20] 


Security fractional field. When the Security field is 0000, determines the support for features from 
the ARMv7 Security Extensions. Defined values are: 


0000 No features from the ARMv7 Security Extensions are implemented. 

0001 The following features from the ARMv7 Security Extensions are implemented: 
° The VBAR register. 
. The TTBCR.PDO and TTBCR.PD1 bits. 


0010 As for 0001, plus the ability to access Secure or Non-secure physical memory is 
supported. 


All other values are reserved. 
In ARMVv8-A the permitted values are 0000, 0001, and 0010. 
This field is only valid when the value of ID_PFR1.Security is 0, otherwise it holds the value 0000. 


GenTimer, bits [19:16] 
Generic Timer support. Defined values are: 


0000 Not implemented. 
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0001 Generic Timer implemented. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Virtualization, bits [15:12] 


Virtualization support. Defined values are: 
0000 EL2, Hyp mode, and the HVC instruction not implemented. 


0001 EL2, Hyp mode, the HVC instruction, and all the features described by Virt_frac == 
0001 implemented. 


All other values are reserved. 

In ARMv8-A the permitted values are 0000 and 0001. 

In an implementation that includes EL2, if EL2 cannot use AArch32 but EL1 can use AArch32 then 
this field has the value 0001. 

— Note 

The ID_ISARs do not identify whether the HVC instruction is implemented. 





MProgMod, bits [11:8] 


M profile programmers' model support. Defined values are: 
0000 Not supported. 

0010 Support for two-stack programmers' model. 

All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


Security, bits [7:4] 


Security support. Defined values are: 


0000 EL3, Monitor mode, and the SMC instruction not implemented. 

0001 EL3, Monitor mode, the SMC instruction, and all the features described by Sec_frac == 
0001 implemented. 

0010 As for 0001, and adds the ability to set the NSACR.RFR bit. Not permitted in ARMv8 


as the NSACR.RFR bit is RESO. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 


In an implementation that includes EL3, if EL3 cannot use AArch32 but EL1 can use AArch32 then 
this field has the value 0001. 


ProgMod, bits [3:0] 


Support for the standard programmers’ model for ARMV4 and later. Model must support User, FIQ, 
IRQ, Supervisor, Abort, Undefined, and System modes. Defined values are: 


0000 Not supported. 
0001 Supported. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0001. 


Accessing the ID_PFR1: 


To access the ID_PFR1: 


MRC p15,0,<Rt>,c@,c1,1 ; Read ID_PFR1 into Rt 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0001 001 
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G6.2.91 IFAR, Instruction Fault Address Register 
The IFAR characteristics are: 


Purpose 
Holds the virtual address of the faulting address that caused a synchronous Prefetch Abort 
exception. 

Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


IFAR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: : : : : RW 





IFAR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


IFAR is accessible as follows: 





ELO EL1 EL2(NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 

° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL] are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If HSTR.T6==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T6==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register IFAR(NS) is architecturally mapped to AArch64 System register 
FAR_EL1[63:32]. 


AArch32 System register IFAR(S) is architecturally mapped to AArch32 System register HIFAR 
when EL2 is implemented. 
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AArch32 System register IFAR(S) is architecturally mapped to AArch64 System register 
FAR_EL2[63:32] when EL2 is implemented. 


RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 


IFAR is a 32-bit register. 


Field descriptions 


The IFAR bit assignments are: 


31 0 
VA of faulting address of synchronous Prefetch Abort exception 
Bits [31:0] 


VA of faulting address of synchronous Prefetch Abort exception. 


Accessing the IFAR: 
To access the IFAR: 


MRC p15,0,<Rt>,c6,c@,2 ; Read IFAR into Rt 
MCR p15,0,<Rt>,c6,c@,2 ; Write Rt to IFAR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0110 0000 010 
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G6.2.92 IFSR, Instruction Fault Status Register 
The IFSR characteristics are: 


Purpose 


Holds status information about the last instruction fault. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


IFSR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: : : : : RW 





IFSR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


IFSR is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T5==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T5==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register IFSR is architecturally mapped to AArch64 System register 
IFSR32_EL2. 


The current translation table format determines which format of the register is used. 


RW fields in this register reset to architecturally UNKNOWN values. 
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Attributes 
IFSR is a 32-bit register. 


Field descriptions 


The IFSR bit assignments are: 


When TTBCR.EAE==0: 


31 171615 131211109 8 


RESO i RESO wea RESO FS[3:0] 


FnV oo Po LPAE 


FS[4] 
RESO 
ExT 


Bits [31:17] 
Reserved, RESO. 


FnV, bit [16] 


FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 


) IFAR is valid. 
1 IFAR is not valid, and holds an UNKNOWN value. 
This field is only valid for a Synchronous external abort other than a Synchronous external abort on 
a translation table walk. It is RESO for all other Prefetch Abort exceptions. 
Bits [15:13] 
Reserved, RESO. 


ExT, bit [12] 


External abort type. This bit can be used to provide an IMPLEMENTATION DEFINED classification of 
external aborts. 


In an implementation that does not provide any classification of external aborts, this bit is RESO. 


For aborts other than external aborts this bit always returns 0. 





Bit [11] 
Reserved, RESO. 

FS[4], bit [10] 
See FS[3:0], bits [3:0] for description of the FS field. 

LPAE, bit [9] 
On taking a Data Abort exception, this bit is set as follows: 
Q Using the Short-descriptor translation table formats. 
1 Using the Long-descriptor translation table formats. 
Hardware does not interpret this bit to determine the behavior of the memory system, and therefore 
software can set this bit to 0 or 1 without affecting operation. 

Bits [8:4] 
Reserved, RESO. 
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FS[3:0], bits [3:0] 
Fault status bits. Interpreted with bit [10]. Possible values of FS[4:0] are: 
00001 PC alignment fault 
00010 Debug exception 
00011 Access flag fault, level 1 





00101 Translation fault, level 1 

00110 Access flag fault, level 2 

00111 Translation fault, level 2 

01000 Synchronous external abort, not on translation table walk 

01001 Domain fault, level 1 

01011 Domain fault, level 2 

01100 Synchronous external abort, on translation table walk, level 1 
Q1101 Permission fault, level 1 

01110 Synchronous external abort, on translation table walk, level 2 
Q1111 Permission fault, level 2 

10000 TLB conflict abort 

10100 IMPLEMENTATION DEFINED fault (Lockdown fault) 

11001 Synchronous parity or ECC error on memory access, not on translation table walk 
11100 Synchronous parity or ECC error on translation table walk, level 1 
11110 Synchronous parity or ECC error on translation table walk, level 2 


All other values are reserved. 
For more information about the lookup level associated with a fault, see The level associated with 


MMU faults on a Short-descriptor translation table lookup on page G4-4129, 


When TTBCR.EAE==1: 


31 171615 131211109 8 


RESO i RESO Ty | [mw RESO STATUS 
Fnv ao Po LPAE 
RESO 


ExT 


Bits [31:17] 
Reserved, RESO. 


FnV, bit [16] 


FAR not Valid, for a Synchronous external abort other than a Synchronous external abort on a 
translation table walk. 


) IFAR is valid. 
1 IFAR is not valid, and holds an UNKNOWN value. 


This field is only valid for a Synchronous external abort other than a Synchronous external abort on 
a translation table walk. It is RESO for all other Prefetch Abort exceptions. 


Bits [15:13] 


Reserved, RESO. 
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ExT, bit [12] 


External abort type. This bit can be used to provide an IMPLEMENTATION DEFINED classification of 
external aborts. 


In an implementation that does not provide any classification of external aborts, this bit is RESO. 


For aborts other than external aborts this bit always returns 0. 


Bits [11:10] 


Reserved, RESO. 


LPAE, bit [9] 
On taking a Data Abort exception, this bit is set as follows: 
0 Using the Short-descriptor translation table formats. 
1 Using the Long-descriptor translation table formats. 
Hardware does not interpret this bit to determine the behavior of the memory system, and therefore 
software can set this bit to 0 or 1 without affecting operation. 
Bits [8:6] 


Reserved, RESO. 


STATUS, bits [5:0] 
Fault status bits. Possible values of this field are: 
000000 Address size fault in TTBRO or TTBR1 
000001 Address size fault, level 1 
000010 Address size fault, level 2 
000011 Address size fault, level 3 
000101 Translation fault, level 1 
000110 Translation fault, level 2 
000111 Translation fault, level 3 
001001 Access flag fault, level 1 
001010 Access flag fault, level 2 
001011 Access flag fault, level 3 
001101 Permission fault, level 1 
001110 Permission fault, level 2 
001111 Permission fault, level 3 





010000 Synchronous external abort, not on translation table walk 

010101 Synchronous external abort, on translation table walk, level 1 

010110 Synchronous external abort, on translation table walk, level 2 

Q10111 Synchronous external abort, on translation table walk, level 3 

011000 Synchronous parity or ECC error on memory access, not on translation table walk 
011101 Synchronous parity or ECC error on memory access on translation table walk, level 1 
011110 Synchronous parity or ECC error on memory access on translation table walk, level 2 
011111 Synchronous parity or ECC error on memory access on translation table walk, level 3 


100001 PC alignment fault 
100010 Debug exception 
110000 TLB conflict abort 


All other values are reserved. 
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For more information about the lookup level associated with a fault, see The level associated with 
MMU faults on a Long-descriptor translation table lookup on page G4-4131. 

Accessing the IFSR: 

To access the IFSR: 


MRC p15,0,<Rt>,c5,c@,1 ; Read IFSR into Rt 
MCR p15,0,<Rt>,c5,c@,1 ; Write Rt to IFSR 


Register access is encoded as follows: 





coproc opci CRn CRm_= opc2 





1111 000 0101 0000 001 
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G6.2.93 ISR, Interrupt Status Register 
The ISR characteristics are: 
Purpose 
Shows whether any IRQ, FIQ, or external abort is pending. In an implementation that includes EL2, 
when the register is accessed from Non-secure EL1, a pending interrupt might be a physical 
interrupt or a virtual interrupt, and the architecture does not provide any mechanism that software 
executing at Non-secure EL] can use to determine whether a pending interrupt is physical or virtual. 
For all other accesses, any indicated interrupt must be a physical interrupt. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - RO RO RO RO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- RO RO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T12==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 
. If HSTR_EL2.T12==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
AArch32 System register ISR is architecturally mapped to AArch64 System register ISR_EL1. 
Attributes 
ISR is a 32-bit register. 
Field descriptions 
The ISR bit assignments are: 
31 98765 0 
RESO 4) fe RESO 
Bits [31:9] 
Reserved, RESO. 
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A, bit [8] 
Asynchronous external abort pending bit: 
) No pending asynchronous external abort. 
1 An asynchronous external abort is pending. 
I, bit [7] 
IRQ pending bit. Indicates whether an IRQ interrupt is pending: 
0 No pending IRQ. 
1 An IRQ interrupt is pending. 
¥, bit [6] 
FIQ pending bit. Indicates whether an FIQ interrupt is pending. 
Q No pending FIQ. 
1 An FIQ interrupt is pending. 
Bits [5:0] 


Reserved, RESO. 


Accessing the ISR: 
To access the ISR: 
MRC p15,0,<Rt>,c12,c1,@ ; Read ISR into Rt 


Register access is encoded as follows: 





coproc opc1 


CRn CRm_= opc2 





1111 000 


1100 0001 000 
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G6.2.94 ITLBIALL, Instruction TLB Invalidate All 
The ITLBIALL characteristics are: 


Purpose 


Invalidate all instruction TLB entries for the PL1&0 translation regime, subject to the Privilege 
level and Security state at which the instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIALL. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ITLBIALL is a 32-bit System instruction. 


Field descriptions 


ITLBIALL ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 


Executing the ITLBIALL instruction: 
The ITLBIALL instruction is executed as: 


MCR p15,0,<Rt>,c8,c5,0 ; ITLBIALL operation, ignoring the value in Rt 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4483 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0101 000 
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G6.2.95 ITLBIASID, Instruction TLB Invalidate by ASID match 
The ITLBIASID characteristics are: 
Purpose 
Invalidate instruction TLB entries for stage 1 of the PL1&0 translation regime that match the given 
ASID, subject to the Security state at which the instruction is executed. 
If this operation is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 
For details of the scope of this instruction see TLBIASID. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
There are no configuration notes. 
Attributes 
ITLBIASID is a 32-bit System instruction. 
Field descriptions 
The ITLBIASID input value bit assignments are: 
31 8 7 0 
RESO ASID 
Bits [31:8] 
Reserved, RESO. 
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ASID, bits [7:0] 
ASID value to match. Any TLB entries for non-global pages that match the ASID values will be 
affected by this operation. 


Executing the ITLBIASID instruction: 
The ITLBIASID instruction is executed as: 
MCR p15,0,<Rt>,c8,c5,2 ; ITLBIASID operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0101 010 
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ITLBIMVA, Instruction TLB Invalidate by VA 


The ITLBIMVA characteristics are: 


Purpose 


Invalidate TLB entries for stage | of the PL1&0 translation regime stage | that match the given VA 
and ASID, subject to the Security state at which the instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIMVA. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
ITLBIMVA is a 32-bit System instruction. 


Field descriptions 


The ITLBIMVA input value bit assignments are: 


31 12 11 8 7 0 
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VA, bits [31:12] 


Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 
by this operation. 


Bits [11:8] 
Reserved, RESO. 


ASID, bits [7:0] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 

Executing the ITLBIMVA instruction: 

The ITLBIMVA instruction is executed as: 

MCR p15,0,<Rt>,c8,c5,1 ; ITLBIMVA operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0101 001 
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JIDR, Jazelle ID Register 


The JIDR characteristics are: 


A Jazelle register, which identified the Jazelle architecture version. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





IMPDEF IMPDEF RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





IMP DEF RO RO 





For accesses from ELO it is IMPLEMENTATION DEFINED whether the register is RO or UNDEFINED. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 

prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 

° If HCR.TIDO==1, Non-secure read accesses to this register from ELO and EL] are trapped 
to Hyp mode. 

° If HCR_EL2.TID0==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


JIDR is a 32-bit register. 


Field descriptions 


The JIDR bit assignments are: 





0 
RAZ 
Reserved, RAZ. 
Accessing the JIDR: 
To access the JIDR: 
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MRC p14,7,<Rt>,c@,c@,@ ; Read JIDR into Rt 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1110 111 0000 0000 000 
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JMCR, Jazelle Main Configuration Register 


The JMCR characteristics are: 


Purpose 
A Jazelle register, which provides control of the Jazelle extension. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





IMPDEF IMPDEF RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1 EL2 (NS) 





IMPDEF RW RW 





For accesses from ELO it is IMPLEMENTATION DEFINED whether the register is RW or UNDEFINED. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


Attributes 


JMCR is a 32-bit register. 


Field descriptions 


The JMCR bit assignments are: 


31 0 
Bits [31:0] 


Reserved, RAZ/WI. 


Accessing the JMCR: 
To access the JMCR: 


MRC p14,7,<Rt>,c2,c@,@ ; Read JMCR into Rt 
MCR p14,7,<Rt>,c2,c@,@ ; Write Rt to JMCR 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 111 0010 0000 000 
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JOSCR, Jazelle OS Control Register 


The JOSCR characteristics are: 


Purpose 
A Jazelle register, which provides operating system control of the Jazelle Extension. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





IMPDEF IMPDEF RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1 EL2 (NS) 





IMPDEF RW RW 





For accesses from ELO it is IMPLEMENTATION DEFINED whether the register is RW or UNDEFINED. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


Attributes 
JOSCR is a 32-bit register. 


Field descriptions 


The JOSCR bit assignments are: 


31 0 
Bits [31:0] 


Reserved, RAZ/WI. 


Accessing the JOSCR: 
To access the JOSCR: 


MRC p14,7,<Rt>,c1,c@,@ ; Read JOSCR into Rt 
MCR p14,7,<Rt>,c1,c0,@ ; Write Rt to JOSCR 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 111 0001 0000 000 
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G6.2.100 MAIRO, Memory Attribute Indirection Register 0 


The MAIRO characteristics are: 


Purpose 


Along with MAIR1, provides the memory attribute encodings corresponding to the possible 
AttrIndx values in a Long-descriptor format translation table entry for stage 1 translations. 


AttrIndx[2] indicates the MAIR register to be used: 
° When AttrIndx[2] is 0, MAIRO is used. 
° When AttrIndx[2] is 1, MAIR1 is used. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


MAIRO(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: z : 2 : RW 





MAIRO(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


MAIRO is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





MAIRO and PRRR are the same register, with a different view depending on the value of 
TTBCR.EAE: 


° When it is set to 0, the register is as described in PRRR. 
° When it is set to 1, the register is as described in MAIRO. 


AttrIndx[2], from the translation table descriptor, selects the appropriate MAIR: setting AttrIndx[2] 
to 0 selects MAIRO. 


In an implementation that includes EL3: 


° MAIRO(S) gives the value for memory accesses from Secure state. 
° MAIRO(NS) gives the value for memory accesses from Non-secure states other than Hyp 
mode. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 





° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 
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. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 

. If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 

. If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL] are trapped to 
EL2. 


° If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


Attributes 


AArch32 System register MAIRO is architecturally mapped to AArch64 System register 
MAIR_EL1[31:0]. 


MAIRO and PRRR are the same register, with a different view depending on the value of 
TTBCR.EAE: 


. When it is set to 0, the register is as described in PRRR. 
. When it is set to 1, the register is as described in MAIRO. 


When EL3 is using AArch32, write access to MAIRO(S) is disabled when the CPISSDISABLE 
signal is asserted HIGH. 


RW fields in this register reset to architecturally UNKNOWN values. 


MAIRO is a 32-bit register. 


Field descriptions 


The MAIRO bit assignments are: 


When TTBCR.EAE==1: 


31 


24 23 1615 8 7 0 


Attr3 Attr2 Attr1 AttrO 


Attr<n>, bits [8n+7:8n], for n = 0 to3 


The memory attribute encoding for an AttrIndx[2:0] entry in a Long descriptor format translation 
table entry, where: 


° AttrIndx[2:0] gives the value of <n> in Attr<n>. 
° AttrIndx[2] defines which MAIR to access. Attr7 to Attr4 are in MAIR1I, and Attr3 to AttrO 
are in MAIRO. 


Bits [7:4] are encoded as follows: 





Attr<n>[7:4] Meaning 





0000 Device memory. See encoding of Attr<n>[3:0] for the type of Device memory. 





@@RW, RW not @@ Normal Memory, Outer Write-Through transient 





0100 Normal Memory, Outer Non-cacheable 
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Attr<n>[7:4] Meaning 





Q1RW, RW not 0@ Normal Memory, Outer Write-Back transient 





10RW Normal Memory, Outer Write-Through non-transient 





11RW Normal Memory, Outer Write-Back non-transient 





R = Outer Read-Allocate policy, W = Outer Write-Allocate policy. 
The meaning of bits [3:0] depends on the value of bits [7:4]: 





Attr<n>[3:0] 


Meaning when Attr<n>[7:4] is 0000 Meaning when Attr<n>[7:4] is not 0000 



































0000 Device-nGnRnE memory UNPREDICTABLE 

Q@RW, RW not @@ UNPREDICTABLE Normal Memory, Inner Write-Through transient 

0100 Device-nGnRE memory Normal memory, Inner Non-cacheable 

Q1RW, RW not @@ UNPREDICTABLE Normal Memory, Inner Write-Back transient 

1000 Device-nGRE memory Normal Memory, Inner Write-Through non-transient (RW=00) 
10RW, RW not 0@@ UNPREDICTABLE Normal Memory, Inner Write-Through non-transient 

1100 Device-GRE memory Normal Memory, Inner Write-Back non-transient (RW=00) 
11RW, RW not 00 UNPREDICTABLE Normal Memory, Inner Write-Back non-transient 





R = Inner Read-Allocate policy, W = Inner Write-Allocate policy. 


The R and W bits in some Attr<n> fields have the following meanings: 





RorW Meaning 





0 No Allocate 





1 Allocate 





Accessing the MAIRO: 


To access the MAIRO when TTBCR.EAE==1: 


MRC p15,0,<Rt>,c10,c2,@ ; Read MAIR@ into Rt 
MCR p15,0,<Rt>,c10,c2,@ ; Write Rt to MAIRO 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1010 0010 000 
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G6.2.101 MAIR1, Memory Attribute Indirection Register 1 
The MAIR1I characteristics are: 
Purpose 
Along with MAIRO, provides the memory attribute encodings corresponding to the possible 
AttrIndx values in a Long-descriptor format translation table entry for stage 1 translations. 
AttrIndx[2] indicates the MAIR register to be used: 
° When AttrIndx[2] is 0, MAIRO is used. 
° When AttrIndx[2] is 1, MAIR1 is used. 
Usage constraints 
If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 
MAIR1(S) is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - - - RW 
MAIRI1(NS) is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - RW RW RW - 
If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 
MAIRI is accessible as follows: 
ELO EL1i  EL2(NS) 
- RW RW 
MAIR1 and NMRR are the same register, with a different view depending on the value of 
TTBCR.EAE: 
° When it is set to 0, the register is as described in NMRR. 
. When it is set to 1, the register is as described in MAIR1. 
AttrIndx[2], from the translation table descriptor, selects the appropriate MAIR: setting AttrIndx[2] 
to 1 selects MAIR1. 
In an implementation that includes EL3: 
° MAIRI(S) gives the value for memory accesses from Secure state. 
° MAIRI(NS) gives the value for memory accesses from Non-secure states other than Hyp 
mode. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 
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. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 

. If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 

. If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


Attributes 


AArch32 System register MAIR1 is architecturally mapped to AArch64 System register 
MAIR_EL1[63:32]. 


MAIRI and NMRR are the same register, with a different view depending on the value of 
TTBCR.EAE: 


. When it is set to 0, the register is as described in NMRR. 
. When it is set to 1, the register is as described in MAIR1. 


When EL3 is using AArch32, write access to MAIR1(S) is disabled when the CPISSDISABLE 
signal is asserted HIGH. 


RW fields in this register reset to architecturally UNKNOWN values. 


MAIR1I is a 32-bit register. 


Field descriptions 


The MAIR1 bit assignments are: 


When TTBCR.EAE==1: 


31 


24 23 1615 8 7 0 


Attr7 Attr6 Attr5 Attr4 


Attr<n>, bits [8(n-4)+7:8(n-4)], for n = 4 to 7 


The memory attribute encoding for an AttrIndx[2:0] entry in a Long descriptor format translation 
table entry, where: 


° AttrIndx[2:0] gives the value of <n> in Attr<n>. 
° AttrIndx[2] defines which MAIR to access. Attr7 to Attr4 are in MAIR1I, and Attr3 to AttrO 
are in MAIRO. 


Bits [7:4] are encoded as follows: 





Attr<n>[7:4] Meaning 





0000 Device memory. See encoding of Attr<n>[3:0] for the type of Device memory. 





@@RW, RW not @@ Normal Memory, Outer Write-Through transient 





0100 Normal Memory, Outer Non-cacheable 
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Attr<n>[7:4] 


Meaning 





Q1RW, RW not 00 


Normal Memory, Outer Write-Back transient 





10RW 


Normal Memory, Outer Write-Through non-transient 





11RW 


Normal Memory, Outer Write-Back non-transient 





R = Outer Read-Allocate policy, W = Outer Write-Allocate policy. 


The meaning of bits [3:0] depends on the value of bits [7:4]: 





Attr<n>[3:0] 


Meaning when Attr<n>[7:4] is 0000 


Meaning when Attr<n>[7:4] is not 0000 





0000 


Device-nGnRnE memory 


UNPREDICTABLE 





@ORW, RW not @@ UNPREDICTABLE 


Normal Memory, Inner Write-Through transient 





0100 


Device-nGnRE memory 


Normal memory, Inner Non-cacheable 





@1RW, RW not @@ UNPREDICTABLE 


Normal Memory, Inner Write-Back transient 





1000 


Device-nGRE memory 


Normal Memory, Inner Write-Through non-transient (RW=00) 





10RW, RW not @@ UNPREDICTABLE 


Normal Memory, Inner Write-Through non-transient 





1100 


Device-GRE memory 


Normal Memory, Inner Write-Back non-transient (RW=00) 








11RW, RW not @@ UNPREDICTABLE 








Normal Memory, Inner Write-Back non-transient 





R = Inner Read-Allocate policy, W = Inner Write-Allocate policy. 


The R and W bits in some Attr<n> fields have the following meanings: 


Accessing the MAIR1: 


To access the MAIR1 when TTBCR.EAE==1: 


MRC p15,0,<Rt>,c10,c2,1 ; Read MAIR1 into Rt 
MCR p15,0,<Rt>,c10,c2,1 ; Write Rt to MAIR1 


Register access is encoded as follows: 





RorW Meaning 














) No Allocate 
1 Allocate 
coproc opct CRn CRm_= opc2 





1111 000 


1010 0010 001 
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G6.2.102 MIDR, Main ID Register 


The MIDR characteristics are: 


Purpose 


Provides identification information for the PE, including an implementer code for the device and a 
device ID number. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register MIDR is architecturally mapped to AArch64 System register 
MIDR_ELI. 


AArch32 System register MIDR is architecturally mapped to External register MIDR_EL1. 


Some fields of the MIDR are IMPLEMENTATION DEFINED. For details of the values of these fields for 
a particular ARMv8 implementation, and any implementation-specific significance of these values, 
see the product documentation. 


Attributes 
MIDR is a 32-bit register. 


Field descriptions 


The MIDR bit assignments are: 


31 24 23 20 19 1615 4 3 0 


Architecture oe 
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Implementer, bits [31:24] 


The Implementer code. This field must hold an implementer code that has been assigned by ARM. 
Assigned codes include the following: 





Hex representation 


ASCII representation 


Implementer 



































Ox41 A ARM Limited 

0x42 B Broadcom Corporation 

0x43 C Cavium Inc. 

0x44 D Digital Equipment Corporation 
0x49 I Infineon Technologies AG 

Qx4D M Motorola or Freescale Semiconductor Inc. 
Ox4E N NVIDIA Corporation 

0x50 P Applied Micro Circuits Corporation 
Ox51 Q Qualcomm Inc. 

0x56 Vv Marvell International Ltd. 

0x69 i Intel Corporation 





ARM can assign codes that are not published in this manual. All values not assigned by ARM are 
reserved and must not be used. 


Variant, bits [23:20] 


An IMPLEMENTATION DEFINED variant number. Typically, this field is used to distinguish between 
different product variants, or major revisions of a product. 


Architecture, bits [19:16] 


The permitted values of this field are: 


0001 
0010 
0011 
0100 
0101 
0110 
0111 
1111 


ARMv4 

ARMVv4T 

ARMvVsS5 (obsolete) 
ARMv5T 
ARMvS5TE 
ARMVSTEJ 
ARMv6 





Architectural features are individually identified in the ID_* registers, see Identification 
registers, functional group on page G4-4194. 


All other values are reserved. 


PartNum, bits [15:4] 


An IMPLEMENTATION DEFINED primary part number for the device. 


On processors implemented by ARM, if the top four bits of the primary part number are Qx@ or 0x7, 


the variant and architecture are encoded differently. 


Revision, bits [3:0] 


An IMPLEMENTATION DEFINED revision number for the device. 
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Accessing the MIDR: 
To access the MIDR: 
MRC p15,0,<Rt>,c@,c@,@ ; Read MIDR into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0000 000 
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G6.2.103 


MPIDR, Multiprocessor Affinity Register 


The MPIDR characteristics are: 


Purpose 
In a multiprocessor system, provides an additional PE identification mechanism for scheduling 
purposes. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register MPIDR is architecturally mapped to AArch64 System register 
MPIDR_EL1. 


The assigned value of the MPIDR.{ Aff2, Affl, AffO} or MPIDR_EL1.{Aff3, Aff2, Affl, Aff0} set 
of fields of each PE must be unique within the system as a whole. 


In a uniprocessor system ARM recommends that each Aff<n> field of this register returns a value 
of 0. 


Attributes 
MPIDR is a 32-bit register. 


Field descriptions 


The MPIDR bit assignments are: 


31 30 29 25 24 23 1615 8 7 0 


ee! 
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M, bit [31] 


Indicates whether this implementation includes the functionality introduced by the ARMv7 
Multiprocessing Extensions. The possible values of this bit are: 


) This implementation does not include the ARMv7 Multiprocessing Extensions 
functionality. 


1 This implementation includes the ARMv7 Multiprocessing Extensions functionality. 


In ARMV8 this bit is RES1. 


U, bit [30] 


Indicates a Uniprocessor system, as distinct from PE 0 in a multiprocessor system. The possible 
values of this bit are: 


0 Processor is part of a multiprocessor system. 
1 Processor is part of a uniprocessor system. 
Bits [29:25] 


Reserved, RESO. 


MT, bit [24] 
Indicates whether the lowest level of affinity consists of logical PEs that are implemented using a 
multithreading type approach. The possible values of this bit are: 
) Performance of PEs at the lowest affinity level is largely independent. 
1 Performance of PEs at the lowest affinity level is very interdependent. 
Aff2, bits [23:16] 


Affinity level 2. The least significant affinity level field, for this PE in the system. 


Aff, bits [15:8] 
Affinity level 1. The intermediate affinity level field, for this PE in the system. 


Aff0, bits [7:0] 
Affinity level 0. The most significant affinity level field, for this PE in the system. 


Accessing the MPIDR: 
To access the MPIDR: 
MRC p15,0,<Rt>,c@,c@,5 ; Read MPIDR into Rt 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0000 0000 101 
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G6.2.104 MVBAR, Monitor Vector Base Address Register 
The MVBAR characteristics are: 


Purpose 


When EL3 is implemented and can use AArch32, holds the vector base address for any exception 
that is taken to Monitor mode. 


Secure software must program the MVBAR with the required initial value as part of the PE boot 
sequence. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- = - - RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO L1(NS) EL1(S) EL2(NS) 





- - Trap - 





If EL3 is implemented and is using AArch64, any read or write to MVBAR from Secure EL]! using 
AArch32 is trapped as an exception to EL3. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T12==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T12==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
This register is only accessible in Secure state. 


It is IMPLEMENTATION DEFINED whether MVBAR[0] has a fixed value and ignored writes, or takes 
the last value written to it. 


Write access to MVBAR is disabled when the CPISSDISABLE signal is asserted HIGH. 


On a reset into EL3 using AArch32, the reset value of MVBAR is an IMPLEMENTATION DEFINED 
choice between: 


° MVBARJ{31:5] = an IMPLEMENTATION DEFINED value, which might be UNKNOWN. 
. MVBAR[4:1] = RESO. 
. MVBAR[0] = 0. 


° MVBAR{[31:1] = an IMPLEMENTATION DEFINED value that is bits[31:1] of the AArch32 reset 
address. 


*  MVBAR[0] = 1. 


Attributes 
MVBAR is a 32-bit register. 
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Field descriptions 


The MVBAR bit assignments are: 


When programmed with a vector base address: 


34 5 4 0 
Vector Base Address 
Bits [31:5] 


Vector Base Address. Bits[31:5] of the base address of the exception vectors for exceptions taken to 
this Exception level. Bits[4:0] of an exception vector are the exception offset. 


Reserved, bits [4:0] 


Reserved, see Configurations. 


Accessing the MVBAR: 
To access the MVBAR: 


MRC p15,0,<Rt>,c12,c@,1 ; Read MVBAR into Rt 
MCR p15,0,<Rt>,c12,c@,1 ; Write Rt to MVBAR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1100 0000 001 
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G6.2.105 MVFRO, Media and VFP Feature Register 0 
The MVFRO characteristics are: 


Purpose 


Describes the features provided by the AArch32 Advanced SIMD and Floating-point 
implementation. 


Must be interpreted with MVFR1 and MVFR2. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - Config-RO Config-RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





- Config-RO Config-RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If CPACR.cp10==00, read accesses to this register from PL1 are UNDEFINED. 


° If NSACR.cp10==0, Non-secure read accesses to this register from EL1 and EL2 are 
UNDEFINED. 


. If CPTR_EL2.TFP==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If CPTR_EL3.TFP==1, read accesses to this register from EL1 and EL2 are trapped to EL3. 


° If HCPTR.TCP10==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCPTR.TCP10==1, Non-secure read accesses to this register from EL2 are UNDEFINED. 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register MVFRO is architecturally mapped to AArch64 System register 
MVFRO_EL1. 


Implemented only if the implementation includes Advanced SIMD and floating-point instructions. 


Attributes 
MVERO is a 32-bit register. 
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Field descriptions 


The MVFRO bit assignments are: 


28 27 24 23 20 19 1615 12 11 


FPRound | FPShVec FPSart FPDivide FPTrap FPDP FPSP SIMDReg 


FPRound, bits [31:28] 


Floating-Point Rounding modes. Indicates whether the floating-point implementation provides 
support for rounding modes. Defined values are: 


0000 Not implemented, or only Round to Nearest mode supported, except that Round towards 
Zero mode is supported for VCVT instructions that always use that rounding mode 
regardless of the FPSCR setting. 


0001 All rounding modes supported. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


FPShVec, bits [27:24] 


Short Vectors. Indicates whether the floating-point implementation provides support for the use of 
short vectors. Defined values are: 


0000 Short vectors not supported. 
0001 Short vector operation supported. 
All other values are reserved. 


In ARMv8-A the only permitted value is 0000. 


FPSart, bits [23:20] 


Square Root. Indicates whether the floating-point implementation provides support for the ARMv6 
VFP square root operations. Defined values are: 


0000 Not supported in hardware. 
0001 Supported. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
The VSQRT.F32 instruction also requires the single-precision floating-point attribute, bits [7:4], 
and the VSQRT.F64 instruction also requires the double-precision floating-point attribute, bits 
[11:8]. 
FPDivide, bits [19:16] 


Indicates whether the floating-point implementation provides support for VFP divide operations. 
Defined values are: 


0000 Not supported in hardware. 

0001 Supported. 

All other values are reserved. 

In ARMv8-A the permitted values are 0000 and 0001. 


The VDIV.F32 instruction also requires the single-precision floating-point attribute, bits [7:4], and 
the VDIV.F64 instruction also requires the double-precision floating-point attribute, bits [11:8]. 
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FPTrap, bits [15:12] 


Floating Point Exception Trapping. Indicates whether the floating-point implementation provides 
support for exception trapping. Defined values are: 


0000 Not supported. 
0001 Supported. 
All other values are reserved. 


A value of 0001 indicates that, when the corresponding trap is enabled, a floating-point exception 
generates an exception. 


FPDP, bits [11:8] 


Double Precision. Indicates whether the floating-point implementation provides support for 
double-precision operations. Defined values are: 


0000 Not supported in hardware. 
0001 Supported, VFPv2. 
0010 Supported, VFPv3, VFPv4, or ARMv8. VFPv3 and ARMV8 add an instruction to load 


a double-precision floating-point constant, and conversions between double-precision 
and fixed-point values. 


All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0010. 


A value of 0b0001 or 0b0010 indicates support for all VFP double-precision instructions in the 
supported version of VFP, except that, in addition to this field being nonzero: 


. VSQRT.F64 is only available if the Square root field is 0001. 
° VDIV.F64 is only available if the Divide field is 0001. 


* Conversion between double-precision and single-precision is only available if the 
single-precision field is nonzero. 


FPSP, bits [7:4] 


Single Precision. Indicates whether the floating-point implementation provides support for 
single-precision operations. Defined values are: 


0000 Not supported in hardware. 

0001 Supported, VFPv2. 

0010 Supported, VFPv3 or VFPv4. VFPv3 adds an instruction to load a single-precision 
floating-point constant, and conversions between single-precision and fixed-point 
values. 


All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0010. 


A value of 0b0001 or 0b0010 indicates support for all VFP single-precision instructions in the 
supported version of VFP, except that, in addition to this field being nonzero: 


° VSQRT.F32 is only available if the Square root field is 0001. 
. VDIV.F32 is only available if the Divide field is 0001. 


. Conversion between double-precision and single-precision is only available if the 
double-precision field is nonzero. 


SIMDReg, bits [3:0] 


Advanced SIMD registers. Indicates whether the Advanced SIMD and floating-point 
implementation provides support for the Advanced SIMD and floating-point register bank. Defined 
values are: 


0000 The implementation has no Advanced SIMD and floating-point support. 


0001 The implementation includes floating-point support with 16 x 64-bit registers. 
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0010 The implementation includes Advanced SIMD and floating-point support with 32 x 
64-bit registers. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0010. 


Accessing the MVFRO: 
To access the MVFRO: 
VMRS <Rt>, MVFRO ; Read MVFRQ into Rt 


Register access is encoded as follows: 


spec_reg 


Q111 
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G6.2.106 MVFR1, Media and VFP Feature Register 1 
The MVFR1 characteristics are: 


Purpose 


Describes the features provided by the AArch32 Advanced SIMD and Floating-point 
implementation. 


Must be interpreted with MVFRO and MVFR2. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - Config-RO Config-RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





- Config-RO Config-RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If CPACR.cp10==00, read accesses to this register from PL1 are UNDEFINED. 


° If NSACR.cp10==0, Non-secure read accesses to this register from EL1 and EL2 are 
UNDEFINED. 


. If CPTR_EL2.TFP==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If CPTR_EL3.TFP==1, read accesses to this register from EL1 and EL2 are trapped to EL3. 


° If HCPTR.TCP10==1, Non-secure read accesses to this register from EL] are trapped to Hyp 
mode. 


° If HCPTR.TCP10==1, Non-secure read accesses to this register from EL2 are UNDEFINED. 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register MVFR1 is architecturally mapped to AArch64 System register 
MVFR1_EL1. 


Implemented only if the implementation includes Advanced SIMD and floating-point instructions. 


Attributes 
MVFR1 is a 32-bit register. 
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Field descriptions 


The MVFR1 bit assignments are: 


28 27 24 23 20 19 1615 12 11 


= FPHP SIMDHP | SIMDSP SIMDInt SIMDLS | FPDNaN FPFtzZ 


SIMDFMAC, bits [31:28] 


Advanced SIMD Fused Multiply-Accumulate. Indicates whether the Advanced SIMD 
implementation provides fused multiply accumulate instructions. Defined values are: 


SIMDFMAC 


0000 Not implemented. 
0001 Implemented. 
All other values are reserved. 
In ARMv8-A the permitted values are 0000 and 0001. 
The Advanced SIMD and floating-point implementations must provide the same level of support 
for these instructions. 
FPHP, bits [27:24] 


Floating Point Half Precision. Indicates whether the floating-point implementation provides 
half-precision floating-point conversion instructions. Defined values are: 


0000 Not implemented. 
0001 Instructions to convert between half-precision and single-precision implemented. 


0010 As for 0b0001, and also instructions to convert between half-precision and 
double-precision implemented. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0010. 


SIMDRP, bits [23:20] 


Advanced SIMD Half Precision. Indicates whether the Advanced SIMD and floating-point 
implementation provides half-precision floating-point conversion instructions. Defined values are: 


0000 Not implemented. 
0001 Implemented. This value is permitted only if the SIMDSP field is 0001. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


SIMDSP, bits [19:16] 


Advanced SIMD Single Precision. Indicates whether the Advanced SIMD and floating-point 
implementation provides single-precision floating-point instructions. Defined values are: 


0000 Not implemented. 
0001 Implemented. This value is permitted only if the SIMDInt field is 0001. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


SIMDInt, bits [15:12] 


Advanced SIMD Integer. Indicates whether the Advanced SIMD and floating-point implementation 
provides integer instructions. Defined values are: 


0000 Not implemented. 
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0001 Implemented. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


SIMDLS, bits [11:8] 


Advanced SIMD Load/Store. Indicates whether the Advanced SIMD and floating-point 
implementation provides load/store instructions. Defined values are: 


0000 Not implemented. 
0001 Implemented. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


FPDNaN, bits [7:4] 


Default NaN mode. Indicates whether the floating-point implementation provides support only for 
the Default NaN mode. Defined values are: 


0000 Not implemented, or hardware supports only the Default NaN mode. 
0001 Hardware supports propagation of NaN values. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


FPFtzZ, bits [3:0] 


Flush to Zero mode. Indicates whether the floating-point implementation provides support only for 
the Flush-to-Zero mode of operation. Defined values are: 


0000 Not implemented, or hardware supports only the Flush-to-Zero mode of operation. 
0001 Hardware supports full denormalized number arithmetic. 
All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0001. 


Accessing the MVFR1: 
To access the MVFRI1: 
VMRS <Rt>, MVFR1 ; Read MVFR1 into Rt 


Register access is encoded as follows: 


spec_reg 


0110 
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G6.2.107 MVFR2, Media and VFP Feature Register 2 
The MVFR2 characteristics are: 


Purpose 


Describes the features provided by the AArch32 Advanced SIMD and Floating-point 
implementation. 


Must be interpreted with MVFRO and MVFR1. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page G4-4169. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - Config-RO Config-RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





- Config-RO Config-RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If CPACR.cp10==00, read accesses to this register from PL1 are UNDEFINED. 


° If NSACR.cp10==0, Non-secure read accesses to this register from EL1 and EL2 are 
UNDEFINED. 


. If CPTR_EL2.TFP==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


. If CPTR_EL3.TFP==1, read accesses to this register from EL1 and EL2 are trapped to EL3. 


° If HCPTR.TCP10==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCPTR.TCP10==1, Non-secure read accesses to this register from EL2 are UNDEFINED. 


° If HCR.TID3==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TID3==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register MVFR2 is architecturally mapped to AArch64 System register 
MVFR2_EL1. 


Implemented only if the implementation includes Advanced SIMD and floating-point instructions. 


Attributes 
MVER2 is a 32-bit register. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4515 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Field descriptions 


The MVFR2 bit assignments are: 


31 8 7 4 3 0 
RESO SIMDMisc 
Bits [31:8] 


Reserved, RESO. 


FPMisc, bits [7:4] 


Indicates whether the floating-point implementation provides support for miscellaneous VFP 


features. 

0000 Not implemented, or no support for miscellaneous features. 

0001 Support for Floating-point selection. 

0010 As 0001, and Floating-point Conversion to Integer with Directed Rounding modes. 
0011 As 0010, and Floating-point Round to Integral Floating-point. 

0100 As 0011, and Floating-point MaxNum and MinNum. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0100. 


SIMDMiisc, bits [3:0] 


Indicates whether the Advanced SIMD implementation provides support for miscellaneous 


Advanced SIMD features. 

0000 Not implemented, or no support for miscellaneous features. 

0001 Floating-point Conversion to Integer with Directed Rounding modes. 
0010 As 0001, and Floating-point Round to Integral Floating-point. 

0011 As 0010, and Floating-point MaxNum and MinNum. 


All other values are reserved. 


In ARMv8-A the permitted values are 0000 and 0011. 


Accessing the MVFR2: 
To access the MVFR2: 
VMRS <Rt>, MVFR2 ; Read MVFR2 into Rt 


Register access is encoded as follows: 


spec_reg 


0101 
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G6.2.108 NMRR, Normal Memory Remap Register 


The NMRR characteristics are: 


Purpose 


Provides additional mapping controls for memory regions that are mapped as Normal memory by 
their entry in the PRRR. 


Used in conjunction with the PRRR. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


NMRR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





» 4 - 7 RW 





NMRRV(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


NMRR is accessible as follows: 


Traps and Enables 





ELO EL1  EL2(NS) 





- RW RW 





For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


Configurations 


If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


If HCR.TRVM==1, Non-secure read accesses to this register from EL] are trapped to Hyp 
mode. 


If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


AArch32 System register NMRR is architecturally mapped to AArch64 System register 
MAIR_EL1[63:32]. 
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MAIR1 and NMRR are the same register, with a different view depending on the value of 
TTBCR.EAE: 


° When it is set to 0, the register is as described in NMRR. 
° When it is set to 1, the register is as described in MAIR1. 


When EL3 is using AArch32, write access to NMRR(S) is disabled when the CPISSDISABLE 
signal is asserted HIGH. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
NMRR is a 32-bit register. 


Field descriptions 


The NMRR bit assignments are: 


When TTBCR.EAE==0: 


31 30 29 28 27 26 25 24 23 22 21 2019 18 1716 1514131211109 8 76543210 





OR<n>, bits [2n+17:2n+16], for n = 0 to 7 


Outer Cacheable property mapping for memory attributes n, if the region is mapped as Normal 
memory by the PRRR.TR<n> entry. n is the value of the TEX[0], C, and B bits concatenated. The 
possible values of this field are: 


00 Region is Non-cacheable. 

01 Region is Write-Back, Write-Allocate. 

10 Region is Write-Through, no Write-Allocate. 
11 Region is Write-Back, no Write-Allocate. 


The meaning of the field with n = 6 is IMPLEMENTATION DEFINED and might differ from the meaning 
given here. This is because the meaning of the attribute combination {TEX[0] = 1, C = 1, B=0} is 
IMPLEMENTATION DEFINED. 


IR<n>, bits [2n+1:2n], for n = 0 to7 


Inner Cacheable property mapping for memory attributes n, if the region is mapped as Normal 
memory by the PRRR.TR<n> entry. n is the value of the TEX[0], C, and B bits concatenated. The 
possible values of this field are: 


00 Region is Non-cacheable. 

01 Region is Write-Back, Write-Allocate. 

10 Region is Write-Through, no Write-Allocate. 
11 Region is Write-Back, no Write-Allocate. 


The meaning of the field with n = 6 is IMPLEMENTATION DEFINED and might differ from the meaning 
given here. This is because the meaning of the attribute combination {TEX[0] = 1, C = 1, B =0} is 
IMPLEMENTATION DEFINED. 


Accessing the NMRR: 


To access the NMVRR when TTBCR.EAE==0: 


MRC p15,0,<Rt>,c1@,c2,1 ; Read NMRR into Rt 
MCR p15,0,<Rt>,c1@,c2,1 ; Write Rt to NMRR 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1010 0010 001 
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G6.2.109 NSACR, Non-Secure Access Control Register 
The NSACR characteristics are: 


Purpose 


When EL3 is implemented and can use AArch32, defines the Non-secure access permissions to 
Trace, Advanced SIMD and floating-point functionality. Also includes IMPLEMENTATION DEFINED 
bits that can define Non-secure access permissions for IMPLEMENTATION DEFINED functionality. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1(NS) EL2 (NS) 





- RO RO 





If EL3 is implemented and is using AArch64 then: 


° Any read of the NSACR from Non-secure EL2 using AArch32 or Non-secure EL1 using 
AArch32 returns a value of 0x00000C00. 


. Any read or write to NSACR from Secure EL1 using AArch32 is trapped as an exception to 
EL3. 


If EL3 is not implemented, then any read of the NSACR from EL2 using AArch32 or from EL1 
using AArch32 returns a value of 0x00000CQ0. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T1==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HSTR_EL2.T1==1, Non-secure read accesses to this register from EL1 are trapped to EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


——— Note 
In AArch64 state, the NSACR controls are replaced by controls in CPTR_EL3. 





Some or all RW fields of this register have defined reset values. These apply whenever the register 
is accessible. This means they apply when the PE resets into EL3 using AArch32. 


Attributes 
NSACR is a 32-bit register. 


Field descriptions 


The NSACR bit assignments are: 
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31 21201918 161514 121110 9 0 


RESO yf fi RESO a RESO 
NSTRCDIS _— Le cp10 
RESO cp11 


IMP DEF NSASEDIS 


Bits [31:21] 


Reserved, RESO. 


NSTRCDIS, bit [20] 
Disables Non-secure System register accesses to all implemented trace registers. 
Q This control has no effect on: 
° System register access to implemented trace registers. 
. The behavior of CPACR.TRCDIS and HCPTR.TTA. 


1 Non-secure System register accesses to all implemented trace registers are disabled, 
meaning: 


° CPACR.TRCDIS behaves as RAO/WI in Non-secure state, regardless of its 
actual value. 


° HCPTR.TTA behaves as RAO/WI, regardless of its actual value. 


The implementation of this field must correspond to the implementation of the CPACR.TRCDIS 
field: 


° If CPACR.TRCDIS is RAZ/WI, this field is RAZ/WI. 
° If CPACR.TRCDIS is RW, this field is RW. 


—— Note 

° The ETMv4 architecture does not permit ELO to access the trace registers. If the 
implementation includes an ETMv4 implementation, ELO accesses to the trace registers are 
UNDEFINED. 

° The architecture does not provide Non-secure access controls on trace register accesses 


through the optional memory-mapped external debug interface. 





System register accesses to the trace registers can have side-effects. When a System register access 
is trapped, any side-effects that are normally associated with the access do not occur before the 
exception is taken. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 


Bit [19] 
Reserved, RESO. 


IMPLEMENTATION DEFINED, bits [18:16] 


IMPLEMENTATION DEFINED. 


NSASEDIS, bit [15] 
Disables Non-secure access to the Advanced SIMD functionality. 
Q This control has no effect on: 
° Non-secure access to Advanced SIMD functionality. 


° The behavior of CPACR.ASEDIS and HCPTR.TASE. 
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Bits [14:12] 


cp11, bit [11] 


cp10, bit [10] 


Bits [9:0] 


1 Non-secure access to the Advanced SIMD functionality is disabled, meaning: 


° CPACR.ASEDIS behaves as RAO/WI in Non-secure state, regardless of its 
actual value. 


° HCPTR.TASE behaves as RAO/WI, regardless of its actual value. 


The implementation of this field must correspond to the implementation of the CPACR.ASEDIS 
field: 


° If CPACR.ASEDIS is RESO, this field is RESO. If the implementation does not include 
Advanced SIMD and floating-point functionality, this field is RESO. 


. If CPACR.ASEDIS is RAZ/WI, this field is RAZ/WI. 
. If CPACR.ASEDIS is RW, this field is RW. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 


Reserved, RESO. 


The value of this field is ignored. If this field is programmed with a different value to the cp10 field 
then this field is UNKNOWN on a direct read of the NSACR. 


If the implementation does not include Advanced SIMD and floating-point functionality, this field 
is RESO. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


Enable Non-secure access to the Advanced SIMD and floating-point features. Possible values of the 
fields are: 


) Advanced SIMD and floating-point features can be accessed only from Secure state. 
Any attempt to access this functionality from Non-secure state is UNDEFINED. 


When the PE is in Non-secure state: 
° The CPACR.{cp11, cp10} fields ignore writes and read as @b00, access denied. 


° The HCPTR.{TCP11, TCP10} fields behave as RAO/WI, regardless of their 
actual values. 


1 Advanced SIMD and floating-point features can be accessed from both Security states. 


If Non-secure access to the Advanced SIMD and floating-point functionality is enabled, the CPACR 
must be checked to determine the level of access that is permitted. 


The Advanced SIMD and floating-point features controlled by these fields are: 
° Execution of any floating-point or Advanced SIMD instruction. 


. Any access to the Advanced SIMD and floating-point registers DO-D31 and their views as 
S0-S31 and QO-Q15. 


° Any access to the FPSCR, FPSID, MVFRO, MVFR1, MVFR2, or FPEXC System registers. 


If the implementation does not include Advanced SIMD and floating-point functionality, this field 
is RESO. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


Accessing the NSACR: 


To access the NSACR: 


MRC p15,0,<Rt>,cl,c1,2 ; Read NSACR into Rt 
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MCR p15,0,<Rt>,cl,c1,2 ; Write Rt to NSACR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0001 0001 010 
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G6.2.110 PAR, Physical Address Register 
The PAR characteristics are: 


Purpose 
Returns the output address (OA) from an address translation instruction that executed successfully, 
or fault information if the instruction did not execute successfully. 

Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


PAR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





E : : : : RW 





PAR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


PAR is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T7==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T7==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register PAR is architecturally mapped to AArch64 System register PAR_EL1. 
The PAR returns a 32-bit value: 
* When the PE is not in Hyp mode and is using the Short-descriptor translation table format. 


. When the PE is in Hyp mode for the ATS12NSOxx instructions when the value of HCR.VM 
is O and the value of TTBCR.EAE is 0. 


In these cases, PAR[63:32] is RESO. 


Otherwise, the PAR returns a 64-bit value. This means it returns a 64-bit value in the following 
cases: 


° When using the Long-descriptor translation table format. 


° If the stage 1 address translation is disabled and TTBCR.EAE is set to 1. 





° In an implementation that includes EL2, for the result of an ATS 1Cxx instruction performed 
from Hyp mode. 
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For PL1&0 stage 1 translations, TTBCR.EAE selects the translation table format. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 


PAR is a 64-bit register that can also be accessed as a 32-bit value. If it is accessed as a 32-bit 
register, accesses read and write bits[31:0] and do not modify bits[63:32]. 


The Configurations section specifies the cases where each PAR format is used. 


Field descriptions 


The PAR bit assignments are: 


For all register layouts: 


F, bit [0] 
Indicates whether the instruction performed a successful address translation. 
Q Address translation completed successfully. 
1 Address translation aborted. 


When the instruction returned a 32-bit value to the PAR, PAR.F==0: 







Outer[1:0] 
Inner[2:0] 
SH 
IMP DEF 
NOS 
LPAE 


This section describes the register value returned by the successful execution of an Address translation instruction. 
Software might subsequently write a different value to the register, and that write does not affect the operation of 
the PE. 


On a successful conversion, the PAR can return a value that indicates the resulting attributes, rather than the values 
that appear in the translation table descriptors. More precisely: 


° Memory attribute fields are permitted to report the resulting attributes, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of reporting the values that appear in 
the translation table descriptors. This applies to the NOS, SH, Inner, and Outer fields. 


. See the NS bit description for constraints on the value it returns. 


PA, bits [31:12] 


Output address. The output address (OA) corresponding to the supplied input address. This field 
returns address bits[31:12]. 





LPAE, bit [11] 
When updating the PAR with the result of the translation operation, this bit is set as follows: 
Q Short-descriptor translation table format used. This means the PAR returned a 32-bit 
value. 
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NOS, bit [10] 


Not Outer Shareable. When the returned value of PAR.SH is 1, indicates the Shareability attribute 
for the physical memory region: 


0 Memory region is Outer Shareable. 
a Memory region is Inner Shareable. 
When the returned value of PAR.SH is 0 the value returned to this field is UNKNOWN. 
The value returned in this field can be the resulting attribute, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of the value that appears in 
the translation table descriptor. 
NS, bit [9] 
Non-secure. The NS attribute for a translation table entry from a Secure translation regime. 


For a result from a Secure translation regime, this bit reflects the Security state of the physical 
address space of the translation. This means it reflects the effect of the NSTable bits of earlier levels 
of the translation table walk if those NSTable bits have an effect on the translation. 


For a result from a Non-secure translation regime, this bit is UNKNOWN. 


IMP DEF, bit [8] 


IMPLEMENTATION DEFINED. 


SH, bit [7] 
Shareability. Indicates whether the physical memory region is Non-shareable: 
Q Memory is Non-shareable. 


1 Memory is shareable, and PAR.NOS indicates whether the region is Outer Shareable or 
Inner Shareable. 


The value returned in this field can be the resulting attribute, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of the value that appears in 
the translation table descriptor. 


Inner[2:0], bits [6:4] 


Inner cacheability attribute for the region. Permitted values are: 


000 Non-cacheable. 

001 Device-nGnRnE. 

011 Device-nGnRE. 

101 Write-Back, Write-Allocate. 
110 Write-Through. 

111 Write-Back, no Write-Allocate. 


The values 010 and 100 are reserved. 


The value returned in this field can be the resulting attribute, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of the value that appears in 
the translation table descriptor. 


Outer[1:0], bits [3:2] 


Outer cacheability attribute for the region. Permitted values are: 


00 Non-cacheable. 

Q1 Write-Back, Write-Allocate. 

10 Write-Through, no Write-Allocate. 
11 Write-Back, no Write-Allocate. 


The value returned in this field can be the resulting attribute, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of the value that appears in 
the translation table descriptor. 
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SS, bit [1] 
Supersection. Used to indicate if the result is a Supersection: 
0 Result is not a Supersection. PAR[31:12] contains OA[31:12]. 
1 Result is a Supersection, and: 
° PAR[31:24] contains OA[31:24]. 
° PAR[23:16] contains OA[39:32]. 


° PAR[15:12] contains 0b0000. 
If an implementation supports less than 40 bits of physical address, the bits in the PAR 


field that correspond to physical address bits that are not implemented are UNKNOWN. 
F, bit [0] 
Indicates whether the instruction performed a successful address translation. 


7) Address translation completed successfully. 


When the instruction returned a 32-bit value to the PAR, PAR.F==1: 


31 1615 12 11 10 7 


6 10 


Po LPAE 


This section describes the register value returned by a fault on the execution of an Address translation instruction. 
Software might subsequently write a different value to the register, and that write does not affect the operation of 
the PE. 
IMP DEF, bits [31:16] 

IMPLEMENTATION DEFINED. 


Bits [15:12] 
Reserved, RESO. 
LPAE, bit [11] 
When updating the PAR with the result of the translation operation, this bit is set as follows: 


0 Short-descriptor translation table format used. This means the PAR returned a 32-bit 
value. 


Bits [10:7] 
Reserved, RESO. 


FS, bits [6:1] 
Fault status bits. Bits [12,10,3:0] from the DFSR, indicating the source of the abort. 





F, bit [0] 
Indicates whether the instruction performed a successful address translation. 
a Address translation aborted. 
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When the instruction returned a 64-bit value to the PAR, PAR.F==0: 


56 55 40 39 1211109 8 7 6 1 E 


This section describes the register value returned by the successful execution of an Address translation instruction. 
Software might subsequently write a different value to the register, and that write does not affect the operation of 
the PE. 


| Et DEF 
LPAE 


On a successful conversion, the PAR can return a value that indicates the resulting attributes, rather than the values 
that appear in the translation table descriptors. More precisely: 


. Memory attribute fields are permitted to report the resulting attributes, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of reporting the values that appear in 
the translation table descriptors. This applies to the ATTR and SH fields. 


° See the NS bit description for constraints on the value it returns. 


ATTR, bits [63:56] 


Memory attributes for the returned output address. This field uses the same encoding as the Attr<n> 
fields in MAIRO and MAIR1. 


The value returned in this field can be the resulting attribute, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of the value that appears in 
the translation table descriptor. 


Bits [55:40] 
Reserved, RESO. 
PA, bits [39:12] 


Output address. The output address (OA) corresponding to the supplied input address. This field 
returns address bits[39:12]. 


LPAE, bit [11] 
When updating the PAR with the result of the translation operation, this bit is set as follows: 
1 Long-descriptor translation table format used. This means the PAR returned a 64-bit 
value. 
IMP DEF, bit [10] 


IMPLEMENTATION DEFINED. 


NS, bit [9] 
Non-secure. The NS attribute for a translation table entry from a Secure translation regime. 


For a result from a Secure translation regime, this bit reflects the Security state of the physical 
address space of the translation. This means it reflects the effect of the NSTable bits of earlier levels 
of the translation table walk if those NSTable bits have an effect on the translation. 


For a result from a Non-secure translation regime, this bit is UNKNOWN. 
SH, bits [8:7] 


Shareability attribute, for the returned output address. Permitted values are: 





00 Non-shareable. 
10 Outer Shareable. 
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11 Inner Shareable. 


The value 01 is reserved. 


— Note 

This field returns the value 10 for: 

. Any type of Device memory. 

° Normal memory with both Inner Non-cacheable and Outer Non-cacheable attributes. 





The value returned in this field can be the resulting attribute, as determined by any permitted 
implementation choices and any applicable configuration bits, instead of the value that appears in 
the translation table descriptor. 

Bits [6:1] 
Reserved, RESO. 

F, bit [0] 
Indicates whether the instruction performed a successful address translation. 


7) Address translation completed successfully. 


When the instruction returned a 64-bit value to the PAR, PAR.F==1: 


63 
| IMP DEF 


56 55 52 51 48 47 1211109 8 7 6 10 


IMP DEF | IMP DEF RESO | | FST a 
| [ee 


S2WLK 
FSTAGE 
RESO 
LPAE 





This section describes the register value returned by a fault on the execution of an Address translation instruction. 
Software might subsequently write a different value to the register, and that write does not affect the operation of 
the PE. 


IMP DEF, bits [63:56] 
IMPLEMENTATION DEFINED. 


IMP DEF, bits [55:52] 
IMPLEMENTATION DEFINED. 





IMP DEF, bits [51:48] 
IMPLEMENTATION DEFINED. 
Bits [47:12] 
Reserved, RESO. 
LPAE, bit [11] 
When updating the PAR with the result of the translation operation, this bit is set as follows: 





1 Long-descriptor translation table format used. This means the PAR returned a 64-bit 
value. 
Bit [10] 
Reserved, RESO. 
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FSTAGE, bit [9] 
Indicates the translation stage at which the translation aborted: 
0 Translation aborted because of a fault in the stage 1 translation. 
1 Translation aborted because of a fault in the stage 2 translation. 
S2WLK, bit [8] 


If this bit is set to 1, it indicates the translation aborted because of a stage 2 fault during a stage 1 


translation table walk. 
Bit [7] 
Reserved, RESO. 


FST, bits [6:1] 


Fault status field. Values are as in the DFSR.STATUS and IFSR.STATUS fields when using the 


Long-descriptor translation table format. 


¥, bit [0] 
Indicates whether the instruction performed a successful address translation. 


1 Address translation aborted. 


Accessing the PAR: 
To access the PAR when accessing as a 32-bit register: 


MRC p15,0,<Rt>,c7,c4,@ ; Read PAR[31:0] into Rt 
MCR p15,0,<Rt>,c7,c4,0 ; Write Rt to PAR[31:0]. PAR[63:32] are unchanged 


Register access is encoded as follows: 

















coproc opct CRn CRm_= opc2 
1111 000 0111 0100 000 
To access the PAR when accessing as a 64-bit register: 
MRRC p15,0,<Rt>,<Rt2>,c7 ; Read PAR[31:0] into Rt and PAR[63:32] into Rt2 
MCRR p15,0,<Rt>,<Rt2>,c7 ; Write Rt to PAR[31:0] and Rt2 to PAR[63:32] 
Register access is encoded as follows: 
coproc opci CRm 
1111 0000 0111 
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G6.2.111 PRRR, Primary Region Remap Register 


The PRRR characteristics are: 


Purpose 


Controls the top level mapping of the TEX[0], C, and B memory region attributes. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


PRRR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: : : : : RW 





PRRR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


PRRR is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL] are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T10==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T10==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register PRRR is architecturally mapped to AArch64 System register 
MAIR_EL1[31:0]. 


MAIRO and PRRR are the same register, with a different view depending on the value of 
TTBCR.EAE: 


. When it is set to 0, the register is as described in PRRR. 
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° When it is set to 1, the register is as described in MAIRO. 


When EL3 is using AArch32, write access to PRRR(S) is disabled when the CPISSDISABLE 
signal is asserted HIGH. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
PRRR is a 32-bit register. 


Field descriptions 


The PRRR bit assignments are: 


When TTBCR.EAE==0: 


31 30 29 28 27 26 25 24 23 20191817161514131211109 8 76543 2 1 0 





NOS7 __| 
NOS6 


NOSS5 
NOS4 
NOS3 
NOS2 
NOS1 
NOSO 
NS1 
NSO 
DS1 
DSO 


NOS<n>, bit [n+24], for n = 0 to 7 


Not Outer Shareable. NOS<n> is the Outer Shareable property for memory attributes n, if the region 
is mapped as Normal memory that is not Inner Non-cacheable, Outer Non-cacheable, and the 
appropriate PRRR.{NSO, NS1} field identifies the region as shareable. n is the value of the 
concatenation of the {TEX[0], C, B} bits from the translation table descriptor. The possible values 
of each NOS<n> field other than NOS6 are: 


0 Memory region is Outer Shareable. 
1 Memory region is Inner Shareable. 
The value of this bit is ignored if the region is: 
° Device memory 
° Normal memory that is at least one of: 
— Inner Non-cacheable, Outer Non-cacheable. 
— Identified by the appropriate PRRR.{NSO, NS1} field as Non-shareable. 


The meaning of the NOS6 field is IMPLEMENTATION DEFINED. 


Bits [23:20] 


Reserved, RESO. 
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NS1, bit [19] 


Mapping of S = 1 attribute for Normal memory regions. This field is used in determining the 
Shareability of a memory region that is mapped to Normal memory and both: 


° Is not Inner Non-cacheable, Outer Non-cacheable. 
. Has the S bit in the translation table descriptor set to 1. 
The possible values of this bit are: 
0 Region is Non-shareable. 
1 Region is shareable. The value of the appropriate PRRR.NOS<n> field determines 
whether the region is Inner Shareable or Outer Shareable. 
NSO, bit [18] 


Mapping of S = 0 attribute for Normal memory regions. This field is used in determining the 
Shareability of a memory region that is mapped to Normal memory and both: 


° Is not Inner Non-cacheable, Outer Non-cacheable. 
° Has the S bit in the translation table descriptor set to 0. 


The possible values of this bit are: 


0 Region is Non-shareable. 
1 Region is shareable. The value of the appropriate PRRR.NOS<n> field determines 
whether the region is Inner Shareable or Outer Shareable. 
DS1, bit [17] 
Mapping of S = 1 attribute for Device memory. In ARMV§, all types of Device memory are Outer 
Shareable, and therefore this bit is RES]. 
DSO, bit [16] 


Mapping of S = 0 attribute for Device memory. In ARMV8, all types of Device memory are Outer 
Shareable, and therefore this bit is RES]. 


TR<n>, bits [2n+1:2n], for n = 0 to 7 


TR<n> is the primary TEX mapping for memory attributes n, and defines the mapped memory type 
for a region with attributes n. n is the value of the concatenation of the {TEX[0], C, B} bits from the 
translation table descriptor. The possible values for each field other than TR6 are: 


00 Device-nGnRnE memory 
01 Device-nGnRE memory 
10 Normal memory 


The value 11 is reserved. The effect of programming a field to 11 is CONSTRAINED UNPREDICTABLE, 
see Reserved values in System and memory-mapped registers and translation table entries on 
page K1-5477. 


The meaning of the TR6 field is IMPLEMENTATION DEFINED. 


Accessing the PRRR: 
To access the PRRR when TTBCR.EAE==0: 


MRC p15,0,<Rt>,c10,c2,@ ; Read PRRR into Rt 
MCR p15,0,<Rt>,c10,c2,@ ; Write Rt to PRRR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1010 0010 000 
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G6.2.112 REVIDR, Revision ID Register 
The REVIDR characteristics are: 


Purpose 
Provides implementation-specific minor revision information. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID1==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 

° If HCR_EL2.TID1==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 

° If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 

° If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL] are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register REVIDR is architecturally mapped to AArch64 System register 
REVIDR_ELI. 
If REVIDR has the same value as MIDR, then its contents have no significance. 


Attributes 
REVIDR is a 32-bit register. 


Field descriptions 


The REVIDR bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 
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IMPLEMENTATION DEFINED, bits [31:0] 


IMPLEMENTATION DEFINED. 


Accessing the REVIDR: 
To access the REVIDR: 
MRC p15,0,<Rt>,c@,c@,6 ; Read REVIDR into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0000 110 
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G6.2.113 RMR (at EL1), Reset Management Register 
The RMR (at EL1) characteristics are: 
Purpose 
When this register is implemented: 
° A write to the register can request a Warm reset. 
. If EL1 can use AArch32 and AArch64, the register specifies the Execution state that the PE 
boots into on a Warm reset. 
Usage constraints 
This register is accessible as follows: 
ELO EL1 
- RW 
However, see Configurations for information about whether the register is implemented. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
AArch32 System register RMR (at EL1) is architecturally mapped to AArch64 System register 
RMR_EL1. 
Only implemented if EL1 is the highest implemented Exception level. In this case: 
. If EL1 can use AArch32 and AArch64 then this register must be implemented. 
° If EL1 cannot use AArch64 then it is IMPLEMENTATION DEFINED whether the register is 
implemented. 
When this register is not implemented its encoding is UNDEFINED at EL1. 
See the field descriptions for the reset values. These apply whenever the register is implemented. 
Attributes 
RMR (at EL1) is a 32-bit register. 
Field descriptions 
The RMR (at EL1) bit assignments are: 
31 210 
RESO FA 
—_ AA64 
Bits [31:2] 
Reserved, RESO. 
RR, bit [1] 
Reset Request. Setting this bit to 1 requests a Warm reset. 
This field resets to @ on a Warm or Cold reset. 
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When EL] can use AArch64, determines which Execution state the PE boots into after a Warm 
reset: 


0 AArch32. 
1 AArch64. 


On coming out of the Warm reset, execution starts at the IMPLEMENTATION DEFINED reset vector 
address of the specified Execution state. 


If EL1 cannot use AArch64 this bit is RAZ/WI. 


When implemented as an RW field, this field resets to @ on a Cold reset. It is not affected by a Warm 
reset. 


Accessing the RMR (at EL1): 


To access the RMR (at EL1) when EL2 and EL3 not implemented: 


MRC p15,0,<Rt>,c12,c@,2 ; Read RMR into Rt 
MCR p15,0,<Rt>,c12,c@,2 ; Write Rt to RMR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1100 0000 010 
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G6.2.114 RMR (at EL3), Reset Management Register 
The RMR (at EL3) characteristics are: 


Purpose 
When this register is implemented: 
° A write to the register can request a Warm reset. 
° If EL3 can use AArch32 and AArch64, this register specifies the Execution state that the PE 
boots into on a Warm reset. 
Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





However, see Configurations for information about whether the register is implemented. 
ARM deprecates accessing this register from any PE mode other than Monitor mode. 


If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 








The AArch32 view of the RMR register is subject to the CPISSDISABLE control, which prevents 
writing to this register when the CPISSDISABLE signal is asserted. 
Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
This register is only accessible in Secure state. 


AArch32 System register RMR (at EL3) is architecturally mapped to AArch64 System register 
RMR_EL3. 


Only implemented EL3 is the highest implemented Exception level. In this case: 
. If EL3 can use AArch32 and AArch64 then this register must be implemented. 


* If EL3 cannot use AArch64 then it is IMPLEMENTATION DEFINED whether the register is 
implemented. 


When this register is not implemented its encoding is UNDEFINED at EL3. 


See the field descriptions for the reset values. These apply whenever the register is implemented. 


Attributes 
RMR (at EL3) is a 32-bit register. 


Field descriptions 


The RMR (at EL3) bit assignments are: 
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2 1 0 


RESO FF 


Bits [31:2] 


RR, bit [1] 


AA64, bit [0] 


f= AA64 


Reserved, RESO. 


Reset Request. Setting this bit to 1 requests a Warm reset. 


This field resets to @ on a Warm or Cold reset. 


When EL3 can use AArch64, determines which Execution state the PE boots into after a Warm 
reset: 


0 AArch32. 
1 AArch64. 


On coming out of the Warm reset, execution starts at the IMPLEMENTATION DEFINED reset vector 
address of the specified Execution state. 


If EL3 cannot use AArch64 this bit is RAZ/WI. 


When implemented as an RW field, this field resets to @ on a Cold reset. It is not affected by a Warm 
reset. 


Accessing the RMR (at EL3): 


To access the RMR (at EL3) when EL3 implemented: 


MRC p15,0,<Rt>,c12,c@,2 ; Read RMR into Rt 
MCR p15,0,<Rt>,c12,c@,2 ; Write Rt to RMR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1100 0000 010 
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G6.2.115 RVBAR, Reset Vector Base Address Register 
The RVBAR characteristics are: 
Purpose 
If EL3 is not implemented, contains the IMPLEMENTATION DEFINED address that execution starts 
from after reset when executing in AArch32 state. 
Usage constraints 
If EL] is the highest exception level implemented and is using AArch32, this register is accessible 
as follows: 
ELO EL1 
- RO 
If EL2 is the highest exception level implemented and is using AArch32, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- - RO 
This register can only be read at the highest Exception level implemented. It is UNDEFINED at all 
other Exception levels. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HSTR.T12==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T12==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
This register is only implemented if the highest Exception level implemented is capable of using 
AArch32, and is not EL3. 
Attributes 
RVBAR is a 32-bit register. 
Field descriptions 
The RVBAR bit assignments are: 
31 10 
Reset Address[31:1] i 
| RES1 
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Bits [31:1] 


Reset Address[31:1]. Bits [31:1] of the IMPLEMENTATION DEFINED address that execution starts 
from after reset when executing in 32-bit state. 


Bit [0] 


Reserved, RES1. 


Accessing the RVBAR: 
To access the RVBAR: 
MRC p15,0,<Rt>,c12,c@,1 ; Read RVBAR into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1100 0000 001 
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G6.2.116 SCR, Secure Configuration Register 
The SCR characteristics are: 


Purpose 


When EL3 is implemented and can use AArch32, defines the configuration of the current Security 
state. It specifies: 


° The Security state, either Secure or Non-secure. 
° What mode the PE branches to if an IRQ, FIQ, or External Abort occurs. 
° Whether the CPSR.F or CPSR.A bits can be modified when SCR.NS==1. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1(NS) EL1(S) EL2(NS) 





- - Trap - 





If EL3 is implemented and is using AArch64, any read or write to SCR from Secure EL1 using 
AArch32 is trapped as an exception to EL3. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
This register is only accessible in Secure state. 


AArch32 System register SCR can be mapped to AArch64 System register SCR_EL3, but this is 
not architecturally mandated. 


Some or all RW fields of this register have defined reset values. These apply whenever the register 
is accessible. This means they apply when the PE resets into EL3 using AArch32. 


Attributes 
SCR is a 32-bit register. 


Field descriptions 


The SCR bit assignments are: 
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161514131211109 8 76543210 





a Ro 


EA 
FW 
AW 
nET 
ScD 
HCE 
SIF 
RESO 
TWI 
TWE 
RESO 
RESO 
Bits [31:16] 
Reserved, RESO. 
Bit [15] 
Reserved, RESO. 
Bit [14] 
Reserved, RESO. 
TWE, bit [13] 
Traps WFE instructions to Monitor mode. 
) This control has no effect on the execution of WFE instructions. 
1 Any attempt to execute a WFE instruction in any mode other than Monitor mode is 


trapped to Monitor mode, if the instruction would otherwise have caused the PE to enter 
a low-power state and the attempted execution does not generate an exception that is 
taken to EL1 or EL2. 


Any exception that is taken to EL1 or to EL2 has priority over this trap. 


The attempted execution of a conditional WFE instruction is only trapped if the instruction passes 
its condition code check. 


—— Note 

Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of 
WF are not guaranteed to be taken, even if the WFE or WFI is executed when there is no Wakeup 
event. The only guarantee is that if the instruction does not complete in finite time in the absence of 
a Wakeup event, the trap will be taken. 





When this register has an architecturally-defined reset value, this field resets to 0. 


TWI, bit [12] 
Traps WFI instructions to Monitor mode. 


0 This control has no effect on the execution of WFI instructions. 
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1 Any attempt to execute a WFI instruction in any mode other than Monitor mode is 
trapped to Monitor mode, if the instruction would otherwise have caused the PE to enter 
a low-power state and the attempted execution does not generate an exception that is 
taken to EL1 or EL2. 


Any exception that is taken to EL1 or to EL2 has priority over this trap. 


The attempted execution of a conditional WFI instruction is only trapped if the instruction passes 
its condition code check. 


— Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of 
WFI are not guaranteed to be taken, even if the WFE or WFI is executed when there is no Wakeup 
event. The only guarantee is that if the instruction does not complete in finite time in the absence of 
a Wakeup event, the trap will be taken. 





When this register has an architecturally-defined reset value, this field resets to 0. 


Bits [11:10] 


Reserved, RESO. 








SIF, bit [9] 
Secure instruction fetch. When the PE is in Secure state, this bit disables instruction fetch from 
Non-secure memory. The possible values for this bit are: 
) Secure state instruction fetches from Non-secure memory are permitted. 
1 Secure state instruction fetches from Non-secure memory are not permitted. 
This bit is permitted to be cached in a TLB. 
When this register has an architecturally-defined reset value, this field resets to 0. 
HCE, bit [8] 
Hypervisor Call instruction enable. Enables EL2 and Non-secure EL1 execution of HVC 
instructions. 
0 HVC instructions are: 
° UNDEFINED at Non-secure EL1. The Undefined Instruction exception is taken 
from PL1 to PL1. 
. UNPREDICTABLE at EL2. Behavior is one of the following: 
— The instruction is UNDEFINED. 
— The instruction executes as a NOP. 
1 HVC instructions are enabled at EL2 and Non-secure EL1. 
— Note 
HVC instructions are always UNDEFINED at ELO and in Secure state. 
If EL2 is not implemented, this bit is RESO and HVC is UNDEFINED. 
When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 
SCD, bit [7] 
Secure Monitor Call disable. Disables SMC instructions. 
0 SMC instructions are enabled. 
1 In Non-secure state, SMC instructions are UNDEFINED. The Undefined Instruction 
exception is taken from the current Exception level to the current Exception level. 
In Secure state, behavior is one of the following: 
° The instruction is UNDEFINED. 
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° The instruction executes as a NOP. 


—— Note 
SMC instructions are always UNDEFINED at PLO. 





When this register has an architecturally-defined reset value, this field resets to 0. 





nET, bit [6] 
Not Early Termination. This bit disables early termination. The possible values of this bit are: 
0 Early termination permitted. Execution time of data operations can depend on the data 
values. 
1 Disable early termination. The number of cycles required for data operations is forced 
to be independent of the data values. 
This IMPLEMENTATION DEFINED mechanism can disable data dependent timing optimizations from 
multiplies and data operations. It can provide system support against information leakage that might 
be exploited by timing correlation types of attack. 
On implementations that do not support early termination or do not support disabling early 
termination, this bit is RESO. 
When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 
AW, bit [5] 
When the value of SCR.EA is | and the value of HCR.AMO is 0, this bit controls whether CPSR.A 
masks an external abort taken from Non-secure state, and the possible values of this bit are: 
) External aborts taken from Non-secure state are not masked by CPSR.A, and are taken 
to EL3. 
External aborts taken from Secure state are masked by CPSR.A. 
1 External aborts taken from either Security state are masked by CPSR.A. When CPSR.A 
is O, the abort is taken to EL3. 
When SCR.EA is 0 or HCR.AMO is 1, this bit has no effect. 
When this register has an architecturally-defined reset value, this field resets to 0. 
FW, bit [4] 
When the value of SCR.FIQ is 1 and the value of HCR.FMO is 0, this bit controls whether CPSR.F 
masks an FIQ interrupt taken from Non-secure state, and the possible values of this bit are: 
Q An FIQ taken from Non-secure state is not masked by CPSR.F, and is taken to EL3. 
An FIQ taken from Secure state is masked by CPSR.F. 
1 An FIQ taken from either Security state is masked by CPSR.F. When CPSR.F is 0, the 
FIQ is taken to EL3. 
When SCR.FIQ is 0 or HCR.FMO is 1, this bit has no effect. 
When this register has an architecturally-defined reset value, this field resets to 0. 
EA, bit [3] 
External Abort handler. This bit controls which mode takes external aborts. The possible values of 
this bit are: 
0 External aborts taken to Abort mode. 
1 External aborts taken to Monitor mode. 
When this register has an architecturally-defined reset value, this field resets to 0. 
FIQ, bit [2] 
FIQ handler. This bit controls which mode takes FIQ exceptions. The possible values of this bit are: 
0 FIQs taken to FIQ mode. 
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1 FIQs taken to Monitor mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 


IRQ, bit [1] 
IRQ handler. This bit controls which mode takes IRQ exceptions. The possible values of this bit are: 
) IRQs taken to IRQ mode. 
1 IRQs taken to Monitor mode. 
When this register has an architecturally-defined reset value, this field resets to 0. 
NS, bit [0] 
Non-secure bit. Except when the PE is in Monitor mode, this bit determines the Security state of the 
PE: 
) PE is in Secure state. 
1 PE is in Non-secure state. 


If the HCR.TGE bit is set, an attempt to change from a Secure PL1 mode to a Non-secure EL1 mode 
by changing the SCR.NS bit from 0 to 1 results in the SCR.NS bit remaining as 0. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the SCR: 
To access the SCR: 


MRC p15,0,<Rt>,cl,c1,@ ; Read SCR into Rt 
MCR p15,0,<Rt>,cl,c1,@ ; Write Rt to SCR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0001 0001 000 
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G6.2.117. SCTLR, System Control Register 


The SCTLR characteristics are: 


Purpose 


Provides the top level control of the system, including its memory system. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


SCTLR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0) 





: : : : z RW 





SCTLRU(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


SCTLR is accessible as follows: 





ELO EL1 EL2(NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL] are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register SCTLR is architecturally mapped to AArch64 System register 
SCTLR_ELI. 


When EL3 is using AArch32, write access to SCTLR(S) is disabled when the CPISSDISABLE 
signal is asserted HIGH. 
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RESO __| 
AFE 


Some bits in the register are read-only. These bits relate to non-configurable features of an 
implementation, and are provided for compatibility with previous versions of the architecture. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into an Exception level that is using AArch32. If the PE resets into EL3 using AArch32 they apply 
only to the Secure instance of the register. Otherwise, RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
SCTLR is a 32-bit register. 


Field descriptions 


The SCTLR bit assignments are: 


31 30 29 28 27 26 25 24 23 22 21 201918 17 161514131211109 8 76543210 





Te RES1 
CP15BEN 





TRE UNK 
RESO ITD 
EE SED 
RESO RESO 
RES1 RES1 
RESO RESO 
UWXN 
WXN 
nTWE 
RESO 
nTWl 
Bit [31] 
Reserved, RESO. 
TE, bit [30] 
T32 Exception Enable. This bit controls whether exceptions to an Exception Level that is executing 
at PL1 are taken to A32 or T32 state: 
Q Exceptions, including reset, taken to A32 state. 
1 Exceptions, including reset, taken to T32 state. 
When this register has an architecturally-defined reset value, this field resets to an IMPLEMENTATION 
DEFINED choice between: 
° 0. 
° A value determined by an input configuration signal. 
AFE, bit [29] 
Access Flag Enable. When using the Short-descriptor translation table format for the PL1&0 
translation regime, this bit enables use of the AP[0] bit in the translation descriptors as the Access 
flag, and restricts access permissions in the translation descriptors to the simplified model. The 
possible values of this bit are: 
0 In the translation table descriptors, AP[0] is an access permissions bit. The full range of 
access permissions is supported. No Access flag is implemented. 
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1 In the translation table descriptors, AP[O] is the Access flag. Only the simplified model 
for access permissions is supported. 


When using the Long-descriptor translation table format, the VMSA behaves as if this bit is set to 
1, regardless of the value of this bit. 


The AFE bit is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, this field resets to 0. 


TRE, bit [28] 


TEX remap enable. This bit enables remapping of the TEX[2:1] bits in the PL1&0 translation 
regime for use as two translation table bits that can be managed by the operating system. Enabling 
this remapping also changes the scheme used to describe the memory region attributes in the 
VMSA. The possible values of this bit are: 


0 TEX remap disabled. TEX[2:0] are used, with the C and B bits, to describe the memory 
region attributes. 


L TEX remap enabled. TEX[2:1] are reassigned for use as bits managed by the operating 
system. The TEX[0], C, and B bits are used to describe the memory region attributes, 
with the MMU remap registers. 


When the value of TTBCR.EAE is 1, this bit is RES]. 
The TRE bit is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Bits [27:26] 


Reserved, RESO. 


EE, bit [25] 


The value of the PSTATE.E bit on branch to an exception vector or coming out of reset, and the 
endianness of stage | translation table walks in the PL1&0 translation regime. 


The possible values of this bit are: 


Q Little-endian. PSTATE.E is cleared to 0 on taking an exception or coming out of reset. 
Stage 1 translation table walks in the PL1&0 translation regime are little-endian. 


1 Big-endian. PSTATE.E is cleared to 0 on taking an exception or coming out of reset. 
Stage 1 translation table walks in the PL1&0 translation regime are big-endian. 


If an implementation does not provide Big-endian support for data accesses at Exception Levels 
higher than ELO, this bit is RESO. 


If an implementation does not provide Little-endian support for data accesses at Exception Levels 
higher than ELO, this bit is RES1. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to an IMPLEMENTATION DEFINED choice between: 


. Q. 

° A value determined by an input configuration signal. 
Bit [24] 

Reserved, RESO. 
Bits [23:22] 

Reserved, RES1. 
Bit [21] 

Reserved, RESO. 
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UWXN, bit [20] 


Unprivileged write permission implies PL1 XN (Execute-never). This bit can force all memory 
regions that are writable at PLO to be treated as XN for accesses from software executing at PL1. 
The possible values of this bit are: 


) This control has no effect on memory access permissions. 
a Any region that is writable at PLO forced to XN for accesses from software executing 
at PL. 


The UWXN bit is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, this field resets to 0. 


WXN, bit [19] 


Write permission implies XN (Execute-never). For the PL1&0 translation regime, this bit can force 
all memory regions that are writable to be treated as XN. The possible values of this bit are: 


Q This control has no effect on memory access permissions. 


1 Any region that is writable in the PL1 &0 translation regime is forced to XN for accesses 
from software executing at PL1 or PLO. 


The WXN bit is permitted to be cached in a TLB. 


When this register has an architecturally-defined reset value, this field resets to 0. 


nTWE, bit [18] 
Traps ELO execution of WFE instructions to Undefined mode. 


Q Any attempt to execute a WFE instruction at ELO is trapped to Undefined mode, if the 
instruction would otherwise have caused the PE to enter a low-power state. 


1 This control has no effect on the ELO execution of WFE instructions. 

The attempted execution of a conditional WFE instruction is only trapped if the instruction passes 
its condition code check. 

— Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of 
WFI are not guaranteed to be taken, even if the WFE or WFI is executed when there is no Wakeup 
event. The only guarantee is that if the instruction does not complete in finite time in the absence of 
a Wakeup event, the trap will be taken. 





When this register has an architecturally-defined reset value, this field resets to 1. 


Bit [17] 


Reserved, RESO. 


nTWI, bit [16] 
Traps ELO execution of WFI instructions to Undefined mode. 


Q Any attempt to execute a WFI instruction at ELO is trapped to Undefined mode, if the 
instruction would otherwise have caused the PE to enter a low-power state. 


1 This control has no effect on the ELO execution of WFI instructions. 

The attempted execution of a conditional WFI instruction is only trapped if the instruction passes 
its condition code check. 

— Note 


Since a WFE or WFI can complete at any time, even without a Wakeup event, the traps on WFE of 
WFI are not guaranteed to be taken, even if the WFE or WFI is executed when there is no Wakeup 
event. The only guarantee is that if the instruction does not complete in finite time in the absence of 
a Wakeup event, the trap will be taken. 





When this register has an architecturally-defined reset value, this field resets to 1. 
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Bits [15:14] 
Reserved, RESO. 





V, bit [13] 
Vectors bit. This bit selects the base address of the exception vectors for exceptions taken to a PE 
mode other than Monitor mode or Hyp mode: 
0 Normal exception vectors. Base address is held in VBAR. 
1 High exception vectors (Hivecs), base address 0@xFFFFQ000. This base address cannot be 
remapped. 
When this register has an architecturally-defined reset value, this field resets to an IMPLEMENTATION 
DEFINED choice between: 
° Q. 
. A value determined by an input configuration signal. 
I, bit [12] 
Instruction access Cacheability control, for accesses at EL1 and ELO: 
0 All instruction access to Normal memory from PL1 and PLO are Non-cacheable for all 
levels of instruction and unified cache. 
If the value of SCTLR.M is 0, instruction accesses from stage 1 of the PL1&0 
translation regime are to Normal, Outer Shareable, Inner Non-cacheable, Outer 
Non-cacheable memory. 
1 All instruction access to Normal memory from PL1 and PLO can be cached at all levels 
of instruction and unified cache. 
If the value of SCTLR.M is 0, instruction accesses from stage 1 of the PL1&0 
translation regime are to Normal, Outer Shareable, Inner Write-Through, Outer 
Write-Through memory. 
Instruction accesses to Normal memory from Non-secure EL1 and Non-secure ELO are Cacheable 
regardless of the value of the SCTLR.I bit if either: 
° EL2 is using AArch32 and the value of HCR.DC is 1. 
° EL2 is using AArch64 and the value of HCR_EL2.DC is 1. 
When this register has an architecturally-defined reset value, this field resets to 0. 
Bit [11] 
Reserved, RES 1. 
Bits [10:9] 
Reserved, RESO. 
SED, bit [8] 
SETEND instruction disable. Disables SETEND instructions at PLO and PL1. 
) SETEND instruction execution is enabled at PLO and PL1. 
1 SETEND instructions are UNDEFINED at PLO and PL1. 
If the implementation does not support mixed-endian operation at any Exception level, this bit is 
RES1. 
When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 
ITD, bit [7] 
IT Disable. Disables some uses of IT instructions at PL1 and PLO. 
) AILIT instruction functionality is enabled at PL1 and PLO. 
1 Any attempt at PL1 or PLO to execute any of the following is UNDEFINED: 
° All encodings of the IT instruction with hw1[3:0]!=1000. 
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UNK, bit [6] 


° All encodings of the subsequent instruction with the following values for hw1: 


1T1XXXXXXXXXXXXXX 
All 32-bit instructions, and the 16-bit instructions B, UDF, SVC, 
LDM, and STM. 
1011XxxXXXXXXXXXX 
All instructions in Miscellaneous 16-bit instructions on 
page F3-2442. 
10100xxXxXXXxXXXXXXX 
ADD Rad, PC, #imm 


Q1001xxXxXXXXXXXX 

LDR Rd, [PC, #imm] 
Q@100x1xxx1111xxx 

ADD Rdn, PC; CMP Rn, PC; MOV Rd, PC; BX PC; BLX PC. 
Q10001xx1xxxx111 


ADD PC, Rm; CMP PC, Rm; MOV PC, Rm. This pattern also covers 
UNPREDICTABLE cases with BLX Rn. 


These instructions are always UNDEFINED, regardless of whether they would pass or fail 
the condition code check that applies to them as a result of being in an IT block. 


It is IMPLEMENTATION DEFINED whether the IT instruction is treated as: 
° A 16-bit instruction, that can only be followed by another 16-bit instruction. 


° The first half of a 32-bit instruction. 
This means that, for the situations that are UNDEFINED, either the second 16-bit 
instruction or the 32-bit instruction is UNDEFINED. 
An implementation might vary dynamically as to whether IT is treated as a 16-bit 
instruction or the first half of a 32-bit instruction. 
If an instruction in an active IT block that would be disabled by this field sets this field to 1 then 
behavior is CONSTRAINED UNPREDICTABLE. For more information see Changes to an ITD control by 
an instruction in an IT block on page E1-2298. 
ITD is optional, but if it is implemented in the SCTLR then it must also be implemented in the 
SCTLR_ELI. If it is not implemented then this bit is RAZ/WI. 
When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 


Writes to this bit are IGNORED. Reads of this bit return an UNKNOWN value. 


CP1S5BEN, bit [5] 


Bits [4:3] 


System instruction memory barrier enable. Enables accesses to the DMB, DSB, and ISB System 
instructions in the (coproc==1111) encoding space from PL1 and PLO: 


0 PLO and PL1 execution of the CP15DMB, CP15DSB, and CP15ISB instructions is 
UNDEFINED. 

1 PLO and PL1 execution of the CP15DMB, CP15DSB, and CP15ISB instructions is 
enabled. 


CP15BEN is optional, but if it is implemented in the SCTLR then it must also be implemented in 
the SCTLR_EL1. If it is not implemented then this bit is RAO/WI. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 1. 


Reserved, RES1. 
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C, bit [2] 
Cacheability control, for data accesses at EL1 and ELO: 
Q All data access to Normal memory from PL1 and PLO, and all accesses to the PL1&0 
stage | translation tables, are Non-cacheable for all levels of data and unified cache. 
1 All data access to Normal memory from PL1 and PLO, and all accesses to the PL1&0 
stage | translation tables, can be cached at all levels of data and unified cache. 
The PE ignores SCLTR.C for Non-secure state and data accesses to Normal memory from EL1 and 
ELO are Cacheable if either: 
. EL2 is using AArch32 and the value of HCR.DC is 1. 
. EL2 is using AArch64 and the value of HCR_EL2.DC is 1. 
When this register has an architecturally-defined reset value, this field resets to 0. 
A, bit [1] 
Alignment check enable. This is the enable bit for Alignment fault checking at PL1 and PLO: 
0 Alignment fault checking disabled when executing at PL1 or PLO. 
Instructions that load or store one or more registers, other than load/store exclusive and 
load-acquire/store-release, do not check that the address being accessed is aligned to the 
size of the data element(s) being accessed. 
1 Alignment fault checking enabled when executing at PL1 or PLO. 
All instructions that load or store one or more registers have an alignment check that the 
address being accessed is aligned to the size of the data element(s) being accessed. If 
this check fails it causes an Alignment fault, which is taken as a Data Abort exception. 
Load/store exclusive and load-acquire/store-release instructions have an alignment check regardless 
of the value of the A bit. 
When this register has an architecturally-defined reset value, this field resets to 0. 
M, bit [0] 


MMU enable for EL1 and ELO stage 1 address translation. Possible values of this bit are: 
Q EL1 and ELO stage 1 address translation disabled. 


See the SCTLR.I field for the behavior of instruction accesses to Normal memory. 
1 EL1 and ELO stage 1 address translation enabled. 


In the Non-secure state the PE behaves as if the value of the SCTLR.M field is 0 for all purposes 
other than returning the value of a direct read of the field if either: 


° EL2 is using AArch32 and the value of HCR.{DC, TGE} is not {0, 0}. 
° EL2 is using AArch64 and the value of HCR_EL2.{DC, TGE} is not {0, 0}. 


When this register has an architecturally-defined reset value, this field resets to Q. 


Accessing the SCTLR: 
To access the SCTLR: 


MRC p15,0,<Rt>,c1,c@,@ ; Read SCTLR into Rt 
MCR p15,0,<Rt>,c1,c0,@ ; Write Rt to SCTLR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 0001 0000 000 
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G6.2.118 


SPSR, Saved Program Status Register 


The SPSR characteristics are: 


Purpose 


Holds the saved process state for the current mode. 


Usage constraints 


The SPSR can be read using the MRS instruction and written using the MSR (immediate) or MSR 
(register) instructions. For more details on the instruction syntax, see PSTATE and banked register 
access instructions on page F1-2380. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


Attributes 
SPSR is a 32-bit register. 


Field descriptions 


The SPSR bit assignments are: 


31 30 29 28 27 26 25 24 23 = 21 2019 1615 1098765 4 3 


IT[1:0] ee | _ ge 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to the current mode, and copied to CPSR.N on 
executing an exception return operation in the current mode. 

Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to the current mode, and copied to CPSR.Z on 
executing an exception return operation in the current mode. 

C, bit [29] 
Set to the value of CPSR.C on taking an exception to the current mode, and copied to CPSR.C on 
executing an exception return operation in the current mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to the current mode, and copied to CPSR.V on 
executing an exception return operation in the current mode. 

Q, bit [27] 


Set to the value of CPSR.Q on taking an exception to the current mode, and copied to CPSR.Q on 
executing an exception return operation in the current mode. 


IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 


RESO. 
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In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvV8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 


Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 


E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
1 Big-endian operation. 
Instruction fetches ignore this bit. 


When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 


If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 


If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 


Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 


A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
0 Exception not masked. 
i Exception masked. 

I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 

F, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 


a Exception masked. 
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T, bit [5] 


T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 


) Taken from A32 state. 
a Taken from T32 state. 
MIA4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





0b0000 User 





0b0001 = FIQ 





0be010 +~= IRQ 





Qb0011 Supervisor 





Qb011@ Monitor (only valid in Secure state, if EL3 is implemented and can use AArch32) 





0b0111 Abort 





0b1010 = Hyp 





0b1011 Undefined 








@b1111 System 





Other values are reserved. 
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G6.2.119 SPSR_abt, Saved Program Status Register (Abort mode) 


The SPSR_abt characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to Abort mode. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 




















ELO(NS) ELO(S) EL1(NS,!ABT) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0, !ABT) 
- - RW RW RW RW 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1(!ABT) EL2 (NS) 
- RW RW 
Using MRS (banked register) and MSR (banked register) instructions, at PL1 this register is only 
accessible from PE modes other than Abort mode. In Abort mode, it is accessible as the current 
SPSR. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
AArch32 System register SPSR_abt is architecturally mapped to AArch64 System register 
SPSR_abt. 
Attributes 


SPSR_abt is a 32-bit register. 


Field descriptions 


The SPSR_abt bit assignments are: 


31 30 29 28 27 26 25 2423 = =—.21 2019 1615 1098765 4 3 





IT[1:0] Se — MI4] 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to Abort mode, and copied to CPSR.N on 
executing an exception return operation in Abort mode. 
Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Abort mode, and copied to CPSR.Z on 
executing an exception return operation in Abort mode. 
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C, bit [29] 
Set to the value of CPSR.C on taking an exception to Abort mode, and copied to CPSR.C on 
executing an exception return operation in Abort mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to Abort mode, and copied to CPSR.V on 
executing an exception return operation in Abort mode. 

Q, bit [27] 


Set to the value of CPSR.Q on taking an exception to Abort mode, and copied to CPSR.Q on 
executing an exception return operation in Abort mode. 

IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvVv8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
He Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 
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A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
L Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
) Taken from A32 state. 
a Taken from T32 state. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





0b0000 User 





0b0001 + FIQ 





0b0010 +~=—s IRQ 





Qb0011 Supervisor 





Qb011@ Monitor (only valid in Secure state, if EL3 is implemented and can use AArch32) 





0b0111 Abort 





0b1011 Undefined 








0b1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


Accessing the SPSR_abt: 


To access the SPSR_abt: 


MRS <Rd>, SPSR_abt ; Read SPSR_abt into Rd 
MSR SPSR_abt, <Rd> ; Write Rd to SPSR_abt 
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Register access is encoded as follows: 


1 0100 «1 
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G6.2.120 SPSR_fiq, Saved Program Status Register (FIQ mode) 
The SPSR_fiq characteristics are: 


Purpose 
Holds the saved process state when an exception is taken to FIQ mode. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS,!FIQ) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0, !FIQ) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1(!FIQ) EL2 (NS) 





- RW RW 





Using MRS (banked register) and MSR (banked register) instructions, at PL1 this register is only 
accessible from PE modes other than FIQ mode. In FIQ mode, it is accessible as the current SPSR. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register SPSR_fiq is architecturally mapped to AArch64 System register 
SPSR_fiq. 


Attributes 
SPSR_fiq is a 32-bit register. 


Field descriptions 


The SPSR_fiq bit assignments are: 


31 30 29 28 27 26 25 24 23.21 2019 1615 1098765 4 3 





IT[1:0] See R= 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to FIQ mode, and copied to CPSR.N on 
executing an exception return operation in FIQ mode. 
Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to FIQ mode, and copied to CPSR.Z on executing 
an exception return operation in FIQ mode. 
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C, bit [29] 
Set to the value of CPSR.C on taking an exception to FIQ mode, and copied to CPSR.C on executing 
an exception return operation in FIQ mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to FIQ mode, and copied to CPSR.V on 
executing an exception return operation in FIQ mode. 

Q, bit [27] 


Set to the value of CPSR.Q on taking an exception to FIQ mode, and copied to CPSR.Q on 
executing an exception return operation in FIQ mode. 

IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvVv8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
He Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 
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A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
L Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
) Taken from A32 state. 
a Taken from T32 state. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





0b0000 User 





0b0001 + FIQ 





0b0010 +~=—s IRQ 





Qb0011 Supervisor 





Qb011@ Monitor (only valid in Secure state, if EL3 is implemented and can use AArch32) 





0b0111 Abort 





0b1011 Undefined 








0b1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


Accessing the SPSR_fiq: 


To access the SPSR_fiq: 


MRS <Rd>, SPSR_fiq ; Read SPSR_fiq into Rd 
MSR SPSR_fig, <Rd> ; Write Rd to SPSR_fiq 
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Register access is encoded as follows: 


0 1110 «1 
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G6.2.121 SPSR_hyp, Saved Program Status Register (Hyp mode) 
The SPSR_hyp characteristics are: 


Purpose 
Holds the saved process state when an exception is taken to Hyp mode. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0, Mon) 





7 - RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1 _ EL2 (NS) 








Using MRS (banked register) and MSR (banked register) instructions, this register is only 
accessible from Monitor mode. In Hyp mode, this register is accessible as the current SPSR. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register SPSR_hyp is architecturally mapped to AArch64 System register 
SPSR_EL2. 


Attributes 


SPSR_hyp is a 32-bit register. 


Field descriptions 


The SPSR_hyp bit assignments are: 


31 30 29 28 27 26 25 24 23.21 2019 1615 1098765 4 3 





IT[1:0] See R=, 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to Hyp mode, and copied to CPSR.N on 
executing an exception return operation in Hyp mode. 
Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Hyp mode, and copied to CPSR.Z on executing 
an exception return operation in Hyp mode. 
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C, bit [29] 
Set to the value of CPSR.C on taking an exception to Hyp mode, and copied to CPSR.C on 
executing an exception return operation in Hyp mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to Hyp mode, and copied to CPSR.V on 
executing an exception return operation in Hyp mode. 

Q, bit [27] 


Set to the value of CPSR.Q on taking an exception to Hyp mode, and copied to CPSR.Q on 
executing an exception return operation in Hyp mode. 

IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvVv8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
He Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 
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A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
F, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
L Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
) Taken from A32 state. 
a Taken from T32 state. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M{[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





Qb0000 User 





0b0001 + ©=FIQ 





obe010 ~=—s IRQ 





Qb0011 Supervisor 





@b0111 Abort 





0b1010 Hyp 





@b1011 Undefined 








Qb1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


Accessing the SPSR_hyp: 


To access the SPSR_hyp: 


MRS <Rd>, SPSR_hyp ; Read SPSR_hyp into Rd 
MSR SPSR_hyp, <Rd> ; Write Rd to SPSR_hyp 
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Register access is encoded as follows: 


1 1110 «1 
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G6.2.122 SPSR_irq, Saved Program Status Register (IRQ mode) 
The SPSR_irq characteristics are: 


Purpose 
Holds the saved process state when an exception is taken to IRQ mode. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS,!IRQ) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0, !IRQ) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1(!IRQ) EL2(NS) 





- RW RW 





Using MRS (banked register) and MSR (banked register) instructions, at PL1 this register is only 
accessible from PE modes other than IRQ mode. In IRQ mode, it is accessible as the current SPSR. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register SPSR_irq is architecturally mapped to AArch64 System register 
SPSR_irq. 


Attributes 
SPSR_irq is a 32-bit register. 


Field descriptions 


The SPSR_irq bit assignments are: 


31 30 29 28 27 26 25 24 23) =—.21 2019 1615 1098765 4 3 





IT[1:0] See R=, 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to IRQ mode, and copied to CPSR.N on 
executing an exception return operation in IRQ mode. 
Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to IRQ mode, and copied to CPSR.Z on executing 
an exception return operation in IRQ mode. 
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C, bit [29] 
Set to the value of CPSR.C on taking an exception to IRQ mode, and copied to CPSR.C on 
executing an exception return operation in IRQ mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to IRQ mode, and copied to CPSR.V on 
executing an exception return operation in IRQ mode. 

Q, bit [27] 


Set to the value of CPSR.Q on taking an exception to IRQ mode, and copied to CPSR.Q on 
executing an exception return operation in IRQ mode. 

IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvVv8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
He Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 
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A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
L Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
) Taken from A32 state. 
a Taken from T32 state. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





0b0000 User 





0b0001 + FIQ 





0b0010 +~=—s IRQ 





Qb0011 Supervisor 





Qb011@ Monitor (only valid in Secure state, if EL3 is implemented and can use AArch32) 





0b0111 Abort 





0b1011 Undefined 








0b1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


Accessing the SPSR_irq: 
To access the SPSR_irq: 


MRS <Rd>, SPSR_irq ; Read SPSR_irq into Rd 
MSR SPSR_irg, <Rd> ; Write Rd to SPSR_irq 
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Register access is encoded as follows: 


1 0000 = 1 
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G6.2.123  SPSR_mon, Saved Program Status Register (Monitor mode) 
The SPSR_mon characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to Monitor mode. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0, !Mon) 





7 RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 








Using MRS (banked register) and MSR (banked register) instructions, this register is only 
accessible from EL3 modes other than Monitor mode. In Monitor mode, it is accessible as the 


current SPSR. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
This register is only accessible in Secure state. 


AArch32 System register SPSR_mon can be mapped to AArch64 System register SPSR_EL3, but 
this is not architecturally mandated. 


Attributes 
SPSR_mon is a 32-bit register. 


Field descriptions 


The SPSR_mon bit assignments are: 


31 30 29 28 27 26 25 24 23.21 2019 1615 1098765 4 3 





IT[1:0] Se — MI4] 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to Monitor mode, and copied to CPSR.N on 
executing an exception return operation in Monitor mode. 
Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Monitor mode, and copied to CPSR.Z on 
executing an exception return operation in Monitor mode. 
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C, bit [29] 
Set to the value of CPSR.C on taking an exception to Monitor mode, and copied to CPSR.C on 
executing an exception return operation in Monitor mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to Monitor mode, and copied to CPSR.V on 
executing an exception return operation in Monitor mode. 

Q, bit [27] 


Set to the value of CPSR.Q on taking an exception to Monitor mode, and copied to CPSR.Q on 
executing an exception return operation in Monitor mode. 


IT[1:0], bits [26:25] 

IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 
J, bit [24] 

RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvVv8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 


Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
He Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 
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A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 

I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 

¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
L Exception masked. 

T, bit [5] 


T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 


) Taken from A32 state. 
a Taken from T32 state. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





0b0000 User 





0b0001 + FIQ 





0b0010 +~=—s IRQ 





Qb0011 Supervisor 





Qb011@ Monitor (only valid in Secure state, if EL3 is implemented and can use AArch32) 





0b0111 Abort 





0b1010 =Hyp 





0b1011 Undefined 








@b1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


Accessing the SPSR_mon: 
To access the SPSR_mon: 


MRS <Rd>, SPSR_mon ; Read SPSR_mon into Rd 
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MSR SPSR_mon, <Rd> ; Write Rd to SPSR_mon 


Register access is encoded as follows: 


1 1100 1 
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G6.2.124 SPSR_svc, Saved Program Status Register (Supervisor mode) 


The SPSR_sve characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to Supervisor mode. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 




















ELO(NS) ELO(S) EL1(NS,!SVC) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0, !SVC) 
- - RW RW RW RW 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1(!SVC) EL2 (NS) 
- RW RW 
Using MRS (banked register) and MSR (banked register) instructions, at PL1 this register is only 
accessible from PE modes other than Supervisor mode. In Supervisor mode, it is accessible as the 
current SPSR. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
AArch32 System register SPSR_svc is architecturally mapped to AArch64 System register 
SPSR_EL1. 
Attributes 


SPSR_sve is a 32-bit register. 


Field descriptions 


The SPSR_svc bit assignments are: 


31 30 29 28 27 26 25 24 23 = 21 2019 1615 1098765 4 3 





IT[1:0] Se — MI4] 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to Supervisor mode, and copied to CPSR.N on 
executing an exception return operation in Supervisor mode. 
Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Supervisor mode, and copied to CPSR.Z on 
executing an exception return operation in Supervisor mode. 
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C, bit [29] 
Set to the value of CPSR.C on taking an exception to Supervisor mode, and copied to CPSR.C on 
executing an exception return operation in Supervisor mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to Supervisor mode, and copied to CPSR.V on 
executing an exception return operation in Supervisor mode. 

Q, bit [27] 


Set to the value of CPSR.Q on taking an exception to Supervisor mode, and copied to CPSR.Q on 
executing an exception return operation in Supervisor mode. 


IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvVv8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
He Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 
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A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
L Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
) Taken from A32 state. 
a Taken from T32 state. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





0b0000 User 





0b0001 + FIQ 





0b0010 +~=—s IRQ 





Qb0011 Supervisor 





Qb011@ Monitor (only valid in Secure state, if EL3 is implemented and can use AArch32) 





0b0111 Abort 





0b1011 Undefined 








0b1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


Accessing the SPSR_sve: 
To access the SPSR_sve: 


MRS <Rd>, SPSR_svc ; Read SPSR_svc into Rd 
MSR SPSR_svc, <Rd> ; Write Rd to SPSR_svc 
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Register access is encoded as follows: 


1 0010 = 
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G6.2.125 SPSR_und, Saved Program Status Register (Undefined mode) 


The SPSR_und characteristics are: 


Purpose 


Holds the saved process state when an exception is taken to Undefined mode. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 




















ELO(NS) ELO(S) EL1(NS,!UND) EL2(NS) EL3(SCR.NS=1) EL3 (SCR.NS=0, !UND) 
- - RW RW RW RW 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1(!UND) EL2 (NS) 
- RW RW 
Using MRS (banked register) and MSR (banked register) instructions, at PL1 this register is only 
accessible from PE modes other than Undefined mode. In Undefined mode, it is accessible as the 
current SPSR. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
AArch32 System register SPSR_und is architecturally mapped to AArch64 System register 
SPSR_und. 
Attributes 


SPSR_und is a 32-bit register. 


Field descriptions 


The SPSR_und bit assignments are: 


31 30 29 28 27 26 25 24 23 =—_.21 2019 1615 1098765 4 3 





IT[1:0] Se — MI4] 





N, bit [31] 
Set to the value of CPSR.N on taking an exception to Undefined mode, and copied to CPSR.N on 
executing an exception return operation in Undefined mode. 
Z, bit [30] 
Set to the value of CPSR.Z on taking an exception to Undefined mode, and copied to CPSR.Z on 
executing an exception return operation in Undefined mode. 
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C, bit [29] 
Set to the value of CPSR.C on taking an exception to Undefined mode, and copied to CPSR.C on 
executing an exception return operation in Undefined mode. 

V, bit [28] 
Set to the value of CPSR.V on taking an exception to Undefined mode, and copied to CPSR.V on 
executing an exception return operation in Undefined mode. 

Q, bit [27] 


Set to the value of CPSR.Q on taking an exception to Undefined mode, and copied to CPSR.Q on 
executing an exception return operation in Undefined mode. 

IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 


In previous versions of the architecture, the {J, T} bits determined the AArch32 Instruction set state. 
ARMvVv8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:21] 


Reserved, RESO. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before the exception was 
taken. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
He Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 
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A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
) Exception not masked. 
1 Exception masked. 
¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
) Exception not masked. 
L Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the exception was 
taken from. Possible values of this bit are: 
) Taken from A32 state. 
a Taken from T32 state. 
M4], bit [4] 
Execution state that the exception was taken from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that an exception was taken from. The possible values are: 





M[3:0] Mode 





0b0000 User 





0b0001 + FIQ 





0b0010 +~=—s IRQ 





Qb0011 Supervisor 





Qb011@ Monitor (only valid in Secure state, if EL3 is implemented and can use AArch32) 





0b0111 Abort 





0b1011 Undefined 








0b1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


Accessing the SPSR_und: 


To access the SPSR_und: 


MRS <Rd>, SPSR_und ; Read SPSR_und into Rd 
MSR SPSR_und, <Rd> ; Write Rd to SPSR_und 
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Register access is encoded as follows: 


1 0110 «1 
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G6.2.126 TCMTR, TCM Type Register 
The TCMTR characteristics are: 


Purpose 
Provides information about the implementation of the TCM. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TID1==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 

° If HCR_EL2.TID1==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 

° If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 

° If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL] are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


Attributes 


TCMTR is a 32-bit register. 


Field descriptions 


The TCMTR bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 
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Accessing the TCMTR: 
To access the TCMTR: 
MRC p15,0,<Rt>,c@,c@,2 ; Read TCMTR into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0000 0000 010 
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G6.2.127 TLBIALL, TLB Invalidate All 
The TLBIALL characteristics are: 


Purpose 


Invalidate all TLB entries for the PL1&0 translation regime, subject to the Privilege level and 
Security state at which the instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIALL. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBIALL is a 32-bit System instruction. 


Field descriptions 


TLBIALL ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 


Executing the TLBIALL instruction: 
The TLBIALL instruction is executed as: 


MCR p15,0,<Rt>,c8,c7,0 ; TLBIALL operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0111 000 
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G6.2.128 TLBIALLH, TLB Invalidate All, Hyp mode 
The TLBIALLH characteristics are: 
Purpose 


Invalidate all TLB entries for the PL2 translation regime. 


For details of the scope of this instruction see TLBIALLH. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0,Mon) EL3(SCR.NS=0, !Mon) 





- - - WO WO WO WO-UNPREDICTABLE 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - WO 





If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode TLB maintenance 
instructions on page K1-5476. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBIALLH is a 32-bit System instruction. 


Field descriptions 


TLBIALLH ignores the value in the register specified by the instruction. Software does not have to write a value to 
the register before issuing this instruction. 


Executing the TLBIALLH instruction: 
The TLBIALLH instruction is executed as: 


MCR p15,4,<Rt>,c8,c7,0 ; TLBIALLH operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0111 000 
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G6.2.129 TLBIALLHIS, TLB Invalidate All, Hyp mode, Inner Shareable 
The TLBIALLHIS characteristics are: 


Purpose 


Invalidate all TLB entries for the PL2 translation regime, on all PEs in the same Inner Shareable 
domain. 


For details of the scope of this instruction see TLBIALLH. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0,Mon) EL3(SCR.NS=0, !Mon) 





- - - WO WO WO WO-UNPREDICTABLE 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - WO 





If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode TLB maintenance 
instructions on page K1-5476. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBIALLHIS is a 32-bit System instruction. 


Field descriptions 


TLBIALLHIS ignores the value in the register specified by the instruction. Software does not have to write a value 
to the register before issuing this instruction. 


Executing the TLBIALLHIS instruction: 
The TLBIALLHIS instruction is executed as: 


MCR p15,4,<Rt>,c8,c3,0 ; TLBIALLHIS operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0011 000 
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G6.2.130 TLBIALLIS, TLB Invalidate All, Inner Shareable 
The TLBIALLIS characteristics are: 


Purpose 


Invalidate all TLB entries for the PL1&0 translation regime, on all PEs in the same Inner Shareable 
domain, subject to the Privilege level and Security state at which the instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIALL. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBIALLIS is a 32-bit System instruction. 


Field descriptions 


TLBIALLIS ignores the value in the register specified by the instruction. Software does not have to write a value 
to the register before issuing this instruction. 


Executing the TLBIALLIS instruction: 
The TLBIALLIS instruction is executed as: 


MCR p15,0,<Rt>,c8,c3,0 ; TLBIALLIS operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0011 000 
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G6.2.131 | TLBIALLNSNH, TLB Invalidate All, Non-Secure Non-Hyp 
The TLBIALLNSNH characteristics are: 
Purpose 


Invalidate all TLB entries for the Non-secure PL1&0 translation regime. 


For details of the scope of this instruction see TLBIALLNSNH. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0,Mon) EL3(SCR.NS=0, !Mon) 





- - - WO WO WO WO-UNPREDICTABLE 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - WO 





If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode TLB maintenance 
instructions on page K1-5476. 

Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBIALLNSNH is a 32-bit System instruction. 


Field descriptions 


TLBIALLNSNH ignores the value in the register specified by the instruction. Software does not have to write a 
value to the register before issuing this instruction. 


Executing the TLBIALLNSNH instruction: 
The TLBIALLNSNH instruction is executed as: 


MCR p15,4,<Rt>,c8,c7,4 ; TLBIALLNSNH operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0111 100 
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G6.2.132 TLBIALLNSNHIS, TLB Invalidate All, Non-Secure Non-Hyp, Inner Shareable 
The TLBIALLNSNHIS characteristics are: 


Purpose 


Invalidate all TLB entries for the Non-secure PL1&0 translation regime, on all PEs in the same 
Inner Shareable domain. 


For details of the scope of this instruction see TLBIALLNSNH. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0,Mon) EL3(SCR.NS=0, !Mon) 





- - - WO WO WO WO-UNPREDICTABLE 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - WO 





If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode TLB maintenance 
instructions on page K1-5476. 

Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBIALLNSNHIS is a 32-bit System instruction. 


Field descriptions 

TLBIALLNSNHIS ignores the value in the register specified by the instruction. Software does not have to write a 
value to the register before issuing this instruction. 

Executing the TLBIALLNSNHIS instruction: 

The TLBIALLNSNHIS instruction is executed as: 


MCR p15,4,<Rt>,c8,c3,4 ; TLBIALLNSNHIS operation, ignoring the value in Rt 
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The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0011 100 
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G6.2.133_ TLBIASID, TLB Invalidate by ASID match 
The TLBIASID characteristics are: 
Purpose 
Invalidate TLB entries for stage 1 of the PL1&0 translation regime that match the given ASID, 
subject to the Security state at which the instruction is executed. 
If this operation is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 
For details of the scope of this instruction see TLBIASID. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
There are no configuration notes. 
Attributes 
TLBIASID is a 32-bit System instruction. 
Field descriptions 
The TLBIASID input value bit assignments are: 
31 8 7 0 
RESO ASID 
Bits [31:8] 
Reserved, RESO. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4599 


1ID092916 


Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


ASID, bits [7:0] 
ASID value to match. Any TLB entries for non-global pages that match the ASID values will be 
affected by this operation. 


Executing the TLBIASID instruction: 
The TLBIASID instruction is executed as: 
MCR p15,0,<Rt>,c8,c7,2 ; TLBIASID operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0111 010 
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G6.2.134 TLBIASIDIS, TLB Invalidate by ASID match, Inner Shareable 
The TLBIASIDIS characteristics are: 


Purpose 


Invalidate TLB entries for stage 1 of the PL1&0 translation regime that match the given ASID, on 
all PEs in the same Inner Shareable domain, subject to the Security state at which the instruction is 
executed. 


If this operation is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIASID. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBIASIDIS is a 32-bit System instruction. 


Field descriptions 


The TLBIASIDIS input value bit assignments are: 


31 8 7 0 


RESO ASID 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4601 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Bits [31:8] 
Reserved, RESO. 


ASID, bits [7:0] 


ASID value to match. Any TLB entries for non-global pages that match the ASID values will be 


affected by this operation. 


Executing the TLBIASIDIS instruction: 
The TLBIASIDIS instruction is executed as: 
MCR p15,0,<Rt>,c8,c3,2 ; TLBIASIDIS operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn 


CRm_ opc2 





1111 000 1000 


0011 010 
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G6.2.135 TLBIIPAS2, TLB Invalidate by Intermediate Physical Address, Stage 2 
The TLBIIPAS? characteristics are: 
Purpose 
Invalidate TLB entries for stage 2 of the Non-secure PL1&0 translation regime that match the given 
IPA and the current VMID. 
For details of the scope of this instruction see TLBIIPAS2. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- - WO 
This instruction is a NOP when executed in Monitor mode with SCR.NS==0, and is 
UNPREDICTABLE when executed in any AArch32 Secure privileged mode other than Monitor mode. 
This instruction must apply to structures that contain only stage 2 translation information, but does 
not need to apply to structures that contain combined stage 1 and stage 2 translation information. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
This System instruction is not implemented in architecture versions before ARMV8. 
Attributes 
TLBIIPAS2 is a 32-bit System instruction. 
Field descriptions 
The TLBIIPAS2 input value bit assignments are: 
31 28 27 0 
RESO IPA[39:12] 
Bits [31:28] 
Reserved, RESO. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4603 


1ID092916 


Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


IPA[39:12], bits [27:0] 
Bits[39:12] of the intermediate physical address to match. 


Executing the TLBIIPAS2 instruction: 
The TLBIIPAS2 instruction is executed as: 
MCR p15,4,<Rt>,c8,c4,1 ; TLBIIPAS2 operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0100 001 
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G6.2.136 TLBIIPAS2IS, TLB Invalidate by Intermediate Physical Address, Stage 2, Inner Shareable 
The TLBIIPAS2IS characteristics are: 
Purpose 
Invalidate TLB entries for stage 2 of the Non-secure PL1&0 translation regime that match the given 
IPA and the current VMID, on all PEs in the same Inner Shareable domain. 
For details of the scope of this instruction see TLBIIPAS2. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- - WO 
This instruction is a NOP when executed in Monitor mode with SCR.NS==0, and is 
UNPREDICTABLE when executed in any AArch32 Secure privileged mode other than Monitor mode. 
This instruction must apply to structures that contain only stage 2 translation information, but does 
not need to apply to structures that contain combined stage 1 and stage 2 translation information. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
This System instruction is not implemented in architecture versions before ARMV8. 
Attributes 
TLBIIPASZIS is a 32-bit System instruction. 
Field descriptions 
The TLBIIPAS2IS input value bit assignments are: 
31 28 27 0 
RESO IPA[39:12] 
Bits [31:28] 
Reserved, RESO. 
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IPA[39:12], bits [27:0] 
Bits[39:12] of the intermediate physical address to match. 


Executing the TLBIIPASZ2IS instruction: 
The TLBIIPASZ2IS instruction is executed as: 
MCR p15,4,<Rt>,c8,c@,1 ; TLBIIPAS2IS operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0000 001 
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G6.2.137  TLBIIPAS2L, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level 
The TLBIIPAS2L characteristics are: 
Purpose 
Invalidate TLB entries for stage 2 of the Non-secure PL1&0 translation regime, for the last level of 
translation, that match the given IPA and the current VMID. 
For details of the scope of this instruction see TLBIIPAS2. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- - WO 
This instruction is a NOP when executed in Monitor mode with SCR.NS==0, and is 
UNPREDICTABLE when executed in any AArch32 Secure privileged mode other than Monitor mode. 
This instruction must apply to structures that contain only stage 2 translation information, but does 
not need to apply to structures that contain combined stage 1 and stage 2 translation information. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
This System instruction is not implemented in architecture versions before ARMV8. 
Attributes 
TLBIPAS2L is a 32-bit System instruction. 
Field descriptions 
The TLBIIPAS2L input value bit assignments are: 
31 28 27 0 
RESO IPA[39:12] 
Bits [31:28] 
Reserved, RESO. 
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IPA[39:12], bits [27:0] 
Bits[39:12] of the intermediate physical address to match. 


Executing the TLBIIPAS2L instruction: 
The TLBIIPAS2L instruction is executed as: 
MCR p15,4,<Rt>,c8,c4,5 ; TLBIIPAS2L operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0100 101 
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TLBIIPAS2LIS, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, Inner 


The TLBIIPAS2LIS characteristics are: 


Purpose 


Invalidate TLB entries for stage 2 of the Non-secure PL1&0 translation regime, for the last level of 
translation, that match the given IPA and the current VMID, on all PEs in the same Inner Shareable 


domain. 


For details of the scope of this instruction see TLBIIPAS2L. 


Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





: : wo 





This instruction is a NOP when executed in Monitor mode with SCR.NS==0, and is 
UNPREDICTABLE when executed in any AArch32 Secure privileged mode other than Monitor mode. 


This instruction must apply to structures that contain only stage 2 translation information, but does 
not need to apply to structures that contain combined stage 1 and stage 2 translation information. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


This System instruction is not implemented in architecture versions before ARMVv8. 


Attributes 
TLBIPASZ2LIS is a 32-bit System instruction. 


Field descriptions 


The TLBIIPAS2LIS input value bit assignments are: 


31 28 27 0 


RESO IPA[39:12] 
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Bits [31:28] 
Reserved, RESO. 


IPA[39:12], bits [27:0] 
Bits[39:12] of the intermediate physical address to match. 


Executing the TLBIIPAS2LIS instruction: 
The TLBIIPAS2LIS instruction is executed as: 
MCR p15,4,<Rt>,c8,c@,5 ; TLBIIPAS2LIS operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0000 101 
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G6.2.139 TLBIMVA, TLB Invalidate by VA 
The TLBIMVA characteristics are: 
Purpose 
Invalidate PL1&0 regime stage 1 and 2 TLB entries for the given VA and ASID, the current VMID, 
and the current Security state. 
If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 
For details of the scope of this instruction see TLBIMVA. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
There are no configuration notes. 
Attributes 
TLBIMVA is a 32-bit System instruction. 
Field descriptions 
The TLBIMVA input value bit assignments are: 
31 12 11 8 7 0 
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VA, bits [31:12] 


Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 
by this operation. 


Bits [11:8] 
Reserved, RESO. 


ASID, bits [7:0] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 

Executing the TLBIMVA instruction: 

The TLBIMVA instruction is executed as: 

MCR p15,0,<Rt>,c8,c7,1 ; TLBIMVA operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0111 001 
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G6.2.140 TLBIMVAA, TLB Invalidate by VA, All ASID 
The TLBIMVAA characteristics are: 
Purpose 
Invalidate TLB entries for stage 1 of the PL1&0 translation regime, for any ASID, that match the 
given VA, subject to the Security state at which the instruction is executed. 
If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 
For details of the scope of this instruction see TLBIMVAA. 
Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - WO WO WO WO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- WO WO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
Configurations 
There are no configuration notes. 
Attributes 
TLBIMVAA is a 32-bit System instruction. 
Field descriptions 
The TLBIMVAA input value bit assignments are: 
31 12 11 0 
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VA, bits [31:12] 


Virtual address to match. Any unlocked TLB entries that match the VA will be affected by this 


operation, regardless of the ASID. 


Bits [11:0] 


Reserved, RESO. 


Executing the TLBIMVAA instruction: 
The TLBIMVAA instruction is executed as: 
MCR p15,0,<Rt>,c8,c7,3 ; TLBIMVAA operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn 


CRm_ opc2 





1111 000 1000 


0111 011 
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G6.2.141 TLBIMVAAIS, TLB Invalidate by VA, All ASID, Inner Shareable 
The TLBIMVAAIS characteristics are: 


Purpose 


Invalidate TLB entries for stage 1 of the PL1&0 translation regime, for any ASID, that match the 
given VA, on all PEs in the same Inner Shareable domain, subject to the Security state at which the 
instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIMVAA. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBIMVAAIS is a 32-bit System instruction. 


Field descriptions 


The TLBIMVAAIS input value bit assignments are: 


31 12 11 0 
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VA, bits [31:12] 


Virtual address to match. Any unlocked TLB entries that match the VA will be affected by this 


operation, regardless of the ASID. 


Bits [11:0] 


Reserved, RESO. 


Executing the TLBIMVAAIS instruction: 
The TLBIMVAAIS instruction is executed as: 
MCR p15,0,<Rt>,c8,c3,3 ; TLBIMVAAIS operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn 


CRm_ opc2 





1111 000 1000 


0011 011 
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G6.2.142 TLBIMVAAL, TLB Invalidate by VA, All ASID, Last level 


The TLBIMVAAL characteristics are: 


Purpose 


Invalidate TLB entries for stage 1 of the PL1&0 translation regime, for the last level of translation, 
for any ASID, that match the given VA, subject to the Security state at which the instruction is 
executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIMVAAL. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


This System instruction is not implemented in architecture versions before ARMVv8. 


Attributes 
TLBIMVAAL is a 32-bit System instruction. 


Field descriptions 


The TLBIMVAAL input value bit assignments are: 


31 12 11 0 
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VA, bits [31:12] 


Virtual address to match. Any unlocked TLB entries that match the VA will be affected by this 


operation, regardless of the ASID. 


Bits [11:0] 


Reserved, RESO. 


Executing the TLBIMVAAL instruction: 
The TLBIMVAAL instruction is executed as: 
MCR p15,0,<Rt>,c8,c7,7 ; TLBIMVAAL operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn 


CRm_ opc2 





1111 000 1000 


0111 111 
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G6.2.143  TLBIMVAALIS, TLB Invalidate by VA, All ASID, Last level, Inner Shareable 
The TLBIMVAALIS characteristics are: 


Purpose 


Invalidate TLB entries for stage 1 of the PL1&0 translation regime, for the last level of translation, 
for any ASID, that match the given VA, on all PEs in the same Inner Shareable domain, subject to 
the Security state at which the instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIMVAAL. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


This System instruction is not implemented in architecture versions before ARMVv8. 


Attributes 
TLBIMVAALIS is a 32-bit System instruction. 


Field descriptions 


The TLBIMVAALIS input value bit assignments are: 


31 12 11 0 
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VA, bits [31:12] 


Virtual address to match. Any unlocked TLB entries that match the VA will be affected by this 


operation, regardless of the ASID. 


Bits [11:0] 


Reserved, RESO. 


Executing the TLBIMVAALIS instruction: 
The TLBIMVAALIS instruction is executed as: 
MCR p15,0,<Rt>,c8,c3,7 ; TLBIMVAALIS operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn 


CRm_ opc2 





1111 000 1000 


0011 111 
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G6.2.144 TLBIMVAH, TLB Invalidate by VA, Hyp mode 

The TLBIMVAH characteristics are: 

Purpose 
Invalidate TLB entries for the Non-secure PL2 translation regime that match the given VA. 
For details of the scope of this instruction see TLBIMVAH. 

Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 

ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0, Mon) EL3(SCR.NS=0, !Mon) 

- - WO WO WO WO-UNPREDICTABLE 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 

ELO EL1  EL2 (NS) 
- - WO 

If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode TLB maintenance 
instructions on page K1-5476. 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 

Configurations 
There are no configuration notes. 

Attributes 
TLBIMVAH is a 32-bit System instruction. 

Field descriptions 

The TLBIMVAH input value bit assignments are: 

31 12 11 0 

VA, bits [31:12] 

Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 
by this operation. 
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Bits [11:0] 
Reserved, RESO. 


Executing the TLBIMVAH instruction: 
The TLBIMVAH instruction is executed as: 
MCR p15,4,<Rt>,c8,c7,1 ; TLBIMVAH operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0111 001 
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G6.2.145 TLBIMVAHIS, TLB Invalidate by VA, Hyp mode, Inner Shareable 

The TLBIMVAHIS characteristics are: 

Purpose 
Invalidate TLB entries for the Non-secure PL2 translation regime that match the given VA, on all 
PEs in the same Inner Shareable domain. 
For details of the scope of this instruction see TLBIMVAH. 

Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 

ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0, Mon) EL3(SCR.NS=0, !Mon) 

- - WO WO WO WO-UNPREDICTABLE 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 

ELO EL1 EL2 (NS) 
- - WO 

If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode TLB maintenance 
instructions on page K1-5476. 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 

Configurations 
There are no configuration notes. 

Attributes 
TLBIMVAHIS is a 32-bit System instruction. 

Field descriptions 

The TLBIMVAHIS input value bit assignments are: 

31 12 11 0 

VA, bits [31:12] 
Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 

by this operation. 
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Bits [11:0] 
Reserved, RESO. 


Executing the TLBIMVAHIS instruction: 
The TLBIMVAHIS instruction is executed as: 
MCR p15,4,<Rt>,c8,c3,1 ; TLBIMVAHIS operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0011 001 
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G6.2.146 TLBIMVAIS, TLB Invalidate by VA, Inner Shareable 


The TLBIMVAIS characteristics are: 


Purpose 


Invalidate TLB entries for stage 1 of the PL1&0 translation regime that match the given VA and 
ASID, on all PEs in the same Inner Shareable domain, subject to the Security state at which the 
instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIMVA. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
° If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


There are no configuration notes. 


Attributes 
TLBIMVAIS is a 32-bit System instruction. 


Field descriptions 


The TLBIMVAIS input value bit assignments are: 


31 12 11 8 7 0 
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VA, bits [31:12] 


Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 
by this operation. 


Bits [11:8] 
Reserved, RESO. 


ASID, bits [7:0] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 

Executing the TLBIMVAIS instruction: 

The TLBIMVAIS instruction is executed as: 

MCR p15,0,<Rt>,c8,c3,1 ; TLBIMVAIS operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0011 001 
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TLBIMVAL, TLB Invalidate by VA, Last level 


The TLBIMVAL characteristics are: 


Purpose 


Invalidate TLB entries for the stage 1 PL1&0 translation regime, for the last level of translation, that 
match the given VA and ASID, subject to the Security state at which the instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIMVAL. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
. If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
. If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


This System instruction is not implemented in architecture versions before ARMVv8. 


Attributes 
TLBIMVAL is a 32-bit System instruction. 


Field descriptions 


The TLBIMVAL input value bit assignments are: 


31 12 11 8 7 0 
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VA, bits [31:12] 


Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 
by this operation. 


Bits [11:8] 
Reserved, RESO. 


ASID, bits [7:0] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 

Executing the TLBIMVAL instruction: 

The TLBIMVAL instruction is executed as: 

MCR p15,0,<Rt>,c8,c7,5 ; TLBIMVAL operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0111 101 
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G6.2.148 TLBIMVALH, TLB Invalidate by VA, Last level, Hyp mode 

The TLBIMVALH characteristics are: 

Purpose 
Invalidate TLB entries for the Non-secure PL2 translation regime, for the last level of translation, 
that match the given VA. 
For details of the scope of this instruction see TLBIMVALH. 

Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 

ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0, Mon) EL3(SCR.NS=0, !Mon) 

- - WO WO WO WO-UNPREDICTABLE 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 

ELO EL1 EL2 (NS) 
- - WO 

If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode TLB maintenance 
instructions on page K1-5476. 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 

Configurations 
This System instruction is not implemented in architecture versions before ARMv8. 

Attributes 
TLBIMVALH is a 32-bit System instruction. 

Field descriptions 

The TLBIMVALH input value bit assignments are: 

31 12 11 0 

VA, bits [31:12] 
Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 

by this operation. 
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Bits [11:0] 
Reserved, RESO. 


Executing the TLBIMVALH instruction: 
The TLBIMVALH instruction is executed as: 
MCR p15,4,<Rt>,c8,c7,5 ; TLBIMVALH operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0111 101 
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G6.2.149 TLBIMVALHIS, TLB Invalidate by VA, Last level, Hyp mode, Inner Shareable 

The TLBIMVALHIS characteristics are: 

Purpose 
Invalidate TLB entries for the Non-secure PL2 translation regime, for the last level of translation, 
that match the given VA, on all PEs in the same Inner Shareable domain. 
For details of the scope of this instruction see TLBIMVALH. 

Usage constraints 
If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 

ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0, Mon) EL3(SCR.NS=0, !Mon) 

- - WO WO WO WO-UNPREDICTABLE 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 

ELO EL1 EL2 (NS) 
- - WO 

If this operation is executed in a Secure privileged mode other than Monitor mode, then the behavior 
is CONSTRAINED UNPREDICTABLE. For more information see Hyp mode TLB maintenance 
instructions on page K1-5476. 

Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 

Configurations 
This System instruction is not implemented in architecture versions before ARMv8. 

Attributes 
TLBIMVALHIS is a 32-bit System instruction. 

Field descriptions 

The TLBIMVALHIS input value bit assignments are: 

31 12 11 0 

VA, bits [31:12] 
Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 

by this operation. 
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Bits [11:0] 
Reserved, RESO. 


Executing the TLBIMVALHIS instruction: 
The TLBIMVALHIS instruction is executed as: 
MCR p15,4,<Rt>,c8,c3,5 ; TLBIMVALHIS operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 100 1000 0011 101 
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G6.2.150 TLBIMVALIS, TLB Invalidate by VA, Last level, Inner Shareable 
The TLBIMVALIS characteristics are: 


Purpose 


Invalidate TLB entries for stage 1 of the PL1&0 translation regime, for the last level of translation, 
that match the given VA and ASID, on all PEs in the same Inner Shareable domain, subject to the 
Security state at which the instruction is executed. 


If this instruction is executed at Secure EL1 in AArch32 when EL3 is using AArch64, it only affects 
TLB entries related to the Secure EL1 translation regime. 


For details of the scope of this instruction see TLBIMVAL. 


Usage constraints 


If EL3 is implemented and is using AArch32, this instruction can be executed at the following 
exception levels: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HCR_EL2.TTLB==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 
° If HSTR.T8==1, Non-secure execution of this instruction at EL1 is trapped to Hyp mode. 
° If HSTR_EL2.T8==1, Non-secure execution of this instruction at EL1 is trapped to EL2. 


Configurations 


This System instruction is not implemented in architecture versions before ARMv8. 


Attributes 
TLBIMVALIS is a 32-bit System instruction. 


Field descriptions 


The TLBIMVALIS input value bit assignments are: 


31 12 11 8 7 0 
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VA, bits [31:12] 


Virtual address to match. Any TLB entries that match the ASID value and VA value will be affected 
by this operation. 


Bits [11:8] 
Reserved, RESO. 


ASID, bits [7:0] 


ASID value to match. Any TLB entries that match the ASID value and VA value will be affected by 
this operation. 


Global TLB entries that match the VA value will be affected by this operation, regardless of the 
value of the ASID field. 

Executing the TLBIMVALIS instruction: 

The TLBIMVALIS instruction is executed as: 

MCR p15,0,<Rt>,c8,c3,5 ; TLBIMVALIS operation 


The instruction is encoded in the System instruction encoding space as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1000 0011 101 
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G6.2.151 TLBTR, TLB Type Register 
The TLBTR characteristics are: 
Purpose 
Provides information about the TLB implementation. The register must define whether the 
implementation provides separate instruction and data TLBs, or a unified TLB. Normally, the 
IMPLEMENTATION DEFINED information in this register includes the number of lockable entries in 
the TLB. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - RO RO RO RO 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2 (NS) 
- RO RO 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
° If HCR.TID1==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 
. If HCR_EL2.TID1==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 
° If HSTR.TO==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 
° If HSTR_EL2.T0==1, Non-secure read accesses to this register from EL] are trapped to EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
Attributes 
TLBTR is a 32-bit register. 
Field descriptions 
The TLBTR bit assignments are: 
31 1.0 
IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED, bits [31:1] 
IMPLEMENTATION DEFINED. 
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nU, bit [0] 
Not Unified TLB. Indicates whether the implementation has a unified TLB: 
) Unified TLB. 
1 Separate Instruction and Data TLBs. 


Accessing the TLBTR: 
To access the TLBTR: 
MRC p15,0,<Rt>,c@,c@,3 ; Read TLBTR into Rt 


Register access is encoded as follows: 





coproc opc1t CRn 


CRm_ opc2 





1111 000 0000 


0000 011 
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G6.2.152 TPIDRPRW, PL1 Software Thread ID Register 
The TPIDRPRW characteristics are: 


Purpose 
Provides a location where software executing at EL1 or higher can store thread identifying 
information that is not visible to software executing at ELO, for OS management purposes. 
Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


TPIDRPRW(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: : : : : RW 





TPIDRPRW(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW = 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


TPIDRPRW is accessible as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T13==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T13==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register TPIDRPRW is architecturally mapped to AArch64 System register 
TPIDR_EL1[31:0]. 


The PE never updates this register. This means the register is always UNKNOWN on reset. 


Attributes 
TPIDRPRW is a 32-bit register. 


Field descriptions 


The TPIDRPRW bit assignments are: 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4637 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


31 0 
Thread ID 
Bits [31:0] 


Thread ID. Thread identifying information stored by software running at this Exception level. 


Accessing the TPIDRPRW: 
To access the TPIDRPRW: 


MRC p15,0,<Rt>,c13,c@,4 ; Read TPIDRPRW into Rt 
MCR p15,0,<Rt>,c13,c0,4 ; Write Rt to TPIDRPRW 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1101 0000 100 
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G6.2.153 TPIDRURO, PLO Read-Only Software Thread ID Register 
The TPIDRURO characteristics are: 


Purpose 
Provides a location where software executing at EL1 or higher can store thread identifying 
information that is visible to software executing at ELO, for OS management purposes. 
Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


TPIDRURO(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RO - - - RW 





TPIDRUROV(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RO - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


TPIDRURO is accessible as follows: 





ELO EL1  EL2 (NS) 





RO RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T13==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


. If HSTR_EL2.T13==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


Configurations 


AArch32 System register TPIDRURO is architecturally mapped to AArch64 System register 
TPIDRRO_ELOJ[31:0]. 


The PE never updates this register. This means the register is always UNKNOWN on reset. 


Attributes 
TPIDRURO is a 32-bit register. 


Field descriptions 


The TPIDRURO bit assignments are: 
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31 0 
Thread ID 
Bits [31:0] 


Thread ID. Thread identifying information stored by software running at this Exception level. 


Accessing the TPIDRURO: 
To access the TPIDRURO: 


MRC p15,0,<Rt>,c13,c@,3 ; Read TPIDRURO into Rt 
MCR p15,0,<Rt>,c13,c0,3 ; Write Rt to TPIDRURO 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1101 0000 011 
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G6.2.154 TPIDRURW, PLO Read/Write Software Thread ID Register 
The TPIDRURW characteristics are: 


Purpose 
Provides a location where software executing at ELO can store thread identifying information, for 
OS management purposes. 

Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


TPIDRURW(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





£ RW = - 7 RW 





TPIDRURW(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


TPIDRURW is accessible as follows: 





ELO EL1  EL2 (NS) 





RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T13==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


. If HSTR_EL2.T13==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


Configurations 


AArch32 System register TPIDRURW is architecturally mapped to AArch64 System register 
TPIDR_ELO[31:0]. 


The PE never updates this register. This means the register is always UNKNOWN on reset. 


Attributes 
TPIDRURW is a 32-bit register. 


Field descriptions 


The TPIDRURW bit assignments are: 
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31 0 
Thread ID 
Bits [31:0] 


Thread ID. Thread identifying information stored by software running at this Exception level. 


Accessing the TPIDRURW: 
To access the TPIDRURW: 


MRC p15,0,<Rt>,c13,c@,2 ; Read TPIDRURW into Rt 
MCR p15,0,<Rt>,c13,c@,2 ; Write Rt to TPIDRURW 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1101 0000 010 
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G6.2.155 TTBCR, Translation Table Base Control Register 


The TTBCR characteristics are: 


Purpose 


Determines which of the Translation Table Base Registers defines the base address for a translation 
table walk required for the stage 1 translation of a memory access from any mode other than Hyp 
mode. Also controls the translation table format and, when using the Long-descriptor translation 
table format, holds cacheability and shareability information. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


TTBCR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





. z : = z RW 





TTBCR(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


TTBCR is accessible as follows: 





ELO EL1 EL2(NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 

° If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


. If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T2==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T2==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register TTBCR is architecturally mapped to AArch64 System register 
TCR_EL1[31:0]. 


The current translation table format determines which format of the register is used. 
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When EL3 is using AArch32, write access to TTBCR(S) is disabled when the CP1SSDISABLE 
signal is asserted HIGH. 


Some RW fields of this register have defined reset values. These apply only if the PE resets into an 
Exception level that is using AArch32. If the PE resets into EL3 using AArch32 then: 


° The EAE bit resets to 0 in both the Secure and the Non-secure instances of the register. 


° Other reset values apply only to the Secure instance of the register. 


Attributes 
TTBCR is a 32-bit register. 


Field descriptions 


The TTBCR bit assignments are: 


For all register layouts: 


EAE, bit [31] 
Extended Address Enable. The meanings of the possible values of this bit are: 
) Use the VMSAv8-32 translation system with the Short-descriptor translation table 
format. 
1 Use the VMSAv8-32 translation system with the Long-descriptor translation table 
format. 


When TTBCR.EAE==0: 


31 30 65 43 2 0 


EAE | | [| RESO 
PDO 


PD1 


EAE, bit [31] 
Extended Address Enable. The meanings of the possible values of this bit are: 


0 Use the VMSAv8-32 translation system with the Short-descriptor translation table 
format. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Bits [30:6] 


Reserved, RESO. 


PD1, bit [5] 


Translation table walk disable for translations using TTBR1. This bit controls whether a translation 
table walk is performed on a TLB miss, for an address that is translated using TTBR1. The encoding 
of this bit is: 


) Perform translation table walks using TTBR1. 


1 A TLB miss on an address that is translated using TTBR1 generates a Translation fault. 
No translation table walk is performed. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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PDO, bit [4] 


Translation table walk disable for translations using TTBRO. This bit controls whether a translation 
table walk is performed on a TLB miss for an address that is translated using TTBRO. The encoding 
of this bit is: 


0 Perform translation table walks using TTBRO. 


a A TLB miss on an address that is translated using TTBRO generates a Translation fault. 
No translation table walk is performed. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Bit [3] 


Reserved, RESO. 


N, bits [2:0] 


Indicate the width of the base address held in TTBRO. In TTBRO, the base address field is 
bits[31:14-N]. The value of N also determines: 


° Whether TTBRO or TTBR1 is used as the base address for translation table walks. 

. The size of the translation table pointed to by TTBRO. 

N can take any value from 0 to 7, that is, from @b000 to b111. 

When N has its reset value of 0, the translation table base is compatible with ARMv5 and ARMv6. 


When this register has an architecturally-defined reset value, this field resets to 0. 


When TTBCR.EAE==1: 


31 30 29 28 27 26 25 24 23 22 21 1918 161514131211109 8 7 6 3 2 0 





IMP DEF al IRGNO 
ORGN1 ORGNO 
IRGN1 RESO 
EPD1 

Al 


EAE, bit [31] 
Extended Address Enable. The meanings of the possible values of this bit are: 


a Use the VMSAv8-32 translation system with the Long-descriptor translation table 
format. 


When this register has an architecturally-defined reset value, this field resets to 0. 


IMP DEF, bit [30] 
IMPLEMENTATION DEFINED. 


When this register has an architecturally-defined reset value, this field resets to 0. 


SH1, bits [29:28] 


Shareability attribute for memory associated with translation table walks using TTBR1. Defined 
values are: 


00 Non-shareable 
10 Outer Shareable 
11 Inner Shareable 
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Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


When this register has an architecturally-defined reset value, this field resets to Q. 


ORGNI, bits [27:26] 


Outer cacheability attribute for memory associated with translation table walks using TTBR1. 


00 Normal memory, Outer Non-cacheable 

01 Normal memory, Outer Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Outer Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Outer Write-Back Read-Allocate No Write-Allocate Cacheable 


When this register has an architecturally-defined reset value, this field resets to 0. 


IRGNI, bits [25:24] 


EPD1, bit [23] 


A1, bit [22] 


Bits [21:19] 


Inner cacheability attribute for memory associated with translation table walks using TTBR1. 
00 Normal memory, Inner Non-cacheable 

01 Normal memory, Inner Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Inner Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Inner Write-Back Read-Allocate No Write-Allocate Cacheable 


When this register has an architecturally-defined reset value, this field resets to 0. 


Translation table walk disable for translations using TTBR1. This bit controls whether a translation 
table walk is performed on a TLB miss, for an address that is translated using TTBR1. The encoding 
of this bit is: 


0 Perform translation table walks using TTBR1. 


1 A TLB miss on an address that is translated using TTBR1 generates a Translation fault. 
No translation table walk is performed. 


When this register has an architecturally-defined reset value, this field resets to Q. 


Selects whether TTBRO or TTBR1 defines the ASID. The encoding of this bit is: 
) TTBRO.ASID defines the ASID. 
1 TTBR1.ASID defines the ASID. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Reserved, RESO. 


T1SZ, bits [18:16] 


Bits [15:14] 


See Selecting between TTBRO and TTBR1, VMSAv8-32 Long-descriptor translation table format on 
page G4-4057 for how TTBCR.{T1SZ, TOSZ} determine the input address ranges and memory 
region sizes translated using TTBRO and TTBR1. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Reserved, RESO. 


SHO, bits [13:12] 


Shareability attribute for memory associated with translation table walks using TTBRO. 


00 Non-shareable 
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10 Outer Shareable 
11 Inner Shareable 


Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


When this register has an architecturally-defined reset value, this field resets to 0. 


ORGNDO, bits [11:10] 


Outer cacheability attribute for memory associated with translation table walks using TTBRO. 


00 Normal memory, Outer Non-cacheable 

01 Normal memory, Outer Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Outer Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Outer Write-Back Read-Allocate No Write-Allocate Cacheable 


When this register has an architecturally-defined reset value, this field resets to 0. 


IRGNO, bits [9:8] 


Inner cacheability attribute for memory associated with translation table walks using TTBRO. 


00 Normal memory, Inner Non-cacheable 

01 Normal memory, Inner Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Inner Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Inner Write-Back Read-Allocate No Write-Allocate Cacheable 


When this register has an architecturally-defined reset value, this field resets to 0. 


EPDO, bit [7] 


Translation table walk disable for translations using TTBRO. This bit controls whether a translation 
table walk is performed on a TLB miss, for an address that is translated using TTBRO. The encoding 
of this bit is: 


0 Perform translation table walks using TTBRO. 


1 A TLB miss on an address that is translated using TTBRO generates a Translation fault. 
No translation table walk is performed. 


When this register has an architecturally-defined reset value, this field resets to 0. 
Bits [6:3] 

Reserved, RESO. 
TOSZ, bits [2:0] 


See Selecting between TTBRO and TTBR1, VMSAv8-32 Long-descriptor translation table format on 
page G4-4057 for how TTBCR.{T1SZ, TOSZ} determine the input address ranges and memory 
region sizes translated using TTBRO and TTBR1. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the TTBCR: 
To access the TTBCR: 


MRC p15,0,<Rt>,c2,c@,2 ; Read TIBCR into Rt 
MCR p15,0,<Rt>,c2,c@,2 ; Write Rt to TIBCR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0010 0000 010 
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G6.2.156 TTBRO, Translation Table Base Register 0 
The TTBRO characteristics are: 


Purpose 


Holds the base address of translation table 0, and information about the memory it occupies. This is 
one of the translation tables for the stage 1 translation of memory accesses from modes other than 
Hyp mode. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


TTBRO(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: : - : : RW 





TTBRO(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


TTBR0O is accessible as follows: 





ELO EL1 EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


° If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T2==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T2==1, Non-secure accesses to this register from EL1 are trapped to EL2. 





Configurations 
AArch32 System register TTBRO is architecturally mapped to AArch64 System register 
TTBRO_EL1. 
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TTBCR.EAE determines which TTBRO format is used: 
EAE== 32-bit format is used. TTBRO[63:32] are ignored. 
EAE==1 64-bit format is used. 


When EL3 is using AArch32, write access to TTBRO(S) is disabled when the CPISSDISABLE 
signal is asserted HIGH. 


Used in conjunction with the TTBCR. When the 64-bit TTBRO format is used, cacheability and 
shareability information is held in the TTBCR, not in TTBRO. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 


TTBRO is a 64-bit register that can also be accessed as a 32-bit value. If it is accessed as a 32-bit 
register, accesses read and write bits [31:0] and do not modify bits [63:32]. 


Field descriptions 


The TTBRO bit assignments are: 


When TTBCR.EAE==0: 





—_ IRGN([1] 
IMP 


NOS 
IRGN[O0] 


TTBO, bits [31:7] 


Translation table base address, bits[31:x], where x is 14-(TTBCR.N). Register bits [x-1:7] are RESO, 
with the additional requirement that if these bits are not all zero, this is a misaligned translation table 
base address, with effects that are CONSTRAINED UNPREDICTABLE, and must be one of the following: 


° Register bits [x-1:7] are treated as if all the bits are zero. The value read back from these bits 
is either the value written or zero. 


° The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


IRGN[0], bit [6] 
See IRGN[1] below for description of the IRGN field. 


NOS, bit [5] 


Not Outer Shareable. When the value of TTBRO.S is 1, indicates whether the memory associated 
with a translation table walk is Inner Shareable or Outer Shareable: 


Q Memory is Outer Shareable. 
1 Memory is Inner Shareable. 


This bit is ignored when the value of TTBRO.S is 0. 


RGN, bits [4:3] 


Region bits. Indicates the Outer cacheability attributes for the memory associated with the 
translation table walks: 





00 Normal memory, Outer Non-cacheable. 
01 Normal memory, Outer Write-Back Write-Allocate Cacheable. 
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10 Normal memory, Outer Write-Through Cacheable. 
11 Normal memory, Outer Write-Back no Write-Allocate Cacheable. 
IMP, bit [2] 
The effect of this bit is IMPLEMENTATION DEFINED. If the translation table implementation does not 
include any IMPLEMENTATION DEFINED features this bit is UNK/SBZP. 
S, bit [1] 
Shareable. Indicates whether the memory associated with the translation table walks is 
Non-shareable: 
Q Memory is Non-shareable. 
1 Memory is shareable. The TTBRO.NOS field indicates whether the memory is Inner 
Shareable or Outer Shareable. 
IRGN[1], bit [0] 


Inner region bits. Indicates the Inner Cacheability attributes for the memory associated with the 
translation table walks. The possible values of IRGN[1:0] are: 


00 Normal memory, Inner Non-cacheable. 

01 Normal memory, Inner Write-Back Write-Allocate Cacheable. 

10 Normal memory, Inner Write-Through Cacheable. 

11 Normal memory, Inner Write-Back no Write-Allocate Cacheable. 


The encoding of the IRGN bits is counter-intuitive, with register bit[6] being IRGN[0] and register 
bit[0] being IRGN[1]. This encoding is chosen to give a consistent encoding of memory region 
types and to ensure that software written for ARMv7 without the Multiprocessing Extensions can 
run unmodified on an implementation that includes the functionality introduced by the ARMv7 
Multiprocessing Extensions. 


When TTBCR.EAE==1: 


56 55 


48 47 


63 0 
| RESO ASID BADDR 


Bits [63:56] 


Reserved, RESO. 


ASID, bits [55:48] 


An ASID for the translation table base address. The TTBCR.A1 field selects either TTBRO.ASID 
or TTBRI1.ASID. 


BADDR, bits [47:0] 


Translation table base address, bits[47:x], Bits [x-1:0] are RESO, with the additional requirement that 
if bits[x-1:3] are not all zero, this is a misaligned translation table base address, with effects that are 
CONSTRAINED UNPREDICTABLE, and must be one of the following: 


° Register bits [x-1:3] are treated as if all the bits are zero. The value read back from these bits 
is either the value written or zero. 


. The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


x is determined from the value of TTBCR.TOSZ as follows: 
° If TTBCR.TOSZ is 0 or 1, x = 5 - TTBCR.TOSZ. 
° If TTBCR.TOSZ is greater than 1, x = 14 - TTBCR.TOSZ. 
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If bits[47:40] of the translation table base address are not zero, an Address size fault is generated. 


Accessing the TTBRO: 
To access the TTBRO when accessing as a 32-bit register: 


MRC p15,0,<Rt>,c2,c@,@ ; Read TTBRQ[31:0] into Rt 
MCR p15,0,<Rt>,c2,c0,0 ; Write Rt to TTBRO[31:0]. TTBR@[63:32] are unchanged 


Register access is encoded as follows: 























coproc opci1 CRn CRm_= opc2 
1111 000 0010 §=— 0000 000 
To access the TTBRO when accessing as a 64-bit register: 
MRRC p15,0,<Rt>,<Rt2>,c2 ; Read TTBRQ[31:0] into Rt and TTBRO[63:32] into Rt2 
MCRR p15,0,<Rt>,<Rt2>,c2 ; Write Rt to TTBRQ[31:0] and Rt2 to TTBRO[63:32] 
Register access is encoded as follows: 
coproc opct1 CRm 
1111 0000 0010 
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G6.2.157 TTBR1, Translation Table Base Register 1 
The TTBRI1 characteristics are: 


Purpose 


Holds the base address of translation table 1, and information about the memory it occupies. This is 
one of the translation tables for the stage 1 translation of memory accesses from modes other than 
Hyp mode. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


TTBRI(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: : - : : RW 





TTBRI(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


TTBRI1 is accessible as follows: 





ELO EL1 EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HCR.TVM==1, Non-secure write accesses to this register from EL1 are trapped to Hyp 
mode. 


. If HCR_EL2.TVM==1, Non-secure write accesses to this register from EL1 are trapped to 
EL2. 


. If HCR.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


° If HCR_EL2.TRVM==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If HSTR.T2==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T2==1, Non-secure accesses to this register from EL1 are trapped to EL2. 





Configurations 
AArch32 System register TTBR1 is architecturally mapped to AArch64 System register 
TTBRI_ELI. 
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TTBCR.EAE determines which TTBR1 format is used: 
EAE== 32-bit format is used. TTBR1[63:32] are ignored. 
EAE==1 64-bit format is used. 


Used in conjunction with the TTBCR. When the 64-bit TTBR1 format is used, cacheability and 
shareability information is held in the TTBCR, not in TTBR1. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 


TTBR1 is a 64-bit register that can also be accessed as a 32-bit value. If it is accessed as a 32-bit 
register, accesses read and write bits [31:0] and do not modify bits [63:32]. 


Field descriptions 


The TTBR1 bit assignments are: 


When TTBCR.EAE==0: 


31 76543210 


TTB1 RGN| |S fi 
= IRGN[1] 
IMP 


NOS 
IRGN[O0] 


TTBI, bits [31:7] 


Translation table base address, bits[31:14]. Register bits [13:7] are RESO, with the additional 
requirement that if these bits are not all zero, this is a misaligned translation table base address, with 
effects that are CONSTRAINED UNPREDICTABLE, and must be one of the following: 


° Register bits [13:7] are treated as if all the bits are zero. The value read back from these bits 
is either the value written or zero. 


. The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 
IRGN[O0], bit [6] 
See IRGN[1] below for description of the IRGN field. 


NOS, bit [5] 


Not Outer Shareable. When the value of TTBR1.S is 1, indicates whether the memory associated 
with a translation table walk is Inner Shareable or Outer Shareable: 


0 Memory is Outer Shareable. 
1 Memory is Inner Shareable. 


This bit is ignored when the value of TTBR1.S is 0. 


RGN, bits [4:3] 
Region bits. Indicates the Outer cacheability attributes for the memory associated with the 
translation table walks: 





00 Normal memory, Outer Non-cacheable. 
01 Normal memory, Outer Write-Back Write-Allocate Cacheable. 
10 Normal memory, Outer Write-Through Cacheable. 
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11 Normal memory, Outer Write-Back no Write-Allocate Cacheable. 


IMP, bit [2] 


The effect of this bit is IMPLEMENTATION DEFINED. If the translation table implementation does not 
include any IMPLEMENTATION DEFINED features this bit is UNK/SBZP. 


S, bit [1] 


Shareable. Indicates whether the memory associated with the translation table walks is 
Non-shareable: 


0 Memory is Non-shareable. 
1 Memory is shareable. The TTBR1.NOS field indicates whether the memory is Inner 
Shareable or Outer Shareable. 
IRGN[1], bit [0] 


Inner region bits. Indicates the Inner Cacheability attributes for the memory associated with the 
translation table walks. The possible values of IRGN[1:0] are: 


00 Normal memory, Inner Non-cacheable. 

01 Normal memory, Inner Write-Back Write-Allocate Cacheable. 

10 Normal memory, Inner Write-Through Cacheable. 

11 Normal memory, Inner Write-Back no Write-Allocate Cacheable. 


The encoding of the IRGN bits is counter-intuitive, with register bit[6] being IRGN[0] and register 
bit[0] being IRGN[1]. This encoding is chosen to give a consistent encoding of memory region 
types and to ensure that software written for ARMv7 without the Multiprocessing Extensions can 
run unmodified on an implementation that includes the functionality introduced by the ARMv7 
Multiprocessing Extensions. 


When TTBCR.EAE==1: 


63 56 55 48 47 0 
| RESO ASID BADDR 
Bits [63:56] 


Reserved, RESO. 


ASID, bits [55:48] 
An ASID for the translation table base address. The TTBCR.A1 field selects either TTBRO.ASID 
or TTBR1.ASID. 

BADDR, bits [47:0] 


Translation table base address, bits[47:x], Bits [x-1:0] are RESO, with the additional requirement that 
if bits[x-1:3] are not all zero, this is a misaligned translation table base address, with effects that are 
CONSTRAINED UNPREDICTABLE, and must be one of the following: 


° Register bits [x-1:3] are treated as if all the bits are zero. The value read back from these bits 
is either the value written or zero. 


. The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


x is determined from the value of TTBCR.T1SZ as follows: 
° If TTBCR.T1SZ is 0 or 1, x = 5 - TTBCR.TISZ. 
° If TTBCR.T1SZ is greater than 1, x = 14 - TTBCR.T1SZ. 


If bits[47:40] of the translation table base address are not zero, an Address size fault is generated. 
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Accessing the TTBR1: 
To access the TTBR1 when accessing as a 32-bit register: 


MRC p15,0,<Rt>,c2,c@,1 ; Read TTBR1[31:0] into Rt 
MCR p15,0,<Rt>,c2,c0,1 ; Write Rt to TTBR1[31:0]. TTBR1[63:32] are unchanged 


Register access is encoded as follows: 























coproc opct CRn CRm_= opc2 
1111 000 0010 = §=— 0000 001 
To access the TTBR1 when accessing as a 64-bit register: 
MRRC p15,1,<Rt>,<Rt2>,c2 ; Read TTBR1[31:0] into Rt and TTBR1[63:32] into Rt2 
MCRR p15,1,<Rt>,<Rt2>,c2 ; Write Rt to TIBR1[31:0] and Rt2 to TTBR1[63:32] 
Register access is encoded as follows: 
coproc opc1 CRm 
1111 0001 0010 
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G6.2.158 VBAR, Vector Base Address Register 
The VBAR characteristics are: 


Purpose 


When high exception vectors are not selected, holds the vector base address for exceptions that are 
not taken to Monitor mode or to Hyp mode. 


Software must program VBAR(NS) with the required initial value as part of the PE boot sequence. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


VBAR(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





2 2 - - - RW 





VBARV(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


VBAR is accessible as follows: 





ELO EL1 _ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T12==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T12==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register VBAR is architecturally mapped to AArch64 System register 
VBAR_EL1[31:0]. 

When EL3 is using AArch32, write access to VBAR(S) is disabled when the CP1SSDISABLE 
signal is asserted HIGH. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into an Exception level that is using AArch32. If the PE resets into EL3 using AArch32 they apply 
only to the Secure instance of the register. Otherwise, RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
VBAR is a 32-bit register. 
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Field descriptions 


The VBAR bit assignments are: 


31 5 4 0 
Vector Base Address RESO 

Bits [31:5] 
Vector Base Address. Bits[31:5] of the base address of the exception vectors for exceptions taken to 
this Exception level. Bits[4:0] of an exception vector are the exception offset. 
When this register has an architecturally-defined reset value, this field resets to an IMPLEMENTATION 
DEFINED value. 

Bits [4:0] 


Reserved, RESO. 


Accessing the VBAR: 
To access the VBAR: 


MRC p15,0,<Rt>,c12,c@,@ ; Read VBAR into Rt 
MCR p15,0,<Rt>,c12,c@,@ ; Write Rt to VBAR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1100 0000 000 
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G6.2.159 


VMPIDR, Virtualization Multiprocessor ID Register 


The VMPIDR characteristics are: 


Purpose 
Holds the value of the Virtualization Multiprocessor ID. This is the value returned by Non-secure 
EL1 reads of MPIDR. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.TO==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T0==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register VMPIDR is architecturally mapped to AArch64 System register 
VMPIDR_EL2[31:0]. 


If EL2 is not implemented but EL3 is implemented, this register takes the value of the MPIDR. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 with EL2 using AArch32, or into EL3 with EL3 using AArch32. Otherwise, RW fields in 
this register reset to architecturally UNKNOWN values. 


Attributes 
VMPIDR is a 32-bit register. 


Field descriptions 


The VMPIDR bit assignments are: 


31 30 29 25 24 23 1615 8 7 0 


———<— 
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M, bit [31] 


Indicates whether this implementation includes the functionality introduced by the ARMv7 
Multiprocessing Extensions. The possible values of this bit are: 


) This implementation does not include the ARMv7 Multiprocessing Extensions 
functionality. 


1 This implementation includes the ARMv7 Multiprocessing Extensions functionality. 


In ARMV8 this bit is RES1. 


U, bit [30] 


Indicates a Uniprocessor system, as distinct from PE 0 in a multiprocessor system. The possible 
values of this bit are: 


Q Processor is part of a multiprocessor system. 
1 Processor is part of a uniprocessor system. 
When this register has an architecturally-defined reset value, this field resets to the value of 
MPIDR.U. 
Bits [29:25] 


Reserved, RESO. 


MT, bit [24] 


Indicates whether the lowest level of affinity consists of logical PEs that are implemented using a 
multithreading type approach. The possible values of this bit are: 


Q Performance of PEs at the lowest affinity level is largely independent. 
a Performance of PEs at the lowest affinity level is very interdependent. 
When this register has an architecturally-defined reset value, this field resets to the value of 
MPIDR.MT. 
Aff2, bits [23:16] 
Affinity level 2. The least significant affinity level field, for this PE in the system. 
When this register has an architecturally-defined reset value, this field resets to the value of 
MPIDR.Aff2. 
Aff1, bits [15:8] 
Affinity level 1. The intermediate affinity level field, for this PE in the system. 
When this register has an architecturally-defined reset value, this field resets to the value of 
MPIDR.Aff1. 
Aff0, bits [7:0] 
Affinity level 0. The most significant affinity level field, for this PE in the system. 


When this register has an architecturally-defined reset value, this field resets to the value of 
MPIDR.Aff0. 

Accessing the VMPIDR: 

To access the VMPIDR: 


MRC p15,4,<Rt>,c@,c@,5 ; Read VMPIDR into Rt 
MCR p15,4,<Rt>,c@,c@,5 ; Write Rt to VMPIDR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 100 0000 0000 101 
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G6.2.160 


VPIDR, Virtualization Processor ID Register 


The VPIDR characteristics are: 


Purpose 


Holds the value of the Virtualization Processor ID. This is the value returned by Non-secure EL1 
reads of MIDR. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.TO==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T0==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 


AArch32 System register VPIDR is architecturally mapped to AArch64 System register 
VPIDR_EL2. 


If EL2 is not implemented but EL3 is implemented, this register takes the value of the MIDR. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 with EL2 using AArch32, or into EL3 with EL3 using AArch32. Otherwise, RW fields in 
this register reset to architecturally UNKNOWN values. 


Attributes 
VPIDR is a 32-bit register. 


Field descriptions 


The VPIDR bit assignments are: 


31 24 23 20 19 1615 4 3 0 


Architecture oe 
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Implementer, bits [31:24] 


The Implementer code. This field must hold an implementer code that has been assigned by ARM. 
Assigned codes include the following: 





Hex representation ASCll representation Implementer 



































Ox41 A ARM Limited 

0x42 B Broadcom Corporation 

0x43 C Cavium Inc. 

0x44 D Digital Equipment Corporation 
0x49 I Infineon Technologies AG 

Qx4D M Motorola or Freescale Semiconductor Inc. 
Ox4E N NVIDIA Corporation 

0x50 P Applied Micro Circuits Corporation 
Ox51 Q Qualcomm Inc. 

0x56 Vv Marvell International Ltd. 

0x69 i Intel Corporation 





ARM can assign codes that are not published in this manual. All values not assigned by ARM are 
reserved and must not be used. 


When this register has an architecturally-defined reset value, this field resets to the value of 
MIDR.Implementer. 
Variant, bits [23:20] 


An IMPLEMENTATION DEFINED variant number. Typically, this field is used to distinguish between 
different product variants, or major revisions of a product. 


When this register has an architecturally-defined reset value, this field resets to the value of 
MIDR. Variant. 


Architecture, bits [19:16] 


The permitted values of this field are: 





0001 ARMv4 

0010 ARMV4T 

0011 ARMvVsS (obsolete) 

0100 ARMv5T 

0101 ARMVSTE 

0110 ARMVSTEJ 

111 ARMv6 

1111 Architectural features are individually identified in the ID_* registers, see Identification 


registers, functional group on page G4-4194., 
All other values are reserved. 
When this register has an architecturally-defined reset value, this field resets to the value of 
MIDR. Architecture. 
PartNum, bits [15:4] 


An IMPLEMENTATION DEFINED primary part number for the device. 
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On processors implemented by ARM, if the top four bits of the primary part number are Qx@ or 0x7, 
the variant and architecture are encoded differently. 


When this register has an architecturally-defined reset value, this field resets to the value of 
MIDR.PartNum. 


Revision, bits [3:0] 
An IMPLEMENTATION DEFINED revision number for the device. 


When this register has an architecturally-defined reset value, this field resets to the value of 
MIDR.Revision. 


Accessing the VPIDR: 


To access the VPIDR: 


MRC p15,4,<Rt>,c@,c@,@ ; Read VPIDR into Rt 
MCR p15,4,<Rt>,c@,c0,@ ; Write Rt to VPIDR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0000 0000 000 
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G6.2.161 VTCR, Virtualization Translation Control Register 
The VTCR characteristics are: 


Purpose 


Controls the translation table walks required for the stage 2 translation of memory accesses from 
Non-secure modes other than Hyp mode, and holds cacheability and shareability information for the 


accesses. 
Used in conjunction with VTTBR, that defines the translation table base address for the translations. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





= RW RW : 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- - RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T2==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T2==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
AArch32 System register VTCR is architecturally mapped to AArch64 System register 
VICR_EL2: 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
VTCR is a 32-bit register. 


Field descriptions 


The VTCR bit assignments are: 


31 30 14131211109 8 7 6 5 4 3 


RES1 _ ae 


IRGNO 
ORGNO 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4663 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.2 General system control registers 


Bit [31] 

Reserved, RES1. 
Bits [30:14] 

Reserved, RESO. 
SHO, bits [13:12] 


Shareability attribute for memory associated with translation table walks using VTTBR. 


00 Non-shareable 
10 Outer Shareable 
11 Inner Shareable 


Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 


ORGNDO, bits [11:10] 


Outer cacheability attribute for memory associated with translation table walks using VTTBR. 


00 Normal memory, Outer Non-cacheable 

01 Normal memory, Outer Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Outer Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Outer Write-Back Read-Allocate No Write-Allocate Cacheable 


IRGNO, bits [9:8] 


Inner cacheability attribute for memory associated with translation table walks using VTTBR. 


00 Normal memory, Inner Non-cacheable 

01 Normal memory, Inner Write-Back Read-Allocate Write-Allocate Cacheable 

10 Normal memory, Inner Write-Through Read-Allocate No Write-Allocate Cacheable 
11 Normal memory, Inner Write-Back Read-Allocate No Write-Allocate Cacheable 


SLO, bits [7:6] 
Starting level for translation table walks using VTTBR. 
00 Start at level 2 
01 Start at level 1 


All other values are reserved. If this field is programmed to a reserved value, or to a value that is not 
consistent with the programming of TOSZ, then a stage 2 level | Translation fault is generated. 


Bit [5] 
Reserved, RESO. 
S, bit [4] 


Sign extension bit. This bit must be programmed to the value of TOSZ[3]. If it is not, then the 
behavior is CONSTRAINED UNPREDICTABLE and the stage 2 TOSZ value is treated as an UNKNOWN 
value, see Misprogramming VTCR.S on page K1-5474. 


TOSZ, bits [3:0] 
The size offset of the memory region addressed by VTTBR. The region size is 2G2-TOSZ) bytes. 
This field holds a four-bit signed integer value, meaning it supports values from -8 to 7. 


— Note 


This is different from the other translation control registers, where TnSZ holds a three-bit unsigned 
integer, supporting values from 0 to 7. 
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If this field is programmed to a value that is not consistent with the programming of SLO then a stage 
2 level 1 Translation fault is generated. 

Accessing the VTCR: 

To access the VTCR: 


MRC p15,4,<Rt>,c2,c1,2 ; Read VICR into Rt 
MCR p15,4,<Rt>,c2,c1,2 ; Write Rt to VTCR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 100 0010 0001 010 
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G6.2.162 VTTBR, Virtualization Translation Table Base Register 
The VTTBR characteristics are: 


Purpose 


Holds the base address of the translation table for the stage 2 translation of memory accesses from 
Non-secure modes other than Hyp mode. 


Used in conjunction with the VTCR. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





: 4 : RW RW : 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





5 : RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T2==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T2==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
Configurations 


AArch32 System register VITTBR is architecturally mapped to AArch64 System register 
VTTBR_EL2. 


If EL2 is not implemented, this register is RESO from EL3. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 with EL2 using AArch32, or into EL3 with EL3 using AArch32. Otherwise, RW fields in 
this register reset to architecturally UNKNOWN values. 


Attributes 
VTTBR is a 64-bit register. 


Field descriptions 


The VTTBR bit assignments are: 


63 56 55 48 47 0 


RESO VMID BADDR 


Bits [63:56] 


Reserved, RESO. 
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VMID, bits [55:48] 


The VMID for the translation table. 


When this register has an architecturally-defined reset value, this field resets to 0. 


BADDR, bits [47:0] 


Translation table base address, bits[47:x], Bits [x-1:0] are RESO, with the additional requirement that 
if bits[x-1:3] are not all zero, this is a misaligned translation table base address, with effects that are 
CONSTRAINED UNPREDICTABLE, and must be one of the following: 


° Register bits [x-1:3] are treated as if all the bits are zero. The value read back from these bits 
is either the value written or zero. 


° The result of the calculation of an address for a translation table walk using this register can 
be corrupted in those bits that are nonzero. 


x is determined from the value of VTCR.SLO and VTCR.TOSZ as follows: 

. If VITCR.SLO is 00, meaning that lookup starts at level 2, then x is 14 - VTCR.TOSZ. 

. If VTCR.SLO is 01, meaning that lookup starts at level 1, then x is 5 - VTCR.TOSZ. 

° If VTCR.SLO is either 10 or 11 then a stage 2 level | Translation fault is generated. 

If bits[47:40] of the translation table base address are not zero, an Address size fault is generated. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the VTTBR: 


To access the VITTBR: 


MRRC p15,6,<Rt>,<Rt2>,c2 ; Read VTTBR[31:0] into Rt and VTTBR[63:32] into Rt2 
MCRR p15,6,<Rt>,<Rt2>,c2 ; Write Rt to VITBR[31:0] and Rt2 to VTTBR[63:32] 


Register access is encoded as follows: 





coproc opc1 CRm 





1111 0110 0010 
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G6.3 Debug registers 


This section lists the Debug System registers in AArch32 state, in alphabetic order. 
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G6.3.1 DBGAUTHSTATUS, Debug Authentication Status register 
The DBGAUTHSTATUS characteristics are: 


Purpose 
Provides information about the state of the IMPLEMENTATION DEFINED authentication interface for 
debug. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HDCR.TDA==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If MDCR_EL2.TDA==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from EL1 and EL2 are trapped to 
EL3. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGAUTHSTATUS is architecturally mapped to AArch64 System 
register DBGAUTHSTATUS_EL1. 


AArch32 System register DBGAUTHSTATUS is architecturally mapped to External register 
DBGAUTHSTATUS_EL1. 


This register is required in all implementations. 


Attributes 
DBGAUTHSTATUS is a 32-bit register. 


Field descriptions 


The DBGAUTHSTATUS bit assignments are: 
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31 876543210 


RESO SNID js] | NSID 


7 NSNID 


Bits [31:8] 
Reserved, RESO. 
SNID, bits [7:6] 


Secure non-invasive debug. Possible values of this field are: 


00 Not implemented. EL3 is not implemented and the implemented Security state is 
Non-secure state. 

10 Implemented and disabled. ExternalSecureNoninvasiveDebugEnabled() == FALSE. 

11 Implemented and enabled. ExternalSecureNoninvasiveDebugEnabled() == TRUE. 


Other values are reserved. 


SID, bits [5:4] 


Secure invasive debug. Possible values of this field are: 


00 Not implemented. EL3 is not implemented and the implemented Security state is 
Non-secure state. 

10 Implemented and disabled. ExternalSecureInvasiveDebugEnabled() == FALSE. 

11 Implemented and enabled. ExternalSecureInvasiveDebugEnabled() == TRUE. 


Other values are reserved. 


NSNID, bits [3:2] 


Non-secure non-invasive debug. Possible values of this field are: 


00 Not implemented. EL3 is not implemented and the implemented Security state is Secure 
state. 

10 Implemented and disabled. ExternalNoninvasiveDebugEnabled() == FALSE. 

11 Implemented and enabled. ExternalNoninvasiveDebugEnabled() == TRUE. 


Other values are reserved. 


NSID, bits [1:0] 


Non-secure invasive debug. Possible values of this field are: 


00 Not implemented. EL3 is not implemented and the implemented Security state is Secure 
state. 

10 Implemented and disabled. ExternallnvasiveDebugEnabled() == FALSE. 

11 Implemented and enabled. ExternalInvasiveDebugEnabled() == TRUE. 


Other values are reserved. 


Accessing the DBGAUTHSTATUS: 
To access the DBGAUTHSTATUS: 


MRC p14,0,<Rt>,c7,c14,6 ; Read DBGAUTHSTATUS into Rt 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0111 1110 110 
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G6.3.2 DBGBCR<n>, Debug Breakpoint Control Registers, n = 0 - 15 
The DBGBCR<n> characteristics are: 


Purpose 


Holds control information for a breakpoint. Forms breakpoint n together with value register 
DBGBVR<n>. If EL2 is implemented and this breakpoint supports Context matching, 
DBGBVR<n> can be associated with a Breakpoint Extended Value Register DBGBXVR<n> for 
VMID matching. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 











as follows: 
ELO EL1  EL2 (NS) 
- RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see: 
° Synchronous exception prioritization for exceptions taken to AArch32 state on page G1-3816 
for exceptions taken to AArch32 state. 
° Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548 for 
exceptions taken to AArch64 state. 
° Software Access debug event on page H3-4903 for accesses to this register taken to Debug 


state. 

Subject to the prioritization rules: 

. If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 

. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 

° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 

° If EDSCR.TDA==1, DBGOSLSR.OSLK==0, and halting is allowed, EL1 and EL2 accesses 
to this register generate a Software Access debug event. 

Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGBCR<n> is architecturally mapped to AArch64 System register 
DBGBCR<n>_EL1. 


AArch32 System register DBGBCR<n> is architecturally mapped to External register 
DBGBCR<n>_EL1. 


If breakpoint n is not implemented then this register is unallocated. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 
DBGBCR<n> is a 32-bit register. 
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Field descriptions 
The DBGBCR<n> bit assignments are: 
31 24 23 20 19 16 15 14 13 12 5 43 2 1 0 


i RESO 


HMC 


When the E field is zero, all the other fields in the register are ignored. 
Bits [31:24] 
Reserved, RESO. 


BT, bits [23:20] 
Breakpoint Type. Possible values are: 





0000 Unlinked instruction address match. 
0001 Linked instruction address match. 

0010 Unlinked context ID match. 

0011 Linked context ID match 

0100 Unlinked instruction address mismatch. 
0101 Linked instruction address mismatch. 
1000 Unlinked VMID match. 

1001 Linked VMID match. 

1010 Unlinked VMID and context ID match. 
1011 Linked VMID and context ID match. 


The field breaks down as follows: 
° BT[3:1]: Base type. 


000 Match address. DBGBVR<n> is the address of an instruction. 

010 Mismatch address. DBGBVR<n> is the address of an instruction to be stepped. 
001 Match context ID. DBGBVR<n> is a context ID. 

100 Match VMID. DBGBXVR<n>.VMID is a VMID. 

101 Match VMID and context ID. DBGBVR<n> is a context ID, and 


DBGBXVR<n>.VMID is a VMID. 
° BT[O]: Enable linking. 


All other values are reserved. Constraints on breakpoint programming mean other values are 
reserved under some conditions. For more information, including the effect of programming this 
field to a reserved value, see Reserved DBGBCR<n>.BT values on page G2-3955. 


This field resets to a value that is architecturally UNKNOWN. 


LBN, bits [19:16] 


Linked breakpoint number. For Linked address matching breakpoints, this specifies the index of the 
Context-matching breakpoint linked to. 


For all other breakpoint types this field is ignored and reads of the register return an UNKNOWN 
value. 


This field is ignored when the value of DBGBCR<n>.E is 0. 


This field resets to a value that is architecturally UNKNOWN. 
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SSC, bits [15:14] 


HMC, bit [13] 


Bits [12:9] 


BAS, bits [8:5] 


Security state control. Determines the Security states under which a Breakpoint debug event for 
breakpoint n is generated. This field must be interpreted along with the HMC and PMC fields, and 
there are constraints on the permitted values of the {HMC, SSC, PMC} fields. For more 
information, including the effect of programming the fields to a reserved set of values, see Usage 
constraints on page G2-3955. 


This field resets to a value that is architecturally UNKNOWN. 


Higher mode control. Determines the debug perspective for deciding when a Breakpoint debug 
event for breakpoint n is generated. This field must be interpreted along with the SSC and PMC 
fields, and there are constraints on the permitted values of the {HMC, SSC, PMC} fields. For more 
information see the SSC, bits [15:14] description. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


Byte address select. Defines which half-words an address-matching breakpoint matches, regardless 
of the instruction set and Execution state. 


The permitted values depend on the breakpoint type. 


For Address match breakpoints, the permitted values are: 





BAS Match instruction at Constraint for debuggers 











0011 DBGBVR<n> Use for T32 instructions. 
1100 DBGBVR<n>+2 Use for T32 instructions. 
1111 DBGBVR<n> Use for A32 instructions. 





All other values are reserved. For more information, see Reserved DBGBCR<n>.{SSC, HMC, 
PMC} values on page G2-3956. 


For more information on using the BAS field in Address Match breakpoints, see Using the BAS field 
in Address Match breakpoints on page G2-3949. 


For Address mismatch breakpoints in an AArch32 stage 1 translation regime, the permitted values 
are: 





BAS Stepinstruction at Constraint for debuggers 














0000s; Use for a match anywhere breakpoint. 
0011 DBGBVR<n> Use for stepping T32 instructions. 
1100 = DBGBVR<n>+2 Use for stepping T32 instructions. 
1111 = DBGBVR<n> Use for stepping A32 instructions. 





All other values are reserved. For more information, see Reserved DBGBCR<n>.{SSC, HMC, 
PMC} values on page G2-3956. 


For more information on using the BAS field in address mismatch breakpoints, see Using the BAS 
field in Address Match breakpoints on page G2-3949. 


For Context matching breakpoints, this field is RES1 and ignored. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
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Bits [4:3] 
Reserved, RESO. 
PMC, bits [2:1] 


Privilege mode control. Determines the Exception level or levels at which a Breakpoint debug event 
for breakpoint n is generated. This field must be interpreted along with the SSC and HMC fields, 
and there are constraints on the permitted values of the {HMC, SSC, PMC} fields. For more 
information see the DBGBCR<n>.SSC description. 


This field resets to a value that is architecturally UNKNOWN. 
E, bit [0] 

Enable breakpoint DBGBVR<n>. Possible values are: 

0 Breakpoint disabled. 

a Breakpoint enabled. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGBCR<n>: 
To access the DBGBCR<n>: 


MRC p14,0,<Rt>,cQ,<CRm>,5 ; Read DBGBCR<n> into Rt, where n is in the range Q to 15 
MCR p14,0,<Rt>,cQ,<CRm>,5 ; Write Rt to DBGBCR<n>, where n is in the range @ to 15 


Register access is encoded as follows: 





coproc opct1 CRn CRm_— opc2 





1110 000 0000 n<3:@> 101 
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G6.3.3 DBGBVR<n>, Debug Breakpoint Value Registers, n = 0 - 15 
The DBGBVR<n> characteristics are: 


Purpose 


Holds a value for use in breakpoint matching, either the virtual address of an instruction or a context 
ID. Forms breakpoint n together with control register DBGBCR<n>. If EL2 is implemented and this 
breakpoint supports Context matching, DBGBVR<n> can be associated with a Breakpoint 
Extended Value Register DBGBXVR<n> for VMID matching. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 











as follows: 
ELO EL1  EL2 (NS) 
- RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see: 
° Synchronous exception prioritization for exceptions taken to AArch32 state on page G1-3816 
for exceptions taken to AArch32 state. 
° Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548 for 
exceptions taken to AArch64 state. 
° Software Access debug event on page H3-4903 for accesses to this register taken to Debug 


state. 
Subject to the prioritization rules: 
. If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
° If EDSCR.TDA==1, DBGOSLSR.OSLK==0, and halting is allowed, EL1 and EL2 accesses 
to this register generate a Software Access debug event. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
AArch32 System register DBGBVR<n> is architecturally mapped to AArch64 System register 
DBGBVR<n>_EL1[31:0]. 


— Note 
Writes to DBGBVR<n> do not modify DBGBVR<n>_EL1[63:32]. 





AArch32 System register DBGBVR<n> is architecturally mapped to External register 
DBGBVR<n>_EL1[31:0]. 


If breakpoint n is not implemented then this register is unallocated. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 
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Attributes 
How this register is interpreted depends on the value of DBGBCR<n>.BT. 


° When DBGBCR<n>.BT is Ob0xOx, this register holds a virtual address. 
° When DBGBCR<n>.BT is 0b001x or 0b101x, this register holds a Context ID. 


For other values of DBGBCR<n>.BT, this register is RESO. 
Some breakpoints might not support Context ID comparison. For more information, see the 
description of the DBGDIDR.CTX_CMPs field. 


Field descriptions 


The DBGBVR<n> bit assignments are: 


When DBGBCR<n>.BT==0b0x0x: 


VA[31:2] 


| 
= 
= 
Oo 


RESO 


VA[31:2], bits [31:2] 
Bits[31:2] of the address value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 


Bits [1:0] 
Reserved, RESO. 


When DBGBCRe<n>.BT==0b001x: 


ContextID 


wo 
= 
oO 


ContextID, bits [31:0] 
Context ID value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 


When DBGBCRen>.BT==0b101x and EL2 implemented: 


ContextID 


wo 
= 
Oo 


ContextID, bits [31:0] 
Context ID value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 
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Accessing the DBGBVR<n>: 
To access the DBGBVR<n>: 


MRC p14,0,<Rt>,cQ,<CRm>,4 ; Read DBGBVR<n> into Rt, where n is in the range Q to 15 
MCR p14,0,<Rt>,cQ,<CRm>,4 ; Write Rt to DBGBVR<n>, where n is in the range @ to 15 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0000 n<3:@> 100 
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G6.3.4 DBGBXVR<n>, Debug Breakpoint Extended Value Registers, n = 0 - 15 
The DBGBXVR<n> characteristics are: 


Purpose 


Holds a value for use in breakpoint matching, to support VMID matching. Used in conjunction with 
a control register DBGBCR<n> and a value register DBGB VR<n>, where EL2 is implemented and 
breakpoint n supports Context matching. 


Usage constraints 


DBGBXVR<n> is implemented only if the implementation includes EL2 and the breakpoint 
supports Context matching. Otherwise it is unallocated. 


When DBGBXVR<n> is implemented, if EL3 is implemented and is using AArch32, this register 
is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





When DBGBXVR<n> is implemented, if EL3 is not implemented or EL3 is implemented and is 
using AArch64, this register is accessible as follows: 





ELO EL1 _ EL2 (NS) 








- RW RW 
Traps and Enables 

For a description of the prioritization of any generated exceptions, see 

° Synchronous exception prioritization for exceptions taken to AArch32 state on page G1-3816 
for exceptions taken to AArch32 state. 

° Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548 for 
exceptions taken to AArch64 state. 

° Software Access debug event on page H3-4903 for accesses to this register taken to Debug 


state. 
Subject to the prioritization rules: 
. If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
° If EDSCR.TDA==1, DBGOSLSR.OSLK==0, and halting is allowed, EL1 and EL2 accesses 
to this register generate a Software Access debug event. 
Configurations 


When DBGBXVR<n> is implemented, there is one instance of the register that is used in both 
Secure and Non-secure states. 


AArch32 System register DBGBXVR<n> is architecturally mapped to AArch64 System register 
DBGBVR<n>_EL1[63:32]. 


— Note 
Writes to DBGBXVR<n> do not modify DBGBVR<n>_EL1 [31:0]. 





AArch32 System register DBGBXVR<n> is architecturally mapped to External register 
DBGBVR<n>_EL1[63:32]. 
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This register is unallocated in any of the following cases: 


. Breakpoint n is not implemented. 
° Breakpoint n does not support Context matching. 
. EL2 is not implemented. 


For more information, see the description of the DBGDIDR.CTX_CMPs field. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 


When DBGBCR<n>.BT is 0b10xx, this register holds a VMID. For other values of 


DBGBCR<n>.BT, this register is RESO. 


Field descriptions 


The DBGBXVR<n> bit assignments are: 


When DBGBCRen>.BT==0b10xx and EL2 implemented: 


31 8 7 


0 


RESO VMID 


Bits [31:8] 


Reserved, RESO. 


VMID, bits [7:0] 
VMID value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGBXVR<n>: 
To access the DBGBXVR<n>: 


MRC p14,0,<Rt>,c1,<CRm>,1 ; Read DBGBXVR<n> into Rt, where n is in the range Q to 15 
MCR p14,0,<Rt>,c1,<CRm>,1 ; Write Rt to DBGBXVR<n>, where n is in the range Q to 15 


Register access is encoded as follows: 





coproc opc1 CRn 


CRm_ opc2 





1110 000 0001 


n<3:@> 001 
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G6.3.5 DBGCLAIMCLR, Debug Claim Tag Clear register 
The DBGCLAIMCLR characteristics are: 
Purpose 


Used by software to read the values of the CLAIM tag bits, and to clear these bits to 0. 
The architecture does not define any functionality for the CLAIM tag bits. 


— Note 


CLAIM tags are typically used for communication between the debugger and target software. 





Used in conjunction with the DBGCLAIMSET register. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1_ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGCLAIMCLR is architecturally mapped to AArch64 System register 
DBGCLAIMCLR_EL1. 


AArch32 System register DBGCLAIMCLR is architecturally mapped to External register 
DBGCLAIMCLR_EL1. 


An implementation must include 8 CLAIM tag bits. 


This register is in the Cold reset domain. See the CLAIM field description for the effect of a Cold 
reset on the value returned by this register. This register is not affected by a Warm reset. 


Attributes 
DBGCLAIMCLR is a 32-bit register. 


Field descriptions 


The DBGCLAIMCLR bit assignments are: 
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31 8 7 0 
RAZ/SBZ CLAIM 
Bits [31:8] 


Reserved, RAZ/SBZ. Software can rely on these bits reading as zero, and must use a should-be-zero 
policy on writes. Implementations must ignore writes. 


CLAIM, bits [7:0] 
Read or clear CLAIM tag bits. Reading this field returns the current value of the CLAIM tag bits. 


Writing a | to one of these bits clears the corresponding CLAIM tag bit to 0. This is an indirect write 
to the CLAIM tag bits. A single write operation can clear multiple CLAIM tag bits to 0. 


Writing 0 to one of these bits has no effect. 
A cold reset clears the CLAIM tag bits to 0. 


Accessing the DBGCLAIMCLR: 
To access the DBGCLAIMCLR: 


MRC p14,0,<Rt>,c7,c9,6 ; Read DBGCLAIMCLR into Rt 
MCR p14,0,<Rt>,c7,c9,6 ; Write Rt to DBGCLAIMCLR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0111 1001 110 
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G6.3.6 DBGCLAIMSET, Debug Claim Tag Set register 
The DBGCLAIMSET characteristics are: 
Purpose 


Used by software to set the CLAIM tag bits to 1. 
The architecture does not define any functionality for the CLAIM tag bits. 


— Note 


CLAIM tags are typically used for communication between the debugger and target software. 





Used in conjunction with the DBGCLAIMCLR register. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1_ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGCLAIMSET is architecturally mapped to AArch64 System register 
DBGCLAIMSET_EL1. 


AArch32 System register DBGCLAIMSET is architecturally mapped to External register 
DBGCLAIMSET_EL1. 


An implementation must include 8 CLAIM tag bits. 


Attributes 
DBGCLAIMSET is a 32-bit register. 


Field descriptions 


The DBGCLAIMSET bit assignments are: 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4683 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 


G6.3 Debug registers 
31 8 7 0 
RAZ/SBZ CLAIM 
Bits [31:8] 


Reserved, RAZ/SBZ. Software can rely on these bits reading as zero, and must use a should-be-zero 
policy on writes. Implementations must ignore writes. 


CLAIM, bits [7:0] 
Set CLAIM tag bits. RAO. 


Writing a 1 to one of these bits sets the corresponding CLAIM tag bit to 1. This is an indirect write 
to the CLAIM tag bits. A single write operation can set multiple CLAIM tag bits to 1. 


Writing 0 to one of these bits has no effect. 
A cold reset clears the CLAIM tag bits to 0. 


Accessing the DBGCLAIMSET: 
To access the DBGCLAIMSET: 


MRC p14,0,<Rt>,c7,c8,6 ; Read DBGCLAIMSET into Rt 
MCR p14,0,<Rt>,c7,c8,6 ; Write Rt to DBGCLAIMSET 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0111 1000 110 
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G6.3.7 DBGDCCINT, DCC Interrupt Enable Register 
The DBGDCCINT characteristics are: 


Purpose 


Enables interrupt requests to be signaled based on the DCC status flags. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGDCCINT is architecturally mapped to AArch64 System register 
MDCCINT_EL1. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch32. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
DBGDCCINT is a 32-bit register. 


Field descriptions 


The DBGDCCINT bit assignments are: 


31 30 29 28 0 


fp RESO 


RESO = 2:| 
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Bit [31] 


Reserved, RESO. 


RX, bit [30] 


DCC interrupt request enable control for DTRRX. Enables a common COMMIRQ interrupt request 
to be signaled based on the DCC status flags. 


0 No interrupt request generated by DTRRX. 
ih Interrupt request will be generated on RXfull == 1. 


If legacy COMMRX and COMMTX signals are implemented, then these are not affected by the 
value of this bit. 


When this register has an architecturally-defined reset value, this field resets to 0. 
TX, bit [29] 


DCC interrupt request enable control for DT[RTX. Enables a common COMMIRQ interrupt request 
to be signaled based on the DCC status flags. 


0 No interrupt request generated by DTRTX. 
1 Interrupt request will be generated on TXfull == 0. 


If legacy COMMRX and COMMTX signals are implemented, then these are not affected by the 
value of this bit. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Bits [28:0] 


Reserved, RESO. 


Accessing the DBGDCCINT: 
To access the DBGDCCINT: 


MRC p14,0,<Rt>,c@,c2,@ ; Read DBGDCCINT into Rt 
MCR p14,0,<Rt>,c@,c2,@ ; Write Rt to DBGDCCINT 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0000 0010 000 
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G6.3.8 DBGDEVID, Debug Device ID register 0 
The DBGDEVID characteristics are: 


Purpose 
Adds to the information given by the DBGDIDR by describing other features of the debug 
implementation. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDA==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If MDCR_EL2.TDA==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from EL1 and EL2 are trapped to 
EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


This register is required in all implementations. 


Attributes 
DBGDEVID is a 32-bit register. 


Field descriptions 


The DBGDEVID bit assignments are: 


28 27 24 23 20 19 1615 12 11 


WPAddrMask 
BPAddrMask 
VectorCatch 
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CIDMask, bits [31:28] 


Indicates the level of support for the Context ID matching breakpoint masking capability. Permitted 
values of this field are: 


0000 Context ID masking is not implemented. 

0001 Context ID masking is implemented. 

All other values are reserved. The value of this for ARMv8 is 0000. 
AuxRegs, bits [27:24] 

Indicates support for Auxiliary registers. Permitted values for this field are: 

0000 None supported. 

0001 Support for External Debug Auxiliary Control Register, EDACR. 


All other values are reserved. 


DoubleLock, bits [23:20] 
Indicates the presence of the DBGOSDLR, OS Double Lock Register. Permitted values of this field 


are: 
0000 The DBGOSDLR is not present. 
0001 The DBGOSDLR is present. 


All other values are reserved. The value of this for ARMv8 is 0001. 


VirtExtns, bits [19:16] 
Indicates whether EL2 is implemented. Permitted values of this field are: 
0000 EL2 is not implemented. 
0001 EL2 is implemented. 


All other values are reserved. 


VectorCatch, bits [15:12] 
Defines the form of Vector Catch exception implemented. Permitted values of this field are: 
0000 Address matching Vector Catch exception implemented. 
0001 Exception matching Vector Catch exception implemented. 


All other values are reserved. 


BPAddrMask, bits [11:8] 


Indicates the level of support for the instruction address matching breakpoint masking capability. 
Permitted values of this field are: 


0000 Breakpoint address masking might be implemented. If not implemented, 
DBGBCR<n>[28:24] is RAZ/WI. 

0001 Breakpoint address masking is implemented. 

1111 Breakpoint address masking is not implemented. DBGBCR<n>[28:24] is RESO. 


All other values are reserved. The value of this for ARMv8 is 1111. 


WPAddrMask, bits [7:4] 


Indicates the level of support for the data address matching watchpoint masking capability. 
Permitted values of this field are: 


0000 Watchpoint address masking might be implemented. If not implemented, 
DBGWCR<n>.MASK (Address mask) is RAZ/WI. 

0001 Watchpoint address masking is implemented. 

1111 Watchpoint address masking is not implemented. DBGWCR<n>.MASK (Address 


mask) is RESO. 


All other values are reserved. The value of this for ARMv8 is 0001. 
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PCSample, bits [3:0] 


Indicates the level of PC Sample-based profiling support using external debug registers 40 through 
43. Permitted values of this field in ARMV8 are: 


0000 Architecture-defined form of PC Sample-based profiling not implemented. 

0010 EDPCSR and EDCIDSR are implemented (only permitted if EL3 and EL2 are not 
implemented). 

0011 EDPCSR, EDCIDSR, and EDVIDSR are implemented. 


All other values are reserved. 


Accessing the DBGDEVID: 
To access the DBGDEVID: 
MRC p14,0,<Rt>,c7,c2,7 ; Read DBGDEVID into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0111 0010 111 
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G6.3.9 DBGDEVID1, Debug Device ID register 1 


The DBGDEVID1 characteristics are: 


Purpose 


Adds to the information given by the DBGDIDR by describing other features of the debug 
implementation. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDA==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 
mode. 


. If MDCR_EL2.TDA==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from EL1 and EL2 are trapped to 
EL3. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


This register is required in all implementations. 


Attributes 
DBGDEVID1 is a 32-bit register. 


Field descriptions 


The DBGDEVID1 bit assignments are: 


31 4 3 0 
RESO PCSROffset 
Bits [31:4] 


Reserved, RESO. 
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PCSROffset, bits [3:0] 


This field indicates the offset applied to PC samples returned by reads of EDPCSR. Permitted values 
of this field in ARMV8 are: 


0000 EDPCSR not implemented. 

0010 EDPCSR implemented. Samples have no offset applied and do not sample the 
instruction set state in AArch32 state. 
— Note 


In ARMv7, a PCSROffset value of 0000 has an alternative meaning that EDPCSR is 
implemented and returns values that have an offset applied and indicate the Instruction 
set state. This implementation option is not permitted in ARMVv8. 





Accessing the DBGDEVID1: 
To access the DBGDEVID1: 
MRC p14,0,<Rt>,c7,c1,7 ; Read DBGDEVID1 into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0111 0001 111 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4691 


1ID092916 


Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.3 Debug registers 


G6.3.10 DBGDEVID2, Debug Device ID register 2 
The DBGDEVID2 characteristics are: 


Purpose 
Reserved for future descriptions of features of the debug implementation. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDA==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 


mode. 

° If MDCR_EL2.TDA==1, Non-secure read accesses to this register from EL1 are trapped to 
EL2. 

° If MDCR_EL3.TDA==1, read accesses to this register from EL1 and EL2 are trapped to 
EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


Attributes 
DBGDEVID2 is a 32-bit register. 


Field descriptions 


The DBGDEVID2 bit assignments are: 


31 0 


RESO 


Bits [31:0] 
Reserved, RESO. 


Accessing the DBGDEVID2: 
To access the DBGDEVID2: 
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MRC p14,0,<Rt>,c7,c@,7 ; Read DBGDEVID2 into Rt 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1110 000 0111 0000 111 
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G6.3.11 DBGDIDR, Debug ID Register 
The DBGDIDR characteristics are: 


Purpose 


Specifies which version of the Debug architecture is implemented, and some features of the debug 
implementation. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) | EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RO 





ARM deprecates any access to this register from ELO. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If DBGDSCRext.UDCCdis==1, read accesses to this register from ELO are trapped to 
Undefined mode. 


. If HDCR.TDA==1, Non-secure read accesses to this register from ELO and EL] are trapped 
to Hyp mode. 


° If MDCR_EL2.TDA==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


° If MDSCR_EL1.TDCC==1, read accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


If EL1 cannot use AArch32 then the implementation of this register is OPTIONAL and deprecated. 


Attributes 
DBGDIDR is a 32-bit register. 


Field descriptions 


The DBGDIDR bit assignments are: 
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28 27 24 23 20 19 16 15 14 13 12 11 0 


ree SE_imp 
RESO 


nSUHD_imp 
RES1 


WRPs, bits [31:28] 
The number of watchpoints implemented, minus 1. 


Permitted values of this field are from 0b0001 for 2 implemented watchpoints, to 0b1111 for 16 
implemented watchpoints. 


The value of 0b0000 is reserved. 
If AArch64 is implemented, this field has the same value as ID_AA64DFRO_EL1.WRPs. 


BRPs, bits [27:24] 
The number of breakpoints implemented, minus 1. 


Permitted values of this field are from 0b0001 for 2 implemented breakpoint, to 0b1111 for 16 
implemented breakpoints. 


The value of 0b0000 is reserved. 
If AArch64 is implemented, this field has the same value as ID_AA64DFRO_EL1.BRPs. 


CTX_CMPs, bits [23:20] 
The number of breakpoints that can be used for Context matching, minus 1. 


Permitted values of this field are from 0b0000 for 1 Context matching breakpoint, to 0b1111 for 16 
Context matching breakpoints. 


The Context matching breakpoints must be the highest addressed breakpoints. For example, if six 
breakpoints are implemented and two are Context matching breakpoints, they must be breakpoints 
4 and 5. 


If AArch64 is implemented, this field has the same value as ID_AA64DFRO_EL1.CTX_CMPs. 


Version, bits [19:16] 


The Debug architecture version. Defined values are: 


0001 ARMvV6, v6 Debug architecture. 

0010 ARMV6, v6.1 Debug architecture. 

0011 ARMvVv7, v7 Debug architecture, with baseline System registers in the (coproc==1110) 
encoding space implemented. 

0100 ARMv7, v7 Debug architecture, with all System registers in the (coproc==1110) 
encoding space implemented. 

0101 ARMv7, v7.1 Debug architecture. 

0110 ARMvV8, v8 Debug architecture. 





All other values are reserved. 


Bit [15] 
Reserved, RES1. 

nSUHD_imp, bit [14] 
In ARMv7-A, was Secure User Halting Debug not implemented. 
The value of this bit must match the value of the SE_imp bit. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4695 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 


G6.3 Debug registers 


Bit [13] 
Reserved, RESO. 
SE_imp, bit [12] 
EL3 implemented. The meanings of the values of this bit are: 
Q EL3 not implemented. 
1 EL3 implemented. 
The value of this bit must match the value of the nSUHD_imp bit. 


Bits [11:0] 


Reserved, RESO. 


Accessing the DBGDIDR: 
To access the DBGDIDR: 
MRC p14,0,<Rt>,c@,c@,@ ; Read DBGDIDR into Rt 


Register access is encoded as follows: 





coproc opci CRn 


CRm_ opc2 





1110 000 0000 


0000 000 
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G6.3.12 DBGDRAR, Debug ROM Address Register 
The DBGDRAR characteristics are: 


Purpose 


Defines the base physical address of a 4KB-aligned memory-mapped debug component, usually a 
ROM table that locates and describes the memory-mapped debug components in the system. 
ARMvVv8 deprecates any use of this register. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) | EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RO 





ARMv8 deprecates any use of this register. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If DBGDSCRext. UDCCdis==1, read accesses to this register from ELO are trapped to 


Undefined mode. 

° If HDCR.TDRA==1, Non-secure read accesses to this register from ELO and EL] are trapped 
to Hyp mode. 

° If MDCR_EL2.TDRA==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If MDSCR_EL1.TDCC==1, read accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGDRAR is architecturally mapped to AArch64 System register 
MDRAR _EL1. 


If EL1 cannot use AArch32 then the implementation of this register is OPTIONAL and deprecated. 


Attributes 


DBGDRAR is a 64-bit register that can also be accessed as a 32-bit value. If it is accessed as a 32-bit 
register, bits [31:0] are read. 


Field descriptions 


The DBGDRAR bit assignments are: 
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63 


When accessing as a 32-bit register: 


31 12 11 2 1 0 


ROMADDR{[31:12] RESO 


ROMADDR{[31:12], bits [31:12] 
Bits[31:12] of the ROM table physical address. Bits [11:0] of the address are zero. 


In an implementation that includes EL3, ROMADDR is an address in Non-secure memory. It is 
IMPLEMENTATION DEFINED whether the ROM table is also accessible in Secure memory. 


Bits [11:2] 
Reserved, RESO. 
Valid, bits [1:0] 
This field indicates whether the ROM Table address is valid. The permitted values of this field are: 
00 ROM Table address is not valid. Software must ignore ROMADDR. 
11 ROM Table address is valid. 


Other values are reserved. 


When accessing as a 64-bit register: 


48 47 12 11 2 


10 
RESO ROMADDRJ47:12] RESO 


Bits [63:48] 


Reserved, RESO. 


ROMADDR{[47:12], bits [47:12] 
Bits[47:12] of the ROM table physical address. 


If the physical address size in bits (PAsize) is less than 48 then the register bits corresponding to 
ROMADDR [47:PAsize] are RESO. 


Bits [11:0] of the ROM table physical address are zero. 


ARM strongly recommends that bits ROMADDR[(PAsize-1):32] are zero in any system that 
supports AArch32 at the highest implemented Exception level. 


In an implementation that includes EL3, ROMADDR is an address in Non-secure memory. It is 
IMPLEMENTATION DEFINED whether the ROM table is also accessible in Secure memory. 


Bits [11:2] 
Reserved, RESO. 
Valid, bits [1:0] 
This field indicates whether the ROM Table address is valid. The permitted values of this field are: 
00 ROM Table address is not valid. Software must ignore ROMADDR. 
11 ROM Table address is valid. 


Other values are reserved. 
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Accessing the DBGDRAR: 
To access the DBGDRAR when accessing as a 32-bit register: 
MRC p14,0,<Rt>,c1,c@,@ ; Read DBGDRAR[31:0] into Rt 


Register access is encoded as follows: 























coproc opct CRn CRm_= opc2 
1110 000 0001 0000 000 
To access the DBGDRAR when accessing as a 64-bit register: 
MRRC p14,0,<Rt>,<Rt2>,cl ; Read DBGDRAR[31:0] into Rt and DBGDRAR[63:32] into Rt2 
Register access is encoded as follows: 
coproc opct CRm 
1110 0000 0001 
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G6.3.13 DBGDSAR, Debug Self Address Register 
The DBGDSAR characteristics are: 


Purpose 


In earlier versions of the ARM Architecture, this register defines the offset from the base address 
defined in DBGDRAR of the physical base address of the debug registers for the PE. ARMv8 
deprecates any use of this register. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) | EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RO 





ARMv8 deprecates any use of this register. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If DBGDSCRext.UDCCdis==1, read accesses to this register from ELO are trapped to 
Undefined mode. 


° If HDCR.TDRA==1, Non-secure read accesses to this register from ELO and EL] are trapped 
to Hyp mode. 


° If MDCR_EL2.TDRA==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


° If MDSCR_EL1.TDCC==1, read accesses to this register from ELO are trapped to EL1. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


If EL1 cannot use AArch32 then the implementation of this register is OPTIONAL and deprecated. 


Attributes 


DBGDSAR is a 64-bit register that can also be accessed as a 32-bit value. If it is accessed as a 32-bit 
register, bits [31:0] are read. 


Field descriptions 


The DBGDSAR bit assignments are: 





G6-4700 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.3 Debug registers 


When accessing as a 32-bit register: 


31 


0 


Offset, bits [31:0] 
This register value is RAZ. 


When accessing as a 64-bit register: 























0 
Offset 
Offset, bits [63:0] 
This register value is RAZ. 
Accessing the DBGDSAR: 
To access the DBGDSAR when accessing as a 32-bit register: 
MRC p14,0,<Rt>,c2,c@,@ ; Read DBGDSAR[31:0] into Rt 
Register access is encoded as follows: 
coproc opci1 CRn CRm_= opc2 
1110 000 0010 «§=— 0000 000 
To access the DBGDSAR when accessing as a 64-bit register: 
MRRC p14,0,<Rt>,<Rt2>,c2 ; Read DBGDSAR[31:0] into Rt and DBGDSAR[63:32] into Rt2 
Register access is encoded as follows: 
coproc opct CRm 
1110 0000 0010 
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G6.3.14 DBGDSCRext, Debug Status and Control Register, External View 
The DBGDSCRekxt characteristics are: 


Purpose 


Main control register for the debug implementation. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGDSCRext is architecturally mapped to AArch64 System register 
MDSCR_EL1. 


This register is required in all implementations. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch32. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
DBGDSCRext is a 32-bit register. 


Field descriptions 


The DBGDSCRext bit assignments are: 
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31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 7 65 2 1 0 





RESO = fe RESO 
RXfull ERR 
TXfull UDCCdis 
RESO RESO 
RXO HDE 
TXU MDBGen 
RESO 
INTdis 
TDA 
RESO 
SPNIDdis 
SPIDdis 

Bit [31] 


Reserved, RESO. 


RXfull, bit [30] 
DTRRxX full. Used for save/restore of EDSCR.RXfull. 


When DBGOSLSR.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat it 
as UNK/SBZP. 


When DBGOSLSR.OSLK == | (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.RXfull. 


ARM deprecates use of this bit other than for save/restore. Use DBGDSCRint to access the DTRRX 
full status. 


The architected behavior of this field determines the value it returns after a reset. 


TXfull, bit [29] 
DTRTX full. Used for save/restore of EDSCR.TXfull. 


When DBGOSLSR.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat it 
as UNK/SBZP. 


When DBGOSLSR.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.TXfull. 


ARM deprecates use of this bit other than for save/restore. Use DBGDSCRint to access the DTIRTX 
full status. 


The architected behavior of this field determines the value it returns after a reset. 





Bit [28] 
Reserved, RESO. 

RXO, bit [27] 
Used for save/restore of EDSCR.RXO. 
When DBGOSLSR.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat it 
as UNK/SBZP. 
When DBGOSLSR.OSLK == | (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.RXO. 
The architected behavior of this field determines the value it returns after a reset. 
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TXU, bit [26] 
Used for save/restore of EDSCR.TXU. 


When DBGOSLSR.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat it 
as UNK/SBZP. 


When DBGOSLSR.OSLK == | (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.TXU. 


The architected behavior of this field determines the value it returns after a reset. 


Bits [25:24] 


Reserved, RESO. 


INTdis, bits [23:22] 
Used for save/restore of EDSCR.INTdis. 


When DBGOSLSR.OSLK == 0 (the OS lock is unlocked), this field is RO, and software must treat 
it as UNK/SBZP. 


When DBGOSLSR.OSLK == 1 (the OS lock is locked), this field is RW and holds the value of 
EDSCR.INTdis. 


The architected behavior of this field determines the value it returns after a reset. 


TDA, bit [21] 
Used for save/restore of EDSCR.TDA. 


When DBGOSLSR.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat it 
as UNK/SBZP. 


When DBGOSLSR.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.TDA. 


The architected behavior of this field determines the value it returns after a reset. 


Bits [20:19] 
Reserved, RESO. 


NS, bit [18] 
Non-secure status. Returns the inverse of IsSecure(). This bit is RO. 


ARM deprecates use of this field. 


SPNIDdis, bit [17] 
Secure privileged profiling disabled status bit. This bit is RO. Permitted values are: 
0 Profiling allowed in Secure privileged modes. 
1 Profiling prohibited in Secure privileged modes. 
Profiling is allowed in Secure privileged modes if either of the following: 
° EL3 is using AArch32 and SDCR.SPME is set to 1. 
° EL3 is using AArch64 and MDSCR_EL1.SPME is set to 1. 
This bit is RESO if EL3 is not implemented. 
ARM deprecates use of this field. 


SPIDdis, bit [16] 


Secure privileged AArch32 invasive self-hosted debug disabled status bit. This bit is RO and 
depends on the value of SDCR.SPD and the pseudocode function 
AArch32.SelfHostedSecurePrivilegedInvasiveDebugEnabled(). Permitted values are: 


Q Self-hosted debug enabled in Secure privileged AArch32 modes. 
1 Self-hosted debug disabled in Secure privileged AArch32 modes. 
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This bit reads as 1 if any of the following is true and reads as 0 otherwise: 
° SDCR.SPD has the value 10. 


. SDCR.SPD has the value 0@ and 
AArch32.SelfHostedSecurePrivilegedInvasiveDebugEnabled() returns FALSE. 


ARM deprecates use of this field. 


MDBGen, bit [15] 
Monitor debug events enable. Enable Breakpoint, Watchpoint, and Vector Catch exceptions. 
) Breakpoint, Watchpoint, and Vector Catch exceptions disabled. 
1 Breakpoint, Watchpoint, and Vector Catch exceptions enabled. 


When this register has an architecturally-defined reset value, this field resets to 0. 


HDE, bit [14] 
Used for save/restore of EDSCR.HDE. 
When DBGOSLSR.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat it 
as UNK/SBZP. 
When DBGOSLSR.OSLK == | (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.HDE. 
The architected behavior of this field determines the value it returns after a reset. 

Bit [13] 


Reserved, RESO. 


UDCCdis, bit [12] 
Traps ELO accesses to the DCC registers to Undefined mode. 
Q ELO accesses to the DCC registers are not trapped to Undefined mode. 
1 ELO accesses to the DBGDSCRint, DBGDTRRXint, DBGDTRTXint, DBGDIDR, 
DBGDSAR, and DBGDRAR are trapped to Undefined mode. 
—— Note 


All accesses to these registers are trapped, including LDC and STC accesses to DBGDTRT Xint and 
DBGDTRR<Xint, and MRRC accesses to DBGDSAR and DBGDRAR. 





Traps of ELO accesses to the DBGDTRRXint and DBGDTRTXint are ignored in Debug state. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Bits [11:7] 


Reserved, RESO. 


ERR, bit [6] 
Used for save/restore of EDSCR.ERR. 


When DBGOSLSR.OSLK == 0 (the OS lock is unlocked), this bit is RO, and software must treat it 
as UNK/SBZP. 


When DBGOSLSR.OSLK == 1 (the OS lock is locked), this bit is RW and holds the value of 
EDSCR.ERR. 


The architected behavior of this field determines the value it returns after a reset. 
MOE, bits [5:2] 


Method of Entry for debug exception. When a debug exception is taken to an Exception level using 
AArch32, this field is set to indicate the event that caused the exception: 





0001 Breakpoint 
0011 Software breakpoint (BKPT) instruction 
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0101 Vector catch 

1010 Watchpoint 

This field resets to a value that is architecturally UNKNOWN. 
Bits [1:0] 


Reserved, RESO. 


Accessing the DBGDSCRext: 
To access the DBGDSCRext: 


MRC p14,0,<Rt>,c@,c2,2 ; Read DBGDSCRext into Rt 
MCR p14,0,<Rt>,c@,c2,2 ; Write Rt to DBGDSCRext 


Register access is encoded as follows: 





coproc opci 


CRn 


CRm_ opc2 





1110 000 


0000 


0010 010 
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G6.3.15 DBGDSCRint, Debug Status and Control Register, Internal View 


The DBGDSCRint characteristics are: 


Purpose 


Main control register for the debug implementation. This is an internal, read-only view. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) | EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If DBGDSCRext.UDCCdis==1, read accesses to this register from ELO are trapped to 
Undefined mode. 


° If HDCR.TDA==1, Non-secure read accesses to this register from ELO and EL] are trapped 
to Hyp mode. 


° If MDCR_EL2.TDA==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


° If MDSCR_EL1.TDCC==1, read accesses to this register from ELO are trapped to EL1. 


Configurations 


Attributes 


There is one instance of this register that is used in both Secure and Non-secure states. 


This register is required in all implementations. 


DBGDSCRint.{NS, SPNIDdis, SPIDdis, MDBGen, UDCCdis, MOE} are UNKNOWN when the 
register is accessed at ELO. However, although these values are not accessible at ELO by instructions 
that are neither UNPREDICTABLE nor return UNKNOWN values, it is permissible for an 
implementation to return the values of DBGDSCRext.{NS, SPNIDdis, SPIDdis, MDBGen, 
UDCCdis, MOE} for these fields at ELO. 


It is also permissible for an implementation to return the same values as defined for a read of 
DBGDSCRint at EL1 or above. (This is the case even if the implementation does not support 
AArch32 at EL1 or above.) 


DBGDSCRint is a 32-bit register. 
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Field descriptions 


The DBGDSCRint bit assignments are: 


31 30 29 28 19 18 17 16 15 14 13 12 11 2 10 





eso | ee 
RXfull Pre 
TXfull RESO 
SPNIDdis MDBGen 
SPIDdis 


Bit [31] 

Reserved, RESO. 
RXfull, bit [30] 

DTRRxX full. Read-only view of the equivalent bit in the EDSCR. 
TXfull, bit [29] 

DTRTX full. Read-only view of the equivalent bit in the EDSCR. 
Bits [28:19] 

Reserved, RESO. 
NS, bit [18] 

Non-secure status. 

Read-only view of the equivalent bit in the DBGDSCRext. ARM deprecates use of this field. 
SPNID4dis, bit [17] 

Secure privileged non-invasive debug disable. 

Read-only view of the equivalent bit in the DBGDSCRext. ARM deprecates use of this field. 
SPIDdis, bit [16] 

Secure privileged invasive debug disable. 

Read-only view of the equivalent bit in the DBGDSCRext. ARM deprecates use of this field. 
MDBGen, bit [15] 

Monitor debug events enable. 

Read-only view of the equivalent bit in the DBGDSCRext. 
Bits [14:13] 

Reserved, RESO. 
UDCCdis, bit [12] 

User mode access to Debug Communications Channel disable. 

Read-only view of the equivalent bit in the DBGDSCRext. ARM deprecates use of this field. 
Bits [11:6] 

Reserved, RESO. 
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Method of Entry for debug exception. When a debug exception is taken to an Exception level using 
AArch32, this field is set to indicate the event that caused the exception: 


0001 
0011 
0101 
1010 


Breakpoint 
Software breakpoint (BKPT) instruction 
Vector catch 


Watchpoint 


Read-only view of the equivalent bit in the DBGDSCRext. 


Bits [1:0] 


Reserved, RESO. 


Accessing the DBGDSCRint: 


To access the DBGDSCRint: 


MRC p14,0,<Rt>,c@,c1,@ ; Read DBGDSCRint into Rt, where Rt can be RQ-R14 or APSR_nzcv. The last form 
writes bits[31:28] of the transferred value to the N, Z, C and V condition flags and is specified by 
setting the RT field of the encoding to 0b1111. 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0000 0001 000 
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G6.3.16 DBGDTRRXext, Debug OS Lock Data Transfer Register, Receive, External View 
The DBGDTRRXext characteristics are: 


Purpose 
Used for save/restore of DBGDTRRXint. It is a component of the Debug Communications Channel. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





ARM deprecates reads and writes of DBGDTRRXext through the System register interface when 
the OS lock is unlocked. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGDTRRXext is architecturally mapped to AArch64 System register 
OSDTRRX_EL1. 


Attributes 
DBGDTRRXext is a 32-bit register. 


Field descriptions 


The DBGDTRRXext bit assignments are: 





31 0 
Update DTRRX without side-effect 
Bits [31:0] 
Update DTRRX without side-effect. 
Writes to this register update the value in DTRRX and do not change RXfull. 
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Reads of this register return the last value written to DT[RRX and do not change RXfull. 


For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGDTRRXext: 
To access the DBGDTRRXext: 


MRC p14,0,<Rt>,c@,c@,2 ; Read DBGDTRRXext into Rt 
MCR p14,0,<Rt>,c@,c@,2 ; Write Rt to DBGDTRRXext 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1110 000 0000 0000 010 
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G6.3 Debug registers 


G6.3.17 DBGDTRRxint, Debug Data Transfer Register, Receive 


The DBGDTRRXint characteristics are: 


Purpose 


Transfers data from an external debugger to the PE. For example, it is used by a debugger 
transferring commands and data to a debug target. It is a component of the Debug Communications 
Channel. 


Usage constraints 


Traps and 


Configura 


Attributes 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RO 





Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules, in 
Non-debug state: 


° If DBGDSCRext.UDCCdis==1, read accesses to this register from ELO are trapped to 
Undefined mode. 


. If HDCR.TDA==1, Non-secure read accesses to this register from ELO and ELI are trapped 
to Hyp mode. 


° If MDCR_EL2.TDA==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If MDCR_EL3.TDA==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


° If MDSCR_EL1.TDCC==1, read accesses to this register from ELO are trapped to EL1. 


— Note 


Read accesses to this register are not trapped in Debug state. 





tions 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGDTRRXint is architecturally mapped to AArch64 System register 
DBGDTRRX_ELO. 


AArch32 System register DBGDTRRXint is architecturally mapped to External register 
DBGDTRRX_ELO. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


DBGDTRRXint is a 32-bit register. 
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Field descriptions 


The DBGDTRRXint bit assignments are: 


0 


31 
Update DTRRX 


Bits [31:0] 
Update DTRRX. 
If RXfull is set to 1, then reads of this register return the last value written to DTRRX and clear 


RXfull to 0. 
For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGDTRRxXint: 
To access the DBGDTRRXint: 
MRC p14,0,<Rt>,c@,c5,@ ; Read DBGDTRRXint into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0000 0101 000 
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G6.3.18 DBGDTRTXext, Debug OS Lock Data Transfer Register, Transmit 
The DBGDTRTXext characteristics are: 


Purpose 
Used for save/restore of DBGDTRTXint. It is a component of the Debug Communication Channel. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RW RW 





ARM deprecates reads and writes of DBGDTRTXext through the System register interface when 
the OS lock is unlocked. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGDTRTXext is architecturally mapped to AArch64 System register 
OSDTRTX_EL1. 


Attributes 
DBGDTRT<Xext is a 32-bit register. 


Field descriptions 


The DBGDTRTXext bit assignments are: 





31 0 
Return DTRTX without side-effect 
Bits [31:0] 
Return DTRTX without side-effect. 
Reads of this register return the value in DTIRTX and do not change TXfull. 
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Writes of this register update the value in DTRTX and do not change TXfull. 


For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGDTRTXext: 
To access the DBGDTRTXext: 


MRC p14,0,<Rt>,c@,c3,2 ; Read DBGDTRTXext into Rt 
MCR p14,0,<Rt>,c@,c3,2 ; Write Rt to DBGDTRTXext 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1110 000 0000 0011 010 
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G6.3.19 DBGDTRTXint, Debug Data Transfer Register, Transmit 
The DBGDTRTXint characteristics are: 


Purpose 


Transfers data from the PE to an external debugger. For example, it is used by a debug target to 
transfer data to the debugger. It is a component of the Debug Communication Channel. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-WO Config-WO WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-WO WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules, in 
Non-debug state: 


. If DBGDSCRext. UDCCdis==1, write accesses to this register from ELO are trapped to 
Undefined mode. 


° If HDCR.TDA==1, Non-secure write accesses to this register from ELO and EL] are trapped 
to Hyp mode. 


° If MDCR_EL2.TDA==1, Non-secure write accesses to this register from ELO and EL] are 
trapped to EL2. 


° If MDCR_EL3.TDA==1, write accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


. If MDSCR_EL1.TDCC==1, write accesses to this register from ELO are trapped to EL1. 
—— Note 


Write accesses to this register are not trapped in Debug state. 





Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGDTRTXint is architecturally mapped to AArch64 System register 
DBGDTRTX_ELO. 


AArch32 System register DBGDTRTXint is architecturally mapped to External register 
DBGDTRTX_ELO. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 
DBGDTRTXint is a 32-bit register. 
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G6.3 Debug registers 
Field descriptions 
The DBGDTRTXint bit assignments are: 
31 0 
Return DTRTX 


Bits [31:0] 
Return DTRTX. 
If TXfull is set to 0, then writes of this register update the value in DTRTX and set TXfull to 1. 


For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGDTRTXint: 


To access the DBGDTRTXint: 


MCR p14,0,<Rt>,c@,c5,@ ; Write Rt to DBGDTRTXint 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0000 0101 000 
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G6.3.20 DBGOSDLR, Debug OS Double Lock Register 
The DBGOSDLR characteristics are: 


Purpose 


Locks out the external debug interface. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDOSA==1, Non-secure accesses to this register from EL1 are trapped to Hyp 
mode. 


° If MDCR_EL2.TDOSA==1, Non-secure accesses to this register from EL1 are trapped to 
EL2. 


. If MDCR_EL3.TDOSA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGOSDLR is architecturally mapped to AArch64 System register 
OSDLR_EL1. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch32. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
DBGOSDLR is a 32-bit register. 


Field descriptions 


The DBGOSDLR bit assignments are: 
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31 1 0 


RESO i 


== DLK 


Bits [31:1] 
Reserved, RESO. 
DLK, bit [0] 
OS Double Lock control bit. Possible values are: 
Q OS Double Lock unlocked. 
1 OS Double Lock locked, if DBGPRCR.CORENPDRQ (Core no powerdown request) 


bit is set to 0 and the PE is in Non-debug state. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the DBGOSDLR: 
To access the DBGOSDLR: 


MRC p14,0,<Rt>,c1,c3,4 ; Read DBGOSDLR into Rt 
MCR p14,0,<Rt>,c1,c3,4 ; Write Rt to DBGOSDLR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0001 0011 100 
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G6.3 Debug registers 


G6.3.21 


DBGOSECCR, Debug OS Lock Exception Catch Control Register 


The DBGOSECCR characteristics are: 


Purpose 


Provides a mechanism for an operating system to access the contents of EDECCR that are otherwise 
invisible to software, so it can save/restore the contents of EDECCR over powerdown on behalf of 
the external debugger. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGOSECCR is architecturally mapped to AArch64 System register 
OSECCR_EL1. 


AArch32 System register DBGOSECCR is architecturally mapped to External register EDECCR. 
If OSLSR.OSLK == 0 then DBGOSECCR returns an UNKNOWN value on reads and ignores writes. 


Attributes 
DBGOSECCR is a 32-bit register. 


Field descriptions 


The DBGOSECCR bit assignments are: 


When OSLSR.OSLK==1: 





31 0 
EDECCR 
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EDECCR, bits [31:0] 


Used for save/restore to EDECCR over powerdown. 


Accessing the DBGOSECCR: 
To access the DBGOSECCR: 


MRC p14,0,<Rt>,c@,c6,2 ; Read DBGOSECCR into Rt 
MCR p14,0,<Rt>,c@,c6,2 ; Write Rt to DBGOSECCR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0000 0110 010 
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G6.3.22 DBGOSLAR, Debug OS Lock Access Register 


The DBGOSLAR characteristics are: 


Purpose 


Provides a lock for the debug registers. The OS lock also disables some Software debug events. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- WO WO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDOSA==1, Non-secure write accesses to this register from EL1 are trapped to 
Hyp mode. 


° If MDCR_EL2.TDOSA==1, Non-secure write accesses to this register from EL1 are trapped 
to EL2 using AArch64. 


° If MDCR_EL3.TDOSA==1, write accesses to this register from EL1 and EL2 are trapped to 
EL3 using AArch6é4. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGOSLAR is architecturally mapped to AArch64 System register 
OSLAR_EL1. 


AArch32 System register DBGOSLAR is architecturally mapped to External register 
OSLAR_EL1. 


Attributes 
DBGOSLAR is a 32-bit register. 


Field descriptions 


The DBGOSLAR bit assignments are: 


31 0 


OS Lock Access 
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Bits [31:0] 


OS Lock Access. Writing the value @xC5ACCE55 to the DBGOSLAR sets the OS lock to 1. Writing 
any other value sets the OS lock to 0. 


Use DBGOSLSR.OSLK to check the current status of the lock. 


Accessing the DBGOSLAR: 
To access the DBGOSLAR: 
MCR p14,0,<Rt>,c1,c0,4 ; Write Rt to DBGOSLAR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0001 0000 100 
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G6.3.23 DBGOSLSR, Debug OS Lock Status Register 
The DBGOSLSR characteristics are: 


Purpose 


Provides status information for the OS lock. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HDCR.TDOSA==1, Non-secure read accesses to this register from EL1 are trapped to Hyp 


mode. 

° If MDCR_EL2.TDOSA==1, Non-secure read accesses to this register from EL1 are trapped 
to EL2. 

° If MDCR_EL3.TDOSA==1, read accesses to this register from EL1 and EL2 are trapped to 
EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGOSLSR is architecturally mapped to AArch64 System register 
OSLSR_EL1. 


The OS lock status is also visible in the external debug interface through EDPRSR. 


This register is in the Cold reset domain. Some or all RW fields of this register have defined reset 
values. On a Cold reset these apply only if the PE resets into an Exception level that is using 
AArch32. Otherwise, on a Cold reset RW fields in this register reset to architecturally UNKNOWN 
values. The register is not affected by a Warm reset. 


Attributes 
DBGOSLSR is a 32-bit register. 


Field descriptions 


The DBGOSLSR bit assignments are: 
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34 T 1 0 


RESO TT) 
= OSLM[O] 
os 


Sie 
Bits [31:4] 
Reserved, RESO. 


OSLM[1], bit [3] 
See below for description of the OSLM field. 


nTT, bit [2] 
Not 32-bit access. This bit is always RAZ. It indicates that a 32-bit access is needed to write the key 
to the OS Lock Access Register. 
OSLK, bit [1] 
OS Lock Status. The possible values are: 
i) OS lock unlocked. 
1 OS lock locked. 


The OS lock is locked and unlocked by writing to the OS Lock Access Register. 


When this register has an architecturally-defined reset value, this field resets to 1. 


OSLM[O0], bit [0] 


OS lock model implemented. Identifies the form of OS save and restore mechanism implemented. 
In ARMvV8 these bits are as follows: 


10 OS lock implemented. DBGOSSRR not implemented. 


All other values are reserved. 


Accessing the DBGOSLSR: 
To access the DBGOSLSR: 
MRC p14,0,<Rt>,cl,c1,4 ; Read DBGOSLSR into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0001 0001 100 
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G6.3.24 DBGPRCR, Debug Power Control Register 
The DBGPRCR characteristics are: 


Purpose 


Controls behavior of the PE on powerdown request. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDOSA==1, Non-secure accesses to this register from EL1 are trapped to Hyp 
mode. 


° If MDCR_EL2.TDOSA==1, Non-secure accesses to this register from EL1 are trapped to 
EL2. 


. If MDCR_EL3.TDOSA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGPRCR is architecturally mapped to AArch64 System register 
DBGPRCR_EL1. 


Bit [0] of this register is mapped to EDPRCR.CORENPDRQ, bit [0] of the external view of this 
register. 


The other bits in these registers are not mapped to each other. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 
DBGPRCR is a 32-bit register. 


Field descriptions 


The DBGPRCR bit assignments are: 
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1 0 


RESO i 


Bits [31:1] 


| CORENPDRQ 


Reserved, RESO. 


CORENPDRQ, bit [0] 


Core no powerdown request. Requests emulation of powerdown. Possible values of this bit are: 
) If the system responds to a powerdown request, it powers down Core power domain. 


1 If the system responds to a powerdown request, it does not powerdown the Core power 
domain, but instead emulates a powerdown of that domain. 


Writes to this bit are permitted regardless of the state of the IMPLEMENTATION DEFINED 
authentication interface. This means that a debugger can request Core no powerdown regardless of 
whether invasive debug is permitted. 


It is IMPLEMENTATION DEFINED whether this bit is reset to the value of EDPRCR.COREPURQ on 
exit from an IMPLEMENTATION DEFINED software-visible retention state. 


Accessing the DBGPRCR: 


To access the DBGPRCR: 


MRC p14,0,<Rt>,c1,c4,4 ; Read DBGPRCR into Rt 
MCR p14,0,<Rt>,c1,c4,4 ; Write Rt to DBGPRCR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1110 000 0001 0100 100 
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G6.3.25 DBGVCR, Debug Vector Catch Register 
The DBGVCR characteristics are: 


Purpose 


Controls Vector Catch debug events. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGVCR is architecturally mapped to AArch64 System register 
DBGVCR32_EL2. 


This register is required in all implementations. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
DBGVCR is a 32-bit register. 


Field descriptions 


The DBGVCR bit assignments are: 
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When EL3 implemented and using AArch32: 


31 30 29 28 27 26 25 24 161514131211109 8 76543210 





NSF | | | RESO 
NSI SD 
RESO RESO 
NSD SF 
NSP RESO 
NSS MS 
NSU MP 
MD 

RESO 

MF 

NSF, bit [31] 


FIQ vector catch enable in Non-secure state. 
The exception vector offset is @x1C. 


This field resets to a value that is architecturally UNKNOWN. 


NSI, bit [30] 

IRQ vector catch enable in Non-secure state. 

The exception vector offset is 0x18. 

This field resets to a value that is architecturally UNKNOWN. 
Bit [29] 

Reserved, RESO. 
NSD, bit [28] 


Data Abort vector catch enable in Non-secure state. 
The exception vector offset is 0x10. 


This field resets to a value that is architecturally UNKNOWN. 


NSP, bit [27] 

Prefetch Abort vector catch enable in Non-secure state. 

The exception vector offset is @x@C. 

This field resets to a value that is architecturally UNKNOWN. 
NSS, bit [26] 

Supervisor Call (SVC) vector catch enable in Non-secure state. 

The exception vector offset is 0x08. 

This field resets to a value that is architecturally UNKNOWN. 
NSU, bit [25] 

Undefined Instruction vector catch enable in Non-secure state. 

The exception vector offset is 0x04. 

This field resets to a value that is architecturally UNKNOWN. 
Bits [24:16] 


Reserved, RESO. 
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MB, bit [15] 
FIQ vector catch enable in Monitor mode. 
The exception vector offset is @x1C. 


This field resets to a value that is architecturally UNKNOWN. 


MI, bit [14] 
IRQ vector catch enable in Monitor mode. 
The exception vector offset is 0x18. 


This field resets to a value that is architecturally UNKNOWN. 


Bit [13] 


Reserved, RESO. 


MD, bit [12] 
Data Abort vector catch enable in Monitor mode. 
The exception vector offset is 0x10. 


This field resets to a value that is architecturally UNKNOWN. 


MP, bit [11] 
Prefetch Abort vector catch enable in Monitor mode. 
The exception vector offset is Qx@C. 


This field resets to a value that is architecturally UNKNOWN. 


MS, bit [10] 
Secure Monitor Call (SMC) vector catch enable in Monitor mode. 
The exception vector offset is 0x08. 


This field resets to a value that is architecturally UNKNOWN. 


Bits [9:8] 


Reserved, RESO. 


SF, bit [7] 
FIQ vector catch enable in Secure state. 
The exception vector offset is @x1C. 


This field resets to a value that is architecturally UNKNOWN. 


SI, bit [6] 
IRQ vector catch enable in Secure state. 
The exception vector offset is 0x18. 


This field resets to a value that is architecturally UNKNOWN. 
Bit [5] 

Reserved, RESO. 
SD, bit [4] 

Data Abort vector catch enable in Secure state. 

The exception vector offset is 0x10. 

This field resets to a value that is architecturally UNKNOWN. 
SP, bit [3] 

Prefetch Abort vector catch enable in Secure state. 


The exception vector offset is @x@C. 
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This field resets to a value that is architecturally UNKNOWN. 
SS, bit [2] 
Supervisor Call (SVC) vector catch enable in Secure state. 
The exception vector offset is 0x08. 
This field resets to a value that is architecturally UNKNOWN. 
SU, bit [1] 
Undefined Instruction vector catch enable in Secure state. 
The exception vector offset is 0x04. 
This field resets to a value that is architecturally UNKNOWN. 
Bit [0] 
Reserved, RESO. 
When EL3 implemented and using AArch64: 
31 30 29 28 27 26 25 24 876543210 





NSF | | = RESO 


RESO RESO 
NSD SF 
NSP 
NSS 
NSU 


NSF, bit [31] 
FIQ vector catch enable in Non-secure state. 
The exception vector offset is @x1C. 


This field resets to a value that is architecturally UNKNOWN. 


NSI, bit [30] 

IRQ vector catch enable in Non-secure state. 

The exception vector offset is 0x18. 

This field resets to a value that is architecturally UNKNOWN. 
Bit [29] 

Reserved, RESO. 
NSD, bit [28] 

Data Abort vector catch enable in Non-secure state. 

The exception vector offset is 0x10. 

This field resets to a value that is architecturally UNKNOWN. 
NSP, bit [27] 

Prefetch Abort vector catch enable in Non-secure state. 

The exception vector offset is @x@C. 


This field resets to a value that is architecturally UNKNOWN. 
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NSS, bit [26] 


NSU, bit [25] 


Bits [24:8] 


SF, bit [7] 


SI, bit [6] 


Bit [5] 


SD, bit [4] 


SP, bit [3] 


SS, bit [2] 


SU, bit [1] 


Bit [0] 


Supervisor Call (SVC) vector catch enable in Non-secure state. 
The exception vector offset is 0x08. 


This field resets to a value that is architecturally UNKNOWN. 


Undefined Instruction vector catch enable in Non-secure state. 
The exception vector offset is 0x04. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


FIQ vector catch enable in Secure state. 
The exception vector offset is @x1C. 


This field resets to a value that is architecturally UNKNOWN. 


IRQ vector catch enable in Secure state. 
The exception vector offset is 0x18. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


Data Abort vector catch enable in Secure state. 
The exception vector offset is 0x10. 


This field resets to a value that is architecturally UNKNOWN. 


Prefetch Abort vector catch enable in Secure state. 
The exception vector offset is Qx@C. 


This field resets to a value that is architecturally UNKNOWN. 


Supervisor Call (SVC) vector catch enable in Secure state. 
The exception vector offset is 0x08. 


This field resets to a value that is architecturally UNKNOWN. 


Undefined Instruction vector catch enable in Secure state. 
The exception vector offset is 0x04. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 
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When EL3 not implemented: 


Bits [31:8] 

Reserved, RESO. 
F, bit [7] 

FIQ vector catch enable. 

The exception vector offset is @x1C. 

This field resets to a value that is architecturally UNKNOWN. 
I, bit [6] 

IRQ vector catch enable. 

The exception vector offset is 0x18. 

This field resets to a value that is architecturally UNKNOWN. 
Bit [5] 

Reserved, RESO. 
D, bit [4] 

Data Abort vector catch enable. 

The exception vector offset is 0x10. 

This field resets to a value that is architecturally UNKNOWN. 
P, bit [3] 

Prefetch Abort vector catch enable. 

The exception vector offset @x@C. 

This field resets to a value that is architecturally UNKNOWN. 
S, bit [2] 

Supervisor Call (SVC) vector catch enable. 

The exception vector offset is 0x08. 

This field resets to a value that is architecturally UNKNOWN. 
U, bit [1] 

Undefined Instruction vector catch enable. 

The exception vector offset is 0x04. 

This field resets to a value that is architecturally UNKNOWN. 
Bit [0] 


Reserved, RESO. 


Accessing the DBGVCR: 


To access the DBGVCR: 
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MRC p14,0,<Rt>,c@,c7,@ ; Read DBGVCR into Rt 
MCR p14,0,<Rt>,c@,c7,@ ; Write Rt to DBGVCR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0000 0111 000 
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G6.3.26 DBGWCR<n>, Debug Watchpoint Control Registers, n= 0 - 15 
The DBGWCR<n> characteristics are: 


Purpose 


Holds control information for a watchpoint. Forms watchpoint n together with value register 
DBGWVR<n>. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 











as follows: 
ELO EL1  EL2(NS) 
- RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see 
° Synchronous exception prioritization for exceptions taken to AArch32 state on page G1-3816 
for exceptions taken to AArch32 state. 
° Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548 for 
exceptions taken to AArch64 state. 
° Software Access debug event on page H3-4903 for accesses to this register taken to Debug 


state. 

Subject to the prioritization rules: 

. If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 

° If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 

° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 

° If EDSCR.TDA==1, DBGOSLSR.OSLK==0, and halting is allowed, EL1 and EL2 accesses 
to this register generate a Software Access debug event. 

Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGWCR<n> is architecturally mapped to AArch64 System register 
DBGWCR<n>_EL1. 


AArch32 System register DBGWCR<n> is architecturally mapped to External register 
DBGWCR<n>_EL1. 


If breakpoint n is not implemented then this register is unallocated. 
This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 
Attributes 
DBGWCR<n> is a 32-bit register. 
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WT 


Field descriptions 


The DBGWCR<n> bit assignments are: 


29 28 


2423 212019 16 15 14 13 12 543 2 1 0 


HMC 


When the E field is zero, all the other fields in the register are ignored. 


Bits [31:29] 


Reserved, RESO. 


MASK, bits [28:24] 


Bits [23:21] 


WT, bit [20] 


Address mask. Only objects up to 2GB can be watched using a single mask. 
00000 No mask. 
00001 Reserved. 
00010 Reserved. 
If programmed with a reserved value, a watchpoint must behave as if either: 


. MASK has been programmed with a defined value, which might be 0 (no mask), other than 
for a direct read of DBGWCRn_EL1. 


. The watchpoint is disabled. 


Software must not rely on this property because the behavior of reserved values might change in a 
future revision of the architecture. 


Other values mask the corresponding number of address bits, from @b00011 masking 3 address bits 
(0x00000007 mask for address) to 0b11111 masking 31 address bits (@x7FFFFFFF mask for address). 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


Watchpoint type. Possible values are: 
0 Unlinked data address match. 
1 Linked data address match. 


This field resets to a value that is architecturally UNKNOWN. 


LBN, bits [19:16] 


Linked breakpoint number. For Linked data address watchpoints, this specifies the index of the 
Context-matching breakpoint linked to. 


This field resets to a value that is architecturally UNKNOWN. 


SSC, bits [15:14] 


Security state control. Determines the Security states under which a Watchpoint debug event for 
watchpoint n is generated. This field must be interpreted along with the HMC and PAC fields, see 
Execution conditions for which a watchpoint generates Watchpoint exceptions on page G2-3964. 


This field resets to a value that is architecturally UNKNOWN. 
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HMC, bit [13] 


Higher mode control. Determines the debug perspective for deciding when a Watchpoint debug 
event for watchpoint n is generated. This field must be interpreted along with the SSC and PAC 
fields, see Execution conditions for which a watchpoint generates Watchpoint exceptions on 
page G2-3964. 


This field resets to a value that is architecturally UNKNOWN. 
BAS, bits [12:5] 


Byte address select. Each bit of this field selects whether a byte from within the word or 
double-word addressed by DBGWVR<n> is being watched. 





BAS Description 





XXXXxxxl Match byte at DBGWVR<n> 





XXXXxx1x Match byte at DBGWVR<n>+1 





XxXxxx1lxx Match byte at DBGWVR<n>+2 





xXxxxIxxx Match byte at DBGWVR<n>+3 





In cases where DBGWVR<n> addresses a double-word: 





BAS Description, if DBGWVR<n>[2] == 





Xxx1xxxx Match byte at DBGWVR<n>+4 





xXx1xxxxx Match byte at DBGWVR<n>+5 





X1xxxxxx Match byte at DBGWVR<n>+6 





1xxxxxxx Match byte at DBGWVR<n>+7 





If DBGWVR<n>[2] == 1, only BAS[3:0] are used and BAS[7:4] are ignored. ARM deprecates 
setting DBGWVR<n>([2] == 1. 


The valid values for BAS are non-zero binary numbers all of whose set bits are contiguous. All other 
values are reserved and must not be used by software. See Reserved DBGWCR<n>.BAS values on 
page G2-3972 


This field resets to a value that is architecturally UNKNOWN. 


LSC, bits [4:3] 


Load/store control. This field enables watchpoint matching on the type of access being made. 
Possible values of this field are: 


Q1 Match instructions that load from a watchpointed address. 
10 Match instructions that store to a watchpointed address. 
11 Match instructions that load from or store to a watchpointed address. 


All other values are reserved, but must behave as if the watchpoint is disabled. Software must not 
rely on this property as the behavior of reserved values might change in a future revision of the 
architecture. 


This field resets to a value that is architecturally UNKNOWN. 
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PAC, bits [2:1] 


Privilege of access control. Determines the Exception level or levels at which a Watchpoint debug 
event for watchpoint n is generated. This field must be interpreted along with the SSC and HMC 


fields, see Execution conditions for which a watchpoint generates Watchpoint exceptions on 
page G2-3964. 


This field resets to a value that is architecturally UNKNOWN. 
E, bit [0] 

Enable watchpoint n. Possible values are: 

0 Watchpoint disabled. 

1 Watchpoint enabled. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGWCR<n>: 
To access the DBGWCR<n>: 


MRC p14,0,<Rt>,cQ,<CRm>,7 ; Read DBGWCR<n> into Rt, where n is in the range Q to 15 
MCR p14,0,<Rt>,cQ,<CRm>,7 ; Write Rt to DBGWCR<n>, where n is in the range @ to 15 


Register access is encoded as follows: 





coproc opci CRn CRm_— opc2 





1110 000 0000 n<3:@> 111 
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DBGWFEAR, Debug Watchpoint Fault Address Register 


The DBGWEAR characteristics are: 


Purpose 


Previously returned information about the address of the instruction that accessed a watchpointed 
address. Is now deprecated and RESO. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





- RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HDCR.TDA==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


Attributes 
DBGWEAR is a 32-bit register. 


Field descriptions 


The DBGWFAR bit assignments are: 


31 0 


RESO 


Bits [31:0] 


Reserved, RESO. 


Accessing the DBGWFAR: 
To access the DBGWFAR: 


MRC p14,0,<Rt>,c@,c6,@ ; Read DBGWFAR into Rt 
MCR p14,0,<Rt>,c@,c6,@ ; Write Rt to DBGWFAR 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0000 0110 000 








G6-4740 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.3 Debug registers 


G6.3.28 DBGWVR<n>, Debug Watchpoint Value Registers, n = 0 - 15 
The DBGWVR<n> characteristics are: 


Purpose 
Holds a data address value for use in watchpoint matching. Forms watchpoint n together with 
control register DBGWCR<n>. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 











as follows: 
ELO EL1 EL2 (NS) 
- RW RW 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see 
° Synchronous exception prioritization for exceptions taken to AArch32 state on page G1-3816 
for exceptions taken to AArch32 state. 
° Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548 for 
exceptions taken to AArch64 state. 
° Software Access debug event on page H3-4903 for accesses to this register taken to Debug 


state. 

Subject to the prioritization rules: 

. If MDCR_EL2.TDA==1, Non-secure accesses to this register from EL1 are trapped to EL2. 

° If MDCR_EL3.TDA==1, accesses to this register from EL1 and EL2 are trapped to EL3. 

° If EDSCR.TDA==1, DBGOSLSR.OSLK==0, and halting is allowed, EL1 and EL2 accesses 
to this register generate a Software Access debug event. 

Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DBGWVR<n> is architecturally mapped to AArch64 System register 
DBGWVR<n>_EL1[31:0]. 


AArch32 System register DBGWVR<n> is architecturally mapped to External register 
DBGWVR<n>_EL1[31:0]. 


If breakpoint n is not implemented then this register is unallocated. 


This register is in the Cold reset domain. On a Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. The register is not affected by a Warm reset. 


Attributes 
DBGWVR<n> is a 32-bit register. 


Field descriptions 


The DBGWVR<n> bit assignments are: 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4741 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.3 Debug registers 


31 2 1 0 


Pm 
<a 


VA, bits [31:2] 
Bits[31:2] of the address value for comparison. 
ARM deprecates setting DBGWVR<n>[2] == 1. 


This field resets to a value that is architecturally UNKNOWN. 
Bits [1:0] 


Reserved, RESO. 


Accessing the DBGWVR<n>: 
To access the DBGWVR<n>: 


MRC p14,0,<Rt>,cQ,<CRm>,6 ; Read DBGWVR<n> into Rt, where n is in the range Q to 15 
MCR p14,0,<Rt>,cQ,<CRm>,6 ; Write Rt to DBGWVR<n>, where n is in the range @ to 15 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1110 000 0000 n<3:@> 110 
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G6.3.29 DLR, Debug Link Register 


The DLR characteristics are: 


Purpose 
In Debug state, holds the address to restart from. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 


as follows: 





ELO EL1  EL2 (NS) 





RW RW RW 





Access to this register is from Debug state only. During normal execution this register is 


unallocated. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DLR is architecturally mapped to AArch64 System register 
DLR_ELO[31:0]. 


Attributes 


DLR is a 32-bit register. 


Field descriptions 


The DLR bit assignments are: 


31 0 
Restart address 
Bits [31:0] 


Restart address. 


Accessing the DLR: 
To access the DLR: 


MRC p15,3,<Rt>,c4,c5,1 ; Read DLR into Rt 
MCR p15,3,<Rt>,c4,c5,1 ; Write Rt to DLR 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 011 0100 0101 001 
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G6.3.30 DSPSR, Debug Saved Program Status Register 


The DSPSR characteristics are: 


Purpose 
Holds the saved process state on entry to Debug state. 


Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RW RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2 (NS) 





RW RW RW 





Access to this register is from Debug state only. During normal execution this register is 
unallocated. 


Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register DSPSR is architecturally mapped to AArch64 System register 
DSPSR_ELO. 


Attributes 
DSPSR is a 32-bit register. 


Field descriptions 


The DSPSR bit assignments are: 


When entering Debug state from AArch32 and exiting Debug state to AArch32: 


31 30 29 28 27 26 25 24 23 22 21 20 19 1615 1098765 4 3 


robb RAL = [ee PT ee | 
IT[1:0] SS i | | a ee 
RESO 








N, bit [31] 
Set to the value of CPSR.N on entering Debug state, and copied to CPSR.N on exiting Debug state. 
Z, bit [30] 
Set to the value of CPSR.Z on entering Debug state, and copied to CPSR.Z on exiting Debug state. 
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C, bit [29] 
Set to the value of CPSR.C on entering Debug state, and copied to CPSR.C on exiting Debug state. 
V, bit [28] 
Set to the value of CPSR.V on entering Debug state, and copied to CPSR.V on exiting Debug state. 
Q, bit [27] 


Set to the value of CPSR.Q on entering Debug state, and copied to CPSR.Q on exiting Debug state. 


IT[1:0], bits [26:25] 
IT block state bits for the T32 IT (If-Then) instruction. See IT[7:2] for explanation of this field. 


J, bit [24] 
RESO. 
In previous versions of the architecture, the {J,T} bits determined the AArch32 Instruction set state. 
ARMvVv8 does not support either Jazelle state or T32EE state, and the T bit determines the Instruction 
set state. 

Bits [23:22] 


Reserved, RESO. 


SS, bit [21] 
Software step. Shows the value of PSTATE.SS immediately before Debug state was entered. 


IL, bit [20] 
Illegal Execution state bit. Shows the value of PSTATE.IL immediately before Debug state was 
entered. 

GE, bits [19:16] 


Greater than or Equal flags, for parallel addition and subtraction. 


IT[7:2], bits [15:10] 
IT block state bits for the T32 IT (If-Then) instruction. This field must be interpreted in two parts. 


° IT[7:5] holds the base condition for the IT block. The base condition is the top 3 bits of the 
condition code specified by the first condition field of the IT instruction. 


° IT[4:0] encodes the size of the IT block, which is the number of instructions that are to be 
conditionally executed, by the position of the least significant 1 in this field. It also encodes 
the value of the least significant bit of the condition code for each instruction in the block. 


The IT field is 0b00000000 when no IT block is active. 





E, bit [9] 
Endianness state bit. Controls the load and store endianness for data accesses: 
0 Little-endian operation 
Hn Big-endian operation. 
Instruction fetches ignore this bit. 
When the reset value of the SCTLR.EE bit is defined by a configuration input signal, that value also 
applies to the CPSR.E bit on reset, and therefore applies to software execution from reset. 
If an implementation does not provide Big-endian support, this bit is RESO. If it does not provide 
Little-endian support, this bit is RES1. 
If an implementation provides Big-endian support but only at ELO, this bit is RESO for an exception 
return to any Exception level other than ELO. 
Likewise, if it provides Little-endian support only at ELO, this bit is RES1 for an exception return to 
any Exception level other than ELO. 
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A, bit [8] 
SError interrupt mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
I, bit [7] 
IRQ mask bit. The possible values of this bit are: 
0 Exception not masked. 
1 Exception masked. 
¥, bit [6] 
FIQ mask bit. The possible values of this bit are: 
Q Exception not masked. 
L Exception masked. 
T, bit [5] 
T32 Instruction set state bit. Determines the AArch32 instruction set state that the Debug state entry 
was taken from. Possible values of this bit are: 
) Taken from A32 state. 
i: Taken from T32 state. 
M4], bit [4] 
Execution state that Debug state was entered from. Possible values of this bit are: 
1 Exception taken from AArch32. 
M[3:0], bits [3:0] 


AArch32 mode that Debug state was entered from. The possible values are: 





M[3:0] Mode 





0b0000 User 





0b0001 + FIQ 





0b0010 +~=« IRQ 





Qb0011 Supervisor 





Qb011@ Monitor (only valid in Secure state, if EL3 is implemented and can use AArch32) 





0b0111 Abort 





0b1010 =Hyp 





0b1011 Undefined 








@b1111 System 





Other values are reserved. The effect of programming this field to a Reserved value is that behavior 
is CONSTRAINED UNPREDICTABLE, as described in Reserved values in System and memory-mapped 
registers and translation table entries on page K1-5477. 

Accessing the DSPSR: 

To access the DSPSR: 


MRC p15,3,<Rt>,c4,c5,@ ; Read DSPSR into Rt 
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MCR p15,3,<Rt>,c4,c5,@ ; Write Rt to DSPSR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 Q11 0100 0101 000 
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G6.3.31 HDCR, Hyp Debug Control Register 
The HDCR characteristics are: 


Purpose 


Controls the trapping to Hyp mode of Non-secure accesses, at EL1 or lower, to functions provided 
by the debug and trace architectures and the Performance Monitors Extension. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1_ EL2 (NS) 





- - RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TDA==1, accesses to this register from EL2 are trapped to EL3 using 
AArché4. 
Configurations 


AArch32 System register HDCR is architecturally mapped to AArch64 System register 
MDCR_EL2. 


If EL2 is not implemented, this register is RESO from EL3. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch32. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
HDCR is a 32-bit register. 


Field descriptions 


The HDCR bit assignments are: 
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31 1211109 87 65 4 


RESO HUTT mw J HPMN 
ee 
TPM 


HPME 
TDE 
TDA 

TDOSA 

TDRA 


Bits [31:12] 
Reserved, RESO. 


TDRA, bit [11] 


Trap Debug ROM Address register access. Traps Non-secure ELO and EL1 System register accesses 
to the Debug ROM registers to Hyp mode. 


) Non-secure ELO and EL1 System register accesses to the Debug ROM registers are not 
trapped to Hyp mode. 

1 Non-secure ELO and EL1 System register accesses to the DBGDRAR or DBGDSAR 
are trapped to Hyp mode. 

If HCR.TGE or HDCR.TDE is 1, behavior is as if this bit is 1 other than for the purpose of a direct 

read. 


When this register has an architecturally-defined reset value, this field resets to 0. 


TDOSA, bit [10] 


Trap debug OS-related register access. Traps Non-secure EL1 System register accesses to the 
powerdown debug registers to Hyp mode. 


) Non-secure EL1 System register accesses to the powerdown debug registers are not 
trapped to Hyp mode. 

1 Non-secure EL1 System register accesses to the powerdown debug registers are trapped 
to Hyp mode. 


The registers for which accesses are trapped are as follows: 

° DBGOSLSR, DBGOSLAR, DBGOSDLR, and the DBGPRCR. 

. Any IMPLEMENTATION DEFINED register with similar functionality that the implementation 
specifies as trapped by this bit. 


— Note 


These registers are not accessible at ELO. 





If HCR.TGE or HDCR.TDE is 1, behavior is as if this bit is 1 other than for the purpose of a direct 
read. 


When this register has an architecturally-defined reset value, this field resets to Q. 





TDA, bit [9] 
Trap debug access. Traps Non-secure ELO and EL1 System register accesses to those debug System 
registers in the (coproc==1110) encoding space that are not trapped by either of the following: 
. HDCR.TDRA. 
. HDCR.TDOSA. 
) Has no effect on System register accesses to the debug registers. 
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1 Non-secure ELO or EL1 System register accesses to the debug registers, other than the 
registers trapped by HDCR.TDRA and HDCR.TDOSA, are trapped to Hyp mode. 


Traps of AArch32 accesses to DBGDTRRXint and DBGDTRTXint are ignored in Debug state. 


If HCR.TGE or HDCR.TDE is 1, behavior is as if this bit is 1 other than for the purpose of a direct 
read. 


When this register has an architecturally-defined reset value, this field resets to Q. 


Trap Debug exceptions. The possible values of this bit are: 


Q This control has no effect on the routing of debug exceptions, and has no effect on 
Non-secure accesses to debug registers. 


a In Non-secure state: 
° Debug exceptions generated at EL1 or ELO are routed to EL2. 


° The HDCR.{TDRA, TDOSA, TDA} fields are treated as being 1 for all purposes 
other than returning the result of a direct read of the register. 


When HCR.TGE == 1, the PE behaves as if the value of this field is 1 for all purposes other than 
returning the value of a direct read of the register. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Hypervisor Performance Monitors Enable. The possible values of this bit are: 
Q Hyp mode Performance Monitors disabled. 
a Hyp mode Performance Monitors enabled. 


When the value of this bit is 1, the Performance Monitors counters that are reserved for use from 
Hyp mode or Secure state are enabled. For more information see the description of the HPMN field. 


If the Performance Monitors Extension is not implemented, this field is RESO. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 








TPM, bit [6] 
Trap Performance Monitors accesses. Traps Non-secure ELO and EL] accesses to all Performance 
Monitors registers to Hyp mode. 
) Non-secure ELO and EL] accesses to all Performance Monitors registers are not trapped 
to Hyp mode. 
1 Non-secure ELO and EL1 accesses to all Performance Monitors registers are trapped to 
Hyp mode. 
—— Note 
EL2 does not provide traps on Performance Monitor register accesses through the optional 
memory-mapped external debug interface. 
If the Performance Monitors Extension is not implemented, this field is RESO. 
When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 
TPMCR, bit [5] 
Trap PMCR accesses. Traps Non-secure ELO and EL1 accesses to the PMCR to Hyp mode. 
) Non-secure ELO and EL1 accesses to the PMCR are not trapped to Hyp mode. 
1 Non-secure ELO and EL1 accesses to the PMCR are trapped to Hyp mode. 
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— Note 


EL2 does not provide traps on Performance Monitor register accesses through the optional 
memory-mapped external debug interface. 





If the Performance Monitors Extension is not implemented, this field is RESO. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 


HPMN, bits [4:0] 


Defines the number of Performance Monitors counters that are accessible from Non-secure EL1 
modes, and from Non-secure ELO modes if unprivileged access is enabled. 


If the Performance Monitors Extension is not implemented, this field is RESO. 


In Non-secure state, HPMN divides the Performance Monitors counters as follows. If software is 
accessing Performance Monitors counter n then, in Non-secure state: 


° If n is in the range 0<=n<HPMN, the counter is accessible from EL1 and EL2, and from ELO 
if unprivileged access to the counters is enabled. PMCR.E enables the operation of counters 
in this range. 


° If n is in the range HPMN<=n<PMCR.N, the counter is accessible only from EL2 and from 
Secure state. HDCR.HPME enables the operation of counters in this range. 


If this field is set to 0, or to a value larger than PMCR.N, then the following CONSTRAINED 
UNPREDICTABLE behavior applies: 


° The value returned by a direct read of HDCR.HPMN is UNKNOWN. 
. Either: 


— An UNKNOWN number of counters are reserved for EL2 use. That is, the PE behaves 
as if HDCR.HPMN is set to an UNKNOWN non-zero value less than PMCR.N. 


—  Allcounters are reserved for EL2 use, meaning no counters are accessible from 
Non-secure EL1 and Non-secure ELO. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to the value of PMCR.N. 


Accessing the HDCR: 
To access the HDCR: 


MRC p15,4,<Rt>,c1,c1,1 ; Read HDCR into Rt 
MCR p15,4,<Rt>,cl,cl,1 ; Write Rt to HDCR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 100 0001 0001 001 
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G6.3.32 SDCR, Secure Debug Control Register 
The SDCR characteristics are: 


Purpose 


When EL3 is implemented and can use AArch32, controls debug and performance monitors 
functionality in Secure state. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1(NS) EL1(S) EL2 (NS) 





- - Trap - 





If EL3 is implemented and is using AArch64, any read or write to SDCR from Secure EL1 using 
AArch32 is trapped as an exception to EL3. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 

AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 

page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
This register is only accessible in Secure state. 


AArch32 System register SDCR can be mapped to AArch64 System register MDCR_EL3, but this 
is not architecturally mandated. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch32. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
SDCR is a 32-bit register. 


Field descriptions 


The SDCR bit assignments are: 
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31 22 21 20 19 18 17 16 15 14 13 0 


EPMAD ee | 
EDAD 


RESO 
SPME 
RESO 


Bits [31:22] 


Reserved, RESO. 


EPMAD, bit [21] 


External debug interface Performance Monitors registers disable. This disables access to these 
registers by an external debugger: 


0 Access to Performance Monitors registers from external debugger is permitted. 


1 Access to Performance Monitors registers from external debugger is disabled, unless 
overridden by the IMPLEMENTATION DEFINED authentication interface. 


If the Performance Monitors Extension is not implemented or does not support external debug 
interface accesses this bit is RESO. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 


EDAD, bit [20] 


External debug interface breakpoint and watchpoint register access disable. This disables access to 
these registers by an external debugger: 


) Access to breakpoint and watchpoint registers from external debugger is permitted. 


He Access to breakpoint and watchpoint registers from external debugger is disabled, 
unless overridden by the IMPLEMENTATION DEFINED authentication interface. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Bits [19:18] 
Reserved, RESO. 


SPME, bit [17] 
Secure Performance Monitors enable. This allows event counting in Secure state: 


Q Event counting prohibited in Secure state, unless overridden by the IMPLEMENTATION 
DEFINED authentication interface. 


a Event counting allowed in Secure state. 
If the Performance Monitors Extension is not implemented, this field is RESO. 


When this register has an architecturally-defined reset value, this field resets to Q. 
Bit [16] 

Reserved, RESO. 
SPD, bits [15:14] 


AArch32 Secure privileged debug. Enables or disables debug exceptions from Secure state, other 
than Breakpoint Instruction exceptions. Valid values for this field are: 


00 Legacy mode. Debug exceptions from Secure EL1 are enabled by the authentication 
interface. 
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10 Secure privileged debug disabled. Debug exceptions from Secure EL1 are disabled. 
11 Secure privileged debug enabled. Debug exceptions from Secure EL1 are enabled. 


Other values are reserved, and have the CONSTRAINED UNPREDICTABLE behavior that they must 
have the same behavior as 0b00. Software must not rely on this property as the behavior of reserved 
values might change in a future revision of the architecture. 


If debug exceptions from Secure EL1 are enabled, then debug exceptions from Secure ELO are also 
enabled. 


Otherwise, debug exceptions from Secure ELO are enabled only if SDER32_EL3.SUIDEN == 1. 


Ignored in Non-secure state. Debug exceptions from Breakpoint Instruction exceptions are always 
enabled. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Reserved, RESO. 


Accessing the SDCR: 


To access the SDCR: 


MRC p15,0,<Rt>,c1,c3,1 ; Read SDCR into Rt 
MCR p15,0,<Rt>,cl,c3,1 ; Write Rt to SDCR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 0001 0011 001 








ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4755 


Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.3 Debug registers 


G6.3.33 SDER, Secure Debug Enable Register 
The SDER characteristics are: 


Purpose 


Controls invasive and non-invasive debug in the Secure ELO mode. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - - RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1(NS) EL1(S) EL2(NS) 





- - RW - 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HSTR.T1==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T1==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
This register is only accessible in Secure state. 


AArch32 System register SDER is architecturally mapped to AArch64 System register 
SDER32_EL3. 


If EL3 is not implemented and EL1 supports AArch32, SDER is implemented only if the 
implemented Security state is Secure state. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch32. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
SDER is a 32-bit register. 


Field descriptions 


The SDER bit assignments are: 
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31 2 10 


RESO il 
= SUIDEN 
SUNIDEN 


Bits [31:2] 


Reserved, RESO. 


SUNIDEN, bit [1] 
Secure User Non-Invasive Debug Enable: 


Q Performance Monitors event counting prohibited in Secure ELO unless allowed by 
MDCR_EL3.SPME, SDCR.SPME, or the IMPLEMENTATION DEFINED authentication 
interface ExternalSecureNoninvasiveDebugEnabled(). 


1 Performance Monitors event counting allowed in Secure ELO. 

When this register has an architecturally-defined reset value, this field resets to 0. 
SUIDEN, bit [0] 

Secure User Invasive Debug Enable: 


) Debug exceptions other than Breakpoint Instruction exceptions from Secure ELO are 
disabled, unless enabled by MDCR_EL3.SPD32 or SDCR.SPD. 


1 Debug exceptions from Secure ELO are enabled. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the SDER: 
To access the SDER: 


MRC p15,0,<Rt>,cl,c1,1 ; Read SDER into Rt 
MCR p15,0,<Rt>,cl,cl,1 ; Write Rt to SDER 


Register access is encoded as follows: 





coproc opci CRn CRm_= opc2 





1111 000 0001 0001 001 
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G6.4 Performance Monitors registers 


This section lists the Performance Monitors registers in AArch32. 
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G6.4.1 PMCCFILTR, Performance Monitors Cycle Count Filter Register 
The PMCCFILTR characteristics are: 


Purpose 


Determines the modes in which the Cycle Counter, PMCCNTR, increments. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





PMCCFILTR can also be accessed by using PMXEVTYPER with PMSELR.SEL set to @b11111. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 


Hyp mode. 

. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 

° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


. If PMUSERENR.EN==0, accesses to this register from ELO are trapped to Undefined mode. 
° If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMCCFILTR is architecturally mapped to AArch64 System register 
PMCCFILTR_ELO. 


AArch32 System register PMCCFILTR is architecturally mapped to External register 
PMCCFILTR_ELO. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch32. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
PMCCFILTR is a 32-bit register. 


Field descriptions 


The PMCCFILTR bit assignments are: 
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31 30 29 28 27 26 0 


NSK a | 
NSU 





NSH 

P, bit [31] 
Privileged filtering bit. Controls counting in EL1. If EL3 is implemented, then counting in 
Non-secure EL] is further controlled by the NSK bit. The possible values of this bit are: 
Q Count cycles in EL1. 
iL Do not count cycles in EL1. 
When this register has an architecturally-defined reset value, this field resets to 0. 

U, bit [30] 
User filtering bit. Controls counting in ELO. If EL3 is implemented, then counting in Non-secure 
ELO is further controlled by the NSU bit. The possible values of this bit are: 
Q Count cycles in ELO. 
1 Do not count cycles in ELO. 
When this register has an architecturally-defined reset value, this field resets to 0. 

NSK, bit [29] 
Non-secure EL1 (kernel) modes filtering bit. Controls counting in Non-secure EL1. If EL3 is not 
implemented, this bit is RESO. 
If the value of this bit is equal to the value of P, cycles in Non-secure EL1 are counted. 
Otherwise, cycles in Non-secure EL1 are not counted. 
When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 

NSU, bit [28] 
Non-secure ELO (Unprivileged) filtering. Controls counting in Non-secure ELO. If EL3 is not 
implemented, this bit is RESO. 
If the value of this bit is equal to the value of U, cycles in Non-secure ELO are counted. 
Otherwise, cycles in Non-secure ELO are not counted. 
When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to Q. 

NSH, bit [27] 
Non-secure EL2 (Hyp mode) filtering bit. Controls counting in Non-secure EL2. If EL2 is not 
implemented, this bit is RESO. 
0 Do not count cycles in EL2. 
1 Count cycles in EL2. 
When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 

Bits [26:0] 
Reserved, RESO. 
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Accessing the PMCCFILTR: 
To access the PMCCFILTR: 


MRC p15,0,<Rt>,c14,c15,7 ; Read PMCCFILTR into Rt 
MCR p15,0,<Rt>,c14,c15,7 ; Write Rt to PMCCFILTR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1110 1111 111 





PMCCFILTR can also be accessed by using PMXEVTYPER with PMSELR.SEL set to 0b11111. 
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G6.4.2 PMCCNTR, Performance Monitors Cycle Count Register 
The PMCCNTR characteristics are: 


Purpose 


Holds the value of the processor Cycle Counter, CCNT, that counts processor clock cycles. See Time 
as measured by the Performance Monitors cycle counter on page D5-1835 for more information. 


PMCCFILTR determines the modes and states in which the PMCCNTR can increment. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





The PMCR.{LC, D} bits configure whether PMCCNTR increments every clock cycle, or once 
every 64 clock cycles. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 
Hyp mode. 


. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


° If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to 
EL2. 


° If PMUSERENR.CR==0, and PMUSERENR.EN==0, read accesses to this register from 
ELO are trapped to Undefined mode. 


° If PMUSERENR_ELO.CR==0, and PMUSERENR_ELO.EN==0, read accesses to this 
register from ELO are trapped to EL1. 


. If PMUSERENR.EN==0, write accesses to this register from ELO are trapped to Undefined 
mode. 


° If PMUSERENR_ELO.EN==0, write accesses to this register from ELO are trapped to EL1. 





Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMCCNTR is architecturally mapped to AArch64 System register 
PMCCNTR_ELO. 
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AArch32 System register PMCCNTR is architecturally mapped to External register 
PMCCNTR_ELO. 


All counters are subject to any changes in clock frequency, including clock stopping caused by the 
WFI and WFE instructions. This means that it is CONSTRAINED UNPREDICTABLE whether or not 
PMCCNTR continues to increment when clocks are stopped by WFI and WFE instructions. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 


to architecturally UNKNOWN values. 


Attributes 


PMCCNTR is a 64-bit register that can also be accessed as a 32-bit value. If it is accessed as a 32-bit 
register, accesses read and write bits [31:0] and do not modify bits [63:32]. 


Field descriptions 


The PMCCNTR bit assignments are: 


When accessing as a 32-bit register: 


31 0 


CCNT 


CCNT, bits [31:0] 
Cycle count. Depending on the values of PMCR.{LC,D}, this field increments in one of the 
following ways: 


° Every processor clock cycle. 
. Every 64th processor clock cycle. 
Writing 1 to PMCR.C sets this field to 0. 


When accessing as a 64-bit register: 


63 0 


CCNT 


CCNT, bits [63:0] 
Cycle count. Depending on the values of PMCR.{LC,D}, this field increments in one of the 
following ways: 


. Every processor clock cycle. 
° Every 64th processor clock cycle. 


Writing 1 to PMCR.C sets this field to 0. 


Accessing the PMCCNTR: 
To access the PMCCNTR when accessing as a 32-bit register: 


MRC p15,0,<Rt>,c9,c13,@ ; Read PMCCNTR[31:0] into Rt 
MCR p15,0,<Rt>,c9,c13,®@ ; Write Rt to PMCCNTR[31:0]. PMCCNTR[63:32] are unchanged 
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Register access is encoded as follows: 

















coproc opct CRn CRm_= opc2 
1111 000 1001 1101 000 
To access the PMCCNTR when accessing as a 64-bit register: 
MRRC p15,0,<Rt>,<Rt2>,c9 ; Read PMCCNTR[31:0] into Rt and PMCCNTR[63:32] into Rt2 
MCRR p15,0,<Rt>,<Rt2>,c9 ; Write Rt to PMCCNTR[31:0] and Rt2 to PMCCNTR[63:32] 
Register access is encoded as follows: 
coproc opc1 CRm 
1111 0000 1001 
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G6.4.3 PMCEIDO, Performance Monitors Common Event Identification register 0 
The PMCEID0 characteristics are: 


Purpose 


Defines which common architectural and common microarchitectural feature events in the range 
0x00 to Qx01F are implemented. If a particular bit is set to 1, then the event for that bit is 
implemented. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) | EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure read accesses to this register from ELO and ELI are trapped 


to Hyp mode. 

° If MDCR_EL2.TPM==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 

° If MDCR_EL3.TPM==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 

° If HSTR.T9==1, Non-secure read accesses to this register from ELO and EL] are trapped to 
Hyp mode. 

° If HSTR_EL2.T9==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 

. If PMUSERENR.EN==0, read accesses to this register from ELO are trapped to Undefined 
mode. 


. If PMUSERENR_ELO.EN==0, read accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMCEID0 is architecturally mapped to AArch64 System register 
PMCEIDO_ELO0[31:0]. 


AArch32 System register PMCEID0 is architecturally mapped to External register 
PMCEIDO[31:0]. 


Attributes 
PMCEID0 is a 32-bit register. 
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Field descriptions 


The PMCEIDO bit assignments are: 


31 0 


ID[31:0], bits [31:0] 
PMCEIDO[n] maps to event n. For a list of event numbers and descriptions, see Events, event 
numbers, and mnemonics on page D5-1848. 


For each bit: 


Q The common event is not implemented. 


1 The common event is implemented. 
Bits that map to reserved event numbers are reserved to identify events that might be defined in 
future revisions to the architecture. 


Events that do not require additional features in the PMU can be defined retrospectively, meaning 
that they can be implemented as part of a PMUv3 implementation. 


Accessing the PMCEIDO: 
To access the PMCEIDO: 
MRC p15,0,<Rt>,c9,c12,6 ; Read PMCEID@ into Rt 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 











1111 000 1001 1100 110 
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G6.4.4 PMCEID1, Performance Monitors Common Event Identification register 1 
The PMCEID1 characteristics are: 


Purpose 


Defines which common architectural and common microarchitectural feature events in the range 
0x20 to Qx03F are implemented. If a particular bit is set to 1, then the event for that bit is 
implemented. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) | EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure read accesses to this register from ELO and ELI are trapped 


to Hyp mode. 

° If MDCR_EL2.TPM==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 

° If MDCR_EL3.TPM==1, read accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 

° If HSTR.T9==1, Non-secure read accesses to this register from ELO and EL] are trapped to 
Hyp mode. 

° If HSTR_EL2.T9==1, Non-secure read accesses to this register from ELO and EL1 are 
trapped to EL2. 

. If PMUSERENR.EN==0, read accesses to this register from ELO are trapped to Undefined 
mode. 


. If PMUSERENR_ELO.EN==0, read accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMCEID1 is architecturally mapped to AArch64 System register 
PMCEID1_EL0[31:0]. 


AArch32 System register PMCEID1 is architecturally mapped to External register 
PMCEID1[31:0]. 


Attributes 
PMCEID1 is a 32-bit register. 
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Field descriptions 


The PMCEID1 bit assignments are: 


31 0 


ID[63:32], bits [31:0] 
PMCEID1[n] maps to event (n + 32). For a list of event numbers and descriptions, see Events, event 
numbers, and mnemonics on page D5-1848. 


For each bit: 


0 The common event is not implemented. 


1 The common event is implemented. 
Bits that map to reserved event numbers are reserved to identify events that might be defined in 
future revisions to the architecture. 


Events that do not require additional features in the PMU can be defined retrospectively, meaning 
that they can be implemented as part of a PMUv3 implementation. 


Accessing the PMCEID1: 
To access the PMCEID1: 
MRC p15,0,<Rt>,c9,c12,7 ; Read PMCEID1 into Rt 


Register access is encoded as follows: 





coproc opct1 CRn CRm_= opc2 











1111 000 1001 1100 111 
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G6.4.5 PMCNTENCLR, Performance Monitors Count Enable Clear register 
The PMCNTENCLR characteristics are: 


Purpose 


Disables the Cycle Count Register, PMCCNTR, and any implemented event counters 
PMEVCNTR<n>. Reading this register shows which counters are enabled. 


PMCNTENCLR is used in conjunction with the PMCNTENSET register. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





If EL2 is implemented, in Non-secure EL1 and ELO modes, the value of HDCR.HPMN or 
MDCR_EL2.HPMN can change the behavior of accesses to PACNTENCLR. See the description 
of the P<n> bit. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


. If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and ELI are trapped to 


EL2. 

. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 

° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


. If PMUSERENR.EN==0, accesses to this register from ELO are trapped to Undefined mode. 
° If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMCNTENCLR is architecturally mapped to AArch64 System register 
PMCNTENCLR_ELO. 


AArch32 System register PMCNTENCLR is architecturally mapped to External register 
PMCNTENCLR_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMCNTENCLR is a 32-bit register. 
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Field descriptions 


The PMCNTENCLR bit assignments are: 


31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR disable bit. Disables the cycle counter register. Possible values are: 
0 When read, means the cycle counter is disabled. When written, has no effect. 
1 When read, means the cycle counter is enabled. When written, disables the cycle 
counter. 


P<n>, bit [n], for n = 0 to 30 
Event counter disable bit for PMEVCNTR<n>. 


Bits [30:N] are RAZ/WI. When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in 
MDCR_EL2.HPMN if EL2 is using AArch64 or in HDCR.HPMN if EL2 is using AArch32. 
Otherwise, N is the value in PMCR.N. 


Possible values of each bit are: 


0 When read, means that PMEVCNTR<n> is disabled. When written, has no effect. 
1 When read, means that PMEVCNTR<n> is enabled. When written, disables 
PMEVCNTR<n>. 


Accessing the PMCNTENCLR: 
To access the PMCNTENCLR: 


MRC p15,0,<Rt>,c9,c12,2 ; Read PMCNTENCLR into Rt 
MCR p15,0,<Rt>,c9,c12,2 ; Write Rt to PMCNTENCLR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1100 010 
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G6.4.6 PMCNTENSET, Performance Monitors Count Enable Set register 
The PMCNTENSET characteristics are: 


Purpose 


Enables the Cycle Count Register, PMCCNTR, and any implemented event counters 
PMEVCNTR<n>. Reading this register shows which counters are enabled. 


PMCNTENSET is used in conjunction with the PMCNTENCLR register. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





If EL2 is implemented, in Non-secure EL1 and ELO modes, the value of HDCR.HPMN can change 
the behavior of accesses to PMCNTENSET. See the description of the P<n> bit. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 
Hyp mode. 


. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


° If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to 
EL2. 


° If PMUSERENR.EN==0, accesses to this register from ELO are trapped to Undefined mode. 
. If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMCNTENSET is architecturally mapped to AArch64 System register 
PMCNTENSET_ELO. 


AArch32 System register PMCNTENSET is architecturally mapped to External register 
PMCNTENSET_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 
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Attributes 
PMCNTENSET is a 32-bit register. 


Field descriptions 


The PMCNTENSET bit assignments are: 


31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR enable bit. Enables the cycle counter register. Possible values are: 
0 When read, means the cycle counter is disabled. When written, has no effect. 
1 When read, means the cycle counter is enabled. When written, enables the cycle 


counter. 


P<n>, bit [n], for n = 0 to 30 
Event counter enable bit for PMEVCNTR<n>. 


Bits [30:N] are RAZ/WI. When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in 
MDCR_EL2.HPMN, if EL2 is using AArch64 or in HDCR.HPMN if EL2 is using AArch32. 
Otherwise, N is the value in PMCR.N. 


Possible values of each bit are: 


0 When read, means that PMEVCNTR<n> is disabled. When written, has no effect. 
iL When read, means that PMEVCNTR<n> event counter is enabled. When written, 
enables PMEVCNTR<n>. 


Accessing the PMCNTENSET: 
To access the PMCNTENSET: 


MRC p15,0,<Rt>,c9,c12,1 ; Read PMCNTENSET into Rt 
MCR p15,0,<Rt>,c9,c12,1 ; Write Rt to PMCNTENSET 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1100 001 
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G6.4.7 PMCR, Performance Monitors Control Register 


The PMCR characteristics are: 


Purpose 


Provides details of the Performance Monitors implementation, including the number of counters 
implemented, and configures and controls the counters. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 
Hyp mode. 


. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If HDCR.TPMCR==1, Non-secure accesses to this register from ELO and EL] are trapped to 
Hyp mode. 


° If MDCR_EL2.TPMCR==1, Non-secure accesses to this register from ELO and EL1 are 
trapped to EL2. 


. If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


. If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and EL1 are trapped to 
EL2. 


° If PMUSERENR.EN==0, accesses to this register from ELO are trapped to Undefined mode. 
° If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMCR is architecturally mapped to AArch64 System register 
PMCR_ELO. 


AArch32 System register PMCR[6:0] is architecturally mapped to External register 
PMCR_ELO[6:0]. 
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This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch32. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 

Attributes 


PMCR is a 32-bit register. 


Field descriptions 


The PMCR bit assignments are: 


31 24 23 1615 11 10 76543 210 


IMP, bits [31:24] 
Implementer code. This field is RO with an IMPLEMENTATION DEFINED value. 


The implementer codes are allocated by ARM. Values have the same interpretation as bits [31:24] 
of the MIDR. 


IDCODE, bits [23:16] 
Identification code. This field is RO with an IMPLEMENTATION DEFINED value. 
Each implementer must maintain a list of identification codes that is specific to the implementer. A 


specific implementation is identified by the combination of the implementer code and the 
identification code. 


N, bits [15:11] 


Number of event counters. A RO field that indicates the number counters implemented. A value of 
0b00000 in this field indicates that only the Cycle Count Register PMCCNTR is implemented. 


The value of this field is the number of event counters implemented. This value is in the range of 
0b00000, in which case only the PMCCNTR is implemented, to 0b11111, which indicates that the 
PMCCNTR and 31 event counters are implemented. 


In an implementation that includes EL2, reads of this field from Non-secure EL1 and Non-secure 
ELO return the value of HDCR.HPMN if EL2 is using AArch32, or the value of 
MDCR_EL2.HPMN if EL2 is using AArch64. 





Bits [10:7] 
Reserved, RESO. 

LC, bit [6] 
Long cycle counter enable. Determines which PMCCNTR bit generates an overflow recorded by 
PMOVSR[31]. 
Q Cycle counter overflow on increment that changes PMCCNTR[31] from 1 to 0. 
1 Cycle counter overflow on increment that changes PMCCNTR[63] from 1 to 0. 
ARM deprecates use of PMCR.LC = 0. 
This field resets to a value that is architecturally UNKNOWN. 

DP, bit [5] 
Disable cycle counter when event counting is prohibited. The possible values of this bit are: 
0 PMCCNTR, if enabled, counts when event counting is prohibited. 
1 PMCCNTR does not count when event counting is prohibited. 

G6-4774 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.4 Performance Monitors registers 


Counting events is never prohibited in Non-secure state. However, there are some restrictions on 
counting events in Secure state. For more information about the interaction between the 
Performance Monitors and EL3, see /nteraction with EL3 on page D5-1841. 


If EL3 is not implemented, this field is RESO, otherwise it is an RW field. 
When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 

X, bit [4] 


Enable export of events in an IMPLEMENTATION DEFINED event stream. The possible values of this 
bit are: 


() Do not export events. 
1 Export events where not prohibited. 


This field enables the exporting of events over an event bus to another device, for example to an 
OPTIONAL trace macrocell. If the implementation does not include such an event bus then this field 
is RAZ/WI, otherwise it is an RW field. 


In an implementation that includes an event bus, no events are exported when counting is prohibited. 


This field does not affect the generation of Performance Monitors overflow interrupt requests or 
signaling to a cross-trigger interface (CTI) that can be implemented as signals exported from the PE. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field, it resets to 0. 
D, bit [3] 
Clock divider. The possible values of this bit are: 
0 When enabled, PMCCNTR counts every clock cycle. 
1 When enabled, PMCCNTR counts once every 64 clock cycles. 
This bit is RW. 
If PMCR.LC == 1, this bit is ignored and the cycle counter counts every clock cycle. 
ARM deprecates use of PMCR.D = 1. 


When this register has an architecturally-defined reset value, this field resets to 0. 


C, bit [2] 
Cycle counter reset. This bit is WO. The effects of writing to this bit are: 
0 No action. 
1 Reset PMCCNTR to zero. 
This bit is always RAZ. 
Resetting PMCCNTR does not clear the PMCCNTR overflow bit to 0. 


P, bit [1] 
Event counter reset. This bit is WO. The effects of writing to this bit are: 
0 No action. 
1 Reset all event counters accessible in the current EL, not including PMCCNTR, to zero. 
This bit is always RAZ. 


In Non-secure ELO and EL1, if EL2 is implemented, a write of 1 to this bit does not reset event 
counters that HDCR.HPMN or MDCR_EL2.HPMN reserves for EL2 use. 


In EL2 and EL3, a write of 1 to this bit resets all the event counters. 
Resetting the event counters does not clear any overflow bits to 0. 
E, bit [0] 
Enable. The possible values of this bit are: 
0 All counters that are accessible at Non-secure EL1, including PMCCNTR, are disabled. 
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1 All counters that are accessible at Non-secure EL1 are enabled by PMCNTENSET. 
This bit is RW. 


If EL2 is implemented, this bit does not affect the operation of event counters that HDCR.HPMN 
or MDCR_EL2.HPMN reserves for EL2 use. 


When this register has an architecturally-defined reset value, this field resets to Q. 


Accessing the PMCR: 
To access the PMCR: 


MRC p15,0,<Rt>,c9,c12,@ ; Read PMCR into Rt 
MCR p15,0,<Rt>,c9,c12,@ ; Write Rt to PMCR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1100 000 
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G6.4.8 PMEVCNTR<n>, Performance Monitors Event Count Registers, n = 0 - 30 
The PMEVCNTR<n> characteristics are: 


Purpose 


Holds event counter n, which counts events, where n is 0 to 30. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





This register can be read at ELO when PMUSERENR.EN or PMUSERENR.ER is set to 1, and can 
be written at ELO when PMUSERENR.ER is set to 1. 


PMEVCNTR<n> can also be accessed by using PMXEVCNTR with PMSELR.SEL set to n. 


If <n> is greater than or equal to the number of accessible counters, reads and writes of 
PMEVCNTR<n> are CONSTRAINED UNPREDICTABLE, and the following behaviors are permitted: 


° Accesses to the register are UNDEFINED. 
° Accesses to the register behave as RAZ/WI. 
° Accesses to the register execute as a NOP. 


° In Non-secure state, for an access from PL1 or a permitted access from PLO, if 
PMSELR.SEL, or PMSELR_ELO.SEL if EL1 is using AArch64, is greater than or equal to 
the number of accessible counters but is less than the number of implemented counters, the 
register access is trapped to EL2. Accesses from PLO are permitted when: 


— ELI is using AArch32 and the value of PMUSERENR.EN is 1. 
— ELI is using AArch64 and the value of PMUSERENR_ELO.EN is 1. 
— Note 
In an implementation that includes EL2, in Non-secure state at ELO and EL1: 
° If EL2 is using AArch32, HDCR.HPMN identifies the number of accessible counters. 
° If EL2 is using AArch64, MDCR_EL2.HPMN identifies the number of accessible counters. 


Otherwise, the number of accessible counters is the number of implemented counters. 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 





Hyp mode. 
. If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 
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° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 

° If PMUSERENR.EN==0, and PMUSERENR.ER==0, read accesses to this register from 
ELO are trapped to Undefined mode. 

° If PMUSERENR.EN==0, write accesses to this register from ELO are trapped to Undefined 
mode. 

° If PMUSERENR_ELO.EN==0, and PMUSERENR_ELO.ER==0, read accesses to this 
register from ELO are trapped to EL1. 


° If PMUSERENR_ELO.EN==0, write accesses to this register from ELO are trapped to EL1. 





Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMEVCNTR<n> is architecturally mapped to AArch64 System register 
PMEVCNTR<n>_ELO. 


AArch32 System register PMEVCNTR<n> is architecturally mapped to External register 
PMEVCNTR<n>_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMEVCNTR<n> is a 32-bit register. 


Field descriptions 


The PMEVCNTR<n> bit assignments are: 


31 0 
Bits [31:0] 


Event counter n. Value of event counter n, where n is the number of this register and is a number 
from 0 to 30. 


Accessing the PMEVCNTR<n>: 
To access the PMEVCNTR<n>: 


MRC p15,0,<Rt>,c14,<CRm>,<opc2> ; Read PMEVCNTR<n> into Rt, where n is in the range 0 to 30 
MCR p15,0,<Rt>,c14,<CRm>,<opc2> ; Write Rt to PMEVCNTR<n>, where n is in the range @ to 30 


Register access is encoded as follows: 





coproc opct CRn CRm opc2 





1111 000 1110 10:n<4:3> — n<2:@> 





PMEVCNTR<n> can also be accessed by using PMXEVCNTR with PMSELR.SEL set to the value of <n>. 
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G6.4.9 PMEVTYPER<n>, Performance Monitors Event Type Registers, n = 0 - 30 
The PMEVTYPER<n> characteristics are: 


Purpose 


Configures event counter n, where n is 0 to 30. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





This register is accessible at ELO when PMUSERENR.EN is set to 1. 
PMEVTYPER<n> can also be accessed by using PMXEVTYPER with PMSELR.SEL set to n. 


If <n> is greater or equal to the number of accessible counters, reads and writes of 
PMEVTYPER<n> are CONSTRAINED UNPREDICTABLE, and the following behaviors are permitted: 


° Accesses to the register are UNDEFINED. 
° Accesses to the register behave as RAZ/WI. 
° Accesses to the register execute as a NOP. 


° In Non-secure state, for an access from PL1 or a permitted access from PLO, if 
PMSELR.SEL, or PMSELR_ELO.SEL if EL1 is using AArch64, is greater than or equal to 
the number of accessible counters but is less than the number of implemented counters, the 
register access is trapped to EL2. Accesses from PLO are permitted when: 


— ELI is using AArch32 and the value of PMUSERENR.EN is 1. 
— ELI is using AArch64 and the value of PMUSERENR_ELO.EN is 1. 
— Note 
In an implementation that includes EL2, in Non-secure state at ELO and EL1: 
° If EL2 is using AArch32, HDCR.HPMN identifies the number of accessible counters. 
° If EL2 is using AArch64, MDCR_EL2.HPMN identifies the number of accessible counters. 


Otherwise, the number of accessible counters is the number of implemented counters. 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 





Hyp mode. 
° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 
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° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If PMUSERENR.EN==0, accesses to this register from ELO are trapped to Undefined mode. 
° If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMEVTY PER<n> is architecturally mapped to AArch64 System register 
PMEVTYPER<n>_ELO. 


AArch32 System register PMEVTYPER<n> is architecturally mapped to External register 
PMEVTYPER<n>_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMEVTYPER<n> is a 32-bit register. 


Field descriptions 


The PMEVTYPER<n> bit assignments are: 


31 30 29 28 27 26 25 24 10 9 0 


NSK | | 
NSU 





NSH 
RESO 
MT 
P, bit [31] 
Privileged filtering bit. Controls counting in EL1. If EL3 is implemented, then counting in 
Non-secure EL] is further controlled by the NSK bit. The possible values of this bit are: 
) Count events in EL1. 
1 Do not count events in EL1. 
U, bit [30] 
User filtering bit. Controls counting in ELO. If EL3 is implemented, then counting in Non-secure 
ELO is further controlled by the NSU bit. The possible values of this bit are: 
0 Count events in ELO. 
a Do not count events in ELO. 
NSK, bit [29] 
Non-secure EL1 (kernel) modes filtering bit. Controls counting in Non-secure EL1. If EL3 is not 
implemented, this bit is RESO. 
If the value of this bit is equal to the value of P, events in Non-secure EL1 are counted. 
Otherwise, events in Non-secure EL1 are not counted. 
NSU, bit [28] 
Non-secure ELO (Unprivileged) filtering. Controls counting in Non-secure ELO. If EL3 is not 
implemented, this bit is RESO. 
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If the value of this bit is equal to the value of U, events in Non-secure ELO are counted. 


Otherwise, events in Non-secure ELO are not counted. 


NSH, bit [27] 


Non-secure EL2 (Hyp mode) filtering bit. Controls counting in Non-secure EL2. If EL2 is not 
implemented, this bit is RESO. 


1) Do not count events in EL2. 


1 Count events in EL2. 


Bit [26] 


Reserved, RESO. 


MT, bit [25] 
Multithreading. When the implementation is multi-threaded, the valid values for this bit are: 
0 Count events only on controlling PE. 
a Count events from any PE with the same affinity at level 1 and above as this PE. 


When the implementation is not multi-threaded, this bit is RESO. 


— Note 


° An implementation is described as multi-threaded when the lowest level of affinity consists 
of logical PEs that are implemented using a multi-threading type approach. That is, the 
performance of PEs at the lowest affinity level is highly interdependent. On such an 
implementation, the value of MPIDR_EL1.MT, when read at the highest implemented 
Exception level, is 1. 


° Events from a different thread of a multithreaded implementation are not Attributable to the 
thread counting the event. 





Bits [24:10] 
Reserved, RESO. 

evtCount, bits [9:0] 
Event to count. The event number of the event that is counted by event counter PMEVCNTR<n>. 
Software must program this field with an event that is supported by the PE being programmed. 


There are three ranges of event numbers: 


. Event numbers in the range 0x000 to 0x03F are common architectural and microarchitectural 
events. 
° Event numbers in the range 0x040 to @x@BF are ARM recommended common architectural and 


microarchitectural events. 
° Event numbers in the range @x0CQ to @x3FF are IMPLEMENTATION DEFINED events. 


If evtCount is programmed to an event that is reserved or not supported by the PE, the behavior 
depends on the event type: 


° For the range 0x00 to @x03F, no events are counted, and the value returned by a direct or 
external read of the evtCount field is the value written to the field. 


° For IMPLEMENTATION DEFINED events, it is UNPREDICTABLE what event, if any, is counted, 
and the value returned by a direct or external read of the evtCount field is UNKNOWN. 


— Note 


UNPREDICTABLE means the event must not expose privileged information. 





ARM recommends that the behavior across a family of implementations is defined such that if a 
given implementation does not include an event from a set of common IMPLEMENTATION DEFINED 
events, then no event is counted and the value read back on evtCount is the value written. 
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Accessing the PMEVTYPER<n>: 
To access the PMEVTYPER<n>: 


MRC p15,0,<Rt>,c14,<CRm>,<opc2> ; Read PMEVTYPER<n> into Rt, where n is in the range Q to 30 
MCR p15,0,<Rt>,c14,<CRm>,<opc2> ; Write Rt to PMEVTYPER<n>, where n is in the range Q to 30 


Register access is encoded as follows: 














coproc opct CRn CRm opc2 
1111 000 1110 11:n<4:3>  1n<2:@> 
G6-4782 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


G6 AArch32 System Register Descriptions 
G6.4 Performance Monitors registers 


G6.4.10 PMINTENCLR, Performance Monitors Interrupt Enable Clear register 
The PMINTENCLR characteristics are: 


Purpose 


Disables the generation of interrupt requests on overflows from the Cycle Count Register, 
PMCCNTR, and the event counters PMEVCNTR<n>. Reading the register shows which overflow 
interrupt requests are enabled. 


PMINTENCLR is used in conjunction with the PMINTENSET register. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1_ EL2 (NS) 





- RW RW 





If EL2 is implemented, in Non-secure EL1 and ELO modes, the value of HDCR.HPMN can change 
the behavior of accesses to PMINTENCLR. See the description of the P<n> bit. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TPM==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TPM==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
° If HSTR.T9==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T9==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMINTENCLR is architecturally mapped to AArch64 System register 
PMINTENCLR_EL1. 


AArch32 System register PMINTENCLR is architecturally mapped to External register 
PMINTENCLR_EL1. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMINTENCLR is a 32-bit register. 


Field descriptions 


The PMINTENCLR bit assignments are: 
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31 30 0 
C, bit [31] 


PMCCNTR overflow interrupt request disable bit. Possible values are: 


) When read, means the cycle counter overflow interrupt request is disabled. When 


written, has no effect. 


When read, means the cycle counter overflow interrupt request is enabled. When 
written, disables the cycle count overflow interrupt request. 


P<n>, bit [n], for n = 0 to 30 


Event counter overflow interrupt request disable bit for PMEVCNTR<n>. 


When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in HDCR.HPMN. 
Otherwise, N is the value in PMCR.N. 


Bits [30:N] are RAZ/WI. 


Possible values are: 


Q When read, means that the PMEVCNTR<n> event counter interrupt request is disabled. 
When written, has no effect. 


When read, means that the PMEVCNTR<n> event counter interrupt request is enabled. 
When written, disables the PMEVCNTR<n> interrupt request. 

Accessing the PMINTENCLR: 

To access the PMINTENCLR: 


MRC p15,0,<Rt>,c9,c14,2 ; Read PMINTENCLR into Rt 
MCR p15,0,<Rt>,c9,c14,2 ; Write Rt to PMINTENCLR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1110 010 
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G6.4.11 PMINTENSET, Performance Monitors Interrupt Enable Set register 
The PMINTENSET characteristics are: 


Purpose 


Enables the generation of interrupt requests on overflows from the Cycle Count Register, 
PMCCNTR, and the event counters PMEVCNTR<n>. Reading the register shows which overflow 
interrupt requests are enabled. 


PMINTENSET is used in conjunction with the PMINTENCLR register. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1_ EL2 (NS) 





- RW RW 





If EL2 is implemented, in Non-secure EL1 and ELO modes, the value of HDCR.HPMN can change 
the behavior of accesses to PMINTENSET. See the description of the P<n> bit. 
Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
. If MDCR_EL2.TPM==1, Non-secure accesses to this register from EL1 are trapped to EL2. 
. If MDCR_EL3.TPM==1, accesses to this register from EL1 and EL2 are trapped to EL3. 
° If HSTR.T9==1, Non-secure accesses to this register from EL1 are trapped to Hyp mode. 
° If HSTR_EL2.T9==1, Non-secure accesses to this register from EL1 are trapped to EL2. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMINTENSET is architecturally mapped to AArch64 System register 
PMINTENSET_EL1. 


AArch32 System register PMINTENSET is architecturally mapped to External register 
PMINTENSET_EL1. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMINTENSET is a 32-bit register. 


Field descriptions 


The PMINTENSET bit assignments are: 
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31 30 0 
C, bit [31] 


PMCCNTR overflow interrupt request enable bit. Possible values are: 


) When read, means the cycle counter overflow interrupt request is disabled. When 


written, has no effect. 


When read, means the cycle counter overflow interrupt request is enabled. When 
written, enables the cycle count overflow interrupt request. 


P<n>, bit [n], for n = 0 to 30 


Event counter overflow interrupt request enable bit for PMEVCNTR<n>. 


When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in HDCR.HPMN. 
Otherwise, N is the value in PMCR.N. 


Bits [30:N] are RAZ/WI. 


Possible values are: 


Q When read, means that the PMEVCNTR<n> event counter interrupt request is disabled. 
When written, has no effect. 


When read, means that the PMEVCNTR<n> event counter interrupt request is enabled. 
When written, enables the PMEVCNTR<n> interrupt request. 

Accessing the PMINTENSET: 

To access the PMINTENSET: 


MRC p15,0,<Rt>,c9,c14,1 ; Read PMINTENSET into Rt 
MCR p15,0,<Rt>,c9,c14,1 ; Write Rt to PMINTENSET 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1110 001 
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G6.4.12 PMOVSR, Performance Monitors Overflow Flag Status Register 
The PMOVSR characteristics are: 


Purpose 
Contains the state of the overflow bit for the Cycle Count Register, PMCCNTR, and each of the 
implemented event counters PMEVCNTR<n>. Writing to this register clears these bits. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





If EL2 is implemented, in Non-secure EL1 and ELO modes, the value of HDCR.HPMN or 
MDCR_EL2.HPMN can change the behavior of accesses to PMOVSR. See the description of the 
P<n> bit. 

Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 
Hyp mode. 


° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


° If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and ELI are trapped to 
EL2. 


. If PMUSERENR.EN==0, accesses to this register from ELO are trapped to Undefined mode. 
° If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMOVSR is architecturally mapped to AArch64 System register 
PMOVSCLR_ELO. 


AArch32 System register PMOVSR is architecturally mapped to External register 
PMOVSCLR_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 
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Attributes 
PMOVSR is a 32-bit register. 


Field descriptions 


The PMOVSR bit assignments are: 


31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR overflow bit. Possible values are: 
0 When read, means the cycle counter has not overflowed. When written, has no effect. 
1 When read, means the cycle counter has overflowed. When written, clears the overflow 


bit to 0. 
PMCR.LC controls whether an overflow is detected from PMCCNTR[3 1] or from PMCCNTR[63]. 


P<n>, bit [n], for n = 0 to 30 
Event counter overflow clear bit for PMEVCNTR<n>. 


Bits [30:N] are RAZ/WI. When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in 
MDCR_EL2.HPMN if EL2 is using AArch64 or in HDCR.HPMN if EL2 is using AArch32. 
Otherwise, N is the value in PMCR.N. 


Possible values of each bit are: 


1) When read, means that PMEVCNTR<n> has not overflowed. When written, has no 
effect. 
1 When read, means that PMEVCNTR<n> has overflowed. When written, clears the 


PMEVCNTR<n> overflow bit to 0. 


Accessing the PMOVSR: 
To access the PMOVSR: 


MRC p15,0,<Rt>,c9,c12,3 ; Read PMOVSR into Rt 
MCR p15,0,<Rt>,c9,c12,3 ; Write Rt to PMOVSR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1100 011 
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G6.4.13 PMOVSSET, Performance Monitors Overflow Flag Status Set register 
The PMOVSSET characteristics are: 


Purpose 
Sets the state of the overflow bit for the Cycle Count Register, PMCCNTR, and each of the 
implemented event counters PMEVCNTR<n>. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





If EL2 is implemented, in Non-secure EL1 and ELO modes, the value of HDCR.HPMN or 
MDCR_EL2.HPMN can change the behavior of accesses to PMOVSSET. See the description of the 
P<n> bit. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 
Hyp mode. 


° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


° If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to 
EL2. 


. If PMUSERENR.EN==0, accesses to this register from ELO are trapped to Undefined mode. 
° If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMOVSSET is architecturally mapped to AArch64 System register 
PMOVSSET_ELO. 


AArch32 System register PMOVSSET is architecturally mapped to External register 
PMOVSSET_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 
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Attributes 
PMOVSSET is a 32-bit register. 


Field descriptions 


The PMOVSSET bit assignments are: 


31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR overflow bit. Possible values are: 
0 When read, means the cycle counter has not overflowed. When written, has no effect. 
1 When read, means the cycle counter has overflowed. When written, sets the overflow 


bit to 1. 


P<n>, bit [n], for n = 0 to 30 
Event counter overflow set bit for PMEVCNTR<n>. 


Bits [30:N] are RAZ/WI. When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in 
MDCR_EL2.HPMN if EL2 is using AArch64 or in HDCR.HPMN if EL2 is using AArch32. 
Otherwise, N is the value in PMCR.N. 


Possible values are: 


1) When read, means that PMEVCNTR<n> has not overflowed. When written, has no 
effect. 
1 When read, means that PMEVCNTR<n> has overflowed. When written, sets the 


PMEVCNTR<n> overflow bit to 1. 


Accessing the PMOVSSET: 
To access the PMOVSSET: 


MRC p15,0,<Rt>,c9,c14,3 ; Read PMOVSSET into Rt 
MCR p15,0,<Rt>,c9,c14,3 ; Write Rt to PMOVSSET 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1110 011 
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G6.4.14 PMSELR, Performance Monitors Event Counter Selection Register 


The PMSELR characteristics are: 


Purpose 


Selects the current event counter PMEVCNTR<n> or the cycle counter, CCNT. 


PMSELR is used in conjunction with PMXEVTYPER to determine the event that increments a 
selected event counter, and the modes and states in which the selected counter increments. 


It is also used in conjunction with PMXEVCNTR, to determine the value of a selected event 
counter. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 
Hyp mode. 


° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


. If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


° If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and EL1 are trapped to 
EL2. 


° If PMUSERENR.EN==0, and PMUSERENR.ER==0, accesses to this register from ELO are 
trapped to Undefined mode. 


° If PMUSERENR_ELO.EN==0, and PMUSERENR_ELO.ER==0, accesses to this register 
from ELO are trapped to EL1. 


Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMSELR is architecturally mapped to AArch64 System register 
PMSELR_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 
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Attributes 


PMSELR is a 32-bit register. 


Field descriptions 


The PMSELR bit assignments are: 


31 


5 4 0 


RESO SEL 


Bits [31:5] 


Reserved, RESO. 


SEL, bits [4:0] 


Selects event counter, PMEVCNTR<n>, where n is the value held in this field. This value identifies 
which event counter is accessed when a subsequent access to PMXEVTYPER or PMXEVCNTR 


occurs. 
This field can take any value from 0 (@b00000) to (PMCR.N)-1, or 31 (0b11111). 
When PMSELR.SEL is 0b11111 it selects the cycle counter and: 


A read of the PMXEVTYPER returns the value of PMCCFILTR. 
A write of the PMXEVTYPER writes to PMCCFILTR. 


A read or write of PMXEVCNTR has CONSTRAINED UNPREDICTABLE effects, that can be one 
of the following: 


— Access to PMXEVCNTR is UNDEFINED. 
— Access to PMXEVCNTR behaves as a NOP. 
— Access to PMXEVCNTR behaves as if the register is RAZ/WI. 


— Access to PMXEVCNTR behaves as if the PMSELR.SEL field contains an UNKNOWN 
value. 


If this field is set to a value greater than or equal to the number of implemented counters, but not 
equal to 31, the results of access to PMXEVTYPER or PMXEVCNTR are CONSTRAINED 
UNPREDICTABLE, and can be one of the following: 


Access to PMXEVTYPER or PMXEVCNTR is UNDEFINED. 
Access to PMXEVTYPER or PMXEVCNTR behaves as a NOP. 
Access to PMXEVTYPER or PMXEVCNTR behaves as if the register is RAZ/WI. 


Access to PMXEVTYPER or PMXEVCNTR behaves as if the PMSELR.SEL field contains 
an UNKNOWN value. 


Access to PMXEVTYPER or PMXEVCNTR behaves as if the PMSELR.SEL field contains 
0b11111. 


Direct reads of this field return an UNKNOWN value. 


Accessing the PMSELR: 


To access the PMSELR: 


MRC p15,0,<Rt>,c9,c12,5 ; Read PMSELR into Rt 
MCR p15,0,<Rt>,c9,c12,5 ; Write Rt to PMSELR 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1100 101 
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G6.4.15 PMSWINC, Performance Monitors Software Increment register 
The PMSWINC characteristics are: 


Purpose 
Increments a counter that is configured to count the Software increment event, event 0x00. For more 
information, see SW_INCR. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-WO Config-WO WO WO WO WO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config¢-WO WO WO 





If EL2 is implemented, in Non-secure EL1 and ELO modes, the value of HDCR.HPMN or 
MDCR_EL2.HPMN can change the behavior of accesses to PMSWINC. See the description of the 
P<n> bit. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure write accesses to this register from ELO and EL] are trapped 
to Hyp mode. 

° If MDCR_EL2.TPM==1, Non-secure write accesses to this register from ELO and EL] are 
trapped to EL2. 


° If MDCR_EL3.TPM==1, write accesses to this register from ELO, EL1, and EL2 are trapped 
to EL3. 


. If HSTR.T9==1, Non-secure write accesses to this register from ELO and EL] are trapped to 
Hyp mode. 


° If HSTR_EL2.T9==1, Non-secure write accesses to this register from ELO and EL1 are 
trapped to EL2. 


° If PMUSERENR.EN==0, and PMUSERENR.SW==0, write accesses to this register from 
ELO are trapped to Undefined mode. 


° If PMUSERENR_ELO.EN==0, and PMUSERENR_ELO.SW==0, write accesses to this 
register from ELO are trapped to EL1. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMSWINC is architecturally mapped to AArch64 System register 
PMSWINC_ELO. 


AArch32 System register PMSWINC is architecturally mapped to External register 
PMSWINC_ELO. 
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Attributes 
PMSWINC is a 32-bit register. 


Field descriptions 


The PMSWINC bit assignments are: 


31 30 0 


i P<n>, bit [n] 


RESO = 


Bit [31] 
Reserved, RESO. 


P<n>, bit [n], for n = 0 to 30 
Event counter software increment bit for PMEVCNTR<n>. 


Bits [30:N] are RAZ/WI. When EL2 is implemented, in Non-secure EL1 and ELO, N is the value in 
MDCR_EL2.HPMN if EL2 is using AArch64 or in HDCR.HPMN if EL2 is using AArch32. 
Otherwise, N is the value in PMCR.N. 


The effects of writing to this bit are: 
) No action. The write to this bit is ignored. 


1 If PMEVCNTR<n> is enabled and configured to count the software increment event, 
increments PMEVCNTR<n> by 1. If PMEVCNTR<n> is disabled, or not configured to 
count the software increment event, the write to this bit is ignored. 


Accessing the PMSWINC: 
To access the PMSWINC: 
MCR p15,0,<Rt>,c9,c12,4 ; Write Rt to PMSWINC 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1100 100 
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G6.4.16 PMUSERENR, Performance Monitors User Enable Register 
The PMUSERENR characteristics are: 


Purpose 


Enables or disables User mode access to the Performance Monitors. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





RO RO RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 _ EL2 (NS) 





RO RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 
Hyp mode. 


° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 


° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


° If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


° If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to 
EL2. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMUSERENR is architecturally mapped to AArch64 System register 
PMUSERENR_ELO. 


This register is in the Warm reset domain. Some or all RW fields of this register have defined reset 
values. On a Warm or Cold reset these apply only if the PE resets into an Exception level that is 
using AArch32. Otherwise, on a Warm or Cold reset RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
PMUSERENR is a 32-bit register. 


Field descriptions 


The PMUSERENR bit assignments are: 
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31 43210 


== EN 
SW 








CR 
ER 
Bits [31:4] 
Reserved, RESO. 
ER, bit [3] 
Event counter read trap control: 
1) PLO reads of the PMXEVCNTR and PMEVCNTR<n>, and PLO read/write access to 
the PMSELR, are trapped to Undefined mode if PMUSERENR.EN is also 0. 
1 PLO reads of the PMAXEVCNTR and PMEVCNTR<n>, and PLO read/write access to 
the PMSELR, are not trapped to Undefined mode. 
When this register has an architecturally-defined reset value, this field resets to 0. 
CR, bit [2] 
Cycle counter read trap control: 
0 PLO reads of the PMCCNTR are trapped to Undefined mode if PMUSERENR.EN is 
also 0. 
1 PLO reads of the PMCCNTR are not trapped to Undefined mode. 
When this register has an architecturally-defined reset value, this field resets to 0. 
SW, bit [1] 
Software increment write trap control: 
0 PLO writes to the PMSWINC are trapped to Undefined mode if PMUSERENR.EN is 
also 0. 
1 PLO writes to the PMSWINC are not trapped to Undefined mode. 
When this register has an architecturally-defined reset value, this field resets to 0. 
EN, bit [0] 
Traps PLO accesses to the Performance Monitors registers to Undefined mode: 
Q PLO accesses to the Performance Monitors registers are trapped to Undefined mode, 
unless enabled by one of PMUSERENR.{ER, CR, SW}. 
1 PLO accesses to the Performance Monitors registers are not trapped to Undefined mode. 
Software can access all PMU registers at PLO. 
—— Note 
° The PMUSERENR is register is always RO at ELO and not trapped by this bit. 
° PLO cannot read or write PMINTENSET and PMINTENCLR. 
When this register has an architecturally-defined reset value, this field resets to 0. 
Accessing the PMUSERENR: 
To access the PMUSERENR: 
MRC p15,0,<Rt>,c9,c14,@ ; Read PMUSERENR into Rt 
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MCR p15,0,<Rt>,c9,c14,0 ; Write Rt to PMUSERENR 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1001 1110 000 
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G6.4.17 PMXEVCNTR, Performance Monitors Selected Event Count Register 
The PMXEVCNTR characteristics are: 


Purpose 
Reads or writes the value of the selected event counter, PMEVCNTR<n>. PMSELR.SEL 
determines which event counter is selected. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





If PMSELR.SEL is greater than or equal to the number of accessible counters then reads and writes 
of PMXEVCNTR are CONSTRAINED UNPREDICTABLE, and the following behaviors are permitted: 


° Accesses to the register are UNDEFINED. 
. Accesses to the register behave as RAZ/WI. 
° Accesses to the register execute as a NOP. 


° Accesses to the register behave as if PMSELR.SEL has an UNKNOWN value less than the 
number of counters accessible at the current Exception level and Security state. 


° Accesses to the register behave as if PMSELR.SEL is 31. 


° In Non-secure state, for an access from PL1 or a permitted access from PLO, if 
PMSELR.SEL, or PMSELR_ELO.SEL if EL1 is using AArch64, is greater than or equal to 
the number of accessible counters but is less than the number of implemented counters, the 
register access is trapped to EL2. Accesses from PLO are permitted when: 


— ELI is using AArch32 and the value of PMUSERENR.EN is 1. 
— ELI is using AArch64 and the value of PMUSERENR_ELO.EN is 1. 
— Note 
In an implementation that includes EL2, in Non-secure state at ELO and EL1: 
° If EL2 is using AArch32, HDCR.HPMN identifies the number of accessible counters. 
° If EL2 is using AArch64, MDCR_EL2.HPMN identifies the number of accessible counters. 


Otherwise, the number of accessible counters is the number of implemented counters. 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 





Hyp mode. 
° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 
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° If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 


EL3. 

. If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 

. If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and EL1 are trapped to 
EL2. 


° If PMUSERENR.EN==0, and PMUSERENR.ER==0, read accesses to this register from 
ELO are trapped to Undefined mode. 

° If PMUSERENR.EN==0, write accesses to this register from ELO are trapped to Undefined 
mode. 

° If PMUSERENR_ELO.EN==0, and PMUSERENR_ELO.ER==0, read accesses to this 
register from ELO are trapped to EL1. 


° If PMUSERENR_ELO.EN==0, write accesses to this register from ELO are trapped to EL1. 





Configurations 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PAXEVCNTR is architecturally mapped to AArch64 System register 
PMXEVCNTR_ELO. 


This register is in the Warm reset domain. On a Warm or Cold reset RW fields in this register reset 
to architecturally UNKNOWN values. 


Attributes 
PMXEVCNTR is a 32-bit register. 


Field descriptions 


The PMXEVCNTR bit assignments are: 


31 0 


PMEVCNTR<n> 


PMEVCNTR<n>, bits [31:0] 
Value of the selected event counter, PMEVCNTR<n>, where n is the value stored in PMSELR.SEL. 


Accessing the PMXEVCNTR: 
To access the PMXEVCNTR: 


MRC p15,0,<Rt>,c9,c13,2 ; Read PMXEVCNTR into Rt 
MCR p15,0,<Rt>,c9,c13,2 ; Write Rt to PMXEVCNTR 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1101 010 
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G6.4.18 PMXEVTYPER, Performance Monitors Selected Event Type Register 
The PMXEVTYPER characteristics are: 


Purpose 
When PMSELR.SEL selects an event counter, this accesses a PMEVTYPER<n> register. When 
PMSELR.SEL selects the cycle counter, this accesses PMCCFILTR. 

Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





If PMSELR.SEL is greater than or equal to the number of accessible counters then reads and writes 
of PMXEVTYPER are CONSTRAINED UNPREDICTABLE, and the following behaviors are permitted: 


° Accesses to the register are UNDEFINED. 
° Accesses to the register behave as RAZ/WI. 
° Accesses to the register execute as a NOP. 


° Accesses to the register behave as if PMSELR.SEL has an UNKNOWN value less than the 
number of counters accessible at the current Exception level and Security state. 


° Accesses to the register behave as if PMSELR.SEL is 31. 


° In Non-secure state, for an access from PL1 or a permitted access from PLO, if 
PMSELR.SEL, or PMSELR_ELO.SEL if EL1 is using AArch64, is greater than or equal to 
the number of accessible counters but is less than the number of implemented counters, the 
register access is trapped to EL2. Accesses from PLO are permitted when: 


— ELI is using AArch32 and the value of PMUSERENR.EN is 1. 
— ELI is using AArch64 and the value of PMUSERENR_ELO.EN is 1. 
— Note 
In an implementation that includes EL2, in Non-secure state at ELO and EL1: 
° If EL2 is using AArch32, HDCR.HPMN identifies the number of accessible counters. 
° If EL2 is using AArch64, MDCR_EL2.HPMN identifies the number of accessible counters. 


Otherwise, the number of accessible counters is the number of implemented counters. 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If HDCR.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped to 





Hyp mode. 
° If MDCR_EL2.TPM==1, Non-secure accesses to this register from ELO and EL] are trapped 
to EL2. 
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Configurations 


If MDCR_EL3.TPM==1, accesses to this register from ELO, EL1, and EL2 are trapped to 
EL3. 


If HSTR.T9==1, Non-secure accesses to this register from ELO and EL] are trapped to Hyp 
mode. 


If HSTR_EL2.T9==1, Non-secure accesses to this register from ELO and EL1 are trapped to 
EL2. 


If PMUSERENR.EN==0, accesses to this register from ELO are trapped to Undefined mode. 
If PMUSERENR_ELO.EN==0, accesses to this register from ELO are trapped to EL1. 


There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register PMXEVTYPER is architecturally mapped to AArch64 System register 
PMXEVTYPER_ELO. 
When the value of PMSELR.SEL is 31, to select the cycle counter, RW fields in this register have 


defined reset values that apply only when the PE resets into an Exception level that is using 
AArch32. See PMCCFILTR for the reset values. 


Otherwise, RW fields in this register reset to IMPLEMENTATION DEFINED values that might be 
UNKNOWN. This applies whenever PMSELR.SEL selects an event counter. 


Attributes 


PMXEVTYPER is a 32-bit register. 


Field descriptions 


The PMXEVTYPER bit assignments are: 


31 


0 


Event type register or PMCCFILTR 


Bits [31:0] 


Event type register or PMCCFILTR. 
When PMSELR.SEL == 31, this register accesses PMCCFILTR. 
Otherwise, this register accesses PMEVTYPER<n> where n is the value in PMSELR.SEL. 


Accessing the PMXEVTYPER: 


To access the PMXEVTYPER: 


MRC p15,0,<Rt>,c9,c13,1 ; Read PMXEVTYPER into Rt 
MCR p15,0,<Rt>,c9,c13,1 ; Write Rt to PMXEVTYPER 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1001 1101 001 
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G6.5 Generic Timer registers 


This section lists the Generic Timer registers in AArch32. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. G6-4803 
1ID092916 Non-Confidential 


G6 AArch32 System Register Descriptions 
G6.5 Generic Timer registers 


G6.5.1 CNTFRQ, Counter-timer Frequency register 
The CNTFRQ characteristics are: 


Purpose 


This register is provided so that software can discover the frequency of the system counter. It must 
be programmed with this value as part of system initialization. The value of the register is not 
interpreted by hardware. 


Usage constraints 


If EL] is the highest exception level implemented and is using AArch32, this register is accessible 
as follows: 





ELO EL1 





Config-RO RW 





If EL2 is the highest exception level implemented and is using AArch32, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RW 





If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) | EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RW RW 





If EL3 is implemented and is using AArch64, this register is accessible as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RO 





Can only be written at the highest Exception level implemented. For example, if EL3 is the highest 
implemented Exception level, CNTFRQ can only be written at EL3. 


If EL3 is using AArch64, write accesses to CNTFRQ from Secure EL1 modes are UNDEFINED. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If CNTKCTL.PLOPCTEN==0, and CNTKCTL.PLOVCTEN==0, read accesses to this 
register from ELO are trapped to Undefined mode. 


. If CNTKCTL_EL1.ELOPCTEN==0, and CNTKCTL_EL1.ELOVCTEN==0, read accesses 
to this register from ELO are trapped to EL1. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register CNTFRQ is architecturally mapped to AArch64 System register 
CNTFRQ_ELO. 
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RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTFRQ is a 32-bit register. 


Field descriptions 


The CNTFRQ bit assignments are: 


31 0 
Clock frequency 
Bits [31:0] 


Clock frequency. Indicates the system counter clock frequency, in Hz. 


Accessing the CNTFRQ: 
To access the CNTFRQ: 


MRC p15,0,<Rt>,c14,c@,@ ; Read CNTFRQ into Rt 
MCR p15,0,<Rt>,c14,c0,@ ; Write Rt to CNTFRQ 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1110 0000 000 
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G6.5.2 CNTHCTL, Counter-timer Hyp Control register 
The CNTHCTL characteristics are: 
Purpose 
Controls the generation of an event stream from the physical counter, and access from Non-secure 
EL1 modes to the physical counter and the Non-secure EL1 physical timer. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2 (NS) 
- - RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch32 System register CNTHCTL is architecturally mapped to AArch64 System register 
CNTHETL_EL?2: 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTHCTL is a 32-bit register. 
Field descriptions 
The CNTHCTL bit assignments are: 
31 8 7 43210 
RESO EVNTI ail 
L PL1PCTEN 
PL1PCEN 
EVNTEN 
EVNTDIR 
Bits [31:8] 
Reserved, RESO. 
EVNTI, bits [7:4] 
Selects which bit (0 to 15) of the counter register CNTPCT is the trigger for the event stream 
generated from that counter, when that stream is enabled. 
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EVNTDIR, bit [3] 


Controls which transition of the counter register CNTPCT trigger bit, defined by EVNTI, generates 
an event when the event stream is enabled: 


0 A 0 to 1 transition of the trigger bit triggers an event. 
1 A | to 0 transition of the trigger bit triggers an event. 
EVNTEN, bit [2] 


Enables the generation of an event stream from the counter register CNTPCT: 


0 Disables the event stream. 
1 Enables the event stream. 
PL1IPCEN, bit [1] 
Traps Non-secure ELO and EL] accesses to the physical timer registers to Hyp mode. 
1) Non-secure ELO and EL! accesses to the CNTP_CTL, CNTP_CVAL, and 
CNTP_TVAL are trapped to Hyp mode. 
1 Non-secure ELO and EL1 accesses to the CNTP_CTL, CNTP_CVAL, and 


CNTP_TVAL are not trapped to Hyp mode. 


If EL3 is implemented and EL2 is not implemented, behavior is as if this bit is 1 other than for the 
purpose of a direct read. 


PLIPCTEN, bit [0] 
Traps Non-secure ELO and EL] accesses to the physical counter register to Hyp mode. 
0 Non-secure ELO and EL1 accesses to the CNTPCT are trapped to Hyp mode. 
1 Non-secure ELO and EL1 accesses to the CNTPCT are not trapped to Hyp mode. 


If EL3 is implemented and EL2 is not implemented, behavior is as if this bit is 1 other than for the 
purpose of a direct read. 

Accessing the CNTHCTL: 

To access the CNTHCTL: 


MRC p15,4,<Rt>,c14,c1,@ ; Read CNTHCTL into Rt 
MCR p15,4,<Rt>,c14,c1,@ ; Write Rt to CNTHCTL 


Register access is encoded as follows: 





coproc opci CRn CRm_= opc2 





1111 100 1110 0001 000 
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G6.5.3 CNTHP_CTL, Counter-timer Hyp Physical Timer Control register 
The CNTHP_CTL characteristics are: 


Purpose 


Control register for the Hyp mode physical timer. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1_ EL2 (NS) 





- - RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch32 System register CNTHP_CTL is architecturally mapped to AArch64 System register 
CNTHP_CTL. EL2. 


If EL2 is not implemented, this register is RESO from EL3. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into EL2 with EL2 using AArch32, or into EL3 with EL3 using AArch32. Otherwise, RW fields in 
this register reset to architecturally UNKNOWN values. 


Attributes 
CNTHP_CTL is a 32-bit register. 


Field descriptions 


The CNTHP_CTL bit assignments are: 


31 3 10 


RESO TT] 
_ ENABLE 
IMASK 





ISTATUS 
Bits [31:3] 
Reserved, RESO. 
ISTATUS, bit [2] 
The status of the timer. This bit indicates whether the timer condition is met: 
0 Timer condition is not met. 
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1 Timer condition is met. 


When the value of the ENABLE bit is 1, ISTATUS indicates whether the timer condition is met. 
ISTATUS takes no account of the value of the IMASK bit. If the value of ISTATUS is 1 and the 
value of IMASK is 0 then the timer interrupt is asserted. 


When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN. 


For more information see Operation of the Compare Value views of the timers on page D6-1884 and 
Operation of the TimerValue views of the timers on page D6-1885. 


This bit is read-only. 


This field resets to a value that is architecturally UNKNOWN. 


IMASK, bit [1] 
Timer interrupt mask bit. Permitted values are: 
Q Timer interrupt is not masked by the IMASK bit. 
1 Timer interrupt is masked by the IMASK bit. 


For more information, see the description of the ISTATUS bit. 


This field resets to a value that is architecturally UNKNOWN. 


ENABLE, bit [0] 
Enables the timer. Permitted values are: 
) Timer disabled. 
1 Timer enabled. 


Setting this bit to 0 disables the timer output signal, but the timer value accessible from 
CNTHP_TVAL continues to count down. 


— Note 


Disabling the output signal might be a power-saving option. 





When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the CNTHP_CTL: 
To access the CNTHP_CTL: 


MRC p15,4,<Rt>,c14,c2,1 ; Read CNTHP_CTL into Rt 
MCR p15,4,<Rt>,c14,c2,1 ; Write Rt to CNTHP_CTL 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 100 1110 0010 001 
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G6.5.4 CNTHP_CVAL, Counter-timer Hyp Physical CompareValue register 
The CNTHP_CVAL characteristics are: 


Purpose 


Holds the compare value for the Hyp mode physical timer. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1  EL2(NS) 





- - RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch32 System register CNTHP_CVAL is architecturally mapped to AArch64 System register 
CNTHP_CVAL_EL2. 


If EL2 is not implemented, this register is RESO from EL3. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTHP_CVAL is a 64-bit register. 


Field descriptions 


The CNTHP_CVAL bit assignments are: 


63 0 


EL2 physical timer compare value 





Bits [63:0] 


EL2 physical timer compare value. 


Accessing the CNTHP_CVAL: 
To access the CNTHP_CVAL: 


MRRC p15,6,<Rt>,<Rt2>,c14 ; Read CNTHP_CVAL[31:0] into Rt and CNTHP_CVAL[63:32] into Rt2 
MCRR p15,6,<Rt>,<Rt2>,c14 ; Write Rt to CNTHP_CVAL[31:0] and Rt2 to CNTHP_CVAL[63:32] 
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Register access is encoded as follows: 





coproc opct1 CRm 





1111 0110 1110 
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G6.5.5 CNTHP_TVAL, Counter-timer Hyp Physical Timer TimerValue register 
The CNTHP_TVAL characteristics are: 
Purpose 
Holds the timer value for the Hyp mode physical timer. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - - RW RW - 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2(NS) 
- - RW 
Where the value of CNTHP_CTL.ENABLE is 0: 
. A write to CNTHP_TVAL updates the register 
° The value held in CNTHP_TVAL continues to decrement 
° A read of CNTHP_TVAL returns an UNKNOWN value. 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
AArch32 System register CNTHP_TVAL is architecturally mapped to AArch64 System register 
CNTHP_TVAL_EL2. 
If EL2 is not implemented, this register is RESO from EL3. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTHP_TVAL is a 32-bit register. 
Field descriptions 
The CNTHP_TVAL bit assignments are: 
31 0 
EL2 physical timer value 
Bits [31:0] 
EL2 physical timer value. 
Accessing the CNTHP_TVAL: 
To access the CNTHP_TVAL: 
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MRC p15,4,<Rt>,c14,c2,@ ; Read CNTHP_TVAL into Rt 
MCR p15,4,<Rt>,c14,c2,@ ; Write Rt to CNTHP_TVAL 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 100 1110 0010 000 
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G6.5.6 CNTKCTL, Counter-timer Kernel Control register 
The CNTKCTL characteristics are: 
Purpose 
Controls the generation of an event stream from the virtual counter, and access from ELO modes to 
the physical counter, virtual counter, EL1 physical timers, and the virtual timer. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
- - RW RW RW RW 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1  EL2(NS) 
- RW RW 
Traps and Enables 
There are no traps or enables affecting this register. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
AArch32 System register CNTKCTL is architecturally mapped to AArch64 System register 
CNTKCTL_EL1. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTKCTL is a 32-bit register. 
Field descriptions 
The CNTKCTL bit assignments are: 
31 109 8 7 43210 
RESO fi] EVNTI ail 
L PLOPCTEN 
PLOVCTEN 
EVNTEN 
EVNTDIR 
PLOVTEN 
PLOPTEN 
Bits [31:10] 
Reserved, RESO. 
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PLOPTEN, bit [9] 


Traps PLO accesses to the physical timer registers to Undefined mode. 


1) PLO accesses to the CNTP_CTL, CNTP_CVAL, and CNTP_TVAL registers are trapped 
to Undefined mode. 
1 PLO accesses to the CNTP_CTL, CNTP_CVAL, and CNTP_TVAL registers are not 


trapped to Undefined mode. 


PLOVTEN, bit [8] 


Traps PLO accesses to the virtual timer registers to Undefined mode. 


1) PLO accesses to the CNTV_CTL, CNTV_CVAL, and CNTV_TVAL registers are 
trapped to Undefined mode. 
1 PLO accesses to the CNTV_CTL, CNTV_CVAL, and CNTV_TVAL registers are not 


trapped to Undefined mode. 


EVNTI, bits [7:4] 
Selects which bit (0 to 15) of the counter register CNTVCT is the trigger for the event stream 
generated from that counter, when that stream is enabled. 

EVNTDIR, bit [3] 


Controls which transition of the counter register CNT VCT trigger bit, defined by EVNTI, generates 
an event when the event stream is enabled: 


) A 0 to 1 transition of the trigger bit triggers an event. 
1 A | to 0 transition of the trigger bit triggers an event. 
EVNTEN, bit [2] 


Enables the generation of an event stream from the counter register CNT VCT: 


1) Disables the event stream. 
1 Enables the event stream. 
PLOVCTEN, bit [1] 


Traps PLO accesses to the frequency register and virtual counter register to Undefined mode. 


Q PLO accesses to the CNTVCT are trapped to Undefined mode. 


PLO accesses to the CNTFRQ register are trapped to Undefined mode, if 
CNTKCTL.PLOPCTEN is also 0. 


1 PLO accesses to the CNTFRQ and CNTVCT are not trapped to Undefined mode. 


PLOPCTEN, bit [0] 
Traps PLO accesses to the frequency register and physical counter register to Undefined mode. 


) PLO accesses to the CNTPCT are trapped to Undefined mode. 


PLO accesses to the CNTFRQ register are trapped to Undefined mode, if 
CNTKCTL.PLOVCTEN is also 0. 


1 PLO accesses to the CNTFRQ and CNTPCT are not trapped to Undefined mode. 


Accessing the CNTKCTL: 
To access the CNTKCTL: 


MRC p15,0,<Rt>,c14,c1,@ ; Read CNTKCTL into Rt 
MCR p15,0,<Rt>,c14,c1,@ ; Write Rt to CNTKCTL 
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Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1110 0001 000 
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G6.5.7 CNTP_CTL, Counter-timer Physical Timer Control register 
The CNTP_CTL characteristics are: 
Purpose 
Control register for the EL1 physical timer. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


CNTP_CTL(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- Config-RW - - - RW 





CNTP_CTL(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW - Config-RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


CNTP_CTL is accessible as follows: 





ELO EL1 EL2 (NS) 





Config-RW  Config-RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If CNTHCTL.PLIPCEN==0, Non-secure accesses to this register from ELO and EL1 are 
trapped to Hyp mode. 


° If CNTHCTL_EL2.EL1PCEN==0, Non-secure accesses to this register from EL1 are 
trapped to EL2. 


° If CNTHCTL_EL2.EL1PCEN==0, and CNTKCTL_EL1.ELOPTEN==1, Non-secure 
accesses to this register from ELO are trapped to EL2. 


° If CNTKCTL.PLOPTEN==0, accesses to this register from ELO are trapped to Undefined 
mode. 








° If CNTKCTL_EL1.ELOPTEN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 


AArch32 System register CNTP_CTL is architecturally mapped to AArch64 System register 
CNTP_CTL_ELO. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into an Exception level that is using AArch32. If the PE resets into EL3 using AArch32 they apply 
only to the Secure instance of the register. Otherwise, RW fields in this register reset to 
architecturally UNKNOWN values. 
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Attributes 


CNTP_CTL is a 32-bit register. 


Field descriptions 


The CNTP_CTL bit assignments are: 


31 


3 10 


RESO TT] 


Bits [31:3] 


_— ENABLE 
IMASK 


ISTATUS 


Reserved, RESO. 


ISTATUS, bit [2] 


The status of the timer. This bit indicates whether the timer condition is met: 
0 Timer condition is not met. 
1 Timer condition is met. 


When the value of the ENABLE bit is 1, ISTATUS indicates whether the timer condition is met. 
ISTATUS takes no account of the value of the IMASK bit. If the value of ISTATUS is 1 and the 
value of IMASK is 0 then the timer interrupt is asserted. 


When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN. 


For more information see Operation of the Compare Value views of the timers on page D6-1884 and 
Operation of the TimerValue views of the timers on page D6-1885. 


This bit is read-only. 
This field resets to a value that is architecturally UNKNOWN. 


IMASK, bit [1] 


Timer interrupt mask bit. Permitted values are: 

Q Timer interrupt is not masked by the IMASK bit. 
1 Timer interrupt is masked by the IMASK bit. 

For more information, see the description of the ISTATUS bit. 


This field resets to a value that is architecturally UNKNOWN. 


ENABLE, bit [0] 


Enables the timer. Permitted values are: 

0 Timer disabled. 

1 Timer enabled. 

Setting this bit to 0 disables the timer output signal, but the timer value accessible from 


CNTP_TVAL continues to count down. 


—— Note 
Disabling the output signal might be a power-saving option. 





When this register has an architecturally-defined reset value, this field resets to 0. 
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Accessing the CNTP_CTL: 
To access the CNTP_CTL: 


MRC p15,0,<Rt>,c14,c2,1 ; Read CNTP_CTL into Rt 
MCR p15,0,<Rt>,c14,c2,1 ; Write Rt to CNTP_CTL 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1110 0010 001 
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G6.5.8 CNTP_CVAL, Counter-timer Physical Timer CompareValue register 
The CNTP_CVAL characteristics are: 
Purpose 
Holds the compare value for the EL1 physical timer. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


CNTP_CVAL(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- Config-RW - - - RW 





CNTP_CVAL(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW - Config-RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


CNTP_CVAL is accessible as follows: 





ELO EL1 EL2 (NS) 





Config-RW  Config-RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If CNTHCTL.PLIPCEN==0, Non-secure accesses to this register from ELO and EL1 are 
trapped to Hyp mode. 


° If CNTHCTL_EL2.EL1PCEN==0, Non-secure accesses to this register from EL1 are 
trapped to EL2. 


° If CNTHCTL_EL2.EL1PCEN==0, and CNTKCTL_EL1.ELOPTEN==1, Non-secure 
accesses to this register from ELO are trapped to EL2. 


° If CNTKCTL.PLOPTEN==0, accesses to this register from ELO are trapped to Undefined 
mode. 


° If CNTKCTL_EL1.ELOPTEN==0, accesses to this register from ELO are trapped to EL1. 








Configurations 


AArch32 System register CNTP_CVAL is architecturally mapped to AArch64 System register 
CNTP_CVAL_ELO. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTP_CVAL is a 64-bit register. 
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Field descriptions 


The CNTP_CVAL bit assignments are: 


EL1 physical timer compare value 


Bits [63:0] 


EL] physical timer compare value. 


Accessing the CNTP_CVAL: 
To access the CNTP_CVAL: 


MRRC p15,2,<Rt>,<Rt2>,c14 ; Read CNTP_CVAL[31:0] into Rt and CNTP_CVAL[63:32] into Rt2 
MCRR p15,2,<Rt>,<Rt2>,c14 ; Write Rt to CNTP_CVAL[31:0] and Rt2 to CNTP_CVAL[63:32] 


Register access is encoded as follows: 

















coproc opct1 CRm 
1111 0010 1110 
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G6.5.9 CNTP_TVAL, Counter-timer Physical Timer TimerValue register 
The CNTP_TVAL characteristics are: 
Purpose 
Holds the timer value for the EL1 physical timer. This provides a 32-bit downcounter. 


Usage constraints 


If EL3 is implemented and is using AArch32, there are separate Secure and Non-secure instances 
of this register. 


CNTP_TVAL(S) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- Config-RW - - - RW 





CNTP_TVAL(NS) is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW - Config-RW RW RW - 





If EL3 is not implemented, or is implemented and is using AArch64, there is a single instance of 
this register. 


CNTP_TVAL is accessible as follows: 





ELO EL1 EL2 (NS) 





Config-RW  Config-RW RW 





Where the value of CNTP_CTL.ENABLE is 0: 
. A write to CNTP_TVAL updates the register 
° The value held in CNTP_TVAL continues to decrement 


° A read of CNTP_TVAL returns an UNKNOWN value. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If CNTHCTL.PL1PCEN==0, Non-secure accesses to this register from ELO and EL] are 
trapped to Hyp mode. 


° If CNTHCTL_EL2.EL1PCEN==0, Non-secure accesses to this register from EL1 are 
trapped to EL2. 


. If CNTHCTL_EL2.EL1PCEN==0, and CNTKCTL_EL1.ELOPTEN==1, Non-secure 
accesses to this register from ELO are trapped to EL2. 


. If CNTKCTL.PLOPTEN==0, accesses to this register from ELO are trapped to Undefined 
mode. 


. If CNTKCTL_EL1.ELOPTEN==0, accesses to this register from ELO are trapped to EL1. 








Configurations 


AArch32 System register CNTP_TVAL is architecturally mapped to AArch64 System register 
CNTP_TVAL_ELO. 
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RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTP_TVAL is a 32-bit register. 


Field descriptions 


The CNTP_TVAL bit assignments are: 


31 0 
EL1 physical timer value 
Bits [31:0] 


EL] physical timer value. 


Accessing the CNTP_TVAL: 
To access the CNTP_TVAL: 


MRC p15,0,<Rt>,c14,c2,@ ; Read CNTP_TVAL into Rt 
MCR p15,0,<Rt>,c14,c2,@ ; Write Rt to CNTP_TVAL 


Register access is encoded as follows: 





coproc opci1 CRn CRm_= opc2 





1111 000 1110 0010 000 
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G6.5.10 CNTPCT, Counter-timer Physical Count register 
The CNTPCT characteristics are: 


Purpose 


Holds the 64-bit physical count value. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) | EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO  Config-RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO Config-RO RO 





When CNTKCTL.ELOPCTEN is set to 1, if CNTPCT is accessible from EL] in the current Security 
state then it is also accessible from ELO in that Security state. 


Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


° If CNTHCTL.PL1IPCTEN==0, Non-secure read accesses to this register from ELO and EL1 
are trapped to Hyp mode. 


° If CNTHCTL_EL2.EL1PCTEN==0, Non-secure read accesses to this register from EL1 are 
trapped to EL2. 


° If CNTHCTL_EL2.EL1PCTEN==0, and CNTKCTL_EL1.ELOPCTEN==1, Non-secure 
read accesses to this register from ELO are trapped to EL2. 


° If CNTKCTL.PLOPCTEN==0, read accesses to this register from ELO are trapped to 
Undefined mode. 


° If CNTKCTL_EL1.ELOPCTEN==0, read accesses to this register from ELO are trapped to 
ELI. 





Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register CNTPCT is architecturally mapped to AArch64 System register 
CNTPCT_ELO, 


Attributes 
CNTPCT is a 64-bit register. 


Field descriptions 


The CNTPCT bit assignments are: 
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63 0 


Physical count value 


Bits [63:0] 


Physical count value. 


Accessing the CNTPCT: 
To access the CNTPCT: 
MRRC p15,0,<Rt>,<Rt2>,c14 ; Read CNTPCT[31:0] into Rt and CNTPCT[63:32] into Rt2 


Register access is encoded as follows: 





coproc opct1 CRm 





1111 0000 1110 
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G6.5.11 CNTV_CTL, Counter-timer Virtual Timer Control register 
The CNTV_CTL characteristics are: 


Purpose 


Control register for the virtual timer. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If CNTKCTL.PLOVTEN==0, accesses to this register from ELO are trapped to Undefined 
mode. 


° If CNTKCTL_EL1.ELOVTEN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register CNTV_CTL is architecturally mapped to AArch64 System register 
CNTV_CTL_ELO. 


Some or all RW fields of this register have defined reset values. These apply only if the PE resets 
into an Exception level that is using AArch32. Otherwise, RW fields in this register reset to 
architecturally UNKNOWN values. 


Attributes 
CNTV_CTL is a 32-bit register. 


Field descriptions 


The CNTV_CTL bit assignments are: 


31 3 10 


RESO TT 
ae ENABLE 
IMASK 


ISTATUS 
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Bits [31:3] 
Reserved, RESO. 
ISTATUS, bit [2] 
The status of the timer. This bit indicates whether the timer condition is met: 
) Timer condition is not met. 
1 Timer condition is met. 


When the value of the ENABLE bit is 1, ISTATUS indicates whether the timer condition is met. 
ISTATUS takes no account of the value of the IMASK bit. If the value of ISTATUS is 1 and the 
value of IMASK is 0 then the timer interrupt is asserted. 


When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN. 


For more information see Operation of the Compare Value views of the timers on page D6-1884 and 
Operation of the TimerValue views of the timers on page D6-1885. 


This bit is read-only. 


This field resets to a value that is architecturally UNKNOWN. 


IMASK, bit [1] 
Timer interrupt mask bit. Permitted values are: 
) Timer interrupt is not masked by the IMASK bit. 
1 Timer interrupt is masked by the IMASK bit. 
For more information, see the description of the ISTATUS bit. 
This field resets to a value that is architecturally UNKNOWN. 
ENABLE, bit [0] 
Enables the timer. Permitted values are: 
) Timer disabled. 
1 Timer enabled. 


Setting this bit to 0 disables the timer output signal, but the timer value accessible from 
CNTV_TVAL continues to count down. 


— Note 


Disabling the output signal might be a power-saving option. 





When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the CNTV_CTL: 
To access the CNTV_CTL: 


MRC p15,0,<Rt>,c14,c3,1 ; Read CNTV_CTL into Rt 
MCR p15,0,<Rt>,c14,c3,1 ; Write Rt to CNTV_CTL 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1110 0011 001 
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G6.5.12 CNTV_CVAL, Counter-timer Virtual Timer CompareValue register 
The CNTV_CVAL characteristics are: 


Purpose 


Holds the compare value for the virtual timer. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RW  Config-RW RW RW RW RW 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RW RW RW 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If CNTKCTL.PLOVTEN==0, accesses to this register from ELO are trapped to Undefined 
mode. 


° If CNTKCTL_EL1.ELOVTEN==0, accesses to this register from ELO are trapped to EL1. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register CNTV_CVAL is architecturally mapped to AArch64 System register 
CNTV_CVAL_ELO. 


RW fields in this register reset to architecturally UNKNOWN values. 


Attributes 
CNTV_CVAL is a 64-bit register. 


Field descriptions 


The CNTV_CVAL bit assignments are: 


63 0 


Virtual timer compare value 


Bits [63:0] 


Virtual timer compare value. 


Accessing the CNTV_CVAL: 


To access the CNTV_CVAL: 
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MRRC p15,3,<Rt>,<Rt2>,c14 ; Read CNTV_CVAL[31:0] into Rt and CNTV_CVAL[63:32] into Rt2 
MCRR p15,3,<Rt>,<Rt2>,c14 ; Write Rt to CNTV_CVAL[31:0] and Rt2 to CNTV_CVAL[63:32] 


Register access is encoded as follows: 





coproc opc1 CRm 





1111 0011 1110 
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G6.5.13 CNTV_TVAL, Counter-timer Virtual Timer TimerValue register 
The CNTV_TVAL characteristics are: 
Purpose 
Holds the timer value for the virtual timer. 
Usage constraints 
If EL3 is implemented and is using AArch32, this register is accessible as follows: 
ELO (NS) ELO (S) EL1 (NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 
Config-RW  Config-RW RW RW RW RW 
If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 
ELO EL1 EL2 (NS) 
Config-RW RW RW 
Where the value of CNTV_CTL.ENABLE is 0: 
. A write to CNTV_TVAL updates the register 
° The value held in CNTV_TVAL continues to decrement 
° A read of CNTV_TVAL returns an UNKNOWN value. 
Traps and Enables 
For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 
. If CNTKCTL.PLOVTEN==0, accesses to this register from ELO are trapped to Undefined 
mode. 
° If CNTKCTL_EL1.ELOVTEN==0, accesses to this register from ELO are trapped to EL1. 
Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 
AArch32 System register CNTV_TVAL is architecturally mapped to AArch64 System register 
CNTV_TVAL_ELO. 
RW fields in this register reset to architecturally UNKNOWN values. 
Attributes 
CNTV_TVAL is a 32-bit register. 
Field descriptions 
The CNTV_TVAL bit assignments are: 
31 0 
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Bits [31:0] 


Virtual timer value. 


Accessing the CNTV_TVAL: 
To access the CNTV_TVAL: 


MRC p15,0,<Rt>,c14,c3,@ ; Read CNTV_TVAL into Rt 
MCR p15,0,<Rt>,c14,c3,@ ; Write Rt to CNTV_TVAL 


Register access is encoded as follows: 





coproc opct CRn CRm_= opc2 





1111 000 1110 0011 000 
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G6.5.14 CNTVCT, Counter-timer Virtual Count register 
The CNTVCT characteristics are: 


Purpose 


Holds the 64-bit virtual count value. The virtual count value is equal to the physical count value 
visible in CNTPCT minus the virtual offset visible in CNTVOFF. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) | EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





Config-RO Config-RO RO RO RO RO 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1 EL2 (NS) 





Config-RO RO RO 





Traps and Enables 


For a description of the prioritization of any generated exceptions, see Synchronous exception 
prioritization for exceptions taken to AArch32 state on page G1-3816 for exceptions taken to 
AArch32 state, and Synchronous exception prioritization for exceptions taken to AArch64 on 
page D1-1548 for exceptions taken to AArch64 state. Subject to the prioritization rules: 


. If CNTKCTL.PLOVCTEN==0, read accesses to this register from ELO are trapped to 
Undefined mode. 


. If CNTKCTL_EL1.ELOVCTEN==0, read accesses to this register from ELO are trapped to 
ELI. 


Configurations 
There is one instance of this register that is used in both Secure and Non-secure states. 


AArch32 System register CNT VCT is architecturally mapped to AArch64 System register 
CNTVCT_ELO. 


The virtual count value is equal to the physical count value visible in CNTPCT minus the virtual 
offset visible in CNTVOFF. 


When EL? is not implemented, CNT VOFF is RESO, and the value of this register is the same as the 
value of CNTPCT. 


Attributes 
CNTVCT is a 64-bit register. 


Field descriptions 


The CNTVCT bit assignments are: 


63 0 


Virtual count value 
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Bits [63:0] 


Virtual count value. 


Accessing the CNTVCT: 
To access the CNT VCT: 
MRRC p15,1,<Rt>,<Rt2>,c14 ; Read CNTVCT[31:0] into Rt and CNTVCT[63:32] into Rt2 


Register access is encoded as follows: 





coproc opct CRm 





1111 0001 1110 
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G6.5.15 CNTVOFF, Counter-timer Virtual Offset register 
The CNTVOFF characteristics are: 


Purpose 


Holds the 64-bit virtual offset. This is the offset between the physical count value visible in 
CNTPCT and the virtual count value visible in CNTVCT. 


Usage constraints 


If EL3 is implemented and is using AArch32, this register is accessible as follows: 





ELO(NS) ELO(S) EL1(NS) EL2(NS) EL3(SCR.NS=1) EL3(SCR.NS=0) 





- - - RW RW - 





If EL3 is not implemented or EL3 is implemented and is using AArch64, this register is accessible 
as follows: 





ELO EL1_ EL2 (NS) 





- - RW 





Traps and Enables 


There are no traps or enables affecting this register. 


Configurations 


AArch32 System register CNT VOFF is architecturally mapped to AArch64 System register 
CNTVOFF_EL2. 


If EL2 is not implemented, this register is RESO from EL3 and the virtual counter uses a fixed virtual 
offset of zero. 


When EL2 is implemented and can use AArch32, on a reset into an Exception level that is using 
AArch32 this register resets to an IMPLEMENTATION DEFINED value that might be UNKNOWN. 


Attributes 
CNTVOFF is a 64-bit register. 


Field descriptions 


The CNTVOFF bit assignments are: 


63 0 


Virtual offset 


Bits [63:0] 
Virtual offset. 


Accessing the CNTVOFF: 
To access the CNTVOFF: 


MRRC p15,4,<Rt>,<Rt2>,c14 ; Read CNTVOFF[31:0] into Rt and CNTVOFF[63:32] into Rt2 
MCRR p15,4,<Rt>,<Rt2>,c14 ; Write Rt to CNTVOFF[31:0] and Rt2 to CNTVOFF[63:32] 
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Register access is encoded as follows: 





coproc opct1 CRm 





1111 0100 1110 
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Chapter H1 
About External Debug 


This chapter gives an overview of ARMvé8 external debug, and specifies the required debug authentication. It 
contains the following sections: 


° Introduction to external debug on page H1-4840. 
° External debug on page H1-4841. 
. Required debug authentication on page H1-4842. 











Note 
For information about self-hosted debug, see Chapter D2 AArch64 Self-hosted Debug and Chapter G2 AArch32 
Self-hosted Debug. 
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H1 About External Debug 
H1.1 Introduction to external debug 


H1.1 Introduction to external debug 
ARMvV8 supports both: 


Self-hosted debug 
The PE itself hosts a debugger. That is, developers developing software to run on the PE use 
debugger software running on the same PE. 

External debug 


The debugger is external to the PE. The debugging might be either on-chip, for example in a second 
PE, or off-chip, for example a JTAG debugger. External debug is particularly useful for: 


. Hardware bring-up. That is, debugging during development when a system is first powered 
up and not all of the software functionality is available. 


° PEs that are deeply embedded inside systems. 


To support external debug, the ARM architecture defines required features that are called external 
debug features. 
H1.1.1 Definition of a debugger in the context of external debug 


When the description of external debug in this Part of the manual describes a debugger as controlling external debug 
this debugger might be a second on-chip PE or an off-chip device such as a JTAG debugger. 
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H1 About External Debug 
H1.2 External debug 


External debug 


Debug events allow an external debugger to halt the PE. ARMv8 provides the following debug events: 


Halting Step debug events on page H3-4886: 

— The debugger can use this resource to make the PE step through code one line at a time. 
Halt Instruction debug event on page H3-4896: 

— This might occur when software executes the Halting breakpoint instruction, HLT. 
Exception Catch debug event on page H3-4897: 

— This can be programmed to occur on all entries to a given Exception level. 
External Debug Request debug event on page H3-4900: 

—  Anembedded cross-trigger can signal this debug event. 

OS Unlock Catch debug event on page H3-4901: 

— _ This might occur when the state of the OS Lock changes from locked to unlocked. 
Reset Catch debug events on page H3-4902: 

— This might occur when the PE exits reset state. 

Software Access debug event on page H3-4903: 


— This can be programmed to occur when software tries to access the Breakpoint Value registers, the 
Breakpoint Control registers, the Watchpoint value registers, or the Watchpoint Control registers. It 
caused a trap to Debug state. 


Breakpoints and watchpoints can also halt the PE. 


When the PE is in Debug state: 


It stops executing instructions from the location indicated by the program counter, and is instead controlled 
through the external debug interface. 


The Instruction Transfer Register, ITR, passes instructions to the PE to execute in Debug state: 


— The ITR contains a single register, EDITR, and associated flow-control flags. 


The Debug Communications Channel, DCC, passes data between the PE and the debugger: 
— The DCC includes the data transfer registers, DTIRRX and DTRTX, and associated flow-control flags. 
— Although the DCC is an essential part of Debug state operation, it can also be used in Non-debug state. 


The PE cannot service any interrupts in Debug state. 


Chapter H2 Debug State describes Debug state in more detail. 
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H1.3 Required debug authentication 


H1.3 


Required debug authentication 
Any implementation must provide the debug authentication defined in this section, that controls: 
° Whether the PE can halt. 
. Whether non-invasive debug is permitted. 
° Some legacy aspects of the AArch32 self-hosted debug model. 


The pseudocode functions shown in the following table, and the conditions that follow that table, define the 
architectural requirements for debug authentication. 


Table H1-1 Debug authentication functions 

















Pseudocode function Description 

External SecureNoninvasiveDebugEnab1ed() Returns TRUE if Secure non-invasive 
debug is enabled. 

AArch32.SelfHostedSecurePrivi legedInvasiveDebugEnab1ed() Returns TRUE if Secure invasive 
self-hosted debug is enabled in AArch32 
state. 

External SecureInvasiveDebugEnab1ed() Returns TRUE if Secure invasive debug 
is enabled. 

ExternalNoninvasiveDebugEnab1led() Returns TRUE if Non-secure 


non-invasive debug is enabled. 





External InvasiveDebugEnab1ed() Returns TRUE if Non-secure invasive 
debug is enabled. 





The following conditions always apply: 

° if ExternalInvasiveDebugEnabled() is FALSE then ExternalSecureInvasiveDebugEnabled() is FALSE. 

. if ExternalNoninvasiveDebugEnabled() is FALSE then ExternalSecureNoninvasiveDebugEnabled() is FALSE. 
° if ExternalInvasiveDebugEnabled() is TRUE then ExternalNoninvasiveDebugEnabled() is TRUE. 

. if ExternalSecureInvasiveDebugEnabled() is TRUE then External SecureNoninvasiveDebugEnabled() is TRUE. 


ARM recommends the use of the interface described in Recommended authentication interface on page K2-5498 to 
provide this debug authentication. 
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Chapter H2 


Debug State 


This chapter describes Debug state. It contains the following sections: 


About Debug state on page H2-4844. 

Halting the PE on debug events on page H2-4845. 
Entering Debug state on page H2-4852. 

Behavior in Debug state on page H2-4855. 
Exiting Debug state on page H2-4880. 


Note 





Table K12-1 on page K12-5660 disambiguates the general register references used in this chapter. 
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H2 Debug State 
H2.1 About Debug state 


H2.1 About Debug state 


In external debug, debug events allow an external debugger to halt the PE. The PE then enters Debug state. When 
the PE is in Debug state: 


. It stops executing instructions from the location indicated by the program counter, and is instead controlled 
through the external debug interface. 


° The Instruction Transfer Register, ITR, passes instructions to the PE to execute in Debug state. 


° The Debug Communications Channel, DCC, passes data between the PE and the debugger. 


The PE cannot service any interrupts in Debug state. 
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H2 Debug State 
H2.2 Halting the PE on debug events 


H2.2 Halting the PE on debug events 


For details of debug events, see Introduction to Halting debug events on page H3-4884 and Breakpoint and 
Watchpoint debug events on page H2-4846. 


On a debug event, the PE must do one of the following: 


° Enter Debug state. 

. Pend the debug event. 

° Generate a debug exception. 
° Ignore the debug event. 


This behavior depends on both: 


° Whether halting is allowed by the current state of the debug authentication interface. See Halting allowed 
and halting prohibited. 

. The type of debug event and the programming of the debug control registers. 
— See Halting debug events on page H2-4846 for all Halting debug events. 


— See Breakpoint and Watchpoint debug events on page H2-4846 for Breakpoint and Watchpoint debug 
events. 


See also Other debug exceptions on page H2-4847. 


This means that behavior can be CONSTRAINED UNPREDICTABLE if the conditions change. See Synchronization and 
Halting debug events on page H3-4904. 


Summary of debug events and possible outcomes on page H3-4884 summarizes the possible outcomes of each type 
of debug event. 


H2.2.1 Halting allowed and halting prohibited 
Halting can be either allowed or prohibited: 
. Halting is always prohibited in Debug state. 


° Halting is always prohibited when DoubleLockStatus() == TRUE. 
— This means that OS Double Lock is locked. 
° Halting is also controlled by the IMPLEMENTATION DEFINED authentication interface, and is prohibited when 
either: 
_ The PE is in Non-secure state and ExternalInvasiveDebugEnabled() == FALSE. 
— The PE is in Secure state and ExternalSecureInvasiveDebugEnabled() == FALSE. 
Note 


See Appendix K2 Recommended External Debug Interface for more information on these functions. 








° Otherwise, halting is allowed. 
See 
° Pseudocode description of Halting on debug events on page H2-4851 


° Required debug authentication on page H1-4842. 
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H2 Debug State 
H2.2 Halting the PE on debug events 


H2.2.2 Halting debug events 
When a Halting debug event is generated, it causes entry to Debug state if both: 
. Halting is allowed. See Halting allowed and halting prohibited on page H2-4845. 


° The Halting debug event is either: 
— A Halt Instruction debug event when Halting debug is enabled. This means that EDSCR.HDE == 1. 


— Nota Halt Instruction debug event. 





Note 


—  AnHalt Instruction debug event is the only Halting debug event that relies on EDSCR.HDE == 1. This 
is to prevent malicious code from causing an entry Debug state. EDSCR.HDE == 0 on a Cold reset. 


— Halting on Breakpoint and Watchpoint Software debug events is also controlled by EDSCR.HDE. See 
Breakpoint and Watchpoint debug events. 


—  EDSCR.HDE can be written by software when the OS Lock is locked. Privileged code can use 
SDCR.TDOSA and HDCR.TDOSA to trap writes to these registers. 





If a Halting debug event does not generate entry to Debug state because the conditions listed in this section do not 
hold, then: 


. If the Halting debug event is a Halt Instruction debug event, the instruction that generated the Halting debug 
event is treated as UNDEFINED. 


° If the Halting debug event is an Exception Catch debug event or a Software Access debug event, it is ignored. 


. In all other cases the Halting debug event is pended, meaning that: 
— The pending Halting debug event is recorded in EDESR. 


— The pending Halting debug event is taken when halting is allowed. See Pending Halting debug events 
on page H3-4905. 


Pending Halting debug events are discarded by a Cold reset. The debugger can also force a pending event to be 
dropped by writing to EDESR. Summary of actions from debug events on page H2-4849 summarizes the possible 
outcome for each type of Debug event. 


Note 


Halting debug events never generate Debug exceptions. 








H2.2.3 Breakpoint and Watchpoint debug events 


A breakpoint or watchpoint generates an entry to Debug state if all of the following conditions hold: 
. Halting debug is enabled, that is EDSCR.HDE == 1. 

. Halting is allowed. See Halting allowed and halting prohibited on page H2-4845. 

° The OS Lock is unlocked, that is OSLSR.OSLK == 0. 


The Address Mismatch breakpoint type is reserved when all of these conditions are met. 


MDSCR_EL1.MDE or DBGDSCRext.MDBGen is ignored when determining whether to enter Debug state. A 
breakpoint or watchpoint that generates entry to Debug state is a Breakpoint or Watchpoint debug event and does 
not generate a debug exception. 


A breakpoint or watchpoint that does not generate an entry to Debug state either: 





° Generates a Breakpoint or Watchpoint exception. 
° Is ignored. 
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H2 Debug State 
H2.2 Halting the PE on debug events 


Note 


EDSCR.HDE is ignored when determining whether to generate a debug exception. The debug exception is 
suppressed only if the PE enters Debug state. This means that the use of Halting debug mode in Non-secure state 
does not affect the Exception model in Secure state. 








See Chapter D2 AArch64 Self-hosted Debug, Chapter G2 AArch32 Self-hosted Debug, and the Note in Other debug 
exceptions. 


H2.2.4 Other debug exceptions 


The following events never generate entry to Debug state: 


° Breakpoint Instruction exceptions. 
° Software Step exceptions. 
° Vector Catch exceptions. 


The behavior of these events is unchanged when Halting debug mode is enabled, that is when EDSCR.HDE == 1. 
This means that these events can do one of the following: 


° They can generate a debug exception. 


° They can be ignored. 


For additional information, see Chapter D2 AArch64 Self-hosted Debug and Chapter G2 AArch32 Self-hosted 
Debug. 


H2.2.5 Debug state entry and debug event prioritization 


The architecture does not define when asynchronous Halting debug events are taken, and therefore the prioritization 
of asynchronous debug events is IMPLEMENTATION DEFINED. 


Synchronous Halting debug events do have a priority order. 


The following are synchronous Halting debug events: 


° Halting Step debug event. 


° Halt Instruction debug event. 
° Exception Catch debug event. 
° Software Access debug event. 
° Reset Catch debug event. 


Each of these synchronous Halting debug events is treated as a synchronous exception generated by an instruction, 
or by the taking of an exception or reset. That is, the synchronous Halting debug event must be taken before any 
subsequent instructions are executed. Reset Catch debug events must be taken before the PE executes the instruction 
at the reset vector. 





Note 


Reset Catch and Exception Catch debug events can be generated asynchronously, because they can result from an 
asynchronous exception. However, if halting is allowed after the asynchronous exception has been processed, the 
Reset Catch or Exception Catch debug event is taken synchronously. 





The Halting Step debug event is generated by the instruction after the stepped instruction. Therefore, if the stepped 
instruction generates any other synchronous exceptions or debug events, these are taken first. 


OS Unlock Catch debug events are always pended and taken asynchronously. 
Halting Step debug events and Reset Catch debug events might be pended and taken asynchronously at a later time. 


The following list shows how the events are prioritized, with -2 being the highest priority. 
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H2 Debug State 


H2.2 Halting the PE on debug events 


Note 


The priority numbering is the same as the numbering for AArch64 synchronous exception priorities listed in 
Synchronous exception types, routing and priorities on page D1-1547. The Debug events in this section with a 
negative priority are a higher priority than any synchronous exception. The debug events with fractional priorities 
have a priority between two or more exceptions. 








The priority for synchronous debug events is as follows: 


-2 Reset Catch debug event. See Reset Catch debug events on page H3-4902. 
This debug event has a higher priority than the synchronous exceptions listed in Synchronous 
exception types, routing and priorities on page D1-1547. 

-1 Exception Catch debug event. See Exception Catch debug event on page H3-4897. 


This debug event can be assigned one of two priorities. When it has a priority of -1, it has a higher 
priority than the synchronous exceptions listed in the Exception model. See Exception Catch debug 
event on page H3-4897. 


0 Halting Step debug event. See Halting Step debug events on page H3-4886. 
This debug event has a higher priority than the synchronous exceptions listed in the Exception 
model. 

1 Software Step debug event. See Software Step exceptions on page D2-1673. 

1.5 Exception Catch debug event. See Exception Catch debug event on page H3-4897. 


This debug event can be assigned one of two priorities, -1 or 1.5. See Exception Catch debug event 
on page H3-4897. 


2-3 These events are not debug events. 


4 Breakpoint exception or debug event or Address Matching Vector Catch exception. See Breakpoint 
exceptions on page D2-1641, and Vector Catch exceptions on page G2-3975. 


These two debug events have the same priority. 


5-13 These events are not debug events. 

14 Halt Instruction debug event. See Halt Instruction debug event on page H3-4896. 

15-19 These events are not debug events. 

19.5 Software Access debug event. See Software Access debug event on page H3-4903. 

20 - 21 These events are not debug events. 

22 Watchpoint exception or debug event. See Watchpoint exceptions on page D2-1657 for exceptions 
taken from AArch64 state, or Watchpoint exceptions on page G2-3961 for exceptions taken from 
AArch32 state. 

23 This event is not a debug event. 


For Reset Catch debug events and Halting Step debug events the priorities listed in this section only apply when 
halting is allowed at the time the event is generated. This means that the event is taken synchronously and not 
pended. 


The prioritization of asynchronous Halting debug events, including pending Halting debug events taken 
asynchronously, is IMPLEMENTATION DEFINED. See Taking Halting debug events asynchronously on page H3-4905. 


For more information on the prioritization of exceptions see Synchronous exception types, routing and priorities on 
page D1-1547. 
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H2 Debug State 
H2.2 Halting the PE on debug events 


Breakpoint debug events and Vector Catch exception 


An Address Matching Vector Catch exception has the same priority as a Breakpoint debug event. See Synchronous 
exception prioritization for exceptions taken to AArch64 on page D1-1548. 


The prioritization of these events is unchanged even if the breakpoint generates entry to Debug state instead of a 
Breakpoint exception. This means that if a single instruction generates both an Address Matching Vector Catch 
exception and a Breakpoint debug event, there is a CONSTRAINED UNPREDICTABLE choice of: 


° The PE entering Debug state due to the Breakpoint debug event. 


° A Vector Catch exception. 


This only applies if all of the following are true: 
° Halting debug is enabled. 

° Halting is allowed. 

° The OS Lock is unlocked. 


An Exception Trapping Vector Catch exception must be generated immediately following the exception that 
generated it. This means that it does not appear in the priority table. 


H2.2.6 Imprecise entry to Debug state 


Entry to Debug state is normally precise, meaning that the PE cannot enter Debug state if it can neither complete 
nor abandon all currently executing instructions and leave the PE in a precise state. 


A debugger can write a value of 1 to EDRCR.CBRRQ to allow imprecise entry to Debug state. An External Debug 
Request debug event must be pending before writing 1 to this bit. Support for this feature is OPTIONAL and it is 
IMPLEMENTATION DEFINED when it is effective at forcing entry to Debug state. 


The PE ignores writes to this bit if either: 
. External debugging is not enabled, meaning ExternalInvasiveDebugEnabled() == FALSE. 


° Secure external debugging is not enabled, meaning ExternalSecureInvasiveDebugEnabled() == FALSE, and 
either: 


— _ EL3 is not implemented and the implemented Security state is Secure state. 


—  EL3 is implemented. 


Example H2-1 shows how entry to Debug state can be forced. 


Example H2-1 Forcing entry to Debug state 


The debugger pends an External Debug Request debug event through the CTI to halt a program that has stopped 
responding. However, the memory system is not responding and a memory access instruction cannot complete. This 
means that Debug state cannot be entered precisely. The debugger writes a value of 1 to EDRCR.CBRRQ. The PE 
cancels all outstanding memory accesses and enters Debug state. As some instructions might not have completed 
correctly, entry to Debug state is imprecise. 


When Debug state is entered imprecisely, all memory access instructions executed through the ITR have 
UNPREDICTABLE behavior. The value of all registers is UNKNOWN, but might be useful for diagnostic purposes. 


H2.2.7 Summary of actions from debug events 
Table H2-1 shows the Software and Halting debug events. In Table H2-1 the columns have the following meaning: 


Debug event type 
This means the type of debug event where: 
Other software Means one of: 


° Software Step exceptions on page D2-1673. 
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H2 Debug State 


H2.2 Halting the PE on debug events 


° Breakpoint Instruction exceptions on page D2-1639. 


° Vector Catch exceptions on page D2-1672 for AArch64 state or 
Vector Catch exceptions on page G2-3975 for AArch32 state. 
Other Halting Means one of the following: 
° Halting Step debug events on page H3-4886. 
° External Debug Request debug event on page H3-4900. 
° Reset Catch debug events on page H3-4902. 
° OS Unlock Catch debug event on page H3-4901. 


Other debug events are referred to explicitly. 


Authentication 


DLK 
OSLK 


HDE 


This means halting is allowed by the IMPLEMENTATION DEFINED external authentication interface. 
It is the result of one of the following pseudocode functions: 


In Secure state External SecureInvasiveDebugEnabled(). 


In Non-secure state —ExternalInvasiveDebugEnabled(). 
This indicates whether the OS Double Lock is locked, DoubleLockStatus() == TRUE. 
This is the value of OSLSR.OSLK. It indicates whether the OS Lock is locked. 


This is the value of EDSCR.HDE. It indicates whether Halting debug is enabled. 


The letter X in Table H2-1 indicates that the value can be either 0 or 1. 


Table H2-1 Debug authentication for external debug 





Debug event type 


Authentication DLK OSLK HDE Behavior 
























































Other software Xx Xx x x Handled by the Exception model 
Breakpoint or x TRUE x Xx Handled by the Exception model (ignored) 
Watchpoint debug 
event Xx FALSE 1 x Handled by the Exception model (ignored) 
FALSE FALSE 0 x Handled by the Exception model 
TRUE FALSE 0 0 Handled by the Exception model 
TRUE FALSE 0 1 Entry to Debug state 
Halt Instruction FALSE Xx Xx x UNDEFINED 
debug event 
TRUE TRUE x x UNDEFINED 
TRUE FALSE xX 0 UNDEFINED 
TRUE FALSE X 1 Entry to Debug state 
Exception Catch FALSE x Xx x Ignored 
debug event 
TRUE TRUE x x Ignored 
TRUE FALSE xX x Entry to Debug state 
Software Access FALSE x Xx x Ignored 
debug event 
TRUE TRUE Xx x Ignored 
TRUE FALSE 1 x Ignored 
TRUE FALSE 0 Xx Entry to Debug state 
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H2 Debug State 
H2.2 Halting the PE on debug events 


Table H2-1 Debug authentication for external debug (continued) 





Debug event type Authentication DLK OSLK HDE Behavior 

















Other Halting FALSE Xx x x Debug event is pended 
TRUE TRUE Xx x Debug event is pended 
TRUE FALSE x x Entry to Debug state 
H2.2.8 Pseudocode description of Halting on debug events 
The Halted(), Restarting(), HaltingAl lowed(), and HaltOnBreakpointOrwatchpoint() functions are described in the 
ARMvV8 pseudocode. 
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H2 Debug State 


H2.3 Entering Debug state 


H2.3 Entering Debug state 


H2.3.1 


On entry to Debug state, the preferred restart address and PSTATE are saved in DLR and DSPSR. The PE remains 
in the mode and security state from which it entered Debug state. 


If EDRCR.CBRRQ has a value of 0, entry to Debug state is precise. If EDRCR.CBRRQ has a value of 1, then 
imprecise entry to Debug state is permitted. 


If a Watchpoint debug event causes an entry to Debug state, the address of the access that generated the Watchpoint 
debug event is recorded in EDWAR. 


See: 

° Determining the memory location that caused a Watchpoint exception on page D2-1665 for a debug event 
taken from AArch64 state. 

° Determining the memory location that caused a Watchpoint exception on page G2-3969 for a debug event 
taken from AArch32 state. 


Other than the effect on PSTATE and EDSCR, entry to Debug state is not a Context synchronization event. The 
effects of entry to Debug state on PSTATE and EDSCR are synchronized. 


Entering Debug state from AArch32 state 


When entering Debug state from AArch32 state, the PE remains in AArch32 state. In AArch32 Debug state the PE 
executes T32 instructions, regardless of the value of PSTATE.T before entering Debug state. 


To allow the debugger to determine the state of the PE, the current Execution state for all four Exception levels can 
be read from EDSCR.RW, and the current Exception level can be read from EDSCR.EL. 


The current endianness state, PSTATE.E, is unchanged on entry to Debug state. 


Note 
. If EL1 is using AArch32 state, the current endianness state can differ from that indicated by SCTLR.EE. 





° If EL2 is using AArch32 state, the current endianness state can differ from that indicated by HSCTLR.EE. 


° On entry to Debug state from AArch32 state, PSTATE.SS is copied to DSPSR.SS, even though the PE 
remains in AArch32 state. 





See also Effect of entering Debug state on PSTATE on page H2-4853. 


H2.3.2 Effect of Debug state entry on DLR and DSPSR 


DLR is set to the preferred restart address for the debug event, that depends on the event type. The value of PSTATE 
is saved in DSPSR. For entry to Debug state from AArch32 state, the values saved in DSPSR.IT are always correct 
for the preferred restart address. 


For synchronous Halting debug events, the preferred restart address is the address of the instruction that generated 
the debug event. 


For asynchronous Halting debug events, including pending Halting debug events taken asynchronously, the 
preferred restart address is the address of the first instruction that must be executed on exit from Debug state. 


This means that: 


. For Breakpoint and Watchpoint debug events, the preferred restart address is the same as the preferred return 
address for a debug exception, as described in Chapter D2 AArch64 Self-hosted Debug and Chapter G2 
AArch32 Self-hosted Debug. 


° For Halt Instruction debug events DLR is set to the address of the HLT instruction and DSPSR.IT is correct 
for the HLT instruction. 
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H2 Debug State 
H2.3 Entering Debug state 


° For Software Access debug events, DLR is set to the address of the accessing instruction and DSPSR.IT is 
correct for this instruction. 


° For Halting Step debug events taken synchronously, DLR and DSPSR are set as the ELR and SPSR would 
be set for a Software Step exception. This is usually the address of, and PSTATE for, the instruction after the 
one that was stepped. 


° For Exception Catch debug events: 


— If the debug event is generated on taking an exception to a trapped Exception level, the DLR is set to 
the address of the exception vector the PE would have started fetching from. This is UNKNOWN if the 
VBAR for the Exception level has never been initialized. The DSPSR records the value of PSTATE 
after taking the exception. The Exception Catch occurs after the SPSR and the Link register are set, 
and the debugger can use these registers to determine where in the application program the exception 
occurred. 


Note 


Depending on the target Exception level and Execution state for the exception, the Link register is one 
of ELR_EL1, ELR_EL2, ELR_EL3, ELR_hyp, or LR (R14). 








— _ If the debug event is generated on an exception return to a trapped Exception level, the DLR is set to 
the target address of the exception return and the DSPSR records the value of PSTATE after the 
exception return. 


° Reset Catch debug events taken synchronously behave like Exception Catch debug events. 


° For Reset Catch debug events and Exception Catch debug events generated on reset to a trapped Exception 
level, the DLR is set to is set to the reset address and the DSPSR records the reset value of PSTATE. 


. For pending Halting debug events and External Debug Request debug events, DLR is set to the address of 
the first instruction that must be executed on exit from Debug state and DSPSR.IT is correct for this 
instruction. See Pending Halting debug events on page H3-4905. 


Normally DLR is aligned according to the instruction set state indicated in DSPSR. However, a debug event might 
be taken at a point where the PC is not aligned. 


H2.3.3 Effect of Debug state entry on System registers, the Event register, and exclusive monitors 


Entering Debug state has no effect on System registers other than DLR and DSPSR. In particular, ESRs, FARs, and 
FSRs are not updated on entering Debug state. SCR is unchanged, even when entering Debug state from EL3. 


Entering Debug state has no architecturally-defined effect on the Event Register and exclusive monitors. 


Note 


Entry to Debug state might set the Event Register or clear the exclusive monitors, or both. However, this is not a 
requirement, and debuggers must not rely on any implementation specific behavior. 








Unless otherwise described in this reference manual, instructions executed in Debug state have their 
architecturally-defined effects on the System registers, the Event register, and exclusive monitors. 


H2.3.4 Effect of entering Debug state on PSTATE 


The effect of an entry to Debug state on PSTATE is described in Entering Debug state on page H2-4852 and 
Entering Debug state from AArch32 state on page H2-4852. 


PSTATE. {E, M, nRW, EL, SP} are unchanged on entry to Debug state. 
PSTATE.IL is cleared to 0 on entry to Debug state, after being saved in DSPSR_ELO. 


The other PSTATE fields are ignored and not observable in Debug state: 
° PSTATE.{N, Z, C, V, Q, GE} are unchanged. 
° PSTATE. {IT, T, SS, D, A, I, F} are set to UNKNOWN values, after being saved in DSPSR_ELO. 
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H2.3.5 


H2.3.6 


H2.3.7 


For more information see Process state (PSTATE) in Debug state on page H2-4855. 


Entering Debug state during loads and stores 


The PE can enter Debug state during instructions that perform a sequence of memory accesses, as opposed to a 
single single-copy atomic access, because of a Watchpoint debug event. The effect of entering Debug state on such 
an instruction is the same as taking a Data Abort exception during such an instruction. 


In addition, when executing in AArch64 state, the PE can enter Debug state during instructions that perform a 
sequence of memory accesses because of an External Debug Request debug event. The effect of entering Debug 
state on such an instruction is the same as taking an interrupt exception during such an instruction. 


This applies to all memory types. 


Entering Debug state and Software Step 


When Software Step is active, a debug event that causes entry to Debug state behaves like an exception taken to an 
Exception level above the debug target Exception level. That is: 


° If the instruction that is stepped generates a synchronous debug event that causes entry to Debug state, or an 
asynchronous debug event is taken before the step completes, the PE enters Debug state with DSPSR.SS set 
to l. 


. A pending Halting debug event or an asynchronous debug event can be taken after the step has completed. 
In this case the PE enters Debug state with DSPSR.SS set to 0. 


In addition: 


. If the instruction that is stepped generates an exception trapped by an Exception Catch debug event, the PE 
enters Debug state at the exception vector with DSPSR.SS set to 0. This is because PSTATE:SS is set to 0 by 
taking the exception. 


° If the PE is reset, PSTATE.SS is reset to 0. If the following debug events are enabled, the PE enters Debug 
state with DSPSR.SS set to 0: 
— __ Reset Catch debug event at the reset Exception level. 
— _ Exception Catch debug event at the reset Exception level. 
— Halting Step debug event. 
° If Halting Step is also active, then Halting Step and Software Step operate in parallel and can both become 


active-pending. In this case Halting step has a higher priority than Software step. This means that the PE 
enters Debug state and DSPSR.SS is set to 0. 


Pseudocode description of entering Debug state 


The DebugHalt constants are described in shared/debug/halting/Debug Halt on page J1-5383 in the ARMv8 
pseudocode. The UpdateEDSCRFields() and Halt() functions are described in Chapter J1 ARMv& Pseudocode. 
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H2.4 Behavior in Debug state 

Instructions are executed in Debug state from the Instruction Transfer Register, ITR. The debugger controls which 

instructions are executed in Debug state by writing the instructions to the External Debug Instruction Transfer 

register, EDITR. The Execution state of the PE determines which instruction set is executed: 

° If the PE is in AArch64 state it executes A64 instructions. 

° If the PE is in AArch32 state it executes T32 instructions: 

—  Fora32-bit T32 instruction, EDITR[15:0] specifies the first halfword and EDITR[3 1:16] specifies the 
second halfword. 

— Fora 16-bit T32 instruction, EDITR[15:0] contains the instruction and EDITR[31:16] is ignored. All 
16-bit T32 instructions are UNPREDICTABLE in debug state. 

The PE does not execute A32 instructions in Debug state. 

Some instructions are available only in Debug state. See Debug state operations, DCPS, DRPS, MRS, MSR on 

page H2-4870. In Non-debug state these instructions are UNDEFINED. 

The following sections describe behavior in Debug state: 

° Process state (PSTATE) in Debug state. 

° Executing instructions in Debug state. 

° Decode tables on page H2-4866. 

° Security in Debug state on page H2-4869. 

° Privilege in Debug state on page H2-4870. 

. Debug state operations, DCPS, DRPS, MRS, MSR on page H2-4870. 

° Exceptions in Debug state on page H2-4874. 

° Accessing registers in Debug state on page H2-4876. 

° Accessing memory in Debug state on page H2-4879. 

This section specifies the CONSTRAINED UNPREDICTABLE behaviors that apply in Debug state, but see Changing the 

value of EDECR.SS when not in Debug state on page H3-4893 for a change in Non-debug state that causes 

CONSTRAINED UNPREDICTABLE behavior. 

H2.4.1 Process state (PSTATE) in Debug state 

PSTATE.{N, Z, C, V, Q, GE, IT, T, SS, D, A, I, F} are all ignored in Debug state: 

. There are no conditional instructions in Debug state. 

° In AArch32 state, the PE only executes T32 instructions and PSTATE.IT is ignored. 

. Asynchronous exceptions and debug events are ignored. 

° Software step is inactive. 

Instructions executed in Debug state indirectly read PSTATE. {IL, E, M, nRW, EL, SP} as they would in Non-debug 

state. 

H2.4.2 Executing instructions in Debug state 

The instructions executed in Debug state must be either A64 instructions or T32 instructions, depending on the 

current Execution state. 

Each instruction falls into one of the following groups: 

° Debug state instructions. These are instructions that are changed in Debug state. See A64 instructions that 
are changed in Debug state on page H2-4856 and T32 instructions that are changed in Debug state on 
page H2-4861. 

° Instructions that are unchanged in Debug state. See A64 instructions that are unchanged in Debug state on 
page H2-4856 and 732 instructions that are unchanged in Debug state on page H2-4861. 
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. Instructions that are UNPREDICTABLE or CONSTRAINED UNPREDICTABLE in Debug state. See A64 instructions 
that are CONSTRAINED UNPREDICTABLE in Debug state on page H2-4858 and 732 instructions that are 
CONSTRAINED UNPREDICTABLE in Debug state on page H2-4863. 


All T32 instructions are treated as unconditional, regardless of PSTATE.IT. See Process state (PSTATE) in Debug 
State on page H2-4855. 


If EDSCR.SDD == 1 then an instruction executed in Non-secure state cannot cause entry into Secure state. See 
Security in Debug state on page H2-4869 


Executing A64 instructions in Debug state 


The following sections describe the behavior of the A64 instructions in Debug state: 

° A64 instructions that are changed in Debug state. 

° A64 instructions that are unchanged in Debug state. 

° A64 instructions that are CONSTRAINED UNPREDICTABLE in Debug state on page H2-4858. 


A64 instructions that are changed in Debug state 


The following A64 instructions are defined in Debug state, but are UNDEFINED in Non-debug state: 
° DCPS. 


Note 
DCPS can be UNDEFINED in certain conditions in Debug state. See DCPS<n> on page H2-4870. 








. DRPS. 
. MRS (DLR_ELO), MRS (DSPSR_ELO), MSR (DLR_ELO), MSR (DSPSR_ELO) 


For more information see Debug state operations, DCPS, DRPS, MRS, MSR on page H2-4870. 


A64 instructions that are unchanged in Debug state 


The following list shows the instructions that are unchanged in Debug state: 


Any instruction that is UNDEFINED in Non-debug state 
This list of instructions excludes: 
. Any instruction listed in A64 instructions that are changed in Debug state. 
° Any instruction listed in A64 instructions that are CONSTRAINED UNPREDICTABLE in 
Debug state on page H2-4858 that is UNDEFINED because an enable or disable bit is not RESO 
or RES1 
Instructions that move System or Special-purpose registers to or from a general-purpose register 
This list of instructions: 


° Includes the instructions to transfer a general-purpose register to or from the DTR, which can 
be executed at any Exception level. 


° Excludes PSTATE access instructions. 
These instructions are: 
° MRS <special_reg>, MSR <special_reg>. 


——— Note 
This does not include NZCV, DAIF, DAIFSet, DAIFClr, SPSel, and CurrentEL. 





° MRS <system_reg>, MSR <system_reg>. 
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Floating-point moves between a SIMD&FP register and a general-purpose register 


These instructions are: 


. FMOV (between a general-purpose register and a single-precision register). 
° FMOV (between a general-purpose register and a double-precision register). 
° FMOV (between a general-purpose register and a SIMD element). 


SIMD moves between a SIMD&FP register and a general-purpose register 


These instructions are: 


° INS (from a general-purpose register to a SIMD element). 

° UMOV (from a SIMD element to a general-purpose register). 
Barriers These instructions are: 

° ISB. 

. DSB. 

° DMB. 


Memory access instructions at various access sizes 

The following constraints apply: 

° General purpose-registers only. 

° One of the following addressing modes: 
—  Unscaled (9-bit signed) immediate offset. 
— Immediate (9-bit signed) post-indexed. 
— Immediate (9-bit signed) pre-indexed. 
—  Unprivileged (9-bit signed). 

. Not literal. 


° One of the following types: 
— (Single) register. 
— Exclusive. 
— Exclusive pair. 
—  Acquire/Release. 
—  Acquire/Release Exclusive. 


—  Acquire/Release Exclusive pair. 
° 32-bit and 64-bit target register variants. 


These instructions are: 

° LDR, LDRB, LDRH, LDRSB, LDRSH, LDRSW (immediate, not literal). 
. LDUR, LDURB, LDURH, LDURSB, LDURSH, LDURSW (immediate). 

. LDTR, LDTRB, LDTRH, LDTRSB, LDTRSH, LDTRSW (immediate). 

. LDAR, LDARB, LDARH, LDXR, LDXRB, LDXRH, LDAXR, LDAXRB, LDAXRH. 
. LDXP, LDAXP. 

° STR, STRB, STRH (immediate). 

° STUR, STURB, STURH (immediate). 

° STTR,STTRB, STTRH (immediate). 

. STLR, STLRB, STLRH, STXR, STXRB, STXRH, STLXR, STLXRB, STLXRH. 
. STXP, STLXP. 


Move immediate to general-purpose register 


These instructions are: 
° MOVZ, MOVN, MOVK (immediate). 


° MOV (between a general-purpose register and the stack pointer). 
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System instructions, Send Event, NOP, and Clear Exclusive 


In this context, the System instructions are the Cache maintenance instructions, TLB maintenance 
instructions, and the Address translation instructions. 


These instructions are: 


° IC. 

: DC. 

° TLBI. 

° AT. 

: SEV, SEVL. 
° NOP. 

. CLREX. 


A64 instructions that are CONSTRAINED UNPREDICTABLE in Debug state 


This subsection describes all instruction not listed in either: 
° A64 instructions that are changed in Debug state on page H2-4856. 
. A64 instructions that are unchanged in Debug state on page H2-4856. 


These instructions are CONSTRAINED UNPREDICTABLE in Debug state. In general, the permissible behaviors are: 


° The instruction is UNDEFINED. 

. The instruction executes as a NOP. 

° If the instruction reads the PC or PSTATE, it uses an UNKNOWN value. 

° If the instruction modifies the PC or PSTATE, other than by advancing the PC to the sequentially next 


instruction, it sets DLR_ELO and DSPSR_ELO to UNKNOWN values. 
° If the instruction is similar to a Debug state instruction, it executes as that Debug state instruction. 
° The instruction has the same behavior as in Non-debug state. 


The following list shows the permissible behaviors for A64 instruction in Debug state. An instruction might appear 
multiple times in the list, in which case the choice of permissible behaviors is any of those listed. An example of 
this is CCMP. 

Exception-generating instructions 


These instructions are: 


: SVC. 
° HVC. 
: SMC. 
° BRK. 
° HLT. 


These instructions behave in one of the following ways: 
° They are UNDEFINED. 

° They execute as a NOP. 

. SVC behaves as DCPS1. 

. HVC behaves as DCPS2. 

° SMC behaves as DCPS3. 


° They generate the exception that the instruction would generate in Non-debug state. The 
exception is taken as described in Exceptions in Debug state on page H2-4874 


— Note 


SMC must not generate a Secure Monitor Call exception from Non-secure state if 
EDSCR.SDD is set to 1. 
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Instructions that explicitly write to the PC (branches) 


These instructions are: 
° B, B.cond, BL, BLR, BR, CBZ, CBNZ, RET, TBZ, TBNZ. 


These instructions behave in one of the following ways: 

° They are UNDEFINED. 

° They execute as a NOP. 

° They execute as in Non-debug state without branching and set DSPSR_ELO and DLR_ELO 


to UNKNOWN values. 
Exception return and related instructions 


These instructions are: 
. ERET. 


These instructions behave in one of the following ways: 
° They are UNDEFINED. 
° They execute as a NOP. 


° They execute as in Non-debug state without branching. They set DSPSR_ELO and DLR_ELO 
to UNKNOWN values, and either: 


— Execute the DRPS operation instead of performing an exception return, using 
UNKNOWN SPSR values. 


— Not change the Exception level. 


Instructions that explicitly modify PSTATE, other than DCPS and DRPS 
These instructions are: 


° MSR DAIFSet (immediate), MSR DAIFCIr (immediate), MSR SPSel (immediate). 
° MSR NZCV (register), MSR DAIF (register), MSR SPSel (register). 


These instructions behave in one of the following ways: 


° They are UNDEFINED. 

° They execute as a NOP. 

° They execute as in Non-debug state, setting DSPSR_ELO and DLR_ELO to UNKNOWN 
values. 


Instructions that request entry to a low-power state 


These instructions are: 


° WFE, WFI. 

These instructions behave in one of the following ways: 

° They are UNDEFINED. 

° They execute as a NOP. 

° They generate a synchronous exception if the corresponding instruction would be trapped in 


Non-debug state. See Configurable instruction enables and disables, and trap controls on 
page D1-1562. 

. A WFE instruction clears the Event register if it is set. 

—— Note 


This means that these instructions must not suspend execution. 
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Instructions that read the PC 


These instructions are: 

° LDR (literal), LDRSW (literal). 
. ADR, ADRP. 

° PREM (literal). 


These instructions behave in one of the following ways: 


° They are UNDEFINED. 
° They execute as a NOP. 
° They execute as in Non-debug state, using an UNKNOWN value for the PC operand. 


Instructions that read the PSTATE.{N, Z, C, V} or other PSTATE fields 


These instructions are: 

° CSEL, CSINC, CSINV, CSNEG, CCMN, CCMP, FCSEL, FCCMP, FCCMPE. 
° ADC, ADCS,SBC, SBCS. 

. MRS NZCV, MRS DAIF, MRS SPSel, MRS CurrentEL. 


These instructions behave in one of the following ways: 


° They are UNDEFINED. 
° They execute as a NOP. 
° They execute as in Non-debug state: 


— For the conditional operations and those using the PSTATE.C flag as an input, these 
instructions use an UNKNOWN value for the condition flag. 


— For the MRS instruction, they return an UNKNOWN value. 


Instructions that explicitly modify the PSTATE. {N, Z, C, V, Q, GE} 


These instructions are: 

° ADDS, SUBS, ADCS, SBCS, ANDS, BICS, CCMN, CCMP. 

° FCMP, FCMPE, FCCMP, FCCMPE. 

° MSR NZCV (register). 

These instructions behave in one of the following ways: 
° They are UNDEFINED. 

° They execute as a NOP. 


° They execute as in Non-debug state, setting DSPSR_ELO and DLR_ELO to UNKNOWN 
values. 


All other instructions 


These instructions behave in one of the following ways: 


° They are UNDEFINED. 

° They execute as a NOP. 

° They have the same behavior as in Non-debug state. 
—— Note 


This includes instructions defined as UNPREDICTABLE or CONSTRAINED UNPREDICTABLE in 
Non-debug state. These instructions are UNPREDICTABLE or CONSTRAINED UNPREDICTABLE in 
Debug state. 
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Executing T32 instructions in Debug state 


The following sections describe the behavior of the T32 instructions in Debug state: 

° T32 instructions that are changed in Debug state. 

° T32 instructions that are unchanged in Debug state. 

° T32 instructions that are CONSTRAINED UNPREDICTABLE in Debug state on page H2-4863. 


T32 instructions that are changed in Debug state 


The following T32 instructions are defined in Debug state, but are UNDEFINED in Non-debug state: 
° DCPS 





Note 
DCPS can be UNDEFINED in certain conditions in Debug state. See DCPS<n> on page H2-4870. 





° MRC p15,3,<Rt>,c4,c5,@ (DSPSR). 
° MCR p15,3,<Rt>,c4,c5,@ (DSPSR). 
. MRC p15,3,<Rt>,c4,c5,1 (DLR). 
. MCR p15,3,<Rt>,c4,c5,1 (DLR). 


In addition, ERET executes the DRPS operation in Debug state. 


For more information see Debug state operations, DCPS, DRPS, MRS, MSR on page H2-4870. 


T32 instructions that are unchanged in Debug state 


The following list shows the instructions that are unchanged in Debug state. Any T32 instruction that uses the PC 
or APSR. {N, Z, C, V} as the source or destination register is not included in the list. Moreover, the list only includes 
the 32-bit T32 encodings. 

Any instruction that is UNDEFINED in Non-debug state 

The list of instructions: 

. Excludes any instruction listed in 732 instructions that are changed in Debug state. 

° Excludes any instruction listed in T32 instructions that are CONSTRAINED 
UNPREDICTABLE in Debug state on page H2-4863 that is UNDEFINED because an enable or 
disable bit is not RESO or RES1 

Instructions that move System or Special-purpose registers to or from a general-purpose register 

The list of instructions: 


° Includes the instructions to transfer a general-purpose register to or from the DTR, which can 
be executed at any Exception level. 


° Excludes APSR and CPSR access instructions. 

° Excludes instructions for accessing banked registers for the current mode. 
These instructions are: 

° MRS <banked_reg>, MSR <banked_reg>. 


— Note 


This does not apply to cases which are UNPREDICTABLE or CONSTRAINED UNPREDICTABLE in 
Non-debug state in the current mode. 





. MRC, MCR. 
— Note 


This includes all allocated System registers in the (coproc==0b111x) encoding space other 
than an MRC move to APSR_nzcv. 





° MRS SPSR, MSR SPSR. 
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° VMRS <vfp_system_reg>, VMSR <vfp_system_reg>. 


— Note 


This includes all allocated Advanced SIMD and floating-point System registers, other than 
an a VMRS move to APSR_nzcv. 





Floating-point moves between a SIMD&FP register and a general-purpose register 
These instructions are: 
° VMOV (between a general-purpose register and a single-precision register). 


° VMOV (between a general-purpose register and a doubleword floating-point register). 


SIMD moves between a SIMD&FP register and a general-purpose register 


These instructions are: 


° VMOV (between a general-purpose register and a scalar). 
Barriers These instructions are: 

7 ISB. 

© DSB. 

° DMB. 


Memory access instructions at various access sizes 

The following constraints apply: 

° General purpose-registers only. 

° One of the following addressing modes: 
— Immediate (8-bit or 12-bit) offset. 
— Immediate (8-bit) post-indexed. 
— Immediate (8-bit) pre-indexed. 
—  Unprivileged (8-bit). 

. Not literal. 


° One of the following types: 
— (Single) register. 
— Dual. 
— Exclusive. 
— Exclusive doubleword. 
—  Acquire/Release. 
—  Acquire/Release Exclusive. 
—  Acquire/Release Exclusive doubleword. 
These instructions are: 
° LDR.W, LDRB.W, LDRH.W, LDRD, LDRSB.W, LDRSH.W (immediate, not literal). 
s LDRT, LDRBT, LDRHT, LDRSBT, LDRSHT (immediate). 
° LDREX, LDREXB, LDREXH, LDA, LDAB, LDAH, LDAEX, LDAEXB, LDAEXH. 
° LDREXD, LDAEXD. 
° STR.W, STRB.W, STRH.W, STRD (immediate). 
° STRT, STRBT, STRHT (immediate). 
° STREX, STREXB, STREXH, STL, STLB, STLH, STLEX, STLEXB, STLEXH. 
: STREXD, STLEXD. 





Move to general-purpose register 


These instructions are: 
° MOVW, MOVT (immediate). 
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System instructions, Send Event, NOP, and Clear Exclusive 


The System instructions are Cache maintenance instructions, TLB maintenance instructions, and 
Address translation instructions. These are encoded in the (coproc==0b1111) System register 
encoding space. 


These instructions are: 
° ICIALLU, ICIALLUIS, ICIMVAU. 
° DCCIMVAC, DCCISW, DCCMVAC, DCCMVAU, DCCSW, DCIMVAC, DCISW. 


° TLBIALL, TLBIALLH, TLBIALLHIS, TLBIALLIS, TLBIALLNSNH, 
TLBIALLNSNHIS, TLBIASID, TLBIASIDIS, TLBUPAS2, TLBITPAS2IS, TLBIIPAS2L, 
TLBIPAS2LIS, TLBIMVA, TLBIMVAA, TLBIMVAAIS, TLBIMVAAL, TLBIMVAALIS, 
TLBIMVAH, TLBIMVAHIS, TLBIMVAIS, TLBIMVAL, TLBIMVALH, TLBIMVALHIS, 
TLBIMVALIS. 


° ATS12NSOPR, ATS12NSOPW, ATS12NSOUR, ATS12NSOUW, ATSICPR, ATSICPW. 
ATS1CUR, ATSICUW, ATSIHR, ATSIHW. 


° BPIALL, BPIALLIS, BPIMVA. 
° SEV.W, SEVL.W. 

° NOP.W. 

° CLREX. 


T32 instructions that are CONSTRAINED UNPREDICTABLE in Debug state 


This subsection describes all instruction not listed in either: 
. T32 instructions that are changed in Debug state on page H2-4861. 
° T32 instructions that are unchanged in Debug state on page H2-4861. 


These instructions are CONSTRAINED UNPREDICTABLE in Debug state. In general, the permissible behaviors are: 


° The instruction generates an Undefined Instruction exception. 

. The instruction executes as a NOP. 

° If the instruction reads the PC or PSTATE, it uses an UNKNOWN value. 

° If the instruction modifies the PC or PSTATE, other than by advancing the PC to the sequentially next 


instruction, it sets DLR and DSPSR to UNKNOWN values. 
° If the instruction is similar to a Debug state instruction, it executes as that Debug state instruction. 
. The instruction has the same behavior as in Non-debug state. 


The following list shows the permissible behaviors for T32 instruction in Debug state. An instruction might appear 
multiple times in the list, in which case the choice of permissible behaviors is any of those listed. 


Exception-generating instructions 


These instructions are: 


° SVC. 
° HVC. 
° SMC. 
7 UDF. 
° BKPT. 
° HLT. 


These instructions behave in one of the following ways: 
° They are UNDEFINED. 
° They execute as a NOP. 


° SVC behaves as DCPS1. 
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° HVC behaves as DCPS2. 
° SMC behaves as DCPS3. 


° They generate the exception the instruction would generate in Non-debug state. The 
exception is taken as described in Exceptions in Debug state on page H2-4874 


— Note 


SMC must not generate a Secure Monitor Call exception from Non-secure state if 
EDSCR.SDD is set to 1. 





Instructions that explicitly write to the PC (branches) 


These instructions are: 
° B, B (conditional), CBZ, CBNZ BL. 


° BX, BLX (register or immediate). 
. BXJ, TBB, TBH. 
° MOV pc and related instructions. 


° LDR pc, LDM(with a register list includes the PC), POP (with a register list that includes the PC). 


These instructions behave in one of the following ways: 

° They are UNDEFINED. 

° They execute as a NOP. 

° They execute as in Non-debug state without branching and set DSPSR and DLR to 


UNKNOWN values. 
Exception return and related instructions, other than ERET 
These instructions are: 
° SRS, RFE, SUBS pc, Ir, and related instructions. 
These instructions behave in one of the following ways: 
° They are UNDEFINED. 
° They execute as a NOP. 


° They execute as in Non-debug state without branching, setting DSPSR_ELO and DLR_ELO 
to UNKNOWN values, and either: 


— Execute the DPRS operation instead of performing an exception return, using 
UNKNOWN SPSR values. 


— Not changing Exception level or PE mode. 


Instructions that explicitly modify PSTATE, other than DCPS and ERET 


These instructions are: 
. CPS, SETEND, IT. 
° MSR APSR, CPSR (register or immediate). 


These instructions behave in one of the following ways: 
° They are UNDEFINED. 
° They execute as a NOP. 
. They execute as in Non-debug state, setting DSPSR_ELO and DLR_ELO to UNKNOWN 
values. 
Instructions that request entry to a low-power state 


These instructions are: 
e WFE, WFI. 


These instructions behave in one of the following ways: 





° They are UNDEFINED. 
° They execute as a NOP. 
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. They generate a synchronous exception if the corresponding instruction would be trapped in 
Non-debug state. See Configurable instruction enables and disables, and trap controls on 
page G1-3885. 


° A WFE instruction is permitted to clear the Event register if it is set. 


— Note 


This means that these instructions must not suspend execution. 





Instructions that read the PC 
These instructions are: 
° LDR (literal), LDRB (literal), LDRH (literal), LDRSB (literal), LDRSH (literal). 
. ADR, ADRL, ADRH. 
° PLD (literal), PLI (literal). 


These instructions behave in one of the following ways: 


° They are UNDEFINED. 
° They execute as a NOP. 
° They execute as in Non-debug state using an UNKNOWN value for the PC operand. 


Instructions that read PSTATE.{N, Z, C, V} or other PSTATE fields 


These instructions are: 


. SEL, VSEL. 
° ADC, SBC, all instructions with an RRX shift. 
. MRS CPSR. 


These instructions behave in one of the following ways: 


° They are UNDEFINED. 
° They execute as a NOP. 
° They execute as in Non-debug state: 


— For the conditional operations and those using the PSTATE.C flag as an input, these 
instructions use an UNKNOWN value for the condition flag. 


— For the MRS instruction, they return an UNKNOWN value 


Instructions that explicitly modify PSTATE.{N, Z, C, V, Q, GE} 


These instructions are: 

. CMP, TST, TEQ, CMN. 

° <opc>S. 

° MRC p14,0,APSR_nzcv,c0,c1,@ (DBGDSCRint). 

° MSR CPSR, MSR APSR, (register or immediate). 

. VMRS APSR_nzcv, FPSCR. 

. QADD, QDADD, QSUB, QDSUB. 

. SMLABB. SMLABT, SMLATB, SMLATT, SMLAD, SMLAWB, SMLAWT, SMLSD, SMUAD. 
° SSAT, SSAT16, USAT, USAT16. 

° SADD, SADD8, SADD16, SASX, SSAX, SSUB, SSUB8, SSUB16. 
° UADD, UADD8, UADD16, UASX, USAX, USAUB, USUN8, USUB16. 


These instructions behave in one of the following ways: 





° They are UNDEFINED. 
° They execute as a NOP. 
° They execute as in Non-debug state, setting DSPSR_ELO and DLR_ELO to UNKNOWN 
values. 
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All other instructions 


These instructions behave in one of the following ways: 


° They are UNDEFINED. 

° They execute as a NOP. 

° They have the same behavior as in Non-debug state. 
—— Note 


This includes instructions defined as UNPREDICTABLE or CONSTRAINED UNPREDICTABLE in 
Non-debug state. These instructions are CONSTRAINED UNPREDICTABLE in Debug state. This 
includes some T32 instructions that specify R15 as a destination or source register, such as: 


MOV.W R15, #<uimml6> 
LDREX R15, [Rn] 


Appendix K1 Architectural Constraints on UNPREDICTABLE behaviors describes the 
CONSTRAINED UNPREDICTABLE behavior for these instructions. In Debug state these CONSTRAINED 
UNPREDICTABLE choices are further restricted: 


. Instructions that specify R15 as a destination register: 


— Are not permitted to branch, because the architecture does not define a branch 
operation in Debug state. 


— Might set DLR and DSPSR to UNKNOWN values. 
— Might have any of the other permitted behaviors. 


° Instructions that specify R15 as a source operand: 
— Cannot use PC + offset, because there is no architecturally-defined PC in Debug state. 


— Might have any of the other permitted behaviors, including using an UNKNOWN value. 





H2.4.3 Decode tables 


The syntax in the tables is defined as follows: 
1 The bit has a fixed value of 1. 
0 The bit has a fixed value of 0. 


!= The field has any value other than the value or values specified. The field might be an encoding field 
in the instruction whose value is supplied by the debugger. 


Note 


The instruction encodings in Chapter C6 A64 Base Instruction Descriptions and Chapter F5 T32 and A32 Base 
Instruction Set Instruction Descriptions might show these bits as (0) or (1). A debugger must set these bits to 0 or 
1, as appropriate. 








Any other value indicates an encoding field in the instruction whose value is supplied by the debugger. Some values 
might be reserved or UNDEFINED, in which case the instruction is UNDEFINED or CONSTRAINED UNPREDICTABLE in 
Debug state, as it is in Non-debug state. 


For more information about the instruction encodings, see: 
° Chapter C6 A64 Base Instruction Descriptions. 
° Chapter F5 732 and A32 Base Instruction Set Instruction Descriptions. 


For information about the syntax used in Table H2-2 on page H2-4867, Table H2-3 on page H2-4867, Table H2-4 
on page H2-4868, and Table H2-5 on page H2-4869, see: 


° Common syntax terms on page C1-123. 
° Assembler symbols on page F2-2404. 
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Table H2-2 shows the A64 instructions that are modified in Debug state. For details of how these are packed in 
EDITR, see EDITR, External Debug Instruction Transfer Register on page H9-5036. 


Table H2-2 Modified A64 instructions in Debug state 
























































313029 28/27 26252423 2221 2019 18 17 1615 141312)1110 9 8|7 6 5 4/3 2 1 0/Description 

1101/0 10 0;1 01 0/0 00 0/0 0 0 0}0 0 0 0|0 0 0 0/0 0} !=00 | DCPS<opt> 

1101/0 10 1/0 OjJL\1/1 01 1/0 1 00/0 10 1j)0 00 Rt MRSIMSR accessing DSPSR_ELO 
1101/0 101)0 OjJLJ1)/1011/0100/0 101/001 Rt MRSIMSR accessing DLR_ELO 
1101/0 11 0)1 01 1)1 12 :1)0 00 0/0 01 1}1 1 1 :0)0 0 O O/|DRPS 











Table H2-3 show the T32 instructions that are modified in Debug state, with the first halfword on the left side and 
the second halfword on the right side. For details of how these are packed in EDITR, See EDITR, External Debug 
Instruction Transfer Register on page H9-5036. 


Table H2-3 Modified T32 instructions in Debug state 












































1514131211109 8/7 6 5 4/3 2 1 0f15141312)11109 8)7 6 5 4/3 2 1 0|Description 
1110/1 11 0/0 1 L/Lj/0O 1 0 OF !=1111 |1 1 1:1)0 0 0 1]0 1 O 1/]MRC|MCR accessing DSPSR 
111 0)1 11 0/0 1 1{L/0 1 0 OF t=1111 }1 1 1 1/0 0 1 1/0 1 O 1} MRC|MCR accessing DLR 
1111)/0 01 1);1 10 1)/1 1:1 071 00 0/1 11 :1)0 0 0 0/0 0 O O|ERET 

11112;/0 1114/1 00 0;)1 121 171 00 0/0 0 0 0}0 0 0 0]0 0/)!=00) DCPS<opt> 
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Table H2-4 lists the A64 instructions that are unchanged in Debug state. 


Table H2-4 A64 instructions that are unchanged in Debug state 

























































































































































































313029 28/27 26252423 2221 2019 18171615141312)1110 9 8|7 6 5 4/3 2 1 0)Description 
sf}0 0 1/0 001}0 000/000 0/0 00 0/0 01 1}1 1 1 Rd MOV <Rn>,SP 
sf}0 0 1/0 0 0 1}0 0 0 0/0 0 0 0/0 0 0 0/0 0 Rn 1}1 1 1 1JMOV SP,<Rn> 
sf}!=01)1/0 0 1 0O}1] hw imm16 Rd MOVN, MOVK, MOVZ 
1101/0 10 1)/0 00 0/0 01 1;0 01 0/0 0 0 0}0 0 0 1;1 1 1 1/}NOP 
1101/0 101/000 0/0 01 1)0 01 0;)0 0 0 O}1 O/L)/1);1 1 1 1) SEV, SEVL 
1101/0 10 1;0 00 0/0 01 1/0 01 1/0 00 0)0 1 0 1/1 1 1 1JCLREX 
1101/0 101/000 0/0 01 1/0 0 1 LI} option |1/!=11}1]1 1 1 1] DSB, DMB, IsB 
1101/0 10 1)0 0 0 0/1} opl CRn CRm op2 Rt IC, DC, TLBI, AT 
1101/0 1 0 1)0 OjJL|1/0} opl CRn CRm op2 Rt RS|MSR accessing System register 
1101/0 10 1j)0 OjJL{|1/1} opl !=0100 CRm op2 Rt RSIMSR accessing System register 
1101/0 10 1/0 O|L{1/1; opl |0 1 0 OF !=0010 op2 Rt RSIMSR accessing Special-purpose 
register 
size |0 0/1 0 0 0|o2/L}0 Rs 00 Rt2 Rn Rt LD(A|X|AX)R{B]H}, ST(L|X|LX)R{B|H} 
size }0 0 0 0 Ojo2|/L) 1 Rs 00 Rt2 Rn Rt LD{A}XP, ST{L}XP 
'=11)1 1]/1 0 0 O| opc |O imm9 0 0 Rn Rt LDUR{B|H|SB|SH| SW}, STUR{B|H} 
1 11 1/1 0 0 0}!=10}0 imm9 0 0 Rn Rt LDUR, STUR 
size|}1 1}1 0 0 OJ opc |0 imm9 1 0 Rn Rt LDTR{B|H|SB|SH| Sw}, STTR{B|H} 
size|1 1}1 0 0 O)J opc |0 imm9 P/1 Rn Rt LDR{B|H|SB|SH| Sw}, STR{B|H} 
010 0}/1 1 1 0/0 0/0 imm5 0001/1 1 Rn Rd IINS <Vd>.<Ts>[<index>] ,<R><n> 
0);Q;0 O}1 1 1 0/0 0/0 imm5 0 0 1}1 1 Rn Rd UMOV <R><d>,<Vn>.<Ts>[<index>] 
000 1}/1 11 :0/0 0 1 0/0 1 Ljopj0 0 0 0}0 0 Rn Rd FMOV <Sd>,<Wn>, FMOV <Wd>,<Sn> 
100 1/1 11 0/0 1 1 0/0 1 Ljop)O 0 0 0/0 0 Rn Rd FMOV <Dd>,<Xn>, FMOV <Xd>,<Dn> 
100 1/1 11 0}1 0 1 O}1 1 1jopf0 0 0 0/0 0 Rn Rd FMOV <Vd>.D[1] , <Xn> 
FMOV <Xd>,<Vn>.D[1] 
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Table H2-5 lists the T32 instructions that are unchanged in Debug state. It shows the T32 instructions with the first 
halfword on the left side and the second halfword on the right side. 


Table H2-5 T32 instructions that are unchanged in Debug state 


































































































































































































1514131211109 8/7 6 5 4/3 2 1 0514131211109 8)7 6 5 4/3 2 1 0/|Description 
111 0}1 10 0/0 1 Ojfop) !=1111 '=1111 ;|1 0 1 1)0 O|M/1 Vm VMOV <Dm>,<Rt>,<Rt2> 
VMOV <Rt>,<Rt2>,<Dm> 

1 11 0}1 1 1 0/0 O Oop Vn '=1111 |1 0 1 OJN\O O 1/0 O O O} VMOV <Sn>,<Rt>, VMOV <Rt>,<Sn> 
1 1 1 0/1 1 1 0|0] opc |0 Vd '=1111 |1 0 1 1|Djopc2}1/0 0 O O}VMOV.<size> <Dd>[<x>] ,<Rt> 
1 1 1 0}1 1 1 O/U| ope }1 Vn '=1111 {1 0 1 1}Djopc2}1|0 0 O OJ VMOV.<dt> <Rt>,<Dd>[<x>] 
1 11 0}1 11 0/1 1 Liop reg '=1111 |1 0 1 0)/0 0 0 1/0 O O O| MRS, vMsR 
1 11 0}1 10 0/0 1 Ojfop) !=1111 '=1111 {1 1 Ljcp} opel CRm ICRRIMRRC accessing System registers 
1 1 1 0}1 1 1 OF opcl jop| CRn '=1111 {1 1 L|ep) ope2 | 1 CRm ICR|MRC accessing System registers 
111 0)1 00 0/0 1 OjL] !=1111 '=1111 Rd imm8’ LDREX, STREX 
111 0})1 00 0/1 1 OjL] !=1111 '=1111 Rt2 0 1)!=10 Rd LDREX(B|H|D), STREX(B|H|D) 
111 0})1 00 0)1 1 OjL] !=1111 '=1111 Rt2 1) op3 Rd LDA{EX}{B]H|D}, STL{EX}{B|H|D} 
1 11 0}1 0 Oj} !=0x10 |L{ !=1111 !=1111 !=1111 imm8’ LDRD, STRD 

!=xx0x 
1 11 1/0/i1/1 Oj;T|1 O O}| imm4 FO] imm3 '=1111 imms OVW, MOVT 
111 1)0 01 1);1 0 OjR| !=1111 J1 00 0 Mil 0 0 1)/M\O O O O|MSR <spec_reg><mode>,<Rn> 
1111;0 01 1/;/1 00 17 !=1111 J1 0 0 0/1 1 1 13/0 0 0 0/0 O O O|MSR SPSR, <Rn> 
1111;0 01 1;1 01 0/1 11 171 00 0/0 0 0 0}0 0 0 0/0 0 0 O}NOP.w 
1111;0011/;/1 01 0/1 11 «171 00 0/0 00 0}0 0 0 0/0 1 O|}L]SEV.W, SEVL.W 
1111;0 01 1;1 01 1/1 11 1971 00 0/1 11 130 01 0/1 1 1 1JCLREX 
1111/0011 1)1 0 1 1)/1 11 1f1 0 0 0/1 1:1 140 1]!=11] option | DSB, DMB, IsB 
1111)0 01 1);1 1 1jR Mil 100 0| !=1111 |0 0 1|M/O O O OJMRS <Rd>,<spec_reg><mode> 
111211;0 01 1);1 121 1)1 11 «191 00 0} !=1111 |0 0 0 0;/0 O O O/}MRS <Rd>,SPSR 
111 1)1 00 O/1)!=11)0; !=1111 '=1111 imm12 STR{B|H}.W (12-bit immediate) 
111 1)1 0 0 0/0}!=11)0; !=1111 !=1111 | 1} !=000 imm8’s STR{B|H|}{T} (8-bit immediate) 
111 21)1 0 O/S)1}t=1l)1} !=1111 !=1111 imm12 LDR{SB|SH|B|H}.W (12-bit immediate) 
111 1)1 0 O/S/O}!=11)1) !=1111 '=1111 |1/} !=000 imm8’ LDR{SB | SH|B|H}{T} (8-bit immediate) 
H2.4.4 Security in Debug state 


If EL3 is implemented or the implemented Security state is Secure state, security in Debug state is governed by the 
Secure debug disabled flag, EDSCR.SDD. 


On entry to Debug state 


If entering in Secure state, EDSCR.SDD is set to 0. Otherwise EDSCR.SDD is set to the inverse of 
External SecureInvasiveDebugEnabled(). That is: 


. If External SecureInvasiveDebugEnabled() == TRUE, EDSCR.SDD is set to 0. 
. If External SecureInvasiveDebugEnabled() == FALSE, EDSCR.SDD is set to 1. 


— Note 


Normally, if External SecureInvasiveDebugEnabled() == FALSE then halting is prohibited and it is 
not possible to enter Debug state from Secure state. However, because changes to the authentication 
signals require a Context synchronization event to guarantee their effect, there is a period during 
which the PE might halt even though the authentication signals prohibit halting. 
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In Debug state 
The value of EDSCR.SDD does not change, even if ExternalSecureInvasiveDebugEnab1ed() 
changes. 
—— Note 
° DBGAUTHSTATUS_EL1.{SNID, SID, NSNID, NSID} are not frozen in Debug state. 
. If EDSCR.SDD set to 1 in Debug state, then there is no means to enter Secure state from 


Non-secure state. In this case it is impossible for the PE to be in Secure state. This is a general 
principle of behavior in Debug state. 





In Non-debug state 


EDSCR.SDD returns the inverse of ExternalSecureInvasiveDebugEnabled(). If the authentication 
signals that control External SecureInvasiveDebugEnabled() change, a Context synchronization event 
is required to guarantee their effect. 


— Note 

° In Non-debug state, EDSCR.SDD is unaffected by the Security state of the PE. 

° A Context synchronization event is also required to guarantee that changes in the 
authentication signals are visible in DBGAUTHSTATUS_EL1.{SNID, SID, NSNID, 
NSID}. 





If EL3 is not implemented and the implemented Security state is Non-Secure state, EDSCR.SDD is RES1. 


H2.4.5 Privilege in Debug state 


The only additional privileges offered to Debug state are: 
° The privilege to execute Debug state operations, DCPS, DRPS, MRS, MSR. 


. The privilege to execute DTR access instructions regardless of the Exception level and traps. 


The DTR access instructions can be executed at any Exception level, including ELO, regardless of any control 
register settings that might force these instructions to be UNDEFINED or trapped in Non-debug state. These 
instruction are: 


° The MRS and MSR instructions that access DBGDTR_ELO, DBGDTRTX_ELO, and DBGDTRRX_ELO in 
AArch6é4 state. 


° The MRC and MCR instructions that access DBGDTRTXint and DBGDTRRXint in AArch32 state. 


All other instructions operate with the privilege determined by the current Exception level and security state. This 
applies to all Special-purpose and System registers accesses, memory accesses, and UNDEFINED instructions, and 
includes generating exceptions when the System registers trap or disable an instruction. 


H2.4.6 Debug state operations, DCPS, DRPS, MRS, MSR 


ARMvVv8 defines operations to change between Exception levels in Debug state. These operations can also change 
the mode at the current Exception level. 
DCPS<n> 


Executing a DCPS<n> instruction in Debug state moves the PE to a higher Exception level or to a specific mode at 
the current Exception level. 


If the DCPS<n> instruction is executed in AArch32 state and the target Exception level is using AArch64: 
. The current instruction set switches from T32 to A64. 


. The effect on registers that are not visible or only partially visible in AArch32 state is the same as for system 
calls in Non-debug state. See Execution state on page D1-1501. 





H2-4870 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


H2 Debug State 
H2.4 Behavior in Debug state 


Otherwise, the instruction set state does not change. 


If the target Exception level is the same as the current Exception level, then the PE does not change Exception level. 
However, the PE might change mode. 


The effect on endianness is the same as for exceptions and exception returns in Non-debug state: 


° In AArch64 state the current endianness is determined by the value of SCTLR_ELx.EE for the target 
Exception level. 


° In AArch32 state the current endianness is determined by the value of SCTLR.EE or HSCTLR.EE for the 
target Exception level. 


The DCPS<n> instructions are: 


In AArch64 state 
. DCPS1 
. DCPS2 
. DCPS3 


In AArch32 state, in the T32 instruction set only 


° DCPS1 variant 
° DCPS2 variant 
° DCPS3 variant 


The DCPS instructions are UNDEFINED in Non-debug state. 


Table H2-6 shows the target of the instruction. In Table H2-6 the entries have the following meaning: 


EL1ih/Sve This means that the target is: 
° ELIh if EL] is using AArch64. 
° EL1 and Supervisor mode if EL1 is using AArch32. 


EL2h/Hyp This means that the target is: 
° EL2h if EL2 is using AArch64. 
° EL2 and Hyp mode if EL2 is using AArch32. 


EL3h/Monitor This means that the target is: 
° EL3h if EL3 is using AArch64. 
° EL3 and Monitor mode if EL3 is using AArch32. 


Sve Secure Supervisor mode, in EL3 using AArch32. 


Monitor Secure Monitor mode, in EL3 using AArch32. 


Table H2-6 Target for DCPS instructions in Debug state 














Instruction Target when DCPS instruction executed at stated Exception level: 
ELO EL1 EL2 EL3 (AArch64) EL3 (AArch32) 
DCPS1 EL1h/Sve EL1h/Sve EL2h/Hyp EL3h Svc, clears SCR.NS to 0 
DCPS2 EL2h/Hyp EL2h/Hyp EL2h/Hyp EL3h UNDEFINED 
DCPS3 EL3h/Monitor EL3h/Monitor EL3h/Monitor EL3h Monitor, clears SCR.NS to 0 





In AArch32 Monitor mode, DCPS1 and DCPS3 clear SCR.NS to 0. 
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Note 
In AArch64 state, at EL3, DCPS<n> does not change SCR_EL3.NS. 








However: 
° DCPS 1 is UNDEFINED at ELO in Non-secure state if either: 
—  EL2 is implemented and using AArch64 and HCR_EL2.TGE == 1. 
—  EL2 is implemented and using AArch32 and HCR.TGE == 1. 
° DCPS2 is UNDEFINED at all Exception levels if EL2 is not implemented. 
° DCPS2 is UNDEFINED at the following Exception levels if EL2 is implemented: 
—  AtELO and EL! in Secure state. 
—  AtEL3 if EL3 is using AArch32. 
° DCPS3 is UNDEFINED at all Exception levels if either: 
—  EDSCR.SDD == 1. 
—  EL3 is not implemented. 
Note 





The references to DCPS1, DCPS2, and DCPS3 in this section link to the descriptions of the instructions in the A64 
instruction set. The DCPS<n> instructions are also defined in the T32 instruction set, see DCPS], DCPS2, DCPS3 on 
page F5-2657. These instructions are not defined in the A32 instruction set, because A32 instructions cannot be 
executed in Debug state. 





On executing a DCPS instruction: 


° If the target Exception level is using AArch64: 
—  ELR_ELx of the target Exception level becomes UNKNOWN. 
—  SPSR_ELx of the target Exception level becomes UNKNOWN. 
—  ESR_ELx of the target Exception level becomes UNKNOWN. 
— DLR_ELO and DSPSR_ELO become UNKNOWN. 


° If the target Exception level is using AArch32 DLR and DSPSR become UNKNOWN and: 
— If the target Exception level is EL1 or EL3, the LR and SPSR of the target mode become UNKNOWN. 
— If the target Exception level is EL2, then ELR_hyp, SPSR_hyp, and HSR become UNKNOWN. 


If the target Exception level is using AArch32, and the target Exception level is EL1 or EL3, the LR and SPSR of 
the target mode become UNKNOWN. 


The DCPSInstruction() function is described in Chapter J1 ARMv8 Pseudocode. 


DRPS 


Executing the DRPS operation in Debug state moves the PE to a lower Exception level, or to another PE mode at 
the current Exception level, by copying the current SPSR to PSTATE. 


If DRPS is executed in AArch64 state and the target Exception level is using AArch32: 
° The current instruction set switches from A64 to T32. 


. The effect on registers that are not visible or only partially visible in AArch32 state is the same as for 
exception returns in Non-debug state. See Execution state on page D1-1501. 


Otherwise the instruction set state does not change. 


If the target Exception level is the same as the current Exception level, then the PE does not change Exception level. 
However, the PE might change mode. 
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The effect on endianness is the same as for exceptions and exception returns in Non-debug state: 


° If targeting an Exception level using AArch64, current endianness is set according to SCTLR_ELx.EE, or 
SCTLR_EL1.EOE for the target Exception level. 


. If targeting an Exception level using AArch32, current endianness is set by SPSR.E as appropriate. 
The DRPS instructions are: 


In AArch64 state 
. DRPS 


In AArch32 state, in the T32 instruction set only 
° ERET 
If the SPSR specifies an illegal exception return, then PSTATE. {M, nRW, EL, SP} are unchanged and PSTATE.IL 


is set to 1. For further information on illegal exception returns, see [/legal return events from AArch64 state on 
page D1-1537. 


PSTATE.{N, Z, C, V, Q, GE, IT, T, SS, D, A, I, F} are ignored in Debug state. This means that the effect of the DRPS 
operation on these fields is to set them to an UNKNOWN value that might be the value from the SPSR. For more 
information see Process state (PSTATE) in Debug state on page H2-4855. 


All other PSTATE fields are copied from SPSR. 
DRPS is UNDEFINED at ELO and in Non-debug state. 


Note 


Unlike an exception return, the DRPS operation has no architecturally-defined effect on the Event Register and 
exclusive monitors. DRPS might set the Event Register or clear the exclusive monitors, or both, but this is not a 
requirement and debuggers must not rely on any implementation specific behavior. 








On executing a DRPS instruction: 


° If the target Exception level is using AArch64: 
—  DLR_ELO and DSPSR_ELO become UNKNOWN. 


° If the target Exception level is using AArch32: 
— DLR and DSPSR become UNKNOWN. 


The DRPSInstruction() function is described in Chapter J1 ARMv8 Pseudocode. 


MRS and MSR 
The other Debug state instructions are used to read or write DLR_ELO and DSPSR_ELO. 
These instructions are: 


In AArch64 state 
. MRS 
. MSR (register) 





In AArch32 state 
° MRC 
° MCR 
MRS <Xt>, DLR_ELO ; Copy DLR_EL@ to <Xt> 
MRS <Xt>, DSPSR_ELO ; Copy DSPSR_EL@ to <Xt> 
MSR DLR_ELO, <Xt> ; Copy <Xt> to DLR_EL@ 
MSR DSPSR_ELO, <Xt> ; Copy <Xt> to DSPSR_ELO 
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These instructions can be executed at any Exception level when in Debug state, including ELO. They are UNDEFINED 
in Non-debug state. 

















H2.4.7 Exceptions in Debug state 

The following sections describe how exceptions are handled in Debug state: 

° Generating exceptions when in Debug state. 

° Taking exceptions when in Debug state. 

. Reset in Debug state on page H2-4876. 

Generating exceptions when in Debug state 

In Debug state: 

. Instruction Abort exceptions cannot happen because instructions are not fetched from memory. 

° Interrupts, including SError and virtual interrupts are ignored and remain pending: 

— __ The pending interrupt remains visible in ISR. 

° Debug exceptions and debug events are ignored. 

° SCR.EA is treated as if it were set to 0, regardless of its actual state, other than for the purpose of reading the 
bit. 

° Any attempt to execute an instruction bit pattern that is an allocated instruction at the current Exception level, 
but is listed in Executing instructions in Debug state on page H2-4855 as UNDEFINED in Debug state, 
generates an exception, that is taken to the current Exception level, or to EL1 if executing at ELO. 

Note 
If the exception is taken to an Exception level that is using AArch32 then it is taken as an Undefined 
Instruction exception. 
The priority and syndrome for these exceptions is the same as for executing an encoding that does not have 
an allocated instruction. 

. Instructions executed at EL2, EL1 and ELO that are configured by EL3 control registers to trap to EL3: 

— When the value of EDSCR.SDD is 0, generate the appropriate trap exception that is taken to EL3. 
— When the value of EDSCR.SDD is 1, are treated as UNDEFINED and generate an exception that is taken 
to the current Exception level, or to EL1 if the instruction is executed at ELO. If the exception is taken 
to an Exception level that is using AArch32 it is taken as an Undefined Instruction exception. 
If the exception is taken to an Exception level using AArch64 or to AArch32 Hyp mode, then it is 
reported with an EC value of 0x00. 
Otherwise configurable traps, enables, and disables for instructions are unaffected by Debug state, and 
executing an affected instruction generates the appropriate exception. 
Otherwise, synchronous exceptions, including Data Aborts, are generated as they would be in Non-debug state and 
taken to the appropriate Exception level in Debug state. 
Note 

If EDSCR.SDD == | then an exception from Non-secure state is never taken to Secure state. See Security in Debug 

State on page H2-4869. 

Taking exceptions when in Debug state 

When the PE is in Debug state, all exceptions are synchronous. When an exception is generated, it is taken to Debug 

state. This means that: 

. The target Exception level is as defined for the exception in Non-debug state. 
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° If the target Exception level is using AArch32 then the target PE mode is as defined for the exception in 
Non-debug state. 


° The exception syndrome is reported as defined for the exception in Non-debug state, except for the case 
described in Data Aborts in Memory access mode on page H4-4914 for which the reporting requirements are 
relaxed. 


The exception syndrome is reported using the syndrome register or registers for the target Exception level. 
In AArch6é4 state, these are ESR_ELx, and FAR_ELx. In AArch32 state, these are DFSR, DFAR, HSR, 
HDFAR, and HPFAR. For example: 


— _ Ifa Data Abort exception is taken to Abort mode at EL1 or EL3 and the exception is taken from 
AArch32 state and using the Short-descriptor translation table format, the DFSR reports the exception 
using the Short-descriptor format fault encoding. For exceptions other than Data Abort exceptions 
taken to Abort mode, DFSR is not updated. 


—  Ifan instruction is trapped to an Exception level using AArch64 due to a configurable trap, disable, or 
enable, the exception code reported is the same as it would be in Non-debug state. 


The effect on auxiliary syndrome registers, such as AFSR, is IMPLEMENTATION DEFINED. 


Note 


Generally, the AArch32 Fault Address Registers (FARs) and Fault Status Registers (FSRs) are not described 
as syndrome registers, although the term is appropriate to their function. 








. The PE remains in Debug state and changes to the target mode. 
° If EL3 is using AArch32 and the exception is taken from Monitor mode, SCR.NS is cleared to 0. 


. If the exception is taken to an Exception level using AArch32, the PE continues to execute T32 instructions, 
regardless of the TE bit in the System register for the target Exception level. 


° The endianness switches to that indicated by the EE bit of the System register for the target Exception level. 
. The SPSR for the target Exception level or mode is corrupted and becomes UNKNOWN. 
. If the target Exception level is using AArch64, ELR_ELx for the target Exception level becomes UNKNOWN. 


° If the target Exception level is EL2 using AArch32, ELR_hyp becomes UNKNOWN. 


° If the target Exception level is EL1 or EL3 using AArch32, LR_<mode> for the target mode becomes 
UNKNOWN. 


° DLR and DSPSR become UNKNOWN. 
° The cumulative error flag, EDSCR.ERR, is set to 1. See Cumulative error flag on page H4-4918. 
° PSTATE IL is cleared to 0. 


° PSTATE. {IT, T, SS, D, A, I, F} are set to UNKNOWN values, and PSTATE.{N, Z, C, V, Q, GE} are unchanged. 
However, these fields are ignored and are not observable in Debug state. For more information see Process 
state (PSTATE) in Debug state on page H2-4855. 


The debugger must save any state that can be corrupted by an exception before executing an instruction that might 
generate another exception. 


Pseudocode description of taking exceptions in Debug state 


The pseudocode function AArch64.TakeException() shows the behavior when the PE takes an exception to an 
Exception level using AArch64 in Non-debug state. In Debug state, this is replaced with the function 
AArch64. TakeExceptionInDebugState(). 
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The pseudocode functions AArch32.EnterMode(), AArch32.EnterHypMode(), and AArch32.EnterMonitorMode() show 
the behavior when the PE takes an exception to an Exception level using AArch32 in Non-debug state. In Debug 
state: 


e AArch32.EnterMode() is replaced with the function AArch32.EnterModeInDebugState(). 
e AArch32.EnterHypMode() is replaced with the function AArch32.EnterHypModeInDebugState(). 
e AArch32.EnterMonitorMode() is replaced with AArch32.EnterMonitorModeInDebugState(). 


Reset in Debug state 


If the PE is reset when in Debug state, it exits Debug state and enters Non-debug reset state. When the PE is in reset 
state, EDSCR.STATUS == 0b000010 and writes to EDITR are ignored. 


Note 


If EDECR.RCE == 1, meaning that a Reset Catch debug event is programmed, and if halting is allowed on exiting 
reset state, then on exiting reset state the PE halts and re-enters Debug state. See Reset Catch debug events on 
page H3-4902. All PE registers have taken their reset values, which might be UNKNOWN. 








H2.4.8 Accessing registers in Debug state 


Register accesses are unchanged in Debug state. The view of each register is determined by either the current 
Exception level or the mode, or both, and accesses might be disabled or trapped by controls at a higher Exception 
level. 


General-purpose register access, other than AArch64 state SP access 


A single general-purpose register can be read by issuing an MSR instruction through the ITR to write DBGDTR_ELO 
in AArch6é4 state, or an MCR instruction through the ITR to write DBGDTRTXint in AArch32 state. The debugger 
can then read the DTR register or registers through the external debug interface. The reverse sequence writes to a 
general-purpose register. 


Figure H2-1 on page H2-4877 shows the reading and writing of general-purpose registers, other than SP, in Debug 
state in AArch64 state. 





H2-4876 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


START 


<M No 


Yes 


t 


DBGDTRTX = D[63:32] 
DBGDTRRX = D[31:0] 
Sets RXfull to 1 














v 


H2 Debug State 
H2.4 Behavior in Debug state 







START 


<M No 


TXfull == 
ITE == 


Yes 


v 








EDITR = MSR DBGDTR_ELO, Xn 
Sets TXfull to 0 











EDITR = MRS Xn, DBGDTR_ELO 
Clears RXfull to 0 











DONE 
Xn = D[63:0] 





v 

D[63:0] = DBGDTRRX 

D[31:0] = DBGDTRTX 
Clears TXfull to 0 

















DONE 
D[63:0] = Xn 


Figure H2-1 Reading and writing general-purpose registers, other than SP, in Debug state in AArch64 state 
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Figure H2-2 shows the reading and writing of general-purpose registers in Debug state in AArch32 state. 


START START 


t—!_—_—No t—!_—_ No 

















TXfull == 
ITE == 





RXfull == 
ITE == 






















































Yes Yes 
v v 
DBGDTRRX = W[31:0] EDITR = MCR p74, 0, Rn, cO, c5, 0 
Sets RXfull to 1 Sets TXfull to 0 
v v 
EDITR = MRC p74, 0, Rn, cO, c5, 0 W([31:0] = DBGDTRTX 
Clears RXfull to 0 Clears TXfull to 0 











DONE DONE 
Rn = W[31:0] W[31:0] = Rn 
Figure H2-2 Reading and writing general-purpose registers in Debug state in AArch32 state 


SIMD and floating-point register, System register, and AArch64 state SP accesses 


To read a SIMD and floating-point register or a System register, the debugger must first copy the value into a 
general-purpose register using: 


° An FMOV instruction in AArch64 state or a VMOV instruction in AArch32 state for floating-point transfers to 
SIMD and FP registers. 

° A UMOV instruction in AArch64 state or a VMOV instruction in AArch32 state for SIMD transfers to SIMD and 
FP registers. 

° An MRS instruction in AArch64 state or an MRC instruction in AArch32 state for System registers. 


° A MOV Xd, SP instruction for the SP register in AArch64 state. 


The debugger can then read out the particular general-purpose register. The reverse sequence writes a register. 


PC and PSTATE access 


The debugger reads the program counter and PSTATE of the process being debugged through the DLR_ELO and 
DSPSR_ELO System registers. The actual values of PC and PSTATE cannot be directly observed in Debug state: 


° Instructions that are used for direct reads and writes of PC and PSTATE in Non-debug state are UNDEFINED 
in Debug state. 


° On taking an exception, ELR_ELx and SPSR_EL«x at the target exception level are UNKNOWN. They do not 
record the PC and PSTATE. 


PSTATE. {IL, E, M, nRW, EL, SP} are indirectly read by instructions executed in Debug state, but all other PSTATE 
fields are ignored and cannot be observed. See also: 


° Process state (PSTATE) in Debug state on page H2-4855. 
° Executing instructions in Debug state on page H2-4855. 
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° Exceptions in Debug state on page H2-4874. 


H2.4.9 Accessing memory in Debug state 
How the PE accesses memory is unchanged in Debug state. This includes: 


° The operation of the MMU, including address translation, tagged address handling, access permissions, 
memory attribute determination, and the operation of any TLBs. 


° The operation of any caches and coherency mechanisms. 
° Alignment support. 

. Endianness support. 

° The Memory order model. 


Simple memory transfers 


Simple memory accesses can be performed in Debug state by issuing memory access instructions through the ITR 
and passing data through the DTR registers. Executing instructions in Debug state on page H2-4855 lists the 
memory access instructions that are supported in Debug state. 


Bulk memory transfers 


Memory access mode can accelerate bulk memory transfers in Debug state. See DCC and ITR access modes on 
page H4-4912. 
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H2.5 Exiting Debug state 


The PE exits Debug state when it receives a Restart request trigger event. If EDSCR.ITE == 0 the behavior of any 
instruction issued through the ITR in Normal access mode or an operation issued by a DTR access in memory access 
mode that has not completed execution is CONSTRAINED UNPREDICTABLE, and must do one of the following: 


° It must complete execution in Debug state before the PE executes the restart sequence. 
° It must complete execution in Non-debug state after the PE executes the restart sequence. 
° It must be abandoned. This means that the instruction does not execute. Any registers or memory accessed 


by the instruction are left in an UNKNOWN state. 


Note 


° Implementations can set EDSCR.ITE to | to indicate that further instructions can be accepted by ITR before 
the previous instructions have completed. If any previous instruction has not completed and EDSCR.ITE == 
1, then the PE must complete these instructions in Debug state before executing the restart sequence. 
EDSCR.ITE == 0 indicates that the PE is not ready to restart. 





. A debugger must observe that any instructions issued through EDITR that might generate a synchronous 
exception, as complete, before issuing a restart request. It can do this by observing the completion of a later 
instruction, as synchronous exceptions must occur in program order. For example, a debugger can observe 
that an instruction that reads or writes a DTR register is complete because of its effect on the 
EDSCR.{TXfull, RXfull} flags. 





On exiting Debug state, the PE sets the program counter to the address in DLR, where: 


° If exiting to AArch32 state: 
—  Bits[31:1] of the PC are set to the value of bits[31:1] of DLR. 


—  Bit[0] of the PC is set to a CONSTRAINED UNPREDICTABLE choice of 0 or the value of bit[0] in DLR. 


. If exiting to AArch64 state: 


— __ Bits[63:56] of DLR_ELO might be ignored as part of tagged address handling. See Address tagging in 
AArch64 state on page D4-1724. 


— Otherwise the PC is set from DLR_ELO. 


Note 
Bits[63:32] of DLR_ELO are ignored when exiting to AArch32 state. 








Exit from Debug state can give rise to a PC alignment fault exception when the program counter is used. Unlike an 
exception return, this might also happen when returning to AArch32 state. For more information, see PC alignment 
checking on page D1-1515. 


On exiting Debug state, PSTATE is set from DSPSR in the same way that an exception return sets PSTATE from 
SPSR_ELx: 


° The same illegal exception return checks that apply to an exception return also apply to exiting Debug state. 
If the return from Debug state is an illegal exception return then the effect on PSTATE and the PC is the same 
as for any other illegal exception return. See Exception return on page D1-1536 and Exception return to an 
Exception level using AArch32 on page G1-3834. 


° The checks on the PSTATE.IT bits that apply to exiting Debug state into AArch32 state are the same as those 
that apply to an exception return. See Appendix K1 Architectural Constraints on UNPREDICTABLE 
behaviors. 
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PSTATE.SS is copied from DSPSR:SS if all of the following hold: 
—  MDSCR_EL1.SS == 1. 
— The debug target Exception level is using AArch64. 


— _ Software step exceptions from the restart Exception level are enabled. 


Otherwise PSTATE:SS is set to 0. 


Note 
Unlike a return using ERET, PSTATE.SS must be restored from DSPSR.SS because otherwise it is UNKNOWN. 








However, if OSDLR.DLK == 1 and DBGPRCR.CORENPDRQ == 0, meaning the OS Double Lock is 
locked in Non-debug state and therefore Software Step exceptions are disabled, but otherwise Software Step 
exceptions would be enabled from the restart Exception level, it is CONSTRAINED UNPREDICTABLE whether 
PSTATE.SS is copied from DSPSR.SS. 


Note 





One important difference between Debug state exit and an exception return is that the PE can exit Debug state 
at ELO. Despite this, the behavior of an exit from Debug state is similar to an exception return. For example, 
PSTATE.{D, A, I, F} is updated regardless of the value of SCTLR_EL1.UMA. 


Exit from Debug state has no architecturally-defined effect on the Event Register and exclusive monitors. An 
exit from Debug state might set the Event Register or clear the exclusive monitors, or both, but this is not a 
requirement and debuggers must not rely on any implementation specific behavior. 





The ExitDebugState() function is described in Chapter J1 ARMv8 Pseudocode. 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. H2-4881 
Non-Confidential 


H2 Debug State 
H2.5 Exiting Debug state 





H2-4882 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


Chapter H3 


Halting Debug Events 


This chapter describes a particular class of debug events. It contains the following sections: 


Introduction to Halting debug events on page H3-4884. 
Halting Step debug events on page H3-4886. 

Halt Instruction debug event on page H3-4896. 

Exception Catch debug event on page H3-4897. 

External Debug Request debug event on page H3-4900. 

OS Unlock Catch debug event on page H3-4901. 

Reset Catch debug events on page H3-4902. 

Software Access debug event on page H3-4903. 
Synchronization and Halting debug events on page H3-4904. 


Note 





Table K12-1 on page K12-5660 disambiguates the general register references used in this chapter. 
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H3.1 


Introduction to Halting debug events 


External debug defines Halting debug events. The following Halting debug events are available in ARMv8: 
° Halting Step debug events on page H3-4886. 

. Halt Instruction debug event on page H3-4896. 

. Exception Catch debug event on page H3-4897. 

. External Debug Request debug event on page H3-4900. 

° OS Unlock Catch debug event on page H3-4901. 

. Reset Catch debug events on page H3-4902. 

. Software Access debug event on page H3-4903. 


If halting is allowed, a Halting debug event halts the PE. The PE enters Debug state. 


In addition, breakpoints and watchpoints might halt the PE if halting is allowed. See Breakpoint and Watchpoint 
debug events on page H2-4846. Because breakpoints and watchpoints can generate an exception or halt the PE, 
Breakpoint and Watchpoint debug events are not classified as Halting debug events. 


For a definition of Debug state, see Chapter H2 Debug State. For a definition of halting allowed, see Halting allowed 
and halting prohibited on page H2-4845. 


Debug state entry and debug event prioritization on page H2-4847 describes the behavior when multiple debug 
events are generated by an instruction. 


See also Synchronization and Halting debug events on page H3-4904. 


Table H3-1 shows the behavior of Breakpoint, Watchpoint, and Halting debug events. 


Table H3-1 Summary of debug events and possible outcomes 





PE behavior when halting is: 


























Debug event type 
Allowed Prohibited 
Breakpoint and Watchpoint debug events on Halt See Table D2-1 on page D2-1628 
page H2-4846 and Table G2-1 on page G2-3925 
Halt Instruction debug event on page H3-4896 Halt UNDEFINED 
Software Access debug event on page H3-4903 Halt Ignored 
Exception Catch debug event on page H3-4897 Halt Ignored 
Halting Step debug events on page H3-4886 Halt Pended 
External Debug Request debug event on page H3-4900 Halt Pended 
Reset Catch debug events on page H3-4902 Halt Pended 
OS Unlock Catch debug event on page H3-4901 Pended Pended 





Table H3-2 shows where the pseudocode for each Halting debug event type is located. 


Table H3-2 Pseudocode description of Halting debug events 





Halting debug event type 


Pseudocode 





Halt Instruction debug event on page H3-4896 


HLT on page C6-529 for AArch64 and HLT on page F5-2675 for AArch32 





Software Access debug event on page H3-4903 


Pseudocode description of Software Access debug event on page H3-4903 





Exception Catch debug event on page H3-4897 


Pseudocode description of Exception Catch debug events on page H3-4899 





Halting Step debug events on page H3-4886 


Pseudocode description of Halting Step debug events on page H3-4895 
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Table H3-2 Pseudocode description of Halting debug events (continued) 





Halting debug event type Pseudocode 





External Debug Request debug event on page H3-4900 = Pseudocode description of External Debug Request debug events on 
page H3-4900 














Reset Catch debug events on page H3-4902 Pseudocode description of Reset Catch debug event on page H3-4902 
OS Unlock Catch debug event on page H3-4901 Pseudocode description of OS Unlock Catch debug event on page H3-4901 
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H3.2 Halting Step debug events 


Halting Step is a debug resource that a debugger can use to make the PE step through code one instruction at a time. 
This section describes the Halting Step debug events. It is divided into the following sections: 


° Overview of a Halting Step debug event. 

° The Halting Step state machine. 

° Using Halting Step on page H3-4889. 

. Detailed Halting Step state machine behavior on page H3-4889. 

° Synchronization and the Halting Step state machine on page H3-4892. 
° Stepping T32 IT instructions on page H3-4893. 

. Disabling interrupts while stepping on page H3-4894. 

° Syndrome information on Halting Step on page H3-4894. 

° Pseudocode description of Halting Step debug events on page H3-4895. 


The architecture describes the behavior as a simple Halting Step state machine. See The Halting Step state machine. 


H3.2.1 Overview of a Halting Step debug event 


The behavior of Halting Step is defined by a state machine, shown in Figure H3-1 on page H3-4888. A Halting Step 
debug event executes a single instruction and then returns control to the debugger. When the debugger software 
wants to execute a Halting Step: 


1. With the PE in Debug state, the debugger activates Halting Step. 
2. The debugger signals the PE to exit Debug state and return to the instruction that is to be stepped. 
3) The PE executes that single instruction. 


4. The PE enters Debug state before executing the next instruction. 


However, an exception might be generated while the instruction is being stepped. That is either: 
° A synchronous exception generated by the instruction being stepped. 


° An asynchronous exception taken before or after the instruction being stepped. 


Halting Step has its own enable control bit, EDECR.SS and EDESR.SS. 


Note 


Because the Halting Step state machine states occur as a result of normal PE operation, the states can be described 
as both: 


° PE states. 





° Halting Step states. 





H3.2.2 The Halting Step state machine 
The state machine states are: 


Inactive Halting Step is inactive. No Halting Step debug events can be generated, therefore execution is not 
affected by Halting Step. The PE is in this state whenever either of the following is true: 


° Halting Step is disabled. That is, EDECR.SS is set to 0 and EDESR:SS is set to 0. 
° Halting is prohibited. See Halting the PE on debug events on page H2-4845. 


In Figure H3-1 on page H3-4888 this state is shown in red. 


Active-not-pending 


Halting Step is enabled and active. This is the state in which the PE steps an instruction. EDECR.SS 
== | and EDESR.SS == 0. Software must not set EDECR.SS to 1 unless the PE is in Debug state, 
otherwise behavior is CONSTRAINED UNPREDICTABLE, as described in Changing the value of 
EDECR.SS when not in Debug state on page H3-4893. 


In Figure H3-1 on page H3-4888 this state is shown in green. 





H3-4886 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


H3 Halting Debug Events 
H3.2 Halting Step debug events 


Active-pending 


Halting Step is enabled and active. The step has completed, and the PE enters Debug state. 
EDESR:.SS == 1. 


In Figure H3-1 on page H3-4888 this state is shown in green. 


Whenever Halting Step is enabled and active, whether the state machine is in the active-not-pending state or in the 
active-pending state depends on EDESR.SS. Halting Step state machine states on page H3-4889 shows this. 


In the simple sequential execution of the program the PE executes the Halting Step state machine, as follows: 


1. Initially, Halting Step is inactive. 

2. After exiting Debug state, Halting Step is active-not-pending. 

3. The PE executes an instruction and Halting Step is active-pending. 

4. The pending Debug state entry is taken on the next instruction and the step is complete. 


Exceptions and other changes to the PE context can interrupt this sequence. 
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Figure H3-1 shows a Halting Step state machine. 





Halting step is disabled 











Inactive 


EDECR SS=0 Write 1 to 
Debug state EDECR.SS 
Halting step is enabled Debugger activation 
EDESR.SS is 





Inactive 
EDECR.SS=1 
Debug state 


set to 0 by the 
exit from Debug 
state 


Debug state exit with 
halting prohibited 








Step completed* 











Debug state exit 
Execution within 


Secure state 


Return to 
Non-secure state 






Active-not-pending 
EDECR.SS=1 
EDESR.SS=0 


Halting allowed 





Inactive 
EDECR.SS=1 
EDESR.SS=0 


Halting prohibited 



















Exception other than 
SMC to Secure state where 
halting is prohibited 


b 
Execution within Step completed 


c 
Secure state Step completed 












Return to 


Non-secure state Active pending 


EDECR.SS=1 
EDESR.SS=1 


Halting allowed 


Inactive 
EDECR.SS=1 
EDESR.SS=1 


Halting prohibited 















Asynchronous 


exception 
Debug state entry 





Inactive 
EDECR.SS=1 
EDESR.SS=1 
Debug state 





a. Step completed occurs when: 
. A debug event, other than a Halting Step debug event, causes entry into Debug state. 


b. Step completed occurs when: 
* Aninstruction is executed without taking an exception. 
* An exception is taken to a state where halting is allowed. 
* Areset. 


c. Step completed occurs when: 
. An SMC exception is taken to Secure state where halting is prohibited. 


Figure H3-1 Halting Step state machine 
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Note 





Figure H3-1 on page H3-4888 only describes state transitions to and from the inactive state by exit from Debug 
state, executing an exception return, or taking an exception. Other changes to the PE context, including writes to 
registers such as EDECR and OSDLR and changes to the authentication interface can also cause changes to the 
Halting Step state machine. These can lead to UNPREDICTABLE or CONSTRAINED UNPREDICTABLE behavior. See 
Synchronization and the Halting Step state machine on page H3-4892. 





The following bits control the state machine, as shown in Table H3-3: 


EDECR.SS. This is the Halting Step enable bit. 
Note 


— The EDECR value is preserved over powerdown, meaning that the step active state is maintained over 
a powerdown event. 





— A debugger must only change the value of EDECR.SS when the PE is in Debug state, otherwise 
behavior is CONSTRAINED UNPREDICTABLE as described in Changing the value of EDECR.SS when not 
in Debug state on page H3-4893. 





EDESR:SS. 


Table H3-3 shows the Halting Step state machine states. The letter X in a register column means that the relevant 
bit can be set to either zero or one. 


Table H3-3 Halting Step state machine states 

















Halting EDECR.SS EDESR.SS Halting Step state 
Prohibited x x Inactive 

Allowed 0 0 Inactive 

Allowed 1 0 Active-not-pending 
Allowed x 1 Active-pending 





Using Halting Step 


To step a single instruction the PE must be in Debug state: 


1. 
2: 


The debugger sets EDECR.SS to 1 to enable Halting step. 


The debugger signals the PE to exit Debug state with DLR set to the address of the instruction being stepped. 
The PE clears EDESR.SS to 0 and the Halting Step state machine enter the active-not-pending state. 


The PE executes the instruction being stepped. 


If an exception is taken to a state where halting is prohibited, then EDESR.SS is always correct for the 
preferred return address of the exception. 


The PE enters Debug state before executing the next instruction and the step is complete. 


Detailed Halting Step state machine behavior 


The behavior of the Halting Step state machine is described in the following sections: 


Entering the active-not-pending state on page H3-4890. 

PE behavior in the active-not-pending state on page H3-4890. 

Entering the active-pending state on page H3-4891. 

PE behavior in the inactive state when in Non-debug state on page H3-4892. 
PE behavior in Debug state on page H3-4892. 





ARM DDI 0487A.k_iss10775 


1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. H3-4889 
Non-Confidential 


H3 Halting Debug Events 
H3.2 Halting Step debug events 


Entering the active-not-pending state 
The PE enters the active-not-pending state: 
. By exiting Debug state with EDECR.SS == 1. 


° By an exception return from a state where halting is prohibited to a state where halting is allowed with 
EDECR.SS == 1 and EDESR.SS == 0. 


. As described in Synchronization and the Halting Step state machine on page H3-4892. 


PE behavior in the active-not-pending state 
When the PE is in the active-not-pending state it does one of the following: 


° It executes one instruction and does one of the following: 
— Completes it without generating a synchronous exception. 
— Generates a synchronous exception. 


— Generates a debug event that causes entry to Debug state. 
. It takes an asynchronous exception without executing any instruction. 


° It takes an asynchronous debug event into Debug state. 


If no exception or debug event is generated 


If no exception or debug event is generated the PE sets EDESR.SS to 1. This means that the Halting Step state 
machine advances to the active-pending state. 


If an exception or debug event is generated 


The PE sets EDESR.SS according to all of the following: 


. The type of exception. 
. The target Exception level of the exception. 
° If the exception is taken to Secure state, whether halting is prohibited in Secure state. 


— This is determined by the result of ExternalSecureInvasiveDebugEnabled() . 
If an exception or debug event is generated, the PE sets EDESR.SS to 1 if one of the following applies: 


° A synchronous exception is generated by the instruction and one of the following applies: 
— The exception is taken to EL1 or EL2. 


— The exception is taken to EL3, it is not an SMC exception, and ExternalSecureInvasiveDebugEnab1ed() 
== TRUE. 


— The exception is an SMC exception. 


° An asynchronous exception is generated before executing an instruction and this is either: 
— Taken to EL] or EL2. 
— Taken to EL3 and ExternalSecureInvasiveDebugEnabled() == TRUE. 


° A PE reset occurs. 
Otherwise EDESR.SS is unchanged. This happens when: 
° No instruction is executed because either: 


—  Anasynchronous exception is taken to EL3 and ExternalSecureInvasiveDebugEnabled() == FALSE. 


—  Anasynchronous debug event causes entry to Debug state. 


° An instruction is executed and either: 


— Generates a synchronous exception other than an SMC exception which is taken to EL3, and 
External SecureInvasiveDebugEnabled() == FALSE. 
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— Generates a synchronous debug event and causes entry to Debug state. 


If halting is prohibited after taking the exception or debug event, then the Halting Step state machine advances to 
the inactive state. Otherwise the Halting Step state machine advances to the active-pending state. 


Note 


The underlying criteria for the value of EDESR.SS on an exception are: 





° Whether halting is allowed at the target of the exception. If halting is allowed, the PE must step into the 
exception. If halting is prohibited, the PE must step over the exception. 


. Whether the preferred return address of the exception is the instruction itself or the next instruction, if the PE 
steps over the exception. 





Table H3-4 shows the behavior of the active-not-pending state. The letter X indicates that 
ExternalSecureInvasiveDebugEnabled() can be either TRUE or FALSE. 


Table H3-4 Summary of active-not-pending state behavior 









































Heiget Value written to 
Event Exception ExternalSecureInvasiveDebugEnab1ed() EDESR.SS 
level 
No exception or debug event Not applicable X 1 
SMC exception EL3 x 1 
Reset Highest x 1 
Exception, other than SMC EL1 x 1 
exception 
EL2 x 1 
EL3 TRUE 1 
FALSE Unchanged 
Debug event Debug state x Unchanged 
Entering the active-pending state 
The PE enters the active-pending state by one of the following: 
. From the active-not-pending state by: 
— Executing an instruction without taking an exception. 
— Taking an exception so that the PE remains in a state where halting is allowed. 
. An exception return from a state where halting is prohibited when EDESR.SS == 1. 
Note 
That is, an exception return from Secure state with ExternalSecureInvasiveDebugEnabled() == FALSE to 
Non-secure state with External InvasiveDebugEnabled() == TRUE. 
° A reset when the value of EDECR.SS == 1, regardless of the state the PE was in before the reset occurred. 
° Following the description in Synchronization and the Halting Step state machine on page H3-4892. 
When the PE is in the active-pending state, it enters Debug state before executing an instruction. However, if 
ExternalSecureInvasiveDebugEnabled() == FALSE, the architecture does not define the prioritization of this Debug 
state entry with respect to any pending asynchronous exception that is taken to EL3, if the PE is in Non-secure state. 
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If an exception is prioritized over the halt, then EDESR.SS is unchanged. On return from the exception the Halting 
Step state machine re-enters the active-pending state. 


The entry into Debug state has higher priority than all other types of debug event and exception, including all other 
asynchronous exceptions. 





Note 


This means that it is possible to step a reentrant exception in the exception vector table. 





For more information on the prioritization of debug events, see Debug state entry and debug event prioritization on 
page H2-4847. 


PE behavior in the inactive state when in Non-debug state 


EDESR:SS is not updated by the execution of an instruction or the taking of an exception when Halting Step is 
inactive. This means that EDESR.SS is not changed by an exception handled in a state where halting is prohibited. 


On return to a state where halting is allowed, the Halting Step state machine is restored either to the active-pending 
state or the active-not-pending state, depending on the value of EDESR.SS. The return to a state where halting is 
allowed is normally by an exception return, which is a Context synchronization event. 


See also Synchronization and the Halting Step state machine. 


PE behavior in Debug state 


Halting Step is inactive in Debug state because halting is prohibited, see Halting allowed and halting prohibited on 
page H2-4845. 


Entry to Debug state does not change EDESR.SS. 


EDESR:SS is cleared to 0 on exiting Debug state as the result of a restart request. If EDECR.SS == 1, Halting Step 
enters the active-not-pending state. 


Note 


This means that EDESR.SS is never cleared to 0 by the execution of an instruction in Debug state, or by taking an 
exception when in Debug state as described in PE behavior in the active-not-pending state on page H3-4890, 
because the Halting Step state machine is not in the active-not-pending state. EDESR.SS can be cleared by a write 
to EDESR, see the register description. 








However, if the PE exits Debug state as the result of a PE reset and EDECR.SS == 1, then Halting Step immediately 
enters the active-pending state, as EDESR.SS is set to the value of EDECR.SS. 


H3.2.5 Synchronization and the Halting Step state machine 
The Halting Step state machine also changes state if: 
° Halting becomes allowed or prohibited other than by exit from Debug state, an exception return, or taking an 
exception. This means that halting becomes allowed or prohibited because: 


— The security state changes without an exception return. See State and mode changes without explicit 
context synchronization events on page G2-3984. 


— The external authentication interface changes. 


— The OS Double Lock status, DoubleLockStatus(), changes. 


° A write to EDECR when the PE is in Non-debug state changes the value of EDECR.SS. 





Note 


Behavior is CONSTRAINED UNPREDICTABLE if the value of EDECR.SS is changed when the PE is in 
Non-debug state, see Changing the value of EDECR.SS when not in Debug state on page H3-4893. 
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. A write to EDESR when the PE is in Non-debug state clears EDESR.SS to 0. 


These operations are guaranteed to take effect only after a Context synchronization event. If the instruction being 
stepped generates a Context synchronization event, then the PE might use the old or new state. 


The PE must perform the required behavior of the new state before or immediately following the next Context 
synchronization event, but it is not required to do so immediately. This means that the PE can perform the required 
behavior of the old state before the next Context synchronization event. This is illustrated in Example H3-1 and 
Example H3-2. 


Example H3-1 Synchronization requirements 1 


EDECR.SS is set to 1 in Debug state, requesting the active-not-pending state on exit from Debug state. On exit from 
Debug state the PE immediately takes an exception to Secure state. ExternalSecureInvasiveDebugEnabled() == 
FALSE, meaning that halting is prohibited in Secure state. The PE does not step any instructions but executes the 
software in Secure state as normal. EDESR.SS remains set to 0. If External SecureInvasiveDebugEnab1ed() 
subsequently becomes TRUE, meaning that halting is now allowed, the PE must perform the required behavior of 
the active-not-pending state before or immediately following the next Context synchronization event, but it is not 
required to do so immediately. 


Example H3-2 Synchronization requirements 2 


EDECR:SS is set to 1 in Debug state. On exit from Debug the PE executes an MSR instruction that sets 
OSDLR_EL1.DLK to 1 and DoubleLockStatus() becomes TRUE. This change requires a Context synchronization 
event to guarantee its effect, meaning it is CONSTRAINED UNPREDICTABLE whether: 


. Halting is allowed: 


— The PE enters Debug state on the next instruction. 
. Halting is prohibited: 
— _ The PE does not enter Debug state. 


The value in EDESR.SS depends on whether halting was allowed or prohibited when the write to 
OSDLR_EL1.DLK completed, and so it might be 0 or 1. If a second MSR instruction clears OSDLR_EL1.DLK to 0, 
the PE must perform the required behavior of the state indicated by EDESR.SS before or immediately following the 
next Context synchronization event, but it is not required to do so immediately. 


See also Synchronization and Halting debug events on page H3-4904. 


Changing the value of EDECR.SS when not in Debug state 


If software changes the value of EDECR.SS when the PE is not in Debug state then behavior is CONSTRAINED 
UNPREDICTABLE, and one or more of the following behaviors occurs: 


° The value of EDECR.SS becomes UNKNOWN. 
° The state of the Halting Step state machine becomes UNKNOWN. 
° On a reset of the PE, the value of EDECR.SS and the state of the Halting Step state machine are UNKNOWN. 


H3.2.6 Stepping T32 IT instructions 


In an implementation that supports the ITD control, the architecture permits a combination of one T32 IT instruction 
and another 16-bit T32 instruction to be treated as a single 32-bit instruction when the value of the ITD field that 
applies to the current Exception level is 1. 


For the purpose of stepping an item, it is IMPLEMENTATION DEFINED whether: 


° The PE considers such a pair of instructions to be one instruction. 
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. The PE considers such a pair of instructions be two instructions. 


It is IMPLEMENTATION DEFINED whether this behavior depends on the value of the applicable ITD bit. For example: 


. The debug logic might consider such a pair of instructions as one instruction, regardless of the state of the 
applicable ITD field. 

° The debug logic might consider such a pair of instructions as two instructions, regardless of the state of the 
applicable ITD field. 

. The debug logic might consider such a pair of instructions as one instruction when the value of the applicable 


ITD field is 1, and as two instructions when the value of the ITD field is 0. 
An implementation that does not support the ITD control behaves as if the value of the ITD field is 0. 


The ITD control fields are: 
HSCTLR.ITD Applies to execution at EL2 when EL2 is using AArch32. 
SCTLR.ITD Applies to execution at ELO or EL1 when EL! is using AArch32. 
SCTLR_EL1.ITD 
Applies to execution at ELO using AArch32 when EL] is using AArch64. 





H3.2.7 Disabling interrupts while stepping 
When using Halting Step, the sequence of entering Debug state, interacting with the debugger, and then exiting 
Debug state for each instruction reduces the rate at which the PE executes instructions. However, the rate at which 
certain interrupts, such as timer interrupts, are generated might be fixed by the system. This means it might be 
necessary to disable interrupts while using Halting Step by setting EDSCR.INTdis, to allow the code being 
debugged to make forward progress. 
H3.2.8 Syndrome information on Halting Step 
Three EDSCR.STATUS encodings record different scenarios for entering Debug state on a Halting Step debug 
event: 
Halting Step, normal 
An instruction other than a Load-Exclusive instruction was stepped. 
Halting Step, exclusive 
A Load-Exclusive instruction was stepped. 
Halting Step, no syndrome 
The syndrome data is not available. 
If the PE enters Debug state due to a Halting Step debug event immediately after stepping an instruction in the 
active-not-pending state, EDSCR.STATUS is set to either: 
° Halting Step, normal, if the stepped instruction was not a Load-Exclusive instruction. 
° Halting Step, exclusive, if the stepped instruction was a Load-Exclusive instruction. 
If the stepped instruction was a conditional Load-Exclusive instruction that failed its condition code test, 
EDSCR.STATUS is set to a CONSTRAINED UNPREDICTABLE choice of Halting Step, normal, or Halting Step, 
exclusive. 
Otherwise the PE enters Debug state without stepping an instruction. This means that the Halting Step state machine 
enters the active-pending state directly from the inactive state, without going through active-not-pending state. In 
this case, EDSCR.STATUS is set to Halting Step, no syndrome. This happens when: 
. The PE enters directly into the active-pending state on an exception return to Non-secure state from EL3 
when Halting is prohibited in Secure state. 
° The active-pending state is entered for other reasons. See Synchronization and the Halting Step state machine 
on page H3-4892 
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In addition, EDSCR.STATUS is set to one of a CONSTRAINED UNPREDICTABLE choice if: 


The instruction being stepped generated a synchronous exception, or a pending asynchronous exception was 
taken before the instruction was executed. 


In this case EDSCR.STATUS is set to a CONSTRAINED UNPREDICTABLE choice of: 


— Halting Step, no syndrome, or Halting Step, normal, if the stepped instruction was not a 
Load-Exclusive instruction. 


— Halting Step, no syndrome, or Halting Step, exclusive, if the stepped instruction was a Load-Exclusive 
instruction. 


The instruction that was stepped was an exception return instruction or an ISB. As these instructions are not 
in the Load-Exclusive instructions, EDSCR.STATUS is set to a CONSTRAINED UNPREDICTABLE choice of 
Halting Step, no syndrome or Halting Step, normal. 


The PE enters directly into the active-pending state on reset because EDECR.SS is set to 1. EDSCR.STATUS 
is set to a CONSTRAINED UNPREDICTABLE choice of Halting Step, no syndrome or Halting Step, normal. 


In all cases, if EDSCR.STATUS is not set to Halting Step, no syndrome, then it must indicate whether the stepped 
instruction was a Load-Exclusive instruction by setting EDSCR.STATUS to Halting Step, normal or Halting Step, 
exclusive. 


Note 





In an implementation that always sets EDSCR.STATUS to Halting Step, no syndrome is not compliant. 





H3.2.9 Pseudocode description of Halting Step debug events 


There are two pseudocode functions for Halting Step debug events: 


RunHaltingStep(). This is called after an instruction has executed and any exception generated by the 
instruction is taken. It is also called after taking a reset before executing any instructions. That is, reset is 
treated like an asynchronous exception, even if EDECR.RCE == 1. RunHaltingStep() affects the next 
instruction. 


CheckHaltingStep(). This is called before the next instruction is executed. If a step is pending, it generates the 
debug event. 
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H3.3 Halt Instruction debug event 


A Halt Instruction debug event is generated when EDSCR.HDE == 1, halting is allowed, and the PE executes the 
Halt instruction, HLT. 


The pseudocode for Halt Instruction debug events is described in HLT on page C6-529 for A64 and HLT on 
page F5-2675 for A32 and T32. 


HLT never generates a debug exception. It is treated as UNDEFINED if EDSCR.HDE == 0, or if halting is prohibited. 


Note 


A debugger can replace a program instruction with a Halt instruction to generate a Halt Instruction debug event. 
Debuggers that use the HLT instruction must be aware of the ARMv8-A rules for concurrent modification of 
executable code, CMODX. The rules for concurrent modification and execution of instructions do not allow one 
thread of execution or an external debugger to replace an instruction with an HLT instruction when these same 
instructions are being executed by a different thread of execution. See Concurrent modification and execution of 
instructions on page B2-83. 








The T32 HLT instruction is unconditionally executed inside an IT block, even when it is treated as UNDEFINED. The 
A32 HLT instruction is CONSTRAINED UNPREDICTABLE if the condition code field is not @b1110, with the set of 
behaviors the same as for BKPT. See Appendix K1 Architectural Constraints on UNPREDICTABLE behaviors. 


Note 


The HLT instruction is part of the external debug solution for ARMv8-A. As such, the presence of the HLT instruction 
is not indicated in the ID registers. In particular, the AArch32 System register ID_ISARO. Debug does not indicate 
the presence of the HLT instruction. 








H3.3.1 HLT instructions as the first instruction in a T32 IT block 


In an implementation that supports the ITD control, the architecture permits a combination of one T32 IT instruction 
and certain other 16-bit T32 instruction to be treated as a single 32-bit instruction when the value of the ITD field 
that applies to the current Exception level is 1. 


The T32 HLT instruction cannot be combined with an IT instruction in this way. In an implementation that supports 
the ITD control, if the first instruction in an IT block is an HLT instruction, then the behavior of the instruction 
depends on the value of the applicable ITD field: 


° If the value of the ITD field is 1, then the combination is treated as UNDEFINED and an Undefined Instruction 
exception is generated either by the IT instruction or by the HLT instruction. 


° If the value of the ITD field is 0, then the HLT instruction unconditionally executed. 
An implementation that does not support the ITD control behaves as if the value of the ITD field is 0. 


To set an Halt Instruction debug event on the first instruction of an IT block, debuggers must replace the IT 
instruction with an HLT instruction to ensure consistent behavior. 


The ITD control fields are: 
HSCTLR.ITD Applies to execution at EL2 when EL2 is using AArch32. 
SCTLR.ITD Applies to execution at ELO or EL1 when EL! is using AArch372. 


SCTLR_EL1.ITD 
Applies to execution at ELO using AArch32 when EL] is using AArch64. 


Note 


An HLT instruction is always unconditional, even within an IT block. 
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Exception Catch debug event 
Exception Catch debug events: 


. Are generated when the corresponding bit in the Exception Catch Control Register, EDECCR, is set to 1 on 
all entries to a given Exception level. This means: 


— Exceptions taken to the Exception level. 


— Exception returns to the Exception level. 
. Are taken synchronously, after entry to the Exception level. 
. Ignore the Execution state of the target Exception level. 
. Are ignored if halting is prohibited. 


The EDECCR contains two fields: 
° One field for Non-secure state. 


° One field for Secure state. 


Each field contains one bit for each Exception level in that state. Bits corresponding to Exception levels that are not 
implemented are RESO. See EDECCR, External Debug Exception Catch Control Register on page H9-5028. 





Note 
° EDECCR does not replace DBGVCR: 


—  DBGVCR is retained in AArch32 state for backwards compatibility. 
—  DBGVCR is ignored in AArch64 state and never generates entries to Debug state. 
— | DBGVCR cannot be accessed by the external debug interface. 
° EDECCR is only visible as OSECCR_EL1 by System register instructions in AArch64 state, and as 


DBGOSECCR by System register access instructions in AArch32 state, when the OS Lock is locked to 
allow software to save and restore it over a powerdown. 


° Exception Catch debug events are not disabled when the OS Lock is locked. 





For an Exception Catch debug event generated after taking an exception to a trapped Exception level: 


° The PE must not fetch instructions from the vector address before entering Debug state, if address translation 
is disabled in the translation regime at the target Exception level. 

° On entering Debug state: 
— The current Exception level is the target Exception level of the exception. 
— The ELR, SPSR, ESR, and other syndrome registers contain information about the exception. 


— DLR contains the exception vector address. 





H3.4.1 Prioritization of Exception Catch debug events 
The following rules define the prioritization of Exception Catch debug events: 
° It is IMPLEMENTATION DEFINED whether Exception Catch debug events are higher or lower priority than each 
of Software Step exceptions and Halting Step debug events. 
° Exception Catch debug events are higher priority than all synchronous exceptions other than Software Step 
exceptions. 
° Exception Catch debug events are lower priority than Reset Catch debug events. 
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Note 


As described in Synchronous exception prioritization for exceptions taken to AArch64 on page D1-1548, an 
exception trapping form of a Vector Catch debug event might generate a second debug exception as part of the 
exception entry, before the Exception Catch debug event is taken. See Vector Catch exceptions on page D2-1672 or 
Vector Catch exceptions on page G2-3975. 








A second unmasked asynchronous exception can be taken before the PE enters Debug state. If this second exception 
does not generate an Exception Catch debug event, the exception handler executed at the higher Exception level 
later returns to the trapped Exception level, causing the Exception Catch debug event to be generated again. 


See also Debug state entry and debug event prioritization on page H2-4847. 











H3.4.2 CONSTRAINED UNPREDICTABLE generation of Exception Catch debug events 

When the PE is executing code at a given Exception level and the corresponding EDECCR bit is 1, it is 

CONSTRAINED UNPREDICTABLE whether an Exception Catch debug event is generated. 

Note 

It is possible to generate Exception Catch debug events: 

. As a trap on all instruction fetches from the trapped Exception level as part of an instruction fetch. 

° On entry to the Exception level, as described in Detailed Halting Step state machine behavior on 
page H3-4889. 

This is similar to the implementation options allowed for Vector Catch debug events. The architecture does not 

require that the event is generated following an ISB operation executed at the Exception level. 

Examples of this are: 

° If the debugger writes to EDECCR so that the current Exception level is trapped. 

° If the OS restore code writes to OSECCR so that the current Exception level is trapped. 

° If the code executing in AArch32 state changes the Exception level or security state other than by an 
exception return, and the target Exception level is trapped. See State and mode changes without explicit 
context synchronization events on page G2-3984. 

H3.4.3 Examples of Exception Catch debug events 

If EDECCR == 0x20, meaning that the Exception Catch debug event is enabled for Non-secure EL1, then the 

following exceptions generate Exception Catch debug events: 

° An exception taken from Non-secure ELO to Non-secure EL1. 

° An exception return from EL2 to Non-secure EL1. 

° An exception return from EL3 to Non-secure EL1. 

For example, on taking a Data Abort exception from Non-secure ELO to Non-secure EL1, using AArch64: 

° ELR_EL1 and SPSR_EL1 are written with the preferred return address and PE state for a return to ELO. 

° ESR_EL1 and FAR_EL1 are written with the syndrome information for the exception. 

° DLR_ELO is set to VBAR_EL1 + 0x400, the synchronous exception vector. 

° DSPSR_ELO is written with the PE state for an exit to EL1. 

The following do not generate Exception Catch debug events: 

. An exception taken from Non-secure ELO to EL2 or EL3. 

° An exception return from EL2 to Non-secure ELO. 

° An exception taken from Secure ELO to Secure EL1. 

. An exception return from EL3 to Secure EL1. 
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H3.4.4 Pseudocode description of Exception Catch debug events 


The pseudocode function CheckExceptionCatch() is described in Chapter J1 ARMv8 Pseudocode. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. H3-4899 
1ID092916 Non-Confidential 


H3 Halting Debug Events 
H3.5 External Debug Request debug event 


H3.5 External Debug Request debug event 
External Debug Request debug events are asynchronous debug events. 


An External Debug Request debug event is generated when signaled by the embedded cross-trigger. See Chapter H5 
The Embedded Cross-Trigger Interface. 


Note 


ARMv8-A requires the implementation of an embedded cross-trigger. 








An implementation might also support IMPLEMENTATION DEFINED ways of generating an External Debug Request 
debug event. 


If an External Debug Request debug event is being asserted at the point where a reset is taken, then the PE enters 
Debug state before it completes execution of the first instruction following the reset, provided that the state into 
which the PE resets allows halting. 


H3.5.1 Pseudocode description of External Debug Request debug events 


The ExternalDebugRequest() function is described in Chapter J1 ARMv8 Pseudocode. 
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OS Unlock Catch debug event 


An OS Unlock Catch debug event is generated when EDECR.OSUCE == 1| and the state of the OS Lock changes 
from locked to unlocked. 


When the OS Unlock Catch debug event is generated, it is recorded by setting EDESR.OSUC to 1, meaning it 
immediately becomes pending. However, it is not guaranteed to be taken immediately. See Synchronization and 
Halting debug events on page H3-4904. 


OS Unlock Catch debug events are not generated if the OS Lock is unlocked when the PE is in Debug state. See also: 
° Debug behavior when the OS Lock is unlocked on page H6-4953. 

° EDECR, External Debug Execution Control Register on page H9-5030. 

° EDESR, External Debug Event Status Register on page H9-5032. 


EDESR.OSUC is cleared to 0 on a Warm reset and on exiting Debug state. 





H3.6.1 Using the OS Unlock Catch debug event 
If the debugger attempts to access a debug register when the Core power down domain is completely off or in a 
low-power state in which the core power domain registers cannot be accessed, and that access returns an error, then 
the debugger must retry the access. However, if the Core power domain is regularly powered down, this can lead to 
unreliable debugger behavior. 
The debugger can program a Reset Catch debug event to halt the PE when it has powered up, and can program the 
debug registers from Debug state. However, if the PE boot software restores the debug registers, as described in 
Debug OS Save and Restore sequences on page H6-4951, then newly written values are overwritten by the restore 
sequence. 
The debugger can program an OS Unlock Catch debug event to halt the PE after the restore sequence has completed, 
and program the debug registers from Debug state. 

H3.6.2 Pseudocode description of OS Unlock Catch debug event 
The CheckOSUnlockCatch() function is called when the OS Lock is unlocked. 
The CheckPendingOSUnlockCatch() function is called before an instruction is executed. If an OS Unlock Catch is 
pending, it generates the debug event. 
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H3.7 Reset Catch debug events 


A Reset Catch debug event is generated when EDECR.RCE == | and the PE exits reset state. When the Reset Catch 
debug event is generated, it is recorded by setting EDESR.RC to 1. 


If halting is allowed when the event is generated, the Reset Catch debug event is taken immediately and 
synchronously. On entering Debug state, DLR has the address of the reset vector. The PE must not fetch any 
instructions from memory. 


Otherwise, the Reset Catch debug event is pended and taken when halting is allowed. See also: 
° Synchronization and Halting debug events on page H3-4904. 

° EDECR, External Debug Execution Control Register on page H9-5030. 

° EDESR, External Debug Event Status Register on page H9-5032. 


This means that EDESR.RC is set to the value of EDECR.RCE on a Warm reset. EDESR.RC is cleared to 0 on 
exiting Debug state. 


H3.7.1 Pseudocode description of Reset Catch debug event 


The CheckResetCatch() function is called after reset before executing any instruction. 


The CheckPendingResetCatch() function is called before an instruction is executed. If a Reset Catch is pending, it 
generates the Reset Catch debug event. 
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H3.8 Software Access debug event 
When the value of EDSCR.TDA == 1, software access to the following debug AArch64 and AArch32 System 
registers cause a trap to Debug state: 
° The Breakpoint Value Registers, DBGBVR. 
° The Breakpoint Control Registers, DBGBCR. 
. The Watchpoint Value Registers, DBGWVR. 
. The Watchpoint Control Registers, DBGWCR. 
However, EDSCR.TDA is ignored if one of: 
° The value of OSLSR.OSLK == 1, meaning that the OS Lock is locked. 
° Halting is prohibited. See Halting allowed and halting prohibited on page H2-4845. 
° The register access generates an exception. 
Note 
° DBGPRCR.CORENPDRQ (Core No-powerdown Request), DCC registers, and CLAIM tag bits are also 
shared, but are deliberately excluded from this list. 
° The only accesses that generate a trap are: 
— Accesses to System registers in AArch64 state. 
— Accesses to System registers in the (coproc==0b1110) encoding space in AArch32 state. 
Accesses to the external debut interface by a PE are not trapped. 
H3.8.1 Pseudocode description of Software Access debug event 
The CheckSoftwareAccessToDebugRegisters() function is described in Chapter J1 ARMv8 Pseudocode. 
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H3.9 


Synchronization and Halting debug events 


The behavior of external debug depends on: 

° Indirect reads of: 
— External debug registers. 
— System registers, including system debug registers. 
—  Special-purpose registers. 


° The state of the external authentication interface. 


This means that any change to these registers or the external authentication interface requires explicit 
synchronization by a Context synchronization event before the change takes effect. This ensures that for instructions 
appearing in program order after the change, the change affects the following: 


° The generation and behavior of Software debug events. See Synchronization and debug exceptions on 
page D2-1687 for exceptions taken from AArch64 state, or Synchronization and debug exceptions on 
page G2-3983 for exceptions taken from AArch32 state. 


° The generation of all Halting debug events. 

° Taking a pending Halting debug event or other asynchronous Debug event. See: 
— Pending Halting debug events on page H3-4905. 
— Taking Halting debug events asynchronously on page H3-4905. 


° The behavior of the Halting Step state machine. See Synchronization and the Halting Step state machine on 
page H3-4892. 


If there is an instruction between the change and the Context synchronization event, it is CONSTRAINED 
UNPREDICTABLE whether the PE uses the old state or the new state. 


For some registers, all read and write accesses that update the register occur in program order, without any 
additional synchronization, but others require an explicit Context synchronization event. For more information on 
the synchronization of register updates see: 


° Synchronization requirements for AArch64 System registers on page D7-1889. 
° Synchronization of changes to the external debug registers on page H8-4964. 
° State and mode changes without explicit context synchronization events on page G2-3984. 


A change on the external authentication interface is typically asynchronous to software and can happen without a 
Context synchronization event. 


External Debug Request debug events must be taken in finite time, without requiring the synchronization of any 
necessary change to the external authentication interface. 


If an unmasked External Debug Request debug event was pending but is changed to not pending before it is taken, 
then the architecture permits the External Debug Request debug event to be taken, but does not require this to 
happen. If the External Debug Request debug event is taken then it must be taken before the first Context 
synchronization event after the External Debug Request debug event was changed to not pending. 


Example H3-3 shows an example of the synchronization requirements. 


Example H3-3 Synchronization requirements 


Secure software locks up in a tight loop, so it executes indefinitely without any synchronization operations. An 
External debug request must be able to break the PE out of that loop. This is a requirement even if DBGEN or 
SPIDEN or both are LOW on entry to the loop, meaning that halting is prohibited, and are only asserted HIGH later. 
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H3.9.1 Pending Halting debug events 


A pending Halting debug event is taken when halting becomes allowed. Halting can become allowed without a 
Context synchronization event if: 


° The PE enters Non-secure state with External InvasiveDebugEnabled() == TRUE, and this is not the result of 
an exception return. See State and mode changes without explicit context synchronization events on 
page G2-3984. 


° A change on the external authentication interface means halting becomes allowed in the current state. 


° The OS Double Lock status, DoubleLockStatus() becomes FALSE. For example, this can happen when 
software clears OSDLR.DLK to 0 or sets DBGPRCR.CORENPDRQ to 1. 


° The debug event is an OS Unlock Catch debug event. OS Unlock Catch debug events are generated in a 
pending state, rather than taken synchronously. 


In these cases a pending Halting debug event is taken asynchronously. 


H3.9.2 Taking Halting debug events asynchronously 
The ARM architecture does not define when Halting debug events that are taken asynchronously are taken. 


Any Halting debug event that is observed as pending in the EDESR before a Context synchronization event, or an 
External Debug Request debug event that is asserted before a Context synchronization event, is taken and the PE 
enters Debug state before the first instruction following the Context synchronization event completes its execution. 
This is only possible if halting is allowed after completion of the Context synchronization event. 


If the first instruction after the Context synchronization event generates a synchronous exception, or an 

asynchronous exception is also pending, then the architecture does not define the order in which the debug event 

and the exception or exceptions are taken, unless both: 

° A Halting Step debug event is pending. EDESR.SS == 1. 

° The Context synchronization event is an exception return from a state where halting is prohibited to a state 
where halting is allowed. 


Note 


This applies to an exception return from Secure state with ExternalSecureInvasiveDebugEnabled() == FALSE 
to Non-secure state with ExternalInvasiveDebugEnabled() == TRUE. 








In this case the order in which the debug events are handled is specified to avoid a double-step. See Entering the 
active-pending state on page H3-4891. 


An External Debug Request debug event that is being asserted when the PE comes out of reset is taken, and the PE 
enters Debug state before the first instruction after the reset completes its execution, provided that halting is allowed 
when the PE exits reset state 





Note 


These rules are based on the rules that apply to taking asynchronous exceptions. See Asynchronous exception types, 
routing, masking and priorities on page D1-1555. 
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Chapter H4 
The Debug Communication Channel and Instruction 
Transfer Register 


This chapter describes communication between a debugger and the implemented debug logic, using the Debug 
Communications Channel (DCC) and the Instruction Transfer Register (ITR), and associated control flags. It 
contains the following sections: 


° Introduction on page H4-4908. 

° DCC and ITR registers on page H4-4909. 

° DCC and ITR access modes on page H4-4912. 

° Flow control of the DCC and ITR registers on page H4-4916. 

. Synchronization of DCC and ITR accesses on page H4-4919. 

° Interrupt-driven use of the DCC on page H4-4924. 

° Pseudocode description of the operation of the DCC and ITR registers on page H4-4925. 


Note 


Where necessary Table K12-1 on page K12-5660 disambiguates the general register references used in this chapter. 
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H4.1 Introduction 


The Debug Communications Channel, DCC, is a channel for passing data between the PE and an external agent, 
such as a debugger. The DCC provides a communications channel between: 


° An external debugger, described as the debug host. 
° The debug implementation on the PE, described as the debug target. 
The DCC can be used: 


° As a 32-bit full-duplex channel. 
° As a 64-bit half-duplex channel. 


The DCC is an essential part of Debug state operation and can also be used in Non-debug state. 
The Instruction Transfer Register, TTR, passes instructions to the PE to execute in Debug state. 


The PE includes flow-control mechanisms for both the DCC and ITR. 
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H4.2 DCC and ITR registers 
The DCC comprises data transfer registers, the DTRs, and associated flow-control flags. The data transfer registers 
are DTRRX and DTRTX. 
The ITR comprises a single register, EDITR, and associated flow-control flags. 
In AArch64 state, software can access the data transfer registers as: 
° A receive and transmit pair for 32-bit full duplex operation: 
— The write-only DBGDTRTX_ELO register to transmit data. 
— The read-only DBGDTRRX_ELD register to receive data. 
° A single 64-bit read/write register, DBGDTR_ELO, for 64-bit half-duplex operation. 
° The read/write OSDTRTX_EL1 and OSDTRRX_ELI registers for save and restore. 
In AArch32 state, software can only access the data transfer registers as: 
° A receive and transmit pair, for 32-bit full duplex operation: 
— The write-only DBGDTRTXint register to transmit data. 
— The read-only DBGDTRRXint register to receive data. 
. The read/write DBGDTRTXext and DBGDTRRXext registers for save and restore. 
The data transfer registers are also accessible by the external debug interface as a pair of 32-bit registers, 
DBGDTRRX_ELO and DBGDTRTX_ELO. Both registers are read/write, allowing both 32-bit full-duplex and 
64-bit half-duplex operation. 
The DCC flow-control flags are EDSCR.{RXfull, TXfull, RXO, TXU}: 
° The RXfull and TXfull ready flags are used for flow-control and are visible to software in the debug System 
registers in DCCSR. 
° The RX overrun flag, RXO, and the TX underrun flag, TXU, report flow-control errors. 
° The flow-control flags are also accessible by software as simple read/write bits for saving and restoring over 
a powerdown when the OS Lock is locked in DSCR. 
. The flow-control flags are accessible from the external debug interface in EDSCR. 
Figure H4-1 on page H4-4910 shows the System register and external debug interface views of the EDSCR and 
DTR registers in both AArch64 state and AArch32 state. These figures do not include the save and restore views. 
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Figure H4-1 System register and external debug interface views of EDSCR and DTR registers, Normal access mode 


EDITR and the ITR flow-control flags, EDSCR.{ITE, ITO} are accessible only by the external debug interface: 


. The EDITR specifies an instruction to execute in Debug state. 
. The ITR empty flag, ITE, is used for flow-control. 
° The ITR overrun flag, ITO, reports flow-control errors. 
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Figure H4-2 External debug interface views of EDSCR and EDITR registers, Normal access mode 
The sticky overflow flag, EDSCR.ERR, is used by both the DCC and ITR to report flow-control errors. 
To save and restore the DCC registers for an external debugger over powerdown, software uses: 


° The MDSCR_EL1, OSDTRTX_EL1, and OSDTRRX_ELI registers in AArch32 state. 
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° The DBGDSCRext, DBGDTRTXext, and DBGDTRR Xext registers in AArch64 state. 


Note 


There is no save and restore mechanism for the ITR registers as the ITR is only used in Debug state. 
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Figure H4-3 System register views of EDSCR and DTR registers for save and restore 
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H4.3 DCC and ITR access modes 


The DCC and ITR support two access modes: 





DCC and ITR access mode, links to description Applies when: 





Normal access mode 


EDSCR.MA == 0 or the PE is in Non-debug state 





Memory access mode on page H4-4913 EDSCR.MA == 1 and the PE is in Debug state 





H4.3.1 Normal access mode 


The Normal access mode allows use of the DCC as a communications channel between target and host. It also 
allows the use of the ITR for issuing instructions to the PE in Debug state. 


In Normal access mode, if there is no overrun or underrun, the following occurs: 


For accesses by software: 


Direct writes to DBGDTRTX update the value in DTRTX and indirectly write 1 to TXfull. 
Direct reads from DBGDTRRX return the value in DTRRX and indirectly write 0 to RXfull. 


In AArch64 state, direct writes to DBGDTR_ELO update both DTRTX and DTRRX, 
indirectly write 1 to TXfull, and do not change RXfull: 


—  DTRTX is set from bits[31:0] of the transfer register. 
—  DTRRXis set from bits[63:32] of the transfer register. 


In AArch64 state, direct reads from DBGDTR_ELO return the concatenation of DTRRX and 
DTRTX, indirectly write 0 to RXfull, and do not change TXfull: 


—  Bits[31:0] of the transfer register are set from DTRRX. 
—  Bits[63:32] of the transfer register are set from DTRTX. 


— Note 
For DBGDTR_ELO, the word order is reversed for reads with respect to writes. 





Software reads TXfull and RXfull using DCCSR. 


For accesses by the external debug interface: 


Writes to EDITR trigger the instruction to be executed if the PE is in Debug state: 
— If the PE is in AArch64 state, this is an A64 instruction. 


— Ifthe PEis in AArch32 state, this is a T32 instruction. The T32 instruction is a pair of 
halfwords where the first halfword is taken from the lower 16-bits, and the second 
halfword is taken from the upper 16-bits. 


Reads of DBGDTRTX_ELO return the value in DTRTX and indirectly write 0 to TXfull. 
Writes to DBGDTRTX_ELO update the value in DTRTX and do not change TXfull. 
Reads of DBGDTRRX_ELO return the value in DTRRX and do not change RXfull. 
Writes to DBGDTRRX_ELO update the value in DTRRX and indirectly write 1 to RXfull. 


TXfull and RXfull are visible to the external debug interface in EDSCR. 


The PE detects overrun and underrun by the external debug interface, and records errors in 
EDSCR.{TXU, RXO, ITO, ERR}. See Flow control of the DCC and ITR registers on 
page H4-4916. 


See also Synchronization of DCC and ITR accesses on page H4-4919. 
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H4.3.2 Memory access mode 


When the PE is in Debug state, Memory access mode can be selected to accelerate word-aligned block reads or 
writes of memory by an external debugger. Memory access mode can only be enabled in Debug state, and no 
instructions can be issued directly by the debugger when in Memory access mode. 


If there is no overrun or underrun when in Memory access mode, an access by the external debug interface results 
in the following: 


° External reads from DBGDTRTX_ELO cause: 
1. The existing value in DTRTX to be returned. This clears EDSCR.TXfull to 0. 


2: The equivalent of LDR W1, [X0] ,#4, if in AArch64 state, or LDR R1, [RO] ,#4, if in AArch32 state, to be 
executed. 


3. The equivalent of the MSR DBGDTRTX_EL@,X1 instruction, if in AArch64 state, or the MCR 
p14,0,R1,c0,c5,@ instruction, if in AArch32 state, to be executed. 


4. EDSCR.{TXfull, ITE} to be set to {1,1}, and X1 or R1 to be set to an UNKNOWN value. 
° External writes to DBGDTRRX_ELO cause: 
1. The value in DTRRX to be updated. This sets EDSCR.RXfull to 1. 
2. The equivalent of the instruction MRS X1,DBGDTRRX_EL@, if in AArch64 state, or MRC p14,0,R1,c0,c5,0if 
in AArch32 state, to be executed. 
3. The equivalent of the instruction STR W1, [X0] , #4, ifin AArch64 state, or STR R1, [RO] ,#4, ifin AArch32 
state, to be executed. 
4. EDSCR.{RXfull, ITE} to be set to {0,1}, and X1 or R1 to be set to an UNKNOWN value. 
° External reads from DBGDTRRX_ELO return the last value written to DTRRX. 


° External writes to EDITR generate an overrun error. 
During these accesses, EDSCR.{TXfull, RXfull, ITE} are used for flow control. 


The architecture does not require precisely when these flags are set or cleared by the sequence of operations outlined 
in this section. For example, in the case of an external write to DBGDTRRX_ELO, in AArch64 state, RXfull might 
be cleared after step 2, or it might not be cleared until after step 3, as an implementation is free to fuse these steps 
into a single operation. The architecture does require that the flags are set as at step 4 when the PE is ready to accept 
a further read or write without causing an overrun error or an underrun error. 


The process outlined in this section represents a simple sequential execution model of Memory access mode. An 
implementation is free to pipeline, buffer, and re-order instructions and transactions, as long as the following remain 


true: 

° Data items are transferred into and out of the DTR in order and without loss of data, other than as a result of 
an overrun or an underrun. 

° Data Aborts occur in order. 

° The constraints of the memory type are met. 


. In the list describing External reads from DBGDTRTX_ELO: 
— The MSR equivalent operation at step 3 of the sequence reads the value loaded by step 2. 
— If the list is performed in a loop, for all but the first iteration of this list, the value read by step 1 returns 


the values written by the MSR equivalent operation at the previous iteration of step 3. 


° In the list describing External writes to DBGDTRRX_ELO: 
— The MRS equivalent operation at step 2 of the sequence returns the value written at step 1. 


— The STR equivalent at step 3 of the sequence writes the value read at step 2. 


° If the PE cannot accept a read or write, as applicable, during the sequence, then the flags are updated to 
indicate an overrun or underrun 


See Flow control of the DCC and ITR registers on page H4-4916 for more information on overrun and underrun. 
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Ordering, access sizes and effect on exclusive monitors 


For the purposes of memory ordering, access sizes, and effect on the exclusive monitor, accesses in Memory access 
mode are consistent with Load/Store word instructions executed by the PE. 


The simple sequential access model of Memory-access mode, as stated in Memory access mode on page H4-4913, 
must also be ordered with respect to instructions executed as a result of explicit writes to EDITR in Normal mode 
both before and after accesses to the DTR registers in Memory-access mode. 


Data Aborts in Memory access mode 
If a memory access generates a Data Abort, then: 


. The Data Abort exception is taken. See Exceptions in Debug state on page H2-4874: 
— This means EDSCR.ERR is set to 1, see Cumulative error flag on page H4-4918. 
— _ Ifthe Data Abort occurs on stage 2 of an address translation, then the values returned in the ISV field 
and in bits[23:14] of the ISS are UNKNOWN. 


If this Data Abort is taken to EL2 using AArch64, the ISS is returned by ESR_EL2. ISS encoding for 
an exception from a Data Abort on page D7-1955 describes the usual encoding of this ISS. 


If EL2 is using AArch32 and this Data Abort is taken to Hyp mode, the ISS is returned by HSR. JSS 
encoding for an exception from a Data Abort on page G6-4408 describes the usual encoding of this 


ISS. 
° Register RO retains the address that generated the abort. 
° Register R1 is set to an UNKNOWN value. 


° EDSCR.TXfull, for a load, or EDSCR.RXfull, for a store, is set to an UNKNOWN value. 
° DTRTX, for a load, or DTRRX, for a store, is set to an UNKNOWN value. 


° EDSCR.ITE is set to 1. 


Illegal Execution state exception 


If PSTATE.IL is set to 1 when EDSCR.MA == 1, then on an external write access to DBGDTRRX_ELO or an 
external read from DBGDTRTX_EL4, it is CONSTRAINED UNPREDICTABLE whether the PE: 


. Takes an Illegal Execution state exception without performing any operations. In this case: 
—  EDSCR.ERR is set to 1, see Cumulative error flag on page H4-4918. 
— _ Register RO is unchanged. 
— Register R1 is set to an UNKNOWN value. 
—  EDSCR.TXfull or EDSCR.RXfull, as applicable, is set to an UNKNOWN value. 
—  DTRTX or DTRRX, as applicable, is set an UNKNOWN value. 
—  EDSCR.ITE is set to 1. 
See also Exceptions in Debug state on page H2-4874. 
° Ignores PSTATE.IL. 


Note 


The typical usage model for Memory access mode involves executing instructions in Normal access mode to set up 
XO before setting EDSCR.MA to 1. These instructions generate an Illegal state exception if PSTATE.IL is set to 1. 
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Alignment constraints 
If the address in RO is not aligned to a multiple of four, the behavior is as follows: 


° For each external DTR access a CONSTRAINED UNPREDICTABLE choice of: 


1. The PE makes an unaligned memory access to RO. If alignment checking is enabled for the memory 
access, this generates an Alignment fault. 


2: The PE makes a memory access to Align(X[@] ,4) in AArch64 state, or Align(R[0],4) in AArch32 
state. 


The PE generates an Alignment fault, regardless of whether alignment checking is enabled. 


4. The PE does nothing. 


° Following each memory access, if there is no Data Abort, RO is updated with an UNKNOWN value. 
° For external writes to DBGDTRRX_ELO, if the PE writes to memory, an UNKNOWN value is written. 
° For external reads of DBGDTRTX_ELO an UNKNOWN value is returned. 


. The RXfull and TXfull flags are left in an UNKNOWN state, meaning that a DBGDTRTX_ELO read can trigger 
a TX underrun, and a DBGDTRTX_ELO write can trigger an RX overrun. 


H4.3.3 Memory-mapped accesses to the DCC and ITR 


Writes to the flags in EDSCR by external debug interface accesses to the DCC and the ITR registers are indirect 
writes, because they are a side-effect of the access. The indirect write might not occur for a memory-mapped access 
to the external debug interface. For more information, see Register access permissions for memory-mapped 
accesses on page H8-4968. 
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H4.4 Flow control of the DCC and ITR registers 

This sub-section describes the flow-control of the DCC and ITR registers: 

° Ready flags. 

° Buffering writes to EDITR. 

° Overrun and underrun flags on page H4-4917. 

° Cumulative error flag on page H4-4918. 

H4.4.1 Ready flags 

In Normal access mode: 

° For the DTR registers there are two ready flags: 

—  EDSCR.RXfull == 1 indicates that DBGDTRRX_ELO contains a valid value that has been written by 
the external debugger and not yet read by software running on the target. 

—  EDSCR.TXfull == 1 indicates that DBGDTRTX_ELO contains a valid value that has been written by 
software running on the target and not yet read by an external debugger. 

° For the ITR register there is a single ready flag: 

—  EDSCR.ITE == 1 indicates that the PE is ready to accept an instruction to the ITR. 
Note 
The architecture permits a PE to continue to accept and buffer instructions when previous instructions 
have not completed their architecturally defined behavior, as long as those instructions are discarded 
if EDSCR.ERR is set, either by an underrun or overrun or by any of the other error conditions 
described in this architecture, such as an instruction generating an abort. 

In Memory access mode: 

° EDSCR. {RXfull, ITE} == {0,1} indicates that DBGDTRRX_ELO is empty and the PE is ready to accept a 
word external write to DBGDTRRX_ELO. 

° EDSCR.{TXfull, ITE} == {1,1} indicates that DBGDTRTX_ELO is full and the PE is ready to accept a word 
external read from DBGDTRTX_ELO. 

All other values indicate that the PE is not ready, and result in a DTR overrun or underrun error, an ITR overrun 

error, or both, as defined in Overrun and underrun flags on page H4-4917. 

EDSCR.{ITE, RXfull, TXfull} shows the status of the ITR and DCC registers. It ignores the question of whether a 

read or write cannot be accepted because, for example, EDSCR.ERR is set or the OPTIONAL Software Lock is locked 

for memory-mapped accesses (EDLSR.SLK == 1). 

H4.4.2 Buffering writes to EDITR 

The architecture permits a processor to continue to accept and buffer instructions when previous instructions have 

not completed their architecturally defined behavior, provided that: 

° Those instructions are discarded if EDSCR.ERR is set to 1, either by an underrun or an overrun, or by any 
other error conditions described in this architecture, such as an instruction generating an abort. 

. The PE maintains the simple sequential execution model with the order of instructions determined by the 
order in which the PE accepts the EDITR writes. In particular, the buffered instructions must be executed in 
the Execution state consistent with a simple sequential execution of the instructions, even if one of the 
previous instructions is a state changing operation, such as DCPS or DRPS. 

H4-4916 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


H4 The Debug Communication Channel and Instruction Transfer Register 
H4.4 Flow control of the DCC and ITR registers 


H4.4.3 Overrun and underrun flags 


Each of the ready flags has a corresponding overrun or a corresponding underrun flag. These are sticky status flags 
that are set if the register is accessed using the external debug interface when the corresponding ready flag is not in 
the ready state. 


If the PE is in Debug state and Memory access mode, the corresponding error flag is also set if the PE is not ready 
to accept an operation because a previous load or store is still in progress. The sticky status flag remains set until 
cleared by writing 1 to EDRCR.CSE. 


Note 


The architecture permits a PE to continue to accept and buffer data to write to memory in Memory access mode. 








Table H4-1 shows DCC and ITR ready flags and the overrun and underrun flags associated with them. 


Table H4-1 DCC and ITR ready flags and the associated overrun/underrun flags 





arin debug Overrun/Underrun condition Brace 
interface access flag 





Write DBGDTRRX_ELO EDSCR.RXfull == ‘1’ || (Halted() && EDSCR.MA == ‘1’ && EDSCR.ITE == ‘Q’) RXO 





Read DBGDTRTX_ELO EDSCR.TXfull == ‘0’ || (Halted() && EDSCR.MA == ‘1’ && EDSCR.ITE == ‘Q’) TXU 





Write EDITR Halted() && (EDSCR.ITE == ‘@’ || EDSCR.MA == ‘1’) ITO 





When an overrun or underrun flag is set to 1, the cumulative error flag, EDSCR.ERR, described in Cumulative error 
flag on page H4-4918, is also set to 1. 


In the event of an external write to DBGDTRRX_ELO or EDITR generating an overrun, or an external read from 
DBGDTRTX_ELO generating an underrun: 


. For a write, the written value is ignored. 
° For a read, an UNKNOWN value is returned. 
° EDSCR.TXfull, EDSCR.RXfull or EDSCR.ITE, as applicable, are not updated. 


There is no overrun or underrun detection on external reads of DBGDTRRX_ELO or external writes of 
DBGDTRTX_ELO. 


There is no overrun or underrun detection of direct reads and direct writes of the DTR System registers by software: 
° If RXfull == 0, a direct read of DBGDTRRX or DBGDTR_ELO returns UNKNOWN. 


° If TXfull == 1, a direct write of: 
—  DBGDTRTX sets DTRTX to UNKNOWN. 
—  DBGDTR_ELO sets DTRRX and DTRTX to UNKNOWN. 


See DCC accesses in Non-debug state on page H4-4920 for more information. 


Accessing 64-bit data 


In AArch64 state, a software access to the DBGDTR_ELO register and an external debugger access to both 
DBGDTRRX_ELO and DBGDTRTX_ELO can perform a 64-bit half-duplex operation. 


However, there is only overrun and underrun detection on one of the external debug registers. That is: 


° If software directly writes a 64-bit value to DBGDTR_ELO, only TXfull is set to 1, meaning: 
— A subsequent external write to DBGDTRRX_ELO would not be detected as an overrun. 


— If the external debugger reads DBGDTRTX_ELO first, software might observe 
MDCCSR_ELO.TXfull == 0 and send a second value before the external debugger reads 
DBGDTRRX_ELO, leading to an undetected overrun. 
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° On external writes to both DBGDTRRX_ELO and DBGDTRTX_ELO only RXfull is set to 1, meaning: 
— _ A subsequent direct write of DBGDTRTX_ELO would not be detected as an overrun. 


— If the external debugger writes to DBGDTRRX_EL1 first, software might observe 
MDCCSR_ELO.RXfull == 1 and read a full 64-bit value, before the external debugger writes to 
DBGDTRTX_ELO, leading to an undetected underrun. 


To avoid this, debuggers need to be aware of the data size used by software for transfers and ensure that 64-bit data 
is read or written in the correct order. If the PE is in Non-debug state, this order is as follows: 


° The external debugger must check EDSCR.{RXfull, TXfull} before each transfer. 


° To receive a 64-bit value from the target, the external debugger must read DBGDTRRX_ELO before reading 
DBGDTRTX_ELO. 


° To send a 64-bit value to the target, the external debugger must write to DBGDTRTX_ELO before writing 
DBGDTRRX_ELO. 


Because three accesses are required to transfer 64 bits of data, 64-bit transfers are not recommended for regular 
communication between host and target. The use of underrun and overrun detection means that only one access is 
required for 32 bits of data when using 32-bit transfers. 


In Debug state, the debugger controls the instructions executed by the PE, so these limitations do not apply. 64-bit 
transfers provide a means to transfer a 64-bit general register between the host and the target in Debug state. 


H4.4.4 Cumulative error flag 


The cumulative error flag, EDSCR.ERR, is set to 1: 
° On taking an exception from Debug state. 


° On any signaled overrun or underrun in the DCC or ITR. 

When EDSCR.ERR == 1: 

° External reads of DBGDTRTX_ELO do not have any side-effects. 
° External writes to DBGDTRRX_ELO are ignored. 

. External writes to EDITR are ignored. 


° No further instructions can be issued in Debug state. This includes any instructions previously accepted as 
external writes to EDITR that occur in program order after the instruction or access that caused the error. 


This allows a debugger to stream data, or, in Debug state, instructions, to the target without having to: 
° Check EDSCR.{RXfull, TXfull, ITE} before each access. 
° Check EDSCR.{ITO, RXO, TXU} following each access, for overrun or underrun. 


° Check PSTATE or other syndrome registers, or both, for an exception following each instruction executed in 
Debug state that might generate a synchronous exception. 


The cumulative error flag remains set until cleared by writing 1 to EDRCR.CSE. See EDRCR, External Debug 
Reserve Control Register on page H9-5062. 


For overruns and underruns, EDSCR. {ITO, RXO, TXU} record the error type. 


Pseudocode description of clearing the error flag 


The ClearStickyErrors() pseudocode function is described in Chapter J1 ARMv8 Pseudocode. 
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H4.5 Synchronization of DCC and ITR accesses 


In addition to the standard synchronization requirements for register accesses, the following subsections describe 
additional requirements that apply for the DCC and ITR registers: 


° Summary of System register accesses to the DCC. 


° DCC accesses in Non-debug state on page H4-4920. 


In these sections, accesses by the external debug interface are referred to as external reads and external writes. 
Accesses to System registers are referred to as direct reads, direct writes, indirect reads, and indirect writes. 


Note 


In Synchronization requirements for AArch64 System registers on page D7-1889 external reads and external writes 
are described as forms of indirect access. This whole section uses more explicit terminology. 








The DTR registers and the DCC flags, TXfull and RXfull, form a communication channel, with one end operating 
asynchronously to the other. Implementations must respect the ordering of accesses to these registers in order to 
maintain the correct behavior of the channel. 


External reads of, and external writes to DBGDTRRX_ELO and DBGDTRTX_EL0 are asynchronous to direct 
reads of, and direct writes to, DBGDTRRX, DBGDTRTX, and in AArch64 state DBGDTR_ELO, made by software 
using System register access instructions. The direct reads and direct writes indirectly write to the DCC flags. The 
external reads and external writes indirectly read the DCC flags to check for underrun and overrun. 


Throughout this section: 
DCC flags Means any or all of the following: 
° The EDSCR.{RXfull.TXfull} ready flags. 
° The EDSCR.RXO overrun flag. 
° The EDSCR.TXU underrun flag. 
° The EDSCR.ERR cumulative error flag. 
ITR flags Means any or all of the following: 
. The EDSCR.ITE ready flag. 
. The EDSCR.ITO overrun flag. 
° The EDSCR.ERR cumulative error flag. 


H4.5.1 Summary of System register accesses to the DCC 


System register accesses to the DTR registers are direct reads and writes of those registers, as shown in Table H4-2 
on page H4-4920. Several of these instructions access the same registers using different encodings. 


With the exception of the read and write bits, DBGDTRRX and DBGDTRTX are the same encoding, with exception 
of the read/write bits, but use different registers. The ARMv8 architecture governs the order of these instructions, 
as described in Synchronization requirements for AArch64 System registers on page D7-1889. For more details, see 
the description of the individual register in the relevant chapter, Chapter D7 AArch64 System Register Descriptions 
or Chapter G6 AArch32 System Register Descriptions. 


Table H4-2 on page H4-4920 shows a summary of System register accesses to the DCC. 
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Table H4-2 Summary of System register accesses to the DCC 





AArch64 AArch32 


























Operation OS Lock (MRS/MSR) (MRC/MCR) Description 
Read - DBGDTRRX_ELO DBGDTRRXint Direct read of DTRRX 
Indirect write to the DCC flags 
Write - DBGDTRTX_ELO DBGDTRTXint Direct read of DTRTX 
Indirect write to the DCC flags 
Read/write - DBGDTR_ELO - Direct read/write of both DIRRX and DTRTX 
Indirect write to the DCC flags 
Read - MDCCSR_ELO DBGDSCRint Direct read of the DCC flags 
Read/write - OSDTRRX_EL1 DBGDTRRXext Direct read/write of DTIRRX 
Read/write - OSDTRTX_EL1 DBGDTRTXext Direct read/write of DTRTX 
Read Unlocked ©MDSCR_EL1 DBGDSCRext Direct read of DCC flags 
Read/write Locked MDSCR_EL1 DBGDSCRext Direct read/write of DCC flags 





H4.5.2 DCC accesses in Non-debug state 


In Non-debug state DCC accesses are as described in Normal access mode on page H4-4912: 


° If a direct read of DCCSR returns RXfull == 1, then a following direct read of DBGDTRRX, or in AArch64 
state of DBGDTR_ELO, returns valid data and indirectly writes 0 to DCCSR.RXfull as a side-effect. 


° If a direct read of DCCSR returns TXfull == 0, then a following direct write to DBGDTRTX, or in AArch64 
state to DBGDTR_ELO, writes the intended value, and indirectly writes 1 to DCCSR.TXfull as a side-effect. 


No Context synchronization event is required between these two instructions. Overrun and underrun detection 
prevents intervening external reads and external writes affecting the outcome of the second instruction. 


The indirect write to the DCC flags as part of the DTR access instruction is made atomically with the DTR access. 


Because a direct read of DBGDTRRxX is an indirect write to DCCSR.RXfull, it must occur in program order with 
respect to the direct read of DCCSR, meaning it must not return a speculative value for DTTRX that predates the 
RXfull flag returned by the read of DCCSR. The direct write to DBGDTRTX must not be executed speculatively. 


Direct reads of DBGDTRRX, or in AArch64 state DBGDTR_ELO, and DCCSR, must occur in program order with 
respect to other direct reads of the same register using the same encoding. 


The following accesses have an implied order within the atomic access: 


° In the simple sequential execution of the program the indirect write of the DCC flags occurs immediately 
after the direct DTR access. 


Note 


For an access to DBGDTR_ELO, this means the indirect write happens after both DBGDTRRX_ELO and 
DBGDTRTX_ELO have been accessed. 
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. In the simple sequential execution model, for an external read of DBGDTRTX_ELO or an external write of 
DBGDTRRX_ELO: 


— The check of the DCC flags for overrun or underrun occurs immediately before the access. 
— If there is no underrun or overrun, the update of the DCC flags occurs immediately after the access. 


— If there is underrun or overrun, the update of the DCC underrun or overrun flags occurs immediately 
after the access. 


All observers must observe the same order for accesses. 


Note 


These requirements do not create order where order does not otherwise exist. It applies only for ordered accesses. 








Without explicit synchronization following external writes and external reads: 


° The value written by the external write to DBGDTRRX_ELO that does not overrun, must be observable to 
direct reads of DBGDTRRX and DBGDTR_ELO in finite time. 
° The DCC flags that are updated as a side-effect of the external write or external read must be observable: 
— _ To subsequent external reads of EDSCR. 
— _ To subsequent external reads of DBGDTRRX_ELO when checking for underrun. 
— _ To subsequent external writes to DBGDTRTX_ELO when checking for overrun. 
— To direct reads of DCCSR in finite time. 
However, explicit synchronization is required to guarantee that a direct read of DCCSR returns up-to-date DCC 
flags. This means that if a signal is received from another agent that indicates that DCCSR must be read, an ISB is 
required to ensure that the direct read of DCCSR occurs after the signal has been received. This also synchronizes 
the value in DBGDTRRxX, if applicable. However, if that signal is an interrupt exception triggered by COMMIRQ, 


COMMTX, or COMMRX, the exception entry is sufficient synchronization. See Synchronization of DCC 
interrupt request signals on page H4-4923. 


Explicit synchronization is required following a direct read or direct write: 
. To ensure that a value directly written to DBGDTRTX is observable to external reads of DBGDTRTX_ELO. 


° To ensure that a value directly written to DBGDTR_ELO is observable to external reads of 
DBGDTRTX_ELO and DBGDTRRX_ELO. 


° To guarantee that the indirect writes to the DCC flags that were a side-effect of the direct read or direct write 
have occurred, and therefore that the updated values are: 
— Observable to external reads of EDSCR. 
— Observable to external reads of DBGDTRRX_ELO when checking for underrun. 
— Observable to external writes of DBGDTRTX_ELO when checking for overrun. 
— _ Returned by a following direct read of DCCSR. 


See also Memory-mapped accesses to the DCC and ITR on page H4-4915 and Synchronization of changes to the 
external debug registers on page H8-4964. 


Note 


These ordering rules mean that software: 





° Must not read DBGDTRRX without first checking DCCSR.RXfull or if the previously-read value of 
DCCSR.RXfull is 0. 


It is not sufficient to read both registers and then later decide whether to discard the read value, as there might 
be an intervening write from the external debug interface. 
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Must not write DBGDTRTX without first checking DCCSR.TXfull or if the previously-read value of 
DCCSR.TXfull is 1. 


The write to DBGDTRTX overwrites the value in DTRTX, and the external debugger might or might not 


have read this value. 


Must ensure there is an explicit Context synchronization event following a DTR access, even if not 
immediately returning to read DCCSR again. This synchronization operation can be an exception return. 





Derived requirements 


The rules for DCC accesses in Non-debug state are as follows: 


Following a direct read of DBGDTRRX when RXfull is 1: 


—  Ifanexternal write to DBGDTRRX checks the RXfull flag for overrun and observes that the value of 
RXfull is 0, the value returned by the previous direct read must not be affected by the external write. 


—  Ifanexternal read of EDSCR returns a RXfull value of 0, then the value returned by the previous direct 
read must not be affected by a following external write to DBGDTRRX, and the following external 
write does not overrun. 


Following a direct read of DBGDTR_ELO, when RXfull is 1: 


—  Ifanexternal write to DBGDTRRX checks the RXfull flag for overrun and observes that the value of 
RXfull is 0, the value returned by the previous direct read must not be affected by the external write 
nor by a following direct write to DBGDTRTX. 


—  Ifanexternal read of EDSCR returns a RXfull value of 0, then the value returned by the previous direct 
read must not be affected by subsequent external writes to DBGDTRRX and DBGDTRTX in any 
order, and the following external write of DBGDTRRX will not overrun. 


Following a direct write to DBGDTRTX, when TXfull is 0: 


—  Ifanexternal read of DBGDTRTX checks the TXfull flag for underrun and observes that the value of 
TXfull is 1, the value returned by the external read must be the value written by the previous direct 
write. 


—  Ifanexternal read of EDSCR returns a TXfull value of 1, then the value returned by a following 
external read of DBGDTRRX must be the value written by the previous direct read, and the 
subsequent external read will not underrun. 


Following a direct write to DBGDTR_ELO, when TXfull is 0: 


—  Ifanexternal read of DBGDTRTX checks the TXfull flag for underrun and observes that the value of 
TXfull is 1, the values returned by the external read and by a subsequent external read of DBGDTRRX 
must be the value written by the previous direct write. 


—  Ifanexternal read of EDSCR returns a TXfull value of 1, then the value returned by subsequent 
external reads of DBGDTRRX and DBGDTRTX, in any order, must be the value written by the 
previous direct read, and the subsequent external read of DBGDTRTX does not underrun. 


Following an external read of DBGDTRTX that does not underrun, if a direct read of DCCSR returns a 
TXfull value of 0, then the value returned by the external read must not be affected by a following direct write 
to DBGDTRTX. 


Following a first external read DBGDTRRX and a following second external read of DBGDTRTX that does 
not underrun, if a direct read of DCCSR returns a TXfull value of 0, then the values returned by the external 
reads must not be affected by a following direct write to DBGDTR_ELO. 


Following an external write to DBGDTRRX that does not overrun, if a direct read of DCCSR returns an 
RXfull value of 1, then the value returned by a following direct read of DBGDTRRX or DBGDTR_ELO must 
be the value written by the previous external write. 
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° Following a first external write to DBGDTRTX and a following second external write to DBGDTRRX that 
does not overrun, if a direct read of DCCSR returns an RXfull value of 1, then the value returned by a 
subsequent direct read of DBGDTR_ELO must return the values written by the previous external writes. 


H4.5.3 Synchronization of DCC interrupt request signals 


Following an external read or external write access to the DTR registers, the interrupt request signals, COMMIRQ, 
COMMTX, and COMMRxX, must be updated in finite time without explicit synchronization. 


The updated values must be observable to a direct read of DCCSR or DBGDTRRX, or a direct write of 
DBGDTRTX executed after taking an interrupt exception generated by the interrupt request. The updated values 
must also be observable to a direct write of DBGDTRTX executed after taking an interrupt exception generated by 
the interrupt request. 


Following a direct read of DBGDTRRX or a direct write to DBGDTRRX, software must execute a Context 
synchronization event to guarantee the interrupt request signals have been updated in finite time. This 
synchronization operation can be an exception return. 


H4.5.4 DCC and ITR access in Debug state 


In Debug state, stricter observability rules apply for instructions issued through the ITR, to maintain communication 
between a debugger and the PE, without requiring excessive explicit synchronization. 


In Normal access mode, without explicit synchronization: 


° A direct read or direct write of the DTR registers by an instruction written to EDITR must be observable to 
an external write or an external read in finite time: 


— A direct read of DBGDTRRX must be observable to an external write of DBGDTRRX_ELO. 

— A direct read of DBGDTR_ELO must be observable to an external write of DBGDTRRX_ELO and 
DBGDTRTX_ELO. 

— A direct write of DBGDTRTX must be observable to an external read of DBGDTRTX_ELO. 

—_ A direct write of DBGDTR_ELO must be observable to an external read of DBGDTRRX_ELO and 
DBGDTRTX_ELO. 


This includes the indirect write to the DCC flags that occurs atomically with the access as described in DCC 
accesses in Non-debug state on page H4-4920. 


The subsequent external write or external read must observe either the old or the new values of both the DTR 
contents and DCC flags. If the old values are observed, this typically results in overrun or underrun, assuming 
the old values of the DCC flags indicate an overrun or underrun condition, as would normally be the case. 


This means the debugger can observe the direct read or direct write without explicit synchronization and 
without explicitly testing the DCC flags in EDSCR, because it can rely on overrun and underrun tests. 


° External reads of DBGDTRTX_ELO that do not underrun and external writes to DBGDTRRX_ELO that do 
not overrun must be observable to an instruction subsequently written to EDITR on completion of the first 
external access. This includes the indirect write to the DCC flags. 


This means that without explicit synchronization and without the need to first check the DCC flags in 
DCCSR: 


— If the instruction is a direct read of DBGDTRRX, it observes the external write. 
— If the instruction is a direct write of DBGDTRTX, it observes the external read. 


. Writes to EDITR that do not overrun commit an instruction for execution immediately. The instruction must 
complete execution in finite time without requiring any further operation by the debugger. 


° After an external write to the EDITR, the ITR flags that are updated as a side effect of that write must be 
observable by: 
—  Anexternal read of the EDSCR that follows the external write to the EDITR. 


— When checking for overrun, another external write to the EDITR that follows the original external 
write to the EDITR. 


In Memory access mode, these requirements shift to the instructions implicitly executed by external reads and 
external writes of the DTR registers, as described in Memory access mode on page H4-4913. 
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H4.6 Interrupt-driven use of the DCC 


ARM recommends implementations provide a level-sensitive DCC interrupt request through the IMPLEMENTATION 
DEFINED interrupt controller as a private peripheral interrupt for the originating PE. 





Note 


° In addition to connection to the interrupt controller ARM also recommends COMMIRQ, COMMTX, and 
COMMRxX signals that might be implemented for use by any legacy system peripherals. 


° GICv3 reserves a private peripheral interrupt number for the COMMIRQ interrupt. 





The DCCINT register provides a first level of interrupt masking within the PE, meaning only a single interrupt 
source, COMMIRQ, is needed at the interrupt controller. 


See also Synchronization of DCC interrupt request signals on page H4-4923. 
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H4.7 Pseudocode description of the operation of the DCC and ITR registers 


The basic operation of the DCC and ITR registers is shown by the following pseudocode functions. These functions 
do not cover the behavior when OSLSR.OSLK == 1, meaning that the OS lock is locked: 


. DBGDTR_ELO[]. 

. DBGDTRRX_ELO[]. 

. DBGDTRTX_ELO[]. 

. EDITR[]. 

. CheckForDCCInterrupts(). 


For the definition of the DTR Registers, see shared/debug/dccanditr/DTR on page J1-5381. 
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Chapter H5 


The Embedded Cross-Trigger Interface 


This chapter describes the embedded cross-trigger interface. It contains the following sections: 


About the Embedded Cross-Trigger (ECT) on page H5-4928. 

Basic operation on the ECT on page H5-4930. 

Cross-triggers on a PE in an ARMV8 implementation on page H5-4934. 
Description and allocation of CTI triggers on page H5-4935. 

CTI registers programmers’ model on page H5-4939. 

Examples on page H5-4940. 
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H5.1 About the Embedded Cross-Trigger (ECT) 
The Embedded Cross-Trigger, ECT, allows a debugger to: 
° Send trigger events to a PE. For example, this might be done to halt the PE. 


° Send a trigger event to one or more PEs when a trigger event occurs on another PE. For example, this might 
be done to halt all PEs when one individual PE halts. 


Figure H5-1 shows the logical structure of an ECT. 
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Figure H5-1 Structure of an embedded cross-trigger 


The ECT can deliver many types of trigger events, which are described in the following sections: 


° Debug request trigger event on page H5-4936. 


° Restart request trigger event on page H5-4936. 

° Cross-halt trigger event on page H5-4936. 

° Performance Monitors overflow trigger event on page H5-4937. 
° Generic trace external input trigger events on page H5-4937. 

° Generic trace external output trigger events on page H5-4937. 


° Generic CTI interrupt trigger event on page H5-4937. 


An ARMv8-A implementation must: 
° Include a cross-trigger interface, CTI. 


° Implement at least the input and output triggers defined in this architecture. 
See Cross-triggers on a PE in an ARMV8 implementation on page H5-4934. 


In addition, ARM recommends that the cross-trigger includes: 
° The ability to route trigger events between Trace macrocells: 
These typically have advanced event triggering logic. 


° An output trigger to the interrupt controller. 





Note 


The ECT and CTI must only signal trigger events for external debugging. They must not route software events, such 
as interrupts. For example, the Performance Monitors overflow input trigger is provided to allow entry to Debug 
state on a counter overflow, and the output trigger to the interrupt controller is provided to generally allow events 
from the external debug sub-system to be routed to a software agent. However, the combination of the two must not 
be used as a mechanism to route Performance Monitors overflows to an interrupt controller. 
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H5.1.1 Implementation with a CoreSight CTI 


For details of the recommended connections in an ARMv8-A implementation, see Appendix K2 Recommended 
External Debug Interface. See also CoreSight™ SoC Technical Reference Manual. 
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H5.2 Basic operation on the ECT 


The ECT comprises a Cross-Trigger Matrix, CTM, and one Cross-Trigger Interface, CTI, for each PE. The CTM 
passes events between the CTI blocks over channels. The CTM can have a maximum of 32 channels. 


The main interfaces of the cross-trigger interface, CTI, are: 
. The input triggers: 
— _ These are trigger event inputs from the PE to the CTI. 
° The output triggers: 
— These are trigger event outputs from the CTI to the PE. 
° The input channels: 
— These are channel event inputs from the cross-trigger matrix, CTM, to the CTI. 
° The output channels: 
— These are channel event outputs from the CTI to the CTM. 


Each CTI block has: 


° Up to 32 input triggers that come from the PE: 


— The input triggers are numbered 0-31. 


° Up to 32 output triggers that go to the PE: 


— _ The output triggers are numbered 0-31. 


Figure H5-2 on page H5-4931 shows the logical internal structure of a CTI. 
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Figure H5-2 Structure of a cross-trigger interface 


Note 
° The number of triggers in IMPLEMENTATION DEFINED. Figure H5-2 shows eight input and eight output 
triggers. 
° The number of channels is IMPLEMENTATION DEFINED. Figure H5-2 shows four channels. 
° In Figure H5-2 the input channel gate function is a CTIv2 feature. 





When the CTI receives an input trigger event, this generates channel events on one or more internal channels, 
according to the mapping function defined by the Input trigger—output channel mapping registers, CTIINEN<n>. 


The CTI also contains an application trigger and channel pulse to allow a debugger to create channel events directly 
on internal channels by writing to the CTI control registers. 
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Channel events on each internal channel are passed to a corresponding output channel that is controlled by a channel 
gate. The channel gate can block propagation of channel events from an internal channel to an output channel. 


The output channels from a CTI are combined, using a logical OR function, with the output channels from all other 
CTIs to form the input channels on other CTIs. The input channels of this CTI are the logical OR of the output 
channels on all other CTIs. This is the cross-trigger matrix, CTM. Therefore, the number of input channels must 
equal the number of output channels. 





Note 


The number of input triggers and output triggers is not required to be the same. 





The internal channels form an internal cross-trigger matrix within the CTI. This delivers events directly from the 
input triggers to the output triggers. Therefore the number of internal channels is the same as the number of input 
and output channels on the external CTM, and there is a direct mapping between the two. 


Channel events received on each input channel are passed to the corresponding internal channel. It is 
IMPLEMENTATION DEFINED whether the cross-trigger gate also blocks propagation of channel events from input 
channels to internal channels. 


When the CTI receives a channel event on an internal channel this generates trigger events on one or more output 
triggers, according to the mapping function defined by the Input channel — output trigger mapping registers, 
CTIOUTEN<n>. 


The CTI contains the input and output trigger interfaces to the PE and the interface of the cross-trigger matrix. The 
architecture does not define the signal protocol used on the trigger interfaces, and: 


° It is IMPLEMENTATION DEFINED whether the CTI supports multicycle input trigger events. 
° It is IMPLEMENTATION DEFINED whether the CTM supports multicycle channel events. 


See Multicycle events. 


However, an output trigger is asserted until acknowledged. The output trigger can be: 
° Self-acknowledging. This means that no further action is required from the debugger. 
. Acknowledged by the debugger writing 1 to the corresponding bit of CTIINTACK. 


The time taken to propagate a trigger event from the first PE, through its CTI, across the CTM to another CTI, and 
thereby to a second PE is IMPLEMENTATION DEFINED. 





Note 


ARM recommends that this path is not longer than the shortest software communication path between those PEs. 
This is because if the first PE halts, the Cross-halt trigger event can propagate through the ECT and halt the second 
PE without causing software on the second PE to malfunction because the first PE is in Debug state and is not 
responding. 





H5.2.1 Multicycle events 


A multicycle event is one with a continuous state that might persist over many cycles, as opposed to a discrete event. 
A typical implementation of a multicycle event is a level-based signal interface, whereas a discrete event might be 
implemented as a pulse signal or message. 


CTI support for multicycle trigger events is IMPLEMENTATION DEFINED. Use of multicycle trigger events is 
deprecated. Of the architecturally defined input trigger events, the Performance Monitors overflow trigger event and 
Generic trace external output trigger events can be multicycle input triggers. 


CTM support for multicycle channel events is IMPLEMENTATION DEFINED. A CTM that does not support multicycle 
channel events cannot propagate a multicycle trigger event between CTIs. 





Note 


A full ECT might comprise a mix of CTIs, some of which can support multicycle trigger events. In bridging these 
components, multicycle channel events become single channel events at the boundary between the CTIs. 
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An ECT that supports multicycle trigger events 


When an ECT supports multicycle trigger events, an input trigger event to the CTI continuously asserts channel 
events on all output channels mapped to it until either: 


° The input trigger event is removed. 


° The channel mapping function is disabled. 


This means that an input trigger that is asserted for multiple cycles causes any channels that are mapped to it to 
become active for multiple cycles. Consequently, any output triggers mapped from that channel are asserted for 
multiple cycles. 


Note 


The output trigger remains asserted for at least as long as the channel remains active. This means that even if the 
output trigger is acknowledged, it remains asserted until the channel deactivates. 








The CTI does not guarantee that these events have precisely the same duration, as the triggers and channels can cross 
between clock domains. 


CTIAPPSET and CTIAPPCLEAR can set a channel active for multiple cycles. CTIAPPPULSE generates a single 
channel event. CTICHINSTATUS and CTICHOUTSTATUS can report whether a channel is active. 


An ECT that does not support multicycle trigger events 


When an ECT does not support multicycle trigger events, an input trigger event to the CTI generates a single 
channel event on all output channels mapped to it, regardless of how long the input trigger event is asserted. 


This means that an input trigger event that is asserted for multiple cycles generates a single channel event on any 
channels mapped to it. Consequently any self-acknowledging output triggers mapped from those channels are single 
trigger events. 





Note 


A single event is typically a single cycle, but there is no guarantee that this is always the case. 





CTIAPPSET and CTIAPPCLEAR can only generate a single channel event. CTIAPPPULSE generates a single 
channel event. If the ECT does not support multicycle channel events, use of CTIAPPSET and CTIAPPCLEAR is 
deprecated, and the debugger must only use CTIAPPPULSE. CTICHINSTATUS and CTICHOUTSTATUS must 
be treated as UNKNOWN. 
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H5.3 Cross-triggers on a PE in an ARMv8 implementation 


An ARMv8 PE must include a cross-trigger interface, and the implementation must include at least the input and 
output triggers defined in this architecture. The number of channels in the cross-trigger matrix is IMPLEMENTATION 
DEFINED, but there must be a minimum of three. Software can read CTIDEVID.NUMCHAN to discover the number 
of implemented channels. 


The CTM must connect to all PEs in the same Inner Shareability domain as the ARMv8-A PE, but can also connect 
to additional PEs. ARM strongly recommends that the CTM connects all PEs implementing a CTI in the system. 
This includes ARMv7-A PEs and other PEs that can be connected using a CoreSight CTI module. 


Note 


In a uniprocessor system the CTM is OPTIONAL. The CTM might be connected other CTI modules for non-PEs, such 
as triggers for system visibility components. ARM recommends that the CTM is implemented. 








Any CTI connected to a PE that is not an ARMv8-A PE must implement at least: 


° The Debug request trigger event. 
° The Restart trigger event. 
° The Cross-halt trigger event. 


For more information about the CTI, see the CoreSight ™ SoC Technical Reference Manual. ARMv8-A refines the 
generic CTI by defining roles for each of the implemented input and output triggers. 
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H5.4 Description and allocation of CTI triggers 


Table H5-1 shows the output trigger events defined by the architecture and the related trigger numbers. 


Table H5-1 Allocation of CTI output trigger events 























Number Source Destination Event description 

0 CTI PE Debug request trigger event on page H5-4936 

1 CTI PE Restart request trigger event on page H5-4936 

2 CTI IRQ controller Generic CTI interrupt trigger event on page H5-4937 

3 - - Reserved 

4-7 CTI Trace macrocell OPTIONAL Generic trace external input trigger events on page H5-4937 
Note 





Output triggers from the CTI are inputs to other blocks. 





Table H5-2 shows the input trigger events defined by the architecture and the related trigger numbers. 


Table H5-2 Allocation of CTI input trigger events 




















Number Source Destination Event description 

0 PE CTI Cross-halt trigger event on page H5-4936 

1 PE CTI Performance Monitors overflow trigger event on page H5-4937 

2-3 - - Reserved 

4-7 Trace macrocell CTI OPTIONAL Generic trace external output trigger events on page H5-4937 
Note 





Input triggers to the CTI are outputs from other blocks. 





Table H5-1 and Table H5-2 show the minimum set of trigger events defined by the architecture. However: 


° The Generic trace external input and output trigger events are only required if the OPTIONAL Trace macrocell 
is implemented. If the OPTIONAL Trace macrocell is not implemented, these trigger events are reserved. 


° Support for the generic CTI interrupt trigger event is IMPLEMENTATION DEFINED because details of interrupt 
handling in the system, including any interrupt controllers, are IMPLEMENTATION DEFINED. Details regarding 
how the CTI interrupt is connected to an interrupt controller and its allocated interrupt number lie outside the 
scope of the architecture. ARM strongly recommends that implementations provide a means to generate 
interrupts based on external debug events. 


. The other trigger events are required by the architecture. 


An ARMv8-A implementation can extend the CTI with additional triggers. These start with the number eight. 
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H5.4.1 


H5.4.2 


H5.4.3 


Debug request trigger event 


This is an output trigger event from the CTI, and an input trigger event to the PE, asserted by the CTI to force the 
PE into Debug state. The trigger event is asserted until acknowledged by the debugger. The debugger acknowledges 
the trigger event by writing 1 to CTIINTACK[0]. 


Note 


A debugger must poll CTITRIGOUTSTATUS[0] until it reads as 0, to confirm that the output trigger has been 
deasserted before generating any event that must be ordered after the write to CTIINTACK, such as a write to 
CTIAPPPULSE to activate another trigger. 








If the PE is already in Debug state, the PE ignores the trigger event, but the CTI continues to assert it until it is 
removed by the debugger. See also External Debug Request debug event on page H3-4900. 


Restart request trigger event 


This is an output trigger event from the CTI, and an input trigger event to the PE, asserted by the CTI to request the 
PE to exit Debug state. If the PE is in Non-debug state, the request is ignored by the PE. 


If a Restart request trigger event is received at or about the same time as the PE enters Debug state, it is 
CONSTRAINED UNPREDICTABLE whether: 


. The request is ignored by the PE. In this case the PE enters Debug state and remains in Debug state. 


° The PE enters Debug state and then immediately restarts. 


Debuggers must program the CTI to send Restart request trigger events only to PEs that are halted. To enable the 
PE to disambiguate discrete Restart request trigger events, after sending a Restart request trigger event, the debugger 
must confirm that the PE has restarted and halted before sending another Restart request trigger event. Debuggers 
can use EDPRSR.{SDR, HALTED} to determine the Execution state of the PE. 


Note 


Before generating a Restart request trigger event for a PE, a debugger must ensure any Debug request trigger event 
targeting that PE is cleared. Debug request trigger event describes how to do this. 








The trigger event is self-acknowledging, meaning that the debugger requires no further action to remove the trigger 
event. The trigger event is acknowledged even if the request is ignored by the PE. See also Exiting Debug state on 
page H2-4880. 


Cross-halt trigger event 


This is an input trigger event to the CTI, and an output trigger event from the PE, asserted by a PE when it is entering 
Debug state. 





Note 


To reduce the latency of halting, ARM recommends that an implementation issues the Cross-halt trigger event early 
in the committed process of entering Debug state. This means that there is no requirement to wait until all aspects 
of entry to Debug state have completed before issuing the trigger event. Speculative emission of Cross-halt trigger 
events is not allowed. The Cross-halt trigger event must not be issued early enough for a subsequent Debug request 
trigger event, that might be derived from the Cross-halt trigger event, to be recorded in the EDSCR.STATUS field. 
This applies to Debug request trigger events that are acting as inputs to the PE. 
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H5 The Embedded Cross-Trigger Interface 
H5.4 Description and allocation of CTI triggers 


Performance Monitors overflow trigger event 


This is an input trigger event to the CTI, and an output trigger event from the PE, asserted each time the PE asserts 
a new Performance Monitors counter overflow interrupt request. See Chapter DS The Performance Monitors 
Extension. 


If the CTI supports multicycle trigger events, then the trigger event remains asserted until the overflow is cleared 
by a write to PMOVSCLR_ELO. Otherwise, the trigger event is asserted when the value of PMOVSCLR_ELO 
changes from zero to a non-zero value. 





Note 


° This does not replace the recommended connection of Performance Monitors overflow trigger event to an 
interrupt controller. Software must be able to program an interrupt on Performance Monitors overflow 
without programming the CTI. 


e Events can be counted when ExternalNoninvasiveDebugEnabled()==FALSE, and, in Secure state, when 
External SecureNoninvasiveDebugEnabled()==FALSE. Secure software must be aware that overflow trigger 
events are nevertheless visible to the CTI. 





Generic trace external input trigger events 


These are output trigger events from the CTI, and input trigger events to the OPTIONAL Trace macrocell, that are 
used in conjunction with the Generic trace external output trigger events to pass trigger events between: 


° The PE and the OPTIONAL Trace macrocell. 


. The OPTIONAL Trace macrocell and any other component attached to the CTM, including other Trace 
macrocells. 


There are four Generic trace external input trigger events. 


The trigger events are self-acknowledging. This means that the debugger does not have to take any further action to 
to remove the events. 


Generic trace external output trigger events 


These are input trigger events to the CTI, and output trigger events from the OPTIONAL Trace macrocell, used in 
conjunction with the Generic trace external input trigger events to pass trigger events between: 


° The PE and the OPTIONAL Trace macrocell. 


° The OPTIONAL Trace macrocell and any other component attached to the CTM, including other Trace 
macrocells. 


There are four Generic trace external output trigger events. 


Generic CTI interrupt trigger event 


This is an output trigger event from the CTI, and an input to an IMPLEMENTATION DEFINED interrupt controller, and 
can transfer trigger events from the PE, Trace macrocell, or any other component attached to the CTI and CTM to 
software as an interrupt. The Generic CTI interrupt trigger event must be connected to the interrupt controller as an 
interrupt that can target the originating PE. 


Note 


° ARM recommends that the Generic CTI interrupt trigger event is a private peripheral interrupt, but 
implementations might instead make this trigger event available as a shared peripheral interrupt or a local 
peripheral interrupt. 





* GICv3 reserves a private peripheral interrupt number for this interrupt. 
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It is IMPLEMENTATION DEFINED whether this trigger event is: 


° Self-acknowledging. This means that the debugger is not required to take any further action, and that the 
interrupt controller must treat the trigger event as a pulse or edge-sensitive interrupt. 


° Acknowledged by the debugger. The debugger acknowledges the trigger event by writing 1 to 
CTIINTACK[2]. This means that the interrupt controller must treat the trigger event as a level-sensitive 
interrupt. 


ARM recommends that the Generic CTI interrupt trigger event is a self-acknowledging trigger event. 
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H5.5 CTI registers programmers’ model 
The CTI registers programmers’ model is described in Chapter H8 About the External Debug Registers. The 
following sections contain information specific to the CTI: 
° External debug register resets on page H8-4981. 
° External debug interface register access permissions on page H8-4970. 
° Cross-trigger interface registers on page H8-4979. 
. The individual register descriptions in Cross-Trigger Interface registers on page H9-5076. 
See also Memory-mapped accesses to the external debug interface on page H8-4968. 
H5.5.1 CTI reset 
An External Debug reset resets the CTI. See External debug register resets on page H8-4981 for details of CTI 
register resets. All CTI output triggers and output channels are deasserted on an External Debug reset. 
Note 
An indirect read of an output trigger might not observe the deasserted state until the processor is Cold reset. For 
more information, see Synchronization of changes to the external debug registers on page H8-4964. 
H5.5.2 CTI authentication 
The CTI ignores the state of the IMPLEMENTATION DEFINED authentication interface. This means that: 
° CTITRIGINSTATUS shows the status of the input triggers and CTICHINSTATUS shows the status of the 
input channels, regardless of the value of ExternalNoninvasiveDebugEnabled(). 
Note 
The PE does not generate the Cross-halt trigger event and the Trace macrocell does not generate Generic trace 
external output trigger events when ExternalNoninvasiveDebugEnabled()==FALSE. However, the PE can 
generate Performance Monitors overflow trigger events. 
° The CTI can generate external triggers regardless of the value of 
External InvasiveDebugEnabled()ExternalInvasiveDebugEnabled(). 
Note 
The PE ignores Debug request and Restart request trigger events when 
External InvasiveDebugEnab1ed()==FALSE. The Trace macrocell ignores Generic trace external input trigger 
events when ExternalNoninvasiveDebugEnab1ed()==FALSE. The behavior of Generic CTI interrupt requests 
is part of the IMPLEMENTATION DEFINED handling of these interrupts, but it is permissible for an interrupt 
controller to receive these requests even when External InvasiveDebugEnabled()==FALSE. 
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H5.6 Examples 


The CTI is fully programmable and allows for flexible cross-triggering of events within a PE and between PEs in a 
multiprocessor system. For example: 


° The Cross-halt trigger event and the Debug request trigger event can be used for cross-triggering in a 
multiprocessor system. 


° The Cross-halt trigger event and the Generic interrupt trigger event can be used for event-driven debugging 
in a multiprocessor system. 


. The Performance Monitors overflow trigger event and the Debug request trigger event can force entry to 
Debug state on overflow of a Performance Monitors event counter, for event-driven profiling. 


Note 


This does not replace the recommended connection of Performance Monitors overflow trigger events to an 
interrupt controller. Software must be able to program an interrupt on Performance Monitors overflow 
without programming the CTI. ARM recommends that the Performance Monitors overflow signal is directly 
available as a local interrupt source. 








° The Generic trace external input and Generic trace external output trigger events can pass trace events into 
and out of the event logic of the Trace macrocell. They can do this: 


— ___ To pass trace events between Trace macrocells. 


—  Inconjunction with the Performance Monitors overflow trigger event, to couple the Performance 
Monitors to the Trace macrocell. 


—  Inconjunction with the Debug request trigger event, to trigger entry to Debug state on a trace event. 
—  Inconjunction with other CTIs, to signal a trace trigger event onto a CoreSight trace interconnect. 
The following sections describe some examples in more detail: 
° Halting a single PE. 
° Halting all PEs in a group when any one PE halts on page H5-4941. 


. Synchronously restarting a group of PEs on page H5-4941. 
° Halting a single PE on Performance Monitors overflow on page H5-4942. 


Example H5-1 Halting a single PE 


To halt a single PE, set: 
1, CTIGATE[0] to 0, so that the CTI does not pass channel events on internal channel 0 to the CTM. 


2. CTIOUTENO[0] to 1, so that the CTI generates a Debug request trigger event in response to a channel event 
on channel 0. 


Note 


The Cross-halt trigger event is input trigger 0, meaning it is controlled by the instance of CTIOUTEN<n> for 
which <n> is 0. 








3: CTIAPPPULSE[0] to 1, to generate a channel event on channel 0. 


When the PE has entered Debug state, clear the Debug request trigger event by writing 1 to CTIINTACK[0], before 
restarting the PE. 
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Example H5-2 Halting all PEs in a group when any one PE halts 


To program a group of PEs so that when one PE in the group halts, all of the PEs in that group halt, set the following 
registers for each PE in the group: 


1. 


2. 


CTIGATE[2] to 1, so that each CTI passes channel events on internal channel 2 to the CTM. 


CTIINENO[2] to 1,so that each CTI generates a channel event on channel 2 in response to a Cross-halt trigger 
event. 


CTIOUTENO[2] to 1, so that each CTI generates a Debug request trigger event in response to an channel 
event on channel 2. 
Note 


The Cross-halt trigger event is input trigger 0, meaning it is controlled by the instances of CTIINEN<n> and 
CTIOUTEN<n> for which <n> is 0. 








When a PE has halted, clear the Debug request trigger event by writing a value of 1to CTIINTACK[0], before 
restarting the PE. 


Example H5-3 Synchronously restarting a group of PEs 


To restart a group of PEs, for each PE in the group: 


i 


If the PE was halted because of a Debug request trigger event, the debugger must ensure the trigger event is 
deasserted. It can do this by: 


a. Writing 1 to CTINTACK)[0] to clear the Debug request trigger event. 


b. Polling CTITRIGOUTSTATUS[0], until it reads as 0, to confirm that the trigger event has been 
deasserted. 


Set CTIGATE[1] to 1, so that each CTI passes channel events on internal channel | to the CTM. 


Set CTIOUTEN1[1] to 1, so that each CTI generates a Restart request trigger event in response to a channel 
event on channel 1. 





Note 
This example must use the instance of CTIOUTEN<n> for which <n> is 1. 





Set CTIAPPPULSE[1] to 1 on any one PE in the group, to generate a channel event on channel 1. 
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H5.6 Examples 
Example H5-4 Halting a single PE on Performance Monitors overflow 
To halt a single PE on a Performance Monitors overflow set: 
1. CTIGATE{3] to 0, so that the CTI does not pass channel events on internal channel 3 to the CTM. 
2. CTIINEN1[3] to 1, so that the CTI generates a channel event on channel 3 in response to a Performance 
Monitors overflow trigger event. 
Note 
This step of this example must use the instance of CTIINEN<n> for which <n> is 1. 
3: CTIOUTENO[3] to 1, so that the CTI generates a Debug request trigger event in response to a channel event 
on channel 3. 
Note 
This step of this example must use the instance of CTIOUTEN<n> for which <n> is 0. 
When the PE has entered Debug state, clear the Debug request trigger event by writing 1 to CTIINTACK[0], before 
restarting the PE. Clear the overflow status by writing to PMOVSCLR_ELO. 
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Chapter H6 


Debug Reset and Powerdown Support 


This chapter describes the reset and powerdown support in the Debug architecture. It contains the following 
sections: 


About Debug over powerdown on page H6-4944. 

Power domains and debug on page H6-4945. 

Core power domain power states on page H6-4946. 
Emulating low-power states on page H6-4949. 

Debug OS Save and Restore sequences on page H6-4951. 
Reset and debug on page H6-4955. 


Note 





Where necessary, Table K12-1 on page K12-5660 disambiguates the general register references used in this chapter. 
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H6.1 About Debug over powerdown 


Debug over powerdown is a facility for an operating system to save and restore the PE state on behalf of a 
self-hosted or external debugger or both. 


For external debug over powerdown, the architecture defines the OS Lock, OS Double Lock, and the logical split 
of the hardware on which a PE executes into the Core power domain and the Debug power domain. See: 


. Power domains and debug on page H6-4945. 


° Core power domain power states on page H6-4946. 
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H6 Debug Reset and Powerdown Support 
H6.2 Power domains and debug 


Power domains and debug 


The external debug component of ARMv8-A has two logical power domains, each with its own reset: 


The debug power domain contains the interface between the PE and the external debugger, and is powered 
up whenever an external debugger is connected to the SoC. It remains powered up while the external 
debugger is connected. Registers in this domain are reset by an external debug reset. 


The core power domain contains the rest of the PE, and is allowed to power up and power down 
independently of the Debug power domain. 


The core power domain contains several types of registers: 


Non-debug logic refers to all registers and logic that are not associated with debug. 

Self-hosted debug logic refers to registers and logic associated solely with the self-hosted debug aspects of 
the architecture. 

Shared debug logic refers to registers and logic associated with both the self-hosted and external debug 
aspects of the architecture. 

External debug logic refers to registers and logic associated solely with the external debug aspects of the 
architecture. 





Note 


The model of two logical power domains has an impact on the reset and access permission requirements of 
the debug programmers’ model. 


The power domains are described as logical because the architecture defines the requirements but does not 
require two physical power domains. Any power domain split that meets the requirements of the 
programmers’ model is a valid implementation. 





Figure H6-1 shows the recommended power domain split. The signals DBGPWRUPREQ, DBGNOPWRDWN, 
and DBGPWRDUP shown in Figure H6-1 provide an interface between the power controller and the PE debug 
logic that is in the debug power domain. They are part of the recommended interface. See Appendix K2 
Recommended External Debug Interface. 
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Figure H6-1 Recommended power domain split between core and debug power domains 
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H6.3 Core power domain power states 


The ARM architecture does not define the power states of the PE as these are not normally visible to software. 
However, they are visible to the external debugger. The Debug architecture uses a four logical power states model 
for the core power domain. The four logical power states are as follows: 


Normal The core power domain is fully powered up and the debug registers are accessible. 


Standby The core power domain is on, but there are measures to reduce energy consumption. In a typical 
implementation, the PE enters standby by executing a WFI or WFE instruction, and exits on a wake-up 
event. There can be other IMPLEMENTATION DEFINED measures the OS can take to enter standby. 


The PE preserves the PE state, including the debug logic state. Changing from standby to normal 
operation does not involve a reset of the PE. 


Standby is the least invasive OS energy saving state. Standby implies only that the PE is unavailable 
and does not clear any debug settings. For standby, the Debug architecture requires only the 
following: 


° An External Debug Request debug event is a wake-up event when halting is allowed. This 
means that the PE must exit standby to handle the debug event. If the PE executed a WFE or a 
WFI instruction to enter standby, then it retires that instruction, 


. If the external debug interface is accessed, the PE must respond to that access. ARM 
recommends that, if the PE executed a WFI or WFE instruction to enter standby, then it does not 
retire that instruction. 


Standby is transparent, meaning that to software and to an external debugger it is indistinguishable 
from normal operation. 


Retention The OS takes some measures, including IMPLEMENTATION DEFINED code sequences and registers, 
to reduce energy consumption. The PE state, including debug settings, is preserved in low-power 
structures, allowing the core power domain to be at least partially turned off. 


Changing from low-power retention to normal operation does not involve a reset of the PE. The 
saved PE state is restored on changing from low-power retention state to normal operation. If 
software has to use an IMPLEMENTATION DEFINED code sequence before entering, or after leaving, 
a retention state, this is referred to as a software-visible retention state. It is IMPLEMENTATION 
DEFINED whether the value of DBGPRCR.CORENPDRQ is set to the value of 
EDPRCR.COREPURQ on leaving the software-visible retention state. 


External Debug Request debug events stay pending and debug registers in the core power domain 
cannot be accessed. 
— Note 


° This model of retention does not include implementations where the PE exits the state in 
response to a debug register access. From the Debug architecture perspective, 
implementations like this are forms of standby. 





Powerdown The OS takes some measures to reduce energy consumption by turning the core power domain off. 
These measures must include the OS saving any PE state, including the debug settings, that must be 
preserved over powerdown. Changing from powerdown to normal operation must include: 


° A Cold reset of the PE after the power level has been restored. 
° The OS restoring the saved PE state. 


External Debug Request debug events stay pending and debug registers in the core power domain 
cannot be accessed. 


An implementation might support enabling and disabling threads, either dynamically or once at reset time. Threads 
that are disabled in this way must appear to the external debugger as either: 


° Powered off , meaning they are either: 
— _ Ina powerdown state. 
—  Inaretention state. 

° Held in Reset state. 
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The Debug architecture uses a simpler two states model for the Debug power domain. The two states are: 
Off The debug power domain is turned off. 


On The debug power domain is turned on. 


The available power states, including the cross-product of core power domain and debug power domain power 
states is IMPLEMENTATION DEFINED. Implementations are not required to implement all of these states and might 
include additional states. These additional states must appear to the debugger as one of the logical power states 
defined by this model. The control of power states is IMPLEMENTATION DEFINED. 





Note 


As a result, it is IMPLEMENTATION DEFINED whether it is possible for the debug power domain to be on when the 
core power domain is off. 





EDPRSR.{DLK, SPD, PU} and the Core power domain 


EDPRSR.{DLK, SPD, PU} describe whether registers in the Core power domain can be accessed, and whether their 
state has been lost since the last time the register was read. Table H6-1 shows how the values of EDPRSR.{DLK, 

SPD, PU} indicate whether registers in the Core power domain can be accessed, and whether their state has been 

lost since the last time the register was read. 


Table H6-1 Interpretation of the EDPRSR.{DLK, SPD, PU} bits 




















EDPRSR Core power domain 
Notes 
SPD PU Power Accesses __ State lost 
0 1 On OK No - 
1 1 On OK Yes SPD is cleared to 0 following the read. 
UNKNOWN 1 On Error Not known — OS double-lock is locked. Software 
locks the OS double-lock before 
removing power. 
UNKNOWN 1 0 Off Error Yes A Cold reset will be asserted on exiting 
powerdown state, but not on exiting 
UNKNOWN O 0 Not known — Error Not known low-power retention state. 





EDPRSR.SPD when the Core domain is in either retention or powerdown state 


When the Core power domain is in either the retention or powerdown state, EDPRSR.SPD is not cleared following 
a read of EDPRSR and it is IMPLEMENTATION DEFINED whether: 


* EDPRSR.SPD shows whether the state of the debug registers in the Core power domain has been lost since 
the last time that EDPRSR was read. This means that: 


— When the Core power domain is in the powerdown state, EDPRSR.SPD is RAO, this indicates that 
the state of the debug registers has been lost. 


— When the Core power domain is in the retention state, EDPRSR.SPD indicates whether the state of the 
debug registers was lost before the Core power domain entered retention state. 
. EDPRSR.SPD is RAZ, and: 


— On leaving the powerdown state, EDPRSR.SPD is set to 1 which indicates that the state of the debug 
registers has been lost. 


— On leaving the retention state, EDPRSR.SPD reverts the value it had on entering the retention state. 


Note 
ARM recommends that an implementation makes EDPRSR.SPD fixed as RAO when in the power-down state, 
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particularly if it does not support a low-power retention state. 





H6.3.3 EDPRSR.{DLK, R} and reset state 


EDPRSR.R is UNKNOWN when EDPRSR.DLK is set to 1. OSDLR_EL1.DLK is cleared to 0 by a reset. If the Core 
power domain is powered up and entered reset state with the OS double-lock locked, it is CONSTRAINED 
UNPREDICTABLE whether a read of EDPRSR while the PE is in reset state returns: 


° EDPRSR.{DLK, R, PU} == {1, UNKNOWN, 1} indicating that the OS double-lock is locked. 
° EDPRSR.{DLK, R, PU} == {0, 1, 1} indicating that the PE is in reset state. 


° EDPRSR.{DLK, R, PU} == {UNKNOWN, UNKNOWN, 0} indicating that the registers in the Core power 
domain cannot be accessed because the OS double-lock is locked. 


If the PE was powered up and the OS double-lock was unlocked when the PE was reset, then EDPRSR.{DLK, R, 
PU} reads as {0, 1, 1} while the PE is in reset state. 


On leaving reset state, EDPRSR.{DLK, R} reads as {0, 0}. 
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H6.4 Emulating low-power states 


The control registers DBGPRCR.CORENPDRQ and EDPRCR.COREPURQ provide an interface between the 
power controller and the PE. They typically map directly to signals in the recommended external debug interface. 
With this interface the external debugger can request the power controller to: 


. Emulate states where the core power domain is completely off or in a low-power state where the core power 
domain registers cannot be accessed. This simplifies the requirements on software by sacrificing entirely 
realistic behavior. 


° Restore full power to the core power domain. 


EDPRSR.{SPD, PU} indicates the core power domain power state. For more information see: 

° DBGPRCR_ELI, Debug Power Control Register on page D7-2166 and DBGPRCR, Debug Power Control 
Register on page G6-4726. 

° EDPRCR, External Debug Power/Reset Control Register on page H9-5051. 

. EDPRSR, External Debug Processor Status Register on page H9-5054. 

. Appendix K2 Recommended External Debug Interface. 


The measures to emulate powerdown are IMPLEMENTATION DEFINED. The ability of the debugger to access the state 
of the PE and the system might be limited as a result of the measures adopted. 


In an emulated powerdown state, the debugger must be able to access all debug registers in both the debug power 
domain and the core power domain as if the core power domain were on. That is, the debugger must be able to read 
and write to such registers without receiving errors. This allows an external debugger to debug the powerup 
sequence. To stop the OS Double Lock preventing access to debug registers when powerdown is being emulated, 
DoubleLockStatus() == FALSE when DBGPRCR.CORENPDRQ == 1. 


Otherwise, the behavior of the PE in emulated powerdown must be similar to that in a real powerdown state. In 
particular, the PE must not respond to other system stimuli, such as interrupts. 


Example H6-1 and Example H6-2 are examples of two approaches to emulating powerdown. 


Example H6-1 An example of emulating powerdown 


The PE is held in Standby state, isolated from any system stimuli. It is IMPLEMENTATION DEFINED whether the PE 
can respond to debug stimuli such as an External Debug Request debug event. 


If the PE can enter Debug state, then the external debugger is able to use the ITR to execute instructions, such as 
loads and stores. This causes the external debugger to interact with the system. If the external debugger restarts the 
PE, the PE leaves Standby state and restarts fetching instructions from memory. 


Example H6-2 Another example of emulating powerdown 


The PE is held in Warm reset. This limits the ability of an external debugger to access the resources of the PE. For 
example, the PE cannot be put into Debug state. 


On exit from emulated powerdown the PE is reset. However, the debug registers that are only reset by a Cold reset 
must not be reset. Typically this means that a Warm reset is substituted for the Cold reset. 
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Note 


° Warm reset and Cold reset have different effects apart from resetting the debug registers. In particular, 
RMR_EL<x is reset by a Cold reset and controls the reset state on a Warm reset. This means that if a Cold reset 
is substituted by a Warm reset, the behavior of the reset code might be different. 


° The timing effects of powering down are typically not factored in the powerdown emulation. Examples of 
these timing effects are clock and voltage stabilization. 


. Emulation does not model the state lost during powerdown, meaning that it might mask errors in the state 
storage and recovery routines. 
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H6.5 Debug OS Save and Restore sequences 


In ARMv8-A, the following registers provide the OS Save and restore mechanism: 


The OS Lock Access Register, OSLAR, locks the OS Lock to restrict access to debug registers before starting 
an OS Save sequence, and unlocks the OS Lock after an OS Restore sequence. 


The OS Lock Status Register, OSLSR, shows the status of the OS Lock. 


The External Debug Execution Control Register, EDECR, can be configured to generate a debug event when 
the OS Lock is unlocked. 


The OS Double Lock Register, OSDLR, locks out an external debug interface entirely. This is only used 
immediately before a powerdown sequence. 


See also: 


Reset and debug on page H6-4955 
Appendix K8 Example OS Save and Restore Sequences 


H6.5.1 Debug registers to save over powerdown 


Table H6-2 shows the different requirements for self-hosted debug over powerdown and external debug over 
powerdown: 


The column labeled Self-hosted lists registers that software must preserve over powerdown so that it can 
support self-hosted debug over powerdown. This does not require use of the OS Save and Restore 
mechanism. 


The column labeled External lists registers that software must preserve over powerdown so that it can support 
external debug over powerdown. This requires use of the OS Save and Restore mechanism: 


— Some external debug registers are not normally accessible to software executing on the PE. Additional 
debug registers are provided that give software the required access to save and restore these external 
debug registers when OSLSR.OSLK is locked. These registers include OSECCR, OSDTRRX, and 
OSDTRTX. 


Some registers might only present in some implementations, or might not be accessible at all Exception levels 
or in Non-secure state. DBGVCR32_EL2 and SDER32_EL3 are only required to support AArch32. 


Table H6-2 does not include registers for the OPTIONAL Trace and Performance Monitor extensions. 


Table H6-2 Debug registers to save over powerdown 






































Register in AArch64 state Register in AArch32 state Self-hosted [External 
MDSCR_EL1 DBGDSCRext Yes Yes? 
DBGBVR<n>_EL1 DBGBVR<n> Yes Yes 
DBGBCR<n>_EL1 DBGBCR<n> Yes Yes 
DBGWVR<n>_EL1 DBGWVR<n> Yes Yes 
DBGWCR<n>_EL1 DBGWCR<n> Yes Yes 
DBGVCR32_EL2 DBGVCR Yes - 
MDCR_EL2 HDCR Yes - 
SDER32_EL3 SDER Yes - 
MDCR_EL3 SDCR Yes? : 
MDCCINT_ELI1 DBGDCCINT - Yes? 
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Table H6-2 Debug registers to save over powerdown (continued) 














Register in AArch64 state Register in AArch32 state Self-hosted External 
DBGCLAIMSET_EL1 DBGCLAIMSET, - Yes 
DBGCLAIMCLR_EL1 DBGCLAIMCLR 

OSECCR_EL1 DBGOSECCR - Yes@ 
OSDTRRX_EL1 DBGDTRRXext - Yes 
OSDTRTX_EL1 DBGDTRTXext 





a. 


The OS Lock must be locked to save and restore for external debug. When the OS Lock is locked, DSCR is part of the software save and 
restore mechanism for external debug. It provides a mechanism for an operating system to access some fields of EDSCR that are otherwise 
read-only or not visible to software. This allows the operating system to save and restore these settings over a powerdown for the external 





debugger. 
b. This register is new in ARMv8-A. Sequences written for ARMv7 do not preserve the register over powerdown. 
c. Read DBGCLAIMCLR to save, write DBGCLAIMSET to restore. 
H6.5.2 OS Save sequence 
To preserve the debug logic state over a powerdown, the state must be saved to nonvolatile storage. This means the 
OS Save sequence must: 
1. Lock the OS Lock by: 
° Writing the key value @xC5ACCE55 to the DBGOSLAR in AArch3? state. 
° Writing 1 to OSLAR_EL1.OSLK in AArch64 state. 
2, Execute an ISB instruction. 
3. Walk through the debug registers listed in Debug registers to save over powerdown on page H6-4951 and 
save the values to the nonvolatile storage. 
Before removing power from the core power domain, software must: 
1. Lock the OS Double Lock by writing 1 to OSDLR_EL1.DLK. 
2. Execute a Context synchronization event. 
H6.5.3 OS Restore sequence 
After a powerdown, the OS Restore sequence must perform the following steps to restore the debug logic state from 
the non-volatile storage: 
1. Lock the OS Lock, as described in OS Save sequence. The OS Lock is generally locked by the Cold reset, 
but this step ensures that it is locked. 
2. Execute an ISB instruction. 
3: To ensure that, if an external debugger clears the OS Lock before the end of this sequence, no debug 
exceptions are generated: 
° Write 0 to MDSCR_EL1 if executing in AArch64 state. 
° Write 0 to DBGDSCRext if executing in AArch32 state. 
4. Walk through the debug registers listed in Debug registers to save over powerdown on page H6-4951, and 
restore the values from the nonvolatile storage. The last register to be restored must be: 
° MDSCR_EL1 if executing in AArch64 state. 
° DBGDSCRext if executing in AArch32 state. 
3: Execute an ISB instruction. 
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6. Unlock the OS Lock by: 
° Writing any non-key value to DBGOSLAR if executing in AArch32 state. 
° Writing 0 to OSLAR_EL1.OSLK if executing in AArch64 state. 


vi Execute a Context synchronization event. 


Note 


The OS Restore sequence overwrites the debug registers with the values that were saved. If there are valid values 
in these registers immediately before the restore sequence, then those values are lost. 








H6.5.4 Debug behavior when the OS Lock is locked 


The main purpose of the OS Lock is to prevent updates to debug registers during an OS Save or OS Restore 
operation. The OS Lock is locked on a Cold reset. 


When the OS Lock is locked: 


° Access to debug registers through the System register interface is mainly unchanged except that: 
— Certain registers are read and written without side-effects. 
— Fields in DSCR and OSECCR that are normally read-only become read/write. 


This allows the state to be saved or restored. For more information, see the relevant register description in 
Chapter H9 External Debug Register Descriptions. 


° Access to debug registers by the external debug interface is restricted to prevent an external debugger 
modifying the registers that are being saved or restored. For more information see External debug interface 
register access permissions summary on page H8-4972. 


. Debug exceptions, other than Breakpoint Instruction exceptions are not generated. 
° Breakpoint and Watchpoint debug events are not generated. The OS Lock has no effect on Breakpoint 
Instruction exceptions and other debug events. 
H6.5.5 Debug behavior when the OS Lock is unlocked 
When the OS Lock is unlocked, an OS Unlock Catch debug event is generated if EDECR.OUCE is set to 1. See 
OS Unlock Catch debug event on page H3-4901. 
H6.5.6 Debug behavior when the OS Double Lock is locked 


The OS Double Lock is locked immediately before a powerdown sequence. The OS Double Lock ensures that it is 
safe to remove core power by forcing the debug interfaces to be quiescent. 


When DoubleLockStatus() == TRUE: 


. The external debug interface only has restricted access to the debug registers, so that it is quiescent before 
removing power. See External debug interface register access permissions summary on page H8-4972. 


° Debug exceptions, other than Breakpoint Instruction exceptions, are not generated. 
° Halting is prohibited. See Halting allowed and halting prohibited on page H2-4845. 
Note 


Pending Halting debug events might be lost when core power is removed. 








° No asynchronous debug events are WFI or WFE wake-up events. 


Software must synchronize the update to OSDLR before it indicates to the system that core power can be removed. 
The interface between the PE and its power controller is IMPLEMENTATION DEFINED. 
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Typically software indicates that core power can be removed by entering the Wait For Interrupt state. This means 
that software must explicitly synchronize the OSDLR update before issuing the WFI instruction. 


OSDLR.DLK is ignored and DoubleLockStatus() == FALSE if either: 
° The PE is in Debug state. 
° DBGPRCR.CORENPDRQ is set to 1. 





Note 


It is possible to enter Debug state with OSDLR.DLK set to 1. This is because a Context synchronization event is 
required to ensure the OS Double Lock is locked, meaning that Debug state might be entered before the OSDLR 
update is synchronized. 





Because OSDLR.DLK is ignored when DBGPRCR.CORENPDRQ is set to 1, and an external debugger can write 
to DBGPRCR.CORENPDRQ, software must not rely on using the OS Double Lock to disable debug exceptions or 
to prohibit halting, or both. ARM deprecates use of the OS Double Lock for these purposes, and instead 
recommends that software: 


° Uses the OS Lock to disable debug exceptions during save or restore sequences. 


° Uses the debug authentication interface to prohibit halting and external debug access to debug registers at 
times other than immediately prior to removing power. 


As the purpose of the OS Double Lock is to ensure that it is safe to remove core power, it is important to avoid race 
conditions that defeat this purpose. ARM recommends that: 


° Once the write to OSDLR.DLK has been synchronized by a Context synchronization event and 
DoubleLockStatus() == TRUE, a PE must: 


— Not allow a debug event generated before the Context synchronization event to cause an entry to 
Debug state or act as a wake-up event for a WFI or WFE instruction after the Context synchronization 
event has completed. 


— Complete any external debug access started before the Context synchronization event by the time the 
Context synchronization event completes. 


Note 


A debug register access might be in progress when software sets OSDLR.DLK to 1. An 
implementation must not permit the synchronization of locking the OS Double Lock to stall 
indefinitely while waiting for that access to complete. This means that any debug register access that 
is in progress when software sets OSDLR.DLK to 1 must complete or return an error in finite time. 








° If a write to DBGPRCR or EDPRCR made when OSDLR.DLK == | changes DBGPRCR.CORENPDRQ or 
EDPRCR.CORENPDRQ from | to 0, meaning DoubleLockStatus() changes from FALSE to TRUE, then 
before signaling to the system that the CORENPDRQ field has been cleared and emulation of powerdown is 
no longer requested, meaning the system can remove core power, the PE must ensure that all the requirements 
for DoubleLockStatus() == TRUE listed in this section are met. 


In the standard OS Save sequence, the OS Lock is locked before the OS Double Lock is locked. This means that 
writes to CORENPDRQ are ignored by the time the OS Double Lock is locked. However, if DoubleLockStatus() == 
FALSE, an external debugger can clear the OS Lock at any time, and then write to EDPRCR. 
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H6.6 Reset and debug 


All registers in the Core power domain are reset either by a Warm reset or a Cold reset, as described in Reset on 
page D1-1517, including external debug logic registers. 


All registers in the Debug power domain are reset by an External Debug reset. 


Figure H6-2 shows this reset scheme. The following three reset signals are an example implementation of the reset 
scheme: 


° CORERESET, which must be asserted for a Warm reset. 
° CPUPORESET, which must be asserted for a Cold reset. 
° PRESETDBG, which must be asserted for an External Debug reset. 


As shown in the figure, the external debug logic is split between the Debug power domain and the Core power 
domain. 


Debug power domain Core power domain 








External debug reset Cold reset Warm reset 























External debug logic ' 
: (part, including external Non-debug lagic 
External debug logic debug registers) 
(part) on Self-hosted debug logic 
Shared debug logic 
CORERESET OR 
PRESETDBG CPUPORESET CPUPORESET 




















Figure H6-2 Power and reset domains 
For more information about power domains and power states, see Power domains and debug on page H6-4945. 
When power is first applied to the Debug power domain, PRESETDBG must be asserted. 
When power is first applied to the Core power domain, CPUPORESET must be asserted. 


Note 


In this scheme, logic in the Warm reset domain is reset by asserting either CORERESET or CPUPORESET. This 
implies a particular implementation style that permits these approaches. 








CPUPORESET is not normally asserted on moving from a low-power state, where power has not been removed, 
to a full-power state. This can occur, for example, on exiting a low-power retention state. See also Emulating 
low-power states on page H6-4949 and EDPRSR, External Debug Processor Status Register on page H9-5054. 


H6.6.1 External debug interface accesses to registers in reset 
If a reset signal is asserted and the external debug interface: 


° Writes a register, or indirectly writes a register or register field as a side-effect of an access: 


— Then, if the register or register field is reset by that reset signal, it is CONSTRAINED UNPREDICTABLE 
whether the register or register field takes the reset value or the value written. The reset value might 
be UNKNOWN. 


— Otherwise the register or register field takes the value that is written. 
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. Reads a register, or indirectly reads a register or register field, as part of an access: 
— Then, if the register or register field is reset by that reset signal, the value returned in UNKNOWN. 


— Otherwise, the value of the register or register field is returned. 


It is IMPLEMENTATION DEFINED whether any register can be accessed when External Debug reset is being asserted. 
The result of these accesses is IMPLEMENTATION DEFINED. 
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Chapter H7 
The PC Sample-based Profiling Extension 


This chapter describes the PC Sample-based Profiling Extension, that is an OPTIONAL extension to the ARMv8 
architecture. The extension provides a non-invasive external debug component. 


Note 


This form of the PC Sample-based Profiling Extension is OPTIONAL. ARM recommends that if the EDPCSR is not 
implemented then an alternative IMPLEMENTATION DEFINED form of PC Sample-based profiling is implemented. 








It contains the following section: 
° Sample-based profiling of the PC on page H7-4958. 
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H7.1 Sample-based profiling of the PC 


The PC Sample-based Profiling Extension is an OPTIONAL extension to the architecture. It provides a mechanism 
for coarse-grained profiling of software executing on the PE by an external debugger, without changing the behavior 
of that software. The following sections describe this extension: 


The implemented PC Sample-based Profiling registers. 

Reads of the EDPCSRs. 

Reads of the EDVIDSR on page H7-4959. 

Accuracy of sampling on page H7-4959. 

PC Sample-based Profiling and security on page H7-4960. 

Pseudocode description of PC Sample-based Profiling on page H7-4960. 


H7.1.1 The implemented PC Sample-based Profiling registers 


An implementation that includes the PC Sample-based Profiling Extension implements the following external 
debug registers: 


EDPCSR is a 64-bit read-only register that contains a sampled program counter value. As external debug 
register accesses are atomic only at word granularity, EDPCSR is split into two registers: EDPCSRhi and 
EDPCSRIo. See Reads of the EDPCSRs. 

EDCIDSR is a read-only register that contains the sampled value of CONTEXTIDR_EL1 indirectly read on 
reading EDPCSRIo. 


Note 


If EL3 is implemented and using AArch32, then CONTEXTIDR is a Banked register and EDCIDSR samples 
the current Banked copy of CONTEXTIDR. 








EDVIDSR is a read-only register that contains sampled values captured on reading EDPSRIo. If neither EL3 
nor EL2 are implemented and the implementation includes the PC Sample-based Profiling Extension then 
EDVIDSR can be implemented as a fixed value register. 


H7.1.2 Reads of the EDPCSRs 


A read of the EDPCSRlo normally has the side-effect of indirectly writing to EDCIDSR, EDVIDSR, and 
EDPCSRhi. When EDPCSRIo is read, the bottom 32 bits of a program counter sample are returned. The top 32 bits 
are captured in EDPCSRhi and can be read later. However: 


If the PE is in Debug state, or PC Sample-based Profiling is prohibited, EDPCSRlo reads as 0xFFFFFFFF and 
EDCIDSR, EDVIDSR, and EDPCSRhi become UNKNOWN. See PC Sample-based Profiling and security on 
page H7-4960. 


If the PE is in Reset state, the sampled value is UNKNOWN, and EDCIDSR, EDVIDSR, and EDPCSRhi 
become UNKNOWN. 


If no instruction has been retired since the PE left Reset state, Debug state, or a state where PC Sample-based 
Profiling is prohibited, the sampled value is UNKNOWN and EDCIDSR, EDVIDSR, and EDPCSRhi become 
UNKNOWN. 


The indirect writes to EDCIDSR, EDVIDSR, and EDPCSRhi might not occur for a memory-mapped access 
to the external debug interface. For more information, see Memory-mapped accesses to the external debug 
interface on page H8-4968. 


The value written to EDCIDSR is an indirect read of CONTEXTIDR_EL1. This means that it is CONSTRAINED 
UNPREDICTABLE whether EDCIDSR is set to the original or new value if a read of the EDPCSRIo samples: 


An instruction that writes to CONTEXTIDR_ELI. 
The next Context synchronization event. 


Any instruction executed between these two instructions. 
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H7 The PC Sample-based Profiling Extension 
H7.1 Sample-based profiling of the PC 


Note 


In the ARMv7 PC Sample-based Profiling Extension, an offset was applied to the sampled program counter value 
and this offset and the instruction set state indicated in bits [1:0] of the sampled value. In the ARMv8 PC 
Sample-based Profiling Extension, the sampled value is the address of an instruction that has executed, with no 
offset and no indication of the instruction set state. 








H7.1.3 Reads of the EDVIDSR 
A read of the EDVIDSR contains sampled values captured on reading EDPSRIo, where: 
° EDVIDSR.NS indicates the Security state associated with the most recent EDPCSR sample. 


° EDVIDSR.E2 indicates whether the most recent EDPCSR sample was associated with EL2. If 
EDVIDSR.NS == 0, this bit is 0. 


° EDVIDSR.E3 indicates whether the most recent EDPCSR sample was associated with EL3 using AArch64. 
If EDVIDSR.NS == 1 or the PE was in AArch32 state when EDPCSRIo was read, this bit is 0. 


° EDVIDSR.HV indicates whether EDPCSRhi is valid, that is, bits [63:32] of the most recent program counter 
sample are nonzero. 
Note 


—  EDVIDSR.HV == 1 does not mean that EDPCSRhi != 0. EDVIDSR.HV == 0 is a hint that EDPCSRhi 
does not need to be read. 





— Tools must take care to avoid skewing sampled data by over-sampling code for which 
EDVIDSR.HV == 0. 





EDVIDSR.VMID indicates the value of the VTTBR_EL2.VMID register associated with the most recent 
EDPCSRIo sample. If EDVIDSR.NS == 0 or EDVIDSR.E2 == 1, this field is RAZ. 


If EL2 is not implemented, EDVIDSR.E2 and EDVIDSR.VMID are RESO. 

If EL3 is not implemented and EL2 is implemented, EDVIDSR.E3 is RESO, and EDVIDSR.NS is RES1. 
If EL3 and EL2 are not implemented, it is IMPLEMENTATION DEFINED whether: 

° EDVIDSR is not implemented. 


° EDVIDSR is implemented and the value of EDVIDSR.NS is the effective value of SCR.NS. 


H7.1.4 Accuracy of sampling 


PC Sample-based Profiling is provided as a mechanism for tools to populate a statistical model of the performance 
of software executing on the PE. The statistical data returned by random sampling of EDPCSR, EDCIDSR, and 
EDVIDSR must allow such statistical modeling. 


It must be possible to sample references to branch targets. It is IMPLEMENTATION DEFINED whether references to 
other instructions can be sampled. The branch target for a conditional branch instruction that fails its condition 
check is the instruction that follows the conditional branch instruction. The branch target for an exception is the 
exception vector address. 


Under normal operating conditions, the whole sample, EDPCSR, EDCIDSR, and EDVIDSR, must reference an 
instruction that was committed for execution, including its context, and must not reference instructions that are 
fetched but not committed for execution. 


To keep the implementation and validation cost low, a reasonable degree of inaccuracy in the sampled data is 
acceptable. ARM does not define a reasonable degree of inaccuracy but recommends the following guidelines: 


° In exceptional circumstances, such as a change in Security state or other boundary condition, it is acceptable 
for the sample to represent an instruction that was not committed for execution. 
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° Under very unusual non-repeating pathological cases the sample can represent an instruction that was not 
committed for execution. These cases are likely to occur as a result of asynchronous exceptions, such as 
interrupts, where the chance of a systematic error in sampling is very unlikely. 


See also Non-invasive behavior on page D5-1836. 


H7.1.5 PC Sample-based Profiling and security 


PC Sample-based Profiling is a non-invasive external debug component, which is controlled by an 
IMPLEMENTATION DEFINED authentication interface. PC Sample-based Profiling is prohibited unless both: 


° Allowed by the IMPLEMENTATION DEFINED authentication interface ExternalNoninvasiveDebugEnabled(). 


. Any one of: 
— Executing in Non-secure state. 
— _ EL3 is not implemented. 


—  EL3 is implemented, executing in Secure state, and allowed by the IMPLEMENTATION DEFINED 
authentication interface ExternalSecureNoninvasiveDebugEnabled(). 


—  EL3 is implemented, EL3 or EL] is using AArch32, executing at ELO in Secure state, and 
SDER32_EL3.SUNIDEN == 1. 


The pseudocode function ExternalNoninvasiveDebugAl lowed() indicates when PC Sample-based Profiling is 
allowed. 


The state of implementation defined authentication interface is visible through DBGAUTHSTATUS_EL1. 


See also Appendix K2 Recommended External Debug Interface. 


H7.1.6 Pseudocode description of PC Sample-based Profiling 


PC Sample-based Profiling is described by the pseudocode functions: 
° CreatePCSample(), which populates a variable of type PCSamp1e. 
° EDPCSR1o[]. which writes a PC sample to the EDPSCR and associated registers. 
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Chapter H8 
About the External Debug Registers 


This chapter provides some additional information about the external debug registers. It contains the following 


sections: 

° Relationship between external debug and System registers on page H8-4962. 
° Supported access sizes on page H8-4963. 

. Synchronization of changes to the external debug registers on page H8-4964. 
. Memory-mapped accesses to the external debug interface on page H8-4968. 
° External debug interface register access permissions on page H8-4970. 


° External debug interface registers on page H8-4974. 
° Cross-trigger interface registers on page H8-4979. 
° External debug register resets on page H8-4981. 


Note 


Where necessary Table K12-1 on page K12-5660 disambiguates the general register references used in this chapter. 
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H8.1 Relationship between external debug and System registers 


Table H8-1 shows the relationship between external debug registers and System registers. Where no relationship 
exists, the registers are not listed. 


Table H8-1 Equivalence between external debug and System registers 





System register 












































External debug register AArch64 AArch32 Notes 

DBGDTRRX_ELO DBGDTRRX_ELO DBGDTRRXint See also Summary of System register 
accesses to the DCC on page H4-4919 

DBGDTRTX_ELO DBGDTRTX_ELO DBGDTRTXint 

OSLAR_EL1 OSLAR_EL1 DBGOSLAR - 

DBGBVR<n>_EL1[31:0] DBGBVR<n>_EL1[31:0] DBGBVR<n> - 

DBGBVR<n>_EL1[63:32] DBGBVR<n>_EL1[63:32] DBGBXVR<n> 

DBGBCR<n>_EL1 DBGBCR<n>_EL1 DBGBCR<n> - 

DBGWVR<n>_EL1[31:0] DBGWVR<n>_EL1[31:0] DBGWVR<n> - 

DBGWVR<n>_EL1[63:32] | DBGWVR<n>_EL1[63:32] 

DBGWCR<n>_EL1 DBGWCR<n>_EL1 DBGWCR<n> - 

DBGCLAIMSET_EL1 DBGCLAIMSET_EL1 DBGCLAIMSET - 

DBGCLAIMCLR_EL1 DBGCLAIMCLR_EL1 DBGCLAIMCLR - 

DBGAUTHSTATUS_EL1 DBGAUTHSTATUS_EL1 DBGAUTHSTATUS _ Read-only 

EDSCR MDSCR_EL1 DBGDSCRext Only some fields map 

EDECCR OSECCR_EL1 DBGOSECCR Applies when the OS Lock is locked. 

MIDR_EL1 MIDR_EL1 MIDR Read-only copies of Processor ID 
Registers 

EDDEVAFFO MPIDR_EL1[31:0]8 MPIDR Read-only copies of system ID registers 

EDDEVAFF1 MPIDR_EL1[63:32] 








a. This is a word of a 64-bit register. 


In addition: 
° EDSCR.{TXfull, RXfull} are read-only aliases for DCCSR.{TXfull, RXfull}. 
° EDPRCR.CORENPDRQ is a read/write alias for DBGPRCR.CORENPDRQ. 
° EDPRSR.OSLK is a read-only alias for OSLSR.OSLK. 
° EDPRSR.DLK is a read-only function of OSDLR.DLK. 
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H8 About the External Debug Registers 
H8.2 Supported access sizes 


Supported access sizes 


The memory access sizes supported by any peripheral is IMPLEMENTATION DEFINED by the peripheral. For accesses 
to the debug registers, Performance Monitor registers, and CTI registers, implementations must support: 


° Word-aligned 32-bit accesses to access 32-bit registers or either half of a 64-bit register mapped to a 
doubleword-aligned pair of adjacent 32-bit locations. 


° Doubleword-aligned 64-bit accesses to access 64-bit registers mapped to a doubleword-aligned pair of 
adjacent 32-bit locations. 


Note 


This means that a system implementing the debug registers using a 32-bit bus, such as a AMBA APB3, with 
a wider system interconnect must implement a bridge between the system and the debug bus that can split 
64-bit accesses. 








All registers are only single-copy atomic at word granularity. This means that for 64-bit accesses to a 64-bit register, 
the system might generate a pair of 32-bit accesses. The order in which the two halves are accessed is not specified. 


The following accesses are not supported: 


° Byte. 

° Halfword. 

° Unaligned word. These accesses are not word single-copy atomic. 

° Unaligned doubleword. These accesses are not doubleword single-copy atomic. 

° Doubleword accesses to a pair of 32-bit locations that are not a doubleword-aligned pair forming a 64-bit 
register. 


° Quadword or higher. 
° Exclusive accesses. 


For each of these access types, it is CONSTRAINED UNPREDICTABLE whether: 


° The access generates an external abort or not. 
° The defined side-effects of a read occur or not. A read returns UNKNOWN data. 
° A write is ignored or sets the accessed register or registers to UNKNOWN. 


For accesses from the external debug interface, the size of an access is determined by the interface. For an access 
from an ADIv5-compliant Memory Access Port, MEM-AP, this is specified by the MEM-AP CSW register. 


See Access sizes for memory-mapped accesses on page H8-4969. 
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H8.3 Synchronization of changes to the external debug registers 


H8.3 Synchronization of changes to the external debug registers 
This section describes the synchronization requirements for the external debug interface. 
For more information on how these requirements affect debug, see: 


° Synchronization and debug exceptions on page D2-1687 for exceptions taken from AArch6é4 state, or 
Synchronization and debug exceptions on page G2-3983 for exceptions taken from AArch32 state. 


. Synchronization and Halting debug events on page H3-4904. 
. Synchronization of DCC and ITR accesses on page H4-4919. 


This section refers to accesses from the external debug interface as external reads and external writes. It refers to 
accesses to System registers as direct reads, direct writes, indirect reads, and indirect writes. 


Note 


Synchronization requirements for AArch64 System registers on page D7-1889 defines direct read, direct write, 
indirect read, and indirect write, and classifies external reads as indirect reads, and external writes as indirect writes. 








Writes to the same register are serialized, meaning they are observed in the same order by all observers, although 
some observers might not observe all of the writes. With the exception of DBGBCR<n>_EL1, DBGBVR<n>_ELl, 
DBGWCR<n>_EL1, and DBGWVR<n>_EL1, external writes to different registers are not necessarily observed in 
the same order by all observers as the order in which they complete. 


Synchronization of DCC and ITR accesses on page H4-4919 describes the synchronization requirements for the 
DCC and ITR. 


Changes to the IMPLEMENTATION DEFINED authentication interface are external writes to the authentication status 


registers by the master of the authentication interface. See Synchronization and the authentication interface on 
page H8-4965. 


Explicit synchronization is not required for an external read or an external write by an external agent to be 
observable to a following external read or external write by that agent to the same register using the same address, 
and so is never required for registers that are accessible only in the external debug interface. 


Some registers are guaranteed to be observable to all observers in finite time, without explicit synchronization. For 
more information, see Synchronization requirements for AArch64 System registers on page D7-1889 or 
Synchronization of changes to AArch32 System registers on page G4-4163. Otherwise, explicit synchronization is 
normally required following an external write to any register for that write to be observable by: 


. A direct access. 
. An indirect read by an instruction. 
. An external read of the register using a different address. 


This means that an external write by an external agent is guaranteed to have an effect on subsequent instructions 
executed by the PE only if all of the following are true: 


° The write has completed. 
° The PE has executed a Context synchronization event. 
° The Context synchronization event was executed after the write completed. 


The order and synchronization of direct reads and direct writes of System registers is defined by Synchronization 
requirements for AArch64 System registers on page D7-1889. 


The external agent must be able to guarantee completion of a write. For example by: 





° Marking the memory as Device-nGnRnE and executing a DSB barrier, if the system supports this property. 
. Reading back the value written. 
° Some guaranteed property of the connection between the PE and the external agent. 
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Note 


For an external Debug Access Port, this is an IMPLEMENTATION DEFINED property. For a CoreSight system 
using APB-AP to access a debug APB, a write is guaranteed to complete before the APB-AP allows a second 
APB transaction to complete. 








The external agent and PE can guarantee ordering by, for example, passing messages in an ordered way with respect 
to the external write and the Context synchronization event, and relying on the memory ordering rules provided by 
the memory model. 


External reads and external writes complete in the order in which they arrive at the PE. For accesses to different 
register locations the external agent must create this order by: 


. Marking the memory as Device-nGnRnE or Device-nGnRE. 


° Using the appropriate memory barriers. 
° Some guaranteed property of the connection between the PE and the external agent. 
Note 





For an external Debug Access Port, this is an IMPLEMENTATION DEFINED property. For a CoreSight system 
using APB-AP to access a debug APB, accesses complete in order. 





However, the external agent cannot force synchronization of completed writes without halting the PE. Executing an 
ISB instruction, either in Debug state or in Non-debug state, and exiting from Debug state forces synchronization. 
If the PE is in Debug state, executing an ISB instruction is guaranteed to explicitly synchronize any external reads, 
external writes, and changes to the authentication interface that are ordered before the external write to EDITR. 


For any given observer, external writes to the following register groups are guaranteed to be observable in the same 
order in which they complete: 


° The breakpoint registers, DBGBCR<n>_EL1 and DBGBVR<n>_EL1. 
° The watchpoint registers, DBGWCR<n>_EL1 and DBGWVR<n>_EL1. 


This guarantee only applies to external writes to registers within one of these groups. There is no guarantee 
regarding the ordering of the observability of external writes within these groups with respect to external writes to 
registers, for example EDSCR, or between breakpoints and watchpoints, including watchpoints linked to context 
matching breakpoints. 





Note 


This means that a debugger can rely on the external writes to be observed in the same order in which they complete. 
It does not mean that a debugger can rely on the external writes being observed in finite time. 





In a simple sequential execution an indirect write that occurs as a side-effect of an access happens atomically with 
the access, meaning no other accesses are allowed between the register access and its side-effect. 


If two or more interfaces simultaneously access a register, the behavior must be as if the accesses occurred 
atomically and in any order. This is described in Examples of the synchronization of changes to the external debug 
registers on page H8-4966. 


Some registers have the property that for certain bits a write of 0 is ignored and a write of 1 has an effect. This means 
that simultaneous writes must be merged. Registers that have this property and support both external debug and 
System register access include DBGCLAIMSET_EL1, DBGCLAIMCLR_EL1, PMCR_ELO.{C,P}, 
PMOVSSET_ELO, PMOVSCLR_ELO, PMCNTENSET_ELO, PMCNTENCLR_ELO, PMINTENSET_EL1, 
PMINTENCLR_EL1, and PMSWINC_ELO. This last register is OPTIONAL and deprecated in the external debug 
interface. 


H8.3.1 Synchronization and the authentication interface 


Changes to the authentication interface are indirect writes to the Authentication Status registers by the master of the 
authentication interface. For each of these Authentication Status registers, it is IMPLEMENTATION DEFINED whether 
a change on the authentication interface is guaranteed to be observable to an external debug interface read of the 
register only after a Context synchronization event or in finite time. 
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For DBGAUTHSTATUS_EL1, a change on the authentication interface is guaranteed to be observable to a System 
register read of DBGAUTHSTATUS_ELI only after a Context synchronization event. 





H8.3.2 Examples of the synchronization of changes to the external debug registers 
Example H8-1, Example H8-2, and Example H8-3 show the synchronization of changes to the external debug 
registers. 
Example H8-1 Order of synchronization of Breakpoint and Watchpoint register writes 
Initially DBGBVR<n>_EL]1 is 0x800@ and DBGBCR<n>_EL 1 is @x@181. This means that a breakpoint is enabled 
on the halfword T32 instruction at address 0x8000. 
A sequence of external writes occurs in the following order: 
1. 0x0000 is written to DBGBCR<n>_EL1, disabling the breakpoint. 
2. 0x9000 is written to DBGBVR<n>_EL1[31:0]. 
3: 0x0061 is written to DBGBCR<n>_EL1, enabling a breakpoint on the halfword at address 0x9002. 
The external writes must be observable to indirect reads in the same order as the external writes complete. This 
means that at no point is there a breakpoint enabled on either of the halfwords at address 0x8002 and 0x9000. 
Similarly a breakpoint or watchpoint must be disabled: 
° If both halves of a 64-bit address have to be updated. 
° If any of the DBGBCR<n>_EL1 or DBGWCR<n>_EL]1 fields are modified at the same time as updating the 
address. 
Example H8-2 Simultaneous accesses to DTR registers 
Initially EDSCR.{TXfull, TXU, ERR} are 0. Then: 
° @x@DCCDA7A is directly written to DBGDTRTX_ELO by an MSR instruction. 
° DBGDTRTX_EL is indirectly read by the external debug interface. 
These accesses might happen at the same time and in any order. 
If the direct write of @x@DCCDA7A to DBGDTRTX_ELO is handled first, then: 
° The external debug interface read of DBGDTRTX_ELO clears EDSCR.TXfull to 0. 
° EDSCR.{TXU, ERR} are unchanged. 
° The external debug interface read returns @x@DCCDA7A. 
If the indirect read of DBGDTRTX_ELO by the external debug interface is handled first, then: 
° The external debug interface read of DBGDTRTX_ELO causes an underrun and as a result EDSCR.{TXU, 
ERR} are both set to 1. 
° The external debug interface returns an UNKNOWN value. 
° Writing @x@DCCDA7A to DBGDTRTX_ELO sets DTRTX to @x@DCCDA7A and EDSCR.TXfull to 1. 
Example H8-3 Simultaneous writes to CLAIM registers 
Initially all CLAIM tag bits are 0. Then: 
° x01 is written to DBGCLAIMSET_EL1 by a direct write, followed by an explicit Context synchronization 
event. 
° x02 is written to DBGCLAIMSET_EL]1 by an external write. 
These events might happen at the same time and in either order. 
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After this: 
° DBGCLAIMCLR_EL1 is read by a direct read. 
° DBGCLAIMCLR_EL1 is read by an external read. 


In this case, a direct read can return either @x01 or @x@3, and the external read can return either 0xQ2 or 0x03. 


The only permitted final result for the CLAIM tags is the value 0x03, because this would be the result regardless of 
whether 0x01 or 0x02 is written first. This is because the external write is guaranteed to be observable to a direct read 
in finite time. See Synchronization requirements for AArch64 System registers on page D7-1889. 


It is not possible for a direct read to return @x01 and the external read to return 0x02, because the writes to 
DBGCLAIMCLR_EL] are serialized. 


In the following scenario, there is only one permitted result. Both observers observe the value 0x3, and then, at the 
same time, two writes occur: 


° 0x04 is written to DBGCLAIMSET_EL1 by a direct write, followed by an explicit Context synchronization 
event. 


° x01 is written to DBGCLAIMCLR_EL] by an external write. 


In this case only permitted final result for the CLAIM tags is the value 0x06. 
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H8.4 


Memory-mapped accesses to the external debug interface 


Support for memory-mapped access to the external debug interface is OPTIONAL. 


If the external debug interface is CoreSight compliant, then an OPTIONAL Software Lock can be implemented for 
memory-mapped accesses to each component. 


The Software Lock is OPTIONAL and deprecated. If it is not implemented, the behavior is as if it is unlocked. The 
Software Locks are controlled by EDLSR and EDLAR, PMLSR and PMLAR, and CTILSR and CTILAR. See 
Management registers and CoreSight compliance on page K2-5499. 


With the exception of these registers and the effect of the Software Lock, the behavior of the memory-mapped 
accesses is the same as for other accesses to the external debug interface. 


Note 


The recommended memory-mapped accesses to the external debug interface are not compatible with the 
memory-mapped interface defined in ARMv7. In particular: 





. The memory map is different. 


° Memory-mapped accesses do not behave differently to Debug Access Port accesses when 
OSLSR.OSLK == 1, meaning that the OS Lock is locked. 








H8.4.1 Register access permissions for memory-mapped accesses 

It is IMPLEMENTATION DEFINED whether unprivileged memory-mapped accesses are allowed. Privileged software 

is responsible for controlling memory-mapped accesses using the MMU. 

If memory-mapped accesses are made through an ADIV5 interface, the Debug Access Port can block the access 

using DBGSWENABLE. This is outside the scope of the ARMv8-A architecture. See ARM® Debug Interface 

Architecture Specification ADIV5.0 to ADIVS.2. 

Effect of the OPTIONAL Software Lock on memory-mapped access 

For memory-mapped accesses, if other controls permit access to a register, the OPTIONAL Software Lock is 

implemented, and EDLSR.SLK, PMLSR.SLK, or CTILSR.SLK is set to 1, meaning the Software Lock is locked, 

then with the exception of the LAR itself: 

° If other controls permit access to a register, then writes are ignored. That is: 

— __ Read/write (RW) registers become read-only, writes ignored (RO/WI). 
—  Write-only (WO) registers become writes ignored (WI). 

° Reads and writes have no side-effects. A side-effect is where a direct read or a direct write of a register creates 
an indirect write of the same or another register. When the Software Lock is locked, the indirect write does 
not occur. 

° Writes to EDLAR, PMLAR, and CTILAR are unaffected. 

This behavior must also apply to all IMPLEMENTATION DEFINED registers. 

For example, if EDLSR.SLK is set to 1: 

° EDSCR. {TXfull, TXU, ERR} are unchanged by a memory-mapped read from DBGDTRTX_ELO. 

° EDSCR. {RXfull, RXO, ERR} are unchanged by a memory-mapped write to DBGDTRRX_ELO that is 
ignored. 

° EDSCR.{ITE, ITO, ERR} are unchanged by a memory-mapped write to EDITR that is ignored. 

° OSLSR.OSLK is unchanged by a memory-mapped write to OSLAR_EL] that is ignored. 
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° EDPCSR[63:32], EDCIDSR, and EDVIDSR are unchanged by a memory-mapped read from 
EDPCSRJ3 1:0]. 


Note 
Updating EDVIDSR, EDCIDSR, and EDPCSRhi are side-effects of reading EDPCSRIo, such that these 
registers contain the matching context for EDPCSRlo. The process that updates EDPCSRIo with PC samples 
is not a side-effect of the access. Reads of EDPCSRlo made when the Software Lock is locked can be used 
to profile software. 








° EDPRSR.{SDR, SPMAD, SDAD, SR, SPD} are unchanged by a memory-mapped read from EDPRSR. 


° EDPRSR.SDAD is not set if an error response is returned due to a memory-mapped read or write of any 
debug register as the result of the value of the EDAD field. 


° The CLAIM tags are unchanged by memory-mapped writes to DBGCLAIMSET_EL1 and 
DBGCLAIMCLR_EL1 which are ignored. 


Similarly, if PMLSR.SLK is set to 1, then EDPRSR.SPMAD is not set if an error response is returned to a 
memory-mapped read or write of any Performance Monitors register due to the value of the EPMAD field. 


Behavior of a not permitted memory-mapped access 


Where the architecture requires that an external debug interface access generates an error response, a 
memory-mapped access must also generate an error response. However, it is IMPLEMENTATION DEFINED how the 
error response is handled, as this depends on the system. 


ARM recommends that the error is returned as either: 


° A synchronous external Data Abort. 
° An SError interrupt. 
H8.4.2 Synchronization of memory-mapped accesses to external debug registers 


The synchronization requirements for memory-mapped accesses to the external debug interface is described in 
Synchronization of changes to the external debug registers on page H8-4964. 


The synchronization requirements between different routes to the external debug interface, that is, between Debug 
Access Port accesses and memory-mapped accesses are IMPLEMENTATION DEFINED. 


H8.4.3 Access sizes for memory-mapped accesses 


For memory-mapped accesses from a PE that complies with an ARM architecture, the single-copy atomicity rules 
for the instruction, the type of instruction, and the type of memory accessed, determine the size of the access made 
by an instruction. Example H8-4 shows this. 


Example H8-4 Access sizes for memory-mapped accesses 


Two Load Doubleword instructions made to consecutive doubleword-aligned locations generate a pair of 
single-copy atomic doubleword reads. However, if the accesses are made to Normal memory or Device-GRE 
memory they might appear as a single quadword access that is not supported by the peripheral. 


ARMvV8 does not require the size of each element accessed by a multi-register load or store instruction to be 
identifiable by the memory system beyond the PE. Any memory-mapped access to a debug register is defined to be 
beyond the PE. 


Software must use a Device-nGRE or stronger memory-type and use only single register load and store instructions 
to create memory accesses that are supported by the peripheral. For more information, see Memory types and 
attributes on page B2-94. 
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H8.5 


External debug interface register access permissions 


Some external accesses to debug registers and Performance Monitor registers are not permitted and return an error 
response if: 


° The Core power domain is powered down or is in low-power state where the registers cannot be accessed. 
° OSLSR.OSLK == 1. The OS Lock is locked. 
e DoubleLockStatus() == TRUE. The OS Double Lock is locked. 


. Access by the external debug interface is disabled by the authentication interface or secure monitor. 


Not all registers are affected in all of these cases. For details, see External debug interface register access 
permissions summary on page H8-4972. 





Note 
OSLSR.OSLK is visible through EDPRSR. 














H8.5.1 External debug over powerdown and locks 
Accessing registers using the external debug interface is not possible when the Debug power domain is off. In this 
case all accesses return an error. 
External accesses to debug and Performance Monitors registers in the Core power domain are not permitted and 
return an error response if: 
° The Core power domain is off or in low-power state where the registers cannot be accessed. 
° OSLSR.OSLK == 1, meaning that the OS Lock is locked. This allows software to prevent external debugger 
modification of the registers while it saves and restores them over powerdown. 
. DoubleLockStatus() == TRUE. This means that the OS Double Lock is locked. The OS Double Lock ensures 
that it is safe to remove Core power by forcing the debug interface to be quiescent. 
It is IMPLEMENTATION DEFINED whether the ID registers that describe the PE to the debugger are in the Debug power 
domain or the Core power domain. 
Note 
This applies only to the MIDR_EL1, EDPFR, EDDFR, and EDAA32PFR registers. It does not include the 
CoreSight ID registers in the management register address range. 
The OS lock condition does not apply to the following debug registers: 
° OSLAR_EL1. This means that an external debugger can override this lock. 
° EDESR. This means that an external debugger can program a debug event for when software unlocks the OS 
lock. See OS Unlock Catch debug event on page H3-4901. 
° The ID registers that describe the PE to the debugger. 
See also Debug registers to save over powerdown on page H6-4951. 
H8.5.2 External access disabled 
Accesses are further controlled by the external authentication interface. An untrusted external debugger cannot 
program the breakpoint and watchpoint registers to generate spurious debug exceptions. If external invasive 
debugging is not enabled, these external accesses to the registers are disabled. If EL3 is implemented, then SDCR 
provides additional external access disable controls for those registers if Secure external invasive debugging is 
disabled. 
The disable applies to: 
° DBGBVR<n>_ELI1, Debug Breakpoint Value Registers, n = 0 - 15 on page H9-4993. 
° DBGBCR<n>_ELI, Debug Breakpoint Control Registers, n = 0 - 15 on page H9-4990. 
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. DBGWVR<n>_ELI, Debug Watchpoint Value Registers, n = 0 - 15 on page H9-5007. 
° DBGWCR<n>_EL1, Debug Watchpoint Control Registers, n = 0 - 15 on page H9-5004. 


It is IMPLEMENTATION DEFINED whether the disable applies to OSLAR_EL1. 
The external debug interface cannot access these registers if either: 
° External debugging is not enabled, meaning External InvasiveDebugEnabled() == FALSE. 


° Secure external debugging is not enabled, meaning External SecureInvasiveDebugEnabled() == FALSE, and 
any of the following: 


—  EL3 is not implemented and the implemented Security state is Secure state. 
—  EL3 is implemented and SDCR.EDAD == 1. 


The AllowExternalDebugAccess() pseudocode function describes these accessibility rules. 


PEs might also provide an OPTIONAL external debug interface to the Performance Monitor registers. The 
authentication interface and SDCR provide similar external access disable controls for those registers. 


The external debug interface cannot access the Performance Monitor registers if either: 
° External non-invasive debug is not enabled. ExternalNoninvasiveDebugEnabled() == FALSE. 


° Secure external non-invasive debugging is not enabled, External SecureNoninvasiveDebugEnabled() == 
FALSE, and any of: 


—  EL3 is not implemented and the implemented Security state is Secure state. 


—  EL3 is implemented and SDCR.EPMAD == 1. 


The AllowExternalPMUAccess() pseudocode function describes these accessibility rules. 


Note 


° ARM recommends that secure software that is not making use of debug hardware does not lock out the 
external debug interface. 





° ARMvVv8-A does not provide the equivalent control over access to Trace extension registers. 





H8.5.3 Behavior of a not permitted access 


For an external debug interface access by a Debug Access Port, the Debug Access Port receives the error response 
and must signal this to the external debugger. For an ADIV5 implementation of a Debug Access Port, the error sets 
a sticky error flag in the Debug Access Port that the debugger can poll, and that suppresses further accesses until it 
is explicitly cleared. 


When an error is returned because external access is disabled, and this is the highest priority error condition, a sticky 
error flag in EDPRSR is indirectly written to 1 as a side-effect of the access: 


° For a debug register access when AllowExternalDebugAccess() == FALSE, EDPRSR.SDAD is indirectly 
written to 1. 


. For Performance Monitor register access when AllowExternalPMUAccess() == FALSE, EDPRSR.SPMAD is 
indirectly written to 1. 


The indirect write might not occur for a memory-mapped access to the external debug interface. For more 
information, see Register access permissions for memory-mapped accesses on page H8-4968. 


If no error is returned, or the error is returned because of a higher priority error condition, the flag in EDPRSR is 
unchanged. 


See also Behavior of a not permitted memory-mapped access on page H8-4969. 


For more information, see ARM® Debug Interface Architecture Specification. 
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H8.5.4 Trapping software access to debug registers 


When EDSCR.TDA == 1, software access to the breakpoint and watchpoint registers generate a Halting debug 
event and entry to Debug state. For more information see Software Access debug event on page H3-4903. 


H8.5.5 External debug interface register access permissions summary 
For accesses to: 
° IMPLEMENTATION DEFINED registers, see IMPLEMENTATION DEFINED registers. 
° OPTIONAL registers for CoreSight compliance, see OPTIONAL CoreSight management registers. 


. Reserved, unallocated, or unimplemented registers, writes to read-only registers, and reads of write-only 
registers, see Reserved and unallocated registers. 


For all other external debug interface, CTI, and Performance Monitor registers, Table H8-3 on page H8-4977, 
Table H8-4 on page H8-4979 and Table I2-1 on page 12-5138, show the response of the PE to accesses by the 
external debug interface. 


H8.5.6 IMPLEMENTATION DEFINED registers 


For debug registers, Performance Monitors registers, CTI registers, IMPLEMENTATION DEFINED register access 
permissions are IMPLEMENTATION DEFINED. The power domain in which these registers are implemented is also 
IMPLEMENTATION DEFINED. 


If OPTIONAL memory-mapped access to the external debug interface is supported, there are additional constraints 
on memory-mapped accesses to registers. These constraints must also apply to IMPLEMENTATION DEFINED registers. 
In particular, if the OPTIONAL Software Lock is locked, writes are ignored and accesses have no side-effects. For 
more information see Register access permissions for memory-mapped accesses on page H8-4968. 


H8.5.7 OPTIONAL CoreSight management registers 


Compliance with CoreSight architecture requires additional registers in the range @xF0@ - QxFFC that are always 
accessible. See Management registers and CoreSight compliance on page K2-5499. 


H8.5.8 Reserved and unallocated registers 
The following information relates to certain types of reserved accesses: 
. Reads and writes of unallocated locations. These accesses are reserved for the architecture. 


. Reads and writes of locations for features that are not implemented, including: 
—  _ OPTIONAL features that are not implemented. 
—  Breakpoints and watchpoints that are not implemented. 
— Performance Monitors counters that are not implemented. 
— CTI triggers that are not implemented. 


These accesses are reserved. 
° Reads of WO locations. These accesses are reserved for the architecture. 
° Writes to RO locations. These accesses are reserved for the architecture. 


Reserved accesses normally RAZ/WI. However, software must not rely on this property as the behavior of reserved 
values might change in a future revision of the architecture. 
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Note 





Reads of WO and writes to RO refers to the default access permissions for a register. For example, when the SLK 
field is set, meaning that the relevant registers become RO, a memory-mapped write to a RW register is ignored, 
and not treated as a reserved access. 





The following reserved registers are RESO in all conditions, other than when debug power is off: 


If the implementation is CoreSight architecture compliant, all reserved registers in the range @xF0@ - QxFFC. 
See Management register access permissions on page K2-5500. 


All reserved CTI registers. 


Otherwise, the architecture defines that: 


1. 


2: 


If debug power is off, all register accesses, including reserved accesses, return an error. 
For reserved debug registers and Performance Monitors registers, the response is a CONSTRAINED 
UNPREDICTABLE choice of error or RESO, when any of the following hold: 


Off The Core power domain is either completely off or in a low-power state in which the Core power 
domain registers cannot be accessed. 


DLK DoubleLockStatus() == TRUE. The OS Double Lock is locked. 
OSLK OSLSR.OSLK == 1. The OS Lock is locked. 
In addition, for reserved debug registers in the address ranges 0x40 - Ox4FC and 0x8Q0 - @x8FC, the response 
is a CONSTRAINED UNPREDICTABLE choice of error or RESO when conditions 1 or 2 do not apply and: 
EDAD AllowExternalDebugAccess() == FALSE. External debug is disabled. 

Note 


See also Behavior of a not permitted access on page H8-4971. 








In addition, for reserved Performance Monitors registers in the address ranges @x000 - Ox@FC and 0x400 - 
@x47C, the response is a CONSTRAINED UNPREDICTABLE choice of error or RESO when conditions 1 or 2 do not 
apply and: 


EPMAD §AllowExternalPMUAccess() == FALSE. External Performance Monitor access is disabled. 





Note 


See also Behavior of a not permitted access on page H8-4971. 
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H8.6 External debug interface registers 

The external debug interface register map is described by: 

° Performance Monitors external register views on page I3-5143. 

° Cross-trigger interface registers on page H8-4979. 

. Table H8-2. 

Table H8-2 External debug interface register map 
Offset Mnemonic Register, or additional information 
0x020 EDESR EDESR, External Debug Event Status Register on page H9-5032 
0x024 EDECR EDECR, External Debug Execution Control Register on page H9-5030 
0x030 EDWAR[31:0] EDWAR, External Debug Watchpoint Address Register on page H9-5071 
0x034 EDWAR[63:32] 
0x080 DBGDTRRX_ELO Chapter H4 The Debug Communication Channel and Instruction Transfer Register 
0x084 EDITR EDITR, External Debug Instruction Transfer Register on page H9-5036 
0x088 EDSCR EDSCR, External Debug Status and Control Register on page H9-5064 
Ox08C DBGDTRTX_ELO Chapter H4 The Debug Communication Channel and Instruction Transfer Register 
0x090 EDRCR EDRCR, External Debug Reserve Control Register on page H9-5062 
0x94 EDACR EDACR, External Debug Auxiliary Control Register on page H9-5011 
0x098 EDECCR EDECCR, External Debug Exception Catch Control Register on page H9-5028 
Ox0A0 EDPCSRIo? EDPCSR, External Debug Program Counter Sample Register on page H9-5041 
Ox0A4 EDCIDSR2 EDCIDSR, External Debug Context ID Sample Register on page H9-5016 
Ox0A8 EDVIDSR@ EDVIDSR, External Debug Virtual Context Sample Register on page H9-5069 
Ox@AC EDPCSRhi# EDPCSR, External Debug Program Counter Sample Register on page H9-5041 
0x0300 OSLAR_EL1 OSLAR_ELI, OS Lock Access Register on page H9-5075 
0x0310 EDPRCR EDPRCR, External Debug Power/Reset Control Register on page H9-5051 
0x0314 EDPRSR EDPRSR, External Debug Processor Status Register on page H9-5054 
0x0400+16xn DBGBVR<n>_EL1[31:0]*« DBGBVR<n>_ELI, Debug Breakpoint Value Registers, n = 0 - 15 on page H9-4993 
0x0404+16xn = DBGBVR<n>_EL1[63:32]b¢ 
0x0408+16xn DBGBCR<n>_EL1 DBGBCR<n>_ELI, Debug Breakpoint Control Registers, n = 0 - 15 on 
page H9-4990 

0x800+16 DBGWVR<n>_EL1[31:0]>° DBGWVR<n>_ELI, Debug Watchpoint Value Registers, n = 0 - 15 on 
Ox804+16xn = DBGWVR<n>_EL1[63:32]b¢ Page H9-5007 
0x808+16xn DBGWCR<n>_EL1¢ DBGWCR<n>_ELI1, Debug Watchpoint Control Registers, n = 0 - 15 on 


page H9-5004 





OxC@0-OxCFC 


IMPLEMENTATION DEFINED 





QxD00 


MIDR_EL1 


Main ID register 





@xD0@4-0xD1C 


Reserved, RESO 
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Table H8-2 External debug interface register map (continued) 
































Offset Mnemonic Register, or additional information 
QxD20 EDPFR[31:0] External Debug Processor Feature Register 0 
@xD24 EDPFR[63:32] 
QxD28 EDDFR[31:0] External Debug Feature Register 0 
@xD2C EDDFR[63:32] 
0xD30 Reserved, see next column Previously defined as Instruction Set Attribute Register 0 bits[31:0]. Behavior is: 
Bits[31:20] RESO. 
Bits[19:4] UNKNOWN. 
Bits[3:0] RESO. 
0xD34 RESO Previously defined as Instruction Set Attribute Register 0 bits[63:32] 
QxD38 UNKNOWN Previously defined as Memory Model Feature Register 0 
@xD3C RESO 
@xD40-OxDFC RESO Reserved, RESO 












































QxD60 EDAA32PFRJ[31:0] External Debug AArch32 Processor Feature Register 

QxD64 EDAA32PFR[63:32] External Debug AArch32 Processor Feature Register 

QxE80-EFC IMPLEMENTATION DEFINED = 

OxFO0-E8C Management registers Management registers and CoreSight compliance on page K2-5499 

OxFAQ DBGCLAIMSET_EL1 DBGCLAIMSET_ELI, Debug Claim Tag Set register on page H9-4998 
OxFA4 DBGCLAIMCLR_EL1 DBGCLAIMCLR_ELI1, Debug Claim Tag Clear register on page H9-4996 
OxFA8 EDDEVAFFO EDDEVAFFO, External Debug Device Affinity register 0 on page H9-5017 
OxFAC EDDEVAFF1 EDDEVAFF 1, External Debug Device Affinity register 1 on page H9-5018 
QxFBO-FB4 Management registers Management registers and CoreSight compliance on page K2-5499 

OxFB8 DBGAUTHSTATUS_EL1 DBGAUTHSTATUS_EL1, Debug Authentication Status register on page H9-4988 
OxFCO EDDEVID2 EDDEVID, External Debug Device ID register 0 on page H9-5021 

OxFC4 EDDEVID1 EDDEVID1, External Debug Device ID register 1 on page H9-5023 

OxFC8 EDDEVID EDDEVID2, External Debug Device ID register 2 on page H9-5024 
@xFDQ-FFC Management registers Management registers and CoreSight compliance on page K2-5499 





a. Implemented only if the OPTIONAL PC Sample-based Profiling Extension is implemented. 


b. A 64-bit register mapped to a pair of 32-bit locations. Doubleword accesses to this register are not guaranteed to be 64-bit single copy atomic. 
See Supported access sizes on page H8-4963. Software must ensure a breakpoint or watchpoint is disabled before altering the value register. 


c. Implemented breakpoints and watchpoints only. n is the breakpoint or the watchpoint number. 





Note 


All other locations are reserved. 
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H8.6.1 Access permissions for the External debug interface registers 


Table H8-3 on page H8-4977 shows the access permissions for the external debug interface registers in an 
ARMv8-A Debug implementation. The terms are defined as follows: 


Domain 


Conditions 


This describes the power domain in which the register is logically implemented. Registers described 
as implemented in the Core power domain might be implemented in the Debug power domain, as 
long as they exhibit the required behavior. 

This lists the conditions under which the access is attempted. 


To determine the access permissions for a register, read these columns from left to right, and stop at 
first column which lists the condition as being true. 


The conditions are: 


Off EDPRSR.PU == 0. The Core power domain is completely off, or in low-power state. In 
these cases the Core power domain registers cannot be accessed. 
—— Note 


If debug power is off, then all external debug interface accesses return an error. 





DLK DoubleLockStatus() == TRUE. The OS Double Lock is locked. 
OSLK OSLSR.OSLK == 1. The OS Lock is locked. 


EDAD AllowExternalDebugAccess() == FALSE. External debug access is disabled. See also 
Behavior of a not permitted access on page H8-4971. 


EPMAD AllowExternalPMUAccess() == FALSE. Access to the external Performance Monitors is 
disabled. See also Behavior of a not permitted access on page H8-4971. 


SLK This provides the modified default access permissions for OPTIONAL memory-mapped 
accesses to the external debug interface if the OPTIONAL Software Lock is locked. See 
Register access permissions for memory-mapped accesses on page H8-4968. For all 
other accesses, this column is ignored. 


Default This provides the default access permissions, if there are no conditions that prevent 
access to the register. 


The access permissions are: 


This means that the default access permission applies. See the Default column, or the SLK column, 
if applicable. 





RO This means that the register or field is read-only, and: 
° Unless the register description states otherwise, a RO field in an RW register ignores writes. 
° Where the SLK control makes a RW register RO, the register ignores writes. 

RW This means that the register or field is read/write. Individual fields within the register might be RO 
or WO. See the relevant register description for details. 

RC This means that a read of the register bit clears the field to 0. 

wo This means that the register or field is write-only. Unless the register description states otherwise, a 
WO field in a RW register returns an UNKNOWN value on a read of the register. 

WI This means that the register or field ignores writes. 

IMP DEF This means that the access permissions are IMPLEMENTATION DEFINED. 
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If OPTIONAL memory-mapped access to the external debug interface is supported, there might be additional 
constraints on memory-mapped accesses. See Register access permissions for memory-mapped accesses on 


page H8-4968. 


Table H8-3 Access permissions for the external debug interface registers 





Conditions (priority from left to right) 






















































































Offset Register Domain Off DLK OSLK EDAD Default SLK 
0x020 EDESR Core Error Error - - RW RO 
0x024 EDECR Debug - - - - RW RO 
0x030 EDWAR[31:0] Core Error Error Error - RO - 
0x034 EDWAR[63:32] 
0x080 DBGDTRRX_ELO Core Error Error Error - RW RO 
0x084 EDITR Core Error Error Error - WO WI 
0x088 EDSCR Core Error Error Error - RW RO 
Ox08C DBGDTRTX_ELO Core Error Error Error - RW RO 
0x90 EDRCR Core Error Error Error - WO Wi 
0x094 EDACR IMP DEF IMP DEF IMP DEF IMP DEF = RW RO 
0x098 EDECCR Core Error Error Error - RW RO 
Qx0AQ EDPCSR[31:0]@ Core Error Error Error - RO RO 
Q@x0A4 EDCIDSR2@ Core Error Error Error - RO RO 
Qx0A8 EDVIDSR4 Core Error Error Error - RO RO 
@x@AC EDPCSR[63:32]@ Core Error Error Error - RO RO 
0x0300 OSLAR_EL1 Core Error Error - IMP DEF> WO WI 
Qx0310¢ EDPRCR See register field descriptions for information 
0x03144 EDPRSR See register field descriptions for information 
0x0400+16xn DBGBVR<n>_EL1[31:0]* Core Error Error Error Error RW RO 
0x0404+16xn DBGBVR<n>_EL1[63:32]¢ Core Error Error Error Error RW RO 
0x0408+16xn DBGBCR<n>_EL1¢ Core Error Error Error Error RW RO 
0x800+16xn DBGWVR<n>_EL1[31:0]¢ Core Error Error Error Error RW RO 
0x804+16xn DBGWVR<n>_EL1[63:32]¢ Core Error Error Error Error RW RO 
0x808+16xn DBGWCR<n>_EL1¢ Core Error Error Error Error RW RO 
OxD00 MIDR_ELI1 IMP DEF IMP DEFf IMPDEF! - - RO RO 
0xD20 EDPFR[31:0] IMP DEF IMP DEFf IMPDEF! - - RO RO 
@xD24 EDPFR[63:32] IMP DEF IMP DEF IMP DEF! - - RO RO 
OxD28 EDDFR{[31:0] IMP DEF IMP DEF! IMP DEF! - - RO RO 
@xD2C EDDFR[63:32] IMP DEF IMP DEF IMP DEF! - - RO RO 
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Table H8-3 Access permissions for the external debug interface registers (continued) 





Conditions (priority from left to right) 
































Offset Register Domain Off DLK OSLK EDAD Default SLK 
QxD60 EDAA32PFR[31:0] IMP DEF IMP DEF) IMPDEF! - - RO RO 
@xD64 EDAA32PFR[63:32] IMP DEF IMP DEF! IMPDEF! - - RO RO 
OxFAQ DBGCLAIMSET_EL1 Core Error Error Error - RW RO 
OxFA4 DBGCLAIMCLR_EL1 Core Error Error Error - RW RO 
OxFA8 EDDEVAFFO Debug - - - - RO RO 
OxFAC EDDEVAFF1 Debug - - - - RO RO 
OxFB8 DBGAUTHSTATUS_EL1 Debug - - - - RO RO 
OxFCO EDDEVID2 Debug - - - - RO RO 
OxFC4 EDDEVID1 Debug - - - - RO RO 
OxFC8 EDDEVID Debug - - - - RO RO 





a. Implemented only if the PC Sample-based profiling Extension is implemented. 
b. It is IMPLEMENTATION DEFINED whether an error is returned. See External access disabled on page H8-4970. If no error is returned, 
the access is permitted. 
c. Some control bits are in the Core power domain. These bits ignore writes when Core power domain registers cannot be accessed as shown. 


d. Some status bits are fetched from the Core power domain. These bits read UNKNOWN when Core power domain registers cannot be accessed 
as shown. 


e. Implemented breakpoints and watchpoints only. n is the breakpoint or watchpoint number. 


f. It is IMPLEMENTATION DEFINED whether an error is returned. See External debug over powerdown and locks on page H8-4970. If no error 
is returned, the access is permitted. 


For the reset values for the external debug interface registers, see Table H8-6 on page H8-4981. 
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H8.7 Cross-trigger interface registers 
The embedded Cross-trigger Interface, CTI, is located within its own block of the external debug memory map. 
There must be one such block for each PE. 
If the CTI of a PE does not implement the CTIDEVAFF0 or CTIDEVAFF1 registers it must be located 64KB above 
the debug registers in the external debug interface. 
Table H8-4 shows the CTI register map. 
Table H8-4 Cross-trigger interface map 
Offset Mnemonic Location of further details 
0x000 CTICONTROL CTICONTROL, CTI Control register on page H9-5091 
0x010 CTIUNTACK CTIINTACK, CTI Output Trigger Acknowledge register on page H9-5104 
Qx014 CTIAPPSET CTIAPPSET, CTI Application Trigger Set register on page H9-5080 
0x018 CTIAPPCLEAR CTIAPPCLEAR, CTI Application Trigger Clear register on page H9-5078 
Qx01C CTIAPPPULSE CTIAPPPULSE, CTI Application Pulse register on page H9-5079 
@x020+4xn CTIINEN<n>* CTIINEN<n>, CTI Input Trigger to Output Channel Enable registers, n = 0 - 31 on 
page H9-5103 
Qx0A0+4xn CTIOUTEN<n>4 CTIOUTEN<n>, CTI Input Channel to Output Trigger Enable registers, n = 0 - 31 on 
page H9-5111 
0x130 CTITRIGINSTATUS CTITRIGINSTATUS, CTI Trigger In Status register on page H9-5117 
0x134 CTITRIGOUTSTATUS CTITRIGOUTSTATUS, CTI Trigger Out Status register on page H9-5118 
0x138 CTICHINSTATUS CTICHINSTATUS, CTI Channel In Status register on page H9-5083 
@x13C CTICHOUTSTATUS CTICHOUTSTATUS, CTI Channel Out Status register on page H9-5084 
0x140 CTIGATE CTIGATE, CTI Channel Gate Enable register on page H9-5102 
0x144 ASICCTL ASICCTL, CTI External Multiplexer Control register on page H9-5077 
OxE8@ - IMPLEMENTATION IMPLEMENTATION DEFINED. See Management registers and CoreSight compliance on 
OxEFC DEFINED page K2-5499 
OxFOO - Management registers Management registers and CoreSight compliance on page K2-5499 
OxFBC 
OxFCO CTIDEVID2 CTIDEVID2, CTI Device ID register 2 on page H9-5100 
OxFC4 CTIDEVID1 CTIDEVID1, CTI Device ID register 1 on page H9-5099 
OxFC8 CTIDEVID CTIDEVID, CTI Device ID register 0 on page H9-5097 
OxFD@ - Management registers Management registers and CoreSight compliance on page K2-5499 
OxFFC 





a. Implemented triggers, including triggers that are not connected, only. n is the trigger number. 
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Table H8-5 shows the access permissions for the CTI registers in an ARMv8-A Debug implementation. For a 
definition of the terms used, see External debug interface registers on page H8-4974. 


Table H8-5 Access permissions for the CTI registers 





Conditions (priority from left to right) 


















































Offset Register Domain Off DLK OSLK EDAD Default SLK 
0x000 CTICONTROL Debug - - - - RW RO 
0x010 CTIINTACK Debug - - - - WO WI 
0x014 CTIAPPSET Debug - - - - RW RO 
0x018 CTIAPPCLEAR Debug - - - - WO Wl 
Qx01C CTIAPPPULSE Debug - - - - WO WI 
0x020+4xn CTIINEN<n>@ Debug - - - - RW RO 
Ox@A0+4xn CTIOUTEN<n> Debug - - - - RW RO 
0x130 CTITRIGINSTATUS Debug - - - - RO RO 
0x134 CTITRIGOUTSTATUS Debug - - - - RO RO 
0x138 CTICHINSTATUS Debug - - - - RO RO 
Qx13C CTICHOUTSTATUS Debug - - - - RO RO 
0x140 CTIGATE Debug - - - - RW RO 
OxFCO CTIDEVID2 Debug - - - - RO RO 
OxFC4 CTIDEVID1 Debug - - - - RO RO 
OxFC8 CTIDEVID Debug - - - - RO RO 





a. Implemented triggers only (including triggers that are not connected). n is the trigger number. 


For the reset values of the CTI registers, see Table H8-7 on page H8-4983. 
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External debug register resets 


Each register or field has a defined reset domain: 


° Registers and fields in the Warm reset domain are also reset by a Cold reset and unchanged by an External 
Debug reset that is not coincident with a Cold reset or a Warm reset. 


° Registers and fields in the Cold reset domain are unchanged by a Warm reset or an External Debug reset that 
is not coincident with a Cold reset. 


. Registers and fields in the External Debug reset domain are unchanged by a Cold reset or a Warm reset that 
is not coincident with an External Debug reset. 


A reset might change the value of a register. Specific rules apply to the observability of registers in the External 
Debug reset domain by indirect reads from the Core power domain when an External Debug reset is asserted without 
a coincident Cold reset. For more information, see Synchronization of changes to the external debug registers on 
page H8-4964. 


Table H8-6 and Table H8-7 on page H8-4983 show the external debug register and CTI register resets. For other 
debug registers and Performance Monitors registers, see Management register resets on page K2-5504 and Power 
domains and Performance Monitors registers reset on page I2-5138. 





Note 


By reference to Figure H6-2 on page H6-4955 the power domain can be deduced from the reset domain. Table K2-7 
on page K2-5504 also shows reset power domains. 





Table H8-6 and Table H8-7 on page H8-4983 do not include: 


° Read-only identification registers, such as Processor ID Registers and PMCFGR, that have a fixed value from 
reset. 
° Read-only status registers, such as EDSCR.RW,, that are evaluated each time the register is read and that have 


no meaningful reset value. 
° Write-only registers, such as EDRCR, that only have an effect on writes, and have no meaningful reset value. 


° Read/write registers, such as breakpoint and watchpoint registers, and EDPRCR.CORENPDRQ, that alias 
other registers. The reset values are described by the descriptions of those other registers. 


. IMPLEMENTATION DEFINED registers. The reset values and reset domains of these registers are also 
IMPLEMENTATION DEFINED and might be UNKNOWN. 


All other fields in the registers are set to an IMPLEMENTATION DEFINED value, that can be UNKNOWN. The register 
is in the specified reset domain. 


Note 


An IMPLEMENTATION DEFINED reset value, which can be UNKNOWN, means that hardware is not required to reset 
the register on the specified reset, but software must not rely on the register being preserved over reset. 








Table H8-6 Summary of external debug register resets, debug registers 

















Register Resetdomain Field Value Description 
EDESR Warm SS EDECR.SS Halting Step debug event pending 
RC EDECR.RCE _ Reset Catch debug event pending 
OSUC 0 OS Unlock Catch debug event pending 
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H8.8 External debug register resets 


Table H8-6 Summary of external debug register resets, debug registers (continued) 













































































Register Resetdomain Field Value Description 
EDECR External debug SS 0 Halting Step debug event enable 
RCE 0 Reset Catch debug event enable 
OSUCE 0 OS Unlock Catch debug event enable 
EDWAR Cold - - All fields 
EDSCR Cold RXfull 0 DTRRX register full 
TXfull 0 DTRTX register full 
RXO 0 DTRRX overrun 
TXU 0 DTRTX underrun 
INTdis 0 Interrupt disable 
TDA 0 Trap debug register accesses to Debug state 
MA 0 Memory access mode in Debug state 
HDE 0 Halting debug mode enable 
ERR 0 Cumulative error flag 
EDECCR Cold NSE[2:1] 0beo Coarse-grained Non-secure Exception 
Catch 
SE[3,1] 0bee Coarse-grained Secure Exception Catch 
EDPCSR Cold - - All fields 
EDCIDSR Cold - - All fields 
EDVIDSR Cold - - All fields 
EDPRCR External debug COREPURQ¢ 0 Core powerup request 
EDPRSR Warm SDR - Sticky debug restart 
Cold SPMAD 0 Sticky EPMAD error 
SDAD 0 Sticky EDAD error 
Warm SR 1 Sticky reset status 
Cold SPD 1 Sticky powerdown status 





a. Onacold reset into AArch64 state, DBGPRCR_EL1.CORENPRDRQ resets to the value of EDPRCR.COREPURQ. 
On acold reset into AArch32 state, DBGPRCR.CORENPRDRQ resets to the value of EDPRCR.COREPURQ.If an 
External Debug reset and a Cold reset coincide, both EDPRCR.COREPURQ and the CORENPRDRQ field of the 
appropriate System register are reset to 0. 
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Table H8-7 Summary of external debug register resets, CTI registers 





























Register Reset domain Field Value Description 
CTICONTROL _ External debug GLBEN 0 CTI global enable 
CTIAPPSET External debug - - All fields 
CTIINEN<n> External debug - - All fields 
CTIOUTEN<n> External debug - - All fields 
CTIGATE External debug - - All fields 
ASICCTL IMPLEMENTATION DEFINED - IMPLEMENTATION DEFINED All of register 
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Chapter H9 
External Debug Register Descriptions 


This chapter provides a description of the external debug registers. 


It contains the following sections: 

. About the debug registers on page H9-4986. 

° External debug registers on page H9-4987. 

° Cross-Trigger Interface registers on page H9-5076. 
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H9.1 About the debug registers 


The following sections describe the registers that are accessible through the external debug interface: 
° External debug registers on page H9-4987. 
° Cross-Trigger Interface registers on page H9-5076. 





H9-4986 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


H9 External Debug Register Descriptions 
H9.2 External debug registers 


H9.2 External debug registers 


This section describes the debug registers that are accessible through the external debug interface and are used for 
external debug. 


This section lists the registers that are accessible through the external debug interface. 
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H9.2.1 DBGAUTHSTATUS_EL1, Debug Authentication Status register 
The DBGAUTHSTATUS_ELI characteristics are: 


Purpose 
Provides information about the state of the IMPLEMENTATION DEFINED authentication interface for 
debug. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 


External register DBGAUTHSTATUS_EL1 is architecturally mapped to AArch64 System register 
DBGAUTHSTATUS_EL1. 


External register DBGAUTHSTATUS_EL] is architecturally mapped to AArch32 System register 
DBGAUTHSTATUS. 


DBGAUTHSTATUS_EL1 is in the Debug power domain. 


Attributes 
DBGAUTHSTATUS_EL1 is a 32-bit register. 


Field descriptions 


The DBGAUTHSTATUS_ELI bit assignments are: 


31 876543210 


RESO SNID} SID js 


a NSID 
— NSNID 


Bits [31:8] 


Reserved, RESO. 


SNID, bits [7:6] 


Secure non-invasive debug. Possible values of this field are: 


00 Not implemented. EL3 is not implemented and the implemented Security state is 
Non-secure state. 

10 Implemented and disabled. ExternalSecureNoninvasiveDebugEnabled() == FALSE. 

11 Implemented and enabled. ExternalSecureNoninvasiveDebugEnabled() == TRUE. 


Other values are reserved. 


SID, bits [5:4] 


Secure invasive debug. Possible values of this field are: 





00 Not implemented. EL3 is not implemented and the implemented Security state is 
Non-secure state. 
10 Implemented and disabled. ExternalSecureInvasiveDebugEnabled() == FALSE. 
11 Implemented and enabled. ExternalSecureInvasiveDebugEnabled() == TRUE. 
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Other values are reserved. 


NSNID, bits [3:2] 


Non-secure non-invasive debug. Possible values of this field are: 


00 Not implemented. EL3 is not implemented and the implemented Security state is Secure 
state. 

10 Implemented and disabled. ExternalNoninvasiveDebugEnabled() == FALSE. 

11 Implemented and enabled. ExternalNoninvasiveDebugEnabled() == TRUE. 


Other values are reserved. 


NSID, bits [1:0] 


Non-secure invasive debug. Possible values of this field are: 


00 Not implemented. EL3 is not implemented and the implemented Security state is Secure 
state. 

10 Implemented and disabled. ExternallnvasiveDebugEnabled() == FALSE. 

11 Implemented and enabled. ExternalInvasiveDebugEnabled() == TRUE. 


Other values are reserved. 


Accessing the DBGAUTHSTATUS_EL1: 


DBGAUTHSTATUS_EL] can be accessed through the external debug interface: 





Component Offset 





Debug OxFB8 
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H9.2.2 


DBGBCR<n>_EL1, Debug 


Breakpoint Control Registers, n = 0 - 15 


The DBGBCR<n>_EL]1 characteristics are: 


Purpose 


Holds control information for a breakpoint. Forms breakpoint n together with value register 
DBGBVR<n>_EL1. 


Usage constraints 


This register is accessible as follows: 


Configurations 





Off DLK OSLK EDAD SLK _ Default 





Error Error — Error 


RO RW 





External register DBGBCR<n>_EL] is architecturally mapped to AArch64 System register 
DBGBCR<n>_EL1. 


External register DBGBCR<n>_EL] is architecturally mapped to AArch32 System register 
DBGBCR<n>. 


DBGBCR<n>_EL] is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on a Cold reset. The register is not affected by a Warm reset 
and is not affected by an External debug reset. 


If breakpoint n is not implemented then this register is unallocated. 


Attributes 


DBGBCR<n>_EL 1] is a 32-bit register. 


Field descriptions 


The DBGBCR<n>_EL1 bit assignments are: 


31 24 23 


20 19 16 15 14 13 12 


5 43 21 0 


oe RESO 
HMC 


When the E field is zero, all the other fields in the register are ignored. 


Bits [31:24] 


Reserved, RESO. 


BT, bits [23:20] 


Breakpoint Type. Possible values are: 


0000 
0001 
0010 
0011 
0100 
0101 


Unlinked instruction address match. 
Linked instruction address match. 
Unlinked context ID match. 

Linked context ID match 

Unlinked instruction address mismatch. 


Linked instruction address mismatch. 
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1000 Unlinked VMID match. 

1001 Linked VMID match. 

1010 Unlinked VMID and context ID match. 
1011 Linked VMID and context ID match. 


The field breaks down as follows: 


° BT[3:1]: Base type. 





000 Match address. DBGBVR<n>_EL] is the address of an instruction. 

010 Mismatch address. DBGBVR<n>_EL] is the address of an instruction to be 
stepped. 

001 Match context ID. DBGBVR<n>_EL1.ContextID is a context ID. 

100 Match VMID. DBGBVR<n>_EL1.VMID is a VMID. 

101 Match VMID and context ID. DBGBVR<n>_EL1.ContextID is a context ID, 


and DBGBVR<n>_EL1.VMID is a VMID. 
° BT[O]: Enable linking. 


All other values are reserved. Constraints on breakpoint programming mean other values are 
reserved under some conditions. For more information, including the effect of programming this 
field to a reserved value, see Reserved DBGBCR<n>_ELI1.BT values on page D2-1652. 


This field resets to a value that is architecturally UNKNOWN. 


LBN, bits [19:16] 


Linked breakpoint number. For Linked address matching breakpoints, this specifies the index of the 
Context-matching breakpoint linked to. 


For all other breakpoint types this field is ignored and reads of the register return an UNKNOWN 
value. 


This field is ignored when the value of DBGBCR<n>_EL1.E is 0. 


This field resets to a value that is architecturally UNKNOWN. 


SSC, bits [15:14] 


HMC, bit [13] 


Bits [12:9] 


BAS, bits [8:5] 


Security state control. Determines the Security states under which a Breakpoint debug event for 
breakpoint n is generated. This field must be interpreted along with the HMC and PMC fields, and 
there are constraints on the permitted values of the {HMC, SSC, PMC} fields. For more 
information, including the effect of programming the fields to a reserved set of values, see Reserved 
DBGBCR<n>_EL1.{SSC, HMC, PMC} values on page D2-1652. 


This field resets to a value that is architecturally UNKNOWN. 


Higher mode control. Determines the debug perspective for deciding when a Breakpoint debug 
event for breakpoint n is generated. This field must be interpreted along with the SSC and PMC 
fields, and there are constraints on the permitted values of the {HMC, SSC, PMC} fields. For more 
information see DBGBCR<n>_EL1.SSC description. 


This field resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


Byte address select. Defines which half-words an address-matching breakpoint matches, regardless 
of the instruction set and Execution state. In an AArch64-only implementation, this field is reserved, 
RES]. 


The permitted values depend on the breakpoint type. 
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Bits [4:3] 


For Address match breakpoints in either AArch32 or AArch64 state, the permitted values are: 





BAS Match instruction at Constraint for debuggers 





0011 DBGBVR<n>_EL1 Use for T32 instructions. 





1100 DBGBVR<n>_EL1+2 Use for T32 instructions. 





1111 DBGBVR<n>_EL1 Use for A64 and A32 instructions. 





All other values are reserved. 
For more information, see Using the BAS field in Address Match breakpoints on page G2-3949. 


For Address mismatch breakpoints in an AArch32 stage 1 translation regime, the permitted values 
are: 





BAS Stepinstruction at Constraint for debuggers 





0000——si- Use for a match anywhere breakpoint. 





0011 #$DBGBVR<n>_EL1 Use for stepping T32 instructions. 





1100 = =DBGBVR<n>_EL1+2 Use for stepping T32 instructions. 





1111 DBGBVR<n>_EL1 Use for stepping A64 and A32 instructions. 





For more information, see Using the BAS field in Address Match breakpoints on page G2-3949. 
For Context matching breakpoints, this field is RES1 and ignored. 


If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 


Reserved, RESO. 


PMC, bits [2:1] 


E, bit [0] 


Privilege mode control. Determines the Exception level or levels at which a Breakpoint debug event 
for breakpoint n is generated. This field must be interpreted along with the SSC and HMC fields, 
and there are constraints on the permitted values of the {HMC, SSC, PMC} fields. For more 
information see the DBGBCR<n>_EL1.SSC description. 


This field resets to a value that is architecturally UNKNOWN. 


Enable breakpoint DBGBVR<n>_EL1. Possible values are: 
0 Breakpoint disabled. 
1 Breakpoint enabled. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGBCR<n>_EL1: 


DBGBCR<n>_EL1 can be accessed through the external debug interface: 





Component Offset 





Debug 0x408 + 16n 
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H9.2.3 DBGBVR<n>_EL1, Debug Breakpoint Value Registers, n = 0 - 15 
The DBGBVR<n>_EL 1 characteristics are: 


Purpose 


Holds a virtual address, or a VMID and/or a context ID, for use in breakpoint matching. Forms 
breakpoint n together with control register DBGBCR<n>_EL1. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EDAD SLK _ Default 





Error Error — Error Error RO RW 





Configurations 


External register DBGBVR<n>_EL] is architecturally mapped to AArch64 System register 
DBGBVR<n>_EL1. 


External register DBGBVR<n>_EL1[31:0] is architecturally mapped to AArch32 System register 
DBGBVR<n>. 


If the breakpoint is context-aware and EL2 is implemented, then External register 
DBGBVR<n>_EL1[63:32] is architecturally mapped to AArch32 System register DBGBXVR<n>. 
Otherwise there is no External register access to DBGBVR<n>_EL1[63:32] from AArch32 state. 


DBGBVR<n>_EL1 is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on a Cold reset. The register is not affected by a Warm reset 
and is not affected by an External debug reset. 


If breakpoint n is not implemented then this register is unallocated. 


Attributes 
How this register is interpreted depends on the value of DBGBCR<n>_EL1.BT. 
° When DBGBCR<n>_EL1.BT is Ob0xOx, this register holds a virtual address. 
° When DBGBCR<n>_EL1.BT is 0b001x, this register holds a Context ID. 
° When DBGBCR<n>_EL1.BT is 0b100x, this register holds a VMID. 
° When DBGBCR<n>_EL1.BT is 0b101x, this register holds a VMID and a Context ID. 
For other values of DBGBCR<n>_EL1.BT, this register is RESO. 


Field descriptions 


The DBGBVR<n>_EL]1 bit assignments are: 


When DBGBCR<n>_EL1.BT==0b0x0x: 


63 49 48 2 10 


RESS VA 


on eee 


RESS, bits [63:49] 


Reserved, Sign extended. Software must treat this field as RESO if bit[48] is 0 or RESO, and as RES1 
if bit[48] is 1. 
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Hardware always ignores the value of these bits and it is IMPLEMENTATION DEFINED whether: 


° The bits are hardwired to a copy of bit [48], meaning writes to these bits are ignored, and 
reads to the bits always return the hardwired value. 


° The value in those bits can be written, and reads will return the last value written. The value 
held in those bits is ignored by hardware. 


This field resets to a value that is architecturally UNKNOWN. 


VA, bits [48:2] 


If the address is being matched in an AArch64 stage 1 translation regime, this field contains 
bits[48:2] of the address for comparison. 


If the address is being matched in an AArch32 stage 1 translation regime, the first 16 bits of this 
field are RESO, and the rest of the field contains bits[31:2] of the address for comparison. 


This field resets to a value that is architecturally UNKNOWN. 
Bits [1:0] 


Reserved, RESO. 


When DBGBCRe<n>_EL1.BT==0b001x: 


63 32 31 0 


RESO ContextID 


Bits [63:32] 


Reserved, RESO. 


ContextID, bits [31:0] 
Context ID value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 


When DBGBCR<n>_EL1.BT==0b100x and EL2 implemented: 


63 40 39 32 31 0 


RESO VMID RESO 


Bits [63:40] 
Reserved, RESO. 
VMID, bits [39:32] 
VMID value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 


Bits [31:0] 


Reserved, RESO. 
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When DBGBCRe<n>_EL1.BT==0b101x and EL2 implemented: 


40 39 32 31 


63 0 
| RESO VMID | ContextID 


Bits [63:40] 
Reserved, RESO. 
VMID, bits [39:32] 
VMID value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 


ContextID, bits [31:0] 
Context ID value for comparison. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGBVR<n>_EL1: 


DBGBVR<n>_EL1[31:0] can be accessed through the external debug interface: 























Component Offset 
Debug 0x400 + 16n 
DBGBVR<n>_EL1[63:32] can be accessed through the external debug interface: 
Component Offset 
Debug 0x404 + 16n 
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H9.2.4 DBGCLAIMCLR_EL1, Debug Claim Tag Clear register 
The DBGCLAIMCLR_EL1 characteristics are: 
Purpose 


Used by software to read the values of the CLAIM tag bits, and to clear these bits to 0. 
The architecture does not define any functionality for the CLAIM tag bits. 


— Note 


CLAIM tags are typically used for communication between the debugger and target software. 





Used in conjunction with the DBGCLAIMSET_ELI register. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK SLK Default 





Error Error’ Error RO RW 





Configurations 


External register DBGCLAIMCLR_EL1 is architecturally mapped to AArch64 System register 
DBGCLAIMCLR_EL1. 


External register DBGCLAIMCLR_EL1 is architecturally mapped to AArch32 System register 
DBGCLAIMCLR. 


DBGCLAIMCLR_EL1 is in the Core power domain. 


See the CLAIM field description for the effect of a Cold reset on the value returned by this register. 
This register is not affected by a Warm reset, and is not affected by an External debug reset. 


An implementation must include 8 CLAIM tag bits. 


Attributes 
DBGCLAIMCLR_EL1] is a 32-bit register. 


Field descriptions 


The DBGCLAIMCLR_EL1 bit assignments are: 


31 8 7 0 
RAZ/SBZ CLAIM 
Bits [31:8] 


Reserved, RAZ/SBZ. Software can rely on these bits reading as zero, and must use a should-be-zero 
policy on writes. Implementations must ignore writes. 

CLAIM, bits [7:0] 
Read or clear CLAIM tag bits. Reading this field returns the current value of the CLAIM tag bits. 


Writing a | to one of these bits clears the corresponding CLAIM tag bit to 0. This is an indirect write 
to the CLAIM tag bits. A single write operation can clear multiple CLAIM tag bits to 0. 


Writing 0 to one of these bits has no effect. 
A cold reset clears the CLAIM tag bits to 0. 
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Accessing the DBGCLAIMCLR_EL1: 


DBGCLAIMCLR_EL1 can be accessed through the external debug interface: 





Component Offset 





Debug OxFA4 
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H9.2.5 DBGCLAIMSET_EL1, Debug Claim Tag Set register 

The DBGCLAIMSET_EL1 characteristics are: 

Purpose 
Used by software to set the CLAIM tag bits to 1. 
The architecture does not define any functionality for the CLAIM tag bits. 
—— Note 
CLAIM tags are typically used for communication between the debugger and target software. 
Used in conjunction with the DBGCLAIMCLR_EL|I register. 

Usage constraints 
This register is accessible as follows: 

Off DLK OSLK SLK Default 
Error Error — Error RO RW 

Configurations 
External register DBGCLAIMSET_EL1 is architecturally mapped to AArch64 System register 
DBGCLAIMSET_EL1. 
External register DBGCLAIMSET_EL1 is architecturally mapped to AArch32 System register 
DBGCLAIMSET. 
DBGCLAIMSET_EL1 is in the Core power domain. 
An implementation must include 8 CLAIM tag bits. 

Attributes 
DBGCLAIMSET_EL1 is a 32-bit register. 

Field descriptions 

The DBGCLAIMSET_EL1 bit assignments are: 

31 8 7 0 

RAZ/SBZ CLAIM 

Bits [31:8] 
Reserved, RAZ/SBZ. Software can rely on these bits reading as zero, and must use a should-be-zero 
policy on writes. Implementations must ignore writes. 

CLAIM, bits [7:0] 
Set CLAIM tag bits. RAO. 
Writing a 1 to one of these bits sets the corresponding CLAIM tag bit to 1. This is an indirect write 
to the CLAIM tag bits. A single write operation can set multiple CLAIM tag bits to 1. 
Writing 0 to one of these bits has no effect. 
A cold reset clears the CLAIM tag bits to 0. 
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Accessing the DBGCLAIMSET_EL1: 


DBGCLAIMSET_EL]I can be accessed through the external debug interface: 





Component Offset 





Debug OxFAQ 
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H9.2.6 


DBGDTRRX_ELO, Debug Data Transfer Register, Receive 


The DBGDTRRX_ELO characteristics are: 


Purpose 
Transfers data from an external debugger to the PE. For example, it is used by a debugger 
transferring commands and data to a debug target. It is a component of the Debug Communications 
Channel. 

Usage constraints 


This register is accessible as follows: 





Off DLK OSLK_ SLK Default 





Error Error’ Error RO RW 





If EDSCR.ITE == 0 when the PE exits Debug state on receiving a Restart request trigger event, the 
behavior of any operation issued by a DTR access in memory access mode that has not completed 
execution is CONSTRAINED UNPREDICTABLE, and must do one of the following: 


° It must complete execution in Debug state before the PE executes the restart sequence. 
° It must complete execution in Non-debug state before the PE executes the restart sequence. 
. It must be abandoned. This means that the instruction does not execute. Any registers or 
memory accessed by the instruction are left in an UNKNOWN state. 
Configurations 


External register DBGDTRRX_ELO is architecturally mapped to AArch64 System register 
DBGDTRRX_ELO. 


External register DBGDTRRX_ELO is architecturally mapped to AArch32 System register 
DBGDTRRxXint. 


DBGDTRRX_ELO is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on a Cold reset. The register is not affected by a Warm reset 
and is not affected by an External debug reset. 

Attributes 
DBGDTRRX_ELO is a 32-bit register. 


Field descriptions 


The DBGDTRRX_ELO bit assignments are: 


31 0 


Update DTRRX 


Bits [31:0] 
Update DTRRX. 
If RXfull is set to 0, then writes to this register update the value in DT[RRX and set RXfull to 1. 
Reads of this register return the last value written to DT[RRX and do not change RXfull. 


For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 


This field resets to a value that is architecturally UNKNOWN. 
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Accessing the DBGDTRRX_ELO: 


DBGDTRRX_ELO can be accessed through the external debug interface: 





Component Offset 





Debug 0x080 
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H9.2.7 DBGDTRTX_ELO, Debug Data Transfer Register, Transmit 

The DBGDTRTX_ELO characteristics are: 

Purpose 
Transfers data from the PE to an external debugger. For example, it is used by a debug target to 
transfer data to the debugger. It is a component of the Debug Communication Channel. 

Usage constraints 
This register is accessible as follows: 

Off DLK OSLK _ SLK _ Default 
Error Error — Error RO RW 
If EDSCR.ITE == 0 when the PE exits Debug state on receiving a Restart request trigger event, the 
behavior of any operation issued by a DTR access in memory access mode that has not completed 
execution is CONSTRAINED UNPREDICTABLE, and must do one of the following: 
° It must complete execution in Debug state before the PE executes the restart sequence. 
° It must complete execution in Non-debug state before the PE executes the restart sequence. 
. It must be abandoned. This means that the instruction does not execute. Any registers or 
memory accessed by the instruction are left in an UNKNOWN state. 

Configurations 
External register DBGDTRTX_ELDO is architecturally mapped to AArch64 System register 
DBGDTRTX_ELO. 
External register DBGDTRTX_ELDO is architecturally mapped to AArch32 System register 
DBGDTRTXint. 
DBGDTRTX_ELO is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on a Cold reset. The register is not affected by a Warm reset 
and is not affected by an External debug reset. 

Attributes 
DBGDTRTX_ELO is a 32-bit register. 

Field descriptions 

The DBGDTRTX_ELO bit assignments are: 

31 0 

Return DTRTX 

Bits [31:0] 
Return DTRTX. 
If TXfull is set to 1, then reads of this register return the value in DTRTX and clear TXfull to 0. 
Writes of this register update the value in DT[RTX and do not change TXfull. 
For the full behavior of the Debug Communications Channel, see Chapter H4 The Debug 
Communication Channel and Instruction Transfer Register. 
This field resets to a value that is architecturally UNKNOWN. 
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Accessing the DBGDTRTX_ELO: 


DBGDTRTX_ELO can be accessed through the external debug interface: 





Component Offset 





Debug Ox08C 
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H9.2.8 DBGWCR<n>_EL1, Debug Watchpoint Control Registers, n = 0 - 15 
The DBGWCR<n>_EL 1 characteristics are: 


Purpose 


Holds control information for a watchpoint. Forms watchpoint n together with value register 
DBGWVR<n>_EL1. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EDAD SLK _ Default 





Error Error — Error Error RO RW 





Configurations 


External register DBGWCR<n>_EL] is architecturally mapped to AArch64 System register 
DBGWCR<n>_EL1. 


External register DBGWCR<n>_EL] is architecturally mapped to AArch32 System register 
DBGWCR<n>. 


DBGWCR<n>_EL1 is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on a Cold reset. The register is not affected by a Warm reset 
and is not affected by an External debug reset. 


If breakpoint n is not implemented then this register is unallocated. 


Attributes 
DBGWCR<n>_EL] is a 32-bit register. 


Field descriptions 


The DBGWCR<n>_EL1 bit assignments are: 


29 28 2423 212019 16 15 14 13 12 5 43 21 0 


WT | — HMC 


When the E field is zero, all the other fields in the register are ignored. 


Bits [31:29] 


Reserved, RESO. 


MASK, bits [28:24] 
Address mask. Only objects up to 2GB can be watched using a single mask. 
00000 No mask. 
00001 Reserved. 
00010 Reserved. 


If programmed with a reserved value, a watchpoint must behave as if either: 





° MASK has been programmed with a defined value, which might be 0 (no mask), other than 
for a direct read of DBGWCRn_EL1. 
. The watchpoint is disabled. 
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Software must not rely on this property because the behavior of reserved values might change in a 
future revision of the architecture. 


Other values mask the corresponding number of address bits, from @b00011 masking 3 address bits 
(0x00000007 mask for address) to 0b11111 masking 31 address bits (@x7FFFFFFF mask for address). 


This field resets to a value that is architecturally UNKNOWN. 
Bits [23:21] 


Reserved, RESO. 


WT, bit [20] 
Watchpoint type. Possible values are: 
0 Unlinked data address match. 
1 Linked data address match. 


This field resets to a value that is architecturally UNKNOWN. 


LBN, bits [19:16] 


Linked breakpoint number. For Linked data address watchpoints, this specifies the index of the 
Context-matching breakpoint linked to. 


This field resets to a value that is architecturally UNKNOWN. 


SSC, bits [15:14] 


Security state control. Determines the Security states under which a Watchpoint debug event for 
watchpoint n is generated. This field must be interpreted along with the HMC and PAC fields, see 
Execution conditions for which a watchpoint generates Watchpoint exceptions on page D2-1660. 


This field resets to a value that is architecturally UNKNOWN. 


HMC, bit [13] 


Higher mode control. Determines the debug perspective for deciding when a Watchpoint debug 
event for watchpoint n is generated. This field must be interpreted along with the SSC and PAC 
fields, see Execution conditions for which a watchpoint generates Watchpoint exceptions on 
page D2-1660. 


This field resets to a value that is architecturally UNKNOWN. 
BAS, bits [12:5] 


Byte address select. Each bit of this field selects whether a byte from within the word or 
double-word addressed by DBGWVR<n>_EL1 is being watched. 





BAS Description 





Xxxxxxxl Match byte at DBGWVR<n>_EL1 





Xxxxxxlx Match byte at DBGWVR<n>_EL1+1 





Xxxxxlxx Match byte at DBGWVR<n>_EL1+2 





xxxx1xxx Match byte at DBGWVR<n>_EL1+3 
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LSC, bits [4:3] 


PAC, bits [2:1] 


E, bit [0] 


In cases where DBGWVR<n>_EL 1 addresses a double-word: 





BAS Description, if DBGWVR<n>_EL1[2] == 





xxxIxxxx Match byte at DBGWVR<n>_EL1+4 





xx1xxxxx Match byte at DBGWVR<n>_EL1+5 





x1xxxxxx Match byte at DBGWVR<n>_EL1+6 





1xxxxxxx Match byte at DBGWVR<n>_EL1+7 





If DBGWVR<n>_EL1[2] == 1, only BAS[3:0] is used. ARM deprecates setting 
DBGWVR<n>_EL1[2] == 1. 


The valid values for BAS are non-zero binary number all of whose set bits are contiguous. All other 
values are reserved and must not be used by software. See Reserved DBGWCR<n>.BAS values on 
page G2-3972. 


This field resets to a value that is architecturally UNKNOWN. 


Load/store control. This field enables watchpoint matching on the type of access being made. 
Possible values of this field are: 


01 Match instructions that load from a watchpointed address. 
10 Match instructions that store to a watchpointed address. 
11 Match instructions that load from or store to a watchpointed address. 


All other values are reserved, but must behave as if the watchpoint is disabled. Software must not 
rely on this property as the behavior of reserved values might change in a future revision of the 
architecture. 


This field resets to a value that is architecturally UNKNOWN. 


Privilege of access control. Determines the Exception level or levels at which a Watchpoint debug 
event for watchpoint n is generated. This field must be interpreted along with the SSC and HMC 
fields, see Execution conditions for which a watchpoint generates Watchpoint exceptions on 

page D2-1660. 


This field resets to a value that is architecturally UNKNOWN. 


Enable watchpoint n. Possible values are: 
) Watchpoint disabled. 
1 Watchpoint enabled. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the DBGWCR<n>_EL1: 


DBGWCR<n>_EL1 can be accessed through the external debug interface: 





Component Offset 





Debug 0x808 + 16n 
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H9.2.9 DBGWVR<n>_EL1, Debug Watchpoint Value Registers, n = 0 - 15 
The DBGWVR<n>_EL 1 characteristics are: 


Purpose 


Holds a data address value for use in watchpoint matching. Forms watchpoint n together with 
control register DBGWCR<n>_EL1. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EDAD SLK _ Default 





Error Error — Error Error RO RW 





Configurations 


External register DBGWVR<n>_EL] is architecturally mapped to AArch64 System register 
DBGWVR<n>_EL1. 


External register DBGWVR<n>_EL1[31:0] is architecturally mapped to AArch32 System register 
DBGWVR<n>. 


DBGWVR<n>_EL1] is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on a Cold reset. The register is not affected by a Warm reset 
and is not affected by an External debug reset. 


If breakpoint n is not implemented then this register is unallocated. 


Attributes 
DBGWVR<n>_EL1 is a 64-bit register. 


Field descriptions 


The DBGWVR<n>_EL1 bit assignments are: 


63 49 48 2 10 


RESS VA 


i ee 


RESS, bits [63:49] 


Reserved, Sign extended. Hardware and software must treat this field as RESO if bit[48] is 0 or RESO, 
and as RES1 if bit[48] is 1. 


Hardware always ignores the value of these bits and it is IMPLEMENTATION DEFINED whether: 


* The bits are hardwired to a copy of bit [48], meaning writes to these bits are ignored, and 
reads to the bits always return the hardwired value. 


° The value in those bits can be written, and reads will return the last value written. The value 
held in those bits is ignored by hardware. 


This field resets to a value that is architecturally UNKNOWN. 
VA, bits [48:2] 

Bits[48:2] of the address value for comparison. 

ARM deprecates setting DBGWVR<n>_EL1[2] == 1. 


This field resets to a value that is architecturally UNKNOWN. 
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Bits [1:0] 


Reserved, RESO. 


Accessing the DBGWVR<n>_EL1: 


DBGWVR<n>_EL1[31:0] can be accessed through the external debug interface: 





Component Offset 





Debug 0x800 + 16n 





DBGWVR<n>_EL1[63:32] can be accessed through the external debug interface: 





Component Offset 





Debug 0x804 + 16n 
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H9.2.10 EDAA32PFR, External Debug AArch32 Processor Feature Register 
The EDAA32PFR characteristics are: 
Purpose 
Provides information about implemented PE features. 
For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 
Usage constraints 
This register is accessible as follows: 
Off DLK Default 
IMP DEF IMPDEF RO 
Configurations 
It is IMPLEMENTATION DEFINED whether EDAA32PFR is implemented in the Core power domain 
or in the Debug power domain. 
EDAA32PER is only accessible in an AArch32-only implementation. If AArch64 is supported at 
any Exception level, EDAA32PFR is RESO. 
Attributes 
EDAA32PER is a 64-bit register. 
Field descriptions 
The EDAA32PEFR bit assignments are: 
63 16 15 12 11 8 7 4 3 0 
| RESO EL3 EL2 | PMSA VMSA 
Bits [63:16] 
Reserved, RESO. 
EL3, bits [15:12] 
AArch32 EL3 Exception level handling. Defined values are: 
0000 EL3 is not implemented. 
0001 EL3 can be executed in AArch32 state only. 
When the value of EDPFR.EL3 is non-zero, this field must be 0000. 
All other values are reserved. 
— Note 
EDPFR.{EL1, ELO} indicate whether EL1 and ELO can only be executed in AArch32 state. 
EL2, bits [11:8] 
AArch32 EL2 Exception level handling. Defined values are: 
0000 EL2 is not implemented. 
0001 EL2 can be executed in AArch32 state only. 
When the value of EDPFR.EL2 is non-zero, this field must be 0000. 
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All other values are reserved. 


— Note 
EDPFR.{EL1, ELO} indicate whether EL1 and ELO can only be executed in AArch32 state. 





PMSA, bits [7:4] 

Indicates support for a PMSA. Defined values are: 

0000 PMSA not supported. 

0100 Support for an ARMv8-R PMSAv8-32. 

All other values are reserved. In ARMv8-A, the only permitted value is 0000. 
VMSA, bits [3:0] 


Indicates support fora VMSA. When the PMSA field is nonzero, determines support fora VMSA. 
When the PMSA field is 0000, VMSA is supported. Defined values are: 


0000 VMSA not supported. 
All other values are reserved. In ARMv8-A, the only permitted value is 0000. 


Accessing the EDAA32PFR: 


EDAA32PEFR can be accessed through the external debug interface: 





Component Offset 





Debug QxD60 
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H9.2.11 EDACR, External Debug Auxiliary Control Register 
The EDACR characteristics are: 


Purpose 


Allows implementations to support IMPLEMENTATION DEFINED controls. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK SLK Default 





IMPDEF IMPDEF IMPDEF RO RW 





Configurations 


It is IMPLEMENTATION DEFINED whether EDACR is implemented in the Core power domain or in 
the Debug power domain. RW fields in this register reset to architecturally UNKNOWN values, and: 


° The register is not affected by a Warm reset. 


° If the register is implemented in the Core power domain the reset values apply on a Cold 
reset, and the register is not affected by an External debug reset. 


. If the register is implemented in the Debug power domain the reset values apply on an 
External debug reset, and the register is not affected by a Cold reset. 


Changing this register from its reset value causes IMPLEMENTATION DEFINED behavior, including 
possible deviation from the architecturally-defined behavior. 


If the EDACR contains any control bits that must be preserved over power down, then these bits 
must be accessible by the external debug interface when OSLSR_EL1.0SLK == 1 (OS lock is 
locked) and when the Core is powered off. 


Attributes 
EDACR is a 32-bit register. 


Field descriptions 


The EDACR bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the EDACR: 


EDACR can be accessed through the external debug interface: 





Component Offset 





Debug 0x94 
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H9.2.12 


EDCIDRO, External Debug Component Identification Register 0 


The EDCIDRO characteristics are: 


Purpose 
Provides information to identify an external debug component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDCIDRO is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
EDCIDRO is a 32-bit register. 


Field descriptions 


The EDCIDRO bit assignments are: 


31 8 7 0 


RESO PRMBL_O 


Bits [31:8] 
Reserved, RESO. 


PRMBL_ 0, bits [7:0] 
Preamble. Must read as Q@x@D. 


Accessing the EDCIDRO: 


EDCIDRO can be accessed through the external debug interface: 





Component Offset 





Debug OxFFO 
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H9.2.13 EDCIDR1, External Debug Component Identification Register 1 
The EDCIDR1 characteristics are: 


Purpose 
Provides information to identify an external debug component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDCIDR1 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
EDCIDR1 is a 32-bit register. 


Field descriptions 


The EDCIDR1 bit assignments are: 


31 8 7 4 3 0 
RESO CLASS | PRMBL_1 
Bits [31:8] 


Reserved, RESO. 


CLASS, bits [7:4] 


Component class. Reads as 0x9, debug component. 
PRMBL_1, bits [3:0] 
Preamble. RAZ. 


Accessing the EDCIDR1: 


EDCIDRI can be accessed through the external debug interface: 





Component Offset 





Debug OxFF4 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. H9-5013 
1ID092916 Non-Confidential 


H9 External Debug Register Descriptions 
H9.2 External debug registers 


H9.2.14 


EDCIDR2, External Debug Component Identification Register 2 


The EDCIDR2 characteristics are: 


Purpose 
Provides information to identify an external debug component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDCIDR2 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
EDCIDR2 is a 32-bit register. 


Field descriptions 


The EDCIDR2 bit assignments are: 


31 8 7 0 


RESO PRMBL_2 


Bits [31:8] 
Reserved, RESO. 


PRMBL_ 2, bits [7:0] 
Preamble. Must read as 0x05. 


Accessing the EDCIDR2: 


EDCIDR2 can be accessed through the external debug interface: 





Component Offset 





Debug OxFF8 
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H9.2.15 EDCIDR3, External Debug Component Identification Register 3 
The EDCIDR3 characteristics are: 


Purpose 
Provides information to identify an external debug component. 


For more information see About the Component Identification scheme on page K2-5507. 
Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDCIDR3 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
EDCIDR3 is a 32-bit register. 


Field descriptions 


The EDCIDR3 bit assignments are: 


31 8 7 0 
RESO PRMBL_3 
Bits [31:8] 


Reserved, RESO. 
PRMBL_ 3, bits [7:0] 


Preamble. Must read as QxB1. 


Accessing the EDCIDR3: 


EDCIDR3 can be accessed through the external debug interface: 





Component Offset 





Debug OxFFC 
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H9.2.16 EDCIDSR, External Debug Context ID Sample Register 
The EDCIDSR characteristics are: 
Purpose 
Contains the sampled value of the Context ID, captured on reading the low half of EDPCSR. 
Usage constraints 
This register is accessible as follows: 
Off DLK OSLK Default 
Error Error Error RO 
Configurations 
EDCIDSR is in the Core power domain. RW fields in this register reset to architecturally UNKNOWN 
values. These apply only on a Cold reset. The register is not affected by a Warm reset and is not 
affected by an External debug reset. 
Implemented only if the OPTIONAL PC Sample-based Profiling Extension is implemented. 
Attributes 
EDCIDSR is a 32-bit register. 
Field descriptions 
The EDCIDSR bit assignments are: 
31 0 
CONTEXTIDR 
CONTEXTIDR, bits [31:0] 
The sampled value of CONTEXTIDR_EL1, captured on reading the low half of EDPCSR. 
If EL3 is implemented and using AArch32 then CONTEXTIDR is a Banked register, and 
EDCIDSR samples the current Banked copy of CONTEXTIDR. 
This field resets to a value that is architecturally UNKNOWN. 
Accessing the EDCIDSR: 
EDCIDSR can be accessed through the external debug interface: 
Component Offset 
Debug Ox0A4 
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H9.2.17 EDDEVAFFO, External Debug Device Affinity register 0 
The EDDEVAFFO characteristics are: 


Purpose 


Copy of the low half of the PE MPIDR_EL] register that allows a debugger to determine which PE 
in a multiprocessor system the external debug component relates to. 


Usage constraints 
This register is accessible as follows: 
Default 


RO 


Configurations 
EDDEVAFFO is in the Debug power domain. 


Implementation of this register is OPTIONAL. 


Attributes 
EDDEVAFF is a 32-bit register. 


Field descriptions 


The EDDEVAFFO bit assignments are: 


31 0 
MPIDR_EL1 low half 
Bits [31:0] 


MPIDR_EL] low half. Read-only copy of the low half of MPIDR_EL1, as seen from the highest 
implemented Exception level. 


Accessing the EDDEVAFF0: 


EDDEVAFFO can be accessed through the external debug interface: 





Component Offset 





Debug OxFA8 
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H9.2.18 EDDEVAFF1, External Debug Device Affinity register 1 
The EDDEVAFFI characteristics are: 


Purpose 


Copy of the high half of the PE MPIDR_EL] register that allows a debugger to determine which PE 
in a multiprocessor system the external debug component relates to. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDDEVAFF1 is in the Debug power domain. 


Implementation of this register is OPTIONAL. 


Attributes 
EDDEVAFF1 is a 32-bit register. 


Field descriptions 


The EDDEVAFF1 bit assignments are: 


31 0 
MPIDR_EL1 high half 
Bits [31:0] 


MPIDR_ELI high half. Read-only copy of the high half of MPIDR_EL1, as seen from the highest 
implemented Exception level. 


Accessing the EDDEVAFF1: 


EDDEVAFF!I can be accessed through the external debug interface: 





Component Offset 





Debug OxFAC 
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EDDEVARCH, External Debug Device Architecture register 


The EDDEVARCH characteristics are: 


Purpose 


Identifies the programmers' model architecture of the external debug component. 
Usage constraints 
This register is accessible as follows: 
Default 


RO 


Configurations 
EDDEVARCH is in the Debug power domain. 


Implementation of this register is OPTIONAL. 


Attributes 
EDDEVARCH is a 32-bit register. 


Field descriptions 


The EDDEVARCH bit assignments are: 


31 21 20 19 1615 0 


ARCHITECT i REVISION ARCHID 


PRESENT | 


ARCHITECT, bits [31:21] 
Defines the architecture of the component. For debug, this is ARM Limited. 
Bits [31:28] are the JEP106 continuation code, 0x4. 
Bits [27:21] are the JEP106 ID code, 0x3B. 


PRESENT, bit [20] 
When set to 1, indicates that the DEVARCH is present. 
This field is 1 in ARMv8. 


REVISION, bits [19:16] 
Defines the architecture revision. For architectures defined by ARM this is the minor revision. 
For debug, the revision defined by ARMV8 is 0x0. 


All other values are reserved. 


ARCHID, bits [15:0] 


Defines this part to be an ARMv8 debug component. For architectures defined by ARM this is 
further subdivided. 


For debug: 
° Bits [15:12] are the architecture version, 0x6. 


° Bits [11:0] are the architecture part number, 0xA15. 
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This corresponds to the ARMv8 debug architecture version. 


Accessing the EDDEVARCH: 


EDDEVARCH can be accessed through the external debug interface: 





Component Offset 





Debug OxFBC 
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EDDEVID, External Debug Device ID register 0 


The EDDEVID characteristics are: 


Purpose 
Provides extra information for external debuggers about features of the debug implementation. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDDEVID is in the Debug power domain. 


Attributes 
EDDEVID is a 32-bit register. 


Field descriptions 


The EDDEVID bit assignments are: 


31 28 27 24 23 4 3 0 


RESO RESO PCSample 


Bits [31:28] 
Reserved, RESO. 


AuxRegs, bits [27:24] 
Indicates support for Auxiliary registers. Permitted values for this field are: 
0000 None supported. 
0001 Support for External Debug Auxiliary Control Register, EDACR. 


All other values are reserved. 


Bits [23:4] 
Reserved, RESO. 


PCSample, bits [3:0] 
Indicates the level of PC Sample-based profiling support using external debug registers 40 through 
43. Permitted values of this field in ARMV8 are: 


0000 Architecture-defined form of PC Sample-based profiling not implemented. 


0010 EDPCSR and EDCIDSR are implemented (only permitted if EL3 and EL2 are not 
implemented). 


0011 EDPCSR, EDCIDSR, and EDVIDSR are implemented. 


All other values are reserved. 
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Accessing the EDDEVID: 


EDDEVID can be accessed through the external debug interface: 





Component Offset 





Debug OxFC8 
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H9.2.21 EDDEVID1, External Debug Device ID register 1 
The EDDEVID1 characteristics are: 


Purpose 


Provides extra information for external debuggers about features of the debug implementation. 
Usage constraints 
This register is accessible as follows: 
Default 


RO 


Configurations 
EDDEVID1 is in the Debug power domain. 


Attributes 
EDDEVID1 is a 32-bit register. 


Field descriptions 


The EDDEVID1 bit assignments are: 


31 4 3 0 
RESO PCSROffset 
Bits [31:4] 


Reserved, RESO. 


PCSROffset, bits [3:0] 


This field indicates the offset applied to PC samples returned by reads of EDPCSR. Permitted values 
of this field in ARMV8 are: 


0000 EDPCSR not implemented. 


0010 EDPCSR implemented, and samples have no offset applied and do not sample the 
instruction set state in AArch32 state. 


Accessing the EDDEVID1: 


EDDEVID1 can be accessed through the external debug interface: 





Component Offset 





Debug OxFC4 
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H9.2.22 EDDEVID2, External Debug Device ID register 2 
The EDDEVID2 characteristics are: 


Purpose 


Reserved for future descriptions of features of the debug implementation. 
Usage constraints 
This register is accessible as follows: 
Default 


RO 


Configurations 
EDDEVID2 is in the Debug power domain. 


Attributes 
EDDEVID2? is a 32-bit register. 


Field descriptions 


The EDDEVID2 bit assignments are: 


31 0 
Bits [31:0] 


Reserved, RESO. 


Accessing the EDDEVID2: 


EDDEVID2 can be accessed through the external debug interface: 





Component Offset 





Debug OxFCO 








H9-5024 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


H9.2.23 


H9 External Debug Register Descriptions 
H9.2 External debug registers 


EDDEVTYPE, External Debug Device Type register 


The EDDEVTYPE characteristics are: 


Purpose 
Indicates to a debugger that this component is part of a PEs debug logic. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDDEVTYPE is in the Debug power domain. 


Implementation of this register is OPTIONAL. 


Attributes 
EDDEVTYPE is a 32-bit register. 


Field descriptions 


The EDDEVTYPE bit assignments are: 


31 8 7 4 3 0 


RESO SUB MAJOR 


Bits [31:8] 
Reserved, RESO. 


SUB, bits [7:4] 


Subtype. Must read as Qx1 to indicate this is a component within a PE. 
MAJOR, bits [3:0] 


Major type. Must read as 0x5 to indicate this is a debug logic component. 


Accessing the EDDEVTYPE: 


EDDEVTYPE can be accessed through the external debug interface: 





Component Offset 





Debug OxFCC 
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H9.2.24 EDDFR, External Debug Feature Register 
The EDDFR characteristics are: 


Purpose 
Provides top level information about the debug system. 


—— Note 
Debuggers must use EDDEVARCH to determine the Debug architecture version. 





For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





Off DLK Default 





IMPDEF IMPDEF RO 





Configurations 


It is IMPLEMENTATION DEFINED whether EDDFR is implemented in the Core power domain or in 
the Debug power domain. 


Attributes 
EDDEFR is a 64-bit register. 


Field descriptions 


The EDDFR bit assignments are: 


63 32 31 28 27, 24 23.2019, 1615.12 11 


RESO er curs crx cos RESO wr RESO | BRPs <a TraceVer usisoun 


Bits [63:32] 
Reserved, RESO. 
CTX_CMPs, bits [31:28] 


Number of breakpoints that are context-aware, minus 1. These are the highest numbered 
breakpoints. 


In an ARMv8-A implementation that supports AArch64 state in at least one Exception level, this 
field returns the value of ID_AA64DFRO_EL1.CTX_CMPs. 


Bits [27:24] 
Reserved, RESO. 
WRPs, bits [23:20] 
Number of watchpoints, minus 1. The value of 0b0000 is reserved. 


In an ARMV8-A implementation that supports AArch64 state in at least one Exception level, this 
field returns the value of ID_AA64DFRO_EL1.WRPs. 


Bits [19:16] 
Reserved, RESO. 
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BRPs, bits [15:12] 


Number of breakpoints, minus 1. The value of @b0000 is reserved. 


In an ARMv8-A implementation that supports AArch64 state in at least one Exception level, this 


field returns the value of ID_AA64DFRO_EL1.BRPs. 


PMUVer, bits [11:8] 


Performance Monitors Extension version. Indicates whether System register interface to 


Performance Monitors extension is implemented. Defined values are: 


0000 Performance Monitors Extension System registers not implemented. 

0001 Performance Monitors Extension System registers implemented, PMUv3. 

1111 IMPLEMENTATION DEFINED form of performance monitors supported, PMUv3 not 
supported. 


All other values are reserved. 


In an ARMv8-A implementation that supports AArch64 state in at least one Exception level, this 


field returns the value of ID_AA64DFRO_EL1.PMUVer. 


TraceVer, bits [7:4] 


Trace support. Indicates whether System register interface to a trace macrocell is implemented. 


Defined values are: 
0000 Trace macrocell System registers not implemented. 
0001 Trace macrocell System registers implemented. 


All other values are reserved. 


A value of 0b0000 only indicates that no System register interface to a trace macrocell is 


implemented. A trace macrocell might nevertheless be implemented without a System register 


interface. 


In an ARMVv8-A implementation that supports AArch64 state in at least one Exception level, this 


field returns the value of ID_AA64DFRO_EL1.Trace Ver. 


UNKNOWN, bits [3:0] 


Reserved, UNKNOWN. 


Accessing the EDDFR: 


EDDFR[31:0] can be accessed through the external debug interface: 























Component Offset 
Debug QxD28 
EDDFR[63:32] can be accessed through the external debug interface: 
Component Offset 
Debug @xD2C 
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H9.2.25 


EDECCR, External Debug Exception Catch Control Register 


The EDECCR characteristics are: 


Purpose 


Controls Exception Catch debug events. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK_ SLK Default 





Error Error Error RO RW 





Configurations 


Attributes 


External register EDECCR is architecturally mapped to AArch64 System register OSECCR_EL1. 
External register EDECCR is architecturally mapped to AArch32 System register DBGOSECCR. 


EDECCR is in the Core power domain. Some or all RW fields of this register have defined reset 
values. These apply only on a Cold reset. The register is not affected by a Warm reset and is not 
affected by an External debug reset. 


EDECCR is a 32-bit register. 


Field descriptions 


The EDECCR bit assignments are: 


31 


8 7 4 3 0 


Bits [31:8] 


NSE, bits [7:4] 


Reserved, RESO. 


Coarse-grained Non-secure exception catch. If EL3 and EL2 are not implemented and the PE 
behaves as if SCR_EL3.NS is set to 0, this field is reserved, RESO. Otherwise, possible values for 
this field are: 


0000 Exception Catch debug event disabled for Non-secure Exception levels. 
0010 Exception Catch debug event enabled for Non-secure EL1. 

0100 Exception Catch debug event enabled for Non-secure EL2. 

0110 Exception Catch debug event enabled for Non-secure EL1 and EL2. 


All other values are reserved. A value that enables an Exception Catch debug event for an Exception 
level that is not implemented is reserved. If this field is programmed with a reserved value then: 


. The PE behaves as if it is programmed with a defined value, other than for a read of 
EDECCR. 


. The value returned for NSE by a read of EDECCR is UNKNOWN. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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SE, bits [3:0] 
Coarse-grained Secure exception catch. If EL3 is not implemented and the PE behaves as if 
SCR_EL3.NS is set to 1, this field is reserved, RESO. Otherwise, possible values for this field are: 


0000 Exception Catch debug event disabled for Secure Exception levels. 
0010 Exception Catch debug event enabled for Secure EL1. 

1000 Exception Catch debug event enabled for Secure EL3. 

1010 Exception Catch debug event enabled for Secure EL1 and EL3. 


All other values are reserved. A value that enables an Exception Catch debug event for an Exception 
level that is not implemented is reserved. If this field is programmed with a reserved value then: 


° The PE behaves as if it is programmed with a defined value, other than for a read of 
EDECCR. 
° The value returned for SE by a read of EDECCR is UNKNOWN. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the EDECCR: 


EDECCR can be accessed through the external debug interface: 





Component Offset 





Debug 0x098 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. H9-5029 
1ID092916 Non-Confidential 


H9 External Debug Register Descriptions 


H9.2 External debug registers 


H9.2.26 EDECR, External Debug Execution Control Register 


The EDECR characteristics are: 


Purpose 


Controls Halting debug events. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RW 





Configurations 


Attributes 


EDECR is in the Debug power domain. Some or all RW fields of this register have defined reset 
values. These apply only on an External debug reset. The register is not affected by a Warm reset 
and is not affected by a Cold reset. 


EDECR is a 32-bit register. 


Field descriptions 


The EDECR bit assignments are: 


31 


32 1 0 


RESO i 


Bits [31:3] 


SS, bit [2] 


RCE, bit [1] 


—_ OSUCE 
RCE 


Reserved, RESO. 


Halting step enable. Possible values of this field are: 
0 Halting step debug event disabled. 
1 Halting step debug event enabled. 


If the value of EDECR.SS is changed when the PE is in Non-debug state, behavior is CONSTRAINED 
UNPREDICTABLE as described in Changing the value of EDECR.SS when not in Debug state on 
page H3-4893. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Reset Catch enable. Possible values of this field are: 
Q Reset Catch debug event disabled. 
1 Reset Catch debug event enabled. 


When this register has an architecturally-defined reset value, this field resets to Q. 
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OSUCE, bit [0] 
OS Unlock Catch enabled. Possible values of this field are: 
0 OS Unlock Catch debug event disabled. 
1 OS Unlock Catch debug event enabled. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the EDECR: 


EDECR can be accessed through the external debug interface: 





Component Offset 





Debug 0x24 
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EDESR, External Debug Event Status Register 


The EDESR characteristics are: 


Purpose 
Indicates the status of internally pending Halting debug events. 


Usage constraints 


This register is accessible as follows: 





Off DLK SLK Default 





Error Error RO RW 





If a request to clear a pending Halting debug event is received at or about the time when halting 
becomes allowed, it is CONSTRAINED UNPREDICTABLE whether the event is taken. 


If Core power is removed while a Halting debug event is pending, it is lost. However, it might 
become pending again when the Core is powered back on and Cold reset. 


Configurations 
EDESR is in the Core power domain. Some or all RW fields of this register have defined reset 
values. These apply on a Warm or Cold reset. The register is not affected by an External debug reset. 
Attributes 
EDESR is a 32-bit register. 


Field descriptions 


The EDESR bit assignments are: 


31 3.2 1 0 


RESO i 
ie OSUC 
RC 


Bits [31:3] 
Reserved, RESO. 
SS, bit [2] 
Halting step debug event pending. Possible values of this field are: 
0 Reading this means that a Halting step debug event is not pending. Writing this means 
no action. 
1 Reading this means that a Halting step debug event is pending. Writing this clears the 
pending Halting step debug event. 
When this register has an architecturally-defined reset value, this field resets to the value of 
EDECR.SS. 
RC, bit [1] 
Reset Catch debug event pending. Possible values of this field are: 
Q Reading this means that a Reset Catch debug event is not pending. Writing this means 
no action. 
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1 Reading this means that a Reset Catch debug event is pending. Writing this clears the 
pending Reset Catch debug event. 


When this register has an architecturally-defined reset value, this field resets to the value of 
EDECR.RCE. 

OSUC, bit [0] 
OS Unlock Catch debug event pending. Possible values of this field are: 


) Reading this means that an OS Unlock Catch debug event is not pending. Writing this 
means no action. 


1 Reading this means that an OS Unlock Catch debug event is pending. Writing this clears 
the pending OS Unlock Catch debug event. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the EDESR: 


EDESR can be accessed through the external debug interface: 





Component Offset 





Debug 0x020 
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H9.2.28 EDITCTRL, External Debug Integration mode Control register 
The EDITCTRL characteristics are: 
Purpose 
Enables the external debug to switch from its default mode into integration mode, where test 
software can control directly the inputs and outputs of the PE, for integration testing or topology 
detection. 
Usage constraints 
This register is accessible as follows: 
Off DLK OSLK Default 
IMPDEF IMPDEF IMPDEF RW 
Configurations 
It is IMPLEMENTATION DEFINED whether EDITCTRL is implemented in the Core power domain or 
in the Debug power domain. Some or all RW fields of this register have defined reset values, and: 
° The register is not affected by a Warm reset. 
. If the register is implemented in the Core power domain the reset values apply on a Cold 
reset, and the register is not affected by an External debug reset. 
° If the register is implemented in the Debug power domain the reset values apply on an 
External debug reset, and the register is not affected by a Cold reset. 
Implementation of this register is OPTIONAL. 
Attributes 
EDITCTRL is a 32-bit register. 
Field descriptions 
The EDITCTRL bit assignments are: 
31 10 
RESO fi 
| IME 
Bits [31:1] 
Reserved, RESO. 
IME, bit [0] 
Integration mode enable. When IME == 1, the device reverts to an integration mode to enable 
integration testing or topology detection. The integration mode behavior is IMPLEMENTATION 
DEFINED. 
0 Normal operation. 
1 Integration mode enabled. 
When this register has an architecturally-defined reset value, this field resets to 0. 
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Accessing the EDITCTRL: 


EDITCTRL can be accessed through the external debug interface: 





Component Offset 





Debug OxF0O 
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H9.2.29 EDITR, External Debug Instruction Transfer Register 
The EDITR characteristics are: 


Purpose 
Used in Debug state for passing instructions to the PE for execution. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK SLK Default 





Error Error Error WI WO 





If EDSCR.ITE == 0 when the PE exits Debug state on receiving a Restart request trigger event, the 
behavior of any instruction issued through the ITR in Normal access mode that has not completed 
execution is CONSTRAINED UNPREDICTABLE, and must do one of the following: 


° It must complete execution in Debug state before the PE executes the restart sequence. 
° It must complete execution in Non-debug state before the PE executes the restart sequence. 
° It must be abandoned. This means that the instruction does not execute. Any registers or 


memory accessed by the instruction are left in an UNKNOWN state. 


EDITR ignores writes if the PE is in Non-debug state. 


Configurations 
EDITR is in the Core power domain. 


Attributes 
EDITR is a 32-bit register. 


Field descriptions 


The EDITR bit assignments are: 


When in AArch32 state: 


31 1615 0 


T32Second T32First 


T32Second, bits [31:16] 
Second halfword of the T32 instruction to be executed on the PE. When EDITR contains a 16-bit 
T32 instruction, this field is ignored. For more information see Behavior in Debug state on 


page H2-4855 


T32First, bits [15:0] 
First halfword of the T32 instruction to be executed on the PE. 





When in AArch64 state: 
31 0 
A64 instruction to be executed on the PE 
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Bits [31:0] 


A64 instruction to be executed on the PE. 


Accessing the EDITR: 


EDITR can be accessed through the external debug interface: 





Component Offset 





Debug 0x84 
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H9.2.30 EDLAR, External Debug Lock Access Register 

The EDLAR characteristics are: 

Purpose 
Allows or disallows access to the external debug registers through a memory-mapped interface. 

Usage constraints 
This register is accessible as follows: 

Default 
WO 

Configurations 
EDLAR is in the Debug power domain. 
If OPTIONAL memory-mapped access to the external debug interface is supported then an OPTIONAL 
Software Lock can be implemented as part of CoreSight compliance. 
EDLAR ignores writes if the Software Lock is not implemented and ignores writes for other 
accesses to the external debug interface. 
The Software Lock provides a lock to prevent memory-mapped writes to the debug registers. Use 
of this lock mechanism reduces the risk of accidental damage to the contents of the debug registers. 
It does not, and cannot, prevent all accidental or malicious damage. 
Software uses EDLAR to set or clear the lock, and EDLSR to check the current status of the lock. 

Attributes 
EDLAR is a 32-bit register. 

Field descriptions 

The EDLAR bit assignments are: 

31 0 

KEY 

KEY, bits [31:0] 
Lock Access control. Writing the key value @xC5ACCE55 to this field unlocks the lock, enabling write 
accesses to this component's registers through a memory-mapped interface. 
Writing any other value to this register locks the lock, disabling write accesses to this component's 
registers through a memory mapped interface. 

Accessing the EDLAR: 

EDLAR can be accessed through a memory-mapped access to the external debug interface: 

Component Offset 
Debug OxFBO 
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H9.2.31 EDLSR, External Debug Lock Status Register 
The EDLSR characteristics are: 


Purpose 


Indicates the current status of the software lock for external debug registers. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 


EDLSR is in the Debug power domain. Some or all RW fields of this register have defined reset 
values. These apply only on an External debug reset. The register is not affected by a Warm reset 
and is not affected by a Cold reset. 


If OPTIONAL memory-mapped access to the external debug interface is supported then an OPTIONAL 
Software Lock can be implemented as part of CoreSight compliance. 


EDLSR is RAZ if the Software Lock is not implemented and is RAZ for other accesses to the 
external debug interface. 


The Software Lock provides a lock to prevent memory-mapped writes to the debug registers. Use 
of this lock mechanism reduces the risk of accidental damage to the contents of the debug registers. 
It does not, and cannot, prevent all accidental or malicious damage. 


Software uses EDLAR to set or clear the lock, and EDLSR to check the current status of the lock. 


Attributes 
EDLSR is a 32-bit register. 


Field descriptions 


The EDLSR bit assignments are: 


31 3.2 1 0 


RESO il 
a SLI 
SLK 
nit 





Bits [31:3] 
Reserved, RESO. 

nTT, bit [2] 
Not thirty-two bit access required. RAZ. 

SLK, bit [1] 
Software Lock status for this component. For an access to LSR that is not a memory-mapped access, 
or when the Software Lock is not implemented, this field is RESO. 
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For memory-mapped accesses when the Software Lock is implemented, possible values of this field 


are: 
0 Lock clear. Writes are permitted to this component's registers. 
1 Lock set. Writes to this component's registers are ignored, and reads have no side 


effects. 
When this register has an architecturally-defined reset value, this field resets to 1. 
SLI, bit [0] 


Software Lock implemented. For an access to LSR that is not a memory-mapped access, this field 
is RAZ. For memory-mapped accesses, the value of this field is IMPLEMENTATION DEFINED. 
Permitted values are: 


) Software Lock not implemented or not memory-mapped access. 


1 Software Lock implemented and memory-mapped access. 


Accessing the EDLSR: 


EDLSR can be accessed through a memory-mapped access to the external debug interface: 





Component Offset 





Debug OxFB4 
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H9.2.32 EDPCSR, External Debug Program Counter Sample Register 
The EDPCSR characteristics are: 


Purpose 


Holds a sampled instruction address value. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK_ SLK Default 





Error Error Error RO RO 





Configurations 


EDPCSR is in the Core power domain. RW fields in this register reset to architecturally UNKNOWN 
values. These apply only on a Cold reset. The register is not affected by a Warm reset and is not 
affected by an External debug reset. 


Implemented only if the OPTIONAL PC Sample-based Profiling Extension is implemented. 


Attributes 
EDPCSR is a 64-bit register. 


Field descriptions 


The EDPCSR bit assignments are: 


63 32 31 0 


PC Sample high word, EDPCSRhi PC Sample low word, EDPCSRIo 





Bits [63:32] 


PC Sample high word, EDPCSRhi. If EDVIDSR.HV == 0 then this field is RAZ, otherwise bits 
[63:32] of the sampled instruction address value. 


This field resets to a value that is architecturally UNKNOWN. 





Bits [31:0] 

PC Sample low word, EDPCSRlo. Bits [31:0] of the sampled instruction address value. Reading 

EDPCSRIo has the side-effect of updating EDCIDSR, EDVIDSR, and EDPCSRhi. However: 

° If the PE is in Debug state, or PC Sample-based profiling is prohibited, EDPCSRlo reads as 
OxFFFFFFFF and EDCIDSR, EDVIDSR, and EDPCSRhi become UNKNOWN. 

° If the PE is in Reset state, the sampled value is unknown and EDCIDSR, EDVIDSR and 
EDPCSRhi become UNKNOWN. 

° If no instruction has been retired since the PE left Reset state, Debug state, or a state where 
Non-invasive debug is not permitted, the sampled value is UNKNOWN and EDCIDSR, 
EDVIDSR, and EDPCSRhi become UNKNOWN. 

. For a read of EDPCSRlo from the memory-mapped interface, if EDLSR.SLK == 1, meaning 
the Software Lock is locked, then the access has no side-effects. That is, EDCIDSR, 
EDVIDSR, and EDPCSRhi are unchanged. 

This field resets to a value that is architecturally UNKNOWN. 
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Accessing the EDPCSR: 


EDPCSR[31:0] can be accessed through the external debug interface: 





Component Offset 





Debug Ox0A0 





EDPCSR[63:32] can be accessed through the external debug interface: 





Component Offset 





Debug Ox@AC 
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H9.2.33 EDPFR, External Debug Processor Feature Register 
The EDPFR characteristics are: 
Purpose 


Provides information about implemented PE features. 


For general information about the interpretation of the ID registers see Principles of the ID scheme 
for fields in ID registers on page D7-1893. 


Usage constraints 


This register is accessible as follows: 





Off DLK Default 





IMPDEF IMPDEF RO 





Configurations 
It is IMPLEMENTATION DEFINED whether EDPFR is implemented in the Core power domain or in the 
Debug power domain. 

Attributes 
EDPFER is a 64-bit register. 


Field descriptions 


The EDPFR bit assignments are: 


28 27 .. 24 23 2019 .. 1615 .. 12 11 


pe eiisisiee 


Bits [63:28] 


Reserved, RESO. 


GIC, bits [27:24] 
System register GIC interface support. Defined values are: 
0000 No System register interface to the GIC is supported. 
0001 System register interface to versions 3.0 and 4.0 of the GIC CPU interface is supported. 
All other values are reserved. 
In an ARMVv8-A implementation that supports AArch64 state in at least one Exception level, this 
field returns the value of ID_AA64PFRO_EL1.GIC. 
AdvSIMD, bits [23:20] 
Advanced SIMD. Defined values are: 
0000 Advanced SIMD is implemented. 
1111 Advanced SIMD is not implemented. 
All other values are reserved. 


In an ARMv8-A implementation that supports AArch64 state in at least one Exception level, this 
field returns the value of ID_AA64PFRO_EL1.AdvSIMD. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. H9-5043 
1ID092916 Non-Confidential 


H9 External Debug Register Descriptions 


H9.2 External debug registers 


FP, bits [19:16] 


Floating-point. Defined values are: 

0000 Floating-point is implemented. 
1111 Floating-point is not implemented. 
All other values are reserved. 


In an ARMV8-A implementation that supports AArch64 state in at least one Exception level, this 
field returns the value of ID_-AA64PFRO_EL1.FP. 


EL3, bits [15:12] 


AArch64 EL3 Exception level handling. Defined values are: 


0000 EL3 is not implemented or cannot be executed in AArch64 state. 
0001 EL3 can be executed in AArch64 state only. 
0010 EL3 can be executed in either AArch64 or AArch32 state. 


When the value of EDAA32PFR.EL3 is non-zero, this field must be 0000. 
All other values are reserved. 


In an ARMv8-A implementation that supports AArch64 state in at least one Exception level, this 
field returns the value of ID_AA64PFRO_EL1.EL3. 


EL2, bits [11:8] 


EL1, bits [7:4] 


EL, bits [3:0] 


AArch64 EL2 Exception level handling. Defined values are: 


0000 EL2 is not implemented or cannot be executed in AArch64 state. 
0001 EL2 can be executed in AArch64 state only. 
0010 EL2 can be executed in either AArch64 or AArch32 state. 


When the value of EDAA32PFR.EL2 is non-zero, this field must be 0000. 
All other values are reserved. 


In an ARMVv8-A implementation that supports AArch64 state in at least one Exception level, this 
field returns the value of ID_AA64PFRO_EL1.EL2. 


AArch64 EL1 Exception level handling. Defined values are: 


0000 EL1 can be executed in AArch32 state only. 
0001 EL1 can be executed in AArch64 state only. 
0010 EL] can be executed in either AArch64 or AArch32 state. 


All other values are reserved. 


In an ARMVv8-A implementation that supports AArch64 state in at least one Exception level, this 
field returns the value of ID_AA64PFRO_EL1.EL1. 


AArch64 ELO Exception level handling. Defined values are: 


0000 ELO can be executed in AArch32 state only. 
0001 ELO can be executed in AArch64 state only. 
0010 ELO can be executed in either AArch64 or AArch32 state. 


All other values are reserved. 


In an ARMV8-A implementation that supports AArch64 state in at least one Exception level, this 
field returns the value of ID_AA64PFRO_EL1.ELO. 
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Accessing the EDPFR: 


EDPFR[31:0] can be accessed through the external debug interface: 





Component Offset 





Debug QxD20 





EDPFR[63:32] can be accessed through the external debug interface: 





Component Offset 





Debug QxD24 
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H9.2.34 EDPIDRO, External Debug Peripheral Identification Register 0 
The EDPIDRO characteristics are: 
Purpose 
Provides information to identify an external debug component. 
For more information see About the Peripheral identification scheme on page K2-5504. 
Usage constraints 
This register is accessible as follows: 
Default 
RO 
Configurations 
EDPIDR0O is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
EDPIDR0 is a 32-bit register. 
Field descriptions 
The EDPIDRO bit assignments are: 
31 8 7 0 
RESO PART_0O 
Bits [31:8] 
Reserved, RESO. 
PART_0, bits [7:0] 
Part number, least significant byte. 
Accessing the EDPIDRO: 
EDPIDRO can be accessed through the external debug interface: 
Component Offset 
Debug OxFEO 
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H9.2.35 EDPIDR1, External Debug Peripheral Identification Register 1 
The EDPIDR1 characteristics are: 


Purpose 
Provides information to identify an external debug component. 


For more information see About the Peripheral identification scheme on page K2-5504. 
Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDPIDR1 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
EDPIDR1 is a 32-bit register. 


Field descriptions 


The EDPIDR1 bit assignments are: 


31 8 7 4 3 0 
RESO DES_0 PART_1 
Bits [31:8] 


Reserved, RESO. 


DES_0, bits [7:4] 
Designer, least significant nibble of JEP106 ID code. For ARM Limited, this field is @b1011. 


PART_1, bits [3:0] 


Part number, most significant nibble. 


Accessing the EDPIDR1: 


EDPIDR1 can be accessed through the external debug interface: 





Component Offset 





Debug OxFE4 
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H9.2.36 EDPIDR2, External Debug Peripheral Identification Register 2 
The EDPIDR2 characteristics are: 


Purpose 
Provides information to identify an external debug component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 
Configurations 
EDPIDR2 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
EDPIDR2 is a 32-bit register. 
Field descriptions 
The EDPIDR2 bit assignments are: 
31 8 7 43 2 0 
RESO REVISION 7 DES_1 
[| JEDEC 
Bits [31:8] 


Reserved, RESO. 


REVISION, bits [7:4] 


Part major revision. Parts can also use this field to extend Part number to 16-bits. 


JEDEC, bit [3] 
RAO. Indicates a JEP106 identity code is used. 


DES_1, bits [2:0] 
Designer, most significant bits of JEP106 ID code. For ARM Limited, this field is @b011. 


Accessing the EDPIDR2: 


EDPIDR2 can be accessed through the external debug interface: 





Component Offset 





Debug OxFE8 
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H9.2.37 EDPIDR3, External Debug Peripheral Identification Register 3 
The EDPIDR3 characteristics are: 


Purpose 
Provides information to identify an external debug component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDPIDR3 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
EDPIDR3 is a 32-bit register. 


Field descriptions 


The EDPIDR3 bit assignments are: 


31 8 7 4 3 0 
RESO REVAND CMOD 
Bits [31:8] 


Reserved, RESO. 


REVAND, bits [7:4] 
Part minor revision. Parts using EDPIDR2.REVISION as an extension to the Part number must use 
this field as a major revision number. 


CMOD, bits [3:0] 


Customer modified. Indicates someone other than the Designer has modified the component. 


Accessing the EDPIDR3: 


EDPIDR3 can be accessed through the external debug interface: 





Component Offset 





Debug OxFEC 
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H9.2.38 


EDPIDR4, External Debug Peripheral Identification Register 4 


The EDPIDR4 characteristics are: 


Purpose 
Provides information to identify an external debug component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 
EDPIDR4 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
EDPIDR4 is a 32-bit register. 


Field descriptions 


The EDPIDR4 bit assignments are: 


31 8 7 4 3 0 


RESO SIZE DES_2 


Bits [31:8] 
Reserved, RESO. 


SIZE, bits [7:4] 
Size of the component. RAZ. Log» of the number of 4KB pages from the start of the component to 
the end of the component ID registers. 


DES_2, bits [3:0] 
Designer, JEP106 continuation code, least significant nibble. For ARM Limited, this field is 0b0100. 


Accessing the EDPIDR4: 


EDPIDR4 can be accessed through the external debug interface: 





Component Offset 





Debug OxFDO 
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H9.2.39 EDPRCR, External Debug Power/Reset Control Register 
The EDPRCR characteristics are: 


Purpose 


Controls the PE functionality related to powerup, reset, and powerdown. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RW 





On permitted accesses to the register, other access controls affect the behavior of some fields. See 
the field descriptions for more information. 
Configurations 


EDPRCR contains fields that are in the Core power domain and fields that are in the Debug power 
domain. 


For RW fields see the field description for a description of the behavior of the field on a reset that 
applies to its power domain. However: 


° Fields that are in the Core power domain are not affected by a warm reset and are not affected 
by an External debug reset. 


° Fields that are in the Debug power domain reset to their defined reset values on an External 
debug reset, and are not affected by a Warm reset and are not affected by a Cold reset. 


CORENPDRQ is the only field that is mapped between the EDPRCR and DBGPRCR and 
DBGPRCR_EL1. 


Attributes 
EDPRCR is a 32-bit register. 


Field descriptions 


The EDPRCR bit assignments are: 


31 43210 


RESO yy] 
| CORENPDRQ 
CWRR 


RESO 
COREPURQ 
Bits [31:4] 
Reserved, RESO. 
COREPURQ, bit [3] 


Core powerup request. Allows a debugger to request that the power controller power up the core, 
enabling access to the debug register in the Core power domain. The actions on writing to this bit 





are: 
0 Do not request power up of the Core power domain. 
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Bit [2] 


CWRR, bit [1] 


1 Request power up of the Core power domain, and emulation of powerdown. 


In an implementation that includes the recommended external debug interface, this bit drives the 
DBGPWRUPREQ signal. 


Typically, this request is passed to an external power controller. This means that whether a request 
causes power up is dependent on the IMPLEMENTATION DEFINED nature of the system. 


This field is in the Debug power domain and can be read and written when the Core power domain 
is powered off. On an External debug reset this field resets to 0. 


The power controller must not allow the Core power domain to switch off while this bit is 1. 
When this register has an architecturally-defined reset value, this field resets to 0. 


This table summarizes the effect of the register access controls on the behavior of this field: 





SLK Default 





RO RW 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 


Reserved, RESO. 


Warm reset request. Write only bit that reads as zero. The actions on writing to this bit are: 
t) No action. 
1 Request Warm reset. 


The PE ignores writes to this bit if any of the following are true: 


. ExternalInvasiveDebugEnabled() == FALSE, EL3 is not implemented, and the implemented 
Security state is Non-Secure state. 


. ExternalSecureInvasiveDebugEnabled() == FALSE and one of the following is true: 
—  EL3 is implemented. 
— The implemented Security state is Secure state. 


° The Core power domain is either off or in a low-power state where the Core power domain 
registers cannot be accessed. 


° DoubleLockStatus() == TRUE (OS Double Lock is set). 
° OSLSR.OSLK == 1 (OS lock is locked). 


In an implementation that includes the recommended external debug interface, this bit drives the 
DBGRSTREQ signal. 


Whether a write to this bit initiates a Warm Reset, the extent of the reset is IMPLEMENTATION 
DEFINED, but must be one of: 


° The request is ignored. 
° Only this PE is Warm reset. 
° This PE and other components of the system, possibly including other PEs, are Warm reset. 


— Note 


Although the ARM architecture permits the first option from the above list, ARM recommends that 
either of the other options is implemented. 





When this register has an architecturally-defined reset value, this field resets to 0. 
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This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK OSLK SLK Default 





WI WI WI WI WO 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 


CORENPDRQ, bit [0] 
Core no powerdown request. Requests emulation of powerdown. Possible values of this bit are: 
0 If the system responds to a powerdown request, it powers down Core power domain. 
a If the system responds to a powerdown request, it does not powerdown the Core power 


domain, but instead emulates a powerdown of that domain. 
This bit is UNKNOWN, and the PE ignores writes to this bit if any of the following are true: 


° The Core power domain is either off or in a low-power state where the Core power domain 
registers cannot be accessed. 


° DoubleLockStatus() == TRUE (OS Double Lock is set). 
° OSLSR.OSLK == | (OS lock is locked). 


Permitted accesses to this field map to the DBGPRCR.CORENPDRQ and 
DBGPRCR_EL1.CORENPDRQ fields. 


This field is in the Core reset domain. See the descriptions of the DBGPRCR.CORENPDRQ and 
DBGPRCR_EL1.CORENPDRQ fields for information about the effect of a Cold reset on the value 
returned by a permitted read of this field. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK OSLK SLK Default 





WI WI Wl RO RW 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 


Accessing the EDPRCR: 


EDPRCR can be accessed through the external debug interface: 





Component Offset 





Debug 0x310 
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H9.2.40 EDPRSR, External Debug Processor Status Register 
The EDPRSR characteristics are: 


Purpose 


Holds information about the reset and powerdown state of the PE. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





On permitted accesses to the register, other access controls affect the behavior of some fields. See 
the field descriptions for more information. 


If the Core power domain is powered up (EDPRSR.PU == 1), then following a read of EDPRSR: 
° If DoubleLockStatus() == FALSE, then: 
—  EDPRSR.{SDR, SPMAD, SDAD, SPD} are cleared to 0. 


—  EDPRSR.SR is cleared to 0 if the non-debug logic of the PE is not in reset state 
(EDPRSR.R == 0). 


° Otherwise it is CONSTRAINED UNPREDICTABLE whether or not this clearing occurs. 
If the Core power domain is powered down (EDPRSR.PU == 0), then: 


° EDPRSR.{SDR, SPMAD, SDAD, SR} are all UNKNOWN, and are either reset or restored on 
being powered up. 


° EDPRSR.SPD is not cleared following a read of EDPRSR. See the SPD bit description for 
more information. 


The clearing of bits is an indirect write to EDPRSR. 


Configurations 


EDPRSR contains fields that are in the Core power domain and fields that are in the Debug power 
domain. 


Some of the fields in the Core power domain are in the Cold reset domain and others are in the Warm 
reset domain. See the field descriptions for more information. However: 


° Fields that are in the Cold reset domain are not affected by a warm reset and are not affected 
by an External debug reset. 


. Fields in the Warm reset domain are also reset by a Cold reset but are not affected by an 
External debug reset. 


° Fields in the Debug power domain are not affected by a Warm reset and are not affected by 
a Cold reset. 


Attributes 
EDPRSR is a 32-bit register. 


Field descriptions 


The EDPRSR bit assignments are: 
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1211109 8 765 43 210 





' ae 
HALTED 


OSLK 
DLK 
EDAD 
SDAD 
EPMAD 
SPMAD 
SDR 


Bits [31:12] 


Reserved, RESO. 


SDR, bit [11] 
Sticky debug restart. Set to 1 when the PE exits Debug state. 
This bit is UNKNOWN on reads if any of the following are true: 
° DoubleLockStatus() == TRUE (EDPRSR.DLK == 1). The OS double-lock is locked. 
. EDPRSR.R == 1. The PE is in Reset state. 
° EDPRSR.PU == 0. The Core power domain is powered down. 


Otherwise permitted values are: 


) The PE has not restarted since EDPRSR was last read. 
1 The PE has restarted since EDPRSR was last read. 
——— Note 


If a reset occurs when the PE is in Debug state, the PE exits Debug state. SDR is UNKNOWN on Warm 
reset, meaning a debugger must also use the SR bit to determine whether the PE has left Debug state. 





If DoubleLockStatus() == FALSE then following a read of EDPRSR, this bit clears to 0. Otherwise 
it is CONSTRAINED UNPREDICTABLE whether or not this clearing occurs. 


This field is in the Core power domain and the Warm reset domain. On a Warm or Cold reset it resets 
to an UNKNOWN value. 


This field resets to a value that is architecturally UNKNOWN. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK SLK Default 





UNK UNK_ RO RC 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 


SPMAD, bit [10] 


Sticky EPMAD error. Set to 1 if an external debug interface access to a Performance Monitors 
register returns an error because AllowExternalPMUAccess() == FALSE. 
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This bit is UNKNOWN on reads if any of the following are true: 
. EDPRSR.{DLK, OSLK, R} is 1. 

° EDPRSR.PU is 0. 

Otherwise permitted values are: 


Q No accesses to the external Performance Monitors registers have failed since EDPRSR 
was last read. 


1 At least one access to the external Performance Monitors registers has failed since 
EDPRSR was last read. 


If DoubleLockStatus() == FALSE then following a read of EDPRSR, this bit clears to 0. Otherwise 
it is CONSTRAINED UNPREDICTABLE whether or not this clearing occurs. 


The write to SPMAD is an indirect write to EDPRSR that is a side effect of the access. The indirect 
write might not occur for a memory-mapped access to the external debug interface. 


This field is in the Core power domain and the Cold reset domain. On a Cold reset it resets to 0. 
When this register has an architecturally-defined reset value, this field resets to 0. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK OSLK_ SLK Default 





UNK UNK UNK RO RC 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 
EPMAD, bit [9] 
External Performance Monitors access disable status. 
This bit is UNKNOWN on reads if any of the following is true: 
° EDPRSR.{DLK, OSLK, R} is 1. 
. EDPRSR.PU is 0. 


Otherwise permitted values are: 


0 External Performance Monitors access enabled. AllowExternalPMUAccess() == 
TRUE. 

1 External Performance Monitors access disabled. AllowExternalPMUAccess() == 
FALSE. 


If external performance monitors access is not implemented, EPMAD is RAO. 
This field is in the Core power domain. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK OSLK EPMAD Default 





UNK UNK UNK RAO RO 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 


SDAD, bit [8] 


Sticky EDAD error. Set to 1 if an external debug interface access to a debug register returns an error 
because AllowExternalDebugAccess() == FALSE. 
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This bit is UNKNOWN on reads if any of the following are true: 
. Either of EDPRSR.{DLK, R} is 1. 
° EDPRSR.PU is 0. 


° EDPRSR.OSLK is 1 and external debug writes to OSLAR_EL1 do not return an error when 
AllowExternalDebugAccess() == FALSE. 


Otherwise permitted values are: 


Q No accesses to the external debug registers have failed since EDPRSR was last read. 
1 At least one access to the external debug registers has failed since EDPRSR was last 
read. 


If DoubleLockStatus() == FALSE then following a read of EDPRSR, this bit clears to 0. Otherwise 
it is CONSTRAINED UNPREDICTABLE whether or not this clearing occurs. 


The write to SDAD is an indirect write to EDPRSR that is a side effect of the access. The indirect 
write might not occur for a memory-mapped access to the external debug interface. 


This field is in the Core power domain and the Cold reset domain. On a Cold reset it resets to 0. 
When this register has an architecturally-defined reset value, this field resets to 0. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK OSLK_ SLK_ Default 





UNK UNK § Seetext RO RC 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 
EDAD, bit [7] 
External debug access disable status. 
This bit is UNKNOWN on reads if any of the following are true: 
. Either of EDPRSR.{DLK, R} is 1. 
. EDPRSR.PU is 0. 


° EDPRSR.OSLK is | and external debug writes to OSLAR_EL1 do not return an error when 
AllowExternalDebugAccess() == FALSE. 


Otherwise permitted values are: 

Q External debug access enabled. AllowExternalDebugAccess() == TRUE. 
1 External debug access disabled. AllowExternalDebugAccess() == FALSE. 
This field is in the Core power domain. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK OSLK _ EDAD Default 





UNK UNK  Seetext RAO RAZ 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 





DLK, bit [6] 
OS Double Lock status bit. Returns the result of the pseudocode function DoubleLockStatus(). 
This bit is UNKNOWN on reads if EDPRSR.PU is 0. 
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OSLK, bit [5] 


Otherwise reads as zero if any of the following are true, that is when DoubleLockStatus() == 
FALSE: 


° OSDLR_EL1.DLK == 0. 
. DBGPRCR_EL1.CORENPDRQ == 1. 
° The PE is in Debug state. 


If the Core power domain is powered up and DoubleLockStatus() == TRUE, it is IMPLEMENTATION 
DEFINED whether: 


° EDPRSR.PU reads as 1, EDPRSR.DLK reads as 1, and EDPRSR.SPD is UNKNOWN. 
° EDPRSR.PU reads as 0, EDPRSR.DLK is UNKNOWN, and EDPRSR.SPD reads as 0. 


If the Core power domain is powered up and entered reset state with the OS double-lock locked this 
bit has a CONSTRAINED UNPREDICTABLE value, for more information see EDPRSR.{DLK, R} and 
reset state on page H6-4948 


EDPRSR.{DLK, SPD, PU} describe whether registers in the Core power domain can be accessed, 
and whether their state has been lost since the last time the register was read. For more information, 
see EDPRSR.{DLK, SPD, PU} and the Core power domain on page H6-4947 


This field is in the Core power domain. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK Default 





UNK  Seetext RAZ 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 


OS lock status bit. 

This bit is UNKNOWN on reads if either: 

° EDPRSR.{DLK, R} is 1. 

° EDPRSR.PU is 0. 

A read of this bit returns the value of OSLSR_EL1.OSLK. 
This field is in the Core power domain. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK OSLK Default 





UNK UNK RAO RAZ 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 


HALTED, bit [4] 


Halted status bit. 

This bit is UNKNOWN on reads if EDPRSR.PU is 0. 
Otherwise permitted values are: 

0 PE is in Non-debug state. 

1 PE is in Debug state. 
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Because the OS Double Lock is never set when the PE is in Debug state, this bit is always RAZ 
when EDPRSR.DLK is set to 1 


This field is in the Core power domain. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK Default 





UNK  Seetext RO 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 














SR, bit [3] 
Sticky core reset status bit. 
This bit is UNKNOWN on reads if either: 
. EDPRSR.DLK is 1. 
. EDPRSR.PU is 0. 
Otherwise permitted values are: 
) The non-debug logic of the PE is not in reset state and has not been reset since the last 
time EDPRSR was read. 
1 The non-debug logic of the PE is in reset state or has been reset since the last time 
EDPRSR was read. 
If DoubleLockStatus() == FALSE then following a read of EDPRSR, this bit clears to 0 if the 
non-debug logic of the PE is not in reset state. Otherwise it is CONSTRAINED UNPREDICTABLE 
whether or not this clearing occurs. 
This field is in the Core power domain and the Warm reset domain. On a Warm or Cold reset it resets 
to 1. 
When this register has an architecturally-defined reset value, this field resets to 1. 
This table summarizes the effect of the register access controls on the behavior of this field: 
Off DLK SLK Default 
UNK UNK_ RO RC 
Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 
R, bit [2] 
PE reset status bit. 
This bit is UNKNOWN on reads if either: 
. EDPRSR.DLK is 1. 
° EDPRSR.PU is 0. 
Otherwise permitted values are: 
0 The non-debug logic of the PE is not in reset state. 
1 The non-debug logic of the PE is in reset state. 
If the Core power domain is powered up and entered reset state with the OS double-lock locked this 
bit has a CONSTRAINED UNPREDICTABLE value, for more information see EDPRSR.{DLK, R} and 
reset state on page H6-4948 
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SPD, bit [1] 


PU, bit [0] 


This field is in the Core power domain. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK Default 





UNK UNK RO 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 


Sticky core powerdown status bit. 
This bit is UNKNOWN on reads if both EDPRSR.DLK and EDPRSR.PU are 1. 
Otherwise, permitted values are: 


0 If EDPRSR.PU is 0, it is not known whether the state of the debug registers in the Core 
power domain is lost. 
If EDPRSR.PU is 1, the state of the debug registers in the Core power domain has not 
been lost. 

1 The state of the debug registers in the Core power domain has been lost. 


If DoubleLockStatus() == FALSE and the PE is not in the powerdown state, then following a read 
of EDPRSR, this bit clears to 0. 


If DoubleLockStatus() == TRUE and the PE is not in the powerdown state, it is CONSTRAINED 
UNPREDICTABLE whether or not this clearing occurs. 


When the value of EDPRSR.PU is 0 indicating that the Core power domain is in either retention or 
powerdown state, EDPRSR.SPD is not cleared following a read of EDPRSR, for the 
IMPLEMENTATION DEFINED behavior see EDPRSR.SPD when the Core domain is in either retention 
or powerdown state on page H6-4947. 


EDPRSR.{DLK, SPD, PU} describe whether registers in the Core power domain can be accessed, 
and whether their state has been lost since the last time the register was read. For more information, 
see EDPRSR.{DLK, SPD, PU} and the Core power domain on page H6-4947. 


This field is in the Core power domain and the Cold reset domain. On a Cold reset it resets to 1. 
When this register has an architecturally-defined reset value, this field resets to 1. 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK SLK Default 





RO UNK’ RO RC 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 
the highest priority and priority decreasing from left to right. 


Core powerup status bit. Indicates whether the Core power domain debug registers can be accessed. 


The value of EDPRSR.PU is IMPLEMENTATION DEFINED when the Core power domain is 
powered-up and OS double-lock is locked. See the description of DLK for more information. 


Otherwise, permitted values are: 


0 Core is in a low-power or powerdown state where the debug registers cannot be 
accessed. 


1 Core is in a powerup state where the debug registers can be accessed. 
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If the Core power domain is powered up and entered reset state with the OS double-lock locked this 
bit has a CONSTRAINED UNPREDICTABLE value, for more information see EDPRSR.{DLK, R} and 


reset state on page H6-4948 

EDPRSR.{DLK, SPD, PU} describe whether registers in the Core power domain can be accessed, 
and whether their state has been lost since the last time the register was read. For more information, 
see EDPRSR.{DLK, SPD, PU} and the Core power domain on page H6-4947 


This table summarizes the effect of the register access controls on the behavior of this field: 





Off DLK Default 





RAZ  Seetext RAO 





Access permissions for the External debug interface registers on page H8-4976 describes the 
conditions shown in this table. These conditions are prioritized, with the leftmost condition having 


the highest priority and priority decreasing from left to right. 


Accessing the EDPRSR: 


EDPRSR can be accessed through the external debug interface: 





Component Offset 





Debug 0x314 
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H9.2.41 EDRCR, External Debug Reserve Control Register 
The EDRCR characteristics are: 
Purpose 
This register is used to allow imprecise entry to Debug state and clear sticky bits in EDSCR. 
Usage constraints 
This register is accessible as follows: 
Off DLK OSLK _ SLK _ Default 
Error Error Error WI WO 
Configurations 
EDRCR is in the Core power domain. 
Attributes 
EDRCR is a 32-bit register. 
Field descriptions 
The EDRCR bit assignments are: 
31 43210 
RESO a 
CSE 
CSPA 
CBRRQ 
Bits [31:5] 
Reserved, RESO. 
CBRRQ, bit [4] 
Allow imprecise entry to Debug state. The actions on writing to this bit are: 
1) No action. 
1 Allow imprecise entry to Debug state, for example by canceling pending bus accesses. 
Setting this bit to 1 allows a debugger to request imprecise entry to Debug state. An External Debug 
Request debug event must be pending before the debugger sets this bit to 1. 
This feature is optional. If this feature is not implemented, writes to this bit are ignored. 
CSPA, bit [3] 
Clear Sticky Pipeline Advance. This bit is used to clear the EDSCR.PipeAdv bit to 0. The actions 
on writing to this bit are: 
0 No action. 
al Clear the EDSCR.PipeAdv bit to 0. 
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CSE, bit [2] 
Clear Sticky Error. Used to clear the EDSCR cumulative error bits to 0. The actions on writing to 
this bit are: 
) No action. 
1 Clear the EDSCR.{TXU, RXO, ERR} bits, and, if the PE is in Debug state, the 
EDSCR.ITO bit, to 0. 
Bits [1:0] 


Reserved, RESO. 


Accessing the EDRCR: 


EDRCR can be accessed through the external debug interface: 





Component Offset 





Debug 0x090 
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H9.2.42 EDSCR, External Debug Status and Control Register 
The EDSCR characteristics are: 


Purpose 


Main control register for the debug implementation. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK_ SLK Default 





Error Error Error RO RW 





Configurations 


EDSCR is in the Core power domain. Some or all RW fields of this register have defined reset 
values. These apply only on a Cold reset. The register is not affected by a Warm reset and is not 
affected by an External debug reset. 


Attributes 
EDSCR is a 32-bit register. 


Field descriptions 


The EDSCR bit assignments are: 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 1098765 0 





RESO __| Po ERR 
RXfull HDE 


TXfull RESO 
ITO 
RXO 
TXU 
PipeAdv 
ITE 
INTdis 
TDA 

MA 
RESO 
RESO 
SDD 


Bit [31] 
Reserved, RESO. 
RXfull, bit [30] 
DTRRxX full. This bit is RO. 


When this register has an architecturally-defined reset value, this field resets to 0. 


TXfull, bit [29] 
DTRTX full. This bit is RO. 
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When this register has an architecturally-defined reset value, this field resets to 0. 


ITO, bit [28] 
ITR overrun. This bit is RO. 
If the PE is in Non-debug state, this bit is UNKNOWN. ITO is set to 0 on entry to Debug state. 
RXO, bit [27] 
DTRRxX overrun. This bit is RO. 
When this register has an architecturally-defined reset value, this field resets to 0. 
TXU, bit [26] 
DTRTX underrun. This bit is RO. 
When this register has an architecturally-defined reset value, this field resets to 0. 
PipeAdy, bit [25] 
Pipeline advance. This bit is RO. Set to 1 every time the PE pipeline retires one or more instructions. 
Cleared to 0 by a write to EDRCR.CSPA. 
The architecture does not define precisely when this bit is set to 1. It requires only that this happen 
periodically in Non-debug state to indicate that software execution is progressing. 
ITE, bit [24] 


ITR empty. This bit is RO. 
If the PE is in Non-debug state, this bit is UNKNOWN. It is always valid in Debug state. 


INTdis, bits [23:22] 


Interrupt disable. Disables taking interrupts (including virtual interrupts and System Error 
interrupts) in Non-Debug state. 


If ExternalInvasiveDebugEnabled() = FALSE, the value of this field is ignored. 
If ExternalInvasiveDebugEnabled() = TRUE, the possible values of this field are: 


00 Do not disable interrupts. 
01 Disable interrupts taken to Non-secure EL1. 
10 Disable interrupts taken only to Non-secure EL1 and Non-secure EL2. If external secure 


invasive debug is enabled, also disable interrupts taken to Secure EL1. 


11 Disable interrupts taken only to Non-secure EL1 and Non-secure EL2. If external secure 
invasive debug is enabled, also disable all other interrupts. 


The value of INTdis does not affect whether an interrupt is a WFI wake-up event, but can mask an 
interrupt as a WFE wake-up event. 


If EL3 and EL2 are not implemented, the values @b01 and 0b10 are reserved. If programmed with a 
reserved value the PE behaves as if INTdis has been programmed with a defined value, other than 
for a direct read of EDSCR, and the value returned by a read of EDSCR.INTdis is UNKNOWN. 


When this register has an architecturally-defined reset value, this field resets to 0. 





TDA, bit [21] 
Traps accesses to the following Debug System registers: 
. AArch64: DBGBCR<n>_EL1, DBGBVR<n>_EL1, DBGWCR<n>_EL1, 
DBGWVR<n>_EL1. 
. AArch32: DBGBCR<n>, DBGBVR<n>, DBGBXVR<n>, DBGWCR<n>, DBGWVR<n>. 
The possible values of this field are: 
) Accesses to Debug System registers do not generate a Software Access debug event. 
1 Accesses to Debug System registers generate a Software Access debug event, if 
OSLSR.OSLK is 0 and if halting is allowed. 
When this register has an architecturally-defined reset value, this field resets to 0. 
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MA, bit [20] 


Memory access mode. Controls use of memory-access mode for accessing ITR and the DCC. This 
bit is ignored if in Non-debug state and set to zero on entry to Debug state. 


Possible values of this field are: 
1) Normal access mode. 
1 Memory access mode. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Bit [19] 
Reserved, RESO. 
NS, bit [18] 
Non-secure status. Read-only. When in Debug state, gives the current Security state: 
1) Secure state, IsSSecure() == TRUE. 
1 Non-secure state, IsSecure() == FALSE. 
In Non-debug state, this bit is UNKNOWN. 
Bit [17] 


Reserved, RESO. 


SDD, bit [16] 
Secure debug disabled. This bit is RO. 
On entry to Debug state: 
° If entering in Secure state, SDD is set to 0. 


. If entering in Non-secure state, SDD is set to the inverse of 
ExternalSecureInvasiveDebugEnabled(). 


In Debug state, the value of the SDD bit does not change, even if 
ExternalSecureInvasiveDebugEnabled() changes. 


In Non-debug state: 


° SDD returns the inverse of ExternalSecureInvasiveDebugEnabled(). If the authentication 
signals that control ExternalSecureInvasiveDebugEnabled() change, a context 
synchronization event is required to guarantee their effect. 


° This bit is unaffected by the Security state of the PE. 
If EL3 is not implemented and the implementation is Non-secure, this bit is RES1. 
Bit [15] 
Reserved, RESO. 
HDE, bit [14] 
Halting debug enable. The possible values of this field are: 
) Halting disabled for Breakpoint, Watchpoint and Halt Instruction debug events. 
1 Halting enabled for Breakpoint, Watchpoint and Halt Instruction debug events. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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RW, bits [13:10] 


Exception level Execution state status. Read-only. In Debug state, each bit gives the current 
Execution state of each EL: 




















RW Meaning 

1111 ~— All Exception levels are using AArch64. 

1110 ELO is using AArch32. All other Exception levels are using AArch64. 

110x  ELO and EL] are using AArch32. All other Exception levels are using AArch64. Never seen if EL2 is not implemented in the 
current Security state. 

1@xx  ELO, ELI, and, if implemented in the current Security state, EL2 are using AArch32. All other Exception levels are using 
AArché4. 

@xxx All Exception levels are using AArch32. 





EL, bits [9:8] 


A, bit [7] 


ERR, bit [6] 


However: 
° The value of 1110 is only permitted at ELO. 
. The values 110x are not permitted if either: 
—  EL2is not implemented. 
—  EL3 is implemented and SCR_EL3.NS/SCR.NS == 0. 
° The values 10xx are not permitted if EL3 is not implemented. 


In Non-debug state, this field is RAO. 


Exception level. Read-only. In Debug state, this gives the current EL of the PE. 
In Non-debug state, this field is RAZ. 


System Error interrupt pending. Read-only. In Debug state, indicates whether a SError interrupt is 
pending: 


° If HCR_EL2.{ AMO, TGE} = {1, 0} and in Non-secure ELO or EL1, a virtual SError 


interrupt. 
. Otherwise, a physical SError interrupt. 
) No SError interrupt pending. 
al SError interrupt pending. 


A debugger can read EDSCR to check whether an SError interrupt is pending without having to 
execute further instructions. A pending SError might indicate data from target memory is corrupted. 


UNKNOWN in Non-debug state. 


Cumulative error flag. This field is RO. It is set to 1 following exceptions in Debug state and on any 
signaled overrun or underrun on the DTR or EDITR. 


When this register has an architecturally-defined reset value, this field resets to 0. 


STATUS, bits [5:0] 


Debug status flags. This field is RO. 

The possible values of this field are: 

000010 PE is in Non-debug state. 

000001 PE is restarting, exiting Debug state. 
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000111 Breakpoint. 

010011 External debug request. 
011011 Halting step, normal. 

011111 Halting step, exclusive. 
100011 OS Unlock Catch. 

100111 Reset Catch. 

101011 Watchpoint. 

101111 HLT instruction. 

110011 Software access to debug register. 
110111 Exception Catch. 

111011 Halting step, no syndrome. 
All other values of STATUS are reserved. 


Accessing the EDSCR: 


EDSCR can be accessed through the external debug interface: 





Component Offset 





Debug 0x088 
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EDVIDSR, External Debug Virtual Context Sample Register 
The EDVIDSR characteristics are: 


Purpose 


Contains sampled values captured on reading EDPCSR. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK Default 





Error Error Error RO 





Configurations 
EDVIDSR is in the Core power domain. 
Fields in this register reset to architecturally UNKNOWN values. 


Required only if the OPTIONAL PC Sample-based Profiling Extension is implemented. 


Attributes 
EDVIDSR is a 32-bit register. 


Field descriptions 


The EDVIDSR bit assignments are: 


31 30 29 28 27 8 7 0 


rs | | | RESO VMID 


NS, bit [31] 


Non-secure state sample. Indicates the Security state associated with the most recent EDPCSR 
sample. 


If EL3 is not implemented, this bit has a fixed read-only value. 


This field resets to a value that is architecturally UNKNOWN. 


E2, bit [30] 
Exception level 2 status sample. Indicates whether the most recent EDPCSR sample was associated 
with EL2. If EDVIDSR.NS == 0, this bit is 0. 


If EL2 is not implemented, this bit is RESO. 


This field resets to a value that is architecturally UNKNOWN. 


E3, bit [29] 


Exception level 3 status sample. Indicates whether the most recent EDPCSR sample was associated 
with AArch64 EL3. If EDVIDSR.NS == 1 or the PE was in AArch32 state when EDPCSR was read, 
this bit is 0. 


If EL3 is not implemented, this bit is RESO. 


This field resets to a value that is architecturally UNKNOWN. 
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HV, bit [28] 
EDPCSR high half valid. Indicates whether bits [63:32] of the most recent EDPCSR sample are 
valid. If EDVIDSR.HV == 0, the value of EDPCSR[63:32] is RAZ. 
This field resets to a value that is architecturally UNKNOWN. 

Bits [27:8] 


Reserved, RESO. 


VMID, bits [7:0] 


VMID sample. The value of VTTBR_EL2.VMID associated with the most recent EDPCSR sample. 
If EDVIDSR.NS == 0 or EDVIDSR.E2 == 1, this field is RAZ. 


If EL2 is not implemented, this field is RESO. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the EDVIDSR: 


EDVIDSR can be accessed through the external debug interface: 





Component Offset 





Debug Ox0A8 
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H9.2.44 EDWAR, External Debug Watchpoint Address Register 
The EDWAR characteristics are: 


Purpose 


Returns the virtual data address being accessed when a Watchpoint Debug Event was triggered. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK Default 





Error Error Error RO 





Configurations 
EDWAR is in the Core power domain. RW fields in this register reset to architecturally UNKNOWN 
values. These apply only on a Cold reset. The register is not affected by a Warm reset and is not 
affected by an External debug reset. 

Attributes 


EDWAR is a 64-bit register. 


Field descriptions 


The EDWAR bit assignments are: 


63 0 


Watchpoint address 


Bits [63:0] 


Watchpoint address. The data virtual address being accessed when a Watchpoint Debug Event was 
triggered and caused entry to Debug state. This address must be within a naturally-aligned block of 
memory of power-of-two size no larger than the DC ZVA block size. 


The value of this register is UNKNOWN if the PE is in Non-debug state, or if Debug state was entered 
other than for a Watchpoint debug event. 


The value of EDWAR[63:32] is UNKNOWN if Debug state was entered for a Watchpoint debug event 
taken from AArch32 state. 


The EDWAR is subject to the same alignment rules as the reporting of a watchpointed address in 
the FAR. See Determining the memory location that caused a Watchpoint exception on 
page D2-1665 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the EDWAR: 


EDWAR[31:0] can be accessed through the external debug interface: 





Component Offset 





Debug 0x030 
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EDWAR[63:32] can be accessed through the external debug interface: 





Component Offset 





Debug 0x034 
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H9.2.45 MIDR_EL1, Main ID Register 
The MIDR_EL1 characteristics are: 


Purpose 
Provides identification information for the PE, including an implementer code for the device and a 
device ID number. 

Usage constraints 


This register is accessible as follows: 





Off DLK Default 





IMP DEF IMPDEF RO 





Configurations 
External register MIDR_EL1 is architecturally mapped to AArch64 System register MIDR_EL1. 
External register MIDR_EL1 is architecturally mapped to AArch32 System register MIDR. 


It is IMPLEMENTATION DEFINED whether MIDR_EL]1 is implemented in the Core power domain or 
in the Debug power domain. 


Attributes 
MIDR_EL1 is a 32-bit register. 


Field descriptions 


The MIDR_EL1 bit assignments are: 


31 24 23 20 19 1615 4 3 0 


Architecture as oe 


Implementer, bits [31:24] 


The Implementer code. This field must hold an implementer code that has been assigned by ARM. 
Assigned codes include the following: 





Hex representation ASCll representation Implementer 





























0x41 A ARM Limited 
0x42 B Broadcom Corporation 
0x43 C Cavium Inc. 
0x44 D Digital Equipment Corporation 
0x49 I Infineon Technologies AG 
Qx4D M Motorola or Freescale Semiconductor Inc. 
Ox4E N NVIDIA Corporation 
0x50 P Applied Micro Circuits Corporation 
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Hex representation ASCll representation Implementer 











Ox51 Q Qualcomm Inc. 
Ox56 Vv Marvell International Ltd. 
0x69 i Intel Corporation 





ARM can assign codes that are not published in this manual. All values not assigned by ARM are 
reserved and must not be used. 


Variant, bits [23:20] 


An IMPLEMENTATION DEFINED variant number. Typically, this field is used to distinguish between 
different product variants, or major revisions of a product. 


Architecture, bits [19:16] 


The permitted values of this field are: 


0001 
0010 
0011 
0100 
0101 
0110 
0111 
1111 


ARMv4 

ARMVv4T 

ARMvVsS (obsolete) 
ARMv5T 
ARMvS5TE 
ARMVSTEJ 
ARMv6 


Architectural features are individually identified in the ID_* registers, see Identification 
registers, functional group on page G4-4194, 





All other values are reserved. 


PartNum, bits [15:4] 


An IMPLEMENTATION DEFINED primary part number for the device. 


On processors implemented by ARM, if the top four bits of the primary part number are @x@ or 0x7, 
the variant and architecture are encoded differently. 


Revision, bits [3:0] 


An IMPLEMENTATION DEFINED revision number for the device. 


Accessing the MIDR_EL1: 


MIDR_EL]I can be accessed through the external debug interface: 





Component Offset 





Debug QxD00 
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H9.2.46 OSLAR_EL1, OS Lock Access Register 
The OSLAR_EL1 characteristics are: 


Purpose 
Used to lock or unlock the OS lock. 


Usage constraints 


This register is accessible as follows: 





Off DLK EDAD SLK Default 





Error Error IMPDEF WI WO 





Configurations 
External register OSLAR_EL1 is architecturally mapped to AArch64 System register 
OSLAR_EL1. 
External register OSLAR_EL1 is architecturally mapped to AArch32 System register 
DBGOSLAR. 


OSLAR_EL1 is in the Core power domain. 


Attributes 
OSLAR_EL1 is a 32-bit register. 


Field descriptions 


The OSLAR_EL1 bit assignments are: 


31 10 


RESO | 


—_ OSLK 


Bits [31:1] 
Reserved, RESO. 


OSLK, bit [0] 
On writes to OSLAR_EL1, bit[0] is copied to the OS lock. 
Use EDPRSR.OSLK to check the current status of the lock. 


Accessing the OSLAR_EL1: 
OSLAR_EL] can be accessed through the external debug interface: 





Component Offset 





Debug 0x300 
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H9.3 Cross-Trigger Interface registers 


This section lists the Cross-Trigger Interface registers. 
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H9.3.1 ASICCTL, CTI External Multiplexer Control register 
The ASICCTL characteristics are: 


Purpose 


Can be used to provide IMPLEMENTATION DEFINED controls for the CTI. For example, the register 
might be used to control multiplexors for additional IMPLEMENTATION DEFINED triggers. The 
IMPLEMENTATION DEFINED controls provided by this register might modify the architecturally 
defined behavior of the CTI. 


— Note 


The architecturally-defined triggers must not be multiplexed. 





Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EDAD SLK Default 





IMPDEF IMPDEF IMPDEF IMPDEF RO IMP DEF 





Configurations 


It is IMPLEMENTATION DEFINED whether ASICCTL is implemented in the Core power domain or in 
the Debug power domain. 


If it is implemented in the Core power domain then it is IMPLEMENTATION DEFINED whether it is in 
the Cold reset domain or the Warm reset domain. 


This register must reset to a value that supports the architecturally-defined behavior of the CTI. 
Changing the value of the register from its reset value causes IMPLEMENTATION DEFINED behavior 
that might differ from the architecturally-defined behavior of the CTI. 


Other than the requirements listed in this register description, all aspects of the reset behavior of the 
ASICCTL are IMPLEMENTATION DEFINED. 


Attributes 
ASICCTL is a 32-bit register. 


Field descriptions 


The ASICCTL bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 
IMPLEMENTATION DEFINED. 


Accessing the ASICCTL: 


ASICCTL can be accessed through the external debug interface: 





Component Offset 





CTI 0x144 
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H9.3.2 


CTIAPPCLEAR, CTI Application Trigger Clear register 


The CTIAPPCLEAR characteristics are: 


Purpose 


Clears bits of the Application Trigger register. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO WO 





Configurations 
CTIAPPCLEAR is in the Debug power domain. 


Attributes 
CTIAPPCLEAR is a 32-bit register. 


Field descriptions 


The CTIAPPCLEAR bit assignments are: 


31 0 


APPCLEAR<<x>, bit [x] 


APPCLEAR<x>, bit [x], for x = 0 to 31 
Application trigger <x> disable. 


Bits [31:N] are RAZ/WI. N is the number of ECT channels implemented as defined by the 
CTIDEVID.NUMCHAN field. 


Writing to this bit has the following effect: 
0 No effect. 


1 Clear corresponding bit in CTIAPPTRIG to 0 and clear the corresponding channel 
event. 


If the ECT does not support multicycle channel events, use of CTIAPPCLEAR is deprecated and 
the debugger must only use CTIAPPPULSE. 


Accessing the CTIAPPCLEAR: 


CTIAPPCLEAR can be accessed through the external debug interface: 





Component Offset 





CTI 0x018 
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H9.3.3 CTIAPPPULSE, CTI Application Pulse register 
The CTIAPPPULSE characteristics are: 


Purpose 


Causes event pulses to be generated on ECT channels. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO WO 





It is CONSTRAINED UNPREDICTABLE whether a write to CTIAPPPULSE generates an event on a 
channel if CTICONTROL.GLBEN is 0. 

Configurations 
CTIAPPPULSE is in the Debug power domain. 


Attributes 
CTIAPPPULSE is a 32-bit register. 


Field descriptions 


The CTIAPPPULSE bit assignments are: 


31 0 


APPPULSE<x>, bit [x] 


APPPULSE<x>, bit [x], for x = 0 to 31 
Generate event pulse on ECT channel <x>. 


Bits [31:N] are RAZ/WI. N is the number of ECT channels implemented as defined by the 
CTIDEVID.NUMCHAN field. 


Writing to this bit has the following effect: 
0 No effect. 


1 Channel <x> event pulse generated. 


— Note 


° The CTIAPPPULSE operation does not affect the state of the Application Trigger register, 
CTIAPPTRIG. If the channel is active, either because of an earlier event or from the 
application trigger, then the value written to CTIAPPPULSE might have no effect. 


° Multiple pulse events that occur close together might be merged into a single pulse event. 





Accessing the CTIAPPPULSE: 


CTIAPPPULSE can be accessed through the external debug interface: 





Component Offset 





CTI Qx01C 
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H9.3.4 CTIAPPSET, CTI Application Trigger Set register 

The CTIAPPSET characteristics are: 

Purpose 
Sets bits of the Application Trigger register. 

Usage constraints 
This register is accessible as follows: 

SLK Default 
RO RW 

Configurations 
CTIAPPSET is in the Debug power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on an External debug reset. The register is not affected by a 
Warm reset and is not affected by a Cold reset. 

Attributes 
CTIAPPSET is a 32-bit register. 

Field descriptions 

The CTIAPPSET bit assignments are: 

31 0 

APPSET<x>, bit [x] 

APPSET<x>, bit [x], for x = 0 to 31 
Application trigger <x> enable. 
Bits [31:N] are RAZ/WI. N is the number of ECT channels implemented as defined by the 
CTIDEVID.NUMCHAN field. 
Possible values of this bit are: 
Q Reading this means the application trigger is inactive. Writing this has no effect. 
1 Reading this means the application trigger is active. Writing this sets the corresponding 

bit in CTIAPPTRIG to | and generates a channel event. 

If the ECT does not support multicycle channel events, use of CTIAPPSET is deprecated and the 
debugger must only use CTIAPPPULSE. 
This field resets to a value that is architecturally UNKNOWN. 

Accessing the CTIAPPSET: 

CTIAPPSET can be accessed through the external debug interface: 
Component Offset 
CTI 0x014 
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H9.3.5 CTIAUTHSTATUS, CTI Authentication Status register 
The CTIAUTHSTATUS characteristics are: 
Purpose 
Provides information about the state of the IMPLEMENTATION DEFINED authentication interface for 
CTL 
Usage constraints 
This register is accessible as follows: 
SLK Default 
RO RO 
Configurations 
CTIAUTHSTATUS is in the Debug power domain. 
This register is OPTIONAL, and is required for CoreSight compliance. 
Attributes 
CTIAUTHSTATUS is a 32-bit register. 
Field descriptions 
The CTIAUTHSTATUS bit assignments are: 
31 8 7 43210 
i NSNID 
Bits [31:8] 
Reserved, RESO. 
Bits [7:4] 
Reserved, RAZ. 
NSNID, bits [3:2] 
If EL3 is not implemented and the implemented Security state is Secure state, holds the same value 
as DBGAUTHSTATUS_EL1.SNID. 
Otherwise, holds the same value as DBGAUTHSTATUS_EL1.NSNID. 
NSID, bits [1:0] 
If EL3 is not implemented and the implemented Security state is Secure state, holds the same value 
as DBGAUTHSTATUS_EL1.SID. 
Otherwise, holds the same value as DBGAUTHSTATUS_EL1.NSID. 
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Accessing the CTIAUTHSTATUS: 


CTIAUTHSTATUS can be accessed through the external debug interface: 





Component Offset 





CTI OxFB8 
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CTICHINSTATUS, CTI Channel In Status register 


The CTICHINSTATUS characteristics are: 


Purpose 
Provides the raw status of the ECT channel inputs to the CTI. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
CTICHINSTATUS is in the Debug power domain. 


Attributes 
CTICHINSTATUS is a 32-bit register. 


Field descriptions 


The CTICHINSTATUS bit assignments are: 


31 0 


CHIN<n>, bit [n] 


CHIN<n>, bit [n], for n = 0 to 31 
Input channel <n> status. 


Bits [31:N] are RAZ. N is the number of ECT channels implemented as defined by the 
CTIDEVID.NUMCHAN field. 


Possible values of this bit are: 
Q Input channel <n> is inactive. 
1 Input channel <n> is active. 


If the ECT channels do not support multicycle events then it is IMPLEMENTATION DEFINED whether 
an input channel can be observed as active. 


Accessing the CTICHINSTATUS: 


CTICHINSTATUS can be accessed through the external debug interface: 





Component Offset 





CTI 0x138 
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H9.3.7 


CTICHOUTSTATUS, CTI Channel Out Status register 


The CTICHOUTSTATUS characteristics are: 


Purpose 
Provides the status of the ECT channel outputs from the CTI. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
CTICHOUTSTATUS is in the Debug power domain. 


Attributes 
CTICHOUTSTATUS is a 32-bit register. 


Field descriptions 


The CTICHOUTSTATUS bit assignments are: 


31 0 


CHOUT<n>, bit [n] 


CHOUT<n>, bit [n], for n = 0 to 31 
Output channel <n> status. 


Bits [31:N] are RAZ. N is the number of ECT channels implemented as defined by the 
CTIDEVID.NUMCHAN field. 


Possible values of this bit are: 

Q Output channel <n> is inactive. 

1 Output channel <n> is active. 

If the ECT channels do not support multicycle events then it is IMPLEMENTATION DEFINED whether 


an input channel can be observed as active. 


—— Note 
The value in CTICHOUTSTATUS is after gating by the channel gate. For more information, see 
CTIGATE. 





Accessing the CTICHOUTSTATUS: 


CTICHOUTSTATUS can be accessed through the external debug interface: 





Component Offset 





CTI Ox13C 
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H9.3.8 CTICIDRO, CTI Component Identification Register 0 
The CTICIDRO characteristics are: 


Purpose 
Provides information to identify a CTI component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
CTICIDRO is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
CTICIDRO is a 32-bit register. 
Field descriptions 
The CTICIDRO bit assignments are: 
31 8 7 0 
RESO PRMBL_O 


Bits [31:8] 
Reserved, RESO. 


PRMBL_0, bits [7:0] 


Preamble. Must read as Qx@D. 


Accessing the CTICIDRO: 


CTICIDRO can be accessed through the external debug interface: 





Component Offset 





CTI OxFFO 
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H9.3.9 CTICIDR1, CTI Component Identification Register 1 
The CTICIDR1 characteristics are: 


Purpose 
Provides information to identify a CTI component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
CTICIDR1 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
CTICIDR1 is a 32-bit register. 
Field descriptions 
The CTICIDR1 bit assignments are: 
31 8 7 4 3 0 
RESO CLASS | PRMBL_1 


Bits [31:8] 
Reserved, RESO. 


CLASS, bits [7:4] 


Component class. Reads as 0x9, debug component. 


PRMBL_1, bits [3:0] 
Preamble. RAZ. 


Accessing the CTICIDR1: 


CTICIDRI can be accessed through the external debug interface: 





Component Offset 





CTI OxFF4 
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H9.3.10 CTICIDR2, CTI Component Identification Register 2 
The CTICIDR2 characteristics are: 


Purpose 
Provides information to identify a CTI component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
CTICIDR2 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
CTICIDR2 is a 32-bit register. 
Field descriptions 
The CTICIDR2 bit assignments are: 
31 8 7 0 
RESO PRMBL_2 


Bits [31:8] 
Reserved, RESO. 


PRMBL_2, bits [7:0] 


Preamble. Must read as 0x05. 


Accessing the CTICIDR2: 


CTICIDR2 can be accessed through the external debug interface: 





Component Offset 





CTI OxFF8 
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H9.3.11 


CTICIDR3, CTI Component Identification Register 3 


The CTICIDR3 characteristics are: 


Purpose 
Provides information to identify a CTI component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
CTICIDR3 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
CTICIDR3 is a 32-bit register. 


Field descriptions 


The CTICIDR3 bit assignments are: 


31 8 7 0 


RESO PRMBL_3 


Bits [31:8] 
Reserved, RESO. 


PRMBL_ 3, bits [7:0] 
Preamble. Must read as QxB1. 


Accessing the CTICIDR3: 


CTICIDR3 can be accessed through the external debug interface: 





Component Offset 





CTI OxFFC 
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H9.3.12 CTICLAIMCLR, CTI Claim Tag Clear register 


The CTICLAIMCLR characteristics are: 


Purpose 
Used by software to read the values of the CLAIM bits, and to clear these bits to 0. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RW 





Configurations 
CTICLAIMCLR is in the Debug power domain. 


See the CLAIM field description for the effect of an External debug reset on the value returned by 
this register. This register is not affected by a Warm reset, and is not affected by a Cold reset. 


Implementation of this register is OPTIONAL. 


Attributes 
CTICLAIMCLR is a 32-bit register. 


Field descriptions 


The CTICLAIMCLR bit assignments are: 


31 0 


CLAIM[X], bit [x] 


CLAIM[x], bit [x], for x = 0 to 31 
CLAIM tag clear bit. 


For values of x greater than or equal to the IMPLEMENTATION DEFINED number of CLAIM tags, this 
bit is RAZ/SBZ. Software can rely on these bits reading as zero, and must use a Should-Be-Zero 
policy on writes. Implementations must ignore writes. 


For other values of x, reads return the value of CLAIM[x] and the behavior on writes is: 
0 No action. 

1 Indirectly clear CLAIM[x] to 0. 

A single write to CTICLAIMCLR can clear multiple tags to 0. 

An External Debug reset clears the CLAIM tag bits to 0. 


Accessing the CTICLAIMCLR: 


CTICLAIMCLR can be accessed through the external debug interface: 





Component Offset 





CTI OxFA4 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. H9-5089 
1ID092916 Non-Confidential 


H9 External Debug Register Descriptions 
H9.3 Cross-Trigger Interface registers 


H9.3.13 CTICLAIMSET, CTI Claim Tag Set register 


The CTICLAIMSET characteristics are: 


Purpose 
Used by software to set CLAIM bits to 1. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RW 
Configurations 
CTICLAIMSET is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
Attributes 
CTICLAIMSET is a 32-bit register. 
Field descriptions 
The CTICLAIMSET bit assignments are: 
31 0 


CLAIM[X], bit [x] 


CLAIM[x], bit [x], for x = 0 to 31 
CLAIM tag set bit. 


For values of x greater than or equal to the IMPLEMENTATION DEFINED number of CLAIM tags, this 
bit is RAZ/SBZ. Software can rely on these bits reading as zero, and must use a Should-Be-Zero 
policy on writes. Implementations must ignore writes. 


For other values of x, the bit is RAO and the behavior on writes is: 
0 No action. 

1 Indirectly set CLAIM[x] tag to 1. 

A single write to CTICLAIMSET can set multiple tags to 1. 

An External Debug reset clears the CLAIM tag bits to 0. 


Accessing the CTICLAIMSET: 


CTICLAIMSET can be accessed through the external debug interface: 





Component Offset 





CTI OxFAQ 
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H9.3.14 CTICONTROL, CTI Control register 


The CTICONTROL characteristics are: 


Purpose 
Controls whether the CTI is enabled. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RW 





Configurations 
CTICONTROL is in the Debug power domain. Some or all RW fields of this register have defined 
reset values. These apply only on an External debug reset. The register is not affected by a Warm 
reset and is not affected by a Cold reset. 

Attributes 
CTICONTROL is a 32-bit register. 


Field descriptions 


The CTICONTROL bit assignments are: 


31 10 


RESO i 


| GLBEN 


Bits [31:1] 
Reserved, RESO. 


GLBEN, bit [0] 
Enables or disables the CTI mapping functions. Possible values of this field are: 
Q CTI mapping functions disabled. 
1 CTI mapping functions enabled. 


When the mapping functions are disabled, no new events are signaled on either output triggers or 
output channels. If a previously asserted output trigger has not been acknowledged, it remains 
asserted after the mapping functions are disabled. All output triggers are disabled by CTI reset. 


If the ECT supports multicycle channel events any existing output channel events will be 
terminated. 


When this register has an architecturally-defined reset value, this field resets to 0. 
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Accessing the CTICONTROL: 


CTICONTROL can be accessed through the external debug interface: 





Component Offset 





CTI 0x00 
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CTIDEVAFFO, CTI Device Affinity register 0 


The CTIDEVAFFO characteristics are: 


Purpose 


Copy of the low half of the PE MPIDR_EL] register that allows a debugger to determine which PE 
in a multiprocessor system the CTI component relates to. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
CTIDEVAFFO is in the Debug power domain. 


Implementation of this register is OPTIONAL. 


Attributes 
CTIDEVAFFO is a 32-bit register. 


Field descriptions 


The CTIDEVAFFO bit assignments are: 


31 0 


MPIDR_EL1 low half 


Bits [31:0] 
MPIDR_EL] low half. Read-only copy of the low half of MPIDR_EL1, as seen from the highest 
implemented Exception level. 


Accessing the CTIDEVAFF0: 


CTIDEVAFFO can be accessed through the external debug interface: 





Component Offset 





CTI OxFA8 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. H9-5093 


1ID092916 


Non-Confidential 


H9 External Debug Register Descriptions 
H9.3 Cross-Trigger Interface registers 


H9.3.16 CTIDEVAFF1, CTI Device Affinity register 1 
The CTIDEVAFF1 characteristics are: 


Purpose 


Copy of the high half of the PE MPIDR_EL] register that allows a debugger to determine which PE 
in a multiprocessor system the CTI component relates to. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
CTIDEVAFF1 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
Attributes 
CTIDEVAFF1 is a 32-bit register. 
Field descriptions 
The CTIDEVAFF1 bit assignments are: 
31 0 
MPIDR_EL1 high half 


Bits [31:0] 
MPIDR_ELI high half. Read-only copy of the high half of MPIDR_EL1, as seen from the highest 
implemented Exception level. 


Accessing the CTIDEVAFF1: 


CTIDEVAFF1 can be accessed through the external debug interface: 





Component Offset 





CTI OxFAC 
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CTIDEVARCH, CTI Device Architecture register 


The CTIDEVARCH characteristics are: 


Purpose 


Identifies the programmers' model architecture of the CTI component. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
CTIDEVARCH is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
Attributes 
CTIDEVARCH is a 32-bit register. 
Field descriptions 
The CTIDEVARCH bit assignments are: 
31 21 20 19 16 15 0 


ARCHITECT i REVISION ARCHID 


PRESENT | 


ARCHITECT, bits [31:21] 
Defines the architecture of the component. For CTI, this is ARM Limited. 
Bits [31:28] are the JEP106 continuation code, 0x4. 
Bits [27:21] are the JEP106 ID code, 0x3B. 


PRESENT, bit [20] 
When set to 1, indicates that the DEVARCH is present. 
This field is 1 in ARMv8. 


REVISION, bits [19:16] 
Defines the architecture revision. For architectures defined by ARM this is the minor revision. 
For CTI, the revision defined by ARMV8 is 0x9. 


All other values are reserved. 


ARCHID, bits [15:0] 


Defines this part to be an ARMv8 debug component. For architectures defined by ARM this is 
further subdivided. 


For CTI: 
° Bits [15:12] are the architecture version, 0x1. 


° Bits [11:0] are the architecture part number, 0xA14. 
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This corresponds to CTI architecture version CTIv2. 


Accessing the CTIDEVARCH: 


CTIDEVARCH can be accessed through the external debug interface: 





Component Offset 





CTI OxFBC 
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H9.3.18 CTIDEVID, CTI Device ID register 0 
The CTIDEVID characteristics are: 


Purpose 


Describes the CTI component to the debugger. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
CTIDEVID is in the Debug power domain. 


Attributes 
CTIDEVID is a 32-bit register. 


Field descriptions 


The CTIDEVID bit assignments are: 


26 25 24 23 22 21 16 15 14 13 


RESO ae NUMCHAN a NUMTRIG RESO | EXTMUXNUM 
nour —— TT “ress 
RESO 


Bits [31:26] 
Reserved, RESO. 


INOUT, bits [25:24] 


Input/output options. Indicates presence of the input gate. If the CTM is not implemented, this field 


is RAZ. 
00 CTIGATE does not mask propagation of input events from external channels. 
Q1 CTIGATE masks propagation of input events from external channels. 


All other values are reserved. 


Bits [23:22] 
Reserved, RESO. 


NUMCHAN, bits [21:16] 
Number of ECT channels implemented. IMPLEMENTATION DEFINED. For ARMv8, valid values are: 
000011 3 channels (0..2) implemented. 
000100 4 channels (0..3) implemented. 
000101 5 channels (0..4) implemented. 
000110 6 channels (0..5) implemented. 
and so on up to 100000, 32 channels (0..31) implemented. 


All other values are reserved. 
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Bits [15:14] 
Reserved, RESO. 


NUMTRIG, bits [13:8] 


Number of triggers implemented. IMPLEMENTATION DEFINED. This is one more than the index of the 
largest trigger, rather than the actual number of triggers. 


For ARMv8, valid values are: 

000011 Up to 3 triggers (0..2) implemented. 

001000 Up to 8 triggers (0..7) implemented. 

001001 Up to 9 triggers (0..8) implemented. 

001010 Up to 10 triggers (0..9) implemented. 

and so on up to 100000, 32 triggers (0..31) implemented. 


All other values are reserved. If the contains a Trace extension, this field must be at least 0b001000. 
There is no guarantee that any of the implemented triggers, including the highest numbered, are 
connected to any components. 


Bits [7:5] 
Reserved, RESO. 


EXTMUXNUM, bits [4:0] 


Number of multiplexors available on triggers. This value is used in conjunction with External 
Control register, ASICCTL. 


Accessing the CTIDEVID: 


CTIDEVID can be accessed through the external debug interface: 





Component Offset 





CTI OxFC8 
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CTIDEVID1, CTI Device ID register 1 


The CTIDEVID1 characteristics are: 


Purpose 


Reserved for future information about the CTI component to the debugger. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
CTIDEVID1 is in the Debug power domain. 


Attributes 
CTIDEVID1 is a 32-bit register. 


Field descriptions 


The CTIDEVID1 bit assignments are: 


31 0 


RESO 


Bits [31:0] 
Reserved, RESO. 


Accessing the CTIDEVID1: 


CTIDEVID1 can be accessed through the external debug interface: 





Component Offset 





CTI OxFC4 
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H9.3.20 CTIDEVID2, CTI Device ID register 2 
The CTIDEVID2 characteristics are: 


Purpose 


Reserved for future information about the CTI component to the debugger. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
CTIDEVID2 is in the Debug power domain. 
Attributes 
CTIDEVID2 is a 32-bit register. 
Field descriptions 
The CTIDEVID2 bit assignments are: 
31 0 
RESO 


Bits [31:0] 
Reserved, RESO. 


Accessing the CTIDEVID2: 


CTIDEVID2 can be accessed through the external debug interface: 





Component Offset 





CTI OxFCO 
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CTIDEVTYPE, CTI Device Type register 


The CTIDEVTYPE characteristics are: 


Purpose 


Indicates to a debugger that this component is part of a PEs cross-trigger interface. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
CTIDEVTYPE is in the Debug power domain. 


Implementation of this register is OPTIONAL. 


Attributes 
CTIDEVTYPE is a 32-bit register. 


Field descriptions 


The CTIDEVTYPE bit assignments are: 


31 8 7 4 3 0 


RESO SUB MAJOR 


Bits [31:8] 
Reserved, RESO. 


SUB, bits [7:4] 


Subtype. Must read as Qx1 to indicate this is a component within a PE. 
MAJOR, bits [3:0] 


Major type. Must read as 0x4 to indicate this is a cross-trigger component. 


Accessing the CTIDEVTYPE: 


CTIDEVTYPE can be accessed through the external debug interface: 





Component Offset 





CTI OxFCC 
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H9.3.22 CTIGATE, CTI Channel Gate Enable register 
The CTIGATE characteristics are: 
Purpose 
Determines whether events on channels propagate through the CTM to other ECT components, or 
from the CTM into the CTI. 
Usage constraints 
This register is accessible as follows: 
SLK Default 
RO RW 
Configurations 
CTIGATE is in the Debug power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on an External debug reset. The register is not affected by a 
Warm reset and is not affected by a Cold reset. 
Attributes 
CTIGATE is a 32-bit register. 
Field descriptions 
The CTIGATE bit assignments are: 
31 0 
GATE<x>, bit [x] 
GATE<«x>, bit [x], for x = 0 to 31 
Channel <x> gate enable. 
Bits [31:N] are RAZ/WI. N is the number of ECT channels implemented as defined by the 
CTIDEVID.NUMCHAN field. 
Possible values of this bit are: 
0 Disable output and, if CTIDEVID.INOUT == 0b01, input channel <x> propagation. 
1 Enable output and, if CTIDEVID.INOUT == 0b01, input channel <x> propagation. 
If GATE[x] is set to 0, no new events will be propagated to the ECT, and if the ECT supports 
multicycle channel events any existing output channel events will be terminated. 
This field resets to a value that is architecturally UNKNOWN. 
Accessing the CTIGATE: 
CTIGATE can be accessed through the external debug interface: 
Component Offset 
CTI 0x140 
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H9.3.23 CTIINEN<n>, CTI Input Trigger to Output Channel Enable registers, n = 0 - 31 
The CTIINEN<n> characteristics are: 


Purpose 


Enables the signaling of an event on output channels when input trigger event n is received by the 
CTI. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RW 





Configurations 


CTIINEN<n> is in the Debug power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on an External debug reset. The register is not affected by a 
Warm reset and is not affected by a Cold reset. 


If input trigger n is not connected, the behavior of CTIINEN<n> is IMPLEMENTATION DEFINED. 


Attributes 
CTIINEN<n> is a 32-bit register. 


Field descriptions 


The CTIINEN<n> bit assignments are: 


31 0 


INEN<x>, bit [x] 


INEN<x>, bit [x], for x = 0 to 31 
Input trigger <n> to output channel <x> enable. 


Bits [31:N] are RAZ/WI. N is the number of ECT channels implemented as defined by the 
CTIDEVID.NUMCHAN field. 


Possible values of this bit are: 
0 Input trigger <n> will not generate an event on output channel <x>. 
1 Input trigger <n> will generate an event on output channel <x>. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the CTIINEN<n>: 


CTIINEN<n> can be accessed through the external debug interface: 





Component Offset 





CTI 0x020 + 4n 
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H9.3.24 


CTIINTACK, CTI Output Trigger Acknowledge register 


The CTIINTACK characteristics are: 


Purpose 


Can be used to deactivate the output triggers. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO WO 





A debugger must read CTITRIGOUTSTATUS to confirm that the output trigger has been 
acknowledged before generating any event that must be ordered after the write to CTIINTACK, 
such as a write to CTIAPPPULSE to activate another trigger. 

Configurations 
CTIINTACK is in the Debug power domain. 


Attributes 
CTIINTACK is a 32-bit register. 


Field descriptions 


The CTIINTACK bit assignments are: 


31 0 


ACK<n>, bit [n] 


ACK<n>, bit [n], for n = 0 to 31 
Acknowledge for output trigger <n>. 


Bits [31:N] are RAZ/WI. N is the number of CTI triggers implemented as defined by the 
CTIDEVID.NUMTRIG field. 


If any of the following is true, writes to ACK<n> are ignored: 

° n >= CTIDEVID.NUMTRIG, the number of implemented triggers. 

° Output trigger n is not active. 

° The channel mapping function output, as controlled by CTIOUTEN<n>, is still active. 


Otherwise, if any of the following are true, it is IMPLEMENTATION DEFINED whether writes to 
ACK<n> are ignored: 


° Output trigger n is not implemented. 

° Output trigger n is not connected. 

° Output trigger n is self-acknowledging and does not require software acknowledge. 
Otherwise, the behavior on writes to ACK<n> is as follows: 

Q No effect 


1 Deactivate the trigger. 
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Accessing the CTIINTACK: 


CTIINTACK can be accessed through the external debug interface: 





Component Offset 





CTI 0x010 
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H9.3.25 CTIITCTRL, CTI Integration mode Control register 

The CTIITCTRL characteristics are: 

Purpose 
Enables the CTI to switch from its default mode into integration mode, where test software can 
control directly the inputs and outputs of the PE, for integration testing or topology detection. 

Usage constraints 
This register is accessible as follows: 

Default 
RW 

Configurations 
It is IMPLEMENTATION DEFINED whether CTIITCTRL is implemented in the Core power domain or 
in the Debug power domain. Some or all RW fields of this register have defined reset values, and: 
° The register is not affected by a Warm reset. 
. If the register is implemented in the Core power domain the reset values apply on a Cold 

reset, and the register is not affected by an External debug reset. 
° If the register is implemented in the Debug power domain the reset values apply on an 
External debug reset, and the register is not affected by a Cold reset. 

Implementation of this register is OPTIONAL. 

Attributes 
CTIITCTRL is a 32-bit register. 

Field descriptions 

The CTITCTRL bit assignments are: 

31 10 

RESO fl 
— IME 

Bits [31:1] 
Reserved, RESO. 

IME, bit [0] 
Integration mode enable. When IME == 1, the device reverts to an integration mode to enable 
integration testing or topology detection. The integration mode behavior is IMPLEMENTATION 
DEFINED. 
0 Normal operation. 
1 Integration mode enabled. 
When this register has an architecturally-defined reset value, this field resets to Q. 
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Accessing the CTIITCTRL: 


CTIITCTRL can be accessed through the external debug interface: 





Component Offset 





CTI OxF0O 
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H9.3.26 CTILAR, CTI Lock Access Register 

The CTILAR characteristics are: 

Purpose 
Allows or disallows access to the CTI registers through a memory-mapped interface. 

Usage constraints 
This register is accessible as follows: 

Default 
WO 

Configurations 
CTILAR is in the Debug power domain. 
If OPTIONAL memory-mapped access to the external debug interface is supported then an OPTIONAL 
Software Lock can be implemented as part of CoreSight compliance. 
CTILAR ignores writes if the Software lock is not implemented and ignores writes for other 
accesses to the external debug interface. 
The Software Lock provides a lock to prevent memory-mapped writes to the Cross-Trigger 
Interface registers. Use of this lock mechanism reduces the risk of accidental damage to the contents 
of the Cross-Trigger Interface registers. It does not, and cannot, prevent all accidental or malicious 
damage. 
Software uses CTILAR to set or clear the lock, and CTILSR to check the current status of the lock. 

Attributes 
CTILAR is a 32-bit register. 

Field descriptions 

The CTILAR bit assignments are: 

31 0 

KEY 

KEY, bits [31:0] 
Lock Access control. Writing the key value @xC5ACCE55 to this field unlocks the lock, enabling write 
accesses to this component's registers through a memory-mapped interface. 
Writing any other value to this register locks the lock, disabling write accesses to this component's 
registers through a memory mapped interface. 

Accessing the CTILAR: 

CTILAR can be accessed through a memory-mapped access to the external debug interface: 

Component Offset 
CTI OxFBO 
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H9.3.27 CTILSR, CTI Lock Status Register 
The CTILSR characteristics are: 


Purpose 


Indicates the current status of the Software Lock for CTI registers. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 


CTILSR is in the Debug power domain. Some or all RW fields of this register have defined reset 
values. These apply only on an External debug reset. The register is not affected by a Warm reset 
and is not affected by a Cold reset. 


If OPTIONAL memory-mapped access to the external debug interface is supported then an OPTIONAL 
Software Lock can be implemented as part of CoreSight compliance. 


CTILSR is RAZ if the Software Lock is not implemented and is RAZ for other accesses to the 
external debug interface. 


The Software Lock provides a lock to prevent memory-mapped writes to the Cross-Trigger 
Interface registers. Use of this lock mechanism reduces the risk of accidental damage to the contents 
of the Cross-Trigger Interface registers. It does not, and cannot, prevent all accidental or malicious 
damage. 


Software uses CTILAR to set or clear the lock, and CTILSR to check the current status of the lock. 


Attributes 
CTILSR is a 32-bit register. 


Field descriptions 


The CTILSR bit assignments are: 


31 32 1 0 


RESO i 
— SLI 
SLK 
nTT 





Bits [31:3] 
Reserved, RESO. 

nTT, bit [2] 
Not thirty-two bit access required. RAZ. 

SLK, bit [1] 
Software Lock status for this component. For an access to LSR that is not a memory-mapped access, 
or when the Software Lock is not implemented, this field is RESO. 
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For memory-mapped accesses when the Software Lock is implemented, possible values of this field 


are: 
0 Lock clear. Writes are permitted to this component's registers. 
1 Lock set. Writes to this component's registers are ignored, and reads have no side 


effects. 
When this register has an architecturally-defined reset value, this field resets to 1. 
SLI, bit [0] 


Software Lock implemented. For an access to LSR that is not a memory-mapped access, this field 
is RAZ. For memory-mapped accesses, the value of this field is IMPLEMENTATION DEFINED. 
Permitted values are: 


) Software Lock not implemented or not memory-mapped access. 


1 Software Lock implemented and memory-mapped access. 


Accessing the CTILSR: 


CTILSR can be accessed through a memory-mapped access to the external debug interface: 





Component Offset 





CTI OxFB4 
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H9.3.28 CTIOUTENK<n>, CTI Input Channel to Output Trigger Enable registers, n = 0 - 31 
The CTIOUTEN<n> characteristics are: 


Purpose 


Defines which input channels generate output trigger n. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RW 





Configurations 


CTIOUTEN<n> is in the Debug power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply only on an External debug reset. The register is not affected by a 
Warm reset and is not affected by a Cold reset. 


If output trigger n is not connected, the behavior of CTIOUTEN<n> is IMPLEMENTATION DEFINED. 


Attributes 
CTIOUTEN<n> is a 32-bit register. 


Field descriptions 


The CTIOUTEN<n> bit assignments are: 


31 0 


OUTEN<xe, bit [x] 


OUTEN<x>, bit [x], for x = 0 to 31 
Input channel <x> to output trigger <n> enable. 


Bits [31:N] are RAZ/WI. N is the number of ECT channels implemented as defined by the 
CTIDEVID.NUMCHAN field. 


Possible values of this bit are: 
Q An event on input channel <x> will not cause output trigger <n> to be asserted. 
1 An event on input channel <x> will cause output trigger <n> to be asserted. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the CTIOUTEN<n>: 


CTIOUTEN<n> can be accessed through the external debug interface: 





Component Offset 





CTI Qx0A0 + 4n 
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H9.3.29 


CTIPIDRO, CTI Peripheral Identification Register 0 


The CTIPIDRO characteristics are: 


Purpose 
Provides information to identify a CTI component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
CTIPIDRO is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
CTIPIDRO is a 32-bit register. 


Field descriptions 


The CTIPIDRO bit assignments are: 


31 8 7 0 


RESO PART_O 


Bits [31:8] 
Reserved, RESO. 


PART_0, bits [7:0] 


Part number, least significant byte. 


Accessing the CTIPIDRO: 


CTIPIDRO can be accessed through the external debug interface: 





Component Offset 





CTI OxFEO 
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H9.3.30 CTIPIDR1, CTI Peripheral Identification Register 1 
The CTIPIDR1 characteristics are: 
Purpose 


Provides information to identify a CTI component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
CTIPIDR1 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
CTIPIDR1 is a 32-bit register. 
Field descriptions 
The CTIPIDR1 bit assignments are: 
31 8 7 4 3 0 
RESO DES_0O PART_1 


Bits [31:8] 
Reserved, RESO. 


DES_0, bits [7:4] 
Designer, least significant nibble of JEP106 ID code. For ARM Limited, this field is @b1011. 


PART_1, bits [3:0] 


Part number, most significant nibble. 


Accessing the CTIPIDR1: 


CTIPIDRI can be accessed through the external debug interface: 





Component Offset 





CTI OxFE4 
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H9.3.31 CTIPIDR2, CTI Peripheral Identification Register 2 
The CTIPIDR2 characteristics are: 
Purpose 
Provides information to identify a CTI component. 
For more information see About the Peripheral identification scheme on page K2-5504. 
Usage constraints 
This register is accessible as follows: 
SLK Default 
RO RO 
Configurations 
CTIPIDR2 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
CTIPIDR2 is a 32-bit register. 
Field descriptions 
The CTIPIDR2 bit assignments are: 
31 8 7 43 2 0 
RESO REVISION 7 DES_1 
[| JEDEC 
Bits [31:8] 
Reserved, RESO. 
REVISION, bits [7:4] 
Part major revision. Parts can also use this field to extend Part number to 16-bits. 
JEDEC, bit [3] 
RAO. Indicates a JEP106 identity code is used. 
DES_1, bits [2:0] 
Designer, most significant bits of JEP106 ID code. For ARM Limited, this field is @b011. 
Accessing the CTIPIDR2: 
CTIPIDR2 can be accessed through the external debug interface: 
Component Offset 
CTI OxFE8 
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H9.3.32 CTIPIDR3, CTI Peripheral Identification Register 3 
The CTIPIDR3 characteristics are: 


Purpose 
Provides information to identify a CTI component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
CTIPIDR3 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
CTIPIDR3 is a 32-bit register. 
Field descriptions 
The CTIPIDR3 bit assignments are: 
31 8 7 4 3 0 
RESO REVAND CMOD 


Bits [31:8] 
Reserved, RESO. 


REVAND, bits [7:4] 
Part minor revision. Parts using CTIPIDR2.REVISION as an extension to the Part number must use 
this field as a major revision number. 


CMOD, bits [3:0] 


Customer modified. Indicates someone other than the Designer has modified the component. 


Accessing the CTIPIDR3: 


CTIPIDR3 can be accessed through the external debug interface: 





Component Offset 





CTI OxFEC 
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H9.3.33 


CTIPIDR4, CTI Peripheral Identification Register 4 


The CTIPIDR4 characteristics are: 


Purpose 
Provides information to identify a CTI component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
CTIPIDR4 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
CTIPIDR4 is a 32-bit register. 


Field descriptions 


The CTIPIDR4 bit assignments are: 


31 8 7 4 3 0 


RESO SIZE DES_2 


Bits [31:8] 
Reserved, RESO. 


SIZE, bits [7:4] 
Size of the component. RAZ. Log» of the number of 4KB pages from the start of the component to 
the end of the component ID registers. 


DES_2, bits [3:0] 
Designer, JEP106 continuation code, least significant nibble. For ARM Limited, this field is 0b0100. 


Accessing the CTIPIDR4: 


CTIPIDR4 can be accessed through the external debug interface: 





Component Offset 





CTI OxFDO 
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H9.3.34 CTITRIGINSTATUS, CTI Trigger In Status register 
The CTITRIGINSTATUS characteristics are: 


Purpose 


Provides the status of the trigger inputs. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
CTITRIGINSTATUS is in the Debug power domain. 
Attributes 
CTITRIGINSTATUS is a 32-bit register. 
Field descriptions 
The CTITRIGINSTATUS bit assignments are: 
0 


31 


TRIN<no, bit [n] 


TRIN<n>, bit [n], for n = 0 to 31 
Trigger input <n> status. 


Bits [31:N] are RAZ. N is the number of CTI triggers implemented as defined by the 
CTIDEVID.NUMTRIG field. 


Possible values of this bit are: 

0 Input trigger n is inactive. 

1 Input trigger n is active. 

Not implemented and not-connected input triggers are always inactive. 


It is IMPLEMENTATION DEFINED whether an input trigger that does not support multicycle events can 
be observed as active. 


Accessing the CTITRIGINSTATUS: 


CTITRIGINSTATUS can be accessed through the external debug interface: 





Component Offset 





CTI 0x130 
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H9.3.35 CTITRIGOUTSTATUS, CTI Trigger Out Status register 
The CTITRIGOUTSTATUS characteristics are: 
Purpose 
Provides the raw status of the trigger outputs, after processing by any IMPLEMENTATION DEFINED 
trigger interface logic. For output triggers that are self-acknowledging, this is only meaningful if the 
CTI implements multicycle channel events. 
Usage constraints 
This register is accessible as follows: 
SLK Default 
RO RO 
Configurations 
CTITRIGOUTSTATUS is in the Debug power domain. 
Attributes 
CTITRIGOUTSTATUS is a 32-bit register. 
Field descriptions 
The CTITRIGOUTSTATUS bit assignments are: 
31 0 
TROUT<n>, bit [n] 
TROUT<n>, bit [n], for n = 0 to 31 
Trigger output <n> status. 
Bits [31:N] are RAZ, where N is the value of the CTIDEVID.NUMTRIG field. 
If n<N, and output trigger <n> is implemented and connected, and either the trigger is not 
self-acknowledging or the CTI implements multicycle channel events, then permitted values for 
TROUT<n> are: 
0 Output trigger n is inactive. 
1 Output trigger n is active. 
Otherwise when n < N it is IMPLEMENTATION DEFINED whether TROUT<n> behaves as described 
here or is RAZ. 
Accessing the CTITRIGOUTSTATUS: 
CTITRIGOUTSTATUS can be accessed through the external debug interface: 
Component Offset 
CTI 0x134 
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Chapter 11 
System Level Implementation of the Generic Timer 


This chapter defines the system level implementation of the Generic Timer. It contains the following sections: 





° About the Generic Timer specification on page I1-5122. 
. Memory-mapped counter module on page 11-5124. 
° Memory-mapped timer components on page 11-5128. 
. Providing a complete set of counter and timer features on page 11-5132. 
° Gray-count scheme for timer distribution scheme on page 11-5134. 
Note 
° Generic Timer memory-mapped register descriptions on page 13-5199 describes the System level Generic 


Timer registers. These registers are memory-mapped. 


° Chapter D6 The Generic Timer in AArch64 state gives a general description of the AArch64 state view of the 
Generic Timer, and describes the AArch64 System register interface to the Generic Timer. 


° Chapter GS The Generic Timer in AArch32 state gives a general description of the AArch32 state view of the 
Generic Timer, and describes the AArch32 System register interface to the Generic Timer. 
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11.1 


11.1.1 


11.1.2 


About the Generic Timer specification 


Chapter D6 The Generic Timer in AArch64 state describes the ARM Generic Timer and its implementation as seen 
from AArch64 state. Chapter GS The Generic Timer in AArch32 state describes the ARM Generic Timer and its 
implementation as seen from AArch32 state. These chapters include the definition of the low-latency System 
register interface to the Generic Timer. However, the ARM Generic Timer architecture requires the inclusion of a 
memory-mapped component, the memory-mapped counter module, to control the generation of the Count value 
used by the Generic Timers. 


In addition, the Generic Timer architecture defines the architecture of an optional memory-mapped timer 
component. This gives a standardized way of providing timers for programmable system components other than PEs 
that implement the ARM architecture. 


The full set of Generic Timer components on page D6-1877 summarizes these components as seen from AArch64 
state, and The full set of Generic Timer components on page G5-4219 summarizes them as seen from AArch3? state. 
The system level components of the Generic Timer summarizes the system level components. 


Registers in the system level implementation of the Generic Timer 


Registers that control components of the system level implementation of the Generic Timer are grouped into frames. 
This specification defines the registers in each frame, and their offsets within the frame. The system defines the 
position of each frame in the memory map. This means the base addresses for each frame is IMPLEMENTATION 
DEFINED. 


Note 


The final twelve words of the first or only 4KB block of a register memory frame is an ID block. 








Power and reset domains for the system level implementation of the Generic Timer 


The power and reset domains of the system level implementation of the Generic Timer are IMPLEMENTATION 
DEFINED as part of the system implementation. These domains can be outside the PE power and reset domains 
defined by the remainder of this manual. 


The ARM architecture requires that the CNTCR. {FCREQ, EN} and CNTSR.FCACK fields reset to 0. These reset 
values apply only on powerup of the power domain in which the registers are implemented or a reset of the reset 
domain in which they are implemented. 


Every other register, or register field, of a system level implementation of the Generic Timer resets to a value that 
is architecturally UNKNOWN if it has a meaningful reset value. This applies on powerup of the power domain in 
which the register is implemented, and on a reset of the reset domain in which it is implemented. 


The system level components of the Generic Timer 


Each system level component has one or two register frames. The possible system level components are: 


The memory-mapped counter module, required 
This module controls the system counter. It has two frames: 
° A control frame, CNTControlBase. 
° A status frame, CNTReadBase. 


Memory-mapped counter module on page 11-5124 describes this component. 


The memory-mapped timer control module, required 
The system level implementation of the Generic Timer can provide up to eight timers, and the 
memory-mapped timer control module identifies: 
° Which timers are implemented. 


° The features of each implemented timer. 


This module has a single frame, CNTCTLBase. 
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The CNTCTLBase frame on page 11-5129 describes this frame. 


Memory-mapped timers, optional 
An implemented memory-mapped timer: 
° Must provide a privileged view of the timer, in the CNTBaseWN frame. 
° Optionally provides an unprivileged view of the timer in the CNTELOBaseN frame. 
N is the timer number, and the corresponding frame number, in the range 0-7. 


The CNTBaseN and CNTELOBaseN frames on page I1-5130 describes these frames. 
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11.2 Memory-mapped counter module 


The memory-mapped counter module provides top-level control of the system counter. The CNTControlBase frame 
holds the registers for the memory-mapped counter, and provides: 
° A RW control register CNTCR, that provides: 

—  Anenable bit for the system counter. 


—  Anenable bit for Halt-on-Debug. When this is enabled, if the debug halt signal into the system counter 
is asserted, it halts the system counter. Otherwise, the system counter ignores the state of this halt 
signal. For more information about Halt-on-Debug, contact ARM. 


— A field that can be written to request a change to the update frequency of the system counter, with a 
corresponding change to the increment made at each update. This mechanism means that, for example, 
if the update frequency is halved, the increment at each update is doubled. 


For more information see Control of counter operating frequency and increment on page 11-5125. 
Writes to this register are rare. In a system that supports two Security states, this register is writable only by 
Secure writes. 
° A RO status register, CNTSR, that provides: 
— __ A bit that indicates whether the system counter is halted because of an asserted Halt-on-Debug signal. 
— __ A field that indicates the current update frequency of the system counter. This field can be polled to 
determine when a requested change to the update frequency has been made. 
° Two contiguous 32-bit RW registers that hold the current system counter value, CNTCV. If the system 
supports 64-bit atomic accesses, these two registers must be accessible by such accesses. 


The system counter must be disabled before writing to these registers, otherwise the effect of the write is 
UNPREDICTABLE. 


Writes to these registers are rare. In a system that supports two Security states, these registers are writable 
only by Secure writes. 
° A Frequency modes table of one or more 32-bit entries, where: 


— The first entry in the table defines the base frequency of the system counter. This is the maximum 
frequency at which the counter updates. 


— Each subsequent entry in the table defines an alternative frequency of the system counter, that must be 
an exact divisor of the base frequency. 


A 32-bit zero entry immediately follows the last table entry. 
This table can be RO or RW. For more information, see The Frequency modes table on page 11-5125. 
In addition, the CNTReadBase frame includes a read-only copy of the system counter value, CNTCV, as two 


contiguous 32-bit RO registers. If the system supports 64-bit atomic accesses, these two registers must be accessible 
by such accesses. 


Counter module control and status register summary on page 11-5126 describes CNTReadBase and 
CNTControlBase memory maps, and the registers in each frame. 
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11.2.1 Control of counter operating frequency and increment 


The system counter has a fixed base frequency, and must maintain the required counter accuracy, meaning ARM 
recommends that it does not gain or lose more than ten seconds in a 24-hour period, see The system counter on 
page D6-1878. However, the counter can increment at a lower frequency than the base frequency, using a 
correspondingly larger increment. For example, it can increment by four at a quarter of the base frequency. Any 
lower-frequency operation, and any switching between operating frequencies, must not reduce the accuracy of the 
counter. 


Control of the system counter frequency and increment is provided only through the memory-mapped counter 
module. The following sections describe this control: 


° The Frequency modes table. 


° Changing the system counter and increment. 


The Frequency modes table 

The Frequency modes table starts at offset @x20 in the CNTControlBase frame. 

Table entries are 32-bits, and each entry specifies a system counter update frequency, in Hz. 
The first entry in the table specifies the base frequency of the system counter. 


When the system timer is operating at a lower frequency than the base frequency, the increment applied at each 
counter update is given by: 


increment = (base_frequency) / (selected_frequency) 


A 32-bit word of zero value marks the end of the table. That is, the word of memory immediately after the last entry 
in the table must be zero. 


The only required entry in the table is the entry for the base frequency. 


Typically, the Frequency modes table is in RO memory. However, a system implementation might use RW memory 
for the table, and initialize the table entries as part of its startup sequence. Therefore, the CNTControlBase memory 
map shows the table region as RO or RW. 


ARM strongly recommends that the Frequency modes table is not updated once the system is running. 


The architecture can support up to 1004 entries in the Frequency modes table, including the zero-word end marker, 
and the maximum number of entries is IMPLEMENTATION DEFINED, up to this limit. 





Note 
° ARM considers it likely that implementations will require significantly fewer entries than the architectural 
limit. 
° In the CNTControlBase frame, the offset range @x0CQ-@x@FC can be used for IMPLEMENTATION DEFINED 


registers. If any registers are defined in this space then the Frequency modes table cannot extend beyond 
offset 0x@B8, with a zero word at offset @x@BC. This means the maximum number of entries in the table is 40, 
including the zero-word end marker. 





Changing the system counter and increment 


The value of the CNTCR.FCREQ field specifies which entry in the Frequency modes table specifies the system 
counter update frequency. 


Changing the value of CNTCR.FCREQ requests a change to the system counter update frequency. To ensure the 
frequency change does not affect the overall accuracy of the counter, a change is made as follows: 





. When changing from a higher frequency to a lower frequency, the counter: 
1. Continues running at the higher frequency until the count reaches an integer multiple of the required 
lower frequency. 
2. Switches to operating at the lower frequency. 
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11.2.2 


. When changing from a lower frequency to a higher frequency, the counter: 
1. Waits until the end of the current lower-frequency cycle. 
2. Makes the counter increment required for operation at that lower frequency. 
3; Switches to operating at the higher frequency. 


When the frequency has changed, CNTSR is updated to indicate the new frequency. Therefore, a system component 
that is waiting for a frequency change can poll CNTSR to detect the change. 


Counter module control and status register summary 


The Counter module control and status registers are memory-mapped registers in the following register memory 
frames: 


° A control frame, with base address CNTControlBase. 
° A status frame, with base address CNTReadBase. 


Each of these register memory frames is at least 4KB in size, or is at least the size of the memory protection granule 
if this granule size is larger than 4KB. Similarly, each base address must be aligned to 4KB, or to the memory 
protection granule if that is larger than 4KB. 


Note 
° The memory protection granule is either 4KB, 16KB or 64KB. 





° Each frame of a memory-mapped Generic Timer takes the name of its base address. 





In each register memory frame, the memory at offset @xFDQ-@xFFF is reserved for twelve 32-bit IMPLEMENTATION 
DEFINED ID registers, see the CounterI[D<n> register descriptions for more information. 


The counter is assumed to be little-endian. 


In an implementation that supports Secure and Non-secure memory spaces, CNTControlBase is implemented only 
in the Secure memory space. 


Table I1-1 shows the CNTControlBase control registers, in order of their offsets from the CNTControlBase base 
address, for an implementation that includes registers in the implementation defined register space 0x0C0-OxOFC, and 
also has fewer than 39 CNTFID<n> registers. The Frequency modes table on page 11-5125 describes how this 
memory map differs if more CNTFID<n> registers are implemented. 


Generic Timer memory-mapped register descriptions on page 13-5199 describes each of these registers. 


Table 11-1 CNTControlBase memory map 





Offset 


Name Type Description 





0x000 


CNTCR RW Counter Control Register. 





0x004 


CNTSR RO Counter Status Register. 





0x008 


CNTCV[31:0] RW Counter Count Value register. 





0x00C 


CNTCV[63:32] RW 





Qx010-0x01C 


- RESO Reserved. 





0x020 


CNTFIDO RO or RW Frequency modes table, and end marker. 





0x020+4n 


For more information see The Frequency modes table on 


RO or RW page 11-5125. 


CNTFID<n> 





0x024+4n 


- RO or RW, RAZ 





(0x028+41)-Ox@BC 


- RO, RESO Reserved. 
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Table 11-1 CNTControlBase memory map (continued) 














Offset Name Type Description 

Qx@CO-OxOFC - IMPLEMENTATION DEFINED _ Reserved for IMPLEMENTATION DEFINED registers. 
Qx100-@xFCC - RO, RESO Reserved. 

OxFDO-OxFFC CounterID<n> RO Counter ID registers 0-11. 





Table 11-2 shows the CNTReadBase control registers, in order of their offsets from the CNTReadBase base address. 
Generic Timer memory-mapped register descriptions on page 13-5199 describes each of these registers. 


Table 11-2 CNTReadBase memory map 























Offset Name Type Description 
0x000 CNTCV[31:0] RO Counter Count Value register 
0x004 CNTCV[63:32] RO 
0x008-OxFCC - RESO Reserved 
QxFD@-@xFFC = Counter[D<n> RO Counter ID registers 0-11 
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11.3 Memory-mapped timer components 


This part of the ARM Generic Timer specification defines an optional memory-mapped timer component. This can 
be implemented as part of any programmable system component that does not incorporate a System register mapped 
ARM Generic Timer, to provide that system component with the timer functionality of an ARM Generic Timer. 


The memory map consists of up to 8 timer frames. Each timer frame: 
° Provides its own set of timers and associated interrupts. 


. Is in its own memory protection region that is: 
—  Inits own memory protection region, with a system-defined size of 4KB, 16KB or 64KB. 
—  Atastart address that is aligned to 4KB. 
Note 


The start address 4KB alignment requirement applies regardless of the memory protection region size. 








The base address of a frame is CNTBaseWN, where N numbers from 0 up to a maximum permitted value of 7. 


For each implemented CNTBaseW frame the system can optionally provide an unprivileged view of the frame, 
described as the ELO view of the frame. The base address of this second view of the CNTBaseN frame is 
CNTELOBaseN. 


Note 


In the naming of the registers associated with a CNTBaseN or CNTELOBaseN frame, the value of N is represented 
as <n>, for example CNTACR<n>. 








If a CNTELOBaseN frame is implemented: 
° All registers visible in CNTBaseW are visible, except for CNTVOFF and CNTELOACR. 
° The offsets of all visible registers are the same as their offsets in the CNTBaseN frame. 


In addition to the implemented CNTBaseN and CNTELOBaseN frames, the system must provide a single control 
frame at base address CNTCTLBase. 


The memory protection region and alignment requirements for the CNTELOBaseN and CNTCTLBase frames are 
the same as the requirements for the CNTBaseWN frames. 


The system defines the position of each frame in the memory map. This means the values of each of the CNTBaseN, 
CNTELOBaseN, and CNTCTLBase base addresses is IMPLEMENTATION DEFINED. 


The memory-mapped timers are assumed to be little-endian. 


The following sections describe the implementation of a memory-mapped view of the counter and timer: 
° The CNTCTLBase frame on page 11-5129. 
° The CNTBaseN and CNTELOBaseN frames on page 11-5130. 


. Providing a complete set of counter and timer features on page I1-5132. 
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11.3.1 The CNTCTLBase frame 


The CNTCTLBase frame contains: 
° An identification register for the features of the memory-mapped counter and timer implementation. 
° Access controls for each CNTBaseWN frame. 


° A virtual offset register for frames that implement a virtual timer. 


Table 11-3 shows the CNTCTLBase registers, in order of their offsets from the CNTCTLBase base address. 


Note 
CNTFRQ and CNTVOFF registers are also implemented in a System register interface to the Generic Timer. 








Generic Timer memory-mapped register descriptions on page 13-5199 describes each of these registers. 


Table 11-3 CNTCTLBase memory map 


















































Offset Register Type Security? Description 

0x000 CNTFRQ?> RW Secure Counter Frequency register. 

0x04 CNTNSAR RW Secure Counter Non-Secure Access register. 

0x008 CNTTIDR RO Both Counter Timer ID register. 

QxOOC- Ox03F —- RESO - Reserved. 

0x040+4.N¢ CNTACR<n> RW Configurable4 | Counter Access Control register N. 

Qx60- Ox@7F- RESO - Reserved. 

0x080+8N° CNTVOFF<n>[31:0]° RWe Configurable4 Virtual Offset register N. IMPLEMENTATION DEFINED 
whether this register is RW or RAZ. Optional in the 

Qx084+8N¢ CNTVOFF<n>[63:32] §RWe CNTCTLBase memory map. 

QxCO-OxOFC - RESO - Reserved. 

Qx100-0x7FC = - - IMPLEMENTATION DEFINED. 

Qx800-OxFBC - RESO - Reserved. 

OxFCO-OxFCF = - - IMPLEMENTATION DEFINED. 

QxFD@- @xFFC = Counter[D<n> RO Both Counter ID registers 0-11. 





a. Access security requirement in an implementation that supports two Security states. In an implementation that does not support multiple 
Security states all registers are accessible as shown in the Type column. 


b. These registers are also defined in the System register interface to the Generic Timer, and therefore are also described in Generic Timer 
registers on page D7-2255 and Generic Timer registers on page G6-4803. The bit assignments of the registers are identical in the System 
register interface and in the memory-mapped system level interface. 

c. Implemented for each value of N from 0 to 7 for which a CNTBaseW frame is implemented. 

d. The CNTNSAR determines the Non-secure accessibility of the CNTACR<n>s and the CNTVOFF<n>s in the CNTCTLBase frame. For 
more information, see the register descriptions. 


e. Address is reserved, RAZ/WI if register not implemented. 


All implementations of the Generic Timer include the virtual counter. Therefore, conceptually, all implementations 
include the CNTVOFF register that defines the virtual offset between the physical count and the virtual count. If a 
memory-mapped Generic Timer component does not distinguish between real time and virtual time then it can 
implement CNTVOFF as RAZ/WI. Otherwise CNTVOFF is a RW register, and ARM strongly recommends that 
the system only permits access to CNTVOFF from EL2 or higher. 
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11.3 Memory-mapped timer components 

































































11.3.2 The CNTBaseN and CNTELOBaseN frames 
Table 11-4 shows the CNTBaseN registers, in order of their offsets from the CNTBaseWN base address. Whether a 
frame includes a virtual timer is IMPLEMENTATION DEFINED. If it does not then memory at offsets 0x030-0x03C is 
RAZ/WI. Except for CNTELOACR and the CounterID<n> registers, equivalent registers are also implemented in a 
System register interface to the timer component of a Generic Timer. 
Generic Timer memory-mapped register descriptions on page 13-5199 describes each of these registers. 
Table 11-4 CNTBaseN memory map 
Offset Register Type Description 
0x000 CNTPCT[31:0]@ RO Physical Count register. 
0x004 CNTPCT[63:32]@ RO 
0x008 CNTVCT[31:0]@ RO Virtual Count register. 
Ox00C CNTVCT[63:32]4 RO 
0x010 CNTFRQ2 ROC Counter Frequency register. 
0x014 CNTELOACR RW? Counter ELO Access Control Register, optional in the CNTBaseN memory 
map. 
0x018 CNTVOFF{31:0]2 ROc Virtual Offset register. If CNTVOFF<n> in the CNTCTLBase frame is a 
RW register, a read of this register returns the value of that register. 
@x@1C CNTVOFF[63:32]@ ROC Otherwise is RAZ. 
0x020 CNTP_CVAL[31:0]4 RW Physical Timer Compare Value register. 
0x024 CNTP_CVAL[63:32]@ RW 
0x028 CNTP_TVAL2 RW Physical TimerValue register. 
Qx02C CNTP_CTL2 RW Physical Timer Control register. 
0x030 CNTV_CVAL[31:0]4 Rw? Virtual Timer Compare Value register, optional in the CNTBaseN memory 
0x034 CNTV_CVAL[63:32]2 RW? or 
0x038 CNTV_TVAL@ RW? Virtual TimerValue register, optional in the CNTBaseN memory map. 
0x03C CNTV_CTL?2 Rwb Virtual Timer Control register, optional in the CNTBaseN memory map. 
Qx040-OxFCF = - RESO Reserved. 
@xFD@-@xFFC © CounterI[D<n> RO Counter ID registers 0-11. 





a. These registers are also defined in the System register interface to the Generic Timer, and therefore are also described in Generic Timer 
registers on page D7-2255 and Generic Timer registers on page G6-4803. The bit assignments of the registers are identical in the System 
register interface and in the memory-mapped system level interface. 


b. Address is reserved, RAZ/WI if register not implemented 


c. The CNTCTLBase frame includes a RW view of this register. 


CNTTIDR controls whether the CNTBaseN and CNTELOBaseW frames are implemented. 
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The CNTELOBaseN frame 


For any value of N, the layout of the registers in the CNTELOBaseW frame is identical to the CNTBaseN frame, 
except that, in the CNTELOBaseN frame: 


° CNTVOFF is never visible, and the memory at 0x018-0x01C is RAZ/WI. 
° CNTELOACR is never visible, and the memory at 0x014 is RAZ/WI. 


° If implemented in the CNTBaseW frame, CNTELOACR controls whether CNTPCT, CNTVCT, CNTFRQ, 
the EL1 Physical Timer, and the Virtual Timer registers are visible in the CNTELOBaseWN frame. 


If CNTELOACR is not implemented then these registers are not visible in the CNTELOBaseN frame, and 
their addresses in that frame are RAZ/WI. 


If an implementation supports 64-bit atomic accesses, then CNTPCT, CNTVCT, CNTVOFF, CNTP_CVAL, and 
CNTV_CVAL must be accessible as atomic 64-bit values. 
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11.4 Providing a complete set of counter and timer features 


Using the general model for implementing a memory-mapped interface to the Generic Timer described in this 
section, the feature set of a System registers counter and timer, in an implementation that includes EL2 and EL3, 
can be implemented using the following set of timer frames: 


° A CNTCTLBase control frame. 
° The following CNTBaseWN timer frames: 


Frame 0 Accessible from Non-secure state, with second view and virtual capability. This provides the 
Non-secure EL1&0 timers. 


Frame 1 Accessible from Non-secure state, with no second view and no virtual capability. This provides 
the Non-secure EL2 timers. 


Frame2_ Accessible only from Secure state, with a second view but no virtual capability. This provides the 
Secure PL1&0 timers, meaning: 


° Compared to a PE where EL3 is using AArch32, it provides the only Secure state timer. 
° Compared to a PE where EL3 is using AArch64, it provides the Secure EL1&0 timer. 

Frame 3_ Accessible only Secure state, with no second view and no virtual capability. This provides the 
Secure EL3 timers. 


Note 


This frame is not required for a memory-mapped timer that provides only the feature set of a PE 
for which EL3 is using AArch32. 








In this implementation, the full set of implemented frames, and their configuration in the memory map, is as follows: 


CNTCTLBase 
The control frame. This frame is located in both Secure and Non-secure physical memory, and: 
. In the Secure EL1&0 translation regime, this frame is accessible only at EL1. 
. In the Non-secure EL2 translation regime, this frame is accessible. 
. In the Non-secure EL1&0 translation regime, this frame is not accessible. 


CNTBase0 The first view of the Non-secure EL1&0 timers. This frame is located only in Non-secure physical 
memory, and: 


° In the Secure EL1&0 translation regime, this frame is accessible only at EL1. 

° In the Non-secure EL2 translation regime, this frame is accessible. 

° In the Non-secure EL1&0 translation regime, this frame is accessible only at EL1. 
CNTELOBase0 


The second view of CNTBase0, meaning it is the ELO view of the Non-secure EL1&0 timers. This 
frame is located only in Non-secure physical memory, and: 


° In the Secure EL1&0 translation regime, the architecture permits this frame to be accessible 
at EL1, or at EL1 and ELO, but does not require either of these options. 


° In the Non-secure EL2 translation regime, this frame is accessible. 
. In the Non-secure EL1&0 translation regime, this frame is accessible at EL1 and ELO. 
CNTBasel The first and only view of the Non-secure EL2 timers. This frame is located only in Non-secure 
physical memory, and: 
° When EL3 is using AArch64: 
— Inthe Secure EL1&0 translation regime, this frame is accessible only at EL1. 
— In the Secure EL3 translation regime, this frame is accessible. 


. When EL3 is using AArch32, in the Secure PL1&0 translation regime, this frame is 
accessible only at PL1 (EL3). 





° In the Non-secure EL2 translation regime, this frame is accessible. 
° In the Non-secure EL1&0 translation regime, this frame is not accessible. 
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CNTBase2 The first view of the Secure EL1&0, or PL1&0 timers. 


— Note 


In AArch64 state, these timers are always called the Secure EL1&0 timers. In AArch32 state they 
are usually called the Secure PL1&0 timers because, in AArch32 Secure state, whether some of the 
PE modes map to EL1 or to EL3 depends on whether EL3 is using AArch64 or is using AArch32, 
see Security state, Exception levels, and AArch32 execution privilege on page G1-3792. 





This frame is located only in Secure physical memory, and: 

. When EL3 is using AArch64: 
— Inthe Secure EL1&0 translation regime, this frame is accessible only at EL1. 
— In the Secure EL3 translation regime, this frame is accessible. 


° When EL3 is using AArch32, in the Secure PL1&0 translation regime, this frame is 
accessible only at PL1 (EL3). 


. Because the frame is in Secure memory, it is not accessible in any Non-secure translation 
regime. 
CNTELOBase2 
The second view of CNTBase2, meaning it is the ELO view of the Secure EL1&0, or PL1&0, timers. 
— Note 


See the Note in the description of the CNTBase2 frame for more information about the naming of 
these timers. 





This frame is located only in Secure physical memory, and: 

. When EL3 is using AArch64: 
— Inthe Secure EL1&0 translation regime, this frame is accessible at EL1 and ELO. 
— Inthe Secure EL3 translation regime, this frame is accessible. 


° When EL3 is using AArch32, in the Secure PL1&0 translation regime, this frame is 
accessible at PL1 (EL3) and ELO. 


° Because the frame is in Secure memory, it is not accessible in any Non-secure translation 

regime. 
CNTBase3 The first and only view of the EL3 timers. This frame is located only in Secure physical memory, 

and: 

° When EL3 is using AArch64: 
— Inthe Secure EL1&0 translation regime, this frame is not accessible. 
— In the Secure EL3 translation regime, this frame is accessible. 

. When EL3 is using AArch32, this frame is not accessible. 

° Because the frame is in Secure memory, it is not accessible in any Non-secure translation 


regime. 


Note 


About the Virtual Memory System Architecture (VMSA) on page D4-1722 describes the VMSAv8-64 translation 
regimes, and About VMSAv8-32 on page G4-4022 describes the VMSAv8-32 translation regimes. 
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11.5 Gray-count scheme for timer distribution scheme 


The distribution of the Counter value using a Gray-code provides a relatively simple mechanism to avoid any danger 
of the count being sampled with an intermediate value even if the clocking is asynchronous. It has a further 
advantage that the distribution is relatively low power, since only one bit changes on the main distribution wires for 
each clock tick. 


A suitable Gray-coding scheme can be achieved with the following logic: 
Gray[N] = Count[N] 
Gray[i] = (XOR(Gray[N:i+1])) XOR Count[i] for N-1 >=i>=0 
Count[i] = XOR(Gray[N:1]) for N >=i>=0 


This is for an N+1 bit counter, where Count is a conventional binary count value, and Gray is the corresponding 
Gray count value. 


Note 


This scheme has the advantage of being relatively simple to switch, in either direction, between operating with 
low-frequency and low-precision, and operating with high-frequency and high-precision. To achieve this, the ratio 
of the frequencies must be 2”, where n is an integer. A switch-over can occur only on the 2n+1 boundary to avoid 
losing the Gray-coding property on a switch-over. 
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Chapter |2 
Recommended External Interface to the 
Performance Monitors 


This chapter describes the recommended external interface to the Performance Monitors. It contains the following 
section: 


° About the external interface to the Performance Monitors registers on page 12-5136. 


Note 


Performance Monitors external register descriptions on page 13-5145 describes the external view of the 
Performance Monitors registers. 
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12.1 About the external interface to the Performance Monitors registers 
ARM recommends that: 


. An implementation provides the OPTIONAL External debug interface to the Performance Monitors registers. 


Note 


A debugger can use this interface to access counters in the Performance Monitors. 








° The implementation includes the OPTIONAL support for memory-mapped access to the External debug 
interface. 


Note 


— Software running on any PE in a system can use this interface to access counters in the Performance 
Monitors. 





— Privileged software should use the MMU to control access to this interface. 





. The external debug interface is implemented as defined in Appendix K2 Recommended External Debug 
Interface. 


External Performance Monitors registers summary on page 13-5143 gives the memory map of these registers. 


The following sections describe the memory-mapped views of the Performance Monitors registers: 


° Differences in the external views of the Performance Monitors registers. 
. Synchronization of changes to the memory-mapped views on page 12-5137. 
° Access permissions for external views of the Performance Monitors on page 12-5137. 


In this section, unless the context explicitly indicates otherwise, any reference to a memory-mapped view applies 
equally to a register view using: 


° A access through an external debug interface. 
. A memory-mapped access. 
12.1.1 Differences in the external views of the Performance Monitors registers 


An external view of the Performance Monitors registers accesses the same registers as the System registers interface 
described in Performance Monitors Extension registers on page D5-1871, except that: 


1. The PMSELR is accessible only in the System registers interface. 


2, The following registers are accessible only in external views: 
. PMCFGR 
MDEVAFFO 
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PMCIDR1 
PMCIDR2 
PMCIDR3 


Performance Monitors external register descriptions on page 13-5145 describes these registers. 


3: The following controls do not affect the external views: 


PMSELR. 
PMUSERENR. 
HDCR.{TPM, TPMCR, HPMN}. 


Instead, see the register descriptions in Chapter [3 External System Control Register Descriptions. 


12.1.2 Synchronization of changes to the memory-mapped views 


If a Performance Monitor is visible in both System register and an external view, and is accessed simultaneously 
through these two mechanisms, the behavior must be as if the access occurred atomically in any order. For more 
information, see Synchronization of changes to the external debug registers on page H8-4964. 


12.1.3 Access permissions for external views of the Performance Monitors 


For more information, see External debug interface register access permissions on page H8-4970. 


Table 12-1 on page I2-5138 shows the access permissions for the Performance Monitors registers in a v8 Debug 
implementation. This table uses the following terms: 


DLK 


EPMAD 


Error 


Default 


Off 


OSLK 


SLK 


When the OS Double Lock is locked, DoubleLockStatus() == TRUE, accesses to some registers 
produce an error. Applies to both interfaces. 


When Al lowExternalPMUAccess() == FALSE, external debug access is disabled. See also Behavior 
of a not permitted memory-mapped access on page H8-4969. 


Indicates that the access gives an error response. 


This shows the default access permissions, if none of the conditions in this list prevent access to the 
register. 


When EDPRSR.PU == 0, the Core power domain is completely off, or in a low-power state where 
the Core power domain registers cannot be accessed. 


— Note 


If debug power is off, then all external debug interface accesses return an error. 





When the OS Lock is locked, OSLAR_EL1.OSLK == 1, accesses to some registers produces an 
error. This column shows the effect of this control on accesses using the external debug interface. 


This indicates the modified default access permissions for OPTIONAL memory-mapped accesses to 
the external debug interface if the OPTIONAL Software Lock is locked. See Register access 
permissions for memory-mapped accesses on page H8-4968. 


For all other accesses, this column is ignored. 


Indicates that the control has no effect on the behavior of the access: 


° If no other control affects the behavior, the Default access behavior applies. 
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. However, another control might determine the behavior. 


Table I2-1 Access permissions for the Performance Monitors registers 




















Offset Register Domain Off DLK OSLK EPMAD Default SLK 
0x000+8xn PMEVCNTR<n>_EL0? Core Error Error — Error Error RW RO 
OxOF8 PMCCNTR_ELO[31:0] Core Error Error — Error Error RW RO 
OxOFC PMCCNTR_ELO[63:32] Core Error Error — Error Error RW RO 
0x400+4xn PMEVTYPER<n>_EL0@ Core Error Error Error Error RW RO 
Qx47C PMCCFILTR_ELO Core Error Error — Error Error RW RO 








0x600-Ox6FC Access is IMPLEMENTATION DEFINED 















































Q@xAQ0-OxBFC - 7 Access is IMPLEMENTATION DEFINED 

@xC00 PMCNTENSET_ELO Core Error Error Error Error RW RO 
@xC20 PMCNTENCLR_ELO Core Error Error — Error Error RW RO 
QxC40 PMINTENSET_EL1 Core Error Error — Error Error RW RO 
QxC60 PMINTENCLR_EL1 Core Error Error — Error Error RW RO 
QxC80 PMOVSCLR_ELO Core Error Error Error Error RW RO 
OxCAQ PMSWINC_ELO> Core Error Error Error Error WO WI 
@xCCO PMOVSSET_ELO Core Error Error — Error Error RW RO 
@xD80-OxDFC = - Access is IMPLEMENTATION DEFINED 

OxE00 PMCFGR Core Error Error — Error Error RO RO 
OxE04 PMCR_ELO Core Error Error — Error Error RW RO 
OxE20 PMCEIDO Core Error Error — Error Error RO RO 
OxE24 PMCEID1 Core Error Error — Error Error RO RO 
@xE80-OxEFC Integration registers - Access is IMPLEMENTATION DEFINED 





@xFQ0-OxFFC Management registers and CoreSight compliance on page K2-5499 





a. Implemented counters only. 7 is the counter number. 
b. Only if the OPTIONAL PMSWINC_ELO register is implemented in the external debug interface. 


12.1.4 Power domains and Performance Monitors registers reset 


For ARMv8-A implementations, ARM recommends that Performance Monitors are implemented as part of the core 
power domain, not as part of a separate debug power domain. There is no interface to access the Performance 
Monitors registers when the core power domain is powered down. 


A Warm or Cold reset sets the Performance Monitors registers to their reset values. An External Debug reset does 
not change the values of the Performance Monitors registers. 


For more information about the reset scheme recommended for a v8 Debug implementation see Chapter H6 Debug 
Reset and Powerdown Support. 
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Table [2-2 shows the Performance Monitors register resets for writable register fields. The column headings use the 
following terms: 


64 
32 


This is the architectural reset value when resetting into AArch64 state. 


This is the architectural reset value when resetting into AArch32 state. 


This indicates an IMPLEMENTATION DEFINED reset value on the specified reset. This might be 
UNKNOWN. 


Note 





This table does not include: 


Read-only identification registers and fields that have a fixed value. In this case, the reset value is that fixed 
value. An example of this is PMCR_ELO.N. 


Write-only registers and fields that only have an effect on writes. These do not have a reset value. An example 
of this is PMSWINC_ELO. 


IMPLEMENTATION DEFINED registers. In this case, the reset domains are IMPLEMENTATION DEFINED. The reset 
values are IMPLEMENTATION DEFINED and might be UNKNOWN. 





Table I2-2 Performance Monitors System register resets 





















































Register Domain Field 64 32 Description 
PMCR_ELO Warm DP - 0 Disable PMCCNTR_ELO when prohibited 
- 0 Export enable 
- 0 Clock divider 
0 0 Performance Monitors enable 
PMCNTENSET_ELO Warm - - - All fields in register 
PMCNTENCLR_ELO 
PMOVSSET_ELO Warm - - - All fields in register 
PMOVSCLR_ELO 
PMSELR_ELO Warm SEL - - Selected event counter 
PMCCNTR_ELO Warm - - - All fields in register 
PMEVTYPER<n>_ELO Warm - - - All fields in register 
PMCCFILTR_ELO Warm [31:26] - 0x00 PMCCNTR_ELO filtering controls 
PMEVCNTR<n>_ELO Warm - - - All fields in register 
PMUSERENR_ELO Warm ER - 0 Enable counter read access in ELO 
CR - 0 Enable PMCCNTR_ELO read access in ELO 
SW - 0 Enable PMSWINC_ELO write access in ELO 
EN - 0 Enable Performance Monitors access in ELO 
PMINTENSET_EL1 Warm - - - All fields in register 


PMINTENCLR_EL1 
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Chapter I3 
External System Control Register Descriptions 


This chapter describes the external system control registers, excluding the External debug register that are described 
in Chapter H9 External Debug Register Descriptions. It contains the following sections: 





° About the external system control register descriptions on page 13-5142. 
° External Performance Monitors registers summary on page 13-5143. 
° Performance Monitors external register descriptions on page 13-5145. 
° Generic Timer memory-mapped registers overview on page 13-5198. 
° Generic Timer memory-mapped register descriptions on page 13-5199. 
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13.1 About the external system control register descriptions 


This chapter describes the external system control registers other than the external debug registers. That is, it 


describes: 


An external view of the Performance Monitors registers 


ARM recommends that implementations provide access to the Performance Monitors registers 
through the OPTIONAL External debug interface, and provide the OPTIONAL memory-mapped 
interface to this interface: 


External Performance Monitors registers summary on page 13-5143 lists the registers that 
are accessible in this view of the Performance Monitors, and describes their memory map. 


Performance Monitors external register descriptions on page 13-5145 describes each of the 
memory-mapped registers. 


Chapter I2 Recommended External Interface to the Performance Monitors describes the 
recommended interface to these registers. 


— Note 


Chapter DS The Performance Monitors Extension describes the Performance Monitors. The 
following sections describe the System register interfaces to the Performance Monitors: 


Performance Monitors registers on page D7-2215, for accesses from an Exception level that 
is using AArch64. 


Performance Monitors registers on page G6-4758, for accesses from an Exception level that 
is using AArch32. 





The registers for the system level Generic Timer component 


Any implementation that includes the Generic Timer must include the memory-mapped system 
level component described in Chapter I1 System Level Implementation of the Generic Timer. In this 
chapter: 


Generic Timer memory-mapped registers overview on page 13-5198 gives an overview of the 
registers, referring to Chapter I1 for more information. 


Generic Timer memory-mapped register descriptions on page 13-5199 describes each of the 
memory-mapped registers. 


— Note 


Chapter D6 The Generic Timer in AArch64 state describes the Generic Timer component that is 
accessible using the System registers. The following sections describe the System register interfaces 
to that component: 


Generic Timer registers on page D7-2255, for accesses from an Exception level that is using 
AArch64. 


Generic Timer registers on page G6-4803, for accesses from an Exception level that is using 
AArch32. 





Note 





Chapter H9 External Debug Register Descriptions describes the external debug registers. 








13-5142 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


Non-Confidential ID092916 


ARM DDI 0487A.k _iss10775 


13 External System Control Register Descriptions 
13.2 External Performance Monitors registers summary 















































13.2 External Performance Monitors registers summary 
When an implementation provides access to the Performance Monitors registers through the External debug 
interface, that interface provides access to: 
° Performance Monitors System registers. 
° A read-only configuration register, PMCFGR. 
° The OPTIONAL CoreSight registers for the Performance Monitors, if they are implemented. 
The locations of the registers are defined as offsets from a system-defined base address. Performance Monitors 
external register views defines this memory map. 
13.2.1 Performance Monitors external register views 
Table 13-1 shows the external view of the Performance Monitors registers. All other entries are reserved. 
Note 
° Counters that are reserved because HDCR.HPMN has been changed from its reset value remain visible in 
any external view. 
° The registers that relate to an implemented event counter, PMNx, are PMEVCNTR<n> and 
PMEVTYPER<n>. 
. The mapping of the Performance Monitors Event Counter Registers, at offsets 0x000-0x0F4, has changed 
compared to the mappings of the equivalent registers in ARMv7. 
Each entry in the Name column links to the register description in Performance Monitors external register 
descriptions on page [3-5145, and: 
° If the System register? column of the table shows that the register is a System register, the memory-mapped 
interface provides a view of the System register described in: 
— Performance Monitors registers on page D7-2215, for the AArch64 System register 
— Performance Monitors registers on page G6-4758, for the AArch32 System register 
° Otherwise, the register is accessible only using the external interface. 
Table 13-1 Performance Monitors external register views 
Name Type Description yet Offset 
register? 
PMEVCNTR<n>_EL0 RW Performance Monitors Event Counter Register. Yes 0x000+8n 
PMCCNTR_ELO[31:0] RW Performance Monitors Cycle Counter Register @ Yes OxOF8 
PMCCNTR_ELO[63:32] OxOFC 
PMEVTYPER<n>_ELO RW Performance Monitors Event Type and Filter Register. Yes 0x400+4n 
PMCCFILTR_ELO RW Performance Monitors Cycle Counter Filter Register Yes Qx47C 
7 - IMPLEMENTATION DEFINED = Qx600-Ox6FC 
2 7 IMPLEMENTATION DEFINED - QxAQ0-@xBFC 
PMCNTENSET_ELO RW Performance Monitors Count Enable Set register Yes OxC0@ 
PMCNTENCLR_ELO RW Performance Monitors Count Enable Clear register Yes 0xC20 
PMINTENSET_EL1 RW Performance Monitors Interrupt Enable Set register Yes OxC40 
PMINTENCLR_EL1 RW Performance Monitors Interrupt Enable Clear register Yes OxC60 
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Table I3-1 Performance Monitors external register views (continued) 





System 



















































































Name Type Description register? Offset 
PMOVSCLR_ELO RW Performance Monitors Overflow Flag Status Clear register Yes OxC80 
PMSWINC_ELO WO Performance Monitors Software Increment register Yes OxCAd 
PMOVSSET_ELO RW Performance Monitors Overflow Flag Status Set register Yes @xCCe 

= 7 IMPLEMENTATION DEFINED 7 QxD80-@xDFC 
PMCFGR RO Performance Monitors Configuration Register No OxE0O 
PMCR_ELO RW Performance Monitors Control Register Yes OxE04 
PMCEIDO RO Performance Monitors Common Event Identification register 0 Yes OxE20 
PMCEID1 RO Performance Monitors Common Event Identification register 1 Yes QxE24 

= - IMPLEMENTATION DEFINED - QxE80-QxEFC 
PMITCTRL> RW Integration Model Control registers No QxF0O 
PMDEVAFFO> RO Device Affinity registers No OxFA8 
PMDEVAFF1> RO @xFAC 
PMLARD:;¢ WO Lock Access register No OxFBO 
PMLSR»: ¢ RO Lock Status register No OxFB4 
PMAUTHSTATUS? RO Authentication Status register No OXFB8 
PMDEVARCH> RO Device Architecture register No OxFBC 
PMDEVTYPE? RO Device Type register No OxFCC 
PMPIDR4> RO Peripheral ID registers No OxFDO 
PMPIDRO> RO OxFEQ 
PMPIDR1> RO OxFE4 
PMPIDR2> RO OxFE8 
PMPIDR3> RO OxFEC 
PMCIDRO> RO Component ID registers No OxFFO 
PMCIDR1> RO OxFF4 
PMCIDR2> RO OxFF8 
PMCIDR3> RO OxFFC 








a. The interface must support at least single-copy atomic 32-bit accesses. If single-copy atomic 64-bit access to the registers is not possible, 
software must use a high-low-high read access to read the counter value if the counter is enabled. 


b. CoreSight interface registers, see Management registers and CoreSight compliance on page K2-5499. 


c. The Software lock registers are defined as part of CoreSight compliance, but their contents depend on the type of access that is made and 
whether the OPTIONAL Software lock is implemented. See the register description for details. 
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13.3 Performance Monitors external register descriptions 


This section describes the external view of the Performance Monitors registers. External Performance Monitors 
registers summary on page I3-5143 lists these registers in offset order. 
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13.3.1 PMAUTHSTATUS, Performance Monitors Authentication Status register 
The PMAUTHSTATUS characteristics are: 
Purpose 
Provides information about the state of the IMPLEMENTATION DEFINED authentication interface for 
Performance Monitors. 
Usage constraints 
This register is accessible as follows: 
SLK Default 
RO RO 
Configurations 
PMAUTHSTATUS is in the Debug power domain. 
This register is OPTIONAL, and is required for CoreSight compliance. ARM recommends that this 
register is implemented. 
Attributes 
PMAUTHSTATUS is a 32-bit register. 
Field descriptions 
The PMAUTHSTATUS bit assignments are: 
31 876543210 
RESO SNID js] NSID 
| NSNID 
Bits [31:8] 
Reserved, RESO. 
SNID, bits [7:6] 
Holds the same value as DBGAUTHSTATUS_EL1.SNID. 
SID, bits [5:4] 
Holds the same value as DBGAUTHSTATUS_EL1.SID. 
NSNID, bits [3:2] 
Holds the same value as DBGAUTHSTATUS_EL1.NSNID. 
NSID, bits [1:0] 
Holds the same value as DBGAUTHSTATUS_EL1.NSID. 
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Accessing the PMAUTHSTATUS: 


PMAUTHSTATUS can be accessed through the external debug interface: 





Component Offset 





PMU OxFB8 
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13.3.2 PMCCFILTR_ELO, Performance Monitors Cycle Counter Filter Register 
The PMCCFILTR_ELO characteristics are: 
Purpose 
Determines the modes in which the Cycle Counter, PMCCNTR_ELO, increments. 
Usage constraints 
This register is accessible as follows: 
Off DLK OSLK EPMAD_ SLK Default 
Error Error Error Error RO RW 
Configurations 
External register PMCCFILTR_ELO is architecturally mapped to AArch64 System register 
PMCCFILTR_ELO. 
External register PMCCFILTR_ELO is architecturally mapped to AArch32 System register 
PMCCFILTR. 
PMCCFILTR_ELO is in the Core power domain. 
On a Warm or Cold reset RW fields in this register reset: 
° To architecturally UNKNOWN values if the reset is to an Exception level that is using 
AArch64. 
. To 0 if the reset is to an Exception level that is using AArch32. 
The register is not affected by an External debug reset. 
Attributes 
PMCCFILTR_ELO is a 32-bit register. 
Field descriptions 
The PMCCFILTR_ELO bit assignments are: 
31 30 29 28 27 26 25 0 
NSK | | 
NSU 
NSH 
P, bit [31] 
Privileged filtering bit. Controls counting in EL1. If EL3 is implemented, then counting in 
Non-secure EL] is further controlled by the NSK bit. The possible values of this bit are: 
0 Count cycles in EL1. 
HE Do not count cycles in EL1. 
U, bit [30] 
User filtering bit. Controls counting in ELO. If EL3 is implemented, then counting in Non-secure 
ELO is further controlled by the NSU bit. The possible values of this bit are: 
Q Count cycles in ELO. 
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NSK, bit [29] 


NSU, bit [28] 


NSH, bit [27] 


M, bit [26] 


Bits [25:0] 


13 External System Control Register Descriptions 
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1 Do not count cycles in ELO. 


Non-secure EL1 (kernel) modes filtering bit. Controls counting in Non-secure EL1. If EL3 is not 
implemented, this bit is RESO. 


If the value of this bit is equal to the value of P, cycles in Non-secure EL1 are counted. 


Otherwise, cycles in Non-secure EL1 are not counted. 


Non-secure ELO (Unprivileged) filtering. Controls counting in Non-secure ELO. If EL3 is not 
implemented, this bit is RESO. 


If the value of this bit is equal to the value of U, cycles in Non-secure ELO are counted. 


Otherwise, cycles in Non-secure ELO are not counted. 


Non-secure EL2 (Hypervisor) filtering bit. Controls counting in Non-secure EL2. If EL2 is not 
implemented, this bit is RESO. 


0 Do not count cycles in EL2. 
1 Count cycles in EL2. 


Secure EL3 filtering bit. If EL3 is not implemented, this bit is RESO. 

If the value of this bit is equal to the value of P, cycles in Secure EL3 are counted. 
Otherwise, cycles in Secure EL3 are not counted. 

Most applications can ignore this field and set its value to 0. 


—— Note 
This field is not visible in the AArch32 PMCCFILTR System register. 





Reserved, RESO. 


Accessing the PMCCFILTR_ELO: 


PMCCFILTR_ELO can be accessed through the external debug interface: 





Component Offset 





PMU Qx47C 
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13.3 Performance Monitors external register descriptions 


13.3.3 PMCCNTR_ELO, Performance Monitors Cycle Counter 
The PMCCNTR_ELO characteristics are: 


Purpose 


Holds the value of the processor Cycle Counter, CCNT, that counts processor clock cycles. See Time 
as measured by the Performance Monitors cycle counter on page D5-1835 for more information. 


PMCCFILTR_ELO determines the modes and states in which the PMCCNTR_ELO can increment. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EPMAD_ SLK Default 





Error Error Error Error RO RW 





Configurations 


External register PMCCNTR_ELO is architecturally mapped to AArch64 System register 
PMCCNTR_ELO. 


External register PMCCNTR_ELO is architecturally mapped to AArch32 System register 
PMCCNTR. 


PMCCNTR_ELO is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply on a Warm or Cold reset. The register is not affected by an External 
debug reset. 


Attributes 
PMCCNTR_ELO is a 64-bit register. 


Field descriptions 


The PMCCNTR_ELO bit assignments are: 


63 0 


CCNT 


CCNT, bits [63:0] 


Cycle count. Depending on the values of PMCR_ELO.{LC,D}, the cycle count increments in one 
of the following ways: 


. Every processor clock cycle. 
° Every 64th processor clock cycle. 
Writing 1 to PMCR_ELO.C sets this field to 0. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the PMCCNTR_ELO: 


PMCCNTR_ELO[31:0] can be accessed through the external debug interface: 





Component Offset 





PMU OxOF8 
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13.3 Performance Monitors external register descriptions 


PMCCNTR_ELO[63:32] can be accessed through the external debug interface: 





Component Offset 





PMU OxOFC 
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13.3 Performance Monitors external register descriptions 


13.3.4 PMCEIDO, Performance Monitors Common Event Identification register 0 
The PMCEID0 characteristics are: 


Purpose 


Defines which common architectural and common microarchitectural feature events in the range 
0x00 to Qx01F are implemented. If a particular bit is set to 1, then the event for that bit is 
implemented. 


—— Note 
This view of the register has previously been called PMCEIDO_ELO. 





Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EPMAD_ SLK Default 





Error Error Error Error RO RO 





Configurations 


External register PMCEID0 is architecturally mapped to AArch64 System register 
PMCEIDO_ELO[31:0]. 


External register PMCEIDO[31:0] is architecturally mapped to AArch32 System register 
PMCEIDO. 


PMCEID0 is in the Core power domain. 


Attributes 
PMCEID0 is a 32-bit register. 


Field descriptions 


The PMCEIDO bit assignments are: 


31 0 


ID[31:0], bits [31:0] 


PMCEIDO[n] maps to event n. For a list of event numbers and descriptions, see Events, event 
numbers, and mnemonics on page D5-1848. 


For each bit: 
0 The common event is not implemented. 
1 The common event is implemented. 


Bits that map to reserved event numbers are reserved to identify events that might be defined in 
future revisions to the architecture. 


Events that do not require additional features in the PMU can be defined retrospectively, meaning 
that they can be implemented as part of a PMUv3 implementation. 
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13.3 Performance Monitors external register descriptions 


Accessing the PMCEIDO: 


PMCEID0 can be accessed through the external debug interface: 





Component Offset 





PMU OxE20 
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13.3 Performance Monitors external register descriptions 


13.3.5 PMCEID1, Performance Monitors Common Event Identification register 1 
The PMCEID1 characteristics are: 


Purpose 


Defines which common architectural and common microarchitectural feature events in the range 
0x20 to 0x03F are implemented. If a particular bit is set to 1, then the event for that bit is 
implemented. 


——— Note 
This view of the register has previously been called PMCEID1_ELO. 





Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EPMAD_ SLK Default 





Error Error Error Error RO RO 





Configurations 


External register PMCEID1 is architecturally mapped to AArch64 System register 
PMCEID1_ELO[31:0]. 


External register PMCEID1[31:0] is architecturally mapped to AArch32 System register 
PMCEID1. 


PMCEID1 is in the Core power domain. 


Attributes 
PMCEID1 is a 32-bit register. 


Field descriptions 


The PMCEID1 bit assignments are: 


31 0 


ID[63:32], bits [31:0] 
PMCEID1[n] maps to event (n + 32). For a list of event numbers and descriptions, see Events, event 
numbers, and mnemonics on page D5-1848. 


For each bit: 
) The common event is not implemented. 
1 The common event is implemented. 


Bits that map to reserved event numbers are reserved to identify events that might be defined in 
future revisions to the architecture. 


Events that do not require additional features in the PMU can be defined retrospectively, meaning 
that they can be implemented as part of a PMUv3 implementation. 
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13.3 Performance Monitors external register descriptions 


Accessing the PMCEID1: 


PMCEID!1 can be accessed through the external debug interface: 





Component Offset 





PMU OxE24 
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13.3 Performance Monitors external register descriptions 


13.3.6 
The PMCFGR characteristics are: 


Purpose 


PMCFGR, Performance Monitors Configuration Register 


Contains PMU-specific configuration data. 


Usage constraints 


This register is accessible as follows: 











Off DLK OSLK EPMAD_ SLK _ Default 
Error Error Error Error RO RO 
Configurations 
PMCFGBR is in the Core power domain. 
Attributes 
PMCFGR is a 32-bit register. 
Field descriptions 
The PMCFGR bit assignments are: 
31 28 27 20 19 18 17 16 15 14 13 8 7 0 


UEN ee 


ee 





NA 
EX 
NGG, bits [31:28] 
This feature is not supported, so this field is RAZ. 
Bits [27:20] 
Reserved, RESO. 
UEN, bit [19] 
User-mode Enable Register supported. PMUSERENR_ELO is not visible in the external debug 
interface, so this bit is RAZ. 
WT, bit [18] 
This feature is not supported, so this bit is RAZ. 
NA, bit [17] 
This feature is not supported, so this bit is RAZ. 
EX, bit [16] 
Export supported. Value is IMPLEMENTATION DEFINED. 
() PMCR_ELO.X is RESO. 
1 PMCR_ELO.X is read/write. 
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CCD, bit [15] 
Cycle counter has prescale. This is RES1 if AArch32 is supported at any EL, and RAZ otherwise. 
) PMCR_ELO.D is RESO. 
1 PMCR_ELO.D is read/write. 


CC, bit [14] 
Dedicated cycle counter (counter 31) supported. This bit is RAO. 
SIZE, bits [13:8] 
Size of counters. This field determines the spacing of counters in the memory-map. 
In ARMv8 the counters are at doubleword-aligned addresses, and the largest counter is 64-bits, so 
this field is @b111111. 
N, bits [7:0] 


Number of counters implemented in addition to the cycle counter, PMCCNTR_ELO. The maximum 
number of event counters is 31. 


00000000 =Only PMCCNTR_ELO implemented. 
00000001 PMCCNTR_ELO plus one event counter implemented. 
and so on up to @b@0011111, which indicates PMCCNTR_ELO and 31 event counters implemented. 


Accessing the PMCFGR: 


PMCFGR can be accessed through the external debug interface: 





Component Offset 





PMU OxE00 
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13.3 Performance Monitors external register descriptions 


13.3.7 


PMCIDRO, Performance Monitors Component Identification Register 0 


The PMCIDRO characteristics are: 


Purpose 
Provides information to identify a Performance Monitor component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
PMCIDRO is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
PMCIDRO is a 32-bit register. 


Field descriptions 


The PMCIDRO bit assignments are: 


31 8 7 0 


RESO PRMBL_O 


Bits [31:8] 
Reserved, RESO. 


PRMBL_ 0, bits [7:0] 
Preamble. Must read as Qx@D. 


Accessing the PMCIDRO: 


PMCIDRO can be accessed through the external debug interface: 





Component Offset 





PMU OxFFO 
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13.3 Performance Monitors external register descriptions 


13.3.8 PMCIDR1, Performance Monitors Component Identification Register 1 
The PMCIDR1 characteristics are: 


Purpose 
Provides information to identify a Performance Monitor component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
PMCIDR1 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
PMCIDR1 is a 32-bit register. 
Field descriptions 
The PMCIDR1 bit assignments are: 
31 8 7 4 3 0 
RESO CLASS | PRMBL_1 


Bits [31:8] 
Reserved, RESO. 


CLASS, bits [7:4] 


Component class. Reads as 0x9, debug component. 
PRMBL_1, bits [3:0] 
Preamble. RAZ. 


Accessing the PMCIDR1: 


PMCIDRI can be accessed through the external debug interface: 





Component Offset 





PMU OxFF4 
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13.3.9 


PMCIDR2, Performance Monitors Component Identification Register 2 


The PMCIDR2 characteristics are: 


Purpose 
Provides information to identify a Performance Monitor component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
PMCIDR? is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
PMCIDR2 is a 32-bit register. 


Field descriptions 


The PMCIDR2 bit assignments are: 


31 8 7 0 


RESO PRMBL_2 


Bits [31:8] 
Reserved, RESO. 


PRMBL_ 2, bits [7:0] 
Preamble. Must read as 0x05. 


Accessing the PMCIDR2: 


PMCIDR2 can be accessed through the external debug interface: 





Component Offset 





PMU OxFF8 
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13.3 Performance Monitors external register descriptions 


13.3.10 PMCIDR3, Performance Monitors Component Identification Register 3 
The PMCIDR3 characteristics are: 


Purpose 
Provides information to identify a Performance Monitor component. 


For more information see About the Component Identification scheme on page K2-5507. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
PMCIDR3 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
PMCIDR3 is a 32-bit register. 
Field descriptions 
The PMCIDR3 bit assignments are: 
31 8 7 0 
RESO PRMBL_3 


Bits [31:8] 
Reserved, RESO. 


PRMBL_3, bits [7:0] 


Preamble. Must read as QxB1. 


Accessing the PMCIDR3: 


PMCIDR3 can be accessed through the external debug interface: 





Component Offset 





PMU OxFFC 
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13.3.11 PMCNTENCLR_ELO, Performance Monitors Count Enable Clear register 

The PMCNTENCLR_ELO characteristics are: 

Purpose 
Disables the Cycle Count Register, PMCCNTR_ELO, and any implemented event counters 
PMEVCNTR<n>. Reading this register shows which counters are enabled. 

Usage constraints 
This register is accessible as follows: 

Off DLK OSLK EPMAD_ SLK Default 
Error Error Error Error RO RW 

Configurations 
External register PACNTENCLR_ELDO is architecturally mapped to AArch64 System register 
PMCNTENCLR_ELO. 
External register PACNTENCLR_ELDO is architecturally mapped to AArch32 System register 
PMCNTENCLR. 
PMCNTENCLR_ELO is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply on a Warm or Cold reset. The register is not affected by an External 
debug reset. 

Attributes 
PMCNTENCLR_ELO is a 32-bit register. 

Field descriptions 

The PMCNTENCLR_ELO bit assignments are: 

31 30 0 

P<n>, bit [n] 

C, bit [31] 
PMCCNTR_ELO disable bit. Disables the cycle counter register. Possible values are: 
0 When read, means the cycle counter is disabled. When written, has no effect. 
1 When read, means the cycle counter is enabled. When written, disables the cycle 

counter. 

This field resets to a value that is architecturally UNKNOWN. 

P<n>, bit [n], for n = 0 to 30 
Event counter disable bit for PMEVCNTR<n>_ELO. 
Bits [30:N] are RAZ/WI. N is the value in PMCFGR.N. 
Possible values of each bit are: 
0 When read, means that PAEVCNTR<n>_EL0O is disabled. When written, has no effect. 
1 When read, means that PMEVCNTR<n>_ELO is enabled. When written, disables 

PMEVCNTR<n>_EL0. 
This field resets to a value that is architecturally UNKNOWN. 
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Accessing the PMCNTENCLR_ELO: 


PMCNTENCLR_ELO can be accessed through the external debug interface: 





Component Offset 





PMU QxC20 
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13.3.12 PMCNTENSET_ELO, Performance Monitors Count Enable Set register 

The PMCNTENSET_ELO characteristics are: 

Purpose 
Enables the Cycle Count Register, PMCCNTR_ELO, and any implemented event counters 
PMEVCNTR<n>. Reading this register shows which counters are enabled. 

Usage constraints 
This register is accessible as follows: 

Off DLK OSLK EPMAD_ SLK Default 
Error Error Error Error RO RW 

Configurations 
External register PACNTENSET_ELO is architecturally mapped to AArch64 System register 
PMCNTENSET_ELO. 
External register PACNTENSET_ELO is architecturally mapped to AArch32 System register 
PMCNTENSET. 
PMCNTENSET_ELO is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply on a Warm or Cold reset. The register is not affected by an External 
debug reset. 

Attributes 
PMCNTENSET_ELO is a 32-bit register. 

Field descriptions 

The PMCNTENSET_ELO bit assignments are: 

31 30 0 

P<n>, bit [n] 

C, bit [31] 
PMCCNTR_ELO enable bit. Enables the cycle counter register. Possible values are: 
0 When read, means the cycle counter is disabled. When written, has no effect. 
1 When read, means the cycle counter is enabled. When written, enables the cycle 

counter. 

This field resets to a value that is architecturally UNKNOWN. 

P<n>, bit [n], for n = 0 to 30 
Event counter enable bit for PMEVCNTR<n>_EL0. 
Bits [30:N] are RAZ/WL. N is the value in PMCFGR.N. 
Possible values of each bit are: 
0 When read, means that PMEVCNTR<n>_EL0O is disabled. When written, has no effect. 
1 When read, means that PMEVCNTR<n>_EL0 event counter is enabled. When written, 

enables PMEVCNTR<n>_ELO. 
This field resets to a value that is architecturally UNKNOWN. 
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Accessing the PMCNTENSET_ELO: 


PMCNTENSET_ELO can be accessed through the external debug interface: 





Component Offset 





PMU QxC00 
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13.3.13 PMCR_ELO, Performance Monitors Control Register 

The PMCR_ELO characteristics are: 

Purpose 
Provides details of the Performance Monitors implementation, including the number of counters 
implemented, and configures and controls the counters. 

Usage constraints 
This register is accessible as follows: 

Off DLK OSLK EPMAD_ SLK Default 
Error Error Error Error RO RW 

Configurations 
External register PMCR_ELO[6:0] is architecturally mapped to AArch32 System register 
PMCR[6:0]. 
External register PMCR_ELO[6:0] is architecturally mapped to AArch64 System register 
PMCR_ELO/[6:0]. 
PMCR_ELO is in the Core power domain. Some or all RW fields of this register have defined reset 
values. These apply on a Warm or Cold reset. The register is not affected by an External debug reset. 
This register is only partially mapped to the internal PMCR System register. An external agent must 
use other means to discover the information held in PMCR[31:11], such as accessing PMCFGR and 
the ID registers. 

Attributes 
PMCR_ELO is a 32-bit register. 

Field descriptions 

The PMCR_ELO bit assignments are: 

31 11 10 76543210 

RAZ/WI RESO Lcpe}x/o|olP]e 

Bits [31:11] 
RAZ/WI. Hardware must implement this field as RAZ/WI. Software must not rely on the register 
reading as zero, and must use a read-modify-write sequence to write to the register. 

Bits [10:7] 
Reserved, RESO. 

LC, bit [6] 
Long cycle counter enable. Determines which PMCCNTR_ELO bit generates an overflow recorded 
by PMOVSR/[31]. 
0 Cycle counter overflow on increment that changes PMCCNTR_ELO[31] from | to 0. 
1 Cycle counter overflow on increment that changes PMCCNTR_ELO[63] from | to 0. 
ARM deprecates use of PMCR_ELO.LC = 0. 
In an AArch64-only implementation, this field is RES1. 
If this field is implemented as an RW field, it resets to a value that is architecturally UNKNOWN. 
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Disable cycle counter when event counting is prohibited. The possible values of this bit are: 
0 PMCCNTR_ELO, if enabled, counts when event counting is prohibited. 
1 PMCCNTR_ELO does not count when event counting is prohibited. 


Counting events is never prohibited in Non-secure state. However, there are some restrictions on 
counting events in Secure state. For more information about the interaction between the 
Performance Monitors and EL3, see /nteraction with EL3 on page D5-1841. 


If EL3 is not implemented, this field is RESO, otherwise it is an RW field. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field it resets to: 


° A value that is architecturally UNKNOWN if the reset is into an Exception level that is using 
AArché4. 


° 0 if the reset is into an Exception level that is using AArch32. 


Enable export of events in an IMPLEMENTATION DEFINED event stream. The possible values of this 
bit are: 


) Do not export events. 
1 Export events where not prohibited. 


This field enables the exporting of events over an event bus to another device, for example to an 
OPTIONAL trace macrocell. If the implementation does not include such an event bus then this field 
is RAZ/WI, otherwise it is an RW field. 


In an implementation that includes an event bus, no events are exported when counting is prohibited. 


This field does not affect the generation of Performance Monitors overflow interrupt requests or 
signaling to a cross-trigger interface (CTI) that can be implemented as signals exported from the PE. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field it resets to: 


° A value that is architecturally UNKNOWN if the reset is into an Exception level that is using 
AArch64. 


. 0 if the reset is into an Exception level that is using AArch32. 


Clock divider. The possible values of this bit are: 
) When enabled, PMCCNTR_ELO counts every clock cycle. 
1 When enabled, PMCCNTR_ELO counts once every 64 clock cycles. 


In an AArch64-only implementation this field is RESO, otherwise it is an RW field. If 
PMCR_ELO.LC == 1, this bit is ignored and the cycle counter counts every clock cycle. 


ARM deprecates use of PMCR_ELO.D = 1. 


When this register has an architecturally-defined reset value, if this field is implemented as an RW 
field it resets to: 


° A value that is architecturally UNKNOWN if the reset is into an Exception level that is using 
AArché4. 


° 0 if the reset is into an Exception level that is using AArch32. 


Cycle counter reset. This bit is WO. The effects of writing to this bit are: 
0 No action. 

1 Reset PMCCNTR_ELO to zero. 

This bit is always RAZ. 
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Resetting PMCCNTR_ELO does not clear the PMCCNTR_ELO overflow bit to 0. 


P, bit [1] 
Event counter reset. This bit is WO. The effects of writing to this bit are: 
0 No action. 
1 Reset all event counters, not including PMCCNTR_ELO, to zero. 
This bit is always RAZ. 
Resetting the event counters does not clear any overflow bits to 0. 
E, bit [0] 
Enable. The possible values of this bit are: 
0 All counters, including PMCCNTR_ELO, are disabled. 
1 All counters are enabled by PMCNTENSET_ELO. 
This bit is RW. 


When this register has an architecturally-defined reset value, this field resets to Q. 


Accessing the PMCR_ELO: 


PMCR_ELO can be accessed through the external debug interface: 





Component Offset 





PMU OxE04 
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13.3.14 PMDEVAFFO, Performance Monitors Device Affinity register 0 
The PMDEVAFFO0 characteristics are: 


Purpose 


Copy of the low half of the PE MPIDR_EL] register that allows a debugger to determine which PE 
in a multiprocessor system the Performance Monitor component relates to. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
PMDEVAFF0 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
Attributes 
PMDEVAFFO is a 32-bit register. 
Field descriptions 
The PMDEVAFFO bit assignments are: 
31 0 
MPIDR_EL1 low half 


Bits [31:0] 
MPIDR_EL] low half. Read-only copy of the low half of MPIDR_EL1, as seen from the highest 
implemented Exception level. 


Accessing the PMDEVAFF0: 


PMDEVAFFO can be accessed through the external debug interface: 





Component Offset 





PMU OxFA8 
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13.3.15 PMDEVAFF1, Performance Monitors Device Affinity register 1 
The PMDEVAFF1 characteristics are: 


Purpose 


Copy of the high half of the PE MPIDR_EL] register that allows a debugger to determine which PE 
in a multiprocessor system the Performance Monitor component relates to. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
PMDEVAFF1 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
Attributes 
PMDEVAFF1 is a 32-bit register. 
Field descriptions 
The PMDEVAFF1 bit assignments are: 
31 0 
MPIDR_EL1 high half 


Bits [31:0] 
MPIDR_ELI high half. Read-only copy of the high half of MPIDR_EL1, as seen from the highest 
implemented Exception level. 


Accessing the PMDEVAFF1: 


PMDEVAFF1 can be accessed through the external debug interface: 





Component Offset 





PMU OxFAC 
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PMDEVARCH, Performance Monitors Device Architecture register 


The PMDEVARCH characteristics are: 


Purpose 


Identifies the programmers' model architecture of the Performance Monitor component. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
PMDEVARCH is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
Attributes 
PMDEVARCH is a 32-bit register. 
Field descriptions 
The PMDEVARCH bit assignments are: 
31 21 20 19 16 15 0 


ARCHITECT i REVISION ARCHID 


PRESENT | 


ARCHITECT, bits [31:21] 
Defines the architecture of the component. For Performance Monitors, this is ARM Limited. 
Bits [31:28] are the JEP106 continuation code, 0x4. 
Bits [27:21] are the JEP106 ID code, 0x3B. 


PRESENT, bit [20] 
When set to 1, indicates that the DEVARCH is present. 
This field is 1 in ARMv8. 


REVISION, bits [19:16] 
Defines the architecture revision. For architectures defined by ARM this is the minor revision. 
For Performance Monitors, the revision defined by ARMV8 is 0x0. 


All other values are reserved. 


ARCHID, bits [15:0] 


Defines this part to be an ARMv8 debug component. For architectures defined by ARM this is 
further subdivided. 


For Performance Monitors: 
° Bits [15:12] are the architecture version, 0x2. 


° Bits [11:0] are the architecture part number, 0xA16. 
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This corresponds to Performance Monitors architecture version PMUVv3. 


— Note 


The PMUv3 memory-mapped programmers’ model can be used by devices other than ARMv8 
processors. Software must determine whether the PMU is attached to an ARMV8 processor by using 
the PMDEVAFFO0 and PMDEVAFF' registers to discover the affinity of the PMU to any ARMv8 
processors. 





Accessing the PMDEVARCH: 


PMDEVARCH can be accessed through the external debug interface: 





Component Offset 





PMU OxFBC 
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PMDEVTYPE, Performance Monitors Device Type register 


The PMDEVTYPE characteristics are: 


Purpose 


Indicates to a debugger that this component is part of a PEs performance monitor interface. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
PMDEVTYPE is in the Debug power domain. 


Implementation of this register is OPTIONAL. 


Attributes 
PMDEVTYPE is a 32-bit register. 


Field descriptions 


The PMDEVTYPE bit assignments are: 


31 8 7 4 3 0 


RESO SUB MAJOR 


Bits [31:8] 
Reserved, RESO. 


SUB, bits [7:4] 


Subtype. Must read as Qx1 to indicate this is a component within a PE. 
MAJOR, bits [3:0] 


Major type. Must read as 0x6 to indicate this is a performance monitor component. 


Accessing the PMDEVTYPE: 


PMDEVTYPE can be accessed through the external debug interface: 





Component Offset 





PMU OxFCC 
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13.3.18 PMEVCNTR<n>_ELO, Performance Monitors Event Count Registers, n = 0 - 30 
The PMEVCNTR<n>_EL0 characteristics are: 
Purpose 
Holds event counter n, which counts events, where n is 0 to 30. 
Usage constraints 
This register is accessible as follows: 
Off DLK OSLK EPMAD_ SLK Default 
Error Error — Error Error RO RW 
External accesses to the performance monitors ignore PMUSERENR_ELO and, if implemented, 
MDCR_EL2.{TPM, TPMCR, HPMN} and MDCR_EL3.TPM. This means that all counters are 
accessible regardless of the current EL or privilege of the access. 
Configurations 
External register PMEVCNTR<n>_EL0 is architecturally mapped to AArch64 System register 
PMEVCNTR<n>_ELO. 
External register PMEVCNTR<n>_EL0 is architecturally mapped to AArch32 System register 
PMEVCNTR<n>. 
PMEVCNTR<n>_EL0 is in the Core power domain. RW fields in this register reset to 
architecturally UNKNOWN values. These apply on a Warm or Cold reset. The register is not affected 
by an External debug reset. 
Attributes 
PMEVCNTR<n>_EL0 is a 32-bit register. 
Field descriptions 
The PMEVCNTR<n>_ELO bit assignments are: 
31 0 
Bits [31:0] 
Event counter n. Value of event counter n, where n is the number of this register and is a number 
from 0 to 30. 
This field resets to a value that is architecturally UNKNOWN. 
Accessing the PMEVCNTR<n>_ELO0: 
PMEVCNTR<n>_EL0 can be accessed through the external debug interface: 
Component Offset 
PMU 0x000 + 8n 
13-5174 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


13 External System Control Register Descriptions 
13.3 Performance Monitors external register descriptions 


13.3.19 PMEVTYPER<n>_EL0, Performance Monitors Event Type Registers, n = 0 - 30 


The PMEVTYPER<n>_ELO characteristics are: 


Purpose 


Configures event counter n, where n is 0 to 30. 


Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EPMAD_ SLK Default 





Error Error Error Error RO RW 





Configurations 


Attributes 


External register PMEVTYPER<n>_EL0O is architecturally mapped to AArch64 System register 
PMEVTYPER<n>_ELO. 


External register PMEVTYPER<n>_EL0O is architecturally mapped to AArch32 System register 
PMEVTYPER<n>. 


PMEVTYPER<n>_ELO is in the Core power domain. RW fields in this register reset to 
architecturally UNKNOWN values. These apply on a Warm or Cold reset. The register is not affected 
by an External debug reset. 


PMEVTYPER<n>_ELO is a 32-bit register. 


Field descriptions 


The PMEVTYPER<n>_ELO bit assignments are: 


31 30 29 28 27 26 25 24 10 9 0 


NSK | | 
NSU 


NSH 
MT 


P, bit [31] 


U, bit [30] 


Privileged filtering bit. Controls counting in EL1. If EL3 is implemented, then counting in 
Non-secure EL] is further controlled by the NSK bit. The possible values of this bit are: 


0 Count events in EL1. 
1 Do not count events in EL1. 


This field resets to a value that is architecturally UNKNOWN. 


User filtering bit. Controls counting in ELO. If EL3 is implemented, then counting in Non-secure 
ELO is further controlled by the NSU bit. The possible values of this bit are: 


0 Count events in ELO. 
1 Do not count events in ELO. 


This field resets to a value that is architecturally UNKNOWN. 
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NSK, bit [29] 


Non-secure EL1 (kernel) modes filtering bit. Controls counting in Non-secure EL1. If EL3 is not 
implemented, this bit is RESO. 


If the value of this bit is equal to the value of P, events in Non-secure EL1 are counted. 
Otherwise, events in Non-secure EL1 are not counted. 


This field resets to a value that is architecturally UNKNOWN. 


NSU, bit [28] 


Non-secure ELO (Unprivileged) filtering. Controls counting in Non-secure ELO. If EL3 is not 
implemented, this bit is RESO. 


If the value of this bit is equal to the value of U, events in Non-secure ELO are counted. 
Otherwise, events in Non-secure ELO are not counted. 


This field resets to a value that is architecturally UNKNOWN. 


NSH, bit [27] 


Non-secure EL2 (Hypervisor) filtering. Controls counting in Non-secure EL2. If EL2 is not 
implemented, this bit is RESO. 


1) Do not count events in EL2. 
1 Count events in EL2. 


This field resets to a value that is architecturally UNKNOWN. 


M, bit [26] 
Secure EL3 filtering bit. If EL3 is not implemented, this bit is RESO. 
If the value of this bit is equal to the value of P, cycles in Secure EL3 are counted. 
Otherwise, cycles in Secure EL3 are not counted. 
Most applications can ignore this field and set its value to 0. 


—— Note 
This field is not visible in the AArch32 PMEVTYPER System register. 





This field resets to a value that is architecturally UNKNOWN. 


MT, bit [25] 
Multithreading. When the implementation is multi-threaded, the valid values for this bit are: 
1) Count events only on controlling PE. 
1 Count events from any PE with the same affinity at level 1 and above as this PE. 


When the implementation is not multi-threaded, this bit is RESO. 


—_ Note 


° An implementation is described as multi-threaded when the lowest level of affinity consists 
of logical PEs that are implemented using a multi-threading type approach. That is, the 
performance of PEs at the lowest affinity level is highly interdependent. On such an 
implementation, the value of MPIDR_EL1.MT, when read at the highest implemented 
Exception level, is 1. 


. Events from a different thread of a multithreaded implementation are not Attributable to the 
thread counting the event. 





This field resets to a value that is architecturally UNKNOWN. 


Bits [24:10] 


Reserved, RESO. 
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evtCount, bits [9:0] 


Event to count. The event number of the event that is counted by event counter 
PMEVCNTR<n>_EL0O. 


Software must program this field with an event that is supported by the PE being programmed. 


There are three ranges of event numbers: 


. Event numbers in the range 0x00Q to 0x03F are common architectural and microarchitectural 
events. 
. Event numbers in the range 0x040 to @x@BF are ARM recommended common architectural and 


microarchitectural events. 
. Event numbers in the range @x0CQ to @x3FF are IMPLEMENTATION DEFINED events. 


If evtCount is programmed to an event that is reserved or not supported by the PE, the behavior 
depends on the event type: 


° For the range 0x00 to @x03F, no events are counted, and the value returned by a direct or 
external read of the evtCount field is the value written to the field. 


° For IMPLEMENTATION DEFINED events, it is UNPREDICTABLE what event, if any, is counted, 
and the value returned by a direct or external read of the evtCount field is UNKNOWN. 


— Note 


UNPREDICTABLE means the event must not expose privileged information. 





ARM recommends that the behavior across a family of implementations is defined such that if a 
given implementation does not include an event from a set of common IMPLEMENTATION DEFINED 
events, then no event is counted and the value read back on evtCount is the value written. 


This field resets to a value that is architecturally UNKNOWN. 


Accessing the PMEVTYPER<n>_ELO: 


PMEVTYPER<n>_ELO can be accessed through the external debug interface: 





Component Offset 





PMU 0x400 + 4n 
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13.3.20 PMINTENCLR_EL1, Performance Monitors Interrupt Enable Clear register 
The PMINTENCLR_EL1 characteristics are: 
Purpose 
Disables the generation of interrupt requests on overflows from the Cycle Count Register, 
PMCCNTR_ELO, and the event counters PMEVCNTR<n>_ELO. Reading the register shows which 
overflow interrupt requests are enabled. 
Usage constraints 
This register is accessible as follows: 
Off DLK OSLK EPMAD_ SLK Default 
Error Error — Error Error RO RW 
Configurations 
External register PMINTENCLR_EL1 is architecturally mapped to AArch64 System register 
PMINTENCLR_EL1. 
External register PMINTENCLR_EL1 is architecturally mapped to AArch32 System register 
PMINTENCLR. 
PMINTENCLR_EL1 is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply on a Warm or Cold reset. The register is not affected by an External 
debug reset. 
Attributes 
PMINTENCLR_EL1 is a 32-bit register. 
Field descriptions 
The PMINTENCLR_EL1 bit assignments are: 
31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR_ELO overflow interrupt request disable bit. Possible values are: 
) When read, means the cycle counter overflow interrupt request is disabled. When 
written, has no effect. 
1 When read, means the cycle counter overflow interrupt request is enabled. When 
written, disables the cycle count overflow interrupt request. 
This field resets to a value that is architecturally UNKNOWN. 
P<n>, bit [n], for n = 0 to 30 
Event counter overflow interrupt request disable bit for PMEVCNTR<n>_ELO. 
Bits [30:N] are RAZ/WL. N is the value in PMCFGR.N. 
Possible values are: 
) When read, means that the PMEVCNTR<n>_EL0 event counter interrupt request is 
disabled. When written, has no effect. 
Hl When read, means that the PMEVCNTR<n>_EL0O event counter interrupt request is 
enabled. When written, disables the PMEVCNTR<n>_ELO interrupt request. 
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This field resets to a value that is architecturally UNKNOWN. 


Accessing the PMINTENCLR_EL1: 


PMINTENCLR_EL1 can be accessed through the external debug interface: 





Component Offset 





PMU OxC60 
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13.3.21 PMINTENSET_EL1, Performance Monitors Interrupt Enable Set register 
The PMINTENSET_EL1 characteristics are: 
Purpose 
Enables the generation of interrupt requests on overflows from the Cycle Count Register, 
PMCCNTR_ELO, and the event counters PMEVCNTR<n>_ELO. Reading the register shows which 
overflow interrupt requests are enabled. 
Usage constraints 
This register is accessible as follows: 
Off DLK OSLK EPMAD_ SLK Default 
Error Error — Error Error RO RW 
Configurations 
External register PMINTENSET_EL] is architecturally mapped to AArch64 System register 
PMINTENSET_EL1. 
External register PMINTENSET_EL] is architecturally mapped to AArch32 System register 
PMINTENSET. 
PMINTENSET_EL1 is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply on a Warm or Cold reset. The register is not affected by an External 
debug reset. 
Attributes 
PMINTENSET_EL1 is a 32-bit register. 
Field descriptions 
The PMINTENSET_EL1 bit assignments are: 
31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR_ELO overflow interrupt request enable bit. Possible values are: 
) When read, means the cycle counter overflow interrupt request is disabled. When 
written, has no effect. 
1 When read, means the cycle counter overflow interrupt request is enabled. When 
written, enables the cycle count overflow interrupt request. 
This field resets to a value that is architecturally UNKNOWN. 
P<n>, bit [n], for n = 0 to 30 
Event counter overflow interrupt request enable bit for PMEVCNTR<n>_EL0. 
Bits [30:N] are RAZ/WL. N is the value in PMCFGR.N. 
Possible values are: 
) When read, means that the PMEVCNTR<n>_ELO event counter interrupt request is 
disabled. When written, has no effect. 
Hl When read, means that the PMEVCNTR<n>_EL0O event counter interrupt request is 
enabled. When written, enables the PMEVCNTR<n>_EL0 interrupt request. 
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This field resets to a value that is architecturally UNKNOWN. 


Accessing the PMINTENSET_EL1: 


PMINTENSET_EL1 can be accessed through the external debug interface: 





Component Offset 





PMU OxC40 
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13.3.22 PMITCTRL, Performance Monitors Integration mode Control register 
The PMITCTRL characteristics are: 
Purpose 
Enables the Performance Monitors to switch from default mode into integration mode, where test 
software can control directly the inputs and outputs of the PE, for integration testing or topology 
detection. 
Usage constraints 
This register is accessible as follows: 
Off DLK OSLK Default 
IMPDEF IMPDEF IMPDEF RW 
Configurations 
It is IMPLEMENTATION DEFINED whether PMITCTRL is implemented in the Core power domain or 
in the Debug power domain. Some or all RW fields of this register have defined reset values, and: 
° The register is not affected by a Warm reset. 
. If the register is implemented in the Core power domain the reset values apply on a Cold 
reset, and the register is not affected by an External debug reset. 
° If the register is implemented in the Debug power domain the reset values apply on an 
External debug reset, and the register is not affected by a Cold reset. 
Implementation of this register is OPTIONAL. 
Attributes 
PMITCTRL is a 32-bit register. 
Field descriptions 
The PMITCTRL bit assignments are: 
31 10 
RESO fi 
| IME 
Bits [31:1] 
Reserved, RESO. 
IME, bit [0] 
Integration mode enable. When IME == 1, the device reverts to an integration mode to enable 
integration testing or topology detection. The integration mode behavior is IMPLEMENTATION 
DEFINED. 
0 Normal operation. 
1 Integration mode enabled. 
When this register has an architecturally-defined reset value, this field resets to 0. 
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Accessing the PMITCTRL: 


PMITCTRL can be accessed through the external debug interface: 





Component Offset 





PMU OxF0O 
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13.3.23 PMLAR, Performance Monitors Lock Access Register 
The PMLAR characteristics are: 
Purpose 
Allows or disallows access to the Performance Monitors registers through a memory-mapped 
interface. 
Usage constraints 
This register is accessible as follows: 
Default 
WO 
Configurations 
PMLAR is in the Debug power domain. 
If OPTIONAL memory-mapped access to the external debug interface is supported then an OPTIONAL 
Software Lock can be implemented as part of CoreSight compliance. 
PMLAR ignores writes if the Software Lock is not implemented and ignores writes for other 
accesses to the external debug interface. 
The Software Lock provides a lock to prevent memory-mapped writes to the Performance Monitors 
registers. Use of this lock mechanism reduces the risk of accidental damage to the contents of the 
Performance Monitors registers. It does not, and cannot, prevent all accidental or malicious damage. 
Software uses PMLAR to set or clear the lock, and PMLSR to check the current status of the lock. 
Attributes 
PMLAR is a 32-bit register. 
Field descriptions 
The PMLAR bit assignments are: 
31 0 
KEY 
KEY, bits [31:0] 
Lock Access control. Writing the key value @xC5ACCE55 to this field unlocks the lock, enabling write 
accesses to this component's registers through a memory-mapped interface. 
Writing any other value to this register locks the lock, disabling write accesses to this component's 
registers through a memory mapped interface. 
Accessing the PMLAR: 
PMLAR can be accessed through a memory-mapped access to the external debug interface: 
Component Offset 
PMU OxFBO 
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13.3.24 PMLSR, Performance Monitors Lock Status Register 
The PMLSR characteristics are: 


Purpose 


Indicates the current status of the software lock for Performance Monitors registers. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


Configurations 


PMLSR is in the Debug power domain. Some or all RW fields of this register have defined reset 
values. These apply only on an External debug reset. The register is not affected by a Warm reset 
and is not affected by a Cold reset. 


If OPTIONAL memory-mapped access to the external debug interface is supported then an OPTIONAL 
Software Lock can be implemented as part of CoreSight compliance. 


PMLSR is RAZ if the Software Lock is not implemented and is RAZ for other accesses to the 
external debug interface. 


The Software Lock provides a lock to prevent memory-mapped writes to the Performance Monitors 
registers. Use of this lock mechanism reduces the risk of accidental damage to the contents of the 
Performance Monitors registers. It does not, and cannot, prevent all accidental or malicious damage. 


Software uses PMLAR to set or clear the lock, and PMLSR to check the current status of the lock. 


Attributes 
PMLSR is a 32-bit register. 


Field descriptions 


The PMLSR bit assignments are: 


31 3.2 1 0 


RESO il 
a SLI 
SLK 
nit 





Bits [31:3] 
Reserved, RESO. 

nTT, bit [2] 
Not thirty-two bit access required. RAZ. 

SLK, bit [1] 
Software Lock status for this component. For an access to LSR that is not a memory-mapped access, 
or when the Software Lock is not implemented, this field is RESO. 
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For memory-mapped accesses when the software lock is implemented, possible values of this field 


are: 
) Lock clear. Writes are permitted to this component's registers. 
1 Lock set. Writes to this component's registers are ignored, and reads have no side 


effects. 
When this register has an architecturally-defined reset value, this field resets to 1. 
SLI, bit [0] 


Software Lock implemented. For an access to LSR that is not a memory-mapped access, this field 
is RAZ. For memory-mapped accesses, the value of this field is IMPLEMENTATION DEFINED. 
Permitted values are: 


) Software Lock not implemented or not memory-mapped access. 


1 Software Lock implemented and memory-mapped access. 


Accessing the PMLSR: 


PMLSR can be accessed through a memory-mapped access to the external debug interface: 





Component Offset 





PMU OxFB4 
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13.3.25 PMOVSCLR_ELO, Performance Monitors Overflow Flag Status Clear register 
The PMOVSCLR_ELO characteristics are: 


Purpose 
Contains the state of the overflow bit for the Cycle Count Register, PMCCNTR_ELO, and each of 
the implemented event counters PMEVCNTR<n>. Writing to this register clears these bits. 
Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EPMAD_ SLK Default 





Error Error Error Error RO RW 





Configurations 


External register PMOVSCLR_ELDO is architecturally mapped to AArch64 System register 
PMOVSCLR_ELO. 


External register PMOVSCLR_ELO is architecturally mapped to AArch32 System register 
PMOVSR. 


PMOVSCLR_ELO is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply on a Warm or Cold reset. The register is not affected by an External 
debug reset. 


Attributes 
PMOVSCLR_ELO is a 32-bit register. 


Field descriptions 


The PMOVSCLR_ELO bit assignments are: 


31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR_ELO overflow bit. Possible values are: 
0 When read, means the cycle counter has not overflowed. When written, has no effect. 
1 When read, means the cycle counter has overflowed. When written, clears the overflow 
bit to 0. 


PMCR_ELO.LC controls whether an overflow is detected from PMCCNTR_ELO[31] or from 
PMCCNTR_ELO[63]. 


This field resets to a value that is architecturally UNKNOWN. 


P<n>, bit [n], for n = 0 to 30 
Event counter overflow clear bit for PMEVCNTR<n>_EL0O. 
Bits [30:N] are RAZ/WL. N is the value in PMCFGR.N. 


Possible values of each bit are: 


1) When read, means that PMEVCNTR<n>_EL0O has not overflowed. When written, has 
no effect. 
1 When read, means that PMEVCNTR<n>_ELO has overflowed. When written, clears 


the PMEVCNTR<n>_EL0 overflow bit to 0. 
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This field resets to a value that is architecturally UNKNOWN. 


Accessing the PMOVSCLR_ELO: 


PMOVSCLR_ELO can be accessed through the external debug interface: 





Component Offset 





PMU OxC80 








13-5188 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


13 External System Control Register Descriptions 
13.3 Performance Monitors external register descriptions 


13.3.26 PMOVSSET_ELO, Performance Monitors Overflow Flag Status Set register 
The PMOVSSET_ELO characteristics are: 


Purpose 
Sets the state of the overflow bit for the Cycle Count Register, PMCCNTR_ELO, and each of the 
implemented event counters PMEVCNTR<n>. 

Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EPMAD_ SLK Default 





Error Error Error Error RO RW 





Configurations 


External register PMOVSSET_ELO is architecturally mapped to AArch64 System register 
PMOVSSET_ELO. 


External register PMOVSSET_ELO is architecturally mapped to AArch32 System register 
PMOVSSET. 


PMOVSSET_ELO is in the Core power domain. RW fields in this register reset to architecturally 
UNKNOWN values. These apply on a Warm or Cold reset. The register is not affected by an External 
debug reset. 


Attributes 
PMOVSSET_ELO is a 32-bit register. 


Field descriptions 


The PMOVSSET_ELO bit assignments are: 


31 30 0 
P<n>, bit [n] 
C, bit [31] 
PMCCNTR_ELO overflow bit. Possible values are: 
0 When read, means the cycle counter has not overflowed. When written, has no effect. 
1 When read, means the cycle counter has overflowed. When written, sets the overflow 
bit to 1. 


This field resets to a value that is architecturally UNKNOWN. 


P<n>, bit [n], for n = 0 to 30 
Event counter overflow set bit for PMEVCNTR<n>_ELO. 
Bits [30:N] are RAZ/WIL. N is the value in PMCFGR.N. 


Possible values are: 


1) When read, means that PMEVCNTR<n>_EL0O has not overflowed. When written, has 
no effect. 
1 When read, means that PMEVCNTR<n>_ELO has overflowed. When written, sets the 


PMEVCNTR<n>_ELO overflow bit to 1. 


This field resets to a value that is architecturally UNKNOWN. 
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Accessing the PMOVSSET_ELO: 


PMOVSSET_ELO can be accessed through the external debug interface: 





Component Offset 





PMU OxCCd 
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13.3.27 PMPIDRO, Performance Monitors Peripheral Identification Register 0 
The PMPIDRO characteristics are: 


Purpose 
Provides information to identify a Performance Monitor component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
PMPIDRO is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
PMPIDRO is a 32-bit register. 
Field descriptions 
The PMPIDRO bit assignments are: 
31 8 7 0 
RESO PART_0O 


Bits [31:8] 
Reserved, RESO. 


PART_ 0, bits [7:0] 


Part number, least significant byte. 


Accessing the PMPIDRO: 


PMPIDRO can be accessed through the external debug interface: 





Component Offset 





PMU OxFEO 
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13.3.28 


PMPIDR1, Performance Monitors Peripheral Identification Register 1 


The PMPIDR1 characteristics are: 


Purpose 
Provides information to identify a Performance Monitor component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
PMPIDR1 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
PMPIDR1 is a 32-bit register. 


Field descriptions 


The PMPIDR1 bit assignments are: 


31 8 7 4 3 0 


RESO DES_0 PART_1 


Bits [31:8] 
Reserved, RESO. 


DES_0, bits [7:4] 
Designer, least significant nibble of JEP106 ID code. For ARM Limited, this field is @b1011. 


PART_1, bits [3:0] 


Part number, most significant nibble. 


Accessing the PMPIDR1: 


PMPIDR1 can be accessed through the external debug interface: 





Component Offset 





PMU OxFE4 
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13.3.29 PMPIDR2, Performance Monitors Peripheral Identification Register 2 
The PMPIDR2 characteristics are: 


Purpose 
Provides information to identify a Performance Monitor component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
PMPIDR2 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
PMPIDR2 is a 32-bit register. 
Field descriptions 
The PMPIDR2 bit assignments are: 
31 8 7 43 2 0 
RESO REVISION 7 DES_1 
[| JEDEC 


Bits [31:8] 
Reserved, RESO. 


REVISION, bits [7:4] 


Part major revision. Parts can also use this field to extend Part number to 16-bits. 


JEDEC, bit [3] 
RAO. Indicates a JEP106 identity code is used. 


DES_1, bits [2:0] 
Designer, most significant bits of JEP106 ID code. For ARM Limited, this field is @b011. 


Accessing the PMPIDR2: 


PMPIDR2 can be accessed through the external debug interface: 





Component Offset 





PMU OxFE8 
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13.3.30 


PMPIDR3, Performance Monitors Peripheral Identification Register 3 


The PMPIDR3 characteristics are: 


Purpose 
Provides information to identify a Performance Monitor component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 





SLK Default 





RO RO 





Configurations 
PMPIDR3 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 


This register is required for CoreSight compliance. 


Attributes 
PMPIDR3 is a 32-bit register. 


Field descriptions 


The PMPIDR3 bit assignments are: 


31 8 7 4 3 0 


RESO REVAND CMOD 


Bits [31:8] 
Reserved, RESO. 


REVAND, bits [7:4] 


Part minor revision. Parts using PMPIDR2.REVISION as an extension to the Part number must use 
this field as a major revision number. 


CMOD, bits [3:0] 


Customer modified. Indicates someone other than the Designer has modified the component. 


Accessing the PMPIDR3: 


PMPIDR3 can be accessed through the external debug interface: 





Component Offset 





PMU OxFEC 
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13.3.31 PMPIDR4, Performance Monitors Peripheral Identification Register 4 
The PMPIDR4 characteristics are: 


Purpose 
Provides information to identify a Performance Monitor component. 


For more information see About the Peripheral identification scheme on page K2-5504. 


Usage constraints 


This register is accessible as follows: 











SLK Default 
RO RO 
Configurations 
PMPIDR4 is in the Debug power domain. 
Implementation of this register is OPTIONAL. 
This register is required for CoreSight compliance. 
Attributes 
PMPIDR4 is a 32-bit register. 
Field descriptions 
The PMPIDR4 bit assignments are: 
31 8 7 4 3 0 
RESO SIZE DES_2 


Bits [31:8] 
Reserved, RESO. 


SIZE, bits [7:4] 
Size of the component. RAZ. Log» of the number of 4KB pages from the start of the component to 
the end of the component ID registers. 


DES_2, bits [3:0] 
Designer, JEP106 continuation code, least significant nibble. For ARM Limited, this field is 0b0100. 


Accessing the PMPIDR4: 


PMPIDR4 can be accessed through the external debug interface: 





Component Offset 





PMU OxFDO 








ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 13-5195 
1ID092916 Non-Confidential 


13 External System Control Register Descriptions 
13.3 Performance Monitors external register descriptions 


13.3.32 PMSWINC_ELO, Performance Monitors Software Increment register 
The PMSWINC_ELO characteristics are: 


Purpose 
Increments a counter that is configured to count the Software increment event, event 0x00. For more 
information, see SW_INCR. 

Usage constraints 


This register is accessible as follows: 





Off DLK OSLK EPMAD_- SLK Default 





Error Error’ Error Error WI WO 





Configurations 


External register PMSWINC_ELO is architecturally mapped to AArch64 System register 
PMSWINC_ELO. 


External register PMSWINC_ELO is architecturally mapped to AArch32 System register 
PMSWINC. 


PMSWINC_ELO is in the Core power domain. 
Implementation of this register is OPTIONAL. 
If this register is implemented, use of it is deprecated. 


If 1 is written to bit [n] from the external debug interface, it is CONSTRAINED UNPREDICTABLE 
whether or not a SW_INCR event is created for counter n. This is consistent with not implementing 
the register in the external debug interface. 


Attributes 
PMSWINC_ELO is a 32-bit register. 


Field descriptions 


The PMSWINC_ELO bit assignments are: 


31 30 0 


7 P<n>, bit [n] 


RESO __| 


Bit [31] 
Reserved, RESO. 


P<n>, bit [n], for n = 0 to 30 
Event counter software increment bit for PMEVCNTR<n>_EL0. 
P<n> is WI if n >= PMCR_ELO.N, the number of implemented counters. 


Otherwise, the effects of writing to this bit are: 





Q No action. The write to this bit is ignored. 
1 It is CONSTRAINED UNPREDICTABLE whether a SW_INCR event is generated for event 
counter n. 
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Accessing the PMSWINC_ELO: 


PMSWINC_ELO can be accessed through the external debug interface: 





Component Offset 





PMU OxCAO 
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13.4 Generic Timer memory-mapped registers overview 


The Generic Timer memory-mapped registers are implemented as multiple register frames, with each register frame 
having its own base address, as follows: 


° A single CNTCTLBase register frame, at base address CNTCTLBase. 
° Between one and seven CNTBaseN register frames, each with its own base address CNTBaseWN. 


° For each CNTBaseN register frame, if required, a CNTELOBaseN register frame, at base address 
CNTELOBaseN, that provides an ELO view of the CNTBaseN register frame. 


For more information, see: 
° Memory-mapped timer components on page 11-5128. 


° The CNTBaseN and CNTELOBaseN frames on page [1-5130. This section includes the memory map of the 
CNTBaseN and CNTBaseN register frames. 


° The CNTCTLBase frame on page 11-5129. This section includes the memory map of the CNTCTLBase 
register frame. 


. Providing a complete set of counter and timer features on page I1-5132. 
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13.5 Generic Timer memory-mapped register descriptions 


This section describes the Generic Timer registers. Generic Timer memory-mapped registers overview on 
page I3-5198 gives an overview of these registers, and includes links to their memory maps. 
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13.5.1 CNTACR<n>, Counter-timer Access Control Registers, n = 0 -7 
The CNTACR<n> characteristics are: 


Purpose 


Provides top-level access controls for the elements of a timer frame. CNTACR<n> provides the 
controls for frame CNTBaseN. 


In addition to the CNTACR<n> control: 
° CNTNSAR controls whether CNTACR<n> is accessible by Non-secure accesses. 
. If frame CNTELOBaseN is implemented, the CNTELOACR in frame CNTBaseN provides 
additional control of accesses to frame CNTELOBaseN. 
Usage constraints 


This register is accessible as follows: 


Default 


RW 


In a system that implements both Secure and Non-secure states: 
° CNTACR<n> is always accessible by Secure accesses. 


° CNTNSAR.NS<n> determines whether CNTACR<n> is accessible by Non-secure accesses. 


Configurations 
The power domain of CNTACR<n> is IMPLEMENTATION DEFINED. 


On a reset of the reset domain in which it is implemented, RW fields in this register reset to 
UNKNOWN values. The register is not affected by a reset of any other reset domain. For more 
information see Power and reset domains for the system level implementation of the Generic Timer 
on page [1-5122. 


Implemented only if the value of CNTTIDR.Frame<n> is 1. 


An implementation of the counters might not provide configurable access to some or all of the 
features. In this case, the associated field in the CNTACR<n> register is: 


. RAZ/W1 if access is always denied. 
. RAO/WI if access is always permitted. 


Attributes 
CNTACR<n> is a 32-bit register. 


Field descriptions 


The CNTACR<n> bit assignments are: 


31 6543210 


RESO 7 
| RPCT 
RVCT 


RFRQ 
RVOFF 

RWVT 

RWPT 
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Bits [31:6] 


Reserved, RESO. 


RWPT, bit [5] 


Read/write access to the EL1 Physical Timer registers CNTP_CVAL, CNTP_TVAL, and 


CNTP_CTL, in frame <n>. The possible values of this bit are: 


0 No access to the EL1 Physical Timer registers in frame <n>. The registers are RESO. 


1 Read/write access to the EL1 Physical Timer registers in frame <n>. 


RWVT, bit [4] 


Read/write access to the Virtual Timer register CNTV_CVAL, CNTV_TVAL, and CNTV_CTL, in 


frame <n>. The possible values of this bit are: 


Q No access to the Virtual Timer registers in frame <n>. The registers are RESO. 


1 Read/write access to the Virtual Timer registers in frame <n>. 


RVOFF, bit [3] 


Read-only access to CNTVOFF, in frame <n>. The possible values of this bit are: 


) No access to CNTVOFF in frame <n>. The register is RESO. 
1 Read-only access to CNT VOFF in frame <n>. 
RFRQ, bit [2] 
Read-only access to CNTFRQ, in frame <n>. The possible values of this bit are: 
0 No access to CNTFRQ in frame <n>. The register is RESO. 
1 Read-only access to CNTFRQ in frame <n>. 


RVCT, bit [1] 


Read-only access to CNTVCT, in frame <n>. The possible values of this bit are: 


7) No access to CNTVCT in frame <n>. The register is RESO. 
1 Read-only access to CNTVCT in frame <n>. 
RPCT, bit [0] 
Read-only access to CNTPCT, in frame <n>. The possible values of this bit are: 
) No access to CNTPCT in frame <n>. The register is RESO. 
1 Read-only access to CNTPCT in frame <n>. 


Accessing the CNTACR<n>: 


CNTACR<n> can be accessed through its memory-mapped interface: 














Component Frame Offset 
Timer CNTCTLBase 0x040+4n 
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13.5.2 CNTCR, Counter Control Register 
The CNTCR characteristics are: 
Purpose 
Enables the counter, controls the counter frequency setting, and controls counter behavior during 
debug. 
Usage constraints 
This register is accessible as follows: 
Default 
RW 
In a system that implements both Secure and Non-secure states, this register is only writable by 
Secure accesses. 
Configurations 
The power domain of CNTCR is IMPLEMENTATION DEFINED. 
Some or all fields in this register have defined reset values. These apply only on a reset of the reset 
domain in which the register is implemented. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 
the Generic Timer on page 11-5122. 
Attributes 
CNTCR is a 32-bit register. 
Field descriptions 
The CNTCR bit assignments are: 
31 1817 8 7 210 
RESO FCREQ RESO i 
— EN 
HDBG 
Bits [31:18] 
Reserved, RESO. 
FCREQ, bits [17:8] 
Frequency change request. Indicates the number of the entry in the Frequency modes table to select. 
Selecting an unimplemented entry, or an entry that contains 0, has no effect on the counter. 
The maximum number of entries in the Frequency modes table is IMPLEMENTATION DEFINED up to 
a maximum of 1004 entries, see The Frequency modes table on page I1-5125. An implementation 
is only required to implement an FCREQ field that can hold values from 0 to the highest supported 
Frequency modes table entry. Any unrequired most-significant bits of FCREQ can be implemented 
as RESO. 
When this register has an architecturally-defined reset value, this field resets to 0. 
Bits [7:2] 
Reserved, RESO. 
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Halt-on-debug. Controls whether a Halt-on-debug signal halts the system counter: 
7) System counter ignores Halt-on-debug. 
1 Asserted Halt-on-debug signal halts system counter update. 


This field resets to a value that is architecturally UNKNOWN. 


Enables the counter: 
0 System counter disabled. 
1 System counter enabled. 


When this register has an architecturally-defined reset value, this field resets to 0. 


Accessing the CNTCR: 


CNTCR can be accessed through its memory-mapped interface: 














Component Frame Offset 
Timer CNTControlBase  0x000 
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13.5.3 CNTCV, Counter Count Value register 
The CNTCV characteristics are: 


Purpose 


Indicates the current count value. 


Usage constraints 


This register is accessible as follows: 





Default 





RW in CNTControlBase, RO in CNTReadBase 








Frame Accessibility 





CNTControlBase RW 





CNTReadBase RO 





A write to CNTCV must be visible in the CNTPCT register of each running processor in a finite 
time. 


For the writable copy of the register: 
. If the counter is enabled, the effect of writing to the register is UNKNOWN. 


° If the system implements both Secure and Non-secure states, the register is writable only by 
Secure accesses. 


In an implementation that supports 64-bit atomic memory accesses, this register must be accessible 
using a 64-bit atomic access. 


Configurations 
The power domain of CNTCV is IMPLEMENTATION DEFINED. 


On a reset of the reset domain in which an RW instance of this register is implemented, RW fields 
in the register reset to UNKNOWN values. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 
the Generic Timer on page 11-5122. 


Attributes 
CNTCV is a 64-bit register. 


Field descriptions 


The CNTCV bit assignments are: 


63 0 


CountValue 


CountValue, bits [63:0] 


Indicates the counter value. 
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Accessing the CNTCV: 


CNTCV/[31:0] can be accessed through its memory-mapped interface: 





























Component Frame Offset 
Timer CNTControlBase 0x08 
Timer CNTReadBase 0x000 
CNTCV/[63:32] can be accessed through its memory-mapped interface: 
Component Frame Offset 
Timer CNTControlBase @x00C 
Timer CNTReadBase 0x004 
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13.5.4 CNTELOACR, Counter-timer ELO Access Control Register 
The CNTELOACR characteristics are: 
Purpose 
An implementation of CNTELOACR in the frame at CNTBaseN controls whether the CNTPCT, 
CNTVCT, CNTFRQ, EL1 Physical Timer, and Virtual Timer registers are visible in the frame at 
CNTELOBaseN. 
Usage constraints 
This register is accessible as follows: 
Default 
RW 
If CNTELOACR is implemented, it is implemented as a RW register in the CNTBaseN frame when 
the value of bit[0] of CNTTIDR.Frame<n> is 1, otherwise the encoding is RESO. 
If CNTELOACR is not implemented: 
° The register location is RAZ/WI. 
. The registers CNTFRQ, CNTP_CTL, CNTP_CVAL, CNTP_TVAL, CNTPCT, 
CNTV_CTL, CNTV_CVAL, CNTV_TVAL, and CNTVCT are not visible in the 
corresponding CNTELOBaseN frame. 
Configurations 
The power domain of CNTELOACR is IMPLEMENTATION DEFINED. 
On a reset of the reset domain in which it is implemented, RW fields in this register reset to 
UNKNOWN values. The register is not affected by a reset of any other reset domain. For more 
information see Power and reset domains for the system level implementation of the Generic Timer 
on page [1-5122. 
Implementation of this register is OPTIONAL. 
Attributes 
CNTELOACR is a 32-bit register. 
Field descriptions 
The CNTELOACR bit assignments are: 
31 109 8 7 2 10 
RESO th RESO i 
L ELOPCTEN 
ELOVCTEN 
ELOVTEN 
ELOPTEN 
Bits [31:10] 
Reserved, RESO. 
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ELOPTEN, bit [9] 


Second view read/write access control for the EL1 Physical Timer registers. This bit controls 
whether the CNTP_CVAL, CNTP_TVAL, and CNTP_CTL registers in the current CNTBaseN 
frame are also accessible in the corresponding CNTELOBaseN frame. The possible values of this 


bit are: 
0 No access. Registers are RESO in the second view. 
1 Access permitted. If the registers are accessible in the current frame then they are 


accessible in the second view. 


ELOVTEN, bit [8] 


Second view read/write access control for the Virtual Timer registers. This bit controls whether the 
CNTV_CVAL, CNTV_TVAL, and CNTV_CTL registers in the current CNTBaseN frame are also 
accessible in the corresponding CNTELOBaseN frame. The possible values of this bit are: 


0 No access. Registers are RESO in the second view. 


1 Access permitted. If the registers are accessible in the current frame then they are 
accessible in the second view. 


The definition of this bit means that, if the Virtual Timer registers are not implemented in the current 
CNTBaseN frame, then the Virtual Timer register addresses are RESO in the corresponding 
CNTELOBaseN frame, regardless of the value of this bit. 


Bits [7:2] 
Reserved, RESO. 


ELOVCTEN, bit [1] 
Second view read access control for CNTVCT and CNTFRQ. The possible values of this bit are: 


0 CNTVCT is not visible in the second view. 
If ELOPCTEN is set to 0, CNTFRQ is not visible in the second view. 
a Access permitted. If CNTVCT and CNTFRQ are visible in the current frame then they 


are visible in the second view. 


ELOPCTEN, bit [0] 
Second view read access control for CNTPCT and CNTFRQ. The possible values of this bit are: 


Q CNTPCT is not visible in the second view. 
If ELOVCTEN is set to 0, CNTFRQ is not visible in the second view. 
1 Access permitted. If CNTPCT and CNTFRQ are visible in the current frame then they 


are visible in the second view. 


Accessing the CNTELOACR: 


CNTELOACR can be accessed through its memory-mapped interface: 














Component Frame Offset 
Timer CNTBaseN = 0x014 
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13.5.5 CNTFIDO, Counter Frequency ID 

The CNTFID0 characteristics are: 

Purpose 
Indicates the base frequency of the system counter. 

Usage constraints 
This register is accessible as follows: 

Default 
RO or RW 

Configurations 
The power domain of CNTFIDO is IMPLEMENTATION DEFINED. 
If this register is implemented as an RW register, on a reset of the reset domain in which it is 
implemented, RW fields in this register reset to UNKNOWN values. The register is not affected by a 
reset of any other reset domain. For more information see Power and reset domains for the system 
level implementation of the Generic Timer on page 11-5122. 
The possible frequencies for the system counter are stored in the Frequency modes table as 32-bit 
words starting with the base frequency, CNTFIDO, see The Frequency modes table on page I1-5125. 
The final entry in the Frequency modes table must be followed by a 32-bit word of zero value, to 
mark the end of the table. 
Typically, the Frequency modes table will be in read-only memory. However, a system 
implementation might use read/write memory for the table, and initialize the table entries as part of 
its start-up sequence. 
If the Frequency modes table is in read/write memory, ARM strongly recommends that the table is 
not updated once the system is running. 

Attributes 
CNTFID0 is a 32-bit register. 

Field descriptions 

The CNTFIDO bit assignments are: 

31 0 

Frequency, bits [31:0] 
The base frequency of the system counter, in Hz. 

Accessing the CNTFIDO: 

CNTFID0 can be accessed through its memory-mapped interface: 

Component Frame Offset 
Timer CNTControlBase 0x020 
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13.5.6 CNTFID<n>, Counter Frequency IDs, n > 0 
The CNTFID<n> characteristics are: 
Purpose 
Indicates alternative system counter update frequencies. 
Usage constraints 
This register is accessible as follows: 
Default 
RO or RW 
Configurations 
The power domain of CNTFID<n> is IMPLEMENTATION DEFINED. 
If this register is implemented as an RW register, on a reset of the reset domain in which it is 
implemented, RW fields in this register reset to UNKNOWN values. The register is not affected by a 
reset of any other reset domain. For more information see Power and reset domains for the system 
level implementation of the Generic Timer on page 11-5122. 
The possible frequencies for the system counter are stored in the Frequency modes table as 32-bit 
words starting with the base frequency, CNTFIDO, see The Frequency modes table on page I1-5125. 
The number of CNTFID<n> registers is IMPLEMENTATION DEFINED, and the only required 
CNTFID<n> register is CNTFIDO. 
The final entry in the Frequency modes table must be followed by a 32-bit word of zero value, to 
mark the end of the table. 
Typically, the Frequency modes table will be in read-only memory. However, a system 
implementation might use read/write memory for the table, and initialise the table entries as part of 
its start-up sequence. 
If the Frequency modes table is in read/write memory, ARM strongly recommends that the is not 
updated once the system is running. 
Attributes 
CNTFID<n> is a 32-bit register. 
Field descriptions 
The CNTFID<n> bit assignments are: 
31 0 
mn 
Frequency, bits [31:0] 
A system counter update frequency, in Hz. Must be an exact divisor of the base frequency. ARM 
strongly recommends that all frequency values in the Frequency modes table are integer 
power-of-two divisors of the base frequency. 
When the system timer is operating at a lower frequency than the base frequency, the increment 
applied at each counter update is given by: 
increment = (base frequency) / (selected frequency) 
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Accessing the CNTFID<n>: 


CNTFID<n> can be accessed through its memory-mapped interface: 














Component Frame Offset 
Timer CNTControlBase 0x020+ 4n 
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CNTFRQ, Counter-timer Frequency 


The CNTFRQ characteristics are: 


Purpose 


Usage constraints 


This register is provided so that software can discover the frequency of the system counter. It must 
be programmed with this value as part of system initialization. The value of the register is not 


interpreted by hardware. 


This register is accessible as follows: 





Frame Accessibility 





CNTCTLBase RW 





CNTBaseN Config-RO 





CNTELOBaseN = Config-RO 





For the CNTBaseN and CNTELOBaseN frames: 


A RO copy of CNTFRQ is implemented in the CNTBaseN frame when both: 
—  CNTACR<n>.RFRQ is 1. 


— Bit[0] of CNTTIDR.Frame<n> is 1. 
Otherwise the encoding in CNTBaseN is RESO. 


When CNTFRQ is RO in the CNTBaseN frame, it is also RO in the CNTELOBaseN frame 
if bit{2] of CNTTIDR.Frame<n> is | and either: 


— The value of CNTELOACR.ELOVCTEN is 1. 


— The value of CNTELOACR.ELOPCTEN is 1. 
Otherwise the CNTFRQ encoding in CNTELOBaseN frame is RESO. 


In a system that implements both Secure and Non-secure states, this register is only accessible by 


Secure accesses. 


Configurations 
The power domain of CNTFRQ is IMPLEMENTATION DEFINED. 

On a reset of the reset domain in which an RW instance of this register is implemented, RW fields 
in the register reset to UNKNOWN values. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 
the Generic Timer on page 11-5122. 


Attributes 


CNTFRQ is a 32-bit register. 


Field descriptions 


The CNTFRQ bit assignments are: 


31 


0 


Clock frequency 


Bits [31:0] 


Clock frequency. Indicates the system counter clock frequency, in Hz. 
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Accessing the CNTFRQ: 


CNTFRQ can be accessed through its memory-mapped interface: 




















Component Frame Offset 
Timer CNTBaseN 0x010 
Timer CNTELOBaseN 0x10 
Timer CNTCTLBase 0x000 
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13.5.8 CNTNSAR, Counter-timer Non-secure Access Register 
The CNTNSAR characteristics are: 
Purpose 
Provides the highest-level control of whether frames CNTBaseN and CNTELOBaseN are accessible 
by Non-secure accesses. 
Usage constraints 
This register is accessible as follows: 
Default 
RW 
In a system that implements both Secure and Non-secure states, this register is only accessible by 
Secure accesses. 
Configurations 
The power domain of CNTNSAR is IMPLEMENTATION DEFINED. 
On a reset of the reset domain in which it is implemented, RW fields in this register reset to 
UNKNOWN values. The register is not affected by a reset of any other reset domain. For more 
information see Power and reset domains for the system level implementation of the Generic Timer 
on page I1-5122. 
Attributes 
CNTNSAR is a 32-bit register. 
Field descriptions 
The CNTNSAR bit assignments are: 
31 876543210 
RESO i 
— NSO 
NS1 
NS2 
NS3 
NS4 
NS5 
NS6 
NS7 
Bits [31:8] 
Reserved, RESO. 
NS<n>, bit [n], for n = 0 to 7 
Non-secure access to frame n. The possible values of this bit are: 
) Secure access only. Behaves as RESO to Non-secure accesses. 
1 Secure and Non-secure accesses permitted. 
This bit also determines whether, in the CNTCTLBase frame, CNTACR<n> and CNT VOFF<n> are 
accessible to Non-secure accesses. 
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If frame CNTBase<n>: 

° Is not implemented, then NS<n> is RESO. 

° Is not Configurable access, and is accessible only by Secure accesses, then NS<n> is RESO. 
° Is not Configurable access, and is accessible by both Secure and Non-secure accesses, then 


NS<n> is RES1. 


Accessing the CNTNSAR: 


CNTNSAR can be accessed through its memory-mapped interface: 














Component Frame Offset 
Timer CNTCTLBase  0x004 
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CNTP_CTL, Counter-timer Physical Timer Control 


The CNTP_CTL characteristics are: 


Purpose 


Control register for the physical timer. 


Usage constraints 


This register is accessible as follows: 





Frame Accessibility 





CNTBaseN Config-RW 





CNTELOBaseN —Config-RW 





CNTP_CTL is implemented as a RW register in the CNTBaseN frame when both: 
° The value of CNTACR<n>.RWPT is 1. 
° Bit[0] of CNTTIDR.Frame<n> is 1. 


Otherwise the encoding in the CNTBaseN frame is RESO. 


When CNTP_CTL is implemented as a RW register in the CNTBaseN frame, it is also implemented 
as a RW register in the CNTELOBaseN frame if both. 


. The value of CNTELOACR.ELOPTEN is 1. 
° Bit[2] of CNTTIDR.Frame<n> is 1. 
Otherwise the CNTP_CTL register in the CNTELOBaseN frame is RESO. 


Configurations 


Attributes 


The power domain of CNTP_CTL is IMPLEMENTATION DEFINED. 


On a reset of the reset domain in which an RW instance of this register is implemented, RW fields 
in the register reset to UNKNOWN values. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 


the Generic Timer on page 11-5122. 


CNTP_CTL is a 32-bit register. 


Field descriptions 


The CNTP_CTL bit assignments are: 


31 


3 10 


RESO TT] 


Bits [31:3] 


_ ENABLE 
IMASK 


ISTATUS 


Reserved, RESO. 
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ISTATUS, bit [2] 
The status of the timer. This bit indicates whether the timer condition is met: 
0 Timer condition is not met. 
1 Timer condition is met. 


When the value of the ENABLE bit is 1, ISTATUS indicates whether the timer condition is met. 
ISTATUS takes no account of the value of the IMASK bit. If the value of ISTATUS is 1 and the 
value of IMASK is 0 then the timer interrupt is asserted. 


When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN. 


For more information see Operation of the Compare Value views of the timers on page D6-1884 and 
Operation of the TimerValue views of the timers on page D6-1885. 


This bit is read-only. 


IMASK, bit [1] 
Timer interrupt mask bit. Permitted values are: 
0 Timer interrupt is not masked by the IMASK bit. 
1 Timer interrupt is masked by the IMASK bit. 


For more information, see the description of the ISTATUS bit. 


ENABLE, bit [0] 
Enables the timer. Permitted values are: 
) Timer disabled. 
1 Timer enabled. 


Setting this bit to 0 disables the timer output signal, but the timer value accessible from 
CNTP_TVAL continues to count down. 


— Note 


Disabling the output signal might be a power-saving option. 





Accessing the CNTP_CTL: 


CNTP_CTL can be accessed through its memory-mapped interface: 

















Component Frame Offset 
Timer CNTBaseN Qx02C 
Timer CNTELOBaseN — 0x02C 
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13.5.10 CNTP_CVAL, Counter-timer Physical Timer CompareValue 
The CNTP_CVAL characteristics are: 


Purpose 


Holds the 64-bit compare value for the EL1 physical timer. 


Usage constraints 


This register is accessible as follows: 





Frame Accessibility 





CNTBaseN Config-RW 





CNTELOBaseN —Config-RW 





CNTP_CVAL is implemented as a RW register in the CNTBaseN frame when both: 
° The value of CNTACR<n>.RWPT is 1. 

° Bit[0] of CNTTIDR.Frame<n> is 1. 

Otherwise the encoding in the CNTBaseN frame is RESO. 


When CNTP_CVAL is implemented as a RW register in the CNTBaseN frame, it is also 
implemented as a RW register in the CNTELOBaseN frame if both: 


° The value of CNTELOACR.ELOPTEN is 1. 
° Bit[2] of CNTTIDR.Frame<n> is 1. 
Otherwise the CNTP_CVAL register in the CNTELOBaseN frame is RESO. 
If the implementation supports 64-bit atomic accesses, then the CNTP_CVAL register must be 
accessible as an atomic 64-bit value. 
Configurations 
The power domain of CNTP_CVAL is IMPLEMENTATION DEFINED. 


On a reset of the reset domain in which an RW instance of this register is implemented, RW fields 
in the register reset to UNKNOWN values. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 
the Generic Timer on page 11-5122. 


Attributes 
CNTP_CVAL is a 64-bit register. 


Field descriptions 


The CNTP_CVAL bit assignments are: 


63 0 


EL1 physical timer compare value 





Bits [63:0] 


EL] physical timer compare value. 
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Accessing the CNTP_CVAL: 


CNTP_CVAL[31:0] can be accessed through its memory-mapped interface: 











Component Frame Offset 
Timer CNTBaseN 0x020 
Timer CNTELOBaseN — 0x020 





CNTP_CVAL[63:32] can be accessed through its memory-mapped interface: 

















Component Frame Offset 
Timer CNTBaseN 0x024 
Timer CNTELOBaseN —0x024 
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CNTP_TVAL, Counter-timer Physical Timer TimerValue 


The CNTP_TVAL characteristics are: 


Purpose 


Holds the timer value for the EL1 physical timer. This provides a 32-bit downcounter. 


Usage constraints 


This register is accessible as follows: 





Frame Accessibility 





CNTBaseN Config-RW 





CNTELOBaseN = Config-RW 





CNTP_TVAL is implemented as a RW register in the CNTBaseN frame when both: 
° The value of CNTACR<n>.RWPT is 1. 

° Bit[0] of CNTTIDR.Frame<n> is 1. 

Otherwise the encoding in the CNTBaseN frame is RESO. 


When CNTP_TVAL is implemented as a RW register in the CNTBaseN frame, it is also 
implemented as RW in the CNTELOBaseN frame if both: 


° The value of CNTELOACR.ELOPTEN is 1. 
° Bit[2] of CNTTIDR.Frame<n> is 1. 
Otherwise the CNTP_TVAL register in the CNTELOBaseN frame is RESO. 


Configurations 


Attributes 


The power domain of CNTP_TVAL is IMPLEMENTATION DEFINED. 


On a reset of the reset domain in which an RW instance of this register is implemented, RW fields 
in the register reset to UNKNOWN values. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 


the Generic Timer on page 11-5122. 


CNTP_TVAL is a 32-bit register. 


Field descriptions 


The CNTP_TVAL bit assignments are: 


31 


0 


EL1 physical timer value 


Bits [31:0] 


ELI physical timer value. 
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Accessing the CNTP_TVAL: 


CNTP_TVAL can be accessed through its memory-mapped interface: 

















Component Frame Offset 
Timer CNTBaseN 0x028 
Timer CNTELOBaseN — 0x028 
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13.5.12 CNTPCT, Counter-timer Physical Count 
The CNTPCT characteristics are: 


Purpose 


Holds the 64-bit physical count value. 


Usage constraints 


This register is accessible as follows: 





Frame Accessibility 





CNTBaseN Config-RO 





CNTELOBaseN — Config-RO 





CNTPCT is implemented as a RO register in the CNTBaseN frame when both: 
° The value of CNTACR<n>.RPCT is 1. 

° Bit[0] of CNTTIDR.Frame<n> is 1. 

Otherwise the encoding in the CNTBaseN frame is RESO. 


When CNTPCT is implemented as a RO register in the CNTBaseN frame, it is also implemented as 
a RO register in the CNTELOBaseN frame if both: 


° The value of CNTELOACR.ELOPCTEN is 1. 
° Bit[2] of CNTTIDR.Frame<n> is 1. 
Otherwise the CNTPCT register in the CNTELOBaseN frame is RESO. 


If the implementation supports 64-bit atomic accesses, then the CNTPCT register must be 
accessible as an atomic 64-bit value. 


Configurations 
The power domain of CNTPCT is IMPLEMENTATION DEFINED. 


For more information see Power and reset domains for the system level implementation of the 
Generic Timer on page 11-5122. 


Attributes 
CNTPCT is a 64-bit register. 


Field descriptions 


The CNTPCT bit assignments are: 


63 0 


Physical count value 


Bits [63:0] 


Physical count value. 
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Accessing the CNTPCT: 


CNTPCT[31:0] can be accessed through its memory-mapped interface: 











Component Frame Offset 
Timer CNTBaseN 0x000 
Timer CNTELOBaseN — 0x000 





CNTPCT[63:32] can be accessed through its memory-mapped interface: 

















Component Frame Offset 
Timer CNTBaseN 0x004 
Timer CNTELOBaseN — 0x004 
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13.5.13 CNTSR, Counter Status Register 
The CNTSR characteristics are: 
Purpose 
Provides counter frequency status information. 
Usage constraints 
This register is accessible as follows: 
Default 
RO 
Configurations 
The power domain of CNTSR is IMPLEMENTATION DEFINED. 
Some or all fields in this register have defined reset values. These apply only on a reset of the reset 
domain in which the register is implemented. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 
the Generic Timer on page 11-5122. 
Attributes 
CNTSR is a 32-bit register. 
Field descriptions 
The CNTSR bit assignments are: 
31 8 7 2 10 
FCACK RESO il 
| RESO 
DBGH 
FCACK, bits [31:8] 
Frequency change acknowledge. Indicates the currently selected entry in the Frequency modes 
table, see The Frequency modes table on page 11-5125. 
When this register has an architecturally-defined reset value, this field resets to 0. 
Bits [7:2] 
Reserved, RESO. 
DBGH, bit [1] 
Indicates whether the counter is halted because the Halt-on-Debug signal is asserted: 
Q Counter is not halted. 
1 Counter is halted. 
This field resets to a value that is architecturally UNKNOWN. 
Bit [0] 
Reserved, RESO. 
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Accessing the CNTSR: 


CNTSR can be accessed through its memory-mapped interface: 














Component Frame Offset 
Timer CNTControlBase  0x004 
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13.5.14 CNTTIDR, Counter-timer Timer ID Register 
The CNTTIDR characteristics are: 


Purpose 


Indicates the implemented timers in the memory map, and their features. For each value of N from 
0 to 7 it indicates whether: 


° Frame CNTBaseN is a view of an implemented timer. 

° Frame CNTBaseN has a second view, CNTELOBaseN. 

° Frame CNTBaseN has a virtual timer capability. 
Usage constraints 


This register is accessible as follows: 


Default 


RO 


This register is accessible by both Secure and Non-secure accesses. 


Configurations 
The power domain of CNTTIDR is IMPLEMENTATION DEFINED. 


For more information see Power and reset domains for the system level implementation of the 
Generic Timer on page 11-5122. 


Attributes 
CNTTIDR is a 32-bit register. 


Field descriptions 


The CNTTIDR bit assignments are: 


28 27 24 23 20 19 1615 12 11 


Frame<n>, bits [4n+3:4n], for n = 0 to7 
A 4-bit field indicating the features of frame CNTBase<n>. 
Bit[3] of the field is RESO. 


Bit[2] indicates whether frame CNTBase<n> has a second view, CNTELOBase<n>. The possible 
values of this bit are: 





Bit[2] Meaning 





0 Frame <n> does not have a second view. CNTELOBase<n> is RESO. 





1 Frame <n> has a second view, CNTELOBase<n>. 





If bit[O] is 0, bit[2] is RESO. 
Bit[1] indicates whether both: 


° Frame CNTBase<n> implements the virtual timer registers CNTV_CVAL, CNTV_TVAL, 
and CNTV_CTL. 
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. This CNTCTLBase frame implements the virtual timer offset register CNTVOFF<n>. 


The possible values of bit[1] are: 





Bit[1] Meaning 





0 Frame <n> does not have virtual capability. The virtual time and offset registers are RESO. 





1 Frame <n> has virtual capability. The virtual time and offset registers are implemented. 





If bit{0] is 0, bit{1] is RESO. 


Bit[0] indicates whether frame CNTBase<n> is implemented. The possible values of this bit are: 





Bit[0] Meaning 





0 Frame <n> not implemented. All registers associated with the frame are RESO. 





1 Frame <n> is implemented. 





Accessing the CNTTIDR: 


CNTTIDR can be accessed through its memory-mapped interface: 














Component Frame Offset 
Timer CNTCTLBase  0x008 
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13.5.15 CNTV_CTL, Counter-timer Virtual Timer Control 
The CNTV_CTL characteristics are: 
Purpose 
Control register for the virtual timer. 
Usage constraints 
This register is accessible as follows: 
Frame Accessibility 
CNTBaseN Config-RW 
CNTELOBaseN — Config-RW 
If CNTV_CTL is implemented, it is implemented as a RW register in the CNTBaseN frame when 
both: 
° The value of CNTACR<n>.RWVT is 1. 
° Bits[1:0] of CNTTIDR.Frame<n> are 1. 
Otherwise the encoding in the CNTBaseN frame is RESO. 
When CNTV_CTL is implemented as a RW register in the CNTBaseN frame, it is also implemented 
as a RW register in the CNTELOBaseN frame if both: 
. The value of CNTELOACR.ELOVTEN is 1. 
° Bit[2] of CNTTIDR.Frame<n> is 1. 
Otherwise the CNTV_CTL register in the CNTELOBaseN frame is RESO. 
If CNTV_CTL is not implemented, the register location is RAZ/WI. 
Configurations 
The power domain of CNTV_CTL is IMPLEMENTATION DEFINED. 
On a reset of the reset domain in which an RW instance of this register is implemented, RW fields 
in the register reset to UNKNOWN values. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 
the Generic Timer on page 11-5122. 
Implementation of this register is OPTIONAL. 
Attributes 
CNTV_CTL is a 32-bit register. 
Field descriptions 
The CNTV_CTL bit assignments are: 
31 3 10 
RESO TT] 
= ENABLE 
IMASK 
ISTATUS 
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Bits [31:3] 
Reserved, RESO. 
ISTATUS, bit [2] 
The status of the timer. This bit indicates whether the timer condition is met: 
) Timer condition is not met. 
1 Timer condition is met. 


When the value of the ENABLE bit is 1, ISTATUS indicates whether the timer condition is met. 
ISTATUS takes no account of the value of the IMASK bit. If the value of ISTATUS is 1 and the 
value of IMASK is 0 then the timer interrupt is asserted. 


When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN. 


For more information see Operation of the Compare Value views of the timers on page D6-1884 and 
Operation of the TimerValue views of the timers on page D6-1885. 


This bit is read-only. 


IMASK, bit [1] 


Timer interrupt mask bit. Permitted values are: 
0 Timer interrupt is not masked by the IMASK bit. 
1 Timer interrupt is masked by the IMASK bit. 


For more information, see the description of the ISTATUS bit. 


ENABLE, bit [0] 


Enables the timer. Permitted values are: 
0 Timer disabled. 
1 Timer enabled. 


Setting this bit to 0 disables the timer output signal, but the timer value accessible from 
CNTV_TVAL continues to count down. 


— Note 


Disabling the output signal might be a power-saving option. 





Accessing the CNTV_CTL: 


CNTV_CTL can be accessed through its memory-mapped interface: 

















Component Frame Offset 
Timer CNTBaseN 0x03C 
Timer CNTELOBaseN — 0x03C 
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13.5.16 CNTV_CVAL, Counter-timer Virtual Timer CompareValue 
The CNTV_CVAL characteristics are: 


Purpose 


Holds the 64-bit compare value for the virtual timer. 


Usage constraints 


This register is accessible as follows: 





Frame Accessibility 





CNTBaseN Config-RW 





CNTELOBaseN = Config-RW 





If CNTV_CVAL is implemented, it is implemented as a RW register in the CNTBaseN frame when 
both: 


° The value of CNTACR<n>.RWVT is 1. 
° Bits[1:0] of CNTTIDR.Frame<n> are 1. 
Otherwise the encoding in the CNTBaseN frame is RESO. 


When CNTV_CVAL is implemented as a RW register in the CNTBaseN frame, it is also 
implemented as a RW register in the CNTELOBaseN frame if both. 


. The value of CNTELOACR.ELOVTEN is 1. 
° Bit[2] of CNTTIDR.Frame<n> is 1. 
Otherwise the CNTV_CVAL register in the CNTELOBaseN frame is RESO. 
If CNTV_CVAL is not implemented, the register location is RAZ/WI. 
If the implementation supports 64-bit atomic accesses, then the CNTV_CVAL register must be 
accessible as an atomic 64-bit value. 
Configurations 
The power domain of CNTV_CVAL is IMPLEMENTATION DEFINED. 


On a reset of the reset domain in which an RW instance of this register is implemented, RW fields 
in the register reset to UNKNOWN values. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 
the Generic Timer on page 11-5122. 


Implementation of this register is OPTIONAL. 


Attributes 
CNTV_CVAL is a 64-bit register. 


Field descriptions 


The CNTV_CVAL bit assignments are: 


63 0 


Virtual timer compare value 


Bits [63:0] 


Virtual timer compare value. 
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Accessing the CNTV_CVAL: 


CNTV_CVAL[31:0] can be accessed through its memory-mapped interface: 











Component Frame Offset 
Timer CNTBaseN 0x030 
Timer CNTELOBaseN 0x30 





CNTV_CVAL[63:32] can be accessed through its memory-mapped interface: 

















Component Frame Offset 
Timer CNTBaseN 0x034 
Timer CNTELOBaseN 0x34 
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CNTV_TVAL, Counter-timer Virtual Timer TimerValue 


The CNTV_TVAL characteristics are: 


Purpose 


Holds the timer value for the virtual timer. 


Usage constraints 


This register is accessible as follows: 





Frame Accessibility 





CNTBaseN Config-RW 





CNTELOBaseN —Config-RW 





If CNTV_TVAL is implemented, it is implemented as a RW register in the CNTBaseN frame when 
both: 


° The value of CNTACR<n>.RWVT is 1. 
° Bits[1:0] of CNTTIDR.Frame<n> are 1. 
Otherwise the encoding in the CNTBaseN frame is RESO. 


When CNTV_TVAL is implemented as a RW register in the CNTBaseN frame, it is also 
implemented as RW in the CNTELOBaseN frame if both: 


. The value of CNTELOACR.ELOVTEN is 1. 

° Bit[2] of CNTTIDR.Frame<n> is 1. 

Otherwise the CNTV_TVAL register in the CNTELOBaseN frame is RESO. 
If CNTV_TVAL is not implemented, the register location is RAZ/WI. 


Configurations 
The power domain of CNTV_TVAL is IMPLEMENTATION DEFINED. 


On a reset of the reset domain in which an RW instance of this register is implemented, RW fields 
in the register reset to UNKNOWN values. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 
the Generic Timer on page 11-5122. 


Implementation of this register is OPTIONAL. 


Attributes 
CNTV_TVAL is a 32-bit register. 


Field descriptions 


The CNTV_TVAL bit assignments are: 


31 0 


Virtual timer value 


Bits [31:0] 


Virtual timer value. 
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Accessing the CNTV_TVAL: 


CNTV_TVAL can be accessed through its memory-mapped interface: 

















Component Frame Offset 
Timer CNTBaseN 0x038 
Timer CNTELOBaseN 0x38 
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CNTVCT, Counter-timer Virtual Count 


The CNTVCT characteristics are: 


Purpose 


Holds the 64-bit virtual count value. 


Usage constraints 


This register is accessible as follows: 





Frame Accessibility 





CNTBaseN Config-RO 





CNTELOBaseN  Config-RO 





CNTVCT is implemented as a RO register in the CNTBaseN frame when both: 
° The value of CNTACR<n>.RVCT is 1. 

° Bit[0] of CNTTIDR.Frame<n> is 1. 

Otherwise the encoding in the CNTBaseN frame is RESO. 


When CNTVCT is implemented as a RO register in the CNTBaseN frame, it is also implemented 
as a RO register in the CNTELOBaseN frame if both: 


. The value of CNTELOACR.ELOVCTEN is 1. 
° Bit[2] of CNTTIDR.Frame<n> is 1. 
Otherwise the CNTVCT register in the CNTELOBaseN frame is RESO. 


If the implementation supports 64-bit atomic accesses, then the CNTVCT register must be 
accessible as an atomic 64-bit value. 


Configurations 


Attributes 


The power domain of CNTVCT is IMPLEMENTATION DEFINED. 


For more information see Power and reset domains for the system level implementation of the 
Generic Timer on page 11-5122. 


CNTVCT is a 64-bit register. 


Field descriptions 


The CNTVCT bit assignments are: 


63 


Bits [63:0] 


Virtual count value 


Virtual count value. 
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Accessing the CNTVCT: 


CNTVCT[31:0] can be accessed through its memory-mapped interface: 











Component Frame Offset 
Timer CNTBaseN 0x008 
Timer CNTELOBaseN — 0x008 





CNTVCT[63:32] can be accessed through its memory-mapped interface: 

















Component Frame Offset 
Timer CNTBaseN Qx00C 
Timer CNTELOBaseN — 0x00C 
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13.5.19 CNTVOFF{<n>}, Counter-timer Virtual Offset 
The CNTVOFF{<n>} characteristics are: 


Purpose 
Holds the 64-bit virtual offset. 


Usage constraints 


This register is accessible as follows: 





Frame Mnemonic Accessibility 





CNTCTLBase CNTVOFF<n>,n=1to7 Config-RW 





CNTBaseN CNTVOFF Config-RO 





CNTELOBaseN CNTVOFF Config-RO 





In an implementation that supports 64-bit atomic accesses, then the CNTVOFF{<n>} register must 
be accessible as an atomic 64-bit value. 


For each implemented CNTBaseN frame, it is IMPLEMENTATION DEFINED whether the timer 
distinguishes between real time and virtual time. 


If the frame<n> timer supports this distinction and bits[1:0] of CNTTIDR.Frame<n> are 1, 
CNTVOFF<n> is implemented as a RW register in the CNTCTLBase frame. When bit[1] of 
CNTTIDR.Frame<n> is 0, the encoding in the CNTCTLBase frame is RESO. 


In a system that implements both Secure and Non-secure states: 


° CNTVOFF<n> is always accessible by Secure accesses. 
° CNTNSAR.NS<n> determines whether CNTVOFF<n> is accessible by Non-secure 
accesses. 


If CNTVOFF<n> is not implemented, the register location is RAZ/WI. 


CNTVOFF is implemented as a RO register in the CNTBaseN frame when the value of 
CNTACR<n>.RVOFF is 1, otherwise the encoding in the CNTBaseN frame is RESO. 


If CNTVOFF is implemented as a RO register in the CNTBaseN frame, and bit[2] of 
CNTTIDR.Frame<n> is 1, CNTVOFF is also implemented as a RO register in the CNTELOBaseN 
frame. When CNTVOFF is not implemented in the CNTBaseN frame, or when bit[2] of 
CNTTIDR.Frame<n> is 0, the encoding in CNTELOBaseN is RESO. 

Configurations 
The power domain of CNTVOFF is IMPLEMENTATION DEFINED. 


On a reset of the reset domain in which an RW instance of this register is implemented, RW fields 
in the register reset to UNKNOWN values. The register is not affected by a reset of any other reset 
domain. For more information see Power and reset domains for the system level implementation of 
the Generic Timer on page 11-5122. 


Implementation of CNTVOFF<n> is OPTIONAL. 
CNTACR<n>.RVOFF enables access to this register for frame CNTBase<n>. 


Attributes 
CNTVOFF{<n>} is a 64-bit register. 


Field descriptions 


The CNTVOFF{<n>} bit assignments are: 
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63 0 


Virtual offset 


Bits [63:0] 
Virtual offset. 


Accessing the CNTVOFF{<n>}: 


CNTVOFF{<n>} can be accessed through the memory-mapped interface: 























Component Mnemonic Frame Offset 
Timer CNTVOFF<n>[3 1:0] CNTCTLBase 0x080 + 8n 
Timer CNTVOFF<n>[63:32] CNTCTLBase 0x084+ 8n 
Timer CNTVOFF{3 1:0] CNTBaseN 0x018 
Timer CNTVOFF[64:32] CNTBaseN Ox01C 
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13.5.20 CounterlD<n>, Counter ID registers, n = 0 - 11 
The CounterID<n> characteristics are: 


Purpose 


IMPLEMENTATION DEFINED identification registers 0 to 11 for the memory-mapped Generic Timer. 


Usage constraints 


This register is accessible as follows: 


Default 


RO 


These registers are accessible by both Secure and Non-secure accesses. 


Configurations 
The power domain of CounterI[D<n> is IMPLEMENTATION DEFINED. 


For more information see Power and reset domains for the system level implementation of the 
Generic Timer on page 11-5122. 


These registers are implemented independently in each of the frames accessed through the different 
memory maps. 


If the implementation of the Counter ID registers requires an architecture version, the value for this 
version of the ARM Generic Timer is version 0. 


The Counter ID registers can be implemented as a set of CoreSight ID registers, comprising 
Peripheral ID Registers and Component ID Registers. An implementation of these registers for the 
Generic Timer must use a Component class value of OxF. 


Attributes 


CounterID<n> is a 32-bit register. 
Field descriptions 


The CounterID<n> bit assignments are: 


31 0 


IMPLEMENTATION DEFINED 


IMPLEMENTATION DEFINED, bits [31:0] 


IMPLEMENTATION DEFINED. 


Accessing the CounterlD<n>: 


CounterID<n> can be accessed through its memory-mapped interface: 














Component Frame Offset 

Timer CNTControlBase 0xFD@+ 4n 

Timer CNTReadBase OxFD@ + 4n 
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Component Frame Offset 

Timer CNTBaseN QxFD@ + 4n 
Timer CNTELOBaseN QxFD@ + 4n 
Timer CNTCTLBase OxFD@ + 4n 
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Chapter J1 
ARMv8 Pseudocode 


This chapter contains pseudocode that describes many features of the ARMv8 architecture. It contains the following 
sections: 


° Pseudocode for AArch64 operations on page J1-5242. 
° Pseudocode for AArch32 operation on page J1-5302. 
. Shared pseudocode on page J1-5374. 
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J1.1 Pseudocode for AArch64 operations 


This section holds the pseudocode for execution in AArch64 state. Functions that are listed in this section are 
identified as AArch64.FunctionName. Some of these functions have an equivalent AArch32 function, 
AArch32.FunctionName. This section is organized by functional groups, with the functional groups being indicated by 
hierarchical path names, for example aarch64/debug/breakpoint. 


The top-level sections of the AArch64 pseudocode hierarchy are: 


aarch64/debug. 

aarch64/exceptions on page J1-5248. 
aarch64/functions on page J1-5265. 
aarch64/instrs on page J1-5279. 
aarch64/translation on page J1-5287. 


J1.1.1 aarch64/debug 


This section includes the following pseudocode functions: 


aarch64/debug/breakpoint/AArch64. BreakpointMatch. 
aarch64/debug/breakpoint/AArch64.BreakpointValueMatch on page J1-5243. 
aarch64/debug/breakpoint/AArch64.StateMatch on page J1-5244. 
aarch64/debug/enables/AArch64.GenerateDebugExceptions on page J1-5245. 
aarch64/debug/enables/AArch64. GenerateDebugExceptionsFrom on page J1-5245. 
aarch64/debug/pmu/AArch64.CheckForPMUOverflow on page J1-5245. 
aarch64/debug/pmu/AArch64.CountEvents on page J1-5246. 
aarch64/debug/takeexceptiondbg/AArch64. TakeExceptionInDebugState on page J1-5246. 
aarch64/debug/watchpoint/AArch64. WatchpointByteMatch on page J1-5247. 
aarch64/debug/watchpoint/AArch64. WatchpointMatch on page J1-5247. 


aarch64/debug/breakpoint/AArch64.BreakpointMatch 


// AArch64.BreakpointMatch() 


// Breakpoint matching in an AArch64 translation regime. 


boolean AArch64.BreakpointMatch(integer n, bits(64) vaddress, integer size) 


assert !ELUsingAArch32(S1TranslationRegime()); 
assert n <= UInt(ID_AA64DFRO_EL1.BRPs) ; 


enabled = DBGBCR_EL1[n].E == '1'; 
ispriv = PSTATE.EL != ELQ; 

linked = DBGBCR_EL1[n].BT == '@xQ1'; 
isbreakpnt = TRUE; 

linked_to = FALSE; 


state_match = AArch64.StateMatch(DBGBCR_EL1[n].SSC, DBGBCR_EL1[n].HMC, DBGBCR_EL1[n].PMC, 
linked, DBGBCR_EL1[n].LBN, isbreakpnt, ispriv); 
value_match = AArch64.BreakpointValueMatch(n, vaddress, linked_to); 


if HaveAnyAArch32() && size == 4 then // Check second halfword 
// If the breakpoint address and BAS of an Address breakpoint match the address of the 
// second halfword of an instruction, but not the address of the first halfword, it is 
// CONSTRAINED UNPREDICTABLE whether or not this breakpoint generates a Breakpoint debug 
// event. 
match_i = AArch64.BreakpointValueMatch(n, vaddress + 2, linked_to); 
if !value_match && match_i then 

value_match = ConstrainUnpredictableBool(); 


if vaddress<1> == '1' && DBGBCR_EL1[n].BAS == '1111' then 
// The above notwithstanding, if DBGBCR_EL1[n].BAS == '1111', then it is CONSTRAINED 
// UNPREDICTABLE whether or not a Breakpoint debug event is generated for an instruction 
// at the address DBGBVR_EL1[n]+2. 
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if value_match then value_match = ConstrainUnpredictableBool(); 
match = value_match && state_match && enabled; 


return match; 


aarch64/debug/breakpoint/AArch64.BreakpointValueMatch 


// AArch64.BreakpointValueMatch() 


boolean AArch64.BreakpointValueMatch(integer n, bits(64) vaddress, boolean linked_to) 


// "n" is the identity of the breakpoint unit to match against 

// "vaddress" is the current instruction address, ignored if linked_to is TRUE and for Context 
// matching breakpoints. 

// "linked_to" is TRUE if this is a call from StateMatch for linking. 


// If a non-existant breakpoint then it is CONSTRAINED UNPREDICTABLE whether this gives 
// no match or the breakpoint is mapped to another UNKNOWN implemented breakpoint. 
if n > UInt(ID_AA64DFRO_EL1.BRPs) then 

(c, n) = ConstrainUnpredictableInteger(@, UInt(ID_AA64DFRO_EL1.BRPs)); 

assert c IN {Constraint_DISABLED, Constraint_UNKNOWN}; 

if c == Constraint_DISABLED then return FALSE; 


// If this breakpoint is not enabled, it cannot generate a match. (This could also happen on a 
// call from StateMatch for linking.) 

if DBGBCR_EL1[n].E == '@' then return FALSE; 

context_aware = (n >= UInt(ID_AA64DFRQ_EL1.BRPs) - UInt(ID_AA64DFRO_EL1.CTX_CMPs)); 


// If BT is set to a reserved type, behaves either as disabled or as a not-reserved type. 
type = DBGBCR_EL1[n].BT; 


if (type == 'x1xx' || // Reserved 
(type != 'Ox0x' && !context_aware) | | // Context matching 
(type == '1xxx' && !HaveEL(EL2))) then // EL2 extension 


(c, type) = ConstrainUnpredictableBits(); 

assert c IN {Constraint_DISABLED, Constraint_UNKNOWN}; 

if c == Constraint_DISABLED then return FALSE; 

// Otherwise the value returned by ConstrainUnpredictableBits must be a not-reserved value 


// Determine what to compare against. 
match_addr = (type == 'OxQx'); 


match_vmid = (type == '10xx'); 
match_cid = (type == 'x01x'); 
linked = (type == 'xxx1'); 


// If this is a call from StateMatch, return FALSE if the breakpoint is not programmed for a 
// \MID and/or context ID match, of if not context-aware. The above assertions mean that the 
// code can just test for match_addr == TRUE to confirm all these things. 

if linked_to && (!linked || match_addr) then return FALSE; 


// If called from BreakpointMatch return FALSE for Linked context ID and/or VMID matches. 
if !linked_to & linked && !match_addr then return FALSE; 


// Do the comparison. 
if match_addr then 
byte = UInt(vaddress<1:0>); 
if HaveAnyAArch32() then 
// T32 instructions can be executed at EL@ in an AArch64 translation regime. 


assert byte IN {0,2}; // "vaddress" is halfword aligned. 
byte_select_match = (DBGBCR_EL1[n] .BAS<byte> == '1'); 

else 
assert byte == 0; // "vaddress" is word aligned 
byte_select_match = TRUE; // DBGBCR_EL1[n].BAS<byte> is RES1 


top = AddrTop(vaddress, PSTATE.EL); 
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BVR_match = vaddress<top:2> == DBGBVR_EL1[n]<top:2> && byte_select_match; 
elsif match_cid then 
BVR_match = (PSTATE.EL IN {EL@,EL1} && CONTEXTIDR_EL1 == DBGBVR_EL1[n]<31:0>); 
if match_vmid then 
BXVR_match = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} & 
VITBR_EL2.VMID == DBGBVR_EL1[n]<39:32>); 


bvr_match_valid = (match_addr || match_cid); 
bxvr_match_valid = match_vmid; 


match = (!bxvr_match_valid || BXVR_match) && (!bvr_match_valid || BVR_match) 


return match; 


aarch64/debug/breakpoint/AArch64.StateMatch 


// AArch64.StateMatch() 


// Determine whether a breakpoint or watchpoint is enabled in the current mode and state. 


boolean AArch64.StateMatch(bits(2) SSC, bit HMC, bits(2) PxC, boolean linked, bits(4) LBN, 


boolean isbreakpnt, boolean ispriv) 
// "SSC", "HMC", "PxC" are the control fields from the DBGBCR[n] or DBGWCR[n] register. 
// "linked" is TRUE if this is a linked breakpoint/watchpoint type. 
// "LBN" is the linked breakpoint number from the DBGBCR[n] or DBGWCR[n] register. 
// “isbreakpnt" is TRUE for breakpoints, FALSE for watchpoints. 
// “ispriv" is valid for watchpoints, and selects between privileged and unprivileged accesses. 


// If parameters are set to a reserved type, behaves as either disabled or a defined type 


if ((HMC:SSC:PxC) IN {'Q11xx','10@x@','101x@','11010','11101','1111x'} || // Reserved 
(HMC == '0' && PxC == 'Q0' && (!isbreakpnt || !HaveAArch32EL(EL1))) || // Usr/Svc/Sys 
(SSC IN {'01','10'} && !HaveEL(EL3)) || // No EL3 
(HMC:SSC != 'Q@00' && HMC:SSC != '111' && !HaveEL(EL3) && !HaveEL(EL2)) || // No EL3/EL2 
(HMC:SSC:PxC == '11100' && !HaveEL(EL2))) then // No EL2 


(c, <HMC,SSC,PxC>) = ConstrainUnpredictableBits(); 

assert c IN {Constraint_DISABLED, Constraint_UNKNOWN}; 

if c == Constraint_DISABLED then return FALSE; 

// Otherwise the value returned by ConstrainUnpredictableBits must be a not-reserved value 


EL3_match = HaveEL(EL3) && HMC == '1' && SSC<@> == '0'; 
EL2_match = HaveEL(EL2) && HMC == '1'; 

EL1_match = PxC<@> == '1'; 

EL@_match = PxC<1> == '1'; 


case PSTATE.EL of 
when EL3 priv_match = EL3_match; 
when EL2. priv_match = EL2_match; 
when EL1 priv_match = if ispriv || isbreakpnt then EL1_match else ELQ_match; 
when EL@ priv_match = ELQ_match; 


case SSC of 
when 'Q0' security_state_match = TRUE; // Both 
when 'Q@1' security_state_match = !IsSecure(); // Non-secure only 
when '10' security_state_match = IsSecure(); // Secure only 
when '11' security_state_match = TRUE; // Both 


if linked then 
// “LBN" must be an enabled context-aware breakpoint unit. If it is not context-aware then 
// it is CONSTRAINED UNPREDICTABLE whether this gives no match, or LBN is mapped to some 
// UNKNOWN breakpoint that is context-aware. 
lbn = UInt(LBN); 
first_ctx_cmp = (UInt(ID_AA64DFRQ_EL1.BRPs) - UInt(ID_AA64DFRO_EL1.CTX_CMPs)); 
last_ctx_cmp = UInt(ID_AA64DFRQ_EL1.BRPs); 
if (Ibn < first_ctx_cmp || Ibn > last_ctx_cmp) then 
(c, Ibn) = ConstrainUnpredictableInteger(first_ctx_cmp, last_ctx_cmp) ; 
assert c IN {Constraint_DISABLED, Constraint_NONE, Constraint_UNKNOWN}; 
case c of 
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when Constraint_DISABLED return FALSE; // Disabled 
when Constraint_NONE linked = FALSE; // No linking 
// Otherwise ConstrainUnpredictableInteger returned a context-aware breakpoint 


if linked then 
vaddress = bits(64) UNKNOWN; 
linked_to = TRUE; 
linked_match = AArch64.BreakpointValueMatch(lbn, vaddress, linked_to); 


return priv_match && security_state_match && (!linked || Tinked_match); 


aarch64/debug/enables/AArch64.GenerateDebugExceptions 


// AArch64.GenerateDebugExceptions() 
// sssssssseesesssaesesesssesessees= 


boolean AArch64.GenerateDebugExceptions() 
return AArch64.GenerateDebugExceptionsFrom(PSTATE.EL, IsSecure(), PSTATE.D); 
aarch64/debug/enables/AArch64.GenerateDebugExceptionsFrom 


// AArch64.GenerateDebugExceptionsFrom( ) 
// sssssesseeesssaa2s2sseseeeeeeee===== 


boolean AArch64.GenerateDebugExceptionsFrom(bits(2) from, boolean secure, bit mask) 


if OSLSR_EL1.0SLK == '1' || DoubleLockStatus() || Halted() then 
return FALSE; 


route_to_el2 = HaveEL(EL2) && !secure && (HCR_EL2.TGE 
if HaveEL(EL3) && secure then 

enabled = MDCR_EL3.SDD == '@' && from != EL3; 
else 

enabled = TRUE; 


"1' || MDCR_EL2.TDE == '1'); 


target = if route_to_el2 then EL2 else EL1; 
if from == target then 
enabled = enabled && MDSCR_EL1.KDE == '1' && mask == 'Q'; 


return enabled; 


aarch64/debug/pmu/AArch64.CheckForPMUOverflow 


// AArch64.CheckForPMUOverflow( ) 


// Signal Performance Monitors overflow IRQ and CTI overflow events 
boolean AArch64.CheckForPMUOverflow() 


pmuirg = (PMCR_EL@.E == '1' && PMINTENSET_EL1<31> == '1' && PMOVSSET_EL@<31> == '1'); 
for n = 0 to UInt(PMCR_ELO.N) - 1 
if HaveEL(EL2) then 
E = (if n < UInt(MDCR_EL2.HPMN) then PMCR_EL@.E else MDCR_EL2.HPME); 
else 
E = PMCR_ELO.E; 
if E == '1' && PMINTENSET_EL1<n> == '1' && PMOVSSET_EL@<n> == '1' then pmuirgq = TRUE; 


SetInterruptRequestLevel(InterruptID_PMUIRQ, if pmuirgq then HIGH else LOW); 
CTI_SetEventLevel(CrossTriggerIn_PMUOverflow, if pmuirq then HIGH else LOW); 


// The request remains set until the condition is cleared. (For example, an interrupt handler 
// or cross-triggered event handler clears the overflow status flag by writing to PMOVSCLR_ELQ.) 


return pmuirq; 
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aarch64/debug/pmu/AArch64.CountEvents 


// AArch64.CountEvents() 


// Return TRUE if counter "n" should count its event. For the cycle counter, n == 31. 


boolean AArch64.CountEvents(integer n) 
assert (n == 31 || n < UInt(PMCR_ELO.N)); 


// Event counting is disabled in Debug state 
debug = Halted(); 


// In Non-secure state, some counters are reserved for EL2 
if HaveEL(EL2) then 
E = (if n < UInt(MDCR_EL2.HPMN) || n == 31 then PMCR_ELO.E else MDCR_EL2.HPME); 
else 
E 


= PMCR_ELO.E; 
enabled = 


(E == '1' && PMCNTENSET_ELQ<n> == '1'); 


if !IsSecure() then 
prohibited = FALSE; 
else 
// Event counting in Secure state is prohibited unless any one of: 
// « EL3 is not implemented 
// * EL3 is using AArch64 and MDCR_EL3.SPME == 
prohibited = (HaveEL(EL3) && MDCR_EL3.SPME == 'Q'); 


// The IMPLEMENTATION DEFINED authentication interface might override software controls 
if ExternalSecureNoninvasiveDebugEnabled() then prohibited = FALSE; 

// For the cycle counter, PMCR_EL@.DP enables counting when otherwise prohibited 

if prohibited && n == 31 then prohibited = (PMCR_ELO.DP == '1'); 


// Event counting can be filtered by the {P, U, NSK, NSU, NSH, M} bits 
filter = (if n == 31 then PMCCFILTR_ELQ@<31:26> else PMEVTYPER_ELO[n]<31: 26>); 


M = if !HaveEL(EL3) then '@' else (filter<5> EOR filter<Q>); 
H = if !HaveEL(EL2) then '0' else filter<l>; 
P = filter<5>; U = filter<4>; 
if !IsSecure() && HaveEL(EL3) then 
P = P EOR filter<3>; U = U EOR filter<2>; 


case PSTATE.EL of 


when ELQ filtered = U == '1'; 
when EL1 filtered = P == '1'; 
when EL2 filtered = H == 'Q'; 
when EL3 filtered = M == '1'; 


return !debug && enabled && !prohibited && ! filtered; 


aarch64/debug/takeexceptiondbg/AArch64. TakeExceptionInDebugState 
// AArch64.TakeExceptionInDebugState() 
// Take an exception in Debug state to an Exception Level using AArché4. 


AArch64.TakeExceptionInDebugState(bits(2) target_el, ExceptionRecord exception) 
assert HaveEL(target_el) && !ELUsingAArch32(target_el) && UInt(target_el) >= UInt(PSTATE.EL); 


// If coming from AArch32 state, the top parts of the X[] registers might be set to zero 
from_32 = UsingAArch32(); 

if from_32 then AArch64.MaybeZeroRegisterUppers(); 

AArch64.ReportException(exception, target_el); 


PSTATE.EL = target_el; PSTATE.nRW = '@'; PSTATE.SP = '1'; 


SPSR[] = bits(32) UNKNOWN; 
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ELR[] = bits(64) UNKNOWN; 


// PSTATE.{SS,D,A,1,F} are not observable and ignored in Debug state, so behave as if UNKNOWN. 
PSTATE.<SS,D,A,I,F> = bits(5) UNKNOWN; 

DLR_EL@ = bits(64) UNKNOWN; 

DSPSR_EL@ = bits(32) UNKNOWN; 


PSTATE.IL = 'Q'; 
if from_32 then // Coming from AArch32 
PSTATE.IT = '@0000000'; PSTATE.T = '0'; // PSTATE.J is RESO 
EDSCR.ERR = '1'; 
UpdateEDSCRFields(); // Update EDSCR processor state flags. 
EndOfInstruction(); 


aarch64/debug/watchpoint/AArch64.WatchpointByteMatch 


// AArch64.WatchpointByteMatch() 
// sssseesseeessssasssssssseee== 


boolean AArch64.WatchpointByteMatch(integer n, bits(64) vaddress) 


top = AddrTop(vaddress, PSTATE.EL); 

bottom = if DBGWVR_EL1[n]<2> == '1' then 2 else 3; // Word or doubleword 
byte_select_match = (DBGWCR_EL1[n] .BAS<UInt(vaddress<bottom-1:0>)> != 'Q'); 

mask = UInt(DBGWCR_EL1[n] .MASK) ; 


// If DBGWCR_EL1[n].MASK is non-zero value and DBGWCR_EL1[n].BAS is not set to '11111111', or 
// DBGWCR_EL1[n].BAS specifies a non-contiguous set of bytes behavior is CONSTRAINED 
// UNPREDICTABLE. 
if mask > @ && !IsOnes(DBGWCR_EL1[n].BAS) then 
byte_select_match = ConstrainUnpredictableBool(); 


else 
LSB = (DBGWCR_EL1[n].BAS AND NOT(DBGWCR_EL1[n].BAS - 1)); MSB = (DBGWCR_EL1[n].BAS + LSB); 
if !IsZero(MSB AND (MSB - 1)) then // Not contiguous 
byte_select_match = ConstrainUnpredictab]eBool(); 
bottom = 3; // For the whole doubleword 


// If the address mask is set to a reserved value, the behavior is CONSTRAINED UNPREDICTABLE. 
if mask > @ && mask <= 2 then 

(c, mask) = ConstrainUnpredictableInteger(3, 31); 

assert c IN {Constraint_DISABLED, Constraint_NONE, Constraint_UNKNOWN}; 


case c of 
when Constraint_DISABLED return FALSE; // Disabled 
when Constraint_NONE mask = Q; // No masking 


// Otherwise the value returned by ConstrainUnpredictableInteger is a not-reserved value 


if mask > bottom then 
WVR_match = (vaddress<top:mask> == DBGWVR_EL1[n]<top:mask>) ; 
// If masked bits of DBGWVR_EL1[n] are not zero, the behavior is CONSTRAINED UNPREDICTABLE. 
if WR_match && !IsZero(DBGWVR_EL1[n]<mask-1:bottom>) then 
WVR_match = ConstrainUnpredictableBool(); 
else 
WVR_match = vaddress<top:bottom> == DBGWVR_EL1[n]<top:bottom>; 


return WR_match && byte_select_match; 


aarch64/debug/watchpoint/AArch64.WatchpointMatch 


// AArch64.WatchpointMatch() 


// Watchpoint matching in an AArch64 translation regime. 


boolean AArch64.WatchpointMatch(integer n, bits(64) vaddress, integer size, boolean ispriv, 
boolean iswrite) 
assert !ELUsingAArch32(S1TranslationRegime()); 
assert n <= UInt(ID_AA64DFRO_EL1.WRPs) ; 
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J1 ARMv8 Pseudocode 
J1.1 Pseudocode for AArch64 operations 


J1.1.2 


// “ispriv" is FALSE for LDTR/STTR instructions executed at EL1 and all 

// l\oad/stores at EL@, TRUE for all other load/stores. "iswrite" is TRUE for stores, FALSE for 
// loads. 

enabled = DBGWCR_EL1[n].E == '1'; 

linked = DBGWCR_EL1[n].WT == '1'; 

isbreakpnt = FALSE; 


state_match = AArch64.StateMatch(DBGWCR_EL1[n].SSC, DBGWCR_EL1[n].HMC, DBGWCR_EL1[n] .PAC, 
linked, DBGWCR_EL1[n].LBN, isbreakpnt, ispriv); 


1s_match = (DBGWCR_EL1[n].LSC<(if iswrite then 1 else Q)> == '1'); 
value_match = FALSE; 
for byte = @ to size - 1 


value_match = value_match || AArch64.WatchpointByteMatch(n, vaddress + byte); 


return value_match && state_match && Is_match && enabled; 


aarch64/exceptions 


This section includes the following pseudocode functions: 


aarch64/exceptions/aborts/AArch64.Abort on page J1-5249. 
aarch64/exceptions/aborts/AArch64.AbortSyndrome on page J1-5249. 
aarch64/exceptions/aborts/AArch64.CheckPCAlignment on page J1-5249. 
aarch64/exceptions/aborts/AArch64.DataAbort on page J1-5250. 
aarch64/exceptions/aborts/AArch64.InstructionAbort on page J1-5250. 
aarch64/exceptions/aborts/AArch64.PCAlignmentFault on page J1-5250. 
aarch64/exceptions/aborts/AArch64.SPAlignmentFault on page J1-5251. 
aarch64/exceptions/asynch/AArch64. TakePhysicalFIQException on page J1-5251. 
aarch64/exceptions/asynch/AArch64. TakePhysicalIRQException on page J1-5251. 
aarch64/exceptions/asynch/AArch64. TakePhysicalSErrorException on page J1-5252. 
aarch64/exceptions/asynch/AArch64. Take VirtualFIQException on page J1-5252. 
aarch64/exceptions/asynch/AArch64. Take VirtualIRQException on page J1-5252. 
aarch64/exceptions/asynch/AArch64. Take VirtualSErrorException on page J1-5253. 
aarch64/exceptions/debug/AArch64.BreakpointException on page J1-5253. 
aarch64/exceptions/debug/AArch64.SoftwareBreakpoint on page J1-5253. 
aarch64/exceptions/debug/AArch64.SoftwareStepException on page J1-5254. 
aarch64/exceptions/debug/AArch64. VectorCatchException on page J1-5254. 
aarch64/exceptions/debug/AArch64. WatchpointException on page J1-5254. 
aarch64/exceptions/exceptions/AArch64.ExceptionClass on page J1-5255. 
aarch64/exceptions/exceptions/AArch64.ReportException on page J1-5255. 
aarch64/exceptions/exceptions/AArch64. ResetControlRegisters on page J1-5256. 
aarch64/exceptions/exceptions/AArch64. TakeReset on page J1-5256. 
aarch64/exceptions/ieeefp/AArch64.F PTrappedException on page J1-5257. 
aarch64/exceptions/syscalls/AArch64. CallHypervisor on page J1-5257. 
aarch64/exceptions/syscalls/AArch64.CallSecureMonitor on page J1-5257. 
aarch64/exceptions/syscalls/AArch64.CallSupervisor on page J1-5258. 
aarch64/exceptions/takeexception/AArch64.TakeException on page J1-5258. 
aarch64/exceptions/traps/AArch64.AArch32SystemAccessTrap on page J1-5259. 
aarch64/exceptions/traps/AArch64.AArch32SystemAccessTrapSyndrome on page J1-5259. 
aarch64/exceptions/traps/AArch64.AdvSIMDF PaAccessTrap on page J1-5260. 
aarch64/exceptions/traps/AArch64.CheckAArch32SystemAccess on page J1-5260. 
aarch64/exceptions/traps/AArch64.CheckAArch32SystemAccessTraps on page J1-5261. 
aarch64/exceptions/traps/AArch64.CheckCP 1 5InstrCoarseTraps on page J1-5261. 
aarch64/exceptions/traps/AArch64. CheckF PAdvSIMDEnabled on page J1-5262. 
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J1 ARMv8 Pseudocode 
J1.1 Pseudocode for AArch64 operations 


° aarch64/exceptions/traps/AArch64.CheckF PAdvSIMDTrap on page J1-5262. 
° aarch64/exceptions/traps/AArch64.CheckForSMCTrap on page J1-5262. 

. aarch64/exceptions/traps/AArch64.CheckForWFxTrap on page J1-5263. 

. aarch64/exceptions/traps/AArch64. ChecklllegalState on page J1-5263. 

. aarch64/exceptions/traps/AArch64.MonitorModeTrap on page J1-5263. 

° aarch64/exceptions/traps/AArch64.SystemRegisterTrap on page J1-5263. 

° aarch64/exceptions/traps/AArch64. UndefinedFault on page J1-5264. 

° aarch64/exceptions/traps/AArch64.WFxTrap on page J1-5264. 

° aarch64/exceptions/traps/CheckF PAdvSIMDEnabled64 on page J1-5264. 


aarch64/exceptions/aborts/AArch64.Abort 


// AArch64.Abort() 
|[ senssssassss=e= 


// Abort and Debug exception handling in an AArch64 translation regime. 
AArch64.Abort(bits(64) vaddress, FaultRecord fault) 


if IsDebugException(fault) then 
if fault.acctype == AccType_IFETCH then 
if UsingAArch32() && fault.debugmoe == DebugException_VectorCatch then 
AArch64.VectorCatchException(fault) ; 
else 
AArch64.BreakpointException(fault); 
else 
AArch64.WatchpointException(vaddress, fault); 
elsif fault.acctype == AccType_IFETCH then 
AArch64.InstructionAbort(vaddress, fault); 
else 
AArch64.DataAbort(vaddress, fault); 


aarch64/exceptions/aborts/AArch64.AbortSyndrome 
// AArch64.AbortSyndrome() 


// Creates an exception syndrome record for Abort and Watchpoint exceptions 
// from an AArch64 translation regime. 


ExceptionRecord AArch64.AbortSyndrome(Exception type, FaultRecord fault, bits(64) vaddress) 
exception = ExceptionSyndrome(type) ; 
d_side = type IN {Exception_DataAbort, Exception_Watchpoint}; 
exception.syndrome = AArch64.FaultSyndrome(d_side, fault); 
exception.vaddress = ZeroExtend(vaddress); 
if IPAValid(fault) then 
exception.ipavalid = TRUE; 
exception.ipaddress = fault.ipaddress; 
else 


exception.ipavalid = FALSE; 


return excepti on; 


aarch64/exceptions/aborts/AArch64.CheckPCAlignment 


// AArch64.CheckPCAlignment() 
|[ sesesssnsssesnssssessssee= 


AArch64.CheckPCA1ignment() 
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J1 ARMv8 Pseudocode 


J1.1 Pseudocode for AArch64 operations 


bits(64) pc = ThisInstrAddr(); 
if pc<1:@> != 'Q0' then 
AArch64.PCAlignmentFault(); 


aarch64/exceptions/aborts/AArch64.DataAbort 


// AArch64.DataAbort() 


AArch64.DataAbort(bits(64) vaddress, FaultRecord fault) 


route_to_el3 = HaveEL(EL3) && SCR_EL3.EA == '1' && IsExternalAbort(fault) ; 

route_to_el2 = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
(HCR_EL2.TGE == '1' || IsSecondStage(fault))); 

bits(64) preferred_exception_return = ThisInstrAddr(); 

vect_offset = 0x0; 


exception = AArch64.AbortSyndrome(Exception_DataAbort, fault, vaddress); 


if PSTATE.EL == EL3 || route_to_el3 then 

AArch64.TakeException(EL3, exception, preferred_exception_return, vect_offset) ; 
elsif PSTATE.EL == EL2 || route_to_el2 then 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 

AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/aborts/AArch64.InstructionAbort 


// AArch64.InstructionAbort() 


// 


AArch64.InstructionAbort(bits(64) vaddress, FaultRecord fault) 


route_to_el3 = HaveEL(EL3) && SCR_EL3.EA == '1' && IsExternalAbort(fault); 
route_to_el2 = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
(HCR_EL2.TGE == '1' || IsSecondStage(fault))); 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = AArch64.AbortSyndrome(Exception_InstructionAbort, fault, vaddress); 


if PSTATE.EL == EL3 || route_to_el3 then 

AArch64.TakeException(EL3, exception, preferred_exception_return, vect_offset) ; 
elsif PSTATE.EL == EL2 || route_to_el2 then 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 

AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/aborts/AArch64.PCAlignmentFault 


// AArch64.PCAlignmentFault() 


// Called on unaligned program counter in AArch64 state. 


AArch64.PCAlignmentFault() 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0xQ; 


exception = ExceptionSyndrome(Exception_PCAlignment) ; 
exception.vaddress = ThisInstrAddr(); 


if UInt(PSTATE.EL) > UInt(EL1) then 
AArch64.TakeException(PSTATE.EL, exception, preferred_exception_return, vect_offset) ; 
elsif HaveEL(EL2) && !IsSecure() && HCR_EL2.TGE == '1' then 
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J1.1 Pseudocode for AArch64 operations 


AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 
AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/aborts/AArch64.SPAlignmentF ault 


// AArch64.SPAlignmentFault() 


// 


Called on an unaligned stack pointer in AArch64 state. 


AArch64.SPAlignmentFault() 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = ExceptionSyndrome(Exception_SPAlignment) ; 


if UInt(PSTATE.EL) > UInt(EL1) then 

AArch64.TakeException(PSTATE.EL, exception, preferred_exception_return, vect_offset) ; 
elsif HaveEL(EL2) && !IsSecure() && HCR_EL2.TGE == '1' then 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 

AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/asynch/AArch64. TakePhysicalFIQException 


// 
// 


AArch64.TakePhysicalFIQException() 


AArch64.TakePhysicalFIQException() 


route_to_el3 = HaveEL(EL3) && SCR_EL3.FIQ == '1'; 

route_to_el2 = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
(HCR_EL2.TGE == '1' || HCR_EL2.FMO == '1')); 

bits(64) preferred_exception_return = ThisInstrAddr(); 

vect_offset = 0x100; 

exception = ExceptionSyndrome(Exception_FI1Q) ; 


if route_to_el3 then 

AArch64.TakeException(EL3, exception, preferred_exception_return, vect_offset) ; 
elsif PSTATE.EL == EL2 || route_to_el2 then 

assert PSTATE.EL != EL3; 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 

assert PSTATE.EL IN {ELQ,EL1}; 

AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/asynch/AArch64. TakePhysicallRQException 


// 
// 


AArch64.TakePhysical IRQException() 


Take an enabled physical IRQ exception. 


AArch64.TakePhysicalIRQException() 


route_to_el3 = HaveEL(EL3) && SCR_EL3.IRQ == '1'; 

route_to_el2 = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
(HCR_EL2.TGE == '1' || HCR_EL2.IMO == '1')); 

bits(64) preferred_exception_return = ThisInstrAddr(); 

vect_offset = 0x80; 


exception = ExceptionSyndrome(Exception_IRQ) ; 


if route_to_el3 then 
AArch64.TakeException(EL3, exception, preferred_exception_return, vect_offset) ; 
elsif PSTATE.EL == EL2 || route_to_el2 then 
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J1 ARMv8 Pseudocode 
J1.1 Pseudocode for AArch64 operations 


assert PSTATE.EL != EL3; 
AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 


assert PSTATE.EL IN {ELQ,EL1}; 
AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/asynch/AArch64. TakePhysicalSErrorException 


// AArch64.TakePhysicalSErrorException() 
// sssssesseeesssnsesessseseeeseeee===== 


AArch64.TakePhysicalSErrorException(boolean syndrome_valid, bits(24) syndrome) 


route_to_el3 = HaveEL(EL3) && SCR_EL3.EA == '1'; 

route_to_el2 = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
(HCR_EL2.TGE == '1' || HCR_EL2.AMO == '1')); 

bits(64) preferred_exception_return = ThisInstrAddr(); 

vect_offset = 0x180; 


exception = ExceptionSyndrome(Exception_SError) ; 
if syndrome_valid then 

exception.syndrome<24> = '1'; 

exception. syndrome<23:@> = syndrome; 
else 

exception.syndrome<24> = '0'; 


ClearPendingPhysicalSError(); 
if PSTATE.EL == EL3 || route_to_el3 then 
AArch64.TakeException(EL3, exception, preferred_exception_return, vect_offset) ; 


elsif PSTATE.EL == EL2 || route_to_el2 then 
AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 


else 
AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/asynch/AArch64. TakeVirtualFIQException 


// AArch64.TakeVirtualFIQException() 
————E—————————— 
AArch64. TakeVirtual FIQException() 
assert HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {EL0,EL1}; 
assert HCR_EL2.TGE == '@' && HCR_EL2.FMO == '1'; // Virtual IRQ enabled if TGE==@ and FMO==1 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x10Q; 


exception = ExceptionSyndrome(Exception_FI1Q) ; 


AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/asynch/AArch64. TakeVirtuallRQException 


// AArch64.TakeVirtualIRQException() 
// sssssssssssssseeessssseeesssseee= 
AArch64.TakeVirtual IRQException() 
assert HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1}; 
assert HCR_EL2.TGE == '@' && HCR_EL2.IMO == '1'; // Virtual IRQ enabled if TGE==@ and IMO==1 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x80; 


exception = ExceptionSyndrome(Exception_IRQ) ; 


AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 
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J1.1 Pseudocode for AArch64 operations 


aarch64/exceptions/asynch/AArch64. TakeVirtualSErrorException 


// AArch64.TakeVirtualSErrorException() 
ee 


AArch64.TakeVirtualSErrorException(boolean syndrome_valid, bits(24) syndrome) 


assert HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1}; 
assert HCR_EL2.TGE == '@' && HCR_EL2.AMO == '1'; // Virtual SError enabled if TGE==@ and AMO==1 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x180; 


exception = ExceptionSyndrome(Exception_SError) ; 
if syndrome_valid then 

exception.syndrome<24> = '1'; 

exception. syndrome<23:@> = syndrome; 
else 

exception.syndrome<24> = 'Q'; 


HCR_EL2.VSE = 'Q'; 
AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 
aarch64/exceptions/debug/AArch64.BreakpointException 


// AArch64.BreakpointException() 
jf eames 


AArch64.BreakpointException(FaultRecord fault) 
assert PSTATE.EL != EL3; 


route_to_el2 = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
(HCR_EL2.TGE == '1' || MDCR_EL2.TDE == '1')); 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


vaddress = bits(64) UNKNOWN; 
exception = AArch64.AbortSyndrome(Exception_Breakpoint, fault, vaddress); 


if PSTATE.EL == EL2 || route_to_el2 then 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 

AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/debug/AArch64.SoftwareBreakpoint 


// AArch64.SoftwareBreakpoint() 


AArch64.SoftwareBreakpoint(bits(16) immediate) 


route_to_el2 = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
(HCR_EL2.TGE == '1' || MDCR_EL2.TDE == '1')); 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = ExceptionSyndrome(Exception_SoftwareBreakpoint) ; 
exception.syndrome<15:0> = immediate; 


if UInt(PSTATE.EL) > UInt(EL1) then 
AArch64.TakeException(PSTATE.EL, exception, preferred_exception_return, vect_offset) ; 
elsif route_to_el2 then 
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J1.1 Pseudocode for AArch64 operations 


AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 


AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/debug/AArch64.SoftwareStepException 


// AArch64.SoftwareStepException() 
ee 


AArch64. SoftwareStepException() 
assert PSTATE.EL != EL3; 


route_to_el2 = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
(HCR_EL2.TGE == '1' || MDCR_EL2.TDE == '1')); 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = ExceptionSyndrome(Exception_SoftwareStep) ; 
if SoftwareStep_DidNotStep() then 
exception.syndrome<24> = '0'; 
else 
exception.syndrome<24> = '1'; 
exception.syndrome<6> = if SoftwareStep_SteppedEX() then '1' else 'Q'; 


if PSTATE.EL == EL2 || route_to_el2 then 


AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 


AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 

aarch64/exceptions/debug/AArch64.VectorCatchException 
// AArch64.VectorCatchException() 
———E—E 
// Nector Catch taken from EL@ or EL1 to EL2. This can only be called when debug exceptions are 
// being routed to EL2, as Vector Catch is a legacy debug event. 
AArch64.VectorCatchException(FaultRecord fault) 

assert PSTATE.EL != EL2; 

assert HaveEL(EL2) && !IsSecure() && (HCR_EL2.TGE == '1' || MDCR_EL2.TDE == '1') 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


vaddress = bits(64) UNKNOWN; 
exception = AArch64.AbortSyndrome(Exception_VectorCatch, fault, vaddress); 


AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/debug/AArch64.WatchpointException 


// AArch64.WatchpointException() 
// sssssesseeesssssases2seseee== 


AArch64.WatchpointException(bits(64) vaddress, FaultRecord fault) 
assert PSTATE.EL != EL3; 


route_to_el2 = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
(HCR_EL2.TGE == '1' || MDCR_EL2.TDE == '1')); 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = AArch64.AbortSyndrome(Exception_Watchpoint, fault, vaddress); 


if PSTATE.EL == EL2 || route_to_el2 then 
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AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
a en exception, preferred_exception_return, vect_offset); 
aarch64/exceptions/exceptions/AArch64.ExceptionClass 

// AArch64.ExceptionClass() 
// Return the Exception Class and Instruction Length fields for reported in ESR 
(integer,bit) AArch64.ExceptionClass(Exception type, bits(2) target_el) 

il = if ThisInstrLength() == 32 then '1' else 'Q'; 

from_32 = UsingAArch32(); 


assert from_32 || i] == '1'; // AArch64 instructions always 32-bit 


case type of 


when Exception_Uncategorized ec = 0x00; il = '1'; 

when Exception_WFxTrap ec = 0x01; 

when Exception_CP15RTTrap ec = 0x03; assert from_32; 
when Exception_CP15RRTTrap ec = 0x04; assert from_32; 
when Exception_CP14RTTrap ec = 0x05; assert from_32; 
when Exception_CP14DTTrap ec = 0x06; assert from_32; 
when Exception_AdvSIMDFPAccessTrap ec = Qx07; 

when Exception_FPIDTrap ec = 0x08; 

when Exception_CP14RRTTrap ec = O0x0C; assert from_32; 
when Exception_I1legalState ec = Ox@E; il = '1'; 

when Exception_SupervisorCal1 ec = 0x11; 

when Exception_HypervisorCal] ec = 0x12; 

when Exception_MonitorCal1 ec = 0x13; 

when Exception_SystemRegisterTrap ec = 0x18; assert !from_32; 
when Exception_InstructionAbort ec = 0x20; il = '1'; 

when Exception_PCAlignment ec = 0x22; il = '1'; 

when Exception_DataAbort ec = 0x24; 

when Exception_SPAlignment ec = 0x26; i] = '1'; assert !from_32; 
when Exception_FPTrappedException ec = Qx28; 

when Exception_SError ec = Qx2F; il = '1'; 

when Exception_Breakpoint ec = 0x30; i] = '1'; 

when Exception_SoftwareStep ec = 0x32; i] = '1'; 

when Exception_Watchpoint ec = 0x34; i] = '1'; 

when Exception_SoftwareBreakpoint ec = 0x38; 

when Exception_VectorCatch ec = Qx3A; il = '1'; assert from_32; 
otherwise Unreachable(); 


if ec IN {0x20, 0x24, 0x30,0x32,0x34} && target_el == PSTATE.EL then 
ec = ec + 1; 


if ec IN {0x11,0x12,0x13,0x28,0x38} && !from_32 then 
ec = ec + 4; 


return (ec,il); 


aarch64/exceptions/exceptions/AArch64.ReportException 
// AArch64.ReportException() 
// Report syndrome information for exception taken to AArch64 state. 
AArch64.ReportException(ExceptionRecord exception, bits(2) target_el) 
Exception type = exception. type; 


(ec,il) = AArch64.ExceptionClass(type, target_el); 
iss = exception.syndrome; 


// IL is not valid for Data Abort exceptions without valid instruction syndrome information 
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if ec IN {@x24,0x25} && iss<24> == '0' then 
il='1'; 


ESR[target_el] = ec<5:0>:il:iss; 


if type IN {Exception_InstructionAbort, Exception_PCAlignment, Exception_DataAbort, 
Exception_Watchpoint} then 
FAR[target_el] = exception.vaddress; 
else 
FAR[target_el] = bits(64) UNKNOWN; 


if target_el == EL2 then 
if exception.ipavalid then 
HPFAR_EL2<39:4> = exception. ipaddress<47:12>; 
else 
HPFAR_EL2<39:4> = bits(36) UNKNOWN; 


return; 


aarch64/exceptions/exceptions/AArch64.ResetC ontrolRegisters 


// Resets System registers and memory-mapped control registers that have architecturally-defined 
// reset values to those values. 
AArch64.ResetControlRegisters(boolean cold_reset); 


aarch64/exceptions/exceptions/AArch64.TakeReset 


// AArch64.TakeReset() 





// Reset into AArch64 state 


AArch64.TakeReset(boolean cold_reset) 


assert !HighestELUsingAArch32(); 


// Enter the highest implemented Exception level in AArch64 state 
PSTATE.nRW = 'Q'; 
if HaveEL(EL3) then 
PSTATE.EL = EL3; 
elsif HaveEL(EL2) then 
PSTATE.EL = EL2; 
else 
PSTATE.EL = EL1; 


// Reset the system registers and other system components 
AArch64.ResetControlRegisters(cold_reset) ; 


// Reset all other PSTATE fields 


PSTATE.SP = '1'; // Select stack pointer 
PSTATE.<D,A,I,F> = '1111'; // All asynchronous exceptions masked 
PSTATE.SS = 'Q'; // Clear software step bit 

PSTATE.IL = 'Q'; // Clear Illegal Execution state bit 


// All registers, bits and fields not reset by the above pseudocode or by the BranchTo() call 
// below are UNKNOWN bitstrings after reset. In particular, the return information registers 
// ELR_ELx and SPSR_ELx have UNKNOWN values, so that it 

// is impossible to return from a reset in an architecturally defined way. 
AArch64.ResetGeneralRegisters(); 

AArch64.ResetSIMDFPRegisters(); 

AArch64.ResetSpecialRegisters(); 

ResetExternalDebugRegisters(cold_reset) ; 


bits(64) rv; // IMPLEMENTATION DEFINED reset vector 
if HaveEL(EL3) then 


rv = RVBAR_EL3; 
elsif HaveEL(EL2) then 
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rv 

else 
rv = RVBAR_EL1; 

// The reset vector must be correctly aligned 

assert IsZero(rv<63:PAMax()>) && IsZero(rv<1:0>); 


RVBAR_EL2 ; 


BranchTo(rv, BranchType_UNKNOWN) ; 


aarch64/exceptions/ieeefp/AArch64.FPTrappedException 


// AArch64.FPTrappedException() 
————— 


AArch64.FPTrappedException(boolean is_ase, integer element, bits(8) accumulated_exceptions) 
exception = ExceptionSyndrome(Exception_FPTrappedException) ; 
exception.syndrome<23> = '1'; // TEV 
if is_ase then exception.syndrome<10:8> = element<2:0>; // VECITR 
exception. syndrome<7,4:@> = accumulated_exceptions<7,4:0>; // IDF,IXF,UFF,OFF,DZF,IOF 


route_to_el2 = HaveEL(EL2) && !IsSecure() && HCR_EL2.TGE == '1'; 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


if UInt(PSTATE.EL) > UInt(EL1) then 

AArch64.TakeException(PSTATE.EL, exception, preferred_exception_return, vect_offset) ; 
elsif route_to_el2 then 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 

AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/syscalls/AArch64.CallHypervisor 
// AArch64.CallHypervisor() 


// Performs a HVC call 


AArch64.CallHypervisor(bits(16) immediate) 
assert HaveEL(EL2); 


if UsingAArch32() then AArch32.1TAdvance(); 
SSAdvance(); 

bits(64) preferred_exception_return = NextInstrAddr(); 
vect_offset = 0x0; 


exception = ExceptionSyndrome(Exception_HypervisorCal1); 
exception.syndrome<15:0> = immediate; 


if PSTATE.EL == EL3 then 

AArch64.TakeException(EL3, exception, preferred_exception_return, vect_offset) ; 
else 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/syscalls/AArch64.CallSecureMonitor 


// AArch64.Cal1SecureMonitor() 


AArch64.Cal1SecureMonitor(bits(16) immediate) 
assert HaveEL(EL3) && !ELUsingAArch32(EL3); 


if UsingAArch32() then AArch32.1TAdvance(); 
SSAdvance(); 


bits(64) preferred_exception_return = NextInstrAddr(); 
vect_offset = 0x0; 
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exception = ExceptionSyndrome(Exception_MonitorCal1); 
exception.syndrome<15:0> = immediate; 


AArch64.TakeException(EL3, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/syscalls/AArch64.CallSupervisor 
// AArch64.Cal1Supervisor() 

// Calls the Supervisor 

AArch64.CallSupervisor(bits(16) immediate) 


if UsingAArch32() then AArch32.1TAdvance(); 
SSAdvance(); 
route_to_el2 = HaveEL(EL2) && !IsSecure() && PSTATE.EL == EL@ && HCR_EL2.TGE == '1'; 


bits(64) preferred_exception_return = NextInstrAddr(); 
vect_offset = 0x0; 


exception = ExceptionSyndrome(Exception_SupervisorCal1); 
exception.syndrome<15:0> = immediate; 


if UInt(PSTATE.EL) > UInt(EL1) then 

AArch64.TakeException(PSTATE.EL, exception, preferred_exception_return, vect_offset) ; 
elsif route_to_el2 then 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 

AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/takeexception/AArch64. TakeException 
// AArch64.TakeException() 
// Take an exception to an Exception Level using AArch64. 


AArch64.TakeException(bits(2) target_el, ExceptionRecord exception, 
bits(64) preferred_exception_return, integer vect_offset) 
assert HaveEL(target_el) && !ELUsingAArch32(target_el) && UInt(target_el) >= UInt(PSTATE.EL); 


// If coming from AArch32 state, the top parts of the X[] registers might be set to zero 
from_32 = UsingAArch32(); 
if from_32 then AArch64.MaybeZeroRegisterUppers(); 


if UInt(target_el) > UInt(PSTATE.EL) then 
boolean lower_32; 
if target_el == EL3 then 
if !IsSecure() && HaveEL(EL2) then 
lower_32 = ELUsingAArch32(EL2); 
else 
lower_32 = ELUsingAArch32(EL1); 
else 
lower_32 = ELUsingAArch32(target_el - 1); 
vect_offset = vect_offset + (if lower_32 then Qx600 else 0x400); 


elsif PSTATE.SP == '1' then 
vect_offset = vect_offset + 0x200; 


spsr = GetPSRFromPSTATE(); 


if !(exception.type IN {Exception_IRQ, Exception_F1Q}) then 
AArch64.ReportException(exception, target_el); 


PSTATE.EL = target_el; PSTATE.nRW = '@'; PSTATE.SP = '1'; 
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SPSR[] = spsr; 
ELR[] = preferred_exception_return; 


PSTATE.SS = 'Q'; 

PSTATE.<D,A,1I,F> = '1111'; 

PSTATE.IL = 'Q'; 

if from_32 then // Coming from AArch32 
PSTATE.IT = '@0000000'; PSTATE.T = '0'; // PSTATE.J is RESO 

BranchTo(VBAR[] + vect_offset, BranchType_EXCEPTION) ; 

EndOfInstruction(); 


aarch64/exceptions/traps/AArch64.AArch32SystemAccessTrap 
// AArch64.AArch32SystemAccessTrap() 
// Trapped AArch32 System register access other than due to CPTR_EL2 or CPACR_EL1. 


AArch64.AArch32SystemAccessTrap(bits(2) target_el, bits(32) aarch32_instr) 
assert HaveEL(target_el) && target_el != ELQ && UInt(target_el) >= UInt(PSTATE.EL); 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = AArch64.AArch32SystemAccessTrapSyndrome(aarch32_instr) ; 


if target_el == EL1 && HaveEL(EL2) && !IsSecure() && HCR_EL2.TGE == '1' then 
AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 

else 
AArch64.TakeException(target_el, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/traps/AArch64.AArch32SystemAccessTrapSyndrome 


// AArch64.AArch32SystemAccessTrapSyndrome() 

——— 

// Return the syndrome information for traps on AArch32 MCR, MCRR, MRC, MRRC, and VMRS instructions, 
// other than traps that are due to HCPTR or CPACR. 


ExceptionRecord AArch64.AArch32SystemAccessTrapSyndrome(bits(32) instr) 


ExceptionRecord exception; 
cpnum = UInt(instr<11:8>); 


bits(20) iss = Zeros(); 
if instr<27:24> == '1110' && instr<4> == '1' && instr<31:28> != '1111' then 
// MRC/MCR 
case cpnum of 
when 10 exception = ExceptionSyndrome(Exception_FPIDTrap) ; 
when 14 exception = ExceptionSyndrome(Exception_CP14RTTrap) 
when 15 exception = ExceptionSyndrome(Exception_CP15RTTrap) 
otherwise Unreachable(); 
iss<19:17> = instr<7:5>; // opc2 
iss<16:14> = instr<23:21; // opcl 
jss<13:10> = instr<19:16>; // CRn 
iss<9:5> = LookUpRIndex(UInt(instr<15:12>), PSTATE.M)<4:0>; // Rt 
iss<4:1> = instr<3:0>; // CRm 
elsif instr<27:21> == '1100010' && instr<31:28> != '1111' then 
// MRRC/MCRR 
case cpnum of 
when 14 exception = ExceptionSyndrome(Exception_CP14RRTTrap) 
when 15 exception = ExceptionSyndrome(Exception_CP15RRTTrap) 
otherwise Unreachable(); 
iss<19:16> = instr<7:4>; // opcl 
iss<14:10> = LookUpRIndex(UInt(instr<19:16>), PSTATE.M)<4:0>; // Rt2 
iss<9:5> = LookUpRIndex(UInt(instr<15:12>), PSTATE.M)<4:0>; // Rt 
iss<4:1> 9 = instr<3:0>; // CRm 
elsif instr<27:25> == '110' && instr<31:28> != '1111' then 
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// LDC/STC 

assert cpnum == 14; 

exception = ExceptionSyndrome(Exception_CP14DTTrap) ; 

iss<19:12> = instr<7:0>; // imms 

iss<4> instr<23>; // U 

iss<2:l> = instr<24,21>; // PW 

if instr<19:16> == '1111' then // Literal addressing 
iss<9:5> = bits(5) UNKNOWN; 


iss<3> = 'L'; 
else 
iss<9:5> = LookUpRIndex(UInt(instr<19:16>), PSTATE.M)<4:0>; // Rn 
iss<3> = 'Q'; 
else 
Unreachable(); 
iss<Q> = instr<20>; // Direction 


exception. syndrome<24:20> = ConditionSyndrome(); 
exception.syndrome<19:Q> = iss; 


return excepti on; 


aarch64/exceptions/traps/AArch64.AdvSIMDFPAccessTrap 


// AArch64.AdvSIMDFPAccessTrap() 


// Trapped access to Advanced SIMD or FP registers due to CPACR[]. 


AArch64.AdvSIMDFPAccessTrap(bits(2) target_el) 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


route_to_el2 = (target_el == EL1 && HaveEL(EL2) && !IsSecure() && HCR_EL2.TGE == '1'); 


if route_to_el2 then 
exception = ExceptionSyndrome(Exception_Uncategorized) ; 
AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 
exception = ExceptionSyndrome(Exception_AdvSIMDFPAccessTrap) ; 
exception. syndrome<24:2@> = ConditionSyndrome(); 
AArch64.TakeException(target_el, exception, preferred_exception_return, vect_offset) ; 


return; 


aarch64/exceptions/traps/AArch64.CheckAArch32SystemAccess 


// AArch64.CheckAArch32SystemAccess() 


// Check AArch32 System register access instruction for enables and disables 


AArch64.CheckAArch32SystemAccess(bits(32) instr) 


cp_num = UInt(instr<11:8>); 
assert cp_num IN {14,15}; 


// Decode the AArch32 System register access instruction 
if instr<31:28> != '1111' && instr<27:24> == '1110' && instr<4> == '1' then // MRC/MCR 
cprt = TRUE; cpdt = FALSE; nreg = 1; 
opcl = UInt(instr<23:21>); 
opc2 = UInt(instr<7:5>); 
CRn = UInt(instr<19:16>); 
CRm = UInt(instr<3:0>); 
elsif instr<31:28> != '1111' && instr<27:21> == '1100010' then // MRRC/MCRR 
cprt = TRUE; cpdt = FALSE; nreg = 2; 
opcl = UInt(instr<7:4>); 
CRm = UInt(instr<3:@>); 
elsif instr<31:28> != '1111' && instr<27:25> == '110' && instr<22> == '@' then // LDC/STC 
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cprt = FALSE; cpdt = TRUE; nreg = 0; 
opcl = 0; 
CRn = UInt(instr<15:12>); 

else 


allocated = FALSE; 


// 
// Coarse-grain decode into CP14 or CP15 encoding space. Each of the CPxxxInstrDecode functions 
// returns TRUE if the instruction is allocated at the current Exception level, FALSE otherwise. 
if cp_num == 14 then 
// LDC and STC only supported for c5 in CP14 encoding space 
if cpdt && CRn != 5 then 
allocated = FALSE; 
else 
// Coarse-grained decode of CP14 based on opcl field 
case opcl of 
when 0 allocated = CP14DebugInstrDecode(instr) ; 


when 1 allocated = CP14TraceInstrDecode(instr); 
when 7 allocated = CP14JazelleInstrDecode(instr) ; // JIDR only 
otherwise allocated = FALSE; // All other values are unallocated 


elsif cp_num == 15 then 
// LDC and STC not supported in CP15 encoding space 
if !cprt then 
allocated = FALSE; 
else 
allocated = CP15InstrDecode(instr); 


// Coarse-grain traps to EL2 have a higher priority than exceptions generated because 
// the access instruction is UNDEFINED 
if AArch64.CheckCP15InstrCoarseTraps(CRn, nreg, CRm) then 
// For a coarse-grain trap, if it is IMPLEMENTATION DEFINED whether an access from 
// Non-secure User mode is UNDEFINED when the trap is disabled, then it is 
// IMPLEMENTATION DEFINED whether the same access is UNDEFINED or generates a trap 
// when the trap is enabled. 
if PSTATE.EL == ELO && !IsSecure() && !allocated then 
if boolean IMPLEMENTATION_DEFINED "UNDEF unallocated CP15 access at NS EL@" then 
UNDEFINED; 
AArch64.AArch32SystemAccessTrap(EL2, instr); 


else 
allocated = FALSE; 


if !allocated then 
UNDEFINED; 


// If the instruction is not UNDEFINED, it might be disabled or trapped to a higher EL. 
AArch64.CheckAArch32SystemAccessTraps(instr) ; 


return; 


aarch64/exceptions/traps/AArch64.CheckAArch32SystemAccess Traps 


// Check for configurable disables or traps to a higher EL of an AArch32 System register access. 
AArch64.CheckAArch32SystemAccessTraps(bits(32) instr); 


aarch64/exceptions/traps/AArch64.CheckCP15InstrCoarseTraps 
// AArch64.CheckCP15InstrCoarseTraps() 
// Check for coarse-grained AArch32 CP15 traps in HSTR_EL2 and HCR_EL2. 
boolean AArch64.CheckCP15InstrCoarseTraps(integer CRn, integer nreg, integer CRm) 


// Check for coarse-grained Hyp traps 
if HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} then 
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// Check for MCR, MRC, MCRR and MRRC disabled by HSTR_EL2<CRn/CRm> 
major = if nreg == 1 then CRn else CRm; 
if !(major IN {4,14}) && HSTR_EL2<major> == '1' then 

return TRUE; 


// Check for MRC and MCR disabled by HCR_EL2.TIDCP 
if (HCR_EL2.TIDCP == '1' && nreg == 1 && 
((CRn == 9 && CRm IN {0,1,2, 5,6,7,8 }) || 
(CRn == 10 && CRm IN {0,1, 4, 8 }) | 
(CRn == 11 && CRm IN {0,1,2,3,4,5,6,7,8,15}))) then 
return TRUE; 


return FALSE; 


aarch64/exceptions/traps/AArch64.CheckFPAdvSIMDEnabled 
// AArch64.CheckFPAdvSIMDEnab1ed() 
// Check against CPACR[] 


AArch64.CheckFPAdvSIMDEnab1ed() 
if PSTATE.EL IN {EL@, EL1} then 

// Check if access disabled in CPACR_EL1 

case CPACR[].FPEN of 
when 'x@' disabled = TRUE; 
when '@1' disabled = PSTATE.EL == ELQ; 
when '11' disabled = FALSE; 

if disabled then AArch64.AdvSIMDFPAccessTrap(EL1) ; 


AArch64.CheckFPAdvSIMDTrap() ; // Also check against CPTR_EL2 and CPTR_EL3 


aarch64/exceptions/traps/AArch64.CheckFPAdvSIMDTrap 
// AArch64.CheckFPAdvSIMDTrap( ) 
// Check against CPTR_EL2 and CPTR_EL3. 
AArch64.CheckFPAdvSIMDTrap() 
if HaveEL(EL2) && !IsSecure() then 
// Check if access disabled in CPTR_EL2 
if CPTR_EL2.TFP == '1' then AArch64.AdvSIMDFPAccessTrap(EL2); 
if HaveEL(EL3) then 
// Check if access disabled in CPTR_EL3 
if CPTR_EL3.TFP == '1' then AArch64.AdvSIMDFPAccessTrap(EL3); 


return; 


aarch64/exceptions/traps/AArch64.CheckForSMCTrap 
// AArch64.CheckForSMCTrap() 
// Check for trap on SMC instruction 
AArch64.CheckForSMCTrap(bits(16) imm) 


route_to_el2 = HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {EL@,EL1} && HCR_EL2.TSC == '1'; 
if route_to_el2 then 

bits(64) preferred_exception_return = ThisInstrAddr(); 

vect_offset = 0x0; 

exception = ExceptionSyndrome(Exception_MonitorCal1); 

exception.syndrome<15:0> = imm; 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
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aarch64/exceptions/traps/AArch64.CheckForWFxTrap 


// AArch64.CheckForWFxTrap() 
// sssssssseeesssseess=s==== 
// Check for trap on WFE or WFI instruction 


AArch64.CheckForWFxTrap(bits(2) target_el, boolean is_wfe) 
assert HaveEL(target_el); 


case target_el of 
when EL1 trap = (if is_wfe then SCTLR[].nTWE else SCTLR[].nTWI) == '0'; 
when EL2 trap = (if is_wfe then HCR_EL2.TWE else HCR_EL2.TWI) == '1' 
when EL3 trap = (if is_wfe then SCR_EL3.TWE else SCR_EL3.TWI) == '1'; 
if trap then 
AArch64.WFxTrap(target_el, is_wfe); 


aarch64/exceptions/traps/AArch64.ChecklllegalState 

// AArch64.CheckI11legalState() 

// Check PSTATE.IL bit and generate Illegal Execution state exception if set. 
AArch64.CheckI11egalState() 


if PSTATE.IL == '1' then 
route_to_el2 = HaveEL(EL2) && !IsSecure() && PSTATE.EL == EL@ && HCR_EL2.TGE == '1'; 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = ExceptionSyndrome(Exception_I1legalState) ; 
if UInt(PSTATE.EL) > UInt(EL1) then 

AArch64.TakeException(PSTATE.EL, exception, preferred_exception_return, vect_offset); 
elsif route_to_el2 then 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 


else 
AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset); 


aarch64/exceptions/traps/AArch64.MonitorModeTrap 

// AArch64.MonitorModeTrap() 

// Trapped use of Monitor mode features in a Secure EL1 AArch32 mode 
AArch64.MonitorModeTrap() 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = ExceptionSyndrome(Exception_Uncategorized) ; 


AArch64.TakeException(EL3, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/traps/AArch64.SystemRegisterTrap 
// AArch64.SystemRegisterTrap() 
// Trapped system register access other than due to CPTR_EL2 and CPACR_EL1 
AArch64.SystemRegisterTrap(bits(2) target_el, bits(2) op@, bits(3) op2, bits(3) opl, bits(4) crn, 
bits(5) rt, bits(4) crm, bit dir) 


assert UInt(target_el) >= UInt(PSTATE.EL); 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 
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exception = ExceptionSyndrome(Exception_SystemRegisterTrap) ; 
exception. syndrome<21:20> = opQ; 
exception.syndrome<19:17> = op2; 
exception. syndrome<16:14> = op1; 
exception. syndrome<13:10> = crn; 


exception.syndrome<9:5> = rt; 
exception.syndrome<4:1> = crm; 
exception. syndrome<@> = dir; 


if target_el == EL1 && HaveEL(EL2) && !IsSecure() && HCR_EL2.TGE == '1' then 
AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 


AArch64.TakeException(target_el, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/traps/AArch64.UndefinedFault 


// AArch64.UndefinedFault() 
|[ sensssensssssnssssssss= 


AArch64.UndefinedFault() 
route_to_el2 = HaveEL(EL2) && !IsSecure() && PSTATE.EL == EL@ && HCR_EL2.TGE == 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = ExceptionSyndrome(Exception_Uncategorized) ; 


if UInt(PSTATE.EL) > UInt(EL1) then 


R 


AArch64.TakeException(PSTATE.EL, exception, preferred_exception_return, vect_offset) ; 


elsif route_to_el2 then 

AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 

AArch64.TakeException(EL1, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/traps/AArch64.WFxTrap 


// AArch64.WFxTrap() 
|[ sansssessssas=== 


AArch64.WFxTrap(bits(2) target_el, boolean is_wfe) 
assert UInt(target_el) > UInt(PSTATE.EL); 


bits(64) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x0; 


exception = ExceptionSyndrome(Exception_WFxTrap) ; 
exception. syndrome<24:20> = ConditionSyndrome(); 
exception.syndrome<@> = if is_wfe then '1' else 'Q'; 


if target_el == EL1 && HaveEL(EL2) && !IsSecure() && HCR_EL2.TGE == '1' then 
AArch64.TakeException(EL2, exception, preferred_exception_return, vect_offset) ; 
else 


AArch64.TakeException(target_el, exception, preferred_exception_return, vect_offset) ; 


aarch64/exceptions/traps/CheckF PAdvSIMDEnabled64 


// CheckFPAdvSIMDEnab1ed64() 
|[ zeennasseensssseensssss= 


// AArch64 instruction wrapper 


CheckFPAdvSIMDEnab1ed64() 
AArch64.CheckFPAdvSIMDEnabled(); 
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J1.1 Pseudocode for AArch64 operations 


J1.1.3 aarch64/functions 


This section includes the following pseudocode functions: 


aarch64/functions/aborts/AArch64. CreateFaultRecord. 
aarch64/functions/aborts/AArch64. FaultSyndrome on page J1-5266. 
aarch64/functions/exclusive/AArch64.ExclusiveMonitorsPass on page J1-5266. 
aarch64/functions/exclusive/AArch64.IsExclusiveVA on page J1-5267. 
aarch64/functions/exclusive/AArch64.MarkExclusiveVA on page J1-5267. 
aarch64/functions/exclusive/AArch64.SetExclusiveMonitors on page J1-5267. 
aarch64/functions/fusedrstep/F PRSqrtStepFused on page J1-5268. 
aarch64/functions/fusedrstep/F PRecipStepFused on page J1-5268. 
aarch64/functions/memory/AArch64. CheckAlignment on page J1-5269. 
aarch64/functions/memory/AArch64.MemSingle on page J1-5269. 
aarch64/functions/memory/CheckSPAlignment on page J1-5270. 
aarch64/functions/memory/Mem on page J1-5270. 
aarch64/functions/registers/AArch64.MaybeZeroRegisterUppers on page J1-5271. 
aarch64/functions/registers/AArch64.ResetGeneralRegisters on page J1-5271. 
aarch64/functions/registers/AArch64.ResetSIMDF PRegisters on page J1-5272. 
aarch64/functions/registers/AArch64.ResetSpecialRegisters on page J1-5272. 
aarch64/functions/registers/AArch64.ResetSystemRegisters on page J1-5272. 
aarch64/functions/registers/PC on page J1-5272. 
aarch64/functions/registers/SP on page J1-5273. 

aarch64/functions/registers/V on page J1-5273. 
aarch64/functions/registers/Vpart on page J1-5273. 
aarch64/functions/registers/X on page J1-5274. 
aarch64/functions/sysregisters/CNTKCTL on page J1-5274. 
aarch64/functions/sysregisters/CNTKCTLType on page J1-5274. 
aarch64/functions/sysregisters/CPACR on page J1-5275. 
aarch64/functions/sysregisters/CPACRType on page J1-5275. 
aarch64/functions/sysregisters/ELR on page J1-5275. 
aarch64/functions/sysregisters/ESR on page J1-5275. 
aarch64/functions/sysregisters/ESRType on page J1-5276. 
aarch64/functions/sysregisters/FAR on page J1-5276. 
aarch64/functions/sysregisters/MAIR on page J1-5277. 
aarch64/functions/sysregisters/MAIRType on page J1-5277. 
aarch64/functions/sysregisters/SCTLR on page J1-5277. 
aarch64/functions/sysregisters/SCTLRType on page J1-5277. 
aarch64/functions/sysregisters/VBAR on page J1-5277. 
aarch64/functions/systemn/AArch64. CheckAdvSIMDF PSystemRegisterTraps on page J1-5278. 
aarch64/functions/systemn/AArch64. CheckSystemAccess on page J1-5278. 
aarch64/functions/system/AArch64. CheckSystemRegisterTraps on page J1-5279. 
aarch64/functions/systemn/AArch64. CheckUnallocatedSystemAccess on page J1-5279. 
aarch64/functions/system/AArch64. SysInstr on page J1-5279. 
aarch64/functions/system/AArch64. SysInstrWithResult on page J1-5279. 
aarch64/functions/system/AArch64. SysRegRead on page J1-5279. 
aarch64/functions/system/AArch64. SysReg Write on page J1-5279. 


aarch64/functions/aborts/AArch64.CreateFaultRecord 


// AArch64.CreateFaultRecord() 
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FaultRecord AArch64.CreateFaultRecord(Fault type, bits(48) ipaddress, 
integer level, AccType acctype, boolean write, bit extflag, 
boolean secondstage, boolean s2fslwalk) 


FaultRecord fault; 
fault.type = type; 


fault.domain = bits(4) UNKNOWN; // Not used from AArch64 
fault.debugmoe = bits(4) UNKNOWN; // Not used from AArch64 
fault. ipaddress = ipaddress; 

fault. level = level; 

fault.acctype = acctype; 

fault.write = write; 

fault.extflag = extflag; 

fault.secondstage = secondstage; 

fault.s2fslwalk = s2fslwalk; 





return fault; 


aarch64/functions/aborts/AArch64.FaultSyndrome 


// AArch64.FaultSyndrome() 

// ssssssssseeessss======= 

// Creates an exception syndrome value for Abort and Watchpoint exceptions taken to 
// an Exception Level using AArch64. 


bits(25) AArch64.FaultSyndrome(boolean d_side, FaultRecord fault) 
assert fault.type != Fault_None; 


bits(25) iss = Zeros(); 
if d_side then 
if IsSecondStage(fault) && !fault.s2fslwalk then iss<24:14> = LSInstructionSyndrome(); 
if fault.acctype IN {AccType_DC, AccType_IC, AccType_AT} then 
iss<8> = '1'; iss<6> = '1'; 
else 
iss<6> = if fault.write then '1' else 'Q'; 
if IsExternalAbort(fault) then iss<9> = fault.extflag; 
iss<7> = if fault.s2fslwalk then '1' else 'Q'; 
iss<5:@> = EncodeLDFSC(fault.type, fault. level); 


return iss; 


aarch64/functions/exclusive/AArch64.ExclusiveMonitorsPass 


// AArch64.ExclusiveMonitorsPass() 
i 


// Return TRUE if the Exclusive Monitors for the current PE include all of the addresses 
// associated with the virtual address region of size bytes starting at address. 
// The immediately following memory write must be to the same addresses. 


boolean AArch64.ExclusiveMonitorsPass(bits(64) address, integer size) 


// It is IMPLEMENTATION DEFINED whether the detection of memory aborts happens 

// before or after the check on the local Exclusive Monitor. As a result a failure 
// of the local monitor can occur on some implementations even if the memory 

// access would give an memory abort. 


acctype = AccType_ATOMIC; 
iswrite = TRUE; 
aligned = (address == Align(address, size)); 


if !aligned then 
secondstage = FALSE; 
AArch64.Abort(address, AArch64.AlignmentFault(acctype, iswrite, secondstage)); 


passed = AArch64.IsExclusiveVA(address, ProcessorID(), size); 
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if !passed then 
return FALSE; 
memaddrdesc = AArch64.TranslateAddress(address, acctype, iswrite, aligned, size); 


// Check for aborts or debug exceptions 
if IsFault(memaddrdesc) then 
AArch64.Abort(address, memaddrdesc. fault); 


passed = IsExclusiveLocal(memaddrdesc.paddress, ProcessorID(), size); 


if passed then 
ClearExclusiveLocal(ProcessorID()); 
if memaddrdesc.memattrs.shareable then 
passed = IsExclusiveGlobal (memaddrdesc.paddress, ProcessorID(), size); 


return passed; 


aarch64/functions/exclusive/AArch64.IsExclusiveVA 


// An optional IMPLEMENTATION DEFINED test for an exclusive access to a virtual 
// address region of size bytes starting at address. 

// 

// It is permitted (but not required) for this function to return FALSE and 

// cause a store exclusive to fail if the virtual address region is not 

// totally included within the region recorded by MarkExclusiveVA(). 


// 

// It is always safe to return TRUE which will check the physical address only. 
boolean AArch64.IsExclusiveVA(bits(64) address, integer processorid, integer size); 
aarch64/functions/exclusive/AArch64.MarkExclusiveVA 

// Optionally record an exclusive access to the virtual address region of size bytes 
// starting at address for processorid. 

AArch64.MarkExclusiveVA(bits(64) address, integer processorid, integer size); 


aarch64/functions/exclusive/AArch64.SetExclusiveMonitors 


// AArch64.SetExclusiveMonitors() 
ee 


// Sets the Exclusive Monitors for the current PE to record the addresses associated 
// with the virtual address region of size bytes starting at address. 


AArch64.SetExclusiveMonitors(bits(64) address, integer size) 


acctype = AccType_ATOMIC; 

jiswrite = FALSE; 

aligned = (address != Align(address, size)); 

memaddrdesc = AArch64.TranslateAddress(address, acctype, iswrite, aligned, size); 


// Check for aborts or debug exceptions 
if IsFault(memaddrdesc) then 
return; 


if memaddrdesc.memattrs.shareable then 
MarkExclusiveGlobal(memaddrdesc.paddress, ProcessorID(), size); 


MarkExclusiveLocal(memaddrdesc.paddress, ProcessorID(), size); 


AArch64.MarkExclusiveVA(address, ProcessorID(), size); 
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aarch64/functions/fusedrstep/FPRSqrtStepFused 


// FPRSqrtStepFused() 
|/ sssssssassssassss= 


bits(N) FPRSqrtStepFused(bits(N) opl, bits(N) op2) 
assert N IN {32, 64}; 
bits(N) result; 
op1 = FPNeg(op1); 
(typel,sign1,valuel) = FPUnpack(op1, FPCR); 
(type2,sign2,value2) = FPUnpack(op2, FPCR); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, FPCR); 
if !done then 
infl = (typel == FPType_Infinity); 
inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); 
zero2 = (type2 == FPType_Zero); 
if (infl && zero2) || (zerol && inf2) then 
result = FPOnePointFive('0'); 
elsif infl || inf2 then 
result = FPInfinity(sign1 EOR sign2); 
else 
// Fully fused multiply-add and halve 
result_value = (3.0 + (valuel « value2)) / 2.0; 
if result_value == @.@ then 
// Sign of exact zero result depends on rounding mode 
sign = if FPRoundingMode(FPCR) == FPRounding_NEGINF then '1' else 'Q'; 
result = FPZero(sign); 
else 
result = FPRound(result_value, FPCR); 
return result; 


aarch64/functions/fusedrstep/FPRecipStepFused 


// FPRecipStepFused() 
|[ sssssssnssssscsss= 


bits(N) FPRecipStepFused(bits(N) opl, bits(N) op2) 
assert N IN {32, 64}; 
bits(N) result; 
op1 = FPNeg(op1); 
(typel,sign1,valuel) = FPUnpack(op1, FPCR); 
(type2,sign2,value2) = FPUnpack(op2, FPCR); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, FPCR); 
if !done then 
infl = (typel == FPType_Infinity); 
inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); 
zero2 = (type2 == FPType_Zero); 
if (infl && zero2) || (zerol && inf2) then 
result = FPTwo('Q'); 
elsif infl || inf2 then 
result = FPInfinity(sign1 EOR sign2); 
else 
// Fully fused multiply-add 
result_value = 2.0 + (valuel « value2); 
if result_value == @.@ then 
// Sign of exact zero result depends on rounding mode 
sign = if FPRoundingMode(FPCR) == FPRounding_NEGINF then '1' else 'Q'; 
result = FPZero(sign); 
else 
result = FPRound(result_value, FPCR); 
return result; 
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aarch64/functions/memory/AArch64.CheckAlignment 


// AArch64.CheckAlignment() 


boolean AArch64.CheckAlignment(bits(64) address, integer alignment, AccType acctype, 
boolean iswrite) 


aligned = (address == Align(address, alignment)); 
check = (acctype == AccType_ATOMIC || acctype == AccType_ORDERED || SCTLR[].A == '1'); 


if check && !aligned then 
secondstage = FALSE; 
AArch64.Abort(address, AArch64.AlignmentFault(acctype, iswrite, secondstage)); 


return aligned; 


aarch64/functions/memory/AArch64.MemSingle 
// AArch64.MemSingle[] - non-assignment (read) form 
// Perform an atomic, little-endian read of 'size' bytes. 


bits(sizex8) AArch64.MemSingle[bits(64) address, integer size, AccType acctype, boolean wasaligned] 
assert size IN {1, 2, 4, 8, 16}; 
assert address == Align(address, size); 


AddressDescriptor memaddrdesc; 
bits(sizex8) value; 
jiswrite = FALSE; 


// MMU or MPU 
memaddrdesc = AArch64.TranslateAddress(address, acctype, iswrite, wasaligned, size); 
// Check for aborts or debug exceptions 
if IsFault(memaddrdesc) then 
AArch64.Abort(address, memaddrdesc. fault); 


// Memory array access 
value = _Mem[memaddrdesc, size, acctype]; 
return value; 


// AArch64.MemSingle[] - assignment (write) form 
// Perform an atomic, little-endian write of 'size' bytes. 


AArch64.MemSingle[bits(64) address, integer size, AccType acctype, boolean wasaligned] = bits(sizex8) 
value 

assert size IN {1, 2, 4, 8, 16}; 

assert address == Align(address, size); 


AddressDescriptor memaddrdesc; 
iswrite = TRUE; 


// MMU or MPU 
memaddrdesc = AArch64.TranslateAddress(address, acctype, iswrite, wasaligned, size); 


// Check for aborts or debug exceptions 
if IsFault(memaddrdesc) then 
AArch64.Abort(address, memaddrdesc. fault); 


// Effect on exclusives 
if memaddrdesc.memattrs.shareable then 
ClearExclusiveByAddress(memaddrdesc.paddress, ProcessorID(), size); 


// Memory array access 
_Mem[memaddrdesc, size, acctype] = value; 
return; 
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J1.1 Pseudocode for AArch64 operations 


aarch64/functions/memory/CheckSPAlignment 


// 
// 
// 


CheckSPAlignment () 


Check correct stack pointer alignment for AArch64 state. 


CheckSPAlignment() 


bits(64) sp = SP[]; 


if PSTATE.EL == EL@ then 

stack_align_check = (SCTLR[].SA@ != 'Q'); 
else 

stack_align_check = (SCTLR[].SA != 'Q'); 


if stack_align_check && sp != Align(sp, 16) then 
AArch64.SPAlignmentFault(); 


return; 


aarch64/functions/memory/Mem 


// 


// 
// 


Mem[] - non-assignment (read) form 


Perform a read of 'size' bytes. The access byte order is reversed for a big-endian access. 
Instruction fetches would call AArch64.MemSingle directly. 


bits(sizex8) Mem[bits(64) address, integer size, AccType acctype] 


// 
// 


assert size IN {1, 2, 4, 8, 16}; 
bits(sizex8) value; 

integer i; 

boolean iswrite = FALSE; 


aligned = AArch64.CheckAlignment(address, size, acctype, iswrite); 
if size != 16 || !(acctype IN {AccType_VEC, AccType_VECSTREAM}) then 
atomic = aligned; 
else 
// 128-bit SIMD&FP loads are treated as a pair of 64-bit single-copy atomic accesses 
// 64-bit aligned. 
atomic = address == Align(address, 8); 


if !atomic then 
assert size > 1; 
value<7:0> = AArch64.MemSingle[address, 1, acctype, aligned]; 


// For subsequent bytes it is CONSTRAINED UNPREDICTABLE whether an unaligned Device memory 
// access will generate an Alignment Fault, as to get this far means the first byte did 
// not, so we must be changing to a new translation page. 
if !aligned then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FAULT, Constraint_NONE}; 
if c == Constraint_NONE then aligned = TRUE; 


for i = 1 to size-1 
value<8*i+7:8%i> = AArch64.MemSingle[address+i, 1, acctype, aligned]; 
elsif size == 16 && acctype IN {AccType_VEC, AccType_VECSTREAM} then 
value<63:0> = AArch64.MemSingle[address, 8, acctype, aligned]; 
value<127:64> = AArch64.MemSingle[address+8, 8, acctype, aligned]; 
else 
value = AArch64.MemSingle[address, size, acctype, aligned]; 


if BigEndian() then 
value = BigEndianReverse(value) ; 
return value; 


Mem[] - assignment (write) form 


Perform a write of 'size' bytes. The byte order is reversed for a big-endian access. 
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Mem[bits(64) address, integer size, AccType acctype] = bits(sizex8) value 
integer i; 


boolean iswrite = TRUE; 


if BigEndian() then 
value = BigEndianReverse(value) ; 


aligned = AArch64.CheckAlignment(address, size, acctype, iswrite); 


if size != 16 || !(acctype IN {AccType_VEC, AccType_VECSTREAM}) then 
atomic = aligned; 
else 
// 128-bit SIMD&FP stores are treated as a pair of 64-bit single-copy atomic accesses 
// 64-bit aligned. 
atomic = address == Align(address, 8); 


if !atomic then 
assert size > 1; 


AArch64.MemSingle[address, 1, acctype, aligned] = value<7:0>; 


// For subsequent bytes it is CONSTRAINED UNPREDICTABLE whether an unaligned Device memory 
// access will generate an Alignment Fault, as to get this far means the first byte did 

// not, so we must be changing to a new translation page. 

if !aligned then 


c = ConstrainUnpredictable(); 
assert c IN {Constraint_FAULT, Constraint_NONE}; 
if c == Constraint_NONE then aligned = TRUE; 


for i = 1 to size-1 


AArch64.MemSingle[address+i, 1, acctype, aligned] = value<8xi+7:8xi>; 
elsif size == 16 && acctype IN {AccType_VEC, AccType_VECSTREAM} then 
AArch64.MemSingle[address, 8, acctype, aligned] = value<63:0>; 


AArch64.MemSingle[address+8, 8, acctype, aligned] = value<127:64>; 
else 


AArch64.MemSingle[address, size, acctype, aligned] = value; 
return; 


aarch64/functions/registers/AArch64.MaybeZeroRegisterUppers 

// AArch64 .MaybeZeroRegisterUppers() 

a Eee 

// On taking an exception to AArch64 from AArch32, it is CONSTRAINED UNPREDICTABLE whether the top 
// 32 bits of registers visible at any lower Exception level using AArch32 are set to zero. 


AArch64.MaybeZeroRegisterUppers() 


assert UsingAArch32(); // Always called from AArch32 state before entering AArch64 state 


if PSTATE.EL == EL@ && !ELUsingAArch32(EL1) then 
first = Q; last = 14; include_R15 = FALSE; 


elsif PSTATE.EL IN {ELQ,EL1} && HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) then 
first = 0; last = 30; include_R15 = FALSE; 
else 


first = 0; last = 30; include_R15 = TRUE; 


for n = first to last 


if (nm != 15 || include_R15) && ConstrainUnpredictableBool() then 
_R[n]<63:32> = Zeros(); 


return; 


aarch64/functions/registers/AArch64.ResetGeneralRegisters 


// AArch64.ResetGeneralRegisters() 
————————— 


AArch64.ResetGeneralRegisters() 
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for i = 0 to 30 


X[i] = bits(64) UNKNOWN; 


return; 


aarch64/functions/registers/AArch64.ResetSIMDFPRegisters 


// AArch64.ResetSIMDFPRegisters() 
// sssssessseesssssasessssseees== 


AArch64.ResetSIMDFPRegisters() 


for i = Q to 31 


V[i] = bits(128) UNKNOWN; 


return; 


aarch64/functions/registers/AArch64.ResetSpecialRegisters 


// AArch64.ResetSpecialRegisters() 
// sssssssssessssssseessssseeess== 


AArch64.ResetSpecialRegisters() 


// AArch64 special registers 
SP_ELO = bits(64) UNKNOWN; 
SP_EL1 = bits(64) UNKNOWN; 
SPSR_EL1 = bits(32) UNKNOWN; 
ELR_EL1 = bits(64) UNKNOWN; 
if HaveEL(EL2) then 
SP_EL2 = bits(64) UNKNOWN; 
SPSR_EL2 = bits(32) U 


ELR_EL2 


bits(64) U 


if HaveEL(EL3) then 
bits(64) UNKNOWN; 


SP_EL3 = 
SPSR_EL3 
ELR_EL3 


bits(32) U 
bits(64) U 


NKNOWN ; 
NKNOWN ; 


NKNOWN ; 
NKNOWN ; 


// AArch32 special registers that are not architecturally mapped to AArch64 registers 


if HaveAArch32EL(EL1) then 


SPSR_fig 
SPSR_irq 
SPSR_abt 
SPSR_und 


bits(32) U 
bits(32) U 
bits(32) U 





bits(32) U 


NKNOWN ; 
NKNOWN ; 
NKNOWN ; 
NKNOWN ; 


// External debug special registers 
DLR_ELO = bits(64) UNKNOWN; 
DSPSR_EL@ = bits(32) UNKNOWN; 


return; 


aarch64/functions/registers/AArch64.ResetSystemRegisters 


AArch64.ResetSystemRegisters(boolean cold_reset) ; 


aarch64/functions/registers/PC 


// PC - non-assignment form 
——— 


// Read program counter. 


bits(64) PC[] 
return _PC; 
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aarch64/functions/registers/SP 
// SP[] - assignment form 
// Write to stack pointer from either a 32-bit or a 64-bit value. 


SP[] = bits(width) value 
assert width IN {32,64}; 
if PSTATE.SP == 'Q' then 
SP_EL@ = ZeroExtend(value) ; 
else 
case PSTATE.EL of 
when EL@ SP_EL@ = ZeroExtend(value); 
when EL1 SP_EL1 = ZeroExtend(value); 
when EL2 SP_EL2 = ZeroExtend(value); 
when EL3 SP_EL3 = ZeroExtend(value); 
return; 


// Read stack pointer with implicit slice of 8, 16, 32 or 64 bits. 


bits(width) SP[] 
assert width IN {8,16,32,64}; 
if PSTATE.SP == 'Q' then 
return SP_ELQ<width-1:0>; 
else 
case PSTATE.EL of 
when EL@ return SP_ELQ@<width-1:0>; 
when EL1 return SP_EL1<width-1:0>; 
when EL2 return SP_EL2<width-1:0>; 
when EL3 return SP_EL3<width-1:0>; 


aarch64/functions/registers/V 
// V[] - assignment form 


// Write to SIMD&FP register with implicit extension from 
// 8, 16, 32, 64 or 128 bits. 


V[integer n] = bits(width) value 
assert n >= @ && n <= 31; 
assert width IN {8,16,32,64,128}; 
_V[n] = ZeroExtend(value); 
return; 


// V[] - non-assignment form 


// Read from SIMD&FP register with implicit slice of 8, 16 
// 32, 64 or 128 bits. 


bits(width) V[integer n] 
assert n >= @ && n <= 31; 
assert width IN {8,16,32,64,128}; 
return _V[n]<width-1:0>; 


aarch64/functions/registers/Vpart 

// Npart[] - non-assignment form 

// Reads a 128-bit SIMD&FP register in up to two parts: 

// part ® returns the bottom 8, 16, 32 or 64 bits of the register; 


// part 1 returns only the top 64 bits of the register. 


bits(width) Vpart[integer n, integer part] 
assert n >= @ && n <= 31; 
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assert part IN {Q, 1}; 
if part == @ then 
assert width IN {8,16,32,64}; 
return _V[n]<width-1:0>; 
else 
assert width == 64; 
return _V[n]<127:64>; 


// Npart[] - assignment form 


// Write a 128-bit SIMD&FP register in up to two parts: 
// part ® zero extends a 8, 16, 32, or 64-bit value to fill the whole register; 
// part 1 inserts a 64-bit value into the top 64 bits of the register. 


Vpart[integer n, integer part] = bits(width) value 
assert n >= @ && n <= 31; 
assert part IN {Q, 1}; 
if part == @ then 
assert width IN {8,16, 32,64}; 
_V[n] = ZeroExtend(value) ; 
else 
assert width == 64; 
_V[n]<127:64> = value<63:0>; 


aarch64/functions/registers/X 
// X[{] - assignment form 
// Write to general-purpose register from either a 32-bit or a 64-bit value. 
X[integer n] = bits(width) value 
assert n >= @ && n <= 31; 
assert width IN {32,64}; 
if n != 31 then 
_R[n] = ZeroExtend(value) 


return; 


// X[{] - non-assignment form 


// Read from general-purpose register with implicit slice of 8, 16, 32 or 64 bits. 


bits(width) X[integer n] 
assert n >= 0 && n <= 31; 
assert width IN {8,16,32,64}; 
if n != 31 then 
return _R[n]<width-1:0>; 
else 
return Zeros(width); 


aarch64/functions/sysregisters/CNTKCTL 


// CNTKCTL[] - non-assignment form 
——————————————E 


CNTKCTLType CNTKCTL[] 
return CNTKCTL_EL1; 
aarch64/functions/sysregisters/CNTKCTLType 


type CNTKCTLType; 





Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


J1 ARMV8 Pseudocode 


J1.1 Pseudocode for AArch64 operations 


aarch64/functions/sysregisters/CPACR 


// CPACR[] - non-assignment form 
[[ seqssssnsessocssssesssssesss= 


CPACRType CPACR[] 
return CPACR_EL1; 


aarch64/functions/sysregisters/CPACRType 


type CPACRType; 


aarch64/functions/sysregisters/ELR 


// ELR[] - non-assignment form 


[[ seseeccossesssseeeeesssssss 


bits(64) ELR[bits(2) el] 

bits(64) r; 

case el of 
when EL1 r = ELR_EL1; 
when EL2 r = ELR_EL2; 
when EL3 r = ELR_EL3; 
otherwise Unreachable(); 

return r; 


// ELR[] - non-assignment form 


[[ seseeecesssssssesscesssssss 


bits(64) ELR[] 
assert PSTATE.EL != ELQ; 
return ELR[PSTATE.EL]; 


// ELR[] - assignment form 
|[ sassssenssseassssscsss= 


ELR[bits(2) el] = bits(64) value 
bits(64) r = value; 
case el of 

when EL1 ELR_EL1 = 

when EL2 ELR_EL2 

when EL3 ELR_EL3 = r; 

otherwise Unreachable(); 
return; 


ot 
be: ee] 


// ELR[] - assignment form 
|[ ssassssnsssenssssscsss= 


ELR[] = bits(64) value 
assert PSTATE.EL != ELQ; 
ELR[PSTATE.EL] = value; 
return; 


aarch64/functions/sysregisters/ESR 


// ESR[] - non-assignment form 


// sssssesseseessssasssse==2== 
ESRType ESR[bits(2) regime] 
bits(32) r; 
case regime of 
when EL1 r = ESR_EL1; 


when EL2 r = ESR_EL2; 

when EL3 r = ESR_EL3; 

otherwise Unreachable(); 
return r; 
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// ESR[] - non-assignment form 


[[ seeeeeccssessssesscessssss 


ESRType ESR[] 
return ESR[S1TranslationRegime()]; 


// ESR[] - assignment form 
|[ ssassssnssssassssscsss= 


ESR[bits(2) regime] = ESRType value 
bits(32) r = value; 
case regime of 

when EL1 ESR_EL1 = 

when EL2 ESR_EL2 

when EL3 ESR_EL3 = r; 

otherwise Unreachable(); 
return; 


ot 
a4 


// ESR[] - assignment form 
|[ ssssssenssssassssscsss= 


ESR[] = ESRType value 
ESR[S1TranslationRegime()] = value; 


aarch64/functions/sysregisters/ESRType 


type ESRType; 


aarch64/functions/sysregisters/FAR 


// FAR[] - non-assignment form 


[[ seesecccnsssssssescessssss 


bits(64) FAR[bits(2) regime] 

bits(64) r; 

case regime of 
when EL1 r = FAR_EL1; 
when EL2 r = FAR_EL2; 
when EL3 r = FAR_EL3; 
otherwise Unreachable(); 

return r; 


// FAR[] - non-assignment form 


[[ seseeecensssssssescessssss 


bits(64) FAR[] 
return FAR[S1TranslationRegime()]; 


// FAR[] - assignment form 
/{ sannnnnnunnsssanesansns 


FAR[bits(2) regime] = bits(64) value 
bits(64) r = value; 
case regime of 

when EL1 FAR_EL1 = 

when EL2 FAR_EL2 

when EL3 FAR_EL3 = r; 

otherwise Unreachable(); 
return; 


oo 
ae) 


// FAR[] - assignment form 
/{ sansnennnnnessssssssces 





J1-5276 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


ARM DDI 0487A.k _iss10775 
1ID092916 


J1 ARMV8 Pseudocode 


J1.1 Pseudocode for AArch64 operations 


FAR[] = bits(64) value 
FAR[S1TranslationRegime()] = value; 
return; 


aarch64/functions/sysregisters/MAIR 


// MAIR[] - non-assignment form 


// ssssssssseessssssse=e==22== 
MAIRType MAIR[bits(2) regime] 
bits(64) r; 
case regime of 
when EL1 r = MAIR_EL1; 


when EL2. r = MAIR_EL2; 

when EL3 r = MAIR_EL3; 

otherwise Unreachable(); 
return r; 


// MAIR[] - non-assignment form 
| ee 


MAIRType MAIR[] 
return MAIR[S1TranslationRegime()]; 


aarch64/functions/sysregisters/MAIRType 


type MAIRType; 


aarch64/functions/sysregisters/SCTLR 


// SCTLR[] - non-assignment form 


// sssssesseeessssaasessssseees= 
SCTLRType SCTLR[bits(2) regime] 
bits(32) r; 
case regime of 
when EL1 r = SCTLR_EL1; 


when EL2_ r = SCTLR_EL2; 

when EL3 r = SCTLR_EL3; 

otherwise Unreachable(); 
return r; 


// SCTLR[] - non-assignment form 


SCTLRType SCTLR[] 
return SCTLR[S1TranslationRegime()]; 


aarch64/functions/sysregisters/SCTLRType 


type SCTLRType; 


aarch64/functions/sysregisters/VBAR 


// NBAR[] - non-assignment form 
a————EEE 


bits(64) VBAR[bits(2) regime] 
bits(64) r; 
case regime of 
when EL1 r = VBAR_EL1; 
when EL2 r = VBAR_EL2; 
when EL3 r = VBAR_EL3; 
otherwise Unreachable(); 
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return r; 


// NBAR[] - non-assignment form 
// sssssssseeessssssaesss=e== 


bits(64) VBAR[] 
return VBAR[S1TranslationRegime()]; 


aarch64/functions/system/AArch64.CheckAdvSIMDFPSystemRegisterTraps 


// Checks if an AArch64 MSR, MRS or SYS instruction on a SIMD or floating-point 

// register is trapped under the current configuration. Returns a boolean which 

// is TRUE if trapping occurs, plus a binary value that specifies the Exception 

// level trapped to. 

(boolean, bits(2)) AArch64.CheckAdvSIMDFPSystemRegisterTraps(bits(2) op0, bits(3) opl, bits(4) crn, 
bits(4) crm, bits(3) op2, bit read); 


aarch64/functions/system/AArch64.CheckSystemAccess 


// AArch64.CheckSystemAccess() 


AArch64.CheckSystemAccess(bits(2) op@, bits(3) opl, bits(4) crn, bits(4) crm, bits(3) op2, bits(5) rt, 
bit read) 

// Checks if an AArch64 MSR, MRS or SYS instruction is UNALLOCATED or trapped at the current 

// exception level, security state and configuration, based on the opcode's encoding. 

boolean unallocated = FALSE; 

boolean need_secure = FALSE; 

bits(2) min_EL; 


// Check for traps by HCR_EL2.TIDCP 
if HaveEL(EL2) && !IsSecure() && HCR_EL2.TIDCP == 1 && opO == 'x1' && crn == '1x11' then 
// At Non-secure EL@, it is IMPLEMENTATION_DEFINED whether attempts to execute system 
// register access instructions with reserved encodings are trapped to EL2 or UNDEFINED 
rcs_el@_trap = boolean IMPLEMENTATION_DEFINED "Reserved Control Space ELQ Trapped"; 
if PSTATE.EL == EL@ && rcs_el@_trap then 
AArch64.SystemRegisterTrap(EL2, op®, op2, op1, crn, rt, crm, read); 
elsif PSTATE.EL == EL1 then 
AArch64.SystemRegisterTrap(EL2, op®, op2, op1, crn, rt, crm, read); 


// Check for unallocated encodings 


case op1 of 

when 'QQx', 'Q10' 

min_EL = EL1; 
when 'Q11' 

min_EL = ELQ; 
when '100' 

min_EL = EL2; 
when '101' 

UnallocatedEncoding(); 
when '110' 

min_EL = EL3; 
when '111' 

min_EL = EL1; 


need_secure = TRUE; 
if UInt(PSTATE.EL) < UInt(min_EL) then 
UnallocatedEncoding() 
elsif need_secure && !IsSecure() then 
UnallocatedEncoding() 
elsif AArch64.CheckUnallocatedSystemAccess(op@, opl, crn, crm, op2, read) then 
UnallocatedEncoding() 


// Check for traps on accesses to SIMD or floating-point registers 
(take_trap, target_el) = AArch64.CheckAdvSIMDFPSystemRegisterTraps(op0, opl, crn, crm, op2); 
if take_trap then 

AArch64.AdvSIMDFPAccessTrap(target_el); 
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// Check for traps on access to all other system registers 
(take_trap, target_el) = AArch64.CheckSystemRegisterTraps(op@, opl, crn, crm, op2, read); 
if take_trap then 

AArch64.SystemRegisterTrap(target_el, op®, op2, opl, crn, rt, crm, read); 


aarch64/functions/system/AArch64.CheckSystemRegisterTraps 

// Checks if an AArch64 MSR, MRS or SYS instruction on a system register is trapped 

// under the current configuration. Returns a boolean which is TRUE if trapping 

// occurs, plus a binary value that specifies the Exception level trapped to. 

(boolean, bits(2)) AArch64.CheckSystemRegisterTraps(bits(2) op@, bits(3) opl, bits(4) crn, bits(4) crm, 
bits(3) op2, bit read); 
aarch64/functions/system/AArch64.CheckUnallocatedSystemAccess 

// Checks if an AArch64 MSR, MRS or SYS instruction is unallocated under the current 

// configuration. 

boolean AArch64.CheckUnallocatedSystemAccess(bits(2) op0, bits(3) opl, bits(4) crn, bits(4) crm, bits(3) 
op2, bit read); 
aarch64/functions/system/AArch64.Sysinstr 

// Execute a system instruction with write (source operand). 

AArch64.SysInstr(integer op@, integer opl, integer crn, integer crm, integer op2, bits(64) val); 
aarch64/functions/system/AArch64.SysinstrWithResult 

// Execute a system instruction with read (result operand). 

// Returns the result of the instruction. 

bits(64) AArch64.SysInstrWithResult(integer opQ, integer opl, integer crn, integer crm, integer op2); 
aarch64/functions/system/AArch64.SysRegRead 

// Read from a system register and return the contents of the register. 

bits(64) System_Get(integer op®, integer opl, integer crn, integer crm, integer op2); 
aarch64/functions/system/AArch64.SysRegWrite 


// Write to a system register. 
AArch64.SysRegWrite(integer op@, integer op1, integer crn, integer crm, integer op2, bits(64) val); 





J1.1.4 aarch64/instrs 
This section includes the following pseudocode functions: 
° aarch64/instrs/branch/eret/AArch64.ExceptionReturn on page J1-5280. 
° aarch64/instrs/countop/CountOp on page J1-5280. 
° aarch64/instrs/extendreg/DecodeRegExtend on page J1-5281. 
° aarch64/instrs/extendreg/ExtendReg on page J1-5281. 
° aarch64/instrs/extendreg/ExtendType on page J1-5281. 
° aarch64/instrs/float/arithmetic/max-min/fpmaxminop/F PMaxMinOp on page J1-5281. 
° aarch64/instrs/float/arithmetic/unary/{punaryop/F PUnaryOp on page J1-5281. 
° aarch64/instrs/float/convert/fpconvop/F PConvOp on page J1-5282. 
° aarch64/instrs/integer/bitfield/bfxpreferred/BFXPreferred on page J1-5282. 
. aarch64/instrs/integer/bitmasks/DecodeBitMasks on page J1-5282. 
° aarch64/instrs/integer/ins-ext/insert/movewide/movewideop/Move WideOp on page J1-5283. 
° aarch64/instrs/integer/logical/movwpreferred/MoveWidePreferred on page J1-5283. 
° aarch64/instrs/integer/shiftreg/DecodeShift on page J1-5283. 
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aarch64/instrs/integer/shiftreg/ShiftReg on page J1-5283. 
aarch64/instrs/integer/shiftreg/ShiftType on page J1-5284. 
aarch64/instrs/logicalop/LogicalOp on page J1-5284. 
aarch64/instrs/memory/memop/MemOp on page J1-5284. 
aarch64/instrs/memory/prefetch/Prefetch on page J1-5284. 
aarch64/instrs/systen/barriers/barrierop/MemBarrierOp on page J1-5284. 
aarch64/instrs/system/hints/syshintop/SystemHintOp on page J1-5284. 
aarch64/instrs/system/register/cpsr/pstatefield/PSTATEField on page J1-5284. 
aarch64/instrs/system/sysops/sysop/SysOp on page J1-5285. 
aarch64/instrs/system/sysops/sysop/SystemOp on page J1-5285. 
aarch64/instrs/vector/arithmetic/binary/uniform/logical/bsl-eor/vbitop/VBitOp on page J1-5286. 
aarch64/instrs/vector/arithmetic/unary/cmp/compareop/CompareOp on page J1-5286. 
aarch64/instrs/vector/crypto/enabled/CheckCryptoEnabled64 on page J1-5286. 
aarch64/instrs/vector/logical/immediateop/ImmediateOp on page J1-5286. 
aarch64/instrs/vector/reduce/reduceop/Reduce on page J1-5286. 
aarch64/instrs/vector/reduce/reduceop/ReduceOp on page J1-5286. 


aarch64/instrs/branch/eret/AArch64.ExceptionReturn 


// AArch64.ExceptionReturn() 


AArch64.ExceptionReturn(bits(64) new_pc, bits(32) spsr) 


// Attempts to change to an illegal state will invoke the Illegal Execution state mechanism 
SetPSTATEFromPSR(spsr) ; 

ClearExclusiveLocal(ProcessorID()); 

EventRegisterSet(); 


if spsr<4> == '1' then 
// For an attempted to change to AArch32 state, align PC[1:0] according 
// to the target instruction set state. If the exception return is illegal, 
// it is IMPLEMENTATION DEFINED whether this alignment takes place. 
align_pc = boolean IMPLEMENTATION_DEFINED "Align PC on illegal exception return"; 
if PSTATE.IL == '@' || align_pc then 
if spsr<5> == '1' then // 132 
new_pc = Align(new_pc, 2); 
else // A32 
new_pc = Align(new_pc, 4); 


// If the return was illegal, the 32 MSBs of the target PC might be zeroed 
if PSTATE.IL == '1' && ConstrainUnpredictableBool() then 
new_pc<63:32> = Zeros(); 


if UsingAArch32() then 
// 32 most significant bits are ignored 
BranchTo(new_pc<31:0>, BranchType_UNKNOWN) ; 
else 
// For an illegal exception return it is IMPLEMENTATION DEFINED whether the return is 
// to the Exception level indicated by the SPSR, or to the Exception level 
// in which the exception return was executed. 
el_from_spsr = boolean IMPLEMENTATION_DEFINED "EL from SPSR on illegal exception return"; 
target_el = PSTATE.EL; 
if PSTATE.IL == '1' && el_from_spsr then 
(-, target_el) = ELFromSPSR(spsr); 
new_pc = BranchAddr(new_pc, target_el); 
BranchToAddr(new_pc, BranchType_ERET) ; 


aarch64/instrs/countop/CountOp 


enumeration CountOp {CountOp_CLZ, CountOp_CLS, CountOp_CNT}; 
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aarch64/instrs/extendreg/DecodeRegExtend 
// DecodeRegExtend() 
// Decode a register extension option 


ExtendType DecodeRegExtend(bits(3) op) 
case op of 
when 'Q00' return ExtendType_UXTB; 
when 'QQ1' return ExtendType_UXTH; 
when 'Q10' return ExtendType_UXTW; 
when 'Q11' return ExtendType_UXTX; 
when '100' return ExtendType_SXTB; 
when '101' return ExtendType_SXTH; 
when '110' return ExtendType_SXTW; 
when '111' return ExtendType_SXTX; 





aarch64/instrs/extendreg/ExtendReg 
// ExtendReg() 
// Perform a register extension and shift 


bits(N) ExtendReg(integer reg, ExtendType type, integer shift) 
assert shift >= 0 && shift <= 4; 
bits(N) val = X[reg]; 
boolean unsigned; 
integer len; 


case type of 
when ExtendType_SXTB unsigned = FALSE; len = 8; 
when ExtendType_SXTH unsigned = FALSE; len = 16; 
when ExtendType_SXTW unsigned = FALSE; len = 32; 
when ExtendType_SXTX unsigned = FALSE; len = 64; 
when ExtendType_UXTB unsigned = TRUE; len = 8; 
when ExtendType_UXTH unsigned = TRUE; len = 16; 
when ExtendType_UXTW unsigned = TRUE; len = 32; 
when ExtendType_UXTX unsigned = TRUE; len = 64; 





// Note the extended width of the intermediate value and 

// that sign extension occurs from bit <len+shift-1>, not 

// from bit <len-1>. This is equivalent to the instruction 

//  [SU]BFIZ Rtmp, Rreg, #shift, #len 

// It may also be seen as a sign/zero extend followed by a shift: 
//  LSL(Extend(val<len-1:0>, N, unsigned), shift); 


len = Min(len, N - shift) 
return Extend(val<len-1:0> : Zeros(shift), N, unsigned) ; 
aarch64/instrs/extendreg/ExtendType 
enumeration ExtendType {ExtendType_SXTB, ExtendType_SXTH, ExtendType_SXTW, ExtendType_SXTX, 
ExtendType_UXTB, ExtendType_UXTH, ExtendType_UXTW, ExtendType_UXTX}; 
aarch64/instrs/float/arithmetic/max-min/fpmaxminop/FPMaxMinOp 
enumeration FPMaxMinOp {FPMaxMinOp_MAX, FPMaxMinOp_MIN, 
FPMaxMinOp_MAXNUM, FPMaxMinOp_MINNUM}; 
aarch64/instrs/float/arithmetic/unary/fpunaryop/FPUnaryOp 


enumeration FPUnaryOp {FPUnaryOp_ABS, FPUnaryOp_MOV, 
FPUnaryOp_NEG, FPUnaryOp_SQRT}; 
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aarch64/instrs/float/convert/fpconvop/FPConvOp 


enumeration FPConvOp {FPConvOp_CVT_FtoI, FPConvOp_CVT_ItoF, 
FPConvOp_MOV_FtoI, FPConvOp_MOV_ItoF}; 


aarch64/instrs/integer/bitfield/bfxpreferred/BF XPreferred 


// BFXPreferred() 


// Return TRUE if UBFX or SBFX is the preferred disassembly of a 
// UBFM or SBFM bitfield instruction. Must exclude more specific 
// aliases UBFIZ, SBFIZ, UXT[BH], SXT[BHW], LSL, LSR and ASR. 


boolean BFXPreferred(bit sf, bit uns, bits(6) imms, bits(6) immr) 
integer S = UInt(imms); 
integer R = UInt(immr); 


// must not match UBFIZ/SBFIX alias 
if UInt(imms) < UInt(immr) then 
return FALSE; 


// must not match LSR/ASR/LSL alias (imms == 31 or 63) 
if imms == sf:'11111' then 
return FALSE; 


// must not match UXTx/SXTx alias 
if immr == 'Q00000' then 
// must not match 32-bit UXT[BH] or SXT[BH] 
if sf == 'Q' && imms IN {'Q00111', 'Q01111'} then 
return FALSE; 
// must not match 64-bit SXT[BHW] 
if sf:uns == '10' && imms IN {'Q00111', 'Q01111', 'Q11111'} then 
return FALSE; 


// must be UBFX/SBFX alias 
return TRUE; 


aarch64/instrs/integer/bitmasks/DecodeBitMasks 


// DecodeBitMasks() 


// Decode AArch64 bitfield and logical immediate masks which use a similar encoding structure 


(bits(M), bits(M)) DecodeBitMasks(bit immN, bits(6) imms, bits(6) immr, boolean immediate) 
bits(M) tmask, wmask; 
bits(6) levels; 


// Compute log2 of element size 

// 2\len must be in range [2, M] 

len = HighestSetBit(immN:NOT(imms)) ; 
if len < 1 then ReservedValue(); 
assert M >= (1 << len); 


// Determine S, R and S - R parameters 
levels = ZeroExtend(Ones(len), 6); 


// For logical immediates an all-ones value of S is reserved 
// since it would generate a useless all-ones result (many times) 
if immediate && (imms AND levels) == levels then 

ReservedValue(); 


S = UInt(imms AND levels); 
R = UInt(immr AND levels); 
diff = S - R; // 6-bit subtract with borrow 
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esize = 1 << len; 

d = UInt(diff<len-1:0>); 

welem = ZeroExtend(Ones(S + 1), esize); 
telem = ZeroExtend(Ones(d + 1), esize); 
wmask = Replicate(ROR(welem, R)); 

tmask = Replicate(telem) ; 

return (wmask, tmask); 


aarch64/instrs/integer/ins-ext/insert/movewide/movewideop/MoveWideOp 


enumeration MoveWideOp {MoveWideOp_N, MoveWideOp_Z, MoveWideOp_K}; 


aarch64/instrs/integer/logical/movwpreferred/MoveWidePreferred 


// MoveWidePreferred() 


// Return TRUE if a bitmask immediate encoding would generate an immediate 
// value that could also be represented by a single MOVZ or MOVN instruction. 
// Used as a condition for the preferred MOV<-ORR alias. 


boolean MoveWidePreferred(bit sf, bit immN, bits(6) imms, bits(6) immr) 
integer S = UInt(imms); 
integer R = UInt(immr); 
integer width = if sf == '1' then 64 else 32; 


// element size must equal total immediate size 


if sf == '1' && immN:imms != '1xxxxxx' then 
return FALSE; 
if sf == 'Q@' && immN:imms != 'Q@@xxxxx' then 


return FALSE; 

// for MOVZ must contain no more than 16 ones 

if S < 16 then 
// ones must not span halfword boundary when rotated 
return (-R MOD 16) <= (15 - S); 

// for MOVN must contain no more than 16 zeros 

if S >= width - 15 then 
// zeros must not span halfword boundary when rotated 
return (R MOD 16) <= (S - (width - 15)); 


return FALSE; 


aarch64/instrs/integer/shiftreg/DecodeShift 
// DecodeShift() 
// Decode shift encodings 
ShiftType DecodeShift(bits(2) op) 
case op of 
when 'Q0' return ShiftType_LSL; 
when 'Q1' return ShiftType_LSR; 
when '10' return ShiftType_ASR; 
when '11' return ShiftType_ROR; 
aarch64/instrs/integer/shiftreg/ShiftReg 
// ShiftReg() 


// Perform shift of a register operand 


bits(N) ShiftReg(integer reg, ShiftType type, integer amount) 
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bits(N) result = X[reg]; 

case type of 
when ShiftType_LSL result = LSL(result, amount); 
when ShiftType_LSR result = LSR(result, amount); 
when ShiftType_ASR result = ASR(result, amount); 
when ShiftType_ROR result = ROR(result, amount); 

return result; 


aarch64/instrs/integer/shiftreg/ShiftType 


enumeration ShiftType {ShiftType_LSL, ShiftType_LSR, ShiftType_ASR, ShiftType_ROR}; 


aarch64/instrs/logicalop/LogicalOp 


enumeration Logical0p {LogicalOp_AND, Logical0p_EOR, LogicalOp_ORR}; 


aarch64/instrs/memory/memop/MemOp 


enumeration MemOp {MemOp_LOAD, MemOp_STORE, MemOp_PREFETCH}; 


aarch64/instrs/memory/prefetch/Prefetch 


// Prefetch() 


// Decode and execute the prefetch hint on ADDRESS specified by PRFOP 
Prefetch(bits(64) address, bits(5) prfop) 

PrefetchHint hint; 

integer target; 


boolean stream; 


case prfop<4:3> of 


when 'Q@Q' hint = Prefetch_READ; // PLD: prefetch for load 
when '@1' hint = Prefetch_EXEC; // PLI: preload instructions 
when '10' hint = Prefetch_WRITE; // PST: prepare for store 
when '11' return; // unallocated hint 

target = UInt(prfop<2:1>); // target cache level 

stream = (prfop<@> != '0'); // streaming (non-temporal) 

Hint_Prefetch(address, hint, target, stream); 

return; 


aarch64/instrs/system/barriers/barrierop/MemBarrierOp 


enumeration MemBarrierOp {MemBarrierOp_DSB, MemBarrierOp_DMB, MemBarrierOp_ISB}; 


aarch64/instrs/system/hints/syshintop/SystemHintOp 


enumeration SystemHintOp { 
SystemHintOp_NOP, 
SystemHintOp_YIELD, 
SystemHintOp_WFE, 
SystemHintOp_WFI, 
SystemHintOp_SEV, 
SystemHintOp_SEVL, 

}; 


aarch64/instrs/system/register/cpsr/pstatefield/PSTATEField 


enumeration PSTATEField {PSTATEField_DAIFSet, PSTATEField_DAIFCIr, 
PSTATEField_SP 
5 
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aarch64/instrs/system/sysops/sysop/SysOp 


// SysOp() 
// 


SystemOp SysOp(bits(3) op1, 


case opl 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 
when 


:CRn:CRm:op2 of 
"Q00 0111 1000 
‘100 0111 1000 
"110 0111 1000 
"Q00 0111 1000 
"100 0111 1000 
"110 0111 1000 
"Q00 111 1000 
"Q00 0111 1000 
"100 111 1000 
‘100 0111 1000 
‘100 0111 1000 
"100 0111 1000 
"Q11 0111 0100 
"Q00 0111 0110 
"Q00 0111 0110 
"Q11 0111 1010 
"Q00 0111 1010 
"Q11 Q111 1011 
"Q11 Q111 1110 
"Q00 0111 1110 
"Q00 0111 0001 
"Q00 111 0101 
"Q11 0111 0101 
"100 1000 0000 
"100 1000 0000 
"Q00 1000 0011 
"100 1000 0011 
"110 1000 0011 
"Q00 1000 0011 
"100 1000 0011 
"110 1000 0011 
"Q00 1000 0011 
"Q00 1000 0011 
"100 1000 0011 
"Q00 1000 0011 
"100 1000 0011 
‘110 1000 0011 
"100 1000 0011 
"Q00 1000 0011 
"100 1000 0100 
"100 1000 0100 
"Q00 1000 0111 
"100 1000 0111 
"110 1000 @111 
"Q00 1000 0111 
"100 1000 0111 
"110 1000 0111 
"Q00 1000 0111 
"Q00 1000 0111 Q11' 
"100 1000 @111 100' 
"Q00 1000 @111 101' 
"100 100@ 111 101' 
"110 1000 @111 101' 
"100 1000 @111 110' 
"Q00 100@ 111 111' 


000' 
000' 
000' 
001" 
001" 
001" 
010' 
Q11' 
100' 
101' 
110' 
111' 
001' 
001" 
010' 
001" 
010' 
001' 











return Sys_SYS; 


return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 
return 


Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_AT; 
Sys_DC; 
Sys_DC; 
Sys_DC; 
Sys_DC; 
Sys_DC; 
Sys_DC; 
Sys_DC; 
Sys_DC; 
Sys_IC; 
Sys_IC; 
Sys_IC; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 
Sys_TLBI; 











bits(4) CRn, bits(4) CRm, bits(3) op2) 


// S1E1R 

// S1E2R 

// S1E3R 

// S1E1W 

// S1E2W 

// S1E3W 

// S1EOR 

// S1EQW 

// S12E1R 

// S12E1W 

// S12EOR 

// S12EQW 

// ZNA 

// IVAC 

// ISW 

// CVAC 

// CSW 

// CVAU 

// CIVAC 

// CISW 

// TALLUIS 
// TALLU 

// IVAU 

// IPAS2E11S 
// IPAS2LE1IS 
// \MALLE1IS 
// ALLE2IS 
// ALLE3IS 
// VAE1IS 

// VAE2IS 

// NVAE3IS 

// ASIDE1IS 
// NAAE1IS 
// ALLE1IS 
// VALE1IS 
// NALE2IS 
// VALE31S 
// \MALLS12E11S 
// NVAALE1IS 


; // IPAS2E1 
; // IPAS2LE1 
; // VMALLE1 
; // ALLE2 

; // ALLE3 

; // VAEL 

; // VAE2 

; // VAE3 

; // ASIDE1 

; // VAAE1 

; // ALLEL 

; // VALE1 

; // VALE2 

; // VALE3 

; // VMALLS12E1 
; // VAALEL 


aarch64/instrs/system/sysops/sysop/SystemOp 


enumeration System0p {Sys_AT, Sys_DC, Sys_IC, Sys_TLBI, Sys_SYS}; 
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aarch64/instrs/vector/arithmetic/binary/uniform/logical/bsl-eor/vbitop/VBitOp 


enumeration VBitOp {VBitOp_VBIF, VBitOp_VBIT, VBitOp_VBSL, VBitOp_VEOR}; 


aarch64/instrs/vector/arithmetic/unary/cmp/compareop/CompareOp 


enumeration CompareOp {CompareOp_GT, CompareOp_GE, CompareOp_EQ, 
CompareOp_LE, CompareOp_LT}; 


aarch64/instrs/vector/crypto/enabled/CheckCryptoEnabled64 


// CheckCryptoEnabled64( ) 
|/ =ssssesnsssensssseess= 


CheckCryptoEnabled64() 
AArch64.CheckFPAdvSIMDEnabled(); 
return; 


aarch64/instrs/vector/logical/immediateop/ImmediateOp 


enumeration ImmediateOp {ImmediateOp_MOVI, ImmediateOp_MVNI, 
ImmediateOp_ORR, ImmediateOp_BIC}; 


aarch64/instrs/vector/reduce/reduceop/Reduce 


// Reduce() 
(—— 


bits(esize) Reduce(ReduceOp op, bits(N) input, integer esize) 
integer half; 
bits(esize) hi; 
bits(esize) lo; 
bits(esize) result; 


if N == esize then 
return input; 


half = N DIV 2; 
hi = Reduce(op, input<N-1l:half>, esize); 
lo = Reduce(op, input<half-1:0>, esize); 


case op of 

when ReduceOp_FMINNUM 

result = FPMinNum(lo, hi, FPCR); 
when ReduceOp_FMAXNUM 

result = FPMaxNum(lo, hi, FPCR); 
when ReduceOp_FMIN 

result = FPMin(lo, hi, FPCR); 
when ReduceOp_FMAX 

result = FPMax(lo, hi, FPCR); 
when ReduceOp_FADD 

result = FPAdd(lo, hi, FPCR); 
when ReduceOp_ADD 

result = lo + hi; 








return result; 


aarch64/instrs/vector/reduce/reduceop/ReduceOp 


enumeration ReduceOp {ReduceOp_FMINNUM, ReduceOp_FMAXNUM, 
ReduceOp_FMIN, ReduceOp_FMAX, 
ReduceOp_FADD, ReduceOp_ADD}; 





J1-5286 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential 


1ID092916 


J1 ARMv8 Pseudocode 
J1.1 Pseudocode for AArch64 operations 


aarch64/translation 


This section includes the following pseudocode functions: 

° aarch64/translation/attrs/AArch64. InstructionDevice. 

° aarch64/translation/attrs/AArch64.S 1AttrDecode. 

° aarch64/translation/attrs/AArch64. TranslateAddressS1 Off on page J1-5288. 
. aarch64/translation/checks/AArch64. CheckPermission on page J1-5289. 

. aarch64/translation/checks/AArch64. CheckS2Permission on page J1-5290. 

. aarch64/translation/debug/AArch64. CheckBreakpoint on page J1-5290. 

. aarch64/translation/debug/AArch64.CheckDebug on page J1-5291. 

. aarch64/translation/debug/AArch64. Check Watchpoint on page J1-5291. 

. aarch64/translation/faults/AArch64.AccessFlagFault on page J1-5292. 

° aarch64/translation/faults/AArch64.AddressSizeFault on page J1-5292. 

° aarch64/translation/faults/AArch64.AlignmentFault on page J1-5292. 

° aarch64/translation/faults/AArch64.AsynchExternalAbort on page J1-5292. 

° aarch64/translation/faults/AArch64. Debug Fault on page J1-5293. 

° aarch64/translation/faults/AArch64.NoFault on page J1-5293. 

° aarch64/translation/faults/AArch64.PermissionFault on page J1-5293. 

° aarch64/translation/faults/AArch64. TranslationFault on page J1-5293. 

° aarch64/translation/translation/AArch64. FirstStageTranslate on page J1-5293. 
° aarch64/translation/translation/AArch64. FullTranslate on page J1-5294. 

° aarch64/translation/translation/AArch64.SecondStageTranslate on page J1-5295. 
° aarch64/translation/translation/AArch64.SecondStage Walk on page J1-5295. 
° aarch64/translation/translation/AArch64.TranslateAddress on page J1-5296. 
° aarch64/translation/walk/AArch64.TranslationTableWalk on page J1-5296. 


aarch64/translation/attrs/AArch64.InstructionDevice 


// AArch64.InstructionDevice() 


// Instruction fetches from memory marked as Device but not execute-never might generate a 


// Permission Fault but are otherwise treated as if from Normal Non-cacheable memory. 


AddressDescriptor AArch64.InstructionDevice(AddressDescriptor addrdesc, bits(64) vaddress, 
bits(48) ipaddress, integer level, 


AccType acctype, boolean iswrite, boolean secondstage, 


boolean s2fslwalk) 


c = ConstrainUnpredictable(); 
assert c IN {Constraint_NONE, Constraint_FAULT}; 


if c == Constraint_FAULT then 
addrdesc.fault = AArch64.PermissionFault(ipaddress, level, acctype, iswrite, 
secondstage, s2fslwalk); 
else 
addrdesc.memattrs.type = MemType_Normal ; 
addrdesc.memattrs.inner.attrs = MemAttr_NC; 
addrdesc.memattrs.inner.hints = MemHint_No; 
addrdesc.memattrs.outer = addrdesc.memattrs.inner; 
addrdesc.memattrs = MemAttrDefaults(addrdesc.memattrs) ; 


return addrdesc; 


aarch64/translation/attrs/AArch64.S1AttrDecode 


// AArch64.S1AttrDecode( ) 

// ssssssssseessese====== 

// Converts the Stage 1 attribute fields, using the MAIR, to orthogonal 
// attributes and hints. 
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MemoryAttributes AArch64.S1AttrDecode(bits(2) SH, bits(3) attr, AccType acctype) 
MemoryAttributes memattrs; 


mair = MAIR[]; 
index = 8 « UInt(attr); 
attrfield = mair<index+7:index>; 


if ((attrfield<7:4> != 'Q000' && attrfield<3:0> == 'Q000') || 
(attrfield<7:4> == 'Q000' && attrfield<3:0> != 'xx00')) then 
// Reserved, maps to an allocated value 
(-, attrfield) = ConstrainUnpredictableBits(); 


if attrfield<7:4> == 'Q000' then // Device 

memattrs.type = MemType_Device; 

case attrfield<3:0> of 
when 'Q000' memattrs.device = DeviceType_nGnRnE; 
when 'Q100' memattrs.device = DeviceType_nGnRE; 
when '1000' memattrs.device = DeviceType_nGRE; 
when '1100' memattrs.device = DeviceType_GRE; 
otherwise Unreachable(); // Reserved, handled above 


elsif attrfield<3:0> != 'Q000' then // Normal 
memattrs.type = MemType_Normal; 
memattrs.outer = LongConvertAttrsHints(attrfield<7:4>, acctype); 
memattrs.inner = LongConvertAttrsHints(attrfield<3:0>, acctype); 


memattrs.shareable = SH<1> == '1'; 
memattrs.outershareable = SH == '10'; 
else 
Unreachable(); // Reserved, handled above 


return MemAttrDefaults(memattrs) ; 


aarch64/translation/attrs/AArch64. TranslateAddressS1 Off 


// AArch64.TranslateAddressS10Ff() 

fee 

// Called for stage 1 translations when translation is disabled to supply a default translation. 
// Note that there are additional constraints on instruction prefetching that are not described in 
// this pseudocode. 


TLBRecord AArch64.TranslateAddressS10ff(bits(64) vaddress, AccType acctype, boolean iswrite) 
assert !ELUsingAArch32(S1TranslationRegime()); 


TLBRecord result; 


Top = AddrTop(vaddress, PSTATE.EL); 
if !IsZero(vaddress<Top:PAMax()>) then 
level = 0; 
jipaddress = bits(48) UNKNOWN; 
secondstage = FALSE; 
s2fslwalk = FALSE; 
result.addrdesc. fault = AArch64.AddressSizeFault(ipaddress, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


default_cacheable = (HasS2Translation() && HCR_EL2.DC == '1'); 


if default_cacheable then 
// Use default cacheable settings 
result.addrdesc.memattrs.type = MemType_Normal; 
result.addrdesc.memattrs.inner.attrs = MemAttr_WB; // Write-back 
result.addrdesc.memattrs.inner.hints = MemHint_RWA; 
result.addrdesc.memattrs.shareable = FALSE; 
result.addrdesc.memattrs.outershareable = FALSE; 
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elsif acctype != AccType_IFETCH then 
// Treat data as Device 
result.addrdesc.memattrs.type = MemType_Device; 
result.addrdesc.memattrs.device = DeviceType_nGnRnE; 
result.addrdesc.memattrs.inner = MemAttrHints UNKNOWN; 
else 
// Instruction cacheability controlled by SCTLR_ELx.I 
cacheable = SCTLR[].I == '1'; 
result.addrdesc.memattrs.type = MemType_Normal; 
if cacheable then 
result.addrdesc.memattrs.inner.attrs = MemAttr_WT; 
result.addrdesc.memattrs.inner.hints = MemHint_RA; 
else 
result.addrdesc.memattrs.inner.attrs = MemAttr_NC; 
result.addrdesc.memattrs.inner.hints = MemHint_No; 
result.addrdesc.memattrs.shareable = TRUE; 
result.addrdesc.memattrs.outershareable = TRUE; 


result.addrdesc.memattrs.outer = result.addrdesc.memattrs.inner; 
result.addrdesc.memattrs = MemAttrDefaults(result.addrdesc.memattrs) ; 
result.perms.ap = bits(3) UNKNOWN; 


result.perms.xn = 'Q'; 
result.perms.pxn = 'Q'; 





result.nG = bit UNKNOWN; 

result.contiguous = boolean UNKNOWN; 

result.domain = bits(4) UNKNOWN; 

result. level = integer UNKNOWN; 

result.blocksize = integer UNKNOWN; 
result.addrdesc.paddress.physicaladdress = vaddress<47:0>; 
result.addrdesc.paddress.NS = if IsSecure() then '@' else '1'; 
result.addrdesc. fault = AArch64.NoFault(); 


return result; 


aarch64/translation/checks/AArch64.CheckPermission 


// AArch64.CheckPermission() 


// Function used for permission checking from AArch64 stage 1 translations 


FaultRecord AArch64.CheckPermission(Permissions perms, bits(64) vaddress, integer 
bit NS, AccType acctype, boolean iswrite) 


assert !ELUsingAArch32(S1TranslationRegime()); 


wxn = SCTLR[].WXN == '1'; 

if PSTATE.EL IN {ELQ,EL1} then 
priv_r = TRUE; 
priv_w = perms.ap<2> == 'Q'; 
user_r = perms.ap<1> == '1'; 
user_w = perms.ap<2:1> == 'Q1'; 


if PSTATE.EL == ELQ then 
ispriv = FALSE; 
else 
ispriv = (acctype != AccType_UNPRIV) ; 


user_xn = perms.xn == '1' || (user_w && wxn); 
priv_xn = perms.pxn == '1' || (priv_w && wxn) || user_w; 
if ispriv then 
(r, w, xn) = (priv_r, priv_w, priv_xn) 
else 
(r, w, xn) = (user_r, user_w, user_xn); 


else 
// Access from EL2 or EL3 


level, 
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r = TRUE; 
W = perms.ap<2> == 'Q'; 
xn = perms.xn == '1' || (w && wxn); 


// Restriction on Secure instruction fetch 
if HaveEL(EL3) && IsSecure() && NS == '1' && SCR_EL3.SIF == '1' then 
xn = TRUE; 


if acctype == AccType_IFETCH then 
fail = xn; 
failedread = TRUE; 

elsif iswrite then 
fail = |w; 
failedread = FALSE; 

else 
fail = Ir; 
failedread = TRUE; 





if fail then 

secondstage = FALSE; 

s2fslwalk = FALSE; 

jipaddress = bits(48) UNKNOWN; 

return AArch64.PermissionFault(ipaddress, level, acctype, 

!failedread, secondstage, s2fslwalk); 

else 

return AArch64.NoFault(); 


aarch64/translation/checks/AArch64.CheckS2Permission 


// AArch64.CheckS2Permission() 
|[ sessssensssecsssssesssseess 


// Function used for permission checking from AArch64 stage 2 translations 

FaultRecord AArch64.CheckS2Permission(Permissions perms, bits(64) vaddress, bits(48) ipaddress, 
integer level, AccType acctype, boolean iswrite, 
boolean s2fslwalk) 


assert HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) && HasS2Translation(); 


r = perms.ap<1l> == '1'; 
W = perms.ap<2> == '1'; 
xn = perms.xn == '1'; 


// Stage 1 walk is checked as a read, regardless of the original type 
if acctype == AccType_IFETCH && !s2fslwalk then 
fail = xn; 
failedread = TRUE; 
elsif iswrite && !s2fslwalk then 
fail = !w; 
failedread = FALSE; 


else 
fail = !r; 
failedread = !iswrite; 





if fail then 
domain = bits(4) UNKNOWN; 
secondstage = TRUE; 
return AArch64.PermissionFault(ipaddress, level, acctype, 
!failedread, secondstage, s2fslwalk); 
else 
return AArch64.NoFault(); 


aarch64/translation/debug/AArch64.CheckBreakpoint 


// AArch64.CheckBreakpoint() 
|[ sasessensssenssssssssss5= 


// Called before executing the instruction of length "size" bytes at "vaddress" in an AArch64 
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// translation regime. 
// The breakpoint can in fact be evaluated well ahead of execution, for example, at instruction 
// fetch. This is the simple sequential execution of the program. 


FaultRecord AArch64.CheckBreakpoint(bits(64) vaddress, integer size) 
assert !ELUsingAArch32(S1TranslationRegime()); 
assert (UsingAArch32() && size IN {2,4}) || size == 4; 


match = FALSE; 


for i = @ to UInt(ID_AA64DFROQ_EL1.BRPs) 
match_i = AArch64.BreakpointMatch(i, vaddress, size); 
match = match || match_i; 


if match && HaltOnBreakpointOrwatchpoint() then 
reason = DebugHalt_Breakpoint; 
Halt(reason) ; 
elsif match && MDSCR_EL1.MDE == '1' && AArch64.GenerateDebugExceptions() then 
acctype = AccType_IFETCH; 
iswrite = FALSE; 
return AArch64.DebugFault(acctype, iswrite); 
else 
return AArch64.NoFault(); 


aarch64/translation/debug/AArch64.CheckDebug 
// AArch64.CheckDebug() 
// Called on each access to check for a debug exception or entry to Debug state. 
FaultRecord AArch64.CheckDebug(bits(64) vaddress, AccType acctype, boolean iswrite, integer size) 
FaultRecord fault = AArch64.NoFault(); 
d_side = (acctype != AccType_IFETCH); 
generate_exception = AArch64.GenerateDebugExceptions() && MDSCR_EL1.MDE == '1'; 
halt = HaltOnBreakpointOrwatchpoint(); 
if generate_exception || halt then 
if d_side then 
fault = AArch64.CheckWatchpoint(vaddress, acctype, iswrite, size); 


else 
fault 


AArch64.CheckBreakpoint(vaddress, size); 


return fault; 


aarch64/translation/debug/AArch64.CheckWatchpoint 


// AArch64.CheckWatchpoint() 


// Called before accessing the memory location of "size" bytes at "address". 


FaultRecord AArch64.CheckWatchpoint(bits(64) vaddress, AccType acctype, 
boolean iswrite, integer size) 
assert !ELUsingAArch32(S1TranslationRegime()); 


match = FALSE; 
ispriv = PSTATE.EL != ELQ && !(PSTATE.EL == EL1 && acctype == AccType_UNPRIV); 


for i = 0 to UInt(ID_AA64DFROQ_EL1.WRPs) 
match = match || AArch64.WatchpointMatch(i, vaddress, size, ispriv, iswrite); 


if match && HaltOnBreakpointOrWatchpoint() then 
reason = DebugHalt_Watchpoint; 
Halt(reason) ; 
elsif match && MDSCR_EL1.MDE == '1' && AArch64.GenerateDebugExceptions() then 
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return AArch64.DebugFault(acctype, iswrite); 
else 


return AArch64.NoFault(); 


aarch64/translation/faults/AArch64.AccessFlagFault 


// AArch64.AccessFlagFault() 
|[ ssqssssnssssessssasssse= 


FaultRecord AArch64.AccessFlagFault(bits(48) ipaddress, integer level, 


AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk) 


extflag = bit UNKNOWN; 
return AArch64.CreateFaultRecord(Fault_AccessFlag, ipaddress, level, acctype, iswrite, 
extflag, secondstage, s2fslwalk); 


aarch64/translation/faults/AArch64.AddressSizeFault 


// MArch64.AddressSizeFault() 
|[ saqssssnssssansssssssssee= 


FaultRecord AArch64.AddressSizeFault(bits(48) ipaddress, integer level, 


AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk) 


extflag = bit UNKNOWN; 
return AArch64.CreateFaultRecord(Fault_AddressSize, ipaddress, level, acctype, iswrite, 
extflag, secondstage, s2fslwalk); 


aarch64/translation/faults/AArch64.AlignmentFault 


// AArch64.AlignmentFault() 
|[ senssssassssssssssesss= 


FaultRecord AArch64.AlignmentFault(AccType acctype, boolean iswrite, boolean secondstage) 


jpaddress = bits(48) UNKNOWN; 
level = integer UNKNOWN; 
extflag = bit UNKNOWN; 
s2fslwalk = boolean UNKNOWN; 


return AArch64.CreateFaultRecord(Fault_Alignment, ipaddress, level, acctype, iswrite, 
extflag, secondstage, s2fslwalk); 


aarch64/translation/faults/AArch64.AsynchExternalAbort 


// AArch64.AsynchExternalAbort() 
// Wrapper function for asynchronous external aborts 
FaultRecord AArch64.AsynchExternalAbort(boolean parity, bit extflag) 


type = if parity then Fault_AsyncParity else Fault_AsyncExternal; 
jpaddress = bits(48) UNKNOWN; 

level = integer UNKNOWN; 

acctype = AccType_NORMAL; 

iswrite = boolean UNKNOWN; 

secondstage = FALSE; 

s2fsiwalk = FALSE; 


return AArch64.CreateFaultRecord(type, ipaddress, level, acctype, iswrite, extflag, 
secondstage, s2fslwalk); 
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aarch64/translation/faults/AArch64.DebugFault 


// AArch64 .DebugFault() 
|[ ssssssensssencsssse= 


FaultRecord AArch64.DebugFault(AccType acctype, boolean iswrite) 


jpaddress = bits(48) UNKNOWN; 
level = integer UNKNOWN; 
extflag = bit UNKNOWN; 
secondstage = FALSE; 
s2fsiwalk = FALSE; 


return AArch64.CreateFaultRecord(Fault_Debug, ipaddress, level, acctype, iswrite, 
extflag, secondstage, s2fslwalk); 


aarch64/translation/faults/AArch64.NoFault 


// AArch64.NoFault() 
|[ sssssssessssss== 


FaultRecord AArch64.NoFault() 


jpaddress = bits(48) UNKNOWN; 
level = integer UNKNOWN; 
acctype = AccType_NORMAL; 
jiswrite = boolean UNKNOWN; 
extflag = bit UNKNOWN; 
secondstage = FALSE; 
s2fsiwalk = FALSE; 


return AArch64.CreateFaultRecord(Fault_None, ipaddress, level, acctype, iswrite, 
extflag, secondstage, s2fslwalk); 
aarch64/translation/faults/AArch64.PermissionFault 


// AArch64.PermissionFault() 
[[ saqsssensssssssesessse5= 


FaultRecord AArch64.PermissionFault(bits(48) ipaddress, integer level, 
AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk) 
extflag = bit UNKNOWN; 


return AArch64.CreateFaultRecord(Fault_Permission, ipaddress, level, acctype, iswrite, 
extflag, secondstage, s2fslwalk); 


aarch64/translation/faults/AArch64.TranslationFault 


// AArch64.TranslationFault() 
|[ seasssenssssscssssesssss5= 


FaultRecord AArch64.TranslationFault(bits(48) ipaddress, integer level, 
AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk) 
extflag = bit UNKNOWN; 
return AArch64.CreateFaultRecord(Fault_Translation, ipaddress, level, acctype, iswrite, 
extflag, secondstage, s2fslwalk); 


aarch64/translation/translation/AArch64.FirstStageTranslate 


// AArch64.FirstStageTranslate() 


// Perform a stage 1 translation walk. The function used by Address Translation operations is 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. J1-5293 
1ID092916 Non-Confidential 


J1 ARMv8 Pseudocode 
J1.1 Pseudocode for AArch64 operations 


// similar except it uses the translation regime specified for the instruction. 


AddressDescriptor AArch64.FirstStageTranslate(bits(64) vaddress, AccType acctype, boolean iswrite, 
boolean wasaligned, integer size) 


if HasS2Translation() then 

sl_enabled = HCR_EL2.TGE == '@' && HCR_EL2.DC == '@' && SCTLR_EL1.M == '1'; 
else 

sl_enabled = SCTLR[].M == '1'; 


ipaddress = bits(48) UNKNOWN; 
secondstage = FALSE; 
s2fslwalk = FALSE; 


if sl_enabled then // First stage enabled 
S1 = AArch64.TranslationTableWalk(ipaddress, vaddress, acctype, iswrite, secondstage, 
s2fslwalk, size); 
permissioncheck = TRUE; 
else 
S1 = AArch64.TranslateAddressSl0ff(vaddress, acctype, iswrite); 
permissioncheck = FALSE; 


// Check for unaligned data accesses to Device memory 
if ((!wasaligned && acctype != AccType_IFETCH) || (acctype == AccType_DCZVA)) 
&& S1.addrdesc.memattrs.type == MemType_Device && !IsFault(S1.addrdesc) then 
Sl.addrdesc. fault = AArch64.AlignmentFault(acctype, iswrite, secondstage) ; 
if !IsFault(Sl.addrdesc) && permissioncheck then 
Sl.addrdesc.fault = AArch64.CheckPermission(Sl.perms, vaddress, S1. level, 
S1.addrdesc.paddress.NS, 
acctype, iswrite); 


// Check for instruction fetches from Device memory not marked as execute-never. If there has 
// not been a Permission Fault then the memory is not marked execute-never. 
if (!IsFault(Sl.addrdesc) && S1.addrdesc.memattrs.type == MemType_Device && 
acctype == AccType_IFETCH) then 
Sl.addrdesc = AArch64.InstructionDevice(S1.addrdesc, vaddress, ipaddress, S1. level, 
acctype, iswrite, 
secondstage, s2fslwalk); 


return S1.addrdesc; 


aarch64/translation/translation/AArch64.FullTranslate 
// AArch64.FullTranslate() 


// Perform both stage 1 and stage 2 translation walks for the current translation regime. The 
// function used by Address Translation operations is similar except it uses the translation 
// regime specified for the instruction. 


AddressDescriptor AArch64.FullTranslate(bits(64) vaddress, AccType acctype, boolean iswrite, 
boolean wasaligned, integer size) 


// First Stage Translation 
S1 = AArch64.FirstStageTranslate(vaddress, acctype, iswrite, wasaligned, size); 
if !IsFault(S1) && HasS2Translation() then 
s2fslwalk = FALSE; 
result = AArch64.SecondStageTranslate(S1, vaddress, acctype, iswrite, wasaligned, s2fslwalk, 





size); 
else 
result = S1; 
return result; 
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aarch64/translation/translation/AArch64.SecondStageTranslate 


// AArch64.SecondStageTranslate() 

ee 

// Perform a stage 2 translation walk. The function used by Address Translation operations is 
// similar except it uses the translation regime specified for the instruction. 


AddressDescriptor AArch64.SecondStageTranslate(AddressDescriptor S1, bits(64) vaddress, 
AccType acctype, boolean iswrite, boolean wasaligned, 
boolean s2fslwalk, integer size) 
assert HasS2Translation(); 


s2_enabled = HCR_EL2.VM == '1' || HCR_EL2.DC == '1'; 
secondstage = TRUE; 


if s2_enabled then // Second stage enabled 
ipaddress = S1.paddress.physicaladdress<47:0>; 


$2 = AArch64.TranslationTableWalk(ipaddress, vaddress, acctype, iswrite, secondstage, 
s2fslwalk, size); 


// Check for unaligned data accesses to Device memory 

if ((!wasaligned && acctype != AccType_IFETCH) || (acctype == AccType_DCZVA)) 
&& S2.addrdesc.memattrs.type == MemType_Device && !IsFault(S2.addrdesc) then 
S2.addrdesc.fault = AArch64.AlignmentFault(acctype, iswrite, secondstage) ; 


if !IsFault(S2.addrdesc) then 
S2.addrdesc. fault = AArch64.CheckS2Permission(S2.perms, vaddress, ipaddress, S2.level, 
acctype, iswrite, s2fslwalk); 
// Check for instruction fetches from Device memory not marked as execute-never. As there 
// has not been a Permission Fault then the memory is not marked execute-never. 
if (!s2fslwalk && !IsFault(S2.addrdesc) && S2.addrdesc.memattrs.type == MemType_Device && 
acctype == AccType_IFETCH) then 
S2.addrdesc = AArch64.InstructionDevice(S2.addrdesc, vaddress, ipaddress, S2. level, 
acctype, iswrite, 
secondstage, s2fslwalk); 


// Check for protected table walk 
if (s2fslwalk && !IsFault(S2.addrdesc) && HCR_EL2.PTW == '1' && 
S2.addrdesc.memattrs.type == MemType_Device) then 
S2.addrdesc.fault = AArch64.PermissionFault(ipaddress, S2.level, acctype, 
iswrite, secondstage, s2fslwalk); 


result = CombineS1S2Desc(S1, S2.addrdesc); 
else 
result = S1; 


return result; 


aarch64/translation/translation/AArch64.SecondStageWalk 
// AArch64.SecondStageWalk() 
// Perform a stage 2 translation on a stage 1 translation page table walk access. 


AddressDescriptor AArch64.SecondStageWalk(AddressDescriptor S1, bits(64) vaddress, AccType acctype, 
boolean iswrite, integer size) 


assert HasS2Translation(); 


s2fslwalk = TRUE; 

wasaligned = TRUE; 

return AArch64.SecondStageTranslate(S1, vaddress, acctype, iswrite, wasaligned, s2fslwalk, 
size); 
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aarch64/translation/translation/AArch64.TranslateAddress 


// MArch64.TranslateAddress() 


// Main entry point for translating an address 


AddressDescriptor AArch64.TranslateAddress(bits(64) vaddress, AccType acctype, boolean iswrite, 
boolean wasaligned, integer size) 


result = AArch64.FullTranslate(vaddress, acctype, iswrite, wasaligned, size); 


if !(acctype IN {AccType_PTW, AccType_IC, AccType_AT}) && !IsFault(result) then 
result.fault = AArch64.CheckDebug(vaddress, acctype, iswrite, size); 


// Update virtual address for abort functions 
result.vaddress = ZeroExtend(vaddress); 


return result; 


aarch64/translation/walk/AArch64. Translation TableWalk 


// AArch64.TranslationTableWalk() 


// Returns a result of a translation table walk 

// 

// Implementations might cache information from memory in any number of non-coherent TLB 
// caching structures, and so avoid memory accesses that have been expressed in this 

// pseudocode. The use of such TLBs is not expressed in this pseudocode. 


TLBRecord AArch64.TranslationTableWalk(bits(48) ipaddress, bits(64) vaddress, 
AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk, integer size) 
if !secondstage then 
assert !ELUsingAArch32(S1TranslationRegime()); 
else 


assert HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) && HasS2Translation(); 


TLBRecord result; 

AddressDescriptor descaddr; 

bits(64) baseregister; 

bits(64) inputaddr; // Input Address is 'vaddress' for stage 1, 'ipaddress' for stage 2 


descaddr.memattrs.type = MemType_Normal; 


// Derived parameters for the page table walk: 

// grainsize = Log2(Size of Table) - Size of Table is 4KB, 16KB or 64KB in AArch64 
// stride = Log2(Address per Level) - Bits of address consumed at each level 

// firstblocklevel = First level where a block entry is allowed 

// ps = Physical Address size as encoded in TCR_EL1.IPS or TCR_ELx/VTCR_EL2.PS 

// inputsize = Log2(Size of Input Address) - Input Address size in bits 

// level = Level to start walk from 

// This means that the number of levels after start level = 3-level 


if !secondstage then 
// First stage translation 
inputaddr = ZeroExtend(vaddress) ; 
top = AddrTop(inputaddr, PSTATE.EL); 
if PSTATE.EL == EL3 then 
largegrain = TCR_EL3.TGO == 'Q1'; 
midgrain = TCR_EL3.TGO == '10'; 
inputsize = 64 - UInt(TCR_EL3.TQ@SZ); 
if inputsize > 48 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 48; 
if inputsize < 25 then 
c = ConstrainUnpredictable(); 
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assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 25; 
ps = TCR_EL3.PS; 
basefound = inputsize >= 25 && inputsize <= 48 && IsZero(inputaddr<top: inputsize>) ; 
disabled = FALSE; 
baseregister = TTBRQ_EL3; 
descaddr.memattrs = WalkAttrDecode(TCR_EL3.SH@, TCR_EL3.ORGN@, TCR_EL3.IRGNQ, secondstage) ; 
reversedescriptors = SCTLR_EL3.EE == '1'; 
lookupsecure = TRUE; 
singlepriv = TRUE; 
elsif PSTATE.EL == EL2 then 
inputsize = 64 - UInt(TCR_EL2.T@SZ); 
largegrain = TCR_EL2.TGO == '01'; 
midgrain = TCR_EL2.TG@ == '10'; 
if inputsize > 48 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 48; 
if inputsize < 25 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 25; 
ps = TCR_EL2.PS; 
basefound = inputsize >= 25 && inputsize <= 48 && IsZero(inputaddr<top: inputsize>) ; 
disabled = FALSE; 
baseregister = TTBRQ_EL2; 
descaddr.memattrs = WalkAttrDecode(TCR_EL2.SH@, TCR_EL2.ORGN@, TCR_EL2.IRGNQ, secondstage) ; 
reversedescriptors = SCTLR_EL2.EE == '1'; 
lookupsecure = FALSE; 
singlepriv = TRUE; 
else 
if inputaddr<top> == '@' then 
inputsize = 64 - UInt(TCR_EL1.TQ@SZ); 
largegrain = TCR_EL1.TG0 == '01'; 
midgrain = TCR_EL1.TGO == '10'; 
if inputsize > 48 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 48; 
if inputsize < 25 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 25; 
basefound = inputsize >= 25 && inputsize <= 48 && IsZero(inputaddr<top: inputsize>) ; 
disabled = TCR_EL1.EPDQ == '1'; 
baseregister = TTBRQ_EL1; 
descaddr.memattrs = WalkAttrDecode(TCR_EL1.SH@, TCR_EL1.ORGNO, TCR_EL1.IRGNO, 








secondstage) ; 
else 
inputsize = 64 - UInt(TCR_EL1.T1SZ); 
largegrain = TCR_EL1.TG1 == '11'; // TG1 and TG@ encodings differ 
midgrain = TCR_EL1.TG1 == 'Q1'; 
if inputsize > 48 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 48; 
if inputsize < 25 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 25; 
basefound = inputsize >= 25 && inputsize <= 48 && IsOnes(inputaddr<top: inputsize>) ; 
disabled = TCR_EL1.EPD1 == '1'; 
baseregister = TTBR1_EL1; 
descaddr.memattrs = WalkAttrDecode(TCR_EL1.SH1, TCR_EL1.ORGN1, TCR_EL1.IRGN1, 
secondstage) ; 


ps = TCR_EL1.IPS; 
reversedescriptors = SCTLR_EL1.EE == '1'; 
lookupsecure = IsSecure(); 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. J1-5297 
1ID092916 Non-Confidential 


J1 ARMv8 Pseudocode 


J1.1 Pseudocode for AArch64 operations 


bytes) 


bytes) 


bytes) 


else 


bytes) 


bytes) 


singlepriv = FALSE; 
if largegrain then 

grainsize = 16; 

firstblocklevel = 2; 


// Log2(64KB page size) 
// Largest block is 512MB (2A29 


elsif midgrain then 
grainsize = 14; 
firstblocklevel = 2; 


// Log2(16KB page size) 
// Largest block is 32MB (2A25 


else // Small grain 
grainsize = 12; 
firstblocklevel = 1; 


// Log2(4KB page size) 
// Largest block is 1GB (2A30 











stride = grainsize - 3; // Log2(page size / 8 bytes) 
// The starting level is the number of strides needed to consume the input address 
level = 4 - RoundUp(Real(inputsize - grainsize) / Real(stride)); 


// Second stage translation 
inputaddr = ZeroExtend(ipaddress) ; 
inputsize = 64 - UInt(VTCR_EL2.TQ@SZ); 
largegrain = VTCR_EL2.TG@ == 'Q1'; 
midgrain = VTCR_EL2.TG@ == '10'; 
if inputsize > 48 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 48; 
if inputsize < 25 then 
c = ConstrainUnpredictable(); 
assert c IN {Constraint_FORCE, Constraint_FAULT}; 
if c == Constraint_FORCE then inputsize = 25; 
ps = VTCR_EL2.PS; 
basefound = inputsize >= 25 && inputsize <= 48 && IsZero(inputaddr<63:inputsize>) ; 
disabled = FALSE; 
baseregister = VTTBR_EL2; 
descaddr.memattrs = WalkAttrDecode(VTCR_EL2.IRGNO, VITCR_EL2.ORGNQ, VTCR_EL2.SHQ, secondstage) ; 
reversedescriptors = SCTLR_EL2.EE == '1'; 
lookupsecure = FALSE; 
singlepriv = TRUE; 


startlevel = UInt(VTCR_EL2.SLO); 
if largegrain then 
grainsize = 16; 
level = 3 - startlevel; 
firstblocklevel = 2; 


// Log2(64KB page size) 
// Largest block is 512MB (2A29 


elsif midgrain then 
grainsize = 14; 
level = 3 - startlevel; 
firstblocklevel = 2; 


// Log2(16KB page size) 
// Largest block is 32MB (2A25 


else // Small grain 
grainsize = 12; 
level = 2 - startlevel; 
firstblocklevel = 1; 
stride = grainsize - 3; 


// Log2(4KB page size) 


// Largest block is 1GB (2A3@ bytes) 
// Log2(page size / 8 bytes) 


// Limits on IPA controls based on implemented PA size. Level @ is only 
// supported by small grain translations 
if largegrain then // 64KB pages 

// Level 1 only supported if implemented PA size is greater than 2A42 bytes 

if level == @ || (level == 1 && PAMax() <= 42) then basefound = FALSE; 
elsif midgrain then // 16KB pages 
// Level 1 only supported if implemented PA size is greater than 2A4@ bytes 
if level == @ || (level == 1 && PAMax() <= 40) then basefound = FALSE; 

// Small grain, 4KB pages 

// Level @ only supported if implemented PA size is greater than 2A42 bytes 
if level < @ || (level == 0 && PAMax() <= 42) then basefound = FALSE; 


else 
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// If the inputsize exceeds the PAMax value, the behavior is CONSTRAINED UNPREDICTABLE 
inputsizecheck = inputsize; 
if inputsize > PAMax() && (!ELUsingAArch32(EL1) || inputsize > 40) then 
case ConstrainUnpredictable() of 
when Constraint_FORCE 
// Restrict the inputsize to the PAMax value 
inputsize = PAMax(); 
inputsizecheck = PAMax(); 
when Constraint_FORCENOSLCHECK 
// As FORCE, except use the configured inputsize in the size checks below 
inputsize = PAMax(); 
when Constraint_FAULT 
// Generate a translation fault 
basefound = FALSE; 
otherwise 
Unreachable(); 


// Number of entries in the starting level table = 
// (Size of Input Address)/((Address per level)A(Num levels remaining)«(Size of Table)) 
startsizecheck = inputsizecheck - ((3 - level)«stride + grainsize); // Log2(Num of entries) 


// Check for starting level table with fewer than 2 entries or longer than 16 pages. 
// Lower bound check is: startsizecheck < Log2(2 entries) 

// Upper bound check is: startsizecheck > Log2(pagesize/8+16) 

if startsizecheck < 1 || startsizecheck > stride + 4 then basefound = FALSE; 


if !basefound || disabled then 
level = 0; // AArch32 reports this as a level 1 fault 
result.addrdesc. fault = AArch64.TranslationFault(ipaddress, level, acctype, iswrite, 
secondstage, s2fslwalk); 
return result; 


case ps of 
when 'Q00' outputsize = 32; 
when 'QQ1' outputsize = 36; 
when 'Q10' outputsize = 40; 
when 'Q11' outputsize = 42; 
when '100' outputsize = 44; 
when '101' outputsize = 48; 
otherwise outputsize = 48; 


if outputsize > PAMax() then outputsize = PAMax(); 


if outputsize < 48 && !IsZero(baseregister<47:outputsize>) then 
level = 0; 
result.addrdesc. fault = AArch64.AddressSizeFault(ipaddress, level, acctype, iswrite, 
secondstage, s2fslwalk); 
return result; 


// Bottom bound of the Base address is: 

// Log2(8 bytes per entry)+Log2(Number of entries in starting level table) 

// Number of entries in starting level table = 

// (Size of Input Address)/((Address per level)A(Num levels remaining)«(Size of Table)) 
baselowerbound = 3 + inputsize - ((3-level)#stride + grainsize); // Log2(Num of entries8) 
baseaddress = baseregister<47:baselowerbound>:Zeros(baselowerbound) ; 


ns_table = if lookupsecure then 'Q' else '1'; 
ap_table = '00'; 
xn_table = 'Q'; 
pxn_table = 'Q'; 


addrselecttop = inputsize - 1; 


repeat 
addrselectbottom = (3-level)sstride + grainsize; 


bits(48) index = ZeroExtend(inputaddr<addrselecttop:addrselectbottom>: 'Q00'); 
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descaddr.paddress.physicaladdress = baseaddress OR index; 
descaddr.paddress.NS = ns_table; 


// If there are two stages of translation, then the first stage table walk addresses 
// are themselves subject to translation 
if secondstage || !HasS2Translation() then 
descaddr2 = descaddr; 
else 
descaddr2 = AArch64.SecondStageWalk(descaddr, vaddress, acctype, iswrite, 8); 
// Check for a fault on the stage 2 walk 
if IsFault(descaddr2) then 
result.addrdesc. fault = descaddr2. fault; 
return result; 


// Update virtual address for abort functions 
descaddr2.vaddress = ZeroExtend(vaddress); 


desc = _Mem[descaddr2, 8, AccType_PTW]; 
if reversedescriptors then desc = BigEndianReverse(desc); 


if desc<@> == '@' || (desc<1:0> == 'Q01' && level == 3) then 
// Fault (@@), Reserved (10), or Block (01) at level 3 
result.addrdesc. fault = AArch64.TranslationFault(ipaddress, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


// Nalid Block, Page, or Table entry 
if desc<1:0> == 'Q01' || level == 3 then // Block (@1) or Page (11) 
blocktranslate = TRUE; 
else // Table (11) 
if outputsize != 48 && !IsZero(desc<47:outputsize>) then 
result.addrdesc.fault = AArch64.AddressSizeFault(ipaddress, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


baseaddress = desc<47:grainsize>:Zeros(grainsize) ; 
if !secondstage then 
// Unpack the upper and lower table attributes 


ns_table = ns_table OR desc<63>; 
ap_table<l> = ap_table<1> OR desc<62>; // read-only 
xn_table = xn_table OR desc<60>; 


// pxn_table and ap_table[Q] apply only in EL1&@ translation regimes 
if !singlepriv then 

ap_table<@> = ap_table<@> OR desc<61>; | _// privileged 

pxn_table = pxn_table OR desc<59>; 


level = level + 1; 
addrselecttop = addrselectbottom - 1; 
blocktranslate = FALSE; 


until blocktranslate; 


// Check block size is supported at this level 
if level < firstblocklevel then 


result.addrdesc. fault = AArch64.TranslationFault(ipaddress, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


// Check for misprogramming of the contiguous bit 
if largegrain then 





contiguousbitcheck = level == 2 && inputsize < 34; 
elsif midgrain then 
contiguousbitcheck = level == 2 && inputsize < 30; 
else 
contiguousbitcheck = level == 1 && inputsize < 34; 
if contiguousbitcheck && desc<52> == '1' then 
if boolean IMPLEMENTATION_DEFINED "Translation fault on misprogrammed contiguous bit" then 


result.addrdesc. fault = AArch64.TranslationFault(ipaddress, level, acctype, 
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iswrite, secondstage, s2fslwalk); 
return result; 


// Check the output address is inside the supported range 
if outputsize != 48 && !IsZero(desc<47:outputsize>) then 
result.addrdesc. fault = AArch64.AddressSizeFault(ipaddress, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


// Unpack the descriptor into address and upper and lower block attributes 

outputaddress = desc<47:addrselectbottom>: inputaddr<addrselectbottom-1:0>; 

// Check the access flag 

if desc<10> == '@' then 
result.addrdesc. fault = AArch64.AccessFlagFault(ipaddress, level, acctype, 

iswrite, secondstage, s2fslwalk); 

return result; 

xn = desc<54>; 

pxn = desc<53>; 

contiguousbit = desc<52>; 

nG = desc<11l; 

sh = desc<9:8>; 

ap = desc<7:6>:'1'; 

memattr = desc<5:2>; // AttrIndx and NS bit in stage 1 


result.domain = bits(4) UNKNOWN; // Domains not used 
result. level = level; 
result.blocksize = 2A((3-level)sstride + grainsize); 


// Stage 1 translation regimes also inherit attributes from the tables 
if !secondstage then 
result.perms.xn = xn OR xn_table; 
result.perms.ap<2> = ap<2> OR ap_table<1>; // Force read-only 
// PXN, nG and AP[1] apply only in EL1&@ stage 1 translation regimes 
if !singlepriv then 
result.perms.ap<1> = ap<l> AND NOT(ap_table<@>); // Force privileged only 
result.perms.pxn = pxn OR pxn_table; 
// Pages from Non-secure tables are marked non-global in Secure EL18&0 
if IsSecure() then 
result.nG = nG OR ns_table; 
else 
result.nG = nG; 


else 
result.perms.ap<1> = '1'; 
result.perms.pxn = 'Q'; 
result.nG = 'Q": 
result.perms.ap<@> = '1'; 
result.addrdesc.memattrs = AArch64.S1AttrDecode(sh, memattr<2:@>, acctype); 
result.addrdesc.paddress.NS = memattr<3> OR ns_table; 
else 
result.perms.ap<2:1> = ap<2:1>; 
result.perms.ap<@> = '1'; 
result.perms.xn = xn} 
result.perms.pxn = 'Q"; 
result.nG = '0'; 
result.addrdesc.memattrs = S2AttrDecode(sh, memattr, acctype); 
result.addrdesc.paddress.NS = '1'; 





result.addrdesc.paddress.physicaladdress = outputaddress; 
result.addrdesc.fault = AArch64.NoFault(); 
result.contiguous = contiguousbit == '1'; 

return result; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. J1-5301 
1ID092916 Non-Confidential 


J1 ARMv8 Pseudocode 
J1.2 Pseudocode for AArch32 operation 











J1.2 Pseudocode for AArch32 operation 
This section holds the pseudocode for execution in AArch32 state. Functions that are listed in this section are 
identified as AArch32.FunctionName. Some of these functions have an equivalent AArch64 function, 
AArch64.FunctionName. This section is organized by functional groups, with the functional groups being indicated by 
hierarchical path names, for example aarch32/debug/breakpoint. 
Note 
Many AArch32 pseudocode functions have not been updated to show the constraints on the ARMv7 
UNPREDICTABLE behaviors that are described in Appendix K1 Architectural Constraints on UNPREDICTABLE 
behaviors. Where AArch32 pseudocode shows something to be UNPREDICTABLE, check Appendix K1 for possible 
constraints on the permitted behavior. 
The top-level sections of the AArch32 pseudocode hierarchy are: 
° aarch32/debug. 
° aarch32/exceptions on page J1-5310. 
° aarch32/functions on page J1-5327. 
. aarch32/translation on page J1-5354. 
J1.2.1 aarch32/debug 
This section includes the following pseudocode functions: 
° aarch32/debug/VCRMatch/AArch32.VCRMatch. 
° aarch32/debug/authentication/AArch32.SelfHostedSecurePrivilegedInvasiveDebugEnabled on 
page J1-5303. 
. aarch32/debug/breakpoint/AArch32.BreakpointMatch on page J1-5303. 
° aarch32/debug/breakpoint/AArch32.BreakpointValueMatch on page J1-5304. 
. aarch32/debug/breakpoint/AArch32.StateMatch on page J1-5305. 
° aarch32/debug/enables/AArch32.GenerateDebugExceptions on page J1-5306. 
° aarch32/debug/enables/AArch32. GenerateDebugExceptionsFrom on page J1-5306. 
° aarch32/debug/pmu/AArch32.CheckForPMUOverflow on page J1-5306. 
° aarch32/debug/pmu/AArch32.CountEvents on page J1-5307. 
° aarch32/debug/takeexceptiondbg/AArch32.EnterHypModeInDebugState on page J1-5308. 
° aarch32/debug/takeexceptiondbg/AArch32.EnterModeInDebugState on page J1-5308. 
. aarch32/debug/takeexceptiondbg/AArch32.EnterMonitorModeInDebugState on page J1-5308. 
. aarch32/debug/watchpoint/AArch32. WatchpointByteMatch on page J1-5309. 
. aarch32/debug/watchpoint/AArch32. WatchpointMatch on page J1-5309. 
aarch32/debug/VCRMatch/AArch32.VCRMatch 
// AArch32.VCRMatch() 
) a 
boolean AArch32.VCRMatch(bits(32) vaddress) 
if UsingAArch32() && ELUsingAArch32(EL1) && IsZero(vaddress<1:@>) && PSTATE.EL != EL2 then 
// Each bit position in this string corresponds to a bit in DBGVCR and an exception vector. 
match_word = Zeros(32); 
if vaddress<31:5> == ExcVectorBase()<31:5> then 
if HaveEL(EL3) && !IsSecure() then 
match_word<UInt(vaddress<4:2>) + 24> = '1'; // Non-secure vectors 
else 
match_word<UInt(vaddress<4:2>) + @> = '1'; // Secure vectors (or no EL3) 
if HaveEL(EL3) && ELUsingAArch32(EL3) && IsSecure() && vaddress<31:5> == MVBAR<31:5> then 
match_word<UInt(vaddress<4:2>) + 8> = 'L1'; // Monitor vectors 
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// Mask out bits not corresponding to vectors. 
if !HaveEL(EL3) then 

mask = 'Q0000000':'00000000':'Q0000000':'11011110'; // DBGVCR[31:8] are RESO 
elsif !ELUsingAArch32(EL3) then 

mask = '11011110':'00000000':'Q0000000':'11011110'; // DBGVCR[15:8] are RESO 
else 

mask = '11011110':'Q0000000':'11011100':'11011110'; 


match_word = match_word AND DBGVCR AND mask; 
match = !IsZero(match_word); 


// Check for UNPREDICTABLE case - match on Prefetch Abort and Data Abort vectors 
if !IsZero(match_word<28:27,12:11,4:3>) && DebugTarget() == PSTATE.EL then 
match = ConstrainUnpredictableBool(); 
else 
match = FALSE; 


return match; 


aarch32/debug/authentication/AArch32.SelfHostedSecurePrivilegedinvasiveDebugEna 
bled 


// AArch32.Se1fHostedSecurePrivi legedInvasi veDebugEnab]ed() 
// sssssssseeessssaaeeseseseeeeeeeeeeeaaeeesseeeeeesseees== 


boolean AArch32.SelfHostedSecurePrivi legedInvasiveDebugEnab]ed() 
// In the recommended interface, SelfHostedSecurePrivilegedInvasiveDebugEnabled returns 
// the state of the (DBGEN AND SPIDEN) signal. 
if !HaveEL(EL3) && !IsSecure() then return FALSE; 
return DBGEN == HIGH && SPIDEN == HIGH; 


aarch32/debug/breakpoint/AArch32.BreakpointMatch 


// AArch32.BreakpointMatch() 


// Breakpoint matching in an AArch32 translation regime. 


(boolean, boolean) AArch32.BreakpointMatch(integer n, bits(32) vaddress, integer size) 
assert ELUsingAArch32(S1TranslationRegime()); 
assert n <= UInt(DBGDIDR.BRPs); 


enabled = DBGBCR[n].E == '1'; 
ispriv = PSTATE.EL != ELQ; 
linked = DBGBCR[n].BT == 'Qx01'; 
isbreakpnt = TRUE; 

linked_to = FALSE; 


state_match = AArch32.StateMatch(DBGBCR[n].SSC, DBGBCR[n].HMC, DBGBCR[n].PMC, 
linked, DBGBCR[n].LBN, isbreakpnt, ispriv); 
(value_match, value_mismatch) = AArch32.BreakpointValueMatch(n, vaddress, linked_to); 


if size == 4 then // Check second halfword 
// If the breakpoint address and BAS of an Address breakpoint match the address of the 
// second halfword of an instruction, but not the address of the first halfword, it is 
// CONSTRAINED UNPREDICTABLE whether or not this breakpoint generates a Breakpoint debug 
// event. 
(match_i, mismatch_i) = AArch32.BreakpointValueMatch(n, vaddress + 2, linked_to); 
if !value_match && match_i then 
value_match = ConstrainUnpredictableBool(); 
if value_mismatch && !mismatch_i then 
value_mismatch = ConstrainUnpredictableBool(); 


if vaddress<l> == '1' && DBGBCR[n].BAS == '1111' then 
// The above notwithstanding, if DBGBCR[n].BAS == '1111', then it is CONSTRAINED 
// UNPREDICTABLE whether or not a Breakpoint debug event is generated for an instruction 
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// at the address DBGBVR[n]+2. 
if value_match then value_match = ConstrainUnpredictableBool(); 
if !value_mismatch then value_mismatch = ConstrainUnpredictableBool(); 


match = value_match && state_match && enabled; 
mismatch = value_mismatch && state_match && enabled; 


return (match, mismatch); 


aarch32/debug/breakpoint/AArch32.BreakpointValueMatch 
// AArch32.BreakpointValueMatch() 


// The first result is whether an Address Match or Context breakpoint is programmed on the 
// instruction at "address". The second result is whether an Address Mismatch breakpoint is 
// programmed on the instruction, that is, whether the instruction should be stepped. 


(boolean, boolean) AArch32.BreakpointValueMatch(integer n, bits(32) vaddress, boolean linked_to) 


// "n" is the identity of the breakpoint unit to match against 

// "vaddress" is the current instruction address, ignored if linked_to is TRUE and for Context 
// matching breakpoints. 

// "linked_to" is TRUE if this is a call from StateMatch for linking. 


// If a non-existant breakpoint then it is CONSTRAINED UNPREDICTABLE whether this gives 
// no match or the breakpoint is mapped to another UNKNOWN implemented breakpoint. 
if n > UInt(DBGDIDR.BRPs) then 

(c, n) = ConstrainUnpredictableInteger(@, UInt(DBGDIDR.BRPs)); 

assert c IN {Constraint_DISABLED, Constraint_UNKNOWN}; 

if c == Constraint_DISABLED then return (FALSE, FALSE); 


// If this breakpoint is not enabled, it cannot generate a match. (This could also happen on a 
// call from StateMatch for linking.) 

if DBGBCR[n].E == '@' then return (FALSE, FALSE); 

context_aware = (n >= UInt(DBGDIDR.BRPs) - UInt(DBGDIDR.CTX_CMPs) ); 


// If BT is set to a reserved type, behaves either as disabled or as a not-reserved type. 
type = DBGBCR[n] .BT; 


if (type IN {'Q11x','11xx'} || // Reserved 
(type == '010x' && HaltOnBreakpointOrWatchpoint()) || // Address mismatch 
(type != 'Ox0x' && !context_aware) | | // Context matching 
(type == '1xxx' && !HaveEL(EL2))) then // EL2 extension 


(c, type) = ConstrainUnpredictableBits(); 

assert c IN {Constraint_DISABLED, Constraint_UNKNOWN}; 

if c == Constraint_DISABLED then return (FALSE, FALSE); 

// Otherwise the value returned by ConstrainUnpredictableBits must be a not-reserved value 


// Determine what to compare against. 
match_addr = (type == 'Qx0x'); 
mismatch = (type == 'Q10x'); 


match_vmid = (type == '10xx'); 
match_cid = (type == 'x1x'); 
linked = (type == 'xxx1'); 


// If this is a call from StateMatch, return FALSE if the breakpoint is not programmed for a 
// \MID and/or context ID match, of if not context-aware. The above assertions mean that the 
// code can just test for match_addr == TRUE to confirm all these things. 

if linked_to && (!linked || match_addr) then return (FALSE, FALSE) 


// If called from BreakpointMatch return FALSE for Linked context ID and/or VMID matches. 
if !linked_to && linked && !match_addr then return (FALSE, FALSE); 


// Do the comparison. 
if match_addr then 
byte = UInt(vaddress<1:0>); 
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assert byte IN {0,2}; // "vaddress" is halfword aligned. 
byte_select_match = (DBGBCR[n] .BAS<byte> == '1'); 
BVR_match = vaddress<31:2> == DBGBVR[n]<31:2> && byte_select_match; 
elsif match_cid then 
BVR_match = (PSTATE.EL != EL2 && CONTEXTIDR == DBGBVR[n]<31:0>); 
if match_vmid then 
vmid = (if ELUsingAArch32(EL2) then VTTBR.VMID else VTTBR_EL2.VMID) ; 
BXVR_match = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 
vmid == DBGBXVR[n]<7:0>); 


bvr_match_valid = (match_addr || match_cid); 
bxvr_match_valid = match_vmid; 


match = (!bxvr_match_valid || BXVR_match) && (!bvr_match_valid || BVR_match) 


return (match && !mismatch, !match && mismatch); 


aarch32/debug/breakpoint/AArch32.StateMatch 


// 
// 
// 


AArch32.StateMatch() 


Determine whether a breakpoint or watchpoint is enabled in the current mode and state. 


boolean AArch32.StateMatch(bits(2) SSC, bit HMC, bits(2) PxC, boolean linked, bits(4) LBN, 


boolean isbreakpnt, boolean ispriv) 
// "SSC", "HMC", "PxC" are the control fields from the DBGBCR[n] or DBGWCR[n] register. 
// "linked" is TRUE if this is a linked breakpoint/watchpoint type. 
// "LBN" is the linked breakpoint number from the DBGBCR[n] or DBGWCR[n] register. 
// “isbreakpnt" is TRUE for breakpoints, FALSE for watchpoints. 
// “ispriv" is valid for watchpoints, and selects between privileged and unprivileged accesses. 


// If parameters are set to a reserved type, behaves as either disabled or a defined type 


if ((HMC:SSC:PxC) IN {'Q11xx','10@x@','101x@','11010','11101','1111x'} || // Reserved 
(HMC == '0' && PxC == 'Q0' && !isbreakpnt) || // Usr/Svc/Sys 
(SSC IN {'01','10'} && !HaveEL(EL3)) || // No EL3 
(HMC:SSC:PxC == '11000' && ELUsingAArch32(EL3)) || // AArch64 only 
(HMC:SSC != 'Q@00' && HMC:SSC != '111' && !HaveEL(EL3) && !HaveEL(EL2)) || // No EL3/EL2 
(HMC:SSC:PxC == '11100' && !HaveEL(EL2))) then // No EL2 


(c, <HMC,SSC,PxC>) = ConstrainUnpredictableBits(); 

assert c IN {Constraint_DISABLED, Constraint_UNKNOWN}; 

if c == Constraint_DISABLED then return FALSE; 

// Otherwise the value returned by ConstrainUnpredictableBits must be a not-reserved value 


PL2_match = HaveEL(EL2) && HMC == '1'; 

PL1_match = PxC<@> == '1'; 

PL@_match = PxC<1> == '1'; 

SSU_match = isbreakpnt && HMC == 'Q' && PxC == 'Q0' && SSC != '11'; 





if SSU_match then 
priv_match = PSTATE.M IN {M32_User,M32_Svc,M32_System}; 
else 
case PSTATE.EL of 
when EL3, EL1 priv_match = if ispriv then PL1_match else PLO_match; 


when EL2 priv_match = PL2_match; 
when ELQ priv_match = PLQ_match; 
case SSC of 
when 'Q0' security_state_match = TRUE; // Both 


when 'Q@1'  security_state_match = !IsSecure(); // Non-secure only 
when '10' security_state_match = IsSecure(); // Secure only 
when '11' security_state_match = TRUE; // Both 


if linked then 
// “LBN" must be an enabled context-aware breakpoint unit. If it is not context-aware then 
// it is CONSTRAINED UNPREDICTABLE whether this gives no match, or LBN is mapped to some 
// UNKNOWN breakpoint that is context-aware. 
lbn = UInt(LBN); 
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first_ctx_cmp = (UInt(DBGDIDR.BRPs) - UInt(DBGDIDR.CTX_CMPs)); 

last_ctx_cmp = UInt(DBGDIDR.BRPs) ; 

if (Ibn < first_ctx_cmp || Ibn > last_ctx_cmp) then 
(c, Ibn) = ConstrainUnpredictableInteger(first_ctx_cmp, last_ctx_cmp); 
assert c IN {Constraint_DISABLED, Constraint_NONE, Constraint_UNKNOWN}; 


case c of 
when Constraint_DISABLED return FALSE; // Disabled 
when Constraint_NONE linked = FALSE; // No linking 


// Otherwise ConstrainUnpredictableInteger returned a context-aware breakpoint 


if linked then 
vaddress = bits(32) UNKNOWN; 
linked_to = TRUE; 
(linked_match,-) = AArch32.BreakpointValueMatch(lbn, vaddress, linked_to); 


return priv_match && security_state_match && (!linked || Tinked_match); 


aarch32/debug/enables/AArch32.GenerateDebugExceptions 


// AArch32.GenerateDebugExceptions() 
DT ae 


boolean AArch32.GenerateDebugExceptions() 
return AArch32.GenerateDebugExceptionsFrom(PSTATE.EL, IsSecure()); 


aarch32/debug/enables/AArch32.GenerateDebugExceptionsFrom 


// AArch32.GenerateDebugExceptionsFrom( ) 
// ssssseseseesssaaassssseseeeseeee===== 


boolean AArch32.GenerateDebugExceptionsFrom(bits(2) from, boolean secure) 


if from == ELO && !ELStateUsingAArch32(EL1, secure) then 
mask = bit UNKNOWN; // PSTATE.D mask, unused for EL@ case 
return AArch64.GenerateDebugExceptionsFrom(from, secure, mask); 

if DBGOSLSR.OSLK == '1' || DoubleLockStatus() || Halted() then 
return FALSE; 


if HaveEL(EL3) && secure then 
spd = (if ELUsingAArch32(EL3) then SDCR.SPD else MDCR_EL3.SPD32); 
if spd<l> == '1' then 
enabled = spd<@> == '1'; 
else 
// SPD == @b01 is reserved, but behaves the same as @bQQ. 
enabled = AArch32.SelfHostedSecurePrivi legedInvasiveDebugEnabled() ; 
if from == ELQ then enabled = enabled || SDER.SUIDEN == '1'; 
else 
enabled = from != EL2; 


return enabled; 


aarch32/debug/pmu/AArch32.CheckForPMUOverflow 


// AArch32.CheckForPMUOverflow() 


// Signal Performance Monitors overflow IRQ and CTI overflow events 
boolean AArch32.CheckForPMUOverflow( ) 


if !ELUsingAArch32(EL1) then return AArch64.CheckForPMUOverflow() ; 
pmuirg = (PMCR.E == '1' && PMINTENSET<31> == '1' && PMOVSSET<31> == '1'); 
for n = 0 to UInt(PMCR.N) - 1 
if HaveEL(EL2) then 
hpmn = (if !ELUsingAArch32(EL2) then MDCR_EL2.HPMN else HDCR.HPMN) ; 
hpme = (if !ELUsingAArch32(EL2) then MDCR_EL2.HPME else HDCR.HPME) ; 
E = (if n < UInt(hpmn) then PMCR.E else hpme); 
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else 
E = PMCR.E; 
if E == '1' && PMINTENSET<n> == '1' && PMOVSSET<n> == '1' then pmuirg = TRUE; 
SetInterruptRequestLevel(InterruptID_PMUIRQ, if pmuirq then HIGH else LOW); 
CTI_SetEventLevel(CrossTriggerIn_PMUOverflow, if pmuirq then HIGH else LOW); 


// The request remains set until the condition is cleared. (For example, an interrupt handler 
// or cross-triggered event handler clears the overflow status flag by writing to PMOVSCLR_ELQ.) 


return pmuirq; 


aarch32/debug/pmu/AArch32.CountEvents 


// 
// 


AArch32.CountEvents() 


Return TRUE if counter "n" should count its event. For the cycle counter, n == 31. 


boolean AArch32.CountEvents(integer n) 


assert (n == 31 || n < UInt(PMCR.N)); 


if !ELUsingAArch32(EL1) then return AArch64.CountEvents(n); 
// Event counting is disabled in Debug state 
debug = Halted(); 


// In Non-secure state, some counters are reserved for EL2 

if HaveEL(EL2) then 
hpmn = (if !ELUsingAArch32(EL2) then MDCR_EL2.HPMN else HDCR.HPMN) ; 
hpme = (if !ELUsingAArch32(EL2) then MDCR_EL2.HPME else HDCR.HPME) ; 
E = (if n < UInt(hpmn) || n == 31 then PMCR.E else hpme); 

else 
E = PMCR.E; 

enabled = (E == '1' && PMCNTENSET<n> == '1'); 


if !IsSecure() then 
prohibited = FALSE; 
else 
// Event counting in Secure state is prohibited unless any one of: 
// « EL3 is not implemented 
// * EL3 is using AArch64 and MDCR_EL3.SPME == 
// * EL3 is using AArch32 and SDCR.SPME == 
// * Executing at EL@, and SDER.SUNIDEN == 1. 
spme = (if ELUsingAArch32(EL3) then SDCR.SPME else MDCR_EL3.SPME) ; 
prohibited = (HaveEL(EL3) && spme == '@' && (PSTATE.EL != EL@ || SDER.SUNIDEN == 'Q')); 


// The IMPLEMENTATION DEFINED authentication interface might override software controls 
if ExternalSecureNoninvasiveDebugEnabled() then prohibited = FALSE; 

// For the cycle counter, PMCR_EL@.DP enables counting when otherwise prohibited 

if prohibited && n == 31 then prohibited = (PMCR.DP == '1'); 


// Event counting can be filtered by the {P, U, NSK, NSU, NSH} bits 
filter = (if n == 31 then PMCCFILTR<31:27> else PMEVTYPER[n]<31:27>); 


H = if !HaveEL(EL2) then '0' else filter<Q>; 
P = filter<4>; U = filter<3>; 
if !IsSecure() && HaveEL(EL3) then 
P = P EOR filter<2>; U = U EOR filter<1>; 


case PSTATE.EL of 


when EL@ filtered = U == '1'; 
when EL1,EL3 filtered = P == '1'; 
when EL2 filtered = H == '0'; 


return !debug && enabled && !prohibited && ! filtered; 
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aarch32/debug/takeexceptiondbg/AArch32.EnterHypModelnDebugState 


// AArch32.EnterHypModeInDebugState() 
// sssssessesesssaaaessseeseeeeseee== 


// Take an exception in Debug state to Hyp mode. 


AArch32.EnterHypModeInDebugState(ExceptionRecord exception) 
assert HaveEL(EL2) && !IsSecure() && ELUsingAArch32(EL2); 


AArch32.ReportHypEntry(exception) ; 

AArch32.WriteMode(M32_Hyp); 

SPSR[] = bits(32) UNKNOWN; 

ELR_hyp = bits(32) UNKNOWN; 

// In Debug state, the PE always execute 132 instructions when in AArch32 state, and 
// PSTATE.{SS,A,I,F} are not observable so behave as UNKNOWN. 
PSTATE.T = '1'; // PSTATE.J is RESO 
PSTATE.<SS,A,1,F> = bits(4) UNKNOWN; 

DLR = bits(32) UNKNOWN; 

DSPSR = bits(32) UNKNOWN; 

PSTATE.E = HSCTLR.EE; 

PSTATE.IL = 'Q'; 

PSTATE.IT = '00000000'; 

EDSCR.ERR = '1'; 

UpdateEDSCRFields(); 

EndOfInstruction(); 


aarch32/debug/takeexceptiondbg/AArch32.EnterModelnDebugState 
// AArch32.EnterModeInDebugState( ) 
// Take an exception in Debug state to a mode other than Monitor and Hyp mode. 


AArch32.EnterModeInDebugState(bits(5) target_mode) 
assert ELUsingAArch32(EL1) && PSTATE.EL != EL2; 


if PSTATE.M == M32_Monitor then SCR.NS = 'Q'; 

AArch32.WriteMode(target_mode) ; 

SPSR[] = bits(32) UNKNOWN; 

R[14] = bits(32) UNKNOWN; 

// In Debug state, the PE always execute 132 instructions when in AArch32 state, and 
// PSTATE.{SS,A,I,F} are not observable so behave as UNKNOWN. 

PSTATE.T = '1'; // PSTATE.J is RESO 

PSTATE.<SS,A,1,F> = bits(4) UNKNOWN; 

DLR = bits(32) UNKNOWN; 

DSPSR = bits(32) UNKNOWN; 

PSTATE.E = SCTLR.EE; 

PSTATE.IL 'O'; 

PSTATE.IT = 'Q0000000'; 

EDSCR.ERR = '1'; 

UpdateEDSCRFields(); // Update EDSCR processor state flags. 
EndOfInstruction(); 





aarch32/debug/takeexceptiondbg/AArch32.EnterMonitorModelnDebugState 
// AArch32.EnterMonitorModeInDebugState() 
// Take an exception in Debug state to Monitor mode. 


AArch32.EnterMonitorModeInDebugState() 
assert HaveEL(EL3) && ELUsingAArch32(EL3); 
if PSTATE.M == M32_Monitor then SCR.NS = 'Q'; 
AArch32.WriteMode(M32_Monitor) ; 
SPSR[] = bits(32) UNKNOWN; 
R[14] = bits(32) UNKNOWN; 
// In Debug state, the PE always execute 132 instructions when in AArch32 state, and 
// PSTATE.{SS,A,I,F} are not observable so behave as UNKNOWN. 
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PSTATE.T = '1'; // PSTATE.J is RESO 

PSTATE.<SS,A,1,F> = bits(4) UNKNOWN; 

DLR = bits(32) UNKNOWN; 

DSPSR = bits(32) UNKNOWN; 

PSTATE.E = SCTLR.EE; 

PSTATE.IL = 'Q'; 

PSTATE.IT = 'Q0000000'; 

EDSCR.ERR = '1'; 

UpdateEDSCRFields(); // Update EDSCR processor state flags. 
EndOfInstruction(); 


aarch32/debug/watchpoint/AArch32.WatchpointByteMatch 


// AArch32.WatchpointByteMatch() 
———— 


boolean AArch32.WatchpointByteMatch(integer n, bits(32) vaddress) 


bottom = if DBGWVR[n]<2> == '1' then 2 else 3; // Word or doubleword 
byte_select_match = (DBGWCR[n] .BAS<UInt(vaddress<bottom-1:0>)> != '0'); 
mask = UInt(DBGWCR[n] .MASK) ; 


// If DBGWCR[n].MASK is non-zero value and DBGWCR[n].BAS is not set to '11111111', or 
// DBGWCR[n].BAS specifies a non-contiguous set of bytes behavior is CONSTRAINED 
// UNPREDICTABLE. 
if mask > @ && !IsOnes(DBGWCR[n].BAS) then 
byte_select_match = ConstrainUnpredictableBool(); 


else 
LSB = (DBGWCR[n].BAS AND NOT(DBGWCR[n].BAS - 1)); MSB = (DBGWCR[n].BAS + LSB) 
if !IsZero(MSB AND (MSB - 1)) then // Not contiguous 
byte_select_match = ConstrainUnpredictableBool(); 
bottom = 3; // For the whole doubleword 


// If the address mask is set to a reserved value, the behavior is CONSTRAINED UNPREDICTABLE. 
if mask > @ && mask <= 2 then 

(c, mask) = ConstrainUnpredictableInteger(3, 31); 

assert c IN {Constraint_DISABLED, Constraint_NONE, Constraint_UNKNOWN}; 


case c of 
when Constraint_DISABLED return FALSE; // Disabled 
when Constraint_NONE mask = Q; // No masking 


// Otherwise the value returned by ConstrainUnpredictableInteger is a not-reserved value 


if mask > bottom then 
WVR_match = (vaddress<31:mask> == DBGWVR[n]<31:mask>) ; 
// If masked bits of DBGWVR_EL1[n] are not zero, the behavior is CONSTRAINED UNPREDICTABLE. 
if WR_match && !IsZero(DBGWVR[n]<mask-1:bottom>) then 
WVR_match = ConstrainUnpredictableBool(); 
else 
WVR_match = vaddress<31:bottom> == DBGWVR[n]<31:bottom>; 


return WR_match && byte_select_match; 


aarch32/debug/watchpoint/AArch32.WatchpointMatch 
// AArch32.WatchpointMatch() 
// Watchpoint matching in an AArch32 translation regime. 


boolean AArch32.WatchpointMatch(integer n, bits(32) vaddress, integer size, boolean ispriv, 
boolean iswrite) 
assert ELUsingAArch32(S1TranslationRegime()); 
assert n <= UInt(DBGDIDR.WRPs) ; 


// “ispriv" is FALSE for LDRT/STRT instructions executed at EL1 and all 
// l\oad/stores at EL@, TRUE for all other load/stores. "iswrite" is TRUE for stores, FALSE for 
// loads. 
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J1.2 Pseudocode for AArch32 operation 


J1.2.2 


enabled = DBGWCR[n].E == '1'; 
linked = DBGWCR[n].WT == '1'; 
isbreakpnt = FALSE; 


state_match = AArch32.StateMatch(DBGWCR[n].SSC, DBGWCR[n].HMC, DBGWCR[n].PAC, 
linked, DBGWCR[n].LBN, isbreakpnt, ispriv); 


1s_match = (DBGWCR[n].LSC<(if iswrite then 1 else Q)> == '1'); 
value_match = FALSE; 
for byte = @ to size - 1 


value_match = value_match || AArch32.WatchpointByteMatch(n, vaddress + byte); 


return value_match && state_match && 1s_match && enabled; 


aarch32/exceptions 


This section includes the following pseudocode functions: 


aarch32/exceptions/aborts/AArch32.Abort on page J1-5311. 
aarch32/exceptions/aborts/AArch32.AbortSyndrome on page J1-5311. 
aarch32/exceptions/aborts/AArch32.CheckPCAlignment on page J1-5311. 
aarch32/exceptions/aborts/AArch32.ReportDataAbort on page J1-5312. 
aarch32/exceptions/aborts/AArch32.ReportPrefetchAbort on page J1-5312. 
aarch32/exceptions/aborts/AArch32. TakeDataAbortException on page J1-5313. 
aarch32/exceptions/aborts/AArch32.TakePrefetchAbortException on page J1-5313. 
aarch32/exceptions/asynch/AArch32. TakePhysicalFIQException on page J1-5314. 
aarch32/exceptions/asynch/AArch32. TakePhysicalIRQException on page J1-5314. 
aarch32/exceptions/asynch/AArch32. TakePhysicalSErrorException on page J1-5315. 
aarch32/exceptions/asynch/AArch32. TakeVirtualFIQException on page J1-5315. 
aarch32/exceptions/asynch/AArch32. Take VirtualIRQException on page J1-5316. 
aarch32/exceptions/asynch/AArch32. Take VirtualSErrorException on page J1-5316. 
aarch32/exceptions/debug/AArch32.SoftwareBreakpoint on page J1-5317. 
aarch32/exceptions/debug/DebugException on page J1-5317. 
aarch32/exceptions/exceptions/AArch32.ExceptionClass on page J1-5317. 
aarch32/exceptions/exceptions/AArch32.GeneralExceptionsToAArch64 on page J1-5317. 
aarch32/exceptions/exceptions/AArch32.ReportHypEntry on page J1-5318. 
aarch32/exceptions/exceptions/AArch32.ResetControlRegisters on page J1-5318. 
aarch32/exceptions/exceptions/AArch32. TakeReset on page J1-5318. 
aarch32/exceptions/exceptions/Exc VectorBase on page J1-5319. 
aarch32/exceptions/ieeefp/AArch32.F PTrappedException on page J1-5319. 
aarch32/exceptions/syscalls/AArch32.CallHypervisor on page J1-5319. 
aarch32/exceptions/syscalls/AArch32.CallSupervisor on page J1-5320. 
aarch32/exceptions/syscalls/AArch32.TakeHVCException on page J1-5320. 
aarch32/exceptions/syscalls/AArch32.TakeSMCException on page J1-5320. 
aarch32/exceptions/syscalls/AArch32.TakeSVCException on page J1-5320. 
aarch32/exceptions/takeexception/AArch32.EnterHypMode on page J1-5321. 
aarch32/exceptions/takeexception/AArch32.EnterMode on page J1-5321. 
aarch32/exceptions/takeexception/AArch32.EnterMonitorMode on page J1-5322. 
aarch32/exceptions/traps/AArch32.AArch32SystemAccessTrap on page J1-5322. 
aarch32/exceptions/traps/AArch32.AArch32SystemAccessTrapSyndrome on page J1-5322. 
aarch32/exceptions/traps/AArch32.CheckAdvSIMDOrF PEnabled on page J1-5323. 
aarch32/exceptions/traps/AArch32.CheckF PAdvSIMDTrap on page J1-5324. 
aarch32/exceptions/traps/AArch32.CheckForSMCTrap on page J1-5324. 
aarch32/exceptions/traps/AArch32.CheckForWFxTrap on page J1-5325. 
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. aarch32/exceptions/traps/AArch32.CheckITEnabled on page J1-5325. 

. aarch32/exceptions/traps/AArch32. ChecklllegalState on page J1-5325. 

. aarch32/exceptions/traps/AArch32.CheckSETENDEnabled on page J1-5326. 

° aarch32/exceptions/traps/AArch32.TakeHypTrapException on page J1-5326. 

. aarch32/exceptions/traps/AArch32.TakeMonitorTrapException on page J1-5326. 
° aarch32/exceptions/traps/AArch32. Take UndefInstrException on page J1-5327. 

° aarch32/exceptions/traps/AArch32. UndefinedFault on page J1-5327. 


aarch32/exceptions/aborts/AArch32.Abort 


// AArch32.Abort() 
|[ senssssessss=e= 


// Abort and Debug exception handling in an AArch32 translation regime. 
AArch32.Abort(bits(32) vaddress, FaultRecord fault) 


// Check if routed to AArch64 state 
route_to_aarch64 = PSTATE.EL == EL@ && !ELUsingAArch32(EL1); 


if !route_to_aarch64 && HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) then 
route_to_aarch64 = (HCR_EL2.TGE == '1' || IsSecondStage(fault) || 
(IsDebugException(fault) && MDCR_EL2.TDE == '1')); 


if !route_to_aarch64 && HaveEL(EL3) && !ELUsingAArch32(EL3) then 
route_to_aarch64 = SCR_EL3.EA == '1' && IsExternalAbort(fault); 


if route_to_aarch64 then 
AArch64.Abort(ZeroExtend(vaddress), fault); 

elsif fault.acctype == AccType_IFETCH then 
AArch32.TakePrefetchAbortException(vaddress, fault); 

else 
AArch32.TakeDataAbortException(vaddress, fault); 


aarch32/exceptions/aborts/AArch32.AbortSyndrome 

// AArch32.AbortSyndrome() 

iH Gas tes on ecu oh ondeae record for Abort exceptions taken to Hyp mode 

// from an AArch32 translation regime. 

ExceptionRecord AArch32.AbortSyndrome(Exception type, FaultRecord fault, bits(32) vaddress) 


exception = ExceptionSyndrome(type) ; 


d_side = type == Exception_DataAbort; 


exception.syndrome = AArch32.FaultSyndrome(d_side, fault); 
exception.vaddress = ZeroExtend(vaddress); 
if IPAValid(fault) then 

exception.ipavalid = TRUE; 

exception.ipaddress = ZeroExtend(fault.ipaddress); 
else 

exception.ipavalid = FALSE; 


return excepti on; 


aarch32/exceptions/aborts/AArch32.CheckPCAlignment 


// AArch32.CheckPCAlignment() 
|[ seeessenssssacssssessssse= 


AArch32.CheckPCA1ignment() 
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J1.2 Pseudocode for AArch32 operation 


bits(32) pc = ThisInstrAddr(); 
if (CurrentInstrSet() == InstrSet_A32 && pc<1> == '1') || pc<@> == '1' then 
if AArch32.GeneralExceptionsToAArch64() then AArch64.PCAlignmentFault(); 


// Generate an Alignment fault Prefetch Abort exception 

vaddress = pc; 

acctype = AccType_IFETCH; 

iswrite = FALSE; 

secondstage = FALSE; 

AArch32.Abort(vaddress, AArch32.AlignmentFault(acctype, iswrite, secondstage)); 


aarch32/exceptions/aborts/AArch32.ReportDataAbort 


// AArch32.ReportDataAbort() 
// Report syndrome information for aborts taken to modes other than Hyp mode. 
AArch32.ReportDataAbort(boolean route_to_monitor, FaultRecord fault, bits(32) vaddress) 


// The encoding used in the IFSR or DFSR can be Long-descriptor format or Short-descriptor 
// format. Normally, the current translation table format determines the format. For an abort 
// from Non-secure state to Monitor mode, the IFSR or DFSR uses the Long-descriptor format if 
// any of the following applies: 
// « The Secure TIBCR.EAE is set to 1. 
// « The abort is synchronous and either: 
// - It is taken from Hyp mode. 
// - It is taken from EL1 or EL@, and the Non-secure TTBCR.EAE is set to 1. 
long_format = FALSE; 
if route_to_monitor && !IsSecure() then 

long_format = TTBCR_S.EAE == '1'; 

if !IsSErrorInterrupt(fault) && !long_format then 

long_format = PSTATE.EL == EL2 || TTBCR.EAE == '1'; 

else 

long_format = TTBCR.EAE == '1'; 
d_side = TRUE; 
if long_format then 

syndrome = AArch32.FaultStatusLD(d_side, fault); 
else 

syndrome = AArch32.FaultStatusSD(d_side, fault); 


if fault.acctype == AccType_IC then 
if (!long_format && 
boolean IMPLEMENTATION_DEFINED "Report I-cache maintenance fault in IFSR") then 
i_syndrome = syndrome; 
syndrome<10,3:@> = EncodeSDFSC(Fault_ICacheMaint, 1); 
else 
i_syndrome = bits(32) UNKNOWN; 
if route_to_monitor then 
IFSR_S = i_syndrome; 
else 
IFSR = i_syndrome; 
if route_to_monitor then 
DFSR_S = syndrome; 
DFAR_S = vaddress; 
else 
DFSR = syndrome; 
DFAR = vaddress; 


return; 


aarch32/exceptions/aborts/AArch32.ReportPrefetchAbort 


// AArch32.ReportPrefetchAbort() 


// Report syndrome information for aborts taken to modes other than Hyp mode. 
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AArch32.ReportPrefetchAbort(boolean route_to_monitor, FaultRecord fault, bits(32) vaddress) 
// The encoding used in the IFSR can be Long-descriptor format or Short-descriptor format. 
// Normally, the current translation table format determines the format. For an abort from 
// Non-secure state to Monitor mode, the IFSR uses the Long-descriptor format if any of the 
// following applies: 
// « The Secure TIBCR.EAE is set to 1. 
// « It is taken from Hyp mode. 
// « It is taken from EL1 or EL@, and the Non-secure TTBCR.EAE is set to 1. 
long_format = FALSE; 
if route_to_monitor && !IsSecure() then 
long_format = TTBCR_S.EAE == '1' || PSTATE.EL == EL2 || TTBCR.EAE == '1'; 
else 
long_format = TTBCR.EAE == '1'; 


d_side = FALSE; 
if long_format then 

fsr = AArch32.FaultStatusLD(d_side, fault); 
else 

fsr = AArch32.FaultStatusSD(d_side, fault); 


if route_to_monitor then 


IFSR_S = fsr; 

IFAR_S = vaddress; 
else 

IFSR = fsr; 


IFAR = vaddress; 


return; 


aarch32/exceptions/aborts/AArch32. TakeDataAbortException 


// AArch32.TakeDataAbortException() 
ee 


AArch32.TakeDataAbortException(bits(32) vaddress, FaultRecord fault) 
route_to_monitor = HaveEL(EL3) && SCR.EA == '1' && IsExternalAbort(fault) ; 
route_to_hyp = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 

(HCR.TGE == '1' || IsSecondStage(fault) || 
(IsDebugException(fault) && HDCR.TDE == '1'))); 
bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x10; 
lr_offset = 8; 


if IsDebugException(fault) then DBGDSCRext.MOE = fault.debugmoe; 
if route_to_monitor then 
AArch32.ReportDataAbort(route_to_monitor, fault, vaddress); 
AArch32.EnterMonitorMode(preferred_exception_return, lr_offset, vect_offset); 
elsif PSTATE.EL == EL2 || route_to_hyp then 
exception = AArch32.AbortSyndrome(Exception_DataAbort, fault, vaddress); 
if PSTATE.EL == EL2 then 
AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 
else 
AArch32.EnterHypMode(exception, preferred_exception_return, 0x14); 
else 
AArch32.ReportDataAbort(route_to_monitor, fault, vaddress); 
AArch32.EnterMode(M32_Abort, preferred_exception_return, lr_offset, vect_offset); 


aarch32/exceptions/aborts/AArch32. TakePrefetchAbortException 


// AArch32.TakePrefetchAbortException() 
a 


AArch32.TakePrefetchAbortException(bits(32) vaddress, FaultRecord fault) 
route_to_monitor = HaveEL(EL3) && SCR.EA == '1' && IsExternalAbort(fault) ; 
route_to_hyp = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} && 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. J1-5313 
1ID092916 Non-Confidential 


J1 ARMv8 Pseudocode 
J1.2 Pseudocode for AArch32 operation 


(HCR.TGE == '1' || IsSecondStage(fault) || 
(IsDebugException(fault) && HDCR.TDE == '1'))); 


bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = OxQC; 
lr_offset = 4; 


if IsDebugException(fault) then DBGDSCRext.MOE = fault.debugmoe; 
if route_to_monitor then 
AArch32.ReportPrefetchAbort(route_to_monitor, fault, vaddress); 
AArch32.EnterMonitorMode(preferred_exception_return, lr_offset, vect_offset); 
elsif PSTATE.EL == EL2 || route_to_hyp then 
if fault.type == Fault_Alignment then // PC Alignment fault 
exception = ExceptionSyndrome(Exception_PCAlignment) ; 
exception.vaddress = ThisInstrAddr(); 
else 
exception = AArch32.AbortSyndrome(Exception_InstructionAbort, fault, vaddress); 
if PSTATE.EL == EL2 then 
AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 
else 
AArch32.EnterHypMode(exception, preferred_exception_return, 0x14); 
else 
AArch32.ReportPrefetchAbort(route_to_monitor, fault, vaddress); 
AArch32.EnterMode(M32_Abort, preferred_exception_return, lr_offset, vect_offset); 


aarch32/exceptions/asynch/AArch32. TakePhysicalFIQException 


// AArch32.TakePhysicalFIQException() 


AArch32.TakePhysicalFIQException() 


// Check if routed to AArch64 state 

route_to_aarch64 = PSTATE.EL == EL@ && !ELUsingAArch32(EL1); 

if !route_to_aarch64 && HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) then 
route_to_aarch64 = HCR_EL2.TGE == '1' || HCR_EL2.FMO == '1'; 


if !route_to_aarch64 && HaveEL(EL3) && !ELUsingAArch32(EL3) then 
route_to_aarch64 = SCR_EL3.FIQ == '1'; 


if route_to_aarch64 then AArch64.TakePhysicalFIQException(); 
route_to_monitor = HaveEL(EL3) && SCR.FIQ == '1'; 
route_to_hyp = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {EL@,EL1} && 
(HCR.TGE == '1' || HCR.FMO == '1')); 

bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x1C; 
lr_offset = 4; 
if route_to_monitor then 

AArch32.EnterMonitorMode(preferred_exception_return, lr_offset, vect_offset); 
elsif PSTATE.EL == EL2 || route_to_hyp then 

exception = ExceptionSyndrome(Exception_FI1Q) ; 

AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 
else 

AArch32.EnterMode(M32_FIQ, preferred_exception_return, lr_offset, vect_offset); 





aarch32/exceptions/asynch/AArch32. TakePhysicallRQException 


// AArch32.TakePhysicalIRQException() 





// Take an enabled physical IRQ exception. 
AArch32.TakePhysicalIRQException() 
// Check if routed to AArch64 state 


route_to_aarch64 = PSTATE.EL == EL@ && !ELUsingAArch32(EL1); 
if !route_to_aarch64 && HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) then 
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route_to_aarch64 = HCR_EL2.TGE == '1' || HCR_EL2.IMO == '1'; 
if !route_to_aarch64 && HaveEL(EL3) && !ELUsingAArch32(EL3) then 
route_to_aarch64 = SCR_EL3.IRQ == '1'; 


if route_to_aarch64 then AArch64.TakePhysicalIRQException(); 


route_to_monitor = HaveEL(EL3) && SCR.IRQ == '1'; 
route_to_hyp = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {EL@,EL1} && 
(HCR.TGE == '1' || HCR.IMO == '1')); 

bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x18; 
lr_offset = 4; 
if route_to_monitor then 

AArch32.EnterMonitorMode(preferred_exception_return, lr_offset, vect_offset); 
elsif PSTATE.EL == EL2 || route_to_hyp then 

exception = ExceptionSyndrome(Exception_IRQ) ; 

AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 
else 

AArch32.EnterMode(M32_IRQ, preferred_exception_return, lr_offset, vect_offset); 


aarch32/exceptions/asynch/AArch32.TakePhysicalSErrorException 


// AArch32.TakePhysicalSErrorException() 
ee 


AArch32.TakePhysicalSErrorException(boolean parity, bit extflag, 
boolean syndrome_valid, bits(24) full_syndrome) 


// Check if routed to AArch64 state 
route_to_aarch64 = PSTATE.EL == EL@ && !ELUsingAArch32(EL1); 


if !route_to_aarch64 && HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) then 
route_to_aarch64 = HCR_EL2.TGE == '1' || HCR_EL2.AMO == '1'; 

if !route_to_aarch64 && HaveEL(EL3) && !ELUsingAArch32(EL3) then 
route_to_aarch64 = SCR_EL3.EA == '1'; 


if route_to_aarch64 then 
AArch64.TakePhysicalSErrorException(syndrome_valid, full_syndrome) ; 


route_to_monitor = HaveEL(EL3) && SCR.EA == '1'; 

route_to_hyp = (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {EL@,EL1} && 
(HCR.TGE == '1' || HCR.AMO == '1')); 

bits(32) preferred_exception_return = ThisInstrAddr(); 

vect_offset = 0x10; 

lr_offset = 8; 


fault = AArch32.AsynchExternalAbort(parity, extflag); 
vaddress = bits(32) UNKNOWN; 
if route_to_monitor then 
AArch32.ReportDataAbort(route_to_monitor, fault, vaddress); 
AArch32.EnterMonitorMode(preferred_exception_return, lr_offset, vect_offset); 
elsif PSTATE.EL == EL2 || route_to_hyp then 
exception = AArch32.AbortSyndrome(Exception_DataAbort, fault, vaddress); 
if PSTATE.EL == EL2 then 
AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 
else 
AArch32.EnterHypMode(exception, preferred_exception_return, 0x14); 
else 
AArch32.ReportDataAbort(route_to_monitor, fault, vaddress); 
AArch32.EnterMode(M32_Abort, preferred_exception_return, lr_offset, vect_offset); 


aarch32/exceptions/asynch/AArch32. TakeVirtualFIQException 


// AArch32.TakeVirtualFIQException() 
// sssssesseessssssesesseseeeesssee= 
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J1.2 Pseudocode for AArch32 operation 


AArch32.TakeVirtual FIQException() 


assert HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1}; 
if ELUsingAArch32(EL2) then // Virtual IRQ enabled if TGE==@ and FMO==1 
assert HCR.TGE == '@' && HCR.FMO == '1'; 
else 
assert HCR_EL2.TGE == '@' && HCR_EL2.FMO == '1'; 
// Check if routed to AArch64 state 
if PSTATE.EL == EL@ && !ELUsingAArch32(EL1) then AArch64.TakeVirtualFIQException(); 


bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = Ox1C; 
lr_offset = 4; 


AArch32.EnterMode(M32_FIQ, preferred_exception_return, lr_offset, vect_offset); 


aarch32/exceptions/asynch/AArch32. TakeVirtuallRQException 


// AArch32.TakeVirtualIRQException() 


AArch32.TakeVirtual IRQException() 


assert HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1}; 


if ELUsingAArch32(EL2) then // Virtual IRQs enabled if TGE==0 and IMO==1 
assert HCR.TGE == '@' && HCR.IMO == '1'; 

else 
assert HCR_EL2.TGE == '@' && HCR_EL2.IMO == '1'; 


// Check if routed to AArch64 state 
if PSTATE.EL == EL@ && !ELUsingAArch32(EL1) then AArch64.TakeVirtualIRQException(); 


bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x18; 
lr_offset = 4; 


AArch32.EnterMode(M32_IRQ, preferred_exception_return, lr_offset, vect_offset); 


aarch32/exceptions/asynch/AArch32. TakeVirtualSErrorException 


// AArch32.TakeVirtualSErrorException() 


AArch32.TakeVirtualSErrorException(bit extflag, boolean syndrome_valid, bits(24) full_syndrome) 


assert HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1}; 
if ELUsingAArch32(EL2) then // Virtual SError enabled if TGE==0 and AMO==1 
assert HCR.TGE == '@' && HCR.AMO == '1'; 
else 
assert HCR_EL2.TGE == '@' && HCR_EL2.AMO == '1'; 
// Check if routed to AArch64 state 
if PSTATE.EL == EL@ && !ELUsingAArch32(EL1) then AArch64.TakeVirtualSErrorException(syndrome_valid, 


full_syndrome) ; 


route_to_monitor = FALSE; 


bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x10; 
lr_offset = 8; 


vaddress = bits(32) UNKNOWN; 

parity = FALSE; 

fault = AArch32.AsynchExternalAbort(parity, extflag); 

if ELUsingAArch32(EL2) then HCR.VA = '@'; else HCR_EL2.VSE = 'Q'; 
AArch32.ReportDataAbort(route_to_monitor, fault, vaddress); 
AArch32.EnterMode(M32_Abort, preferred_exception_return, lr_offset, vect_offset); 
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aarch32/exceptions/debug/AArch32.SoftwareBreakpoint 


// AArch32.SoftwareBreakpoint() 
|[ ssssssenssseanssssesssssess 


AArch32.SoftwareBreakpoint(bits(16) immediate) 


if (HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) && 
(HCR_EL2.TGE == '1' || MDCR_EL2.TDE == '1')) || !ELUsingAArch32(EL1) then 


AArch64.SoftwareBreakpoint (immediate) ; 
vaddress = bits(32) UNKNOWN; 


acctype = AccType_IFETCH; // Take as a Prefetch Abort 


iswrite = FALSE; 
entry = DebugException_BKPT; 


fault = AArch32.DebugFault(acctype, iswrite, entry); 


AArch32.Abort(vaddress, fault); 


aarch32/exceptions/debug/DebugException 


constant bits(4) DebugException_Breakpoint = 'QQ0Q1'; 


constant bits(4) DebugException_BKPT = 'QQ11'; 
constant bits(4) DebugException_VectorCatch = 'Q101'; 
constant bits(4) DebugException_Watchpoint = '1010'; 


aarch32/exceptions/exceptions/AArch32.ExceptionClass 


// AArch32.ExceptionClass() 
|[ senssssosassacssssecsss= 


// Return the Exception Class and Instruction Length fields for reported in HSR 


(integer,bit) AArch32.ExceptionClass(Exception type) 
il = if ThisInstrLength() == 32 then '1' else 'Q'; 


case type of 


when Exception_Uncategorized ec = 0x00; il = 
when Exception_WFxTrap ec = 0x01; 

when Exception_CP15RTTrap ec = 0x03; 

when Exception_CP15RRTTrap ec = 0x04; 

when Exception_CP14RTTrap ec = 0x05; 

when Exception_CP14DTTrap ec = 0x06; 

when Exception_AdvSIMDFPAccessTrap ec = 0x07; 

when Exception_FPIDTrap ec = 0x08; 

when Exception_CP14RRTTrap ec = Ox0C; 

when Exception_I]legalState ec = Qx0E; il = 
when Exception_SupervisorCal] ec = 0x11; 

when Exception_HypervisorCal1 ec = 0x12; 

when Exception_MonitorCal1 ec = 0x13; 

when Exception_InstructionAbort ec = 0x20; i] = 
when Exception_PCAlignment ec = 0x22; i] = 
when Exception_DataAbort ec = 0x24; 

when Exception_FPTrappedException ec = Qx28; 
otherwise Unreachable(); 


if ec IN {0x20,0x24} && PSTATE.EL == EL2 then 
ec = ec + 1; 


return (ec,il); 


"4's 
"a's 


aarch32/exceptions/exceptions/AArch32.GeneralExceptions ToAArch64 


// AArch32.GeneralExceptionsToAArch64() 


// Returns TRUE if exceptions normally routed to EL1 are being handled at an Exception 
// level using AArch64, because either EL1 is using AArch64 or TGE is in force and EL2 
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// is using AArch6é4. 
boolean AArch32.GeneralExceptionsToAArch64() 
return ((PSTATE.EL == ELO && !ELUsingAArch32(EL1)) || 
(HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) && HCR_EL2.TGE == '1')); 
aarch32/exceptions/exceptions/AArch32.ReportHypEntry 
// AArch32.ReportHypEntry() 
// Report syndrome information to Hyp mode registers. 
AArch32.ReportHypEntry(ExceptionRecord exception) 
Exception type = exception. type; 


(ec,i1) = AArch32.ExceptionClass(type) ; 
iss = exception.syndrome; 


// IL is not valid for Data Abort exceptions without valid instruction syndrome information 
if ec IN {@x24,0x25} && iss<24> == '0' then 
Tee 1"; 


HSR = ec<5:0>:i1:iss; 


if type IN {Exception_InstructionAbort, Exception_PCAlignment} then 
HIFAR = exception.vaddress<31:0>; 

HDFAR = bits(32) UNKNOWN; 

elsif type == Exception_DataAbort then 

HIFAR = bits(32) UNKNOWN; 

HDFAR = exception.vaddress<31:0>; 


if exception.ipavalid then 
HPFAR<31:4> = exception. ipaddress<39:12>; 
else 





HPFAR<31:4> = bits(28) UNKNOWN; 


return; 


aarch32/exceptions/exceptions/AArch32.ResetC ontrolRegisters 


// Resets System registers and memory-mapped control registers that have architecturally-defined 
// reset values to those values. 
AArch32.ResetControlRegisters(boolean cold_reset) ; 


aarch32/exceptions/exceptions/AArch32.TakeReset 
// AArch32.TakeReset() 
// Reset into AArch32 state 


AArch32.TakeReset(boolean cold_reset) 
assert HighestELUsingAArch32(); 


// Enter the highest implemented Exception level in AArch32 state 
if HaveEL(EL3) then 
AArch32.WriteMode(M32_Svc); 
SCR.NS = 'Q'; // Secure state 
elsif HaveEL(EL2) then 
AArch32.WriteMode(M32_Hyp) ; 
else 
AArch32.WriteMode(M32_Svc); 


// Reset the CP14 and CP15 registers and other system components 
AArch32.ResetControlRegisters(cold_reset) ; 
FPEXC.EN = 'Q'; 
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// Reset all other PSTATE fields, including instruction set and endianness according to the 
// SCTLR values produced by the above call to ResetControlRegisters() 


PSTATE.<A,I,F> = '111'; // All asynchronous exceptions masked 

PSTATE.IT = '00000000'; // IT block state reset 

PSTATE.T = SCTLR.TE; // Instruction set: TE=0: A32, TE=1: 132. PSTATE.J is RESO. 
PSTATE.E = SCTLR.EE; // Endianness: EE=0: little-endian, EE=1: big-endian 
PSTATE.IL = 'Q'; // Clear Illegal Execution state bit 


// All registers, bits and fields not reset by the above pseudocode or by the BranchTo() call 
// below are UNKNOWN bitstrings after reset. In particular, the return information registers 
// R14 or ELR_hyp and SPSR have UNKNOWN values, so that it 

// is impossible to return from a reset in an architecturally defined way. 





AArch32.ResetGeneralRegisters(); 
AArch32.ResetSIMDFPRegisters(); 
AArch32.ResetSpecialRegisters(); 
ResetExternalDebugRegisters(cold_reset) ; 


bits(32) rv; // IMPLEMENTATION DEFINED reset vector 
if HaveEL(EL3) then 
if MVBAR<@> == '1' then // Reset vector in MVBAR 
rv = MVBAR<31:1>:'0'; 
else 


rv = bits(32) IMPLEMENTATION_DEFINED "reset vector address"; 
else 
rv = RVBAR<31:1>:'0'; 
// The reset vector must be correctly aligned 
assert rv<@> == 'Q' && (PSTATE.T == '1' || rv<l> == 'Q'); 


BranchTo(rv, BranchType_UNKNOWN) ; 


aarch32/exceptions/exceptions/ExcVectorBase 


// ExcVectorBase() 


bits(32) ExcVectorBase() 
if SCTLR.V == '1' then // Hivecs selected, base = @xFFFFQQ00 
return Ones(16):Zeros(16); 
else 
return VBAR; 


aarch32/exceptions/ieeefp/AArch32.FPTrappedException 


// AArch32.FPTrappedException() 
ee 


AArch32.FPTrappedException(bits(8) accumulated_exceptions) 
if AArch32.GeneralExceptionsToAArch64() then 
is_ase = FALSE; 


element = Q; 

AArch64.FPTrappedException(is_ase, element, accumulated_exceptions) ; 
FPEXC.DEX = ‘1's; 
FPEXC.TFV a "1's 


FPEXC<7,4:@> = accumulated_exceptions<7,4:0>; // IDF, IXF,UFF,OFF,DZF, IOF 


AArch32.TakeUndefInstrException(); 


aarch32/exceptions/syscalls/AArch32.CallHypervisor 
// AArch32.CallHypervisor() 
// Performs a HVC call 


AArch32.CallHypervisor(bits(16) immediate) 
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assert HaveEL(EL2); 


if !ELUsingAArch32(EL2) then 
AArch64.CallHypervisor(immediate) ; 

else 
AArch32.TakeHVCException(immediate) ; 


aarch32/exceptions/syscalls/AArch32.CallSupervisor 
// AArch32.Cal1Supervisor() 

// Calls the Supervisor 

AArch32.CallSupervisor(bits(16) immediate) 


if AArch32.CurrentCond() != '1110' then 
immediate = bits(16) UNKNOWN; 

if AArch32.GeneralExceptionsToAArch64() then 
AArch64.Cal1Supervisor(immediate) ; 

else 
AArch32.TakeSVCException(immediate) ; 


aarch32/exceptions/syscalls/AArch32. Take HVCException 


// AArch32.TakeHVCException() 
|[ ssssssenssssanssssensssse= 


AArch32.TakeHVCException(bits(16) immediate) 
assert HaveEL(EL2) && ELUsingAArch32(EL2); 


AArch32.1TAdvance(); 

SSAdvance(); 

bits(32) preferred_exception_return = NextInstrAddr(); 
vect_offset = 0x08; 


exception = ExceptionSyndrome(Exception_HypervisorCal1); 
exception.syndrome<15:0> = immediate; 


if PSTATE.EL == EL2 then 

AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 
else 

AArch32.EnterHypMode(exception, preferred_exception_return, 0x14); 


aarch32/exceptions/syscalls/AArch32.TakeSMCException 


// AArch32.TakeSMCException() 
|[ sasssssnssssscssssesssss5= 


AArch32.TakeSMCException() 
assert HaveEL(EL3) && ELUsingAArch32(EL3); 


AArch32.1TAdvance(); 
SSAdvance(); 


bits(32) preferred_exception_return = NextInstrAddr(); 
vect_offset = 0x08; 
lr_offset = 0; 


AArch32.EnterMonitorMode(preferred_exception_return, Ir_offset, vect_offset) ; 


aarch32/exceptions/syscalls/AArch32.TakeSVCException 


// AArch32.TakeSVCException() 
|[ seqsssensssssnssssassssse= 
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AArch32.TakeSVCException(bits(16) immediate) 


AArch32.1TAdvance(); 
SSAdvance(); 
route_to_hyp = HaveEL(EL2) && !IsSecure() && PSTATE.EL == EL@ && HCR.TGE == '1'; 


bits(32) preferred_exception_return = NextInstrAddr(); 
vect_offset = 0x08; 
lr_offset = Q; 


if PSTATE.EL == EL2 || route_to_hyp then 
exception = ExceptionSyndrome(Exception_SupervisorCal1); 
exception.syndrome<15:Q> = immediate; 
if PSTATE.EL == EL2 then 
AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 
else 
AArch32.EnterHypMode(exception, preferred_exception_return, 0x14); 
else 
AArch32.EnterMode(M32_Svc, preferred_exception_return, lr_offset, vect_offset); 


aarch32/exceptions/takeexception/AArch32.EnterHypMode 


// AArch32.EnterHypMode() 


// Take an exception to Hyp mode. 


AArch32.EnterHypMode(ExceptionRecord exception, bits(32) preferred_exception_return, 
integer vect_offset) 
assert HaveEL(EL2) && !IsSecure() && ELUsingAArch32(EL2); 


spsr = GetPSRFromPSTATE(); 

if !(exception.type IN {Exception_IRQ, Exception_FIQ}) then 
AArch32.ReportHypEntry(exception) ; 

AArch32.WriteMode(M32_Hyp); 

SPSR[] = spsr; 

ELR_hyp = preferred_exception_return; 

PSTATE.T = HSCTLR.TE; // PSTATE.J is RESO 

PSTATE.SS = 'Q'; 

if !HaveEL(EL3) || SCR_GEN[].EA == '@' then PSTATE.A = '1'; 

if !HaveEL(EL3) || SCR_GEN[].IRQ == '@' then PSTATE.I = '1'; 

if !HaveEL(EL3) || SCR_GEN[].FIQ == '@' then PSTATE.F = '1'; 

PSTATE.E = HSCTLR.EE; 

PSTATE.IL = 'Q'; 

PSTATE.IT = '00000000'; 

BranchTo(HVBAR + vect_offset, BranchType_UNKNOWN) ; 

EndOfInstruction(); 


aarch32/exceptions/takeexception/AArch32.EnterMode 
// AArch32.EnterMode() 
// Take an exception to a mode other than Monitor and Hyp mode. 


AArch32.EnterMode(bits(5) target_mode, bits(32) preferred_exception_return, integer lr_offset, 
integer vect_offset) 
assert ELUsingAArch32(EL1) && PSTATE.EL != EL2; 


spsr = GetPSRFromPSTATE(); 
if PSTATE.M == M32_Monitor then SCR.NS = 'Q'; 
AArch32.WriteMode(target_mode) ; 
SPSR[] = spsr; 
R[14] = preferred_exception_return + lr_offset; 
PSTATE.T = SCTLR.TE; // PSTATE.J is RESO 
PSTATE.SS = 'Q'; 
if target_mode == M32_FIQ then 
PSTATE.<A,I,F> = '111'; 
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elsif target_mode IN {M32_Abort, M32_IRQ} then 
PSTATE.<A,I> = '11'; 
else 
PSTATE.I = '1'; 
PSTATE.E = SCTLR.EE; 
PSTATE.IL = 'Q'; 
PSTATE.IT = '00000000'; 
BranchTo(ExcVectorBase() + vect_offset, BranchType_UNKNOWN) ; 
EndOfInstruction(); 


aarch32/exceptions/takeexception/AArch32.EnterMonitorMode 


// AArch32.EnterMonitorMode() 


// Take an exception to Monitor mode. 


AArch32.EnterMonitorMode(bits(32) preferred_exception_return, integer Ir_offset, 
integer vect_offset) 
assert HaveEL(EL3) && ELUsingAArch32(EL3); 
spsr = GetPSRFromPSTATE(); 
if PSTATE.M == M32_Monitor then SCR.NS = 'Q'; 
AArch32.WriteMode(M32_Monitor); 


SPSR[] = spsr; 

R[14] = preferred_exception_return + Ir_offset; 

PSTATE.T = SCTLR.TE; // PSTATE.J is RESO 
PSTATE.SS = '0'; 


PSTATE.<A,I,F> = '111'; 

PSTATE.E = SCTLR.EE; 

PSTATE.IL = 'Q'; 

PSTATE.IT = '00000000'; 

BranchTo(MVBAR + vect_offset, BranchType_UNKNOWN) ; 
EndOfInstruction(); 








aarch32/exceptions/traps/AArch32.AArch32SystemAccessTrap 


// AArch32.AArch32SystemAccessTrap() 
// sssssssseesssssassessesseeeseee== 
// Trapped AArch32 System register access other than due to CPTR_EL2 or CPACR_EL1. 


AArch32.AArch32SystemAccessTrap(bits(2) target_el, bits(32) instr) 
assert HaveEL(target_el) && target_el != ELQ && UInt(target_el) >= UInt(PSTATE.EL); 


if !ELUsingAArch32(target_el) || AArch32.GeneralExceptionsToAArch64() then 
AArch64.AArch32SystemAccessTrap(target_el, instr); 


assert target_el IN {EL1,EL2}; 


if target_el == EL2 then 
exception = AArch32.AArch32SystemAccessTrapSyndrome(instr) ; 
AArch32.TakeHypTrapException(exception) ; 

else 
AArch32.TakeUndefInstrException(); 


aarch32/exceptions/traps/AArch32.AArch32SystemAccessTrapSyndrome 

// AArch32.AArch32SystemAccessTrapSyndrome() 

/) Gexatn Cle edione iaormatien or ray bw wAnehSD GR, MGW, ARC, Mia WS snstwuetans, 
// other than traps that are due to HCPTR or CPACR. 

ExceptionRecord AArch32.AArch32SystemAccessTrapSyndrome(bits(32) instr) 


ExceptionRecord exception; 
cpnum = UInt(instr<11:8>); 


bits(20) iss = Zeros(); 





J1-5322 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


J1 ARMv8 Pseudocode 
J1.2 Pseudocode for AArch32 operation 


if instr<27:24> == '1110' && instr<4> == '1' && instr<31:28> != '1111' then 

// MRC/MCR 

case cpnum of 
when 10 exception = ExceptionSyndrome(Exception_FPIDTrap) ; 
when 14 exception = ExceptionSyndrome(Exception_CP14RTTrap) 
when 15 exception = ExceptionSyndrome(Exception_CP15RTTrap) 
otherwise Unreachable(); 

iss<19:17> = instr<7:5>; // opc2 

iss<16:14> = instr<23:21; // opcl 

iss<13:10> = instr<19:16>; // CRn 

iss<8:5> = instr<15:12>; // Rt 

iss<4:1> = instr<3:0>; // CRm 

elsif instr<27:21> == '1100010' && instr<31:28> != '1111' then 

// MRRC/MCRR 

case cpnum of 
when 14 exception = ExceptionSyndrome(Exception_CP14RRTTrap) 
when 15 exception = ExceptionSyndrome(Exception_CP15RRTTrap) 
otherwise Unreachable(); 

iss<19:16> = instr<7:4>; // opcl 

jss<13:10> = instr<19:16>; // Rt2 

iss<8:5> = instr<15:12>; // Rt 


iss<4:1> 9 = instr<3:0>; // CRm 
elsif instr<27:25> == '110' && instr<31:28> != '1111' then 
// LDC/STC 


assert cpnum == 14; 

exception = ExceptionSyndrome(Exception_CP14DTTrap) ; 

iss<19:12> = instr<7:0>; // imm8 

iss<4> instr<23>; // U 

iss<2:l> = instr<24,21>; // PW 

if instr<19:16> == '1111' then // Literal addressing 
iss<8:5> = bits(4) UNKNOWN; 


iss<3> = 'L'; 
else 
iss<8:5> = instr<19:16>; // Rn 
iss<3> = 'Q'; 
else 
Unreachable(); 
iss<@> = instr<2Q>; // Direction 


exception.syndrome<24:20> = ConditionSyndrome() ; 
exception.syndrome<19:0> = iss; 


return excepti on; 


aarch32/exceptions/traps/AArch32.CheckAdvSIMDOrFPEnabled 
// AArch32.CheckAdvSIMDOrFPEnab1ed() 
// Check against CPACR, FPEXC, HCPTR, NSACR, and CPTR_EL3. 


AArch32.CheckAdvSIMDOrFPEnabled(boolean fpexc_check, boolean advsimd) 
if PSTATE.EL == EL@ && !ELUsingAArch32(EL1) then 
AArch64.CheckFPAdvSIMDEnabled(); 
else 
cpacr_asedis = CPACR.ASEDIS; 
cpacr_cp10 = CPACR.cp10; 


if HaveEL(EL3) && ELUsingAArch32(EL3) && !IsSecure() then 
// Check if access disabled in NSACR 
if NSACR.NSASEDIS == '1' then cpacr_asedis = '1'; 
if NSACR.cp10 == '@' then cpacr_cp10 = '00'; 


if PSTATE.EL != EL2 then 
// Check if Advanced SIMD disabled in CPACR 
if advsimd && cpacr_asedis == '1' then UNDEFINED; 


// Check if access disabled in CPACR 
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case cpacr_cp10 of 
when 'x@' disabled = TRUE; 
when '@1' disabled = PSTATE.EL == ELQ; 
when '11' disabled = FALSE; 

if disabled then UNDEFINED; 


// If required, check FPEXC enabled bit. 
if fpexc_check && FPEXC.EN == '@' then UNDEFINED; 


AArch32.CheckFPAdvSIMDTrap(advsimd) ; // Also check against HCPTR and CPTR_EL3 


aarch32/exceptions/traps/AArch32.CheckFPAdvSIMDTrap 
// AArch32.CheckFPAdvSIMDTrap( ) 
// Check against CPTR_EL2 and CPTR_EL3. 


AArch32.CheckFPAdvSIMDTrap(boolean advsimd) 
if HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) then 
AArch64.CheckFPAdvSIMDTrap() ; 
else 
if HaveEL(EL2) && !IsSecure() then 
hcptr_tase = HCPTR.TASE; 
hcptr_cp1@ = HCPTR.TCP10; 


if HaveEL(EL3) && ELUsingAArch32(EL3) && !IsSecure() then 
// Check if access disabled in NSACR 
if NSACR.NSASEDIS == '1' then hcptr_tase = '1'; 
if NSACR.cp10 == 'Q@' then hcptr_cp10 = '1'; 


// Check if access disabled in HCPTR 

if (advsimd && hcptr_tase == '1') || hcptr_cp1@ == '1' then 
exception = ExceptionSyndrome(Exception_AdvSIMDFPAccessTrap) ; 
exception.syndrome<24:20> = ConditionSyndrome(); 


if advsimd then 
exception.syndrome<5> = '1'; 
else 
exception.syndrome<5> = 'Q'; 
exception.syndrome<3:0> = '1010'; // coproc field, always QxA 


if PSTATE.EL == EL2 then 
AArch32.TakeUndefInstrException(exception) ; 
else 
AArch32.TakeHypTrapException(exception) ; 


if HaveEL(EL3) && !ELUsingAArch32(EL3) then 
// Check if access disabled in CPTR_EL3 
if CPTR_EL3.TFP == '1' then AArch64.AdvSIMDFPAccessTrap(EL3); 
return; 


aarch32/exceptions/traps/AArch32.CheckForSMCTrap 
// AArch32.CheckForSMCTrap() 

// Check for trap on SMC instruction 

AArch32.CheckForSMCTrap() 


if HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) then 
AArch64.CheckForSMCTrap(Zeros(16)); 
else 
route_to_hyp = HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {EL@,EL1} && HCR.TSC == '1'; 
if route_to_hyp then 
exception = ExceptionSyndrome(Exception_MonitorCal1); 
AArch32.TakeHypTrapException(exception) ; 
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aarch32/exceptions/traps/AArch32.CheckForWFxTrap 


// AArch32.CheckForWFxTrap() 
—————e 
// Check for trap on WFE or WFI instruction 


AArch32.CheckForWFxTrap(bits(2) target_el, boolean is_wfe) 
assert HaveEL(target_el); 


// Check for routing to AArch64 
if !ELUsingAArch32(target_el) then 
AArch64.CheckForWFxTrap(target_el, is_wfe); 
return; 
case target_el of 
when EL1 trap = (if is_wfe then SCTLR.nTWE else SCTLR.nTWI) == 'Q'; 
when EL2 trap = (if is_wfe then HCR.TWE else HCR.TWI) == '1'; 
when EL3 trap = (if is_wfe then SCR.TWE else SCR.TWI) == '1'; 
if trap then 
if (target_el == EL1 && HaveEL(EL2) && !IsSecure() && !ELUsingAArch32(EL2) && 
HCR_EL2.TGE == '1') then 
AArch64.WFxTrap(target_el, is_wfe); 
if target_el == EL3 then 
AArch32.TakeMonitorTrapException(); 
elsif target_el == EL2 then 
exception = ExceptionSyndrome(Exception_WFxTrap) ; 
exception. syndrome<24:2@> = ConditionSyndrome(); 
exception.syndrome<@> = if is_wfe then '1' else 'Q'; 
AArch32.TakeHypTrapException(exception) ; 
else 
AArch32.TakeUndefInstrException(); 


aarch32/exceptions/traps/AArch32.CheckITEnabled 
// AArch32.CheckITEnabled() 
// Check whether the 132 IT instruction is disabled. 
AArch32.CheckITEnabled(bits(4) mask) 


if PSTATE.EL == EL2 then 

it_disabled = HSCTLR.ITD; 
else 

it_disabled = (if ELUsingAArch32(EL1) then SCTLR.ITD else SCTLR[].ITD); 
if it_disabled == '1' then 

if mask != '1000' then UNDEFINED; 


// Otherwise whether the IT block is allowed depends on hwl of the next instruction. 
next_instr = AArch32.MemSingle[NextInstrAddr(), 2, AccType_IFETCH, TRUE]; 


if next_instr IN {'11xxxxxxxxxxxxxx', '1O11xxxXXXXXXXXXX', '101O0OXXXXXXXXXxXx', 
"Q1001xxxxxXxXxXXxXxXxX', 'Q10001xxx1111xxx', 'Q10001xx1xxxx111'} then 
// It is IMPLEMENTATION DEFINED whether the Undefined Instruction exception is 
// taken on the IT instruction or the next instruction. This is not reflected in 
// the pseudocode, which always takes the exception on the IT instruction. This 


// also does not take into account cases where the next instruction is UNPREDICTABLE. 
UNDEFINED; 


return; 


aarch32/exceptions/traps/AArch32.ChecklllegalState 
// AArch32.CheckI11legalState() 
// Check PSTATE.IL bit and generate Illegal Execution state exception if set. 


AArch32.CheckI11egalState() 
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J1.2 Pseudocode for AArch32 operation 


if AArch32.GeneralExceptionsToAArch64() then 
AArch64.CheckI1legalState(); 
elsif PSTATE.IL == '1' then 
route_to_hyp = HaveEL(EL2) && !IsSecure() && PSTATE.EL == EL@ && HCR.TGE == '1'; 


bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x04; 


if PSTATE.EL == EL2 || route_to_hyp then 
exception = ExceptionSyndrome(Exception_I1legalState) ; 
if PSTATE.EL == EL2 then 
AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 
else 
AArch32.EnterHypMode(exception, preferred_exception_return, 0x14); 
else 
AArch32.TakeUndefInstrException(); 


aarch32/exceptions/traps/AArch32.CheckSETENDEnabled 
// AArch32.CheckSETENDEnab1led() 
: Check whether the AArch32 SETEND instruction is disabled. 
AArch32.CheckSETENDEnab1ed() 


if PSTATE.EL == EL2 then 
setend_disabled = HSCTLR.SED; 


else 

setend_disabled = (if ELUsingAArch32(EL1) then SCTLR.SED else SCTLR[].SED); 
if setend_disabled == '1' then 

UNDEFINED; 
return; 


aarch32/exceptions/traps/AArch32.TakeHypTrapException 
// AArch32.TakeHypTrapException() 
// sssssssessssssssssssssssssss55 


// Exceptions routed to Hyp mode as a Hyp Trap exception. 


AArch32.TakeHypTrapException(ExceptionRecord exception) 
assert HaveEL(EL2) && !IsSecure() && ELUsingAArch32(EL2); 


bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x14; 


AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 


aarch32/exceptions/traps/AArch32.TakeMonitorTrapException 
// AArch32.TakeMonitorTrapException() 
// Exceptions routed to Monitor mode as a Monitor Trap exception. 


AArch32.TakeMonitorTrapException() 
assert HaveEL(EL3) && ELUsingAArch32(EL3); 


bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x04; 
lr_offset = if CurrentInstrSet() == InstrSet_A32 then 4 else 2; 


AArch32.EnterMonitorMode(preferred_exception_return, Ir_offset, vect_offset) ; 
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J1.2 Pseudocode for AArch32 operation 


aarch32/exceptions/traps/AArch32.TakeUndeflnstrException 


// AArch32.TakeUndefInstrException() 
// sssssesseeessssaesessesseeeseees= 


AArch32.TakeUndefInstrException() 
exception = ExceptionSyndrome(Exception_Uncategorized) ; 
AArch32.TakeUndefInstrException(exception) ; 


// AArch32.TakeUndefInstrException() 
ee 


AArch32.TakeUndefInstrException(ExceptionRecord exception) 
route_to_hyp = HaveEL(EL2) && !IsSecure() && PSTATE.EL == EL@ && HCR.TGE == '1'; 


bits(32) preferred_exception_return = ThisInstrAddr(); 
vect_offset = 0x04; 
lr_offset = if CurrentInstrSet() == InstrSet_A32 then 4 else 2; 


if PSTATE.EL == EL2 then 
AArch32.EnterHypMode(exception, preferred_exception_return, vect_offset) ; 
elsif route_to_hyp then 
AArch32.EnterHypMode(exception, preferred_exception_return, 0x14); 
else 
AArch32.EnterMode(M32_Undef, preferred_exception_return, lr_offset, vect_offset); 


aarch32/exceptions/traps/AArch32.UndefinedFault 


// AArch32.UndefinedFault() 


AArch32.UndefinedFault() 


if AArch32.GeneralExceptionsToAArch64() then AArch64.UndefinedFault(); 
AArch32.TakeUndefInstrException(); 





J1.2.3 aarch32/functions 

This section includes the following pseudocode functions: 

° aarch32/functions/aborts/AArch32. CreateFaultRecord on page J1-5329. 

. aarch32/functions/aborts/AArch32.DomainValid on page J1-5329. 

. aarch32/functions/aborts/AArch32. FaultStatusLD on page J1-5330. 

. aarch32/functions/aborts/AArch32. FaultStatusSD on page J1-5330. 

° aarch32/functions/aborts/AArch32. FaultSyndrome on page J1-5330. 

. aarch32/functions/aborts/EncodeSDFSC on page J1-5331. 

° aarch32/functions/common/A32ExpandiImm on page J1-5332 

° aarch32/functions/common/A32ExpandImm_C on page J1-5332 

° aarch32/functions/common/DecodelmmShift on page J1-5332. 

° aarch32/functions/common/DecodeRegShift on page J1-5332. 

° aarch32/functions/common/RRX on page J1-5332. 

° aarch32/functions/common/RRX_C on page J1-5333. 

° aarch32/functions/common/SRType on page J1-5333. 

° aarch32/functions/common/Shift on page J1-5333. 

. aarch32/functions/common/Shift_C on page J1-5333. 

. aarch32/functions/common/T32ExpandImm on page J1-5333. 

° aarch32/functions/common/T32ExpandImm_C on page J1-5334. 

° aarch32/functions/coproc/AArch32.CheckCP 15InstrCoarseTraps on page J1-5334. 

° aarch32/functions/coproc/AArch32.CheckSystemAccess on page J1-5334. 
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J1 ARMv8 Pseudocode 


J1.2 Pseudocode for AArch32 operation 


aarch32/functions/coproc/AArch32.CheckSystemAccessTraps on page J1-5335. 
aarch32/functions/coproc/CP 14DebugInstrDecode on page J1-5336. 
aarch32/functions/coproc/CP 14JazelleInstrDecode on page J1-5336. 
aarch32/functions/coproc/CP 14TraceInstrDecode on page J1-5336. 
aarch32/functions/coproc/CP 15InstrDecode on page J1-5336. 
aarch32/functions/exclusive/AArch32. ExclusiveMonitorsPass on page J1-5336. 
aarch32/functions/exclusive/AArch32.IsExclusiveVA on page J1-5337. 
aarch32/functions/exclusive/AArch32.MarkExclusiveVA on page J1-5337. 
aarch32/functions/exclusive/AArch32.SetExclusiveMonitors on page J1-5337. 
aarch32/functions/float/CheckAdvSIMDEnabled on page J1-5337. 
aarch32/functions/float/CheckAdvSIMDOrVFP Enabled on page J1-5338. 
aarch32/functions/float/CheckCryptoEnabled32 on page J1-5338. 
aarch32/functions/float/CheckVFPEnabled on page J1-5338. 
aarch32/functions/float/F P HalvedSub on page J1-5338. 
aarch32/functions/float/F PRSqrtStep on page J1-5339. 
aarch32/functions/float/F PRecipStep on page J1-5339. 
aarch32/functions/float/StandardF PSCRValue on page J1-5339. 
aarch32/functions/memory/AArch32.CheckAlignment on page J1-5339. 
aarch32/functions/memory/AArch32.MemSingle on page J1-5340. 
aarch32/functions/memory/Hint_PreloadData on page J1-5341. 
aarch32/functions/memory/Hint_PreloadDataForWrite on page J1-5341. 
aarch32/functions/memory/Hint_PreloadInstr on page J1-5341. 
aarch32/functions/memory/MemA on page J1-5341. 
aarch32/functions/memory/MemO on page J1-5341. 
aarch32/functions/nemory/MemU on page J1-5341. 
aarch32/functions/memory/MemU_unpriv on page J1-5342. 
aarch32/functions/memory/Mem_with_type on page J1-5342. 
aarch32/functions/registers/AArch32.ResetGeneralRegisters on page J1-5343. 
aarch32/functions/registers/AArch32.ResetSIMDF PRegisters on page J1-5343. 
aarch32/functions/registers/AArch32.ResetSpecialRegisters on page J1-5343. 
aarch32/functions/registers/AArch32.ResetSystemRegisters on page J1-5344. 
aarch32/functions/registers/ALUExceptionReturn on page J1-5344. 
aarch32/functions/registers/ALUWritePC on page J1-5344. 
aarch32/functions/registers/BXWritePC on page J1-5344. 
aarch32/functions/registers/BranchWritePC on page J1-5344. 
aarch32/functions/registers/D on page J1-5345. 
aarch32/functions/registers/Din on page J1-5345. 
aarch32/functions/registers/LR on page J1-5345. 
aarch32/functions/registers/LoadWritePC on page J1-5345. 
aarch32/functions/registers/LookUpRIndex on page J1-5345. 
aarch32/functions/registers/Monitor_mode_registers on page J1-5346. 
aarch32/functions/registers/PC on page J1-5346. 
aarch32/functions/registers/P CStore Value on page J1-5346. 
aarch32/functions/registers/Q on page J1-5346. 
aarch32/functions/registers/Qin on page J1-5346. 
aarch32/functions/registers/R on page J1-5347. 
aarch32/functions/registers/RBankSelect on page J1-5347. 
aarch32/functions/registers/Rmode on page J1-5347. 
aarch32/functions/registers/S on page J1-5348. 
aarch32/functions/registers/SP on page J1-5348. 
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° aarch32/functions/registers/_Dclone on page J1-5348. 

. aarch32/functions/system/AArch32.ExceptionReturn on page J1-5348. 

° aarch32/functions/system/AArch32.ExecutingCP 10or11 Instr on page J1-5349. 
. aarch32/functions/system/AArch32.ExecutingLSMInstr on page J1-5349. 
. aarch32/functions/system/AArch32.ITAdvance on page J1-5349. 

° aarch32/functions/system/AArch32.SysRegRead on page J1-5349. 

° aarch32/functions/system/AArch32.SysRegRead64 on page J1-5350. 

° aarch32/functions/system/AArch32.SysRegReadCanWriteAPSR on page J1-5350. 
° aarch32/functions/system/AArch32.SysReg Write on page J1-5350. 

° aarch32/functions/system/AArch32.SysReg Write64 on page J1-5350. 

° aarch32/functions/system/AArch32. WriteMode on page J1-5350. 

° aarch32/functions/system/AArch32. WriteModeByInstr on page J1-5350. 
° aarch32/functions/systemn/BadMode on page J1-5351. 

° aarch32/functions/system/BankedRegisterAccess Valid on page J1-5351. 
° aarch32/functions/systemn/CPSRWriteByInstr on page J1-5352. 

. aarch32/functions/system/ConditionPassed on page J1-5352. 

° aarch32/functions/system/CurrentCond on page J1-5352. 

° aarch32/functions/system/InITBlock on page J1-5352. 

. aarch32/functions/system/LastInITBlock on page J1-5353. 

° aarch32/functions/systen/SPSRWriteByInstr on page J1-5353. 

° aarch32/functions/system/SPSRaccess Valid on page J1-5353. 

° aarch32/functions/system/SelectInstrSet on page J1-5354. 

° aarch32/functions/v6simd/Sat on page J1-5354. 

. aarch32/functions/v6simd/SignedSat on page J1-5354. 

. aarch32/functions/v6simd/UnsignedSat on page J1-5354. 


aarch32/functions/aborts/AArch32.CreateFaultRecord 


// AArch32.CreateFaultRecord() 


FaultRecord AArch32.CreateFaultRecord(Fault type, bits(4@) ipaddress, bits(4) domain, 
integer level, AccType acctype, boolean write, bit extflag, 
bits(4) debugmoe, boolean secondstage, boolean s2fsiwalk) 


FaultRecord fault; 

fault.type = type; 

if (type != Fault_None && PSTATE.EL != EL2 && TTBCR.EAE == 'Q@' && !secondstage && !s2fslwalk && 
AArch32.DomainValid(type, level)) then 
fault.domain = domain; 

else 
fault.domain = bits(4) UNKNOWN; 


fault.debugmoe = debugmoe; 
fault.ipaddress = ZeroExtend(ipaddress); 
fault. level = level; 

fault.acctype = acctype; 

fault.write = write; 

fault.extflag = extflag; 
fault.secondstage = secondstage; 
fault.s2fslwalk = s2fslwalk; 





return fault; 


aarch32/functions/aborts/AArch32.DomainValid 
// AArch32.DomainValid() 


// Returns TRUE if the Domain is valid for a Short-descriptor translation scheme. 
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J1.2 Pseudocode for AArch32 operation 


boolean AArch32.DomainValid(Fault type, integer level) 
assert type != Fault_None; 


case type of 
when Fault_Domain 
return TRUE; 
when Fault_Translation, Fault_AccessFlag, Fault_SyncExternalOnWalk, Fault_SyncParityOnWalk 
return level == 2; 
otherwise 
return FALSE; 


aarch32/functions/aborts/AArch32.FaultStatusLD 


// AArch32.FaultStatusLD() 

// sssssssseeesssses====== 

// Creates an exception fault status value for Abort and Watchpoint exceptions taken 
// to Abort mode using AArch32 and Long-descriptor format. 


bits(32) AArch32.FaultStatusLD(boolean d_side, FaultRecord fault) 
assert fault.type != Fault_None; 


bits(32) fsr = Zeros(); 
if d_side then 
if fault.acctype IN {AccType_DC, AccType_IC, AccType_AT} then 
fsr<13> = '1'; fsr<ll> = '1'; 
else 
fsr<ll> = if fault.write then '1' else 'Q'; 
if IsExternalAbort(fault) then fsr<12> = fault.extflag; 
fsr<9> = '1'; 
fsr<5:0> = EncodeLDFSC(fault.type, fault. level); 


return fsr; 


aarch32/functions/aborts/AArch32.FaultStatusSD 
// AArch32.FaultStatusSD() 


// Creates an exception fault status value for Abort and Watchpoint exceptions taken 
// to Abort mode using AArch32 and Short-descriptor format. 


bits(32) AArch32.FaultStatusSD(boolean d_side, FaultRecord fault) 
assert fault.type != Fault_None; 


bits(32) fsr = Zeros(); 
if d_side then 
if fault.acctype IN {AccType_DC, AccType_IC, AccType_AT} then 
fsr<13> = '1'; fsr<ll> = '1'; 
else 
fsr<ll> = if fault.write then '1' else 'Q'; 
if IsExternalAbort(fault) then fsr<12> = fault.extflag; 
fsr<9> = '0'; 
fsr<10,3:0> = EncodeSDFSC(fault.type, fault. level); 
if d_side then 
fsr<7:4> = fault.domain; // Domain field (data fault only) 


return fsr; 


aarch32/functions/aborts/AArch32.FaultSyndrome 


// AArch32.FaultSyndrome() 

Of iat sees 

// Creates an exception syndrome value for Abort and Watchpoint exceptions taken to 
// AArch32 Hyp mode. 


bits(25) AArch32.FaultSyndrome(boolean d_side, FaultRecord fault) 
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assert fault.type != Fault_None; 


bits(25) iss = 
if d_side then 
if IsSecondStage(fault) && !fault.s2fslwalk then iss<24:14> = 
if fault.acctype IN {AccType_DC, AccType_IC, AccType_AT} then 
iss<8> = '1'; iss<6> = '1'; 
else 
iss<6> = if fault.write then '1' else 'Q'; 
if IsExternalAbort(fault) then iss<9> = fault.extflag; 
iss<7> = if fault.s2fslwalk then '1' else '0'; 
iss<5:@> = EncodeLDFSC(fault.type, fault. level); 


Zeros(); 


LSInstructionSyndrome(); 


return iss; 


aarch32/functions/aborts/EncodeSDFSC 
// EncodeSDFSC() 


// Function that gives the Short-descriptor FSR code for different types of Fault 


bits(5) EncodeSDFSC(Fault type, integer level) 


bits(5) result; 
case type of 
when Fault_AccessFlag 
assert level IN {1,2}; 
result = if level == 1 then 
when Fault_Alignment 
result = '00001'; 
when Fault_Permission 


"Q0011' else 'Q0110'; 


assert level 
result = if 
when Fault_Domai 
assert level 
result = if 


IN {1,2}; 
level == 1 then 
n 

IN {1,2}; 
level == 1 then 


when Fault_Translation 


assert level 
result = if 


IN {1,2}; 
level == 1 then 


01101" 


"01001' 


"00101' 


else 


else 


else 


"Q1111'; 


"Q1011'; 


Q0111'; 


when Fault_SyncExternal 
result = '01000'; 
when Fault_SyncExternalOnWalk 
assert level IN {1,2}; 
result = if level == 1 then 
when Fault_SyncParity 
result = '11001'; 
when Fault_SyncParityOnWalk 
assert level IN {1,2}; 
result = if level == 1 then 
when Fault_AsyncParity 
result = '11000'; 
when Fault_AsyncExternal 
result = '10110'; 
when Fault_Debug 
result = 'Q0010'; 
when Fault_TLBConflict 
result = '10000'; 
when Fault_Lockdown 
result = '10100'; 
when Fault_Exclusive 
result = '10101'; 
when Fault_ICacheMaint 
result = '00100'; 
otherwise 
Unreachable(); 


"Q1100' else 'Q1110'; 


'11100' else '11110'; 





// IMPLEMENTATION DEFINED 


// IMPLEMENTATION DEFINED 


return result; 
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J1.2 Pseudocode for AArch32 operation 


aarch32/functions/common/A32Expandimm 


// A32ExpandImm() 
as 


bits(32) A32ExpandImm(bits(12) imm12) 


// PSTATE.C argument to following function call does not affect the imm32 result. 
(imm32, -) = A32ExpandImm_C(imm12, PSTATE.C); 


return imm32; 


aarch32/functions/common/A32Expandimm_C 


// A32ExpandImm_C() 
|[ ssssssenssssnes= 


(bits(32), bit) A32ExpandImm_C(bits(12) imm12, bit carry_in) 


unrotated_value = ZeroExtend(imm12<7:0>, 32); 
(imm32, carry_out) = Shift_C(unrotated_value, SRType_ROR, 2#UInt(imm12<11:8>), carry_in); 


return (imm32, carry_out); 


aarch32/functions/common/DecodelmmShift 


// DecodeImmShi Ft () 
|[ =sssssessssenes= 


(SRType, integer) DecodeImmShift(bits(2) type, bits(5) imm5) 


case type of 
when 'QQ' 
shift_t 
when 'Q1' 
shift_t 
when '10' 
shift_t 
when '11' 
if imm5 == '00000' then 
shift_t = SRType_RRX; shift_n = 1; 
else 
shift_t = SRType_ROR; shift_n = UInt(imm5); 


SRType_LSL; shift_n = UInt(immS); 


SRType_LSR; shift_n = if imm5 == 'Q0000' then 32 else UInt(immS); 


SRType_ASR; shift_n = if imm5 == 'Q0000' then 32 else UInt(immS); 


return (shift_t, shift_n); 


aarch32/functions/common/DecodeRegShift 


// DecodeRegShi ft ( ) 
| 


SRType DecodeRegShift(bits(2) type) 
case type of 


when 'Q@Q' shift_t = SRType_LSL; 
when '@1' shift_t = SRType_LSR; 
when '10'  shift_t = SRType_ASR; 


when '11' shift_t = SRType_ROR; 
return shift_t; 


aarch32/functions/common/RRX 


// RX) 
= 
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J1.2 Pseudocode for AArch32 operation 


bits(N) RRX(bits(N) x, bit carry_in) 
(result, -) = RRX_C(x, carry_in); 
return result; 


aarch32/functions/common/RRX_C 


// RRX_C() 
// ===55== 


(bits(N), bit) RRX_C(bits(N) x, bit carry_in) 
result = carry_in : x<N-1:1>; 
carry_out = x<@>; 
return (result, carry_out); 


aarch32/functions/common/SRType 


enumeration SRType {SRType_LSL, SRType_LSR, SRType_ASR, SRType_ROR, SRType_RRX}; 


aarch32/functions/common/Shift 


// ShiftQ) 
// ==555=5 


bits(N) Shift(bits(N) value, SRType type, integer amount, bit carry_in) 
(result, -) = Shift_C(value, type, amount, carry_in); 
return result; 


aarch32/functions/common/Shift_C 


// Shift_cQ) 
= 


(bits(N), bit) Shift_C(bits(N) value, SRType type, integer amount, bit carry_in) 
assert !(type == SRType_RRX && amount != 1); 


if amount == @ then 
(result, carry_out) = (value, carry_in); 
else 
case type of 
when SRType_LSL 
(result, carry_out) = LSL_C(value, amount); 
when SRType_LSR 
(result, carry_out) = LSR_C(value, amount); 
when SRType_ASR 
(result, carry_out) = ASR_C(value, amount); 
when SRType_ROR 
(result, carry_out) = ROR_C(value, amount); 
when SRType_RRX 
(result, carry_out) = RRX_C(value, carry_in); 


return (result, carry_out); 


aarch32/functions/common/T32Expandimm 


// T32ExpandImm() 
|/ =assssessss== 


bits(32) T32ExpandImm(bits(12) imm12) 


// PSTATE.C argument to following function call does not affect the imm32 result. 
(imm32, -) = T32ExpandImm_C(imm12, PSTATE.C); 


return imm32; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. J1-5333 
1ID092916 Non-Confidential 


J1 ARMv8 Pseudocode 
J1.2 Pseudocode for AArch32 operation 


aarch32/functions/common/T32Expandimm_C 


// T32ExpandImm_C() 


(bits(32), bit) T32ExpandImm_C(bits(12) imm12, bit carry_in) 


if imm12<11:10> == 'QQ' then 
case imm12<9:8> of 
when 'QQ' 
jimm32 = ZeroExtend(imm12<7:0>, 32); 
when 'Q1' 
imm32 = 'QQQ00000' : imm12<7:0> : '@0000000' : imm12<7:0>; 
when '10' 
imm32 = imm12<7:@> : '@Q000000' : imm12<7:0> : '00000000'; 
when '11' 
imm32 = imm12<7:@> : imm12<7:@> : imm12<7:0> : imm12<7:0>; 
carry_out = carry_in; 
else 
unrotated_value = ZeroExtend('1':imm12<6:0>, 32); 
(imm32, carry_out) = ROR_C(unrotated_value, UInt(imm12<11:7>)); 


return (imm32, carry_out); 


aarch32/functions/coproc/AArch32.CheckCP15InstrCoarseTraps 

// AArch32.CheckCP15InstrCoarseTraps() 

// Check for coarse-grained CP15 traps in HSTR and HCR. 

boolean AArch32.CheckCP15InstrCoarseTraps(integer CRn, integer nreg, integer CRm) 


// Check for coarse-grained Hyp traps 
if HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1} then 
if PSTATE.EL == EL@ && !ELUsingAArch32(EL2) then 
return AArch64.CheckCP15InstrCoarseTraps(CRn, nreg, CRm); 


// Check for MCR, MRC, MCRR and MRRC disabled by HSTR<CRn/CRm> 
major = if nreg == 1 then CRn else CRm; 
if !(major IN {4,14}) && HSTR<major> == '1' then 

return TRUE; 


// Check for MRC and MCR disabled by HCR.TIDCP 
if (HCR.TIDCP == '1' && nreg == 1 & 
((CRn == 9 && CRm IN {0,1,2, 5,6,7,8 }) | 
(CRn == 10 && CRm IN {0,1, 4, 8 }) | 
(CRn == 11 && CRm IN {0,1,2,3,4,5,6,7,8,15}))) then 
return TRUE; 


return FALSE; 


aarch32/functions/coproc/AArch32.CheckSystemAccess 
// AArch32.CheckSystemAccess() 
// Check System register access instruction for enables and disables 


AArch32.CheckSystemAccess(integer cp_num, bits(32) instr) 
assert cp_num == UInt(instr<11:8>) && (cp_num IN {14,15}); 


if PSTATE.EL == ELQ && !ELUsingAArch32(EL1) then 
AArch64.CheckAArch32SystemAccess(instr) ; 
return; 


// Decode the AArch32 System register access instruction 
if instr<31:28> != '1111' && instr<27:24> == '1110' && instr<4> == '1' then // MRC/MCR 
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cprt = TRUE; cpdt = FALSE; nreg = 1; 
opcl = UInt(instr<23:21>); 

opc2 = UInt(instr<7:5>); 

CRn = UInt(instr<19:16>); 


CRm = UInt(instr<3:0>); 
elsif instr<31:28> != '1111' && instr<27:21> == '1100010' then // MRRC/MCRR 
cprt = TRUE; cpdt = FALSE; nreg = 2; 
opcl = UInt(instr<7:4>); 
CRm = UInt(instr<3:0>); 
elsif instr<31:28> != '1111' && instr<27:25> == '110' && instr<22> == '@' then // LDC/STC 
cprt = FALSE; cpdt = TRUE; nreg = 0; 
opcl = 0; 
CRn UInt(instr<15:12>); 
else 
allocated = FALSE; 


// 
// Coarse-grain decode into CP14 or CP15 encoding space. Each of the CPxxxInstrDecode functions 
// returns TRUE if the instruction is allocated at the current Exception level, FALSE otherwise. 
if cp_num == 14 then 
// LDC and STC only supported for c5 in CP14 encoding space 
if cpdt && CRn != 5 then 
allocated = FALSE; 
else 
// Coarse-grained decode of CP14 based on opcl field 
case opcl of 
when 0 allocated = CP14DebugInstrDecode(instr) ; 


when 1 allocated = CP14TraceInstrDecode(instr); 
when 7 allocated = CP14JazelleInstrDecode(instr) ; // JIDR only 
otherwise allocated = FALSE; // All other values are unallocated 


elsif cp_num == 15 then 
// LDC and STC not supported in CP15 encoding space 
if !cprt then 
allocated = FALSE; 
else 
allocated = CP15InstrDecode(instr); 


// Coarse-grain traps to EL2 have a higher priority than exceptions generated because 
// the access instruction is UNDEFINED 
if AArch32.CheckCP15InstrCoarseTraps(CRn, nreg, CRm) then 
// For a coarse-grain trap, if it is IMPLEMENTATION DEFINED whether an access from 
// Non-secure User mode is UNDEFINED when the trap is disabled, then it is 
// IMPLEMENTATION DEFINED whether the same access is UNDEFINED or generates a trap 
// when the trap is enabled. 
if PSTATE.EL == ELO && !IsSecure() && !allocated then 
if boolean IMPLEMENTATION_DEFINED "UNDEF unallocated CP15 access at NS EL@" then 
UNDEFINED; 
AArch32.AArch32SystemAccessTrap(EL2, instr); 


else 
allocated = FALSE; 


if !allocated then 
UNDEFINED; 


// If the instruction is not UNDEFINED, it might be disabled or trapped to a higher EL. 
AArch32.CheckSystemAccessTraps(instr); 


return; 


aarch32/functions/coproc/AArch32.CheckSystemAccessTraps 


// Check for configurable disables or traps to a higher EL of an System register access. 
AArch32.CheckSystemAccessTraps(bits(32) instr); 
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aarch32/functions/coproc/CP14DebugInstrDecode 


// Decodes an accepted access to a debug System register in the CP14 encoding space. 
// Returns TRUE if the instruction is allocated at the current Exception level, FALSE otherwise. 
boolean CP14DebugInstrDecode(bits(32) instr); 


aarch32/functions/coproc/CP14JazellelnstrDecode 


// Decodes an accepted access to a Jazelle System register in the CP14 encoding space. 
// Returns TRUE if the instruction is allocated at the current Exception level, FALSE otherwise. 
boolean CP14JazelleInstrDecode(bits(32) instr); 


aarch32/functions/coproc/CP14TracelnstrDecode 


// Decodes an accepted access to a trace System register in the CP14 encoding space. 
// Returns TRUE if the instruction is allocated at the current Exception level, FALSE otherwise. 
boolean CP14TraceInstrDecode(bits(32) instr); 


aarch32/functions/coproc/CP15InstrDecode 


// Decodes an accepted access to a System register in the CP15 encoding space. 
// Returns TRUE if the instruction is allocated at the current Exception level, FALSE otherwise. 
boolean CP15InstrDecode(bits(32) instr); 


aarch32/functions/exclusive/AArch32.ExclusiveMonitorsPass 


// AArch32.ExclusiveMonitorsPass() 
Jf eects 


// Return TRUE if the Exclusive Monitors for the current PE include all of the addresses 
// associated with the virtual address region of size bytes starting at address. 
// The immediately following memory write must be to the same addresses. 


boolean AArch32.ExclusiveMonitorsPass(bits(32) address, integer size) 


// It is IMPLEMENTATION DEFINED whether the detection of memory aborts happens 

// before or after the check on the local Exclusive Monitor. As a result a failure 
// of the local monitor can occur on some implementations even if the memory 

// access would give an memory abort. 


acctype = AccType_ATOMIC; 
jiswrite = TRUE; 
aligned = (address == Align(address, size)); 


if !aligned then 
secondstage = FALSE; 
AArch32.Abort(address, AArch32.AlignmentFault(acctype, iswrite, secondstage)); 


passed = AArch32.IsExclusiveVA(address, ProcessorID(), size); 
if !passed then 
return FALSE; 
memaddrdesc = AArch32.TranslateAddress(address, acctype, iswrite, aligned, size); 


// Check for aborts or debug exceptions 
if IsFault(memaddrdesc) then 
AArch32.Abort(address, memaddrdesc. fault) ; 


passed = IsExclusiveLocal(memaddrdesc.paddress, ProcessorID(), size); 
if passed then 

ClearExclusiveLocal(ProcessorID()); 

if memaddrdesc.memattrs.shareable then 


passed = IsExclusiveGlobal(memaddrdesc.paddress, ProcessorID(), size); 


return passed; 
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aarch32/functions/exclusive/AArch32.IsExclusiveVA 


// An optional IMPLEMENTATION DEFINED test for an exclusive access to a virtual 
// address region of size bytes starting at address. 

// 

// It is permitted (but not required) for this function to return FALSE and 

// cause a store exclusive to fail if the virtual address region is not 

// totally included within the region recorded by MarkExclusiveVA(). 


// 

// It is always safe to return TRUE which will check the physical address only. 
boolean AArch32.IsExclusiveVA(bits(32) address, integer processorid, integer size); 
aarch32/functions/exclusive/AArch32.MarkExclusiveVA 

// Optionally record an exclusive access to the virtual address region of size bytes 
// starting at address for processorid. 

AArch32.MarkExclusiveVA(bits(32) address, integer processorid, integer size); 


aarch32/functions/exclusive/AArch32.SetExclusiveMonitors 


// AArch32.SetExclusiveMonitors() 
_————E— ee 


// Sets the Exclusive Monitors for the current PE to record the addresses associated 
// with the virtual address region of size bytes starting at address. 


AArch32.SetExclusiveMonitors(bits(32) address, integer size) 
acctype = AccType_ATOMIC; 
iswrite = FALSE; 
aligned = (address != Align(address, size)); 
memaddrdesc = AArch32.TranslateAddress(address, acctype, iswrite, aligned, size); 
// Check for aborts or debug exceptions 
if IsFault(memaddrdesc) then 


return; 


if memaddrdesc.memattrs.shareable then 
MarkExclusiveGlobal(memaddrdesc.paddress, ProcessorID(), size); 


MarkExclusiveLocal(memaddrdesc.paddress, ProcessorID(), size); 


AArch32.MarkExclusiveVA(address, ProcessorID(), size); 


aarch32/functions/float/(CheckAdvSIMDEnabled 


// CheckAdvSIMDEnabled() 


CheckAdvSIMDEnab1ed() 


fpexc_check = TRUE; 
advsimd = TRUE; 


AArch32.CheckAdvSIMDOrFPEnabled(fpexc_check, advsimd); 
// Return from CheckAdvSIMDOrFPEnabled() occurs only if Advanced SIMD access is permitted 


// Make temporary copy of D registers 
// _Dclone[] is used as input data for instruction pseudocode 
for i = 0 to 31 

_Dclone[i] = D[i]; 


return; 
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aarch32/functions/float/(CheckAdvSIMDOrVFPEnabled 


// CheckAdvSIMDOrVFPEnab1ed() 
|[ ssqssssnssseanssssessssse= 


CheckAdvSIMDOrVFPEnabled(boolean include_fpexc_check, boolean advsimd) 
AArch32.CheckAdvSIMDOrFPEnabled(include_fpexc_check, advsimd); 
// Return from CheckAdvSIMDOrFPEnabled() occurs only if VFP access is permitted 
return; 


aarch32/functions/float/CheckCryptoEnabled32 


// CheckCryptoEnabled32() 
|[ ssssssensssenssssssss= 


CheckCryptoEnabled32() 
CheckAdvSIMDEnabled(); 


// Return from CheckAdvSIMDEnabled() occurs only if access is permitted 
return; 


aarch32/functions/float/CheckVFPEnabled 


// CheckVFPEnabled() 
|[ sssssssnssssss=== 


CheckVFPEnabled(boolean include_fpexc_check) 
advsimd = FALSE; 
AArch32.CheckAdvSIMDOrFPEnabled(include_fpexc_check, advsimd); 
// Return from CheckAdvSIMDOrFPEnabled() occurs only if VFP access is permitted 
return; 


aarch32/functions/float/FPHalvedSub 


// FPHalvedSub() 
|/ =ssssssanss== 


bits(N) FPHalvedSub(bits(N) op1, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
rounding = FPRoundingMode(fpcr) ; 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
infl = (typel == FPType_Infinity); inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); zero2 = (type2 == FPType_Zero); 
if infl && inf2 && signl == sign2 then 
result = FPDefaultNaN(); 
FPProcessException(FPExc_InvalidOp, fpcr); 
elsif (infl && signl == '@') || (inf2 && sign2 == '1') then 
result = FPInfinity('0'); 
elsif (infl && signl == '1') || (inf2 && sign2 == '@') then 
result = FPInfinity('1'); 
elsif zerol && zero2 && signl != sign2 then 
result = FPZero(sign1); 
else 
result_value = (valuel - value2) / 2.0; 
if result_value == 0.0 then // Sign of exact zero result depends on rounding mode 
result_sign = if rounding == FPRounding_NEGINF then '1' else 'Q'; 
result = FPZero(result_sign) ; 
else 
result = FPRound(result_value, fpcr); 
return result; 
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aarch32/functions/float/FPRSqrtStep 


// FPRSqrtStep() 
|/ =sassssansss= 


bits(N) FPRSqrtStep(bits(N) opl, bits(N) op2) 
assert N == 32; 
FPCRType fpcr = StandardFPSCRValue(); 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
infl = (typel == FPType_Infinity); inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); zero2 = (type2 == FPType_Zero); 
bits(N) product; 
if (infl && zero2) || (zerol && inf2) then 
product = FPZero('Q'); 
else 
product = FPMul(opl, op2, fpcr); 
bits(N) three = FPThree('Q'); 
result = FPHalvedSub(three, product, fpcr); 
return result; 


aarch32/functions/float/FPRecipStep 


// FPRecipStep() 
|/ ==sssssan==== 


bits(N) FPRecipStep(bits(N) op1, bits(N) op2) 
assert N == 32; 
FPCRType fpcr = StandardFPSCRValue(); 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
infl = (typel == FPType_Infinity); inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); zero2 = (type2 == FPType_Zero); 
bits(N) product; 
if (infl && zero2) || (zerol && inf2) then 
product = FPZero('Q'); 
else 
product = FPMul(opl, op2, fpcr); 
bits(N) two = FPTwo('Q@'); 
result = FPSub(two, product, fpcr); 
return result; 


aarch32/functions/float/StandardFPSCRValue 


// StandardFPSCRValue() 
as 


FPCRType StandardFPSCRValue( ) 
return '00000' : FPSCR.AHP : '11000000000000000000000000' ; 


aarch32/functions/memory/AArch32.CheckAlignment 


// AArch32.CheckAlignment() 
|[ sensssenssseesssssssss= 


boolean AArch32.CheckAlignment(bits(32) address, integer alignment, AccType acctype, 
boolean iswrite) 


if PSTATE.EL == EL@ && !ELUsingAArch32(S1TranslationRegime()) then 


A = SCTLR[].A; //use AArch64 register, when higher Exception level is using AArch64 
elsif PSTATE.EL == EL2 then 
A = HSCTLR.A; 
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else 
A = SCTLR.A; 
aligned = (address == Align(address, alignment)); 


// AccType_VEC is used for SIMD element alignment checks only 
check = (acctype == AccType_ATOMIC || acctype == AccType_ORDERED || acctype == AccType_VEC || A == 
‘1'); 


if check && !aligned then 
secondstage = FALSE; 
AArch32.Abort(address, AArch32.AlignmentFault(acctype, iswrite, secondstage)); 


return aligned; 


aarch32/functions/memory/AArch32.MemSingle 
// AArch32.MemSingle[] - non-assignment (read) form 
// Perform an atomic, little-endian read of 'size' bytes. 


bits(sizex8) AArch32.MemSingle[bits(32) address, integer size, AccType acctype, boolean wasaligned] 
assert size IN {1, 2, 4, 8, 16}; 
assert address == Align(address, size); 


AddressDescriptor memaddrdesc; 
bits(sizex8) value; 
iswrite = FALSE; 


// MMU or MPU 
memaddrdesc = AArch32.TranslateAddress(address, acctype, iswrite, wasaligned, size); 
// Check for aborts or debug exceptions 
if IsFault(memaddrdesc) then 
AArch32.Abort(address, memaddrdesc. fault); 


// Memory array access 
value = _Mem[memaddrdesc, size, acctype]; 
return value; 


// AArch32.MemSingle[] - assignment (write) form 
// Perform an atomic, little-endian write of 'size' bytes. 


AArch32.MemSingle[bits(32) address, integer size, AccType acctype, boolean wasaligned] = bits(sizex8) 
value 

assert size IN {1, 2, 4, 8, 16}; 

assert address == Align(address, size); 


AddressDescriptor memaddrdesc; 
iswrite = TRUE; 


// MMU or MPU 
memaddrdesc = AArch32.TranslateAddress(address, acctype, iswrite, wasaligned, size); 


// Check for aborts or debug exceptions 
if IsFault(memaddrdesc) then 
AArch32.Abort(address, memaddrdesc. fault); 


// Effect on exclusives 
if memaddrdesc.memattrs.shareable then 
ClearExclusiveByAddress(memaddrdesc.paddress, ProcessorID(), size); 


// Memory array access 
_Mem[memaddrdesc, size, acctype] = value; 
return; 
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aarch32/functions/memory/Hint_PreloadData 


Hint_PreloadData(bits(32) address); 


aarch32/functions/memory/Hint_PreloadDataForWrite 


Hint_PreloadDataForwWrite(bits(32) address); 


aarch32/functions/memory/Hint_Preloadinstr 


Hint_PreloadInstr(bits(32) address); 


aarch32/functions/memory/MemA 


// MemA[] - non-assignment form 


[[ seseeeccesessessssecesssssss 


bits(8*size) MemA[bits(32) address, integer size] 
acctype = AccType_ATOMIC; 
return Mem_with_type[address, size, acctype]; 


// MemA[] - assignment form 


[/ seseeeeccesssssseseesese 


MemA[bits(32) address, integer size] = bits(8«size) value 
acctype = AccType_ATOMIC; 
Mem_with_type[address, size, acctype] = value; 
return; 


aarch32/functions/memory/MemO 


// MemO[] - non-assignment form 


[[ seseeecconsssssseseoesssssss 


bits(8size) MemO[bits(32) address, integer size] 
acctype = AccType_ORDERED; 
return Mem_with_type[address, size, acctype]; 


// MemO[] - assignment form 


[/ seseeeeccesssssseseseeee 


MemO[bits(32) address, integer size] = bits(8«size) value 
acctype = AccType_ORDERED; 
Mem_with_type[address, size, acctype] = value; 
return; 


aarch32/functions/memory/MemU 


// MemU[] - non-assignment form 


[[ seseeeccnsesssssesseesssssss 


bits(8size) MemU[bits(32) address, integer size] 
acctype = AccType_NORMAL; 
return Mem_with_type[address, size, acctype]; 


// MemU[] - assignment form 


[f/f seseeeeccesssssssseseess 


MemU[bits(32) address, integer size] = bits(8*size) value 
acctype = AccType_NORMAL; 
Mem_with_type[address, size, acctype] = value; 
return; 
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aarch32/functions/memory/MemU_unpriv 


// MemU_unpriv[] - non-assignment form 


bits(8size) MemU_unpriv[bits(32) address, integer size] 
acctype = AccType_UNPRIV; 
return Mem_with_type[address, size, acctype]; 


// MemU_unpriv[] - assignment form 


/ 





MemU_unpriv[bits(32) address, integer size] = bits(8*size) value 
acctype = AccType_UNPRIV; 
Mem_with_type[address, size, acctype] = value; 
return; 


aarch32/functions/memory/Mem_with_type 
// Mem_with_type[] - non-assignment (read) form 


// Perform a read of 'size' bytes. The access byte order is reversed for a big-endian access. 
// Instruction fetches would call AArch32.MemSingle directly. 


bits(sizex8) Mem_with_type[bits(32) address, integer size, AccType acctype] 
assert size IN {1, 2, 4, 8, 16}; 
bits(sizex8) value; 
integer i; 
boolean iswrite = FALSE; 


aligned = AArch32.CheckAlignment(address, size, acctype, iswrite); 
if !aligned then 

assert size > 1; 

value<7:0> = AArch32.MemSingle[address, 1, acctype, aligned]; 


// For subsequent bytes it is CONSTRAINED UNPREDICTABLE whether an unaligned Device memory 
// access will generate an Alignment Fault, as to get this far means the first byte did 

// not, so we must be changing to a new translation page. 

c = ConstrainUnpredictable(); 

assert c IN {Constraint_FAULT, Constraint_NONE}; 

if c == Constraint_NONE then aligned = TRUE; 


for i = 1 to size-1 
value<8*i+7:8%i> = AArch32.MemSingle[address+i, 1, acctype, aligned]; 
else 
value = AArch32.MemSingle[address, size, acctype, aligned]; 


if BigEndian() then 
value = BigEndianReverse(value) ; 
return value; 
// Mem_with_type[] - assignment (write) form 
// Perform a write of 'size' bytes. The byte order is reversed for a big-endian access. 
Mem_with_type[bits(32) address, integer size, AccType acctype] = bits(sizex8) value 
integer i; 


boolean iswrite = TRUE; 


if BigEndian() then 
value = BigEndianReverse(value) ; 


aligned = AArch32.CheckAlignment(address, size, acctype, iswrite); 
if !aligned then 


assert size > 1; 
AArch32.MemSingle[address, 1, acctype, aligned] = value<7:0>; 
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// For subsequent bytes it is CONSTRAINED UNPREDICTABLE whether an unaligned Device memory 
// access will generate an Alignment Fault, as to get this far means the first byte did 

// not, so we must be changing to a new translation page. 

c = ConstrainUnpredictable(); 

assert c IN {Constraint_FAULT, Constraint_NONE}; 

if c == Constraint_NONE then aligned = TRUE; 


for i = 1 to size-1 
AArch32.MemSingle[address+i, 1, acctype, aligned] = value<8*i+7:8xi>; 
else 
AArch32.MemSingle[address, size, acctype, aligned] = value; 
return; 


aarch32/functions/registers/AArch32.ResetGeneralRegisters 


// AArch32.ResetGeneralRegisters() 
| eee ee 


AArch32.ResetGeneralRegisters() 


fori =Q@to7 
R[i] = bits(32) UNKNOWN; 
for i = 8 to 12 
Rmode[i, M32_User] = bits(32) UNKNOWN; 
Rmode[i, M32_FIQ] = bits(32) UNKNOWN; 
if HaveEL(EL2) then Rmode[13, M32_Hyp] = bits(32) UNKNOWN; // No R14_hyp 
for i = 13 to 14 
Rmode[i, M32_User] = bits(32) UNKNOWN; 
Rmode[i, M32_FIQ] = bits(32) UNKNOWN; 
Rmode[i, M32_IRQ] = bits(32) UNKNOWN; 
Rmode[i, M32_Svc] = bits(32) UNKNOWN; 
Rmode[i, M32_Abort] = bits(32) UNKNOWN; 
Rmode[i, M32_Undef] = bits(32) UNKNOWN; 
if HaveEL(EL3) then Rmode[i, M32_Monitor] = bits(32) UNKNOWN; 











return; 


aarch32/functions/registers/AArch32.ResetSIMDFPRegisters 


// AArch32.ResetSIMDFPRegisters() 
// sssssesseeesssssssessssseese== 


AArch32.ResetSIMDFPRegisters() 


for i = 0 to 15 
Qi] = bits(128) UNKNOWN; 


return; 


aarch32/functions/registers/AArch32.ResetSpecialRegisters 


// AArch32.ResetSpecialRegisters() 
// sssesesssesssssssssssssseeese== 


AArch32.ResetSpecialRegisters() 


// AArch32 special registers 
SPSR_fig = bits(32) UNKNOWN; 
SPSR_irg = bits(32) UNKNOWN; 
SPSR_svc = bits(32) UNKNOWN; 
SPSR_abt = bits(32) UNKNOWN; 
SPSR_und = bits(32) UNKNOWN; 
if HaveEL(EL2) then 
SPSR_hyp = bits(32) UNKNOWN; 
ELR_hyp = bits(32) UNKNOWN; 
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if HaveEL(EL3) then 
SPSR_mon = bits(32) UNKNOWN; 


// External debug special registers 
DLR = bits(32) UNKNOWN; 
DSPSR = bits(32) UNKNOWN; 


return; 


aarch32/functions/registers/AArch32.ResetSystemRegisters 


AArch32.ResetSystemRegisters(boolean cold_reset) ; 


aarch32/functions/registers/ALUExceptionReturn 


// ALUExceptionReturn() 
jf wesceeeeaccessccsece 


ALUExceptionReturn(bits(32) address) 
if PSTATE.EL == EL2 then 


UNDEFINED; 
elsif PSTATE.M IN {M32_User,M32_System} then 
UNPREDICTABLE; // UNDEFINED or NOP 
else 


AArch32.ExceptionReturn(address, SPSR[]); 


aarch32/functions/registers/ALUWritePC 


// ALUWritePC() 
// ===55=s=5=5= 


ALUWritePC(bits(32) address) 
if CurrentInstrSet() == InstrSet_A32 then 
BXWritePC(address); 
else 
BranchWritePC(address); 


aarch32/functions/registers/BXWritePC 


// BXWritePC() 
(<= 


BXWritePC(bits(32) address) 
if address<@> == '1' then 
SelectInstrSet(InstrSet_T32); 
address<@> = 'Q'; 
else 
SelectInstrSet(InstrSet_A32); 
// For branches to an unaligned PC counter in A32 state, the processor takes the branch 
// and does one of: 
// * Forces the address to be aligned 
// * Leaves the PC unaligned, meaning the target generates a PC Alignment fault. 
if address<1> == '1' && ConstrainUnpredictableBool() then 
address<1> = 'Q'; 
BranchTo(address, BranchType_UNKNOWN) ; 


aarch32/functions/registers/BranchWritePC 


// BranchWritePC() 
|[ =enssssnssss=5= 


BranchWritePC(bits(32) address) 
if CurrentInstrSet() == InstrSet_A32 then 
address<1:0> = 'QQ'; 
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address<@> = 'Q'; 


BranchTo(address, BranchType_UNKNOWN) ; 


aarch32/functions/registers/D 


[] - non-assignment form 


bits(64) D[integer n] 
assert n >= @ && n <= 31; 
base = (n MOD 2) « 64; 
return _V[n DIV 2]<base+63:base>; 


[] - assignment form 


D[integer n] = bits(64) value 
assert n >= @ && n <= 31; 
base = (n MOD 2) « 64; 
_V[n DIV 2]<base+63:base> = value; 
return; 


aarch32/functions/registers/Din 


// Din[] - non-assignment form 


bits(64) Din[integer n] 
assert n >= @ && n <= 31; 
return _Dclone[n]; 


aarch32/functions/registers/LR 


// LR - assignment form 


LR = bits(32) value 
R[14] = value; 
return; 


// LR - non-assignment form 


bits(32) LR 
return R[14]; 


aarch32/functions/registers/LoadWritePC 


// LoadWritePC() 


LoadWritePC(bits(32) address) 
BXWritePC(address); 


aarch32/functions/registers/LookUpRIindex 


// LookUpRIndex() 


integer LookUpRIndex(integer n, bits(5) mode) 
assert n >= @ && n <= 14; 


case nof // Select index by mode: usr fig irq svc abt und hyp 
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= RBankSelect(mode, 8 
= RBankSelect(mode, 9, 25, 9, 9, 9, 9, 9); 
= RBankSelect(mode, 10, 26, 10, 10, 10, 10, 10); 
= RBankSelect(mode, 11, 27, 11, 11, 11, 11, 11); 
= RBankSelect(mode, 12, 28, 12, 12, 12, 12, 12); 
= RBankSelect(mode, 13, 29, 17, 19, 21, 23, 15); 
= RBankSelect(mode, 14, 30, 16, 18, 20, 22, 14); 
=n; 


when 8 resul 24, 8, 8, 8, 8, 8); 
when 9 resul 
when 10 resul 
when 11 resul 
when 12 resul 
when 13 resul 
when 14 resul 
otherwise resul 





teeteeeaee 


return result; 


aarch32/functions/registers/Monitor_mode_registers 
bits(32) SP_mon; 
bits(32) LR_mon; 

aarch32/functions/registers/PC 


// PC - non-assignment form 


[[ seseeeeccesesssssssceees 


bits(32) PC 


return R[15]; // This includes the offset from AArch32 state 


aarch32/functions/registers/PCStoreValue 


// PCStoreValue() 
|/ ==ssssessse== 


bits(32) PCStoreValue() 


// This function returns the PC value. On architecture versions before ARMv7, it 

// is permitted to instead return PC+4, provided it does so consistently. It is 

// used only to describe A32 instructions, so it returns the address of the current 
// instruction plus 8 (normally) or 12 (when the alternative is permitted). 


return PC; 


aarch32/functions/registers/Q 


// QL] - non-assignment form 


bits(128) Q[integer n] 
assert n >= 0 && n <= 15; 
return _V[n]; 


// QL] - assignment form 


Q[integer n] = bits(128) value 
assert n >= 0 && n <= 15; 
_V[n] = value; 
return; 


aarch32/functions/registers/Qin 


// Vin[] - non-assignment form 


[[ seseeeccssssssssescesssssss 


bits(128) Qin[integer n] 
assert n >= 0 && n <= 15; 
return Din[2«n+1]:Din[2«n]; 
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aarch32/functions/registers/R 


// R[] - assignment form 


R[ integer n] = bits(32) value 
Rmode[n, PSTATE.M] = value; 
return; 


// R[] - non-assignment form 


bits(32) R[integer n] 
if n == 15 then 
offset = (if CurrentInstrSet() == InstrSet_A32 then 8 else 4); 
return _PC<31:0> + offset; 
else 
return Rmode[n, PSTATE.M]; 


aarch32/functions/registers/RBankSelect 


// RBankSelect() 
|/ sesssssasss= 


integer RBankSelect(bits(5) mode, integer usr, integer fig, integer irq, 
integer svc, integer abt, integer und, integer hyp) 


case mode of 
when M32_User result = usr; // User mode 
when M32_FIQ result = fiq; // FIQ mode 
when M32_IRQ result = irq; // IRQ mode 
when M32_Svc result = svc; // Supervisor mode 
when M32_Abort result = abt; // Abort mode 
when M32_Hyp result = hyp; // Hyp mode 
when M32_Undef result = und; // Undefined mode 
when M32_System result = usr; // System mode uses User mode registers 
otherwise Unreachable(); // Monitor mode 








return result; 


aarch32/functions/registers/Rmode 


// Rmode[] - non-assignment form 


[[ seseeeeccessesssssessessssss 


bits(32) Rmode[integer n, bits(5) mode] 
assert n >= @ && n <= 14; 


// Check for attempted use of Monitor mode in Non-secure state. 
if !IsSecure() then assert mode != M32_Monitor; 
assert !BadMode(mode) ; 


if mode == M32_Monitor then 
if n == 13 then return SP_mon; 
elsif n == 14 then return LR_mon; 
else return _R[n]<31:0>; 
else 
return _R[LookUpRIndex(n, mode) ]<31:0>; 


// Rmode[] - assignment form 


[f/f seseeeeccnssssssssssseese 


Rmode[integer n, bits(5) mode] = bits(32) value 
assert n >= @ && n <= 14; 


// Check for attempted use of Monitor mode in Non-secure state. 
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if !IsSecure() then assert mode != M32_Monitor; 
assert !BadMode(mode) ; 


if mode == M32_Monitor then 
if n == 13 then SP_mon = value; 
elsif n == 14 then LR_mon = value; 
else _R[n]<31:0> = value; 
else 
// It is CONSTRAINED UNPREDICTABLE whether the upper 32 bits of the X 
// register are unchanged or set to zero. This is also tested for on 
// exception entry, as this applies to all AArch32 registers. 
if !HighestELUsingAArch32() && ConstrainUnpredictableBool() then 
_R[LookUpRIndex(n, mode)] = ZeroExtend(value); 
else 


_R[LookUpRIndex(n, mode) ]<31:@> = value; 


return; 


aarch32/functions/registers/S 
// S{] - non-assignment form 
bits(32) S[integer n] 
assert n >= @ && n <= 31; 
base = (n MOD 4) « 32; 
return _V[n DIV 4]<base+31:base>; 
// S{] - assignment form 
S[integer n] = bits(32) value 
assert n >= @ && n <= 31; 
base = (n MOD 4) « 32; 
_V[n DIV 4]<base+31:base> = value; 
return; 


aarch32/functions/registers/SP 


// SP - assignment form 

// sssssssssse========= 

SP = bits(32) value 
R[13] = value; 
return; 


// SP - non-assignment form 


// ssssssssssssssssssssssss 
bits(32) SP 
return R[13]; 
aarch32/functions/registers/_Dclone 


array bits(64) _Dclone[Q..31]; 


aarch32/functions/system/AArch32.ExceptionReturn 


// AArch32.ExceptionReturn() 


AArch32.ExceptionReturn(bits(32) new_pc, bits(32) spsr) 


// Attempts to change to an illegal mode or state will invoke the I]legal Execution state 
// mechanism 
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SetPSTATEFromPSR(spsr) ; 
ClearExclusiveLocal(ProcessorID()); 
EventRegisterSet(); 


// Align PC[1:0] according to the target instruction set state 
if spsr<5> == '1' then // 132 

new_pc = Align(new_pc, 2); 
else // A32 

new_pc = Align(new_pc, 4); 


BranchTo(new_pc, BranchType_UNKNOWN) ; 


aarch32/functions/system/AArch32.ExecutingCP10or11Instr 


// AArch32.ExecutingCP10or11Instr() 
// sssssesseeesssssassseseseeeseee= 


boolean AArch32.ExecutingCP10or11Instr() 
instr = ThisInstr(); 
instr_set = CurrentInstrSet(); 
assert instr_set IN {InstrSet_A32, InstrSet_T32}; 


if instr_set == InstrSet_A32 then 
return ((instr<27:24> == '1110' || instr<27:25> == '110') && instr<11:8> == '101x') 
else // InstrSet_T32 
return (instr<31:28> == '111x' && (instr<27:24> == '1110' || instr<27:25> == '110') && 
instr<11:8> == '101x'); 


aarch32/functions/system/AArch32.ExecutingLSMInstr 


// AArch32.ExecutingLSMInstr() 





// Returns TRUE if processor is executing a Load/Store Multiple instruction 


boolean AArch32.ExecutingLSMInstr() 
instr = ThisInstr(); 
instr_set = CurrentInstrSet(); 
assert instr_set IN {InstrSet_A32, InstrSet_T32}; 


if instr_set == InstrSet_A32 then 
return (instr<28+:4> != '1111' && instr<25+:3> == '100'); 
else // InstrSet_T32 
if ThisInstrLength() == 16 then 
return (instr<12+:4> == '1100'); 
else 
return (instr<25+:7> == '1110100' && instr<22> == 'Q'); 


aarch32/functions/system/AArch32.ITAdvance 


// AArch32.1TAdvance() 


AArch32.1TAdvance() 
if PSTATE.IT<2:@> == 'Q0Q' then 
PSTATE.IT = '00000000'; 
else 
PSTATE.1T<4:0> = LSL(PSTATE.IT<4:@>, 1); 
return; 


aarch32/functions/system/AArch32.SysRegRead 


// Read from a 32-bit AArch32 System register and return the register's contents. 
bits(32) AArch32.SysRegRead(integer cp_num, bits(32) instr); 
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aarch32/functions/system/AArch32.SysRegRead64 


// Read from a 64-bit AArch32 System register and return the register's contents. 
bits(64) AArch32.SysRegRead64(integer cp_num, bits(32) instr); 


aarch32/functions/system/AArch32.SysRegReadCanWriteAPSR 
// AArch32.SysRegReadCanWriteAPSR() 
// Determines whether the AArch32 System register read instruction can write to APSR flags. 
boolean AArch32.SysRegReadCanWriteAPSR(integer cp_num, bits(32) instr) 
assert UsingAArch32(); 
assert (cp_num IN {14,15}); 


assert cp_num == UInt(instr<11:8>); 


opcl = UInt(instr<23:21>); 


opc2 = UInt(instr<7:5>); 
CRn = UInt(instr<19:16>); 
CRm = UInt(instr<3:0>); 


if cp_num == 14 && opcl == 0 && CRn == @ && CRm == 1 && opc2 == O then // DBGDSCRint 
return TRUE; 


return FALSE; 


aarch32/functions/system/AArch32.SysRegWrite 


// Write to a 32-bit AArch32 System register. 
AArch32.SysRegWrite(integer cp_num, bits(32) instr, bits(32) val); 


aarch32/functions/system/AArch32.SysRegWrite64 


// Write to a 64-bit AArch32 System register. 
AArch32.SysRegwrite64(integer cp_num, bits(32) instr, bits(64) val); 


aarch32/functions/system/AArch32.WriteMode 


// AArch32.WriteMode() 

// sssssssssss======== 

// Function for dealing with writes to PSTATE.M from AArch32 state only. 
// This ensures that PSTATE.EL and PSTATE.SP are always valid. 


AArch32.WriteMode(bits(5) mode) 
(valid,el) = ELFromM32(mode) ; 
assert valid; 
PSTATE.M = mode; 
PSTATE.EL = el; 
PSTATE.nRW = '1'; 
PSTATE.SP = (if mode IN {M32_User,M32_System} then 'Q' else '1'); 
return; 


aarch32/functions/system/AArch32.WriteModeByInstr 


// AArch32.WriteModeByInstr() 

// ssssssssseeessssss====2=== 

// Function for dealing with writes to PSTATE.M from an AArch32 instruction, and ensuring that 
// i\legal state changes are correctly flagged in PSTATE.IL. 


AArch32.WriteModeByInstr(bits(5) mode) 
(valid,el) = ELFromM32(mode) ; 


// ‘'valid' is set to FALSE if' mode’ is invalid for this implementation or the current value 
// of SCR.NS/SCR_EL3.NS. Additionally, it is illegal for an instruction to write 'mode' to 
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// PSTATE.EL if it would result in any of: 
// * A change to a mode that would cause entry to a higher Exception level. 
if UInt(el) > UInt(PSTATE.EL) then 

valid = FALSE; 


// * A change to or from Hyp mode. 
if (PSTATE.M == M32_Hyp || mode == M32_Hyp) && PSTATE.M != mode then 
valid = FALSE; 


// * When EL2 is implemented, the value of HCR.TGE is '1', a change to a Non-secure EL1 mode. 
if PSTATE.M == M32_Monitor && HaveEL(EL2) && el == EL1 && SCR.NS == '1' && HCR.TGE == '1' then 
valid = FALSE; 


if !valid then 
PSTATE.IL = '1'; 

else 
AArch32.WriteMode(mode) ; 


aarch32/functions/system/BadMode 


// BadMode() 


boolean BadMode(bits(5) mode) 
// Return TRUE if 'mode' encodes a mode that is not valid for this implementation 
case mode of 
when M32_Monitor 
valid = HaveAArch32EL(EL3); 
when M32_Hyp 
valid = HaveAArch32EL(EL2); 
when M32_FIQ, M32_IRQ, M32_Svc, M32_Abort, M32_Undef, M32_System 
// If EL3 is implemented and using AArch32, then these modes are EL3 modes in Secure 
// state, and EL1 modes in Non-secure state. If EL3 is not implemented or is using 
// AArch64, then these modes are EL1 modes. 
// Therefore it is sufficient to test this implementation supports EL1 using AArch32. 
valid = HaveAArch32EL(EL1); 
when M32_User 
valid = HaveAArch32EL(ELQ); 
otherwise 
valid = FALSE; // Passed an illegal mode value 
return !valid; 


aarch32/functions/system/BankedRegisterAccessValid 
// BankedRegisterAccessValid() 


// Checks for MRS (Banked register) or MSR (Banked register) accesses to registers 
// other than the SPSRs that are invalid. This includes ELR_hyp accesses. 


BankedRegisterAccessValid(bits(5) SYSm, bits(5) mode) 


case SYSm of 





when 'QQQxx', 'Q0100' // R8_usr to R12_usr 
if mode != M32_FIQ then UNPREDICTABLE; 

when 'QQ101' // SP_usr 
if mode == M32_System then UNPREDICTABLE; 

when 'Q0110' // LR_usr 
if mode IN {M32_Hyp,M32_System} then UNPREDICTABLE; 

when '@10xx', '@110x', 'Q1110' // R8_fiq to R12_fig, SP_fig, LR_fiq 
if mode == M32_FIQ then UNPREDICTABLE; 

when '100Qx' // LR_irg, SP_irg 
if mode == M32_IRQ then UNPREDICTABLE; 

when '10Q1x' // LR svc, SP_svc 
if mode == M32_Svc then UNPREDICTABLE; 

when '1010x' // LR_abt, SP_abt 





if mode == M32_Abort then UNPREDICTABLE; 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. J1-5351 
1ID092916 Non-Confidential 


J1 ARMv8 Pseudocode 
J1.2 Pseudocode for AArch32 operation 


when '1011x' // LR_und, SP_und 
if mode == M32_Undef then UNPREDICTABLE; 

when '1110x' // LR_mon, SP_mon 
if !HaveEL(EL3) || !IsSecure() || mode == M32_Monitor then UNPREDICTABLE; 

when '11110' // ELR_hyp, only from Monitor or Hyp mode 
if !HaveEL(EL2) || !(mode IN {M32_Monitor,M32_Hyp}) then UNPREDICTABLE; 

when '11111' // SP_hyp, only from Monitor mode 
if !HaveEL(EL2) || mode != M32_Monitor then UNPREDICTABLE; 

otherwise 
UNPREDICTABLE; 

return; 


aarch32/functions/system/CPSRWriteBylnstr 
// CPSRWriteByInstr() 
// Update PSTATE.<N,Z,C,V,Q,GE,E,A,I,F,M> from a CPSR value written by an MSR instruction. 


CPSRWriteByInstr(bits(32) value, bits(4) bytemask) 
privileged = PSTATE.EL != ELQ; // PSTATE.<A,I,F,M> are not writable at ELQ 


// Write PSTATE from 'value', ignoring bytes masked by 'bytemask' 
if bytemask<3> == '1' then 

PSTATE.<N,Z,C,V,Q> = value<31:27>; 

// Bits <26:24> are ignored 


if bytemask<2> == '1' then 
// Bits <23:20> are RESO 
PSTATE.GE = value<19:16>; 
if bytemask<1> == '1' then 
// Bits <15:10> are RESO 
PSTATE.E = value<9>; // PSTATE.E is writable at ELQ 
if privileged then 
PSTATE.A = value<8>; 
if bytemask<@> == '1' then 
if privileged then 
PSTATE.<I,F> = value<7:6>; 
// Bit <5> is RESO 
// AArch32.WriteModeByInstr() sets PSTATE.IL to 1 if this is an illegal mode change. 
AArch32.WriteModeByInstr(value<4:0>) ; 


return; 


aarch32/functions/system/ConditionPassed 


// ConditionPassed() 
|[ sasssssnssssan=s= 


boolean ConditionPassed() 
return ConditionHolds(AArch32.CurrentCond()); 
aarch32/functions/system/CurrentCond 


bits(4) AArch32.CurrentCond(); 


aarch32/functions/system/InITBlock 


// InITBlock() 
(<= 


boolean InITBlock() 
if CurrentInstrSet() == InstrSet_T32 then 
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return PSTATE.IT<3:0> != 'Q000'; 
else 
return FALSE; 


aarch32/functions/system/LastInITBlock 


// LastInITBlock() 
as 


boolean LastInITBlock() 
return (PSTATE.1IT<3:0> == '1000'); 


aarch32/functions/system/SPSRWriteByInstr 


// SPSRWriteByInstr() 
|{ sansennennansssses 


SPSRWriteByInstr(bits(32) value, bits(4) bytemask) 


new_spsr = SPSR[]; 
if bytemask<3> == '1' then 
new_spsr<31:24> = value<31:24>; // N,Z,C,V,Q flags, IT[1:0],J bits 
if bytemask<2> == '1' then 
new_spsr<23:16> = value<23:16>; // IL bit, GE[3:0] flags 
if bytemask<1> == '1' then 
new_spsr<15:8> = value<15:8>; // IT[7:2] bits, E bit, A interrupt mask 





if bytemask<@> == '1' then 

new_spsr<7:Q@> = value<7:0>; // 1,F interrupt masks, T bit, Mode bits 
SPSR[] = new_spsr; // UNPREDICTABLE if User or System mode 
return; 


aarch32/functions/system/SPSRaccessValid 
// SPSRaccessValid() 


// Checks for MRS (Banked register) or MSR (Banked register) accesses to the SPSRs 
// that are UNPREDICTABLE 


SPSRaccessValid(bits(5) SYSm, bits(5) mode) 
case SYSm of 














when 'Q1110' // SPSR_fig 
if mode == M32_FIQ then UNPREDICTABLE; 

when '1000Q' // SPSR_irg 
if mode == M32_IRQ then UNPREDICTABLE; 

when '10010' // SPSR_svc 
if mode == M32_Svc then UNPREDICTABLE; 

when '10100' // SPSR_abt 
if mode == M32_Abort then UNPREDICTABLE; 

when '10110' // SPSR_und 
if mode == M32_Undef then UNPREDICTABLE; 

when '11100' // SPSR_mon 
if !HaveEL(EL3) || mode == M32_Monitor || !IsSecure() then UNPREDICTABLE; 

when '11110' // SPSR_hyp 
if !HaveEL(EL2) || mode != M32_Monitor then UNPREDICTABLE; 

otherwise 
UNPREDICTABLE; 

return; 
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aarch32/functions/system/SelectinstrSet 


// SelectInstrSet() 
|{ sanneensnonsssss 


SelectInstrSet(InstrSet iset) 
assert CurrentInstrSet() IN {InstrSet_A32, InstrSet_T32}; 
assert iset IN {InstrSet_A32, InstrSet_T32}; 


PSTATE.T = if iset == InstrSet_A32 then '0' else '1'; 


return; 


aarch32/functions/v6simd/Sat 


// Sat() 
// ==555 


bits(N) Sat(integer i, integer N, boolean unsigned) 
result = if unsigned then UnsignedSat(i, N) else SignedSat(i, N); 
return result; 


aarch32/functions/v6simd/SignedSat 


// SignedSat() 
(=== 


bits(N) SignedSat(integer i, integer N) 
(result, -) = SignedSatQ(i, N); 
return result; 


aarch32/functions/v6simd/UnsignedSat 


// UnsignedSat() 
|{ sansnanannons 


bits(N) UnsignedSat(integer i, integer N) 
(result, -) = UnsignedSatQ(i, N); 
return result; 


J1.2.4 aarch32/translation 


This section includes the following pseudocode functions: 

° aarch32/translation/attrs/AArch32.DefaultTEX Decode on page J1-5355. 

° aarch32/translation/attrs/AArch32.InstructionDevice on page J1-5356. 

° aarch32/translation/attrs/AArch32.RemappedTEX Decode on page J1-5356. 
° aarch32/translation/attrs/AArch32.S1AttrDecode on page J1-5357. 

° aarch32/translation/attrs/AArch32.TranslateAddressS 1 Off on page J1-5357. 
. aarch32/translation/checks/AArch32.CheckDomain on page J1-5358. 

° aarch32/translation/checks/AArch32. CheckPermission on page J1-5359. 

. aarch32/translation/checks/AArch32.CheckS2Permission on page J1-5360. 

. aarch32/translation/debug/AArch32. CheckBreakpoint on page J1-5360. 

° aarch32/translation/debug/AArch32. CheckDebug on page J1-5361. 

° aarch32/translation/debug/AArch32.CheckVectorCatch on page J1-5361. 

° aarch32/translation/debug/AArch32.CheckWatchpoint on page J1-5362. 

° aarch32/translation/faults/AArch32.Access Flag Fault on page J1-5362. 

. aarch32/translation/faults/AArch32.AddressSizeFault on page J1-5362. 

° aarch32/translation/faults/AArch32.AlignmentFault on page J1-5362. 

. aarch32/translation/faults/AArch32.AsynchExternalAbort on page J1-5363. 
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. aarch32/translation/faults/AArch32.DebugFault on page J1-5363. 

. aarch32/translation/faults/AArch32.DomainFault on page J1-5363. 

. aarch32/translation/faults/AArch32.NoFault on page J1-5363. 

. aarch32/translation/faults/AArch32.PermissionFault on page J1-5364. 

. aarch32/translation/faults/AArch32.TranslationFault on page J1-5364. 

° aarch32/translation/translation/AArch32.FirstStageTranslate on page J1-5364. 
° aarch32/translation/translation/AArch32.FullTranslate on page J1-5365. 

° aarch32/translation/translation/AArch32.SecondStageTranslate on page J1-5365. 
° aarch32/translation/translation/AArch32.SecondStage Walk on page J1-5366. 

° aarch32/translation/translation/AArch32.TranslateAddress on page J1-5367. 

° aarch32/translation/walk/AArch32.TranslationTableWalkLD on page J1-5367. 
° aarch32/translation/walk/AArch32.TranslationTableWalkSD on page J1-5370. 
. aarch32/translation/walk/RemapRegsHaveResetValues on page J1-5373. 


aarch32/translation/attrs/AArch32.DefaultTEXDecode 


// AArch32.DefaultTEXDecode() 


MemoryAttributes AArch32.DefaultTEXDecode(bits(3) TEX, bit C, bit B, bit S, AccType acctype) 
MemoryAttributes memattrs; 


// Reserved values map to allocated values 

if (TEX == 'Q01' && C:B == '@1') || (TEX == '010' && C:B != '00') || TEX == 'Q11' then 
bits(5) texchb; 
(-, texcb) = ConstrainUnpredictableBits(); 
TEX = texcb<4:2>; C = texcb<1>; B = texch<Q>; 


case TEX:C:B of 

when 'Q0000' 
// Device-nGnRnE 
memattrs.type = MemType_Device; 
memattrs.device = DeviceType_nGnRnE; 

when 'Q0001', 'Q1000' 
// Device-nGnRE 
memattrs.type = MemType_Device; 
memattrs.device = DeviceType_nGnRE; 

when 'Q0010', '@0011', '00100' 
// Write-back or Write-through Read allocate, or Non-cacheable 
memattrs.type = MemType_Normal; 
memattrs.inner = ShortConvertAttrsHints(C:B, acctype, FALSE); 
memattrs.outer = ShortConvertAttrsHints(C:B, acctype, FALSE); 


memattrs.shareable = (S == '1'); 
when 'QQ110' 

memattrs = MemoryAttributes IMPLEMENTATION_DEFINED; 
when 'QQ111' 


// Write-back Read and Write allocate 
memattrs.type = MemType_Normal; 
memattrs.inner = ShortConvertAttrsHints('@1', acctype, FALSE); 
memattrs.outer = ShortConvertAttrsHints('@1', acctype, FALSE); 
memattrs.shareable = (S == '1'); 

when '1xxxx' 
// Cacheable, TEX<1:0> = Outer attrs, {C,B} = Inner attrs 
memattrs.type = MemType_Normal; 
memattrs.inner = ShortConvertAttrsHints(C:B, acctype, FALSE); 
memattrs.outer = ShortConvertAttrsHints(TEX<1:@>, acctype, FALSE); 


memattrs.shareable = (S == '1'); 
otherwise 

// Reserved, handled above 

Unreachable(); 


// transient bits are not supported in this format 
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memattrs.inner.transient = FALSE; 
memattrs.outer.transient = FALSE; 


// distinction between inner and outer shareable is not supported in this format 
memattrs.outershareable = memattrs.shareable; 


return MemAttrDefaults(memattrs) ; 


aarch32/translation/attrs/AArch32.InstructionDevice 


// AArch32.InstructionDevice() 

// ssssssssseesssseseesss===== 

// Instruction fetches from memory marked as Device but not execute-never might generate a 
// Permission Fault but are otherwise treated as if from Normal Non-cacheable memory. 


AddressDescriptor AArch32.InstructionDevice(AddressDescriptor addrdesc, bits(32) vaddress, 
bits(40) ipaddress, integer level, bits(4) domain, 
AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk) 


c = ConstrainUnpredictable(); 
assert c IN {Constraint_NONE, Constraint_FAULT}; 


if c == Constraint_FAULT then 
addrdesc.fault = AArch32.PermissionFault(ipaddress, domain, level, acctype, iswrite, 
secondstage, s2fslwalk); 
else 
addrdesc.memattrs.type = MemType_Normal ; 
addrdesc.memattrs.inner.attrs = MemAttr_NC; 
addrdesc.memattrs.inner.hints = MemHint_No; 
addrdesc.memattrs.outer = addrdesc.memattrs.inner; 
addrdesc.memattrs = MemAttrDefaults(addrdesc.memattrs) ; 


return addrdesc; 


aarch32/translation/attrs/AArch32.RemappedTEXDecode 


// AArch32.RemappedTEXDecode() 


MemoryAttributes AArch32.RemappedTEXDecode(bits(3) TEX, bit C, bit B, bit S, AccType acctype) 
MemoryAttributes memattrs; 


region = UInt(TEX<0>:C:B); // TEX<2:1> are ignored in this mapping scheme 
if region == 6 then 

memattrs = MemoryAttributes IMPLEMENTATION_DEFINED; 
else 

base = 2 * region; 

attrfield = PRRR<base+1:base>; 


if attrfield == '11' then // Reserved, maps to allocated value 
(-, attrfield) = ConstrainUnpredictableBits(); 


case attrfield of 

when 'QQ' // Device-nGnRnE 
memattrs.type = MemType_Device; 
memattrs.device = DeviceType_nGnRnE; 

when 'Q1' // Device-nGnRE 
memattrs.type = MemType_Device; 
memattrs.device = DeviceType_nGnRE; 

when '10' 
memattrs.type = MemType_Normal; 
memattrs.inner = ShortConvertAttrsHints(NMRR<base+1:base>, acctype, FALSE); 
memattrs.outer = ShortConvertAttrsHints(NMRR<base+17:base+16>, acctype, FALSE); 
s_bit = if S == '@' then PRRR.NS@ else PRRR.NS1; 
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memattrs.shareable = (s_bit == '1'); 

memattrs.outershareable = (s_bit == '1' && PRRR<region+24> == 'Q'); 
when '11' 

Unreachable(); 


// transient bits are not supported in this format 
memattrs.inner.transient = FALSE; 
memattrs.outer.transient = FALSE; 


return MemAttrDefaults(memattrs) ; 


aarch32/translation/attrs/AArch32.S1AttrDecode 
// AArch32.S1AttrDecode() 


// Converts the Stage 1 attribute fields, using the MAIR, to orthogonal 
// attributes and hints. 


MemoryAttributes AArch32.S1AttrDecode(bits(2) SH, bits(3) attr, AccType acctype) 
MemoryAttributes memattrs; 


if PSTATE.EL == EL2 then 
mair = HMAIR1:HMAIRO; 
else 
mair = MAIR1:MAIRQ; 
index = 8 « UInt(attr); 
attrfield = mair<index+7:index>; 


if ((attrfield<7:4> != 'Q000' && attrfield<3:0> == '0000') || 
(attrfield<7:4> == 'Q000' && attrfield<3:0> != 'xx@0')) then 
// Reserved, maps to an allocated value 
(-, attrfield) = ConstrainUnpredictableBits(); 


if attrfield<7:4> == 'Q000' then // Device 

memattrs.type = MemType_Device; 

case attrfield<3:0> of 
when 'Q000' memattrs.device = DeviceType_nGnRnE; 
when 'Q100' memattrs.device = DeviceType_nGnRE; 
when '1000' memattrs.device = DeviceType_nGRE; 
when '1100' memattrs.device = DeviceType_GRE; 
otherwise Unreachable(); // Reserved, handled above 


elsif attrfield<3:0> != 'Q000' then // Normal 
memattrs.type = MemType_Normal; 
memattrs.outer = LongConvertAttrsHints(attrfield<7:4>, acctype); 
memattrs.inner = LongConvertAttrsHints(attrfield<3:0>, acctype); 


memattrs.shareable = SH<1> == '1'; 
memattrs.outershareable = SH == '10'; 
else 
Unreachable(); // Reserved, handled above 


return MemAttrDefaults(memattrs) ; 


aarch32/translation/attrs/AArch32. TranslateAddressS1 Off 


// AArch32.TranslateAddressS10ff() 

// sssssesseesssssssesssssseeese== 

// Called for stage 1 translations when translation is disabled to supply a default translation. 
// Note that there are additional constraints on instruction prefetching that are not described in 
// this pseudocode. 


TLBRecord AArch32.TranslateAddressS10ff(bits(32) vaddress, AccType acctype, boolean iswrite) 
assert ELUsingAArch32(S1TranslationRegime()); 
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TLBR 


defal 
'1')); 


if d 


elsi 


else 


resul 


resul 


resul 
resul 
resul 


resul 
resul 
resul 
resul 


resu 


resul 
resul 
resul 


retu 


ecord result; 


ult_cacheable = (HasS2Translation() && ((if ELUsingAArch32(EL2) then HCR.DC else HCR_EL2.DC) == 


efault_cacheable then 
// Use default cacheable settings 


result.addrdesc.memattrs.type = MemTlype_Normal; 
result.addrdesc.memattrs.inner.attrs = MemAttr_WB; // Write-back 
result.addrdesc.memattrs.inner.hints = MemHint_RWA; 
result.addrdesc.memattrs.shareable = FALSE; 


result.addrdesc.memattrs.outershareable = FALSE; 

f acctype != AccType_IFETCH then 

// Treat data as Device 

result.addrdesc.memattrs.type = MemType_Device; 
result.addrdesc.memattrs.device = DeviceType_nGnRnE; 
result.addrdesc.memattrs.inner = MemAttrHints UNKNOWN; 





// Instruction cacheability controlled by SCTLR/HSCTLR.I 
if PSTATE.EL == EL2 then 
cacheable = HSCTLR.I == '1'; 
else 
cacheable = SCTLR.I == '1'; 
result.addrdesc.memattrs.type = MemType_Normal; 
if cacheable then 
result.addrdesc.memattrs.inner.attrs = MemAttr_WT; 
result.addrdesc.memattrs.inner.hints = MemHint_RA; 
else 
result.addrdesc.memattrs.inner.attrs = MemAttr_NC; 
result.addrdesc.memattrs.inner.hints = MemHint_No; 
result.addrdesc.memattrs.shareable = TRUE; 
result.addrdesc.memattrs.outershareable = TRUE; 


t.addrdesc.memattrs.outer = result.addrdesc.memattrs.inner; 
t.addrdesc.memattrs = MemAttrDefaults(result.addrdesc.memattrs) ; 
t.perms.ap = bits(3) UNKNOWN; 

t.perms.xn = 'Q'; 


t.perms.pxn = 'Q'; 


nG = bit UNKNOWN; 

contiguous = boolean UNKNOWN; 

domain = bits(4) UNKNOWN; 

level = integer UNKNOWN; 

blocksize = integer UNKNOWN; 
addrdesc.paddress.physicaladdress = ZeroExtend(vaddress) ; 
addrdesc.paddress.NS = if IsSecure() then 'Q' else '1'; 
addrdesc. fault = AArch32.NoFault(); 

rn result; 
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aarch32/translation/checks/AArch32.CheckDomain 


// AArch 
= 


(boolean 
inde 
attr 


ifa 


ifa 


32.CheckDomain() 


, FaultRecord) AArch32.CheckDomain(bits(4) domain, bits(32) vaddress, integer level, 
AccType acctype, boolean iswrite) 


x = 2 x UInt(domain); 
field = DACR<index+1:index>; 


ttrfield == '10' then // Reserved, maps to an allocated value 
// Reserved value maps to an allocated value 


(-, attrfield) = ConstrainUnpredictableBits(); 


ttrfield == 'Q0' then 
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fault = AArch32.DomainFault(domain, level, acctype, iswrite); 
else 
fault = AArch32.NoFault(); 


permissioncheck = (attrfield == 'Q1'); 


return (permissioncheck, fault); 


aarch32/translation/checks/AArch32.CheckPermission 


// AArch32.CheckPermission() 


// Function used for permission checking from AArch32 stage 1 translations 


FaultRecord AArch32.CheckPermission(Permissions perms, bits(32) vaddress, integer level, 
bits(4) domain, bit NS, AccType acctype, boolean iswrite) 
assert ELUsingAArch32(S1TranslationRegime()); 


if PSTATE.EL != EL2 then 
wxn = SCTLR.WXN == '1'; 
if TTBCR.EAE == '1' || SCTLR.AFE == '1' || perms.ap<Q> == '1' then 


priv_r = TRUE; 

priv_w = perms.ap<2> == 'Q'; 
user_r = perms.ap<l> == '1'; 
user_w = perms.ap<2:1> == 'Q1'; 

else 

priv_r = perms.ap<2:1> != 'Q0'; 
priv_w = perms.ap<2:1> == 'Q1'; 
user_r = perms.ap<l> == '1'; 


user_w = FALSE; 
uwxn = SCTLR.UWXN == '1'; 


if PSTATE.EL == ELQ then 
ispriv = FALSE; 
else 
ispriv = (acctype != AccType_UNPRIV) ; 


user_xn = !user_r || perms.xn == '1' || (user_w && wxn); 
priv_xn = (!priv_r || perms.xn == '1' || perms.pxn == '1' | 
(priv_w && wxn) || (user_w && uwxn)); 


if ispriv then 
(r, w, xn) = (priv_r, priv_w, priv_xn) 
else 
(r, w, xn) = (user_r, user_w, user_xn); 
else 
// Access from EL2 
wxn = HSCTLR.WXN == '1'; 


r = TRUE; 
W = perms.ap<2> == 'Q'; 
xn = perms.xn == '1' || (w && wxn); 


// Restriction on Secure instruction fetch 

if HaveEL(EL3) && IsSecure() && NS == '1' then 
secure_instr_fetch = if ELUsingAArch32(EL3) then SCR.SIF else SCR_EL3.SIF; 
if secure_instr_fetch == '1' then xn = TRUE; 


if acctype == AccType_IFETCH then 
fail = xn; 
failedread = TRUE; 
elsif iswrite && !IsSecure() && PSTATE.EL == EL1 && (acctype != AccType_DC) then 


fail = !w; 

failedread = FALSE; 
else 

fail = !r; 


failedread = TRUE; 


if fail then 
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secondstage = FALSE; 
s2fslwalk = FALSE; 
jipaddress = bits(4Q) UNKNOWN; 
return AArch32.PermissionFault(ipaddress, domain, level, acctype, 
!failedread, secondstage, s2fslwalk); 
else 
return AArch32.NoFault(); 


aarch32/translation/checks/AArch32.CheckS2Permission 


// AArch32.CheckS2Permission() 


// Function used for permission checking from AArch32 stage 2 translations 

FaultRecord AArch32.CheckS2Permission(Permissions perms, bits(32) vaddress, bits(40) ipaddress, 
integer level, AccType acctype, boolean iswrite, 
boolean s2fslwalk) 


assert HaveEL(EL2) && !IsSecure() && ELUsingAArch32(EL2) && HasS2Translation(); 


r = perms.ap<1> == '1'; 
W = perms.ap<2> == '1'; 
xn = !r || perms.xn == '1'; 


// Stage 1 walk is checked as a read, regardless of the original type 
if acctype == AccType_IFETCH && !s2fslwalk then 
fail = xn; 
failedread = TRUE; 
elsif iswrite && !s2fslwalk then 
fail = !w; 
failedread = FALSE; 
else 
fail = Ir; 
failedread = !iswrite; 





if fail then 
domain = bits(4) UNKNOWN; 
secondstage = TRUE; 
return AArch32.PermissionFault(ipaddress, domain, level, acctype, 
!failedread, secondstage, s2fslwalk); 
else 
return AArch32.NoFault(); 


aarch32/translation/debug/AArch32.CheckBreakpoint 
// AArch32.CheckBreakpoint() 


// Called before executing the instruction of length "size" bytes at "vaddress" in an AArch32 
// translation regime. 

// The breakpoint can in fact be evaluated well ahead of execution, for example, at instruction 
// fetch. This is the simple sequential execution of the program. 


FaultRecord AArch32.CheckBreakpoint(bits(32) vaddress, integer size) 
assert ELUsingAArch32(S1TranslationRegime()); 
assert size IN {2,4}; 


match = FALSE; 
mismatch = FALSE; 


for i = 0 to UInt(DBGDIDR.BRPs) 
(match_i, mismatch_i) = AArch32.BreakpointMatch(i, vaddress, size); 
match = match || match_i; 
mismatch = mismatch || mismatch_i; 


if match && HaltOnBreakpointOrwatchpoint() then 
reason = DebugHalt_Breakpoint; 
Halt(reason) ; 
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elsif (match || mismatch) && DBGDSCRext.MDBGen == '1' && AArch32.GenerateDebugExceptions() then 
acctype = AccType_IFETCH; 
iswrite = FALSE; 
debugmoe = DebugException_Breakpoint; 
return AArch32.DebugFault(acctype, iswrite, debugmoe); 
else 
return AArch32.NoFault(); 


aarch32/translation/debug/AArch32.CheckDebug 


// AArch32.CheckDebug() 


// Called on each access to check for a debug exception or entry to Debug state. 
FaultRecord AArch32.CheckDebug(bits(32) vaddress, AccType acctype, boolean iswrite, integer size) 
FaultRecord fault = AArch32.NoFault(); 


d_side = (acctype != AccType_IFETCH); 

generate_exception = AArch32.GenerateDebugExceptions() && DBGDSCRext.MDBGen == '1'; 

halt = HaltOnBreakpointOrwatchpoint(); 

// Relative priority of Vector Catch and Breakpoint exceptions not defined in the architecture 
vector_catch_first = ConstrainUnpredictableBool(); 


if !d_side && vector_catch_first && generate_exception then 
fault = AArch32.CheckVectorCatch(vaddress, size); 


if fault.type == Fault_None && (generate_exception || halt) then 
if d_side then 
fault = AArch32.CheckWatchpoint(vaddress, acctype, iswrite, size); 
else 
fault = AArch32.CheckBreakpoint(vaddress, size); 


if fault.type == Fault_None && !d_side && !vector_catch_first && generate_exception then 
return AArch32.CheckVectorCatch(vaddress, size); 


return fault; 


aarch32/translation/debug/AArch32.CheckVectorCatch 


// AArch32.CheckVectorCatch() 

// sssssssseeeessssss==222=== 

// Called before executing the instruction of length "size" bytes at "vaddress" in an AArch32 
// translation regime. 

// Vector Catch can in fact be evaluated well ahead of execution, for example, at instruction 
// fetch. This is the simple sequential execution of the program. 


FaultRecord AArch32.CheckVectorCatch(bits(32) vaddress, integer size) 
assert ELUsingAArch32(S1TranslationRegime()); 


match = AArch32.VCRMatch(vaddress); 
if size == 4 && !match && AArch32.VCRMatch(vaddress + 2) then 
match = ConstrainUnpredictableBool(); 


if match && DBGDSCRext.MDBGen == '1' && AArch32.GenerateDebugExceptions() then 
acctype = AccType_IFETCH; 
iswrite = FALSE; 
debugmoe = DebugException_VectorCatch; 
return AArch32.DebugFault(acctype, iswrite, debugmoe) ; 
else 
return AArch32.NoFault(); 
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aarch32/translation/debug/AArch32.CheckWatchpoint 


// AArch32.CheckWatchpoint() 
|[ sesessensssecsssssnsssse= 


// Called before accessing the memory location of "size" bytes at "address". 


FaultRecord AArch32.CheckWatchpoint(bits(32) vaddress, AccType acctype, 
boolean iswrite, integer size) 
assert ELUsingAArch32(S1TranslationRegime()); 


match = FALSE; 
ispriv = PSTATE.EL != EL@ && !(PSTATE.EL == EL1 && acctype == AccType_UNPRIV); 


for i = @ to UInt(DBGDIDR.WRPs) 
match = match || AArch32.WatchpointMatch(i, vaddress, size, ispriv, iswrite); 


if match && HaltOnBreakpointOrWatchpoint() then 
reason = DebugHalt_Watchpoint; 
Halt(reason) ; 
elsif match && DBGDSCRext.MDBGen == '1' && AArch32.GenerateDebugExceptions() then 
debugmoe = DebugException_Watchpoint; 
return AArch32.DebugFault(acctype, iswrite, debugmoe); 
else 
return AArch32.NoFault(); 


aarch32/translation/faults/AArch32.AccessFlagFault 


// AArch32.AccessFlagFault() 
|[ zasessenssssassssassss5= 


FaultRecord AArch32.AccessFlagFault(bits(4@) ipaddress, bits(4) domain, integer level, 
AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk) 


extflag = bit UNKNOWN; 

debugmoe = bits(4) UNKNOWN; 

return AArch32.CreateFaultRecord(Fault_AccessFlag, ipaddress, domain, level, acctype, iswrite, 
extflag, debugmoe, secondstage, s2fslwalk); 


aarch32/translation/faults/AArch32.AddressSizeFault 


// AArch32.AddressSizeFault() 
|[ seessssnssssanssssesssse5= 


FaultRecord AArch32.AddressSizeFault(bits(40) ipaddress, bits(4) domain, integer level, 
AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk) 


extflag = bit UNKNOWN; 

debugmoe = bits(4) UNKNOWN; 

return AArch32.CreateFaultRecord(Fault_AddressSize, ipaddress, domain, level, acctype, iswrite, 
extflag, debugmoe, secondstage, s2fslwalk); 


aarch32/translation/faults/AArch32.AlignmentFault 


// AArch32.AlignmentFault() 
|[ senssseassssenssssassss= 


FaultRecord AArch32.AlignmentFault(AccType acctype, boolean iswrite, boolean secondstage) 


jpaddress = bits(4@) UNKNOWN; 
domain = bits(4) UNKNOWN; 
level = integer UNKNOWN; 
extflag = bit UNKNOWN; 
debugmoe = bits(4) UNKNOWN; 
s2fslwalk = boolean UNKNOWN; 





J1-5362 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


J1 ARMv8 Pseudocode 
J1.2 Pseudocode for AArch32 operation 


return AArch32.CreateFaultRecord(Fault_Alignment, ipaddress, domain, level, acctype, iswrite, 
extflag, debugmoe, secondstage, s2fslwalk); 


aarch32/translation/faults/AArch32.AsynchExternalAbort 


// AArch32.AsynchExternalAbort() 
// Wrapper function for asynchronous external aborts 
FaultRecord AArch32.AsynchExternalAbort(boolean parity, bit extflag) 


type = if parity then Fault_AsyncParity else Fault_AsyncExternal; 
ipaddress = bits(4@) UNKNOWN; 

domain = bits(4) UNKNOWN; 

level = integer UNKNOWN; 

acctype = AccType_NORMAL; 

iswrite = boolean UNKNOWN; 

debugmoe = bits(4) UNKNOWN; 

secondstage = FALSE; 

s2fslwalk = FALSE; 


return AArch32.CreateFaultRecord(type, ipaddress, domain, level, acctype, iswrite, extflag, 
debugmoe, secondstage, s2fslwalk); 


aarch32/translation/faults/AArch32.DebugFault 


// AArch32.DebugFault() 
|/ sessssenssssacsssse= 


FaultRecord AArch32.DebugFault(AccType acctype, boolean iswrite, bits(4) debugmoe) 


jpaddress = bits(40) UNKNOWN; 
domain = bits(4) UNKNOWN; 
level = integer UNKNOWN; 
extflag = bit UNKNOWN; 
secondstage = FALSE; 
s2fslwalk = FALSE; 


return AArch32.CreateFaultRecord(Fault_Debug, ipaddress, domain, level, acctype, iswrite, 
extflag, debugmoe, secondstage, s2fslwalk); 
aarch32/translation/faults/AArch32.DomainFault 


// AArch32.DomainFault() 

7 ee 

FaultRecord AArch32.DomainFault(bits(4) domain, integer level, AccType acctype, boolean iswrite) 
ipaddress = bits(4@) UNKNOWN; 
extflag = bit UNKNOWN; 
debugmoe = bits(4) UNKNOWN; 


secondstage = FALSE; 
s2fslwalk = FALSE; 


return AArch32.CreateFaultRecord(Fault_Domain, ipaddress, domain, level, acctype, iswrite, 
extflag, debugmoe, secondstage, s2fslwalk); 


aarch32/translation/faults/AArch32.NoFault 


// AArch32.NoFault() 
|[ zssssssnsssssn=== 


FaultRecord AArch32.NoFault() 
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jpaddress = bits(4@) UNKNOWN; 
domain = bits(4) UNKNOWN; 
level = integer UNKNOWN; 
acctype = AccType_NORMAL; 
jiswrite = boolean UNKNOWN; 
extflag = bit UNKNOWN; 
debugmoe = bits(4) UNKNOWN; 
secondstage = FALSE; 
s2fslwalk = FALSE; 


return AArch32.CreateFaultRecord(Fault_None, ipaddress, domain, level, acctype, iswrite, 
extflag, debugmoe, secondstage, s2fslwalk); 


aarch32/translation/faults/AArch32.PermissionFault 


// AArch32.PermissionFault() 
|[ sassssensessssssanssse5= 


FaultRecord AArch32.PermissionFault(bits(4@) ipaddress, bits(4) domain, integer level, 
AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk) 


extflag = bit UNKNOWN; 

debugmoe = bits(4) UNKNOWN; 

return AArch32.CreateFaultRecord(Fault_Permission, ipaddress, domain, level, acctype, iswrite, 
extflag, debugmoe, secondstage, s2fslwalk); 


aarch32/translation/faults/AArch32.TranslationFault 


// AArch32.TranslationFault() 
|[ sessssenssssansssssssssse= 


FaultRecord AArch32.TranslationFault(bits(40) ipaddress, bits(4) domain, integer level, 
AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk) 


extflag = bit UNKNOWN; 

debugmoe = bits(4) UNKNOWN; 

return AArch32.CreateFaultRecord(Fault_Translation, ipaddress, domain, level, acctype, iswrite, 
extflag, debugmoe, secondstage, s2fslwalk); 


aarch32/translation/translation/AArch32.FirstStageTranslate 


// AArch32.FirstStageTranslate() 

// sssssssseeessssaassessssees= 

// Perform a stage 1 translation walk. The function used by Address Translation operations is 
// similar except it uses the translation regime specified for the instruction. 


AddressDescriptor AArch32.FirstStageTranslate(bits(32) vaddress, AccType acctype, boolean iswrite, 
boolean wasaligned, integer size) 


if PSTATE.EL == EL2 then 
sl_enabled = HSCTLR.M == '1'; 

elsif HaveEL(EL2) && !IsSecure() then 
tge = (if ELUsingAArch32(EL2) then HCR.TGE else HCR_EL2.TGE); 
dc = (if ELUsingAArch32(EL2) then HCR.DC else HCR_EL2.DC); 
sl_enabled = tge == '0' && dc == 'Q' && SCTLR.M == '1'; 

else 
dc = (if ELUsingAArch32(EL2) then HCR.DC else HCR_EL2.DC); 
sl_enabled = dc == 'Q' && SCTLR.M == '1'; 


jpaddress = bits(40) UNKNOWN; 
secondstage = FALSE; 
s2fslwalk = FALSE; 


if sl_enabled then // First stage enabled 
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use_long_descriptor_format = PSTATE.EL == EL2 || TTBCR.EAE == '1'; 
if use_long_descriptor_format then 
$1 = AArch32.TranslationTableWalkLD(ipaddress, vaddress, acctype, iswrite, secondstage, 
s2fslwalk, size) 
permissioncheck = TRUE; domaincheck = FALSE; 
else 
$1 = AArch32.TranslationTableWalkSD(vaddress, acctype, iswrite, size); 
permissioncheck = TRUE; domaincheck = TRUE; 
else 
S1 = AArch32.TranslateAddressSl0ff(vaddress, acctype, iswrite); 
permissioncheck = FALSE; domaincheck = FALSE; 


// Check for unaligned data accesses to Device memory 
if ((!wasaligned && acctype != AccType_IFETCH) || (acctype == AccType_DCZVA)) 
&& Sl1.addrdesc.memattrs.type == MemType_Device && !IsFault(S1.addrdesc) then 
Sl.addrdesc. fault = AArch32.AlignmentFault(acctype, iswrite, secondstage) ; 
if !IsFault(Sl.addrdesc) && domaincheck then 
(permissioncheck, abort) = AArch32.CheckDomain(S1.domain, vaddress, S1.level, acctype, 
iswrite); 
Sl.addrdesc.fault = abort; 


if !IsFault(Sl.addrdesc) && permissioncheck then 
Sl.addrdesc.fault = AArch32.CheckPermission(Sl.perms, vaddress, S1. level, 
S1.domain, S1.addrdesc.paddress.Ns, 
acctype, iswrite); 





// Check for instruction fetches from Device memory not marked as execute-never. If there has 
// not been a Permission Fault then the memory is not marked execute-never. 
if (!IsFault(Sl.addrdesc) && S1.addrdesc.memattrs.type == MemType_Device && 
acctype == AccType_IFETCH) then 
Sl.addrdesc = AArch32.InstructionDevice(S1.addrdesc, vaddress, ipaddress, S1. level, 
S1.domain, acctype, iswrite, 
secondstage, s2fslwalk); 


return S1.addrdesc; 


aarch32/translation/translation/AArch32.FullTranslate 


// AArch32.FullTranslate() 


// Perform both stage 1 and stage 2 translation walks for the current translation regime. The 
// function used by Address Translation operations is similar except it uses the translation 
// regime specified for the instruction. 


AddressDescriptor AArch32.FullTranslate(bits(32) vaddress, AccType acctype, boolean iswrite, 
boolean wasaligned, integer size) 


// First Stage Translation 
S1 = AArch32.FirstStageTranslate(vaddress, acctype, iswrite, wasaligned, size); 
if !IsFault(S1) && HasS2Translation() then 
s2fslwalk = FALSE; 
result = AArch32.SecondStageTranslate(S1, vaddress, acctype, iswrite, wasaligned, s2fslwalk, 
size); 
else 
result = S1; 


return result; 


aarch32/translation/translation/AArch32.SecondStageTranslate 


// AArch32.SecondStageTranslate() 

oe 

// Perform a stage 2 translation walk. The function used by Address Translation operations is 
// similar except it uses the translation regime specified for the instruction. 


AddressDescriptor AArch32.SecondStageTranslate(AddressDescriptor S1, bits(32) vaddress, 
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AccType acctype, boolean iswrite, boolean wasaligned, 
boolean s2fslwalk, integer size) 
assert HasS2Translation(); 
assert IsZero(S1.paddress.physicaladdress<47: 40>) ; 
if !ELUsingAArch32(EL2) then 
return AArch64.SecondStageTranslate(S1, ZeroExtend(vaddress, 64), acctype, iswrite, 
wasaligned, s2fslwalk, size); 


s2_enabled = HCR.VM == '1' || HCR.DC == '1'; 
secondstage = TRUE; 


if s2_enabled then // Second stage enabled 
ipaddress = S1.paddress.physicaladdress<39:0>; 


$2 = AArch32.TranslationTableWalkLD(ipaddress, vaddress, acctype, iswrite, secondstage, 
s2fslwalk, size) 


// Check for unaligned data accesses to Device memory 

if ((!wasaligned && acctype != AccType_IFETCH) || (acctype == AccType_DCZVA)) 
&& S2.addrdesc.memattrs.type == MemType_Device && !IsFault(S2.addrdesc) then 
$2.addrdesc. fault = AArch32.AlignmentFault(acctype, iswrite, secondstage) ; 


if !IsFault(S2.addrdesc) then 
S2.addrdesc.fault = AArch32.CheckS2Permission(S2.perms, vaddress, ipaddress, S2.level, 
acctype, iswrite, s2fslwalk); 
// Check for instruction fetches from Device memory not marked as execute-never. As there 
// has not been a Permission Fault then the memory is not marked execute-never. 
if (!s2fslwalk && !IsFault(S2.addrdesc) && S2.addrdesc.memattrs.type == MemType_Device && 
acctype == AccType_IFETCH) then 
domain = bits(4) UNKNOWN; 
S2.addrdesc = AArch32.InstructionDevice(S2.addrdesc, vaddress, ipaddress, S2.level, 
domain, acctype, iswrite, 
secondstage, s2fslwalk); 


// Check for protected table walk 
if (s2fslwalk && !IsFault(S2.addrdesc) && HCR.PTW == '1' && 
S2.addrdesc.memattrs.type == MemType_Device) then 
domain = bits(4) UNKNOWN; 
S2.addrdesc.fault = AArch32.PermissionFault(ipaddress, domain, S2.level, acctype, 
iswrite, secondstage, s2fslwalk); 


result = CombineS1S2Desc(S1, S2.addrdesc); 
else 
result = S1; 


return result; 


aarch32/translation/translation/AArch32.SecondStageWalk 


// AArch32.SecondStageWalk() 


// Perform a stage 2 translation on a stage 1 translation page table walk access. 


AddressDescriptor AArch32.SecondStageWalk(AddressDescriptor S1, bits(32) vaddress, AccType acctype, 
boolean iswrite, integer size) 


assert HasS2Translation(); 


s2fslwalk = TRUE; 

wasaligned = TRUE; 

return AArch32.SecondStageTranslate(S1, vaddress, acctype, iswrite, wasaligned, s2fslwalk, 
size); 
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aarch32/translation/translation/AArch32.TranslateAddress 


// AArch32.TranslateAddress() 


// Main entry point for translating an address 


AddressDescriptor AArch32.TranslateAddress(bits(32) vaddress, AccType acctype, boolean iswrite, 
boolean wasaligned, integer size) 


if !ELUsingAArch32(S1TranslationRegime()) then 
return AArch64.TranslateAddress(ZeroExtend(vaddress, 64), acctype, iswrite, wasaligned, 
size); 
result = AArch32.FullTranslate(vaddress, acctype, iswrite, wasaligned, size); 


if !(acctype IN {AccType_PTW, AccType_IC, AccType_AT}) && !IsFault(result) then 
result.fault = AArch32.CheckDebug(vaddress, acctype, iswrite, size); 


// Update virtual address for abort functions 
result.vaddress = ZeroExtend(vaddress); 


return result; 


aarch32/translation/walk/AArch32.TranslationTableWalkLD 


// AArch32.TranslationTableWalkLD() 

——————————E 

// Returns a result of a translation table walk using the Long-descriptor format 

// 

// Implementations might cache information from memory in any number of non-coherent TLB 
// caching structures, and so avoid memory accesses that have been expressed in this 

// pseudocode. The use of such TLBs is not expressed in this pseudocode. 


TLBRecord AArch32.TranslationTableWalkLD(bits(40) ipaddress, bits(32) vaddress, 
AccType acctype, boolean iswrite, boolean secondstage, 
boolean s2fslwalk, integer size) 
if !secondstage then 
assert ELUsingAArch32(S1TranslationRegime()); 
else 
assert HaveEL(EL2) && !IsSecure() && ELUsingAArch32(EL2) && HasS2Translation(); 


TLBRecord result; 

AddressDescriptor descaddr; 

bits(64) baseregister; 

bits(40) inputaddr; // Input Address is 'vaddress' for stage 1, 'ipaddress' for stage 2 
domain = bits(4) UNKNOWN; 


descaddr.memattrs.type = MemType_Normal ; 


// Fixed parameters for the page table walk: 


// grainsize = Log2(Size of Table) - Size of Table is 4KB in AArch32 

// stride = Log2(Address per Level) - Bits of address consumed at each level 
constant integer grainsize = 12; // Log2(4KB page size) 

constant integer stride = grainsize - 3; // Log2(page size / 8 bytes) 


// Derived parameters for the page table walk: 

// inputsize = Log2(Size of Input Address) - Input Address size in bits 
// level = Level to start walk from 

// This means that the number of levels after start level = 3-level 


if !secondstage then 

// First stage translation 

inputaddr = ZeroExtend(vaddress) ; 

if PSTATE.EL == EL2 then 
inputsize = 32 - UInt(HTCR.TQ@SZ); 
basefound = inputsize == 32 || IsZero(inputaddr<31:inputsize>) ; 
disabled = FALSE; 
baseregister = HTTBR; 
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descaddr.memattrs = WalkAttrDecode(HTCR.SH@, HTCR.ORGN@, HTCR.IRGNQ, secondstage) ; 
reversedescriptors = HSCTLR.EE == '1'; 
lookupsecure = FALSE; 
singlepriv = TRUE; 
else 
basefound = FALSE; 
disabled = FALSE; 
t@size = UInt(TTBCR.TQSZ); 
if t@size == | IsZero(inputaddr<31:(32-t@size)>) then 
inputsize = 32 - tQsize; 
basefound = TRUE; 
disabled = TTBCR.EPDQ@ == '1'; 
baseregister = TTBRQ; 
descaddr.memattrs = WalkAttrDecode(TTBCR.SHO, TTBCR.ORGNO, TTBCR.IRGNQ@, secondstage) ; 
tlsize = UInt(TTBCR.T1SZ); 
if (tlsize == @ && !basefound) || (tlsize > @ && IsOnes(inputaddr<31:(32-tlsize)>)) then 
inputsize = 32 - tlsize; 
basefound = TRUE; 
disabled = TTBCR.EPD1 == '1'; 
baseregister = TTBR1; 
descaddr.memattrs = WalkAttrDecode(TTBCR.SH1, TTBCR.ORGN1, TTBCR.IRGN1, secondstage) ; 
reversedescriptors = SCTLR.EE == '1'; 
lookupsecure = IsSecure(); 
singlepriv = FALSE; 
// The starting level is the number of strides needed to consume the input address 
level = 4 - RoundUp(Real(inputsize - grainsize) / Real(stride)); 





else 


// Second stage translation 
inputaddr = ipaddress; 
inputsize = 32 - SInt(VTCR.TQSZ); 
// NTCR.S must match VTCR.TQSZ[3] 
if VTICR.S != VTCR.T@SZ<3> then 
(-, inputsize) = ConstrainUnpredictableInteger(32-7, 32+8); 
basefound = inputsize == 40 || IsZero(inputaddr<39:inputsize>) ; 
disabled = FALSE; 
baseregister = VTTBR; 
descaddr.memattrs = WalkAttrDecode(VTCR.IRGN@, VTCR.ORGNO, VTCR.SH@, secondstage) ; 
reversedescriptors = HSCTLR.EE == '1'; 
lookupsecure = FALSE; 
singlepriv = TRUE; 


startlevel = UInt(VTCR.SLQ); 
level = 2 - startlevel; 
if level <= @ then basefound = FALSE; 


// Number of entries in the starting level table = 
// (Size of Input Address)/((Address per level)A(Num levels remaining)«(Size of Table)) 
startsizecheck = inputsize - ((3 - level)*stride + grainsize); // Log2(Num of entries) 


// Check for starting level table with fewer than 2 entries or longer than 16 pages. 

// Lower bound check is: startsizecheck < Log2(2 entries) 

// That is, VTCR.SLO == '@0' and SInt(VTCR.T@SZ) > 1, Size of Input Address < 2A31 bytes 
// Upper bound check is: startsizecheck > Log2(pagesize/8+16) 

// That is, VTCR.SLO@ == '@1' and SInt(VTICR.T@SZ) < -2, Size of Input Address > 2A34 bytes 
if startsizecheck < 1 || startsizecheck > stride + 4 then basefound = FALSE; 


if !basefound || disabled then 


level = 1; // AArch64 reports this as a level @ fault 

result.addrdesc.fault = AArch32.TranslationFault(ipaddress, domain, level, acctype, iswrite, 
secondstage, s2fslwalk); 

return result; 


if !IsZero(baseregister<47:40>) then 


level = 0; 

result.addrdesc. fault = AArch32.AddressSizeFault(ipaddress, domain, level, acctype, iswrite, 
secondstage, s2fslwalk); 

return result; 
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// Bottom bound of the Base address js: 

// Log2(8 bytes per entry)+Log2(Number of entries in starting level table) 

// Number of entries in starting level table = 

// (Size of Input Address)/((Address per level)A(Num levels remaining)«(Size of Table) ) 
baselowerbound = 3 + inputsize - ((3-level)sstride + grainsize); // Log2(Num of entries«8) 
baseaddress = baseregister<39:baselowerbound>:Zeros(baselowerbound) ; 


ns_table = if lookupsecure then 'Q' else '1'; 
ap_table = '00'; 
xn_table = 'Q'; 
pxn_table = 'Q'; 


addrselecttop = inputsize - 1; 


repeat 
addrselectbottom = (3-level)sstride + grainsize; 


bits(40) index = ZeroExtend(inputaddr<addrselecttop:addrselectbottom>: 'QQ0'); 
descaddr.paddress.physicaladdress = ZeroExtend(baseaddress OR index); 
descaddr.paddress.NS = ns_table; 


// If there are two stages of translation, then the first stage table walk addresses 
// are themselves subject to translation 
if secondstage || !HasS2Translation() then 
descaddr2 = descaddr; 
else 
descaddr2 = AArch32.SecondStageWalk(descaddr, vaddress, acctype, iswrite, 8); 
// Check for a fault on the stage 2 walk 
if IsFault(descaddr2) then 
result.addrdesc.fault = descaddr2.fault; 
return result; 


// Update virtual address for abort functions 
descaddr2.vaddress = ZeroExtend(vaddress); 


desc = _Mem[descaddr2, 8, AccType_PTW]; 
if reversedescriptors then desc = BigEndianReverse(desc); 


if desc<@> == '@' || (desc<1:0> == 'Q01' && level == 3) then 
// Fault (@0), Reserved (10), or Block (01) at level 3 
result.addrdesc.fault = AArch32.TranslationFault(ipaddress, domain, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


// Nalid Block, Page, or Table entry 
if desc<1:0> == 'Q01' || level == 3 then // Block (@1) or Page (11) 
blocktranslate = TRUE; 
else // Table (11) 
if !IsZero(desc<47:40>) then 
result.addrdesc.fault = AArch32.AddressSizeFault(ipaddress, domain, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


baseaddress = desc<39:grainsize>:Zeros(grainsize) ; 
if !secondstage then 
// Unpack the upper and lower table attributes 


ns_table = ns_table OR desc<63>; 
ap_table<l> = ap_table<1> OR desc<62>; // read-only 
xn_table = xn_table OR desc<60>; 


// pxn_table and ap_table[Q] apply only in EL1&@ translation regimes 
if !singlepriv then 

ap_table<@> = ap_table<@> OR desc<61>;  // privileged 

pxn_table = pxn_table OR desc<59>; 


level = level + 1; 
addrselecttop = addrselectbottom - 1; 
blocktranslate = FALSE; 
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until blocktranslate; 


// Check the output address is inside the supported range 
if !IsZero(desc<47:40@>) then 
result.addrdesc.fault = AArch32.AddressSizeFault(ipaddress, domain, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


// Unpack the descriptor into address and upper and lower block attributes 

outputaddress = desc<39:addrselectbottom>: inputaddr<addrselectbottom-1:0>; 

// Check the access flag 

if desc<10> == 'Q' then 
result.addrdesc. fault = AArch32.AccessFlagFault(ipaddress, domain, level, acctype, 

iswrite, secondstage, s2fslwalk); 

return result; 

xn = desc<54>; 

pxn = desc<53>; 

contiguousbit = desc<52>; 

nG = desc<1b; 

sh = desc<9:8>; 

ap = desc<7:6>:'1'; 

memattr = desc<5:2>; // AttrIndx and NS bit in stage 1 


result.domain = bits(4) UNKNOWN; // Domains not used 
result. level = level; 
result.blocksize = 2A((3-level)sstride + grainsize); 


// Stage 1 translation regimes also inherit attributes from the tables 
if !secondstage then 
result.perms.xn = xn OR xn_table; 
result.perms.ap<2> = ap<2> OR ap_table<1>; // Force read-only 
// PXN, nG and AP[1] apply only in EL1&@ stage 1 translation regimes 
if !singlepriv then 
result.perms.ap<1> = ap<l> AND NOT(ap_table<@>); // Force privileged only 
result.perms.pxn = pxn OR pxn_table; 
// Pages from Non-secure tables are marked non-global in Secure EL18&0 
if IsSecure() then 
result.nG = nG OR ns_table; 
else 
result.nG = nG; 


else 
result.perms.ap<1> = '1'; 
result.perms.pxn = 'Q'; 
result.nG = 'Q'; 
result.perms.ap<@> = '1'; 
result.addrdesc.memattrs = AArch32.S1AttrDecode(sh, memattr<2:@>, acctype); 
result.addrdesc.paddress.NS = memattr<3> OR ns_table; 
else 
result.perms.ap<2:1> = ap<2:1>; 
result.perms.ap<0> = '1'; 
result.perms.xn = xn} 
result.perms.pxn = '0'; 
result.nG = '0'; 
result.addrdesc.memattrs = S2AttrDecode(sh, memattr, acctype); 
result.addrdesc.paddress.NS = '1'; 





result.addrdesc.paddress.physicaladdress = ZeroExtend(outputaddress) ; 
result.addrdesc. fault = AArch32.NoFault(); 

result.contiguous = contiguousbit == '1'; 

return result; 


aarch32/translation/walk/AArch32.TranslationTableWalkSD 


// AArch32.TranslationTableWalkSD() 


// Returns a result of a translation table walk using the Short-descriptor format 
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// Implementations might cache information from memory in any number of non-coherent TLB 
// caching structures, and so avoid memory accesses that have been expressed in this 
// pseudocode. The use of such TLBs is not expressed in this pseudocode. 


TLBRecord AArch32.TranslationTableWalkSD(bits(32) vaddress, AccType acctype, boolean iswrite, 
integer size) 
assert ELUsingAArch32(S1TranslationRegime()); 


// This is only called when address translation is enabled 
TLBRecord result; 

AddressDescriptor l1ldescaddr; 

AddressDescriptor 12descaddr; 

bits (40) outputaddress; 


// Variables for Abort functions 
jpaddress = bits(40) UNKNOWN; 
secondstage = FALSE; 

s2fslwalk = FALSE; 


// Default setting of the domain 
domain = bits(4) UNKNOWN; 


// Determine correct Translation Table Base Register to use. 
bits(64) ttbr; 

n = UInt(TTBCR.N); 

if n == 0 || IsZero(vaddress<31:(32-n)>) then 


ttbr = TTBRQ; 

disabled = (TTBCR.PDQ == '1'); 
else 

ttbr = TTBR1; 


disabled = (TTBCR.PD1 == '1'); 
n= 0; // TTBR1 translation always works like N=@ TTBRO translation 


// Check this Translation Table Base Register is not disabled. 
if disabled then 
level = 1; 
result.addrdesc. fault = AArch32.TranslationFault(ipaddress, domain, level, acctype, iswrite, 
secondstage, s2fslwalk); 
return result; 


// Obtain descriptor from initial lookup. 
]ldescaddr.paddress.physicaladdress = ZeroExtend(ttbr<31:14-n>:vaddress<31-n:20>:'Q0'); 
lidescaddr.paddress.NS = if IsSecure() then '@' else '1'; 


IRGN = ttbr<Q>:ttbr<6>; // TTBR.IRGN 
RGN = ttbr<4:3>; // TTBR.RGN 
SH = ttbr<1>:ttbr<5>; // TTBR.S:TTBR.NOS 


Jidescaddr.memattrs = WalkAttrDecode(SH, RGN, IRGN, secondstage); 


if !HaveEL(EL2) || IsSecure() then 
// if only 1 stage of translation 
l1ldescaddr2 = 11descaddr; 
else 
Jidescaddr2 = AArch32.SecondStageWalk(11descaddr, vaddress, acctype, iswrite, 4); 
// Check for a fault on the stage 2 walk 
if IsFault(lldescaddr2) then 
result.addrdesc.fault = lldescaddr2. fault; 
return result; 


// Update virtual address for abort functions 
l1ldescaddr2.vaddress = ZeroExtend(vaddress) ; 


lidesc = _Mem[11ldescaddr2, 4, AccType_PTW]; 
if SCTLR.EE == '1' then l1ldesc = BigEndianReverse(11desc); 


// Process descriptor from initial lookup. 
case lldesc<1:0> of 





when 'QQ' // Fault, Reserved 
level = 1; 
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result.addrdesc.fault = AArch32.TranslationFault(ipaddress, domain, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


when 'Q1' // Large page or Small page 
domain = 11desc<8:5>; 
level = 2; 
pxn = 11desc<2>; 
NS = lldesc<3>; 


// Obtain descriptor from level 2 lookup. 

12descaddr.paddress.physicaladdress = ZeroExtend(11desc<31:10>:vaddress<19:12>:'Q0'); 
]2descaddr.paddress.NS = if IsSecure() then 'Q' else '1'; 

]2descaddr.memattrs = 1ldescaddr.memattrs; 


if !HaveEL(EL2) || IsSecure() then 
// if only 1 stage of translation 
]2descaddr2 = 12descaddr; 
else 
]2descaddr2 = AArch32.SecondStageWalk(12descaddr, vaddress, acctype, iswrite, 4); 
// Check for a fault on the stage 2 walk 
if IsFault(12descaddr2) then 
result.addrdesc. fault = 12descaddr2. fault; 
return result; 


// Update virtual address for abort functions 
]2descaddr2.vaddress = ZeroExtend(vaddress); 


12desc = _Mem[12descaddr2, 4, AccType_PTW]; 
if SCTLR.EE == '1' then 12desc = BigEndianReverse(12desc) ; 


// Process descriptor from level 2 lookup. 
if 12desc<1:@> == '00' then 
result.addrdesc.fault = AArch32.TranslationFault(ipaddress, domain, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


nG = 12desc<11>; 
S = 12desc<10>; 
ap = 12desc<9,5:4>; 


if SCTLR.AFE == '1' && 12desc<4> == '@' then 
// Hardware access to the Access Flag is not supported in ARMv8 
result.addrdesc.fault = AArch32.AccessFlagFault(ipaddress, domain, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


if 12desc<1> == 'Q' then // Large page 

xn = 12desc<15>; 

tex = 12desc<14:12>; 

c = 12desc<3>; 

b = 12desc<2>; 

blocksize = 64; 

outputaddress = ZeroExtend(12desc<31:16>:vaddress<15:0>); 
else // Small page 

tex = 12desc<8:6>; 

c = 12desc<3>; 

b = 12desc<2>; 

xn = 12desc<0>; 

blocksize = 4; 

outputaddress = ZeroExtend(12desc<31:12>:vaddress<11:0>); 


when '1x' // Section or Supersection 
NS = 11desc<19>; 
nG = l1desc<17>; 
S = lldesc<16>; 
ap = 11ldesc<15,11:10>; 
tex = 1l1ldesc<14:12>; 





J1-5372 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


J1 ARMV8 Pseudocode 
J1.2 Pseudocode for AArch32 operation 


xn = l1ldesc<4>; 
c = ]1desc<3>; 

b = l1desc<2>; 
pxn = 11desc<Q>; 
level = 1; 


if SCTLR.AFE == '1' && I1ldesc<1@> == '@' then 
// Hardware management of the Access Flag is not supported in ARMv8 
result.addrdesc.fault = AArch32.AccessFlagFault(ipaddress, domain, level, acctype, 
iswrite, secondstage, s2fslwalk); 
return result; 


if lldesc<18> == 'Q' then // Section 
domain = 11ldesc<8:5>; 
blocksize = 1024; 
outputaddress = ZeroExtend(11desc<31:20>:vaddress<19:0>); 
else // Supersection 
domain = '0000'; 
blocksize = 16384; 
outputaddress = 11desc<8:5>:11desc<23:20>:11desc<31:24>:vaddress<23:0>; 


// Decode the TEX, C, B and S bits to produce the TLBRecord's memory attributes 
if SCTLR.TRE == 'Q' then 
if RemapRegsHaveResetValues() then 


result.addrdesc.memattrs = AArch32.DefaultTEXDecode(tex, c, b, S, acctype); 


else 


else 


result.addrdesc.memattrs = MemoryAttributes IMPLEMENTATION_DEFINED; 


result.addrdesc.memattrs = AArch32.RemappedTEXDecode(tex, c, b, S, acctype); 





return 


the rest of the TLBRecord, try to add it to the TLB, and return it. 


]t.perms.ap = ap; 

]t.perms.xn = xn; 

.perms.pxn = pxn; 

1t.nG = nG; 

lt.domain = domain; 

. level = level; 

1t.blocksize = blocksize; 

]t.addrdesc.paddress.physicaladdress = ZeroExtend(outputaddress) ; 
]t.addrdesc.paddress.NS = if IsSecure() then NS else '1'; 
lt.addrdesc. fault = AArch32.NoFault(); 


result; 


aarch32/translation/walk/RemapRegsHaveResetValues 


boolean RemapRegsHaveResetValues(); 
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J1.3 


Shared pseudocode 


This section holds the pseudocode that is common to execution in AArch64 state and in AArch32 state. Functions 
listed in this section are identified only by a FunctionName, without an AArch64. or AArch32. prefix. This section is 
organized by functional groups, with the functional groups being indicated by hierarchical path names, for example 
shared/debug/DebugTarget. 


Note 


Currently, the following pseudocode functions in this group reference C program functions, that are not linked from 
the pseudocode, to describe their operation: 





UnsignedRecipEstimate() 
Calls the C function recip_estimate(). This function is defined in Floating-point reciprocal 
estimate and step on page E1-2308. 

UnsignedRSqrtEstimate() 


Calls the C function recip_sqrt_estimate(). This function is defined in Floating-point reciprocal 
square root estimate and step on page E1-2309. 





The top-level sections of the shared pseudocode hierarchy are: 
° shared/debug. 

° shared/exceptions on page J1-5391. 

° shared/functions on page J1-5393. 

° shared/translation on page J1-5447. 





J1.3.1 shared/debug 
This section includes the following pseudocode functions: 
° shared/debug/ClearStickyErrors/ClearStickyErrors on page J1-5375. 
° shared/debug/DebugTarget/DebugTarget on page J1-5375. 
° shared/debug/DebugTarget/DebugTargetF rom on page J1-5376. 
° shared/debug/DoubleLockStatus/DoubleLockStatus on page J1-5376. 
° shared/debug/FindWatchpoint/F indWatchpoint on page J1-5376. 
. shared/debug/authentication/AllowExternalDebugAccess on page J1-5376. 
° shared/debug/authentication/AllowExternalPMUAccess on page J1-5377. 
° shared/debug/authentication/Debug_authentication on page J1-5377. 
° shared/debug/authentication/ExternalInvasiveDebug Enabled on page J1-5377. 
° shared/debug/authentication/ExternalNoninvasiveDebugAllowed on page J1-5377. 
° shared/debug/authentication/ExternalNoninvasiveDebugEnabled on page J1-5377. 
° shared/debug/authentication/ExternalSecurelInvasiveDebugEnabled on page J1-5378. 
° shared/debug/authentication/ExternalSecureNoninvasiveDebugEnabled on page J1-5378. 
° shared/debug/cti/CTI_SetEventLevel on page J1-5378. 
. shared/debug/cti/CTI_SignalEvent on page J1-5378. 
° shared/debug/cti/CrossTrigger on page J1-5378. 
° shared/debug/dccanditr/CheckForDCCInterrupts on page J1-5378. 
° shared/debug/dccanditr/DBGDTRRX_ELO on page J1-5379. 
° shared/debug/dccanditr/DBGDTRTX_ELO on page J1-5380. 
° shared/debug/dccanditr/DBGDTR_ELO on page J1-5381. 
° shared/debug/dccanditr/DTR on page J1-5381. 
° shared/debug/dccanditr/EDITR on page J1-5381. 
° shared/debug/halting/DCPSInstruction on page J1-5382. 
° shared/debug/halting/DRPSInstruction on page J1-5383. 
° shared/debug/halting/DebugHalt on page J1-5383. 
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. shared/debug/halting/DisableITRAndResumelnstructionPrefetch on page J1-5383. 
. shared/debug/halting/ExecuteA64 on page J1-5383 

° shared/debug/halting/ExecuteT32 on page J1-5383. 

. shared/debug/halting/ExitDebug State on page J1-5383. 

. shared/debug/halting/Halt on page J1-5384. 

° shared/debug/halting/HaltOnBreakpointOrWatchpoint on page J1-5385. 

. shared/debug/halting/Halted on page J1-5385. 

° shared/debug/halting/HaltingAllowed on page J1-5385. 

° shared/debug/halting/Restarting on page J1-5385. 

° shared/debug/halting/StopInstructionP refetchAndEnablelTR on page J1-5385. 
° shared/debug/halting/UpdateEDSCR Fields on page J1-5385. 

° shared/debug/haltingevents/CheckExceptionCatch on page J1-5386. 

° shared/debug/haltingevents/CheckHaltingStep on page J1-5386. 

° shared/debug/haltingevents/CheckOS UnlockCatch on page J1-5386. 

° shared/debug/haltingevents/CheckPendingOSUnlockCatch on page J1-5387. 
. shared/debug/haltingevents/CheckPendingResetCatch on page J1-5387. 

. shared/debug/haltingevents/CheckResetCatch on page J1-5387. 

. shared/debug/haltingevents/CheckSoftwareAccessToDebugRegisters on page J1-5387. 
° shared/debug/haltingevents/ExternalDebugRequest on page J1-5387. 

° shared/debug/haltingevents/HaltingStep_DidNotStep on page J1-5387. 

° shared/debug/haltingevents/HaltingStep_SteppedEX on page J1-5388. 

. shared/debug/haltingevents/RunHaltingStep on page J1-5388. 

° shared/debug/interrupts/ExternalDebugInterruptsDisabled on page J1-5388. 
. shared/debug/interrupts/InterruptlD on page J1-5388. 

. shared/debug/interrupts/SetInterruptRequestLevel on page J1-5388. 

. shared/debug/samplebasedprofiling/CreatePCSample on page J1-5389. 

. shared/debug/samplebasedprofiling/EDPCSRlo on page J1-5389. 

. shared/debug/samplebasedprofiling/PCSample on page J1-5389. 

. shared/debug/softwarestep/CheckSoftwareStep on page J1-5390. 

° shared/debug/softwarestep/DebugExceptionReturnSS on page J1-5390. 

° shared/debug/softwarestep/SSAdvance on page J1-5390. 

° shared/debug/softwarestep/SoftwareStep_DidNotStep on page J1-5391. 

° shared/debug/softwarestep/SoftwareStep_SteppedEX on page J1-5391. 


shared/debug/ClearStickyErrors/ClearStickyErrors 


// ClearStickyErrors() 


|/ ssssssenssssesssso= 
ClearStickyErrors() 
EDSCR.TXU = 'Q'; // Clear TX underrun flag 
EDSCR.RXO = 'Q'; // Clear RX overrun flag 
if Halted() then // in Debug state 
EDSCR.ITO = 'Q'; // Clear ITR overrun flag 
EDSCR.ERR = 'Q'; // Clear cumulative error flag 
return; 


shared/debug/DebugTarget/DebugTarget 
// DebugTarget() 


// Returns the debug exception target Exception level 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. J1-5375 
1ID092916 Non-Confidential 


J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 


bits(2) DebugTarget() 
secure = IsSecure(); 
return DebuglargetFrom(secure) ; 


shared/debug/DebugTarget/DebugTargetFrom 


// DebugTargetFrom() 
|[ zssssssessssanas 


bits(2) DebugTargetFrom(boolean secure) 
if HaveEL(EL2) && !secure then 
if ELUsingAArch32(EL2) then 
route_to_el2 = (HDCR.TDE == '1' || HCR.TGE == '1'); 
else 
route_to_el2 = (MDCR_EL2.TDE == '1' || HCR_EL2.TGE == '1') 
else 
route_to_el2 = FALSE; 


if route_to_el2 then 


target = EL2; 

elsif HaveEL(EL3) && HighestELUsingAArch32() && secure then 
target = EL3; 

else 
target = EL1; 


return target; 


shared/debug/DoubleLockStatus/DoubleLockStatus 
// DoubleLockStatus() 


// Returns the state of the OS Double Lock. 
// FALSE if OSDLR_EL1.DLK == @ or DBGPRCR_EL1.CORENPDRQ == 1 or the PE is in Debug state. 
// TRUE if OSDLR_EL1.DLK == 1 and DBGPRCR_EL1.CORENPDRQ == @ and the PE is in Non-debug state. 


boolean DoubleLockStatus() 
if ELUsingAArch32(EL1) then 
return DBGOSDLR.DLK == '1' && DBGPRCR.CORENPDRQ == '@' && !Halted(); 
else 
return OSDLR_EL1.DLK == '1' && DBGPRCR_EL1.CORENPDRQ == '@' && !Halted(); 


shared/debug/FindWatchpoint/FindWatchpoint 


// FindWatchpoint() 
|[ ssssssenssssse== 


integer FindWatchpoint() 
address = FAR[]; 
base = Align(address, ZVAGranuleSize()); 
limit = base + ZVAGranuleSize(); 


repeat 

for i = @ to UInt(ID_AA64DFRO_EL1.WRPs) 

if WatchpointByteMatch(i, address) then // Candidate found 
return 7; 

address = address + 1; 

if address == limit then address = base; // Wrap round, as this must be a DC ZVA 
while address != FAR[]; 
return -1; // No candidate found (should not happen) 


shared/debug/authentication/AllowExternalDebugAccess 


// AllowExternalDebugAccess() 
// ssssssssesessssss========= 
// Returns the status of EDPRSR.EDAD. 
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boolean AllowExternalDebugAccess() 
// The access may also be subject to OS lock, power-down, etc. 
if ExternalInvasiveDebugEnabled() then 
if ExternalSecureInvasiveDebugEnabled() then 
return TRUE; 
elsif HaveEL(EL3) then 
return (if ELUsingAArch32(EL3) then SDCR.EDAD else MDCR_EL3.EDAD) == '0'; 
else 
return !IsSecure(); 
else 
return FALSE; 


shared/debug/authentication/AllowExternalPMUAccess 


// AllowExternalPMUAccess() 
// ssssssssseesssssss====== 
// Returns the status of EDPRSR.EPMAD. 


boolean AllowExternalPMUAccess() 
// The access may also be subject to OS lock, power-down, etc. 
if ExternalNoninvasiveDebugEnabled() then 
if ExternalSecureNoninvasiveDebugEnabled() then 
return TRUE; 
elsif HaveEL(EL3) then 
return (if ELUsingAArch32(EL3) then SDCR.EPMAD else MDCR_EL3.EPMAD) == 'Q'; 
else 
return !IsSecure(); 
else 
return FALSE; 


shared/debug/authentication/Debug_authentication 


signal DBGEN; 
signal NIDEN; 
signal SPIDEN; 
signal SPNIDEN; 


shared/debug/authentication/ExternallnvasiveDebugEnabled 


// ExternalInvasiveDebugEnab1ed() 
// sssssssseesssssasseseseseeess= 


boolean ExternalInvasiveDebugEnab1ed() 
// In the recommended interface, ExternalInvasiveDebugEnabled returns the state of the DBGEN 


// signal. 
return DBGEN == HIGH; 


shared/debug/authentication/ExternalNoninvasiveDebugAllowed 


// ExternalNoninvasiveDebugAl lowed () 


[[ seeeeecenseessssseenssssssssseeee 


boolean ExternalNoninvasiveDebugAl lowed() 
// Return TRUE if Trace and Sample-based profiling are allowed 
return (ExternalNoninvasiveDebugEnabled() && 
(!IsSecure() || ExternalSecureNoninvasiveDebugEnabled() || 
(ELUsingAArch32(EL1) && PSTATE.EL == EL@ && SDER.SUNIDEN == '1'))); 


shared/debug/authentication/ExternalNoninvasiveDebugEnabled 


// ExternalNoninvasiveDebugEnab1ed() 


[[ seeeeeennsessseseseesssssssseeee 


boolean ExternalNoninvasiveDebugEnabled() 
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// In the recommended interface, ExternalNoninvasiveDebugEnabled returns the state of the (DBGEN 
// OR NIDEN) signal. 
return ExternalInvasiveDebugEnabled() || NIDEN == HIGH; 


shared/debug/authentication/ExternalSecurelnvasiveDebugEnabled 


// ExternalSecureInvasiveDebugEnab]ed() 


[[ seseeeeennsessesssseeessssssssseeeee 


boolean ExternalSecureInvasiveDebugEnab]ed() 
// In the recommended interface, ExternalSecureInvasiveDebugEnabled returns the state of the 
// (DBGEN AND SPIDEN) signal. 
// CoreSight allows asserting SPIDEN without also asserting DBGEN, but this is not recommended. 
if !HaveEL(EL3) && !IsSecure() then return FALSE; 
return ExternalInvasiveDebugEnabled() && SPIDEN == HIGH; 


shared/debug/authentication/ExternalSecureNoninvasiveDebugEnabled 


// ExternalSecureNoninvasiveDebugEnab1ed() 


[[ seseeeeennsesssessseessasssssseseasssss 


boolean ExternalSecureNoninvasiveDebugEnab1ed() 
// In the recommended interface, ExternalSecureNoninvasiveDebugEnabled returns the state of the 
// (DBGEN OR NIDEN) AND (SPIDEN OR SPNIDEN) signal. 
if !HaveEL(EL3) && !IsSecure() then return FALSE; 
return ExternalNoninvasiveDebugEnabled() && (SPIDEN == HIGH || SPNIDEN == HIGH); 


shared/debug/cti/CTI_SetEventLevel 


// Set a Cross Trigger multi-cycle input event trigger to the specified level. 
CTI_SetEventLevel(CrossTriggerIn id, signal level); 


shared/debug/cti/CTI_SignalEvent 


// Signal a discrete event on a Cross Trigger input event trigger. 
CTI_SignalEvent(CrossTriggerIn id); 


shared/debug/cti/CrossTrigger 


enumeration CrossTriggerOut {CrossTriggerOut_DebugRequest, CrossTriggerOut_RestartRequest, 
CrossTriggerOut_IRQ, CrossTriggerOut_RSVD3, 
CrossTriggerOut_TraceExtIn®, CrossTriggerOut_TraceExtIn1, 
CrossTriggerOut_TraceExtIn2, CrossTriggerOut_TraceExtIn3}; 


enumeration CrossTriggerIn {CrossTriggerIn_CrossHalt, CrossTriggerIn_PMUOverflow, 
CrossTriggerIn_RSVD2, CrossTriggerIn_RSVD3, 
CrossTriggerIn_TraceExtOut®, CrossTriggerIn_TraceExtOut1, 
CrossTriggerIn_TraceExtOut2, CrossTriggerIn_TraceExt0ut3}; 








shared/debug/dccanditr/CheckForDCClnterrupts 


// CheckForDCCInterrupts() 
|[ sassssensssesssssasass= 


CheckForDCCInterrupts() 
commrx = (EDSCR.RXfull == '1'); 
commtx = (EDSCR.TXfull == '@'); 


// COMMRX and COMMTX support is optional and not recommended for new designs. 
// SetInterruptRequestLevel (InterruptID_COMMRX, if commrx then HIGH else LOW); 
// SetInterruptRequestLevel (InterruptID_COMMTX, if commtx then HIGH else LOW); 


// The value to be driven onto the common COMMIRQ signal. 
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if ELUsingAArch32(EL1) then 
commirg = ((commrx && DBGDCCINT.RX == '1') || 
(commtx && DBGDCCINT.TX == '1')); 
else 
commirg = ((commrx && MDCCINT_EL1.RX == '1') || 
(commtx && MDCCINT_EL1.TX == '1')); 
SetInterruptRequestLevel (InterruptID_COMMIRQ, if commirq then HIGH else LOW); 


return; 


shared/debug/dccanditr/DBGDTRRX_ELO 
// DBGDTRRX_EL@[] (external write) 
// Called on writes to debug register QxQ8C. 


DBGDTRRX_EL@[boolean memory_mapped] = bits(32) value 


if EDPRSR<6:5,@> != 'QQ1' then // Check DLK, OSLK and PU bits 
IMPLEMENTATION_DEFINED "signal slave-generated error"; 
return; 

if EDSCR.ERR == '1' then return; // Error flag set: ignore write 


// The Software lock is OPTIONAL. 
if memory_mapped && EDLSR.SLK == '1' then return; // Software lock locked: ignore write 


if EDSCR.RXfull == '1' || (Halted() && EDSCR.MA == '1' && EDSCR.ITE == '@') then 
EDSCR.RXO = '1'; EDSCR.ERR = '1'; // Overrun condition: ignore write 
return; 


EDSCR.RXfull = '1'; 
DTRRX = value; 


if Halted() && EDSCR.MA == '1' then 
EDSCR.ITE = 'Q'; // See comments in EDITR[] (external write) 


if !UsingAArch32() then 
ExecuteA64(0xD5330501<31:0>) ; // A64 "MRS X1,DBGDTRRX_ELQ" 
ExecuteA64 (0xB8004401<31:0>) ; // A64 "STR W1, [XO], #4" 
X[1] = bits(64) UNKNOWN; 
else 
ExecuteT32(@xEE10<15:0> /xhwls/, Q0x1E15<15:@> /xhw2«/); // 132 "MRS R1,DBGDTRRXint" 
ExecuteT32(@xF840<15:0> /xhwl«/, 0x1B04<15:@> /«hw2«/); // 132 "STR R1,[RQ],#4" 
R[1] = bits(32) UNKNOWN; 
// If the store aborts, the Data Abort exception is taken and EDSCR.ERR is set to 1 
if EDSCR.ERR == '1' then 
EDSCR.RXfull = bit UNKNOWN; 
DBGDTRRX_EL@ = bits(32) UNKNOWN; 
else 
// "MRS X1,DBGDTRRX_EL@" calls DBGDTR_EL@[] (read) which clears RXfull. 
assert EDSCR.RXfull == 'Q'; 


EDSCR.ITE = '1'; // See comments in EDITR[] (external write) 
return; 


// DBGDTRRX_EL@[] (external read) 


bits(32) DBGDTRRX_EL@[boolean memory_mapped] 
return DTRRX; 
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shared/debug/dccanditr/DBGDTRTX_ELO 
// DBGDTRTX_ELO[] (external read) 
// Called on reads of debug register 0x80. 
bits(32) DBGDTRTX_EL@[boolean memory_mapped] 


if EDPRSR<6:5,@> != 'QQ1' then 


// Check DLK, OSLK and PU bits 


IMPLEMENTATION_DEFINED "signal slave-generated error"; 


return bits(32) UNKNOWN; 


underrun = EDSCR.TXfull == '@' || (Halted() && EDSCR.MA == '1' && EDSCR.ITE == 'Q'); 
value = if underrun then bits(32) UNKNOWN else DTRTX; 


if EDSCR.ERR == '1' then return value; 


// The Software lock is OPTIONAL. 
if memory_mapped && EDLSR.SLK == '1' then 
return value; 


if underrun then 
EDSCR.TXU = '1'; 
return value; 


EDSCR.ERR = '1'; 


EDSCR.TXfull = 'Q'; 
if Halted() && EDSCR.MA == '1' then 
EDSCR.ITE = 'Q'; 


if !UsingAArch32() then 
ExecuteA64 (@xB8404401<31:0>); 
else 


ExecuteT32(@xF850<15:0> /xhwls/, 0x1B04<15:0> /«hw2:/); 


// Error flag set: no side-effects 


// Software lock locked: no side-effects 


// Underrun condition: block side-effects 
// Return UNKNOWN 


// See comments in EDITR[] (external write) 


// A64 "LDR W1, [X0],#4" 


// T32 "LDR R1,[RO],#4" 


// If the load aborts, the Data Abort exception is taken and EDSCR.ERR is set to 1 


if EDSCR.ERR == '1' then 
EDSCR.TXfull = bit UNKNOWN; 
DBGDTRTX_ELO = bits(32) UNKNOWN; 
else 
if !UsingAArch32() then 
ExecuteA64 (0xD5130501<31:0>) ; 
else 


ExecuteT32(QxEEQ@Q<15:0> /xhwls/, @x1E15<15:0> /shw2«/); 


// A64 "MSR DBGDTRTX_EL@,X1" 


// 132 "MSR DBGDTRTXint, R1" 


// "MSR DBGDTRTX_ELO@,X1" calls DBGDTR_EL@[] (write) which sets TXfull. 


assert EDSCR.TXfull == '1'; 


if !UsingAArch32() then 

X[1] = bits(64) UNKNOWN; 
else 

R[1] = bits(32) UNKNOWN; 


EDSCR.ITE = '1'; 
return value; 


// DBGDTRTX_ELO[] (external write) 


DBGDTRTX_EL@[boolean memory_mapped] = bits(32) value 
// The Software lock is OPTIONAL. 
if memory_mapped && EDLSR.SLK == '1' then return; 
DTRTX = value; 
return; 


// See comments in EDITR[] (external write) 


// Software lock locked: ignore write 
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shared/debug/dccanditr/DBGDTR_ELO 


// 
// 


DBGDTR_EL@[] (write) 


System register writes to DBGDTR_ELQ, DBGDTRTX_EL@ (AArch64) and DBGDTRTXint (AArch32) 


DBGDTR_EL@[] = bits(N) value 


// 
// 


// For MSR DBGDTRTX_ELO@,<Rt> N=32, value=X[t]<31:0>, X[t]<63:32> is ignored 
// For MSR DBGDTR_ELQ, <Xt> N=64, value=X[t]<63:0> 
assert N IN {32,64}; 
if EDSCR.TXfull == '1' then 
value = bits(N) UNKNOWN; 
// On a 64-bit write, implement a half-duplex channel 
if N == 64 then DIRRX = value<63:32>; 


DTRTX = value<31:0>; // 32-bit or 64-bit write 
EDSCR.TXfull = '1'; 
return; 


DBGDTR_ELO[] (read) 


System register reads of DBGDTR_EL@, DBGDTRRX_ELO (AArch64) and DBGDTRRXint (AArch32) 


bits(N) DBGDTR_ELQ[] 


// For MRS <Rt>,DBGDTRTX_EL@ N=32, X[t]=Zeros(32):result 
// For MRS <Xt>,DBGDTR_ELO N=64, X[t]=result 
assert N IN {32,64}; 
bits(N) result; 
if EDSCR.RXfull == 'Q' then 
result = bits(N) UNKNOWN; 
else 
// On a 64-bit read, implement a half-duplex channel 
// NOTE: the word order is reversed on reads with regards to writes 
if N == 64 then result<63:32> = DTRTIX; 
result<31:0> = DTRRX; 
EDSCR.RXfull = 'Q'; 
return result; 


shared/debug/dccanditr/DTR 


bits(32) DTRRX; 
bits(32) DTRTX; 


shared/debug/dccanditr/EDITR 


// 
// 


EDITR[] (external write) 


Called on writes to debug register 0x084. 


EDITR[boolean memory_mapped] = bits(32) value 


if EDPRSR<6:5,@> != 'QQ@1' then // Check DLK, OSLK and PU bits 
IMPLEMENTATION_DEFINED "signal slave-generated error"; 
return; 

if EDSCR.ERR == '1' then return; // Error flag set: ignore write 


// The Software lock is OPTIONAL. 
if memory_mapped && EDLSR.SLK == '1' then return; // Software lock locked: ignore write 


if !Halted() then return; // Non-debug state: ignore write 
if EDSCR.ITE == '@' || EDSCR.MA == '1' then 
EDSCR.ITO = '1'; EDSCR.ERR = '1'; // Overrun condition: block write 


return; 


// ITE indicates whether the processor is ready to accept another instruction; the processor 
// may support multiple outstanding instructions. Unlike the "InstrComp1" flag in [v7A] there 
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// is no indication that the pipeline is empty (all instructions have completed). In this 
// pseudocode, the assumption is that only one instruction can be executed at a time, 
// meaning ITE acts like "InstrComp1". 
EDSCR.ITE = 'Q'; 
if !UsingAArch32() then 
ExecuteA64(value); 
else 
ExecuteT32(value<15:0>/«hwls/, value<31:16> /shw2«/); 
EDSCR.ITE = '1'; 


return; 


shared/debug/halting/DCPSInstruction 
// DCPSInstruction() 
// Operation of the DCPS instruction in Debug state 
DCPSInstruction(bits(2) target_el) 
SynchronizeContext(); 
case target_el of 


when EL1 
if PSTATE.EL == EL2 || (PSTATE.EL == EL3 && !UsingAArch32()) then handle_el = PSTATE.EL; 





elsif HaveEL(EL2) && !IsSecure() && HCR_EL2.TGE == '1' then UndefinedFault(); 
else handle_el = EL1; 

when EL2 
if !HaveEL(EL2) then UndefinedFault(); 
elsif PSTATE.EL == EL3 && !UsingAArch32() then handle_el = EL3; 
elsif IsSecure() then UndefinedFault(); 
else handle_el = EL2; 

when EL3 
if EDSCR.SDD == '1' || !HaveEL(EL3) then UndefinedFault(); 
handle_el = EL3; 

otherwise 
Unreachable(); 


if ELUsingAArch32(handle_el) then 
if PSTATE.M == M32_Monitor then SCR.NS = 'Q'; 
assert UsingAArch32(); // Cannot move from AArch64 to AArch32 
case handle_el of 
when EL1 
AArch32.WriteMode(M32_Svc); 
when EL2 AArch32.WriteMode(M32_Hyp); 
when EL3 
AArch32.WriteMode(M32_Monitor); 
if handle_el == EL2 then 
ELR_hyp = bits(32) UNKNOWN; HSR = bits(32) UNKNOWN; 
else 
LR = bits(32) UNKNOWN; 
SPSR[] = bits(32) UNKNOWN; 
PSTATE.E = SCTLR[].EE; 
DLR = bits(32) UNKNOWN; DSPSR = bits(32) UNKNOWN; 


else // Targeting AArch64 
if UsingAArch32() then 
AArch64 .MaybeZeroRegisterUppers(); 
PSTATE.nRW = '@'; PSTATE.SP = '1'; PSTATE.EL = handle_el; 
ELR[] = bits(64) UNKNOWN; SPSR[] = bits(32) UNKNOWN; ESR[] = bits(32) UNKNOWN; 
DLR_EL@ = bits(64) UNKNOWN; DSPSR_EL@ = bits(32) UNKNOWN; 


UpdateEDSCRFields(); // Update EDSCR PE state flags. 
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return; 


shared/debug/halting/DRPSInstruction 


// 
// 


DRPSInstruction() 


Operation of the A64 DRPS and 132 ERET instructions in Debug state 


DRPSInstruction() 


SynchronizeContext(); 
SetPSTATEFromPSR(SPSR[]); 


// PSTATE.{N,Z,C,V,Q,GE,SS,D,A,I,F} are not observable and ignored in Debug state, so 
// behave as if UNKNOWN. 
if UsingAArch32() then 
PSTATE.<N,Z,C,V,Q,GE,SS,A,1,F> = bits(13) UNKNOWN; 
// In AArch32, all instructions are 132 and unconditional. 
PSTATE.IT = '@0000000'; PSTATE.T = '1'; // PSTATE.J is RESO 
DLR = bits(32) UNKNOWN; DSPSR = bits(32) UNKNOWN; 
else 
PSTATE.<N,Z,C,V,SS,D,A,1,F> = bits(9) UNKNOWN; 
DLR_EL@ = bits(64) UNKNOWN; DSPSR_EL@ = bits(32) UNKNOWN; 


UpdateEDSCRFields(); // Update EDSCR PE state flags. 


return; 


shared/debug/halting/DebugHalt 





constant bits(6) DebugHalt_Breakpoint = 'Q00111'; 
constant bits(6) DebugHalt_EDBGRQ = 'Q10011'; 
constant bits(6) DebugHalt_Step_Normal = 'Q11011'; 
constant bits(6) DebugHalt_Step_Exclusive = 'Q11111'; 
constant bits(6) DebugHalt_OSUnlockCatch = '100011'; 
constant bits(6) DebugHalt_ResetCatch = '100111'; 
constant bits(6) DebugHalt_Watchpoint = '101011'; 
constant bits(6) DebugHalt_HaltInstruction = '101111'; 
constant bits(6) DebugHalt_SoftwareAccess = '110011'; 
constant bits(6) DebugHalt_ExceptionCatch = '110111'; 
constant bits(6) DebugHalt_Step_NoSyndrome = '111011'; 


shared/debug/halting/DisablelTRAndResumelnstructionPrefetch 


DisableITRAndResumeInstructionPrefetch() ; 


shared/debug/halting/ExecuteA64 


// 


Execute an A64 instruction in Debug state. 


ExecuteA64(bits(32) instr); 


shared/debug/halting/ExecuteT32 


// 


Execute a T32 instruction in Debug state. 


ExecuteT32(bits(16) hwl, bits(16) hw2); 


shared/debug/halting/ExitDebugState 


// 
// 


ExitDebugState() 


ExitDebugState() 
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assert Halted(); 
SynchronizeContext(); 


// Although EDSCR.STATUS signals that the PE is restarting, debuggers must use EDPRSR.SDR to 
// detect that the PE has restarted. 

EDSCR.STATUS = '000001'; // Signal restarting 

EDESR<2:0> = '000'; // Clear any pending Halting debug events 


bits(64) new_pc; 
bits(32) spsr; 


if UsingAArch32() then 
new_pc = ZeroExtend(DLR); 
spsr = DSPSR; 
else 
new_pc = DLR_ELQ; 
spsr = DSPSR_ELQ; 
// If this is an illegal return, SetPSTATEFromPSR() will set PSTATE.IL. 
SetPSTATEFromPSR(spsr) ; // Can update privileged bits, even at ELQ. 


if UsingAArch32() then 
if ConstrainUnpredictableBool() then new_pc<@> = 'Q'; 
BranchTo(new_pc<31:@>, BranchType_UNKNOWN) ; // AArch32 branch 


else 
// If targeting AArch32 then possibly zero the 32 most significant bits of the target PC 
if spsr<4> == '1' && ConstrainUnpredictableBool() then 
new_pc<63:32> = Zeros(); 
BranchTo(new_pc, BranchType_DBGEXIT) ; // A type of branch that is never predicted 
(EDSCR.STATUS,EDPRSR.SDR) = ('000010','1'); // Atomically signal restarted 
UpdateEDSCRFields(); // Stop signalling PE state. 


DisableITRAndResumeInstructionPrefetch(); 


return; 


shared/debug/halting/Halt 
// Halt() 


// 


Halt(bits(6) reason) 


CTI_SignalEvent(CrossTriggerIn_CrossHalt); // Trigger other cores to halt 
if UsingAArch32() then 

DLR = ThisInstrAddr(); 

DSPSR = GetPSRFromPSTATE(); 

DSPSR.SS = PSTATE.SS; // Always save PSTATE.SS 
else 

DLR_EL@ = ThisInstrAddr(); 

DSPSR_EL@ = GetPSRFromPSTATE() ; 

DSPSR_EL@.SS = PSTATE.SS; // Always save PSTATE.SS 


EDSCR.ITE = '1'; EDSCR.ITO = '0'; 
if IsSecure() then 
EDSCR.SDD = 'Q'; // If entered in Secure state, allow debug 
elsif HaveEL(EL3) then 
EDSCR.SDD = (if ExternalSecureInvasiveDebugEnabled() then '@' else '1'); 
else 
assert EDSCR.SDD == '1'; // Otherwise EDSCR.SDD is RES1 
EDSCR.MA = 'Q'; 
// PSTATE.{SS,D,A,1,F} are not observable and ignored in Debug state, so behave as if 
// UNKNOWN. PSTATE.{N,Z,C,V,Q,GE} are also not observable, but since these are not changed on 
// exception entry, this function also leaves them unchanged. PSTATE.{E,M,nRW,EL,SP} are 
// unchanged. PSTATE.IL is set to 0. 
if UsingAArch32() then 
PSTATE.<SS,A,I,F> = bits(4) UNKNOWN; 
// In AArch32, all instructions are 132 and unconditional. 
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PSTATE.IT = '@0000000'; PSTATE.T = '1'; // PSTATE.J is RESO 
else 
PSTATE.<SS,D,A,1,F> = bits(5) UNKNOWN; 
PSTATE.IL = 'Q'; 
StopInstructionPrefetchAndEnableITR(); 
EDSCR.STATUS = reason; // Signal entered Debug state 
UpdateEDSCRFields(); // Update EDSCR PE state flags. 


return; 


shared/debug/halting/HaltOnBreakpointOrWatchpoint 


// HaltOnBreakpointOrWatchpoint() 


// Returns TRUE if the Breakpoint and Watchpoint debug events should be considered for Debug 
// state entry, FALSE if they should be considered for a debug exception. 


boolean HaltOnBreakpointOrwatchpoint() 
return HaltingAllowed() && EDSCR.HDE == '1' && OSLSR_EL1.0SLK == '0'; 
shared/debug/halting/Halted 


// Halted() 
—— 


boolean Halted() 
return !(EDSCR.STATUS IN {'Q00001', 'Q00010'}); // Halted 
shared/debug/halting/HaltingAllowed 
// HaltingAllowed( ) 
// Returns TRUE if halting is currently allowed, FALSE if halting is prohibited. 
boolean HaltingAllowed( ) 
if Halted() || DoubleLockStatus() then 
return FALSE; 
elsif IsSecure() then 
return External SecureInvasiveDebugEnabled() ; 
else 
return External InvasiveDebugEnabled(); 


shared/debug/halting/Restarting 


// Restarting() 
|/ =ssssssn==== 


boolean Restarting() 
return EDSCR.STATUS == '000001'; // Restarting 


shared/debug/halting/StopInstructionPrefetchAndEnablelTR 


StopInstructionPrefetchAndEnableITR(); 


shared/debug/halting/UpdateEDSCRFields 
// UpdateEDSCRFields() 
// Update EDSCR PE state fields 


UpdateEDSCRFields() 
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if !Halted() then 


EDSCR.EL = 'Q0'; 
EDSCR.NS = bit UNKNOWN; 
EDSCR.RW = '1111'; 
else 
EDSCR.EL = PSTATE.EL; 
EDSCR.NS = if IsSecure() then '@' else '1'; 
bits(4) RW; 


RW<1> = (if ELUsingAArch32(EL1) then '@' else '1'); 
if PSTATE.EL != ELQ then 
RW<@> = RW<1>; 
else 
RW<@> = (if UsingAArch32() then '@' else '1'); 
if !HaveEL(EL2) || (HaveEL(EL3) && SCR_GEN[].NS == 'Q') then 
RW<2> = RW<1>; 
else 
RW<2> = (if ELUsingAArch32(EL2) then '@' else '1'); 
if !HaveEL(EL3) then 
RW<3> = RW<2>; 
else 





RW<3> = (if ELUsingAArch32(EL3) then '@' else '1'); 


// The least-significant bits of EDSCR.RW are UNKNOWN if any higher EL is using AArch32. 
if RW<3> == '@' then RW<2:0> = bits(3) UNKNOWN; 
elsif RW<2> == '@' then RW<1:0> = bits(2) UNKNOWN; 
elsif RW<1> == 'Q' then RW<@> = bit UNKNOWN; 
EDSCR.RW = RW; 
return; 


shared/debug/haltingevents/CheckExceptionCatch 
// CheckExceptionCatch() 
// Check whether an Exception Catch debug event is set on the current Exception level 


CheckExceptionCatch() 
// Called after taking an exception, that is, such that IsSecure() and PSTATE.EL are correct 
// for the exception target. 
base = if IsSecure() then @ else 4; 
if HaltingAllowed() && EDECCR<UInt(PSTATE.EL) + base> == '1' then 
Halt (DebugHalt_ExceptionCatch) ; 


shared/debug/haltingevents/CheckHaltingStep 


// CheckHaltingStep() 
// sssssessses======= 
// Check whether EDESR.SS has been set by Halting Step 


CheckHaltingStep() 
if HaltingAllowed() && EDESR.SS == '1' then 
// The STATUS code depends on how we arrived at the state where EDESR.SS == 1. 
if HaltingStep_DidNotStep() then 
Halt (DebugHalt_Step_NoSyndrome) ; 
elsif HaltingStep_SteppedEX() then 
Halt (DebugHalt_Step_Exclusive) ; 
else 
Halt (DebugHalt_Step_Norma1) ; 


shared/debug/haltingevents/CheckOSUnlockCatch 
// CheckOSUnlockCatch() 


// Called on unlocking the OS Lock to pend an OS Unlock Catch debug event 
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CheckOSUnlockCatch() 
if EDECR.OSUCE == '1' && !Halted() then EDESR.OSUC = '1'; 
shared/debug/haltingevents/CheckPendingOSUnlockCatch 
// CheckPendingOSUnlockCatch() 
// Check whether EDESR.OSUC has been set by an OS Unlock Catch debug event 
CheckPendingOSUnlockCatch() 
if HaltingAllowed() && EDESR.OSUC == '1' then 
Halt (DebugHalt_OSUnlockCatch) ; 
shared/debug/haltingevents/CheckPendingResetCatch 
// CheckPendingResetCatch() 
// Check whether EDESR.RC has been set by a Reset Catch debug event 
CheckPendingResetCatch() 
if HaltingAllowed() && EDESR.RC == '1' then 
Halt (DebugHalt_ResetCatch) ; 
shared/debug/haltingevents/CheckResetCatch 
// CheckResetCatch() 
// Called after reset 
CheckResetCatch() 
if EDECR.RCE == '1' then 
EDESR.RC = '1'; 
// If halting is allowed then halt immediately 
if HaltingAllowed() then Halt(DebugHalt_ResetCatch) ; 
shared/debug/haltingevents/CheckSoftwareAccessToDebugRegisters 
// CheckSoftwareAccessToDebugRegisters() 
eee 
// Check for access to Breakpoint and Watchpoint registers. 
CheckSoftwareAccessToDebugRegisters() 
os_lock = (if ELUsingAArch32(EL1) then DBGOSLSR.OSLK else OSLSR_EL1.0SLK) ; 
if HaltingAllowed() && EDSCR.TDA == '1' && os_lock == 'Q' then 
Halt (DebugHalt_SoftwareAccess) ; 


shared/debug/haltingevents/ExternalDebugRequest 


// ExternalDebugRequest() 


ExternalDebugRequest() 
if HaltingAllowed() then 
Halt (DebugHalt_EDBGRQ) ; 
// Otherwise the CTI continues to assert the debug request until it is taken. 


shared/debug/haltingevents/HaltingStep_DidNotStep 
// Returns TRUE if the previously executed instruction was executed in the inactive state, that is, 


// if it was not itself stepped. 
boolean HaltingStep_DidNotStep(); 
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shared/debug/haltingevents/HaltingStep_SteppedEX 


// Returns TRUE if the previously executed instruction was a Load-Exclusive class instruction 
// executed in the active-not-pending state. 
boolean HaltingStep_SteppedEX(); 


shared/debug/haltingevents/RunHaltingStep 


// RunHaltingStep( ) 


RunHaltingStep(boolean exception_generated, bits(2) exception_target, boolean syscall, 
boolean reset) 
// “exception_generated" is TRUE if the previous instruction generated a synchronous exception 
// or was cancelled by an asynchronous exception. 
// if “exception_generated" is TRUE then "exception_target" is the target of the exception, and 
// "syscall" is TRUE if the exception is a synchronous exception where the preferred return 
// address is the instruction following that which generated the exeception. 
// "reset" is TRUE if exiting reset state into the highest EL. 


if reset then assert !Halted(); // Cannot come out of reset halted 
active = EDECR.SS == '1L' && !Halted(); 


if active && reset then // Coming out of reset with EDECR.SS set. 
EDESR.SS = '1'; 
elsif active && HaltingAllowed() then 
if exception_generated && exception_target == EL3 then 
advance = syscall || ExternalSecureInvasiveDebugEnabled() ; 
else 
advance = TRUE; 
if advance then EDESR.SS = '1'; 


return; 


shared/debug/interrupts/ExternalDebugInterruptsDisabled 
// ExternalDebugInterruptsDisabled() 
// Determine whether EDSCR disables interrupts routed to 'target' 


boolean ExternalDebugInterruptsDisabled(bits(2) target) 
case target of 


when EL3 

int_dis = (EDSCR.INTdis == '11' && ExternalSecureInvasiveDebugEnabled()); 
when EL2 

int_dis = (EDSCR.INTdis == '1x' && ExternalInvasiveDebugEnabled()); 
when EL1 


if IsSecure() then 
int_dis = (EDSCR.INTdis == '1x' && ExternalSecureInvasiveDebugEnabled()); 
else 
int_dis 
return int_dis; 


(EDSCR.INTdis != 'Q0' && ExternalInvasiveDebugEnabled()); 


shared/debug/interrupts/InterruptID 
enumeration InterruptID {InterruptID_PMUIRQ, InterruptID_COMMIRQ, InterruptID_CTIIRQ, 
InterruptID_COMMRX, InterruptID_COMMTX}; 
shared/debug/interrupts/SetinterruptRequestLevel 


// Set a level-sensitive interrupt to the specified level. 
SetInterruptRequestLevel(InterruptID id, signal level); 
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shared/debug/samplebasedprofiling/CreatePCSample 


// CreatePCSample() 
|{ sanneenanassssss 


CreatePCSamp1e() 
// In a simple sequential execution of the program, CreatePCSample is executed each time the PE 
// executes an instruction that can be sampled. An implementation is not constrained such that 
// reads of EDPCSRlo return the current values of PC, etc. 


pc_sample.valid = ExternalNoninvasiveDebugAllowed() && !Halted(); 
pc_sample.pc = ThisInstrAddr(); 
pc_sample.el = PSTATE.EL; 
pc_sample.rw = if UsingAArch32() then '@' else '1'; 
pc_sample.ns = if IsSecure() then '@' else '1'; 
pc_sample.contextidr = if ELUsingAArch32(EL1) then CONTEXTIDR else CONTEXTIDR_EL1; 
if HaveEL(EL2) && !IsSecure() then 
pc_sample.vmid = if ELUsingAArch32(EL2) then VITBR.VMID else VTTBR_EL2.VMID; 
return; 


shared/debug/samplebasedprofiling/EDPCSRlo 


// EDPCSRlo[] (read) 
|[ zsasssenssssanss= 


bits(32) EDPCSRlo[boolean memory_mapped] 


if EDPRSR<6:5,@> != 'Q01' then // Check DLK, OSLK and PU bits 
IMPLEMENTATION_DEFINED "signal slave-generated error"; 
return bits(32) UNKNOWN; 


// The Software lock is OPTIONAL. 
update = !memory_mapped || EDLSR.SLK == '0'; // Software locked: no side-effects 


if pc_sample.valid then 
sample = pc_sample.pc<31:0>; 
if update then 
EDPCSRhi = (if pc_sample.rw == '@' then Zeros(32) else pc_sample.pc<63: 32>); 
EDCIDSR = pc_sample.contextidr; 
EDVIDSR.VMID = (if HaveEL(EL2) && pc_sample.ns == '1' && pc_sample.el IN {EL1,ELQ} 
then pc_sample.vmid else Zeros(8)); 
DVIDSR.NS = pc_sample.ns; 
DVIDSR.E2 = (if pc_sample.el == EL2 then '1' else '0'); 
DVIDSR.E3 = (if pc_sample.el == EL3 then '1' else '@') AND pc_sample.rw; 
/ The conditions for setting HV are not specified if PCSRhi is zero. 
/ An example implementation may be "pc_sample.rw". 
DVIDSR.HV = (if !IsZero(EDPCSRhi) then '1' else bit IMPLEMENTATION_DEFINED "@ or 1") 





onNNomomm 


else 
sample = Ones(32); 
if update then 
EDPCSRhi = bits(32) UNKNOWN; 
EDCIDSR = bits(32) UNKNOWN; 
EDVIDSR = (bits(4) UNKNOWN) :Zeros(20):(bits(8) UNKNOWN) ; 


return sample; 


shared/debug/samplebasedprofiling/PCSample 


type PCSample is ( 
boolean valid, 
bits(64) pc, 
bits(2) el, 
bit rw, 
bit ns, 
bits(32) contextidr, 
bits(8) vmid 
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) 


PCSample pc_sample; 


shared/debug/softwarestep/CheckSoftwareStep 
// CheckSoftwareStep() 
// Take a Software Step exception if in the active-pending state 
CheckSoftwareStep() 


// Other self-hosted debug functions will call AArch32.GenerateDebugExceptions() if called from 
// AArch32 state. However, because Software Step is only active when the debug target Exception 
// level is using AArch64, CheckSoftwareStep only calls AArch64.GenerateDebugExceptions(). 
if !ELUsingAArch32(DebugTarget()) && AArch64.GenerateDebugExceptions() then 
if MDSCR_EL1.SS == '1' && PSTATE.SS == '@' then 
AArch64.SoftwareStepException(); 


shared/debug/softwarestep/DebugExceptionReturnSS 
// DebugExceptionReturnSS() 
// Returns value to write to PSTATE.SS on an exception return or Debug state exit. 


bit DebugExceptionReturnSS(bits(32) spsr) 
assert Halted() || Restarting() || PSTATE.EL != ELQ; 


SS_bit = '0'; 


if MDSCR_EL1.SS == '1' then 
if Restarting() then 
enabled_at_source = FALSE; 
elsif UsingAArch32() then 
enabled_at_source = AArch32.GenerateDebugExceptions(); 
else 
enabled_at_source = AArch64.GenerateDebugExceptions(); 


if I]legalExceptionReturn(spsr) then 
dest = PSTATE.EL; 
else 
(valid, dest) = ELFromSPSR(spsr); assert valid; 


secure = IsSecureBelowEL3() || dest == EL3; 
if ELUsingAArch32(dest) then 

enabled_at_dest = AArch32.GenerateDebugExceptionsFrom(dest, secure); 
else 


mask = spsr<9>; 
enabled_at_dest = AArch64.GenerateDebugExceptionsFrom(dest, secure, mask); 


ELd = DebugTargetFrom(secure) ; 
if !ELUsingAArch32(ELd) && !enabled_at_source && enabled_at_dest then 
SS_bit = spsr<21>; 
return SS_bit; 
shared/debug/softwarestep/SSAdvance 
// SSAdvance() 
// Advance the Software Step state machine. 


SSAdvance() 


// A simpler implementation of this function just clears PSTATE.SS to zero regardless of the 
// current Software Step state machine. However, this check is made to illustrate that the 
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// processor only needs to consider advancing the state machine from the active-not-pending 
// state. 

target = DebugTarget(); 

step_enabled = !ELUsingAArch32(target) && MDSCR_EL1.SS == '1'; 

active_not_pending = step_enabled && PSTATE.SS == '1'; 

if active_not_pending then PSTATE.SS = 'Q'; 


return; 


shared/debug/softwarestep/SoftwareStep_DidNotStep 

// Returns TRUE if the previously executed instruction was executed in the inactive state, that is, 
// if it was not itself stepped. 

boolean SoftwareStep_DidNotStep(); 
shared/debug/softwarestep/SoftwareStep_SteppedEX 

// Returns TRUE if the previously executed instruction was a Load-Exclusive class instruction 

// executed in the active-not-pending state. 

boolean SoftwareStep_SteppedEX() ; 

J1.3.2 shared/exceptions 


This section includes the following pseudocode functions: 


° shared/exceptions/exceptions/ConditionSyndrome. 

° shared/exceptions/exceptions/Exception on page J1-5392. 

° shared/exceptions/exceptions/ExceptionRecord on page J1-5392. 

° shared/exceptions/exceptions/ExceptionSyndrome on page J1-5392. 


. shared/exceptions/traps/ReservedValue on page J1-5393. 
° shared/exceptions/traps/UnallocatedEncoding on page J1-5393. 


shared/exceptions/exceptions/ConditionSyndrome 
// ConditionSyndrome() 
// Return CV and COND fields of instruction syndrome 
bits(5) ConditionSyndrome() 
bits(5) syndrome; 


if UsingAArch32() then 
cond = AArch32.CurrentCond(); 
if PSTATE.T == 'Q' then // A32 
syndrome<4> = '1'; 
// A conditional A32 instruction that is known to pass its condition code check 
// can be presented either with COND set to @xE, the value for unconditional, or 
// the COND value held in the instruction. 
if ConditionHolds(cond) && ConstrainUnpredictableBool() then 
syndrome<3:0> = '1110'; 
else 
syndrome<3:0> = cond; 
else // 732 
// When a 132 instruction is trapped, it is IMPLEMENTATION DEFINED whether: 
// «CV set to @ and COND is set to an UNKNOWN value 
// * CV set to 1 and COND is set to the condition code for the condition that 
// applied to the instruction. 
if boolean IMPLEMENTATION_DEFINED "Condition valid for trapped 132" then 
syndrome<4> = '1'; 
syndrome<3:0> = cond; 
else 
syndrome<4> = 'Q'; 
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syndrome<3:@> = bits(4) UNKNOWN; 
else 
syndrome<4> = '1'; 
syndrome<3:@> = '1110'; 


return syndrome; 


shared/exceptions/exceptions/Exception 


enumeration Exception {Exception_Uncategorized, 
Exception_WFxTrap, 
Exception_CP15RTTrap, 
Exception_CP15RRTTrap, 
Exception_CP14RTTrap, 
Exception_CP14DTTrap, 


// Uncategorized or unknown reason 

// Trapped WFI or WFE instruction 

// Trapped AArch32 MCR or MRC access to CP15 
// Trapped AArch32 MCRR or MRRC access to CP15 
// Trapped AArch32 MCR or MRC access to CP14 
// Trapped AArch32 LDC or STC access to CP14 


Exception_AdvSIMDFPAccessTrap, // HCPTR-trapped access to SIMD or FP 


Exception_FPIDTrap, 


// Trapped access to SIMD or FP ID register 


// Trapped BXJ instruction not supported in ARMv8 


Exception_CP14RRTTrap, 
Exception_I1legalState, 
Exception_SupervisorCall, 
Exception_HypervisorCall, 
Exception_MonitorCall, 
Exception_SystemRegisterTrap, 
Exception_InstructionAbort, 
Exception_PCAlignment, 
Exception_DataAbort, 
Exception_SPAlignment, 
Exception_FPTrappedException, 
Exception_SError, 
Exception_Breakpoint, 
Exception_SoftwareStep, 
Exception_Watchpoint, 
Exception_SoftwareBreakpoint, 
Exception_VectorCatch, 
Exception_IRQ, 
Exception_F1Q}; 





// Trapped MRRC access to CP14 from AArch32 
// T1legal Execution state 

// Supervisor Call 

// Hypervisor Call 

// Monitor Call or Trapped SMC instruction 
// Trapped MRS or MSR system register access 
// Instruction Abort or Prefetch Abort 

// PC alignment fault 

// Data Abort 

// SP alignment fault 

// IEEE trapped FP exception 

// SError interrupt 

// (Hardware) Breakpoint 

// Software Step 

// Watchpoint 

// Software Breakpoint Instruction 

// AArch32 Vector Catch 

// IRQ interrupt 

// FIQ interrupt 








shared/exceptions/exceptions/ExceptionRecord 


type ExceptionRecord is (Exception type, 
bits(25) syndrome, 
bits(64) vaddress, 
boolean ipavalid, 
bits(48) ipaddress) 


// Exception class 

// Syndrome record 

// Nirtual fault address 

// Physical fault address is valid 

// Physical fault address for second stage faults 


shared/exceptions/exceptions/ExceptionSyndrome 


// ExceptionSyndrome() 


// Return a blank exception syndrome record for an exception of the given type. 


ExceptionRecord ExceptionSyndrome(Exception type) 
ExceptionRecord r; 
r.type = type; 


// Initialize all other fields 
r.syndrome = Zeros(); 
r.vaddress = Zeros(); 
r.ipavalid = FALSE; 
r.ipaddress = Zeros(); 


return r; 
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shared/exceptions/traps/ReservedValue 


// ReservedValue() 
|/ sssssssassss=e= 


ReservedValue() 


if UsingAArch32() && !AArch32.GeneralExceptionsToAArch64() then 
AArch32.TakeUndefInstrException(); 

else 
AArch64.UndefinedFault(); 


shared/exceptions/traps/UnallocatedEncoding 


// UnallocatedEncoding() 
|[ ssssssessssessssse= 


UnallocatedEncoding() 
if UsingAArch32() && AArch32.ExecutingCPl0or11lInstr() then 
FPEXC.DEX = 'Q'; 
if UsingAArch32() && !AArch32.GeneralExceptionsToAArch64() then 
AArch32.TakeUndefInstrException(); 
else 
AArch64.UndefinedFault(); 


J1.3.3 shared/functions 


This section includes the following pseudocode functions: 

° shared/functions/aborts/EncodeLDFSC on page J1-5397. 

° shared/functions/aborts/IPAValid on page J1-5398. 

° shared/functions/aborts/IsAsyncAbort on page J1-5398. 

° shared/functions/aborts/IsDebugException on page J1-5398. 

. shared/functions/aborts/IsExternalAbort on page J1-5398. 

. shared/functions/aborts/IsExternalSyncAbort on page J1-5399. 

. shared/functions/aborts/IsFault on page J1-5399. 

. shared/functions/aborts/IsSErrorInterrupt on page J1-5399. 

. shared/functions/aborts/IsSecondStage on page J1-5399. 

° shared/functions/aborts/LSInstructionSyndrome on page J1-5400. 
. shared/functions/common/ASR on page J1-5400. 

° shared/functions/common/ASR_C on page J1-5400. 

° shared/functions/common/Abs on page J1-5400. 

° shared/functions/common/Align on page J1-5400. 

° shared/functions/common/BitCount on page J1-5401. 

. shared/functions/common/CountLeadingSignBits on page J1-5401. 


° shared/functions/common/CountLeadingZeroBits on page J1-5401. 
. shared/functions/common/Elem on page J1-5401. 
° shared/functions/common/Extend on page J1-5401. 


° shared/functions/common/HighestSetBit on page J1-5402. 
° shared/functions/common/Int on page J1-5402. 

° shared/functions/common/IsOnes on page J1-5402. 

° shared/functions/common/IsZero on page J1-5402. 

° shared/functions/common/sZeroBit on page J1-5402. 

° shared/functions/common/LSL on page J1-5402. 

° shared/functions/common/LSL_C on page J1-5403. 

° shared/functions/common/LSR on page J1-5403. 

° shared/functions/common/LSR_C on page J1-5403. 
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shared/functions/common/LowestSetBit on page J1-5403. 
shared/functions/common/Max on page J1-5403. 
shared/functions/common/Min on page J1-5404. 
shared/functions/common/NOT on page J1-5404. 
shared/functions/common/Ones on page J1-5404. 
shared/functions/common/ROR on page J1-5404. 
shared/functions/common/ROR_C on page J1-5404. 
shared/functions/common/Replicate on page J1-5404. 
shared/functions/common/RoundDown on page J1-5405. 
shared/functions/common/RoundTowardsZero on page J1-5405. 
shared/functions/common/RoundUp on page J1-5405. 
shared/functions/common/SInt on page J1-5405. 
shared/functions/common/SignExtend on page J1-5405. 
shared/functions/common/UInt on page J1-5405. 
shared/functions/common/ZeroExtend on page J1-5405. 
shared/functions/common/Zeros on page J1-5406. 
shared/functions/crc/BitReverse on page J1-5406. 
shared/functions/crc/HaveCRCExt on page J1-5406. 
shared/functions/crc/Poly32Mod2 on page J1-5406. 
shared/functions/crypto/AESInvMixColumns on page J1-5406. 
shared/functions/crypto/AESInvShiftRows on page J1-5407. 
shared/functions/crypto/AESInvSubBytes on page J1-5407. 
shared/functions/crypto/AESMixColumns on page J1-5407. 
shared/functions/crypto/AESShiftRows on page J1-5407. 
shared/functions/crypto/AESSubBytes on page J1-5407. 
shared/functions/crypto/HaveCryptoExt on page J1-5407. 
shared/functions/crypto/ROL on page J1-5407. 
shared/functions/crypto/SHA256hash on page J1-5407. 
shared/functions/crypto/SHAchoose on page J1-5407. 
shared/functions/crypto/SHAhashSIGMAO on page J1-5408. 
shared/functions/crypto/SHAhashSIGMA]1 on page J1-5408. 
shared/functions/crypto/SHAmajority on page J1-5408. 
shared/functions/crypto/SHAparity on page J1-5408. 
shared/functions/exclusive/ClearExclusiveByAddress on page J1-5408. 
shared/functions/exclusive/ClearExclusiveLocal on page J1-5408. 
shared/functions/exclusive/ClearExclusiveMonitors on page J1-5408. 
shared/functions/exclusive/ExclusiveMonitorsStatus on page J1-5408. 
shared/functions/exclusive/IsExclusiveGlobal on page J1-5409. 
shared/functions/exclusive/IsExclusiveLocal on page J1-5409. 
shared/functions/exclusive/MarkExclusiveGlobal on page J1-5409. 
shared/functions/exclusive/MarkExclusiveLocal on page J1-5409. 
shared/functions/exclusive/ProcessorID on page J1-5409. 
shared/functions/float/fixedtofp/FixedToFP on page J1-5409. 
shared/functions/float/fpabs/F PAbs on page J1-5409. 
shared/functions/float/fpadd/F PAdd on page J1-5410. 
shared/functions/float/fpcompare/F PCompare on page J1-5410. 
shared/functions/float/fpcompareeq/F PCompareEQ on page J1-5410. 
shared/functions/float/fpcomparege/F PCompareGE on page J1-5411. 
shared/functions/float/fpcomparegt/F PCompareGT on page J1-5411. 
shared/functions/float/fpconvert/F PConvert on page J1-5411. 
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shared/functions/float/fpconvertnan/F PConvertNaN on page J1-5412. 
shared/functions/float/fpcrtype/F PCRType on page J1-5412. 
shared/functions/float/fpdecoderm/F PDecodeRM on page J1-5412. 
shared/functions/float/fpdecoderounding/F P DecodeRounding on page J1-5413. 
shared/functions/float/fpdefaultnan/F PDefaultNaN on page J1-5413. 
shared/functions/float/fpdiv/F PDiv on page J1-5413. 
shared/functions/float/fpexc/F PExc on page J1-5413. 
shared/functions/float/fpinfinity/F PInfinity on page J1-5414. 
shared/functions/float/fpmax/F PMax on page J1-5414. 
shared/functions/float/fpmaxnormal/F PMaxNormal on page J1-5414. 
shared/functions/float/fpmaxnum/F PMaxNum on page J1-5414. 
shared/functions/float/fpmin/F PMin on page J1-5415. 
shared/functions/float/fpminnum/F PMinNum on page J1-5415. 
shared/functions/float/fpmul/F PMul on page J1-5415. 
shared/functions/float/fpmuladd/F PMulAdd on page J1-5416. 
shared/functions/float/fpmulx/F PMulX on page J1-5417. 
shared/functions/float/fpneg/F PNeg on page J1-5417. 
shared/functions/float/fponepointfive/F POnePointFive on page J1-5417. 
shared/functions/float/fpprocessexception/F PProcessException on page J1-5417. 
shared/functions/float/fpprocessnan/F PProcessNaN on page J1-5418. 
shared/functions/float/fpprocessnans/F PProcessNaNs on page J1-5418. 
shared/functions/float/fpprocessnans3/F PProcessNaNs3 on page J1-5419. 
shared/functions/float/fprecipestimate/F PRecipEstimate on page J1-5419. 
shared/functions/float/fprecpx/F PRecpX on page J1-5420. 
shared/functions/float/fpround/F PRound on page J1-5421. 
shared/functions/float/fprounding/F PRounding on page J1-5423. 
shared/functions/float/fproundingmode/F PRoundingMode on page J1-5423. 
shared/functions/float/fproundint/F PRoundInt on page J1-5423. 
shared/functions/float/fprsqrtestimate/F PRSqrtEstimate on page J1-5424. 
shared/functions/float/fpsqrt/F PSqrt on page J1-5425. 
shared/functions/float/fpsub/F P Sub on page J1-5425. 
shared/functions/float/fpthree/F PThree on page J1-5425. 
shared/functions/float/fptofixed/F PToFixed on page J1-5426. 
shared/functions/float/fptwo/F PTwo on page J1-5426. 
shared/functions/float/fptype/F PType on page J1-5427. 
shared/functions/float/fpunpack/F P Unpack on page J1-5427. 
shared/functions/float/fpzero/F PZero on page J1-5428. 
shared/functions/float/vfpexpandimm/VF PExpandImm on page J1-5428. 
shared/functions/integer/AddWithCarry on page J1-5428. 
shared/functions/memory/AccType on page J1-5429. 
shared/functions/memory/AddrTop on page J1-5429. 
shared/functions/memory/AddressDescriptor on page J1-5429. 
shared/functions/memory/Allocation on page J1-5429. 
shared/functions/memory/BigEndian on page J1-5429. 
shared/functions/memory/BigEndianReverse on page J1-5430. 
shared/functions/memory/BranchAdar on page J1-5430. 
shared/functions/memory/Cacheability on page J1-5430. 
shared/functions/memory/DataMemoryBarrier on page J1-5430. 
shared/functions/memory/DataSynchronizationBarrier on page J1-5430. 
shared/functions/memory/DeviceType on page J1-5430. 
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shared/functions/memory/Fault on page J1-5430. 
shared/functions/memory/FaultRecord on page J1-5431. 
shared/functions/memory/FullAddress on page J1-5431. 
shared/functions/memory/Hint_Prefetch on page J1-5431. 
shared/functions/memory/MBReqDomain on page J1-5431. 
shared/functions/memory/MBReqTypes on page J1-5431. 
shared/functions/memory/MemAttrHints on page J1-5431. 
shared/functions/memory/MemType on page J1-5432. 
shared/functions/memory/MemoryAttributes on page J1-5432. 
shared/functions/memory/Permissions on page J1-5432. 
shared/functions/memory/PrefetchHint on page J1-5432. 
shared/functions/memory/TLBRecord on page J1-5432. 
shared/functions/memory/_Mem on page J1-5432. 
shared/functions/registers/BranchTo on page J1-5433. 
shared/functions/registers/BranchToAdadr on page J1-5433. 
shared/functions/registers/BranchType on page J1-5433. 
shared/functions/re gisters/Hint_Branch on page J1-5433. 
shared/functions/registers/NextInstrAddr on page J1-5433. 
shared/functions/registers/ResetExternalDebugRegisters on page J1-5433. 
shared/functions/registers/ThisInstrAddr on page J1-5433. 
shared/functions/registers/_PC on page J1-5434. 
shared/functions/registers/_R on page J1-5434. 
shared/functions/registers/_V on page J1-5434. 
shared/functions/sysregisters/SPSR on page J1-5434. 
shared/functions/system/ArchVersion on page J1-5435. 
shared/functions/system/ClearEventRegister on page J1-5435. 
shared/functions/system/ClearPendingPhysicalSError on page J1-5435. 
shared/functions/system/ConditionHolds on page J1-5435. 
shared/functions/system/CurrentInstrSet on page J1-5435. 
shared/functions/system/CurrentPL on page J1-5435. 
shared/functions/system/ELO on page J1-5436. 
shared/functions/system/ELFromM32 on page J1-5436. 
shared/functions/system/ELFromSPSR on page J1-5436. 
shared/functions/system/ELState UsingAArch32 on page J1-5437. 
shared/functions/system/ELState UsingAArch32K on page J1-5437. 
shared/functions/system/ELUsingAArch32 on page J1-5437. 
shared/functions/system/ELUsingAArch32K on page J1-5437. 
shared/functions/system/EndOfInstruction on page J1-5438. 
shared/functions/system/EventRegisterSet on page J1-5438. 
shared/functions/system/EventRegistered on page J1-5438. 
shared/functions/system/GetP SRFromPSTATE on page J1-5438. 
shared/functions/system/HasArchVersion on page J1-5438. 
shared/functions/system/HaveAArch32EL on page J1-5438. 
shared/functions/system/HaveAnyAArch32 on page J1-5439. 
shared/functions/system/HaveAnyAArch64 on page J1-5439. 
shared/functions/system/HaveEL on page J1-5439. 
shared/functions/system/HighestEL on page J1-5439. 
shared/functions/system/HighestELUsingAArch32 on page J1-5439. 
shared/functions/system/Hint_Yield on page J1-5440. 
shared/functions/system/IllegalExceptionReturn on page J1-5440. 
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. shared/functions/system/InstrSet on page J1-5440. 

° shared/functions/system/InstructionSynchronizationBarrier on page J1-5440. 
. shared/functions/system/InterruptPending on page J1-5440. 

. shared/functions/system/IsSecure on page J1-5440. 

° shared/functions/system/IsSecureBelowEL3 on page J1-5441. 

° shared/functions/system/Mode_Bits on page J1-5441. 

° shared/functions/system/PLOFfEL on page J1-5441. 

° shared/functions/system/PSTATE on page J1-5441. 

° shared/functions/system/PrivilegeLevel on page J1-5441. 

° shared/functions/system/ProcState on page J1-5441. 

° shared/functions/system/RestoredITBits on page J1-5442. 

° shared/functions/system/SCRType on page J1-5442. 

° shared/functions/system/SCR_GEN on page J1-5442. 

. shared/functions/system/SErrorPending on page J1-5443. 

. shared/functions/system/SendEvent on page J1-5443. 

° shared/functions/system/SetPSTATEFromPSR on page J1-5443. 

. shared/functions/system/SynchronizeContext on page J1-5443. 

° shared/functions/system/Take UnmaskedPhysicalSErrorInterrupts on page J1-5443. 
° shared/functions/system/Take UnmaskedSErrorInterrupts on page J1-5443. 

° shared/functions/system/ThisInstr on page J1-5444. 

° shared/functions/system/ThisInstrLength on page J1-5444. 

° shared/functions/system/Unreachable on page J1-5444. 

° shared/functions/system/UsingAArch32 on page J1-5444. 

° shared/functions/system/WaitForEvent on page J1-5444. 

. shared/functions/system/WaitForInterrupt on page J1-5444. 

. shared/functions/unpredictable/ConstrainUnpredictable on page J1-5444. 

. shared/functions/unpredictable/ConstrainUnpredictableBits on page J1-5444. 
° shared/functions/unpredictable/ConstrainUnpredictableBool on page J1-5444. 
. shared/functions/unpredictable/ConstrainUnpredictableInteger on page J1-5445. 
° shared/functions/unpredictable/Constraint on page J1-5445. 

° shared/functions/vector/AdvSIMDExpandImm on page J1-5445. 

° shared/functions/vector/PolynomialMult on page J1-5446. 

° shared/functions/vector/SatQ on page J1-5446. 

. shared/functions/vector/SignedSatQ on page J1-5446. 

° shared/functions/vector/UnsignedRSqrtEstimate on page J1-5446. 

. shared/functions/vector/UnsignedRecipEstimate on page J1-5447. 

° shared/functions/vector/UnsignedSatQ on page J1-5447. 


shared/functions/aborts/EncodeLDFSC 
// EncodeLDFSC() 
// Function that gives the Long-descriptor FSC code for types of Fault 
bits(6) EncodeLDFSC(Fault type, integer level) 


bits(6) result; 
case type of 


when Fault_AddressSize result = 'Q000':level<1:0>; assert level IN {0,1,2,3}; 
when Fault_AccessFlag result = 'Q010':level<1:0>; assert level IN {1,2,3}; 
when Fault_Permission result = 'Q011':level<1:0>; assert level IN {1,2,3}; 
when Fault_Translation result = 'Q001':level<1:0>; assert level IN {0,1,2, 3}; 
when Fault_SyncExternal result = 'Q10000'; 


when Fault_SyncExternalOnWalk result = 'Q101':level<1:0>; assert level IN {0,1,2,3}; 
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when Fault_SyncParity result = '011000'; 

when Fault_SyncParityOnWalk result = 'Q111':level<1:0>; assert level IN {0,1,2,3}; 
when Fault_AsyncParity result = '011001'; 

when Fault_AsyncExternal result = 'Q10001'; 

when Fault_Alignment result = '100001'; 

when Fault_Debug result = '100010'; 

when Fault_TLBConflict result = '110000'; 

when Fault_Lockdown result = '110100'; // IMPLEMENTATION DEFINED 

when Fault_Exclusive result = '110101'; // IMPLEMENTATION DEFINED 
otherwise Unreachable(); 


return result; 


shared/functions/aborts/IPAValid 
// IPAValid() 
// Return TRUE if the IPA is reported for the abort 


boolean IPAValid(FaultRecord fault) 
assert fault.type != Fault_None; 


if fault.s2fslwalk then 

return fault.type IN {Fault_AccessFlag, Fault_Permission, Fault_Translation, 
Fault_AddressSize}; 

elsif fault.secondstage then 
return fault.type IN {Fault_AccessFlag, Fault_Translation, Fault_AddressSize}; 

else 
return FALSE; 

shared/functions/aborts/IsAsyncAbort 
// IsAsyncAbort() 


// Returns TRUE if the abort currently being processed is an asynchronous abort, and FALSE 
// otherwise. 


boolean IsAsyncAbort(Fault type) 
assert type != Fault_None; 


return (type IN {Fault_AsyncExternal, Fault_AsyncParity}); 


// IsAsyncAbort() 
|/ ==ssssenssse== 


boolean IsAsyncAbort(FaultRecord fault) 
return IsAsyncAbort(fault. type) ; 
shared/functions/aborts/IsDebugException 


// IsDebugException() 
|[ zsassssnssssscsss= 


boolean IsDebugException(FaultRecord fault) 


assert fault.type != Fault_None; 
return fault.type == Fault_Debug; 


shared/functions/aborts/IsExternalAbort 
// IsExternalAbort() 
// Returns TRUE if the abort currently being processed is an external abort and FALSE otherwise. 


boolean IsExternalAbort(Fault type) 
assert type != Fault_None; 
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return (type IN {Fault_SyncExternal, Fault_SyncParity, Fault_SyncExternalOnWalk, 
Fault_SyncParityOnWalk, 
Fault_AsyncExternal, Fault_AsyncParity }); 


// IsExternalAbort() 
|[ ssssssenssssan=== 


boolean IsExternalAbort(FaultRecord fault) 
return IsExternalAbort(fault.type) ; 
shared/functions/aborts/IsExternalSyncAbort 


// IsExternalSyncAbort() 


// Returns TRUE if the abort currently being processed is an external synchronous abort and FALSE 
otherwise. 


boolean IsExternalSyncAbort(Fault type) 
assert type != Fault_None; 


return (type IN {Fault_SyncExternal, Fault_SyncParity, Fault_SyncExternalOnWalk, 
Fault_SyncParityOnWalk}) ; 


// IsExternalSyncAbort() 
|[ ssssssensssessssee= 


boolean IsExternalSyncAbort(FaultRecord fault) 
return IsExternalSyncAbort(fault. type) ; 
shared/functions/aborts/IsFault 
// IsFault() 
// Return TRUE if a fault is associated with an address descriptor 
boolean IsFault(AddressDescriptor addrdesc) 
return addrdesc.fault.type != Fault_None; 
shared/functions/aborts/IsSErrorinterrupt 
// IsSErrorInterrupt() 
7 Returns TRUE if the abort currently being processed is an SError interrupt, and FALSE 


// otherwise. 


boolean IsSErrorInterrupt(Fault type) 
assert type != Fault_None; 


return (type IN {Fault_AsyncExternal, Fault_AsyncParity}); 


// IsSErrorInterrupt() 
|[ ssssssenssseesssse= 


boolean IsSErrorInterrupt(FaultRecord fault) 
return IsSErrorInterrupt(fault. type) ; 
shared/functions/aborts/IsSecondStage 


// IsSecondStage() 
as 


boolean IsSecondStage(FaultRecord fault) 
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assert fault.type != Fault_None; 


return fault.secondstage; 


shared/functions/aborts/LSInstructionSyndrome 


bits(11) LSInstructionSyndrome() ; 


shared/functions/common/ASR 


// ASR) 
= 


bits(N) ASR(bits(N) x, integer shift) 
assert shift >= 0; 
if shift == @ then 
result = x; 
else 
(result, -) = ASR_C(x, shift); 
return result; 


shared/functions/common/ASR_C 


// ASR_C() 
|/ ======= 


(bits(N), bit) ASR_C(bits(N) x, integer shift) 
assert shift > Q; 
extended_x = SignExtend(x, shift+N); 
result = extended_x<shift+N-1:shift>; 
carry_out = extended_x<shift-1lb; 
return (result, carry_out); 


shared/functions/common/Abs 


// Abs() 
i= 


integer Abs(integer x) 
return if x >= 0 then x else -x; 


// Abs() 
i= 


real Abs(real x) 
return if x >= 0.0 then x else -x; 
shared/functions/common/Align 


// Align() 
// ss5s==5 


integer Align(integer x, integer y) 
return y « (x DIV y); 


// Align() 
// === 


bits(N) Align(bits(N) x, integer y) 
return Align(UInt(x), y)<N-1:0>; 
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shared/functions/common/BitCount 


// BitCount() 
== 


integer BitCount(bits(N) x) 
integer result = 0; 
for i = @ to N-1 
if x<i> == '1' then 
result = result + 1; 
return result; 


shared/functions/common/CountLeadingSignBits 


// CountLeadingSignBits() 
|[ ssssssenessesssssssss= 


integer CountLeadingSignBits(bits(N) x) 
return CountLeadingZeroBits(x<N-1:1> EOR x<N-2:0>); 


shared/functions/common/CountLeadingZeroBits 


// CountLeadingZeroBits() 
|/ ssssseensssscssssesss= 


integer CountLeadingZeroBits(bits(N) x) 
return N - 1 - HighestSetBit(x); 


shared/functions/common/Elem 


// Elem[] - non-assignment form 


[[ seeeeeccnssessssessesssssss 


bits(size) Elem[bits(N) vector, integer e, integer size] 
assert e >= @ && (e+1)«size <= N; 
return vector<exsize+size-1 : e#size>; 


// Elem[] - non-assignment form 


[[ seeeeeccesssssssesecesssssss 


bits(size) Elem[bits(N) vector, integer e] 
return Elem[vector, e, size]; 


// Elem[] - assignment form 


[/ sesseeeccesssssssssceese 


Elem[bits(N) &vector, integer e, integer size] = bits(size) value 
assert e >= @ && (e+1)«size <= N; 
vector<(e+1)#size-liexsize> = value; 
return; 


// Elem[] - assignment form 


———————— 

Elem[bits(N) &vector, integer e] = bits(size) value 
Elem[vector, e, size] = value; 
return; 


shared/functions/common/Extend 


// Extend() 
j(—— 


bits(N) Extend(bits(M) x, integer N, boolean unsigned) 
return if unsigned then ZeroExtend(x, N) else SignExtend(x, N); 


J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 
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// Extend() 
=< 


bits(N) Extend(bits(M) x, boolean unsigned) 
return Extend(x, N, unsigned); 


shared/functions/common/HighestSetBit 


// HighestSetBit() 
|[ sssssseassss=5= 


integer HighestSetBit(bits(N) x) 
for i = N-1 downto 0 


if x<i> == '1' then return 7; 
return -1; 


shared/functions/common/Int 


// Int() 
|/ ====2 


integer Int(bits(N) x, boolean unsigned) 
result = if unsigned then UInt(x) else SInt(x); 
return result; 


shared/functions/common/IsOnes 


// IsOnes() 
—— 


boolean IsOnes(bits(N) x) 
return x == Ones(N); 


shared/functions/common/|sZero 


// IsZero() 
(== 


boolean IsZero(bits(N) x) 
return x == Zeros(N); 


shared/functions/common/|sZeroBit 


// IsZeroBit() 
=< 


bit IsZeroBit(bits(N) x) 
return if IsZero(x) then '1' else 'Q'; 


shared/functions/common/LSL 


// (SLO 
== 


bits(N) LSL(bits(N) x, integer shift) 
assert shift >= 0; 
if shift == @ then 
result = x; 
else 
(result, -) = LSL_C(x, shift); 
return result; 
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shared/functions/common/LSL_C 


// LSL_CO) 
|f ==sss== 


(bits(N), bit) LSL_C(bits(N) x, integer shift) 
assert shift > Q; 
extended_x = x : Zeros(shift); 
result = extended_x<N-1:0>; 
carry_out = extended_x<N>; 
return (result, carry_out); 


shared/functions/common/LSR 


// (SRO 
= 


bits(N) LSR(bits(N) x, integer shift) 
assert shift >= 0; 
if shift == @ then 
result = x; 
else 
(result, -) = LSR_C(x, shift); 
return result; 


shared/functions/common/LSR_C 


// LSR_C() 
|f =es=s== 


(bits(N), bit) LSR_C(bits(N) x, integer shift) 
assert shift > Q; 
extended_x = ZeroExtend(x, shift+N); 
result = extended_x<shift+N-1:shift>; 
carry_out = extended_x<shift-1>; 
return (result, carry_out); 


shared/functions/common/LowestSetBit 


// LowestSetBit() 
|/ =sssssessse== 


integer LowestSetBit(bits(N) x) 
for i = @ to N-1 


if x<i> == '1' then return 7; 
return N; 


shared/functions/common/Max 


// Max() 
// =s=== 


integer Max(integer a, integer b) 
return if a >= b then a else b; 


// Max() 
// =s=== 


real Max(real a, real b) 
return if a >= b then a else b; 


J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 
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shared/functions/common/Min 


// Min() 
i= 


integer Min(integer a, integer b) 
return if a <= b then a else b; 


// Min() 
= 


real Min(real a, real b) 
return if a <= b then a else b; 


shared/functions/common/NOT 


bits(N) NOT(bits(N) x); 


shared/functions/common/Ones 


// Ones() 
// =s=se= 


bits(N) Ones(integer N) 
return Replicate('1',N); 


// Ones() 
// sssse= 


bits(N) Ones() 
return Ones(N); 


shared/functions/common/ROR 


// ROR() 
|f ===== 


bits(N) ROR(bits(N) x, integer shift) 
assert shift >= 0; 
if shift == @ then 
result = x; 
else 
(result, -) = ROR_C(x, shift); 
return result; 


shared/functions/common/ROR_C 


// ROR_C() 
|f ==s==== 


(bits(N), bit) ROR_C(bits(N) x, integer shift) 
assert shift != 0; 
m = shift MOD N; 
result = LSR(x,m) OR LSL(x,N-m); 
carry_out = result<N-1>; 
return (result, carry_out); 


shared/functions/common/Replicate 


// Replicate() 
ji ——= 


bits(N) Replicate(bits(M) x) 
assert N MOD M == Q; 
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return Replicate(x, N DIV M); 


bits(MsN) Replicate(bits(M) x, integer N); 


shared/functions/common/RoundDown 


integer RoundDown(real x); 


shared/functions/common/RoundTowardsZero 


// RoundTowardsZero() 
|[ seasssenssssscsss= 


integer RoundTowardsZero(real x) 
return if x == 0.0 then 0 else if x >= @.@ then RoundDown(x) else RoundUp(x); 


shared/functions/common/RoundUp 


integer RoundUp(real x); 


shared/functions/common/SInt 


// Sint() 
i= 


integer SInt(bits(N) x) 
result = Q; 
for i = 0 to N-1 
if x<i> == '1' then result = result + 2Ai; 
if x<N-I> == '1' then result = result - 2AN; 
return result; 


shared/functions/common/SignExtend 


// SignExtend() 
as 


bits(N) SignExtend(bits(M) x, integer N) 
assert N >= M; 
return Replicate(x<M-1>, N-M) : x; 


// SignExtend() 
|/ =sssssenss== 


bits(N) SignExtend(bits(M) x) 
return SignExtend(x, N); 


shared/functions/common/UInt 


// UInt() 
i= 


integer UInt(bits(N) x) 
result = Q; 
for i = 0 to N-1 
if x<i> == '1' then result = result + 2Ai; 
return result; 


shared/functions/common/ZeroExtend 


// ZeroExtend() 
|/ =asssseass== 


J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 
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bits(N) ZeroExtend(bits(M) x, integer N) 
assert N >= M; 
return Zeros(N-M) : x; 


// ZeroExtend() 
|/ =sassseess== 


bits(N) ZeroExtend(bits(M) x) 
return ZeroExtend(x, N); 
shared/functions/common/Zeros 


// Zeros() 
|/ =====5= 


bits(N) Zeros(integer N) 
return Replicate('Q',N); 


// Zeros() 
|/ =====5= 


bits(N) Zeros() 
return Zeros(N); 


shared/functions/crc/BitReverse 


// BitReverse() 
|/ ==sssss=s== 


bits(N) BitReverse(bits(N) data) 
bits(N) result; 
for i = @ to N-1 
result<N-i-1> = data<i>; 
return result; 


shared/functions/crc/HaveCRCExt 


// HaveCRCExt() 
|/ =ssssssn==== 


boolean HaveCRCExt() 
return boolean IMPLEMENTATION_DEFINED "Have CRC extension"; 
shared/functions/crc/Poly32Mod2 


// Poly32Mod2() 
|/ ==asssenss== 


// Poly32Mod2 on a bitstring does a polynomial Modulus over {@,1} operation 


bits(32) Poly32Mod2(bits(N) data, bits(32) poly) 
assert N > 32; 
for i = N-1 downto 32 
if data<i> == '1' then 
data<i-1:0> = data<i-1:0> EOR poly:Zeros(i-32); 
return data<31:0>; 


shared/functions/crypto/AESInvMixC olumns 


bits(128) AESInvMixColumns(bits (128) op); 
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shared/functions/crypto/AESInvShiftRows 


bits(128) AESInvShiftRows(bits(128) op); 


shared/functions/crypto/AESInvSubBytes 


bits(128) AESInvSubBytes(bits(128) op); 


shared/functions/crypto/AESMixColumns 


bits(128) AESMixColumns(bits (128) op); 


shared/functions/crypto/AESShiftRows 


bits(128) AESShiftRows(bits(128) op); 


shared/functions/crypto/AESSubBytes 


bits(128) AESSubBytes(bits(128) op); 


shared/functions/crypto/HaveCryptoExt 


boolean HaveCryptoExt(); 


shared/functions/crypto/ROL 


// ROL() 
_ 


bits(N) ROL(bits(N) x, integer shift) 
assert shift >= 0 && shift <= N; 
if (shift == @) then 
return x; 
return ROR(x, N-shift); 


shared/functions/crypto/SHA256hash 


// SHA256hash() 
as 


bits(128) SHA256hash (bits (128) X, bits(128) Y, bits(128) W, boolean part1) 
bits(32) chs, maj, t; 


for e = Q to 3 
chs = SHAchoose(Y<31:0@>, Y<63:32>, Y<95:64>); 
maj = SHAmajority(X<31:0>, X<63:32>, X<95:64>); 
t = Y<127:96> + SHAhashSIGMA1(Y<31:@>) + chs + Elem[W, e, 32]; 
X<127:96> = t + X<127:96>; 
Y<127:96> = t + SHAhashSIGMAQ(X<31:0>) + maj; 
<Y, X> = ROL(Y : X, 32); 
return (if partl then X else Y); 


shared/functions/crypto/SHAchoose 


// SHAchoose() 
i=—=——= 


bits(32) SHAchoose(bits(32) x, bits(32) y, bits(32) z) 
return (((y EOR z) AND x) EOR z); 
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shared/functions/crypto/SHAhashSIGMAO 


// SHAhashSIGMAQ( ) 
[/ sesseeecnssssss 


bits(32) SHAhashSIGMAQ(bits(32) x) 
return ROR(x, 2) EOR ROR(x, 13) EOR ROR(x, 22); 
shared/functions/crypto/SHAhashSIGMA1 


// SHAhashSIGMA1( ) 
[/ sesseeccsssssss 


bits(32) SHAhashSIGMA1(bits(32) x) 
return ROR(x, 6) EOR ROR(x, 11) EOR ROR(x, 25) 
shared/functions/crypto/SHAmajority 


// SHAmajority() 
a 


bits(32) SHAmajority(bits(32) x, bits(32) y, bits(32) z) 
return ((x AND y) OR ((x OR y) AND z)); 
shared/functions/crypto/SHAparity 


// SHAparity() 
(== 


bits(32) SHAparity(bits(32) x, bits(32) y, bits(32) z) 
return (x EOR y EOR z); 


shared/functions/exclusive/ClearExclusiveByAddress 


// Clear the global Exclusive Monitors for all PEs EXCEPT processorid if they 

// record any part of the physical address region of size bytes starting at paddress. 
// It is IMPLEMENTATION DEFINED whether the global Exclusive Monitor for processorid 
// is also cleared if it records any part of the address region. 
ClearExclusiveByAddress(FullAddress paddress, integer processorid, integer size); 


shared/functions/exclusive/ClearExclusiveLocal 
// Clear the local Exclusive Monitor for the specified processorid. 
ClearExclusiveLocal(integer processorid); 


shared/functions/exclusive/ClearExclusiveMonitors 


// ClearExclusiveMonitors() 


// ssssssssesesssssss====== 
// Clear the local Exclusive Monitor for the executing PE. 


ClearExclusiveMonitors() 
ClearExclusiveLocal(ProcessorID()); 


shared/functions/exclusive/ExclusiveMonitorsStatus 


// Returns '@' to indicate success if the last memory write by this PE was to 

// the same physical address region endorsed by ExclusiveMonitorsPass(). 

// Returns '1' to indicate failure if address translation resulted in a different 
// physical address. 

bit ExclusiveMonitorsStatus(); 
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shared/functions/exclusive/IsExclusiveGlobal 


// Return TRUE if the global Exclusive Monitor for processorid includes all of 
// the physical address region of size bytes starting at paddress. 
boolean IsExclusiveGlobal(FullAddress paddress, integer processorid, integer size); 


shared/functions/exclusive/IsExclusiveLocal 


// Return TRUE if the local Exclusive Monitor for processorid includes all of 
// the physical address region of size bytes starting at paddress. 
boolean IsExclusiveLocal(FullAddress paddress, integer processorid, integer size); 


shared/functions/exclusive/MarkExclusiveGlobal 


// Record the physical address region of size bytes starting at paddress in 
// the global Exclusive Monitor for processorid. 
MarkExclusiveGlobal(FullAddress paddress, integer processorid, integer size); 


shared/functions/exclusive/MarkExclusiveLocal 


// Record the physical address region of size bytes starting at paddress in 
// the local Exclusive Monitor for processorid. 
MarkExclusiveLocal(FullAddress paddress, integer processorid, integer size); 


shared/functions/exclusive/ProcessorlID 


// Return the ID of the currently executing PE. 
integer ProcessorID(); 


shared/functions/float/fixedtofp/FixedToFP 


// FixedToFP() 


// Convert M-bit fixed point OP with FBITS fractional bits to 
// N-bit precision floating point, controlled by UNSIGNED and ROUNDING. 


bits(N) FixedToFP(bits(M) op, integer fbits, boolean unsigned, FPCRType fpcr, FPRounding rounding) 
assert N IN {32,64}; 
assert M IN {32,64}; 
bits(N) result; 
assert fbits >= 0; 
assert rounding != FPRounding_ODD; 


// Correct signed-ness 
int_operand = Int(op, unsigned); 


// Scale by fractional bits and generate a real value 
real_operand = Real(int_operand) / 2.QAfbits; 


if real_operand == 0.0 then 
result = FPZero('0'); 
else 
result = FPRound(real_operand, fpcr, rounding); 


return result; 


shared/functions/float/fpabs/FPAbs 





// FPAbs() 
// ======= 
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bits(N) FPAbs(bits(N) op) 
assert N IN {32,64}; 
return 'Q' : op<N-2:0>; 


shared/functions/float/fpadd/FPAdd 


// FPAdd() 
|f =ss===2 


bits(N) FPAdd(bits(N) opl, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
rounding = FPRoundingMode(fpcr) ; 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
infl = (typel == FPType_Infinity); inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); zero2 = (type2 == FPType_Zero); 
if infl && inf2 && signl == NOT(sign2) then 
result = FPDefaultNaN(); 
FPProcessException(FPExc_InvalidOp, fpcr); 
elsif (infl && signl == '@') || (inf2 && sign2 == 'Q') then 
result = FPInfinity('0'); 
elsif (infl && signl == '1') || (inf2 && sign2 == '1') then 
result = FPInfinity('1'); 
elsif zerol && zero2 && signl == sign2 then 
result = FPZero(sign1); 
else 
result_value = valuel + value2; 
if result_value == @.0 then // Sign of exact zero result depends on rounding mode 
result_sign = if rounding == FPRounding_NEGINF then '1' else 'Q'; 
result = FPZero(result_sign) ; 
else 
result = FPRound(result_value, fpcr, rounding); 
return result; 





shared/functions/float/fpcompare/FPCompare 


// FPCompare() 
(—=—<—= 


bits(4) FPCompare(bits(N) op1, bits(N) op2, boolean signal_nans, FPCRType fpcr) 
assert N IN {32,64}; 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
if typel==FPType_SNaN || typel==FPType_QNaN || type2==FPType_SNaN || type2==FPType_QNaN then 
result = 'Q011'; 
if typel==FPType_SNaN || type2==FPType_SNaN || signal_nans then 
FPProcessException(FPExc_InvalidOp, fpcr); 
else 
// All non-NaN cases can be evaluated on the values produced by FPUnpack() 
if valuel == value2 then 
result = 'Q110'; 
elsif valuel < value2 then 
result = '1000'; 
else // valuel > value2 
result = 'Q010'; 
return result; 


shared/functions/float/fpcompareeq/F PCompareEQ 


// FPCompareEQ() 
|[ =snssssansss= 


boolean FPCompareEQ(bits(N) op1, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
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(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
if typel==FPType_SNaN || typel==FPType_QNaN || type2==FPType_SNaN || type2==FPType_QNaN then 
result = FALSE; 
if typel==FPType_SNaN || type2==FPType_SNaN then 
FPProcessException(FPExc_InvalidOp, fpcr); 
else 
// All non-NaN cases can be evaluated on the values produced by FPUnpack() 
result = (valuel == value2) 
return result; 


shared/functions/float/fpcomparege/F PCompareGE 


// FPCompareGE() 
|/ =ssssssanss== 


boolean FPCompareGE(bits(N) op1, bits(N) op2, FPCRType fpcr) 

assert N IN {32,64}; 

(typel,sign1,valuel) = FPUnpack(op1, fpcr); 

(type2,sign2,value2) = FPUnpack(op2, fpcr); 

if typel==FPType_SNaN || typel==FPType_QNaN || type2==FPType_SNaN || type2==FPType_QNaN then 
result = FALSE; 
FPProcessException(FPExc_InvalidOp, fpcr); 

else 
// All non-NaN cases can be evaluated on the values produced by FPUnpack() 
result = (valuel >= value2) 

return result; 


shared/functions/float/fpcomparegt/FPCompareGT 


// FPCompareGT() 
|[ seassssansss= 


boolean FPCompareGT(bits(N) op1, bits(N) op2, FPCRType fpcr) 

assert N IN {32,64}; 

(typel,sign1,valuel) = FPUnpack(op1, fpcr); 

(type2,sign2,value2) = FPUnpack(op2, fpcr); 

if typel==FPType_SNaN || typel==FPType_QNaN || type2==FPType_SNaN || type2==FPType_QNaN then 
result = FALSE; 
FPProcessException(FPExc_InvalidOp, fpcr); 

else 
// All non-NaN cases can be evaluated on the values produced by FPUnpack() 
result = (valuel > value2); 

return result; 


shared/functions/float/fpconvert/FPConvert 


// FPConvert() 
i= 


// Convert floating point OP with N-bit precision to M-bit precision, 
// with rounding controlled by ROUNDING. 


bits(M) FPConvert(bits(N) op, FPCRType fpcr, FPRounding rounding) 
assert M IN {16, 32,64}; 
assert N IN {16, 32,64}; 
bits(M) result; 


// Unpack floating-point operand optionally with flush-to-zero. 
(type,sign,value) = FPUnpack(op, fpcr); 


alt_hp = (M == 16) && (fpcr.AHP == '1'); 
if type == FPType_SNaN || type == FPType_QNaN then 


if alt_hp then 
result = FPZero(sign); 
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elsif fpcr.DN == '1' then 
result = FPDefaultNaN(); 
else 
result = FPConvertNaN(op); 
if type == FPType_SNaN || alt_hp then 
FPProcessException(FPExc_InvalidOp, fpcr); 
elsif type == FPType_Infinity then 
if alt_hp then 
result = sign:Ones(M-1); 
FPProcessException(FPExc_InvalidOp, fpcr); 
else 
result = FPInfinity(sign); 
elsif type == FPType_Zero then 
result = FPZero(sign); 
else 
result = FPRound(value, fpcr, rounding); 
return result; 


// FPConvert() 
<= 


bits(M) FPConvert(bits(N) op, FPCRType fpcr) 
return FPConvert(op, fpcr, FPRoundingMode(fpcr)); 


shared/functions/float/fpconvertnan/FPConvertNaN 


// FPConvertNaN() 
|/ =sssssensssse= 


// Converts a NaN of one floating-point type to another 


bits(M) FPConvertNaN(bits(N) op) 
assert N IN {16, 32,64}; 
assert M IN {16, 32,64}; 
bits(M) result; 
bits(51) frac; 


sign = op<N-1>; 


// Unpack payload from input NaN 

case N of 
when 64 frac = op<50:0>; 
when 32 frac = op<21:0>:Zeros(29); 
when 16 frac = op<8:@>:Zeros(42); 


// Repack payload into output NaN, while 

// converting an SNaN to a QNaN. 

case M of 
when 64 result = sign:Ones(M-52):frac; 
when 32 result = sign:Ones(M-23):frac<50:29>; 
when 16 result = sign:Ones(M-10):frac<50:42>; 


return result; 


shared/functions/float/fpcrtype/FPCRType 


type FPCRType; 


shared/functions/float/fpdecoderm/FPDecodeRM 


// FPDecodeRM() 
|/ ==sssssn==== 


// Decode most common AArch32 floating-point rounding encoding. 


FPRounding FPDecodeRM(bits(2) rm) 
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case rm of 
when 'QQ' return FPRounding_TIEAWAY; // A 
when 'Q1' return FPRounding_TIEEVEN; // N 
when '10' return FPRounding_POSINF; // P 
when '11' return FPRounding_NEGINF; // M 


shared/functions/float/fpdecoderounding/FPDecodeRounding 


// FPDecodeRounding() 
|/ ssasssessssssnsss= 


// Decode floating-point rounding mode and common AArch64 encoding. 


FPRounding FPDecodeRounding(bits(2) rmode) 
case rmode of 
when 'QQ' return FPRounding_TIEEVEN; // N 
when 'Q1' return FPRounding_POSINF; // P 
when '10' return FPRounding_NEGINF; // M 
when '11' return FPRounding_ZERO; //7Z 


shared/functions/float/fpdefaultnan/FPDefaultNaN 


// FPDefaultNaN() 
|/ ==ssssessss== 


bits(N) FPDefaultNaN() 
assert N IN {16, 32,64}; 
constant integer E = (if N == 16 then 5 elsif N == 32 then 8 else 11); 
constant integer F=N-E - 1; 
sign = '0'; 
exp = Ones(E); 
frac = '1':Zeros(F-1); 
return sign : exp : frac; 


shared/functions/float/fpdiv/F PDiv 


// FPDiv() 
|/ ====5== 


bits(N) FPDiv(bits(N) opl, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
infl = (typel == FPType_Infinity); 
inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); 
zero2 = (type2 == FPType_Zero); 
if (infl && inf2) || (zerol && zero2) then 
result = FPDefaultNaN(); 
FPProcessException(FPExc_InvalidOp, fpcr); 
elsif infl || zero2 then 
result = FPInfinity(sign1 EOR sign2); 
if !infl then FPProcessException(FPExc_DivideByZero, fpcr); 
elsif zerol || inf2 then 
result = FPZero(sign1 EOR sign2); 
else 
result = FPRound(valuel/value2, fpcr); 
return result; 


shared/functions/float/fpexc/FPExc 


enumeration FPExc {FPExc_InvalidOp, FPExc_DivideByZero, FPExc_Overflow, 
FPExc_Underflow, FPExc_Inexact, FPExc_InputDenorm}; 
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shared/functions/float/fpinfinity/FPInfinity 


// FPInfinity() 
|/ =sssssssss== 


bits(N) FPInfinity(bit sign) 
assert N IN {16, 32,64}; 
constant integer E = (if N == 16 then 5 elsif N == 32 then 8 else 11); 
constant integer F=N-E - 1; 
exp = Ones(E); 
frac = Zeros(F); 
return sign : exp : frac; 


shared/functions/float/fpmax/FPMax 


// FPMax() 
|/ =====5= 


bits(N) FPMax(bits(N) opl, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
if valuel > value2 then 
(type,sign,value) = (typel,sign1,value1) ; 
else 
(type,sign,value) = (type2,sign2,value2) ; 
if type == FPType_Infinity then 
result = FPInfinity(sign); 
elsif type == FPType_Zero then 
sign = signl AND sign2; // Use most positive sign 
result = FPZero(sign); 
else 
result = FPRound(value, fpcr); 
return result; 


shared/functions/float/fpmaxnormal/FPMaxNormal 


// FPMaxNormal () 
|/ ==sssss=n=s== 


bits(N) FPMaxNormal(bit sign) 
assert N IN {16, 32,64}; 
constant integer E = (if N == 16 then 5 elsif N == 32 then 8 else 11); 
constant integer F=N-E - 1; 
exp = Ones(E-1):'Q'; 
frac = Ones(F); 
return sign : exp : frac; 


shared/functions/float/fpmaxnum/FPMaxNum 


// FPMaxNum() 
== 


bits(N) FPMaxNum(bits(N) op1, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
(typel,-,-) = FPUnpack(opl, fpcr); 
(type2,-,-) = FPUnpack(op2, fpcr); 


// treat a single quiet-NaN as -Infinity 

if typel == FPType_QNaN && type2 != FPType_QNaN then 
opl = FPInfinity('1') 

elsif typel != FPType_QNaN && type2 == FPType_QNaN then 
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op2 = FPInfinity('1'); 


return FPMax(op1, op2, fpcr); 


shared/functions/float/fpmin/FPMin 


// FPMin() 
—— 


bits(N) FPMin(bits(N) opl, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
if valuel < value2 then 
(type,sign,value) = (typel,sign1,value1) ; 
else 
(type,sign,value) = (type2,sign2,value2) ; 
if type == FPType_Infinity then 
result = FPInfinity(sign); 
elsif type == FPType_Zero then 
sign = signl OR sign2; // Use most negative sign 
result = FPZero(sign); 
else 
result = FPRound(value, fpcr); 
return result; 


shared/functions/float/fpminnum/FPMinNum 


// FPMinNum( ) 
|[ =sas==s=5= 


bits(N) FPMinNum(bits(N) op1, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
(typel,-,-) = FPUnpack(op1, fpcr); 
(type2,-,-) = FPUnpack(op2, fpcr); 


// Treat a single quiet-NaN as +Infinity 

if typel == FPType_QNaN && type2 != FPType_QNaN then 
opl = FPInfinity('0'); 

elsif typel != FPType_QNaN && type2 == FPType_QNaN then 
op2 = FPInfinity('0'); 


return FPMin(op1, op2, fpcr); 


shared/functions/float/fpmul/FPMul 


// FPMu1() 
|f =ss==== 


bits(N) FPMul(bits(N) opl, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
infl = (typel == FPType_Infinity); 
inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); 
zero2 = (type2 == FPType_Zero); 
if (infl && zero2) || (zerol && inf2) then 
result = FPDefaultNaN(); 
FPProcessException(FPExc_InvalidOp, fpcr); 
elsif infl || inf2 then 
result = FPInfinity(sign1 EOR sign2); 


J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 
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elsif zerol || zero2 then 
result = FPZero(sign1 EOR sign2); 
else 
result = FPRound(valuelsvalue2, fpcr); 
return result; 


shared/functions/float/fpmuladd/F PMulAdd 


// FPMulAdd() 


// Calculates addend + oplxop2 with a single rounding. 


bits(N) FPMulAdd(bits(N) addend, bits(N) op1, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
rounding = FPRoundingMode(fpcr) ; 
(typeA,signA,valueA) = FPUnpack(addend, fpcr); 
(typel,signl,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
jinfl = (typel == FPType_Infinity); zerol = (typel == FPType_Zero); 
inf2 = (type2 == FPType_Infinity); zero2 = (type2 == FPType_Zero); 
(done, result) = FPProcessNaNs3(typeA, typel, type2, addend, opl, op2, fpcr); 


if typeA == FPType_QNaN && ((infl && zero2) || (zerol && inf2)) then 
result = FPDefaultNaN(); 
FPProcessException(FPExc_InvalidOp, fpcr); 


if !done then 
infA = (typeA == FPType_Infinity); zeroA = (typeA == FPType_Zero); 


// Determine sign and type product will have if it does not cause an Invalid 
// Operation. 

signP = sign1 EOR sign2; 

infP infl || inf2; 

zeroP = zerol || zero2; 


// Non SNaN-generated Invalid Operation cases are multiplies of zero by infinity and 
// additions of opposite-signed infinities. 
if (infl && zero2) || (zerol && inf2) || (infA && infP && signA != signP) then 
result = FPDefaultNaN(); 
FPProcessException(FPExc_InvalidOp, fpcr); 


// Other cases involving infinities produce an infinity of the same sign. 
elsif (infA && signA == 'Q@') || (infP && signP == 'Q') then 

result = FPInfinity('0'); 
elsif (infA && signA == '1') || (infP && signP == '1') then 

result = FPInfinity('1'); 


// Cases where the result is exactly zero and its sign is not determined by the 
// rounding mode are additions of same-signed zeros. 
elsif zeroA && zeroP && signA == signP then 

result = FPZero(signA); 


// Otherwise calculate numerical result and round it. 
else 
result_value = valueA + (valuel « value2); 
if result_value == @.0 then // Sign of exact zero result depends on rounding mode 
result_sign = if rounding == FPRounding_NEGINF then '1' else 'Q'; 
result = FPZero(result_sign) ; 
else 
result = FPRound(result_value, fpcr); 


return result; 
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shared/functions/float/fpmulx/FPMulX 


// FPMu1X() 
== 


bits(N) FPMuIX(bits(N) opl, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
bits(N) result; 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
infl = (typel == FPType_Infinity); 
inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); 
zero2 = (type2 == FPType_Zero); 
if (infl && zero2) || (zerol && inf2) then 
result = FPTwo(sign1 EOR sign2); 
elsif infl || inf2 then 
result = FPInfinity(sign1 EOR sign2); 
elsif zerol || zero2 then 
result = FPZero(sign1 EOR sign2); 
else 
result = FPRound(valuelsvalue2, fpcr); 
return result; 


shared/functions/float/fpneg/F PNeg 


// FPNeg() 
|f ======2 


bits(N) FPNeg(bits(N) op) 
assert N IN {32,64}; 
return NOT(op<N-1>) : op<N-2:0>; 


shared/functions/float/fponepointfive/FPOnePointFive 


// FPOnePointFive() 
|/ =ssssssnssssee= 


bits(N) FPOnePointFive(bit sign) 
assert N IN {32,64}; 
constant integer E = (if N == 32 then 8 else 11); 
constant integer F=N-E - 1; 
exp = 'Q':Ones(E-1); 
frac = '1':Zeros(F-1); 
return sign : exp : frac; 


shared/functions/float/fpprocessexception/FPProcessException 


// FPProcessException() 


// The 'fpcr' argument supplies FPCR control bits. Status information is 
// updated directly in the FPSR where appropriate. 


FPProcessException(FPExc exception, FPCRType fpcr) 
// Determine the cumulative exception bit number 
case exception of 


when FPExc_InvalidOp cumul = Q; 
when FPExc_DivideByZero cumul = 1; 
when FPExc_Overflow cumul = 2; 
when FPExc_Underflow cumul = 3; 
when FPExc_Inexact cumul = 4; 

7; 


when FPExc_InputDenorm  cumul = 
enable = cumul + 8; 


J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 
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if fpcr<enable> == '1' then 
// Trapping of the exception enabled. 


// It is IMPLEMENTATION DEFINED whether the enable bit may be set at all, and 
// if so then how exceptions may be accumulated before calling FPTrapException() 


IMPLEMENTATION_DEFINED "floating-point trap handling"; 
elsif UsingAArch32() then 

// Set the cumulative exception bit 

FPSCR<cumul> = '1'; 
else 

// Set the cumulative exception bit 

FPSR<cumul> = '1'; 
return; 


shared/functions/float/fpprocessnan/FPProcessNaN 


// FPProcessNaN() 


// 


bits(N) FPProcessNaN(FPType type, bits(N) op, FPCRType fpcr) 


shared/functions/float/fpprocessnans/FPProcessNaNs 


assert N IN {32,64}; 
assert type IN {FPType_QNaN, FPType_SNaN}; 


case N of 
when 32 topfrac = 22; 
when 64 topfrac = 51; 


result = op; 

if type == FPType_SNaN then 
result<topfrac> = '1'; 
FPProcessException(FPExc_InvalidOp, fpcr); 

if fpcr.DN == '1' then // DefaultNaN requested 
result = FPDefaultNaN(); 

return result; 


// FPProcessNaNs() 


// 
// 


// The boolean part of the return value says whether a NaN has been found and 
// processed. The bits(N) part is only relevant if it has and supplies the 


// result of the operation. 


// 


// The 'fpcr' argument supplies FPCR control bits. Status information is 


// updated directly in the FPSR where appropriate. 


(boolean, bits(N)) FPProcessNaNs(FPType typel, FPType type2, 


bits(N) op1, bits(N) op2, 
FPCRType fpcr) 
assert N IN {32,64}; 
if typel == FPType_SNaN then 
done = TRUE; result = FPProcessNaN(typel, op1, fpcr); 
elsif type2 == FPType_SNaN then 
done = TRUE; result = FPProcessNaN(type2, op2, fpcr); 
elsif typel == FPType_QNaN then 
done = TRUE; result = FPProcessNaN(typel, op1, fpcr); 
elsif type2 == FPType_QNaN then 
done = TRUE; result = FPProcessNaN(type2, op2, fpcr); 
else 





done = FALSE; result = Zeros(); // ‘Don't care’ result 
return (done, result); 
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shared/functions/float/fpprocessnans3/FPProcessNaNs3 


// FPProcessNaNs3() 


// The boolean part of the return value says whether a NaN has been found and 
// processed. The bits(N) part is only relevant if it has and supplies the 

// result of the operation. 

// 

// The 'fpcr' argument supplies FPCR control bits. Status information is 

// updated directly in the FPSR where appropriate. 


(boolean, bits(N)) FPProcessNaNs3(FPType typel, FPType type2, FPType type3, 
bits(N) op1, bits(N) op2, bits(N) op3, 
FPCRType fpcr) 
assert N IN {32,64}; 
if typel == FPType_SNaN then 
done = TRUE; result = FPProcessNaN(typel, op1, fpcr); 
elsif type2 == FPType_SNaN then 
done = TRUE; result = FPProcessNaN(type2, op2, fpcr); 
elsif type3 == FPType_SNaN then 
done = TRUE; result = FPProcessNaN(type3, op3, fpcr); 
elsif typel == FPType_QNaN then 
done = TRUE; result = FPProcessNaN(typel, op1, fpcr); 
elsif type2 == FPType_QNaN then 
done = TRUE; result = FPProcessNaN(type2, op2, fpcr); 
elsif type3 == FPType_QNaN then 
done = TRUE; result = FPProcessNaN(type3, op3, fpcr); 





else 





done = FALSE; result = Zeros(); // ‘Don't care’ result 
return (done, result); 


shared/functions/float/fprecipestimate/FPRecipEstimate 


// FPRecipEstimate() 


bits(N) FPRecipEstimate(bits(N) operand, FPCRType fpcr) 
assert N IN {32,64}; 
(type,sign,value) = FPUnpack(operand, fpcr); 
if type == FPType_SNaN || type == FPType_QNaN then 
result = FPProcessNaN(type, operand, fpcr); 
elsif type == FPType_Infinity then 
result = FPZero(sign); 
elsif type == FPType_Zero then 
result = FPInfinity(sign); 
FPProcessException(FPExc_DivideByZero, fpcr); 
elsif ( 
(N == 32 && Abs(value) < 2.Q@A-128) || 
(N == 64 && Abs(value) < 2.0A-1024) 
) then 
case FPRoundingMode(fpcr) of 
when FPRounding_TIEEVEN 
overflow_to_inf = TRUE; 
when FPRounding_POSINF 


overflow_to_inf = (sign == '0'); 
when FPRounding_NEGINF 
overflow_to_inf = (sign == '1'); 


when FPRounding_ZERO 
overflow_to_inf = FALSE; 
result = if overflow_to_inf then FPInfinity(sign) else FPMaxNormal(sign) ; 
FPProcessException(FPExc_Overflow, fpcr); 
FPProcessException(FPExc_Inexact, fpcr); 
elsif fpcr.FZ == '1' 





&& ( 
(N == 32 && Abs(value) >= 2.0A126) || 
(N == 64 && Abs(value) >= 2.0A1022) 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. J1-5419 


ID092916 Non-Confidential 


J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 


) then 
// Result flushed to zero of correct sign 
result = FPZero(sign); 
if UsingAArch32() then 
FPSCR.UFC = '1'; 
else 
FPSR.UFC = '1'; 
else 
// Scale to a double-precision value in the range 0.5 <= x < 1.0, and 
// calculate result exponent. Scaled value has copied sign bit, 
// exponent = 1022 = double-precision biased version of -1, 
// fraction = original fraction extended with zeros. 
case N of 
when 32 
fraction = operand<22:0> : Zeros(29); 
exp = UInt(operand<30:23>); 
when 64 
fraction = operand<51:0>; 
exp = UInt(operand<62:52>); 


if exp == @ then 
if fraction<51> == @ then 
exp = -1; 
fraction = fraction<49:0>:'00'; 
else 
fraction = fraction<50:0>:'0'; 


scaled = 'Q@' : '@1111111110' : fraction<51:44> : Zeros(44); 


case N of 
when 32 result_exp = 253 - exp; // In range 253-254 = -1 to 253+1 = 254 
when 64 result_exp = 2045 - exp; // In range 2045-2046 = -1 to 2045+1 = 2046 


// Call C function to get reciprocal estimate of scaled value. 
// Input is rounded down to a multiple of 1/512. 
estimate = recip_estimate(scaled); 


// Result is double-precision and a multiple of 1/256 in the range 1 to 511/256. 
// Convert to scaled single-precision result with copied sign bit and high-order 
// fraction bits, and exponent calculated above. 


fraction = estimate<51:0>; 
if result_exp == @ then 
fraction = '1' : fraction<51:1>; 
elsif result_exp == -1 then 
fraction = 'Q1' : fraction<51:2>; 
result_exp = Q; 


case N of 
when 32 result = sign : result_exp<N-25:0> : fraction<51:29>; 
when 64 result = sign : result_exp<N-54:0> : fraction<51:0>; 


return result; 


shared/functions/float/fprecpx/F PRecpX 


// FPRecpX() 


bits(N) FPRecpX(bits(N) op, FPCRType fpcr) 
assert N IN {32,64}; 


case N of 
when 32 esize = 8; 


when 64 esize = 11; 


bits(N) result; 
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bits(esize) exp; 
bits(esize) max_exp; 
bits(N-esize-1) frac = Zeros(); 


case N of 
when 32 exp = op<23+esize-1:23>; 
when 64 exp = op<52+esize-1:52>; 


max_exp = Ones(esize) - 1; 


(type,sign,value) = FPUnpack(op, fpcr); 
if type == FPType_SNaN || type == FPType_QNaN then 
result = FPProcessNaN(type, op, fpcr); 
else 
if IsZero(exp) then // Zero and denormals 
result = sign:max_exp:frac; 
else // Infinities and normals 
result = sign:NOT(exp):frac; 


return result; 


shared/functions/float/fpround/FPRound 


// FPRound() 


// Convert a real number OP into an N-bit floating-point value using the 
// supplied rounding mode RMODE. 


bits(N) FPRound(real op, FPCRType fpcr, FPRounding rounding) 


assert N IN {16, 32,64}; 

assert op != 0.0; 

assert rounding != FPRounding_TIEAWAY; 
bits(N) result; 


// Obtain format parameters - minimum exponent, numbers of exponent and fraction bits. 
if N == 16 then 
minimum_exp = -14; E=5; F = 10; 
elsif N == 32 then 
minimum_exp = -126; E = 8; F = 23; 
else // N == 64 
minimum_exp = -1022; E=11; F = 52; 


// Split value into sign, unrounded mantissa and exponent. 
if op < 0.0 then 

sign = '1'; mantissa = -op; 
else 

sign = 'Q'; mantissa = op; 
exponent = Q; 
while mantissa < 1.@ do 

mantissa = mantissa « 2.0; exponent = exponent - 1; 
while mantissa >= 2.0 do 

mantissa = mantissa / 2.0; exponent = exponent + 1; 


// Deal with flush-to-zero. 
if fpcr.FZ == '1' &&N != 16 && exponent < minimum_exp then 
// Flush-to-zero never generates a trapped exception 
if UsingAArch32() then 
FPSCR.UFC = '1'; 
else 
FPSR.UFC = '1'; 
return FPZero(sign); 


// Start creating the exponent value for the result. Start by biasing the actual exponent 
// so that the minimum exponent becomes 1, lower values 0 (indicating possible underflow). 
biased_exp = Max(exponent - minimum_exp + 1, Q); 

if biased_exp == @ then mantissa = mantissa / 2.QA(minimum_exp - exponent) ; 
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// Get the unrounded mantissa as an integer, and the “units in last place" rounding error. 
jint_mant = RoundDown(mantissa « 2.@AF); // < 2.0AF if biased_exp == Q@, >= 2.QAF if not 
error = mantissa « 2.QAF - Real(int_mant); 


// Underflow occurs if exponent is too small before rounding, and result is inexact or 

// the Underflow exception is trapped. 

if biased_exp == @ && (error != @.@ || fpcr.UFE == '1') then 
FPProcessException(FPExc_Underflow, fpcr); 


// Round result according to rounding mode. 
case rounding of 
when FPRounding_TIEEVEN 
round_up = (error > 0.5 || (error == 0.5 && int_mant<@> == '1')) 
overflow_to_inf = TRUE; 
when FPRounding_POSINF 
round_up = (error != 0.0 && sign == '0'); 
overflow_to_inf = (sign == '0'); 
when FPRounding_NEGINF 
round_up = (error != 0.0 && sign == '1'); 
overflow_to_inf = (sign == '1'); 
when FPRounding_ZERO, FPRounding_ODD 
round_up = FALSE; 
overflow_to_inf = FALSE; 


if round_up then 
jint_mant = int_mant + 1; 


if int_mant == 2AF then // Rounded up from denormalized to normalized 
biased_exp = 1; 
if int_mant == 2A(F+1) then // Rounded up to next exponent 


biased_exp = biased_exp + 1; | int_mant = int_mant DIV 2; 


// Handle rounding to odd aka Von Neumann rounding 
if error != 0.0 && rounding == FPRounding_ODD then 
int_mant<@> = '1'; 


// Deal with overflow and generate result. 
if N != 16 || fpcr.AHP == '0' then // Single, double or IEEE half precision 
if biased_exp >= 2AE - 1 then 
result = if overflow_to_inf then FPInfinity(sign) else FPMaxNormal(sign); 
FPProcessException(FPExc_Overflow, fpcr); 
error = 1.0; // Ensure that an Inexact exception occurs 
else 
result = sign : biased_exp<N-F-2:0> : int_mant<F-1:0>; 
else // Alternative half precision 
if biased_exp >= 2AE then 
result = sign : Ones(N-1); 
FPProcessException(FPExc_InvalidOp, fpcr); 
error = 0.0; // Ensure that an Inexact exception does not occur 
else 
result = sign : biased_exp<N-F-2:0> : int_mant<F-1:0>; 


// Deal with Inexact exception. 


if error != 0.0 then 
FPProcessException(FPExc_Inexact, fpcr); 


return result; 


// FPRound() 


bits(N) FPRound(real op, FPCRType fpcr) 
return FPRound(op, fpcr, FPRoundingMode(fpcr)); 
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shared/functions/float/fprounding/FPRounding 


enumeration FPRounding {FPRounding_TIEEVEN, FPRounding_POSINF, 
FPRounding_NEGINF, FPRounding_ZERO, 
FPRounding_TIEAWAY, FPRounding_ODD}; 


shared/functions/float/fproundingmode/FPRoundingMode 


// FPRoundingMode() 
1 cacti 


// Return the current floating-point rounding mode. 


FPRounding FPRoundingMode(FPCRType fpcr) 
return FPDecodeRounding(fpcr.RMode) ; 


shared/functions/float/fproundint/FPRoundint 


// FPRoundInt() 
|/ =ssssssnss== 


// Round OP to nearest integral floating point value using rounding mode ROUNDING. 
// If EXACT is TRUE, set FPSR.IXC if result is not numerically equal to OP. 


bits(N) FPRoundInt(bits(N) op, FPCRType fpcr, FPRounding rounding, boolean exact) 
assert rounding != FPRounding_ODD; 
assert N IN {32,64}; 


// Unpack using FPCR to determine if subnormals are flushed-to-zero 
(type,sign,value) = FPUnpack(op, fpcr); 


if type == FPType_SNaN || type == FPType_QNaN then 
result = FPProcessNaN(type, op, fpcr); 
elsif type == FPType_Infinity then 
result = FPInfinity(sign); 
elsif type == FPType_Zero then 
result = FPZero(sign); 
else 
// extract integer component 
jint_result = RoundDown(value); 
error = value - Real(int_result); 


// Determine whether supplied rounding mode requires an increment 
case rounding of 
when FPRounding_TIEEVEN 
round_up = (error > 0.5 || (error == 0.5 && int_result<@> == '1')); 
when FPRounding_POSINF 
round_up = (error != 0.0); 
when FPRounding_NEGINF 
round_up = FALSE; 
when FPRounding_ZERO 
round_up = (error != 0.0 && int_result < Q); 
when FPRounding_TIEAWAY 
round_up = (error > 0.5 || (error == 0.5 && int_result >= Q)); 


if round_up then int_result = int_result + 1; 


// Convert integer value into an equivalent real value 
real_result = Real(int_result); 


// Re-encode as a floating-point value, result is always exact 
if real_result == @.@ then 

result = FPZero(sign); 
else 

result = FPRound(real_result, fpcr, FPRounding_ZERO) ; 
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// Generate inexact exceptions 
if error != 0.0 && exact then 
FPProcessException(FPExc_Inexact, fpcr); 


return result; 


shared/functions/float/fprsqrtestimate/FPRSqrtEstimate 


// FPRSqrtEstimate() 
a 


bits(N) FPRSqrtEstimate(bits(N) operand, FPCRType fpcr) 

assert N IN {32,64}; 

(type,sign,value) = FPUnpack(operand, fpcr); 

if type == FPType_SNaN || type == FPType_QNaN then 
result = FPProcessNaN(type, operand, fpcr); 

elsif type == FPType_Zero then 
result = FPInfinity(sign); 
FPProcessException(FPExc_DivideByZero, fpcr); 

elsif sign == '1' then 
result = FPDefaultNaN(); 
FPProcessException(FPExc_InvalidOp, fpcr); 

elsif type == FPType_Infinity then 
result = FPZero('0'); 

else 
// Scale to a double-precision value in the range 0.25 <= x < 1.0, with the 
// evenness or oddness of the exponent unchanged, and calculate result exponent. 
// Scaled value has copied sign bit, exponent = 1022 or 1021 = double-precision 
// biased version of -1 or -2, fraction = original fraction extended with zeros. 





case N of 
when 32 
fraction = operand<22:@> : Zeros(29); 
exp = UInt(operand<30:23>); 
when 64 
fraction = operand<51:0>; 
exp = UInt(operand<62:52>); 


if exp == @ then 
while fraction<51> == @ do 
fraction = fraction<50:0> : 'Q'; 
exp = exp - 1; 
fraction = fraction<50:0> : 'Q'; 


if exp<@> == 'Q' then 

scaled = 'Q' : 'Q1111111110' : fraction<51:44> : Zeros(44); 
else 

scaled = 'Q' : 'Q1111111101' : fraction<51:44> : Zeros(44); 


case N of 
when 32 result_exp 
when 64 result_exp 


( 380 - exp) DIV 2; 
(3068 - exp) DIV 2; 


// Call C function to get reciprocal estimate of scaled value. 
estimate = recip_sqrt_estimate(scaled); 


// Result is double-precision and a multiple of 1/256 in the range 1 to 511/256. 
// Convert to scaled single-precision result with copied sign bit and high-order 
// fraction bits, and exponent calculated above. 
case N of 
when 32 result = 'Q' : result_exp<N-25:0> : estimate<51:29>; 
when 64 result = 'Q' : result_exp<N-54:0> : estimate<51:0>; 
return result; 
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shared/functions/float/fpsqrt/FPSqrt 


// FPSqrt() 
— 


bits(N) FPSqrt(bits(N) op, FPCRType fpcr) 

assert N IN {32,64}; 

(type,sign,value) = FPUnpack(op, fpcr); 

if type == FPType_SNaN || type == FPType_QNaN then 
result = FPProcessNaN(type, op, fpcr); 

elsif type == FPType_Zero then 
result = FPZero(sign); 

elsif type == FPType_Infinity && sign == 'Q' then 
result = FPInfinity(sign); 

elsif sign == '1' then 
result = FPDefaultNaN(); 
FPProcessException(FPExc_InvalidOp, fpcr); 

else 





result = FPRound(Sqrt(value), fpcr); 
return result; 


shared/functions/float/fpsub/FPSub 


// FPSub() 
|/ ======= 


bits(N) FPSub(bits(N) opl, bits(N) op2, FPCRType fpcr) 
assert N IN {32,64}; 
rounding = FPRoundingMode(fpcr) ; 
(typel,sign1,valuel) = FPUnpack(op1, fpcr); 
(type2,sign2,value2) = FPUnpack(op2, fpcr); 
(done, result) = FPProcessNaNs(typel, type2, opl, op2, fpcr); 
if !done then 
infl = (typel == FPType_Infinity); 
inf2 = (type2 == FPType_Infinity); 
zerol = (typel == FPType_Zero); 
zero2 = (type2 == FPType_Zero); 
if infl && inf2 && signl == sign2 then 
result = FPDefaultNaN(); 
FPProcessException(FPExc_InvalidOp, fpcr); 
elsif (infl && signl == 'Q@') || (inf2 && sign2 == '1') then 
result = FPInfinity('0'); 
elsif (infl && signl == '1') || (inf2 && sign2 == '@') then 
result = FPInfinity('1'); 
elsif zerol && zero2 && sign1 == NOT(sign2) then 
result = FPZero(sign1); 
else 
result_value = valuel - value2; 
if result_value == 0.0 then // Sign of exact zero result depends on rounding mode 
result_sign = if rounding == FPRounding_NEGINF then '1' else 'Q'; 
result = FPZero(result_sign) ; 
else 
result = FPRound(result_value, fpcr, rounding); 
return result; 


shared/functions/float/fpthree/FPThree 


// FPThree() 
|/ ==s====== 


bits(N) FPThree(bit sign) 
assert N IN {32,64}; 
constant integer E = (if N == 32 then 8 else 11); 
constant integer F=N-E - 1; 
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exp "L':Zeros(E-1); 
frac = '1':Zeros(F-1); 
return sign : exp : frac; 


shared/functions/float/fptofixed/FPToFixed 


// FPToFixed() 
|[ ==ssssse=== 


// Convert N-bit precision floating point OP to M-bit fixed point with 
// FBITS fractional bits, controlled by UNSIGNED and ROUNDING. 


bits(M) FPToFixed(bits(N) op, integer fbits, boolean unsigned, FPCRType fpcr, FPRounding rounding) 
assert N IN {32,64}; 
assert M IN {32,64}; 
assert fbits >= 0; 
assert rounding != FPRounding_ODD; 


// Unpack using fpcr to determine if subnormals are flushed-to-zero 
(type,sign,value) = FPUnpack(op, fpcr); 


// If NaN, set cumulative flag or take exception 
if type == FPType_SNaN || type == FPType_QNaN then 
FPProcessException(FPExc_InvalidOp, fpcr); 


// Scale by fractional bits and produce integer rounded towards minus-infinity 
value = value « 2.QAfbits; 

int_result = RoundDown(value) 

error = value - Real(int_result); 


// Determine whether supplied rounding mode requires an increment 
case rounding of 
when FPRounding_TIEEVEN 
round_up = (error > 0.5 || (error == 0.5 && int_result<@> == '1')); 
when FPRounding_POSINF 
round_up = (error != 0.0); 
when FPRounding_NEGINF 
round_up = FALSE; 
when FPRounding_ZERO 
round_up = (error != 0.0 && int_result < Q); 
when FPRounding_TIEAWAY 
round_up = (error > 0.5 || (error == 0.5 && int_result >= Q)); 


if round_up then int_result = int_result + 1; 


// Generate saturated result and exceptions 
(result, overflow) = SatQ(int_result, M, unsigned); 
if overflow then 
FPProcessException(FPExc_Invalid0p, fpcr); 
elsif error != 0.0 then 
FPProcessException(FPExc_Inexact, fpcr); 


return result; 


shared/functions/float/fptwo/FP Two 


// FPTwo() 
|/ =====5= 


bits(N) FPTwo(bit sign) 
assert N IN {32,64}; 
constant integer E = (if N == 32 then 8 else 11); 
constant integer F=N-E - 1; 
exp = '1':Zeros(E-1); 
frac = Zeros(F); 
return sign : exp : frac; 
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shared/functions/float/fptype/FPType 


enumeration FPType {FPType_Nonzero, FPType_Zero, FPType_Infinity, 
FPType_QNaN, FPType_SNaN}; 


shared/functions/float/fpunpack/FPUnpack 


// FPUnpack() 


// Unpack a floating-point number into its type, sign bit and the real number 
// that it represents. The real number result has the correct sign for numbers 
// and infinities, is very large in magnitude for infinities, and is 0.0 for 
// NaNs. (These values are chosen to simplify the description of comparisons 
// and conversions. ) 

// 

// The 'fpcr' argument supplies FPCR control bits. Status information is 

// updated directly in the FPSR where appropriate. 


(FPType, bit, real) FPUnpack(bits(N) fpval, FPCRType fpcr) 
assert N IN {16, 32,64}; 
if N == 16 then 
sign = fpval<15>; 
expl6 fpval<14:10>; 
fracl6 = fpval<9:0>; 
if IsZero(exp16) then 
// Produce zero if value is zero 
if IsZero(fracl6) then 
type = FPType_Zero; value = 0.0; 
else 
type = FPType_Nonzero; value = 2.QA-14 » (Real(UInt(fracl6)) « 2.0A-10); 
elsif IsOnes(exp16) && fpcr.AHP == '@' then // Infinity or NaN in IEEE format 
if IsZero(fracl6) then 
type = FPType_Infinity; value = 2.0A1000000; 
else 
type = if fracl6<9> == '1' then FPType_QNaN else FPType_SNaN; 
value = 0.0 


else 
type = FPType_Nonzero; 
value = 2.QA(UInt(exp16)-15) » (1.@ + Real(UInt(fracl6)) « 2.0A-10); 


elsif N == 32 then 


sign = fpval<31l>; 
exp32. = fpval<30:23>; 
frac32 = fpval<22:0>; 
if IsZero(exp32) then 
// Produce zero if value is zero or flush-to-zero is selected. 
if IsZero(frac32) || fpcr.FZ == '1' then 
type = FPType_Zero; value = 0.0; 
if !IsZero(frac32) then // Denormalized input flushed to zero 
FPProcessException(FPExc_InputDenorm, fpcr); 


else 

type = FPType_Nonzero; value = 2.QA-126 » (Real(UInt(frac32)) * 2.QA-23); 
elsif IsOnes(exp32) then 

if IsZero(frac32) then 
type = FPType_Infinity; value = 2.0A1000000; 

else 
type = if 
value = 0 


frac32<22> == '1' then FPType_QNaN else FPType_SNaN; 
0; 
else 

type = FPType_Nonzero; 

value = 2.QA(UInt(exp32)-127) « (1.0 + Real(UInt(frac32)) « 2.QA-23); 


else // N == 64 


sign = fpval<63>; 
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exp64 = fpval<62:52>; 
frac64 = fpval<51:0>; 
if IsZero(exp64) then 
// Produce zero if value is zero or flush-to-zero is selected. 
if IsZero(frac64) || fpcr.FZ == '1' then 
type = FPType_Zero; value = 0.0; 
if !IsZero(frac64) then // Denormalized input flushed to zero 
FPProcessException(FPExc_InputDenorm, fpcr); 
else 
type = FPType_Nonzero; value = 2.QA-1022 » (Real(UInt(frac64)) « 2.0A-52); 
elsif IsOnes(exp64) then 
if IsZero(frac64) then 
type = FPType_Infinity; value = 2.0A1000000; 
else 
type = if frac64<51> == '1' then FPType_QNaN else FPType_SNaN; 
value = 0.0; 
else 
type = FPType_Nonzero; 
value = 2.QA(UInt(exp64)-1023) « (1.0 + Real(UInt(frac64)) « 2.0A-52); 
if sign == '1' then value = -value; 
return (type, sign, value); 


shared/functions/float/fpzero/FPZero 


// FPZero() 
(—— 


bits(N) FPZero(bit sign) 
assert N IN {16, 32,64}; 
constant integer E = (if N == 16 then 5 elsif N == 32 then 8 else 11); 
constant integer F=N-E - 1; 
exp = Zeros(E); 
frac = Zeros(F); 
return sign : exp : frac; 


shared/functions/float/vfpexpandimm/VFPExpandimm 


// VFPExpandImm() 
|/ =sasssenssse== 


bits(N) VFPExpandImm(bits(8) imm8) 
assert N IN {32,64}; 
constant integer E = (if N == 32 then 8 else 11); 
constant integer F=N-E - 1; 
sign = imm8<7>; 
exp NOT(imm8<6>) :Replicate(imm8<6>,E-3):imm8<5:4>; 
frac = imm8<3:Q>:Zeros(F-4); 
return sign : exp : frac; 


shared/functions/integer/AddWithCarry 
// AddWithCarry() 
// Integer addition with carry input, returning result and NZCV flags 


(bits(N), bits(4)) AddWithCarry(bits(N) x, bits(N) y, bit carry_in) 
integer unsigned_sum = UInt(x) + UInt(y) + UInt(carry_in); 
integer signed_sum = SInt(x) + SInt(y) + UInt(carry_in); 
bits(N) result = unsigned_sum<N-1:0>; // same value as signed_sum<N-1:0> 
bit n = result<N-1>; 
bit z = if IsZero(result) then '1' else 'Q'; 
bit c = if UInt(result) == unsigned_sum then '@' else '1'; 
bit v = if SInt(result) == signed_sum then 'Q' else '1'; 
return (result, n:z:c:v); 
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shared/functions/memory/AccType 


enumeration AccType {AccType_NORMAL, AccType_VEC, // Normal loads and stores 
AccType_STREAM, AccType_VECSTREAM, // Streaming loads and stores 
AccType_ATOMIC, // Atomic loads and stores 
AccType_ORDERED, // Load-Acquire and Store-Release 
AccType_UNPRIV, // Load and store unprivileged 
AccType_IFETCH, // Instruction fetch 
AccType_PTW, // Page table walk 
// Other operations 
AccType_DC, // Data cache maintenance 
AccType_IC, // Instruction cache maintenance 
AccType_DCZVA, // Memory access type specifically for DCZVA 
instruction 
AccType_AT}; // Address translation 


shared/functions/memory/AddrTop 


// AddrTop() 
<= 


integer AddrTop(bits(64) address, bits(2) el) 
// Return the MSB number of a virtual address in the current stage 1 translation 
// regime. If EL1 is using AArch64 then addresses from EL@ using AArch32 
// are zero-extended to 64 bits. 
el_regime = S1TranslationRegime(el); 
if UsingAArch32() && !(e1 == ELQ && !ELUsingAArch32(el_regime)) then 
// AArch32 translation regime. 
return 31; 
else 
// AArch64 translation regime. 
case el_regime of 


when EL1 
thi = if address<55> == '1' then TCR_EL1.TBI1 else TCR_EL1.TBIO; 
when EL2 
tbi = TCR_EL2.TBI; 
when EL3 
tbi = TCR_EL3.TBI; 
return (if thi == '1' then 55 else 63); 


shared/functions/memory/AddressDescriptor 


type AddressDescriptor is ( 


FaultRecord fault, // fault.type indicates whether the address is valid 
MemoryAttributes memattrs, 
FullAddress paddress, 
bits(64) vaddress 


shared/functions/memory/Allocation 


constant bits(2) MemHint_No = '00'; // No Read-Allocate, No Write-Allocate 
constant bits(2) MemHint_WA = 'Q1'; // No Read-Allocate, Write-Allocate 
constant bits(2) MemHint_RA = '10'; // Read-Allocate, No Write-Allocate 
constant bits(2) MemHint_RWA = '11'; // Read-Allocate, Write-Allocate 


shared/functions/memory/BigEndian 


// BigEndian() 
(=< 


boolean BigEndian() 
boolean bigend; 
if UsingAArch32() then 
bigend = (PSTATE.E != '@'); 
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elsif PSTATE.EL == EL@ then 
bigend = (SCTLR[].EQE != '0'); 
else 
bigend = (SCTLR[].EE != '0'); 
return bigend; 


shared/functions/memory/BigEndianReverse 


// BigEndianReverse() 


bits(width) BigEndianReverse (bits(width) value) 
assert width IN {8, 16, 32, 64, 128}; 
integer half = width DIV 2; 
if width == 8 then return value; 
return BigEndianReverse(value<half-1:0>) : BigEndianReverse(value<width-1:half>) ; 


shared/functions/memory/BranchAddr 


// BranchAddr() 
// Return the virtual address with tag bits removed for storing to the program counter. 


bits(64) BranchAddr(bits(64) vaddress, bits(2) el) 

assert !UsingAArch32(); 

msbit = AddrTop(vaddress, el); 

if msbit == 63 then 
return vaddress; 

elsif el IN {ELQ, EL1} && vaddress<msbit> == '1' then 
return SignExtend(vaddress<msbit:0>) ; 

else 
return ZeroExtend(vaddress<msbit:Q>); 


shared/functions/memory/Cacheability 
constant bits(2) MemAttr_NC = '00'; // Non-cacheable 


constant bits(2) MemAttr_WT = '10'; // Write-through 
constant bits(2) MemAttr_WB = '11'; // Write-back 


shared/functions/memory/DataMemoryBarrier 


DataMemoryBarrier(MBReqDomain domain, MBReqTypes types); 


shared/functions/memory/DataSynchronizationBarrier 


DataSynchronizationBarrier(MBReqDomain domain, MBReqTypes types); 


shared/functions/memory/DeviceType 


enumeration DeviceType {DeviceType_GRE, DeviceType_nGRE, DeviceType_nGnRE, DeviceType_nGnRnE}; 


shared/functions/memory/Fault 
enumeration Fault {Fault_None, 
Fault_AccessFlag, 
Fault_Alignment, 
Fault_Background, 
Fault_Domain, 
Fault_Permission, 
Fault_Translation, 
Fault_AddressSize, 
Fault_SyncExternal, 
Fault_SyncExternalOnWalk, 
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Fault_SyncParity, 
Fault_SyncParityOnWalk, 
Fault_AsyncParity, 
Fault_AsyncExternal, 
Fault_Debug, 
Fault_TLBConflict, 
Fault_Lockdown, 
Fault_Exclusive, 
Fault_ICacheMaint}; 





shared/functions/memory/FaultRecord 


type FaultRecord is (Fault type, // Fault Status 
AccType acctype, // Type of access that faulted 
bits(48) ipaddress, // Intermediate physical address 
boolean s2fsilwalk, // Is on a Stage 1 page table walk 


boolean write, // TRUE for a write, FALSE for a read 

integer level, // For translation, access flag and permission faults 
bit extflag, // IMPLEMENTATION DEFINED syndrome for external aborts 
boolean secondstage, // Is a Stage 2 abort 

bits(4) domain, // Domain number, AArch32 only 


bits(4) debugmoe) // Debug method of entry, from AArch32 only 


shared/functions/memory/FullAddress 


type FullAddress is ( 
bits(48) physicaladdress, 
bit NS // '0' = Secure, '1' = Non-secure 


shared/functions/memory/Hint_Prefetch 


// Signals the memory system that memory accesses of type HINT to or from the specified address are 
// likely in the near future. The memory system may take some action to speed up the memory accesses 
// when they do occur, such as pre-loading the the specified address into one or more caches as 

// indicated by the innermost cache level target (Q=L1, 1=L2, etc) and non-temporal hint stream. 

// Any or all prefetch hints may be treated as a NOP. A prefetch hint must not cause a synchronous 
// abort due to alignment or translation faults and the like. Its only effect on software visible 
// state should be on caches and TLBs associated with address, which must be accessable by reads, 
// writes or execution as defined in the translation regime of the current Exception level. 

// It is guaranteed not to access Device memory. 

// A Prefetch_EXEC hint must not result in an access that could not be performed by a speculative 
// instruction fetch, therefore if all associated MMUs are disabled, then it cannot access any 

// memory location that cannot be accessed by instruction fetches. 

Hint_Prefetch(bits(64) address, PrefetchHint hint, integer target, boolean stream); 


shared/functions/memory/MBReqDomain 
enumeration MBReqDomain {MBReqDomain_Nonshareable, MBReqDomain_InnerShareable, 
MBReqDomain_OuterShareable, MBReqDomain_FullSystem}; 


shared/functions/memory/MBReqTypes 


enumeration MBReqTypes {MBReqTypes_Reads, MBReqlypes_Writes, MBReqlypes_Al1}; 


shared/functions/memory/MemAttrHints 


type MemAttrHints is ( 
bits(2) attrs, // The possible encodings for each attributes field are as below 
bits(2) hints, // The possible encodings for the hints are below 
boolean transient 
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shared/functions/memory/MemType 
enumeration MemType {MemType_Normal, MemType_Device}; 


shared/functions/memory/MemoryAttributes 


type MemoryAttributes is ( 


MemType type, 

DeviceType device, // For Device memory types 
MemAttrHints inner, // Inner hints and attributes 
MemAttrHints outer, // Outer hints and attributes 
boolean shareable, 

boolean outershareable 


shared/functions/memory/Permissions 
type Permissions is ( 
bits(3) ap, // Access permission bits 


bit xn, // Execute-never bit 
bit pxn // Privileged execute-never bit 


shared/functions/memory/PrefetchHint 
enumeration PrefetchHint {Prefetch_READ, Prefetch_WRITE, Prefetch_EXEC}; 


shared/functions/memory/TLBRecord 


type TLBRecord is ( 


Permissions perms, 

bit nG, // '0' = Global, '1' = not Global 

bits(4) domain, // AArch32 only 

boolean contiguous, // Contiguous bit from page table 

integer level, // In AArch32 Short-descriptort format, indicates Section/Page 
integer blocksize, // Describes size of memory translated in KBytes 


AddressDescriptor addrdesc 


shared/functions/memory/_Mem 


// These two _Mem[] accessors are the hardware operations which perform 
// single-copy atomic, aligned, little-endian memory accesses of size 
// bytes from/to the underlying physical memory array of bytes. 

// 

// The functions address the array using desc.PADDRESS which supplies: 
// 

// * A 48-bit physical address 

// * A single NS bit to select between Secure and Non-secure parts of 
// the array. 

// 

// The acctype parameter describes the access type: normal, exclusive, 
// ordered, streaming, etc. 

bits(8*size) _Mem[AddressDescriptor desc, integer size, AccType acctype]; 


_Mem[AddressDescriptor desc, integer size, AcclType acctype] = bits(8*size) value; 
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shared/functions/registers/BranchTo 


// BranchTo() 
== 


// Set program counter to a new address, which may include a tag in the top eight bits, 
// with a branch reason hint for possible use by hardware fetching the next instruction. 


BranchTo(bits(N) target, BranchType branch_type) 
Hint_Branch(branch_type) ; 
if N == 32 then 
assert UsingAArch32(); 
_PC = ZeroExtend(target) ; 
else 
assert N == 64 && !UsingAArch32(); 
_PC = BranchAddr(target<63:@>, PSTATE.EL); 
return; 


shared/functions/registers/BranchToAddr 


// BranchToAddr() 
|/ =sssssenssss== 


// Set program counter to a new address, which does not include a tag in the top eight bits, 
// with a branch reason hint for possible use by hardware fetching the next instruction. 


BranchToAddr(bits(N) target, BranchType branch_type) 
Hint_Branch(branch_type) ; 
if N == 32 then 
assert UsingAArch32(); 
_PC = ZeroExtend(target) ; 
else 
assert N == 64 && !UsingAArch32(); 
_PC = target<63:0>; 
return; 


shared/functions/registers/BranchType 
enumeration BranchType {BranchType_CALL, BranchType_ERET, BranchType_DBGEXIT, 
BranchType_RET, BranchType_JMP, BranchType_EXCEPTION, 
BranchType_UNKNOWN} ; 
shared/functions/registers/Hint_Branch 
// Report the hint passed to BranchTo() and BranchToAddr(), for consideration when processing 
// the next instruction. 
Hint_Branch(BranchType hint); 
shared/functions/registers/NextInstrAddr 
// Return address of the next instruction. 
bits(N) NextInstrAddr(); 
shared/functions/registers/ResetExternalDebugRegisters 
// Reset the External Debug registers in the Core power domain. 
ResetExternalDebugRegisters(boolean cold_reset) ; 
shared/functions/registers/ThisInstrAddr 


// ThisInstrAddr() 


// Return address of the current instruction. 
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bits(N) ThisInstrAddr() 
assert N == 64 || (N == 32 && UsingAArch32()); 
return _PC<N-1:0>; 


shared/functions/registers/_PC 


bits(64) _PC; 


shared/functions/registers/_R 


array bits(64) _R[Q..30]; 


shared/functions/registers/_V 


array bits(128) _V[Q..31]; 


shared/functions/sysregisters/SPSR 


// SPSR[] - non-assignment form 
[] semmnnnmmnnennnneemennennens 


bits(32) SPSR[] 
bits(32) result; 
if UsingAArch32() then 
case PSTATE.M of 


when M32_FIQ result = SPSR_fiq; 
when M32_IRQ result = SPSR_irq; 
when M32_Svc result = SPSR_svc; 


when M32_Monitor result = SPSR_mon; 
when M32_Abort result = SPSR_abt; 


when M32_Hyp result = SPSR_hyp; 
when M32_Undef result = SPSR_und; 
otherwise Unreachable(); 
else 
case PSTATE.EL of 
when EL1 result = SPSR_EL1; 
when EL2 result = SPSR_EL2; 
when EL3 result = SPSR_EL3; 
otherwise Unreachable(); 


return result; 


// SPSR[] - assignment form 
/{ sannennnnnnnssensssnnns 


SPSR[] = bits(32) value 
if UsingAArch32() then 
case PSTATE.M of 


else 


when M32_FIQ SPSR_fig = value; 
when M32_IRQ SPSR_irg = value; 
when M32_Svc SPSR_svc = value; 


when M32_Monitor 
when M32_Abort 


SPSR_mon = value; 
SPSR_abt = value; 


when M32_Hyp SPSR_hyp = value; 
when M32_Undef SPSR_und = value; 
otherwise Unreachable(); 


case PSTATE.EL of 








when EL1 SPSR_EL1 = value; 
when EL2 SPSR_EL2 = value; 
when EL3 SPSR_EL3 = value; 
otherwise Unreachable(); 
return; 
J1-5434 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


J 


shared/functions/system/ArchVersion 


enumeration ArchVersion { 
ARMv8p0, 
}; 
shared/functions/system/ClearEventRegister 


ClearEventRegister(); 


shared/functions/system/ClearPendingPhysicalSError 
// Clear a pending physical SError interrupt 
ClearPendingPhysicalSError(); 
shared/functions/system/ConditionHolds 


// ConditionHolds() 
|[ zassssssssssees= 


// Return TRUE iff COND currently holds 
boolean ConditionHolds(bits(4) cond) 


// Evaluate base condition. 
case cond<3:1> of 





when 'QQ00' result = (PSTATE.Z == '1'); // EQ or NE 
when 'QQ1' result = (PSTATE.C == '1'); // CS or CC 
when 'Q10' result = (PSTATE.N == '1'); // MI or PL 
when 'Q@11' result = (PSTATE.V == '1'); // VS or VC 
when '100' result = (PSTATE.C == '1' && PSTATE.Z == 'Q'); // HI or LS 
when '101' result = (PSTATE.N == PSTATE.V); // GE or LT 
when '110' result = (PSTATE.N == PSTATE.V && PSTATE.Z == '@'); // GT or LE 
when '111' result = TRUE; // AL 


// Condition flag values in the set '111x' indicate always true 
// Otherwise, invert condition if necessary. 
if cond<@> == '1' && cond != '1111' then 

result = !result; 


return result; 


shared/functions/system/CurrentinstrSet 


// CurrentInstrSet() 
|/ =ssssssessssanas= 


InstrSet CurrentInstrSet() 


if UsingAArch32() then 
result = if PSTATE.T == '@' then InstrSet_A32 else InstrSet_T32; 


J1 ARMv8 Pseudocode 
1.3 Shared pseudocode 


// PSTATE.J is RESQ. Implementation of T32EE or Jazelle state not permitted. 


else 
result = InstrSet_A64; 
return result; 


shared/functions/system/CurrentPL 


// CurrentPL() 
/(———= 


PrivilegeLevel CurrentPL() 
return PLOfEL(PSTATE.EL); 
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J1.3 Shared pseudocode 


shared/functions/system/ELO 


constant bits(2) EL3 = '11'; 
constant bits(2) EL2 = '10'; 
constant bits(2) EL1 = 'Q1'; 
constant bits(2) EL@ = '00'; 


shared/functions/system/ELFromM32 


// ELFromM32() 
i ———= 


(boolean, bits(2)) ELFromM32(bits(5) mode) 
// Convert an AArch32 mode encoding to an Exception level. 
// Returns (valid,EL): 
// ‘valid' is TRUE if 'mode<4:@>' encodes a mode that is both valid for this implementation 


// and the current value of SCR.NS/SCR_EL3.NS. 
// ‘EL is the Exception level decoded from 'mode'. 
bits(2) el; 


boolean valid = !BadMode(mode); // Check for modes that are not valid for this implementation 
case mode of 
when M32_Monitor 


el = EL3; 
when M32_Hyp 
el = EL2; 


valid = valid && (!HaveEL(EL3) || SCR_GEN[].NS == '1') 
when M32_FIQ, M32_IRQ, M32_Svc, M32_Abort, M32_Undef, M32_System 
// If EL3 is implemented and using AArch32, then these modes are EL3 modes in Secure 
// state, and EL1 modes in Non-secure state. If EL3 is not implemented or is using 
// AArch64, then these modes are EL1 modes. 
el = (if HaveEL(EL3) && HighestELUsingAArch32() && SCR.NS == '@' then EL3 else EL1); 
when M32_User 
el = ELQ; 
otherwise 
valid = FALSE; // Passed an illegal mode value 
if !valid then el = bits(2) UNKNOWN; 
return (valid, el); 


shared/functions/system/ELFromSPSR 


// ELFromSPSR() 
|/ ==sssssn=s== 


// Convert an SPSR value encoding to an Exception level. 

// Returns (valid,EL): 

//  ‘valid' is TRUE if 'spsr<4:@>' encodes a valid mode for the current state. 
//  ‘EL' is the Exception level decoded from 'spsr' 


(boolean,bits(2)) ELFromSPSR(bits(32) spsr) 








if spsr<4> == 'Q' then // AArch64 state 
el = spsr<3:2>; 
if HighestELUsingAArch32() then // No AArch64 support 
valid = FALSE; 
elsif !HaveEL(el) then // Exception level not implemented 
valid = FALSE; 
elsif spsr<l> == '1' then // M[1] must be 0 
valid = FALSE; 
elsif el == ELQ && spsr<@> == '1' then // for EL@, M[@] must be Q 
valid = FALSE; 
elsif el == EL2 && HaveEL(EL3) && SCR_EL3.NS == 'Q@' then 
valid = FALSE; // EL2 only valid in Non-secure state 
else 
valid = TRUE; 
elsif !HaveAnyAArch32() then // AArch32 not supported 
valid = FALSE; 
else // AArch32 state 
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J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 


(valid, el) = ELFromM32(spsr<4:0>); 
if !valid then el = bits(2) UNKNOWN; 
return (valid,el); 


shared/functions/system/ELStateUsingAArch32 


// ELStateUsingAArch32() 
|[ sassssensssessssees 


boolean ELStateUsingAArch32(bits(2) el, boolean secure) 
// See ELStateUsingAArch32K() for description. Must only be called in circumstances where 
// result is valid (typically, that means 'el IN {EL1,EL2,EL3}'). 
(known, aarch32) = ELStateUsingAArch32K(el, secure); 
assert known; 
return aarch32; 


shared/functions/system/ELStateUsingAArch32K 


// ELStateUsingAArch32K() 
|/ ssssssensssesssssssss= 


(boolean, boolean) ELStateUsingAArch32K(bits(2) el, boolean secure) 
// Returns (known, aarch32): 
// ‘known' is FALSE for EL@ if the current Exception level is not EL@ and EL1 is 
// using AArch64, since it cannot determine the state of EL@; TRUE otherwise. 
// ‘aarch32' is TRUE if the specified Exception level is using AArch32; FALSE otherwise. 
boolean aarch32; 


known = TRUE; 
if !HaveAArch32EL(el) then 

aarch32 = FALSE; // Exception level is using AArch64 
elsif HighestELUsingAArch32() then 

aarch32 = TRUE; // All levels are using AArch32 
else 


aarch32_below_el3 = HaveEL(EL3) && SCR_EL3.RW == 'Q'; 


aarch32_at_ell = aarch32_below_el3 || (HaveEL(EL2) && !secure && HCR_EL2.RW == 'Q'); 


if el == ELO && !aarch32_at_el1 then // Only know if EL@ using AArch32 from PSTATE 
if PSTATE.EL == ELO then 
aarch32 = PSTATE.nRW == '1'; // EL@ controlled by PSTATE 
else 
known = FALSE; // EL@ state is UNKNOWN 
else 


aarch32 = (aarch32_below_el3 && el != EL3) || (aarch32_at_ell && el IN {EL1,ELQ}); 
if !known then aarch32 = boolean UNKNOWN; 
return (known, aarch32); 


shared/functions/system/ELUsingAArch32 


// ELUsingAArch32() 
|[ =sssssenssssee== 


boolean ELUsingAArch32(bits(2) el) 
return ELStateUsingAArch32(el, IsSecureBelowEL3()); 
shared/functions/system/ELUsingAArch32K 


// ELUsingAArch32K() 
|[ sesssssssss=c=== 


(boolean, boolean) ELUsingAArch32K(bits(2) e1) 
return ELStateUsingAArch32K(el, IsSecureBelowEL3()); 
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J1.3 Shared pseudocode 


shared/functions/system/EndOfinstruction 


// Terminate processing of the current instruction. 
EndOfInstruction(); 


shared/functions/system/EventRegisterSet 


// Set the local event register in this PE. 
EventRegisterSet(); 


shared/functions/system/EventRegistered 


boolean EventRegistered(); 


shared/functions/system/GetPSRFromPSTATE 


// GetPSRFromPSTATE( ) 
|{ sannennnnnnssssses 


// Return a PSR value which represents the current PSTATE 


bits(32) GetPSRFromPSTATE() 

bits(32) spsr = Zeros(); 

spsr<31:28> = PSTATE.<N,Z,C,V>; 

spsr<21> = PSTATE.SS; 

spsr<2Q> = PSTATE.IL; 

if PSTATE.nRW == '1' then // AArch32 state 
spsr<27> = PSTATE.Q; 
spsr<26:25> = PSTATE.IT<1:0>; 
spsr<19:16> = PSTATE.GE; 
spsr<15:10> = PSTATE.IT<7:2>; 


spsr<9> = PSTATE.E; 
spsr<8:6> = PSTATE.<A,I,F>; // No PSTATE.D in AArch32 state 
spsr<5> = PSTATE.T; 
assert PSTATE.M<4> == PSTATE.nRW; // bit [4] is the discriminator 


spsr<4:Q@> = PSTATE.M; 
else // AArch64 state 
spsr<9:6> = PSTATE.<D,A,I1,F>; 


spsr<4> = PSTATE.nRW; 
spsr<3:2> = PSTATE.EL; 
spsr<Q> = PSTATE.SP; 


return spsr; 


shared/functions/system/HasArchVersion 


// HasArchVersion() 
|/ =sssssenssssee= 


// Return TRUE if the implemented architecture includes the extensions defined in the specified 
// architecture version. 


boolean HasArchVersion(ArchVersion version) 
return version == ARMv8p0 || boolean IMPLEMENTATION_DEFINED; 
shared/functions/system/HaveAArch32EL 


// HaveAArch32EL() 
as 


boolean HaveAArch32EL(bits(2) el) 
// Return TRUE if Exception level 'el' supports AArch32 in this implementation 
if !HaveEL(el) then 





return FALSE; // The Exception level is not implemented 
elsif !HaveAnyAArch32() then 
return FALSE; // No Exception level can use AArch32 
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elsif HighestELUsingAArch32() then 


return TRUE; // All Exception levels are using AArch32 
elsif el == HighestEL() then 

return FALSE; // The highest Exception level is using AArch64 
elsif el == ELQ then 

return TRUE; // EL@ must support using AArch32 if any AArch32 


return boolean IMPLEMENTATION_DEFINED; 


shared/functions/system/HaveAnyAArch32 
// HaveAnyAArch32() 
// Return TRUE if AArch32 state is supported at any Exception level 
boolean HaveAnyAArch32() 
return boolean IMPLEMENTATION_DEFINED; 
shared/functions/system/HaveAnyAArch64 
// HaveAnyAArch64() 
// Return TRUE if AArch64 state is supported at any Exception level 
boolean HaveAnyAArch64() 
return !HighestELUsingAArch32(); 
shared/functions/system/HaveEL 
// HaveEL() 
// Return TRUE if Exception level 'el' is supported 
boolean HaveEL(bits(2) el) 
if el IN {EL1,ELQ} then 
return TRUE; // EL1 and ELO must exist 
return boolean IMPLEMENTATION_DEFINED; 
shared/functions/system/HighestEL 
// HighestEL() 
// Returns the highest implemented Exception level. 
bits(2) HighestEL() 
if HaveEL(EL3) then 
return EL3; 
elsif HaveEL(EL2) then 
return EL2; 
else 
return EL1; 
shared/functions/system/HighestELUsingAArch32 
// HighestELUsingAArch32() 
// Return TRUE if configured to boot into AArch32 operation 
boolean HighestELUsingAArch32() 


if !HaveAnyAArch32() then return FALSE; 
return boolean IMPLEMENTATION_DEFINED; // e.g. CFG32SIGNAL == HIGH 
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shared/functions/system/Hint_Yield 


Hint_Yield(); 


shared/functions/system/IllegalExceptionReturn 


// T1legalExceptionReturn() 
|[ senssssassssssssssesss= 


boolean I]legalExceptionReturn(bits(32) spsr) 


// Check for return: 

// «To an unimplemented Exeception level. 

// * To EL2 in Secure state. 

// * To EL@ using AArch64 state, with SPSR.M[Q]==1. 
//  * To AArch64 state with SPSR.M[1]==1. 

//  * To AArch32 state with an illegal value of SPSR.M. 
(valid, target) = ELFromSPSR(spsr); 

if !valid then return TRUE; 


// Check for return to higher Exception level 
if UInt(target) > UInt(PSTATE.EL) then return TRUE; 


spsr_mode_is_aarch32 = (spsr<4> == '1'); 


// Check for return: 

//  * To EL1, EL2 or EL3 with register width specified in the SPSR different from the 
// Execution state used in the Exception level being returned to, as determined by 
// the SCR_EL3.RW or HCR_EL2.RW bits, or as configured from reset. 

//  * To EL@ using AArch64 state when EL1 is using AArch32 state as determined by the 
// SCR_EL3.RW or HCR_EL2.RW bits or as configured from reset. 

// _* To AArch64 state from AArch32 state (should be caught by above) 

(known, target_el_is_aarch32) = ELUsingAArch32K(target) ; 

assert known || (target == ELQ && !ELUsingAArch32(EL1)); 

if known && sos modes caarch?? != target_el_is_aarch32 then return TRUE; 


// Check for illegal return from AArch32 to AArch64 
if UsingAArch32() && !spsr_mode_is_aarch32 then return TRUE; 


// Check for return to EL1 in Non-secure state when HCR.TGE is set 
if HaveEL(EL2) && target == EL1 && !IsSecureBelowEL3() && HCR_EL2.TGE == '1' then return TRUE; 
return FALSE; 


shared/functions/system/InstrSet 


enumeration InstrSet {InstrSet_A64, InstrSet_A32, InstrSet_T32}; 


shared/functions/system/InstructionSynchronizationBarrier 


InstructionSynchronizationBarrier(); 


shared/functions/system/InterruptPending 


boolean InterruptPending(); 


shared/functions/system/IsSecure 


// IsSecure() 
—=—= 


boolean IsSecure() 
// Return TRUE if current Exception level is in Secure state. 
if HaveEL(EL3) && !UsingAArch32() && PSTATE.EL == EL3 then 
return TRUE; 
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elsif HaveEL(EL3) && UsingAArch32() && PSTATE.M == M32_Monitor then 


return TRUE; 
return IsSecureBelowEL3(); 


shared/functions/system/IsSecureBelowEL3 


// IsSecureBelowEL3() 
|[ ssssssenssssscsss= 


// Return TRUE if an Exception level below EL3 is in Secure state 
// or would be following an exception return to that level. 


// 


// Differs from IsSecure in that it ignores the current EL or Mode 
// in considering security state. 

// That is, if at AArch64 EL3 or in AArch32 Monitor mode, whether an 
// exception return would pass to Secure or Non-secure state. 


boolean IsSecureBelowEL3() 
if HaveEL(EL3) then 
return SCR_GEN[].NS == 'Q'; 
elsif HaveEL(EL2) then 
return FALSE; 


else 


// TRUE if processor is Secure or FALSE if Non-secure; 
return boolean IMPLEMENTATION_DEFINED; 


shared/functions/system/Mode_Bits 


constant 
constant 
constant 
constant 
constant 
constant 
constant 
constant 
constant 


bits(5) 
bits(5) 
bits(5) 
bits(5) 
bits(5) 
bits(5) 
bits(5) 
bits(5) 
bits(5) 








32_User = '10000'; 
32_FIQ = '10001'; 
32_IRQ = '10010'; 
32_Svc = '10011'; 
32_Monitor = '10110'; 
32_Abort = '10111'; 
32_Hyp = '11010'; 
32_Undef = '11011'; 


32_System = '11111'; 


shared/functions/system/PLOfEL 


// PLOfE 
|/ ===== 


L() 


PrivilegeLevel PLOfEL(bits(2) e1) 


case 


el of 
when EL3 
when EL2 
when EL1 
when EL@ 


return if HighestELUsingAArch32() then PL1 else PL3; 
return PL2; 
return PL1; 
return PLQ; 


shared/functions/system/PSTATE 


ProcState PSTATE; 


shared/functions/system/PrivilegeLevel 


enumeration PrivilegeLevel {PL3, PL2, PL1, PLQ}; 


shared/functions/system/ProcState 


type ProcState is ( 


bits 
bits 


(1) N, 
(1) Z, 


// Negative condition flag 
// Zero condition flag 


J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 
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bits (1) C, // Carry condition flag 

bits (1) V, // oVerflow condition flag 

bits (1) D, // Debug mask bit [AArch64 only] 
bits (1) A, // SError interrupt mask bit 

bits (1) I, // IRQ mask bit 

bits (1) F, // FIQ mask bit 

bits (1) SS, // Software step bit 

bits (1) IL, // T1legal Execution state bit 

bits (2) EL, // Exception Level 

bits (1) nRw, // not Register Width: 0=64, 1=32 

bits (1) SP, // Stack pointer select: Q=SPQ, 1=SPx [AArch64 only] 
bits (1) Q, // Cumulative saturation flag [AArch32 only] 
bits (4) GE, // Greater than or Equal flags [AArch32 only] 
bits (8) IT, // If-then bits, RES@ in CPSR [AArch32 only] 
bits (1) J, // J bit, RESO [AArch32 only, RES@ in SPSR and CPSR] 
bits (1) T, // 732 bit, RESO in CPSR [AArch32 only] 
bits (1) E, // Endianness bit [AArch32 only] 
bits (5) M // Mode field [AArch32 only] 





shared/functions/system/RestoredITBits 


// RestoredITBits() 
|[ ==ssssensssso=== 


// Get the value of PSTATE.IT to be restored on this exception return. 


bits(8) RestoredITBits(bits(32) spsr) 
it = spsr<15:10,26:25>; 


// When PSTATE.IL is set, it is CONSTRAINED UNPREDICTABLE whether the IT bits are each set 
// to zero or copied from the SPSR. 
if PSTATE.IL == '1' then 

if ConstrainUnpredictableBool() then return 'Q0000000'; 

else return it; 


// The IT bits are forced to zero when they are set to a reserved value. 
if !IsZero(it<7:4>) && IsZero(it<3:0>) then 
return '00000000'; 


// The IT bits are forced to zero when returning to A32 state, or when returning to an EL 
// with the ITD bit set to 1, and the IT bits are describing a multi-instruction block. 
itd = if PSTATE.EL == EL2 then HSCTLR.ITD else SCTLR.ITD; 
if (spsr<5> == '@' && !IsZero(it)) || (itd == '1' && !IsZero(it<2:@>)) then 

return '00000000'; 
else 

return it; 


shared/functions/system/SCRType 


type SCRType; 


shared/functions/system/SCR_GEN 


// SCR_GEN[] 
== 


SCRType SCR_GEN[] 
// AArch32 secure & AArch64 EL3 registers are not architecturally mapped 
assert HaveEL(EL3); 
bits(32) r; 
if HighestELUsingAArch32() then 
r = SCR; 
else 
r = SCR_EL3; 
return r; 
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shared/functions/system/SErrorPending 
// Return TRUE if a physical SError interrupt is pending; that is, if ISR_EL1.A == 
boolean SErrorPending(); 

shared/functions/system/SendEvent 


// Signal an event to all PEs. 
SendEvent(); 


shared/functions/system/SetPSTATEFromPSR 
// SetPSTATEFromPSR() 
// Set PSTATE based on a PSR value 
SetPSTATEFromPSR(bits(32) spsr) 


SynchronizeContext(); 

PSTATE.SS = DebugExceptionReturnSS(spsr) ; 

if I]legalExceptionReturn(spsr) then 
PSTATE.IL = '1'; 

else 
// State that is reinstated only on a legal exception return 
PSTATE.IL = spsr<2Q>; 


if spsr<4> == '1' then // AArch32 state 
AArch32.WriteMode(spsr<4:0>); // Sets PSTATE.EL correctly 
else // AArch64 state 


PSTATE.nRW = 'Q'; 
PSTATE.EL = spsr<3:2>; 
PSTATE.SP = spsr<Q@>; 
// If PSTATE.IL is set and returning to AArch32 state, it is CONSTRAINED UNPREDICTABLE whether 
// the T bit is set to zero or copied from SPSR. 
if PSTATE.IL == '1' && PSTATE.nRW == '1' then 
if ConstrainUnpredictableBool() then spsr<5> = '0'; 


// State that is reinstated regardless of illegal exception return 
PSTATE.<N,Z,C,V> = spsr<31:28>; 





if PSTATE.nRW == '1' then // AArch32 state 

PSTATE.Q = spsr<27>; 

PSTATE.IT = RestoredITBits(spsr); 

PSTATE.GE = spsr<19:16>; 

PSTATE.E = spsr<9>; 

PSTATE.<A,I,F> = spsr<8:6>; // No PSTATE.D in AArch32 state 

PSTATE.T = spsr<5>; // PSTATE.J is RESO 
else // AArch64 state 

PSTATE.<D,A,1I,F> = spsr<9:6>; // No PSTATE.<Q,IT,GE,E,T> in AArch64 state 
return; 


shared/functions/system/SynchronizeContext 


SynchronizeContext(); 


shared/functions/system/TakeUnmaskedPhysicalSErrorinterrupts 
// Take any pending unmasked physical SError interrupt 
TakeUnmaskedPhysicalSErrorInterrupts(boolean iesb_req); 
shared/functions/system/TakeUnmaskedSErrorInterrupts 
// Take any pending unmasked physical SError interrupt or unmasked virtual SError 


// interrupt. 
TakeUnmaskedSErrorInterrupts(); 
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shared/functions/system/ThisInstr 


bits(32) ThisInstr(); 


shared/functions/system/ThisInstrLength 


integer ThisInstrLength(); 


shared/functions/system/Unreachable 
Unreachable() 
assert FALSE; 
shared/functions/system/UsingAArch32 
// UsingAArch32() 
// =sssssssss==== 
// Return TRUE if the current Exception level is using AArch32, FALSE if using AArché4. 
boolean UsingAArch32() 
boolean aarch32 = (PSTATE.nRW == '1'); 
if !HaveAnyAArch32() then assert !aarch32; 
if HighestELUsingAArch32() then assert aarch32; 
return aarch32; 


shared/functions/system/WaitForEvent 


WaitForEvent(); 


shared/functions/system/WaitForlInterrupt 


WaitForInterrupt(); 


shared/functions/unpredictable/ConstrainUnpredictable 


// Return the appropriate Constraint result to control the caller's behavior. The return value 
// is IMPLEMENTATION DEFINED within a permitted list for each UNPREDICTABLE case. 

// (The permitted list is determined by an assert or case statement at the call site.) 
Constraint ConstrainUnpredictable(); 


shared/functions/unpredictable/ConstrainUnpredictableBits 
// This is a variant of ConstrainUnpredictable for when the result can be Constraint_UNKNOWN. 
// If the result is Constraint_UNKNOWN then the function also returns UNKNOWN value, but that 
// value is always an allocated value; that is, one for which the behavior is not itself 
// CONSTRAINED. 
(Constraint,bits(width)) ConstrainUnpredictableBits(); 


shared/functions/unpredictable/ConstrainUnpredictableBool 


// ConstrainUnpredictableBool() 


// This is a simple wrapper function for cases where the constrained result is either TRUE or FALSE. 
boolean ConstrainUnpredictab]eBool () 
c = ConstrainUnpredictable(); 


assert c IN {Constraint_TRUE, Constraint_FALSE}; 
return (c == Constraint_TRUE); 
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shared/functions/unpredictable/ConstrainUnpredictablelnteger 


// This is a variant of ConstrainUnpredictable for when the result can be Constraint_UNKNOWN. If 
// the result is Constraint_UNKNOWN then the function also returns an UNKNOWN value in the range 
// low to high, inclusive. 

(Constraint, integer) ConstrainUnpredictableInteger(integer low, integer high); 


shared/functions/unpredictable/Constraint 


enumeration Constraint {// General: 
Constraint_NONE, Constraint_UNKNOWN, 
Constraint_UNDEF, Constraint_NOP, 
Constraint_TRUE, Constraint_FALSE, 
Constraint_DISABLED, 
Constraint_UNCOND, Constraint_COND, Constraint_ADDITIONAL_DECODE, 
// Load-store: 
Constraint_WBSUPPRESS, Constraint_FAULT, 
// IPA too large 
Constraint_FORCE, Constraint_FORCENOSLCHECK}; 


shared/functions/vector/AdvSIMDExpandimm 


// AdvSIMDExpandImm() 


bits(64) AdvSIMDExpandImm(bit op, bits(4) cmode, bits(8) imm8) 
case cmode<3:1> of 





when 'QQQ' 
imm64 = Replicate(Zeros(24):imm8, 2); 
when 'QQ1' 
imm64 = Replicate(Zeros(16):imm8:Zeros(8), 2); 
when 'Q10' 
imm64 = Replicate(Zeros(8):imm8:Zeros(16), 2); 
when 'Q11' 
imm64 = Replicate(imm8:Zeros(24), 2); 
when '100' 
imm64 = Replicate(Zeros(8):imm8, 4); 
when '101' 
imm64 = Replicate(imm8:Zeros(8), 4); 
when '110' 
if cmode<@> == 'Q' then 


imm64 = Replicate(Zeros(16):imm8:Ones(8), 2); 
else 
imm64 = Replicate(Zeros(8):imm8:Ones(16), 2); 
when '111' 
if cmode<@> == 'Q' && op == 'Q' then 
imm64 = Replicate(imm8, 8); 
if cmode<@> == 'Q' && op == '1' then 
imm8a = Replicate(imm8<7>, 8); imm8b = Replicate(imm8<6>, 8); 
imm8c = Replicate(imm8<5>, 8); imm8d = Replicate(imm8<4>, 8); 
imm8e = Replicate(imm8<3>, 8); imm8f = Replicate(imm8<2>, 8); 
imm8g = Replicate(imm8<1>, 8); imm8h = Replicate(imm8<Q>, 8); 
imm64 = imm8a:imm8b:imm8c:imm8d: imm8e: imm8f: imm&g: imm8h; 
if cmode<@> == '1' && op == 'Q' then 
imm32 = imm8<7>:NOT(imm8<6>) :Replicate(imm8<6>,5):imm8<5:Q>:Zeros(19); 
imm64 = Replicate(imm32, 2); 
if cmode<@> == '1' && op == '1' then 
if UsingAArch32() then ReservedEncoding(); 
imm64 = imm8<7>:NOT(imm8<6>) :Replicate(imm8<6>, 8): imm8<5:Q>:Zeros(48); 





return imm64; 
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J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 


shared/functions/vector/PolynomialMult 


// PolynomialMult() 
] | Sacaatiecitensan 


bits(M+N) PolynomialMult(bits(M) op1, bits(N) op2) 
result = Zeros(M4N); 
extended_op2 = ZeroExtend(op2, M+N); 
for i=0 to M-1 
if opl<i> == '1' then 
result = result EOR LSL(extended_op2, i); 
return result; 


shared/functions/vector/SatQ 


// SatQc) 
// ==5555 


(bits(N), boolean) SatQ(integer i, integer N, boolean unsigned) 
(result, sat) = if unsigned then UnsignedSatQ(i, N) else SignedSatQ(i, N); 
return (result, sat); 


shared/functions/vector/SignedSatQ 


// SignedSatQ() 
|{ sannnnnnnnns 


(bits(N), boolean) SignedSatQ(integer i, integer N) 

if i > 2A(N-1) - 1 then 

result = 2A(N-1) - 1; saturated = TRUE; 
elsif i < -(2A(N-1)) then 

result = -(2A(N-1)); saturated = TRUE; 
else 

result = 1; saturated = FALSE; 
return (result<N-1:0>, saturated); 


shared/functions/vector/UnsignedRSqrtEstimate 


// UnsignedRSqrtEstimate() 
[f seesnaassesnsssssssesss 


bits(N) UnsignedRSqrtEstimate(bits(N) operand) 
assert N == 32; 
if operand<N-1:N-2> == 'Q@' then // Operands <= Ox3FFFFFFF produce @xFFFFFFFF 
result = Ones(N); 
else 


// Generate double-precision value = operand « 2A(-32). This has zero sign bit, with: 
// exponent = 1022 or 1021 = double-precision representation of 2A(-1) or 2A(-2) 
// fraction taken from operand, excluding its most significant one or two bits. 


if operand<N-1> == '1' then 

dp_operand = '@ Q1111111110' : operand<N-2:@> : Zeros(64-(N-1)-12); 
else // operand<N-1:N-2> == 'Q@1' 

dp_operand = '@ Q1111111101' : operand<N-3:@> : Zeros(64-(N-2)-12); 


// Call C function to get reciprocal estimate of scaled value. 
estimate = recip_sqrt_estimate(dp_operand) ; 


// Result is double-precision and a multiple of 1/256 in the range 1 to 511/256. 


// Multiply by 2A31 and convert to an unsigned integer - this just involves 
// concatenating the implicit units bit with the top N-1 fraction bits. 
bits(N-1) frac = estimate<51:51-(N-1)+1Lb; 

result = '1' : frac; 


return result; 
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J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 


shared/functions/vector/UnsignedRecipEstimate 


// UnsignedRecipEstimate() 
// sssssssseeeessse======= 


bits(N) UnsignedRecipEstimate(bits(N) operand) 
assert N == 32; 
if operand<N-1> == '0' then // Operands <= Qx7FFFFFFF produce OxFFFFFFFF 
result = Ones(N); 


else 
// Generate double-precision value = operand « 2A(-32). This has zero sign bit, with: 
// exponent = 1022 = double-precision representation of 2A(-1) 
// fraction taken from operand, excluding its most significant bit. 


dp_operand = '@ Q1111111110' : operand<N-2:@> : Zeros(64-(N-1)-12); 


// Call C function to get reciprocal estimate of scaled value. 
estimate = recip_estimate(dp_operand) ; 


// Result is double-precision and a multiple of 1/256 in the range 1 to 511/256. 
// Multiply by 2A31 and convert to an unsigned integer - this just involves 

// concatenating the implicit units bit with the top N-1 fraction bits. 
bits(N-1) frac = estimate<51:51-(N-1)+Lb; 

result = '1' : frac; 


return result; 


shared/functions/vector/UnsignedSatQ 


// UnsignedSatQ() 


(bits(N), boolean) UnsignedSatQ(integer i, integer N) 
if i > 2AN - 1 then 
result = 2AN - 1; saturated = TRUE; 
elsif i < @ then 
result = 0; saturated = TRUE; 
else 
result = i; saturated = FALSE; 
return (result<N-1:0>, saturated); 





J1.3.4 shared/translation 

This section includes the following pseudocode functions: 

° shared/translation/attrs/CombineS 1S2AttrHints on page J1-5448. 

. shared/translation/attrs/CombineS1S2Desc on page J1-5448. 

° shared/translation/attrs/CombineS1S2Device on page J1-5448. 

° shared/translation/attrs/Long ConvertAttrsHints on page J1-5449. 

° shared/translation/attrs/MemAttrDefaults on page J1-5449. 

° shared/translation/attrs/S 1 CacheDisabled on page J1-5450. 

° shared/translation/attrs/S2AttrDecode on page J1-5450. 

° shared/translation/attrs/S2CacheDisabled on page J1-5450. 

. shared/translation/attrs/S2ConvertAttrsHints on page J1-5450. 

. shared/translation/attrs/ShortConvertAttrsHints on page J1-5451. 

. shared/translation/attrs/WalkAttrDecode on page J1-5451. 

. shared/translation/translation/HasS2Translation on page J1-5452. 

° shared/translation/translation/PAMax on page J1-5452. 

. shared/translation/translation/S1TranslationRegime on page J1-5452. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. J1-5447 
ID092916 Non-Confidential 


J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 


shared/translation/attrs/CombineS1S2AttrHints 


// CombineS1S2AttrHints() 


MemAttrHints CombineS1S2AttrHints(MemAttrHints sldesc, MemAttrHints s2desc) 
MemAttrHints result; 
if s2desc.attrs == 'Q1' || sldesc.attrs == '@1' then 


result.attrs = bits(2) UNKNOWN; // Reserved 
elsif s2desc.attrs == MemAttr_NC || sldesc.attrs == MemAttr_NC then 


result.attrs = MemAttr_NC; // Non-cacheable 

elsif s2desc.attrs == MemAttr_WT || sldesc.attrs == MemAttr_WT then 
result.attrs = MemAttr_WT; // Write-through 

else 
result.attrs = MemAttr_WB; // Write-back 


result.hints = sldesc.hints; 
result.transient = sldesc.transient; 


return result; 


shared/translation/attrs/CombineS1S2Desc 
// CombineS1S2Desc() 
// Combines the address descriptors from stage 1 and stage 2 
AddressDescriptor CombineS1S2Desc(AddressDescriptor sldesc, AddressDescriptor s2desc) 
AddressDescriptor result; 
result.paddress = s2desc.paddress; 


if IsFault(sldesc) || IsFault(s2desc) then 
result = if IsFault(sldesc) then sldesc else s2desc; 
elsif s2desc.memattrs.type == Memlype_Device || sldesc.memattrs.type == MemType_Device then 
result.memattrs.type = MemType_Device; 
if sldesc.memattrs.type == MemType_Normal then 
result.memattrs.device = s2desc.memattrs.device; 
elsif s2desc.memattrs.type == MemType_Normal then 
result.memattrs.device = sldesc.memattrs.device; 
else // Both Device 
result.memattrs.device = CombineS1S2Device(sldesc.memattrs.device, 
s2desc.memattrs.device); 
else // Both Normal 
result.memattrs.type = MemType_Normal; 
result.memattrs.device = DeviceType UNKNOWN; 
result.memattrs.inner = CombineS1S2AttrHints(sldesc.memattrs.inner, s2desc.memattrs.inner); 
result.memattrs.outer = CombineS1S2AttrHints(sldesc.memattrs.outer, s2desc.memattrs.outer); 
result.memattrs.shareable = (sldesc.memattrs.shareable || s2desc.memattrs.shareable); 
result.memattrs.outershareable = (sldesc.memattrs.outershareable | | 
s2desc.memattrs.outershareable) ; 


result.memattrs = MemAttrDefaults(result.memattrs); 


return result; 


shared/translation/attrs/CombineS1S2Device 
// CombineS1S2Device() 
// Combines device types from stage 1 and stage 2 


DeviceType CombineS1S2Device(DeviceType sldevice, DeviceType s2device) 
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J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 


if s2device == DeviceType_nGnRnE || sldevice == DeviceType_nGnRnE then 
result = DeviceType_nGnRnE; 

elsif s2device == DeviceType_nGnRE || sldevice == DeviceType_nGnRE then 
result = DeviceType_nGnRE; 

elsif s2device == DeviceType_nGRE || sldevice == DeviceType_nGRE then 
result = DeviceType_nGRE; 

else 
result = DeviceType_GRE; 


return result; 


shared/translation/attrs/LongC onvertAttrsHints 
// LongConvertAttrsHints() 


// Convert the long attribute fields for Normal memory as used in the MAIR fields 
// to orthogonal attributes and hints 


MemAttrHints LongConvertAttrsHints(bits(4) attrfield, AccType acctype) 
assert !IsZero(attrfield); 
MemAttrHints result; 
if S1CacheDisabled(acctype) then // Force Non-cacheable 
result.attrs = MemAttr_NC; 
result.hints = MemHint_No; 
else 
if attrfield<3:2> == 'QQ' then // Write-through transient 
result.attrs = MemAttr_WT; 
result.hints = attrfield<1:0>; 
result.transient = TRUE; 
elsif attrfield<3:0> == 'Q100' then // Non-cacheable (no allocate) 
result.attrs = MemAttr_NC; 
result.hints = MemHint_No; 
result.transient = FALSE; 
elsif attrfield<3:2> == 'Q1' then // Write-back transient 
result.attrs = attrfield<1:0>; 
result.hints = MemAttr_WB; 
result.transient = TRUE; 
else // Write-through/Write-back non-transient 
result.attrs = attrfield<3:2>; 
result.hints = attrfield<1:0>; 
result.transient = FALSE; 





return result; 


shared/translation/attrs/MemAttrDefaults 


// MemAttrDefaults() 

// ssssssssses====== 

// Supply default values for memory attributes, including overriding the shareability attributes 
// for Device and Non-cacheable memory types. 


MemoryAttributes MemAttrDefaults(MemoryAttributes memattrs) 


if memattrs.type == MemType_Device then 
memattrs.inner = MemAttrHints UNKNOWN; 
memattrs.outer = MemAttrHints UNKNOWN; 
memattrs.shareable = TRUE; 
memattrs.outershareable = TRUE; 
else 
memattrs.device = DeviceType UNKNOWN; 
if memattrs.inner.attrs == MemAttr_NC && memattrs.outer.attrs == MemAttr_NC then 
memattrs.shareable = TRUE; 
memattrs.outershareable = TRUE; 


return memattrs; 
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J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 


shared/translation/attrs/S1CacheDisabled 


// S1CacheDisabled() 
|{ sannenscnnssssss 


boolean S1CacheDisabled(AccType acctype) 
if ELUsingAArch32(S1TranslationRegime()) then 
if PSTATE.EL == EL2 then 
enable = if acctype == AccType_IFETCH then HSCTLR.I else HSCTLR.C; 
else 
enable = if acctype == AccType_IFETCH then SCTLR.I else SCTLR.C; 
else 
enable = if acctype == AccType_IFETCH then SCTLR[].I else SCTLR[].C; 
return enable == 'Q'; 


shared/translation/attrs/S2AttrDecode 


// S2AttrDecode() 
|{ sassnnnnansnss 


// Converts the Stage 2 attribute fields into orthogonal attributes and hints 
MemoryAttributes S2AttrDecode(bits(2) SH, bits(4) attr, AccType acctype) 
MemoryAttributes memattrs; 


if attr<3:2> == 'Q0' then // Device 
memattrs.type = MemType_Device; 
case attr<1:0> of 
when 'QQ' memattrs.device = DeviceType_nGnRnE; 
when 'Q1' memattrs.device = DeviceType_nGnRE; 
when '10' memattrs.device = DeviceType_nGRE; 
when '11' memattrs.device = DeviceType_GRE; 


elsif attr<1:0> != 'QQ' then // Normal 
memattrs.type = MemType_Normal; 
memattrs.outer = S2ConvertAttrsHints(attr<3:2>, acctype); 
memattrs.inner = S2ConvertAttrsHints(attr<1:@>, acctype); 
memattrs.shareable = SH<1> == '1'; 
memattrs.outershareable = SH == '10'; 


else 
memattrs = MemoryAttributes UNKNOWN; // Reserved 


return MemAttrDefaults(memattrs) ; 


shared/translation/attrs/S2CacheDisabled 


// S2CacheDisabled() 
|{ saannenncnnssssss 


boolean S2CacheDisabled(AccType acctype) 
if ELUsingAArch32(EL2) then 
disable = if acctype == AccType_IFETCH then HCR2.ID else HCR2.CD; 
else 
disable = if acctype == AccType_IFETCH then HCR_EL2.ID else HCR_EL2.CD; 


return disable == '1'; 


shared/translation/attrs/S2ConvertAttrsHints 


// S2ConvertAttrsHints() 

// ssssssssssss========= 

// Converts the attribute fields for Normal memory as used in stage 2 
// descriptors to orthogonal attributes and hints 


MemAttrHints S2ConvertAttrsHints(bits(2) attr, AccType acctype) 
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J1.3 Shared pseudocode 


assert !IsZero(attr); 
MemAttrHints result; 


if S2CacheDisabled(acctype) then // Force Non-cacheable 
result.attrs = MemAttr_NC; 
result.hints = MemHint_No; 
else 
case attr of 
when 'Q1' // Non-cacheable (no allocate) 
result.attrs = MemAttr_NC; 
result.hints = MemHint_No; 
when '10' // Write-through 
result.attrs = MemAttr_WT; 
result.hints = MemHint_RWA; 
when '11' // Write-back 
result.attrs = MemAttr_WB; 
result.hints = MemHint_RWA; 


result.transient = FALSE; 


return result; 


shared/translation/attrs/ShortConvertAttrsHints 
// ShortConvertAttrsHints() 


// Converts the short attribute fields for Normal memory as used in the TTBR and 
// TEX fields to orthogonal attributes and hints 


MemAttrHints ShortConvertAttrsHints(bits(2) RGN, AccType acctype, boolean secondstage) 
MemAttrHints result; 


if (!secondstage && S1CacheDisabled(acctype)) || (secondstage && S2CacheDisabled(acctype)) then 
// Force Non-cacheable 
result.attrs = MemAttr_NC; 
result.hints = MemHint_No; 
else 
case RGN of 
when 'QQ' // Non-cacheable (no allocate) 
result.attrs = MemAttr_NC; 
result.hints = MemHint_No; 
when 'Q1' // Write-back, Read and Write allocate 
result.attrs = MemAttr_WB; 
result.hints = MemHint_RWA; 
when '10' // Write-through, Read allocate 
result.attrs = MemAttr_WT; 
result.hints = MemHint_RA; 
when '11' // Write-back, Read allocate 
result.attrs = MemAttr_WB; 
result.hints = MemHint_RA; 


result.transient = FALSE; 


return result; 


shared/translation/attrs/WalkAttrDecode 


// WalkAttrDecode() 


MemoryAttributes WalkAttrDecode(bits(2) SH, bits(2) ORGN, bits(2) IRGN, boolean secondstage) 


MemoryAttributes memattrs; 
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J1 ARMv8 Pseudocode 
J1.3 Shared pseudocode 


AccType acctype = AccType_NORMAL; 


memattrs.type = MemType_Normal; 

memattrs.inner = ShortConvertAttrsHints(IRGN, acctype, secondstage) ; 
memattrs.outer = ShortConvertAttrsHints(ORGN, acctype, secondstage) ; 
memattrs.shareable = SH<1> == '1'; 

memattrs.outershareable = SH == '10'; 


return MemAttrDefaults(memattrs) ; 


shared/translation/translation/HasS2Translation 


// HasS2Translation() 


// Returns TRUE if stage 2 translation is present for the current translation regime 


boolean HasS2Translation() 
return (HaveEL(EL2) && !IsSecure() && PSTATE.EL IN {ELQ,EL1}); 


shared/translation/translation/PAMax 
// PAMax() 


// Returns the IMPLEMENTATION DEFINED upper limit on the physical address 
// size for this processor, as log2(). 


integer PAMax() 
return integer IMPLEMENTATION_DEFINED "Maximum Physical Address Size"; 


shared/translation/translation/S1TranslationRegime 
// S1TranslationRegime() 
// Stage 1 translation regime for the given Exception level 


bits(2) SlTranslationRegime(bits(2) e1) 
if el != ELO then 
return el; 
elsif HaveEL(EL3) && ELUsingAArch32(EL3) && SCR.NS == 'Q' then 
return EL3; 
else 
return EL1; 


// S1TranslationRegime() 

// sssssessssss========= 

// Returns the Exception level controlling the current Stage 1 translation regime. For the most 
// part this is unused in code because the system register accessors (SCTLR[], etc.) implicitly 
// return the correct value. 


bits(2) S1TranslationRegime() 
return S1TranslationRegime(PSTATE.EL) ; 
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Appendix K1 


Architectural Constraints on UNPREDICTABLE 
behaviors 


This chapter describes the architectural constraints on UNPREDICTABLE behaviors in the ARMv8 architecture. It 
contains the following sections: 


° AArch32 CONSTRAINED UNPREDICTABLE behaviors on page K1-5456. 
° AArch64 CONSTRAINED UNPREDICTABLE behaviors on page K1-5479. 
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Appendix K1 Architectural Constraints on UNPREDICTABLE behaviors 
K1.1 AArch32 CONSTRAINED UNPREDICTABLE behaviors 





K1.1 AArch32 CONSTRAINED UNPREDICTABLE behaviors 

ARMvV8 defines architecturally-required constraints on many behaviors that are UNPREDICTABLE in ARMv7. The 

following sections define those constraints: 

° Overview of the constraints on ARMv7 UNPREDICTABLE behaviors on page K1-5457. 

° Using R13 on page K1-5457. 

° Using R15 on page K1-5457. 

. Branching into an IT block on page K1-5458. 

° Branching to an unaligned PC on page K1-5458. 

° Loads and Stores to unaligned locations on page K1-5458. 

° CONSTRAINED UNPREDICTABLE behavior associated with IT instructions and PSTATE.IT on 
page K1-5458. 

. Unallocated System register access instructions on page K1-5460. 

° SBZ or SBO fields T32 and A32 in instructions on page K1-5460. 

° UNPREDICTABLE cases in immediate constants in T32 data-processing instructions on page K1-5460. 

. UNPREDICTABLE cases in immediate constants in Advanced SIMD instructions on page K1-5461. 

° CONSTRAINED UNPREDICTABLE behaviors due to caching of control or data values on page K1-5461. 

. CONSTRAINED UNPREDICTABLE behavior due to inadequate context synchronization on page K1-5461 

° Translation Table Base Address alignment on page K1-5462. 

° Handling of System register control fields for Advanced SIMD and floating-point operation on 
page K1-5462. 

° The Performance Monitors Extension on page K1-5462. 

° Syndrome register handling for CONSTRAINED UNPREDICTABLE instructions treated as UNDEFINED 
on page K1-5464. 

. Out of range virtual address on page K1-5464. 

° Instruction fetches from Device memory on page K1-5465. 

° Multi-access instructions that load the PC from Device memory on page K1-5465. 

° Programming CSSELR.Level for a cache level that is not implemented on page K1-5465. 

° Crossing a page boundary with different memory types or Shareability attributes on page K1-5465. 

° Crossing a 4KB boundary with a Device access on page K1-5466. 

° UNPREDICTABLE behaviors with Load-Exclusive/Store-Exclusive pairs on page K1-5466. 

° CONSTRAINED UNPREDICTABLE behavior for A32 memory hints, Advanced SIMD instructions, and 
miscellaneous instructions on page K1-5467. 

° Out of range values of the Set/Way/Index fields in cache maintenance instructions on page K1-5467. 

° CONSTRAINED UNPREDICTABLE behavior for A32 and T32 system instructions in the base instruction 
set on page K1-5467. 

° CONSTRAINED UNPREDICTABLE behavior, A32 and T32 Advanced SIMD and floating-point instructions 
on page K1-5470. 

K1-5456 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


Appendix K1 Architectural Constraints on UNPREDICTABLE behaviors 
K1.1 AArch32 CONSTRAINED UNPREDICTABLE behaviors 


° CONSTRAINED UNPREDICTABLE behaviors associated with the VTCR on page K1-5474. 
° CONSTRAINED UNPREDICTABLE behavior of EL2 features on page K1-5474. 
° Reserved values in System and memory-mapped registers and translation table entries on page K1-5477. 


° CONSTRAINED UNPREDICTABLE behavior in Debug state on page K1-5478. 


K1.1.1 Overview of the constraints on ARMv7 UNPREDICTABLE behaviors 


The term UNPREDICTABLE describes a number of cases where the architecture has a feature that software must not 
use. For execution in AArch32 state, where previous versions of the architecture define behavior as 
UNPREDICTABLE, the ARMv8-A architecture specifies a narrow range of permitted behaviors. This range is the 
range of CONSTRAINED UNPREDICTABLE behavior. All implementations that are compliant with the architecture 
must follow the CONSTRAINED UNPREDICTABLE behavior. 





Note 


Software designed to be compatible with the ARMv8-A architecture must not rely on these CONSTRAINED 
UNPREDICTABLE Cases. 





K1.1.2 Using R13 


In prior versions of the architecture, the use of R13 as a named register specifier was described as UNPREDICTABLE 
in the pseudocode. In the ARMv8-A architecture, the use of R13 as a named register specifier is not 
UNPREDICTABLE, unless this is specifically stated, and R13 can be used in the regular form. Bits[1:0] of R13 are not 
treated as RESO, but can hold any values programmed into them. 


K1.1.3 Using R15 


All uses of R15 as a named register specifier for a source register that are described as CONSTRAINED 
UNPREDICTABLE in the pseudocode or in other places in this Manual must do one of the following: 


° Cause the instruction to be treated as UNDEFINED. 
° Cause the instruction to execute as a NOP. 
° Read or return an UNKNOWN value for the source register specified as R15. 


All uses of R15 as a named register specifier for a destination register that are described as CONSTRAINED 
UNPREDICTABLE in the pseudocode or in other places in this reference manual must do one of the following: 


° Cause the instruction to be treated as UNDEFINED. 

° Cause the instruction to execute as a NOP. 

° Ignore the write. 

° Branch to an UNKNOWN location in either A32 or T32 state. 


The choice between these behaviors might in some implementations vary from instruction to instruction, or between 
different instances of the same instruction. 


Instructions that are CONSTRAINED UNPREDICTABLE when the base register is R15 and the instruction specifies a 
writeback of the base register, are treated as having R15 as both a source register and a destination register. 


For instructions that have two destination registers, for example LDRD, MRRC, and many of the multiply instructions, 
if Rt, Rt2, RdLo, or RdHi is R15, then the other destination register of the pair is UNKNOWN, if the CONSTRAINED 
UNPREDICTABLE behavior for the write to R15 is either to ignore the write or to branch to an UNKNOWN location. 


For instructions that affect any or all of PSTATE.{N, Z, C, V}, PSTATE.Q, and PSTATE.GE when the register 
specifier is not R15, any flags affected by an instruction that is CONSTRAINED UNPREDICTABLE when the register 
specifier is R15 become UNKNOWN, 


In addition, for MRC instructions that use R15 as the destination register descriptor, and therefore target APSR_nzcv 
where these are described as being CONSTRAINED UNPREDICTABLE, PSTATE.{N, Z, C, V} becomes UNKNOWN. 
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K1.1.4 Branching into an IT block 
Branching into an IT block leads to CONSTRAINED UNPREDICTABLE behavior. Execution starts from the address 
determined by the branch, but each instruction in the IT block is: 
° Executed as if it were not in an IT block. This means that it is executed unconditionally. 
° Executed as if it had passed its Condition code check within an IT block. 
° Executed as a NOP. That is, it behaves as if it had failed the Condition code check. 
K1.1.5 Branching to an unaligned PC 
In A32 state, when branching to an address that is not word aligned and is defined to be CONSTRAINED 
UNPREDICTABLE, one of the following behaviors must occur: 
. The unaligned location is forced to be aligned. 
° The unaligned address generates an exception on the first instruction using the unaligned PC value. 
If that instruction is executed at ELO and either of the following applies, the exception is taken to EL2: 
—  EL2is using AArch32 and the value of HCR.TGE is 1 
—  EL2is using AArch64 and the value of HCR_EL2.TGE is 1. 
If the instruction is executed at ELO when the applicable TGE bit is 0 the exception is taken to EL1. 
If the instruction is executed at an Exception level that is higher than ELO the exception is taken to the 
Exception level at which the instruction was executed. 
In all cases, the exception is generated only if the first instruction using the unaligned PC value is 
architecturally executed. 
If the exception that results from a branch to an unaligned PC value: 
° Is taken to an Exception level that is using AArch64, it is reported as a PC alignment fault exception, see 
Exception from an Illegal Execution state, PC alignment fault, or SP alignment fault on page D1-1530. 
° Is taken to an Exception level that is using AArch32, it is reported as a Prefetch Abort exception, see Prefetch 
Abort exception reporting a PC alignment fault exception on page G1-3857. 
Note 
Because bit[0] is used for interworking, it is impossible to specify a branch to A32 state when the bottom bit of the 
target address is 1. Therefore the bottom bit of IFAR, HIFAR, or FAR_ELx is 0 for all these cases. 
K1.1.6 Loads and Stores to unaligned locations 
Some unaligned loads and stores in the ARMv7 architecture are described as UNPREDICTABLE. These are defined in 
the ARMv8-A architecture to do one of the following: 
. Take an alignment fault. 
. Perform the specified load or store to the unaligned memory location. 
K1.1.7 CONSTRAINED UNPREDICTABLE behavior associated with IT instructions and PSTATE.IT 
A number of instructions in the architecture are described as being CONSTRAINED UNPREDICTABLE either: 
° Anywhere within an IT block. 
. As an instruction within an IT block, other than the last instruction within an IT block. 
Unless otherwise stated in this manual, when these instructions are committed for execution, one of the following 
occurs: 
° An UNDEFINED exception results. 
° The instructions are executed as if they had passed the Condition code check. 
° The instructions execute as NOPs. This means that they behave as if they had failed the Condition code check. 
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The behavior might in some implementations vary from instruction to instruction, or between different instances of 
the same instruction. 


Many instructions that are CONSTRAINED UNPREDICTABLE in an IT block are branch instructions or other 
non-sequential instructions that change the PC. Where these instructions are not treated as UNDEFINED within an IT 
block, the remaining iterations of the PSTATE.IT state machine must be treated in one of the following ways: 


° PSTATE.IT is cleared to 0. 


° PSTATE.IT advances for either a sequential or a nonsequential change of the PC in the same way as it does 
for instructions that are not CONSTRAINED UNPREDICTABLE that cause a sequential change of the PC. 


Note 


This does not apply to an instruction that is the last instruction in an IT block. 








The instructions addressed by the updated PC must do one of the following: 


° Execute as if they had passed the Condition code check for the remaining iterations of the PSTATE.IT state 
machine. 
° Execute as NOPs. That is, they behave as if they had failed the Condition code check for the remaining 


iterations of the PSTATE.IT state machine. 


° Execute as if they were unconditional, or, if the instructions are part of another IT block, in accordance with 
the behavior described in Branching into an IT block on page K1-5458. 


The behavior might in some implementations vary from instruction to instruction, or between different instances of 
the same instruction. 


For exception returns or Debug state exits that cause PSTATE.IT to be set to a reserved value in T32 state or that 
return to A32 state with a nonzero value in PSTATE.IT, the PSTATE.IT bits are forced to ‘00000000’. The reserved 
values are: 


PSTATE.IT[7:4] != ‘0000’ && PSTATE.IT[3:0] = ‘0000’ 
PSTATE.IT[2:0] != ‘@00’ when SCTLR/SCTLR_EL_1.ITD == ‘1’ 


Exception returns or Debug state exits that set PSTATE.IT to a non-reserved value in T32 state can occur when the 
flow of execution returns to a point: 


° Outside an IT block, but with the PSTATE.IT bits set to a value other than ‘00000000’. 


° Inside an IT block, but with a different value of the PSTATE.IT bits than if the IT block had been executed 
without an exception return or Debug state exit. 


In this case the instructions at the target of the exception return or Debug state exit must do one of the following: 


° Execute as if they passed the Condition code check for the remaining iterations of the PSTATE.IT state 
machine. 
° Execute as NOPs. That is, they behave as if they failed the Condition code check for the remaining iterations 


of the PSTATE.IT state machine. 


° Execute as if they were unconditional, or as if the instruction were part of another IT block, in accordance 
with the behavior in Branching into an IT block on page K1-5458. 


The remaining iterations of the PSTATE.IT state machine must behave in one of the following ways: 
° The PSTATE.IT state machine advances as if it were in an IT block. 

° The PSTATE.IT bits are ignored. 

. The PSTATE.IT bits are forced to ‘00000000’. 
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K1.1.8 Unallocated System register access instructions 


In ARMv8-A, accesses to unallocated System register encodings are UNDEFINED. 


This includes: 


° Reads using encodings that are defined as WO. 
. Writes using encodings that are defined as RO. 
° MCR or MRC accesses to using a set of {coproc, CRn, opcl, CRm, opc2} values that the ARMv7 architecture defined 


as UNPREDICTABLE. 


° Accesses to System registers in the (coproc==b111x) encoding space that the ARMv7 architecture defined 
as UNPREDICTABLE when particular functionality was not implemented, when an ARMV8 implementation 
does not include the Exception level that provides that functionality. 


SBZ or SBO fields T32 and A32 in instructions 


Many of the A32 and T32 instructions have (0) or (1) in the instruction decode to indicate should-be-zero, SBZ, or 
should-be-one, SBO. If the instruction bit pattern of an instruction is executed with these fields not having the 
should be values, one of the following must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as a NOP. 

. The instruction operates as if the bit had the should-be value. 

. Any destination registers of the instruction become UNKNOWN. 


The exceptions to this rule are: 

° LDM, LDMIA, LDMFD on page F5-2699. 
° LDMDB, LDMEA on page F5-2709. 

. LDR (literal) on page F5-2718. 

. LDRB (literal) on page F5-2727. 

. LDRD (immediate) on page F5-2735. 

° LDRD (register) on page F5-2741. 

. LDRD (literal) on page F5-2738. 

. LDRH (literal) on page F5-2755. 

° LDRSB (literal) on page F5-2766. 

° LDRSH (literal) on page F5-2777. 

° POP on page F5-2880. 

° PUSH on page F5-2886. 

° SDIV on page F5-2962. 

° STM, STMIA, STMEA on page F5-3049. 
. STMDB, STMFD on page F5-3057. 

° UDIV on page F5-3164. 


UNPREDICTABLE cases in immediate constants in T32 data-processing instructions 


The description of immediate constants in T32 data processing Modified immediate constants in T32 instructions 
on page F2-2420 include constant values that were UNPREDICTABLE in ARMvV7. Instruction encodings on 

page F2-2402 describes 32-bit T32 instructions as {hw1, hw2}, where hw] is the left-hand halfword in the 32-bit 
encoding diagram for the instruction. The UNPREDICTABLE cases are those where both: 


° hw2[7:0] == 0b0000000. 

° hw1[10] == @ and either: 
— — hw2[14:12] == 0b001. 
— — hw2[14:12] == 0b010. 
— — hw2[14:12] == 0b011. 
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In ARMv8 the CONSTRAINED UNPREDICTABLE behavior is that these encodings produce the value 0b0000000. 


K1.1.11 UNPREDICTABLE cases in immediate constants in Advanced SIMD instructions 


The description of immediate constants in Modified immediate constants in T32 and A32 Advanced SIMD 
instructions on page F2-2423 include constant values that were UNPREDICTABLE in ARMv7. The UNPREDICTABLE 
cases are those where: 


. The bits that the encoding diagram shows as abcd are all 0. 


In the A32 encoding these are bits[24, 18:6, 3:0]. In the T32 encoding they are bits {hw1[12, 2:0], hw2[3:0]}. 


° The bits that the encoding diagram shows as cmode[3:1] are one of {0b001, 0b010, 0b011, 0b101, 0b110}. 
In the A32 encoding these are bits[11:9]. In the T32 encoding they are bits hw2[11:9]. 


In ARMv8 the CONSTRAINED UNPREDICTABLE behavior is that these encodings produce an immediate constant 
value of zero. 


K1.1.12 CONSTRAINED UNPREDICTABLE behaviors due to caching of control or data values 


The ARM architecture allows copies of control values or data values to be cached in a cache or TLB. This can lead 
to CONSTRAINED UNPREDICTABLE behavior if the cache or TLB has not been correctly invalidated following a 
change of the control or data values. 


Unless explicitly stated otherwise, the behavior of the PE is consistent with: 


° The old data or control value. 

° The new data or control value. 

. An amalgamation of the old and new data or control values. 
Note 





This rules applies where inadequate invalidation of the TLB might cause multiple hits within the TLB. In this 
situation, a failure to invalidate the TLB by code running at a given Privilege level must not make access to regions 
of memory with permissions or attributes that could not be accessed at that Privilege level possible. 





Alternatively to this CONSTRAINED UNPREDICTABLE behavior, an implementation detecting multiple hits within a 
TLB might generate an exception, reporting the exception using the TLB Conflict fault code, see TLB conflict 
aborts on page G4-4091. 


The choice between the behaviors might, in some implementations, vary for each use of a control or data value. 


K1.1.13 CONSTRAINED UNPREDICTABLE behavior due to inadequate context synchronization 


The ARM architecture requires that changes to System registers must be synchronized before they take effect. This 
can lead to CONSTRAINED UNPREDICTABLE behavior if the synchronization has not been performed. 


In these cases, the behavior of the PE is consistent with the unsynchronized control value being either the old value 
or the new value. 


Where multiple control values are updated but not yet synchronized, each control value might independently be the 
old value or the new value. 


In addition, where the unsynchronized control value applies to different areas of functionality, or what an 
implementation has constructed as different areas of functionality, those areas might independently treat the control 
value as being either the old value or the new value. 


The choice between these behaviors might, in some implementations, vary for each use of a control value. 
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K1.1.14 Translation Table Base Address alignment 
A misaligned Translation Table Base Address can occur if: 

° The VMSAv8-32 Short-descriptor translation table format is enabled and TTBRO[13-N:7], which is defined 
to be RESO, contains one or more nonzero values. 

° The VMSAv8-32 Long-descriptor translation table format is enabled, and TTBRO[x-1:3], TTBR1[x-1:3], 
HTTBR[x-1:3], or VTTBR[x-1:3], which are defined to be RESO, contain one or more nonzero values. 

In the event of a misaligned Translation Table Base Address, one of the following behaviors must occur: 

° The field that is defined to be RESO is treated as if all bits were zero: 
— The value that is read back might be the value written or it might be zero. 

° The calculation of an address for a translation table walk using that register might be corrupted in those bits 
that are nonzero. 

K1.1.15 Handling of System register control fields for Advanced SIMD and floating-point operation 
For historical reasons described in Background to the System register interface on page G1-3879, each of the 
CPACR, HCPTR, and NSACR has a pair of control fields that were defined to have identical functionality for 
controlling Advanced SIMD and floating-point operation. These fields are: 

. CPACR.{cp10, cp11}. 
. HCPTR.{TCP10, TCP11}. 
. NSACR.{cp10, cp11}. 
The architecture requires that both fields in one of these pairs are programmed to the same value. If this is not done, 
then the CONSTRAINED UNPREDICTABLE behavior is that behavior is the same as if the cp11, or TCP11, control field 
was equal to the cp10, or TCP10, field in all respects other than the value read back by a direct read of the register. 
After a register write that writes different values to the two fields of a pair, a direct read of the register might return 
an UNKNOWN value for the cp11 or TCP11 field. 

Note 
This means that, when different values are written to the {cp10, cp11} fields in a single register, the architecture 
permits but does not require that a read of that register returns the value written to the cp11 field. 
CONSTRAINED UNPREDICTABLE CPACR and NSACR settings 
If CPACR.cp<n> contains the encoding ‘10’, then one of the following behaviors must occur: 
° The encoding maps onto any of the allocated values, but otherwise does not cause UNPREDICTABLE behavior. 
° The encoding causes effects that could be achieved by a combination of more than one of the allocated 

encodings. 

Note 
In ARMv7, CPACR had a D32DIS bit, and NSACR had an NSD32DIS bit. There is no CPACR.D32DIS or 
NSACR.NSD32DIS in ARMv8-A, and the corresponding bits in the two registers are RESO. 

K1.1.16 The Performance Monitors Extension 
The following subsections describe CONSTRAINED UNPREDICTABLE behaviors when accessing the Performance 
Monitors Extension in AArch32 state: 

. CONSTRAINED UNPREDICTABLE accesses to PMXEVTYPER or PMXEVCNTR on page K1-5463. 
. CONSTRAINED UNPREDICTABLE accesses to PMEVCNTR<n> and PMEVTYPER<n> on 
page K1-5464. 
. CONSTRAINED UNPREDICTABLE behavior caused by HDCR.HPMN on page K1-5464. 
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CONSTRAINED UNPREDICTABLE accesses to PMXEVTYPER or PMXEVCNTR 
The following cases can cause CONSTRAINED UNPREDICTABLE behavior: 


° If PMSELR.SEL is not equal to 31, and PMSELR.SEL is greater than or equal to PMCR.N, and the PE is 
executing in Secure state or in Non-secure Hyp mode. 


° If PMSELR.SEL is not 31, and PMSELR.SEL is greater than or equal to HDCR.HPMN, and the PE is 
executing in a Non-secure mode other than Hyp mode. 


In these UNPREDICTABLE cases, one of the following behaviors must occur: 

° Accesses to PMXEVTYPER or PMXEVCNTR from that mode are UNDEFINED. 

. Accesses to PMXEVTYPER or PMXEVCNTR from that mode behave as RAZ/WI. 
° Accesses to PMXEVTYPER or PMXEVCNTR from that mode execute as NOPs. 


° Accesses to PMXEVTYPER or PMXEVCNTR from that mode behave as if PMSELR.SEL contains an 
UNKNOWN value that is less than the number of counters accessible at the current Exception level and 
Security state. 


° Accesses to PMXEVTYPER or PMXEVCNTR behave as if PMSELR.SEL is 31. 


° In Non-secure state, for an access to PMXEVTYPER or PMXEVCNTR from PLI or a permitted access from 
PLO, if the counter is implemented but not accessible at the current Exception level, the register access is 
trapped to EL2. Accesses from PLO are permitted when: 


— ELI is using AArch32 and the value of PMUSERENR.EN is 1. 
— ELI is using AArch64 and the value of PMUSERENR_ELO.EN is 1. 


If PMSELR.SEL is equal to 31, then one of the following behaviors must occur: 
° Accesses to PMXEVCNTR are UNDEFINED. 

° Accesses to PMXEVCNTR behave as RAZ/WI. 

° Accesses to PMXEVCNTR execute as NOPs. 


° Accesses to PMXEVCNTR behave as if PMSELR.SEL contains an UNKNOWN value that is less than the 
number of counters accessible at the current Exception level and Security state. 


° In Non-secure state, for an access to PMXEVCNTR from PL1 or a permitted access from PLO, if the counter 
is implemented but not accessible at the current Exception level, the register access is trapped to EL2. 
Accesses from PLO are permitted when: 


— ELI is using AArch32 and the value of PMUSERENR.EN is 1. 
— ELI is using AArch64 and the value of PMUSERENR_ELO.EN is 1. 


Note 


In an implementation that includes EL2, in Non-secure state at PLO and PL1, HDCR.HPMN, or 
MDCR_EL2.HPMN, identifies the number of accessible counters. Otherwise, the number of accessible counters is 
the number of implemented counters. 
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K1.1.17 
UNDEFINED 


K1.1.18 


CONSTRAINED UNPREDICTABLE accesses to PMEVCNTR<n> and PMEVTYPER<n> 


If <n> is greater than the number of counters available in the current Exception level and state, reads and writes of 
PMEVCNTR<n> and PMEVTYPER<n> are CONSTRAINED UNPREDICTABLE, and the following behaviors are 
permitted: 


° Accesses to the register are UNDEFINED. 
° Accesses to the register behave as RAZ/WI. 
° Accesses to the register execute as a NOP. 


° In Non-secure state, for an access to PMEVCNTR<n> or PMEVTYPER<n> from PL1 or a permitted access 
from PLO, if the counter is implemented but not accessible at the current Exception level, the register access 
is trapped to EL2. Accesses from PLO are permitted when: 


— ELI is using AArch32 and the value of PMUSERENR.EN is 1. 
— ELI is using AArch64 and the value of PMUSERENR_ELO.EN is 1. 


Note 


In an implementation that includes EL2, in Non-secure state at PLO and PL1, HDCR.HPMN, or 
MDCR_EL2.HPMN, identifies the number of accessible counters. Otherwise, the number of accessible counters is 
the number of implemented counters. 








CONSTRAINED UNPREDICTABLE behavior caused by HDCR.HPMN 
If HDCR.HPMN is set to 0 or to a value greater than PMCR.N, then the CONSTRAINED UNPREDICTABLE behavior is: 
. The value returned by a direct read of HDCR.HPMN is UNKNOWN. 


° Either: 


—_ An UNKNOWN number of counters are reserved for EL2 use. That is, the PE behaves as if 
HDCR.HPMN is set to an UNKNOWN non-zero value less than PMCR.N. 


—  Allcounters are reserved for EL2 use, meaning no counters are accessible from Non-secure EL1 and 
Non-secure ELO. 


Syndrome register handling for CONSTRAINED UNPREDICTABLE instructions treated as 


When a CONSTRAINED UNPREDICTABLE instruction is treated as UNDEFINED, this generates an exception: 
. If this exception is taken to an Exception level that is using AArch64 then ESR_ELx is UNKNOWN. 
. If this exception is taken to EL2 and EL2 is using AArch32, then the HSR is UNKNOWN. 


Note 


The value written to ESR or HSR must be consistent with a value that could be created as the result of an exception 
from the same Exception level that generated the exception, but resulted from a situation that is not CONSTRAINED 
UNPREDICTABLE at that Exception level. This is to avoid a possible privilege violation. 








Out of range virtual address 


If the PE executes an instruction for which the instruction address, size, and alignment mean it contains the bytes 
OxFFFF FFFF and @x000 0000, then the bytes that wrap around and appear to be from 0x0000 0000 onwards come from 
an UNKNOWN address. 


If the PE executes a load or store instruction for which the computed address, total access size, and alignment mean 
it accesses bytes OxFFFF FFFF and 0x0000 0000, then the bytes that wrap around and appear to be from 0x0000 0000 
onwards come from an UNKNOWN address. 
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K1.1.19 Instruction fetches from Device memory 
Instruction fetches from Device memory are CONSTRAINED UNPREDICTABLE. 
If a location in memory has the Device attribute and is not marked as execute-never, then an implementation might 
perform speculative instruction accesses to this memory location when address translation is enabled. 
If a branch causes the program counter to point to a location in memory with the Device attribute that is not marked 
as execute-never for the current Exception level for instruction fetches, then an implementation must perform one 
of the following behaviors: 
° It treats the instruction fetch as if it were to a memory location with the Normal, Non-cacheable attribute. 
° It generates a Permission fault. 
K1.1.20 Multi-access instructions that load the PC from Device memory 
Multi-access instructions that load the PC from Device memory when address translation is enabled are 
UNPREDICTABLE in AArch32 state. In the ARMv8-A architecture in AArch32 state an implementation must perform 
one of the following behaviors: 
° It loads the PC from the memory location as if the memory location had the Normal Non-cacheable attribute. 
. It generates a permission fault. 
K1.1.21 Programming CSSELR.Level for a cache level that is not implemented 
If CSSELR.Level is programmed to a cache level that is not implemented, then a read of CSSELR returns an 
UNKNOWN value in CSSELR.Level. 
If the CSSELR.Level is programmed to a cache level that is not implemented, then on a read of CCSIDR an 
implementation must perform one of the following behaviors: 
° The CCSIDR read executes as a NOP. 
° The CCSIDR read is UNDEFINED. 
° The CCSIDR read returns an UNKNOWN value. 
K1.1.22 Crossing a page boundary with different memory types or Shareability attributes 
A memory access from a load or store instruction that crosses a page boundary to a memory location that has a 
different memory type or Shareability attribute results in CONSTRAINED UNPREDICTABLE behavior. In this case, the 
implementation must perform one of the following behaviors: 
. All memory accesses generated by the instruction use the memory type and Shareability attributes associated 
with the first address accessed by the instruction. 
° All memory accesses generated by the instruction use the memory type and Shareability attributes associated 
with the last address accessed by the instruction. 
. Each memory access generated by the instruction uses the memory type and Shareability attribute associated 
with its own address. 
° The instruction generates an alignment fault caused by the memory type. 
For the Non-secure PL1&0 translation regime: 
— If the stage 1 translation causes the mismatch then the resulting exception is taken to PL1. 
— If the stage 2 translation causes the mismatch then the resulting exception is taken to PL2. 
—  Ifboth stages of translation cause the mismatch then the resulting exception can be taken to either PL1 
or PL2. 
. The instruction executes as a NOP. 
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K1.1.23 Crossing a 4KB boundary with a Device access 
A memory access from a load or store instruction to Device memory that crosses a 4KB boundary results in 
CONSTRAINED UNPREDICTABLE behavior. In this case, the implementation must perform one of the following 
behaviors: 
. All memory accesses generated by the instruction are performed as if the presence of the boundary had no 
effect on the memory accesses. 
° All memory accesses generated by the instruction are performed as if the presence of the boundary had no 
effect on the memory accesses, except that there is no guarantee of ordering between memory accesses. 
. The instruction generates an Alignment fault caused by the memory type. 
For the Non-secure PL1&0 translation regime: 
— If the stage 1 translation causes the boundary to be crossed then the resulting exception is taken to PL1. 
— Ifthe stage 2 translation causes the boundary to be crossed then the resulting exception is taken to PL2. 
—  Ifboth stages of translation cause the boundary to be crossed then the resulting exception can be taken 
to either PL1 or PL2. 
° The instruction executes as a NOP. 
Note 
The boundary referred to is between two Device memory regions that are both of 4KB and aligned to 4KB. 
K1.1.24 UNPREDICTABLE behaviors with Load-Exclusive/Store-Exclusive pairs 
Load-Exclusive and Store-Exclusive instruction usage restrictions on page E2-2362 defines a 
Load-Exclusive/Store-Exclusive pair, and identifies various CONSTRAINED UNPREDICTABLE behaviors associated 
with using Load-Exclusive/Store-Exclusive pairs. These cases were UNPREDICTABLE in ARMv/7. In summary, these 
cases are: 
° The target virtual address of a StoreExcl instruction is different from the virtual address of the preceding 
LoadExcl instruction in the same thread of execution. 
° The transaction size of a StoreExcl instruction is different from the transaction size of the preceding LoadExcl 
instruction in the same thread of execution. 
° The memory attributes for a StoreExc] instruction are different from the memory attributes for the preceding 
LoadExcl instruction in the same thread of execution, either: 
— Because the translation of the accessed address changes between the LoadExc] instruction and the 
StoreExcl instruction. 
— Because the LoadExcl instruction and the StoreExc] instruction use different virtual addresses, with 
different attributes, that point to the same physical address. 
In addition, the effect of a data or unified cache invalidate, clean, or clean and invalidate instruction on a local or 
global exclusive monitor that is in the Exclusive Access state is CONSTRAINED UNPREDICTABLE. 
See the descriptions in Load-Exclusive and Store-Exclusive instruction usage restrictions on page E2-2362 for the 
permitted behavior in each of these cases, and any constraints that might apply to whether the case is CONSTRAINED 
UNPREDICTABLE. 
Note 
Additional CONSTRAINED UNPREDICTABLE cases can apply to Load-Exclusive and Store-Exclusive instructions, see 
CONSTRAINED UNPREDICTABLE behavior for A32 and T32 system instructions in the base instruction set on 
page K1-5467. 
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K1.1.25 CONSTRAINED UNPREDICTABLE behavior for A32 memory hints, Advanced SIMD 
instructions, and miscellaneous instructions 


In the A32 instruction set, a number of memory hints, Advanced SIMD instructions, and miscellaneous A32 
instructions can result in CONSTRAINED UNPREDICTABLE behavior. 


The top level encoding of these instructions, as given in the ARM® Architecture Reference Manual, ARMv7-A and 
ARMv7-R edition, is: 


31 30 29 28 27 26 25 24 23 22 21 201918 17161514131211109 8 76543 21 0 


TeeISae a a | eee 
For instructions in this encoding space that ARMv7 shows as UNPREDICTABLE: 


° If the instruction does not have an encoding with 0p1= 101x001, Op2 = -, and Rn = 1111, then the CONSTRAINED 
UNPREDICTABLE behavior is that an implementation must treat the encodings in one of the following ways: 


— The encoding is UNDEFINED. 


— The instruction executes as a NOP. 


° For an instruction with 0p1= 101x001, Op2 = -, and Rn = 1111, PLD (literal) on page F5-2871 describes the 
behavior. This encoding is subject to the rules outlined in SBZ or SBO fields T32 and A32 in instructions on 
page K1-5460. 


Note 


This Manual restructures the description of the encoding of these A32 instructions. A future release of this Manual 
will add the CONSTRAINED UNPREDICTABLE behavior to the relevant instruction descriptions. 








K1.1.26 Out of range values of the Set/Way/Index fields in cache maintenance instructions 


In the cache maintenance by set/way instructions DCCISW, DCCSW, and DCISW,, if any set/way/index argument 
is larger than the value supported by the implementation, then the behavior is CONSTRAINED UNPREDICTABLE and 
one of the following occurs: 


° The instruction is UNDEFINED. 

. The instruction performs cache maintenance on one of: 
— Nocache lines. 
— A single arbitrary cache line. 


— Multiple arbitrary cache lines. 


Note 


This CONSTRAINED UNPREDICTABLE behavior applies, also, to the A64 cache maintenance by set/way instructions 
DC CISW, DC CSW, and DC ISW. 








K1.1.27 CONSTRAINED UNPREDICTABLE behavior for A32 and T32 system instructions in the base 
instruction set 


This section lists the CONSTRAINED UNPREDICTABLE behavior for the different A32 and T32 system instructions. 


Note 


If an instruction can result in CONSTRAINED UNPREDICTABLE behavior that is not specific to that particular 
instruction, see the relevant section in this appendix for a description of the CONSTRAINED UNPREDICTABLE 
behavior. 
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SRS (T32) 


For a description of this instruction and the encoding, see SRS, SRSDA, SRSDB, SRSIA, SRSIB on page F5-3018. 


CONSTRAINED UNPREDICTABLE behavior 
For all encodings: 


° If the instruction specifies an illegal mode field, then one of the following behaviors must occur: 
— _ The instruction is UNDEFINED. 
— The instruction executes as a NOP. 
— R13 of the current mode is used. 


— The store occurs to an UNKNOWN address, and if the instruction specifies writeback, any 
general-purpose register that can be accessed without privilege violation from the current Exception 
level become UNKNOWN. 


SRS (A32) 


For a description of this instruction and the encoding, see SRS, SRSDA, SRSDB, SRSIA, SRSIB on page F5-3018. 


CONSTRAINED UNPREDICTABLE behavior 
For all encodings: 


. If the instruction specifies an illegal mode field, then one of the following behaviors must occur: 
— The instruction is UNDEFINED. 
— The instruction executes as a NOP. 
— R13 of the current mode is used. 


— The store occurs to an UNKNOWN address, and if the instruction specifies writeback, any 
general-purpose register that can be accessed without privilege violation from the current Exception 
level become UNKNOWN. 


SUBS PC, LR and related instructions (T32) 


For a description of this instruction and the encoding, see the exception return form of SUB, SUBS (immediate) on 
page F5-3114. 


CONSTRAINED UNPREDICTABLE behavior 
For all encodings: 


° If this instruction is executed in User mode or in System mode, then one of the following behaviors must 
occur: 


— The instruction is UNDEFINED. 


— The instruction executes as a NOP. 


° If the instruction transfers an illegal mode encoding to PSTATE.M, then this invokes the illegal exception 
return. 


Note 


An illegal mode encoding is either an unallocated mode encoding or one that is not accessible at the current 
Exception level. 
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For this encoding: 


If hw2[3:0] are 0b1110, and the instruction is executed when not in Hyp mode, System mode, or User mode, 
then one of the following behaviors must occur: 


— The instruction is UNDEFINED. 
— The instruction is treated as a NOP. 
— The instruction is treated as if hw2[3:0] are b1110. 


— The program counter is set using the value in the register specified by hw2[3:0]. 


SUBS PC. LR and related instructions (A32) 


For a description of this instruction and the encoding, see the exception return forms of MOV, MOVS (register) on 
page F5-2815 and SUB, SUBS (immediate) on page F5-3114. 


CONSTRAINED UNPREDICTABLE behavior 


For all encodings: 


If this instruction is executed in User mode or in System mode, then one of the following behaviors must 
occur: 


— The instruction is UNDEFINED. 


— The instruction executes as a NOP. 


If the instruction transfers an illegal mode encoding to PSTATE.M, then this invokes the illegal exception 
return. 





Note 


An illegal mode encoding is either an unallocated mode encoding or one that is not accessible at the current 
Exception level. 
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K1.1.28 CONSTRAINED UNPREDICTABLE behavior, A32 and T32 Advanced SIMD and floating-point 
instructions 


This section lists the CONSTRAINED UNPREDICTABLE behavior for the different A32 and T32 Advanced SIMD and 
floating-point instructions listed in Alphabetical list of floating-point and Advanced SIMD instructions on 
page F6-3234. 





Note 


. The pseudocode used in this section to describe cases that can result in CONSTRAINED UNPREDICTABLE 
behavior does not necessarily match the encoding specific pseudocode for a specific instruction. 


° If an instruction can result in CONSTRAINED UNPREDICTABLE behavior that is not specific to that particular 
instruction, see the relevant section in this appendix for a description of the CONSTRAINED UNPREDICTABLE 
behavior. 





VCVT (between floating-point and fixed-point) 


For a description of this instruction and the encoding, see VCVT (between floating-point and fixed-point, 
floating-point) on page F6-3364. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD1 (multiple single elements) 
For a description of this instruction and the encoding, see VLD/ (multiple single elements) on page F6-3424. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD1 (single element to all lanes) 
For a description of this instruction and the encoding, see VLD/ (single element to all lanes) on page F6-3421. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD2 (multiple 2-element structures) 
For a description of this instruction and the encoding, see VLD2 (multiple 2-element structures) on page F6-3435. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD2 (single 2-element structure to one lane) 


For a description of this instruction and the encoding, see VLD2 (single 2-element structure to one lane) on 
page F6-3428. 





K1-5470 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


Appendix K1 Architectural Constraints on UNPREDICTABLE behaviors 
K1.1 AArch32 CONSTRAINED UNPREDICTABLE behaviors 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD2 (single 2-element structure to all lanes) 


For a description of this instruction and the encoding, see VLD2 (single 2-element structure to all lanes) on 
page F6-3432. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD3 (multiple 3-element structures) 
For a description of this instruction and the encoding, see VLD3 (multiple 3-element structures) on page F6-3445. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD3 (single 3-element structure to one lane) 


For a description of this instruction and the encoding, see VLD3 (single 3-element structure to one lane) on 
page F6-3438. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD3 (single 3-element structure to all lanes) 


For a description of this instruction and the encoding, see VLD3 (single 3-element structure to all lanes) on 
page F6-3442. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD4 (multiple 4-element structures) 
For a description of this instruction and the encoding, see VLD4 (multiple 4-element structures) on page F6-3455. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD4 (single 4-element structure to one lane) 


For a description of this instruction and the encoding, see VLD4 (single 4-element structure to one lane) on 
page F6-3448. 
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If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLD4 (single 4-element structure to all lanes) 


For a description of this instruction and the encoding, see VLD4 (single 4-element structure to all lanes) on 
page F6-3452. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VLDM 
For a description of this instruction and the encoding, see VLDM, VLDMDB, VLDMIA on page F6-3458. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VMOV (between two general-purpose registers and two single-precision registers) 


For a description of this instruction and the encoding, see VMOV (between two general-purpose registers and two 
single-precision registers) on page F6-3518. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VMOV (between two general-purpose registers and a doubleword floating-point 
register) 


For a description of this instruction and the encoding, see VMOV (between two general-purpose registers and a 
doubleword floating-point register) on page F6-3503. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VST1 (multiple single elements) 
For a description of this instruction and the encoding, see VST/ (multiple single elements) on page F6-3719. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VST2 (multiple 2-element structures) 


For a description of this instruction and the encoding, see VST2 (multiple 2-element structures) on page F6-3727. 
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If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VST2 (single 2-element structure from one lane) 


For a description of this instruction and the encoding, see VST2 (single 2-element structure from one lane) on 
page F6-3723. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VST3 (multiple 3-element structures) 
For a description of this instruction and the encoding, see VST3 (multiple 3-element structures) on page F6-3734. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VST3 (single 3-element structure from one lane) 


For a description of this instruction and the encoding, see VST3 (single 3-element structure from one lane) on 
page F6-3730. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VST4 (multiple 4-element structures) 
For a description of this instruction and the encoding, see VST4 (multiple 4-element structures) on page F6-3741. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VST4 (single 4-element structure from one lane) 


For a description of this instruction and the encoding, see VST4 (single 4-element structure from one lane) on 
page F6-3737. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 


VSTM 
For a description of this instruction and the encoding, see VSTM, VSTMDB, VSTMIA on page F6-3744. 


If this instruction is not UNDEFINED, then whether it is affected by traps or enables relating to the use of the 
SIMD&FP registers when it is CONSTRAINED UNPREDICTABLE, is IMPLEMENTATION DEFINED. The implementation 
must ensure that the CONSTRAINED UNPREDICTABLE behavior does not corrupt registers that are not accessible at the 
current Exception level by instructions that are not CONSTRAINED UNPREDICTABLE. 
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K1.1.29 


K1.1.30 


CONSTRAINED UNPREDICTABLE behaviors associated with the VTCR 


The following subsections describe the CONSTRAINED UNPREDICTABLE behavior associated with programming the 
VTCR: 


° Misprogramming VTCR.S. 
° Misprogramming VTCR.{SLO, TOSZ}. 


Misprogramming VTCR.S 


VTCR.S must be programmed to the value of TOSZ[3], or the effect is CONSTRAINED UNPREDICTABLE. For the 
ARMv8-A architecture, if VTCR.S is not programmed correctly, then the VTCR.TOSZ value is treated as an 
UNKNOWN value. 


Note 


The CONSTRAINED UNPREDICTABLE behavior described in Misprogramming VTCR.{SLO, TOSZ} means the 
UNKNOWN VTCR.TOSZ value might generate a Translation fault. 








Misprogramming VTCR.{SLO, TOSZ} 


If the stage 2 input address size, as programmed in VTCR.TOSZ, is out of range with respect to the starting level, 
as programmed in the VTCR.SLO field, or the VTCR.SLO field is programmed to a reserved value, then at the time 
of a translation walk that uses the stage 2 translation, a stage 2 level 1 Translation Fault is generated. 


CONSTRAINED UNPREDICTABLE behavior of EL2 features 


The following sections describe CONSTRAINED UNPREDICTABLE behavior that can occur in an implementation that 
includes EL2 where EL2 can use AArch32: 


° ERET in User mode or System mode. 

° Accessing Hyp mode from outside Hyp mode. 

. Modifying PSTATE.M when in Hyp mode on page K1-5475 

° Use of Hyp mode in Secure state on page K1-5475. 

. Execution of Load/Store unprivileged instructions in Hyp mode on page K1-5475. 
° Exception return to Hyp mode on page K1-5475. 

° Accessing registers that cannot be accessed using MSR/MRS instructions on page K1-5475. 
° Memory type handling on page K1-5476. 

° Hyp mode TLB maintenance instructions on page K1-5476. 

° Hyp mode VA to PA address translation instructions on page K1-5476. 

° Stage 1 default memory type on page K1-5476. 

° Trapping of general exceptions to Hyp mode on page K1-5476. 

° Prevention of rootkits using Hyp mode or Secure state on page K1-5477. 

° HVC on page K1-5477. 

. MSR/MRS Banked registers on page K1-5477. 


ERET in User mode or System mode 


If ERET is executed in User mode or System mode, it behaves as described in SUBS PC, LR and related instructions 
(T32) on page K1-5468. 


Accessing Hyp mode from outside Hyp mode 


Attempting to change into Hyp mode or out of Hyp mode using the MSR or CPS instruction invokes the ARMV8 illegal 
exception return by not changing the mode, and setting PSTATE.IL to 1. 


SRS using the Hyp mode SP from Non-secure modes other than Hyp mode, or from Secure state, is handled as 
described in SRS (T32) on page K1-5468 and SRS (A32) on page K1-5468. 
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Modifying PSTATE.M when in Hyp mode 


Attempting to change into Hyp mode or out of Hyp mode using the MSR or CPS instruction invokes the ARMV8 illegal 
exception return by not changing the mode, and setting PSTATE.IL to 1. 


SRS using the Hyp mode SP from Non-secure modes other than Hyp mode, or from Secure state, is handled as 
described in SRS (T32) on page K1-5468 and SRS (A32) on page K1-5468. 


Use of Hyp mode in Secure state 


Attempting to change into Hyp mode or out of Hyp mode using the MSR or CPS instruction invokes the ARMV8 illegal 
exception return by not changing the mode, and setting PSTATE.IL to 1. 


SRS using the Hyp mode SP from Non-secure modes other than Hyp mode, or from Secure state, is handled as 
described in SRS (T32) on page K1-5468 and SRS (A32) on page K1-5468. 


Execution of Load/Store unprivileged instructions in Hyp mode 


If LDRT, LDRSHT, LDRHT, LDRSBT, LDRBT, STRT, STRHT or STRBT are executed in Hyp mode, then one of the following 
behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 


° The instruction performs the equivalent, corresponding LDR, LDRSH, LDRH, LDRSB, LDRB, STR, STRH or STRB 
instruction in Hyp mode. 


Exception return to Hyp mode 


Exception returns to Hyp mode when SCR.NS == 0 or from a Non-secure PL1 mode invokes the ARMv8 illegal 
exception return. 


Accessing registers that cannot be accessed using MSR/MRS instructions 
The following MSR and MRS instructions can lead to CONSTRAINED UNPREDICTABLE behavior: 


MSR <Rm>_<mode>, <Rn> 
MSR SPSR_<mode>, <Rn> 
MSR ELR_hyp, <Rn> 
MRS <Rn>, <Rm>_<mode> 
MRS <Rn>, SPSR_<mode> 
MRS <Rn>, ELR_hyp 


If these instructions are executed in either Secure or Non-secure User mode, then one of the following behaviors 
must occur: 
° The instruction is UNDEFINED. 


° The instruction executes as a NOP. 


Tf the MSR and MRS instructions attempt to access a register that cannot be legally accessed, then one of the following 
behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as a NOP. 

° For MRS instructions, the destination general-purpose register becomes UNKNOWN. 

° For MSR instructions, if the register specified could be accessed from the current mode by other mechanisms, 


then this register is UNKNOWN. Otherwise the instruction executes as a NOP. 
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Memory type handling 


If the attributes for a memory location after combining stage | and stage 2 of a translation regime is Normal Inner 
Non-cacheable, Outer Non-cacheable, then the shareability attributes after combining the two stages of translation 
is Outer Shareable. 


Hyp mode TLB maintenance instructions 


Ifa TLBIMVAH, TLBIMVALH, TLBIMVAHIS, TLBIMVALHIS, TLBIALLH, TLBIALLHIS, TLBIALLNSNH, 
TLBIALLNSNHIS, TLBITPAS2, TLBIPAS2L, TLBIIPAS2IS, or TLBIIPAS2LIS instruction is executed in a 
Secure Privileged mode other than Monitor mode, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction is treated as a NOP. 
° The instruction is executes as if it had been executed in Monitor mode. 


For more information about these instructions see The scope of TLB maintenance instructions on page G4-4101. 


Hyp mode VA to PA address translation instructions 


If an ATS1HR or ATS1HW instruction is executed in a Secure Privileged mode other than Monitor mode, then one 
of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction is treated as a NOP. 
° The instruction is executes as if it had been executed in Monitor mode. 


For more information about these instructions see Address translation instruction naming and operation summary 
on page G4-4142. 


Stage 1 default memory type 


If HCR.DC == 1, then the behavior of the PE when executing in a Non-secure mode other than Hyp mode is 
consistent with: 


° SCTLR.M == 0, regardless of the actual value of SCTLR.M, other than for the value returned by an explicit 
read of SCTLR.M. 


° HCR.VM == 1, regardless of the actual value of HCR.VM, other than for an explicit read of this bit. 


Trapping of general exceptions to Hyp mode 


Attempting to perform an exception return to a Non-secure PL1 mode when HCR.TGE == 1 invokes an illegal 
exception return. 


Attempting to change from Monitor mode to a Non-secure PL1 mode when HCR.TGE == 1 by executing a CPS or 
MSR instruction generates an Illegal Execution state exception, by not changing the mode, and setting PSTATE.IL 
to l. 


When EL3 is using AArch32, attempting to change from a Secure PL1 mode to a Non-secure PL1 mode when 
HCR.TGE is set, by changing SCR.NS from 0 to 1, results in no change of SCR.NS 


Because taking an exception into Non-secure PL1 modes leads to a CONSTRAINED UNPREDICTABLE Situation, the 
following additional properties apply when HCR.TGE == 1: 


° All exceptions that would be routed to EL1 are routed to EL2. 
° Non-secure SCTLR.M is treated as being 0, regardless of its actual value, other than for an explicit read of 
of this bit. 


° HCR.FMO, HCR.IMO, and HCR.AMO are treated as being 1, regardless of their actual value, other than for 
an explicit read of these bits. 
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. All virtual interrupts are disabled. 


° Any IMPLEMENTATION DEFINED mechanisms for signalling virtual interrupts are disabled. 


Prevention of rootkits using Hyp mode or Secure state 


If an HVC instruction is executed in Hyp mode when SCR.HCE == 0, then one of the following behaviors must 
occur: 


° The instruction is UNDEFINED. 


° The instruction executes as a NOP. 


If an SMC instruction is executed in a Secure privileged mode when SCR.SCD == 1, then one of the following 
behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
HVC 


For a description of this instruction and the encoding, see HVC on page F5-2677. 


For the Al encoding, if cond field !=1110, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as a NOP. 

° The instruction executes unconditionally. 
° The instruction executes conditionally. 


MSR/MRS Banked registers 


Encoding and use of Banked register transfer instructions on page F5-3228 identifies cases where attempted 
execution of an MRS (Banked register) or MSR (Banked register) was UNPREDICTABLE in ARMv7 and becomes 
CONSTRAINED UNPREDICTABLE in ARMV8. This includes cases where: 


° The target register specified by the {R, SYSm} fields of the instruction encoding is not accessible from the PE 
mode in which the instruction was executed, see Usage restrictions on the Banked register transfer 
instructions on page F5-3229. 


. The instruction was executed specifying unallocated {R, SYSm} field values, see Encoding the register 
argument in the Banked register transfer instructions on page F5-3230. 


If one of these encodings for the MSR/MRS (Banked register) instructions is executed, then one of the following 
behaviors must occur: 





° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
° An allocated MSR/MRS (Banked register) instruction is executed. 
K1.1.31 Reserved values in System and memory-mapped registers and translation table entries 
Unless otherwise stated, all unallocated or reserved values of fields with allocated values within the AArch32 
System registers, memory-mapped registers, and translation table entries behave in one of the following ways: 
° The encoding maps onto any of the allocated values, but otherwise does not cause CONSTRAINED 
UNPREDICTABLE behavior. 
. The encoding causes effects that could be achieved by a combination of more than one of the allocated 
encodings. 
. The encoding causes the field to have no functional effect. 
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Note 


These constraints are identical to those for the equivalent AArch64 definitions, as given in Reserved values in 
System and memory-mapped registers and translation table entries on page K1-5492. 








K1.1.32 CONSTRAINED UNPREDICTABLE behavior in Debug state 


Behavior in Debug state on page H2-4855 of this manual describes the CONSTRAINED UNPREDICTABLE behaviors 
that are specifically associated with Debug state. 
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K1.2 AArch64 CONSTRAINED UNPREDICTABLE behaviors 


It contains the following sections: 

° Overview of the constraints on AArch64 UNPREDICTABLE behaviors. 

° SBZ or SBO fields in A64 instructions. 

° CONSTRAINED UNPREDICTABLE behaviors due to caching of control or data values on page K1-5480. 
° CONSTRAINED UNPREDICTABLE behavior due to inadequate context synchronization on page K1-5480. 
. Translation table base address alignment on page K1-5480. 

° The Performance Monitors Extension on page K1-5480. 


. Syndrome register handling for CONSTRAINED UNPREDICTABLE instructions treated as UNDEFINED 
on page K1-5481. 


. Out of range virtual address on page K1-5481. 


° Instruction fetches from Device memory on page K1-5482. 

° Programming the CSSELR_EL1.Level for a cache level that is not implemented on page K1-5482. 
° Crossing a page boundary with different memory types or Shareability attributes on page K1-5482. 
° Crossing a peripheral boundary with a Device access on page K1-5483. 


° CONSTRAINED UNPREDICTABLE behaviors with Load-Exclusive/Store-Exclusive pairs on 
page K1-5483. 


° CONSTRAINED UNPREDICTABLE behavior for A64 instructions on page K1-5484. 

° Out of range values of the Set/Way/Index fields in cache maintenance instructions on page K1-5492. 

. Reserved values in System and memory-mapped registers and translation table entries on page K1-5492. 
. CONSTRAINED UNPREDICTABLE behavior in Debug state on page K1-5492. 


K1.2.1 Overview of the constraints on AArch64 UNPREDICTABLE behaviors 


The term UNPREDICTABLE describes a number of cases where the architecture has a feature that software must not 
use. For execution in AArch64 state, the ARMv8-A architecture specifies a narrow range of permitted behaviors. 
This range is the range of CONSTRAINED UNPREDICTABLE behavior. All implementations that are compliant with the 
architecture must follow the CONSTRAINED UNPREDICTABLE behavior. 





Note 
Software designed to be compatible with the ARMv8-A architecture must not rely on these CONSTRAINED 
UNPREDICTABLE cases being handled in any way other than those listed under the heading CONSTRAINED 
UNPREDICTABLE. 





K1.2.2 SBZ or SBO fields in A64 instructions 


Some A64 instructions have (0) or (1) in the instruction decode to indicate should-be-zero, SBZ, or should-be-one, 
SBO, as described in Fixed values in AArch64 instruction and System register descriptions on page C2-137. Except 
for specific cases identified in CONSTRAINED UNPREDICTABLE behaviors with Load-Exclusive/Store-Exclusive 
pairs on page K1-5483, if the instruction bit pattern of an instruction is executed with these fields not having the 
should be values, one of the following must occur: 





. The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction operates as if the bit had the should-be value. 
. Any destination registers of the instruction become UNKNOWN. 
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K1.2.3 CONSTRAINED UNPREDICTABLE behaviors due to caching of control or data values 
The ARM architecture allows copies of control values or data values to be cached in a cache or TLB. This can lead 
to UNPREDICTABLE behavior if the cache or TLB has not been correctly invalidated following a change of the control 
or data values. 
Unless explicitly stated otherwise, the behavior of the PE is consistent with: 
° The old data or control value. 
° The new data or control value. 
. An amalgamation of the control or data values. 
Note 
This rules applies where inadequate invalidation of the TLB might cause multiple hits within the TLB. In this 
situation, a failure to invalidate the TLB by code running at a given Privilege level must not make access to regions 
of memory with permissions or attributes that could not be accessed at that Privilege level possible. 
Alternatively to this CONSTRAINED UNPREDICTABLE behavior, an implementation detecting multiple hits within a 
TLB might generate an exception, reporting the exception using the TLB conflict fault code, see TLB conflict aborts 
on page D4-1814. 
The choice between the behaviors might, in some implementations, vary for each use of a control or data value. 
K1.2.4 CONSTRAINED UNPREDICTABLE behavior due to inadequate context synchronization 
The ARM architecture requires that changes to System registers must be synchronized before they take effect. This 
can lead to UNPREDICTABLE behavior if the synchronization has not been performed. 
In these cases, the behavior of the PE is consistent with the unsynchronized control value being either the old value 
or the new value. 
Where multiple control values are updated but not yet synchronized, each control value might independently be the 
old value or the new value. 
In addition, where the unsynchronized control value applies to different areas of functionality, or what an 
implementation has constructed as different areas of functionality, those areas might independently treat the control 
value as being either the old value or the new value. 
The choice between these behaviors might, in some implementations, vary for each use of a control value. 
K1.2.5 Translation table base address alignment 
Field [x-1:0] in TTBRO_EL1, TTBR1_EL1, TTBRO_EL2, VTTBR_EL2, or TTBRO_EL3 is RESO. If this field does 
not have a value of 0, this might result in a misaligned translation table base address. In this case, one of the 
following behaviors must occur: 
° The field that is defined to be RESO is treated as if all the bits had a value of 0: 
— The value read back might be the value written or it might be 0. 
. The calculation of an address for a translation table walk using those registers might be corrupted in those 
bits that are nonzero. 
K1.2.6 The Performance Monitors Extension 
The following cases can cause CONSTRAINED UNPREDICTABLE behavior: 
° If PMSELR_ELO.SEL is not equal to 31, and PMSELR_ELO.SEL is greater than or equal to PMCR_ELO.N, 
and the PE is executing in an Exception level in Secure state or in EL2. 
° If PMSELR_ELO.SEL is not 31, and PMSELR_ELO.SEL is greater than or equal to MDCR_EL2.HPMN, 
and the PE is executing in an Exception level in Non-secure state other than in EL2. 
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In these cases, one of the following behaviors must occur: 


Accesses to PMXEVTYPER_ELO or PMXEVCNTR_ELO from that state are UNDEFINED. 
Accesses to PMXEVTYPER_ELO or PMXEVCNTR_ELO from that state behave as RAZ/WI. 
Accesses to PMXEVTYPER_ELO or PMXEVCNTR_ELO from that state execute as NOPs. 


Accesses to PUXEVTYPER_ELO or PMXEVCNTR_ELO from that state behave as if PMSELR_ELO.SEL 
contains an UNKNOWN value that is less than the number of counters accessible at the current Exception level 
and Security state. 


Accesses to PUXEVTYPER_ELO or PMXEVCNTR_ELO from that state behave as if PMSELR_ELO.SEL 
is 31. 


In Non-secure state, for an access to PMXEVTYPER_ELO or PMXEVCNTR_ELO from EL1 or a permitted 
access from ELO, if the counter is implemented but not accessible at the current Exception level, the register 
access is trapped to EL2. Accesses from ELO are permitted when: 


— ELI is using AArch32 and the value of PMUSERENR.EN is 1. 
— ELI is using AArch64 and the value of PMUSERENR_ELO.EN is 1. 


If PMSELR_ELO.SEL is equal to 31, then one of the following behaviors must occur: 


Accesses to PMXEVCNTR_ELO are UNDEFINED. 
Accesses to PUXEVCNTR_ELO behave as RAZ/WI. 
Accesses to PMXEVCNTR_ELO execute as NOPs. 


Accesses to PMXEVCNTR_ELO behave as if PMSELR_ELO.SEL contains an UNKNOWN value that is less 
than the number of counters accessible at the current Exception level and Security state. 


If MDCR_EL2.HPMN is set to 0, or to a value larger than PMCR_ELO.N, then the following CONSTRAINED 
UNPREDICTABLE behavior applies: 


The value returned by a direct read of MDCR_EL2.HPMN is UNKNOWN. 


Either: 


—_ An UNKNOWN number of counters are reserved for EL2 use. That is, the PE behaves as if 
MDCR_EL2.HPMN is set to an UNKNOWN non-zero value less than PMCR_ELO.N. 


— Allcounters are reserved for EL2 use, meaning no counters are accessible from Non-secure EL1 and 
Non-secure ELO. 


K1.2.7 Syndrome register handling for CONSTRAINED UNPREDICTABLE instructions treated as 


UNDEFINED 


When a CONSTRAINED UNPREDICTABLE instruction is treated as UNDEFINED, ESR_ELx is UNKNOWN. 


Note 





The value written to ESR_ELx must be consistent with a value that could be created as the result of an exception 
from the same Exception level that generated the exception, but was the result of a situation that is not CONSTRAINED 
UNPREDICTABLE at that Exception level. This is to avoid a possible privilege violation. 





K1.2.8 Out of range virtual address 


Because of program counter alignment constraints, it is impossible for a PE to fetch an A64 instruction that includes 
the bytes at virtual address OxFFFF FFFF FFFF FFFF and 0x0000 0000 0000 0000. 
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If the PE executes a load or store instruction with tagged addressing disabled in the current translation regime, and 
where the computed virtual address, total access size, and alignment mean that it accesses the bytes at @xFFFF FFFF 
FFFF FFFF and 0x0000 0000 000 0000, then the bytes that appear to be from 0x0000 0000 0000 0000 onwards are 
accessed at an UNKNOWN address. 


If the PE executes a load or store instruction with tagged addressing enabled in the current translation regime, and 
where the computed address, total access size, and alignment mean that it accesses the bytes at OxFFFF FFFF FFFF 
FFFF and 0x0000 0000 0000 0009, then the bytes that appear to be from 0x0000 0000 0000 0000 onwards are accessed 
at an UNKNOWN address and the tags associated with address also become UNKNOWN. 





K1.2.9 Instruction fetches from Device memory 
Instruction fetches from Device memory are CONSTRAINED UNPREDICTABLE. 
If a location in memory has the Device attribute and is not marked as execute-never, then an implementation might 
perform speculative instruction accesses to this memory location at times when address translation is enabled. 
If a branch causes the program counter to point to an area of memory with the Device attribute that is not marked 
as execute-never for the current Exception level for instruction fetches, then an implementation must perform one 
of the following behaviors: 
° It treats the instruction fetch as if it were to a memory location with the Normal, Non-cacheable attribute. 
. It generates a Permission fault. 
K1.2.10 Programming the CSSELR_EL1.Level for a cache level that is not implemented 
If the CSSELR_EL1.Level is programmed to a cache level that is not implemented, then a read of CSSELR_EL1 
returns an UNKNOWN value in CSSELR_EL1.Level. 
If the CSSELR_EL1.Level is programmed to a cache level that is not implemented, then on a read of CCSIDR_EL1 
an implementation must perform one of the following behaviors: 
° The CCSIDR_EL1 read executes as a NOP. 
° The CCSIDR_EL1 read is UNDEFINED. 
° The CCSIDR_EL1 read returns an UNKNOWN value. 
K1.2.11 Crossing a page boundary with different memory types or Shareability attributes 
A memory access from a load or store instruction that crosses a page boundary to a memory location that has a 
different memory type or Shareability attribute results in CONSTRAINED UNPREDICTABLE behavior. In this case, the 
implementation must perform one of the following behaviors: 
° All memory accesses generated by the instruction use the memory type and Shareability attributes associated 
with the first address accessed by the instruction. 
° All memory accesses generated by the instruction use the memory type and Shareability attributes associated 
with the last address accessed by the instruction. 
° Each memory access generated by the instruction uses the memory type and Shareability attribute associated 
with its own address. 
° The instruction generates an Alignment fault caused by the memory type. 
For the Non-secure EL1&0 translation regime: 
— If the stage 1 translation generated the mismatch then the resulting exception is taken to EL1. 
— If the stage 2 translation generated the mismatch then the resulting exception is taken to EL2. 
— _ If both stages of translation generate the mismatch then the exception can be taken to either EL1 or 
EL2. 
. The instruction executes as a NOP. 
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K1.2.12 Crossing a peripheral boundary with a Device access 


Performing memory accesses from one load or store instruction to Device memory that crosses a boundary 
corresponding to the smallest translation granule size of the implementation causes CONSTRAINED UNPREDICTABLE 
behavior. In this case, the implementation performs one of the following behaviors: 


. All memory accesses generated by the instruction are performed as if the boundary has no effect on the 
memory accesses. 


° All memory accesses generated by the instruction are performed as if the boundary has no effect on the 
memory accesses except that there is no guarantee of ordering between memory accesses. 

. The instruction generates an alignment fault caused by the memory type. 
For the Non-secure EL1&0 translation regime: 
— Ifthe stage 1 translation causes the boundary to be crossed then the resulting exception is taken to EL1. 
— Ifthe stage 2 translation causes the boundary to be crossed then the resulting exception is taken to EL2. 


—  Ifboth stages of translation cause the boundary to be crossed then the resulting exception can be taken 
to either EL1 or EL2. 





° The instruction executes as a NOP. 
Note 
The boundary referred to is between two Device memory regions that are both: 
° Of the size of the smallest implemented translation granule. 
° Aligned to the size of the smallest implemented translation granule. 





K1.2.13 CONSTRAINED UNPREDICTABLE behaviors with Load-Exclusive/Store-Exclusive pairs 


Load-Exclusive and Store-Exclusive instruction usage restrictions on page B2-115 defines a 
Load-Exclusive/Store-Exclusive pair, and identifies various CONSTRAINED UNPREDICTABLE behaviors associated 
with using Load-Exclusive/Store-Exclusive pairs. In summary, these cases are: 


° The target virtual address of a StoreExcl instruction is different from the virtual address of the preceding 
LoadExcl instruction in the same thread of execution. 


° The transaction size of a StoreExc] instruction is different from the transaction size of the preceding LoadExc] 
instruction in the same thread of execution. 


° The StoreExcl instruction accesses a different number of registers than the preceding LoadExc] instruction in 
the same thread of execution. 


° The memory attributes for a StoreExc] instruction are different from the memory attributes for the preceding 
LoadExcl instruction in the same thread of execution, either: 


— Because the translation of the accessed address changes between the LoadExc] instruction and the 
StoreExcl instruction. 


— Because the LoadExc] instruction and the StoreExc] instruction use different virtual addresses, with 
different attributes, that point to the same physical address. 


In addition, the effect of a data or unified cache invalidate, clean, or clean and invalidate instruction on a local or 
global exclusive monitor that is in the Exclusive Access state is CONSTRAINED UNPREDICTABLE. 


See the descriptions in Load-Exclusive and Store-Exclusive instruction usage restrictions on page B2-115 for the 
permitted behavior in each of these cases, and any constraints that might apply to whether the case is CONSTRAINED 
UNPREDICTABLE. 
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K1.2.14 CONSTRAINED UNPREDICTABLE behavior for A64 instructions 

This section lists the CONSTRAINED UNPREDICTABLE behavior for the different A64 instructions listed in Chapter C6 

A64 Base Instruction Descriptions and Chapter C7 A64 Advanced SIMD and Floating-point Instruction 

Descriptions. 

LDAXP 

For a description of this instruction and the encoding, see LDAXP on page C6-536. 

CONSTRAINED UNPREDICTABLE behavior 

If t == t2, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as a NOP. 

° The instruction performs a load using the specified addressing mode, and the base register is set to an 
UNKNOWN value. 

LDNP 

For a description of this instruction and the encoding, see LDNP on page C6-542. 

CONSTRAINED UNPREDICTABLE behavior 

If t == t2, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

. The instruction executes as a NOP. 

. The instruction performs a load using the specified addressing mode, and the base register is set to an 
UNKNOWN value. 

LDNP (SIMD&FP) 

For a description of this instruction and the encoding, see LDNP (SIMD&FP) on page C7-1088. 

CONSTRAINED UNPREDICTABLE behavior 

If t == t2, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as a NOP. 

° The instruction performs a load using the specified addressing mode, and the base register is set to an 
UNKNOWN value. 

LDP 

For a description of this instruction and the encoding, see LDP on page C6-544. 

CONSTRAINED UNPREDICTABLE behavior 

If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and (t == n || t2 == n) 

&& n != 31, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 

° The instruction executes as a NOP. 

° The instruction performs a load using the specified addressing mode, and the base register is set to an 
UNKNOWN value. In addition, if an exception occurs during such an instruction, the base register might be 
corrupted so that the instruction cannot be repeated. 

If t == t2, then one of the following behaviors must occur: 

° The instruction is UNDEFINED. 
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° The instruction executes as a NOP. 


. The instruction performs all of the loads using the specified addressing mode, and the register loaded is set 
to an UNKNOWN value. 





Note 


Pre-indexed addressing and post-indexed addressing imply writeback. 





LDP (SIMD&FP) 


For a description of this instruction and the encoding, see LDP (SIMD&FP) on page C7-1090. 


CONSTRAINED UNPREDICTABLE behavior 


If t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
° The instruction performs a load using the specified addressing mode, and the base register is set to an 


UNKNOWN value. 


LDPSW 


For a description of this instruction and the encoding, see LDPSW on page C6-547. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and (t == n || t2 == n) 
&& n != 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
° The instruction performs a load using the specified addressing mode, and the base register is set to an 


UNKNOWN value. In addition, if an exception occurs during such an instruction, the base register might be 
corrupted so that the instruction cannot be repeated. 


If t == t2, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
° The instruction performs all of the loads using the specified addressing mode, and the register loaded is set 


to an UNKNOWN value. 


Note 


Pre-indexed addressing and post-indexed addressing imply writeback. 








LDR (immediate) 


For a description of this instruction and the encoding, see LDR (immediate) on page C6-550. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and n == t && n != 31, then 
one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
° The instruction performs the load using the specified addressing mode, and the base register is set to an 


UNKNOWN value. In addition, if an exception occurs during such an instruction, the base register might be 
corrupted so that the instruction cannot be repeated. 
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Note 


Pre-indexed addressing and post-indexed addressing imply writeback. 








LDRB (immediate) 


For a description of this instruction and the encoding, see LDRB (immediate) on page C6-557. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and n == t && n != 31, then 
one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
. The instruction performs the load using the specified addressing mode, and the base register is set to an 


UNKNOWN value. In addition, if an exception occurs during such an instruction, the base register might be 
corrupted so that the instruction cannot be repeated. 


Note 


Pre-indexed addressing and post-indexed addressing imply writeback. 








LDRH (immediate) 


For a description of this instruction and the encoding, see LDRH (immediate) on page C6-561. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and n == t && n != 31, then 
one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the load using the specified addressing mode, and the base register is set to an 


UNKNOWN value. In addition, if an exception occurs during such an instruction, the base register might be 
corrupted so that the instruction cannot be repeated. 


Note 


Pre-indexed addressing and post-indexed addressing imply writeback. 








LDRSB (immediate) 


For a description of this instruction and the encoding, see LDRSB (immediate) on page C6-565. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and n == t && n != 31, then 
one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the load using the specified addressing mode, and the base register is set to an 


UNKNOWN value. In addition, if an exception occurs during such an instruction, the base register might be 
corrupted so that the instruction cannot be repeated. 


Note 


Pre-indexed addressing and post-indexed addressing imply writeback. 
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LDRSH (immediate) 


For a description of this instruction and the encoding, see LDRSH (immediate) on page C6-570. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and n == t && n != 31, then 
one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the load using the specified addressing mode, and the base register is set to an 


UNKNOWN value. In addition, if an exception occurs during such an instruction, the base register might be 
corrupted so that the instruction cannot be repeated. 


Note 


Pre-indexed addressing and post-indexed addressing imply writeback. 








LDRSW (immediate) 


For a description of this instruction and the encoding, see LDRSW (immediate) on page C6-575. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and n == t && n != 31, then 
one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the load using the specified addressing mode, and the base register is set to an 


UNKNOWN value. In addition, if an exception occurs during such an instruction, the base register might be 
corrupted so that the instruction cannot be repeated. 


Note 


Pre-indexed addressing and post-indexed addressing imply writeback. 








LDXP 


For a description of this instruction and the encoding, see LDXP on page C6-598. 


CONSTRAINED UNPREDICTABLE behavior 


If t == t2, then one of the following behaviors must occur: 


. The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
° The instruction performs a load using the specified addressing mode, and the base register is set to an 


UNKNOWN value. 
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STP 


For a description of this instruction and the encoding, see STP on page C6-694. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and (t == n || t2 == n) 
&& n != 31, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as a NOP. 

. The instruction performs a store using the specified addressing mode but the value stored is UNKNOWN. 
Note 





Pre-indexed addressing and post-indexed addressing imply writeback. 





STLXP 


For a description of this instruction and the encoding, see STLXP on page C6-683. 


CONSTRAINED UNPREDICTABLE behavior 


Ifs == t || (s == t2), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
° The instruction performs the store to the specified address, but the value stored is UNKNOWN. 


Ifs == n && n != 31 then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as a NOP. 

. The instruction performs the store to an UNKNOWN address. 
STLXR 


For a description of this instruction and the encoding, see STLXR on page C6-686. 


CONSTRAINED UNPREDICTABLE behavior 


If s == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
° The instruction performs the store to the specified address, but the value stored is UNKNOWN. 


Ifs == n && n != 31 then one of the following behaviors must occur: 





. The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
. The instruction performs the store to an UNKNOWN address. 
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STLXRB 


For a description of this instruction and the encoding, see STLXRB on page C6-688. 


CONSTRAINED UNPREDICTABLE behavior 


If s == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the store to the specified address, but the value stored is UNKNOWN. 


Ifs == n && n != 31 then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as a NOP. 

. The instruction performs the store to an UNKNOWN address. 
STLXRH 


For a description of this instruction and the encoding, see STLXRH on page C6-690. 


CONSTRAINED UNPREDICTABLE behavior 


If s == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the store to the specified address, but the value stored is UNKNOWN. 


If s == n && n != 31 then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the store to an UNKNOWN address. 


STR (immediate) 


For a description of this instruction and the encoding, see STR (immediate) on page C6-697. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and n == t && n != 31, then 
one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as a NOP. 

° The instruction performs a store using the specified addressing mode but the value stored is UNKNOWN. 
Note 





Pre-indexed addressing and post-indexed addressing imply writeback. 
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STRB (immediate) 


For a description of this instruction and the encoding, see STRB (immediate) on page C6-702. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and n == t && n != 31, then 
one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as a NOP. 

. The instruction performs a store using the specified addressing mode but the value stored is UNKNOWN. 
Note 





Pre-indexed addressing and post-indexed addressing imply writeback. 





STRH (immediate) 
For a description of this instruction and the encoding, see STRH (immediate) on page C6-706. 


CONSTRAINED UNPREDICTABLE behavior 


If the instruction encoding specifies pre-indexed addressing or post-indexed addressing, and n == t && n != 31, then 
one of the following behaviors must occur: 





° The instruction is UNDEFINED. 

° The instruction executes as a NOP. 

° The instruction performs a store using the specified addressing mode but the value stored is UNKNOWN. 
Note 


Pre-indexed addressing and post-indexed addressing imply writeback. 





STXP 


For a description of this instruction and the encoding, see STXP on page C6-717. 


CONSTRAINED UNPREDICTABLE behavior 


Ifs == t || (s == t2), then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
° The instruction performs the store to the specified address, but the value stored is UNKNOWN. 


Ifs == n && n != 31 then one of the following behaviors must occur: 





° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
° The instruction performs the store to an UNKNOWN address. 
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STXR 


For a description of this instruction and the encoding, see STXR on page C6-720. 


CONSTRAINED UNPREDICTABLE behavior 


If s == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the store to the specified address, but the value stored is UNKNOWN. 


Ifs == n && n != 31 then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

° The instruction executes as a NOP. 

. The instruction performs the store to an UNKNOWN address. 
STXRB 


For a description of this instruction and the encoding, see STXRB on page C6-722. 


CONSTRAINED UNPREDICTABLE behavior 


If s == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the store to the specified address, but the value stored is UNKNOWN. 


Ifs == n && n != 31 then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 

. The instruction executes as a NOP. 

° The instruction performs the store to an UNKNOWN address. 
STXRH 


For a description of this instruction and the encoding, see STXRH on page C6-724. 


CONSTRAINED UNPREDICTABLE behavior 


If s == t, then one of the following behaviors must occur: 


° The instruction is UNDEFINED. 
° The instruction executes as a NOP. 
° The instruction performs the store to the specified address, but the value stored is UNKNOWN. 


Ifs == n && n != 31 then one of the following behaviors must occur: 





. The instruction is UNDEFINED. 
. The instruction executes as a NOP. 
. The instruction performs the store to an UNKNOWN address. 
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K1.2.15 Out of range values of the Set/Way/Index fields in cache maintenance instructions 


In the cache maintenance by set/way instructions DC CISW, DC CSW, and DC ISW,, if any set/way/index argument 
is larger than the value supported by the implementation, then the behavior is CONSTRAINED UNPREDICTABLE and 
one of the following occurs: 


° The instruction is UNDEFINED. 

° The instruction performs cache maintenance on one of: 
—  Nocache lines. 
— A single arbitrary cache line. 


— Multiple arbitrary cache lines. 





Note 


This CONSTRAINED UNPREDICTABLE behavior applies, also, to the AArch32 cache maintenance by set/way 
instructions DCCISW, DCCSW, and DCISW. 





K1.2.16 Reserved values in System and memory-mapped registers and translation table entries 


Unless otherwise stated in this manual, all unallocated or reserved values of fields with allocated values within 
AArch64 System registers, memory-mapped registers, and translation table entries behave in one of the following 


ways: 

. The unallocated value maps onto any of the allocated values, but otherwise does not cause CONSTRAINED 
UNPREDICTABLE behavior. 

° The unallocated value causes effects that could be achieved by a combination of more than one of the 
allocated values. 

. The unallocated value causes the field to have no functional effect. 


Note 


These constraints are identical to those for the equivalent AArch32 definitions, as given in Reserved values in 
System and memory-mapped registers and translation table entries on page K1-5477. 








K1.2.17 CONSTRAINED UNPREDICTABLE behavior in Debug state 


Behavior in Debug state on page H2-4855 of this manual describes the CONSTRAINED UNPREDICTABLE behaviors 
that are specifically associated with Debug state. 
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Recommended External Debug Interface 


This appendix describes the recommended external debug interface. It contains the following sections: 





° About the recommended external debug interface on page K2-5494. 

° PMUEVENT bus on page K2-5497. 

. Recommended authentication interface on page K2-5498. 

. Management registers and CoreSight compliance on page K2-5499. 
Note 


This recommended external debug interface specification is not part of the ARM architecture specification. 
Implementers and users of the ARMv8 architecture must not consider this appendix as a requirement of the 
architecture. It is included as an appendix to this manual only: 

. As reference material for users of ARM products that implement this interface. 


. As an example of how an external debug interface might be implemented. 


The inclusion of this appendix is no indication of whether any ARM products might, or might not, implement this 
external debug interface. For details of the implemented external debug interface you must always see the 
appropriate product documentation. 
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K2.1 About the recommended external debug interface 
See the Note on the first page of this appendix for information about the architectural status of this recommended 
debug interface. 
This specification provides a recommended external debug interface for ARMv8 to define a standard set of 
connections for validation environments. In general, the connection between components, such as between the PE 
and Trace extension, is not described here, although the table does include the signals for the CTI connection. 
Table K2-1 shows the signals in the recommended interface. 
Table K2-1 Recommended debug interface signals 
Name Direction Description Notes 
DBGEN In External debug enable - 
SPIDEN In Secure privileged external debug enable - 
Secure privileged self-hosted debug enable Only in Secure AArch32 modes when 
enabled by MDCR_EL3.SPD32 
NIDEN In External profiling and trace enable - 
SPNIDEN In Secure external profiling and trace enable - 
EDBGRQ In External halt request 
Provided for legacy connections only. 
DBGACK Out External halt request acknowledge 
COMMIRQ Out DCC interrupt Interface to an interrupt controller. See 
Interrupt-driven use of the DCC on 
page H4-4924 and the pseudocode for 
function CheckForDCCInterrupts(). 
PMUIRQ Out Performance Monitor overflow Interface to an interrupt controller. See 
Behavior on overflow on page DS-1838. 
COMMRX Out DTRRxX is full Provided for legacy connection to an 
, interrupt controller only. See 
COMMTX Out DTRTX is empty Interrupt-driven use of the DCC on 
page H4-4924 and the pseudocode for 
function CheckForDCCInterrupts(). 
PMUEVENT[n:0] Out Performance Monitors event bus See PMUEVENT bus on page K2-5497. 
DBGNOPWRDWN Out Core no powerdown request Interface to a power controller. 
See DBGPRCR_EL1.CORENPDRQ. 
DBGPWRUPREQ Out Core powerup request Interface to a power controller. 
See EDPRCR.COREPURQ. 
DBGRSTREQ Out Warm reset request Interface to a power controller. 
See EDPRCR.CWRR. 
DBGBUSCANCELREQ Out All asynchronous entry to Debug state Extension to the bus interface. 
See EDRCR.CBRRQ. 
DBGPWRDUP In Core powerup status Interface to a power controller. 
See EDPRSR.PU. 
DBGROMADDRin:12] In MDRAR_EL1.ROMADDR n depends on the size of the physical 


address space. 
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Table K2-1 Recommended debug interface signals (continued) 


































































































Name Direction Description Notes 

DBGROMADDRV In MDRAR_EL1.Valid - 

PRESETDBG In External debug reset - 

CPUPORESET In Cold reset - 

CORERESET In Warm reset - 

PSELDBG In Debug APB slave port For details see AMBA APB3. ARM 
recommends a single slave port for all 

PENABLEDBG In integrated debug components. 
memory-mapped and DAP accesses: 

PRDATADBG[31:0] Out 0 Memory-mapped access 

PWDATADBG[31:0] In BAP aces: 

PADDRDBG[n:2]? In 

PREADYDBG Out 

PSLVERRDBG Out 

PCLKDBG In 

PCLKENDBG In 

CTICHIN In Asynchronous CoreSight channel interface For details, see CoreSight™ v1.0 
Architecture Specification. The ACK 

CTICHOUTACK In signals are not required if the channel 

CTICHOUT Out interface is synchronous. 

CTICHINACK Out 

CTIIRQ Out CTlinterrupt, see Description andallocation Implements a handshake for an 

of CTI triggers on page H5-4935 edge-sensitive interrupt. 

CTIIRQACK In 

ATDATA[nx8-1:0] Out AMBA 4 ATB interface For details, see AMBA 4 ATB Protocol 
Specification, ATBv1.0 and ATBv1.1. 

ATBYTES[n-1:0] Out Only available if the OPTIONAL Trace 

ATID[6:0] Out extension is implemented. 

ATREADY In 

ATVALID Out 

AFREADY Out 

AFVALID Out 

SYNCREQ In 

ATCLK In 

ATCLKEN In 

ATRESET In 





a. The value of n depends on the size of the address space occupied by the Debug port. 
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Figure K2-1 shows the recommended debug interface. 
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Figure K2-1 Recommended external debug interface, including the APB3 slave port 
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K2.2 PMUEVENT bus 


The PMUEVENT bus exports Performance Monitor events from the PE to an on-chip agent. ARM recommends 
that it has the following characteristics: 


. The bus is synchronous. 

° The width of the bus is IMPLEMENTATION DEFINED. 

° It is IMPLEMENTATION DEFINED which events are exported on the bus. 

. Each exported event occupies a contiguous sub-field of the bus. ARM recommends that the sub-fields of the 


bus are occupied in the same order as the event numbers. 


° If the event can only occur once per cycle, it occupies a single bit. If the event can occur more than once per 
cycle, it is IMPLEMENTATION DEFINED how the event is encoded. The encoding depends on constraints such 
as the designated use of the event bus and the number of pins available. For example, the event can be 
encoded: 


—  Asacount, using a plain binary number. This is the most useful encoding when exporting to an 
external counter. It is not a useful encoding for exporting to a Trace extension external input. 


— As acount, using thermometer encoding. This is the most useful encoding when exporting to a Trace 
extension. 


— Using a single bit encoding to indicate whether the event count is zero or nonzero. This is useful for 
exporting to an activity monitor where the number of pins is constrained. 


If a Trace extension is implemented, the PMUEVENT bus is normally connected to the Trace extension using the 
external inputs. TRCEXTINSELR multiplexes a wide PMUEVENT bus to a narrow set of inputs. An external 
PMUEVENT bus might also be provided. For more information, contact ARM. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. K2-5497 
1ID092916 Non-Confidential 


Appendix K2 Recommended External Debug Interface 
K2.3 Recommended authentication interface 


K2.3 Recommended authentication interface 


An implementation of the ARMv8 architecture must support debug authentication described in Required debug 
authentication on page H1-4842. 


The details of the debug authentication interface are IMPLEMENTATION DEFINED, but ARM recommends the use of 
the CoreSight interface, which includes the following signals for external debug authentication: 


° DBGEN. 
. SPIDEN. 
° NIDEN. 


° SPNIDEN. 


ARM recommends an interface in which DBGEN and SPIDEN are also used for self-hosted Secure debug 
authentication if either: 


. EL3 is using AArch32 and SDCR.SPD == 0b00. 
. Secure EL] is using AArch32 and MDCR_EL3.SPD32 == 0b00. 


If EL3 is not implemented and the PE is in Non-secure state, SPIDEN and SPNIDEN are not implemented, and the 
PE behaves as if these signals were tied LOW. 


If EL3 is not implemented and the PE is in Secure state, SPIDEN is usually connected to DBGEN and SPNIDEN 
is connected to NIDEN, but this is not required. The recommended interface is defined as if all four signals are 
implemented. 


How the authentication signals are driven is IMPLEMENTATION DEFINED. For example, the signals might be 
hard-wired, connected to fuses, or to an authentication module. The architecture permits PEs within a cluster to have 
independent authentication interfaces, but this is not required. ARM recommends that any Trace extension has the 
same authentication interface as the PE it is connected to. 


Table K2-2 shows the debug authentication pseudocode functions and the recommended implementations. 


Table K2-2 Recommended implementation of debug enable pseudocode functions 























Pseudocode function Description Implementation 

AArch32.SelfHostedSecurePrivilegedInvasiveDebugEnabled() Secure invasive self-hosted debug (DBGEN AND SPIDEN) 
enabled in AArch32 state (legacy) 

ExternalSecureNoninvasiveDebugEnab1ed() Secure non-invasive debug enabled (DBGEN OR NIDEN) AND 

(SPIDEN OR SPNIDEN) 

ExternalSecureInvasiveDebugEnab]ed() Secure invasive debug enabled (DBGEN AND SPIDEN) 

ExternalNoninvasiveDebugEnab1ed() Non-secure non-invasive debug (DBGEN OR NIDEN) 
enabled 

ExternalInvasiveDebugEnabled() Non-secure invasive debug enabled DBGEN 








The Debug_authentication() pseudocode function on shared/debug on page J1-5374 defines the authentication 
signals DBGEN, SPIDEN, NIDEN and SPNIDEN. 
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K2.4 Management registers and CoreSight compliance 
The CoreSight architecture requires the implementation of a set of management registers that occupy the memory 
map from @xF0 upwards in each of the debug components. 
CoreSight compliance and complete implementation of the management registers is OPTIONAL, but ARM 
recommends that the registers are implemented. 
The CoreSight architecture specification recommends that any integration test registers are implemented starting 
from @xEFC downwards. Each of the debug components has an IMPLEMENTATION DEFINED region from 0xE8@ to 
@xEFC for this purpose. 
K2.4.1 Coresight interface register map 
Table K2-3 shows the external management register maps for the following registers: 
ED These are the external debug register. 
CTI These are the Cross-trigger interface registers. 
PMU These are the Performance Monitors registers. 
Table K2-3 Coresight interface register map 
Mnemonic 
Offset Name 
ED CTI PMU 
OxFO EDITCTRL CTUTCTRL PMITCTRL Integration Model Control registers 
OxFQ4-OxF9C - - Reserved, RESO 
OxFAQ DBGCLAIMSET_EL14 CTICLAIMSET® - CLAIM Tag Set registers 
OxFA4 DBGCLAIMCLR_EL1@ CTICLAIMCLR® - CLAIM Tag Clear registers 
OxFA8 EDDEVAFF0? CTIDEVAFFO0* PMDEVAFFO Device Affinity registers 
OxFAC EDDEVAFF 1? CTIDEVAFF 1° PMDEVAFF1 
OxFBO EDLAR¢ CTILAR® PMLAR4 Lock Access register 
OxFB4 EDLSR¢ CTILSR4 PMLSR¢ Lock Status register 
OXFB8 DBGAUTHSTATUS_EL14 CTIAUTHSTATUS PMAUTHSTATUS Authentication Status register 
OxFBC EDDEVARCH CTIDEVARCH PMDEVARCH Device Architecture register 
OxFCO EDDEVID24 CTIDEVID24 . Device ID register 
OxFC4 EDDEVID14 CTIDEVID14 7 
OxFC8 EDDEVID? CTIDEVID@ : 
OxFCC EDDEVTYPE CTIDEVTYPE PMDEVTYPE Device Type register 
OxFDO EDPIDR4 CTIPIDR4 PMPIDR4 Peripheral ID registers 
OxFD4-@xFDC—- - - Reserved, RESO 
OxFEO EDPIDRO CTIPIDRO PMPIDRO Peripheral ID registers 
OxFE4 EDPIDR1 CTIPIDR1 PMPIDR1 
OxFE8 EDPIDR2 CTIPIDR2 PMPIDR2 
OxFEC EDPIDR3 CTIPIDE3 PMPIDR3 
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Table K2-3 Coresight interface register map (continued) 




















Mnemonic 
Offset Name 
CTl PMU 

OxFFO EDCIDRO CTICIDRO PMCIDRO Component ID registers 
OxFF4 EDCIDR1 CTICIDR1 PMCIDRI 
OxFF8 EDCIDR2 CTICIDR2 PMCIDR2 
OxFFC EDCIDR3 CTICIDR3 PMCIDR3 

a. This register must always be implemented, regardless of whether the component is CoreSight compliant. 

b. If implemented, the number of CLAIM bits is IMPLEMENTATION DEFINED and can be discovered by reading CLAIMSET. 


a 9 


If the CTI implements CTIv1, this register is not implemented. See the register description for details. 


The Software lock registers are defined as part of CoreSight compliance, but their contents depend on the type of access that is made and 


whether the OPTIONAL Software lock is implemented. See the register description for details. 


K2.4.2 


Management register access permissions 


Access to the OPTIONAL Integration Control register ITCTRL) is IMPLEMENTATION DEFINED. 
If the Debug power domain is off, all register accesses return an error. 


Otherwise, Table K2-4 on page K2-5501, Table K2-5 on page K2-5502, and Table K2-6 on page K2-5503 show the 
response to accesses by the external debug interface to the CoreSight management registers. For definitions of the 
terms used in the tables, see External debug interface register access permissions summary on page H8-4972. 


Note 
Access to the CoreSight management registers is not affected by the values of EDAD and EPMAD. 








Table K2-4 on page K2-5501, Table K2-5 on page K2-5502, and Table K2-6 on page K2-5503 include reserved 
management registers, because the CoreSight architecture requires that these registers are always RESO. The 
descriptions in Reserved and unallocated registers on page H8-4972 does not apply to reserved management 
registers if the implementation is CoreSight compliant. 


If OPTIONAL memory-mapped access to the external debug interface is supported, there are additional constraints 
on memory-mapped accesses. See Register access permissions for memory-mapped accesses on page H8-4968. 


The terms in Table K2-4 on page K2-5501, Table K2-5 on page K2-5502, and Table K2-6 on page K2-5503 are 
defined as follows: 


Domain This describes the power domain in which the register is logically implemented. Registers described 
as implemented in the Core power domain might be implemented in the Debug power domain, as 
long as they exhibit the required behavior. 

Conditions _This lists the conditions under which the access is attempted. 


To determine the access permissions for a register, read these columns from left to right, and stop at 
first column which lists the condition as being true. 


The conditions are: 


Off EDPRSR.PU == 0. The Core power domain is completely off, or in low-power state. In 
these cases the Core power domain registers cannot be accessed. 


— Note 


If debug power is off, then all external debug interface accesses return an error. 





DLK DoubleLockStatus() == TRUE. The OS Double Lock is locked. 
OSLK OSLSR.OSLK == 1. The OS Lock is locked. 
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Default This provides the default access permissions, if there are no conditions that prevent access to the 
register. 
SLK This provides the modified default access permissions for OPTIONAL memory-mapped accesses to 


the external debug interface if the OPTIONAL Software Lock is locked. See Register access 
permissions for memory-mapped accesses on page H8-4968. For all other accesses, this column is 
ignored. 


The access permissions are: 


- This means that the default access permission applies. See the Default column, or the SLK column, 


if applicable. 
RO This means that the register or field is read-only. 
RW This means that the register or field is read/write. Individual fields within the register might be RO. 


See the relevant register description for details. 


RC This means that the bit clears to 0 after a read. 

(SE) This means that accesses to this register have indirect write side effects. A side effect occurs when 
a direct read or a direct write of a register creates an indirect write to the same register or to another 
register. 

wo This means that the register or field is write-only. 

WI This means that the register or field ignores writes. 

IMP DEF This means that the access permissions are IMPLEMENTATION DEFINED. 


Table K2-4 External debug interface access permissions, CoreSight registers (debug) 





Conditions 
(priority left to right) 


















































Offset Register Domain Default SLK 
Off DLK OSLK 
OxF QO EDITCTRL IMP DEF IMPLEMENTATION DEFINED IMP DEF RO/WI 
OxFQ4-OxF8C Reserved Debug - - - RESO - 
OxFAQ DBGCLAIMSET_EL1 Core Error Error Error RW (SE) RO 
OxFA4 DBGCLAIMCLR_EL1 Core Error Error Error RW (SE) RO 
OxFA8 EDDEVAFFO Debug - - - RO - 
OxFAC EDDEVAFF1 Debug - - - RO - 
OxFBO EDLAR Debug - - - WO (SE) - 
OxFB4 EDLSR Debug - - - RO - 
OxFB8 DBGAUTHSTATUS_EL1 Debug - - - RO - 
OxFBC EDDEVARCH Debug - - - RO - 
OxFCO EDDEVID2 Debug - - - RO - 
OxFC4 EDDEVID1 Debug - - - RO - 
OxFC8 EDDEVID Debug - - - RO - 
@xFCC EDDEVTYPE Debug - - - RO - 
OxFDO EDPIDR4 Debug - - - RO - 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. K2-5501 


ID092916 Non-Confidential 


Appendix K2 Recommended External Debug Interface 
K2.4 Management registers and CoreSight compliance 


Table K2-4 External debug interface access permissions, CoreSight registers (debug) (continued) 





Conditions 
(priority left to right) 





























Offset Register Domain Default SLK 
Off DLK OSLK 
@xFD4-@xFDC Reserved Debug - - - RESO - 
@xFEQ-@xFEC EDPIDRO Debug - - - RO - 
OxFE4 EDPIDR1 Debug - - - RO - 
OxFE8 EDPIDR2 Debug - - - RO - 
OxFEC EDPIDR3 Debug - - - RO - 
OxFFO EDCIDRO Debug - - - RO - 
OxFF4 EDCIDR1 Debug - - - RO - 
OxFF8 EDCIDR2 Debug - - - RO - 
OxFFC EDCIDR3 Debug - - - RO - 





Table K2-5 External debug interface access permissions, CoreSight registers (CTI) 





Conditions 
(priority left to right) 





















































Offset Register Domain Default SLK 
Off DLK OSLK 

OxFQ0 CTIITCTRL IMP DEF —= IMPLEMENTATION DEFINED IMP DEF RO/WI 

OxFQ4-OxF8C Reserved Debug - - - RESO - 

OxFAQ CTICLAIMSET Debug - - - RW (SE) RO 

OxFA4 CTICLAIMCLR Debug - - - RW (SE) RO 

OxFA8 CTIDEVAFFO Debug - - - RO - 

OxFAC CTIDEVAFF1 Debug - - - RO - 

OxFBO CTILAR Debug - - - WO (SE) - 

OxFB4 CTILSR Debug - - - RO - 

OxFB8 CTIAUTHSTATUS Debug - - - RO - 

OxFBC CTIDEVARCH Debug - - - RO - 

OxFCO CTIDEVID2 Debug - - - RO - 

OxFC4 CTIDEVID1 Debug - - - RO - 

OxFC8 CTIDEVID Debug - - - RO - 

@xFCC CTIDEVTYPE Debug - - - RO - 

OxFDO CTIPIDR4 Debug - - - RO - 

@xFD4-@xFDC Reserved Debug - - - RESO - 
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Table K2-5 External debug interface access permissions, CoreSight registers (CTI) (continued) 




























































































Conditions 
Offset Register Domain pHotity deft toright) Default SLK 
Off DLK OSLK 
OxFEO CTIPIDRO Debug - - - RO - 
OxFE4 CTIPIDRI Debug - - - RO - 
OxFE8 CTIPIDR2 Debug - - - RO - 
QxFEC CTIPIDR3 Debug - - - RO - 
OxFFO CTICIDRO Debug - - - RO - 
OxFF4 CTICIDR1 Debug - - - RO - 
OxFF8 CTICIDR2 Debug - - - RO - 
OxFFC CTICIDR3 Debug - - - RO - 
Table K2-6 External debug interface access permissions, CoreSight registers (PMU) 
Conditions 
Offset Register Domain kpilonity lett tonight) Default SLK 
Off DLK OSLK 
OxF0O PMITCTRL IMP DEF — IMPLEMENTATION DEFINED IMP DEF RO/WI 
OxFQ4-OxFA4 Reserved Debug - - - RESO - 
OxFA8 PMDEVAFFO Debug - - - RO - 
OxFAC PMDEVAFF1 Debug - - - RO - 
OxFBO PMLAR Debug 2 : 2 WO(SE) - 
OxFB4 PMLSR Debug - - - RO - 
OxFB8 PMAUTHSTATUS Debug - - - RO - 
OxFBC PMDEVARCH Debug - - - RO - 
OxFCO-OxFC8 Reserved Debug - - - RESO - 
OxFCC PMDEVTYPE Debug - - - RO - 
OxFDO PMPIDR4 Debug - - - RO - 
QxFD4-OxFDC Reserved Debug - - - RESO - 
OxFEQ PMPIDRO Debug - - - RO - 
OxFE4 PMPIDR1 Debug - - - RO - 
OxFE8 PMPIDR2 Debug - - - RO - 
OxFEC PMPIDR3 Debug - - - RO - 
OxFFQ PMCIDRO Debug - - - RO - 
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Table K2-6 External debug interface access permissions, CoreSight registers (PMU) (continued) 





Conditions 
(priority left to right) 














Offset Register Domain Default SLK 
Off DLK OSLK 
OxFF4 PMCIDRI Debug - - - RO - 
OxFF8 PMCIDR2 Debug - - - RO - 
OxFFC PMCIDR3 Debug - - - RO - 
K2.4.3 Management register resets 


Table K2-7 shows the management register resets. This table does not include: 


. Read-only identification registers that have a fixed value from reset. These registers include those with the 
DEVAFFn, DEVARCH, DEVID{n}, DEVTYPE, PIDRn, and CIDRn mnemonics. 


° Registers that have the AUTHSTATUS mnemonic. This is a read-only status register that reflects the status 
outside of the reset domain of the register. 


. Registers that have the LAR mnemonic. These are write-only registers that only have an effect on writes. 


All other fields in the management registers are reset to an IMPLEMENTATION DEFINED value which can be 
UNKNOWN. The registers are in the reset domain specified in the table. 


Table K2-7 shows a summary of the management register resets. 


Table K2-7 Management register resets 

















Register Reset domain Field Value Description 
CTUTCTRL IMPLEMENTATION DEFINED IME 0 Integration mode enable 
EDITCTRL 

PMITCTRL 

DBGCLAIMCLR_EL1 Cold reset CLAIM 0x00 CLAIM tags 
CTICLAIMCLR External debug CLAIM — 0x00000000 

CTILSR? External debug SLK 1 Software Lock 
EDLSR? 

PMLSR@ 





a. Only if the OPTIONAL Software Lock is implemented 


K2.4.4 About the Peripheral identification scheme 


The Peripheral Identification scheme provides the standard information required by all components that conform to 
the ARM® Debug Interface Architecture Specification, ADIv5.0 to ADIVS.2, that implements the CoreSight 
identification scheme. They identify a peripheral in a particular namespace. For more information, see the ARM® 
CoreSight™ v2.0 Architecture Specification. 

Table K2-8 on page K2-5505 lists the Peripheral ID Registers that make up the Peripheral Identification scheme for 
each component. 
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Table K2-8 Peripheral Identification Registers 





























Reference 

Register offset Description External Debug CTI Performance Monitors 
QxFDO Peripheral ID4 EDPIDR4 CTIPIDR4 PMPIDR4 

QxFD4 Reserved for Peripheral ID5 - - - 

OxFD8 Reserved for Peripheral ID6 - - - 

QxFDC Reserved for Peripheral ID7 - - - 

OxFEOQ Peripheral IDO EDPIDRO CTIPIDRO PMPIDRO 

OxFE4 Peripheral ID1 EDPIDR1 CTIPIDR1 PMPIDR1 

OxFE8 Peripheral ID2 EDPIDR2 CTIPIDR2 PMPIDR2 

QxFEC Peripheral ID3 EDPIDR3 CTIPIDR3 PMPIDR3 





Figure K2-2 shows the register field allocation scheme for the Peripheral ID Registers. 


31 8 7 0 


RESO Peripheral ID data 


Figure K2-2 Peripheral ID register format 


Software can consider the eight Peripheral ID Registers as defining a single 64-bit Peripheral ID, as shown in 
Figure K2-3. 


Actual Peripheral ID Register fields 


EDPIDR7 EDPIDR6 EDPIDRS EDPIDR4 EDPIDR3 EDPIDR2 EDPIDR1 EDPIDRO 
7 07 07 07 07 0:7 07 07 0 
63 56 55 48 47 40 39 32.31 24:23 16.15 87 0 
. 2 





Conceptual 64-bit Peripheral ID 
Figure K2-3 Mapping between Peripheral ID Registers and a 64-bit Peripheral ID Value 
Figure K2-3 shows the fields in the 64-bit Peripheral ID value, and includes the field values for fields that: 
. Have fixed values, including the bits that are reserved. 


° Have fixed values in an implementation that is designed by ARM. 


For more information about the fields and their values see Table K2-9 on page K2-5506. 


Conceptual 64-bit Peripheral ID 














f EDPIDR7 EDPIDR7 EDPIDR7 EDPIDR7 EDPIDR7 : EDPIDR7 EDPIDR7 EDPIDR7 1 
7 07 07 07 43 07 43 0:7 432 07 43 07 0 
[2{0[0[0{0[0[0[ofo[a/ofafafafafofofo fo} off oft ofetsfofel TTT TTT TTT Hetsteistolsisy EEE EE EEE ETE 
63 A A A A A PIA A 9 
Reserved, RESO 4KB RevAnd Revision JEP106 Part number 
count ID code 
JEP106 Customer 
Continuation code modified Uses JER109ID:code 


Figure K2-4 Peripheral ID fields, with values for a implementation designed by ARM 
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Table K2-9 shows the fields in the Peripheral ID. 


Table K2-9 Fields in the Peripheral Identification Registers 











Name Size Description Registers 
4KB count 4 bits Log2 of the number of 4KB blocks occupied by the implementation. EDPIDR4 
CTIPIDR4 
PMPIDR4 
JEP106  4+7 Identifies the designer of the implementation. This value consists of: EDPIDR1, EDPIDR2, 
code bits ¢ A 4-bit continuation code, also described as the bank number. EDPIDR4 
° A 7-bit identification code. CTIPIDR1, CTIPIDR2, 
For implementations designed by ARM, the continuation code is 0x4, indicating bank 5, CTIPIDR4 
and the identity code is 0x3B. PMPIDR1, PMPIDR2, 
PMPIDR4 





RevAnd 4 bits Manufacturing revision number. Indicates a late modification to the implementation, EDPIDR3 
usually as a result of an Engineering Change Order (ECO). This field starts at 0x0 and CT[PIDR3 














is incremented by the integrated circuit manufacturer on metal fixes. PMPIDR3 
Customer 4 bits Indicates an endorsed modification to the implementation. EDPIDR3 
modified If the system designer cannot modify the implementation supplied by the CTIPIDR3 
implementation designer then this field is RESO. PMPIDR3 
Revision 4 bits Revision number for the implementation. EDPIDR2 
Starts at 0x0 and increments by 1 at both major and minor revisions. CTIPIDR2 
PMPIDR2 
Uses 1 bit This bit is set to 1 when a JEP106 identification code is used. EDPIDR2 
JEP106 This bit must be 1 on all ARMv8 implementations. CTIPIDR2 
ID code PMPIDR2 
Part 12 bits Part number for the implementation. Each organization designing to the ARM Debug EDPIDRO, EDPIDR1 
number architecture specification keeps its own part number list. CTIPIDRO, CTIPIDR1 


PMPIDRO, PMPIDR1 





A component is identified uniquely by the combination of the following fields: 
° JEP 106 continuation code. 


° JEP 106 identity code. 


° Part number. 

. Revision. 

° Customer Modified. 
° RevAnd. 


For components with a Component class of 0x9, Debug component, indicated by the Component Identification 
Registers, multiple components can have the same Part number, provided each component has a different CoreSight 
Device type. However, ARM strongly recommends that each device has a unique Part number. For more 
information: 


° About the Component Identification Registers, see About the Component Identification scheme on 
page K2-5507. 


° About the CoreSight Device type, see EDDEVTYPE, CTIDEVTYPE, or PMDEVTYPE. 


° About CoreSight components and their identification, see the ARM® Debug Interface Architecture 
Specification, ADIV5.0 to ADIVS.2. 
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Allocating revisions and part numbers 


Within the Peripheral Identification registers, the allocation of major and minor revisions, part numbers, and 
customer-modified fields is IMPLEMENTATION DEFINED, with the following set of restrictions so that: 


° The REVISION field must increase monotonically with revisions. 





Note 


ARM recommends that the REVISION field is updated for each update to the RTL, regardless of whether 
this is a major or minor update. 





. The REVAND field should increase monotonically with revisions. 


Note 


ARM recommends that the REVAND field is used only for post-release changes. For example, those due to 
engineering change order (ECO) fixes related to the debug component of the processor. 








° The PART field must have a degree of uniqueness: 


— Two component designs can have the same part number so long as they are sub-components of the 
same part and the programmers’ model for the part has the means to disambiguate sub-components. 


— Otherwise, two component designs must have unique part numbers. 


The DEVARCH (if implemented) or DEVTYPE (otherwise) register provides the means to disambiguate 
sub-components of the Debug Architecture. 


A ROM table has no DEVTYPE or DEVARCH register. However, if it is the only CLASS @x1 component in a 
processor cluster, it can still be disambiguated. 


Multiple instances of the same component design have the same part number. 


K2.4.5 About the Component Identification scheme 


The Component Identification Registers identify the processor as an ARM Debug Interface v5 component. For more 
information, see the ARM® Debug Interface Architecture Specification, ADIv5.0 to ADIV5.2 and the ARM® 
CoreSight™ v2.0 Architecture Specification. 


The Component Identification Registers occupy the last four words of the 4KB block of debug registers. 


Table K2-10 Component Identification Registers 

















Register offset Description External debug CTI Performance Monitors 
OxFFQ Component IDO EDCIDRO CTICIDRO PMCIDRO 
OxFFQ Component IDI EDCIDRI CTICIDR1 PMCIDRI1 
OxFFQ Component ID2 EDCIDR2 CTICIDR2 PMCIDR2 
OxFFQ Component ID3 EDCIDR3 CTICIDR3 PMCIDR3 





Figure K2-5 shows the register field allocation scheme for the Component ID Registers. 


31 8 7 0 


RESO Component ID Data 


Figure K2-5 Component ID Register format 


Software can consider the eight Component ID Registers as defining a single 32-bit Component ID, as shown in 
Figure K2-6 on page K2-5508. 
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Actual ComponentiD register fields 


EDCIDR3 EDCIDR2 EDCIDR1 EDCIDRO 








31 2423 1615 1211 87 0 
Preamble Component Preamble 
r class ) 





Conceptual 32-bit component ID Component ID 


Figure K2-6 Mapping between Component ID Registers and a 32-bit Component ID Value 
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Recommendations for Performance Monitors Event 
Numbers for IMPLEMENTATION DEFINED Events 


This appendix describes the ARM recommendations for the use of the IMPLEMENTATION DEFINED event numbers. 
It contains the following sections: 


° ARM recommendations for IMPLEMENTATION DEFINED event numbers on page K3-5510. 


° Summary of events for exceptions taken to an Exception level using AArch64 on page K3-5524. 
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K3.1 


ARM recommendations for IMPLEMENTATION DEFINED event numbers 


These are the ARM recommendations for the use of the IMPLEMENTATION DEFINED event numbers. ARM does not 
define these events as rigorously as those in the architectural and microarchitectural event lists, and an 
implementation might: 


Modify the definition of an event to better correspond to the implementation. 


Not use some, or many, of these event numbers. 


Table K3-1 lists the PMU IMPLEMENTATION DEFINED event numbers in event number order. 


Table K3-1 PMU IMPLEMENTATION DEFINED event numbers 





Event number 


Event mnemonic 


Description 
































0x040 L1ID_CACHE_RD Attributable Level 1 data cache access, read 

Qx041 LID_CACHE_WR Attributable Level 1 data cache access, write 

0x042 L1ID_CACHE_REFILL_RD@ Attributable Level 1 data cache refill, read 

0x043 L1D_CACHE_REFILL_WR# Attributable Level 1 data cache refill, write 

0x044 L1ID_CACHE_REFILL_INNER Attributable Level 1 data cache refill, inner 

0x045 LID_CACHE_REFILL_OUTER Attributable Level | data cache refill, outer 

0x046 LID_CACHE_WB_ VICTIM Attributable Level 1 data cache Write-Back, victim 
0x047 LID_CACHE_WB_CLEAN Level 1 data cache Write-Back, cleaning and coherency 
0x048 L1ID_CACHE_INVAL Attributable Level 1 data cache invalidate 





0x049-0x04B 


Reserved 


























0x04C LID_TLB_REFILL_RD@ Attributable Level 1 data TLB refill, read 

0x04D L1D_TLB_REFILL_WRa Attributable Level 1 data TLB refill, write 

Ox04E LID_TLE RD Attributable Level 1 data or unified TLB access, read 
Ox04F L1ID_TLB_WR Attributable Level 1 data or unified TLB access, write 
0x050 L2D_CACHE_RD Attributable Level 2 data cache access, read 

@x051 L2D_CACHE_WR Attributable Level 2 data cache access, write 

Qx052 L2D_CACHE_REFILL_RD# Attributable Level 2 data cache refill, read 

0x053 L2D_ CACHE REFILL WR? Attributable Level 2 data cache refill, write 








0x054-0x055 


Reserved 











0x056 L2D_CACHE_WB_ VICTIM Attributable Level 2 data cache Write-Back, victim 
0x057 L2D_CACHE_WB_CLEAN Level 2 data cache Write-Back, cleaning and coherency 
Qx058 L2D_CACHE_INVAL Attributable Level 2 data cache invalidate 





0x059-@x05B 


Reserved 











Ox5C L2D_TLB_REFILL_RD@ Attributable Level 2 data or unified TLB refill, read 
0x05D L2D_TLB_REFILL_WRa Attributable Level 2 data or unified TLB refill, write 
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Table K3-1 PMU IMPLEMENTATION DEFINED event numbers (continued) 





Event number 


Event mnemonic 


Description 





































































































OxO5E L2D_TLB_RD Attributable Level 2 data or unified TLB access, read 

OxO5F L2D_TLB_WR Attributable Level 2 data or unified TLB access, write 

0x060 BUS_ACCESS_RD Bus access, read 

0x061 BUS_ACCESS_WR Bus access, write 

0x62 BUS_ACCESS_SHARED Bus access, Normal, Cacheable, Shareable 

0x063 BUS_ACCESS_NOT_SHARED _ Bus access, not Normal, Cacheable, Shareable 

0x064 BUS_ACCESS_NORMAL Bus access, normal 

Qx065 BUS_ACCESS_PERIPH Bus access, peripheral 

0x066 MEM_ACCESS_RD Data memory access, read 

0x067 MEM_ACCESS_ WR Data memory access, write 

0x068 UNALIGNED_LD_SPEC Unaligned access, read 

Qx069 UNALIGNED_ST_SPEC Unaligned access, write 

Qx06A UNALIGNED_LDST_SPEC Unaligned access 

Qx06B - Reserved 

Qx06C LDREX_SPEC Exclusive operation speculatively executed, LDREX or LDX 
Qx06D STREX_PASS_SPEC Exclusive operation speculatively executed, STREX or STX pass 
Ox06E STREX_FAIL_ SPEC Exclusive operation speculatively executed, STREX or STX pass 
Ox06F STREX_SPEC Exclusive operation speculatively executed, STREX or STX 
0x070 LD_ SPEC Operation speculatively executed, load 

Qx071 ST_SPEC Operation speculatively executed, store 

Qx072 LDST_SPEC Operation speculatively executed, load or store 

0x073 DP_SPEC Operation speculatively executed, integer data processing 
0x074 ASE_SPEC Operation speculatively executed, Advanced SIMD instruction 
Qx075 VFP_SPEC Operation speculatively executed, floating-point instruction 
0x076 PC_WRITE_SPEC Operation speculatively executed, software change of the PC 
0x077 CRYPTO_SPEC Operation speculatively executed, Cryptographic instruction 
0x078 BR_IMMED_SPEC Branch speculatively executed, immediate branch 

0x079 BR_RETURN_SPEC Branch speculatively executed, procedure return 

Qx07A BR_INDIRECT_SPEC Branch speculatively executed, indirect branch 

0x07B - Reserved 

Qx07C ISB_SPEC Barrier speculatively executed, ISB 

0x07D DSB_SPEC Barrier speculatively executed, DSB 
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Table K3-1 PMU IMPLEMENTATION DEFINED event numbers (continued) 





Event number 


Event mnemonic 


Description 



























































Ox07E DMB_SPEC Barrier speculatively executed, DMB 

0x07F -0x080 - Reserved 

0x081 EXC_UNDEF Exception taken, Other synchronous 

0x082 EXC_SVC Exception taken, Supervisor Call 

0x083 EXC_PABORT Exception taken, Instruction Abort 

0x084 EXC_DABORT Exception taken, Data Abort and SError 

0x085 - Reserved 

Qx086 EXC_IRQ Exception taken, IRQ 

0x087 EXC Fig) Exception taken, FIQ 

0x088 EXC_SMC Exception taken, Secure Monitor Call 

0x089 - Reserved 

Ox08A EXC_HVC Exception taken, Hypervisor Call 

0x08B EXC_TRAP_PABORT Exception taken, Instruction Abort not taken locally> 

@x08C EXC_TRAP_DABORT Exception taken, Data Abort or SError not taken locally> 

@x08D EXC_TRAP_OTHER Exception taken, Other traps not taken locally 

OxO8E EXC_TRAP_IRQ Exception taken, IRQ not taken locally® 

Ox08F EXC_TRAP_FIQ Exception taken, FIQ not taken locally’ 

0x090 RC_LD_SPEC Release consistency operation speculatively executed, Load-Acquire 
Qx091 RC_ST_SPEC Release consistency operation speculatively executed, Store-Release 





Qx092-0x09F 


Reserved 














Ox0A0 L3D_CACHE_RD Attributable Level 3 data or unified cache access, read 
Ox0A1 L3D_CACHE_WR Attributable Level 3 data or unified cache access, write 
OxOA2 L3D_CACHE_REFILL_RD@ Attributable Level 3 data or unified cache refill, read 
@x0A3 L3D_CACHE_REFILL_WR# Attributable Level 3 data or unified cache refill, write 





Qx@A4-Ox0A5 


Reserved 











Ox0A6 L3D_CACHE_WB_VICTIM Attributable Level 3 data or unified cache Write-Back, victim 
Qx0A7 L3D_CACHE_WB_CLEAN Attributable Level 3 data or unified cache Write-Back, cache clean 
Ox0A8 L3D_CACHE_INVAL Attributable Level 3 data or unified cache access, invalidate 





a. For more information, see Relationship between REFILL events and associated access events on page K3-5523. 


b. In these definitions, an exception that is taken locally means an exception that is taken to the default Exception level, and is not routed 
to another Exception level. See Exception levels on page D1-1498 for more information. 


0x040, LID_CACHE_RD, Attributable Level 1 data cache access, read 
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This event is similar to Level 1 data cache access, LID_CACHE, but the counter counts only 
memory-read operations that access at least the Level 1 data or unified cache. 


0x041, LID_CACHE_WR, Attributable Level 1 data cache access, write 


This event is similar to Level 1 data cache access, LID_CACHE, but the counter counts only 
memory-write operations that access at least the Level 1 data or unified cache. 


The counter counts DC ZVA as a store instruction. 


0x042, LID_CACHE_REFILL_RD, Attributable Level 1 data cache refill, read 


This event is similar to Level | data cache refill, LID CACHE REFILL, but the counter counts 
only memory-read operations that cause a refill of at least the Level 1 data or unified cache. 





See also Relationship between REFILL events and associated access events. on page K3-5523. 


0x043, LID_CACHE_REFILL_WR, Attributable Level 1 data cache refill, write 


This event is similar to Level | data cache refill, LID CACHE REFILL, but the counter counts 
only memory-write operations that cause a refill of at least the Level 1 data or unified cache. 





The counter counts DC ZVA as a store instruction. 


See also Relationship between REFILL events and associated access events. on page K3-5523. 


0x044, LID_CACHE_REFILL_INNER, Attributable Level 1 data cache refill, inner 


This event is similar to Level | data cache refill, LID _CACHE_REFILL, but the counter counts 
only memory-read and memory-write operations that generate refills satisfied by transfer from 
another cache inside of the immediate cluster. 





— Note 


The boundary between inner and outer is IMPLEMENTATION DEFINED, and it is not necessarily linked 
to other similar boundaries, such as the boundary between Inner Cacheable and Outer Cacheable or 
the boundary between Inner Shareable and Outer Shareable. 





0x045, LID_CACHE_REFILL_OUTER, Attributable Level 1 data cache refill, outer 


This event is similar to Level | data cache refill, LID CACHE REFILL, but the counter counts 
only memory-read and memory-write operations that generate refills satisfied from outside of the 
immediate cluster. 





0x046, LID_CACHE_WB_VICTIM, Attributable Level 1 data cache Write-Back, victim 


This event is similar to Level 1 data cache Write-Back, L1D_CACHE_WB, but the counter counts 
only Write-Backs that are a result of the line being allocated for an access made by the PE. 


Write-Backs caused by the execution of a cache maintenance instruction are not counted. 


It is IMPLEMENTATION DEFINED whether a write of a whole cache line that is not the result of the 
eviction of a line from the cache is counted. For example, this might occur if the PE detects 
streaming writes to memory and does not allocate lines to the cache, or as the result of a DC ZVA. 


0x047, LID_CACHE_WB_CLEAN, Level 1 data cache Write-Back, cleaning and coherency 


This event is similar to Attributable Level 1 data cache Write-Back, L1D_CACHE_WB., but the 
counter counts only Write-Backs that are a result of a coherency operation made by another PE or 
are caused by the execution of a cache maintenance instruction. Whether Write-Backs caused by the 
execution of a cache maintenance instruction are counted is IMPLEMENTATION DEFINED. 


If a coherency request from a requestor outside the PE results in a Write-Back, it is an Unattributable 
event. 
—— Note 


The transfer of a dirty cache line from the Level | data cache of this PE to the Level 1 data cache of 
another PE due to a hardware coherency operation is not counted unless the dirty cache line is also 
written back to a Level 2 cache or memory. 
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0x048, LID_CACHE_INVAL, Attributable Level 1 data cache invalidate 
The counter counts each invalidation of a cache line in the Level | data or unified cache. 


The counter does not count events: 
° If a cache refill invalidates a line. 


° For locally-executed cache maintenance instructions that operate by set/way. 
If a coherency request from a requestor outside the PE results in a Write-Back, it is an Unattributable 
event. 

0x04C, LID_TLB_REFILL_RD, Attributable Level 1 data TLB refill, read 


This event is similar to Level 1 data TLB refill, L1D_TLB_REFILL, but the counter counts only 
memory-read operations that cause a data TLB refill of a least the Level 1 data or unified TLB. 


See also Relationship between REFILL events and associated access events. on page K3-5523. 


0x04D, LID_TLB_REFILL_WR, Attributable Level 1 data TLB refill, write 


This event is similar to Level 1 data TLB refill, L1D_TLB_REFILL, but the counter counts only 
memory-write operations that cause a data TLB refill of a least the Level 1 data or unified TLB. 


The counter counts DC ZVA as a store instruction. 


See also Relationship between REFILL events and associated access events. on page K3-5523. 


0x04E, LID_TLB_RD, Attributable Level 1 data or unified TLB access, read 
This event is similar to Level 1 data or unified TLB access, L1D_TLB, but the counter counts only 
memory-read operations that cause a TLB access to at least the Level 1 data or unified TLB. 
0x04F, LID_TLB_WR, Attributable Level 1 data or unified TLB access, write 
This event is similar to Level 1 data or unified TLB access, L1D_TLB, but the counter counts only 
memory-write operations that cause a TLB access to at least the Level 1 data or unified TLB. 
0x@50, LZD_CACHE_RD, Attributable Level 2 data cache access, read 
This event is similar to Attributable Level 2 data cache access, L2D_ CACHE, but the counter 
counts only memory-read operations that access at least the Level 2 data or unified cache. 
0x051, LZD_CACHE_WR, Attributable Level 2 data cache access, write 


This event is similar to Attributable Level 2 data cache access, L2D_ CACHE, but the counter 
counts only memory-write operations that access at least the Level 2 data or unified cache. 


The counter counts DC ZVA as a store instruction. 


0x052, LZD_CACHE_REFILL_RD, Attributable Level 2 data cache refill, read 


This event is similar to Attributable Level 2 data cache refill, L2D_ CACHE REFILL, but the 
counter counts only memory-read operations that cause a refill of at least the Level 2 data or unified 
cache. 





See also Relationship between REFILL events and associated access events. on page K3-5523. 


0x053, LZD_CACHE_REFILL_WR, Attributable Level 2 data cache refill, write 


This event is similar to Attributable Level 2 data cache refill, L2D_ CACHE REFILL, but the 
counter counts only memory-write operations that cause a refill of at least the Level 2 data or unified 
cache. 





The counter counts DC ZVA as a store instruction. 


See also Relationship between REFILL events and associated access events. on page K3-5523. 


0x056, L2ZD_CACHE_WB_VICTIM, Attributable Level 2 data cache Write-Back, victim 


This event is similar to Attributable Level 2 data cache Write-Back, L2D_CACHE_WB, but the 
counter counts only Write-Backs that are a result of the line being allocated for an access made by 
the PE. 


Write-Backs caused by the execution of a cache maintenance instruction are not counted. 
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It is IMPLEMENTATION DEFINED whether a write of a whole cache line that is not the result of the 
eviction of a line from the cache is counted. For example, this might occur if the PE detects 
streaming writes to memory and does not allocate lines to the cache, or as the result of a DC ZVA. 


0x057, LZD_CACHE_WB_CLEAN, Level 2 data cache Write-Back, cleaning and coherency 


This event is similar to Attributable Level 2 data cache Write-Back, L2D_CACHE_WB, but the 
counter counts only Write-Backs that are a result of a coherency operation made by another PE or 
are caused by the execution of a cache maintenance instruction. Whether Write-Backs caused by the 
execution of a cache maintenance instruction are counted as IMPLEMENTATION DEFINED. 


— Note 


The transfer of a dirty cache line from the Level 2 data cache of this PE to the Level 2 data cache of 
another PE due to a hardware coherency operation is not counted unless the dirty cache line is also 
written back to a Level 3 cache or memory. 





If a coherency request from a requestor outside the PE results in a Write-Back, it is an Unattributable 
event. 


0x058, LZD_CACHE_INVAL, Attributable Level 2 data cache invalidate 
The counter counts each invalidation of a cache line in the Level 2 data or unified cache. 


The counter does not count events: 


. If a cache refill invalidates a line. 
° For locally executed cache maintenance instructions that operate by set/way. 
— Note 


Software that uses this event must know whether the Level 2 data cache is shared with other PEs. 
This event does not follow the general rule of Level 2 data cache events of only counting events that 
directly affect this PE. 





If a coherency request from a requestor outside the PE results in a Write-Back, it is an Unattributable 
event. 


0x05C, L2D_TLB_REFILL_RD, Attributable Level 2 data or unified TLB refill, read 


This event is similar to Attributable Level 2 data or unified TLB refill, L2D_TLB REFILL, but the 
counter counts only Attributable memory read operations that cause a TLB refill of at least the Level 
2 data or unified TLB. See also Relationship between REFILL events and associated access events. 
on page K3-5523. 


0x@5D, LZD_TLB_REFILL_WR, Attributable Level 2 data or unified TLB refill, write 


This event is similar to Attributable Level 2 data or unified TLB refill, L2D_TLB REFILL, but the 
counter counts only Attributable memory write operations that cause a TLB refill of at least the 
Level 2 data or unified TLB. See also Relationship between REFILL events and associated access 
events. on page K3-5523. 


0x@5E, L2D_TLB_RD, Attributable Level 2 data or unified TLB access, read 


This event is similar to Attributable Level 2 data or unified TLB access, L2D_TLB, but the counter 
counts only Attributable memory read operations that cause a TLB access to at least the Level 2 data 
or unified TLB. 


0x05F, L2ZD_TLB_WR, Attributable Level 2 data or unified TLB access, write 


This event is similar to Attributable Level 2 data or unified TLB access, L2D_TLB, but the counter 
counts only Attributable memory write operations that cause a TLB access to at least the Level 2 
data or unified TLB. 


0x060, BUS_ACCESS_RD, Bus access, read 


This event is similar to Bus access, BUS_ACCESS, but the counter counts only memory-read 
operations that access outside the boundary of the PE and its closely-coupled caches. 
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0x061, BUS_ACCESS_WR, Bus access, write 


This event is similar to Bus access, BUS_ACCESS, but the counter counts only memory-write 
operations that access outside the boundary of the PE and its closely-coupled caches. 


0x062, BUS_ACCESS_SHARED, Bus access, Normal, Cacheable, Shareable 


This event is similar to Bus access, BUS_ACCESS, but the counter counts only memory-read and 
memory-write operations that make Normal, Cacheable, Shareable accesses outside the boundary 
of the PE and its closely-coupled caches. 


— Note 


It is IMPLEMENTATION DEFINED how the PE translates the attributes from the translation table entry 
for a region to the attributes on the bus. 


In particular, a region of memory designated as Normal, Cacheable, Inner Shareable, Not Outer 
Shareable by a translation table entry, might be marked as either shareable or Non-shareable at the 
boundary of the PE and its closely-coupled caches. This depends on where the IMPLEMENTATION 
DEFINED boundary lies, between Inner and Outer Shareable. 


If the Inner Shareable extends beyond the PE boundary, and the bus indicates the distinction 
between Inner and Outer Shareable, then either is counted as shareable for the purposes of defining 
this event. 





0x063, BUS_ACCESS_NOT_SHARED, Bus access, not Normal, Cacheable, Shareable 


This event is similar to Bus access, BUS_ACCESS, but the counter counts only memory-read and 
memory-write operations that make accesses outside the boundary of the PE and its closely-coupled 
caches that are not Normal, Cacheable, Shareable. For example, the counter counts accesses marked 


as: 
° Normal, Cacheable, Non-shareable. 
° Normal, Non-cacheable. 

° Device. 


— Note 


It is IMPLEMENTATION DEFINED, how the PE translates the attributes from the translation table 
entries for a region to the attributes on the bus. 


In particular, a region of memory designated as Normal, Cacheable, Inner Shareable, Not Outer 
Shareable by a translation table entry, might be marked as either shareable or Non-shareable at the 
boundary of the PE and its closely-coupled caches. This depends on where the IMPLEMENTATION 
DEFINED boundary lies, between Inner and Outer Shareable. 


If the Inner Shareable extends beyond the PE boundary, and the bus indicates the distinction 
between Inner and Outer Shareable, then either is counted as shareable for the purposes of defining 
this event. 





0x064, BUS_ACCESS_NORMAL, Bus access, normal 


This event is similar to Bus access, BUS_ACCESS, but the counter counts only memory-read and 
memory-write operations that make Normal accesses outside the boundary of the PE and its 
closely-coupled caches. For example, the counter counts Normal, Cacheable and Normal, 
Non-cacheable accesses but does not count Device accesses. 


0x065, BUS_ACCESS_PERIPH, Bus access, peripheral 


This event is similar to Bus access, BUS_ACCESS, but the counter counts only memory-read and 
memory-write operations that make Device accesses outside the boundary of the PE and its 
closely-coupled caches. 


0x066, MEM_ACCESS_RD, Data memory access, read 


This event is similar to Data memory access, MEM_ACCESS, but the counter counts only 
memory-read operations that the PE made. 
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0x067, MEM_ACCESS_WR, Data memory access, write 
This event is similar to Data memory access, MEM_ACCESS, but the counter counts only 
memory-write operations made by the PE. 

0x068, UNALIGNED_LD_SPEC. Unaligned access, read 


This event is similar to Data memory access, MEM_ACCESS, but the counter counts only 
unaligned memory-read operations that the PE made. It also counts unaligned accesses if they are 
subsequently transposed into multiple aligned accesses. 


0x069, UNALIGNED_ST_SPEC, Unaligned access, write 
This event is similar to Data memory access, MEM_ACCESS, but the counter counts only 
unaligned memory-read operations that the PE made. It also counts unaligned accesses if they are 
subsequently transposed into multiple aligned accesses. 


0x06A, UNALIGNED_LDST_SPEC, Unaligned access 
This event is similar to Data memory access, MEM_ACCESS, but the counter counts only 
unaligned memory-read operations and unaligned memory-write operations that the PE made. It 
also counts unaligned accesses if they are subsequently transposed into multiple aligned accesses. 
0x06C, LDREX_SPEC, Exclusive operation speculatively executed, Load-Exclusive 
The counter counts Load-Exclusive instructions speculatively executed. 


The definition of speculatively executed is IMPLEMENTATION DEFINED. 


0x06D, STREX_PASS_SPEC, Exclusive operation speculatively executed, Store-Exclusive pass 
The counter counts Store-Exclusive instructions speculatively executed that completed a write. 
The definition of speculatively executed is IMPLEMENTATION DEFINED but must be the same as for 
the LDREX_SPEC event. 

0x06E, STREX_FAIL_SPEC, Exclusive operation speculatively executed, Store-Exclusive fail 


The counter counts Store-Exclusive instructions speculatively executed that fail to complete a write. 
It is within the IMPLEMENTATION DEFINED definition of speculatively executed whether this 
includes conditional instructions that fail the condition code check. 


The definition of speculatively executed is IMPLEMENTATION DEFINED but must be the same as for 
the LDREX_SPEC event. 

0x06F, STREX_SPEC, Exclusive operation speculatively executed, Store-Exclusive 
The counter counts Store-Exclusive instructions speculatively executed. 


The definition of speculatively executed is IMPLEMENTATION DEFINED but it must be the same as for 
the LDREX_SPEC event. 


ARM recommends that this event is implemented if it is not possible to implement the exclusive 
operation speculatively executed, Store-Exclusive pass, and exclusive operation speculatively 
executed, Store-Exclusive fail, events with the same degree of speculation as the LDREX_SPEC 
event. 


0x070, LD_SPEC, Operation speculatively executed, load 


This event is similar to Operation speculatively executed, INST_SPEC, but the counter counts only 
memory-reading instructions, as defined by the LD_RETIRED event. 


The definition of speculatively executed is IMPLEMENTATION DEFINED but must be the same as for 
the INST_SPEC event. 
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0x071, ST_SPEC, Operation speculatively executed, store 


This event is similar to Operation speculatively executed, INST_SPEC, but the counter counts only 
memory-writing instructions, as defined by the ST_RETIRED event. 


The counter counts DC ZVA as a store operation. 
The definition of speculatively executed is IMPLEMENTATION DEFINED but must be the same as for 
the INST_SPEC event. 

0x072, LDST_SPEC, Operation speculatively executed, load or store 


This event is similar to Operation speculatively executed, INST_SPEC, but the counter counts only 
memory-reading instructions and memory-writing instructions, as defined by the LD_RETIRED 
and ST_RETIRED events. 


The definition of speculatively executed is IMPLEMENTATION DEFINED but must be the same as for 
the INST_SPEC event. 
0x073, DP_SPEC, Operation speculatively executed, integer data processing 


This event is similar to Operation speculatively executed, INST_SPEC, but counts only integer 
data-processing instructions. It counts the following operations that operate on the general-purpose 
registers: 


° In AArch64 state, Data processing - immediate on page C3-158 and Data processing - 
register on page C3-163. 


° In AArch32 state, Data-processing instructions on page F1-2372. 
This includes MOV and MVN operations. 
This event also counts the following miscellaneous instructions: 


° In AArch64 state, System register instructions on page C3-144, System instructions on 
page C3-144, and Hint instructions on page C3-145. 


° In AArch32 state, PSTATE and banked register access instructions on page F1-2380, Banked 
register access instructions on page F1-2380, Miscellaneous instructions on page F1-2385, 
other than ISB and preloads, and System register access instructions on page F1-2387, other 
than LDC and STC instructions. 


If the preload instructions PRFM, PLD, PLDW, and PLI, do not count as memory-reading instructions 
then they must count as integer data-processing instructions. 


If ISBs do not count as software change of the PC then they must count as integer data-processing 
instructions. 


The definition of speculatively executed is IMPLEMENTATION DEFINED, but must be the same as for 
the INST_SPEC event. 


It is IMPLEMENTATION DEFINED whether the following instructions are counted as integer 
data-processing operations, SIMD operations, or floating-point operations, but ARM recommends 
that the instructions are all counted as integer data-processing operations: 


. For AArch64 state, from the A64 floating-point convert to integer class, operations that move 
a value between a general-purpose register and a SIMD and floating-point register without 
type conversion: 


— FMOV (general). 


° For AArch64 state, from the SIMD Move group, operations that move a values between a 
general-purpose register and an element or elements in a SIMD and floating-point register: 


— DUP (general). 
—  SMOV. 
—  UNMOV. 
— INS (general). 
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° For AArch32 state: 


—  VDUP (general-purpose register) and all VMOV instructions that transfer data between a 
general-purpose register and a SIMD and floating-point register. 


— _ VWRS. 
—  VWSR. 


0x074, ASE_SPEC, Operation speculatively executed, Advanced SIMD 


This event is similar to Operation speculatively executed, INST_SPEC, but the counter counts only 
Advanced SIMD data-processing instructions, see: 


° For AArch6é4 state, the SIMD operations listed in Data processing - SIMD and floating-point 
on page C3-171. 


° For AArch32 state, Advanced SIMD data-processing instructions on page F1-2391. 


This includes all operations that operate on the SIMD and floating-point registers, except those that 
are counted as: 


° Integer data-processing operations. 

° Floating-point data-processing operations. 
° Memory-reading operations. 

. Memory-writing operations. 


° Cryptographic operations other than PMULL, in AArch64 state. 
° VMULL, in AArch32 state. 


Advanced SIMD scalar operations are counted as Advanced SIMD operations, including those 
which operate on floating-point values. In AArch64 state, PMULL, and in AArch32 state, VMULL are 
counted as Advanced SIMD operations. 


The definition of speculatively executed is IMPLEMENTATION DEFINED, but must be the same as for 
the INST_SPEC event. 
0x075, VFP_SPEC, Operation speculatively executed, floating-point 


This event is similar to Operation speculatively executed, INST_SPEC, but the counter counts only 
floating-point data-processing instructions, see: 


° In AArch64 state, only the scalar floating-point operations listed in Data processing - SIMD 
and floating-point on page C3-171. 
—— Note 


This event does not count the SIMD floating-point operations listed in Data processing - 
SIMD and floating-point on page C3-171. 





° In AArch32 state, Floating-point data-processing instructions on page F1-2399. 


This includes all operations that operate on the SIMD and floating-point registers as floating-point 
values, except for SIMD scalar operations and those that are counted as one of: 


° Integer data processing. 
° Memory-reading operations. 
° Memory-writing operations. 


The following instructions that take both an integer register and a floating-point register argument 
and perform a type conversion (to/from integer or to/from fixed-point), are counted as floating-point 
data-processing operations: 


° In AArch64 state, FCVT{<mode>}, UCVTF, and SCVTF. 
° In AArch32 state, VCVT<mode>(floating-point), VCVT, VCVTT, and VCVTB. 


The definition of speculatively executed is IMPLEMENTATION DEFINED, but must be the same as for 
the INST_SPEC event. 
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0x076, PC_WRITE_SPEC, Operation speculatively executed, software change of the PC 


This event is similar to Operation speculatively executed, INST_SPEC, but the counter counts only 
software changes of the PC. Defined by the instruction architecturally executed, condition code 
check pass, software change of the PC event, see Common event numbers on page DS-1852. 


The definition of speculatively executed is IMPLEMENTATION DEFINED but must be the same as for 
the INST_SPEC event. 


See also PC_WRITE_RETIRED. 


0x077, CRYPTO_SPEC, Operation speculatively executed, Cryptographic instruction 


This event is similar to Operation speculatively executed, INST_SPEC, but the counter counts only 
Cryptographic instructions, except PMULL and VMULL, see The Cryptographic Extension on 
page C3-189. 


The definition of speculatively executed is IMPLEMENTATION DEFINED but must be the same as for 
the INST_SPEC event. 
0x078, BR_IMMED_SPEC, Branch speculatively executed, immediate branch 


The counter counts immediate branch instructions speculatively executed. Defined by the 
instruction architecturally executed, immediate branch event, see Common event numbers on 
page D5-1852. 


The definition of speculatively executed is IMPLEMENTATION DEFINED. 
See also BR_IMMED_RETIRED. 


0x079, BR_RETURN_SPEC, Branch speculatively executed, procedure return 


The counter counts procedure return instructions speculatively executed. Defined by the 
BR_RETURN_RETIRED event. 


The definition of speculatively executed is IMPLEMENTATION DEFINED. 
See also BR_RETURN_RETIRED. 


0x07A, BR_INDIRECT_SPEC, Branch speculatively executed, indirect branch 


The counter counts indirect branch instructions speculatively executed. This includes software 
change of the PC other than exception-generating instructions and immediate branch instructions. 


The definition of speculatively executed is IMPLEMENTATION DEFINED. 


0x07C, ISB_SPEC, Barrier speculatively executed, ISB 


The counter counts Instruction Synchronization Barrier instructions speculatively executed, 
including CP15ISB. 


The definition of speculatively executed is IMPLEMENTATION DEFINED. 


0x07D, DSB_SPEC, Barrier speculatively executed, DSB 


The counter counts data synchronization barrier instructions speculatively executed, including 
CP15DSB. 


The definition of speculatively executed is IMPLEMENTATION DEFINED. 


0x07E, DMB_SPEC, Barrier speculatively executed, DMB 


The counter counts data memory barrier instructions speculatively executed, including CP15DSB. 
It does not include the implied barrier operations of load/store operations with release consistency 
semantics. 


The definition of speculatively executed is IMPLEMENTATION DEFINED. 


0x081, EXC_UNDEF, Exception taken, other synchronous 


This event is similar to Exception taken, EXC_TAKEN, but the counter counts only synchronous 
exceptions that are not counted by the other Exception taken events. This event counts only 
exceptions taken locally. 
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0x082, EXC_SVC, Exception taken, Supervisor Call 
This event is similar to Exception taken, EXC_TAKEN, but the counter counts only Supervisor Call 
exceptions. This event counts only exceptions taken locally. 

0x083, EXC_PABORT, Exception taken, Instruction Abort 
This event is similar to Exception taken, EXC_TAKEN, but the counter counts only Instruction 
Abort exceptions. This event counts only exceptions taken locally. 

0x084, EXC_DABORT, Exception taken, Data Abort or SError 
This event is similar to Exception taken, EXC_TAKEN, but the counter counts only Data Abort or 
SError interrupt exceptions. The counter counts only exceptions taken locally. 

0x086, EXC_IRQ, Exception taken, IRQ 
This event is similar to Exception taken, EXC_TAKEN, but the counter counts only IRQ 
exceptions. The counter counts only exceptions taken locally, including Virtual IRQ exceptions. 

0x087, EXC_FIQ, Exception taken, FIQ 
This event is similar to Exception taken, EXC_TAKEN, but the counter counts only FIQ exceptions. 
The counter counts only exceptions taken locally, including Virtual FIQ exceptions. 

0x088, EXC_SMC, Exception taken, Secure Monitor Call 
This event is similar to Exception taken, EXC_TAKEN, but the counter counts only Secure Monitor 
Call exceptions. The counter does not increment on SMC instructions trapped as a Hyp Trap 
exception. 


@x08A, EXC_HVC, Exception taken, Hypervisor Call 
This event is similar to Exception taken, EXC_TAKEN, but the counter counts only Hypervisor Call 
exceptions. The counter counts for both Hypervisor Call exceptions taken locally in the hypervisor 
and those taken as an exception from Non-secure EL1. 

0x08B, EXC_TRAP_PABORT, Exception taken, Instruction Abort not taken locally 
This event is similar to Exception taken, EXC_TAKEN, but the counter counts only Instruction 
Abort exceptions not taken locally. 

0x08C, EXC_TRAP_DABORT, Exception taken, Data Abort or SError not taken locally 
This event is similar to Exception taken, EXC_TAKEN, but the counter counts only Data Abort or 
SError interrupt exceptions not taken locally. 

0x08D, EXC_TRAP_OTHER, Exception taken, other traps not taken locally 


This event is similar to Exception taken, EXC_TAKEN, but the counter counts only those traps that 
are not counted as: 


° Exception taken, Hypervisor Call. 

° Exception taken, Instruction Abort not taken locally. 

° Exception taken, Data Abort or SError not taken locally. 
° Exception taken, IRQ not taken locally. 

. Exception taken, FIQ not taken locally. 


0x08E, EXC_TRAP_IRQ, Exception taken, IRQ not taken locally 


This event is similar to Exception taken, EXC_TAKEN, but the counter counts only IRQ exceptions 
not taken locally. 


0x08F, EXC_TRAP_FIQ, Exception taken, FIQ not taken locally 


This event is similar to Exception taken, EXC_TAKEN, but the counter counts only FIQ exceptions 
not taken locally. 
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0x090, RC_LD_SPEC, Release consistency operation speculatively executed, Load-Acquire 
The counter counts Load-Acquire operations that are speculatively executed. The definition of 
speculatively executed is IMPLEMENTATION DEFINED. 

0x091, RC_ST_SPEC, Release consistency operation speculatively executed, Store-Release 
The counter counts Store-Release operations that are speculatively executed. The definition of 
speculatively executed is IMPLEMENTATION DEFINED. 

0x0A0, L3D_CACHE_RD, Attributable Level 3 data or unified cache access, read 


This event is similar to Attributable Level 3 data or unified cache access, L3D_CACHE, but the 
counter counts only attributable memory read operations that cause a cache access to at least the 
Level 3 data or unified cache. 


0x0A1, L3D_CACHE_WR, Attributable Level 3 data or unified cache access, write 


This event is similar to Attributable Level 3 data or unified cache access, L3D_CACHE, but the 
counter counts only attributable memory write operations that cause a cache access to at least the 
Level 3 data or unified cache. 


0x0@A2, L3D_CACHE_REFILL_RD, Attributable Level 3 data or unified cache refill, read 


This event is similar to Attributable Level 3 data or unified cache refill, L3D_ CACHE REFILL, 
but the counter counts only attributable memory read operations that cause a refill of at least the 
Level 3 data or unified cache from outside the Level 3cache. See also Relationship between REFILL 
events and associated access events. on page K3-5523 





0x0A3, L3D_CACHE_REFILL_WR, Attributable Level 3 data or unified cache refill, write 


This event is similar to Attributable Level 3 data or unified cache refill, L3D CACHE REFILL, 
but the counter counts only attributable memory write operations that cause a refill of at least the 
Level 3 data or unified cache from outside the Level 3cache. See also Relationship between REFILL 
events and associated access events. on page K3-5523 





0x0A6, L3D_CACHE_WB_VICTIM, Attributable Level 3 data or unified cache Write-Back, victim 


This event is similar to Attributable Level 3 data cache Write-Back, L3D_CACHE_WB, but the 
counter counts only Write-Backs that are a result of the line being allocated for an access made by 
the PE. 


Write-Backs caused by the execution of a cache maintenance instruction are not counted. 


It is IMPLEMENTATION DEFINED whether a write of a whole cache line that is not the result of the 
eviction of a line from the cache is counted. For example, this might occur if the PE detects 
streaming writes to memory and does not allocate lines to the cache, or as the result of a DC ZVA. 


0x0A7, L3D_CACHE_WB_CLEAN, Level 3 data or unified cache Write-Back, cache clean 


This event is similar to Attributable Level 3 data cache Write-Back, L3D_CACHE_WB, but the 
counter counts only Write-Backs that are a result of a coherency operation made by another PE or 
are caused by the execution of a cache maintenance instruction. Whether Write-Backs that are 
caused by the execution of a cache maintenance instruction are counted is IMPLEMENTATION 
DEFINED. 


— Note 


The transfer of a dirty cache line from the Level 3 data cache of this PE to the Level 3 data cache of 
another PE due to a hardware coherency operation is not counted unless the dirty cache line is also 
written back to a Level 3 cache or memory. 





Ifa coherency request from a requestor outside the PE results in a Write-Back, it is an Unattributable 
event. 


0x0A8, L3D_CACHE_INVAL, Attributable Level 3 data or unified cache access, invalidate 


The counter counts each invalidation of a cache line in the Level 3 data or unified cache. 
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K3.1 ARM recommendations for IMPLEMENTATION DEFINED event numbers 


The counter does not count events: 


. If a cache refill invalidates a line. 
° For locally-executed cache maintenance instructions that operate by set/way. 
—— Note 


Software that uses this event must know whether the Level 3 data cache is shared with other PEs. 
This event does not follow the general rule of Level 3 data cache events of only counting 


Attributable events. 





K3.1.1 Relationship between REFILL events and associated access events. 


CACHE_REFILL and TLB_REFILL events count the refills for accesses that are counted by the corresponding 
CACHE or TLB event. Table K3-2 shows this correspondence. 


Table K3-2 Relationship between REFILL events and associated access events 





REFILL event 


Access event 


Ratio REFILL/Access 





0x042 LID_CACHE REFILL RD 





0x040 LID_CACHE RD 


Attributable Level 1 cache refill rate, read 





0x043 LID_CACHE_ REFILL_WR 





0x041 LID_CACHE WR 


Attributable Level | cache refill rate, write 





@x04c LID_TLB_REFILL_RD 


@x04E LID_TLB_RD 


Attributable Level 1 TLB refill rate, read 





0x04D LID_TLB_REFILL_WR 


0x04F LID_TLB_WR 


Attributable Level 1 TLB refill rate, write 





0x52 L2D_CACHE REFILL RD 





0x50 L2D_CACHE RD 


Attributable Level 2 data cache refill rate, read 





0x053 L2D_CACHE_ REFILL_WR 





0x51 L2D_CACHE WR 


Attributable Level 2 data cache refill rate, write 





@x@5C L2D_TLB_REFILL_RD 


@x@5E L2D_TLB_RD 


Attributable Level 2 data TLB refill rate, read 





0x05D L2D_TLB_REFILL_WR 


@x@5F L2D_TLB_WR 


Attributable Level 2 data TLB refill rate, write 





@x@A2 L3D_CACHE_REFILL_RD 





@x0@A@ L3D_CACHE_ RD 


Attributable Level 3 data cache refill rate, read 





0x0A3 L3D_CACHE_REFILL_WR 





@x0@A1 L3D_CACHE WR 


Attributable Level 3 data cache refill rate, write 
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K3.2 Summary of events for exceptions taken to an Exception level using AArch64 






















































































K3.2 Summary of events for exceptions taken to an Exception level using AArch64 
Table K3-3 shows the events for exceptions taken to an Exception level using AArch64. 
Table K3-3 Events for exceptions taken to an Exception level using AArch64 
Event number and classification for exception taken to 
ESR.EC Description 
Se curen EL2 or EL3, from below 
Exception level 
0x00 Unknown or uncategorized @x081, Other synchronous @x08D, Other traps not taken locally 
Qx01 WFE/WFI traps @x081, Other synchronous @x08D, Other traps not taken locally 
0x03 AArch32 MCR/MRC traps on 0x081, Other synchronous @x08D, Other traps not taken locally 
(coproc==0b1111) accesses 
0x04 AArch32 MCRR/MRRC traps on @x081, Other synchronous @x08D, Other traps not taken locally 
(coproc==0b1111) accesses 
0x05 AArch32 MCR/MRC traps on 0x081, Other synchronous @x08D, Other traps not taken locally 
(coproc==0b1110) accesses 
0x06 AArch32 LDC/STC traps on @x081, Other synchronous @x08D, Other traps not taken locally 
(coproc==0b1110) accesses 
0x07 Advanced SIMD or FP traps @x081, Other synchronous @x08D, Other traps not taken locally 
0x08 AArch32 MVFR* and FPSID - @x08D, Other traps not taken locally 
traps 
@x0C AArch32 MCRR/MRRC traps on 0x081, Other synchronous @x08D, Other traps not taken locally 
(coproc==0b1110) accesses 
Ox0E Illegal instruction set state @x081, Other synchronous @x08D, Other traps not taken locally 
@x11 AArch32 SVC 0x82, Supervisor Call @x08D, Other traps not taken locally 
Qx12 AArch32 HVC that is not disabled - @x08A, Hypervisor Call 
0x13 AArch32 SMC that is not disabled toEL2 = - @x08D, Other traps not taken locally 
toEL3— - 0x88, Secure Monitor Call 
@x15 AArch64 SVC 0x82, Supervisor Call @x08D, Other traps not taken locally 
0x16 AArch64 HVC that is not disabled @x08A, Hypervisor Call @x@8A, Hypervisor Call 
0x17 AArch64 SMC that is not disabled toEL2 = - @x08D, Other traps not taken locally 
to EL3 x88, Secure Monitor Call 0x88, Secure Monitor Call 
0x18 AArch64 MSR, MRS and system 0x081, Other synchronous @x@8D, Other traps not taken locally 
instruction traps 
0x20 Instruction Abort from below 0x083, Instruction Abort 0x08B, Instruction Abort not taken locally 
0x21 Instruction Abort from current 0x083, Instruction Abort - 
Exception level 
Qx22 PC alignment 0x083, Instruction Abort @x08B, Instruction Abort not taken locally 
0x24 Data Abort from below 0x84, Data Abort or SError 0x08C, Data Abort or SError not taken 
locally 
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K3.2 Summary of events for exceptions taken to an Exception level using AArch64 


Table K3-3 Events for exceptions taken to an Exception level using AArch64 (continued) 





Event number and classification for exception taken to 


















































ESR.EC Description 
Eee et me current EL2 or EL3, from below 
Exception level 
0x25 Data Abort from current 0x084, Data Abort or SError - 
Exception level 
0x26 SP alignment fault exception 0x84, Data Abort or SError @x08C, Data Abort or SError not taken 
locally 
0x28 AArch32 FP exception @x081, Other synchronous @x08D, Other traps not taken locally 
@x2C AArch64 FP exception @x081, Other synchronous @x08D, Other traps not taken locally 
Ox2F SError interrupt 0x084, Data Abort or SError 0x08C, Data Abort or SError not taken 
locally 
0x30 Breakpoint from below 0x083, Instruction Abort @x08B, Instruction Abort not taken locally 
0x31 Breakpoint from current 0x083, Instruction Abort - 
Exception level 
Ox32 Software step from below 0x083, Instruction Abort 0x08B, Instruction Abort not taken locally 
0x33 Software step from current 0x083, Instruction Abort - 
Exception level 
0x34 Watchpoint from below 0x84, Data Abort or SError Qx08C, Data Abort or SError not taken 
locally 
@x35 Watchpoint from current 0x084, Data Abort or SError - 
Exception level 
0x38 AArch32 BKPT instruction 0x083, Instruction Abort 0x08B, Instruction Abort not taken locally 
Ox3A AArch32 Vector Catch debug 0x083, Instruction Abort @x08B, Instruction Abort not taken locally 
event 
0x3C AArch64 BRK instruction 0x083, Instruction Abort 0x08B, Instruction Abort not taken locally 
- IRQ interrupt 0x086, IRQ Qx08E, IRQ not taken locally 
- FIQ interrupt 0x087, FIQ 0x08F, FIQ not taken locally 





Note 





In these definitions, an exception that is taken locally means an exception that is taken to the default Exception level, 
and is not routed to another Exception level. See Exception levels on page D1-1498 for more information. 
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K3.2 Summary of events for exceptions taken to an Exception level using AArch64 
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Appendix K4 
Recommendations for reporting memory attributes 
on an interconnect 


This appendix describes the ARM recommendations for reporting the memory attributes that are assigned by the 
PE. It contains the following section: 


° ARM recommendations for reporting memory attributes on an interconnect on page K4-5528. 
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K4.1 ARM recommendations for reporting memory attributes on an interconnect 


K4.1 ARM recommendations for reporting memory attributes on an interconnect 


The ARM architecture defines the architectural interface between software and the PE hardware. This means the 
mechanisms by which different memory type and Cacheability attributes are presented on an interface to an 
interconnect fabric such as AMBA® AX are, strictly, outside the scope of the architecture. This appendix describes 
an approach for the interface between a PE implementation and an interconnect fabric that ARM strongly 
recommends, but these recommendations do not form part of the ARMv8 architecture. 


K4.1.1 Effect of microarchitectural choices on memory attributes 


Implementations of the ARM architecture permit considerable variability in the presentation of memory attributes 
on the interconnect fabric, particularly in cases where the PE implementation does not provide optimized support 
for amemory type. For example, an implementation might treat Write-Through locations as Non-cacheable at some 
level of cache, because functionally this is consistent with the definition of Write-Through, but for the particular 
implementation the performance trade-off does not merit the hardware directly providing Write-Through capability. 
However, in such implementations, the assigned memory attributes are not changed by the microarchitectural 
choices. The microarchitecture simply implements different ways of handling some memory attributes. 


Therefore, ARM strongly recommended that where any or all of the following memory attributes are presented on 
the interface between a PE and an interconnect fabric, the attributes that are presented are completely consistent 
with the attributes defined by the translation system: 


° The memory type, Normal or Device. 

. The Early write acknowledgement attribute. 

. The ordering requirements. 

° The Shareability. 

° The Cacheability, including where practicable, the allocation hints. 


Effect when memory accesses are forced to be Non-cacheable 


ARM also strongly recommends that the effects of forcing accesses to Normal memory to be Non-cacheable, as 
described in Enabling and disabling the caching of memory accesses on page D3-1696 for AArch64 and in 
Enabling and disabling the caching of memory accesses in AArch32 state on page G3-3993 for AArch32, are 
reflected on the interconnect by the memory type and attributes used for memory transactions generated while the 
cache is disabled. 
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Appendix K5 
ARMvVv8 Changes to the T32 and A32 Instruction Sets 


This appendix summarizes the changes that ARMv8 makes to the T32 and A32 instruction sets. It contains the 
following sections: 


The A32 and T32 instruction sets on page K5-5530. 

Partial deprecation of IT on page K5-5531. 

New A32 and T32 Load-Acquire/Store-Release instructions on page K5-5532. 

New A32 and T32 scalar floating-point instructions on page K5-5533. 

New A32 and T32 Advanced SIMD floating-point instructions on page K5-5536. 

New A32 and T32 instructions provided by the Cryptographic Extension on page K5-5538. 
New A32 and T32 System instructions on page K5-5539. 

CRC32 instructions on page K5-5541. 
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K5.1 The A32 and T32 instruction sets 


K5.1 The A32 and T32 instruction sets 


This chapter describes the changes that ARMv8-A makes to the T32 and A32 instruction sets, compared to an 
ARMv/7-A implementation that includes all of the following extensions: 


° Multiprocessing Extensions. 


° Large Physical Address Extension. 


° Virtualization Extensions. 
° Security Extensions. 
° VFPv4. 


° Advanced SIMDv2. 


These ARMV8 changes are not affected by whether the ARMv8-A implementation includes either or both of EL2 
and EL3. 


ARMv8-A obsoletes the A32 SWP and SWPB instructions. 


ARM deprecates any use of the following instructions. The first two of these deprecations are made for performance 
reasons. In ARMv8-A, privileged software can disable this functionality, see the descriptions of the 
SCTLR.{CPI5BEN, ITD, SED} and HSCTLR.{CPI5BEN, ITD, SED} bits: 


° The A32 and T32 System instruction memory barriers CP1SDSB, CP15ISB, and CP15DMB, that are in the 
(coproc==0b1111) System register encoding space. 


° A subset of T32 IT instruction functionality, as described in Partial deprecation of IT on page K5-5531. 
° The A32 and T32 SETEND instruction. 


ARMv8-A adds new A32 and T32 instructions to align with some of the features introduced in the A64 instruction 
set. These are described in: 


° Partial deprecation of IT on page K5-5531. 

° New A32 and T32 Load-Acquire/Store-Release instructions on page K5-5532. 

° New A32 and T32 scalar floating-point instructions on page K5-5533. 

° New A32 and T32 Advanced SIMD floating-point instructions on page K5-5536. 

. New A32 and T32 instructions provided by the Cryptographic Extension on page K5-5538. 
° New A32 and T32 System instructions on page K5-5539. 


Note 


The existing A32 and T32 assembler syntax is unchanged from ARMv7 UAL. Where the syntax term <c> is used 
in this chapter, it represents a standard ARM condition code. Mnemonics that do not include <c> can not be 
conditionally executed. 
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K5.2 Partial deprecation of IT 


K5.2 Partial deprecation of IT 


ARMv8-A deprecates some uses of the T32 IT instruction, for performance reasons. All uses of IT that apply to 
instructions other than a single subsequent 16-bit instruction from a restricted set are deprecated, as are explicit 
references to the PC within that single 16-bit instruction. This permits the non-deprecated forms of IT and 
subsequent instructions to be treated as a single 32-bit conditional instruction. Table K5-1 shows the restricted set 
of 16-bit instructions that are not deprecated when used in conjunction with IT. 


Table K5-1 Non-deprecated IT 16-bit conditional instructions 









































Permitted 16-bit instructions Class Notes 
MOV, MVN Move Deprecated when Rm or Rd is the PC. 
LDR, LDRB, LDRH, LDRSB, LDRSH Load Deprecated for PC-relative load literal forms 
STR, STRB, STRH Store - 
ADD, ADC, RSB, SBC, SUB Add/Subtract Deprecated for ADD SP,SP,#imm, SUB SP,SP,#imm, and when Rm, Rdn, or Rdm is 
the PC 
CMP, CMN Compare Deprecated when Rm or Rn is the PC 
MUL Multiply - 
ASR, LSL, LSR, ROR Shift = 
AND, BIC, EOR, ORR, TST Logical - 
BX, BLX Branch to register | Deprecated when Rm is the PC 
The full ARMv7 IT instruction functionality remains available in order to execute legacy T32 code. It is 
IMPLEMENTATION DEFINED whether an ARMv8 implementation provides an ITD control, that software can use to 
disable the deprecated uses of the IT instruction. In an implementation that included the ITD control, setting an ITD 
field to 1 disables the deprecated uses of the IT instruction, making those uses of the IT instruction UNDEFINED. The 
ITD control fields are: 
HSCTLR.ITD When EL2 is using AArch32, makes execution of the deprecated uses of the IT UNDEFINED at EL2. 
SCTLR.ATD When EL] is using AArch32, makes execution of the deprecated uses of the IT UNDEFINED at ELO 
and EL1. 
SCTLR_EL1.ITD 
When EL! is using AArch64, makes execution of the deprecated uses of the IT UNDEFINED at ELO 
when ELO is using AArch32. 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. K5-5531 


1ID092916 


Non-Confidential 


Appendix K5 ARMv8 Changes to the T32 and A32 Instruction Sets 
K5.3 New A32 and T32 Load-Acquire/Store-Release instructions 


K5.3 New A32 and T32 Load-Acquire/Store-Release instructions 


The new Load-Acquire/Store-Release instructions must be naturally aligned. LDAEXD and STLEXD must be aligned to 
8 bytes. An unaligned address causes an alignment fault. For more information about the ordering of 
Load-Acquire/Store-Release, see Load-Acquire, Store-Release on page E2-2338. 


K5.3.1 A32 and T32 Load-Acquire/Store-Release (non-exclusive) instructions 


Table K5-2 shows the new A32 and T32 Load-Acquire/Store-Release (non-exclusive) instructions. 


Table K5-2 A32 and T32 Load-Acquire/Store-Release (non-exclusive) instructions 





Mnemonic _ Instruction 


See 




















LDA Load-Acquire Word LDA on page F5-2683 
LDAB Load-Acquire Byte LDAB on page F5-2684 
LDAH Load-Acquire Halfword LDAH on page F5-2693 
STL Store-Release Word STL on page F5-3035 
STLB Store-Release Byte STLB on page F5-3036 
STLH Store-Release Halfword STLH on page F5-3048 





K5.3.2 A32 and T32 Load-Acquire/Store-Release Exclusive instructions 


Table K5-3 shows the new A32 and T32 Load-Acquire/Store-Release Exclusive instructions. 


Table K5-3 A32 and T32 Load-Acquire/Store-Release Exclusive instructions 



































Mnemonic Instruction See 
LDAEX Load-Acquire Exclusive Word LDAEX on page F5-2685 
LDAEXB Load-Acquire Exclusive Byte LDAEXB on page F5-2687 
LDAEXD Load-Acquire Exclusive Double LDAEXD on page F5-2689 
LDAEXH Load-Acquire Exclusive Halfword © LDAEXH on page F5-2691 
STLEX Store-Release Exclusive Word STLEX on page F5-3037 
STLEXB Store-Release Exclusive Byte STLEXB on page F5-3040 
STLEXD Store-Release Exclusive Double STLEXD on page F5-3042 
STLEXH Store-Release Exclusive Halfword = STLEXH on page F5-3045 
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K5.4 New A32 and T32 scalar floating-point instructions 


This section describes the new A32 and T32 scalar floating-point instructions. It contains the following subsections: 
° A32 and T32 floating-point conditional select. 

° A32 and T32 floating-point minimum and maximum numeric. 

. A32 and T32 floating-point to integer conversion. 

. A32 and T32 floating-point conversion between half-precision and double-precision on page K5-5534. 

. A32 and T32 floating-point round to integer on page K5-5534. 


K5.4.1 A32 and T32 floating-point conditional select 


The new VSEL instruction conditionally copies one of its two source registers to the destination register. For A32 it 
provides an alternative to a pair of conditional VMOV instructions, while for T32 it compensates for the partial 
deprecation of IT instruction described in Partial deprecation of IT on page K5-5531, since it does not require an 
IT prefix. 


Table K5-4 shows the A32 and T32 floating-point conditional select instructions 


Table K5-4 A32 and T32 Conditional select 











Mnemonic Instruction See 
VSEL Conditional select VSELEQ, VSELGE, VSELGT, VSELVS on page F6-3690 
K5.4.2 A32 and T32 floating-point minimum and maximum numeric 


The new VMAXNM and VMINNM instructions implement the minNum(x,y) and maxNum(x,y) operations defined by the 
TEEE754-2008 standard. They return the numerical operand when one operand is numerical and the other is a quiet 
NaN, but otherwise the result is identical to VFP VMAX and VMIN. These instructions cannot be conditionally executed. 


Table K5-5 shows the A32 and T32 floating-point minNum and maxNum instructions. 


Table K5-5 A32 and T32 floating-point minNum and maxNum instructions 











Mnemonic _ Instruction See 
VMAXNM Single-precision maxNum (scalar) VMAXNM on page F6-3471 
VMINNM Double-precision minNum (scalar) | VMINNM on page F6-3478 





K5.4.3 A32 and T32 floating-point to integer conversion 


These new instructions extend the ARMv7 VFP VCVT instructions by providing four additional explicit rounding 
modes. The instruction syntax is VCVTr, where r selects the rounding mode as follows: 


N Round to Nearest. 

A Round to Nearest with Ties to Away. 
P Round towards Plus Infinity. 

M Round towards Minus Infinity. 


The rounding modes are defined by the IEEE 754 standards, see Floating-point standards, and terminology on 
page Al-48. 


These instructions cannot be conditionally executed. 
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K5.4 New A32 and T32 scalar floating-point instructions 


Table K5-6 shows the A32 and T32 FP to integer conversion instructions. 


Table K5-6 A32 and T32 floating-point to integer conversion instructions 





Mnemonic 


Instruction See 





VCVTA 


Convert floating-point to integer with Round to Nearest with Ties to VCVITA (floating-point) on page F6-3369 
Away 





VCVTM 


Convert floating-point to integer with Round towards Minus Infinity VCVTM (floating-point) on page F6-3376 





VCVTN 


Convert floating-point to integer with Round to Nearest VCVTN (floating-point) on page F6-3380 





VCVTP 


Convert floating-point to integer with Round towards Plus Infinity VCVTP (floating-point) on page F6-3384 





K5.4.4 


K5.4.5 


A32 and T32 floating-point conversion between half-precision and double-precision 


The VFP VCVTT and VCVTB instructions are extended to permit direct conversion between half-precision and 
double-precision floating-point as a single operation, preventing double rounding errors. 


Table K5-7 shows the A32 and T32 instructions to convert between half-precision and double-precision 
floating-point values. 


Table K5-7 A32 and T32 floating-point precision conversion 





Mnemonic _ Instruction See 





VCVTB Floating-point convert half-precision to double-precision VCVTB on page F6-3371 





VCVTT Floating-point convert double-precision to half-precision VCVIT on page F6-3389 





A32 and T32 floating-point round to integer 


The new round to integer instructions round a floating-point value to the nearest integer floating-point value of the 
same size. The floating-point exceptions that can be raised by these instructions are the Invalid operation, for a 
signaling NaN input, or Input Denormal, for a denormal input when Flush-to-zero mode is enabled. For VRINTX only 
an Inexact exception can be raised if the result is numeric and does not have the same numerical value as the source. 
A zero input gives a zero result with the same sign, an infinite input gives an infinite result with the same sign, and 
a NaN is propagated as for normal floating-point arithmetic. 


The instruction syntax is VRINTr, where r selects the rounding mode as follows: 
Round to Nearest. 

Round to Nearest with Ties to Away. 

Round towards Plus Infinity. 

Round towards Minus Infinity. 

Round towards Zero. 

FPSCR rounding mode. 


FPSCR rounding mode, signaling inexactness. 


K—~ AN FDS 2 


When ris R, X, or Z, the round to integer instruction can be conditionally executed. The remaining round to integer 
instructions must be unconditional. 


Table K5-8 on page K5-5535 shows the A32 and T32 round to integer instructions. 
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Appendix K5 ARMv8 Changes to the T32 and A32 Instruction Sets 
K5.4 New A32 and T32 scalar floating-point instructions 


Table K5-8 A32 and T32 floating-point round to integer instructions 





Mnemonic Instruction See 





Round to integer instructions that can be conditionally executed 





VRINTR Round floating-point to integer using FPSCR rounding mode VRINTR on page F6-3662 





VRINTX Round floating-point to integer using FPSCR rounding mode, VRINTX (floating-point) on page F6-3666 
signaling inexactness. 





VRINTZ Round floating-point to integer towards Zero VRINTZ (floating-point) on page F6-3670 





Round to integer instructions that must be unconditional 























VRINTA Round floating-point to integer to Nearest with Ties to Away VRINTA (floating-point) on page F6-3648 
VRINTM Round floating-point to integer towards Minus Infinity VRINTM (floating-point) on page F6-3652 
VRINTN Round floating-point to integer to Nearest with Ties to Even VRINTN (floating-point) on page F6-3656 
VRINTP Round floating-point to integer towards Plus Infinity VRINTP (floating-point) on page F6-3660 
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K5.5 New A32 and T32 Advanced SIMD floating-point instructions 






































K5.5 New A32 and T32 Advanced SIMD floating-point instructions 
The AArch32 Advanced SIMD instructions support only single-precision, 32-bit, floating-point data types, with 
fixed operating modes of Round to Nearest, Default NaN, and Flush-to-Zero. However, they are extended by the 
addition of the instructions described in the following subsections: 
° A32 and T32 floating-point minimum and maximum numeric. 
° A32 and T32 floating-point conversion. 
° A32 and T32 floating-point round to integral. 
K5.5.1 A32 and T32 floating-point minimum and maximum numeric 
Vector forms of the new VMAXNM and VMINNM instructions are described in A32 and T32 floating-point minimum and 
maximum numeric on page K5-5533. 
Table K5-9 shows the A32 and T32 floating-point minNum/maxNum instructions. 
Table K5-9 A32 and T32 floating-point minNum/maxNum instructions 
Mnemonic Instruction See 
VMAXNM Single-precision maxNum (vector) VMAXNM on page F6-3471 
VMINNM Double-precision minNum (vector) VMINNM on page F6-3478 
K5.5.2 A32 and T32 floating-point conversion 
Vector forms of the floating-point to integer conversion instructions are described in A32 and T32 floating-point to 
integer conversion on page K5-5533. The instruction syntax is VCVTr, where r selects the rounding mode as follows: 
N Round to Nearest. 
A Round to Nearest with Ties to Away. 
P Round towards Plus Infinity. 
M Round towards Minus Infinity. 
The rounding modes are defined by the IEEE 754 standards, see Floating-point standards, and terminology on 
page Al-48. 
Table K5-10 shows the A32 and T32 floating-point conversion instructions. 
Table K5-10 A32 and T32 floating-point conversion instructions 
Mnemonic _ Instruction See 
VCVTA Vector Convert floating-point to integer with Round to Nearest with VCVTA (Advanced SIMD) on page F6-3367 
Ties to Away 
VCVTM Vector Convert floating-point to integer with Round towards Minus VCVTM (Advanced SIMD) on page F6-3374 
Infinity 
VCVTN Vector Convert floating-point to integer with Round to Nearest VCVTN (Advanced SIMD) on page F6-3378 
VCVTP Vector Convert floating-point to integer with Round towards Plus VCVTP (Advanced SIMD) on page F6-3382 
Infinity 
K5.5.3 A32 and T32 floating-point round to integral 
Vector forms of the floating-point rounding instructions are described in A32 and T32 floating-point round to 
integer on page K5-5534. The instruction syntax is VRINTr, where r selects the rounding mode as follows: 
N Round to Nearest. 
A Round Nearest with Ties to Away. 
K5-5536 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 


Non-Confidential 


1ID092916 


Appendix K5 ARMv8 Changes to the T32 and A32 Instruction Sets 
K5.5 New A32 and T32 Advanced SIMD floating-point instructions 


P Round towards Plus Infinity. 

M Round towards Minus Infinity. 

Z Round towards Zero. 

X Round to Nearest, signaling inexactness. 


The rounding modes are defined by the IEEE 754 standards, see Floating-point standards, and terminology on 
page Al-48. 


Table K5-11 shows the A32 and T32 floating-point round to integral instructions. 


Table K5-11 A32 and T32 SIMD floating-point round to integral instructions 
































Mnemonic _ Instruction See 

VRINTA Vector Round floating-point to integer towards Nearest with Ties to © VRINTA (Advanced SIMD) on page F6-3646 
Away 

VRINTM Vector Round floating-point to integer towards Minus Infinity VRINTM (Advanced SIMD) on page F6-3650 

VRINTN Vector Round floating-point to integer to Nearest VRINTN (Advanced SIMD) on page F6-3654 

VRINTP Vector Round floating-point to integer towards Plus Infinity VRINTP (Advanced SIMD) on page F6-3658 

VRINTX Vector round floating-point to integer to nearest signaling VRINTX (Advanced SIMD) on page F6-3664 
inexactness 

VRINTZ Vector round floating-point to integer towards Zero VRINTZ (Advanced SIMD) on page F6-3668 
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K5.6 New A32 and T32 instructions provided by the Cryptographic Extension 
The optional Cryptographic Extension instructions use the SIMD and floating-point register file. For more 
information see: 

° Announcing the Advanced Encryption Standard. 

° The Galois/Counter Mode of Operation. 

° Announcing the Secure Hash Standard. 

Table K5-12 shows the A32 and T32 Cryptographic Extension instructions. 

Table K5-12 A32 and T32 Cryptographic Extension instructions 
Mnemonic _ Instruction See 
AESD AES single round decryption AESD on page F6-3235 
AESE AED single round encryption AESE on page F6-3237 
AESIMC AES inverse mix columns AESIMC on page F6-3239 
AESMC AES mix columns AESMC on page F6-3240 
SHA1C SHA1 hash update accelerator, choose SHAIC on page F6-3248 
SHAIM SHA1 hash update accelerator, majority SHAIM on page F6-3251 
SHA1P SHA1 hash update accelerator, parity SHAIP on page F6-3253 
SHA1H SHA1 hash update accelerator, rotate left by 30 SHAIH on page F6-3250 
SHA1SUO SHA1 schedule update accelerator, first part SHAISUO on page F6-3255 
SHA1SU1 SHA1 schedule update accelerator, second part SHAISU1 on page F6-3257 
SHA256H SHA256 hash update accelerator SHA256H on page F6-3259 
SHA256H2 SHA256 hash update accelerator upper part SHA256H2 on page F6-3260 
SHA256SU0 SHA256 schedule update accelerator, first part SHA256SUO0 on page F6-3261 
SHA256SU1 SHA256 schedule update accelerator, second part §SHA256SU/ on page F6-3263 
VMULL Polynomial multiply long, 64x64 to 128-bit VMULL (integer and polynomial) on page F6-3537 
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K5.7 New A32 and T32 System instructions 


The section describes the new System instructions. It contains the following subsections: 


° External Debug. 


° Barriers and hints. 
° TLB Maintenance. 
K5.7.1 External Debug 


Table K5-13 shows the new External Debug support instructions. 


Table K5-13 External Debug support instructions 

















Mnemonic __ Instructions Notes 

DCPS1 Debug switch to EL1, valid in Debug state only - 

DCPS2 Debug switch to EL2, valid in Debug state only - 

DCPS3 Debug switch to EL3, valid in Debug state only - 

HLT #uimm6 Halt instruction Enters Debug state if allowed, with a 6-bit payload in uimm6, 


otherwise treated as UNDEFINED 





K5.7.2 Barriers and hints 


There are new A32 and T32 barrier options and hint instructions. 


Table K5-14 shows the new A32 and T32 barrier instructions. 


Table K5-14 Additional barrier instructions 





Mnemonic Notes 





DMB {ISHLD, OSHLD, NSHLD, LD} Data Memory Barrier is extended to support the new Load-Load/Store options 





DSB {ISHLD, OSHLD, NSHLD, LD} Data Synchronization Barrier is extended to support the new Load-Load/Store options 





SEVL Send Event Locally without the requirement to affect other processors, for example to prime a 
wait-loop that starts with a WFE instructions 





K5.7.3 TLB Maintenance 


TLB maintenance instructions that are only required to apply to the last level translation table walk of the first stage 
of translation are added to A32 and T32 as shown in Table K5-15. For more information see Translation Lookaside 
Buffers (TLBs) on page G4-4089 and TLB maintenance requirements on page G4-4093. 


Table K5-15 Additional A32 and T32 TLB maintenance instructions 





Mnemonic Relation to existing A32/T32 operation 





TLBIMVALIS Related to the TLBIMVAIS operation 





TLBIMVAALIS Related to the TLBIMVAATS operation 





TLBIMVALHIS Related to the TLBIMVAHIS operation 





TLBIMVAL Related to the TLBIMVA operation 
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Table K5-15 Additional A32 and T32 TLB maintenance instructions (continued) 





Mnemonic Relation to existing A32/T32 operation 





TLBIMVAAL 


Related to the TLBIMVAA operation 





TLBIMVALH 


Related to the TLBIMVAH operation 





A32 and T32 include TLB maintenance instructions that must apply to individual entries from stage 2 TLB 

structures, that hold IPA to PA translations. These are consistent with the A64 TLBI system instructions described in 
New A32 and T32 System instructions on page K5-5539. The relation between the A32 and T32 instructions and the 
A64 instructions is shown in Table K5-16. 


Table K5-16 Relation of A32 TLB maintenance instructions to A64 instructions 





T32 and A32 instruction 


Relation to A64 instruction 














TLBIIPAS2IS Equivalent to IPAS2E11S 
TLBIIPAS2LI1S Equivalent to IPAS2LE1IS 

Related to existing A32/T32 TLBIIPAS2IS operation 
TLBIIPAS2 Equivalent to the A64 IPAS2E1 operation 
TLBIIPAS2L Equivalent to IPAS2LE1 operation 








Related to existing A32/T32 TLBIIPAS2 operation 








Note 


These new system operations are accessed using the MCR instruction or, if implemented, by an assembler using the 
SYS mnemonic followed by the TLBI operation name. 
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K5.8 CRC32 instructions 


ARMvV8 introduces CRC32 instructions to the T32 and A32 instruction sets. Table K5-17 shows these instructions: 


Table K5-17 CRC32 instructions 

















Mnemonic _ Instructions See 

CRC32 CRC32 using polynomial @x04C11DB7_ =CRC32 on page F5-2650 

CRC32C CRC32 using polynomial 0x1EDC6F41 CRC32C on page F5-2653 
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Legacy Instruction Syntax for AArch32 Instruction 
Sets 


This appendix describes the legacy instruction syntax in the ARM instruction sets, and their Unified Assembler 
Language (UAL) equivalents. It contains the following section: 


° Legacy Instruction Syntax on page K6-5544. 
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K6.1 Legacy Instruction Syntax 


Early versions of the ARM Architecture defined an assembly language for A32 (ARM) instructions, and a separate 
assembly language for T32 (Thumb) instructions. UAL is based on the A32 assembly language, with some changes 
to the instruction syntax. The appendix describes those changes. The pre-UAL mnemonics are compatible with 
UAL, and might be supported by an assembler. 


The original T32 assembly language is not compatible with UAL, and is not described in the manual. 


K6.1.1 Pre-UAL instruction syntax for the A32 base instructions 


Table K6-1 lists the syntax for the A32 base instructions that have changed after UAL was introduced. 


Table K6-1 Pre-UAL instruction syntax f or the A32 base instructions 





Pre-UAL syntax UAL equivalent See 





ADC<c>S ADCS<c> ADC, ADCS (immediate) on page F5-2561, 
ADC, ADCS (register) on page F5-2563, 
ADC, ADCS (register-shifted register) on page F5-2567 





ADD<c>S ADDS<c> ADD, ADDS (immediate) on page F5-2569, 
ADD, ADDS (register) on page F5-2573, 
ADD, ADDS (register-shifted register) on page F5-2577, 
ADD, ADDS (SP plus immediate) on page F5-2579, 
ADD, ADDS (SP plus register) on page F5-2582 





AND<c>S ANDS<c> AND, ANDS (immediate) on page F5-2591, 
AND, ANDS (register) on page F5-2593, 
AND, ANDS (register-shifted register) on page F5-2597 





BIC<c>S BICS<c> BIC, BICS (immediate) on page F5-2614, 
BIC, BICS (register) on page F5-2616, 
BIC, BICS (register-shifted register) on page F5-2619 





EOR<c>S EORS<c> EOR, EORS (immediate) on page F5-2665, 
EOR, EORS (register) on page F5-2667, 
EOR, EORS (register-shifted register) on page F5-2671 


























LDC<c>L LDCL<c> LDC (immediate) on page F5-2695, 

LDC (literal) on page F5-2697 
LDM<c>IA, LDM<c>FD LDM<c> LDM, LDMIA, LDMFD on page F5-2699 
LDM<c>DA, LDM<c>FA LDMDA<c> LDMDA, LDMFA on page F5-2707 
LDM<c>DB, LDM<c>EA LDMDB<c> LDMDB, LDMEA on page F5-2709 
LDM<c>IB, LDM<c>ED LDMIB<c> LDMIB, LDMED on page F5-2712 
LDR<c>B LDRB<c> LDRB (immediate) on page F5-2724, 


LDRB (literal) on page F5-2727, 
LDRB (register) on page F5-2729 





LDR<c>BT LDRBT<c> LDRBT on page F5-2732 





LDR<c>D LDRD<c> LDRD (immediate) on page F5-2735, 
LDRD (literal) on page F5-2738, 
LDRD (register) on page F5-2741 
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K6.1 Legacy Instruction Syntax 


Table K6-1 Pre-UAL instruction syntax f or the A32 base instructions (continued) 





Pre-UAL syntax 


UAL equivalent 


See 

















LDR<c>H LDRH<c> LDRH (immediate) on page F5-2751, 
LDRH (literal) on page F5-2755, 
LDRH (register) on page F5-2757 

LDR<c>SB LDRSB<c> LDRSB (immediate) on page F5-2763. 
LDRSB (literal) on page F5-2766, 
LDRSB (register) on page F5-2768 

LDR<c>SH LDRSH<c> LDRSH (immediate) on page F5-2774, 
LDRSH (literal) on page F5-2777, 
LDRSH (register) on page F5-2779 

LDR<c>T LDRT<c> LDRT on page F5-2785 

LA<c>S LAS<c> MLA, MLAS on page F5-2808 





LSLS <Rd>, <Rn>, #0 


IOVS <Rd>, <Rm> 





MOV, MOVS (immediate) on page F5-2812, 
MOV, MOVS (register) on page F5-2815 
























































OV<c>S OVS<c> 
UL<c>S ULS<c> MUL, MULS on page F5-2845 
WN<c>S VNS<c> MVN, MVNS (immediate) on page F5-2847, 
MVN, MVNS (register) on page F5-2849, 
MVN, MVWNS (register-shifted register) on page F5-2852 
ORR<c>S ORRS<C> ORR, ORRS (immediate) on page F5-2859, 
ORR, ORRS (register) on page F5-2861, 
ORR, ORRS (register-shifted register) on page F5-2865 
QADDSUBX QASX QASX on page F5-2896 
QSUBADDX QSAX QSAX on page F5-2902 
RSB<c>S RSBS<c> RSB, RSBS (immediate) on page F5-2933, 
RSB, RSBS (register) on page F5-2936, 
RSB, RSBS (register-shifted register) on page F5-2939 
RSC<c>S RSCS<c> RSC, RSCS (immediate) on page F5-2941, 
RSC, RSCS (register) on page F5-2943, 
RSC, RSCS (register-shifted register) on page F5-2945 
SADDSUBX SASX SASX on page F5-2951 
SBC<c>S SBCS<c> SBC, SBCS (immediate) on page F5-2953, 
SBC, SBCS (register), 
SBC, SBCS (register-shifted register) on page F5-2958 
SHADDSUBX SHASX SHASX on page F5-2975 
SHSUBADDX SHSAX SHSAX on page F5-2977 
SMI<c> SMC<c> SMC on page F5-2983 
SMLAL<c>S SMLALS<c> SMLAL, SMLALS on page F5-2989 
SMULL<c>S SMULLS<c> SMULL, SMULLS on page F5-3012 
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Table K6-1 Pre-UAL instruction syntax f or the A32 base instructions (continued) 




























































































Pre-UAL syntax UAL equivalent See 
SSUBADDX<c> SSAX<c> SSAX on page F5-3026 
STC<c>L STCL<c> STC on page F5-3032 
STM<c>EA, STM<c>IA STM<c> STM, STMIA, STMEA on page F5-3049 
STM<c>DA, STM<c>ED STMDA<c> STMDA, STMED on page F5-3055 
STM<c>DB, STM<c>FD STMDB<c> STMDB, STMFD on page F5-3057 
STM<c>IB, STM<c>FA STMIB<c> STMIB, STMFA on page F5-3060 
STR<c>B STRB<c> STRB (immediate) on page F5-3069, 
STRB (register) on page F5-3073 
STR<c>BT STRBT<c> STRBT on page F5-3076 
STR<c>D STRD<c> STRD (immediate) on page F5-3080, 
STRD (register) on page F5-3084 
STR<c>H STRH<c> STRH (immediate) on page F5-3098, 
STRH (register) on page F5-3102 
STR<c>T STRT<c> STRT on page F5-3109 
SUB<c>S SUBS<c> SUB, SUBS (immediate) on page F5-3114, 
SUB, SUBS (register) on page F5-3118, 
SUB, SUBS (register-shifted register) on page F5-3121, 
SUB, SUBS (SP minus immediate) on page F5-3123, 
SUB, SUBS (SP minus register) on page F5-3126 
SWI SVC SVC on page F5-3129 
UADDSUBX UASX UASX on page F5-3158 
UHADDSUBX UHASX UHASX on page F5-3170 
UHSUBADDX UHSAX UHSAX on page F5-3172 
UMLAL<c>S UMLALS<c> UMLAL, UMLALS on page F5-3180 
UMULL<c>S UMULLS<c> UMULL, UMULLS on page F5-3182 
UQADDSUBX UQASX UQASX on page F5-3188 
UQSUBADDX UQSAX UQSAX on page F5-3190 
USUBADDX USAX USAX on page F5-3204 
UEXT8 UXTB UXTB on page F5-3216 
UEXT16 UXTH UXTH on page F5-3220 
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K6.1.2 Pre-UAL instruction syntax for the A32 floating-point instructions 


Table K6-2 lists the syntax for A32 floating-point instructions that have changed after UAL was introduced. 


Table K6-2 Pre-UAL instruction syntax for A32 floating-point instructions 





Pre-UAL syntax 


UAL equivalent 


See 


























FABSD VABS .F64 VABS on page F6-3275 

FABSS VABS .F32 

FADDD VADD . F64 VADD (floating-point) on page F6-3286 
FADDS VADD . F32 

FCMPEZD VCMPE.. F64 VCMPE on page F6-3345 

FCMPEZS VCMPE . F32 

FCMPZD VCMP .F64 VCMP on page F6-3342, 

FCMPZS VCMP .F32 





FCONSTD <Dd>, #<imm8> MOV. 


F64 <Dd>, #<fpimm> 





FCONSTS <Sd>, #<imm8> VMOV. 


F32 <Sd>, #<fpimm> 


VMOV (immediate) on page F6-3505 
For more information, see FCONST on page K6-5550. 

































































FCPYD VMOV .F64 VMOV (register) on page F6-3508 

FCPYS VMOV .F32 

FCVTDS VCVT. F64.F32 VCVT (between double-precision and single-precision) on 
page F6-3350 

FCVTSD VCVT.F32.F64 

FDIVD VDIV.F64 VDIV on page F6-3392 

FDIVS VDIV.F32 

FLDD VLDR.F64 VLDR (immediate) on page F6-3463 
VLDR (literal) on page F6-3465 

FLDMD, FLDMIAD VLDM.F64 VLDM, VLDMDB, VLDMIA on page F6-3458 

FLDMS VLDM.F32 

FLDS VLDR.F32 VLDR (immediate) on page F6-3463 
VLDR (literal) on page F6-3465 

FMACD VMLA. F64 VMLA (floating-point) on page F6-348 1 

FMACS VMLA.F32 

FMDHR <Dd>, <Rt> VMOV <Dd[1]>, <Rt> VMOV (general-purpose register to scalar) on page F6-3512 

FMDLR <Dd>, <Rt> VMOV <Dd[@]>, <Rt> 

FMDRR VMOV VMOV (between two general-purpose registers and a doubleword 
floating-point register) on page F6-3503 

FMRDH <Rt>, <Dd> VMOV <Rt>, <Dd[1]> VMOV (scalar to general-purpose register) on page F6-3516 

FMRDL <Rt>, <Dd> VMOV <Rt>, <Dd[0]> 
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Table K6-2 Pre-UAL instruction syntax for A32 floating-point instructions (continued) 





Pre-UAL syntax UAL equivalent See 





FMRRD VMOV VMOV (between two general-purpose registers and a doubleword 
floating-point register) on page F6-3503 





FMRRS VMOV VMOV (between two general-purpose registers and two 
single-precision registers) on page F6-3518 





FMRS VMOV VMOV (between general-purpose register and single-precision) on 
page F6-3514 




















FMRX VMRS VMRS on page F6-3525 

FMSCD VNMLS . F64 VNMLS on page F6-3550 

FMSCS VNMLS . F32 

FMSR VMOV VMOV (between general-purpose register and single-precision) on 


page F6-3514 





FMSRR VMOV VMOV (between two general-purpose registers and two 
single-precision registers) on page F6-3518 






















































































FMSTAT VMRS APSR_nzcv, FPSCR VMRS on page F6-3525 

FMULD VMUL. F64 VMUL (floating-point) on page F6-3530 

FMULS VMUL . F32 

FMXR VMSR VMSR on page F6-3528 

FNEGD VNEG. F64 VNEG on page F6-3545 

FNEGS VNEG.F32 

FNMACD VMLS . F64 VNMLS on page F6-3550 

FNMACS VMLS .F32 

FNMSCD VNMLA. F64 VNMLA on page F6-3548 

FNMSCS VNMLA. F32 

FNMULD VNMUL. F64 VNMUL on page F6-3552 

FNMULS VNMUL . F32 

FSHTOD VCVT.F64.S16 VCVT (between floating-point and fixed-point, floating-point) on 
page F6-3364 

FSHTOS VCVT.F32.S16 

FSITOD VCVT.F64.S32 VCVT (between floating-point and integer, Advanced SIMD) on 
page F6-3354, VCVTR on page F6-3386 

FSITOS VCVT.F32.S32 

FSLTOD VCVT. F64.S32 VCVT (between floating-point and fixed-point, floating-point) on 
page F6-3364 

FSLTOS VCVT.F32.S32 

FSQRTD VSQRT . F64 VSQRT on page F6-3710 

FSQRTS VSQRT . F32 

FSTD VSTR VSTR on page F6-3749 
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Table K6-2 Pre-UAL instruction syntax for A32 floating-point instructions (continued) 





Pre-UAL syntax 


UAL equivalent 


See 






















































































FSTMD, FSTMIAS VSTM.F64 VSTM, VSTMDB, VSTMIA on page F6-3744 

FSTMS VSTM.F32 

FSTS VSTR VSTR on page F6-3749 

FSUBD VSUB.F64 VSUB (floating-point) on page F6-3751 

FSUBS VSUB .F32 

FTOSHD VCVT.S16.F64 VCVT (between floating-point and fixed-point, floating-point) on 
page F6-3364 

FTOSHS VCVT.S16.F23 

FTOSID VCVT .S32.F64 VCVT (between floating-point and integer, Advanced SIMD) on 
page F6-3354 

FTOSIS VCVT.S32.F32 

FTOSIZD VCVTR. S32. F64 VCVTR on page F6-3386 

FTOSIZS VCVTR.S32.F32 

FTOSLD VCVT.S32.F64 VCVT (between floating-point and fixed-point, floating-point) on 
page F6-3364 

FTOSLS VCVT.S32.F32 

FTOUHD VCVT.U16. F64 

FTOUHS VCVT .U16.F32 

FTOUID VCVT .U32.F64 VCVT (between floating-point and integer, Advanced SIMD) on 
page F6-3354 

FTOUIS VCVT .U32.F32 

FTOUIZD VCVTR.U32.F64 VCVTR on page F6-3386 

FTOUIZS VCVTR.U32.F32 

FTOULD VCVT .U32.F64 VCVT (between floating-point and fixed-point, floating-point) on 
page F6-3364 

FTOULS VCVT .U32.F32 

FUHTOD, VCVT.F64.U16 

FUHTOS VCVT.F64.U16 

FUITOD VCVT .F64.U32 VCVT (between floating-point and integer, Advanced SIMD) on 
page F6-3354 

FUITOS VCVT.F32.U32 

FULTOD VCVT. F64.U32 VCVT (between floating-point and fixed-point, floating-point) on 
page F6-3364 

FULTOS VCVT.F32.U32 
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Appendix K6 Legacy Instruction Syntax for AArch32 Instruction Sets 
K6.1 Legacy Instruction Syntax 


K6.1.3 FCONST 
The syntax of FCONST is 


FCONST<dest>{<c>} <Fd>, #<imm8> 


where: 

<dest> Specifies the destination data type. It must be one of: 
S. Single-precision floating-point. 
D. Double-precision floating-point. 

<C> This is an optional field. It specifies the condition under which the instruction is executed. See 
Conditional execution on page F2-2407 for the range of available conditions and their encoding. If 
<c> is omitted, it defaults to always (AL). 

<Fd> Specifies the destination register. It must be one of: 
<Dd> 64-bit name of the SIMD&FP destination register. 
<Sd> 32-bit name of the SMID&FP destination register. 

<imm8> Specifies the immediate value used to generate the floating-point constant. 


FCONSTD{<c>} <Dd>, #<imm8> maps to VMOV.F64 <Dd>, #<fpimm> 


FCONSTS{<c>} <Sd>, #<imm8> maps to VMOV.F32 <Sd>, #<fpimm> 
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Appendix K7 
Address Translation Examples 


This appendix gives examples of address translations using the translation regimes described in Chapter D4 The 
AArch64 Virtual Memory System Architecture and Chapter G4 The AArch32 Virtual Memory System Architecture. 
It contains the following sections: 

° AArch64 Address translation examples on page K7-5552. 


° AArch32 Address translation examples on page K7-5565. 


Note 


This chapter gives examples of translation table lookups for the ARMv8 address translation stages. It does not 
define any part of the address translation mechanism. If any information in this appendix appears to contradict the 
information in Chapter D4 The AArch64 Virtual Memory System Architecture or Chapter G4 The AArch32 Virtual 
Memory System Architecture then the information in Chapter D4 or Chapter G4 must be taken as the definition of 
the required behavior. 
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Appendix K7 Address Translation Examples 
K7.1 AArch64 Address translation examples 











K7.1 AArch64 Address translation examples 

Figure D4-1 on page D4-1727 shows the VMSAv8 address translation stages that are controlled by an Exception 

level that is using AArch64. The VMSAv8-64 address translation system on page D4-1726 describes the 

VMSAv8-64 address translation scheme. This section gives examples of the use of that scheme, for common 

translation requirements. 

System registers relevant to MMU operation on page D4-1730 specifies the relevant registers, including the TCR 

and TTBR, or TTBRs, for each stage of address translation. 

For any stage of translation, a TCR.TnSZ field indicates the supported input address size. For a stage of address 

translation controlled from an Exception level using AArch64, the supported input address size is 24-T"S2Z), 

This section describes: 

. Performing the initial lookup, for an address for which the initial lookup is either: 

— At the highest lookup level used for the appropriate translation granule size. 

— Because of the concatenation of translation tables at the initial lookup level, one level down from the 
highest level used for the translation granule size. 

These descriptions take account of the following cases: 

— The JA size is smaller than the largest size for the translation level, see Reduced IA width on 
page D4-1740. 

—  Forastage 2 translation, translation tables are concatenated, to move the initial lookup level down by 
one level, see Concatenated translation tables on page D4-1741. 

For examples of performing the initial lookup, see Examples of performing the initial lookup. 

° The full translation flow for resolving a page of memory. These examples describe resolving the largest IA 
size supported by the initial lookup level. For these examples, see Full translation flows for VMSAv8-64 
address translation on page K7-5559. 

K7.1.1 Examples of performing the initial lookup 

The address ranges used for the initial translation table lookup depend on the translation granule, as described in: 

° Performing the initial lookup using the 4KB translation granule. 

° Performing the initial lookup using the 16KB granule on page K7-5555. 

° Performing the initial lookup using the 64KB translation granule on page K7-5557. 

Performing the initial lookup using the 4KB translation granule 

This subsection describes examples of the initial lookup when using the 4KB translation granule that Table D4-12 

on page D4-1746 shows as starting at level 0 or at level 1. It includes those stage 2 translations where concatenation 

of translation tables is required for the lookup to start at level 1. This means that it gives specific examples of the 
mechanisms described in The VMSAv8-64 address translation system on page D4-1726. 
Note 

For stage 2 translations, the same principles apply to an initial lookup that Table D4-12 on page D4-1746 shows as 

starting at level 1. In this case, for some IA sizes concatenation of translation tables means the lookup can, instead, 

start at level 2. 

The following subsections describe these examples of the initial lookup: 

° Initial lookup at level 0, 4KB translation granule on page K7-5553. 

. Initial lookup at level 1, 4KB translation granule on page K7-5553. 

In all cases, for a stage 2 translation, the VTCR_EL2.SLO field must indicate the required initial lookup level, and 

this level must be consistent with the value of the VTCR_EL2.TOSZ field, see Overview of stage 2 translations, 

4KB granule on page D4-1746. 
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Initial lookup at level 0, 4KB translation granule 


This subsection describes initial lookups with an input address width of (n+1) bits, meaning the input address is 
IA[n:0]. As Table D4-12 on page D4-1746 shows, a stage 1 or stage 2 initial lookup at level 0 is required when 
39<n<47. For these lookups: 


° TTBR[47:(n-35)] specify the translation table base address. 





° Bits[1:39] of the input address are bits[(n-36):3] of the descriptor offset in the translation table. 
Note 
This means that, when the input address width is less than 48 bits 
° The size of the translation table is reduced. 
° More low-order bits of the TTBR are required to specify the translation table base address. 
° Fewer input address bit are used to specify the descriptor offset in the translation table. 


For example, if the input address width is 46 bits: 

° The translation table size is 1KB, 

° TTBR bits[47:10] specify the translation table base address. 

° Input address bits[45:39] specify bits[9:3] of the descriptor offset. 





Figure K7-1 shows this lookup. 


47 y 39 38 0 


48:47 


63 x x-1 0 
Register-defined Translation table base address[47:x] RESO* | TTBR 
N ) 


( Y y 
47 x x-1 3 2 0 


[0.0 o Descriptor address* 


Supported input address range is IA[y:0], 4 < x $ 12, y= x + 35. When y is 47 the field marked $ is absent. 
+ For a Non-secure EL1&0 stage 1 translation, the IPA of the descriptor. Otherwise, the PA of the descriptor. 
* Field has additional properties to the default RESO definition, see the register description for more information. 














Figure K7-1 Initial lookup for VMSAv8-64 using the 4KB granule, starting at level 0 


Initial lookup at level 1, 4KB translation granule 


This subsection describes initial lookups with an input address width of (n+1) bits, meaning the input address is 
TA[n:0]. 


For a stage 1 or stage 2 initial lookup at level 1, without use of concatenated translation tables 
As Table D4-12 on page D4-1746 shows, this applies to [A[”:0], where 30 <n < 38. For these 


lookups: 

° There is a single translation table at this level. 

° TTBR[47:(n-26)] specify the translation table base address. 

° Bits[n:30] of the input address are bits[(n-27):3] of the descriptor offset in the translation 
table. 


Figure K7-2 on page K7-5554 shows this lookup. 
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y+ 
47 y 30 29 0. 


UNK/SBZP fF | Input address 

















63 48 47 x x-1 0 
Register-defined Translation table base address[47:x] TTBR 
N ) 
( Y ‘ 


47 x x-1 3 


2 0 


Supported input address range is IA[y:0], 4s x< 12, y=x+ 26. 
+ For a Non-secure EL1&0 stage 1 translation, the IPA of the descriptor. Otherwise, the PA of the descriptor. 
* Field has additional properties to the default RESO definition, see the register description for more information. 


Figure K7-2 Initial lookup for VMSAv8-64 using the 4KB granule, starting at level 1, without concatenation 


For a stage 2 initial lookup at level 1, with concatenated translation tables 


As Table D4-12 on page D4-1746 shows, this applies to [A[n:0], where 39 < n < 42. For these 


lookups: 
° There are 2-38) concatenated translation tables at this level. 
° These concatenated translation tables must be aligned to 2(-38)x4KB. This means 


TTBR[(-27):12] must be zero. 
° TTBR[47:(n-26)] specify the base address of the block of concatenated translation tables. 


° Bits[n:30] of the input address are bits[(-27):3] of the descriptor offset from the base address 
of the block of concatenated translation tables. 


Figure K7-3 shows this lookup. 


y+ 
47 y 30 29 0 
N ? 





NY 





x-1 
x 1211 0 


63 48.47 
Register-defined | Translation table base address[47:x] TTBR 
¥ s 
4 XY \ 


x-1 


47 x 3.2 0 


Supported input address range is IPA[y:0], 4 < x < 16, y= x + 26. The field marked + must be zero. 
* Field has additional properties to the default RESO definition, see the register description for more information. 














Figure K7-3 Initial lookup for VMSAv8-64 using the 4KB granule, starting at level 1, with concatenation 
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Appendix K7 Address Translation Examples 
k7.1 AArch64 Address translation examples 


Performing the initial lookup using the 16KB granule 


This subsection describes examples of the initial lookup when using the 16KB translation granule that Table D4-15 
on page D4-1750 shows as starting at level 0 or at level 1. It includes those stage 2 translations where concatenation 
of translation tables is required for the lookup to start at level 1. This means that it gives specific examples of the 
mechanisms described in The VMSAv8-64 address translation system on page D4-1726. 


Note 


For stage 2 translations, the same principles apply to an initial lookup that Table D4-15 on page D4-1750 shows as 
starting at level 1. In this case, for some IA sizes concatenation of translation tables means the lookup can, instead, 
start at level 2. 








The following subsections describe these examples of the initial lookup: 
° Initial lookup at level 0, 16KB translation granule. 


° Initial lookup at level 1, 16KB translation granule on page K7-5556. 


In all cases, for a stage 2 translation, the VTCR_EL2.SLO field must indicate the required initial lookup level, and 
this level must be consistent with the value of the VTCR_EL2.TOSZ field, see Overview of stage 2 translations, 
16KB granule on page D4-1750. 


Initial lookup at level 0, 16KB translation granule 


This subsection describes initial lookups with an input address width of (n+1) bits, meaning the input address is 
TA[n:0]. As Table D4-14 on page D4-1749 shows, the only case where an address translation using the 16KB 
granule starts at level 0 is a stage 1 translation of a 48-bit input address, IA[47:0]. For this lookup: 


° The required translation table has only two entries, meaning its size is 16bytes, and it must be aligned to 16 
bytes. 


° TTBR[47:4] specify the translation table base address. 
° Bit[47] of the input address is bits[3] of the descriptor offset in the translation table. 
Figure K7-4 shows this lookup. 


47 46 0 


| input address 





48.47 4.3 0 


Register-defined Translation table base address[47:4] nal TTBR 





N ) 


¢ k al 


47 43 2 0 


[ foo 9 Descriptor address* 








Supported input address range is IA[47:0]. The field marked $ is RESO*. 
+ For a Non-secure EL1&0 stage 1 translation, the IPA of the descriptor. Otherwise, the PA of the descriptor. 
* Field has additional properties to the default RESO definition, see the register description for more information. 


Figure K7-4 Initial lookup for VMSAv8-64 using the 16KB granule, starting at level 0 
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Initial lookup at level 1, 16KB translation granule 


This subsection describes initial lookups with an input address width of (n+1) bits, meaning the input address is 
TA[n:0]. 


For a stage 1 or stage 2 initial lookup at level 1, without use of concatenated translation tables 
As Table D4-15 on page D4-1750 shows, this applies to [A[n:0], where 36 < n < 46. For these 


lookups: 

. There is a single translation table at this level. 

° TTBR[47:(n-32)] specify the translation table base address. 

° Bits[n:36] of the input address are bits[(n-33):3] of the descriptor offset in the translation 
table. 


Figure K7-5 shows this lookup. 


y+ 
47 y 36 35 0 


UNK/SBZP] | input address 














63 48 47 x x-1 0 
Register-defined Translation table base address[47:x] [rest TTBR 
\ y) 
( Y » 
47 xX x-1 3 


2 0 


Supported input address range is IA[y:0], 4s xs 14, y=x + 32. 
+ For a Non-secure EL1&0 stage 1 translation, the IPA of the descriptor. Otherwise, the PA of the descriptor. 
* Field has additional properties to the default RESO definition, see the register description for more information. 


Figure K7-5 Initial lookup for VMSAv8-64 using the 16KB granule, starting at level 1, without concatenation 


For a stage 2 initial lookup at level 1, with concatenated translation tables 


As Table D4-15 on page D4-1750 shows, the only case where an address translation using the 16KB 
granule starts at level 1 because of concatenation of translation tables is a stage 2 translation of a 
48-bit input address, [A[47:0]. For this lookup: 


° There are two concatenated translation tables at this level. 


° These concatenated translation tables must be aligned to 2x 16KB. This means TTBR[14] 
must be zero. 


. TTBR[47:15] specify the base address of the block of two concatenated translation tables. 


° Bits[47:36] of the input address are bits[14:3] of the descriptor offset from the base address 
of the block of concatenated translation tables. 


Figure K7-6 on page K7-5557 shows this lookup. 
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47 36 35 0 
\ ) 
Y 
63 48.47 15 14 13 0 
Register-defined Translation table base addressia7-i6)_ ft] ESO TTBR 
\ ) 


47 15 14 3 


2 0 


Supported input address range is IPA[47:0]. The bit marked + must be zero. 
* Field has additional properties to the default RESO definition, see the register description for more information. 


Figure K7-6 Initial lookup for VMSAv8-64 using the 16KB granule, starting at level 1, with concatenation 


Performing the initial lookup using the 64KB translation granule 


This subsection describes examples of the initial lookup when using the 64KB translation granule that Table D4-18 
on page D4-1754 shows as starting at level 1 or at level 2. It includes those stage 2 translations where concatenation 
of translation tables is required for the lookup to start at level 2. This means that it gives specific examples of the 
mechanisms described in The VMSAv8-64 address translation system on page D4-1726. 


Note 
For stage 2 translations, the same principles apply to an initial lookup that Table D4-18 on page D4-1754 shows as 
starting at level 2. In this case, for some IA sizes concatenation of translation tables means the lookup can, instead, 
start at level 3. 








The following subsections describe these examples of the initial lookup: 
° Initial lookup at level 1, 64KB translation granule. 
° Initial lookup at level 2, 64KB translation granule on page K7-5558. 


In all cases, for a stage 2 translation, the VTCR_EL2.SLO field must indicate the required initial lookup level, and 
this level must be consistent with the value of the VTCR_EL2.TOSZ field, see Overview of stage 2 translations, 
64KB granule on page D4-1754. 


Initial lookup at level 1, 64KB translation granule 


This subsection describes initial lookups with an input address width of (n+1) bits, meaning the input address is 
IA[n:0]. As Table D4-18 on page D4-1754 shows, a stage 1 or stage 2 initial lookup at level 1 is required when 42 
<n<47. For these lookups: 


° The size of the translation table is 2‘-39) bytes. This means the size of the translation table, at this level, is 
always less than the granule size. The address of this translation table must align to the size of the table. 
° Bits[n:42] of the input address are bits[(n-39):3] of the descriptor offset in the translation table. 


° Bits[47:(n-38)] of the TTBR specify the translation table base address. 


Figure K7-7 on page K7-5558 shows this lookup. 
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y+ 
4241 0 


47 y H 
| input address 
x-1 


63 48 47 x 0 
Register-defined Translation table base address[47:x] [reo | TTBR 


N y) 


es eS 


x-1 


47 x 3.2 0 
[ [ooo] Descriptor address* 


Supported input address range is IA[y:0], 4 < x $ 9, y= x + 38. When y is 47 the field marked $ is absent. 
+ For a Non-secure EL1&0 stage 1 translation, the IPA of the descriptor. Otherwise, the PA of the descriptor. 
* Field has additional properties to the default RESO definition, see the register description for more information. 











Figure K7-7 Initial lookup for VMSAv8-64 using the 64KB granule, starting at level 1 


Initial lookup at level 2, 64KB translation granule 


This subsection describes initial lookups with an input address width of (n+1) bits, meaning the input address is 
TA[n:0]. 
For a stage 1 or stage 2 initial lookup at level 2, without the use of concatenated translation tables 


As Table D4-18 on page D4-1754 shows, this applies to [A[n:0], where 29 < n < 41. For these 
lookups: 


° There is a single translation table at this level. 
° TTBR[47:(n-25)] of the specify the translation table base address. 


° Bits[n:29] of the input address are bits[(n-26):3] of the descriptor offset in the translation 
table. 


Figure K7-8 shows this lookup. 




















ar ‘ty 29 28 0 
N ) 
UNK/SBZP Vv 
x-1 
63 48 47 x 0 
Register-defined Base address[47:x] TTBR 

N ) 

( Y ‘ 
x-1 


47 x 3 2 0 


Supported input address range is IA[y:0]. 4 < x< 16, y=x+ 25. 
+ For a Non-secure EL1&0 stage 1 translation, the IPA of the descriptor. Otherwise, the PA of the descriptor. 
* Field has additional properties to the default RESO definition, see the register description for more information. 


Figure K7-8 Initial lookup for VMSAv8-64 using the 64KB granule, starting at level 2, without concatenation 
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For a stage 2 initial lookup at level 2, with concatenated translation tables 
As Table D4-18 on page D4-1754 shows, this applies to [A[n:0], where 42 <n < 45. For these 


lookups: 
. There are 2°"-4) concatenated translation tables at this level. 
° These concatenated translation tables must be aligned to 2-4) x64KB. This means 


TTBR[(n-26):16] must be zero. 
° TTBR[47:(n-25)] specify the base address of the block of translation tables. 


° Bits[:42] of the input address are bits[(m-26):16] of the descriptor offset from the base 
address of the block of translation tables. 


Figure K7-9 shows this lookup. 


yt 





47 y 29 28 0 
UNK/SBZP \ ) 
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x-1 


63 48.47 : x 0 
Register-defined | Base address[47:x] TTBR 


K7.1.2 














\ 2 
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47 : x 32 O 


Supported input address range is IPA[y:0], 4 < x $ 20, y=x+ 25. 
* Field has additional properties to the default RESO definition, see the register description for more information. 


Figure K7-9 Initial lookup for VMSAv8-64 using the 64KB granule, starting at level 2, with concatenation 


Full translation flows for VMSAv8-64 address translation 


In a translation table walk, only the first lookup uses the translation table base address from the appropriate TTBR. 
Subsequent lookups use a combination of address information from: 


. The table descriptor read in the previous lookup. 


. The input address. 


This section describes example full translation flows, from the initial lookup to the address of a memory page. The 
example flows: 


° Resolve the maximum-sized IA range supported by the initial lookup level. 
. Do not have any concatenation of translation tables. 
° Cover only the 4KB and the 64KB translation granules. 


Examples of performing the initial lookup on page K7-5552 described how either reducing the IA range or 
concatenating translation tables affects the initial lookup. 


Note 


Reducing the IA range or concatenating translation tables affects only the initial lookup. 








The following sections describe full VMS Av8-64 translation flows, down to an entry for a memory page: 
° The address and properties fields shown in the translation flows on page K7-5560. 

° Full translation flow using the 4KB granule and starting at level 0 on page K7-5561. 

. Full translation flow using the 4KB granule and starting at level 1 on page K7-5562. 

° Full translation flow using the 64KB granule and starting at level 1 on page K7-5563. 

° Full translation flow using the 64KB granule and starting at level 2 on page K7-5564. 
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The address and properties fields shown in the translation flows 


For the Non-secure EL1&0 stage 1 translation: 
° Any descriptor address is the IPA of the required descriptor. 
° The final output address is the IPA of the block or page. 


In these cases, an EL1&0 stage 2 translation is performed to translate the IPA to the required PA. 


For all other translations, the final output address is the PA of the block or page, and any descriptor address is the 
PA of the descriptor. 


Properties indicates register or translation table fields that return information, other than address information, about 
the translation or the targeted memory region. For more information see Memory attribute fields in the VMSAv8-64 
translation table format descriptors on page D4-1778. 
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Full translation flow using the 4KB granule and starting at level 0 


Figure K7-10 shows the complete translation flow for a stage 1 translation table walk for a 48-bit input address. This 
lookup must start with a level 0 lookup. For more information about the fields shown in the figure see The address 


and properties fields shown in the translation flows on page K7-5560. 
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For details of Properties fields, see the register or descriptor description. 


* Field has additional properties to the default RESO definition, see the register description for more information. 
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Figure K7-10 Complete stage 1 translation of a 48-bit address using the 4KB translation granule 
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If the level 1 lookup or level 2 lookup returns a block descriptor then the translation table walk completes at that 


level. 


Figure K7-10 on page K7-5561 shows a stage 1 translation. The only difference for a stage 2 translation is that 


bits[63:58] of the Table descriptors are SBZ. 


Full translation flow using the 4KB granule and starting at level 1 


Figure K7-11 shows the complete translation flow for a stage 1 translation table walk for a 39-bit input address. This 
lookup must start with a level 1 lookup. For more information about the fields shown in the figure see The address 


and properties fields shown in the translation flows on page K7-5560. 
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For details of Properties fields, see the register or descriptor description. 
* Field has additional properties to the default RESO definition, see the register description for more information. 
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Figure K7-11 Complete stage 1 translation of a 39-bit address using the 4KB translation granule 


If the level 1 lookup or the level 2 lookup returns a block descriptor then the translation table walk completes at that 


level. 
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Figure K7-11 on page K7-5562 shows a stage 1 translation. The only difference for a stage 2 translation is that 
bits[63:58] of the Table descriptors are SBZ. 


Comparing this translation with the translation for a 48-bit address, shown in Figure K7-10 on page K7-5561, shows 
how the translation for the 42-bit address start the same lookup process one stage later. 


Full translation flow using the 64KB granule and starting at level 1 


Figure K7-10 on page K7-5561 shows the complete translation flow for a stage | translation table walk for a 48-bit 
input address. This lookup must start with a level 0 lookup. For more information about the fields shown in the 
figure see The address and properties fields shown in the translation flows on page K7-5560. 
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* Field has additional properties to the default RESO definition, see the register description for more information. 


Figure K7-12 Complete stage 1 translation of a 48-bit address using the 64KB translation granule 
If the level 2 lookup returns a block descriptor then the translation table walk completes at that level. 


Figure K7-12 shows a stage 1 translation. The only difference for a stage 2 translation is that bits[63:58] of the Table 
descriptors are SBZ. 
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Appendix K7 Address Translation Examples 
K7.1 AArch64 Address translation examples 


The level 1 lookup resolves only 6 bits of the input address. As described in Performing the initial lookup using the 
64KB translation granule on page K7-5557, this means: 


. The translation table size for this level is only 512 bytes. 
. The required translation table alignment for this level is 512 bytes. 
° The Base address field in the TTBR is extended, at the low-order end, to be bits[47:9]. 


Full translation flow using the 64KB granule and starting at level 2 


Figure K7-11 on page K7-5562 shows the complete translation flow for a stage 1 translation table walk for a 42-bit 
input address. This lookup must start with a level 2 lookup. For more information about the fields shown in the 
figure see The address and properties fields shown in the translation flows on page K7-5560. 
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* Field has additional properties to the default RESO definition, see the register description for more information. 


Figure K7-13 Complete stage 1 translation of a 42-bit address using the 64KB translation granule 
If the level 2 lookup returns a block descriptor then the translation table walk completes at that level. 


Figure K7-13 shows a stage 1 translation. The only difference for a stage 2 translation is that bits[63:58] of the Table 
descriptors are SBZ. 


Comparing this translation with the translation for a 48-bit address, shown in Figure K7-12 on page K7-5563, 





shows: 
° The translation for the 42-bit address starts the same lookup process one stage later. 
° Because the initial lookup resolves 13 bits of address: 
— The translation table size for this level is 64KB. 
— The required translation table alignment for this level is 64KB. 
— The Base address field in the TTBR is bits[47:16]. 
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K7.2 AArch32 Address translation examples 
The following sections give address translation examples for the VMSAv8-32 address translation formats: 

° Address translation examples using the VMUSAv8-32 Short descriptor translation table format. 
° Address translation examples using the VMSAv8-32 Long descriptor translation table format on 
page K7-5570. 

K7.2.1 Address translation examples using the VMSAv8-32 Short descriptor translation table format 
VMSAv8-32 Short-descriptor translation table format descriptors on page G4-4041 describes the memory section 
and page option for a single VMSAv8-32 address translation. The following sections show the full translation flow 
for each of these options: 

° Translation flow for a Supersection on page K7-5566. 

° Translation flow for a Section on page K7-5567. 

° Translation flow for a Large page on page K7-5568. 

° Translation flow for a Small page on page K7-5569. 

The address and Properties fields shown in the translation flows on page K7-5570 summarizes the information 
returned by the lookup. 
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39 


00000000 Translation base Table index 0) 


Translation flow for a Supersection 


Figure K7-14 shows the complete translation flow for a Supersection. For more information about the fields shown 
in this figure see The address and Properties fields shown in the translation flows on page K7-5570. 
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t This field is absent if N is 0 
BA = Base address 
For a translation based on TTBRO, N is the value of TTBCR.N 
For a translation based on TTBR1, N is 0 
For details of Properties fields, see the register or descriptor description 


Figure K7-14 VMSAv8-32 Short-descriptor Supersection address translation 


Note 
Figure K7-14 shows how, when the input address, the VA, addresses a Supersection, the top four bits of the 
Supersection index bits of the address overlap the bottom four bits of the Table index bits. For more information, 
see Additional requirements for Short-descriptor format translation tables on page G4-4044. 
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K7.2 AArch32 Address translation examples 


Translation flow for a Section 


Figure K7-15 shows the complete translation flow for a Section. For more information about the fields shown in 
this figure see The address and Properties fields shown in the translation flows on page K7-5570. 
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t This field is absent if N is 0 
For a translation based on TTBR®O, N is the value of TTBCR.N 
For a translation based on TTBR1, N is 0 
For details of Properties fields, see the register or descriptor description. 


Figure K7-15 VMSAv8-32 Short-descriptor Section address translation 
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K7.2 AArch32 Address translation examples 


Translation flow for a Large page 


Figure K7-16 shows the complete translation flow for a Large page. For more information about the fields shown 
in this figure see The address and Properties fields shown in the translation flows on page K7-5570. 
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Level 2 descriptor 


[Jo 


| 











wo 


Figure K7-16 VMSAv8-32 Short-descriptor Large page address translation 


Note 


Figure K7-16 shows how, when the input address, the VA, addresses a Large page, the top four bits of the page index 
bits of the address overlap the bottom four bits of the level 1 table index bits. For more information, see Additional 
requirements for Short-descriptor format translation tables on page G4-4044. 
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Translation flow for a Small page 


Figure K7-17 shows the complete translation flow for a Small page. For more information about the fields shown 
in this figure see The address and Properties fields shown in the translation flows on page K7-5570. 
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+ This field is absent if N is 0 
L1 = Level 1, L2 = Level 2 
For a translation based on TTBRO, N is the value of TTBCR.N 
For a translation based on TTBR1, N is 0 
For details of Properties fields, see the register or descriptor description. 


Figure K7-17 VMSAv8-32 Short-descriptor Small page address translation 
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K7.2 AArch32 Address translation examples 


The address and Properties fields shown in the translation flows 


For the Non-secure PL1&0 stage 1 translation tables: 
° Any descriptor address is the IPA of the required descriptor. 
° The final output address is the IPA of the Section, Supersection, Large page, or Small page. 


In these cases, a PL1&0 stage 2 translation is performed to translate the IPA to the required PA. 
Otherwise, the address is the PA of the descriptor, Section, Supersection, Large page, or Small page. 


Properties indicates register or translation table fields that return information, other than address information, about 
the translation or the targeted memory region. For more information see Information returned by a translation table 
lookup on page G4-4036, and the description of the register or translation table descriptor. 


For translations using the Short-descriptor translation table format, VMSAv8-32 Short-descriptor translation table 
format descriptors on page G4-4041 describes the descriptors formats. 


K7.2.2 Address translation examples using the VMSAv8-32 Long descriptor translation table format 


As described in Translation table walks, when using the VUSAv8-32 Long-descriptor translation table format on 
page G4-4063, in a translation table walk, only the first lookup uses the translation table base address from the 
appropriate TTBR. Subsequent lookups use a combination of address information from: 


. The table descriptor read in the previous lookup. 
. The input address. 


The following sections give examples of full VMSAv8-32 Long-descriptor format address translation flows, down 
to an entry for a 4KB page: 


° Full translation flow, starting at level 1 lookup on page K7-5571. 
° Full translation flow, starting at level 2 lookup on page K7-5572. 


The address and Properties fields shown in the translation flows sammarizes the information returned by the 
lookup. 
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K7.2 AArch32 Address translation examples 


Full translation flow, starting at level 1 lookup 


Figure K7-18 shows the complete translation flow for a VMSAv8-32 Long-descriptor stage | translation table walk 
that starts with a level 1 lookup. For more information about the fields shown in the figure see The address and 


Properties fields shown in the translation flows on page K7-5570. 
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For details of Properties fields, see the register or descriptor description. 
$ See the lookup description for more information about bits[40:47] of the TTBR and descriptors 


Input address 


TTBR 


Descriptor 
address 


Level 1 
table descriptor 


Descriptor 
address 


Level 2 
table descriptor 


Descriptor 
address 


Level 3 
page descriptor 


Figure K7-18 Complete VMSAv8-32 Long-descriptor format stage 1 translation, starting at level 1 


If the level 1 lookup or the level 2 lookup returns a block descriptor then the translation table walk completes at that 


level. 


If bits[47:40] of the TTBR or the descriptor are not zero then the lookup will generate an Address size fault, see 


Address size fault on page G4-4119. 
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Appendix K7 Address Translation Examples 
K7.2 AArch32 Address translation examples 


A stage 2 translation that starts at a level 1 lookup differs from the translation shown in Figure K7-18 on 


page K7-5571 only as follows: 


. The possible values of n are 4-13, to support an input address of between 31 and 40 bits. 


° A descriptor and output addresses are always PAs. 


Full translation flow, starting at level 2 lookup 


Figure K7-19 shows the complete translation flow for a stage 1 VMSAv8-32 Long-descriptor translation table walk 
that starts at a level 2 lookup. For more information about the fields shown in the figure see The address and 


Properties fields shown in the translation flows on page K7-5570. 
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For details of Properties fields, see the register or descriptor description. 
$ See the lookup description for more information about bits[40:47] of the TTBR and descriptors 
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Figure K7-19 Complete VMSAv8-32 Long-descriptor format stage 1 translation, starting at level 2 


If the level 2 lookup returns a block descriptor then the translation table walk completes at that level. 


If bits[47:40] of the TTBR or the descriptor are not zero then the lookup will generate an Address size fault, see 


Address size fault on page G4-4119. 


A stage 2 translation that starts at a level 2 lookup differs from the translation shown in Figure K7-19 only as 





follows: 
° The possible values of n are 7-16, to support an input address of up to 34 bits. 
° The descriptor and output addresses are always PAs. 
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The address and Properties fields shown in the translation flows 


For the Non-secure PL1&0 stage 1 translation: 
° Any descriptor address is the IPA of the required descriptor. 
° The final output address is the IPA of the block or page. 


In these cases, a PL1&0 stage 2 translation is performed to translate the IPA to the required PA. 


For all other translations, the final output address is the PA of the block or page, and any descriptor address is the 
PA of the descriptor. 


Properties indicates register or translation table fields that return information, other than address information, about 
the translation or the targeted memory region. For more information see Information returned by a translation table 
lookup on page G4-4036, and the description of the register or translation table descriptor. 


For translations using the Long-descriptor translation table format, VMSAv8-32 Long-descriptor translation table 
format descriptors on page G4-4050 describes the descriptors formats. 
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Appendix K8 
Example OS Save and Restore Sequences 


This appendix provides possible OS Save and Restore sequences for a v8 Debug implementation. It contains the 
following sections: 


° Save Debug registers on page K8-5576. 
° Restore Debug registers on page K8-5578. 
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Appendix K8 Example OS Save and Restore Sequences 


K8.1 Save Debug registers 





























K8.1 Save Debug registers 
This section shows how to save the registers that are used by an external debugger. 
; On entry, XQ points to a block to save the debug registers in. 
; Returns the pointer beyond the block and corrupts X1-X3 
SaveDebugRegisters 
; (1) Set OS lock. 
MOV X2,#1 ; Set the OS lock. In AArch64 state, the OS lock 
MSR OSLAR_EL1, X2 ; is writable via OSLAR. 
ISB ; Context synchronization event 
; (2) Walk through the registers, saving them 
RS X1, OSDTRRX_EL1 ; Read DTRRX 
RS X2, OSDTRTX_EL1 ; Read DTRTX 
STP W1,W2, [XO] ,#8 ; Save { DTRRX, DTRTX } 
RS X1, OSECCR_EL1 ; Read ECCR 
RS X2,MDSCR_EL1 ; Read DSCR 
STP W1,W2, [XO] , #8 ; Save { ECCR, DSCR } 
[ AARCH32_SUPPORTED 
RS X1,DBGVCR32_EL2 ; Read DBGVCR 
RS X2,,DBGCLAIMCLR_EL1 ; Read CLAIM - note, have to read via CLAIMCLR 
STP W1,W2, [XO] ,#8 ; Save { VCR, CLAIM } 
] 
3; Macros for saving off a "register pair" 
33 SWB is W for watchpoint, B for breakpoint 
33 $num is the pair’s number 
3; X@ contains a pointer for the value words 
;; X1 contains a pointer for the control words 
;; W2 contains the max index 
MACRO 
SaveRP $WB,$num, $exit 
MRS X3, DBG$WB.VR$num._EL1 ; Read DBGxVRn 
STR X3, [XO] , #8 ; Save { xVRn } 
MRS X3,DBG$WB.CR$num._EL1 ; Read DBGxCRn 
STR W3, [XO] ,#4 ; Save { xCRn }. 
[ $num > 1 :LAND: $num < 15 
CMP W1,#$num 
BEQ $exit 
] 
END 
; (3) Breakpoints 
RS X1, ID_AA64DFRO_EL1 
UBFX W1,W1,#12,#4 ; Extract BRPs field 
ACRO 
SaveBRP $num ; Save a Breakpoint Register Pair 
SaveRP B, $num, SaveDebugRegisters_Watchpoints 
END 
SaveBRP Q 
SaveBRP 1 
SaveBRP 2 
3; and so on to... 
SaveBRP 15 
SaveDebugRegisters_Watchpoints 
; (4) Watchpoints 
MRS X1, ID_AA64DFRO_EL1 ; Read DBGDIDR 
UBFX W1,W1,#20,#4 ; Extract WRPs field 
MACRO 
SaveWRP $num ; Save a Watchpoint Register Pair 
SaveRP W, $num, SaveDebugRegisters_Exit 
MEND 
SaveWRP @ 
SaveWRP 1 
SaveWRP 2 
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K8.1 Save Debug registers 


;; and so on to... 
SaveWRP 15 


SaveDebugRegisters_Exit 
; (5) Return the pointer to first word not read. This pointer is already in XQ, so 
; all that is needed is to return from this function. The OS double-lock (OSDLR_EL1.DLK) is 
; locked later, just before the final entry to WFI state. 
RET 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. K8-5577 
1ID092916 Non-Confidential 
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K8.2 Restore Debug registers 


K8.2 Restore Debug registers 
This section shows how to restore the registers that are used by an external debugger. 


; On entry, X@ points to a block of saved debug registers. 
; Returns the pointer beyond the block and corrupts R1-R3,R12. 


RestoreDebugRegisters 
; (1) Lock OS lock. The lock will already be set, but this write is included to ensure it 





; is locked. 

MOV X2,#1 ; Lock the OS lock. In AArch64 state, the OS lock 
MSR OSLAR_EL1, X2 ; is writable via OSLAR. 

ISB ; Context synchronization event 

MSR MDSCR_EL1, XZR ; Initialize MDSCR_EL1 

; (2) Walk through the registers, restoring them 

LDP W1,W2, [XO] ,#8 ; Read { DTRRX,DTRTX } 

MSR OSDTRRX_EL1, X1 ; Restore DTRRX 

MSR OSDTRTX_EL1, X2 ; Restore DIRTX 

LDP W1,W3, [XO] ,#8 ; Read { DSCR, ECCR } 

MSR OSECCR_EL1, X2 ; Restore ECCR 

[ AARCH32_SUPPORTED 

LDP W1,W2, [XO] ,#8 ; Read { VCR,CLAIM } 

MSR DBGVCR32_EL2, X1 ; Restore DBGVCR 

MSR DBGCLAIMSET_EL1, X2 ; Restore CLAIM - note, writes CLAIMSET 


] 


;; Macro for restoring a "register pair" 


MACRO 

RestoreRP $WB, $num, $exit 

LDR X3, [XO] , #8 ; Read { xVRn } 
MSR DBG$WB.VR$num._EL1, X3 ; Restore DBGxVRn 
LDR W3, [XO] ,#4 ; Read { xCRn } 
MSR DBG$WB.CR$num._EL1, X3 ; Restore DBGxCRn 


[ $num >= 1 :LAND: $num < 15 
CMP W1,#$num 

BEQ $exit 

] 

END 


; (3) Breakpoints 
RS X1, ID_LAA64DFRO_EL1 





UBFX W1,W1,#12,#4 ; Extract BRPs field 

ACRO 

RestoreBRP $num ; Restore a Breakpoint Register Pair 
RestoreRP B,$num,RestoreDebugRegisters_Watchpoints 

END 


RestoreBRP 0 
RestoreBRP 1 
RestoreBRP 2 
3; and so on until ... 
RestoreBRP 15 





RestoreDebugRegisters_Watchpoints 
; (4) Watchpoints 


RS X1, ID_AA64DFRO_EL1 ; Read DBGDIDR 
UBFX W1,W1,#20,#4 ; Extract WRPs field 

ACRO 
RestoreWRP $num ; Restore a Watchpoint Register Pair 
RestoreRP W,$num,RestoreDebugRegisters_Exit 

END 


RestoreWRP 0 
RestoreWRP 1 
RestoreWRP 2 
3; and so on until ... 
RestoreWRP 15 
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K8.2 Restore Debug registers 


RestoreDebugRegisters_Exit 


MSR MDSCR_EL1, X3 ; Restore DSCR 

; (5) Clear the OS lock. 

ISB 

MOV X2,#0 ; Clear the OS lock. In AArch64 state, the OS lock 
MSR OSLAR_EL1, X2 ; is writable via OSLAR. 


; (6) A final ISB guarantees the restored register values are visible to subsequent 
; instructions. 
ISB 


; (7) Return the pointer to first word not read. This pointer is already in XQ, so 
; all that is needed is to return from this function. 
RET 
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K8.2 Restore Debug registers 
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Appendix K9 
Recommended Upload and Download Processes for 
External Debug 


This appendix contains the following section: 


° Using memory access mode in AArch64 state on page K9-5582. 





Note 


This description is not part of the ARM architecture specification. It is included here as supplementary information, 
for the convenience of developers and users who might find this information useful. 
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Appendix K9 Recommended Upload and Download Processes for External Debug 
K9.1 Using memory access mode in AArch64 state 


K9.1 Using memory access mode in AArch64 state 


Figure K9-1 and Figure K9-2 on page K9-5583 show the processes for using memory access mode to implement a 
download (external host to target) and an upload (target to external host). 


To transfer n words of data: 
° The download sequence needs n+6 accesses by the external debug interface. 


° The upload sequence needs n+8 accesses by the external debug interface. 


In both cases, in the innermost loop the debugger can make an external access to a DTR without polling EDSCR 
after each write as underrun and overrun detection prevent failure. Normally external accesses from the debugger 
are outpaced by the memory accesses of the PE, making underruns and overruns unlikely. If this is not the case, the 
EDSCR.ERR flag is set to 1. This is checked once at the end of the sequence, although a debugger can check it more 
often, for example once for each page. If the EDSCR.ERR flag is set to 1 because of overrun or underrun, the 
debugger can restart. The address to restart from is frozen in XO. EDSCR.ERR might also be set because of a Data 
abort. 


If underruns and overruns are common, the debugger can pace itself accordingly. 


Note 


° The base address must be a multiple of 4. 





° The order of the writes that set up the address does not matter in Debug state. 


AArch64 
Write D[n] toA 


1. DBBGDTRTX = A[63:32] 
2. DBGDTRRX = A[31:0] 
3. EDITR=“MRS X0,DBGDTR_ELO” 























v 
4. EDSCR.MA == 
Set i=0 














lt 








Vv 
DBGDTRRX = Dfi] 
Issues store through ITR 
Sets ERR to 1 if there is an overrun or abort 

















No 





Yes 


y 


5. EDSCR.MA = 0 

















Error 
recovery 


No——> 














Figure K9-1 Fast code download in AArch64 state (external host to target) 
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In Figure K9-1 on page K9-5582, the sequence for the fast code download is as follows: 


1. 


Setup. From the external debug interface: 


a. 


b 
Cc. 
d 


Write address [31:0] to DBGDTRRX_ELO. 
Write address [63:32] to DBGDTRTX_ELO. 
Write MRS X@, DBGDTR_EL@ to EDITR. The PE executes this instruction. 


Set EDSCR.MA to 1. 


Loop n times. From the external debug interface: 


a. 


Write to DBGDTRRX_ELO. The PE reads the word from DTRRX and stores it to memory. It 
increments XO by 4. 


Epilogue. From the external debug interface: 


a. 
b. 


Clear EDSCR.MA to 0. 
Read EDSCR to check for overruns or Data Aborts during download. 


AArch64 
Read D[n] from A 


1. DBGDTRTX = A[63:32] 
2. DBBGDTRRX = A[31:0] 
3. EDITR=“MRS X0,DBGDTR_ELO” 























4. EDITR = “MSR, DBGDTR_ELO,X0” (sets TXfull to 1) 
5. EDSCR.MA = 1 
Set i=0 














6. Discard DBGDTRTX 
Sets ERR to 1 in the case of an underrrun or abort 
Issues a load through ITR 




















D/fi-1] = DBGDTRTX 
Sets ERR to 7 if there is an underrun or abort 
Issues a load through ITR 














Yes 


Vv 
7. EDSCR.MA = 0 
8. D[n-1] = DBGDTRTX 
Sets ERR to 1 if there is an underrun 




















Error 


No——> 
recovery 














Yes 


End 


Figure K9-2 Fast data upload in AArch64 state (target to external host) 
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kK9.1 Using memory access mode in AArch64 state 


In Figure K9-2 on page K9-5583, the sequence for the fast code download is as follows: 


1. Setup. From the external debug interface: 
a. Write address [31:0] to DBGDTRRX_ELO. 
b. Write address [63:32] to DBGDTRTX_ELO. 
Write MRS X@, DBGDTR_EL@ to EDITR. 
Write MSR DBGDTR_EL@, X@ to EDITR. This dummy operation ensures EDSCR.TXfull == 1. 
e. Set EDSCR.MA to 1. 


a 9 


f. Read DBGDTRTX_ELO and discard the value. The PE returns the previous DTR value, loads the first 
word, and writes it to DTR. It increments XO by 4. 
2. Loop n-1 times. From the external debug interface: 


a. Read DBGDTRTX_ELO. The PE returns the previous DTRTX value, loads a new word, and writes it 
to DTRTX. It increments XO by 4. 


3; Epilogue. From the external debug interface: 
a. Clear EDSCR.MA to 0. 
b. Read DBGDTRTX_ELO for the n value. 


C Read EDSCR to check for underruns, overruns or Data Aborts during upload. 
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Appendix K10 
Barrier Litmus Tests 


This appendix gives examples of the use of the barrier instructions provided by the ARMv8 architecture. It contains 
the following sections: 


° Introduction on page K10-5586. 
° Load-Acquire, Store-Release and barriers on page K10-5589. 


. Load-Acquire Exclusive, Store-Release Exclusive and barriers on page K10-5596. 
. Using a mailbox to send an interrupt on page K10-5601. 
. Cache and TLB maintenance instructions and barriers on page K10-5602. 


. ARMv\7 compatible approaches for ordering, using DMB and DSB barriers on page K10-5614. 


Note 


This information is not part of the ARM architecture specification. It is included here as supplementary information, 
for the convenience of developers and users who might require this information. 
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K10.1 Introduction 


The exact rules for the insertion of barriers into code sequences is a very complicated subject, and this appendix 
describes many of the corner cases and behaviors that are possible in an implementation of the ARMvé8 architecture. 


This appendix is to help programmers, hardware design engineers, and validation engineers understand the need for 
the different kinds of barriers. 


K10.1.1 Overview of memory consistency 


Early generations of microprocessors were relatively simple processing engines that executed each instruction in 
program order. In such processors, the effective behavior was that each instruction was executed in its entirety 
before a subsequent instruction started to be executed. This behavior is sometimes referred to as the Sequential 
Execution Model (SEM), and in this Manual it is described as Simple sequential execution of the program. 


In later processor generations, the needs to increase processor performance, both in terms of the frequency of 
operation and the number of instructions executed each cycle, mean that such a simple form of execution is 
abandoned. Many techniques, such as pipelining, write buffering, caching, speculation, and out-of-order execution, 
are introduced to provide improved performance. 


For general purpose PEs, such as ARM, these microarchitectural innovations are largely hidden from the 
programmer by a number of microarchitectural techniques. These techniques ensure that, within an individual PE, 
the behavior of the PE largely remains the same as the SEM. There are some exceptions to this where explicit 
synchronization is required. In the ARM architecture, these are limited to cases such as: 


° Synchronization of changes to the instruction stream. 


° Synchronization of changes to System registers. 
In both these cases, the ISB instruction provides the necessary synchronization. 


While the effect of ordering is largely hidden from the programmer within a single PE, the microarchitectural 
innovations have a profound impact on the ordering of memory accesses. Write buffering, speculation, and cache 
coherency protocols, in particular, can all mean that the order in which memory accesses occur, as seen by an 
external observer, differs significantly from the order of accesses that would appear in the SEM. This is usually 
invisible in a uniprocessor environment, but the effect becomes much more significant when multiple PEs are trying 
to communicate with memory. In reality, these effects are often only significant at particular synchronization 
boundaries between the different threads of execution. 


The problems that arise from memory ordering considerations are sometimes described as the problem of memory 
consistency. Processor architectures have adopted one or more memory consistency models, or memory models, that 
describe the permitted limits of the memory re-ordering that can be performed by an implementation of the 
architecture. The comparison and categorization of these has generated significant research and comment in 
academic circles, and ARM recommends the Memory Consistency Models for Shared Memory-Multiprocessors 
paper as an excellent detailed treatment of this subject. 


This appendix does not reproduce such a work, but instead concentrates on some cases that demonstrate the features 
of the weakly-ordered memory model of the ARM architecture from ARMv6. In particular, the examples show how 
the use of the DMB and DSB memory barrier instructions can provide the necessary safeguards to limit memory 
ordering effects at the required synchronization points. 


K10.1.2 Barrier operation definitions 


The following reference, or provide, definitions of terms used in this appendix: 


DMB See Data Memory Barrier (DMB) on page B2-88. 
DSB See Data Synchronization Barrier (DSB) on page B2-89. 
ISB See Instruction Synchronization Barrier (ISB) on page B2-88. 


Observer, Completion 


See Observability and completion on page B2-84. 
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K10.1 Introduction 


Program order 


The order of instructions as they appear in an assembly language program. This appendix does not 
attempt to describe or define the legal transformations from a program written in a higher level 
programming language, such as C or C++, into the machine language that can then be disassembled 
to give an equivalent assembly language program. Such transformations are a function of the 
semantics of the higher level language and the capabilities and options on the compiler. 


K10.1.3 Conventions 


Many of the examples are written in a stylized extension to ARM assembler, to avoid confusing the examples with 
unnecessary code sequences. 


AArch32 


The construct WAIT([Rx]==1) describes the following sequence: 


loop 
LDR R12, [Rx] 
CMP R12, #1 
BNE loop 


Also, the construct WAIT_ACQ([Rx]==1) describes the following sequence: 


loop 

LDA R12, [Rx] ; load acquire ensures it is ordered before subsequent loads/stores 
CMP R12, #1 

BNE loop 


R12 is chosen as an arbitrary temporary register that is not in use. It is named to permit the generation of a false 
dependency to ensure ordering. 


AArch64 


The construct WAIT([Xx]==1) describes the following sequence: 


loop 
LDR W12, [Xx] 
CMP W12, #1 
B.NE loop 


Also, the construct WAIT_ACQ([Xx]==1) and describes the following sequence: 


loop 

LDAR W12, [Xx] ; load acquire ensures it is ordered before subsequent loads/stores 
CMP W12, #1 

B.NE loop 


For each example, a code sequence is preceded by an identifier of the observer running it: 


° PO, P1...Px refer to caching coherent PEs that implement the ARMvV8 architecture, and are in the same 
shareability domain. 


° EO, E1...Ex refer to non-caching observers, that do not participate in the coherency protocol, but execute 
ARM instructions and have a weakly-ordered memory model. This does not preclude these observers being 
different objects, such as DMA engines or other system masters. 


These observers are unsynchronized other than as required by the documented code sequence. 


Note 


Throughout this appendix, ARM instruction and instruction refer to instructions from the A64, A32, or T32 
instruction set, provided by ARMv8 implementations. 
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K10.1 Introduction 


Results are expressed in terms of <agent>:<register>, such as P0:R5. The results can be described as: 


Permissible This does not imply that the results expressed are required or are the only possible results. 
In most cases they are results that would not be possible under a sequentially consistent 
running of the code sequences on the agents involved. In general terms, this means that these 
results might be unexpected to anyone unfamiliar with memory consistency issues. 


Not permissible Results that the architecture expressly forbids. 


Required Results that the architecture expressly requires. 


The examples omit the required shareability domain arguments of DMB and DSB instructions. The arguments are 
assumed to be selected appropriately for the shareability domains of the observers. 


In AArch32 state, where the barrier function in the litmus test can be achieved by a DMB ST, that is a barrier to stores 
only, this is shown by the use of DMB [ST]. This indicates that the ST qualifier can be omitted without affecting the 
result of the test. In some implementations DMB ST is faster than DMB. 


For AArch64 code, the shareability domain of the DMB or DSB must be included. This is shown in this manual using 
the notation DMB <domain> and DSB <domain> respectively. 


Except where otherwise stated, other conventions are: 


° All memory initializes to 0. 
° RO and WO contain the value 1. 
° R1 - R4 and W1 - W4 contain arbitrary independent addresses that initialize to the same value on all PEs. 


The addresses held in these registers are shareable and: 
— The addresses held in R1 and R2 are in Write-Back Cacheable Normal memory. 
— The address held in R3 is in Write-Through Cacheable Normal memory. 


— The address held in R4 is in Non-cacheable Normal memory. 


° R5 - R8 and W5 - W8 contain: 
— When used with an STR instruction, 0x55, 0x66, 0x77, and 0x88 respectively. 
— When used with an LDR instruction, the value 0. 


. R11 and W11 contain a new instruction or new translation table entry, as appropriate, and R10 contains the 
virtual address and the ASID, for use in this change of translation table entry. 


° Memory locations are Normal memory locations unless otherwise stated. 


The examples use mnemonics for the cache maintenance and TLB maintenance instructions. The following tables 
describe the mnemonics: 





° Cache maintenance instructions, functional group on page G4-4201. 
° TLB maintenance instructions, functional group on page G4-4202. 
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K10.2 Load-Acquire, Store-Release and barriers 


The Load-Acquire and Store-Release instructions are described in Load-Acquire, Store-Release on page B2-90. 


The following sections show that most of the examples in sections Simple ordering and barrier cases on 
page K10-5614 and Load-Exclusive, Store-Exclusive and barriers on page K10-5620 can be achieved using the 
Load-Acquire and Store-Release instructions without the need for additional barriers. 


K10.2.1 Message passing 


The following sections describe: 
. Resolving weakly-ordered message passing by using Acquire and Release. 
. Resolving message passing by the use of Store-Release and address dependency on page K10-5590. 


Resolving weakly-ordered message passing by using Acquire and Release 


The message passing problem described in Weakly-ordered message passing problem on page K10-5614 can be 
solved by the use of Load-Acquire and Store-Release instructions when accessing the communications flag: 


AArch32 
Pl 

STR R5, [R1] ; sets new data 

STL RO, [R2] ; sends flag indicating data ready, which is ordered after the STR 
P2 

WAIT_ACQ( [R2]==1) ; waits on flag 

LDR R5, [R1] 
AArch64 
Pl 

STR W5, [X1] ; sets new data 

STLR WO, [X2] ; sends flag indicating data ready, which is ordered after the STR 
P2 


WAIT_ACQ( [X2]==1) 
LDR W5, [X1] 


; waits on flag 


This ensures the observed order of both the reads and the writes allows transfer of data such that the result 
P2:R5==0x55 is guaranteed. 


This approach also works with multiple observers, in a way that further observers use the same sequence as P2 uses: 





AArch32 
P3 
WAIT_ACQ( [R2]==1) ; waits on flag 
LDR R5, [R1] 
AArch64 
P3 
WAIT_ACQ( [X2]==1) ; waits on flag 
LDR W5, [X1] 
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K10.2 Load-Acquire, Store-Release and barriers 


Resolving message passing by the use of Store-Release and address dependency 


The lack of ordering of stores discussed in Message passing with multiple observers on page K10-5616 can be 
resolved by the use of Store-Release for the store of the valid flag by P1, even when the observers are using an 


address dependency: 


AArch32 
Pl 
STR R5, [R1] 
STL RQ, [R2] 
P2 


WAIT([R2]==1) 
AND R12, R12, #0 
LDR R5, [R1, R12] 


AArch64 
Pl 
STR WS, [X1] 
STLR WO, [X2] 
P2 
WAIT([X2]==1) 


AND W12, W12, WZR 
LDR W5, [X1, X12] 


sets new data 
sends flag indicating data ready using a Store-Release 


R12 is the destination of LDR in the WAIT macro 
the load has an address dependency on R12 


: and so is ordered after the flag has been seen 


sets new data 
sends flag indicating data ready using a Store-Release 


W12 is the destination of LDR in the WAIT macro 
the load has an address dependency on W12 


: and so is ordered after the flag has been seen 


This ensures the observed order of the writes allows transfer of data such that P2:R5 and P3:R5 contain the same 


value of @x55. 


This approach also works with multiple observers, in a way that further observers use the same sequence as P2 uses: 


AArch32 
P3 
WAIT([R2]==1) 


AND R12, R12, #0 
LDR R5, [R1, R12] 


AArch64 
P3 
WAIT ([X2]==1) 


AND W12, W12, WZR 
LDR W5, [X1, X12] 


R12 is the destination of LDR in the WAIT macro 
the load has an address dependency on R12 


: and so is ordered after the flag has been seen 


R12 is the destination of LDR in the WAIT macro 
the load has an address dependency on W12 


: and so is ordered after the flag has been seen 
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K10.2.2 Address dependency with object construction 


When accessing an object-oriented data structure, the address dependency rule means that barriers are not required, 
even when initializing the object. A Store-Release can be used to ensure the order of the update of the base address: 


AArch32 
Pl 
STR R5, [R1, #offset] ; sets new data in a field 
STL R1, [R2] ; updates base address 
P2 
LDR R1, [R2] ; reads base address 
CMP R1, #0 ; checks if it is valid 
BEQ null_trap 
LDR R5, [R1, #offset] ; uses base address to read field 
AArch64 
Pl 
STR W5, [X1, #offset] ; sets new data in a field 
STLR X1, [X2] ; updates base address 
P2 
LDR X1, [X2] ; reads base address 
CMP X1, #0 ; check if it is valid 


B.EQ null_trap 
LDR W5, [X1, #offset] ; uses base address to read field 


It is required that P2:R5==0x55 if the null_trap is not taken. This avoids P2 observing a partially constructed object 
from P1. Significantly, P2 does not need a barrier to ensure this behavior. 


The read of the base address in P2 could be a Load-Acquire, but it is not necessary in this case. 


K10.2.3 Causal consistency issues with multiple observers 


The cause consistent problem discussed in Causal consistency issues with multiple observers on page K10-5617 can 
be addressed by the use of a Store-Release, as this requires that the store is multicopy atomic in the case of a 
Load-Acquire. In addition, a Store-Release has an effect on the observation order of any stores observed by the 
observer executing the Store-Release. 


The following sequences guarantee causal consistency: 
° Using multi-copy atomicity of the Store-Release when observed by Load-Acquire. 
° Using ordering property of Store-Release on stores observed by the PE on page K10-5592. 


Using multi-copy atomicity of the Store-Release when observed by Load-Acquire 





AArch32 
Pl 
STL RO, [R2] ; sets new data 
; this is made multi-copy atomic 
P2 
WAIT_ACQ( [R2]==1) ; waits to see new data from P1 
STR RQ, [R3] ; sends flag 
; must be after the new data has been by P2, 
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P3 
WAIT([R3]==1) 
AND R12, R12, #0 
ADD R12, R2, R12 
LDA RO, [R12] 

AArch64 

Pl 


STLR WO, [X2] 


P2 


WAIT_ACQ([X2]==1) 
STR WO, [X3] 


P3 
WAIT ([X3]==1) 
AND W12, W12, WZR 


ADD X12, X2, X12 
LDAR WO, [X12] 


as stores must not be speculative 
this does not need to be a Store-Release, 
though it could be a Store-Release 


waits for P2 flag 

this does not need to be a WAIT_ACQ, although 

it could be a WAIT_ACQ (at which point the dependency is not needed) 
dependency to ensure order (only needed for a WAIT, not WAIT_ACQ) 
creating a dependency 

reads Pl data using a Load-Acquire 


sets new data 
this is made multi-copy atomic 


; waits to see new data from P1 


sends flag 

must be after the new data has been by P2 as stores 
must not be speculative. 

this does not need to be a Store-Release, 

though it could be a Store-Release 


waits for P2 flag 

this does not need to be a WAIT_ACQ, though 

it could be a WAIT_ACQ (at which point the dependency is not needed 
dependency to ensure order (only needed for a WAIT, not WAIT_ACQ) 
creating a dependency 

reads Pl data using a Load-Acquire 


In this case, P3:RO == 0 is not permissible. P3 is guaranteed to see the store from P1 if P2 has seen the store from 


P1 using a Load-Acquire. 


Using ordering property of Store-Release on stores observed by the PE 


AArch32 
Pl 


STR RO, [R2] 


P2 
WAIT ([R2]==1) 


STL RO, [R3] 


P3 


WAIT ([R3]==1) 


AND R12, R12, #0 
LDR RO, [R2, R12] 


sets new data 
this does not have to be a Store-Release, though it 
could be a Store-release 


waits to see new data from P1 

this does not need to be a WAIT_ACQ, though it could be a WAIT_ACQ 
sends flag 

this must be after the new data has been seen by P2 


: as stores must not be speculative, 


as a Store-Release, this orders the Pl store 


; waits for P2 flag 


this does not need to be a WAIT_ACQ, although it could be a WAIT_ACQ 
(at which point the dependency is not needed) 


; dependency to ensure order (only needed for a WAIT, not WAIT_ACQ) 


reads Pl data 
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K10.2 Load-Acquire, Store-Release and barriers 


AArch64 
Pl 
STR WO, [X2] ; sets new data 
; this does not have to be a Store-Release, though it 
; could be a Store-Release 
P2 
WAIT ([X2]==1) ; waits to see new data from P1 
; this does not need to be a WAIT_ACQ, though it could be a WAIT_ACQ 
STLR WO, [X3] ; sends flag 


this must be after the new data has been seen by P2 
: as stores must not be speculative, 
as a Store-Release, this orders the Pl store 


P3 
WAIT ([X3]==1) waits for P2 flag 
this does not need to be a WAIT_ACQ, though it could be a WAIT_ACQ 
at which point the dependency is not needed 
dependency to ensure order 
only needed for a WAIT, not WAIT_ACQ 
reads P1 data 


AND W12, W12, WZR . 


LDR W@, [X2, X12] 


In this case, P3:RO == 0 is not permissible. P3 is guaranteed to see the store from P1 if P2 has seen the store from 
P1 before it does the Store-Release that is then seen by P3. 


Note 
The use of dependency by P3 could be replaced by a Load-Acquire. 








K10.2.4 Multiple observers of writes to multiple locations 


The ARM weakly consistent memory model means that different observers can observe writes to different locations 
in different orders as was shown in Multiple observers of writes to multiple locations on page K10-5617, but the use 
of Load-Acquire and Store-Release can resolve this. In this case, the loads by P3 and P4 must be Load-Acquire in 
order to ensure the perceived multi-copy atomicity of the stores: 


AArch32 
Pl 
STL RQ, [R1] ; sets new data 
P2 
STL RQ, [R2] ; sets new data 
P3 
LDA R10, [R2] ; reads P2 data before P1 
LDA R9, [R1] 


BIC R9, R10, R9 RQ <- R1O & ~R9 
RQ contains 1 if read from [R2] is observed to be 1 and 


read from [R1] is observed to be Q 





Pp4 
LDA R9, [R1] 
LDA R10, [R2] 
BIC R9, R9, R10 ; RO <- RO & ~R1O 
; R9 contains 1 if read from [R2] is observed to be @ and 
; read from [R1] is observed to be 1 
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AArch64 
Pl 
STLR WO, [X1] ; sets new data 
P2 
STLR WO, [X2] ; sets new data 
P3 
LDAR W10, [X2] ; reads P2 data before P1 


LDAR W9, [X1] 
BIC W9, W10, W9 W9 <- W10 & ~w9 
W9 contains 1 if read from [X2] is observed to be 1 and 


read from [X1] is observed to be 0 


P4 


LDAR W9, [X1] 

LDAR W10, [X2] 

BIC W9, W9, W10 ; W9 <- WO & ~W10 

; W9 contains 1 if read from [X2] is observed to be @ and 
read from [X1] is observed to be 1 


In this case, the result P3:R9==1 and P4:R9==1 is not permissible, as the stores from P1 and P2 are multi-copy 
atomic when read by Load-Acquire. 


Therefore, if P3 gets R10==1, then we know that the P3 load of R9 can only be observed after we know that P4 has 
also observed the P2 store to [R2]. Similarly, if the P4 load of R9 returns 1, and the P3 load of R9 returns 0, then 
the P3 load must have occurred before the P4 load. 


Therefore, if the P3 load of R10 returns | and the P3 load of R9 returns 0, then we know that if the P4 load of R9 
returns 1, it must have happened after P4 has observed the P2 store to [R2], so the P4 load of R10 must return 1. 


This shows that, of the 4 possible values for {P3:R9, P4:R9}, the use of these instructions makes the result {1,1} 
impossible. 
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Appendix K10 Barrier Litmus Tests 
K10.2 Load-Acquire, Store-Release and barriers 


WFE and WFI and barriers 


The Wait For Event and Wait For Interrupt instructions permit the PE to suspend execution and enter a low-power 
state. An explicit DSB barrier instruction is required if it is necessary to ensure memory accesses made before the WFI 
or WFE are visible to other observers, unless some other mechanism has ensured this visibility. Examples of other 
mechanism that would guarantee the required visibility are the DMB described in Posting a store before polling for 
acknowledgement on page K10-5619, or a dependency on a load. 


The following example requires the DSB to ensure that the store is visible: 
AArch32 
Pl 


STR RO, [R2] 
DSB 
Loop 
WFI 
B Loop 


AArch64 
Pl 


STR WO, [X2] 

DSB <domain> 
Loop 

WFI 

B Loop 


This requirement is unchanged in ARMv8 by the presence of Load-Acquire or Store-Release. 
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K10.3 Load-Acquire Exclusive, Store-Release Exclusive and barriers 


K10.3. Load-Acquire Exclusive, Store-Release Exclusive and barriers 


The ARMV8 architecture adds the acquire and release semantics to Load-Exclusive and Store-Exclusive 
instructions, which allows them to gain ordering acquire and/or release semantics. 


The Load-Exclusive instruction can be specified to have acquire semantics, and the Store-Exclusive instruction can 
be specified to have release semantics. These can be arbitrarily combined to allow the atomic update created by a 
successful Load-Exclusive and Store-Exclusive pair to have any of: 

° No Ordering semantics (using LDREX and STREX). 

. Acquire only semantics (using LDAEX and STREX). 


. Release only semantics (using LDREX and STLEX). 


° Sequentially consistent semantics (using LDAEX and STLEX). 


In addition, the ARMV8 specification requires that the clearing of a global monitor will generate an event for the PE 
associated with the global monitor, which can simplify the use of WFE, by removing the need for a DSB barrier and 


SEV instruction. 


K10.3.1 Acquiring a lock 


A common use of Load-Exclusive and Store-Exclusive instructions is to claim a lock to permit entry into a critical 
region. This is typically performed by testing a lock variable that indicates 0 for a free lock and some other value, 
commonly 1 or an identifier of the process holding the lock, for a taken lock. 


For a critical region, the requirement on taking a lock is usually for acquire semantics, while the clearing of a lock 


requires release semantics: 
AArch32 
Px 


PLDW[R1] ; 
Loop 

LDAEX R5, [R1] : 

CMP R5, #0 ; 

STREXEQ R5, RO, [R1]_ ; 

CMPEQ R5, #0 ; 

BNE Loop : 


preload into cache in unique state 


read lock with acquire 
check if @ 

attempt to store new value 
test if store suceeded 
retry if not 


; loads and stores in the critical region can now be performed 


AArch64 
Px 


PRFM PSTLIKEEP, [X1]_ ; 
Loop 

LDAXR W5, [X1] ; 

CBNZ W5, Loop 

STXR W5, WO, [X1] : 

CBNZ W5, Loop ; 


preload into cache in unique state 


read lock with acquire 


; check if @ 


attempt to store new value 
test if store succeeded and retry if not 


; loads and stores in the critical region can now be performed 


The acquire associated with the load is sufficient to ensure the required ordering in a lock situation. The 
Store-Exclusive will fail (and so be retried) if there is a store to the location being monitored between the 
Load-Exclusive and the Store-Exclusive. 
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K10.3.2 Releasing a lock 


The converse operation of releasing a lock does not require the use of Load-Exclusive and Store-Exclusive 
instructions, because only a single observer is able to write to the lock. However, often it is necessary for any 
observer to observe any memory updates, or any values that are loaded into memory, before they observe the release 
of the lock. Therefore, the lock release needs release semantics: 


AArch32 
Px 
; loads and stores in the critical region 
MOV RO, #0 
STL RO, [R1] ; clear the lock with release semantics 
AArch64 
Px 
; loads and stores in the critical region 
STLR WZR, [X1] ; clear the lock with release semantics 


K10.3.3 Ticket locks 


When a lock is free, in order to avoid a rush to get the lock by many PEs, the use of ticket locks is common in more 
advanced systems. When the use is requested, the ticket locks determine the order of the users of the critical 
sections, in order to avoid starvation that can occur with a simple contention based spin lock. 


A ticket lock allocates each thread a ticket number when it first requests the lock, and then compares that number 
with the current number for the lock. If they are the same, then the critical section can be entered. Otherwise the 
thread waits until the current number is equal to the ticket number for that thread. 


The reading of the current number of the lock needs acquire semantics for the lock to be acquired. 


Note 


The code in this section is little-endian code, as it views the combined current and next values as a single combined 
quantity. The addresses of the current and next ticket values need to be adjusted for a big-endian system. 








This is shown in the implementation below: 
AArch32 
Px 


; Rl holds two 16 bit quantities 
; the lower halfword holds the current ticket number 
; the higher halfword holds the next ticket number 


PLDW[R1] ; preload into cache in unique state 
Loop1 

LDAEX R5, [R1] ; read current and next 

ADD R5, R5, #0x10000 ; increment the next number 

STREX R6, R5, [R1] ; and update the value 

CMP R6, #0 ; did the exclusive pass 

BNE Loop1 ; retry if not 


CMP R5, R5, ROR #16 ; is the current ticket ours 
BEQ block_start 


Loop2 
LDAH R6, [R1] ; read current value 
CMP R6, R5, LSR #16 ; compare it with our allocated ticket 
BNE Loop2 ; retry (spin) if it is not the same 


block_start 
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K10.3 Load-Acquire Exclusive, Store-Release Exclusive and barriers 


AArch64 
Px 


; X1 holds 2 16 bit quantities 
; the lower halfword holds the current ticket number 
; the higher halfword holds the next ticket number 


PRFM PSTLIKEEP, [X1] 
Loop1 

LDAXR W5, [X1] 

ADD W5, W5, #0x10000 

STXR W6, WS, [X1] 

CBNZ W6, Loopl 


AND W6, WS, #0xFFFF 

CMP W6, W5, LSR #16 

B.EQ block_start 
Loop2 

LDARH W6, [X1] 

CMP W6, W5, LSR #16 

B.NE Loop2 
block_start 


; preload into cache in unique state 


read current and next 
increment the next number 


; and update the value 


did the exclusive pass - retry if not 


is the current ticket ours 


read current value 
compare it with the our allocated ticket 
retry (spin) if it isn’t the same 


Releasing the ticket lock simply involves incrementing the current ticket number, that is still assumed to be in R3, 


and doing a Store-Release: 
AArch32 


ADD R6, R6, #1 
STLH R6, [R1] 


AArch64 


ADD W6, W6, #1 
STLRH W6, [X1] 


K10.3.4 Use of Wait For Event (WFE) and Send Event (SEV) with locks 


The ARMV8 architecture can use the Wait For Event mechanism to minimise the energy cost of polling variables 

by putting the PE into a low power state, suspending execution, until an asynchronous exception or an explicit event 
is seen by that PE. In ARMv8, the event can be generated as a result of clearing the global monitor, so removing the 
need for a DSB barrier or an explicit send event message. 


This can be used with simple locks or with ticket locks. 


Simple lock 


The following is an example of lock acquire code using WFE: 





AArch32 
Px 
PLDW[R1] ; preload into cache in unique state 
Loop 
LDAEX R5, [R1] ; read lock with acquire 
CMP R5, #0 ; check if @ 
WFENE ; Sleep if the lock is held 
STREXEQ R5, RQ, [R1] ; attempt to store new value 
CMPEQ R5, #0 ; test if store succeeded 
BNE Loop ; retry if not 
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AArch64 
Px 
SEVL ; invalidates the WFE on the first loop iteration 
PRFM PSTLIKEEP, [X1] ; allocate into cache in unique state 
Loop 
WFE 
LDAXR W5, [X1] ; read lock with acquire 
CBNZ W5, Loop ; check if @ 
STXR W5, WO, [X1] ; attempt to store new value 
CBNZ W5, Loop ; test if store succeeded and retry if not 


; loads and stores in the critical region can now be performed 


And the following is an example of lock release code: 


AArch32 
Px 
; loads and stores in the critical region 
MOV RO, #0 
STL RQ, [R1] ; clear the lock 
AArch64 
Px 
; loads and stores in the critical region 
STLR WZR, [X1] ; clear the lock 


Ticket lock 


In the Ticket lock case, the Load-Exclusive instruction can be used to move the monitor into the exclusive state for 
the express purpose of creating an event when the monitor changes state: 


AArch32 
Px 


; Rl holds 2 16 bit quantities 
; the lower halfword holds the current ticket number 
; the higher halfword holds the next ticket number 


PLDW[R1] ; preload into cache in unique state 
Loop1 
LDAEX R5, [R1] ; read current and next 


ADD R5, R5, #0x10000 
STREX R6, R5, [R1] 


increment the next number 
and update the value 

CMP R6, #0 did the exclusive pass 

BNE Loop retry if not 

CMP R5, R5, ROR #16 ; is the current ticket ours 
BEQ block_start 


SEVL 
Loop2 
WFE ; wait if there has not been a change to the count since last 
; read 
LDAEXH R6, [R1] ; check the current count 


CMP R6, R5, LSR #16 
BNE Loop2 
block_start 


; check if it is equal 
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K10.3 Load-Acquire Exclusive, Store-Release Exclusive and barriers 


AArch64 
Px 


; X1 holds 2 16 bit quantities 
; the lower halfword holds the current ticket number 
; the higher halfword holds the next ticket number 


PRFM PSTLIKEEP, [X1] 
Loop1 

LDAXR W5, [X1] 

ADD W5, W5, #0x10000 

STXR W6, WS, [X1] 

CBNZ W6, Loopl 


AND W6, WS, QxFFFF 
CMP W6, W5, LSR #16 
B.EQ block_start 
SEVL 
Loop2 
WFE 
LDAXRH W6, [X1] 
CMP W6, W5, LSR #16 
B.NE Loop2 
block_start 


; preload into cache in unique state 


read current and next 
increment the next number 


; and update the value 


did the exclusive pass - retry if not 


is the current ticket ours 


read current value 


compare it with our allocated ticket 
retry (spin) if it is not the same 
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K10.4 Using a mailbox to send an interrupt 
In some message passing systems, it is common for one observer to update memory and then notify a second 
observer of the update by sending an interrupt, using a mailbox. 
Although a memory access might be made to initiate the sending of the mailbox interrupt, a DSB instruction is 
required to ensure the completion of previous memory accesses. 
Therefore, the following sequence is required to ensure that P2 observes the updated value: 
AArch32 
Pl 
STR R5, [R1] ; Message stored to shared memory location 
DSB ST 
STR RO, [R4] ; R4 contains the address of a mailbox 
P2 
; interrupt service routine 
LDR R5, [R1] 
AArch64 
Pl 
STR W5, [X1] ; Message stored to shared memory location 
DSB ST 
STR WO, [X4] ; R4 contains the address of a mailbox 
P2 
; interrupt service routine 
LDR W5, [X1] 
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K10.5 Cache and TLB maintenance instructions and barriers 


The following sections describe the use of barriers with cache and TLB maintenance instructions: 


° Data cache maintenance instructions. 

. Instruction cache maintenance instructions on page K10-5606. 

. TLB maintenance instructions and barriers on page K10-5609. 
K10.5.1 Data cache maintenance instructions 


The following sections describe the use of barriers with data cache maintenance instructions: 

° Message passing to non-caching observers. 

. Multiprocessing message passing to non-caching observers on page K10-5603. 

. Invalidating DMA buffers, non-functional example on page K10-5604. 

. Invalidating DMA buffers, functional example with single PE on page K10-5604. 

. Invalidating DMA buffers, functional example with multiple coherent PEs on page K10-5605. 


Message passing to non-caching observers 


The ARMV8 architecture requires the use of DMB instructions to ensure the ordering of data cache maintenance 
instructions and their effects. The Load-Acquire and Store-Release instructions have no effect on cache 
maintenance instruction. This means the following message passing approaches can be used when communicating 
between caching observers and non-caching observers: 


AArch32 
Pl 

STR R5, [R1] ; updates data (assumed to be in P1 cache) 

DCCMVAC R1 ; cleans cache to point of coherency 

DMB ; ensures effects of the clean will be observed before the 

; flag is set 

STR RO, [R4] ; sends flag to external agent (Non-cacheable location) 

El 


WAIT_ACQ ([R4] == 1) ; waits for the flag (with order) 


LDR R5, [R1] ; reads the data 
AArch64 
Pl 

STR W5, [X1] ; updates data (assumed to be in P1 cache) 

DC CVAC, X1 ; cleans cache to point of coherency 

DMB ISH ; ensures effects of the clean will be observed before the 

; flag is set 

STR WO, [X4] ; sends flag to external agent (Non-cacheable location) 

El 


WAIT_ACQ ([X4] == 1) ; waits for the flag (with order) 
LDR W5, [X1] reads the data 


In this example, it is required that E1:R5==0x55. 
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Multiprocessing message passing to non-caching observers 


The broadcast nature of the cache maintenance instructions combined with properties of barriers, means that the 
message passing principle for non-caching observers is: 


AArch32 
Pl 

STR R5, [R1] ; updates data (assumed to be in P1 cache) 

STL RQ, [R2] ; sends a flag for P2 (ordered by the store release) 
P2 


WAIT ([R2] == 1) ; waits for Pl flag 


DMB ; ensures cache clean is observed after P1 flag is observed 
DCCMVAC R1 ; cleans cache to point of coherency - will clean Pl cache 
DMB ; ensures effects of the clean will be observed before the 
; flag to E1 is set 
STR RO, [R4] ; sends flag to E1 
El 
WAIT_ACQ ([R4] == 1) ; waits for P2 flag (ordered) 
LDR R5, [R1] ; reads data 
AArch64 
Pl 
STR W5, [X1] ; updates data (assumed to be in P1 cache) 
STLR WO, [X2] ; sends a flag for P2 (ordered) 
P2 


WAIT ([X2] == 1) ; waits sfor Pl flag 


DMB SY ; ensure cache clean is observed after Pl flag is observed 

DC CVAC, X1 ; Cleans cache to point of coherency, will clean P1 cache 

DMB SY ; ensures effects of the clean will be observed before the 
; flag to E1 is set 

STR WO, [X4] ; sends flag to E1 


El 


WAIT_ACQ ([X4] == 1) 
LDR W5, [X1] 


waits for P2 flag 
reads data 


In this example, it is required that E1:R5==0x55. The clean operation executed by P2 affects the data location in the 
P1 cache. The cast-out from the P1 cache is guaranteed to be observed before P2 updates [R4]. 


Note 


The cache maintenance instructions are not ordered by the Load-Acquire and Store-Release instructions. 
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Invalidating DMA buffers, non-functional example 


The basic scheme for communicating with an external observer that is a process that passes data in to a Cacheable 
memory region must take account of the architectural requirement that regions with a Normal Cacheable attribute 
can be allocated into a cache at any time, for example as a result of speculation. The following example shows this 


possibility: 
AArch32 
Pl 
DCIMVAC R1 ; ensures cache is not dirty. A clean operation could be used 
; but as the DMA will subsequently overwrite this region an 
; invalidate operation is sufficient and usually more efficient 
DMB ; ensures cache invalidation is observed before the next store 
; is observed 
STR RO, [R3] ; sends flag to external agent 
WAIT_ACQ ([R4]==1) ; waits for a different flag from an external agent 
LDR R5, [R1] 
El 


WAIT ([R3] == 1) waits for flag 


STR R5, [R1] ; stores new data 
STL RO, [R4] ; sends a flag 
AArch64 
Pl 
DC IVAC, X1 ; ensure cache is not dirty. A clean operation could be used 
; but as the DMA will subsequently overwrite this region an 
; invalidate operation is sufficient and usually more efficient 
DMB SY ; ensures cache invalidation is observed before the next store 
; is observed 
STR WO, [X3] ; sends flag to external agent 
WAIT_ACQ ([X4]==1) ; waits for a different flag from an external agent 
LDR W5, [X1] 
El 


WAIT ([X3] == 1) waits for flag 
STR W5, [X1] stores new data 
STLR WO, [X4] ; sends a flag 


If a speculative access occurs, there is no guarantee that the cache line containing [R1] is not brought back into the 
cache after the cache invalidation, but before [R1] is written by El. Therefore, the result P1:R5=0 is permissible. 


Invalidating DMA buffers, functional example with single PE 


AArch32 
Pl 
DCIMVAC R1 ; ensures cache is not dirty. A clean operation could be used 
; but as the DMA will subsequently overwrite this region an 
; invalidate operation is sufficient and usually more efficient 
DMB ; ensures cache invalidation is observed before the next store 
; is observed 
STR RO, [R3] ; sends flag to external agent 
WAIT ([R4]==1) ; waits for a different flag from an external agent 
DMB ; ensures that cache invalidate is observed after the flag 
; from external agent is observed 
DCIMVAC R1 ; ensures cache discards stale copies before use 
LDR RS, [R1] 
El 


WAIT ([R3] == 1) ; waits for flag 
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STL RQ, [R4] 
AArch64 
Pl 
DC IVAC, X1 
DMB SY 
STR WO, [X3] 
WAIT ([X4]==1) 
DMB SY 
DC IVAC, X1 
LDR W5, [X1] 
El 


WAIT ([X3] == 1) 
STR W5, [X1] 
STLR WO, [X4] 
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stores new data 
sends a flag 


ensures cache is not dirty. A clean operation could be used 
but as the DMA will subsequently overwrite this region an 
invalidate operation is sufficient and usually more efficient 
ensures cache invalidation is observed before the next store 
is observed 

sends flag to external agent 

waits for a different flag from an external agent 

ensures that cache invalidate is observed after the flag 

from external agent is observed 

; ensures cache discards stale copies before use 


; waits for flag 
; stores new data 
; sends a flag 


In this example, the result P1:R5 == 0x55 is required. Including a cache invalidation after the store by El to [R1] is 
observed ensures that the line is fetched from external memory after it has been updated. 


Invalidating DMA buffers, functional example with multiple coherent PEs 


The broadcasting of cache maintenance instructions, and the use of DMB instructions to ensure their observability, 
means that the previous example extends naturally to a multiprocessor system. Typically this requires a transfer of 
ownership of the region that the external observer is updating. 


(Use data from [R1], potentially using [R1] as scratch space) 


AArch32 
PO 
STL RQ, [R2] 
WAIT_ACQ ([R2] == Q) 
LDR R5, [R1] 
Pl 
WAIT ([R2] == 1) 
DCIMVAC R1 
DMB 
STR RO, [R3] 
WAIT ([R4] == 1) 
DMB 
DCIMVAC R1 
DMB 
MOV RO, #0 
STR RQ, [R2] 
El 
WAIT ([R3] == 1) 
STR R5, [R1] 
DMB [ST] 
STR RQ, [R4] 
AArch64 


PO 


; signals release of [R1] 
; waits for new value from DMA 


; waits for release of [R1] by PQ 
ensures caches are not dirty, an invalidate is sufficient 


; requests new data for [R1] 
; waits for new data 


ensures caches discard stale copies before use 


; signals availability of new [R1] 


waits for new data request 
sends new [R1] 


; indicates that new data is available to P1 


(Use data from [X1], potentially using [X1] as scratch space) 


STLR WO, [X2] 


; signals release of [X1] 
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K10.5.2 


WAIT_ACQ ([X2] == @) ; waits for new value from DMA 





LDR W5, [X1] 
Pl 
WAIT ([X2] == 1) ; waits for release of [R1] by PQ 
DC IVAC, X1 ; ensures caches are not dirty, an invalidate is sufficient 
DMB SY 
STR WO, [X3] ; requests new data for [R1] 
WAIT ([X4] == 1) ; waits for new data 
DMB SY 
DCIMVAC X1 ; ensures caches discard stale copies before use 
DMB SY 
STR WZR, [X2] ; signals availability of new [R1] 
El 
WAIT ([X3] == 1) ; waits for new data request 








STR WS, [X1] sends new [R1] 
STR WO, [X4] ; indicates new data is available to P1 


In this example, the result PO:R5 == 0x55 is required. The DMB issued by P1 after the first data cache invalidation 
ensures that effect of the cache invalidation on PO is seen by E1 before the store by El to [R1]. The DMB issued by 
P1 after the second data cache invalidation ensures that its effects are seen before the store of 0 to the semaphore 
location in [R2]. 


Instruction cache maintenance instructions 


The following sections describe the use of barriers with instruction cache maintenance instructions: 
° Ensuring the visibility of updates to instructions for a uniprocessor. 


. Ensuring the visibility of updates to instructions for a multiprocessor on page K10-5607. 


Ensuring the visibility of updates to instructions for a uniprocessor 


On a single PE, the agent that causes instruction fetches, or instruction cache linefills, is a separate memory system 
observer from the agent that causes data accesses. Therefore, any operations to invalidate the instruction cache can 
rely only on seeing updates to memory that are complete. This must be ensured by the use of a DSB instruction. 


Also, instruction cache maintenance instructions are only guaranteed to complete after the execution of a DSB, and 
an ISB is required to discard any instructions that might have been prefetched before the instruction cache 
invalidation completed. Therefore, on a uniprocessor, to ensure the visibility of an update to code and to branch to 
it, the following sequence is required: 


AArch32 
Pl 


STR R11, [R1] ; R11 contains a new instruction to be stored in program memory 

DCCMVAU R1 ; Clean to PoU makes the new instruction visible to the instruction cache 
DSB 

ICIMVAU R1 ; ensures instruction cache/branch predictor discards stale data 

BPIMVA R1 

DSB ; ensures completion of the invalidation 

ISB ; ensures instruction fetch path sees new instruction cache state 

BX R1 


In AArch64 state, the branch predictor maintenance is not required. 
AArch64 
Pl 


STR W11, [X1] ; W11 contains a new instruction to be stored in program memory 

DC CVAU, X1 ; clean to PoU makes the new instruction visible to instruction cache 
DSB ISH 

IC IVAU, X1 ; ensures instruction cache/branch predictor discards stale data 
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ensures completion of the invalidation 
ensures instruction fetch path sees new instruction cache state 





Note 


Where the changes to the instructions span multiple cache lines, then the data cache and instruction cache 
maintenance instructions can be duplicated to cover each of the lines to be cleaned and to be invalidated. 





Ensuring the visibility of updates to instructions for a multiprocessor 


The ARMV8 architecture requires a PE that executes an instruction cache maintenance instruction to execute a DSB 
instruction to ensure completion of the maintenance operation. This ensures that the cache maintenance instruction 
is complete on all PEs in the Inner Shareable shareability domain. 


An ISB is not broadcast, and so does not affect other PEs. This means that any other PE must perform its own ISB 
synchronization after it knows that the update is visible, if it is necessary to ensure its synchronization with the 
update. The following example shows how this might be done: 


AArch32 
Pl 


STR R11, [R1] 
DCCMVAU R1 
DSB 

ICIMVAU R1 
BPIMVA R 

DSB 


STR RQ, [R2] 
ISB 
BX R1 


P2-Px 
WAIT ([R2] == 1) 
ISB 
BX R1 

AArch64 

Pl 


STR X11, [X1] 
DC CVAU, X1 
DSB ISH 

IC IVAU, X1 
DSB ISH 


STR WO, [X2] 
ISB 
BR R1 

P2-Px 


WAIT ([X2] == 1) 


R11 contains a new instruction to be stored in program memory 

clean to PoU makes the new instruction visible to the instruction cache 
ensures completion of the clean on all PEs 

ensures instruction cache discards stale data 

ensures branch predictor discards stale data 

ensures completion of the instruction cache and branch predictor 
invalidation on all PEs 

sets flag to signal completion 

synchronizes context on this PE 

branches to new code 


waits for flag signalling completion 
synchronizes context on this PE 
branches to new code 


; X11 contains a new instruction to be stored in program memory 


clean to PoU makes the new instruction visible to the instruction cache 
ensures completion of the clean on all PEs 

ensures instruction cache/branch predictor discards stale data 

ensures completion of the instruction cache/branch predictor 
invalidation on all PEs 

sets flag to signal completion 

synchronizes context on this PE 

branches to new code 


; waits for flag signalling completion 





ISB ; synchronizes context on this PE 
BR X1 ; branches to new code 
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Nonfunctional approach 


The following sequence does not have the same effect, because a DSB is not required to complete the instruction 
cache maintenance instructions that other PEs issue: 


AArch32 


Pl 


STR R11, [R1] 
DCCMVAU R1 
DSB 

ICIMVAU R1 
BPIMVA R1 
DMB 


STR RO, [R2] 
DSB 

ISB 

BX R1 


P2-Px 


WAIT ([R2] == 1) 
DSB 


ISB 
BX R1 


AArch64 


Pl 


STR W11, [X1] 
DC CVAU, X1 
DSB ISH 

IC IVAU, X1 
DMB ISH 


STR WO, [X2] 
DSB ISH 

ISB 
BR X1 





P2-Px 


WAIT ([X2] == 1) 
DSB ISH 


ISB 
BR X1 


R11 contains a new instruction to be stored in program memory 
clean to PoU makes the new instruction visible to the instruc 
ensures completion of the clean on all PEs 

ensures instruction cache discards stale data 


: ensures branch predictor discards stale data 


ensures ordering of the store after the invalidation 

DOES NOT guarantee completion of instruction cache/branch 
predictor on other PEs 

sets flag to signal completion 

ensures completion of the invalidation on all PEs 
synchronizes context on this PE 

branches to new code 


; waits for flag signalling completion 


this DSB does not guarantee completion of P1 
ICIMVAU/BPIMVA 


W11 contains a new instruction to be stored in program memory 
clean to PoU makes the new instruction visible to instruction 
ensures completion of the clean on all PEs 

ensures instruction cache/branch predictor discards stale dat 
ensures ordering of the store after the invalidation 

DOES NOT guarantee completion of instruction cache/branch 
predictor on other PEs 

sets flag to signal completion 

ensures completion of the invalidation on all PEs 
synchronizes context on this PE 

branches to new code 


waits for flag signalling completion 
this DSB does not guarantee completion of P1 
ICIMVAU/BPIMVA 


In this example, P2...Px might not see the updated region of code at R1. 


tion cache 


cache 


a 
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K10.5.3 TLB maintenance instructions and barriers 


The following sections describe the use of barriers with TLB maintenance instructions: 

° Ensuring the visibility of updates to translation tables for a uniprocessor. 

° Ensuring the visibility of updates to translation tables for a multiprocessor on page K10-5610. 
° Paging memory in and out on page K10-5610. 

° Using break-before-make when updating translation table entries on page K10-5611. 


Ensuring the visibility of updates to translation tables for a uniprocessor 


On a single PE, the agent that causes translation table walks is a separate memory system observer from the agent 
that causes data accesses. Therefore, any operations to invalidate the TLB can only rely on seeing updates to 
memory that are complete. This must be ensured by the use of a DSB instruction. 


The ARMV8 architecture requires that translation table walks look in the data or unified caches at L1, so such 
systems do not require data cache cleaning. 


After the translation tables update, any old copies of entries that might be held in the TLBs must be invalidated. This 
operation is only guaranteed to affect all instructions, including instruction fetches and data accesses, after the 
execution of a DSB and an ISB. Therefore, the code for updating a translation table entry is: 


AArch32 
Pl 
STR R11, [R1] ; updates the translation table entry 
DSB ; ensures visibility of the update to translation table walks 
TLBIMVA R10 
BPIALL 
DSB ; ensures completion of the BP and TLB invalidation 
ISB ; synchronises context on this PE 


; new translation table entry can be relied upon at this point and all accesses 
; generated by this observer using 
; the old mapping have been completed 


AArch64 
Pl 
STR X11, [X1] ; updates the translation table entry 
DSB ISH ; ensures visibility of the update to translation table walks 
TLBI VAE1, X10 ; assumes we are in the EL1 
DSB ISH ; ensures completion of the TLB invalidation 
ISB ; synchronise context on this PE 


; new translation table entry can be relied upon at this point and all accesses 
; generated by this observer using 
; the old mapping have been completed 


Importantly, by the end of this sequence, all accesses that used the old translation table mappings have been 
observed by all observers. 


An example of this is where a translation table entry is marked as invalid. Such a system must provide a mechanism 
to ensure that any access to a region of memory being marked as invalid has completed before any action is taken 
as a result of the region being marked as invalid. 
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Ensuring the visibility of updates to translation tables for a multiprocessor 


The same code sequence can be used in a multiprocessing system. The ARMv8 architecture requires a PE that 
executes a TLB maintenance instruction to execute a DSB instruction to ensure completion of the maintenance 
operation. This ensures that the TLB maintenance instruction is complete on all PEs in the Inner Shareable 
shareability domain. 


The completion of a DSB that completes a TLB maintenance instruction ensures that all accesses that used the old 
mapping have completed. 


AArch32 
Pl 
STR R11, [R1] ; updates the translation table entry 
DSB ; ensures visibility of the update to translation table walks 
TLBIMVAIS R10 
BPIALLIS 
DSB ; ensures completion of the BP and TLB invalidation 
ISB ; Note ISB is not broadcast and must be executed locally 


; on other PEs 
; new translation table entry can be relied upon at this point and all accesses 
; generated by any observers affected by the broadcast TLBIMVAIS operation using 
; the old mapping have been completed 


AArch64 
Pl 
STR X11, [X1] ; updates the translation table entry 
DSB ISH ; ensures visibility of the update to translation table walks 
TLBI VAE1IS, X10 
DSB ISH ; ensures completion of the TLB invalidation 
ISB ; Note ISB is not broadcast and must be executed locally 


; on other PEs 
; new translation table entry can be relied upon at this point and all accesses 
; generated by any observers affected by the broadcast TLBIMVAIS operation using 
; the old mapping have been completed 


The completion of the TLB maintenance instruction is guaranteed only by the execution of a DSB by the observer 
that performed the TLB maintenance instruction. The execution of a DSB by a different observer does not have this 
effect, even if the DSB is known to be executed after the TLB maintenance instruction is observed by that different 
observer. 


Paging memory in and out 


In a multiprocessor system there is a requirement to ensure the visibility of translation table updates when paging 
regions of memory into RAM from a backing store. This might, or might not, also involve paging existing locations 
in memory from RAM to a backing store. In such situations, the operating system selects one or more pages of 
memory that might be in use but are suitable to discard, with or without copying to a backing store, depending on 
whether or not the region of memory is writable. Disabling the translation table mappings for a page, and ensuring 
the visibility of that update to the translation tables, prevents agents accessing the page. 


For this reason, it is important that the DSB that is performed after the TLB invalidation ensures that no other updates 
to memory using those mappings are possible. 


An example sequence for the paging out of an updated region of memory, and the subsequent paging in of memory, 
is as follows: 





AArch32 
Pl 
STR R11, [R1] ; updates the translation table for the region being paged out 
DSB ; ensures visibility of the update to translation table walks 
TLBIMVAIS R10 ; invalidates the old entry 
DSB ; ensures completion of the invalidation on all PEs 
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ISB ; ensures visibility of the invalidation 

BL SaveMemoryPageToBackingStore 

BL LoadMemoryFromBackingStore 

DSB ; ensures completion of the memory transfer (this could be part of 
LoadMemoryFromBackingStore) 

ICIALLUIS also invalidates the branch predictor 

STR R9, [R1] ; creates a new translation table entry with a new mapping 


DSB ; ensures completion of the instruction cache 
; and branch predictor invalidation 
; AND ensures visibility of the new translation table mapping 
ISB ; ensures synchronisation of this instruction stream 
AArch64 
Pl 
STR X11, [X1] ; updates the translation table for the region being paged out 
DSB ISH ; ensures visibility of the update to translation table walks 
TLBI VAE1IS, X10 ; invalidates the old entry 
DSB ISH ; ensures completion of the invalidation on all PEs 
ISB ; ensures visibility of the invalidation 


BL SaveMemoryPageToBackingStore 
BL LoadMemoryFromBackingStore 





DSB ISH ; ensures completion of the memory transfer (this could be part of 
; LoadMemoryFromBackingStore) 

IC IALLUIS ; also invalidates the branch predictor 

STR X9, [X1] ; creates a new translation table entry with a new mapping 

DSB ISH ; ensures completion of the instruction cache 


; and branch predictor invalidation 
; AND ensures visibility of the new translation table mapping 
ISB ; ensures synchronisation of this instruction stream 


This example assumes the memory copies are performed by an observer that is coherent with the caches of PE P1. 
This observer might be P1 itself, using a specific paging mapping. For clarity, the example omits the functional 
descriptions of SaveMemoryPageToBackingStore and LoadMemoryFromBackingStore. LoadMemoryFromBackingStore is 
required to ensure that the memory updates that it makes are visible to instruction fetches. 


In this example, the use of ICIALLUIS in AArch32 state and IC IALLUIS in AArch64 state to invalidate the entire 
instruction cache is a simplification, that might not be optimal for performance. An alternative approach involves 
invalidating all of the lines in the caches using ICIMVAU in AArch32 state and IC IVAU operations in AArch64 
state. This invalidation must be done when the mapping used for the ICIMVAU and IC IVAU operations is valid 
but not executable. 


Using break-before-make when updating translation table entries 


The ARM Architecture requires that reads to the same location are observed in order, and since application level 
software relies on this behavior, the operating system needs to maintain this illusion when it is changing a virtual to 
physical address mapping for a location, as is the case with copy on write or other memory management techniques. 
This illusion can be maintained provided that the software uses a break-before-make sequence when updating 
translation table entries whenever multiple threads of execution can use the same translation tables and the change 
to the translation entries involves any of: 


° Changing the memory type. 
° Changing the cacheability attributes 


° Changing the output address (OA), if the OA of at least one of the old translation table entry and the new 
translation table entry is writable. 


The architecture requires use of a break-before make sequence in these situations, see Using break-before-make 

when updating translation table entries on page D4-1816 for more information. However, if software did not use a 
break-before-make approach, an implementation might give a result that would occur if the two reads to the same 
virtual address did not occur in program order. An example of such an occurrence would be an implementation of 
copy-on-write, where one PE is performing two reads to the same virtual address at the same time as a second PE, 
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running code associated with the operating system, is copying the data from one physical location that is mapped to 
by that virtual address, where the page was mapped as read-only, to a different physical location which will be 
mapped as read-write. 


If the operating system changed the address mapping without going through an invalid entry, then it would be 
possible for a third PE to perform a write to the location that would be seen by the first load by the first PE, and not 
seen by the second load by the same PE. 


The required break-before-make code sequence in this case is: 

AArch32 

Pl 
; R1, R2 contain an invalid translation table entry (that is, one with bit[@] == Q) 
; R3 contains the address of the translation table entry 


; R4 contains the Virtual Address and ASID of the VA being remapped 
; R5, R6 contain the new valid translation table entry 


STRD R1, R2, [R3] ; stores invalid entry 
DSB ISH ; ensures visibility of the update to translation table walks 
TLBIMVAIS R4 ; invalidates the old entry 
DSB ISH ; ensures completion of the invalidation on all PEs 
ICIALLUIS ; also invalidates the branch predictor 
STRD R5, R6, [R3] ; store new mapping 
DSB ISH ; ensures visibility of the update to translation table walks 
ISB ; ensures synchronisation of this instruction stream 
Note 








This example shows an update to an entry in a translation table that is using the long-descriptor format. 





AArch64 
Pl 


; X1 contains an invalid translation table entry (that is, one with bit[Q] == Q) 
; X2 contains the address of the translation table entry 

; X3 contains the Virtual Address and ASID of the VA being remapped 

; X4 contains the new valid translation table entry 


STR X1, [X2] ; stores invalid entry 

DSB ISH ; ensures visibility of the update to translation table walks 
TLBI VAE1IS, X3 ; invalidates the old entry 

DSB ISH ; ensures completion of the invalidation on all PEs 

IC IALLUIS ; also invalidates the branch predictor 

STR X4, [X2] ; store new mapping 

DSB ISH ; ensures visibility of the update to translation table walks 
ISB ; ensures synchronisation of this instruction stream 


If this sequence is correctly followed, then the architecture guarantees that the loads to a virtual address being 
remapped will be seen in the correct order. 


The instruction cache maintenance is only required if the mapping from input address to output address has been 
changed as part of the change of the translation table entries, and the memory being moved is executable. In this 
example, the use of ICIALLUIS in AArch32 state and IC IALLUIS in AArch64 state to invalidate the entire instruction 
cache is a simplification, that might not be optimal for performance. An alternative approach involves invalidating 
all of the lines in the caches using ICIMVAU in AArch32 state, and IC IVAU in AArch6é4 state. This invalidation must 
be done when the mapping used for the ICIMVAU and IC IVAU operations is valid but not executable. 
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K10.5.4 Ordering of Memory-mapped device control with payloads 


With a Memory-mapped peripheral, such as a DMA, which can also access memory for its own use, it is common 
to have control or status registers which are Memory-mapped. These registers need to be accessed in an ordered 
manner with respect to the data that the Memory-mapped peripheral is handling. 


Two simple examples of this are: 


° When a processing element is writing a buffer of data, and then writing to a control register in the DMA 
peripheral to start that peripheral to access the buffer of data. 


. When a DMA peripheral has written to a buffer of data in memory, and the processing element is reading a 
status register to determine that the DMA transfer has completed, and then is reading the data. 


For the case of the processing element writing a buffer of data, before starting the DMA peripheral, the ordering 
requirements between the stores to the data buffer and the stores to the Memory-mapped a to the DMA peripheral 
can be met by the insertion of a DSB <domain> instruction between these sets of accesses as this ensures the global 
observation of the stores before the DMA is started. this is shown by the following code: 


AArch32 
Pl 
STR R5, [R2] ; data written to the data buffer 
fs RO, [R4] ; R4 contains the address of the DMA control register 
AArch64 
Pl 
STR W5, [X2] ; data written to the data buffer 
DSB <domain> 
STR WO, [X4] ; X4 contains the address of the DMA control register 


For the case of DMA peripheral writing the data buffer and then setting a status register when those stores are 
complete (and so globally observed) and then having this status register polled by the processing element before the 
processing element reads the data buffer, the processing element must insert a DSB <domain> between the load that 
reads the status register, and the read of the buffer. A DMB, or load-acquire, is not sufficient as this problem is not 
solely concerned with observation order, since the polling read is actually a read of a status register at a slave, not 
the polling a data value that has been written by an observer. 


For this case, the code is therefore: 
AArch32 
Pl 


WAIT ([R4] == 1) R4 contains the address of the status register, 


and the value '1' indicates completion of the DMA transfer 





DSB 
LDR R5, [R2] ; reads data from the data buffer 
AArch64 
Pl 
WAIT ([X4] == 1) ; X4 contains the address of the status register, 
; and the value '1' indicates completion of the DMA transfer 
DSB <domain> 
LDR W5, [X2] ; reads data from the data buffer 
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K10.6 ARMv7 compatible approaches for ordering, using DMB and DSB barriers 
The following sections describe the ARMv7 compatible approaches for ordering, using DMB and DSB barriers: 
° Simple ordering and barrier cases. 

. Load-Exclusive, Store-Exclusive and barriers on page K10-5620. 
° Using a mailbox to send an interrupt on page K10-5622. 
° Cache and TLB maintenance instructions and barriers on page K10-5622. 

K10.6.1 Simple ordering and barrier cases 
ARM implements a weakly consistent memory model for Normal memory. In general terms, this means that the 
order of memory accesses observed by other observers might not be the order that appears in the program, for either 
loads or stores. 

This section includes examples of this. 
Simple weakly consistent ordering example 
Pl 
STR RS, [R1] 
LDR R6, [R2] 
P2 
STR R6, [R2] 
LDR R5, [R21] 
In the absence of barriers, the result of P1: R6=0, P2: R5=0 is permissible. 
Message passing 
The following sections describe: 
° Weakly-ordered message passing problem. 
° Message passing with multiple observers on page K10-5616. 
Weakly-ordered message passing problem 
Pl 
STR R5, [R1] ; sets new data 
STR RO, [R2] ; sends flag indicating data ready 
P2 
WAIT([R2]==1) ; waits on flag 
LDR RS, [R1] ; reads new data 
In the absence of barriers, an end result of P2: R5=0 is permissible. 
Resolving by the addition of barriers 
The addition of barriers, to ensure the observed order of the reads and the writes, ensures that data is transferred so 
that the result P2:R5==0x55 is guaranteed, as follows: 
Pl 
STR R5, [R1] ; sets new data 
DMB [ST] ; ensures all observers observe data before the flag 
STR RO, [R2] ; sends flag indicating data ready 
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P2 
WAIT([R2]==1) ; waits on flag 
DMB ; ensures that the load of data is after the flag has been observed 
LDR R5, [R1] 


Resolving by the use of barriers and address dependency 


There is a rule within the ARM architecture that: 


. Where the value returned by a read is used for computation of the virtual address of a subsequent read or 
write, then these two memory accesses are observed in program order. 


Where the value returned by a read is used for computation of the virtual address of a subsequent read or 
write, this is called an address dependency. An address dependency exists even if the value returned by the 
first read has no effect on the virtual address. This might occur if the value returned is masked off before it 
is used, or if it confirms a predicted address value that it might have changed. 


This restriction applies only when the data value returned by a read is used as a data value to calculate the 
address of a subsequent read or write. It does not apply if the data value returned by a read determines the 
condition flags values, and the values of the flags are used for condition code evaluation to determine the 
address of a subsequent read, either through conditional execution or the evaluation of a branch. This is called 
a control dependency. 


Where both a control and address dependency exist, the ordering behavior is consistent with the address 














dependency. 
Table K10-1 shows examples of address dependencies, control dependencies, and an address and control 
dependency. 
Table K10-1 Dependency examples 
Address dependency Control dependency Address and control dependency? 
(a) (b) (c) (d) (e) 
LDR ri, [r@] LDR ri, [rQ] LDR ri, [rQ] LDR ri, [rQ] LDR ri, [rQ] 
LDR r2, [ri] AND rl, rl, #0 CMP rl, #55 CMP rl, #55 CMP rl, #0 
LDR r2, [r3, r1] LDRNE r2, [r3] MOVNE r4, #22 LDRNE r2, [ri] 





LDR r2, [r3, r4] 





a. The address dependency takes priority. 


This means that the data transfer example of Weakly-ordered message passing problem on page K10-5614 can also 
be satisfied as shown in the following example: 


Pl 
STR R5, [R1] ; sets new data 
DMB [ST] ; ensures all observers observe data before the flag 
STR RO, [R2] ; sends flag indicating data ready 
P2 
WAIT([R2]==1) 
AND R12, R12, #0 ; R12 is destination of LDR in WAIT macro 
LDR R5, [R1, R12] ; the load has an address dependency on R12 





; and so is ordered after the flag has been seen 


The load of R5 by P2 is ordered with respect to the load from [R2] because there is an address dependency using 
R12. Pl uses a DMB to ensure that P2 does not observe the write of [R2] before the write of [R1]. 
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Message passing with multiple observers 


Where the ordering of Normal memory accesses is not resolved by the use of barriers or dependencies, then different 
observers might observe the accesses in a different order, as shown in the following example: 





Pl 
STR R5, [R1] ; sets new data 
STR RO, [R2] ; sends flag indicating data ready 
P2 
WAIT([R2]==1) 
AND R12, R12, #0 ; R12 is destination of LDR in WAIT macro 
LDR R5, [R1, R12] ; the load has an address dependency on R12 
; and so is ordered after the flag has been seen 
P3 
WAIT([R2]==1) 
AND R12, R12, #0 ; R12 is destination of LDR in WAIT macro 
LDR R5, [R1, R12] ; the load is address depndent on R12 


; and so is ordered after the flag has been seen 


In this case, it is permissible for P2:R5 and P3:RS5 to contain different values, because there is no order guaranteed 
between the two stores performed by P1. 


Resolving by the addition of barriers 


The addition of a barrier by P1, as shown in the following example, ensures the observed order of the writes, 
transferring data so that P2:R5 and P3:R5 both contain the value 0x55: 





Pl 
STR R5, [R1] ; sets new data 
DMB [ST] ; ensures all observers observe data before the flag 
STR RO, [R2] ; sends flag indicating data ready 
P2 
WAIT([R2]==1) 
AND R12, R12, #0 ; R12 is the destination of LDR in WAIT macro 
LDR R5, [R1, R12] ; the load has an address dependency on R12 
; and so is ordered after the flag has been seen 
P3 
WAIT([R2]==1) 
AND R12, R12, #0 ; R12 is the destination of LDR in WAIT macro 
LDR R5, [R1, R12] ; the load has an address dependency on R12 


; and so is ordered after the flag has been seen 


Address dependency with object construction 


When accessing an object-oriented data structure, the address dependency rule means that barriers are not required, 
even when initializing the object: 


Pl 
STR R5, [R1, #offset] ; sets new data in a field 
DMB [ST] ; ensures all observers observe data before base address is updated 
STR R1, [R2] ; updates base address 
P2 
LDR R1, [R2] ; reads for base address 
CMP R1, #0 ; checks if it is valid 


BEQ null_trap 





K10-5616 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. ARM DDI 0487A.k _iss10775 
Non-Confidential ID092916 


Appendix K10 Barrier Litmus Tests 
K10.6 ARMv7 compatible approaches for ordering, using DMB and DSB barriers 


LDR R5, [R1, #offset] ; uses base address to read field 


If the null_trap is not taken, it is required that P2:R5==0x55. This avoids P2 observing a partially constructed object 
from P1. Significantly, P2 does not require a barrier to ensure this behavior. 


P1 requires a barrier to ensure the observed order of the writes by P1. In general, the impact of requiring a barrier 
during the construction phase is much less than the impact of requiring a barrier for every read access. 


Causal consistency issues with multiple observers 


The fact that different observers can observe memory accesses in different orders extends, in the absence of barriers, 
to behaviors that do not fit naturally expected causal properties, as the following example shows: 


Pl 


STR RQ, [R2] sets new data 


P2 


waits to see new data from P1 
sends flag, must be after the new data has been by P2 as stores, 
must not be speculative 


WAIT([R2]==1) 
STR RO, [R3] 


P3 


WAIT([R3]==1) waits for P2 flag 
AND R12, R12, #0 dependency to ensure order 
LDR RQ, [R2, R12] ; reads Pl's data 


In this example, P3:RO==0 is permissible. P3 is not guaranteed to see the stores from P1 and P2 in any particular 
order. This applies despite the fact that the store from P2 can only happen after P2 has observed the store from P1. 


This example shows that the ARM memory order model for Normal memory does not conform to causal 
consistency. This means that the apparently transitive causal relationship between two variables is not guaranteed 
to be transitive. 


The following example shows the insertion of a barrier by P2 to create causal consistency: 
Pl 


STR RQ, [R2] sets new data 


P2 


waits to see new data from P1 
ensures P1 data is observed by all observers before any following store 
sends flag 


WAIT([R2]==1) 
DMB 
STR RO, [R3] 


P3 


WAIT([R3]==1) waits for P2 flag 
AND R12, R12, #0 dependency to ensure order 
LDR RQ, [R2, R12] ; reads Pl data 





This creates causal consistency because a DMB is required to order all accesses that the executing PE observed before 
the DMB, not only those it issued, before any of the accesses that follow the DMB. 
Multiple observers of writes to multiple locations 


The ARM weakly consistent memory model means that different observers can observe writes to different locations 
in different orders, as the following example shows: 





Pl 
STR RQ, [R1] ; sets new data 
P2 
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STR RQ, [R2] ; sets new data 
P3 
LDR R10, [R2] ; reads P2 data before P1 data 
LDR R9, [R1] ; 
BIC R9, R10, R9 ; R9 <- R1O && ~RI 
; R9 contains 1 iff read from [R2] is observed to be 1 and 
; read from [R1] is observed to be 0. 
P4 
LDR R9, [R1] 
LDR R10, [R2] 
BIC R9, RI, R10 ; RO <- ROI && ~R10 


; R9 contains 1 iff read from [R2] is observed to be @ and 
; read from [R1] is observed to be 1. 


In this example, the result P3:R9==1 and P4:R9==1 is permissible. This means that P3 and P4 observed the stores 
from P1 and P2 in different orders. 


The following example shows the use of DMB instructions to ensure sequential consistency: 


Pl 
STR RO, [R1] ; sets new data 
P2 
STR RQ, [R2] ; sets new data 
P3 
LDR R10, [R2] ; reads P2 data before P1 data 
DMB 
LDR R9, [R1] 
BIC R9, R10, R9 ; RO <- R1O && ~RI 
; R9 contains 1 iff read from [R2] is observed to be 1 and 
; read from [R1] is observed to be 0. 
P4 
LDR R9, [R1] ; reads Pl data before P2 data 
DMB 
LDR R10, [R2] 
BIC R9, RI, R10 ; RO <- RO && ~R10 


RQ contains 1 iff read from [R2] is observed to be @ and 
read from [R1] is observed to be 1. 


In this example: 


° The DMB executed by P3 ensures that, if the P3 load from [R2] observes the P2 store to [R2], then all observers 
observe the P2 store to [R2] before they observe the P3 load from [R1]. 


° The DMB executed by P4 ensures that, if the P4 load from [R1] observes the P1 store to [R1], then all observers 
observe the P1 store to [R1 before they observe the P4 load from [R2]. 


If the P3 load from [R1] returns 0, then it has not observed the P1 store to [R1]. Also, if the P3 load of [R2] returns 1, 
then all observers must have observed the P2 store to [R2] before they observed the P1 store to [R1]. This means 
that P4 cannot observe the P1 store to [R1] without also observing the P2 store to [R2]. 


Alternatively, if the P4 load from [R2] returns 0, then it has not observed the P2 store to [R2]. If, also, the P4 load 
of [R1] returns 1, then all observers must have observed the P1 store to [R1] before they observed the P2 store to 
[R2]. This means that P3 cannot observe the P2 store to [R2] without also observing the P1 store to [R1]. 


This shows that, of the four possible results for {P3:R9, P4:R9}, the insertion of these barriers makes the result 
{1, 1} impossible. 
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Posting a store before polling for acknowledgement 


In the case where an observer stores to a location, and then polls for an acknowledge from a different observer, the 
weak ordering of the memory model can lead to a deadlock, as the following example shows: 


Pl 


STR RO, [R2] 
WAIT ([R3]==1) 


P2 


WAIT ([R2]==1) 
STR RO, [R3] 


In ARMv7 implementations that do not include the Multiprocessing Extensions, then this can deadlock because P2 
might not observe the store by P1 in finite time. For ARMv7 implementations with the Multiprocessing Extensions 
and for ARMV8, this is not an issue as all stores must be observed by all observers within their shareability domain 
in finite time. 


The addition of a DMB instruction prevents this deadlock in ARMv7 implementations that do not include the 
Multiprocessing Extensions: 


Pl 


STR RO, [R2] 
DMB 
WAIT ([R3]==1) 


P2 


WAIT ([R2]==1) 
STR RO, [R3] 


The DMB executed by P1 ensures that P2 observes the store by P1 before it observes the load by P1. This ensures a 
timely completion. 


The following example is a variant of the previous example, where the two observers poll the same memory 
location: 


Pl 


STR RQ, [R2] 
WAIT ([R2]==2) 


P2 


WAIT ([R2]==1) 
LDR RQ, [R2] 
ADD RO, RO, #1 
STR RO, [R2] 


In this example, the same deadlock can occur in ARMv7 implementations that do not include the Multiprocessing 
Extensions, because the architecture permits P1 to read the result of its own store to [R2] early, and continue doing 
so for an indefinite amount of time. The addition of a DMB instruction prevents this deadlock: 


Pl 


STR RO, [R2] 
DMB 
WAIT ([R2]==2) 


P2 


WAIT ([R2]==1) 
LDR RQ, [R2] 
ADD RO, RO, #1 
STR RO, [R2] 
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K10.6.2 


WEE and WFI and barriers 


The Wait For Event and Wait For Interrupt instructions permit the PE to suspend execution and enter a low-power 
state. A DSB barrier instruction is required if it is necessary to ensure that memory accesses made before the WFI or 
WFE are visible to other observers, unless some other mechanism has ensured this visibility. Examples of other 
mechanism that would guarantee the required visibility are the DMB described in Posting a store before polling for 
acknowledgement on page K10-5619, or a dependency on a load. 


The following example requires the DSB to ensure that the store is visible: 
Pl 


STR RO, [R2] 
DSB 
Loop 
WFI 
B Loop 


However, if the example in Posting a store before polling for acknowledgement on page K10-5619 is extended to 
include a WFE, there is no risk of a deadlock. The extended example is: 


Pl 


STR RO, [R2] 
DMB 

Loop 
LDR R12, [R3] 
CMP R12, #1 
WFENE 
BNE Loop 


P2 
WAIT ([R2]==1) 
STR RO, [R3] 


DSB 
SEV 


In this example: 
° The DMB by P1 ensures that P2 observes the store by P1 before it observes the load by P1. 


. The dependency of the WFE on the result of the load by P1 means that this load must complete before P1 
executes the WFE. 


For more information about SEV, see Use of Wait For Event (WFE) and Send Event (SEV) with locks on 
page K10-5621. 


Load-Exclusive, Store-Exclusive and barriers 


The Load-Exclusive and Store-Exclusive instructions, described in Synchronization and semaphores on 

page B2-108, are predictable only with Normal memory. These instructions do not have any implicit barrier 
functionality. Therefore, any use of these instructions to implement locks of any type requires the addition of explicit 
barriers. 


Acquiring a lock 


A common use of Load-Exclusive and Store-Exclusive instructions is to claim a lock to permit entry into a critical 
region. This is typically performed by testing a lock variable that indicates 0 for a free lock and some other value, 
commonly | or an identifier of the process holding the lock, for a taken lock. 


The lack of implicit barriers in the Load-Exclusive and Store-Exclusive instructions means that the mechanism 
requires a DMB instruction between acquiring a lock and making the first access to the critical region, to ensure that 
all observers observe the successful claim of the lock before they observe any subsequent loads or stores to the 
region. This example shows Px acquiring a lock: 
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Px 
Loop 
LDREX R5, [R1] ; reads lock 
CMP R5, #0 ; checks if @ 
STREXEQ R5, RQ, [R1] ; attempts to store new value 
CMPEQ R5, #0 ; tests if store succeeded 
BNE Loop ; retries if not 
DMB ; ensures that all subsequent accesses are observed after the 


; gaining of the lock is observed 
; loads and stores in the critical region can now be performed 


Releasing a lock 


The converse operation of releasing a lock does not require the use of Load-Exclusive and Store-Exclusive 
instructions, because only a single observer is able to write to the lock. However, often it is necessary for any 
observer to observe any memory updates, or any values that are loaded into memory, before they observe the release 
of the lock. Therefore, a DMB usually precedes the lock release, as the following example shows. 


Px 
; loads and stores in the critical region 
MOV RO, #0 
DMB ; ensures all previous accesses are observed before the lock is cleared 
STR RO, [R1] ; clears the lock 


Use of Wait For Event (WFE) and Send Event (SEV) with locks 


The ARMv8 architecture includes Wait For Event and Send Event instructions, that can be executed to reduce the 
required number of iterations of a lock-acquire loop, or spinlock, to reduce power. The basic mechanism involves 
an observer that is in a spinlock executing a WFE instruction that suspends execution on that observer until an 
asynchronous exception or an explicit event, sent by some other observer using the SEV instruction, is seen by the 
suspended observer. An observer that holds the lock executes an SEV instruction to send an event after it has released 
the lock. 


The Event signal is a non-memory communication, and therefore the memory update that releases the lock must be 
observable by all observers before the SEV instruction is executed and the event is sent. This requires the use of DSB 
instruction, rather than DMB. 


Therefore, the following is an example of lock acquire code using WFE: 


Px 

Loop 

LDREX R5, [R1] ; reads lock 

CMP R5, #0 ; checks if @ 

WFENE ; sleeps if the lock is held 
STREXEQ R5, RQ, [R1] ; attempts to store new value 
CMPEQ R5, #0 ; tests if store succeeded 
BNE Loop ; retries if not 

DMB ; ensures that all subsequent accesses are observed after the 


gaining of the lock is observed 
; loads and stores in the critical region can now be performed 


And the following is an example of lock release code using SEV: 





Px 
; loads and stores in the critical region 
MOV RO, #0 
DMB ; ensures all previous accesses are observed before the lock is cleared 
STR RQ, [R1] ; clears the lock 
DSB ; ensures completion of the store that cleared the lock before 
; sending the event 
SEV 
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K10.6.3 


K10.6.4 


Using a mailbox to send an interrupt 


In some message passing systems, it is common for one observer to update memory and then notify a second 
observer of the update by sending an interrupt, using a mailbox. 


Although a memory access might be made to initiate the sending of the mailbox interrupt, a DSB instruction is 
required to ensure the completion of previous memory accesses. 


Therefore, the following sequence is required to ensure that P2 observes the updated value: 
Pl 


STR R5, [R1] ; message stored to shared memory location 
DSB [ST] 
STR R1, [R4] ; R4 contains the address of a mailbox 


P2 


; interrupt service routine 
LDR R5, [R1] 


Note 


The DSB executed by P1 ensures global observation of the store to [R1].The interrupt timing ensures that the code 
executed by P2 is executed after the global observation of the update to [R1], and therefore must see this update. In 
some implementations, this might be implemented by requiring that interrupts flush non-coherent buffers that hold 
speculatively loaded data. 








Cache and TLB maintenance instructions and barriers 


The following sections describe the use of barriers with cache and TLB maintenance instructions: 
° Data cache maintenance instructions. 
. Instruction cache maintenance instructions on page K10-5624. 


. TLB maintenance instructions and barriers on page K10-5626. 


Data cache maintenance instructions 


The following sections describe the use of barriers with data cache maintenance instructions: 

. Message passing to non-caching observers. 

. Multiprocessing message passing to non-caching observers on page K10-5623. 

. Invalidating DMA buffers, non-functional example on page K10-5623. 

. Invalidating DMA buffers, functional example with single PE on page K10-5623. 

° Invalidating DMA buffers, functional example with multiple coherent PEs on page K10-5624. 


Message passing to non-caching observers 


The ARMV8 architecture requires the use of DMB instructions to ensure the ordering of data cache maintenance 
instructions and their effects. This means the following message passing approaches can be used when 
communicating between caching observers and non-caching observers: 


Pl 


STR R5, [R1] 
DCCMVAC R1 
DMB 

STR RO, [R4] 


updates data (assumed to be in P1's cache) 

cleans cache to point of coherency 

ensures effects of the clean will be observed before the flag is set 
sends flag to external agent (Non-cacheable location) 


El 


WAIT ([R4] == 1) 
DMB 
LDR R5, [R1] 


waits for the flag 
ensures that flag has been seen before reading data 
reads the data 
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In this example, it is required that E1:R5==0x55. 


Multiprocessing message passing to non-caching observers 


The broadcast nature of the cache maintenance instructions in ARMv8, and in ARMv7 implementations that include 
the Multiprocessing Extensions, combined with properties of barriers, means that the message passing principle for 
non-caching observers is: 


Pl 
STR R5, [R1] ; updates data (assumed to be in P1's cache) 
DMB [ST] ; ensures new data is observed before the flag to P2 is set 
STR RQ, [R2] ; sends flag to P2 

P2 


WAIT ([R2] == 1) waits for flag from P1 
DMB ; ensures cache clean is observed after P1 flag is observed 


DCCMVAC R1 ; cleans cache to point of coherency - this cleans the cache of Pl 
DMB ; ensures effects of the clean are observed before the flag to El is set 
STR RO, [R4] ; sends flag to E1 


El 


WAIT ([R4] == 1) 
DMB 
LDR R5, [R1] 


waits for flag from P2 
ensures that flag has been observed before reading the data 
reads the data 


In this example, it is required that E1:R5==0x55. The clean operation executed by P2 affects the data location in the 
P1 cache. The cast-out from the P1 cache is guaranteed to be observed before P2 updates [R4]. 


Invalidating DMA buffers, non-functional example 


The basic scheme for communicating with an external observer that is a process that passes data in to a Cacheable 
memory region must take account of the architectural requirement that regions with a Normal Cacheable attribute 
can be allocated into a cache at any time, for example as a result of speculation. The following example shows this 


possibility: 
Pl 
DCIMVAC R1 ; ensures caches are not dirty. A clean operation could be 
; used but the DMA overwrites this region so an invalidate operation 
; is sufficient and usually more efficient 
DMB ; ensures cache invalidation is observed before the next store is observed 
STR RO, [R3] ; sends flag to external agent 
WAIT ([R4]==1) ; waits for a different flag from an external agent 
DMB ; observes flag from external agent before reading new data. However [R1] 
; could have been brought into cache earlier 
LDR R5, [R1] 
El 


WAIT ([R3] == 1) 


waits for flag 


STR R5, [R1] ; stores new data 
DMB 
STR RO, [R4] ; sends a flag 


If a speculative access occurs, there is no guarantee that the cache line containing [R1] is not brought back into the 
cache after the cache invalidation, but before [R1] is written by El. Therefore, the result P1:R5=0 is permissible. 


Invalidating DMA buffers, functional example with single PE 





Pl 
DCIMVAC R1 ; ensures cache is not dirty. A clean operation could be 
; used but the DMA overwrites this region so an invalidate operation 
; is sufficient and usually more efficient 
DMB ; ensures cache invalidation is observed before the next store is observed 
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STR RQ, [R3] sends flag to external agent 
WAIT ([R4]==1) ; waits for a different flag from an external agent 


DMB ; ensures that cache invalidate is observed after the flag 
; from external agent is observed 

DCIMVAC R1 ; ensures cache discards stale copies before use 

LDR R5, [R1] 


El 


WAIT ([R3] == 1) waits for flag 


STR R5, [R1] ; stores new data 
DMB [ST] 
STR RO, [R4] ; sends a flag 


In this example, the result P1:R5 == 0x55 is required. Including a cache invalidation after the store by El to [R1] is 
observed ensures that the line is fetched from external memory after it has been updated. 


Invalidating DMA buffers, functional example with multiple coherent PEs 


The broadcasting of cache maintenance instructions, and the use of DMB instructions to ensure their observability, 
means that the previous example extends naturally to a multiprocessor system. Typically this requires a transfer of 
ownership of the region that the external observer is updating. 


PO 
(Use data from [R1], potentially using [R1] as scratch space) 
DMB 
STR RO, [R2] ; signals release of [R1] 
WAIT ([R2] == @) ; waits for new value from DMA 
DMB 
LDR R5, [R1] 


Pl 


WAIT ([R2] == 1) waits for release of [R1] by PO 


DCIMVAC R1 ; ensures caches are not dirty, invalidate is sufficient 
DMB 

STR RO, [R3] ; requests new data for [R1] 

WAIT ([R4] == 1) ; waits for new data 

DMB 

DCIMVAC R1 ; ensures caches discard stale copies before use 

DMB 

MOV RO, #0 

STR RO, [R2] ; signals availability of new [R1] 


El 


WAIT ([R3] == 1) waits for new data request 


STR R5, [R1] ; sends new [R1] 
DMB [ST] 
STR RQ, [R4] ; indicates new data available to Pl 


In this example, the result PO:R5 == 0x55 is required. The DMB issued by P1 after the first data cache invalidation 
ensures that effect of the cache invalidation on PO is seen by E1 before the store by El to [R1]. The DMB issued by 
P1 after the second data cache invalidation ensures that its effects are seen before the store of 0 to the semaphore 
location in [R2]. 


Instruction cache maintenance instructions 


The following sections describe the use of barriers with instruction cache maintenance instructions: 





° Ensuring the visibility of updates to instructions for a uniprocessor on page K10-5625. 
° Ensuring the visibility of updates to instructions for a multiprocessor on page K10-5625. 
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Ensuring the visibility of updates to instructions for a uniprocessor 


On a single PE, the agent that causes instruction fetches, or instruction cache linefills, is a separate memory system 
observer from the agent that causes data accesses. Therefore, any operations to invalidate the instruction cache can 
rely only on seeing updates to memory that are complete. This must be ensured by the use of a DSB instruction. 


Also, instruction cache maintenance instructions are only guaranteed to complete after the execution of a DSB, and 
an ISB is required to discard any instructions that might have been prefetched before the instruction cache 
invalidation completed. Therefore, on a uniprocessor, to ensure the visibility of an update to code and to branch to 
it, the following sequence is required: 


Pl 


STR R11, [R1] R11 contains a new instruction to store in program memory 


DCCMVAU R1 ; clean to PoU makes new instructions visible to instruction cache 
DSB 

ICIMVAU R1 ; ensures instruction cache and branch predictor discard stale data 
BPIMVA R1 

DSB ; ensures completion of the invalidation 

ISB ; ensures instruction fetch path observes new instruction cache state 
BX R1 


Ensuring the visibility of updates to instructions for a multiprocessor 


ARMvVv8, and an ARMv7 implementation that includes the Multiprocessing Extensions, requires a PE that executes 
an instruction cache maintenance instruction to execute a DSB instruction to ensure completion of the maintenance 
operation. This ensures that the cache maintenance instruction is complete on all PEs in the Inner Shareable 
shareability domain. 


An ISB is not broadcast, and so does not affect other PEs. This means that any other PE must perform its own ISB 
synchronization after it knows that the update is visible, if it is necessary to ensure its synchronization with the 
update. The following example shows how this might be done: 


Pl 


STR R11, [R1] R11 contains a new instruction to store in program memory 


DCCMVAU R1 ; clean to PoU makes new instructions visible to instruction cache 
DSB ; ensures completion of the clean on all processors 
ICIMVAU R1 ; ensures instruction cache/branch predictor discards stale data 
BPIMVA R1 
DSB ; ensures completion of the instruction cache and branch predictor 
; invalidation on all PEs 
STR RO, [R2] ; sets flag to signal completion 
ISB ; synchronizes context on this PE 
BX R1 ; branches to new code 
P2-Px 


WAIT ([R2] == 1) 
ISB 
BX R1 


waits for flag signaling completion 
synchronizes context on this processor 
branches to new code 


Nonfunctional approach 


The following sequence does not have the same effect, because a DSB is not required to complete the instruction 
cache maintenance instructions that other PEs issue: 


Pl 


STR R11, [R1] R11 contains a new instruction to store in program memory 





DCCMVAU R1 ; clean to PoU makes new instructions visible to instruction cache 
DSB ; ensure completion of the clean on all PEs 
ICIMVAU R1 ; ensure instruction cache/branch predictor discards stale data 
BPIMVA R1 
DMB ; ensure ordering of the store after the invalidation 
; DOES NOT guarantee completion of instruction cache/branch 
; predictor on other PEs 
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STR RO, [R2] ; sets flag to signal completion 
DSB ; ensures completion of the invalidation on all PEs 
ISB ; synchronizes context on this PE 
BX R1 ; branches to new code 
P2-Px 


WAIT ([R2] == 1) 
DSB 

ISB 

BX R1 


waits for flag signaling completion 
this DSB does not guarantee completion of P1's ICIMVAU/BPIMVA 


In this example, P2...Px might not see the updated region of code at R1. 


TLB maintenance instructions and barriers 


The following sections describe the use of barriers with TLB maintenance instructions: 
° Ensuring the visibility of updates to translation tables for a uniprocessor. 
° Ensuring the visibility of updates to translation tables for a multiprocessor. 


° Paging memory in and out on page K10-5627. 


Ensuring the visibility of updates to translation tables for a uniprocessor 


On a single PE, the agent that causes translation table walks is a separate memory system observer from the agent 
that causes data accesses. Therefore, any operations to invalidate the TLB can only rely on seeing updates to 
memory that are complete. This must be ensured by the use of a DSB instruction. 


In the ARMV8 architecture, and in an ARMv7 implementation that includes the Multiprocessing Extensions, 
translation table walks must look in the data or unified caches at L1, so such systems do not require data cache 
cleaning. 


After the translation tables update, any old copies of entries that might be held in the TLBs must be invalidated. This 
operation is only guaranteed to affect all instructions, including instruction fetches and data accesses, after the 
execution of a DSB and an ISB. Therefore, the code for updating a translation table entry is: 


Pl 
STR R11, [R1] ; updates the translation table entry 
DSB ; ensures visibility of the update to translation table walks 
TLBIMVA R10 
BPIALL 
DSB ; ensures completion of the BP and TLB invalidation 


ISB ; synchronizes context on this PE 


; new translation table entry can be relied upon at this point and all accesses 
; generated by this observer using the old mapping have been completed 


Importantly, by the end of this sequence, all accesses that used the old translation table mappings have been 
observed by all observers. 


An example of this is where a translation table entry is marked as invalid. Such a system must provide a mechanism 
to ensure that any access to a region of memory being marked as invalid has completed before any action is taken 
as a result of the region being marked as invalid. 


Ensuring the visibility of updates to translation tables for a multiprocessor 


The same code sequence can be used in a multiprocessing system. In the ARMvV8 architecture, and in an ARMv7 
implementation that includes the Multiprocessing Extensions, a PE that executes a TLB maintenance instruction 
must execute a DSB instruction to ensure completion of the maintenance operation. This ensures that the TLB 
maintenance instruction is complete on all PEs in the Inner Shareable shareability domain. 


The completion of a DSB that completes a TLB maintenance instruction ensures that all accesses that used the old 
mapping have completed. 


Pl 
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STR R11, [R1] ; updates the translation table entry 

DSB ; ensures visibility of the update to translation table walks 
TLBIMVAIS R10 

BPIALLIS 

DSB ; ensures completion of the BP and TLB invalidation 

ISB ; Note ISB is not broadcast and must be executed locally on other PEs 


; new translation table entry can be relied upon at this point and all accesses generated by any 
; observers affected by the broadcast TLBIMVAIS operation using the old mapping have completed 


The completion of the TLB maintenance instruction is guaranteed only by the execution of a DSB by the observer 
that performed the TLB maintenance instruction. The execution of a DSB by a different observer does not have this 
effect, even if the DSB is known to be executed after the TLB maintenance instruction is observed by that different 
observer. 


Paging memory in and out 


In a multiprocessor system there is a requirement to ensure the visibility of translation table updates when paging 
regions of memory into RAM from a backing store. This might, or might not, also involve paging existing locations 
in memory from RAM to a backing store. In such situations, the operating system selects one or more pages of 
memory that might be in use but are suitable to discard, with or without copying to a backing store, depending on 
whether or not the region of memory is writable. Disabling the translation table mappings for a page, and ensuring 
the visibility of that update to the translation tables, prevents agents accessing the page. 


For this reason, it is important that the DSB that is performed after the TLB invalidation ensures that no other updates 
to memory using those mappings are possible. 


An example sequence for the paging out of an updated region of memory, and the subsequent paging in of memory, 
is as follows: 


Pl 
STR R11, [R1] ; updates the translation table for the region being paged out 
DSB ; ensures visibility of the update to translation table walks 
TLBIMVAIS R10 ; invalidates the old entry 
DSB ; ensures completion of the invalidation on all processors 
ISB ; ensures visibility of the invalidation 


BL SaveMemoryPageToBackingStore 

BL LoadMemoryFromBackingStore 

DSB ; ensures completion of the memory transfer (this could be part of 
LoadMemoryFromBackingStore 


ICIALLUIS ; also invalidates the branch predictor 

STR R9, [R1] ; creates a new translation table entry with a new mapping 

DSB ; ensures completion of instruction cache and branch predictor invalidation 
; and ensures visibility of the new translation table mapping 

ISB ; ensures synchronization of this instruction stream 


This example assumes the memory copies are performed by an observer that is coherent with the caches of PE P1. 
This observer might be P1 itself, using a specific paging mapping. For clarity, the example omits the functional 
descriptions of SaveMemoryPageToBackingStore and LoadMemoryFromBackingStore. LoadMemoryFromBackingStore is 
required to ensure that the memory updates that it makes are visible to instruction fetches. 


In this example, the use of ICIALLUIS to invalidate the entire instruction cache is a simplification, that might not 
be optimal for performance. An alternative approach involves invalidating all of the lines in the caches using 
ICIMVAU operations. This invalidation must be done when the mapping used for the ICIMVAU operations is valid 
but not executable. 
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K10.6 ARMv7 compatible approaches for ordering, using DMB and DSB barriers 
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Appendix K11 
ARM Pseudocode Definition 


This appendix provides a definition of the pseudocode that is used in this manual, and defines some helper 
procedures and functions that are used by pseudocode. It contains the following sections: 


About the ARM pseudocode on page K11-5630. 

Pseudocode for instruction descriptions on page K11-5631. 

Data types on page K11-5633. 

Operators on page K11-5638. 

Statements and control structures on page K11-5644. 

Built-in functions on page K11-5650. 

Miscellaneous helper procedures and functions on page K11-5653. 
ARM pseudocode definition index on page K11-5655. 


Note 





This appendix is not a formal language definition for the pseudocode. It is a guide to help understand the use of 
ARM pseudocode. This appendix is not complete. Changes are planned for future releases. 
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K11.1 About the ARM pseudocode 


The ARM pseudocode provides precise descriptions of some areas of the ARM architecture. This includes 
description of the decoding and operation of all valid instructions. Pseudocode for instruction descriptions on 
page K11-6021 gives general information about this instruction pseudocode, including its limitations. 


The following sections describe the ARM pseudocode in detail: 
. Data types on page K11-5633. 
. Operators on page K11-5638. 


° Statements and control structures on page K11-5644. 


Built-in functions on page K11-5650 and Miscellaneous helper procedures and functions on page K11-5653 
describe some built-in functions and pseudocode helper functions that are used by the pseudocode functions that are 
described elsewhere in this manual. ARM pseudocode definition index on page K11-5655 contains the indexes to 
the pseudocode. 


K11.1.1 General limitations of ARM pseudocode 


The pseudocode statements IMPLEMENTATION_DEFINED, SEE, UNDEFINED, and UNPREDICTABLE indicate behavior that 
differs from that indicated by the pseudocode being executed. If one of them is encountered: 


° Earlier behavior indicated by the pseudocode is only specified as occurring to the extent required to 
determine that the statement is executed. 


. No subsequent behavior indicated by the pseudocode occurs. 


For more information, see Special statements on page K11-5648. 
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K11.2 Pseudocode for instruction descriptions 


Each instruction description includes pseudocode that provides a precise description of what the instruction does, 
subject to the limitations described in General limitations of ARM pseudocode on page K11-5630 and Limitations 
of the instruction pseudocode on page K11-5632. 


In the instruction pseudocode, instruction fields are referred to by the names shown in the encoding diagram for the 
instruction. /nstruction encoding diagrams and instruction pseudocode gives more information about the 
pseudocode provided for each instruction. 


K11.2.1 Instruction encoding diagrams and instruction pseudocode 


Instruction descriptions in this manual contain: 


. An Encoding section, containing one or more encoding diagrams, each followed by some encoding-specific 
pseudocode that translates the fields of the encoding into inputs for the common pseudocode of the 
instruction, and picks out any encoding-specific special cases. 


° An Operation section, containing common pseudocode that applies to all of the encodings being described. 
The Operation section pseudocode contains a call to the EncodingSpecificOperations() function, either at its 
start or only after a condition code check performed by if ConditionPassed() then. 


An encoding diagram specifies each bit of the instruction as one of the following: 


° An obligatory 0 or 1, represented in the diagram as 0 or 1. If this bit does not have this value, the encoding 
corresponds to a different instruction. 


° A should be 0 or 1, represented in the diagram as (0) or (1). If this bit does not have this value, the instruction 
is CONSTRAINED UNPREDICTABLE. For more information, see SBZ or SBO fields T32 and A32 in instructions 
on page K1-5460. 


° A named single bit or a bit in a named multi-bit field. The cond field in bits[3 1:28] of many A32/T32 
instructions has some special rules associated with it. 


An encoding diagram matches an instruction if all obligatory bits are identical in the encoding diagram and the 
instruction, and one of the following is true: 


° The encoding diagram is not for an A32/T32 instruction. 
° The encoding diagram is for an A32/T32 instruction that does not have a cond field in bits[31:28]. 


° The encoding diagram is for an A32/T32 instruction that has a cond field in bits[31:28], and bits[31:28] of 
the instruction are not @b1111. 


In the context of the instruction pseudocode, the execution model for an instruction is: 


1. Find all encoding diagrams that match the instruction. It is possible that no encoding diagram matches. In 
that case, abandon this execution model and consult the relevant instruction set chapter instead to find out 
how the instruction is to be treated. The bit pattern of such an instruction is usually reserved and UNDEFINED, 
though there are some other possibilities. For example, unallocated hint instructions are documented as being 
reserved and executed as NOPs. 


2. If the operation pseudocode for the matching encoding diagrams starts with a condition code check, perform 
that check. If the condition code check fails, abandon this execution model and treat the instruction as a NOP. 
If there are multiple matching encoding diagrams, either all or none of their corresponding pieces of common 
pseudocode start with a condition code check. 


3: Perform the encoding-specific pseudocode for each of the matching encoding diagrams independently and in 
parallel. Each such piece of encoding-specific pseudocode starts with a bitstring variable for each named bit 
or multi-bit field in its corresponding encoding diagram, named the same as the bit or multi-bit field and 
initialized with the values of the corresponding bit or bits from the bit pattern of the instruction. 
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In a few cases, the encoding diagram contains more than one bit or field with same name. In these cases, the 
values of the different instances of those bits or fields must be identical. The encoding-specific pseudocode 
contains a special case using the Consistent() function to specify what happens if they are not identical. 
Consistent() returns TRUE if all instruction bits or fields with the same name as its argument have the same 
value, and FALSE otherwise. 


If there are multiple matching encoding diagrams, all but one of the corresponding pieces of pseudocode must 
contain a special case that indicates that it does not apply. Discard the results of all such pieces of pseudocode 
and their corresponding encoding diagrams. 


There is now one remaining piece of pseudocode and its corresponding encoding diagram left to consider. 
This pseudocode might also contain a special case, most commonly one indicating that it is CONSTRAINED 
UNPREDICTABLE. If so, abandon this execution model and treat the instruction according to the special case. 


Check the should be bits of the encoding diagram against the corresponding bits of the bit pattern of the 
instruction. If any of them do not match, abandon this execution model and treat the instruction as 
CONSTRAINED UNPREDICTABLE, see SBZ or SBO fields T32 and A32 in instructions on page K1-5460. 


Perform the rest of the operation pseudocode for the instruction description that contains the encoding 
diagram. That pseudocode starts with all variables set to the values they were left with by the 
encoding-specific pseudocode. 


The ConditionPassed() call in the common pseudocode, if present, performs step 2, and the 
EncodingSpecificOperations() call performs steps 3 and 4. 


K11.2.2 Limitations of the instruction pseudocode 


The pseudocode descriptions of instruction functionality have a number of limitations. These are mainly due to the 
fact that, for clarity and brevity, the pseudocode is a sequential and mostly deterministic language. 


These limitations include: 


Pseudocode does not describe the ordering requirements when an instruction generates multiple memory 
accesses. For a description of the ordering requirements on memory accesses see Ordering requirements on 
page E2-2333. 


Pseudocode does not describe the exact rules when an UNDEFINED instruction fails its condition code check. 
In such cases, the UNDEFINED pseudocode statement lies inside the if ConditionPassed() then ... structure, 
either directly or in the EncodingSpecificOperations() function call, and so the pseudocode indicates that the 
instruction executes as a NOP. Conditional execution of undefined instructions on page G1-3851 describes 
the exact rules. 


Pseudocode does not describe the exact ordering requirements when a single floating-point instruction 
generates more than one floating-point exception and one or more of those floating-point exceptions is 
trapped. Combinations of floating-point exceptions on page E1-2305 describes the exact rules. 





Note 


There is no limitation in the case where all the floating-point exceptions are untrapped, because the 
pseudocode specifies the same behavior as the cross-referenced section. 





An exception can be taken during execution of the pseudocode for an instruction, either explicitly as a result 
of the execution of a pseudocode function such as Abort(), or implicitly, for example if an interrupt is taken 
during execution of an LDM instruction. If this happens, the pseudocode does not describe the extent to which 
the normal behavior of the instruction occurs. To determine that, see the descriptions of the exceptions in 
Handling exceptions that are taken to an Exception level using AArch32 on page G1-3812. 
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K11.3 Datatypes 


This section describes: 

° General data type rules. 

° Bitstrings. 

. Integers on page K11-5634. 

° Reals on page K11-5634. 

° Booleans on page K11-5634. 

. Enumerations on page K11-5635. 
° Structures on page K11-5635. 

° Tuples on page K11-5636. 

° Arrays on page K11-5637. 


K11.3.1 General data type rules 


ARM architecture pseudocode is a strongly typed language. Every literal and variable is of one of the following 


types: 

. Bitstring. 

° Integer. 

° Boolean. 

° Real. 

° Enumeration. 
° Tuple. 

° Struct. 

° Array. 


The type of a literal is determined by its syntax. A variable can be assigned to without an explicit declaration. The 
variable implicitly has the type of the assigned value. For example, the following assignments implicitly declare the 
variables x, y and z to have types integer, bitstring of length 1, and Boolean, respectively. 


x= 1; 
ysl; 
Zz = TRUE; 


Variables can also have their types declared explicitly by preceding the variable name with the name of the type. 
The following example declares explicitly that a variable named count is an integer. 


integer count; 
This is most often done in function definitions for the arguments and the result of the function. 


The remaining subsections describe each data type in more detail. 


K11.3.2 Bitstrings 


This section describes the bitstring data type. 


Syntax 

bits(N) The type name of a bitstring of length N. 
bit A synonym of bits(1). 

Description 


A bitstring is a finite-length string of Os and 1s. Each length of bitstring is a different type. The minimum permitted 
length of a bitstring is 0. 
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K11.3 Data types 


Bitstring constants literals are written as a single quotation mark, followed by the string of Os and 1s, followed by 
another single quotation mark. For example, the two constants literals of type bit are '@' and '1'. Spaces can be 
included in bitstrings for clarity. 


The bits in a bitstring are numbered from left to right N-1 to 0. This numbering is used when accessing the bitstring 
using bitslices. In conversions to and from integers, bit N-1 is the MSByte and bit 0 is the LSByte. This order 
matches the order in which bitstrings derived from encoding diagrams are printed. 


Every bitstring value has a left-to-right order, with the bits being numbered in standard little-endian order. That is, 
the leftmost bit of a bitstring of length N is bit (V—1) and its right-most bit is bit 0. This order is used as the 
most-significant-to-least-significant bit order in conversions to and from integers. For bitstring constants and 
bitstrings that are derived from encoding diagrams, this order matches the way that they are printed. 


Bitstrings are the only concrete data type in pseudocode, corresponding directly to the contents values that are 
manipulated in registers, memory locations, and instructions. All other data types are abstract. 


K11.3.3 Integers 


This section describes the data type for integer numbers. 


Syntax 


integer The type name for the integer data type. 


Description 


Pseudocode integers are unbounded in size and can be either positive or negative. That is, they are mathematical 
integers rather than what computer languages and architectures commonly call integers. Computer integers are 
represented in pseudocode as bitstrings of the appropriate length, and the pseudocode provides functions to interpret 
those bitstrings as integers. 


Integer literals are normally written in decimal form, such as @, 15, -1234. They can also be written in C-style 
hexadecimal form, such as 0x55 or 0x80000000. Hexadecimal integer literals are treated as positive unless they have 
a preceding minus sign. For example, 0x80000000 is the integer +231. If -231 needs to be written in hexadecimal, it 
must be written as -0x80000000. 


K11.3.4 Reals 


This section describes the data type for real numbers. 


Syntax 


real The type name for the real data type. 


Description 


Pseudocode reals are unbounded in size and precision. That is, they are mathematical real numbers, not computer 
floating-point numbers. Computer floating-point numbers are represented in pseudocode as bitstrings of the 
appropriate length, associated with suitable and the pseudocode provides functions to interpret those bitstrings as 
reals. 


Real constants literals are written in decimal form with a decimal point. This means 0 is an integer constant literal, 
but 0.0 is a real constant literal. 


K11.3.5 Booleans 


This section describes the Boolean data type. 
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Syntax 

boolean The type name for the Boolean data type. 
TRUE The two values a Boolean variable can take. 
Description 


A Boolean is a logical TRUE or FALSE value. 


Note 


This is not the same type as bit, which is a bitstring of length 1. A Boolean can only take on one of two values: TRUE 
or FALSE. 








K11.3.6 Enumerations 


This section describes the enumeration data type. 


Syntax and examples 
enumeration | Keyword to defined a new enumeration type. 


enumeration Example {Example_One, Example_Two, Example_Three}; 


A definition of a new enumeration called Example, which can take on the values Examp1e_One, 
Example_Two, Example_Three. 


Description 
An enumeration is a defined set of named values. 
An enumeration must contain at least one named value. A named value must not be shared between enumerations. 


Enumerations must be defined explicitly, although a variable of an enumeration type can be declared implicitly by 
assigning one of the named values to it. By convention, each named value starts with the name of the enumeration 
followed by an underscore. The name of the enumeration is its type name, or type, and the named values are its 
possible values. 


K11.3.7 Structures 


This section describes the structure data type. 


Syntax and examples 
type The keyword used to declare the structure data type. 


type ShiftSpec is (bits(2) shift, integer amount) 


An example definition for a new structure called Shi ftSpec that contains an bitstring member called 
shift and a integer member named amount. Structure definitions must not be terminated with a 
semicolon. 


ShiftSpec abc; 


A declaration of a variable named abc of type ShiftSpec. 


abc. shift 


Syntax to refer to the individual members within the structure variable. 
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Description 


A structure is a compound data type composed of one or more data items. The data items can be of different data 
types. This can include compound data types. The data items of a structure are called its members and are named. 


In the syntax section, the example defines a structure called ShiftSpec with two members. The first is a bitstring of 
length 2 named shift and the second is an integer named amount. After declaring a variable of that type named abc, 
the members of this structure are referred to as abc. shift and abc. amount. 


Every definition of a structure creates a different type, even if the number and type of their members are identical. 
For example: 


type ShiftSpecl is (bits(2) shift, integer amount) 
type ShiftSpec2 is (bits(2) shift, integer amount) 


ShiftSpecl and ShiftSpec2 are two different types despite having identical definitions. This means that the value in 
a variable of type ShiftSpec1 cannot be assigned to variable of type ShiftSpec2. 


K11.3.8 Tuples 


This section describes the tuple data type. 


Examples 


(bits(32) shifter_result, bit shifter_carry_out) 


An example of the tuple syntax. 


(shift_t, shift_n) = ('00', Q); 


An example of assigning values to a tuple. 


Description 


A tuple is an ordered set of data items, separated by commas and enclosed in parentheses. The items can be of 
different types and a tuple must contain at least one data item. 


Tuples are often used as the return type for functions that return multiple results. For example, in the syntax section, 
the example tuple is the return type of the function Shift_C(), which performs a standard A32/T32 shift or rotation. 
Its return type is a tuple containing two data items, with the first of type bits(32) and the second of type bit. 


Each tuple is a separate compound data type. The compound data type is represented as a comma-separated list of 
ordered data types between parentheses. This means that the example tuple at the start of this section is of type 
(bits(32), bit). The general principle that types can be implied by an assignment extends to implying the type of 
the elements in the tuple. For example, in the syntax section, the example assignment implicitly declares: 


° shift_t to be of type bits(2). 





° shift_n to be of type integer. 
e (shift_t, shift_n) to be a tuple of type (bits(2), integer). 
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K11.3.9 Arrays 
This section describes the array data type. 
Syntax 
array The type name for the array data type. 
array data_type array_name[A. .B]; 

Declaration of an array of type data_type, which might be compound data type. It is named 
array_name and is indexed with an integer range from A to B. 

Description 
An array is an ordered set of fixed size containing items of a single data type. This can include compound data types. 
Pseudocode arrays are indexed by either enumerations or integer ranges. An integer range is represented by the 
lower inclusive end of the range, then .., then the upper inclusive end of the range. 
For example: 
The following example declares an array of 31 bitstrings of length 64, indexed from 0 to 30. 
array bits(64) _R[Q..30]; 
Arrays are always explicitly declared, and there is no notation for a constant literal array. Arrays always contain at 
least one element data item, because: 
. Enumerations always contain at least one symbolic constant named value. 
° Integer ranges always contain at least one integer. 
An array declared with an enumeration type as the index must be accessed using enumeration values of that 
enumeration type. An array declared with an integer range type as the index must be accessed using integer values 
from that inclusive range. Accessing such an array with an integer value outside of the range is a coding error. 
Arrays do not usually appear directly in pseudocode. The items that syntactically look like arrays in pseudocode are 
usually array-like functions such as R[i], MemU[address, size] or Elem[vector, i, size]. These functions package 
up and abstract additional operations normally performed on accesses to the underlying arrays, such as register 
banking, memory protection, endian-dependent byte ordering, exclusive-access housekeeping and Advanced SIMD 
element processing. See Function and procedure calls on page K11-5644. 
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K11.4 Operators 


This section describes: 

° Relational operators. 

° Boolean operators. 

° Bitstring operators on page K11-5639. 

° Arithmetic operators on page K11-5639. 

° The assignment operator on page K11-5640. 
° Precedence rules on page K11-5642. 

° Conditional expressions on page K11-5642. 

° Operator polymorphism on page K11-5642. 


K11.4.1 Relational operators 


The following operations yield results of type boolean. 


Equality and non-equality 


If two variables x and y are of the same type, their values can be tested for equality by using the expression x == y 
and for non-equality by using the expression x != y. In both cases, the result is of type boolean. 


Both x and y must be of type bits(N), real , enumeration, boolean, or integer. Named values from an enumeration 
can only be compared if they are both from the same enumeration. An exception is that a bitstring can be tested for 
equality with an integer to allow a d==15 test. 


A special form of comparison is defined with a bitstring literal that can contain bit values '0' ,'1', and 'x'. Any bit 
with value 'x' is ignored in determining the result of the comparison. For example, if opcode is a 4-bit bitstring, the 
expression opcode == '1xx' matches the values ‘1000’, ‘1100’, ‘1001’, and ‘1101’. This is known as a bitmask. 





Note 


This special form is permitted in the implied equality comparisons in the when parts of case ... of ... structures. 





Comparisons 
If x and y are integers or reals, then x < y, x <= y, x > y, andx >= y are less than, less than or equal, greater than, 
and greater than or equal comparisons between them, producing Boolean results. 
Set membership with IN 
<expression> IN {<set>} produces TRUE if <expression> is a member of <set>. Otherwise, it is FALSE. <set> must be 
a list of expressions separated by commas. 
K11.4.2 Boolean operators 
If x is a Boolean expression, then !x is its logical inverse. 


If x and y are Boolean expressions, then x && y is the result of ANDing them together. As in the C language, if x is 
FALSE, the result is determined to be FALSE without evaluating y. 


Note 


This is known as short circuit evaluation. 








If x and y are booleans, then x || y is the result of ORing them together. As in the C language, if x is TRUE, the result 
is determined to be TRUE without evaluating y. 
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Note 


If x and y are booleans or Boolean expressions, then the result of x != y is the same as the result of exclusive-ORing 
x and y together. The operator EOR only accepts bitstring arguments. 





K11.4.3 Bitstring operators 


The following operations can be applied only to bitstrings. 


Logical operations on bitstrings 
If x is a bitstring, NOT(x) is the bitstring of the same length obtained by logically inverting every bit of x. 


If x and y are bitstrings of the same length, x AND y, x OR y, and x EOR y are the bitstrings of that same length obtained 
by logically ANDing, logically ORing, and exclusive-ORing corresponding bits of x and y together. 


Bitstring concatenation and slicing 


If x and y are bitstrings of lengths N and M respectively, then x:y is the bitstring of length N+M constructed by 
concatenating x and y in left-to-right order. 


The bitstring slicing operator addresses specific bits in a bitstring. This can be used to create a new bitstring from 
extracted bits or to set the value of specific bits. Its syntax is x<integer_list>, where x is the integer or bitstring 
being sliced, and <integer_list> is acomma-separated list of integers enclosed in angle brackets. The length of the 
resulting bitstring is equal to the number of integers in <integer_list>. In x<integer_list>, each of the integers in 
<integer_list> must be: 

° >= 0. 


° < Len(x) if x is a bitstring. 


The definition of x<integer_list> depends on whether integer_list contains more than one integer: 


° If integer_list contains more than one integer, x<i, j, k,.., n>is defined to be the concatenation: 
X<i> 1 X<j> 1 x<k> or. xX<n>. 
° If integer_list consists of just one integer i, x<i> is defined to be: 


—  Ifxisa bitstring, '0' if bit i of x is a zero and '1' if bit i of x is a one. 


—  Ifx/is an integer, and y is the unique integer in the range @ to 2A(i+1)-1 that is congruent to x modulo 
2A(i1+1). Then x<i> is '@' ify < 2Ai and '1' ify >= 2Ai. 
Loosely, this definition treats an integer as equivalent to a sufficiently long two’s complement 
representation of it as a bitstring. 


The notation for a range expression is 1:j with i >= j is shorthand for the integers in order from i down to j, with 
both end values included. For example, instr<31:28> represents instr<31, 30, 29, 28>. 


X<integer_list> is assignable provided x is an assignable bitstring and no integer appears more than once in 
<integer_list>. In particular, x<i> is assignable if x is an assignable bitstring and @ <= i < Len(x). 


Encoding diagrams for registers frequently show named bits or multi-bit fields. For example, the encoding diagram 
for the APSR shows its bit<31> as N. In such cases, the syntax APSR.N is used as a more readable synonym for 
APSR<31> as named bits can be referred to with the same syntax as referring to members of a struct. A 
comma-separated list of named bits enclosed in angle brackets following the register name allows multiple bits to 
be addressed simultaneously. For example, APSR.<N, C, Q> is synonymous with APSR <31, 29, 27>. 


K11.4.4 Arithmetic operators 


Most pseudocode arithmetic is performed on integer or real values, with operands obtained by conversions from 
bitstrings and results converted back to bitstrings. As these data types are the unbounded mathematical types, no 
issues arise about overflow or similar errors. 
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Unary plus and minus 


If x is an integer or real, then +x is x unchanged, -x is x with its sign reversed. Both are of the same type as x. 


Addition and subtraction 


If x and y are integers or reals, x+y and x-y are their sum and difference. Both are of type integer if x and y are both 
of type integer, and real otherwise. 


There are two cases where the types of x and y can be different. A bitstring and an integer can be added together to 
allow the operation PC + 4. An integer can be subtracted from a bitstring to allow the operation PC - 2. 


If x and y are bitstrings of the same length N, so that N = Len(x) = Len(y), then x+y and x-y are the least significant 
N bits of the results of converting x and y to integers and adding or subtracting them. Signed and unsigned 
conversions produce the same result: 


x+y = (SInt(x) + SInt(y))<N-1:0> 
= (UInt(x) + UInt(y))<N-1:0> 
x-y = (SInt(x) - SInt(y))<N-1:0> 


= (UInt(x) - UInt(y))<N-1:0> 
If x is a bitstring of length N and y is an integer, x+y and x-y are the bitstrings of length N defined by x+y = x + y<N-1:0> 
and x-y = x - y<N-1:0>. Similarly, if x is an integer and y is a bitstring of length M, x+y and x-y are the bitstrings of 
length M defined by x+y = x<M-1:0> + y and x-y = x<M-1:@ - y. 
Multiplication 


If x and y are integers or reals, then x « y is the product of x and y. It is of type integer if x and y are both of type 
integer, and real otherwise. 


Division and modulo 
If x and y are reals, then x/y is the result of dividing x by y, and is always of type real. 
If x and y are integers, then x DIV y and x MOD y are defined by: 


x DIV y = RoundDown(x/y) 
x MOD y = x - y » (x DIV y) 


It is a pseudocode error to use any of x/y, x MOD y, or x DIV y in any context where y can be zero. 


Scaling 

If x and n are of type integer, then: 

. x << n = RoundDown(x * 2An). 

. xX >> n = RoundDown(x « 2A(-n)). 


Raising to a power 


If x is an integer or a real and n is an integer then xAn is the result of raising x to the power of n, and: 
. If x is of type integer then xAn is of type integer. 
. If x is of type real then xAn is of type real. 


K11.4.5 The assignment operator 


The assignment operator is the = character, which assigns the value of the right-hand side to the left-hand side. An 
assignment statement takes the form: 


<assignable_expression> = <expression>; 


This following subsection defines valid expression syntax. 
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General expression syntax 


An expression is one of the following: 


° A literal. 

° A variable, optionally preceded by a data type name to declare its type. 
° The word UNKNOWN preceded by a data type name to declare its type. 

. The result of applying a language-defined operator to other expressions. 
. The result of applying a function to other expressions. 


Variable names normally consist of alphanumeric and underscore characters, starting with an alphabetic or 
underscore character. 


Each register defined in an ARM architecture specification defines a correspondingly named pseudocode bitstring 
variable, and that variable has the stated behavior of the register. For example, if a bit of a register is defined as 
RAZ/WI, then the corresponding bit of its variable reads as '0' and ignore writes. 


An expression like bits(32) UNKNOWN indicates that the result of the expression is a value of the given type, but the 
architecture does not specify what value it is and software must not rely on such values. The value produced must 


not: 

° Return information that cannot be accessed at the current or a lower level of privilege using instructions that 
are not UNPREDICTABLE or CONSTRAINED UNPREDICTABLE and do not return UNKNOWN values, 

. Be promoted as providing any useful information to software. 


Note 


UNKNOWN values are similar to the definition of UNPREDICTABLE, but do not indicate that the entire architectural 
state becomes unspecified. 








Only the following expressions are assignable. This means that these are the only expressions that can be placed on 
the left-hand side of an assignment. 


° Variables. 


° The results of applying some operators to other expressions. 


The description of each language-defined operator that can generate an assignable expression specifies the 
circumstances under which it does so. For example, those circumstances might require that one or more of 
the expressions the operator operates on is an assignable expression. 


° The results of applying array-like functions to other expressions. The description of an array-like function 
specifies the circumstances under which it can generate an assignable expression. 


Note 


If the right-hand side in an assignment is a function returning a tuple, an item in the assignment destination can be 
written as - to indicate that the corresponding item of the assigned tuple value is discarded. For example: 





(shifted, -) = LSL_C(operand, amount); 
The expression on the right-hand side itself can be a tuple. For example: 


(x, y) = (function_1(), function_2()); 





Every expression has a data type. 
. For a literal, this data type is determined by the syntax of the literal. 


. For a variable, there are the following possible sources for the data type 
— An optional preceding data type name. 


— __ A data type the variable was given earlier in the pseudocode by recursive application of this rule. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. K11-5641 
1ID092916 Non-Confidential 


Appendix K11 ARM Pseudocode Definition 
K11.4 Operators 


— __ A data type the variable is being given by assignment, either by direct assignment to the variable, or 
by assignment to a list of which the variable is a member. 


It is a pseudocode error if none of these data type sources exists for a variable, or if more than one of them 
exists and they do not agree about the type. 


° For a language-defined operator, the definition of the operator determines the data type. 


. For a function, the definition of the function determines the data type. 


K11.4.6 Precedence rules 
The precedence rules for expressions are: 


1. Literals, variables and function invocations are evaluated with higher priority than any operators using their 
results, but see Boolean operators on page K11-5638. 


2. Operators on integers follow the normal operator precedence rules of exponentiation before multiply/divide 
before add/subtract, with sequences of multiply/divides or add/subtracts evaluated left-to-right. 


3. Other expressions must be parenthesized to indicate operator precedence if ambiguity is possible, but need 
not be if all permitted precedence orders under the type rules necessarily lead to the same result. For example, 
if i, j and k are integer variables, i > @ && j > @ & k > Qis acceptable, but i > @ && j > @ || k > Qis not. 


K11.4.7 Conditional expressions 


If x and y are two values of the same type and t is a value of type boolean, then if t then x else y is an expression 
of the same type as x and y that produces x if t is TRUE and y if t is FALSE. 


K11.4.8 Operator polymorphism 


Operators in pseudocode can be polymorphic, with different functionality when applied to different data types. Each 
resulting form of an operator has a different prototype definition. For example, the operator + has forms that act on 
various combinations of integers, reals and bitstrings. 


Table K11-1 summarizes the operand types valid for each unary operator and the result type. Table K11-2 
summarizes the operand types valid for each binary operator and the result type. 


Table K11-1 Result and operand types permitted for unary operators 

















Operator Operand Type Result Type 
integer integer 

. real real 

NOT bits(N) bits(N) 

! boolean boolean 





Table K11-2 Result and operand types permitted for binary operators 





Operator First operand type Second operand type Result type 























integer 

bits(N) 
bits(N) 

integer integer 

== boolean 

real real 

enumeration enumeration 

boolean boolean 
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Table K11-2 Result and operand types permitted for binary operators (continued) 





Operator First operand type Second operand type Result type 










































































bits(N) bits(N) 

I= integer integer boolean 
real real 

<> integer integer 7 

<= >= real real ee 
integer integer integer 
real real real 

ca bits(N) 
bits(N) : bits(N) 

integer 

<<, >> integer integer integer 
integer integer integer 
real real real 
bits(N) bits(N) bits(N) 

J real real real 

DIV integer integer integer 
integer integer 

MOD integer 
bits(N) integer 

&&, | | boolean boolean boolean 

AND, OR, EOR bits(N) bits(N) bits(N) 
integer integer integer 

‘ real integer real 
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K11.5 Statements and control structures 
This section describes the statements and program structures available in the pseudocode: 
° Statements and Indentation. 
. Function and procedure calls. 
° Conditional control structures on page K11-5646. 
. Loop control structures on page K11-5647. 
° Special statements on page K11-5648. 
° Comments on page K11-5649. 

K11.5.1 Statements and Indentation 
A simple statement is either an assignment, a function call, or a procedure call. Each statement must be terminated 
with a semicolon. 
Indentation normally indicates the structure in compound statements. The statements contained in structures such 
as if .. then .. else .. or procedure and function definitions are indented more deeply than the statement structure 
itself. The end of a compound statement structure and their end is indicated by returning to the original indentation 
level or less. 
Indentation is normally done by four spaces for each level. Standard indentation uses four spaces for each level of 
indent. 

K11.5.2 Function and procedure calls 
This section describes how functions and procedures are defined and called in the pseudocode. 
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Procedure and function definitions 
A procedure definition has the form: 


<procedure name>(<argument prototypes>) 
<statement 1>; 
<statement 2>; 


<statement n>; 


where <argument prototypes> consists of zero or more argument definitions, separated by commas. Each argument 
definition consists of a type name followed by the name of the argument. 


Note 


This first definition line is not terminated by a semicolon. This distinguishes it from a procedure call. 








A function definition is similar, but also declares the return type of the function: 


<return type> <function name>(<argument prototypes>) 
<statement 1>; 
<statement 2>; 


<statement n>; 


Note 


A function or procedure name can include a".". This is a convention used for functions that have similar but 
different behaviors in AArch32 and AArch6é4 states. 








Array-like functions are similar, but are written with square brackets and have two forms. These two forms exist 
because reading from and writing to an array element require different functions. They are frequently used in 
memory operations. An array-like function definition with a return type is equivalent to reading from an array. For 
example: 


<return type> <function name>[<argument prototypes>] 

<statement 1>; 

<statement 2>; 

<statement n>; 
Its related function definition with no return type is equivalent to writing to an array. For example: 
<function name>[<argument prototypes>] = <value prototype> 

<statement 1>; 


<statement 2>; 


<statement n>; 


The value prototype determines what data type can be written to the array. The two related functions must share the 
same name, but the value prototype and return type can be different. 


Procedure calls 


A procedure call has the form: 


<procedure_name>(<arguments>) ; 


Return statements 
A procedure return has the form: 
return; 


A function return has the form: 
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return <expression>; 


where <expression> is of the type declared in the function prototype line. 


K11.5.3 Conditional control structures 


This section describes how conditional control structures are used in the pseudocode. 


if... then... else... 


In addition to being a ternary operator, a multi-line if .. then .. else .. structure can act as a control structure and 
has the form: 


if <boolean_expression> then 
<statement 1>; 
<statement 2>; 


<statement n>; 


elsif <boolean_expression> then 
<statement a>; 
<statement b>; 


<statement z>; 
else 

<statement A>; 

<statement B>; 


<statement Z>; 
The block of lines consisting of elsif and its indented statements is optional, and multiple elsif blocks can be used. 
The block of lines consisting of else and its indented statements is optional. 


Abbreviated one-line forms can be used when the then part, and in the else part if it is present, contain only simple 
statements such as: 


if <boolean_expression> then <statement 1>; 
if <boolean_expression> then <statement 1>; else <statement A>; 
if <boolean_expression> then <statement 1>; <statement 2>; else <statement A>; 


Note 


In these forms, <statement 1>, <statement 2> and <statement A> must be terminated by semicolons. This and the 
fact that the else part is optional distinguish its use as a control structure from its use as a ternary operator. 








case ... of ... 
A case ... of ... structure has the form: 
case <expression> of 
when <literal values1> 
<statement 1>; 
<statement 2>; 
<statement n>; 
when <literal values2> 
<statement 1>; 
<statement 2>; 


<statement n>; 


.. more "when" groups if required .. 
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otherwise 
<statement A>; 
<statement B>; 


<statement Z>; 


In this structure, <literal values1> and <literal values2> consist of literal values of the same type as <expression>, 
separated by commas. There can be additional when groups in the structure. Abbreviated one line forms of when and 
otherwise parts can be used when they contain only simple statements. 


If <expression> has a bitstring type, the literal values can also include bitstring literals containing 'x' bits, known 
as bitmasks. For details see Equality and non-equality on page K11-5638. 
K11.5.4 Loop control structures 


This section describes the three loop control structures used in the pseudocode. 
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K11.5.5 


repeat ... until ... 
A repeat .. until ... structure has the form: 


repeat 
<statement 1>; 
<statement 2>; 


<statement n>; 
until <boolean_expression>; 


It executes the statement block at least once, and the loop repeats until <boolean expression> evaluates to TRUE. 
Variables explicitly declared inside the loop body have scope local to that loop and might not be accessed outside 
the loop body. 


while ... do 
Awhile ... do structure has the form: 


while <boolean_expression> do 
<statement 1>; 
<statement 2>; 


<statement n>; 


It begins executing the statement block only if the Boolean expression is true. The loop then runs until the 
expression is false. 


for... 
A for ... structure has the form: 


for <assignable_expression> = <integer_exprl> to <integer_expr2> 
<statement 1>; 
<statement 2>; 


<statement n>; 


The <assignable_expression> is initialized to <integer_expr1> and compared to <integer_expr2>. If <integer_expr1> 
is less than <integer_expr2>, the loop body is executed and the <assignable_expression> incremented by one. This 
repeats until <assignable expression> is more than or equal to <integer_expr2>. 


There is an alternate form: 
for <assignable_expression> = <integer_exprl> downto <integer_expr2> 


where <integer_expr1> is decremented after the loop body executes and continues until <assignable expression> is 
less than or equal than <integer_expr2>. 


Special statements 


This section describes statements with particular architecturally-defined behaviors. 


UNDEFINED 
This subsection describes the statement: 
UNDEFINED; 


This statement indicates a special case that replaces the behavior defined by the current pseudocode, apart from 
behavior required to determine that the special case applies. The replacement behavior is that the Undefined 
Instruction exception is taken. 
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UNPREDICTABLE 
This subsection describes the statement: 
UNPREDICTABLE; 


This statement indicates a special case that replaces the behavior defined by the current pseudocode, apart from 
behavior required to determine that the special case applies. The replacement behavior is UNPREDICTABLE. 


SEE... 
This subsection describes the statement: 
SEE <reference>; 


This statement indicates a special case that replaces the behavior defined by the current pseudocode, apart from 
behavior required to determine that the special case applies. The replacement behavior is that nothing occurs as a 
result of the current pseudocode because some other piece of pseudocode defines the required behavior. The 
<reference> indicates where that other pseudocode can be found. 


It usually refers to another instruction, but can also refer to another encoding or note of the same instruction. 


IMPLEMENTATION_DEFINED 
This subsection describes the statement: 
IMPLEMENTATION_DEFINED {"<text>"}; 


This statement indicates a special case that replaces the behavior defined by the current pseudocode, apart from 
behavior required to determine that the special case applies. The replacement behavior is IMPLEMENTATION 
DEFINED. An optional <text> field can give more information. 


Comments 


The pseudocode supports two styles of comments: 
° // starts a comment that is terminated by the end of the line. 


° /* starts a comment that is terminated by «/. 


/*«/ statements might not be nested, and the first «/ ends the comment. 





Note 


Comment lines do not require a terminating semicolon. 
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K11.6 Built-in functions 


This section describes: 
° Bitstring manipulation functions. 


. Arithmetic functions on page K11-5651. 


K11.6.1 Bitstring manipulation functions 


The following bitstring manipulation functions are defined: 


Bitstring length and most significant bit 


If x is a bitstring: 


° The bitstring length function Len(x) returns the length of x as an integer. 


Bitstring concatenation and replication 


If x is a bitstring and n is an integer with n >= 0: 


. Replicate(x, n) is the bitstring of length n«Len(x) consisting of n copies of x concatenated together. 
. Zeros(n) = Replicate('0', n). 
. Ones(n) = Replicate('1', n). 


Bitstring count 


If x is a bitstring, BitCount(x) is an integer result equal to the number of bits of x that are ones. 
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Testing a bitstring for being all zero or all ones 


If x is a bitstring: 
° IsZero(x) produces TRUE if all of the bits of x are zeros and FALSE if any of them are ones 


° IsZeroBit(x) produces '1' if all of the bits of x are zeros and '@' if any of them are ones. 


IsOnes(x) and IsOnesBit(x) work in the corresponding ways. This means: 


IsZero(x) = (BitCount(x) == @) 

TsOnes(x) (BitCount(x) == Len(x)) 
IsZeroBit(x) = if IsZero(x) then '1' else 'Q' 
IsOnesBit(x) = if IsOnes(x) then '1' else 'Q' 


Lowest and highest set bits of a bitstring 
If x is a bitstring, and N = Len(x): 


° LowestSetBit(x) is the minimum bit number of any of the bits of x that are ones. If all of its bits are zeros, 
LowestSetBit(x) = N. 


e HighestSetBit(x) is the maximum bit number of any of the bits of x that are ones. If all of its bits are zeros, 
HighestSetBit(x) = -1. 


° CountLeadingZeroBits(x) is the number of zero bits at the left end of x, in the range 0 to N. This means: 


CountLeadingZeroBits(x) = N - 1 - HighestSetBit(x). 


° CountLeadingSignBits(x) is the number of copies of the sign bit of x at the left end of x, excluding the sign 
bit itself, and is in the range 0 to N-1. This means: 


CountLeadingSignBits(x) = CountLeadingZeroBits(x<N-1:1> EOR x<N-2:0>). 


Zero-extension and sign-extension of bitstrings 


If x is a bitstring and i is an integer, then ZeroExtend(x, i) is x extended to a length of i bits, by adding sufficient 
zero bits to its left. That is, if i == Len(x), then ZeroExtend(x, 1) = x, andifi > Len(x), then: 


ZeroExtend(x, i) = Replicate('@', i-Len(x)) : x 


If x is a bitstring and 7 is an integer, then SignExtend(x, i) is x extended to a length of i bits, by adding sufficient 
copies of its leftmost bit to its left. That is, if i == Len(x), then SignExtend(x, i) = x, andifi > Len(x), then: 


SignExtend(x, i) = Replicate(TopBit(x), i-Len(x)) : x 

It is a pseudocode error to use either ZeroExtend(x, 1) or SignExtend(x, i) in a context where it is possible that 
i < Len(x). 

Converting bitstrings to integers 

If x is a bitstring, SInt() is the integer whose two’s complement representation is x. 

UInt() is the integer whose unsigned representation is x. 


Int(x, unsigned) returns either SInt(x) or UInt(x) depending on the value of its second argument. 


K11.6.2 Arithmetic functions 


This section defines built-in arithmetic functions. 


Absolute value 


If x is either of type real or integer, Abs(x) returns the absolute value of x. The result is the same type as x. 
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Rounding and aligning 


If x is a real: 


° RoundDown(x) produces the largest integer n such that n <= x. 
. RoundUp(x) produces the smallest integer n such that n >= x. 
. RoundTowardsZero(x) produces: 

—  RoundDown(x) if x > 0.0. 

— Oifx == 0.0. 


—_ RoundUp(x) if x < 0.0. 
If x and y are both of type integer, Align(x, y) = y » (x DIV y), and is of type integer. 


If x is of type bitstring and y is of type integer, Align(x, y) = (Align(UInt(x), y))<Len(x)-1:0>, and is a bitstring 
of the same length as x. 


It is a pseudocode error to use either form of Align(x, y) in any context where y can be 0. In practice, Align(x, y) 
is only used with y a constant power of two, and the bitstring form used with y = 2An has the effect of producing its 
argument with its n low-order bits forced to zero. 


Maximum and minimum 


If x and y are integers or reals, then Max(x, y) and Min(x, y) are their maximum and minimum respectively. x and 
y must both be of type integer or of type real. The function returns a value of the same type as its operands. 
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K11.7 Miscellaneous helper procedures and functions 


This section lists the prototypes of miscellaneous helper procedures and functions used by the pseudocode, together 
with a brief description of the effect of the procedure or function. The pseudocode does not define the operation of 
these helper procedures and functions. 


Note 


Chapter J1 ARMv8 Pseudocode also has an entry for each of these functions, but currently these entries do not say 
anything about the effect of the function. When this information is added in Chapter J1 this section will be removed 
from the manual. 








K11.7.1 EndOfinstruction() 


This procedure terminates processing of the current instruction. 


EndOfInstruction(); 


K11.7.2 Hint_Debug() 


This procedure supplies a hint to the debug system. 


Hint_Debug(bits(4) option); 


K11.7.3 Hint_PreloadData() 


This procedure performs a preload data hint. 


Hint_PreloadData(bits(32) address); 


K11.7.4 Hint_PreloadDataForWrite() 


This procedure performs a preload data hint with a probability that the use will be for a write. 


Hint_PreloadDataForWrite(bits(32) address); 


K11.7.5 Hint_Preloadinstr() 


This procedure performs a preload instructions hint. 


Hint_PreloadInstr(bits(32) address); 


K11.7.6  Hint_Yield() 


This procedure performs a Yield hint. 


Hint_Yield(); 


K11.7.7 IsExternalAbort() 


This function returns TRUE if the abort currently being processed is an external abort and FALSE otherwise. It is used 
only in exception entry pseudocode. 


boolean IsExternalAbort(Fault type) 
assert type != Fault_None; 


boolean IsExternalAbort(FaultRecord fault); 
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K11.7.8 IsAsyncAbort() 


This function returns TRUE if the abort currently being processed is an asynchronous abort, and FALSE otherwise. It 
is used only in exception entry pseudocode. 


boolean IsAsyncAbort(Fault type) 
assert type != Fault_None; 


boolean IsAsyncAbort(FaultRecord fault); 


K11.7.9 LSInstructionSyndrome() 


This function returns the extended syndrome information for a fault reported in the HSR. 


bits(11) LSInstructionSyndrome(); 


K11.7.10 | ProcessorID() 


This function returns an integer that uniquely identifies the executing PE in the system. 


integer ProcessorID(); 


K11.7.11 RemapRegsHaveResetValues() 


This function returns TRUE if the remap registers PRRR and NMRR have their IMPLEMENTATION DEFINED reset 
values, and FALSE otherwise. 


boolean RemapRegsHaveResetValues(); 


K11.7.12 ResetControlRegisters() 


This function resets the System registers and memory-mapped control registers that have architecturally-defined 
reset values to those values. For more information about the affected registers see: 


° PE state on reset to AArch64 state on page D1-1518. 
° PE state on reset into AArch32 state on page G1-3869. 


AArch64.ResetControlRegisters(boolean ResetIsCold) 
AArch32.ResetControlRegisters(boolean ResetIsCold) 


K11.7.13  ThisInstr() 
This function returns the bitstring encoding of the currently-executing instruction. 


bits(32) ThisInstr(); 


Note 


Currently, this function is used only on 32-bit instruction encodings. 








K11.7.14 ThisInstrLength() 
This function returns the length, in bits, of the current instruction. This means it returns 32 or 16: 


integer ThisInstrLength(); 
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° Table K11-3 which contains the pseudocode data types. 

° Table K11-4 which contains the pseudocode operators. 

° Table K11-5 on page K11-5656 which contains the pseudocode keywords and control structures. 
° Table K11-6 on page K11-5657 which contains the statements with special behaviors. 


Table K11-3 Index of pseudocode data types 

























































































Keyword Meaning 
array Type name for the array type 
bit Keyword equivalent to bits(1) 
bits(N) Type name for the bitstring of length N data type 
boolean Type name for the Boolean data type 
enumeration Keyword to define a new enumeration type 
integer Type name for the integer data type 
real Type name for the real data type 
type Keyword to define a new structure 
Table K11-4 Index of pseudocode operators 
Operator Meaning 
- Unary minus on integers or reals 
Subtraction of integers, reals and bitstrings 
Used in the left-hand side of an assignment or a tuple to discard 
the result 
+ Unary plus on integers or reals 
Addition of integers, reals and bitstrings 
Extract named member from a list 
Extract named bit or field from a register 
Bitstring concatenation 
Integer range in bitstring extraction operator 
Boolean NOT 
= Comparison for inequality 
Gs) Around arguments of procedure or function 
[...] Around array index 
Around arguments of array-like function 
Multiplication of integers, reals, and bitstrings 
/ Division of reals 
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Table K11-4 Index of pseudocode operators (continued) 






















































































Operator Meaning 
&& Boolean AND 
< Less than comparison of integers and reals 
<> Slicing of specified bits of bitstring or integer 
<< Multiply integer by power of 2 
<= Less than or equal comparison of integers and reals 
- Assignment operator 
== Comparison for equality 
> Greater than comparison of integers and reals 
>= Greater than or equal comparison of integers and reals 
>> Divide integer by power of 2 
l| Boolean OR 
A Exponential operator 
AND Bitwise AND of bitstrings 
DIV Quotient from integer division 
EOR Bitwise EOR of bitstrings 
IN Tests membership of a certain expression in a set of values 
MOD Remainder from integer division 
NOT Bitwise inversion of bitstrings 
OR Bitwise OR of bitstrings 
Table K11-5 Index of pseudocode keywords and control structures 
Operator Meaning 
[8] Comment delimiters 
// Introduces comment terminated by end of line 
case .. of .. Control structure for the 
FALSE One of two values a Boolean can take (other than TRUE) 
for .. = ..to ... Loop control structure, counting up from the initial value to the 
upper limit 
for .. = .. downto ... Loop control structure, counting down from the initial value to 


the lower limit 





if .. then... else ... 


Condition expression selecting between two values 





if .. then .. else ... 


Conditional control structure 





otherwise 


Introduces default case in case ... of ... control structure 
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Table K11-5 Index of pseudocode keywords and control structures (continued) 





Operator 


Meaning 





repeat .. until .. 


Loop control structure that runs at least once until the 
termination condition is satisfied 




















return Procedure or function return 
TRUE One of two values a Boolean can take (other than FALSE) 
when Introduces specific case in case .. of ... control structure 
while .. do .. Loop control structure that runs until the termination condition 
is satisfied 
Table K11-6 Index of special statements 
Keyword Meaning 





IMPLEMENTATION_DEFINED 


Describes IMPLEMENTATION DEFINED behavior 




















SEE Points to other pseudocode to use instead 
UNDEFINED Cause Undefined Instruction exception 
UNKNOWN Unspecified value 
UNPREDICTABLE Unspecified behavior 
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Appendix K12 
Registers Index 


This appendix provides indexes to the register descriptions in this manual. It contains the following sections: 


Introduction and register disambiguation on page K12-5660. 

Alphabetical index of AArch64 registers and system instructions on page K12-5665. 
Functional index of AArch64 registers and system instructions on page K12-5674. 
Alphabetical index of AArch32 registers and system instructions on page K12-5684. 
Functional index of AArch32 registers and system instructions on page K12-5692. 
Alphabetical index of memory-mapped registers on page K12-5702. 

Functional index of memory-mapped registers on page K12-5707. 
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K12.1 


Introduction and register disambiguation 


In some sections of this manual, registers are referred to by a general name, where the description applies to more 
than one context. Generally, this is one of the following: 


The description applies to both AArch32 state and AArch64 state, and therefore the register names could 
apply to either AArch32 System registers or AArch64 System registers. 


The description applies to multiple Exception levels, and therefore at a particular Exception level the register 
names need to take the appropriate Exception. level suffix, ELO, EL1, EL2, or __EL3. 





The following sections disambiguate the general register names: 


K12.1.1 


Table K12-1 disambiguates the general names of the registers by Execution state. 


Register name disambiguation by Execution state. 


Register name disambiguation by Exception level on page K12-5663. 


Register name disambiguation by Execution state 


Table K12-1 Disambiguation of general names of registers by Execution state 






























































General name Short description AArch64 register AArch32 register 
CONTEXTIDR Context ID CONTEXTIDR_EL1 CONTEXTIDR 
DBGBCR Debug Breakpoint Control Registers DBGBCR<n>_EL1 DBGBCR<n> 
DBGBVR Debug Breakpoint Value Registers DBGBVR<n>_EL1 DBGBVR<n> 
DBGBXVR<n> 
DBGCLAIMCLR Debug CLAIM Tag Clear register DBGCLAIMCLR_EL1 DBGCLAIMCLR 
DBGCLAIMSET Debug CLAIM Tag Set register DBGCLAIMSET_EL1 DBGCLAIMSET 
DBGDTRRX Debug Data Transfer Register, Receive DBGDTRRX_ELO DBGDTRRXint 
DBGDTRTX Debug Data Transfer Register, Transmit DBGDTRTX_ELO DBGDTRTXint 
DBGPRCR Debug Power Control Register DBGPRCR_EL1 DBGPRCR 
DBGVCR Debug Vector Catch Register DBGVCR32_EL2 DBGVCR 
DBGWCR Debug Watchpoint Control Registers DBGWCR<n>_EL1 DBGWCR<n> 
DBGWVR Debug Watchpoint Value Registers DBGWVR<n>_EL1 DBGWVR<n> 
DCCINT Debug Comms Channel Interrupt Enable MDCCINT_EL1 DBGDCCINT 
Register 
DCCSR Debug Comms Channel Status Register MDCCSR_ELO DBGDSCRint 
DBGAUTHSTATUS Debug Authentication Status DBGAUTHSTATUS_EL1 DBGAUTHSTATUS 
DLR Debug Link Register DLR_ELO[31:0] DLR 
DSCR Debug System Control Register MDSCR_EL1 DBGDSCRext 
DSPSR Debug Saved PE State Register DSPSR_ELO DSPSR 
FAR Fault Address Register FAR_EL1 DFAR, IFAR 
FAR_EL2 HDFAR, HIFAR 
FAR_EL3 FAR_EL3 
HPFAR_EL2 HPFAR 
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K12.1 Introduction and register disambiguation 


Table K12-1 Disambiguation of general names of registers by Execution state (continued) 





General name 


Short description 


AArch64 register 


AArch32 register 



























































HCR Hypervisor Configuration Register HCR_EL2 HCR 
HCR2 
HDCR Hyp or EL2 Debug Control Register MDCR_EL2 HDCR 
HSCTLR Hypervisor System Control Register SCTLR_EL2 HSCTLR 
HTTBR EL2 Translation Table Base Register TTBRO_EL2 HTTBR 
ISR Interrupt Status Register ISR_EL1 ISR 
MPIDR Multiprocessor Affinity Register MPIDR_EL1 MPIDR 
OSDLR OS Double-Lock Register OSDLR_EL1 DBGOSDLR 
OSDTRRX OS Lock Data Transfer Register, Receive OSDTRRX_EL1 DBGDTRRXext 
OSDTRTX OS Lock Data Transfer Register, Transmit OSDTRTX_EL1 DBGDTRTXext 
OSECCR OS Lock Exception Catch Control Register OSECCR_EL1 DBGOSECCR 
OSLAR OS Lock Access Register OSLAR_EL1 DBGOSLAR 
OSLSR OS Lock Status Register OSLSR_EL1 DBGOSLSR 
SCR Secure Configuration Register SCR_EL3 SCR 
SCTLR System Control Register SCTLR_EL1 SCTLR (NS) 
SCTLR_EL2 HSCTLR 
SCTLR_EL3 SCTLR (S) 
SDCR Secure or EL3 Debug Configuration MDCR_EL3 SDCR 
Register 
SDER Secure Debug Enable Register SDER32_EL3 SDER 
SPSR Saved Program Status Register SPSR_EL1 SPSR (general description) 
SPSR_EL2 SPSR_abt 
SPSR_EL3 SPSR_fiq 
SPSR_hyp 
SPSR_irq 
SPSR_mon 
SPSR_sve 
SPSR_und 
TCR Translation Control Register TCR_EL1 TTBCR(NS) 
TCR_EL2 HTCR 
TCR_EL3 TTBCR(S) 
VTCR_EL2 VTCR 
TTBR Translation Table Base Register TTBRO_EL1 TTBRO 
TTBRO_EL2 TTBRI 
TTBRO_EL3 HTTBR 
TTBR1_EL1 VTTBR 
VTTBR_EL2 
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Table K12-1 Disambiguation of general names of registers by Execution state (continued) 














General name Short description AArch64 register AArch32 register 
VCR PL1&0 stage 2 Translation Control Register VTCR_EL2 VTCR 
VBAR Vector Base Address Register VBAR_EL1 VBAR 
VBAR_EL2 HVBAR 
VBAR_EL3 MVBAR 
VTTBR PL1&0 stage 2 Translation Table Base VTTBR_EL2 VTTBR 


Register 





Table K12-2 disambiguates the general names of the System registers that provide access to the Performance 


Monitors by Execution state. 


Table K12-2 Disambiguation of general names of the Performance Monitors System registers by Execution state 





General name 


Short description 


AArch64 register 


AArch32 register 

































































PMCCFILTR Cycle Count Filter Register PMCCFILTR_ELO PMCCFILTR 
PMCCNTR Cycle Count Register PMCCNTR_ELO PMCCNTR 
PMCEIDO Performance Monitors Cycle Count Filter Register 0 PMCEIDO_ELO PMCEIDO 
PMCEID1 Performance Monitors Cycle Count Filter Register 1 PMCEID1_ELO PMCEID1 
PMCNTENCLR Performance Monitors Count Enable Clear register PMCNTENCLR_ELO PMINTENCLR 
PMCNTENSET Performance Monitors Count Enable Set register PMCNTENSET_ELO PMCNTENSET 
PMCR Performance Monitors Control Register PMCR_ELO PMCR 
PMEVCNTR<n> Performance Monitors Event Count Registers, n = 0-30 PMEVCNTR<n>_ELO PMEVCNTR<n> 
PMEVTYPER<n> _ _— Performance Monitors Event Type Registers, n = 0-30 PMEVTYPER<n>_ELO =PMEVTYPER<n> 
PMINTENCLR Performance Monitors Interrupt Enable Clear register PMINTENCLR_EL1 PMINTENCLR 
PMINTENSET Performance Monitors Interrupt Enable Set register PMINTENSET_EL1 PMINTENSET 
PMOVSCLR Performance Monitors Overflow Flag Status Register PMOVSCLR_ELO PMOVSR 
PMOVSSET Performance Monitors Overflow Flag Status Set register PMOVSSET_ELO PMOVSSET 
PMSELR Performance Monitors Event Counter Selection Register PMSELR_ELO PMSELR 
PMSWINC Performance Monitors Software Increment register PMSWINC_ELO PMSWINC 
PMUSERENR Performance Monitors User Enable Register PMUSERENR_ELO PMUSERENR 
PMXEVCNTR Performance Monitors Selected Event Count Register PMXEVCNTR_ELO PMXEVCNTR 
PMXEVTYPER Performance Monitors Selected Event Type Register PMXEVTYPER_ELO PMXEVTYPER 
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Table K12-3 disambiguates the general names of the System registers that provide access to the Performance 
Monitors by Execution state. 


Table K12-3 Disambiguation of general names of the Generic Timer System registers by Execution state 





General name 


Short description 


AArch64 register 


AArch32 register 







































































CNTFRQ Counter-timer Frequency register CNTFRQ_ELO CNTFRQ 
CNTHCTL Counter-timer Hypervisor Control register CNTHCTL EL2 CNTHCTL 
CNTHP_CTL Counter-timer Hypervisor Physical Timer Control register CNTHP_CTL_EL2 CNTHP_CTL 
CNTHP_CVAL Counter-timer Hypervisor Physical Timer CompareValue = CNTHP_CVAL_EL2 CNTHP_CVAL 
register 
CNTHP_TVAL Counter-timer Hypervisor Physical Timer TimerValue CNTHP_TVAL_EL2 CNTHP_TVAL 
register 
CNTKCTL Counter-timer Kernel Control register CNIKCTL.ELI CNTKCTL 
CNTP_CTL Counter-timer Physical Timer Control register CNTP_CTL_ELO CNITP_CTL 
CNTP_CVAL Counter-timer Physical Timer Compare Value register CNTP_CVAL_ELO CNTP_CVAL 
CNTP_TVAL Counter-timer Physical Timer TimerValue register CNTP_TVAL_ELO CNTP_TVAL 
CNTPCT Counter-timer Physical Count register CNTPCT_ELO CNTPCT 
CNTPS_CTL Counter-timer Physical Secure Timer Control register CNTPS CTL. EL| - 
CNTPS_CVAL Counter-timer Physical Secure Timer Compare Value CNTPS_CVAL_EL1 - 
register 
CNTPS_TVAL Counter-timer Physical Secure Timer TimerValue register CNTPS_TVAL_EL1 - 
CNTV_CTL Counter-timer Virtual Timer Control register CNTV_CTL_ELO CNTV_CTL 
CNTV_CVAL Counter-timer Virtual Timer CompareValue register CNTV_CVAL_ELO CNTV_CVAL 
CNTV_TVAL Counter-timer Virtual Timer TimerValue register CNTV_TVAL_ELO CNTV_TVAL 
CNTVCT Counter-timer Virtual Count register CNTVCT_ELO CNTVCT 
CNTVOFF Counter-timer Virtual Offset register CNTVOFF_EL2 CNTVOFF 
K12.1.2 Register name disambiguation by Exception level 


Table K12-4 disambiguates the general names of the AArch64 System registers by Exception level. 


Table K12-4 Disambiguation of AArch64 System registers by Exception level 























General form ELO EL1 EL2 EL3 
AFSRO_ELx - AFSRO_EL1 AFSRO_EL2 AFSRO_EL3 
AFSR1_ELx - AFSR1_EL1 AFSR1_EL2 AFSR1_EL3 
CONTEXTIDR_ELx - CONTEXTIDR_EL1 - - 

CPTR_ELx - - CPTR_EL2 CPTR_EL3 
ELR_ELx - ELR_EL1 ELR_EL2 ELR_EL3 
ESR_ELx - ESR_EL1 ESR_EL2 ESR_EL3 
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Table K12-4 Disambiguation of AArch64 System registers by Exception level (continued) 






































General form ELO EL1 EL2 EL3 
FAR_ELx - FAR_EL1 FAR_EL2 FAR_EL3 
MAIR_ELx - MAIR_EL1 MAIR_EL2 MAIR_EL3 
RMR_ELx - RMR_EL1 RMR_EL2 RMR_EL3 
RVBAR_ELx - RVBAR_EL1 RVBAR_EL2 RVBAR_EL3 
SCTLR_ELx - SCTLR_EL1 SCTLR_EL2 SCTLR_EL3 
SP_ELx SP_ELO SP_EL1 SP_EL2 SP_EL3 
SPSR_ELx - SPSR_EL1 SPSR_EL2 SPSR_EL3 
TCR_ELx - TCR_ELI TCR_ELZ TCR_EL3 
TTBRO_ELx - TTBRO_ELI1 TTBRO_EL2 TTBRO_EL3 
TTBRI1_ELx - TIBR1_ELI1 - - 
VBAR_ELx - VBAR_EL1 VBAR_EL2 VBAR_EL3 
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K12.2 Alphabetical index of AArch64 registers and system instructions 
This section is an index of AArch64 registers and system instructions in alphabetical order. 
Table K12-5 Alphabetical index of AArch64 Registers 

Register Description, see 
ACTLR_EL1 ACTLR_ELI, Auxiliary Control Register (EL1) on page D7-1896 
ACTLR_EL2 ACTLR_EL2, Auxiliary Control Register (EL2) on page D7-1897 
ACTLR_EL3 ACTLR_EL3, Auxiliary Control Register (EL3) on page D7-1898 
AFSRO_EL1 AFSRO_ELI, Auxiliary Fault Status Register 0 (EL1) on page D7-1899 
AFSRO_EL2 AFSRO_EL2, Auxiliary Fault Status Register 0 (EL2) on page D7-1900 
AFSRO_EL3 AFSRO_EL3, Auxiliary Fault Status Register 0 (EL3) on page D7-1901 
AFSR1_EL1 AFSRI_ELI, Auxiliary Fault Status Register I (ELI) on page D7-1902 
AFSR1_EL2 AFSR1_EL2, Auxiliary Fault Status Register 1 (EL2) on page D7-1903 
AFSR1_EL3 AFSR1_EL3, Auxiliary Fault Status Register I (EL3) on page D7-1904 
AIDR_EL1 AIDR_ELI, Auxiliary ID Register on page D7-1905 
AMAIR_EL1 AMAIR_ELI, Auxiliary Memory Attribute Indirection Register (EL1) on page D7-1906 
AMAIR_EL2 AMAIR_EL2, Auxiliary Memory Attribute Indirection Register (EL2) on page D7-1908 
AMAIR_EL3 AMAIR_EL3, Auxiliary Memory Attribute Indirection Register (EL3) on page D7-1909 
AT S12E0R AT S12E0R, Address Translate Stages 1 and 2 ELO Read on page C5-366 
AT S12EOW AT S12E0W, Address Translate Stages 1 and 2 ELO Write on page C5-367 
AT S12E1R AT S12EIR, Address Translate Stages 1 and 2 ELI Read on page C5-368 
AT S12E1W AT S12E1W, Address Translate Stages 1 and 2 ELI Write on page C5-369 
AT S1EOR AT SIEOR, Address Translate Stage 1 ELO Read on page C5-370 
AT S1EOW AT SIEOW, Address Translate Stage 1 ELO Write on page C5-371 
AT S1EIR AT SIEIR, Address Translate Stage 1 ELI Read on page C5-372 
AT S1E1W AT SIEI1W, Address Translate Stage 1 ELI Write on page C5-373 
AT S1E2R AT SIE2R, Address Translate Stage 1 EL2 Read on page C5-374 
AT S1E2W AT SIE2W, Address Translate Stage 1 EL2 Write on page C5-375 
AT S1E3R AT SIE3R, Address Translate Stage 1 EL3 Read on page C5-376 
AT S1E3W AT S1E3W, Address Translate Stage 1 EL3 Write on page C5-377 
CCSIDR_EL1 CCSIDR_ELI, Current Cache Size ID Register on page D7-1910 
CLIDR_EL1 CLIDR_ELI, Cache Level ID Register on page D7-1912 
CNTFRQ_ELO CNTFRQ_ELO, Counter-timer Frequency register on page D7-2256 
CNTHCTL_EL2 CNTHCTL_EL2, Counter-timer Hypervisor Control register on page D7-2258 
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Table K12-5 Alphabetical index of AArch64 Registers (continued) 





Register 


Description, see 





CNTHP_CTL_EL2 


CNTHP_CTL_EL2, Counter-timer Hypervisor Physical Timer Control register on 
page D7-2260 





CNTHP_CVAL_EL2 


CNTHP_CVAL_EL2, Counter-timer Hypervisor Physical Timer Compare Value register on 
page D7-2262 





CNTHP_TVAL_EL2 


CNTHP_TVAL_EL2, Counter-timer Hypervisor Physical Timer TimerValue register on 
page D7-2263 





CNTKCTL_EL1 


CNTKCTL_ELI, Counter-timer Kernel Control register on page D7-2264 





CNTP_CTL_ELO 


CNTP_CTL_ELO, Counter-timer Physical Timer Control register on page D7-2267 





CNTP_CVAL_ELO 


CNTP_CVAL_ELO, Counter-timer Physical Timer CompareValue register on 
page D7-2269 





CNTP_TVAL_ELO 


CNTP_TVAL_ELO, Counter-timer Physical Timer TimerValue register on page D7-2270 








CNTPCT_ELO 


CNTPCT_ELO, Counter-timer Physical Count register on page D7-2272 





CNTPS_CTL_EL1 


CNTPS_CTL_ELI, Counter-timer Physical Secure Timer Control register on 
page D7-2273 





CNTPS_CVAL_EL1 


CNTPS_CVAL_EL1, Counter-timer Physical Secure Timer CompareValue register on 
page D7-2275 





CNTPS_TVAL_EL1 


CNTPS_TVAL_ELI, Counter-timer Physical Secure Timer TimerValue register on 
page D7-2276 























CNTV_CTL_ELO CNTV_CTL_ELO, Counter-timer Virtual Timer Control register on page D7-2277 
CNTV_CVAL_ELO CNTV_CVAL_ELO, Counter-timer Virtual Timer Compare Value register on page D7-2279 
CNTV_TVAL_ELO CNTV_TVAL_ELO, Counter-timer Virtual Timer TimerValue register on page D7-2280 
CNTVCT_ELO CNTVCT_ELO, Counter-timer Virtual Count register on page D7-2282 

CNTVOFF_EL2 CNTVOFF_EL2, Counter-timer Virtual Offset register on page D7-2283 





CONTEXTIDR_EL1 


CONTEXTIDR_ELI, Context ID Register (EL1) on page D7-1914 


























CPACR_EL1 CPACR_ELI, Architectural Feature Access Control Register on page D7-1916 
CPTR_EL2 CPTR_EL2, Architectural Feature Trap Register (EL2) on page D7-1918 
CPTR_EL3 CPTR_EL3, Architectural Feature Trap Register (EL3) on page D7-1920 
CSSELR_EL1 CSSELR_ELI, Cache Size Selection Register on page D7-1922 

CTR_ELO CTR_ELO, Cache Type Register on page D7-1924 

CurrentEL CurrentEL, Current Exception Level on page C5-294 

DACR32_EL2 DACR32_EL2, Domain Access Control Register on page D7-1926 

DAIF DAIF, Interrupt Mask Bits on page C5-296 





DBGAUTHSTATUS_EL1 


DBGAUTHSTATUS_EL1, Debug Authentication Status register on page D7-2148 





DBGBCR<n>_EL1 


DBGBCR<n>_ELI, Debug Breakpoint Control Registers, n = 0 - 15 on page D7-2150 





DBGBVR<n>_EL1 


DBGBVR<n>_ELI1, Debug Breakpoint Value Registers, n = 0 - 15 on page D7-2153 
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Table K12-5 Alphabetical index of AArch64 Registers (continued) 





Register 


Description, see 





DBGCLAIMCLR_EL1 


DBGCLAIMCLR_EL1, Debug Claim Tag Clear register on page D7-2156 





DBGCLAIMSET_EL1 


DBGCLAIMSET_ELI, Debug Claim Tag Set register on page D7-2158 





DBGDTR_ELO 


DBGDTR_ELO, Debug Data Transfer Register, half-duplex on page D7-2160 





DBGDTRRX_ELO 


DBGDTRRX_ELO, Debug Data Transfer Register, Receive on page D7-2162 





DBGDTRTX_ELO 


DBGDTRTX_ELO, Debug Data Transfer Register, Transmit on page D7-2164 





DBGPRCR_EL1 


DBGPRCR_ELI, Debug Power Control Register on page D7-2166 





DBGVCR32_EL2 


DBGVCR32_EL2, Debug Vector Catch Register on page D7-2168 





DBGWCR<n>_EL1 


DBGWCR<n>_ELI1, Debug Watchpoint Control Registers, n = 0 - 15 on page D7-2172 





DBGWVR<n>_EL1 


DBGWVR<n>_ELI, Debug Watchpoint Value Registers, n = 0 - 15 on page D7-2175 




































































DC CISW DC CISW, Data or unified Cache line Clean and Invalidate by Set/Way on page C5-348 
DC CIVAC DC CIVAC, Data or unified Cache line Clean and Invalidate by VA to PoC on page C5-350 
DC CSW DC CSW, Data or unified Cache line Clean by Set/Way on page C5-351 

DC CVAC DC CVAC, Data or unified Cache line Clean by VA to PoC on page C5-353 
DC CVAU DC CVAU, Data or unified Cache line Clean by VA to PoU on page C5-354 
DC ISW DC ISW, Data or unified Cache line Invalidate by Set/Way on page C5-355 
DC IVAC DC IVAC, Data or unified Cache line Invalidate by VA to PoC on page C5-357 
DC ZVA DC ZVA, Data Cache Zero by VA on page C5-359 

DCZID_ELO DCZID_ELO, Data Cache Zero ID register on page D7-1928 

DLR_ELO DLR_ELO, Debug Link Register on page D7-2177 

DSPSR_ELO DSPSR_ELO, Debug Saved Program Status Register on page D7-2178 
ELR_EL1 ELR_ELI, Exception Link Register (EL1) on page C5-300 

ELR_EL2 ELR_EL2, Exception Link Register (EL2) on page C5-301 

ELR_EL3 ELR_EL3, Exception Link Register (EL3) on page C5-303 

ESR_EL1 ESR_EL1, Exception Syndrome Register (EL1) on page D7-1930 

ESR_EL2 ESR_EL2, Exception Syndrome Register (EL2) on page D7-1931 

ESR_EL3 ESR_EL3, Exception Syndrome Register (EL3) on page D7-1932 

ESR_ELx ESR_ELx, Exception Syndrome Register (ELx) on page D7-1933 

FAR_EL1 FAR_ELI, Fault Address Register (ELI) on page D7-1965 

FAR_EL2 FAR_EL2, Fault Address Register (EL2) on page D7-1967 

FAR_EL3 FAR_EL3, Fault Address Register (EL3) on page D7-1969 

FPCR FPCR, Floating-point Control Register on page C5-304 





FPEXC32_EL2 


FPEXC32_EL2, Floating-Point Exception Control register on page D7-1971 
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Table K12-5 Alphabetical index of AArch64 Registers (continued) 





























Register Description, see 

FPSR FPSR, Floating-point Status Register on page C5-308 

HACR_EL2 HACR_EL2, Hypervisor Auxiliary Control Register on page D7-1976 

HCR_EL2 HCR_EL2, Hypervisor Configuration Register on page D7-1977 

HPFAR_EL2 HPFAR_EL2, Hypervisor IPA Fault Address Register on page D7-1989 

HSTR_EL2 HSTR_EL2, Hypervisor System Trap Register on page D7-1991 

IC IALLU IC IALLU, Instruction Cache Invalidate All to PoU on page C5-361 

IC IALLUIS IC IALLUIS, Instruction Cache Invalidate All to PoU, Inner Shareable on page C5-362 
IC IVAU IC IVAU, Instruction Cache line Invalidate by VA to PoU on page C5-363 





ID_AA64AFRO_EL1 


ID_AA64AFRO_ELI, AArch64 Auxiliary Feature Register 0 on page D7-1993 





ID_AA64AFR1_EL1 


ID_AA64AFR1_EL1, AArch64 Auxiliary Feature Register 1 on page D7-1995 





ID_AA64DFRO_EL1 


ID_AA64DFRO_EL1, AArch64 Debug Feature Register 0 on page D7-1996 





ID_AA64DFR1_EL1 


ID_AA64DFR1_ELI, AArch64 Debug Feature Register 1 on page D7-1998 





ID_AA64ISARO_EL1 


ID_AA64ISARO_ELI, AArch64 Instruction Set Attribute Register 0 on page D7-1999 





ID_AA64ISAR1_EL1 


ID_AA64ISAR1_EL1, AArch64 Instruction Set Attribute Register 1 on page D7-2001 





ID_AA64MMFRO_EL1 


ID_AA64MMFRO_ELI, AArch64 Memory Model Feature Register 0 on page D7-2002 





ID_AA64MMEFR1_EL1 


ID_AA64MMFRI1_ELI, AArch64 Memory Model Feature Register 1 on page D7-2005 





ID_AA64PFRO_EL1 


ID_AA64PFRO_ELI1, AArch64 Processor Feature Register 0 on page D7-2006 





ID_AA64PFR1_EL1 


ID_AA64PFRI1_ELI1, AArch64 Processor Feature Register 1 on page D7-2008 





ID_AFRO_EL1 


ID_AFRO_ELI, AArch32 Auxiliary Feature Register 0 on page D7-2009 





ID_DFRO_EL1 


ID_DFRO_ELI, AArch32 Debug Feature Register 0 on page D7-2011 





ID_ISARO_EL1 


ID_ISARO_EL1, AArch32 Instruction Set Attribute Register 0 on page D7-2014 





ID_ISAR1_EL1 


ID_ISAR1I_EL1I, AArch32 Instruction Set Attribute Register 1 on page D7-2017 





ID_ISAR2_EL1 


ID_ISAR2_EL1, AArch32 Instruction Set Attribute Register 2 on page D7-2020 





ID_ISAR3_EL1 


ID_ISAR3_EL1, AArch32 Instruction Set Attribute Register 3 on page D7-2023 





ID_ISAR4_EL1 


ID_ISAR4_EL1, AArch32 Instruction Set Attribute Register 4 on page D7-2026 





ID_ISARS_EL1 


ID_ISARS_EL1, AArch32 Instruction Set Attribute Register 5 on page D7-2029 





ID_MMFRO_EL1 


ID_MMFRO_ELI, AArch32 Memory Model Feature Register 0 on page D7-2031 





ID_MMFR1_EL1 


ID_MMFRI1_ELI1, AArch32 Memory Model Feature Register I on page D7-2034 





ID_MMFR2_EL1 


ID_MMFR2_EL], AArch32 Memory Model Feature Register 2 on page D7-2038 





ID_MMFR3_EL1 


ID_MMFR3_ELI1, AArch32 Memory Model Feature Register 3 on page D7-2041 








ID_MMFR4_EL1 


ID_MMFR4_EL1, AArch32 Memory Model Feature Register 4 on page D7-2044 





ID_PFRO_EL1 


ID_PFRO_ELI, AArch32 Processor Feature Register 0 on page D7-2046 
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Table K12-5 Alphabetical index of AArch64 Registers (continued) 




































































Register Description, see 

ID_PFR1_EL1 ID_PFR1_EL1, AArch32 Processor Feature Register 1 on page D7-2048 
IFSR32_EL2 IFSR32_EL2, Instruction Fault Status Register (EL2) on page D7-2051 
ISR_EL1 ISR_ELI, Interrupt Status Register on page D7-2055 

MAIR_EL1 MAIR_ELI, Memory Attribute Indirection Register (EL1) on page D7-2057 
MAIR_EL2 MAIR_EL2, Memory Attribute Indirection Register (EL2) on page D7-2059 
MAIR_EL3 MAIR_EL3, Memory Attribute Indirection Register (EL3) on page D7-2061 
MDCCINT_EL1 MDCCINT_ELI, Monitor DCC Interrupt Enable Register on page D7-2183 
MDCCSR_ELO MDCCSR_ELO, Monitor DCC Status Register on page D7-2185 
MDCR_EL2 MDCR_EL2, Monitor Debug Configuration Register (EL2) on page D7-2187 
MDCR_EL3 MDCR_EL3, Monitor Debug Configuration Register (EL3) on page D7-2191 
MDRAR_EL1 MDRAR_ELI, Monitor Debug ROM Address Register on page D7-2195 
MDSCR_EL1 MDSCR_ELI, Monitor Debug System Control Register on page D7-2197 
MIDR_EL1 MIDR_ELI, Main ID Register on page D7-2063 

MPIDR_EL1 MPIDR_ELI, Multiprocessor Affinity Register on page D7-2065 
MVFRO_EL1 MVFRO_ELI, AArch32 Media and VFP Feature Register 0 on page D7-2067 
MVFR1_EL1 MVFRI1_ELI, AArch32 Media and VFP Feature Register 1 on page D7-2070 
MVFR2_EL1 MVFR2_ELI, AArch32 Media and VFP Feature Register 2 on page D7-2073 
NZCV NZCYV, Condition Flags on page C5-311 

OSDLR_EL1 OSDLR_ELI1, OS Double Lock Register on page D7-2201 

OSDTRRX_EL1 OSDTRRX_ELI, OS Lock Data Transfer Register, Receive on page D7-2203 





OSDTRTX_EL1 


OSDTRTX_ELI, OS Lock Data Transfer Register, Transmit on page D7-2205 














OSECCR_EL1 OSECCR_EL1, OS Lock Exception Catch Control Register on page D7-2207 
OSLAR_EL1 OSLAR_EL1, OS Lock Access Register on page D7-2209 

OSLSR_EL1 OSLSR_ELI, OS Lock Status Register on page D7-2211 

PAR_EL1 PAR_ELI, Physical Address Register on page D7-2075 





PMCCFILTR_ELO 


PMCCFILTR_ELO, Performance Monitors Cycle Count Filter Register on page D7-2216 





PMCCNTR_ELO 


PMCCNTR_ELO, Performance Monitors Cycle Count Register on page D7-2218 





PMCEIDO_ELO 


PMCEIDO_ELO, Performance Monitors Common Event Identification register 0 on 
page D7-2220 





PMCEID1_ELO 


PMCEID1_ELO, Performance Monitors Common Event Identification register I] on 
page D7-2222 





PMCNTENCLR_ELO 


PMCNTENCLR_ELO, Performance Monitors Count Enable Clear register on 
page D7-2224 
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Table K12-5 Alphabetical index of AArch64 Registers (continued) 





Register 


Description, see 





PMCNTENSET_ELO 


PMCNTENSET_ELO, Performance Monitors Count Enable Set register on page D7-2226 





PMCR_ELO 


PMCR_ELO, Performance Monitors Control Register on page D7-2228 





PMEVCNTR<n>_ELO 


PMEVCNTR<n>_ELO, Performance Monitors Event Count Registers, n = 0 - 30 on 
page D7-2231 





PMEVTYPER<n>_ELO 


PMEVTYPER<n>_ELO, Performance Monitors Event Type Registers, n = 0 - 30 on 
page D7-2233 





PMINTENCLR_EL1 


PMINTENCLR_ELI, Performance Monitors Interrupt Enable Clear register on 
page D7-2237 








PMINTENSET_EL1 


PMINTENSET_ELI, Performance Monitors Interrupt Enable Set register on 
page D7-2239 





PMOVSCLR_ELO 


PMOVSCLR_ELO, Performance Monitors Overflow Flag Status Clear Register on 
page D7-2241 





PMOVSSET_ELO 


PMOVSSET_ELO, Performance Monitors Overflow Flag Status Set register on 
page D7-2243 





PMSELR_ELO 


PMSELR_ELO, Performance Monitors Event Counter Selection Register on page D7-2245 





PMSWINC_ELO 


PMSWINC_ELDO, Performance Monitors Software Increment register on page D7-2247 





PMUSERENR_ELO 


PMUSERENR_ELO, Performance Monitors User Enable Register on page D7-2249 





PMXEVCNTR_ELO 


PMXEVCNTR_ELO, Performance Monitors Selected Event Count Register on 
page D7-2251 








PMXEVTYPER_ELO 


PMXEVTYPER_ELO, Performance Monitors Selected Event Type Register on 
page D7-2253 























REVIDR_EL1 REVIDR_ELI, Revision ID Register on page D7-2079 

RMR_EL1 RMR_ELI, Reset Management Register (if EL2 and EL3 not implemented) on 
page D7-2080 

RMR_EL2 RMR_EL2, Reset Management Register (if EL2 implemented and EL3 not implemented) on 
page D7-2082 

RMR_EL3 RMR_EL3, Reset Management Register (if EL3 implemented) on page D7-2084 

RVBAR_EL1 RVBAR_ELI, Reset Vector Base Address Register (if EL2 and EL3 not implemented) on 
page D7-2086 

RVBAR_EL2 RVBAR_EL2, Reset Vector Base Address Register (if EL3 not implemented) on 
page D7-2087 

RVBAR_EL3 RVBAR_EL3, Reset Vector Base Address Register (if EL3 implemented) on page D7-2088 





S3_<op1>_<Cn>_<Cm>_<op2> 





S3_<opl1>_<Cn>_<Cm>_<op2>, IMPLEMENTATION DEFINED registers on 
page D7-2089 




















SCR_EL3 SCR_EL3, Secure Configuration Register on page D7-2090 
SCTLR_EL1 SCTLR_ELI, System Control Register (ELI) on page D7-2094 
SCTLR_EL2 SCTLR_EL2, System Control Register (EL2) on page D7-2101 
SCTLR_EL3 SCTLR_EL3, System Control Register (EL3) on page D7-2105 
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Table K12-5 Alphabetical index of AArch64 Registers (continued) 































































































Register Description, see 

SDER32_EL3 SDER32_EL3, AArch32 Secure Debug Enable Register on page D7-2213 

SP_ELO SP_ELO, Stack Pointer (ELO) on page C5-313 

SP_EL1 SP_ELI, Stack Pointer (ELI) on page C5-314 

SP_EL2 SP_EL2, Stack Pointer (EL2) on page C5-316 

SP_EL3 SP_EL3, Stack Pointer (EL3) on page C5-318 

SPSel SPSel, Stack Pointer Select on page C5-319 

SPSR_abt SPSR_abt, Saved Program Status Register (Abort mode) on page C5-320 

SPSR_EL1 SPSR_ELI, Saved Program Status Register (EL1) on page C5-323 

SPSR_EL2 SPSR_EL2, Saved Program Status Register (EL2) on page C5-328 

SPSR_EL3 SPSR_EL3, Saved Program Status Register (EL3) on page C5-333 

SPSR_fiq SPSR_fig, Saved Program Status Register (FIQ mode) on page C5-338 

SPSR_irq SPSR_irg, Saved Program Status Register (IRQ mode) on page C5-341 

SPSR_und SPSR_und, Saved Program Status Register (Undefined mode) on page C5-344 

TCR_EL1 TCR_ELI, Translation Control Register (EL1) on page D7-2109 

TCR_EL2 TCR_EL2, Translation Control Register (EL2) on page D7-2114 

TCR_EL3 TCR_EL3, Translation Control Register (EL3) on page D7-2117 

TLBI ALLE1 TLBI ALLE], TLB Invalidate All, ELI on page C5-379 

TLBI ALLEIIS TLBI ALLEIIS, TLB Invalidate All, EL1, Inner Shareable on page C5-380 

TLBI ALLE2 TLBI ALLE2, TLB Invalidate All, EL2 on page C5-381 

TLBI ALLE2IS TLBI ALLE2IS, TLB Invalidate All, EL2, Inner Shareable on page C5-382 

TLBI ALLE3 TLBI ALLE3, TLB Invalidate All, EL3 on page C5-383 

TLBI ALLE3IS TLBI ALLE3IS, TLB Invalidate All, EL3, Inner Shareable on page C5-384 

TLBI ASIDE1 TLBI ASIDE1, TLB Invalidate by ASID, EL1 on page C5-385 

TLBI ASIDE1IS TLBI ASIDES, TLB Invalidate by ASID, ELI, Inner Shareable on page C5-387 

TLBI IPAS2E1 TLBI IPAS2E1, TLB Invalidate by Intermediate Physical Address, Stage 2, ELI on 
page C5-389 

TLBI IPAS2E1IS TLBI IPAS2E1IS, TLB Invalidate by Intermediate Physical Address, Stage 2, ELI, Inner 
Shareable on page C5-390 

TLBI IPAS2LE1 TLBI IPAS2LE1, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, EL1 
on page C5-392 

TLBI IPAS2LE1IS TLBI IPAS2LE1IS, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, 
ELI, Inner Shareable on page C5-393 

TLBI VAAE1 TLBI VAAE1, TLB Invalidate by VA, All ASID, EL] on page C5-395 

TLBI VAAEIIS TLBI VAAELIS, TLB Invalidate by VA, All ASID, EL1, Inner Shareable on page C5-397 
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Table K12-5 Alphabetical index of AArch64 Registers (continued) 







































































Register Description, see 

TLBI VAALE1 TLBI VAALE1, TLB Invalidate by VA, All ASID, Last level, EL1 on page C5-399 

TLBI VAALEIIS TLBI VAALEIIS, TLB Invalidate by VA, All ASID, EL1, Inner Shareable on page C5-401 

TLBI VAE1 TLBI VAE1, TLB Invalidate by VA, ELI on page C5-403 

TLBI VAELIS TLBI VAEIIS, TLB Invalidate by VA, ELI, Inner Shareable on page C5-405 

TLBI VAE2 TLBI VAE2, TLB Invalidate by VA, EL2 on page C5-407 

TLBI VAE2IS TLBI VAE2IS, TLB Invalidate by VA, EL2, Inner Shareable on page C5-409 

TLBI VAE3 TLBI VAE3, TLB Invalidate by VA, EL3 on page C5-411 

TLBI VAE3IS TLBI VAE3IS, TLB Invalidate by VA, EL3, Inner Shareable on page C5-413 

TLBI VALE1 TLBI VALE1, TLB Invalidate by VA, Last level, EL] on page C5-415 

TLBI VALE1IS TLBI VALEIIS, TLB Invalidate by VA, Last level, EL1, Inner Shareable on page C5-417 

TLBI VALE2 TLBI VALE2, TLB Invalidate by VA, Last level, EL2 on page C5-419 

TLBI VALE2IS TLBI VALE2IS, TLB Invalidate by VA, Last level, EL2, Inner Shareable on page C5-421 

TLBI VALE3 TLBI VALE3, TLB Invalidate by VA, Last level, EL3 on page C5-423 

TLBI VALE3 IS TLBI VALE3IS, TLB Invalidate by VA, Last level, EL3, Inner Shareable on page C5-425 

TLBI VMALLE1 TLBI VMALLE1, TLB Invalidate by VMID, All at stage 1, ELI on page C5-427 

TLBI VMALLELIS TLBI VMALLEIIS, TLB Invalidate by VMID, All at stage 1, EL1, Inner Shareable on 
page C5-428 

TLBI VMALLS12E1 TLBI VMALLS12E1, TLB Invalidate by VMID, All at Stage 1 and 2, EL1 on page C5-429 

TLBI VMALLS12E1IS TLBI VMALLS12E1IS, TLB Invalidate by VMID, All at Stage 1 and 2, EL1, Inner 
Shareable on page C5-430 

TPIDR_ELO TPIDR_ELO, ELO Read/Write Software Thread ID Register on page D7-2120 

TPIDR_EL1 TPIDR_EL1, ELI Software Thread ID Register on page D7-2121 

TPIDR_EL2 TPIDR_EL2, EL2 Software Thread ID Register on page D7-2122 

TPIDR_EL3 TPIDR_EL3, EL3 Software Thread ID Register on page D7-2123 





TPIDRRO_ELO 


TPIDRRO_ELO, ELO Read-Only Software Thread ID Register on page D7-2124 





























TTBRO_EL1 TTBRO_ELI, Translation Table Base Register 0 (EL1) on page D7-2125 

TTBRO_EL2 TTBRO_EL2, Translation Table Base Register 0 (EL2) on page D7-2127 

TTBRO_EL3 TTBRO_EL3, Translation Table Base Register 0 (EL3) on page D7-2129 

TTBRI_ELI1 TTBR1_ELI, Translation Table Base Register 1 (EL1) on page D7-2131 

VBAR_ELI VBAR_ELI, Vector Base Address Register (EL1) on page D7-2133 

VBAR_EL2 VBAR_EL2, Vector Base Address Register (EL2) on page D7-2135 

VBAR_EL3 VBAR_EL3, Vector Base Address Register (EL3) on page D7-2137 

VMPIDR_EL2 VMPIDR_EL2, Virtualization Multiprocessor ID Register on page D7-2138 
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Table K12-5 Alphabetical index of AArch64 Registers (continued) 




















Register Description, see 
VPIDR_EL2 VPIDR_EL2, Virtualization Processor ID Register on page D7-2140 
VTCR_EL2 VTCR_EL2, Virtualization Translation Control Register on page D7-2142 
VTTBR_EL2 VTTBR_EL2, Virtualization Translation Table Base Register on page D7-2145 
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K12.3 Functional index of AArch64 registers and system instructions 


This section is an index of the AArch64 registers and system instructions, divided by functional group. 


K12.3.1 Special-purpose registers 


This section is an index to the registers in the Special-purpose registers functional group. 


Table K12-6 Special-purpose registers 



























































Register Description, see 

DLR_ELO DLR_ELO, Debug Link Register on page D7-2177 

DSPSR_ELO DSPSR_ELO, Debug Saved Program Status Register on page D7-2178 
ELR_EL1 ELR_EL1I, Exception Link Register (EL1) on page C5-300 

ELR_EL2 ELR_EL2, Exception Link Register (EL2) on page C5-301 

ELR_EL3 ELR_EL3, Exception Link Register (EL3) on page C5-303 

FPCR FPCR, Floating-point Control Register on page C5-304 

FPSR FPSR, Floating-point Status Register on page C5-308 

SP_ELO SP_ELO, Stack Pointer (ELO) on page C5-313 

SP_EL1 SP_ELI, Stack Pointer (EL1) on page C5-314 

SP_EL2 SP_EL2, Stack Pointer (EL2) on page C5-316 

SP_EL3 SP_EL3, Stack Pointer (EL3) on page C5-318 

SPSR_abt SPSR_abt, Saved Program Status Register (Abort mode) on page C5-320 
SPSR_EL1 SPSR_ELI, Saved Program Status Register (EL1) on page C5-323 
SPSR_EL2 SPSR_EL2, Saved Program Status Register (EL2) on page C5-328 
SPSR_EL3 SPSR_EL3, Saved Program Status Register (EL3) on page C5-333 
SPSR_fiq SPSR_fig, Saved Program Status Register (FIQ mode) on page C5-338 
SPSR_irq SPSR_irg, Saved Program Status Register (IRQ mode) on page C5-341 
SPSR_und SPSR_und, Saved Program Status Register (Undefined mode) on page C5-344 





K12.3.2 VMSA-specific registers 


This section is an index to the registers in the Virtual memory control registers functional group. 


Table K12-7 VMSA-specific registers 























Register Description, see 
AMAIR_EL1 AMAIR_EL1I, Auxiliary Memory Attribute Indirection Register (EL1) on page D7-1906 
AMAIR_EL2 AMAIR_EL2, Auxiliary Memory Attribute Indirection Register (EL2) on page D7-1908 
AMAIR_EL3 AMAIR_EL3, Auxiliary Memory Attribute Indirection Register (EL3) on page D7-1909 
CONTEXTIDR_EL1 CONTEXTIDR_ELI, Context ID Register (EL1) on page D7-1914 
DACR32_EL2 DACR32_EL2, Domain Access Control Register on page D7-1926 
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Table K12-7 VMSA-specific registers (continued) 









































Register Description, see 

MAIR_EL1 MAIR_ELI, Memory Attribute Indirection Register (EL1) on page D7-2057 
MAIR_EL2 MAIR_EL2, Memory Attribute Indirection Register (EL2) on page D7-2059 
MAIR_EL3 MAIR_EL3, Memory Attribute Indirection Register (EL3) on page D7-2061 
TCR_EL1 TCR_ELI, Translation Control Register (EL1) on page D7-2109 
TCR_EL2 TCR_EL2, Translation Control Register (EL2) on page D7-2114 

TCR_EL3 TCR_EL3, Translation Control Register (EL3) on page D7-2117 
TTBRO_EL1 TTBRO_ELI, Translation Table Base Register 0 (EL1) on page D7-2125 
TTBRO_EL2 TTBRO_EL2, Translation Table Base Register 0 (EL2) on page D7-2127 
TTBRO_EL3 TTBRO_EL3, Translation Table Base Register 0 (EL3) on page D7-2129 
TTBR1_EL1 TTBR1_ELI, Translation Table Base Register 1 (EL1) on page D7-2131 
VTCR_EL2 VTCR_EL2, Virtualization Translation Control Register on page D7-2142 
VTTBR_EL2 VITBR_EL2, Virtualization Translation Table Base Register on page D7-2145 





K12.3.3 ID registers 


This section is an index to the registers in the Identification registers functional group. 


Table K12-8 ID registers 























Register Description, see 

AIDR_EL1 AIDR_ELI, Auxiliary ID Register on page D7-1905 
CCSIDR_EL1 CCSIDR_ELI, Current Cache Size ID Register on page D7-1910 
CLIDR_EL1 CLIDR_EL1, Cache Level ID Register on page D7-1912 
CSSELR_EL1 CSSELR_ELI, Cache Size Selection Register on page D7-1922 
CTR_ELO CTR_ELO, Cache Type Register on page D7-1924 

DCZID_ELO DCZID_ELO, Data Cache Zero ID register on page D7-1928 





ID_AA64AFRO_EL1 


ID_AA64AFRO_EL1, AArch64 Auxiliary Feature Register 0 on page D7-1993 





ID_AA64AFR1_EL1 


ID_AA64AFR1_EL1, AArch64 Auxiliary Feature Register 1 on page D7-1995 





ID_AA64DFRO_EL1 


ID_AA64DFRO_ELI, AArch64 Debug Feature Register 0 on page D7-1996 





ID_AA64DFR1_EL1 


ID_AA64DFRI_EL1, AArch64 Debug Feature Register 1 on page D7-1998 





ID_AA64ISARO_EL1 


ID_AA64ISARO_ELI, AArch64 Instruction Set Attribute Register 0 on page D7-1999 





ID_AA64ISAR1_EL1 


ID_AA64ISARI_ELI, AArch64 Instruction Set Attribute Register 1 on page D7-2001 





ID_AA64MMFRO_EL1 


ID_AA64MMFRO_ELI, AArch64 Memory Model Feature Register 0 on page D7-2002 





ID_AA64MMEFR1_EL1 


ID_AA64MMFRI1_EL1, AArch64 Memory Model Feature Register 1 on page D7-2005 





ID_AA64PFRO_EL1 


ID_AA64PFRO_EL1, AArch64 Processor Feature Register 0 on page D7-2006 
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Table K12-8 ID registers (continued) 





Register 


Description, see 





ID_AA64PFR1_EL1 


ID_AA64PFR1_EL1, AArch64 Processor Feature Register 1 on page D7-2008 





ID_AFRO_EL1 


ID_AFRO_EL1, AArch32 Auxiliary Feature Register 0 on page D7-2009 





ID_DFRO_EL1 


ID_DFRO_EL1I, AArch32 Debug Feature Register 0 on page D7-2011 





ID_ISARO_EL1 


ID_ISARO_EL1, AArch32 Instruction Set Attribute Register 0 on page D7-2014 





ID_ISAR1_EL1 


ID_ISAR1I_EL1, AArch32 Instruction Set Attribute Register 1 on page D7-2017 





ID_ISAR2_EL1 


ID_ISAR2_EL1, AArch32 Instruction Set Attribute Register 2 on page D7-2020 





ID_ISAR3_EL1 


ID_ISAR3_EL1, AArch32 Instruction Set Attribute Register 3 on page D7-2023 





ID_ISAR4_EL1 


ID_ISAR4_EL1, AArch32 Instruction Set Attribute Register 4 on page D7-2026 





ID_ISARS_EL1 


ID_ISARS_EL1, AArch32 Instruction Set Attribute Register 5 on page D7-2029 





ID_MMFRO_EL1 


ID_MMFRO_ELI, AArch32 Memory Model Feature Register 0 on page D7-2031 





ID_MMFR1_EL1 


ID_MMFR1_EL1I, AArch32 Memory Model Feature Register 1 on page D7-2034 





ID_MMFR2_EL1 


ID_MMFR2_EL1, AArch32 Memory Model Feature Register 2 on page D7-2038 





ID_MMFR3_EL1 


ID_MMFR3_ELI1, AArch32 Memory Model Feature Register 3 on page D7-2041 








ID_MMFR4_EL1 


ID_MMFR4_ELI1, AArch32 Memory Model Feature Register 4 on page D7-2044 






































ID_PFRO_EL1 ID_PFRO_EL1, AArch32 Processor Feature Register 0 on page D7-2046 
ID_PFR1_EL1 ID_PFRI1_ELI, AArch32 Processor Feature Register 1 on page D7-2048 
MIDR_EL1 MIDR_ELI, Main ID Register on page D7-2063 
MPIDR_EL1 MPIDR_ELI, Multiprocessor Affinity Register on page D7-2065 
MVFRO_EL1 MVFRO_ELI, AArch32 Media and VFP Feature Register 0 on page D7-2067 
MVFR1_EL1 MVFRI1_ELI, AArch32 Media and VFP Feature Register 1 on page D7-2070 
MVFR2_EL1 MVFR2_ELI, AArch32 Media and VFP Feature Register 2 on page D7-2073 
REVIDR_EL1 REVIDR_ELI, Revision ID Register on page D7-2079 
VMPIDR_EL2 VMPIDR_EL2, Virtualization Multiprocessor ID Register on page D7-2138 
VPIDR_EL2 VPIDR_EL2, Virtualization Processor ID Register on page D7-2140 
K12.3.4 Performance monitors registers 


This section is an index to the registers in the Performance Monitors registers functional group. 


Table K12-9 Performance monitors registers 





Register 


Description, see 





PMCCFILTR_ELO 


PMCCFILTR_ELO, Performance Monitors Cycle Count Filter Register on page D7-2216 





PMCCNTR_ELO 


PMCCNTR_ELO, Performance Monitors Cycle Count Register on page D7-2218 





PMCEID0O_ELO 


PMCEIDO_ELO, Performance Monitors Common Event Identification register 0 on 
page D7-2220 
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Table K12-9 Performance monitors registers (continued) 





Register 


Description, see 





PMCEID1_ELO 


PMCEID1_ELO, Performance Monitors Common Event Identification register 1 on 
page D7-2222 





PMCNTENCLR_ELO 


PMCNTENCLR_ELO, Performance Monitors Count Enable Clear register on 
page D7-2224 





PMCNTENSET_ELO 


PMCNTENSET_ELO, Performance Monitors Count Enable Set register on page D7-2226 





PMCR_ELO 


PMCR_ELO, Performance Monitors Control Register on page D7-2228 





PMEVCNTR<n>_ELO 


PMEVCNTR<n>_ELO, Performance Monitors Event Count Registers, n = 0 - 30 on 
page D7-2231 





PMEVTYPER<n>_ELO 


PMEVTYPER<n>_ELO, Performance Monitors Event Type Registers, n = 0 - 30 on 
page D7-2233 





PMINTENCLR_EL1 


PMINTENCLR_ELI, Performance Monitors Interrupt Enable Clear register on 
page D7-2237 








PMINTENSET_EL1 


PMINTENSET_ELI, Performance Monitors Interrupt Enable Set register on 
page D7-2239 





PMOVSCLR_ELO 


PMOVSCLR_ELO, Performance Monitors Overflow Flag Status Clear Register on 
page D7-2241 





PMOVSSET_ELO 


PMOVSSET_ELO, Performance Monitors Overflow Flag Status Set register on 
page D7-2243 





PMSELR_ELO 


PMSELR_ELO, Performance Monitors Event Counter Selection Register on page D7-2245 





PMSWINC_ELO 


PMSWINC_ELDO, Performance Monitors Software Increment register on page D7-2247 





PMUSERENR_ELO 


PMUSERENR_ELO, Performance Monitors User Enable Register on page D7-2249 





PMXEVCNTR_ELO 


PMXEVCNTR_ELO, Performance Monitors Selected Event Count Register on 
page D7-2251 








PMXEVTYPER_ELO 


PMXEVTYPER_ELO, Performance Monitors Selected Event Type Register on 
page D7-2253 





K12.3.5 


Debug registers 


This section is an index to the registers in the Debug registers functional group. 


Table K12-10 Debug registers 





Register 


Description, see 





DBGAUTHSTATUS_EL1 


DBGAUTHSTATUS_ELI, Debug Authentication Status register on page D7-2148 





DBGBCR<n>_EL1 


DBGBCR<n>_ELI, Debug Breakpoint Control Registers, n = 0 - 15 on page D7-2150 





DBGBVR<n>_EL1 


DBGBVR<n>_ELI, Debug Breakpoint Value Registers, n = 0 - 15 on page D7-2153 





DBGCLAIMCLR_EL1 


DBGCLAIMCLR_EL1, Debug Claim Tag Clear register on page D7-2156 





DBGCLAIMSET_EL1 


DBGCLAIMSET_ELI, Debug Claim Tag Set register on page D7-2158 





DBGDTR_ELO 


DBGDTR_ELO, Debug Data Transfer Register, half-duplex on page D7-2160 
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K12.3 Functional index of AArch64 registers and system instructions 


Table K12-10 Debug registers (continued) 





Register 


Description, see 





DBGDTRRX_ELO 


DBGDTRRX_ELO, Debug Data Transfer Register, Receive on page D7-2162 





DBGDTRTX_ELO 


DBGDTRTX_ELO, Debug Data Transfer Register, Transmit on page D7-2164 





DBGPRCR_EL1 


DBGPRCR_ELI, Debug Power Control Register on page D7-2166 





DBGVCR32_EL2 


DBGVCR32_EL2, Debug Vector Catch Register on page D7-2168 





DBGWCR<n>_EL1 


DBGWCR<n>_EL1, Debug Watchpoint Control Registers, n = 0 - 15 on page D7-2172 





DBGWVR<n>_EL1 


DBGWVR<n>_ELI, Debug Watchpoint Value Registers, n = 0 - 15 on page D7-2175 



































DLR_ELO DLR_ELO, Debug Link Register on page D7-2177 

DSPSR_ELO DSPSR_ELO, Debug Saved Program Status Register on page D7-2178 
MDCCINT_EL1 MDCCINT_ELI, Monitor DCC Interrupt Enable Register on page D7-2183 
MDCCSR_ELO MDCCSR_ELO, Monitor DCC Status Register on page D7-2185 
MDCR_EL2 MDCR_EL2, Monitor Debug Configuration Register (EL2) on page D7-2187 
MDCR_EL3 MDCR_EL3, Monitor Debug Configuration Register (EL3) on page D7-2191 
MDRAR_EL1 MDRAR_ELI, Monitor Debug ROM Address Register on page D7-2195 
MDSCR_EL1 MDSCR_ELI, Monitor Debug System Control Register on page D7-2197 
OSDLR_EL1 OSDLR_ELI1, OS Double Lock Register on page D7-2201 

OSDTRRX_EL1 OSDTRRX_EL1, OS Lock Data Transfer Register, Receive on page D7-2203 





OSDTRTX_EL1 


OSDTRTX_EL1, OS Lock Data Transfer Register, Transmit on page D7-2205 














OSECCR_EL1 OSECCR_ELI, OS Lock Exception Catch Control Register on page D7-2207 
OSLAR_EL1 OSLAR_EL1, OS Lock Access Register on page D7-2209 

OSLSR_EL1 OSLSR_EL1, OS Lock Status Register on page D7-2211 

SDER32_EL3 SDER32_EL3, AArch32 Secure Debug Enable Register on page D7-2213 





K12.3.6 Generic timer registers 


This section is an index to the registers in the Generic Timer registers functional group. 


Table K12-11 Generic timer registers 





Register 


Description, see 





CNTFRQ_ELO 


CNTFRQ_ELO, Counter-timer Frequency register on page D7-2256 





CNTHCTL_EL2 


CNTHCTL_EL2, Counter-timer Hypervisor Control register on page D7-2258 





CNTHP_CTL_EL2 


CNTHP_CTL_EL2, Counter-timer Hypervisor Physical Timer Control register on 
page D7-2260 








CNTHP_CVAL_EL2 


CNTHP_CVAL_EL2, Counter-timer Hypervisor Physical Timer Compare Value register on 
page D7-2262 
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Table K12-11 Generic timer registers (continued) 





Register 


Description, see 





CNTHP_TVAL_EL2 


CNTHP_TVAL_EL2, Counter-timer Hypervisor Physical Timer TimerValue register on 
page D7-2263 





CNTKCTL_EL1 


CNTKCTL_ELI, Counter-timer Kernel Control register on page D7-2264 





CNTP_CTL_ELO 


CNTP_CTL_ELO, Counter-timer Physical Timer Control register on page D7-2267 





CNTP_CVAL_ELO 


CNTP_CVAL_ELO, Counter-timer Physical Timer CompareValue register on 
page D7-2269 





CNTP_TVAL_ELO 


CNTP_TVAL_ELO, Counter-timer Physical Timer TimerValue register on page D7-2270 





CNTPCT_ELO 


CNTPCT_ELO, Counter-timer Physical Count register on page D7-2272 








CNTPS_CTL_EL1 


CNTPS_CTL_ELI, Counter-timer Physical Secure Timer Control register on 
page D7-2273 





CNTPS_CVAL_EL1 


CNTPS_CVAL_EL1, Counter-timer Physical Secure Timer CompareValue register on 
page D7-2275 





CNTPS_TVAL_EL1 


CNTPS_TVAL_ELI, Counter-timer Physical Secure Timer TimerValue register on 
page D7-2276 























CNTV_CTL_ELO CNTV_CTL_ELO, Counter-timer Virtual Timer Control register on page D7-2277 
CNTV_CVAL_ELO CNTV_CVAL_ELO, Counter-timer Virtual Timer Compare Value register on page D7-2279 
CNTV_TVAL_ELO CNTV_TVAL_ELO, Counter-timer Virtual Timer TimerValue register on page D7-2280 
CNTVCT_ELO CNTVCT_ELO, Counter-timer Virtual Count register on page D7-2282 

CNTVOFF_EL2 CNTVOFF_EL2, Counter-timer Virtual Offset register on page D7-2283 





K12.3.7 Cache maintenance system instructions 


This section is an index to the registers in the Cache maintenance instructions functional group. 


Table K12-12 Cache maintenance system instructions 





























Register Description, see 

DC CISW DC CISW, Data or unified Cache line Clean and Invalidate by Set/Way on page C5-348 
DC CIVAC DC CIVAC, Data or unified Cache line Clean and Invalidate by VA to PoC on page C5-350 
DC CSW DC CSW, Data or unified Cache line Clean by Set/Way on page C5-351 

DC CVAC DC CVAC, Data or unified Cache line Clean by VA to PoC on page C5-353 

DC CVAU DC CVAU, Data or unified Cache line Clean by VA to PoU on page C5-354 

DC ISW DC ISW, Data or unified Cache line Invalidate by Set/Way on page C5-355 

DC IVAC DC IVAC, Data or unified Cache line Invalidate by VA to PoC on page C5-357 

DC ZVA DC ZVA, Data Cache Zero by VA on page C5-359 
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K12.3 Functional index of AArch64 registers and system instructions 


Table K12-12 Cache maintenance system instructions (continued) 














Register Description, see 

IC IALLU IC IALLU, Instruction Cache Invalidate All to PoU on page C5-361 

IC JALLUIS IC IALLUIS, Instruction Cache Invalidate All to PoU, Inner Shareable on page C5-362 
IC IVAU IC IVAU, Instruction Cache line Invalidate by VA to PoU on page C5-363 





K12.3.8 Address translation system instructions 


This section is an index to the registers in the Address translation instructions functional group. 


Table K12-13 Address translation system instructions 









































Register Description, see 

AT S12E0OR AT S12E0R, Address Translate Stages 1 and 2 ELO Read on page C5-366 
AT S12EOW AT S12EOW, Address Translate Stages 1 and 2 ELO Write on page C5-367 
AT S12E1R AT S12EIR, Address Translate Stages 1 and 2 ELI Read on page C5-368 
AT S12E1W AT S12E1W, Address Translate Stages 1 and 2 EL1 Write on page C5-369 
AT S1E0R AT SIEOR, Address Translate Stage 1 ELO Read on page C5-370 

AT S1EOW AT S1EOW, Address Translate Stage 1 ELO Write on page C5-371 

AT S1EIR AT SIEIR, Address Translate Stage 1 EL1 Read on page C5-372 

AT S1E1W AT SI1E1W, Address Translate Stage 1 ELI Write on page C5-373 

AT S1E2R AT S1E2R, Address Translate Stage 1 EL2 Read on page C5-374 

AT S1E2W AT S1E2W, Address Translate Stage 1 EL2 Write on page C5-375 

AT S1E3R AT S1E3R, Address Translate Stage 1 EL3 Read on page C5-376 

AT S1E3W AT S1E3W, Address Translate Stage 1 EL3 Write on page C5-377 





K12.3.9 TLB maintenance system instructions 


This section is an index to the registers in the TLB maintenance instructions functional group. 


Table K12-14 TLB maintenance system instructions 





























Register Description, see 
TLBI ALLE1 TLBI ALLE1, TLB Invalidate All, ELI on page C5-379 
TLBI ALLEIIS TLBI ALLEIIS, TLB Invalidate All, EL1, Inner Shareable on page C5-380 
TLBI ALLE2 TLBI ALLE2, TLB Invalidate All, EL2 on page C5-381 
TLBI ALLEZ2IS TLBI ALLEZ2IS, TLB Invalidate All, EL2, Inner Shareable on page C5-382 
TLBI ALLE3 TLBI ALLE3, TLB Invalidate All, EL3 on page C5-383 
TLBI ALLE3IS TLBI ALLE3IS, TLB Invalidate All, EL3, Inner Shareable on page C5-384 
TLBI ASIDE1 TLBI ASIDE1, TLB Invalidate by ASID, EL] on page C5-385 
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Table K12-14 TLB maintenance system instructions (continued) 
















































































Register Description, see 

TLBI ASIDES TLBI ASIDES, TLB Invalidate by ASID, ELI, Inner Shareable on page C5-387 

TLBI IPAS2E1 TLBI IPAS2E1, TLB Invalidate by Intermediate Physical Address, Stage 2, ELI on 
page C5-389 

TLBI IPAS2E1IS TLBI IPAS2E1IS, TLB Invalidate by Intermediate Physical Address, Stage 2, ELI, Inner 
Shareable on page C5-390 

TLBI IPAS2LE1 TLBI IPAS2LE1, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, EL1 
on page C5-392 

TLBI IPAS2LE1IS TLBI IPAS2LE1IS, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, 
ELI, Inner Shareable on page C5-393 

TLBI VAAE1 TLBI VAAE1, TLB Invalidate by VA, All ASID, EL] on page C5-395 

TLBI VAAEILIS TLBI VAAELIS, TLB Invalidate by VA, All ASID, EL1, Inner Shareable on page C5-397 

TLBI VAALEI1 TLBI VAALE]1, TLB Invalidate by VA, All ASID, Last level, EL1 on page C5-399 

TLBI VAALEIIS TLBI VAALEIIS, TLB Invalidate by VA, All ASID, EL1, Inner Shareable on page C5-401 

TLBI VAE1 TLBI VAE1, TLB Invalidate by VA, ELI on page C5-403 

TLBI VAEIIS TLBI VAEIIS, TLB Invalidate by VA, EL1, Inner Shareable on page C5-405 

TLBI VAE2 TLBI VAE2, TLB Invalidate by VA, EL2 on page C5-407 

TLBI VAEZ2IS TLBI VAE2IS, TLB Invalidate by VA, EL2, Inner Shareable on page C5-409 

TLBI VAE3 TLBI VAE3, TLB Invalidate by VA, EL3 on page C5-411 

TLBI VAE3IS TLBI VAE3IS, TLB Invalidate by VA, EL3, Inner Shareable on page C5-413 

TLBI VALE1 TLBI VALE1, TLB Invalidate by VA, Last level, EL] on page C5-415 

TLBI VALEIIS TLBI VALEIIS, TLB Invalidate by VA, Last level, EL1, Inner Shareable on page C5-417 

TLBI VALE2 TLBI VALE2, TLB Invalidate by VA, Last level, EL2 on page C5-419 

TLBI VALE2IS TLBI VALE2IS, TLB Invalidate by VA, Last level, EL2, Inner Shareable on page C5-421 

TLBI VALE3 TLBI VALE3, TLB Invalidate by VA, Last level, EL3 on page C5-423 

TLBI VALE3IS TLBI VALE3IS, TLB Invalidate by VA, Last level, EL3, Inner Shareable on page C5-425 

TLBI VMALLE1 TLBI VMALLE1, TLB Invalidate by VMID, All at stage 1, ELI on page C5-427 

TLBI VMALLELIS TLBI VMALLEIIS, TLB Invalidate by VMID, All at stage 1, EL1, Inner Shareable on 
page C5-428 

TLBI VMALLS12E1 TLBI VMALLS12E1, TLB Invalidate by VMID, All at Stage 1 and 2, EL1 on page C5-429 

TLBI VMALLS12E1IS TLBI VMALLS12E1IS, TLB Invalidate by VMID, All at Stage 1 and 2, EL1, Inner 


Shareable on page C5-430 





K12.3.10 Base system registers 


This section is an index to the registers in the functional group. 
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K12.3 Functional index of AArch64 registers and system instructions 


Table K12-15 Base system registers 




































































Register Description, see 

ACTLR_EL1 ACTLR_ELI, Auxiliary Control Register (EL1) on page D7-1896 
ACTLR_EL2 ACTLR_EL2, Auxiliary Control Register (EL2) on page D7-1897 
ACTLR_EL3 ACTLR_EL3, Auxiliary Control Register (EL3) on page D7-1898 
AFSRO_EL1 AFSRO_ELI, Auxiliary Fault Status Register 0 (EL1) on page D7-1899 
AFSRO_EL2 AFSRO_EL2, Auxiliary Fault Status Register 0 (EL2) on page D7-1900 
AFSRO_EL3 AFSRO_EL3, Auxiliary Fault Status Register 0 (EL3) on page D7-1901 
AFSR1_EL1 AFSR1_ELI, Auxiliary Fault Status Register 1 (EL1) on page D7-1902 
AFSR1_EL2 AFSRI_EL2, Auxiliary Fault Status Register 1 (EL2) on page D7-1903 
AFSR1_EL3 AFSR1_EL3, Auxiliary Fault Status Register 1 (EL3) on page D7-1904 
CPACR_EL1 CPACR_ELI, Architectural Feature Access Control Register on page D7-1916 
CPTR_EL2 CPTR_EL2, Architectural Feature Trap Register (EL2) on page D7-1918 
CPTR_EL3 CPTR_EL3, Architectural Feature Trap Register (EL3) on page D7-1920 
CurrentEL CurrentEL, Current Exception Level on page C5-294 

DAIF DAIF, Interrupt Mask Bits on page C5-296 

ESR_EL1 ESR_EL1, Exception Syndrome Register (ELI) on page D7-1930 
ESR_EL2 ESR_EL2, Exception Syndrome Register (EL2) on page D7-1931 
ESR_EL3 ESR_EL3, Exception Syndrome Register (EL3) on page D7-1932 
ESR_ELx ESR_ELx, Exception Syndrome Register (ELx) on page D7-1933 
FAR_EL1 FAR_ELI, Fault Address Register (ELI) on page D7-1965 

FAR_EL2 FAR_EL2, Fault Address Register (EL2) on page D7-1967 

FAR_EL3 FAR_EL3, Fault Address Register (EL3) on page D7-1969 





FPEXC32_EL2 


FPEXC32_EL2, Floating-Point Exception Control register on page D7-1 


971 
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HACR_EL2 HACR_EL2, Hypervisor Auxiliary Control Register on page D7-1976 
HCR_EL2 HCR_EL2, Hypervisor Configuration Register on page D7-1977 
HPFAR_EL2 HPFAR_EL2, Hypervisor IPA Fault Address Register on page D7-1989 
HSTR_EL2 HSTR_EL2, Hypervisor System Trap Register on page D7-1991 
IFSR32_EL2 IFSR32_EL2, Instruction Fault Status Register (EL2) on page D7-2051 
ISR_EL1 ISR_ELI, Interrupt Status Register on page D7-2055 
NZCV NZCYV, Condition Flags on page C5-311 
PAR_EL1 PAR_ELI, Physical Address Register on page D7-2075 
RMR_EL1 RMR_ELI, Reset Management Register (if EL2 and EL3 not implemented) on 
page D7-2080 
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Table K12-15 Base system registers (continued) 




















Register Description, see 

RMR_EL2 RMR_EL2, Reset Management Register (if EL2 implemented and EL3 not implemented) on 
page D7-2082 

RMR_EL3 RMR_EL3, Reset Management Register (if EL3 implemented) on page D7-2084 

RVBAR_EL1 RVBAR_ELI, Reset Vector Base Address Register (if EL2 and EL3 not implemented) on 
page D7-2086 

RVBAR_EL2 RVBAR_EL2, Reset Vector Base Address Register (if EL3 not implemented) on 
page D7-2087 

RVBAR_EL3 RVBAR_EL3, Reset Vector Base Address Register (if EL3 implemented) on page D7-2088 





S3_<op1>_<Cn>_<Cm>_<op2> 





S3_<op1>_<Cn>_<Cm>_<op2>, IMPLEMENTATION DEFINED registers on 
page D7-2089 
































SCR_EL3 SCR_EL3, Secure Configuration Register on page D7-2090 

SCTLR_EL1 SCTLR_ELI, System Control Register (EL1) on page D7-2094 

SCTLR_EL2 SCTLR_EL2, System Control Register (EL2) on page D7-2101 

SCTLR_EL3 SCTLR_EL3, System Control Register (EL3) on page D7-2105 

SPSel SPSel, Stack Pointer Select on page C5-319 

TPIDR_ELO TPIDR_ELO, ELO Read/Write Software Thread ID Register on page D7-2120 
TPIDR_EL1 TPIDR_EL1, EL] Software Thread ID Register on page D7-2121 
TPIDR_EL2 TPIDR_EL2, EL2 Software Thread ID Register on page D7-2122 
TPIDR_EL3 TPIDR_EL3, EL3 Software Thread ID Register on page D7-2123 





TPIDRRO_ELO 


TPIDRRO_ELO, ELO Read-Only Software Thread ID Register on page D7-2124 











VBAR_ELI VBAR_ELI, Vector Base Address Register (EL1) on page D7-2133 
VBAR_EL2 VBAR_EL2, Vector Base Address Register (EL2) on page D7-2135 
VBAR_EL3 VBAR_EL3, Vector Base Address Register (EL3) on page D7-2137 
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K12.4 Alphabetical index of AArch32 registers and system instructions 































































































K12.4 Alphabetical index of AArch32 registers and system instructions 
This section is an index of AArch32 registers and system instructions in alphabetical order. 
Table K12-16 Alphabetical index of AArch32 Registers 
Register Description, see 
ACTLR ACTLR, Auxiliary Control Register on page G6-4232 
ACTLR2 ACTLR2, Auxiliary Control Register 2 on page G6-4234 
ADFSR ADFSR, Auxiliary Data Fault Status Register on page G6-4236 
AIDR AIDR, Auxiliary ID Register on page G6-4238 
AIFSR AIFSR, Auxiliary Instruction Fault Status Register on page G6-4240 
AMAIRO AMAIRO, Auxiliary Memory Attribute Indirection Register 0 on page G6-4242 
AMAIRI1 AMAIRI, Auxiliary Memory Attribute Indirection Register 1 on page G6-4244 
APSR APSR, Application Program Status Register on page G6-4246 
ATS12NSOPR ATSI2NSOPR, Address Translate Stages 1 and 2 Non-secure Only PLI Read on page G6-4248 
ATS12NSOPW ATS1I2NSOPW, Address Translate Stages 1 and 2 Non-secure Only PL1 Write on page G6-4250 
ATS12NSOUR ATSI2NSOUR, Address Translate Stages 1 and 2 Non-secure Only Unprivileged Read on 
page G6-4252 
ATS12NSOUW ATSI2NSOUW, Address Translate Stages 1 and 2 Non-secure Only Unprivileged Write on 
page G6-4254 

ATS1CPR ATSICPR, Address Translate Stage 1 Current state PL] Read on page G6-4256 
ATS1CPW ATSICPW, Address Translate Stage 1 Current state PLI Write on page G6-4258 
ATS1CUR ATSICUR, Address Translate Stage 1 Current state Unprivileged Read on page G6-4260 
ATS1CUW ATSICUW, Address Translate Stage 1 Current state Unprivileged Write on page G6-4262 
ATS1HR ATSIHR, Address Translate Stage 1 Hyp mode Read on page G6-4264 
ATSIHW ATS1HW, Address Translate Stage 1 Hyp mode Write on page G6-4266 
BPIALL BPIALL, Branch Predictor Invalidate All on page G6-4268 
BPIALLIS BPIALLIS, Branch Predictor Invalidate All, Inner Shareable on page G6-4269 
BPIMVA BPIMVA, Branch Predictor Invalidate by VA on page G6-4270 
CCSIDR CCSIDR, Current Cache Size ID Register on page G6-4272 
CLIDR CLIDR, Cache Level ID Register on page G6-4274 
CNTFRQ CNTFRQ, Counter-timer Frequency register on page G6-4804 
CNTHCTL CNTHCTL, Counter-timer Hyp Control register on page G6-4806 
CNTHP_CTL CNTHP_CTL, Counter-timer Hyp Physical Timer Control register on page G6-4808 
CNTHP_CVAL CNTHP_CVAL, Counter-timer Hyp Physical CompareValue register on page G6-4810 
CNTHP_TVAL CNTHP_TVAL, Counter-timer Hyp Physical Timer TimerValue register on page G6-4812 
CNTKCTL CNTKCTL, Counter-timer Kernel Control register on page G6-4814 
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Table K12-16 Alphabetical index of AArch32 Registers (continued) 





Register 


Description, see 





NTP_CTL 


CNTP_CTL, Counter-timer Physical Timer Control register on page G6-4817 





NTP_CVAL 


CNTP_CVAL, Counter-timer Physical Timer CompareValue register on page G6-4820 





CNTP_TVAL, Counter-timer Physical Timer TimerValue register on page G6-4822 





CNTPCT, Counter-timer Physical Count register on page G6-4824 





C 
C 
CNTP_TVAL 
C 
C 





NTV_CTL 


CNTV_CTL, Counter-timer Virtual Timer Control register on page G6-4826 



















































































CNTV_CVAL CNTV_CVAL, Counter-timer Virtual Timer Compare Value register on page G6-4828 
CNTV_TVAL CNTV_TVAL, Counter-timer Virtual Timer TimerValue register on page G6-4830 
CNTVCT CNTVCT, Counter-timer Virtual Count register on page G6-4832 

CNTVOFF CNTVOFF, Counter-timer Virtual Offset register on page G6-4834 

CONTEXTIDR CONTEXTIDR, Context ID Register on page G6-4276 

CP15DMB CPI5DMB, Data Memory Barrier System instruction on page G6-4278 

CP15DSB CP1I5DSB, Data Synchronization Barrier System instruction on page G6-4280 
CP15ISB CPIS5ISB, Instruction Synchronization Barrier System instruction on page G6-4282 
CPACR CPACR, Architectural Feature Access Control Register on page G6-4284 

CPSR CPSR, Current Program Status Register on page G6-4288 

CSSELR CSSELR, Cache Size Selection Register on page G6-4291 

CTR CTR, Cache Type Register on page G6-4293 

DACR DACR, Domain Access Control Register on page G6-4296 

DBGAUTHSTATUS DBGAUTHSTATUS, Debug Authentication Status register on page G6-4669 
DBGBCR<n> DBGBCR<n>, Debug Breakpoint Control Registers, n = 0 - 15 on page G6-4672 
DBGBVR<n> DBGBVR<n>, Debug Breakpoint Value Registers, n = 0 - 15 on page G6-4676 
DBGBXVR<n> DBGBXVR<n>, Debug Breakpoint Extended Value Registers, n = 0 - 15 on page G6-4679 
DBGCLAIMCLR DBGCLAIMCLR, Debug Claim Tag Clear register on page G6-468 1 
DBGCLAIMSET DBGCLAIMSET, Debug Claim Tag Set register on page G6-4683 

DBGDCCINT DBGDCCINT, DCC Interrupt Enable Register on page G6-4685 

DBGDEVID DBGDEVID, Debug Device ID register 0 on page G6-4687 

DBGDEVID1 DBGDEVID1, Debug Device ID register 1 on page G6-4690 

DBGDEVID2 DBGDEVID2, Debug Device ID register 2 on page G6-4692 

DBGDIDR DBGDIDR, Debug ID Register on page G6-4694 

DBGDRAR DBGDRAR, Debug ROM Address Register on page G6-4697 

DBGDSAR DBGDSAR, Debug Self Address Register on page G6-4700 

DBGDSCRext DBGDSCRext, Debug Status and Control Register, External View on page G6-4702 
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K12.4 Alphabetical index of AArch32 registers and system instructions 


Table K12-16 Alphabetical index of AArch32 Registers (continued) 





































































































Register Description, see 

DBGDSCRint DBGDSCRint, Debug Status and Control Register, Internal View on page G6-4707 

DBGDTRRXext DBGDTRRXext, Debug OS Lock Data Transfer Register, Receive, External View on 
page G6-4710 

DBGDTRRXint DBGDTRRXint, Debug Data Transfer Register, Receive on page G6-4712 

DBGDTRTXext DBGDTRTXext, Debug OS Lock Data Transfer Register, Transmit on page G6-4714 

DBGDTRTXint DBGDTRTXint, Debug Data Transfer Register, Transmit on page G6-4716 

DBGOSDLR DBGOSDLR, Debug OS Double Lock Register on page G6-4718 

DBGOSECCR DBGOSECCR, Debug OS Lock Exception Catch Control Register on page G6-4720 

DBGOSLAR DBGOSLAR, Debug OS Lock Access Register on page G6-4722 

DBGOSLSR DBGOSLSR, Debug OS Lock Status Register on page G6-4724 

DBGPRCR DBGPRCR, Debug Power Control Register on page G6-4726 

DBGVCR DBGVCR, Debug Vector Catch Register on page G6-4728 

DBGWCR<n> DBGWCR<n>, Debug Watchpoint Control Registers, n = 0 - 15 on page G6-4735 

DBGWFAR DBGWFEAR, Debug Watchpoint Fault Address Register on page G6-4739 

DBGWVR<n> DBGWVR<n>, Debug Watchpoint Value Registers, n = 0 - 15 on page G6-4741 

DCCIMVAC DCCIMVAC, Data Cache line Clean and Invalidate by VA to PoC on page G6-4298 

DCCISW DCCISW, Data Cache line Clean and Invalidate by Set/Way on page G6-4300 

DCCMVAC DCCMVAC, Data Cache line Clean by VA to PoC on page G6-4302 

DCCMVAU DCCMVAU, Data Cache line Clean by VA to PoU on page G6-4304 

DCCSW DCCSW, Data Cache line Clean by Set/Way on page G6-4306 

DCIMVAC DCIMVAC, Data Cache line Invalidate by VA to PoC on page G6-4308 

DCISW DCISW, Data Cache line Invalidate by Set/Way on page G6-4310 

DFAR DFAR, Data Fault Address Register on page G6-4312 

DFSR DFSR, Data Fault Status Register on page G6-4314 

DLR DLR, Debug Link Register on page G6-4743 

DSPSR DSPSR, Debug Saved Program Status Register on page G6-4745 

DTLBIALL DTLBIALL, Data TLB Invalidate All on page G6-4320 

DTLBIASID DTLBIASID, Data TLB Invalidate by ASID match on page G6-4322 

DTLBIMVA DTLBIMVA, Data TLB Invalidate by VA on page G6-4324 

ELR_hyp ELR_hyp, Exception Link Register (Hyp mode) on page G6-4326 

FCSEIDR FCSEIDR, FCSE Process ID register on page G6-4328 

FPEXC FPEXC, Floating-Point Exception Control register on page G6-4330 

FPSCR FPSCR, Floating-Point Status and Control Register on page G6-4335 
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K12.4 Alphabetical index of AArch32 registers and system instructions 


Table K12-16 Alphabetical index of AArch32 Registers (continued) 





































































































Register Description, see 

FPSID FPSID, Floating-Point System ID register on page G6-4341 

HACR HACR, Hyp Auxiliary Configuration Register on page G6-4344 

HACTLR HACTLR, Hyp Auxiliary Control Register on page G6-4346 

HACTLR2 HACTLR2, Hyp Auxiliary Control Register 2 on page G6-4348 

HADFSR HADFSR, Hyp Auxiliary Data Fault Status Register on page G6-4350 
HAIFSR HAIFSR, Hyp Auxiliary Instruction Fault Status Register on page G6-4352 
HAMAIRO HAMAIRO, Hyp Auxiliary Memory Attribute Indirection Register 0 on page G6-4354 
HAMAIRI1 HAMAIRI, Hyp Auxiliary Memory Attribute Indirection Register 1 on page G6-4356 
HCPTR HCPTR, Hyp Architectural Feature Trap Register on page G6-4358 

HCR HCR, Hyp Configuration Register on page G6-4362 

HCR2 HCR2, Hyp Configuration Register 2 on page G6-4372 

HDCR HDCR, Hyp Debug Control Register on page G6-4749 

HDFAR HDFAR, Hyp Data Fault Address Register on page G6-4374 

HIFAR HIFAR, Hyp Instruction Fault Address Register on page G6-4376 

HMAIRO HMAIRO, Hyp Memory Attribute Indirection Register 0 on page G6-4378 
HMAIRI HMAIR1, Hyp Memory Attribute Indirection Register 1 on page G6-4381 
HPFAR HPFAR, Hyp IPA Fault Address Register on page G6-4384 

HRMR HRMR, Hyp Reset Management Register on page G6-4386 

HSCTLR HSCTLR, Hyp System Control Register on page G6-4388 

HSR HSR, Hyp Syndrome Register on page G6-4393 

HSTR HSTR, Hyp System Trap Register on page G6-4413 

HTCR HTCR, Hyp Translation Control Register on page G6-4415 

HTPIDR HTPIDR, Hyp Software Thread ID Register on page G6-4418 

HTTBR HTTBR, Hyp Translation Table Base Register on page G6-4420 

HVBAR HVBAR, Hyp Vector Base Address Register on page G6-4422 

ICIALLU ICIALLU, Instruction Cache Invalidate All to PoU on page G6-4424 
ICIALLUIS ICIALLUIS, Instruction Cache Invalidate All to PoU, Inner Shareable on page G6-4426 
ICIMVAU ICIMVAU, Instruction Cache line Invalidate by VA to PoU on page G6-4428 
ID_AFRO ID_AFRO, Auxiliary Feature Register 0 on page G6-4430 

ID_DFRO ID_DFRO, Debug Feature Register 0 on page G6-4432 

ID_ISARO ID_ISARO, Instruction Set Attribute Register 0 on page G6-4435 

ID_ISAR1 ID_ISARI, Instruction Set Attribute Register 1 on page G6-4438 
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Table K12-16 Alphabetical index of AArch32 Registers (continued) 











































































































Register Description, see 

ID_ISAR2 ID_ISAR2, Instruction Set Attribute Register 2 on page G6-4441 
ID_ISAR3 ID_ISAR3, Instruction Set Attribute Register 3 on page G6-4444 
ID_ISAR4 ID_ISAR4, Instruction Set Attribute Register 4 on page G6-4447 
ID_ISARS ID_ISARS, Instruction Set Attribute Register 5 on page G6-4450 
ID_MMFRO ID_MMFRO, Memory Model Feature Register 0 on page G6-4452 
ID_MMFR1 ID_MMFRI1, Memory Model Feature Register 1 on page G6-4455 
ID_MMFR2 ID_MMFR2, Memory Model Feature Register 2 on page G6-4459 
ID_MMFR3 ID_MMFR3, Memory Model Feature Register 3 on page G6-4463 
ID_MMFR4 ID_MMFR4, Memory Model Feature Register 4 on page G6-4466 
ID_PFRO ID_PFRO, Processor Feature Register 0 on page G6-4468 
ID_PFRI ID_PFRI1, Processor Feature Register 1 on page G6-4470 

IFAR IFAR, Instruction Fault Address Register on page G6-4474 

IFSR IFSR, Instruction Fault Status Register on page G6-4476 

ISR ISR, Interrupt Status Register on page G6-4481 

ITLBIALL ITLBIALL, Instruction TLB Invalidate All on page G6-4483 
ITLBIASID ITLBIASID, Instruction TLB Invalidate by ASID match on page G6-4485 
ITLBIMVA ITLBIMVA, Instruction TLB Invalidate by VA on page G6-4487 
JIDR JIDR, Jazelle ID Register on page G6-4489 

JMCR JMCR, Jazelle Main Configuration Register on page G6-4491 
JOSCR JOSCR, Jazelle OS Control Register on page G6-4493 

MAIRO MAIRO, Memory Attribute Indirection Register 0 on page G6-4495 
MAIRI MAIRI, Memory Attribute Indirection Register 1 on page G6-4498 
MIDR MIDR, Main ID Register on page G6-4501 

MPIDR MPIDR, Multiprocessor Affinity Register on page G6-4504 
MVBAR MVBAR, Monitor Vector Base Address Register on page G6-4506 
MVFRO MVFRO, Media and VFP Feature Register 0 on page G6-4508 
MVFR1 MVFRI, Media and VFP Feature Register I] on page G6-4512 
MVFR2 MVFR2, Media and VFP Feature Register 2 on page G6-4515 
NMRR NMRR, Normal Memory Remap Register on page G6-4517 
NSACR NSACR, Non-Secure Access Control Register on page G6-4520 
PAR PAR, Physical Address Register on page G6-4524 

PMCCFILTR PMCCFILTR, Performance Monitors Cycle Count Filter Register on page G6-4759 
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K12.4 Alphabetical index of AArch32 registers and system instructions 


Table K12-16 Alphabetical index of AArch32 Registers (continued) 





































































































Register Description, see 

PMCCNTR PMCCNTR, Performance Monitors Cycle Count Register on page G6-4762 

PMCEIDO PMCEIDO, Performance Monitors Common Event Identification register 0 on page G6-4765 
PMCEID1 PMCEID1, Performance Monitors Common Event Identification register 1 on page G6-4767 
PMCNTENCLR PMCNTENCLR, Performance Monitors Count Enable Clear register on page G6-4769 
PMCNTENSET PMCNTENSET, Performance Monitors Count Enable Set register on page G6-4771 

PMCR PMCR, Performance Monitors Control Register on page G6-4773 

PMEVCNTR<n> PMEVCNTR<n>, Performance Monitors Event Count Registers, n = 0 - 30 on page G6-4777 
PMEVTYPER<n> PMEVTYPER<n>, Performance Monitors Event Type Registers, n = 0 - 30 on page G6-4779 
PMINTENCLR PMINTENCLR, Performance Monitors Interrupt Enable Clear register on page G6-4783 
PMINTENSET PMINTENSET, Performance Monitors Interrupt Enable Set register on page G6-4785 
PMOVSR PMOVSR, Performance Monitors Overflow Flag Status Register on page G6-4787 
PMOVSSET PMOVSSET, Performance Monitors Overflow Flag Status Set register on page G6-4789 
PMSELR PMSELR, Performance Monitors Event Counter Selection Register on page G6-4791 
PMSWINC PMSWINC, Performance Monitors Software Increment register on page G6-4794 
PMUSERENR PMUSERENR, Performance Monitors User Enable Register on page G6-4796 
PMXEVCNTR PMXEVCNTR, Performance Monitors Selected Event Count Register on page G6-4799 
PMXEVTYPER PMXEVTYPER, Performance Monitors Selected Event Type Register on page G6-4801 
PRRR PRRR, Primary Region Remap Register on page G6-4531 

REVIDR REVIDR, Revision ID Register on page G6-4534 

RMR (at EL1) RMR (at EL1), Reset Management Register on page G6-4536 

RMR (at EL3) RMR (at EL3), Reset Management Register on page G6-4538 

RVBAR RVBAR, Reset Vector Base Address Register on page G6-4540 

SCR SCR, Secure Configuration Register on page G6-4542 

SCTLR SCTLR, System Control Register on page G6-4547 

SDCR SDCR, Secure Debug Control Register on page G6-4753 

SDER SDER, Secure Debug Enable Register on page G6-4756 

SPSR SPSR, Saved Program Status Register on page G6-4554 

SPSR_abt SPSR_abt, Saved Program Status Register (Abort mode) on page G6-4557 

SPSR_fiq SPSR_fig, Saved Program Status Register (FIQ mode) on page G6-4561 

SPSR_hyp SPSR_hyp, Saved Program Status Register (Hyp mode) on page G6-4565 

SPSR_irq SPSR_irg, Saved Program Status Register (IRQ mode) on page G6-4569 

SPSR_mon SPSR_mon, Saved Program Status Register (Monitor mode) on page G6-4573 
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Table K12-16 Alphabetical index of AArch32 Registers (continued) 


































































































Register Description, see 

SPSR_sve SPSR_svc, Saved Program Status Register (Supervisor mode) on page G6-4577 

SPSR_und SPSR_und, Saved Program Status Register (Undefined mode) on page G6-4581 

TCMTR TCMTR, TCM Type Register on page G6-4585 

TLBIALL TLBIALL, TLB Invalidate All on page G6-4587 

TLBIALLH TLBIALLH, TLB Invalidate All, Hyp mode on page G6-4589 

TLBIALLHIS TLBIALLHIS, TLB Invalidate All, Hyp mode, Inner Shareable on page G6-4591 

TLBIALLIS TLBIALLIS, TLB Invalidate All, Inner Shareable on page G6-4593 

TLBIALLNSNH TLBIALLNSNH, TLB Invalidate All, Non-Secure Non-Hyp on page G6-4595 

TLBIALLNSNHIS TLBIALLNSNHIS, TLB Invalidate All, Non-Secure Non-Hyp, Inner Shareable on page G6-4597 

TLBIASID TLBIASID, TLB Invalidate by ASID match on page G6-4599 

TLBIASIDIS TLBIASIDIS, TLB Invalidate by ASID match, Inner Shareable on page G6-4601 

TLBIIPAS2 TLBIIPAS2, TLB Invalidate by Intermediate Physical Address, Stage 2 on page G6-4603 

TLBIIPAS2IS TLBIIPAS2IS, TLB Invalidate by Intermediate Physical Address, Stage 2, Inner Shareable on 
page G6-4605 

TLBIIPAS2L TLBIIPAS2L, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level on 
page G6-4607 

TLBIIPAS2LIS TLBIIPAS2LIS, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, Inner 
Shareable on page G6-4609 

TLBIMVA TLBIMVA, TLB Invalidate by VA on page G6-4611 

TLBIMVAA TLBIMVAA, TLB Invalidate by VA, All ASID on page G6-4613 

TLBIMVAAIS TLBIMVAAIS, TLB Invalidate by VA, All ASID, Inner Shareable on page G6-4615 

TLBIMVAAL TLBIMVAAL, TLB Invalidate by VA, All ASID, Last level on page G6-4617 

TLBIMVAALIS TLBIMVAALIS, TLB Invalidate by VA, All ASID, Last level, Inner Shareable on page G6-4619 

TLBIMVAH TLBIMVAH, TLB Invalidate by VA, Hyp mode on page G6-4621 

TLBIMVAHIS TLBIMVAHIS, TLB Invalidate by VA, Hyp mode, Inner Shareable on page G6-4623 

TLBIMVAIS TLBIMVAIS, TLB Invalidate by VA, Inner Shareable on page G6-4625 

TLBIMVAL TLBIMVAL, TLB Invalidate by VA, Last level on page G6-4627 

TLBIMVALH TLBIMVALH, TLB Invalidate by VA, Last level, Hyp mode on page G6-4629 

TLBIMVALHIS TLBIMVALHIS, TLB Invalidate by VA, Last level, Hyp mode, Inner Shareable on page G6-4631 

TLBIMVALIS TLBIMVALIS, TLB Invalidate by VA, Last level, Inner Shareable on page G6-4633 

TLBTR TLBTR, TLB Type Register on page G6-4635 

TPIDRPRW TPIDRPRW, PLI Software Thread ID Register on page G6-4637 

TPIDRURO TPIDRURO, PLO Read-Only Software Thread ID Register on page G6-4639 
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Table K12-16 Alphabetical index of AArch32 Registers (continued) 
































Register Description, see 

TPIDRURW TPIDRURW, PLO Read/Write Software Thread ID Register on page G6-4641 
TTBCR TTBCR, Translation Table Base Control Register on page G6-4643 

TTBRO TTBRO, Translation Table Base Register 0 on page G6-4648 

TTBRI TTBRI, Translation Table Base Register 1 on page G6-4652 

VBAR VBAR, Vector Base Address Register on page G6-4656 

VMPIDR VMPIDR, Virtualization Multiprocessor ID Register on page G6-4658 
VPIDR VPIDR, Virtualization Processor ID Register on page G6-4660 

VTCR VTCR, Virtualization Translation Control Register on page G6-4663 
VTTBR VTTBR, Virtualization Translation Table Base Register on page G6-4666 
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K12.5 Functional index of AArch32 registers and system instructions 


This section is an index of the AArch32 registers and system instructions, divided by functional group. Each of the 
following sections lists the registers for a functional group: 


Special-purpose registers. 

VMSA-specific registers on page K12-5693. 

ID registers on page K12-5693. 

Performance monitors registers on page K12-5695. 

Debug registers on page K12-5695. 

Generic timer registers on page K12-5697. 

Cache maintenance system instructions on page K12-5697. 
Address translation system instructions on page K12-5698. 

TLB maintenance system instructions on page K12-5698. 

Legacy feature registers and system instructions on page K12-5699. 


Base system registers on page K12-5700. 


K12.5.1 Special-purpose registers 


This section is an index to the registers in the Processor state registers functional group. 


Table K12-17 Special-purpose registers 





Register Description, see 





DLR DLR, Debug Link Register on page G6-4743 





DSPSR DSPSR, Debug Saved Program Status Register on page G6-4745 





ELR_hyp ELR_hyp, Exception Link Register (Hyp mode) on page G6-4326 





FPSCR FPSCR, Floating-Point Status and Control Register on page G6-4335 





SPSR SPSR, Saved Program Status Register on page G6-4554 





SPSR_abt SPSR_abt, Saved Program Status Register (Abort mode) on page G6-4557 





SPSR_fiq SPSR_fig, Saved Program Status Register (FIQ mode) on page G6-4561 





SPSR_hyp SPSR_hyp, Saved Program Status Register (Hyp mode) on page G6-4565 





SPSR_irq SPSR_irg, Saved Program Status Register (IRQ mode) on page G6-4569 





SPSR_mon  SPSR_mon, Saved Program Status Register (Monitor mode) on page G6-4573 





SPSR_svc SPSR_svc, Saved Program Status Register (Supervisor mode) on page G6-4577 





SPSR_und  SPSR_und, Saved Program Status Register (Undefined mode) on page G6-458 1 
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VMSA-specific registers 


Appendix K12 Registers Index 
K12.5 Functional index of AArch32 registers and system instructions 


This section is an index to the registers in the Virtual memory control registers functional group. 


Table K12-18 VMSA-specific registers 






























































Register Description, see 

AMAIRO AMAIRO, Auxiliary Memory Attribute Indirection Register 0 on page G6-4242 
AMAIR1 AMAIRI, Auxiliary Memory Attribute Indirection Register 1 on page G6-4244 
CONTEXTIDR CONTEXTIDR, Context ID Register on page G6-4276 

DACR DACR, Domain Access Control Register on page G6-4296 

HAMAIRO HAMAIRO, Hyp Auxiliary Memory Attribute Indirection Register 0 on page G6-4354 
HAMAIRI HAMAIRI, Hyp Auxiliary Memory Attribute Indirection Register 1 on page G6-4356 
HMAIRO HMAIRO, Hyp Memory Attribute Indirection Register 0 on page G6-4378 

HMAIR1 HMAIRI1, Hyp Memory Attribute Indirection Register 1 on page G6-4381 

HTCR HTCR, Hyp Translation Control Register on page G6-4415 

HTTBR HATTBR, Hyp Translation Table Base Register on page G6-4420 

MAIRO MAIRO, Memory Attribute Indirection Register 0 on page G6-4495 

MAIR1 MAIR1, Memory Attribute Indirection Register 1 on page G6-4498 

NMRR NMRR, Normal Memory Remap Register on page G6-4517 

PRRR PRRR, Primary Region Remap Register on page G6-4531 

TTBCR TTBCR, Translation Table Base Control Register on page G6-4643 

TTBRO TTBRO, Translation Table Base Register 0 on page G6-4648 

TTBRI TTBRI, Translation Table Base Register 1 on page G6-4652 

VTCR VTCR, Virtualization Translation Control Register on page G6-4663 

VTTBR VTTBR, Virtualization Translation Table Base Register on page G6-4666 





K12.5.3 ID registers 


This section is an index to the registers in the Identification registers functional group. 


Table K12-19 ID registers 























Register Description, see 

AIDR AIDR, Auxiliary ID Register on page G6-4238 

CCSIDR CCSIDR, Current Cache Size ID Register on page G6-4272 
CLIDR CLIDR, Cache Level ID Register on page G6-4274 
CSSELR CSSELR, Cache Size Selection Register on page G6-4291 
CTR CTR, Cache Type Register on page G6-4293 

FPSID FPSID, Floating-Point System ID register on page G6-4341 
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K12.5 Functional index of AArch32 registers and system instructions 


Table K12-19 ID registers (continued) 

























































































Register Description, see 
ID_AFRO ID_AFRO, Auxiliary Feature Register 0 on page G6-4430 
ID_DFRO ID_DFRO, Debug Feature Register 0 on page G6-4432 
ID_ISARO ID_ISARO, Instruction Set Attribute Register 0 on page G6-4435 
ID_ISARI ID_ISARI1, Instruction Set Attribute Register 1 on page G6-4438 
ID_ISAR2 ID_ISAR2, Instruction Set Attribute Register 2 on page G6-4441 
ID_ISAR3 ID_ISAR3, Instruction Set Attribute Register 3 on page G6-4444 
ID_ISAR4 ID_ISAR4, Instruction Set Attribute Register 4 on page G6-4447 
ID_ISARS ID_ISARS, Instruction Set Attribute Register 5 on page G6-4450 
ID_MMFRO JD_MMFRO, Memory Model Feature Register 0 on page G6-4452 
ID_MMFR1 JID_MMFRI, Memory Model Feature Register 1 on page G6-4455 
ID_MMFR2 ID_MMFR2, Memory Model Feature Register 2 on page G6-4459 
ID_MMFR3  =ID_MMFR3, Memory Model Feature Register 3 on page G6-4463 
ID_MMFR4 JD_MMFR4, Memory Model Feature Register 4 on page G6-4466 
ID_PFRO ID_PFRO, Processor Feature Register 0 on page G6-4468 
ID_PFRI ID_PFR1, Processor Feature Register 1 on page G6-4470 
MIDR MIDR, Main ID Register on page G6-4501 
MPIDR MPIDR, Multiprocessor Affinity Register on page G6-4504 
MVFRO MVFRO, Media and VFP Feature Register 0 on page G6-4508 
MVFRI1 MVFRI, Media and VFP Feature Register I on page G6-4512 
MVFR2 MVFRz2, Media and VFP Feature Register 2 on page G6-4515 
REVIDR REVIDR, Revision ID Register on page G6-4534 
TCMTR TCMTR, TCM Type Register on page G6-4585 
TLBTR TLBTR, TLB Type Register on page G6-4635 
VMPIDR VMPIDR, Virtualization Multiprocessor ID Register on page G6-4658 
VPIDR VPIDR, Virtualization Processor ID Register on page G6-4660 
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K12.5 Functional index of AArch32 registers and system instructions 


K12.5.4 Performance monitors registers 


This section is an index to the registers in the Performance Monitors registers functional group. 


Table K12-20 Performance monitors registers 



























































Register Description, see 

PMCCFILTR PMCCFILTR, Performance Monitors Cycle Count Filter Register on page G6-4759 
PMCCNTR PMCCNTR, Performance Monitors Cycle Count Register on page G6-4762 

PMCEIDO PMCEIDO, Performance Monitors Common Event Identification register 0 on page G6-4765 
PMCEID1 PMCEID1, Performance Monitors Common Event Identification register 1 on page G6-4767 
PMCNTENCLR PMCNTENCLR, Performance Monitors Count Enable Clear register on page G6-4769 
PMCNTENSET PMCNTENSET, Performance Monitors Count Enable Set register on page G6-4771 

PMCR PMCR, Performance Monitors Control Register on page G6-4773 

PMEVCNTR<n> PMEVCNTR<n>, Performance Monitors Event Count Registers, n = 0 - 30 on page G6-4777 
PMEVTYPER<n> PMEVTYPER<n>, Performance Monitors Event Type Registers, n = 0 - 30 on page G6-4779 
PMINTENCLR PMINTENCLR, Performance Monitors Interrupt Enable Clear register on page G6-4783 
PMINTENSET PMINTENSET, Performance Monitors Interrupt Enable Set register on page G6-4785 
PMOVSR PMOVSR, Performance Monitors Overflow Flag Status Register on page G6-4787 
PMOVSSET PMOVSSET, Performance Monitors Overflow Flag Status Set register on page G6-4789 
PMSELR PMSELR, Performance Monitors Event Counter Selection Register on page G6-479 1 
PMSWINC PMSWINC, Performance Monitors Software Increment register on page G6-4794 
PMUSERENR PMUSERENR, Performance Monitors User Enable Register on page G6-4796 
PMXEVCNTR PMXEVCNTR, Performance Monitors Selected Event Count Register on page G6-4799 
PMXEVTYPER PMXEVTYPER, Performance Monitors Selected Event Type Register on page G6-4801 








K12.5.5 Debug registers 


This section is an index to the registers in the Debug registers functional group. 


Table K12-21 Debug registers 





Register Description, see 





DBGAUTHSTATUS DBGAUTHSTATUS, Debug Authentication Status register on page G6-4669 











DBGBCR<n> DBGBCR<n>, Debug Breakpoint Control Registers, n = 0 - 15 on page G6-4672 
DBGBVR<n> DBGBVR<n>, Debug Breakpoint Value Registers, n = 0 - 15 on page G6-4676 
DBGBXVR<n> DBGBXVR<n>, Debug Breakpoint Extended Value Registers, n = 0 - 15 on page G6-4679 





DBGCLAIMCLR DBGCLAIMCLR, Debug Claim Tag Clear register on page G6-4681 











DBGCLAIMSET DBGCLAIMSET, Debug Claim Tag Set register on page G6-4683 
DBGDCCINT DBGDCCINT, DCC Interrupt Enable Register on page G6-4685 
ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. K12-5695 


1ID092916 


Non-Confidential 


Appendix K12 Registers Index 
K12.5 Functional index of AArch32 registers and system instructions 


Table K12-21 Debug registers (continued) 



















































































Register Description, see 

DBGDEVID DBGDEVID, Debug Device ID register 0 on page G6-4687 

DBGDEVID1 DBGDEVID1, Debug Device ID register 1 on page G6-4690 

DBGDEVID2 DBGDEVID2, Debug Device ID register 2 on page G6-4692 

DBGDIDR DBGDIDR, Debug ID Register on page G6-4694 

DBGDRAR DBGDRAR, Debug ROM Address Register on page G6-4697 

DBGDSAR DBGDSAR, Debug Self Address Register on page G6-4700 

DBGDSCRext DBGDSCRext, Debug Status and Control Register, External View on page G6-4702 
DBGDSCRint DBGDSCRint, Debug Status and Control Register, Internal View on page G6-4707 
DBGDTRRXext DBGDTRRxXext, Debug OS Lock Data Transfer Register, Receive, External View on page G6-4710 
DBGDTRRXint DBGDTRRXint, Debug Data Transfer Register, Receive on page G6-4712 
DBGDTRTXext DBGDTRTXext, Debug OS Lock Data Transfer Register, Transmit on page G6-4714 
DBGDTRTXint DBGDTRTXint, Debug Data Transfer Register, Transmit on page G6-4716 
DBGOSDLR DBGOSDLR, Debug OS Double Lock Register on page G6-4718 

DBGOSECCR DBGOSECCR, Debug OS Lock Exception Catch Control Register on page G6-4720 
DBGOSLAR DBGOSLAR, Debug OS Lock Access Register on page G6-4722 

DBGOSLSR DBGOSLSR, Debug OS Lock Status Register on page G6-4724 

DBGPRCR DBGPRCR, Debug Power Control Register on page G6-4726 

DBGVCR DBGVCR, Debug Vector Catch Register on page G6-4728 

DBGWCR<n> DBGWCR<n>, Debug Watchpoint Control Registers, n = 0 - 15 on page G6-4735 
DBGWFAR DBGWEAR, Debug Watchpoint Fault Address Register on page G6-4739 
DBGWVR<n> DBGWVR<n>, Debug Watchpoint Value Registers, n = 0 - 15 on page G6-4741 
DLR DLR, Debug Link Register on page G6-4743 

DSPSR DSPSR, Debug Saved Program Status Register on page G6-4745 

HDCR HDCR, Hyp Debug Control Register on page G6-4749 

SDCR SDCR, Secure Debug Control Register on page G6-4753 

SDER SDER, Secure Debug Enable Register on page G6-4756 








Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 


ARM DDI 0487A.k _iss10775 


Non-Confidential ID092916 


K12.5.6 


Appendix K12 Registers Index 
K12.5 Functional index of AArch32 registers and system instructions 


Generic timer registers 


This section is an index to the registers in the Generic Timer registers functional group. 


Table K12-22 Generic timer registers 





Register 


Description, see 





CNTFRQ, Counter-timer Frequency register on page G6-4804 





CNTHCTL, Counter-timer Hyp Control register on page G6-4806 





NTHP_CTL 


CNTHP_CTL, Counter-timer Hyp Physical Timer Control register on page G6-4808 





NTHP_CVAL 


CNTHP_CVAL, Counter-timer Hyp Physical CompareValue register on page G6-4810 





NTHP_TVAL 


CNTHP_TVAL, Counter-timer Hyp Physical Timer TimerValue register on page G6-4812 





CNTKCTL, Counter-timer Kernel Control register on page G6-4814 








NTP_CTL 


CNTP_CTL, Counter-timer Physical Timer Control register on page G6-4817 





C 


NTP_CVAL 


CNTP_CVAL, Counter-timer Physical Timer Compare Value register on page G6-4820 





Cc 


NTP_TVAL 


CNTP_TVAL, Counter-timer Physical Timer TimerValue register on page G6-4822 





CNTPCT 


CNTPCT, Counter-timer Physical Count register on page G6-4824 





CNTV_CTL 


CNTV_CTL, Counter-timer Virtual Timer Control register on page G6-4826 





CNTV_CVAL 


CNTV_CVAL, Counter-timer Virtual Timer Compare Value register on page G6-4828 








CNTV_TVAL 


CNTV_TVAL, Counter-timer Virtual Timer TimerValue register on page G6-4830 





CNTVCT 


CNTVCT, Counter-timer Virtual Count register on page G6-4832 





CNTVOFF 


CNTVOFF, Counter-timer Virtual Offset register on page G6-4834 





K12.5.7 


Cache maintenance system instructions 


This section is an index to the registers in the Cache maintenance instructions functional group. 


Table K12-23 Cache maintenance system instructions 



































Register Description, see 

BPIALL BPIALL, Branch Predictor Invalidate All on page G6-4268 

BPIALLIS BPIALLIS, Branch Predictor Invalidate All, Inner Shareable on page G6-4269 
BPIMVA BPIMVA, Branch Predictor Invalidate by VA on page G6-4270 

DCCIMVAC  DCCIMVAC, Data Cache line Clean and Invalidate by VA to PoC on page G6-4298 
DCCISW DCCISW, Data Cache line Clean and Invalidate by Set/Way on page G6-4300 
DCCMVAC DCCMVAC, Data Cache line Clean by VA to PoC on page G6-4302 
DCCMVAU =DCCMVAU, Data Cache line Clean by VA to PoU on page G6-4304 

DCCSW DCCSW, Data Cache line Clean by Set/Way on page G6-4306 

DCIMVAC DCIMVAC, Data Cache line Invalidate by VA to PoC on page G6-4308 
DCISW DCISW, Data Cache line Invalidate by Set/Way on page G6-4310 
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Table K12-23 Cache maintenance system instructions (continued) 





Register Description, see 





ICIALLU ICIALLU, Instruction Cache Invalidate All to PoU on page G6-4424 





ICIALLUIS ICIALLUIS, Instruction Cache Invalidate All to PoU, Inner Shareable on page G6-4426 





ICIMVAU ICIMVAU, Instruction Cache line Invalidate by VA to PoU on page G6-4428 







































































K12.5.8 Address translation system instructions 
This section is an index to the registers in the Address translation instructions functional group. 
Table K12-24 Address translation system instructions 
Register Description, see 
ATS12NSOPR = ATS12NSOPR, Address Translate Stages 1 and 2 Non-secure Only PL1 Read on page G6-4248 
ATS12NSOPW = ATS12NSOPW, Address Translate Stages 1 and 2 Non-secure Only PL1 Write on page G6-4250 
ATS12NSOUR = ATS12NSOUR, Address Translate Stages 1 and 2 Non-secure Only Unprivileged Read on 
page G6-4252 
ATS12NSOUW = ATS12NSOUW, Address Translate Stages I and 2 Non-secure Only Unprivileged Write on 
page G6-4254 
ATS1CPR ATSICPR, Address Translate Stage 1 Current state PLI Read on page G6-4256 
ATS1CPW ATSICPW, Address Translate Stage 1 Current state PLI Write on page G6-4258 
ATS1ICUR ATSICUR, Address Translate Stage 1 Current state Unprivileged Read on page G6-4260 
ATS1CUW ATSICUW, Address Translate Stage 1 Current state Unprivileged Write on page G6-4262 
ATS1HR ATSIHR, Address Translate Stage 1 Hyp mode Read on page G6-4264 
ATSIHW ATSIHW, Address Translate Stage 1 Hyp mode Write on page G6-4266 
K12.5.9 TLB maintenance system instructions 
This section is an index to the registers in the TLB maintenance instructions functional group. 
Table K12-25 TLB maintenance system instructions 
Register Description, see 
DTLBIALL DTLBIALL, Data TLB Invalidate All on page G6-4320 
DTLBIASID DTLBIASID, Data TLB Invalidate by ASID match on page G6-4322 
DTLBIMVA DTLBIMVA, Data TLB Invalidate by VA on page G6-4324 
ITLBIALL ITLBIALL, Instruction TLB Invalidate All on page G6-4483 
ITLBIASID ITLBIASID, Instruction TLB Invalidate by ASID match on page G6-4485 
ITLBIMVA ITLBIMVA, Instruction TLB Invalidate by VA on page G6-4487 
TLBIALL TLBIALL, TLB Invalidate All on page G6-4587 
TLBIALLH TLBIALLH, TLB Invalidate All, Hyp mode on page G6-4589 
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Table K12-25 TLB maintenance system instructions (continued) 










































































Register Description, see 

TLBIALLHIS TLBIALLHIS, TLB Invalidate All, Hyp mode, Inner Shareable on page G6-4591 

TLBIALLIS TLBIALLIS, TLB Invalidate All, Inner Shareable on page G6-4593 

TLBIALLNSNH TLBIALLNSNH, TLB Invalidate All, Non-Secure Non-Hyp on page G6-4595 

TLBIALLNSNHIS TLBIALLNSNHIS, TLB Invalidate All, Non-Secure Non-Hyp, Inner Shareable on page G6-4597 

TLBIASID TLBIASID, TLB Invalidate by ASID match on page G6-4599 

TLBIASIDIS TLBIASIDIS, TLB Invalidate by ASID match, Inner Shareable on page G6-4601 

TLBIIPAS2 TLBIIPAS2, TLB Invalidate by Intermediate Physical Address, Stage 2 on page G6-4603 

TLBIIPAS2IS TLBIIPAS2IS, TLB Invalidate by Intermediate Physical Address, Stage 2, Inner Shareable on 
page G6-4605 

TLBITPAS2L TLBIIPAS2L, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level on 
page G6-4607 

TLBIIPAS2LIS TLBIIPAS2LIS, TLB Invalidate by Intermediate Physical Address, Stage 2, Last level, Inner 
Shareable on page G6-4609 

TLBIMVA TLBIMVA, TLB Invalidate by VA on page G6-4611 

TLBIMVAA TLBIMVAA, TLB Invalidate by VA, All ASID on page G6-4613 

TLBIMVAAIS TLBIMVAAIS, TLB Invalidate by VA, All ASID, Inner Shareable on page G6-4615 

TLBIMVAAL TLBIMVAAL, TLB Invalidate by VA, All ASID, Last level on page G6-4617 

TLBIMVAALIS TLBIMVAALIS, TLB Invalidate by VA, All ASID, Last level, Inner Shareable on page G6-4619 

TLBIMVAH TLBIMVAH, TLB Invalidate by VA, Hyp mode on page G6-4621 

TLBIMVAHIS TLBIMVAHIS, TLB Invalidate by VA, Hyp mode, Inner Shareable on page G6-4623 

TLBIMVAIS TLBIMVAIS, TLB Invalidate by VA, Inner Shareable on page G6-4625 

TLBIMVAL TLBIMVAL, TLB Invalidate by VA, Last level on page G6-4627 

TLBIMVALH TLBIMVALH, TLB Invalidate by VA, Last level, Hyp mode on page G6-4629 

TLBIMVALHIS TLBIMVALHIS, TLB Invalidate by VA, Last level, Hyp mode, Inner Shareable on page G6-4631 

TLBIMVALIS TLBIMVALIS, TLB Invalidate by VA, Last level, Inner Shareable on page G6-4633 





Legacy feature registers and system instructions 


This section is an index to the registers in the Legacy feature registers functional group. 


Table K12-26 Legacy feature registers and system instructions 





Register Description, see 





CPISDMB = CPI5DMB, Data Memory Barrier System instruction on page G6-4278 





CP15DSB CP1I5DSB, Data Synchronization Barrier System instruction on page G6-4280 





CP15ISB CP1IS5ISB, Instruction Synchronization Barrier System instruction on page G6-4282 
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Table K12-26 Legacy feature registers and system instructions (continued) 





Register 


Description, see 





FCSEIDR 


FCSEIDR, FCSE Process ID register on page G6-4328 





JIDR 


JIDR, Jazelle ID Register on page G6-4489 





JMCR 


JMCR, Jazelle Main Configuration Register on page G6-449 1 





JOSCR 


JOSCR, Jazelle OS Control Register on page G6-4493 





K12.5.11 Base system registers 


This section is an index to the registers in the functional group. 


Table K12-27 Base system registers 













































































Register Description, see 
ACTLR ACTLR, Auxiliary Control Register on page G6-4232 
ACTLR2 ACTLR2, Auxiliary Control Register 2 on page G6-4234 
ADFSR ADFSR, Auxiliary Data Fault Status Register on page G6-4236 
AIFSR AIFSR, Auxiliary Instruction Fault Status Register on page G6-4240 
APSR APSR, Application Program Status Register on page G6-4246 
CPACR CPACR, Architectural Feature Access Control Register on page G6-4284 
CPSR CPSR, Current Program Status Register on page G6-4288 
DFAR DFAR, Data Fault Address Register on page G6-4312 
DFSR DFSR, Data Fault Status Register on page G6-4314 
FPEXC FPEXC, Floating-Point Exception Control register on page G6-4330 
HACR HACR, Hyp Auxiliary Configuration Register on page G6-4344 
HACTLR HACTLR, Hyp Auxiliary Control Register on page G6-4346 
HACTLR2 HACTLR2, Hyp Auxiliary Control Register 2 on page G6-4348 
HADFSR HADFSR, Hyp Auxiliary Data Fault Status Register on page G6-4350 
HAIFSR HAIFSR, Hyp Auxiliary Instruction Fault Status Register on page G6-4352 
HCPTR HCPTR, Hyp Architectural Feature Trap Register on page G6-4358 
HCR HCR, Hyp Configuration Register on page G6-4362 
HCR2 HCR2, Hyp Configuration Register 2 on page G6-4372 
HDFAR HDFAR, Hyp Data Fault Address Register on page G6-4374 
HIFAR HIFAR, Hyp Instruction Fault Address Register on page G6-4376 
HPFAR HPFAR, Hyp IPA Fault Address Register on page G6-4384 
HRMR HRMR, Hyp Reset Management Register on page G6-4386 
HSCTLR HSCTLR, Hyp System Control Register on page G6-4388 
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Table K12-27 Base system registers (continued) 






























































Register Description, see 

HSR HSR, Hyp Syndrome Register on page G6-4393 

HSTR HSTR, Hyp System Trap Register on page G6-4413 

HTPIDR HTPIDR, Hyp Software Thread ID Register on page G6-4418 

HVBAR HVBAR, Hyp Vector Base Address Register on page G6-4422 

IFAR IFAR, Instruction Fault Address Register on page G6-4474 

IFSR IFSR, Instruction Fault Status Register on page G6-4476 

ISR ISR, Interrupt Status Register on page G6-4481 

MVBAR MVBAR, Monitor Vector Base Address Register on page G6-4506 

NSACR NSACR, Non-Secure Access Control Register on page G6-4520 

PAR PAR, Physical Address Register on page G6-4524 

RMR (at EL1) RMR (at EL1), Reset Management Register on page G6-4536 

RMR (at EL3) RMR (at EL3), Reset Management Register on page G6-4538 

RVBAR RVBAR, Reset Vector Base Address Register on page G6-4540 

SCR SCR, Secure Configuration Register on page G6-4542 

SCTLR SCTLR, System Control Register on page G6-4547 

TPIDRPRW TPIDRPRW, PLI Software Thread ID Register on page G6-4637 
TPIDRURO TPIDRURO, PLO Read-Only Software Thread ID Register on page G6-4639 
TPIDRURW TPIDRURW, PLO Read/Write Software Thread ID Register on page G6-4641 
VBAR VBAR, Vector Base Address Register on page G6-4656 
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K12.6 Alphabetical index of memory-mapped registers 


This section is an index of memory-mapped registers in alphabetical order. 


Table K12-28 Alphabetical index of Memory-Mapped Registers 








































































































Register Description, see 

ASICCTL ASICCTL, CTI External Multiplexer Control register on page H9-5077 

CNTACR<n> CNTACR<n>, Counter-timer Access Control Registers, n = 0 - 7 on page 13-5200 

CNTCR CNTCR, Counter Control Register on page [13-5202 

CNTCV CNTCYV, Counter Count Value register on page [13-5204 

CNTELOACR CNTELOACR, Counter-timer ELO Access Control Register on page 13-5206 

CNTFIDO CNTFIDO, Counter Frequency ID on page 13-5208 

CNTFID<n> CNTFID<n>, Counter Frequency IDs, n > 0 on page 13-5209 

CNTFRQ CNTFRQ, Counter-timer Frequency on page 13-5211 

CNTNSAR CNTNSAR, Counter-timer Non-secure Access Register on page 13-5213 

CNTP_CTL CNTP_CTL, Counter-timer Physical Timer Control on page 13-5215 

CNTP_CVAL CNTP_CVAL, Counter-timer Physical Timer CompareValue on page 13-5217 

CNTP_TVAL CNTP_TVAL, Counter-timer Physical Timer TimerValue on page 13-5219 

CNTPCT CNTPCT, Counter-timer Physical Count on page 13-5221 

CNTSR CNTSR, Counter Status Register on page 13-5223 

CNTTIDR CNTTIDR, Counter-timer Timer ID Register on page 13-5225 

CNTV_CTL CNTV_CTL, Counter-timer Virtual Timer Control on page 13-5227 

CNTV_CVAL CNTV_CVAL, Counter-timer Virtual Timer Compare Value on page 13-5229 

CNTV_TVAL CNTV_TVAL, Counter-timer Virtual Timer TimerValue on page 13-5231 

CNTVCT CNTVCT, Counter-timer Virtual Count on page 13-5233 

CNTVOFF CNTVOFF{<n>}, Counter-timer Virtual Offset on page 13-5235 

CNTVOFF<n> CNTVOFF{<n>}, Counter-timer Virtual Offset on page 13-5235 

CounterID<n> CounterID<n>, Counter ID registers, n = 0 - 11 on page 13-5237 

CTIAPPCLEAR CTIAPPCLEAR, CTI Application Trigger Clear register on page H9-5078 

CTIAPPPULSE CTIAPPPULSE, CTI Application Pulse register on page H9-5079 

CTIAPPSET CTIAPPSET, CTI Application Trigger Set register on page H9-5080 

CTIAUTHSTATUS CTIAUTHSTATUS, CTI Authentication Status register on page H9-5081 

CTICHINSTATUS CTICHINSTATUS, CTI Channel In Status register on page H9-5083 

CTICHOUTSTATUS CTICHOUTSTATUS, CTI Channel Out Status register on page H9-5084 

CTICIDRO CTICIDRO, CTI Component Identification Register 0 on page H9-5085 

CTICIDR1 CTICIDR1, CTI Component Identification Register 1 on page H9-5086 
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Table K12-28 Alphabetical index of Memory-Mapped Registers (continued) 



















































































Register Description, see 

CTICIDR2 CTICIDR2, CTI Component Identification Register 2 on page H9-5087 
CTICIDR3 CTICIDR3, CTI Component Identification Register 3 on page H9-5088 
CTICLAIMCLR CTICLAIMCLR, CTI Claim Tag Clear register on page H9-5089 
CTICLAIMSET CTICLAIMSET, CTI Claim Tag Set register on page H9-5090 
CTICONTROL CTICONTROL, CTI Control register on page H9-5091 

CTIDEVAFFO CTIDEVAFFO0, CTI Device Affinity register 0 on page H9-5093 
CTIDEVAFF1 CTIDEVAFF 1, CTI Device Affinity register 1 on page H9-5094 
CTIDEVARCH CTIDEVARCH, CTI Device Architecture register on page H9-5095 
CTIDEVID CTIDEVID, CTI Device ID register 0 on page H9-5097 

CTIDEVID1 CTIDEVID1, CTI Device ID register 1 on page H9-5099 

CTIDEVID2 CTIDEVID2, CTI Device ID register 2 on page H9-5100 

CTIDEVTYPE CTIDEVTYPE, CTI Device Type register on page H9-5101 

CTIGATE CTIGATE, CTI Channel Gate Enable register on page H9-5102 
CTIINEN<n> CTIINEN<n>, CTI Input Trigger to Output Channel Enable registers, n = 0 - 31 on page H9-5103 
CTIUNTACK CTIINTACK, CTI Output Trigger Acknowledge register on page H9-5104 
CTUTCTRL CTIUTCTRL, CTI Integration mode Control register on page H9-5106 
CTILAR CTILAR, CTI Lock Access Register on page H9-5108 

CTILSR CTILSR, CTI Lock Status Register on page H9-5109 

CTIOUTEN<n> CTIOUTEN<n>, CTI Input Channel to Output Trigger Enable registers, n = 0 - 31 on page H9-5111 
CTIPIDRO CTIPIDRO, CTI Peripheral Identification Register 0 on page H9-5112 
CTIPIDR1 CTIPIDR1, CTI Peripheral Identification Register 1 on page H9-5113 
CTIPIDR2 CTIPIDR2, CTI Peripheral Identification Register 2 on page H9-5114 
CTIPIDR3 CTIPIDR3, CTI Peripheral Identification Register 3 on page H9-5115 
CTIPIDR4 CTIPIDR4, CTI Peripheral Identification Register 4 on page H9-5116 
CTITRIGINSTATUS CTITRIGINSTATUS, CTI Trigger In Status register on page H9-5117 
CTITRIGOUTSTATUS CTITRIGOUTSTATUS, CTI Trigger Out Status register on page H9-5118 





DBGAUTHSTATUS_EL1 


DBGAUTHSTATUS_EL1, Debug Authentication Status register on page H9-4988 





DBGBCR<n>_EL1 


DBGBCR<n>_ELI1, Debug Breakpoint Control Registers, n = 0 - 15 on page H9-4990 





DBGBVR<n>_EL1 


DBGBVR<n>_ELI, Debug Breakpoint Value Registers, n = 0 - 15 on page H9-4993 





DBGCLAIMCLR_EL1 


DBGCLAIMCLR_ELI, Debug Claim Tag Clear register on page H9-4996 





DBGCLAIMSET_EL1 


DBGCLAIMSET_ELI, Debug Claim Tag Set register on page H9-4998 





DBGDTRRX_ELO 


DBGDTRRX_ELO, Debug Data Transfer Register, Receive on page H9-5000 





ARM DDI 0487A.k_iss10775 
1ID092916 


Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. 
Non-Confidential 


K12-5703 


Appendix K12 Registers Index 
K12.6 Alphabetical index of memory-mapped registers 


Table K12-28 Alphabetical index of Memory-Mapped Registers (continued) 





Register 


Description, see 





DBGDTRTX_ELO 


DBGDTRTX_ELO, Debug Data Transfer Register, Transmit on page H9-5002 





DBGWCR<n>_EL1 


DBGWCR<n>_EL1, Debug Watchpoint Control Registers, n = 0 - 15 on page H9-5004 





DBGWVR<n>_EL1 


DBGWVR<n>_ELI, Debug Watchpoint Value Registers, n = 0 - 15 on page H9-5007 




























































































EDAA32PFR EDAA32PFR, External Debug AArch32 Processor Feature Register on page H9-5009 

EDACR EDACR, External Debug Auxiliary Control Register on page H9-5011 

EDCIDRO EDCIDRO, External Debug Component Identification Register 0 on page H9-5012 

EDCIDR1 EDCIDRI1, External Debug Component Identification Register 1 on page H9-5013 

EDCIDR2 EDCIDR2, External Debug Component Identification Register 2 on page H9-5014 

EDCIDR3 EDCIDR3, External Debug Component Identification Register 3 on page H9-5015 

EDCIDSR EDCIDSR, External Debug Context ID Sample Register on page H9-5016 

EDDEVAFFO EDDEVAFFO, External Debug Device Affinity register 0 on page H9-5017 

EDDEVAFF1 EDDEVAFF 1, External Debug Device Affinity register 1 on page H9-5018 

EDDEVARCH EDDEVARCH, External Debug Device Architecture register on page H9-5019 

EDDEVID EDDEVID, External Debug Device ID register 0 on page H9-5021 

EDDEVID1 EDDEVID1, External Debug Device ID register 1 on page H9-5023 

EDDEVID2 EDDEVID2, External Debug Device ID register 2 on page H9-5024 

EDDEVTYPE EDDEVTYPE, External Debug Device Type register on page H9-5025 

EDDFR EDDFR, External Debug Feature Register on page H9-5026 

EDECCR EDECCR, External Debug Exception Catch Control Register on page H9-5028 

EDECR EDECR, External Debug Execution Control Register on page H9-5030 

EDESR EDESR, External Debug Event Status Register on page H9-5032 

EDITCTRL EDITCTRL, External Debug Integration mode Control register on page H9-5034 

EDITR EDITR, External Debug Instruction Transfer Register on page H9-5036 

EDLAR EDLAR, External Debug Lock Access Register on page H9-5038 

EDLSR EDLSR, External Debug Lock Status Register on page H9-5039 

EDPCSR EDPCSR, External Debug Program Counter Sample Register on page H9-5041 

EDPFR EDPFR, External Debug Processor Feature Register on page H9-5043 

EDPIDRO EDPIDRO, External Debug Peripheral Identification Register 0 on page H9-5046 

EDPIDR1 EDPIDRI, External Debug Peripheral Identification Register I on page H9-5047 

EDPIDR2 EDPIDR2, External Debug Peripheral Identification Register 2 on page H9-5048 

EDPIDR3 EDPIDR3, External Debug Peripheral Identification Register 3 on page H9-5049 

EDPIDR4 EDPIDR4, External Debug Peripheral Identification Register 4 on page H9-5050 
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Table K12-28 Alphabetical index of Memory-Mapped Registers (continued) 
































Register Description, see 

EDPRCR EDPRCR, External Debug Power/Reset Control Register on page H9-5051 

EDPRSR EDPRSR, External Debug Processor Status Register on page H9-5054 

EDRCR EDRCR, External Debug Reserve Control Register on page H9-5062 

EDSCR EDSCR, External Debug Status and Control Register on page H9-5064 

EDVIDSR EDVIDSR, External Debug Virtual Context Sample Register on page H9-5069 

EDWAR EDWAR, External Debug Watchpoint Address Register on page H9-5071 

MIDR_EL1 MIDR_ELI, Main ID Register on page H9-5073 

OSLAR_EL1 OSLAR_EL1, OS Lock Access Register on page H9-5075 

PMAUTHSTATUS PMAUTHSTATUS, Performance Monitors Authentication Status register on page 13-5146 





PMCCFILTR_ELO 


PMCCFILTR_ELO, Performance Monitors Cycle Counter Filter Register on page 13-5148 





PMCCNTR_ELO 


PMCCNTR_ELO, Performance Monitors Cycle Counter on page 13-5150 























PMCEIDO PMCEIDO, Performance Monitors Common Event Identification register 0 on page 13-5152 
PMCEID1 PMCEID1, Performance Monitors Common Event Identification register 1 on page 13-5154 
PMCFGR PMCFGR, Performance Monitors Configuration Register on page 13-5156 

PMCIDRO PMCIDRO, Performance Monitors Component Identification Register 0 on page I3-5158 
PMCIDRI1 PMCIDRI, Performance Monitors Component Identification Register 1 on page 13-5159 
PMCIDR2 PMCIDR2, Performance Monitors Component Identification Register 2 on page 13-5160 
PMCIDR3 PMCIDR3, Performance Monitors Component Identification Register 3 on page 13-5161 





PMCNTENCLR_ELO 


PMCNTENCLR_ELO, Performance Monitors Count Enable Clear register on page 13-5162 





PMCNTENSET_ELO 


PMCNTENSET_ELO, Performance Monitors Count Enable Set register on page 13-5164 

















PMCR_ELO PMCR_ELO, Performance Monitors Control Register on page I3-5166 
PMDEVAFFO PMDEVAFFO, Performance Monitors Device Affinity register 0 on page 13-5169 
PMDEVAFFI1 PMDEVAFF 11, Performance Monitors Device Affinity register 1 on page 13-5170 
PMDEVARCH PMDEVARCH, Performance Monitors Device Architecture register on page 13-5171 
PMDEVTYPE PMDEVTYPE, Performance Monitors Device Type register on page 13-5173 





PMEVCNTR<n>_ELO 


PMEVCNTR<n>_ELO, Performance Monitors Event Count Registers, n = 0 - 30 on page 13-5174 





PMEVTYPER<n>_ELO 


PMEVTYPER<n>_ELO, Performance Monitors Event Type Registers, n = 0 - 30 on page 13-5175 





PMINTENCLR_EL1 


PMINTENCLR_ELI, Performance Monitors Interrupt Enable Clear register on page 13-5178 





PMINTENSET_EL1 


PMINTENSET_ELI, Performance Monitors Interrupt Enable Set register on page I3-5180 











PMITCTRL PMITCTRL, Performance Monitors Integration mode Control register on page 13-5182 
PMLAR PMLAR, Performance Monitors Lock Access Register on page 13-5184 
PMLSR PMLSR, Performance Monitors Lock Status Register on page 13-5185 
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Table K12-28 Alphabetical index of Memory-Mapped Registers (continued) 



































Register Description, see 
PMOVSCLR_ELO PMOVSCLR_ELO, Performance Monitors Overflow Flag Status Clear register on page 13-5187 
PMOVSSET_ELO PMOVSSET_ELO, Performance Monitors Overflow Flag Status Set register on page 13-5189 
PMPIDRO PMPIDRO, Performance Monitors Peripheral Identification Register 0 on page 13-5191 
PMPIDR1 PMPIDRI, Performance Monitors Peripheral Identification Register 1 on page 13-5192 
PMPIDR2 PMPIDR2, Performance Monitors Peripheral Identification Register 2 on page 13-5193 
PMPIDR3 PMPIDR3, Performance Monitors Peripheral Identification Register 3 on page 13-5194 
PMPIDR4 PMPIDR4, Performance Monitors Peripheral Identification Register 4 on page 13-5195 
PMSWINC_ELO PMSWINC_ELO, Performance Monitors Software Increment register on page 13-5196 
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Functional index of memory-mapped registers 


This section is an index of the memory-mapped registers, divided by functional group. 


ID registers 


This section is an index to the registers in the Identification registers functional group. 


Table K12-29 ID registers 





Register 


Description, see 





EDAA32PFR 


EDAA32PFR, External Debug AArch32 Processor Feature Register on page H9-5009 





EDDFR 


EDDFR, External Debug Feature Register on page H9-5026 





EDPFR 


EDPFR, External Debug Processor Feature Register on page H9-5043 





MIDR_EL1 


MIDR_ELI, Main ID Register on page H9-5073 





Performance monitors registers 


This section is an index to the registers in the Performance Monitors registers functional group. 


Table K12-30 Performance monitors registers 





Register 


Description, see 





PMAUTHSTATUS 


PMAUTHSTATUS, Performance Monitors Authentication Status register on page 13-5146 





PMCCFILTR_ELO 


PMCCFILTR_ELO, Performance Monitors Cycle Counter Filter Register on page I3-5148 





PMCCNTR_ELO 


PMCCNTR_ELO, Performance Monitors Cycle Counter on page 13-5150 























PMCEIDO PMCEIDO, Performance Monitors Common Event Identification register 0 on page 13-5152 
PMCEID1 PMCEID1, Performance Monitors Common Event Identification register 1 on page 13-5154 
PMCFGR PMCFGR, Performance Monitors Configuration Register on page 13-5156 

PMCIDRO PMCIDRO, Performance Monitors Component Identification Register 0 on page 13-5158 
PMCIDR1 PMCIDRI, Performance Monitors Component Identification Register 1 on page 13-5159 
PMCIDR2 PMCIDR2, Performance Monitors Component Identification Register 2 on page 13-5160 
PMCIDR3 PMCIDR3, Performance Monitors Component Identification Register 3 on page 13-5161 





PMCNTENCLR_ELO 


PMCNTENCLR_ELO, Performance Monitors Count Enable Clear register on page I3-5162 





PMCNTENSET_ELO 


PMCNTENSET_ELO, Performance Monitors Count Enable Set register on page 13-5164 

















PMCR_ELO PMCR_ELO, Performance Monitors Control Register on page 13-5166 
PMDEVAFFO PMDEVAFFO, Performance Monitors Device Affinity register 0 on page 13-5169 
PMDEVAFF1 PMDEVAFF1, Performance Monitors Device Affinity register I on page 13-5170 
PMDEVARCH PMDEVARCH, Performance Monitors Device Architecture register on page 13-5171 
PMDEVTYPE PMDEVTYPE, Performance Monitors Device Type register on page 13-5173 





PMEVCNTR<n>_ELO 


PMEVCNTR<n>_ELO, Performance Monitors Event Count Registers, n = 0 - 30 on page 13-5174 
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Table K12-30 Performance monitors registers (continued) 





Register 


Description, see 





PMEVTYPER<n>_ELO 


PMEVTYPER<n>_ELO, Performance Monitors Event Type Registers, n = 0 - 30 on page 13-5175 





PMINTENCLR_EL1 


PMINTENCLR_ELI, Performance Monitors Interrupt Enable Clear register on page I3-5178 





PMINTENSET_EL1 


PMINTENSET_ELI, Performance Monitors Interrupt Enable Set register on page 13-5180 






































PMITCTRL PMITCTRL, Performance Monitors Integration mode Control register on page 13-5182 
PMLAR PMLAR, Performance Monitors Lock Access Register on page 13-5184 

PMLSR PMLSR, Performance Monitors Lock Status Register on page 13-5185 

PMOVSCLR_ELO PMOVSCLR_ELO, Performance Monitors Overflow Flag Status Clear register on page 13-5187 
PMOVSSET_ELO PMOVSSET_ELO, Performance Monitors Overflow Flag Status Set register on page 13-5189 
PMPIDRO PMPIDRO, Performance Monitors Peripheral Identification Register 0 on page 13-5191 
PMPIDR1 PMPIDRI, Performance Monitors Peripheral Identification Register I on page 13-5192 
PMPIDR2 PMPIDR2, Performance Monitors Peripheral Identification Register 2 on page 13-5193 
PMPIDR3 PMPIDR3, Performance Monitors Peripheral Identification Register 3 on page 13-5194 
PMPIDR4 PMPIDR4, Performance Monitors Peripheral Identification Register 4 on page 13-5195 
PMSWINC_ELO PMSWINC_ELO, Performance Monitors Software Increment register on page 13-5196 





Debug registers 


This section is an index to the registers in the Debug registers functional group. 


Table K12-31 Debug registers 





Register 


Description, see 





DBGAUTHSTATUS_EL1 


DBGAUTHSTATUS_EL1, Debug Authentication Status register on page H9-4988 





DBGBCR<n>_EL1 


DBGBCR<n>_ELI1, Debug Breakpoint Control Registers, n = 0 - 15 on page H9-4990 





DBGBVR<n>_EL1 


DBGBVR<n>_ELI, Debug Breakpoint Value Registers, n = 0 - 15 on page H9-4993 





DBGCLAIMCLR_EL1 


DBGCLAIMCLR_ELI, Debug Claim Tag Clear register on page H9-4996 





DBGCLAIMSET_EL1 


DBGCLAIMSET_ELI, Debug Claim Tag Set register on page H9-4998 





DBGDTRRX_ELO 


DBGDTRRX_ELO, Debug Data Transfer Register, Receive on page H9-5000 





DBGDTRTX_ELO 


DBGDTRTX_ELO, Debug Data Transfer Register, Transmit on page H9-5002 





DBGWCR<n>_EL1 


DBGWCR<n>_ELI1, Debug Watchpoint Control Registers, n = 0 - 15 on page H9-5004 





DBGWVR<n>_EL1 


DBGWVR<n>_ELI, Debug Watchpoint Value Registers, n = 0 - 15 on page H9-5007 














EDACR EDACR, External Debug Auxiliary Control Register on page H9-5011 

EDCIDRO EDCIDRO, External Debug Component Identification Register 0 on page H9-5012 
EDCIDR1 EDCIDRI, External Debug Component Identification Register 1 on page H9-5013 
EDCIDR2 EDCIDR2, External Debug Component Identification Register 2 on page H9-5014 
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Table K12-31 Debug registers (continued) 




























































































Register Description, see 

EDCIDR3 EDCIDR3, External Debug Component Identification Register 3 on page H9-5015 
EDCIDSR EDCIDSR, External Debug Context ID Sample Register on page H9-5016 
EDDEVAFFO EDDEVAFFO, External Debug Device Affinity register 0 on page H9-5017 
EDDEVAFF1 EDDEVAFF1, External Debug Device Affinity register 1 on page H9-5018 
EDDEVARCH EDDEVARCH, External Debug Device Architecture register on page H9-5019 
EDDEVID EDDEVID, External Debug Device ID register 0 on page H9-5021 

EDDEVID1 EDDEVID1, External Debug Device ID register 1 on page H9-5023 

EDDEVID2 EDDEVID2, External Debug Device ID register 2 on page H9-5024 
EDDEVTYPE EDDEVTYPE, External Debug Device Type register on page H9-5025 

EDECCR EDECCR, External Debug Exception Catch Control Register on page H9-5028 
EDECR EDECR, External Debug Execution Control Register on page H9-5030 

EDESR EDESR, External Debug Event Status Register on page H9-5032 

EDITCTRL EDITCTRL, External Debug Integration mode Control register on page H9-5034 
EDITR EDITR, External Debug Instruction Transfer Register on page H9-5036 

EDLAR EDLAR, External Debug Lock Access Register on page H9-5038 

EDLSR EDLSR, External Debug Lock Status Register on page H9-5039 

EDPCSR EDPCSR, External Debug Program Counter Sample Register on page H9-5041 
EDPIDRO EDPIDRO, External Debug Peripheral Identification Register 0 on page H9-5046 
EDPIDR1 EDPIDRI, External Debug Peripheral Identification Register 1 on page H9-5047 
EDPIDR2 EDPIDR2, External Debug Peripheral Identification Register 2 on page H9-5048 
EDPIDR3 EDPIDR3, External Debug Peripheral Identification Register 3 on page H9-5049 
EDPIDR4 EDPIDR4, External Debug Peripheral Identification Register 4 on page H9-5050 
EDPRCR EDPRCR, External Debug Power/Reset Control Register on page H9-5051 
EDPRSR EDPRSR, External Debug Processor Status Register on page H9-5054 

EDRCR EDRCR, External Debug Reserve Control Register on page H9-5062 

EDSCR EDSCR, External Debug Status and Control Register on page H9-5064 
EDVIDSR EDVIDSR, External Debug Virtual Context Sample Register on page H9-5069 
EDWAR EDWAR, External Debug Watchpoint Address Register on page H9-5071 
OSLAR_EL1 OSLAR_ELI, OS Lock Access Register on page H9-5075 
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K12.7.4 Cross-trigger interface registers 
This section is an index to the registers in the Cross-Trigger Interface registers functional group. 
Table K12-32 Cross-trigger interface registers 
Register Description, see 
ASICCTL ASICCTL, CTI External Multiplexer Control register on page H9-5077 
CTIAPPCLEAR CTIAPPCLEAR, CTI Application Trigger Clear register on page H9-5078 
CTIAPPPULSE CTIAPPPULSE, CTI Application Pulse register on page H9-5079 
CTIAPPSET CTIAPPSET, CTI Application Trigger Set register on page H9-5080 
CTIAUTHSTATUS CTIAUTHSTATUS, CTI Authentication Status register on page H9-5081 
CTICHINSTATUS CTICHINSTATUS, CTI Channel In Status register on page H9-5083 
CTICHOUTSTATUS CTICHOUTSTATUS, CTI Channel Out Status register on page H9-5084 
CTICIDRO CTICIDRO, CTI Component Identification Register 0 on page H9-5085 
CTICIDR1 CTICIDRI, CTI Component Identification Register 1 on page H9-5086 
CTICIDR2 CTICIDR2, CTI Component Identification Register 2 on page H9-5087 
CTICIDR3 CTICIDR3, CTI Component Identification Register 3 on page H9-5088 
CTICLAIMCLR CTICLAIMCLR, CTI Claim Tag Clear register on page H9-5089 
CTICLAIMSET CTICLAIMSET, CTI Claim Tag Set register on page H9-5090 
CTICONTROL CTICONTROL, CTI Control register on page H9-5091 
CTIDEVAFFO CTIDEVAFFO0, CTI Device Affinity register 0 on page H9-5093 
CTIDEVAFF1 CTIDEVAFF 1, CTI Device Affinity register 1 on page H9-5094 
CTIDEVARCH CTIDEVARCH, CTI Device Architecture register on page H9-5095 
CTIDEVID CTIDEVID, CTI Device ID register 0 on page H9-5097 
CTIDEVID1 CTIDEVID1, CTI Device ID register I on page H9-5099 
CTIDEVID2 CTIDEVID2, CTI Device ID register 2 on page H9-5100 
CTIDEVTYPE CTIDEVTYPE, CTI Device Type register on page H9-5101 
CTIGATE CTIGATE, CTI Channel Gate Enable register on page H9-5102 
CTIINEN<n> CTIINEN<n>, CTI Input Trigger to Output Channel Enable registers, n = 0 - 31 on page H9-5103 
CTUNTACK CTIINTACK, CTI Output Trigger Acknowledge register on page H9-5104 
CTUTCTRL CTIUTCTRL, CTI Integration mode Control register on page H9-5106 
CTILAR CTILAR, CTI Lock Access Register on page H9-5108 
CTILSR CTILSR, CTI Lock Status Register on page H9-5109 
CTIOUTEN<n> CTIOUTEN<n>, CTI Input Channel to Output Trigger Enable registers, n = 0 - 31 on page H9-5111 
CTIPIDRO CTIPIDRO, CTI Peripheral Identification Register 0 on page H9-5112 
CTIPIDR1 CTIPIDR1, CTI Peripheral Identification Register 1 on page H9-5113 
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Table K12-32 Cross-trigger interface registers (continued) 




















Register Description, see 

CTIPIDR2 CTIPIDR2, CTI Peripheral Identification Register 2 on page H9-5114 
CTIPIDR3 CTIPIDR3, CTI Peripheral Identification Register 3 on page H9-5115 
CTIPIDR4 CTIPIDR4, CTI Peripheral Identification Register 4 on page H9-5116 
CTITRIGINSTATUS CTITRIGINSTATUS, CTI Trigger In Status register on page H9-5117 
CTITRIGOUTSTATUS = CTITRIGOUTSTATUS, CTI Trigger Out Status register on page H9-5118 
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A32 instruction 


A32 state 


A64 instruction 


AArch32 


AArch64 


Abort 


Addressing mode 


Advanced SIMD 


A word that specifies an operation to be performed by a PE that is executing in an Exception level that is using 
AArch32 and is in A32 state. A32 instructions must be word-aligned. 


A32 instructions were previously called ARM instructions. 

See also A32 state, A64 instruction, T32 instruction. 

The AArch32 Instruction set state in which the PE executes A32 instructions. 
A32 state was previously called ARM state. 

See also T32 instruction, T32 state. 


A word that specifies an operation to be performed by a PE that is executing in an Exception level that is using 
AArch64. A64 instructions must be word-aligned. 


See also A32 instruction, T32 instruction. 


The 32-bit Execution state. In AArch32 state, addresses are held in 32-bit registers, and instructions in the base 
instruction sets use 32-bit registers for their processing. AArch32 state supports the T32 and A32 instruction sets 


See also AArch64, A32 instruction, T32 instruction. 


The 64-bit Execution state. In AArch64 state, addresses are held in 64-bit registers, and instructions in the base 
instruction set can use 64-bit registers for their processing. AArch64 state supports the A64 instruction set. 


See also AArch32, A64 instruction. 


An exception caused by an illegal memory access. Aborts can be caused by the external memory system or the 
MMU. 


Means a method for generating the memory address used by a load/store instruction. 


A feature of the ARM architecture that provides SIMD operations on a register file of SIMD and floating-point 
registers. Where an implementation supports both Advanced SIMD and floating-point instructions, these 
instructions operate on the same register file. 
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Aligned 


A data item stored at an address that is exactly divisible by the highest power of 2 that divides exactly into its size 
in bytes. Aligned halfwords, words and doublewords therefore have addresses that are divisible by 2, 4 and 8 
respectively. 


An aligned access is one where the address of the access is aligned to the size of each element of the access. 


Architecturally executed 


An instruction is architecturally executed only if it would be executed in a simple sequential execution of the 
program. When such an instruction has been executed and retired it has been architecturally executed. Any 
instruction that, in a simple sequential execution of a program, is treated as a NOP because it fails its condition code 
check, is an architecturally executed instruction. 


In a PE that performs speculative execution, an instruction is not architecturally executed if the PE discards the 
results of a speculative execution. 


See also Condition code check, Simple sequential execution. 


Architecturally mapped 


Where this manual describes a register as being architecturally mapped to another register, this indicates that, in an 
implementation that supports both of the registers, the two registers access the same state. 


Architecturally UNKNOWN 


An architecturally UNKNOWN value is a value that is not defined by the architecture but must meet the requirements 
of the definition of UNKNOWN. Implementations can define the value of the field, but are not required to do so. 


See also IMPLEMENTATION DEFINED. 


ARM core registers 


ARM instruction 


Associativity 


Atomicity 


Banked register 


Base register 


Some older documentation uses ARM core registers to refer to the following set of registers for execution in 
AArch32 state: 


° The 13 general-purpose registers, RO-R12, that software can use for processing. 
° SP, the stack pointer, that can also be referred to as R13. 

° LR, the link register, that can also be referred to as R14. 

° PC, the program counter, that can also be referred to as R15. 


See also General-purpose registers. 


See A32 instruction. 
See Cache associativity. 


Describes either single-copy atomicity or multi-copy atomicity. Atomicity in the ARM architecture on page B2-81 
defines these forms of atomicity for the ARM architecture. 


See also Multi-copy atomicity, Single-copy atomicity. 


A register that has multiple instances, with the instance that is in use depending on the PE mode, Security state, or 
other PE state. 


A register specified by a load/store instruction that is used as the base value for the address calculation for the 
instruction. Depending on the instruction and its addressing mode, an offset can be added to or subtracted from the 
base register value to form the virtual address that is sent to memory. 


Base register writeback 


Behaves as if 


Describes writing back a modified value to the base register used in an address calculation. 


Where this manual indicates that a PE behaves as if a certain condition applies, all descriptions of the operation of 
the PE must be re-evaluated taking account of that condition, together with any other conditions that affect 
operation. 
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Big-endian memory 


Blocking 


Branch prediction 


Breakpoint 


Byte 


Means that, for example: 


° A byte or halfword at a word-aligned address is the most significant byte or halfword in the word at that 
address. 
. A byte at a halfword-aligned address is the most significant byte in the halfword at that address. 


See also Endianness, Little-endian memory. 
Describes an operation that does not permit following instructions to be executed before the operation completes. 


A non-blocking operation can permit following instructions to be executed before the operation completes, and in 
the event of encountering an exception does not signal an exception to the PE. This enables implementations to retire 
following instructions while the non-blocking operation is executing, without the need to retain precise PE state. 


Is where a PE selects a future execution path to fetch along. For example, after a branch instruction, the PE can 
choose to speculatively fetch either the instruction following the branch or the instruction at the branch target. 


See also Prefetching. 


A debug event triggered by the execution of a particular instruction, specified by one or both of the address of the 
instruction and the state of the PE when the instruction is executed. 


An 8-bit data item. 


Cache associativity 


Cache level 


Cache line 


Cache lockdown 


Cache miss 


Cache sets 


Cache way 


Coherence order 


Coherent 


The number of locations in a cache set to which an address can be assigned. Each location is identified by its way 
value. 


The position of a cache in the cache hierarchy. In the ARM architecture, the lower numbered levels are those closest 
to the PE. For more information, see Terms used in describing the maintenance instructions on page D3-1699. 


The basic unit of storage in a cache. Its size in words is always a power of two, usually 4 or 8 words. A cache line 
must be aligned to a suitable memory boundary. A memory cache line is a block of memory locations with the same 
size and alignment as a cache line. Memory cache lines are sometimes loosely called cache lines. 


Enables critical software and data to be loaded into the cache so that the cache lines containing them are not 
subsequently reallocated. It alleviates the delays caused by accessing a cache in a worst-case situation. This ensures 
that all subsequent accesses to the software and data concerned are cache hits and so complete quickly. 


A memory access that cannot be processed at high speed because the data it addresses is not in the cache. 


Areas of a cache, divided up to simplify and speed up the process of determining whether a cache hit occurs. The 
number of cache sets is always a power of two. 


A cache way consists of one cache line from each cache set. The cache ways are indexed from 0 to (Associativity-1). 
Each cache line in a cache way is chosen to have the same index as the cache way. For example, cache way n consists 
of the cache line with index n from each cache set. 


See Coherent. 


Data accesses from a set of observers to a byte in memory are coherent if accesses to that byte in memory by the 
members of that set of observers are consistent with there being a single total order of all writes to that byte in 
memory by all members of the set of observers. This single total order of all to writes to that memory location is the 
coherence order for that byte in memory. 


Condition code check 


The process of determining whether a conditional instruction executes normally or is treated as a NOP. For an 
instruction that includes a condition code field, that field is compared with the condition flags to determine whether 
the instruction is executed normally. For a T32 instruction in an IT block, the value of PSTATE.IT determines 
whether the instruction is executed normally. 


See also Condition code field, Condition flags, Conditional execution. 
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Condition code field 
A 4-bit field in an instruction that specifies the condition under which the instruction executes. 


See also Condition code check. 
Condition flags The N, Z, C, and V bits of PSTATE, an SPSR, or FPSCR. See the register descriptions for more information. 
See also Condition code check, PSTATE. 


Conditional execution 
When a conditional instruction starts executing, if the condition code check returns TRUE, the instruction executes 
normally. Otherwise, it is treated as a NOP. 


See also Condition code check. 


CONSTRAINED UNPREDICTABLE 
Where an instruction can result in UNPREDICTABLE behavior, the ARMv8 architecture specifies a narrow range of 
permitted behaviors. This range is the range of CONSTRAINED UNPREDICTABLE behavior. All implementations that 
are compliant with the architecture must follow the CONSTRAINED UNPREDICTABLE behavior. 


Execution at Non-secure EL1 or ELO of an instruction that is CONSTRAINED UNPREDICTABLE can be implemented 
as generating a trap exception that is taken to EL2, provided that at least one instruction that is not UNPREDICTABLE 
and is not CONSTRAINED UNPREDICTABLE causes a trap exception that is taken to EL2. 


In body text, the term CONSTRAINED UNPREDICTABLE is shown in SMALL CAPITALS. 
See also UNPREDICTABLE. 


Context switch The saving and restoring of computational state when switching between different threads or processes. In this 
manual, the term context switch describes any situation where the context is switched by an operating system and 
might or might not include changes to the address space. 


Context synchronization event 


One of: 

° Performing an ISB operation. An ISB operation is performed when an ISB instruction is executed and does 
not fail its condition code check. 

° Taking an exception. 

° Returning from an exception. 

° Exit from Debug state. 

° Executing a DCPS instruction. 

° Executing a DRPS instruction. 


The architecture requires a context synchronization event to guarantee visibility of any change to a System register. 


Note 


Context synchronization events were previously described as context synchronization operations. 








Debugger In most of this manual, debugger refers to any agent that is performing debug. However, some chapters or parts of 
this manual require a more rigorous definition, and define debugger locally. See: 


. Definition of a debugger in the context of self-hosted debug on page D2-1626. 
. Definition of a debugger in the context of self-hosted debug on page G2-3922. 
. Definition of a debugger in the context of external debug on page H1-4840. 
Deprecated Something that is present in the ARM architecture for backwards compatibility. Whenever possible software must 
avoid using deprecated features. Features that are deprecated but are not optional are present in current 


implementations of the ARM architecture, but might not be present, or might be deprecated and OPTIONAL, in future 
versions of the ARM architecture. 


See also OPTIONAL. 
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Digital signal processing (DSP) 
Algorithms for processing signals that have been sampled and converted to digital form. DSP algorithms often use 
saturated arithmetic. 


Direct Memory Access (DMA) 
An operation that accesses main memory directly, without the PE performing any accesses to the data concerned. 


Direct read A direct read of a System register is a read performed by a System register access instruction. 
For more information, see Direct read on page D7-1891. 
See also Direct write, Indirect read, Indirect write. 

Direct write A direct write of a System register is a write performed by a System register access instruction. 
For more information, see Direct write on page D7-1891. 


See also Direct read, Indirect read, Indirect write. 


DMA See Direct Memory Access (DMA). 
DNM See Do-Not-Modify (DNM). 
Domain In the ARM architecture, domain is used in the following contexts. 


Shareability domain Defines a set of observers for which the shareability attributes make the data or unified 
caches transparent for data accesses. 


Power domain Defines a block of logic with a single, common, power supply. 


Memory regions domain 


When using the Short-descriptor translation table format, defines a collection of Sections, 
Large pages and Small pages of memory, that can have their access permissions switched 
rapidly by writing to the Domain Access Control Register (DACR). ARM deprecates any 
use of memory regions domains. 


Do-Not-Modify (DNM) 
Means the value must not be altered by software. DNM fields read as UNKNOWN values, and must only be written 
with the value read from the same field on the same PE. 


Double-precision value 
Consists of two consecutive 32-bit words that are interpreted as a basic double-precision floating-point number 
according to the JEEE Standard for Floating-point Arithmetic. 


Doubleword A 64-bit data item. Doublewords are normally at least word-aligned in ARM systems. 


Doubleword-aligned 
Means that the address is divisible by 8. 


DSP See Digital signal processing (DSP). 


Effective value ° In some cases, the description of a control a specifies that when control a is active it causes a register control 
field b to be treated as having a fixed value for all purposes other than direct reads, or direct reads and direct 
writes, of the register containing control field b. When control a is active that fixed value is described as the 
Effective value of register control field b. For example, when the value of HCR.DC is 1, the Effective value 
of HCR.VM is 1, regardless of its actual value. 


In other cases, in some contexts a register control field b is not implemented or is not accessible, but behavior 
of the PE is as if control field b was implemented and accessible, and had a particular value. In this case, that 
value is the Effective value of register control field b. 


° Otherwise, the Effective value of a register control field is the value of that field. 
Endianness An aspect of the system memory mapping. 
See also Big-endian memory and Little-endian memory. 


Exception Handles an event. For example, an exception could handle an external interrupt or an undefined instruction. 
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Exception vector 


Execution stream 


Explicit access 


External abort 


A fixed address that contains the address of the first instruction of the corresponding exception handler. 


The stream of instructions that would have been executed by sequential execution of the program. 


A read from memory, or a write to memory, generated by a load or store instruction executed by the PE. Reads and 
writes generated by hardware translation table accesses are not explicit accesses. 


An abort that is generated by the external memory system. 


Fast Context Switch Extension (FCSE) 


FCSE 


Modifies the behavior of an ARM memory system to enable multiple programs running on the ARM PE to use 
identical address ranges, while ensuring that the addresses they present to the rest of the memory system differ. From 
ARMv6, ARM deprecates any use of the FCSE. The FCSE is: 


° Optional in an ARMv7 implementation that does not include the Multiprocessing Extensions. 


° Obsolete from the introduction of the Multiprocessing Extensions. 


See Fast Context Switch Extension (FCSE). 


Flat address mapping 


Is where the physical address for every access is equal to its virtual address. 


Flush-to-zero mode 


A processing mode that optimizes the performance of some floating-point algorithms by replacing the denormalized 
operands and intermediate results with zeros, without significantly affecting the accuracy of their final results. 


General-purpose registers 


Halfword 
Halfword-aligned 


High registers 


High vectors 


IGNORED 


The registers that the base instructions use for processing: 
° In AArch32 state the general-purpose registers are RO-R14, that can also be described as RO-R12, SP, LR. 


Note 


Older documentation defines the AArch32 general-purpose registers as RO-R12, and the ARM core registers 
as RO-R12, SP, LR, and PC. 








° In AArch64 state the general-purpose registers are: 
— | WO-W30 when accessed as 32-bit registers. 
—  X0-X30 when accessed as 64-bit registers. 


See also High registers, Low registers. 


A 16-bit data item. Halfwords are normally halfword-aligned in ARM systems. 


Means that the address is divisible by 2. 


In AArch32 state, the general-purpose registers R8-R14. Most 16-bit T32 instructions cannot access the high 
registers. 





Note 
° In some contexts, high registers refers to R8-R15, meaning R8-R14 and the PC. 





See also General-purpose registers, Low registers. 


An alternative location for the exception vectors. The high vector address range is near the top of the address space, 
rather than at the bottom. 


Indicates that the architecture guarantees that the bit or field is not interpreted or modified by hardware. 


In body text, the term IGNORED is shown in SMALL CAPITALS. 
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Immediate and offset fields 
Are unsigned unless otherwise stated. 


Immediate value 
A value that is encoded directly in the instruction and used as numeric data when the instruction is executed. Many 
A64, A32, and T32 instructions can be used with an immediate argument. 


IMP An abbreviation used in diagrams to indicate that one or more bits have IMPLEMENTATION DEFINED behavior. 


IMPLEMENTATION DEFINED 
Means that the behavior is not architecturally defined, but must be defined and documented by individual 
implementations. 


In body text, the term IMPLEMENTATION DEFINED is shown in SMALL CAPITALS. 


Index register A register specified in some load and store instructions. The value of this register is used as an offset to be added to 
or subtracted from the base register value to form the virtual address that is sent to memory. Some instruction forms 
permit the index register value to be shifted before the addition or subtraction. 


Indirect read When an instruction uses a System register value to establish operating conditions, that use of the System register 
is an indirect read of the System register. 


For more information, including additional examples of indirect reads, see Indirect read on page D7-1891. 
See also Direct read, Direct write, Indirect write. 


Indirect write An indirect write of a System register occurs when the contents of a register are updated by some mechanism other 
than a Direct write to that register. For example, an indirect write to a register might occur as a side-effect of 
executing an instruction that does not perform a direct write to the register, or because of some operation performed 
by an external agent. 


For more information, see /ndirect write on page D7-1891. 
See also Direct read, Direct write, Indirect read. 


Inline literals These are constant addresses and other data items held in the same area as the software itself. They are automatically 
generated by compilers, and can also appear in assembler code. 


Intermediate physical address (IPA) 
An implementation of virtualization, the address to which a Guest OS maps a VA. A hypervisor might then map the 
IPA to a PA. Typically, the Guest OS is unaware of the translation from IPA to PA. 


See also Physical address (PA), Virtual address (VA). 


Interworking A method of working that permits branches between software using the A32 and T32 instruction sets. 
IPA See Intermediate physical address (IPA). 
Level See Cache level. 


Level of Coherence (LoC) 
The last level of cache that must be cleaned or invalidated when cleaning or invalidating to the point of coherency. 
For more information, see Terms used in describing the maintenance instructions on page D3-1699. 


See also Cache level, Point of coherency (PoC). 


Level of Unification, Inner Shareable (LOUIS) 
The last level of cache that must be cleaned or invalidated when cleaning or invalidating to the point of unification 
for the Inner Shareable shareability domain. For more information, see Terms used in describing the maintenance 
instructions on page D3-1699. 


See also Cache level, Point of unification (PoU). 
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Level of Unification, uniprocessor (LoUU) 


Line 


For a PE, the last level of cache that must be cleaned or invalidated when cleaning or invalidating to the point of 
unification for that PE. For more information, see Terms used in describing the maintenance instructions on 
page D3-1699. 


See also Cache level, Point of unification (PoU). 


See Cache line. 


Little-endian memory 


Means that, for example: 


° A byte or halfword at a word-aligned address is the least significant byte or halfword in the word at that 
address. 
. A byte at a halfword-aligned address is the least significant byte in the halfword at that address. 


See also Big-endian memory, Endianness. 


Load/Store architecture 


LoC 
LoUIS 
LoUU 
Lockdown 


Low registers 


Memory barrier 


An architecture where data-processing operations only operate on register contents, not directly on memory 
contents. 


See Level of Coherence (LoC). 

See Level of Unification, Inner Shareable (LoUIS). 
See Level of Unification, uniprocessor (LoUU). 
See Cache lockdown. 


In AArch32 state, general-purpose registers RO-R7. Unlike the high registers, all T32 instructions can access the 
Low registers. 


See also General-purpose registers, High registers. 


See Memory barriers on page B2-87. 


Memory coherency 


The problem of ensuring that when a memory location is read, either by a data read or an instruction fetch, the value 
actually obtained is always the value that was most recently written to the location. This can be difficult when there 
are multiple possible physical locations, such as main memory and at least one of a write buffer and one or more 
levels of cache. 


Memory Management Unit (MMU) 


Provides detailed control of the part of a memory system that provides a single stage of address translation. Most of 
the control is provided using translation tables that are held in memory, and define the attributes of different regions 
of the physical memory map. 


Memory Protection Unit (MPU) 


Miss 
MMU 
MPU 


A hardware unit whose registers provide simple control of a limited number of protection regions in memory. 
See Cache miss. 
See Memory Management Unit (MMU). 


See Memory Protection Unit (MPU). 


Multi-copy atomicity 


NaN 


The form of atomicity described in Requirements for multi-copy atomicity on page B2-82. 
See also Atomicity, Single-copy atomicity. 


. Not a Number. A floating-point value that can be used when neither a numeric value nor an infinity is 
appropriate. A NaN can be a quiet NaN, that propagate through most floating-point operations, or a signaling 
NaN, that causes an Invalid Operation floating-point exception when used. For more information, see the 
IEEE Standard for Floating-point Arithmetic. 
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Natural eviction —_A natural eviction is an eviction that occurs in the course of the normal operation of the memory system, rather than 
because of an operation that explicitly causes an eviction from the cache, such as the execution of a cache 
maintenance instruction. Typically, a natural eviction occurs when the caching algorithm requires data to be cached 
but the cache does not have room for that data. 


Observer A PE or mechanism in the system, such as a peripheral device, that can generate reads from or writes to memory. 


Obsolete Obsolete indicates something that is no longer supported by ARM. When an architectural feature is described as 
obsolete, this indicates that the architecture has no support for that feature, although an earlier version of the 
architecture did support it. 


Offset addressing 
Means that the memory address is formed by adding or subtracting an offset to or from the base register value. 


OPTIONAL When applied to a feature of the architecture, OPTIONAL indicates a feature that is not required in an implementation 
of the ARM architecture: 


. If a feature is OPTIONAL and deprecated, this indicates that the feature is being phased out of the architecture. 
ARM expects such a features to be included in a new implementation only if there is a known 
backwards-compatibility reason for the inclusion of the feature. 


A feature that is OPTIONAL and deprecated might not be present in future versions of the architecture. 


° A feature that is OPTIONAL but not deprecated is, typically, a feature added to a version of the ARM 
architecture after the initial release of that version of the architecture. ARM recommends that such features 
are included in all new implementations of the architecture. 


In body text, these meanings of the term OPTIONAL are shown in SMALL CAPITALS. 


Note: Do not confuse these ARM-specific uses of OPTIONAL with other uses of optional, where it has its usual 
meaning. These include: 


° Optional arguments in the syntax of many instructions. 


° Behavior determined by an implementation choice, for example the optional byte order reversal in an 
ARMv/7-R implementation, where the SCTLR.IE bit indicates the implemented option. 


See also Deprecated. 
PA See Physical address (PA). 
PE See Processing element (PE). 


Physical address (PA) 
An address that identifies a location in the physical memory map. 


See also Intermediate physical address (IPA), Virtual address (VA). 
PoC See Point of coherency (PoC). 
PoU See Point of unification (PoU). 


Point of coherency (PoC) 
For a particular MVA, the point at which all agents that can access memory are guaranteed to see the same copy of 
a memory location. For more information, see Terms used in describing the maintenance instructions on 
page D3-1699. 


Point of unification (PoU) 
For a particular PE, the point by which the instruction and data caches and the translation table walks of that PE are 
guaranteed to see the same copy of a memory location. For more information, see Terms used in describing the 
maintenance instructions on page D3-1699. 


Post-indexed addressing 
Means that the memory address is the base register value, but an offset is added to or subtracted from the base 
register value and the result is written back to the base register. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. Glossary-5721 
1ID092916 Non-Confidential 


Glossary 


Prefetching Prefetching refers to speculatively fetching instructions or data from the memory system. In particular, instruction 
prefetching is the process of fetching instructions from memory before the instructions that precede them, in simple 
sequential execution of the program, have finished executing. Prefetching an instruction does not mean that the 
instruction has to be executed. 


In this manual, references to instruction or data fetching apply also to prefetching, unless the context explicitly 
indicates otherwise. 





Note 


The Prefetch Abort exception can be generated on any instruction fetch, and is not limited to speculative instruction 
fetches. 





See also Simple sequential execution. 


Pre-indexed addressing 
Means that the memory address is formed in the same way as for offset addressing, but the memory address is also 
written back to the base register. 


Processing element (PE) 
The abstract machine defined in the ARM architecture, as documented in an ARM Architecture Reference Manual. 
A PE implementation compliant with the ARM architecture must conform with the behaviors described in the 
corresponding ARM Architecture Reference Manual. 


Protection region 
A memory region whose position, size, and other properties are defined by Memory Protection Unit registers. 


Protection Unit = See Memory Protection Unit (MPU). 


Pseudo-instruction 
UAL assembler syntax that assembles to an instruction encoding that is expected to disassemble to a different 
assembler syntax, and is described in this manual under that other syntax. For example, MOV <Rd>, <Rm>, LSL #<n> 
is a pseudo-instruction that is expected to disassemble as LSL <Rd>, <Rm>, #<n>. 


PSTATE An abstraction of process state information. All of the instruction sets provide instructions that operate on elements 
of PSTATE. 


See also Condition flags. 
Quadword A 128-bit data item. Quadwords are normally at least word-aligned in ARM systems. 


Quadword-aligned 
Means that the address is divisible by 16. 


Quiet NaN A NaN that propagates unchanged through most floating-point operations. 

RAO See Read-As-One (RAO). 

RAZ See Read-As-Zero (RAZ). 

RAO/SBOP In versions of the ARM architecture before ARMv8, Read-As-One, Should-Be-One-or-Preserved on writes. 


In ARMVv8, RES1 replaces this description. 

See also UNK/SBOP, Read-As-One (RAO), RES1, Should-Be-One-or-Preserved (SBOP). 
RAO/WI Read-As-One, Writes Ignored. 

Hardware must implement the field as Read-as-One, and must ignore writes to the field. 

Software can rely on the field reading as all 1s, and on writes being ignored. 

This description can apply to a single bit that reads as 1, or to a field that reads as all 1s. 


See also Read-As-One (RAO). 
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RAZ/SBZP In versions of the ARM architecture before ARMv8, Read-As-Zero, Should-Be-Zero-or-Preserved on writes. 
In ARMv8, RESO replaces this description. 
See also UNK/SBZP, Read-As-Zero (RAZ), RESO, Should-Be-Zero-or-Preserved (SBZP). 
RAZ/WI Read-As-Zero, Writes Ignored. 
Hardware must implement the field as Read-as-Zero, and must ignore writes to the field. 
Software can rely on the field reading as all Os, and on writes being ignored. 
This description can apply to a single bit that reads as 0, or to a field that reads as all Os. 
See also Read-As-Zero (RAZ). 


Read-allocate cache 
A cache in which a cache miss on reading data causes a cache line to be allocated into the cache. 


Read-As-One (RAO) 
Hardware must implement the field as reading as all 1s. 


Software: 
° Can rely on the field reading as all 1s. 
° Must use a SBOP policy to write to the field. 


This description can apply to a single bit that reads as 1, or to a field that reads as all 1s. 
See also RAO/SBOP, RAO/WI, RES1. 


Read-As-Zero (RAZ) 
Hardware must implement the field as reading as all Os. 


Software: 
° Can rely on the field reading as all Os 
° Must use a SBZP policy to write to the field. 


This description can apply to a single bit that reads as 0, or to a field that reads as all Os. 
See also RAZ/SBZP, RAZ/WI, RESO. 


Read, modify, write 
In aread, modify, write instruction sequence, a value is read to a general-purpose register, the relevant fields updated 
in that register, and the new value written back. 


RESO A reserved bit or field with Should-Be-Zero-or-Preserved (SBZP) behavior, or equivalent read-only or write-only 
behavior. Used for fields in register descriptions, and for fields in architecturally-defined data structures that are held 
in memory, for example in translation table descriptors. 


Within the architecture, there are some cases where a register bit or field: 





° Is RESO in some defined architectural context. 
. Has different defined behavior in a different architectural context. 
Note 
° RESO is not used in descriptions of instruction encodings. 
° Where an AArch32 System register is Architecturally mapped to an AArch64 System register, and a bit or 


field in that register is RESO in one Execution state and has defined behavior in the other Execution state, this 
is an example of a bit or field with behavior that depends on the architectural context. 
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This means the definition of RESO for fields in read/write registers is: 


If a bit is RESO in all contexts 
For a bit in a read/write register, it is IMPLEMENTATION DEFINED whether: 


1. The bit is hardwired to 0. In this case: 


° Reads of the bit always return 0. 
° Writes to the bit are ignored. 
2. The bit can be written. In this case: 
° An indirect write to the register sets the bit to 0. 
° A read of the bit returns the last value successfully written, by either a direct or an 


indirect write, to the bit. 
If the bit has not been successfully written since reset, then the read of the bit returns 
the reset value if there is one, or otherwise returns an UNKNOWN value. 


° A direct write to the bit must update a storage location associated with the bit. 


° The value of the bit must have no effect on the operation of the PE, other than 
determining the value read back from the bit, unless this Manual explicitly defines 
additional properties for the bit. 


Whether RESO bits or fields follow behavior 1 or behavior 2 is IMPLEMENTATION DEFINED on a 
field-by-field basis. 


If a bit is RESO only in some contexts 

For a bit in a read/write register, when the bit is described as RESO: 

° An indirect write to the register sets the bit to 0. 

° A read of the bit must return the value last successfully written to the bit, by either a direct or 
an indirect write, regardless of the use of the register when the bit was written. 
If the bit has not been successfully written since reset, then the read of the bit returns the reset 
value if there is one, or otherwise returns an UNKNOWN value. 

° A direct write to the bit must update a storage location associated with the bit. 


° While the use of the register is such that the bit is described as RESO, the value of the bit must 
have no effect on the operation of the PE, other than determining the value read back from 
that bit, unless this Manual explicitly defines additional properties for the bit. 


Considering only contexts that apply to a particular implementation, if there is a context in which a 
bit is defined as RESO, another context in which the same bit is defined as RES1, and no context in 
which the bit is defined as a functional bit, then it is IMPLEMENTATION DEFINED whether: 


° Writes to the bit are ignored, and reads of the bit return an UNKNOWN value. 


° The value of the bit can be written, and a read returns the last value written to the bit. 


The RESO description can apply to bits or fields that are read-only, or are write-only: 
. For a read-only bit, RESO indicates that the bit reads as 0, but software must treat the bit as UNKNOWN. 
° For a write-only bit, RESO indicates that software must treat the bit as SBZ. 


A bit that is RESO in a context is reserved for possible future use in that context. To preserve forward compatibility, 


software: 
° Must not rely on the bit reading as 0. 
° Must use an SBZP policy to write to the bit. 


This RESO description can apply to a single bit, or to a field for which each bit of the field must be treated as RESO. 
In body text, the term RESO is shown in SMALL CAPITALS. 


See also Read-As-Zero (RAZ), RES1, Should-Be-Zero-or-Preserved (SBZP), UNKNOWN. 
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RES1 A reserved bit or field with Should-Be-One-or-Preserved (SBOP) behavior, or equivalent read-only or write-only 
behavior. Used for fields in register descriptions, and for fields in architecturally-defined data structures that are held 
in memory, for example in translation table descriptors. 


Within the architecture, there are some cases where a register bit or field: 


Is RES] in some defined architectural context. 


Has different defined behavior in a different architectural context. 





Note 


RES1 is not used in descriptions of instruction encodings. 


Where an AArch32 System register is Architecturally mapped to an AArch64 System register, and a bit or 
field in that register is RES1 in one Execution state and has defined behavior in the other Execution state, this 
is an example of a bit or field with behavior that depends on the architectural context. 





This means the definition of RES1 for fields in read/write registers is: 


If a bit is RES1 in all contexts 


For a bit in a read/write register, it is IMPLEMENTATION DEFINED whether: 


1. 


2. 


The bit is hardwired to 1. In this case: 

° Reads of the bit always return 1. 

° Writes to the bit are ignored. 

The bit can be written. In this case: 

° An indirect write to the register sets the bit to 1. 

° A read of the bit returns the last value successfully written, by either a direct or an 
indirect write, to the bit. 
If the bit has not been successfully written since reset, then the read of the bit returns 
the reset value if there is one, or otherwise returns an UNKNOWN value. 


. A direct write to the bit must update a storage location associated with the bit. 


° The value of the bit must have no effect on the operation of the PE, other than 
determining the value read back from the bit, unless this Manual explicitly defines 
additional properties for the bit. 


Whether RES! bits or fields follow behavior 1 or behavior 2 is IMPLEMENTATION DEFINED on a 
field-by-field basis. 


If a bit is RES1 only in some contexts 


For a bit in a read/write register, when the bit is described as RES1: 


An indirect write to the register sets the bit to 1. 


A read of the bit must return the value last successfully written to the bit, regardless of the 
use of the register when the bit was written. 


— Note 


As indicated in this list, this value might be written by an indirect write to the register. 





If the bit has not been successfully written since reset, then the read of the bit returns the reset 
value if there is one, or otherwise returns an UNKNOWN value. 


A direct write to the bit must update a storage location associated with the bit. 


While the use of the register is such that the bit is described as RES1, the value of the bit must 
have no effect on the operation of the PE, other than determining the value read back from 
that bit, unless this Manual explicitly defines additional properties for the bit. 
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Reserved 


RISC 
Rounding error 


Rounding mode 


Considering only contexts that apply to a particular implementation, if there is a context in which a 
bit is defined as RESO, another context in which the same bit is defined as RES1, and no context in 
which the bit is defined as a functional bit, then it is IMPLEMENTATION DEFINED whether: 


° Writes to the bit are ignored, and reads of the bit return an UNKNOWN value. 
° The value of the bit can be written, and a read returns the last value written to the bit. 


The RES1 description can apply to bits or fields that are read-only, or are write-only: 
° For a read-only bit, RES1 indicates that the bit reads as 1, but software must treat the bit as UNKNOWN. 
° For a write-only bit, RES1 indicates that software must treat the bit as SBO. 


A bit that is RES1 in a context is reserved for possible future use in that context. To preserve forward compatibility, 
software: 


. Must not rely on the bit reading as 1. 


° Must use an SBOP policy to write to the bit. 

This RES1 description can apply to a single bit, or to a field for which each bit of the field must be treated as RES1. 
In body text, the term RES1 is shown in SMALL CAPITALS. 

See also Read-As-One (RAO), RESO, Should-Be-One-or-Preserved (SBOP), UNKNOWN. 


Unless otherwise stated: 


° Instructions that are reserved or that access reserved registers have UNPREDICTABLE or CONSTRAINED 
UNPREDICTABLE behavior. 


. Bit positions described as reserved are: 
—  Inan RW or WO register, RESO. 
—  Inan RO register, UNK. 


See also CONSTRAINED UNPREDICTABLE, RESO, RES1, UNDEFINED, UNK, UNPREDICTABLE. 
Reduced Instruction Set Computer. 
The value of the rounded result of an arithmetic operation minus the exact result of the operation. 


Specifies how the exact result of a floating-point operation is rounded to a value that is representable in the 
destination format. The rounding modes are defined by the JEEE Standard for Floating-point Arithmetic, see 
Floating-point standards, and terminology on page A1-48. 


Saturated arithmetic 


SBO 
SBOP 
SBZ 
SBZP 


Security hole 


Integer arithmetic in which a result that would be greater than the largest representable number is set to the largest 
representable number, and a result that would be less than the smallest representable number is set to the smallest 
representable number. Signed saturated arithmetic is often used in DSP algorithms. It contrasts with the normal 
signed integer arithmetic used in ARM processors, in which overflowing results wrap around from +23!—1 to —231 
or vice versa. 


See Should-Be-One (SBO). 
See Should-Be-One-or-Preserved (SBOP). 
See Should-Be-Zero (SBZ). 
See Should-Be-Zero-or-Preserved (SBZP). 


A mechanism by which execution at the current level of privilege can achieve an outcome that cannot be achieved 
at the current or a lower level of privilege using instructions that are not UNPREDICTABLE and are not CONSTRAINED 
UNPREDICTABLE. The ARM architecture forbids security holes. 


See also CONSTRAINED UNPREDICTABLE, UNPREDICTABLE. 


Self-modifying code 


Code that writes one or more instructions to memory and then executes them. When using self-modifying code, you 
must use cache maintenance and barrier instructions to ensure synchronization. For more information, see Caches 
and memory hierarchy on page B2-70. 
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See Cache sets. 


Should-Be-One (SBO) 


Hardware must ignore writes to the field. 


ARM strongly recommends that software writes the field as all 1s. If software writes a value that is not all 1s, it must 
expect an UNPREDICTABLE or CONSTRAINED UNPREDICTABLE result. 


This description can apply to a single bit that should be written as 1, or to a field that should be written as all 1s. 


See also CONSTRAINED UNPREDICTABLE, UNPREDICTABLE. 


Should-Be-One-or-Preserved (SBOP) 


From the introduction of the ARMv8 architecture, the description Should-Be-One-or-Preserved (SBOP) is 
superseded by RES/. 


Note 


The ARMv7 Large Physical Address Extension modified the definition of SBOP for register bits that are SBOP in 
some but not all contexts. The behavior of these bits is covered by the RES/ definition, but not by the generic 
definition of SBOP given here. 








Hardware must ignore writes to the field. 


When writing this field, software must either write all 1s to this field or, if the register is being restored from a 
previously read state, write the previously read value to this field. If this is not done, then the result is 
UNPREDICTABLE. 


This description can apply to a single bit that should be written as its preserved value or as 1, or to a field that should 
be written as its preserved value or as all 1s. 


See also CONSTRAINED UNPREDICTABLE, UNPREDICTABLE. 


Should-Be-Zero (SBZ) 


Hardware must ignore writes to the field. 


ARM strongly recommends that software writes the field as all Os. If software writes a value that is not all Os, it must 
expect an UNPREDICTABLE or CONSTRAINED UNPREDICTABLE result. 


This description can apply to a single bit that should be written as 0, or to a field that should be written as all Os. 


See also CONSTRAINED UNPREDICTABLE, UNPREDICTABLE. 


Should-Be-Zero-or-Preserved (SBZP) 


Signaling NaNs 


From the introduction of the ARMv8 architecture, the description Should-Be-Zero-or-Preserved (SBZP) is 
superseded by RESO. 





Note 


The ARMv7 Large Physical Address Extension modified the definition of SBZP for register bits that are SBZP in 
some but not all contexts. The behavior of these bits is covered by the RESO definition, but not by the generic 
definition of SBZP given here. 





Hardware must ignore writes to the field. 


When writing this field, software must either write all Os to this field or, if the register is being restored from a 
previously read state, write the previously read value to this field. If this is not done, then the result is 
UNPREDICTABLE. 


This description can apply to a single bit that should be written as its preserved value or as 0, or to a field that should 
be written as its preserved value or as all Os. 


See also CONSTRAINED UNPREDICTABLE, UNPREDICTABLE. 


Cause an Invalid Operation exception whenever any floating-point operation receives a signaling NaN as an 
operand. Signaling NaNs can be used in debugging, to track down some uses of uninitialized variables. 





ARM DDI 0487A.k_iss10775 Copyright © 2013-2016 ARM Limited or its affiliates. All rights reserved. Glossary-5727 


1ID092916 


Non-Confidential 


Glossary 


Signed immediate and offset fields 
Are encoded in two’s complement notation unless otherwise stated. 


SIMD Single-Instruction, Multiple-Data. 


The SIMD instructions in AArch32 state are: 
° The instructions summarized in Parallel addition and subtraction instructions on page F1-2378. 


° The Advanced SIMD instructions summarized in Advanced SIMD and floating-point instructions on 
page E1-2300, when operating on vectors. 
Note 


In ARMv7, some VFP instructions can operate on vectors. However, ARM deprecates those instruction uses, 
and strongly recommends that Advanced SIMD instructions are always used for vector operations. 








Simple sequential execution 
The behavior of an implementation that fetches, decodes and completely executes each instruction before 
proceeding to the next instruction. Such an implementation performs no speculative accesses to memory, including 
to instruction memory. The implementation does not pipeline any phase of execution. In practice, this is the 
theoretical execution model that the architecture is based on, and ARM does not expect this model to correspond to 
a realistic implementation of the architecture. 


Single-copy atomicity 
The form of atomicity described in Properties of single-copy atomic accesses on page B2-82. 


See also Atomicity, Multi-copy atomicity. 


Single-precision value 
A 32-bit word that is interpreted as a basic single-precision floating-point number according to the JEEE Standard 
for Floating-point Arithmetic. 


Spatial locality The observed effect that after a program has accessed a memory location, it is likely to also access nearby memory 
locations in the near future. Caches with multi-word cache lines exploit this effect to improve performance. 


Special-purpose register 
One of a specified set of registers for which all direct and indirect reads and writes to the register appear to occur in 
program order relative to other instructions, without the need for any explicit synchronization: 


° Special-purpose registers on page C5-293 specifies the AArch64 Special-purpose registers. 
° AArch32 Special-purpose registers on page G1-3802 lists the AArch32 Special-purpose registers. 


T32 instruction One or two halfwords that specify an operation to be performed by a PE that is executing in an Exception level that 
is using AArch32 and is in T32 state. T32 instructions must be halfword-aligned. 


T32 instructions were previously called Thumb instructions. 
See also A32 instruction, A64 instruction, T32 state. 
T32 state The AArch32 Instruction set state in which the PE executes T32 instructions. 
T32 state was previously called Thumb state. 
See also A32 state, T32 instruction. 


Taken locally An exception that is taken locally means an exception that is taken to the default Exception level, and is not routed 
to another Exception level. 


Temporal locality 
The observed effect that after a program has accesses a memory location, it is likely to access the same memory 
location again in the near future. Caches exploit this effect to improve performance. 


Thumb instruction 
See T32 instruction. 


TLB See Translation Lookaside Buffer (TLB). 
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TLB lockdown A way to prevent specific translation table walk results being accessed. This ensures that accesses to the associated 
memory areas never cause a translation table walk. 


Translation Lookaside Buffer (TLB) 
A memory structure containing the results of translation table walks. They help to reduce the average cost of a 
memory access. Usually, there is a TLB for each memory interface of the ARM implementation. 


Translation table 
A table held in memory that defines the properties of memory areas of various sizes from 1KB to 1MB. 


Translation table walk 
The process of doing a full translation table lookup. It is performed automatically by hardware. 


Trap enable bits = In VFPv2, VFPv3U, and VFPv4U, determine whether trapped or untrapped exception handling is selected. If 
trapped exception handling is selected, the way it is carried out is IMPLEMENTATION DEFINED. 


Unaligned An unaligned access is an access where the address of the access is not aligned to the size of an element of the access. 


Unaligned memory accesses 
Are memory accesses that are not, or might not be, appropriately halfword-aligned, word-aligned, or 
doubleword-aligned. 


Unallocated Except where otherwise stated in this manual, an instruction encoding is unallocated if the architecture does not 
assign a specific function to the entire bit pattern of the instruction, but instead describes it as CONSTRAINED 
UNPREDICTABLE, UNDEFINED, UNPREDICTABLE, or as an unallocated hint instruction. 


A bit in a register is unallocated if the architecture does not assign a function to that bit. 
See also CONSTRAINED UNPREDICTABLE, UNDEFINED, UNPREDICTABLE. 


UNDEFINED Indicates cases where an attempt to execute a particular encoding bit pattern generates an exception, that is taken to 
the current Exception level, or to the default Exception level for taking exceptions if the UNDEFINED encoding was 
executed at ELO. This applies to: 


° Any encoding that is not allocated to any instruction. 
. Any encoding that is defined as never accessible at the current Exception level. 
° Some cases where an enable, disable, or trap control means an encoding is not accessible at the current 


Exception level. 


If the generated exception is taken to an Exception level that is using AArch32 then it is taken as an Undefined 
Instruction exception. 


Note 


On reset, the default Exception level for taking exceptions from ELO is EL1. However, an implementation might 
include controls that can change this, effectively making EL1 inactive. See the description of the Exception model 
for more information 








In body text, the term UNDEFINED is shown in SMALL CAPITALS. 
See also Undefined Instruction exception on page G1-3849. 
Unified cache Is a cache used for both processing instruction fetches and processing data loads and stores. 


Unindexed addressing 
Means addressing in which the base register value is used directly as the virtual address to send to memory, without 
adding or subtracting an offset. In most types of load/store instruction, unindexed addressing is performed by using 
offset addressing with an immediate offset of 0. 


In ARMv7 and earlier versions of the ARM architecture, and in the M-profile, the LDC, LDC2, STC, and STC2 
instructions have an explicit unindexed addressing mode that permits the offset field in the instruction to specify 
additional coprocessor options. 


UNK An abbreviation indicating that software must treat a field as containing an UNKNOWN value. 
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UNK/SBOP 


UNK/SBZP 


UNKNOWN 


UNPREDICTABLE 


VA 
VFP 


Hardware must implement the bit as read as 0, or all Os for a multi-bit field. Software must not rely on the field 
reading as zero. 


See also UNKNOWN. 
Hardware must implement the field as Read-As-One, and must ignore writes to the field. 


Software must not rely on the field reading as all 1s, and except for writing back to the register it must treat the value 
as if it is UNKNOWN. Software must use an SBOP policy to write to the field. 


This description can apply to a single bit that should be written as its preserved value or as 1, or to a field that should 
be written as its preserved value or as all 1s. 


See also Read-As-One (RAO), Should-Be-One-or-Preserved (SBOP), UNKNOWN. 
Hardware must implement the bit as Read-As-Zero, and must ignore writes to the field. 


Software must not rely on the field reading as all Os, and except for writing back to the register must treat the value 
as if it is UNKNOWN. Software must use an SBZP policy to write to the field. 


This description can apply to a single bit that should be written as its preserved value or as 0, or to a field that should 
be written as its preserved value or as all Os. 


See also Read-As-Zero (RAZ), Should-Be-Zero-or-Preserved (SBZP), UNKNOWN. 


An UNKNOWN value does not contain valid data, and can vary from moment to moment, instruction to instruction, 
and implementation to implementation. An UNKNOWN value must not return information that cannot be accessed at 
the current or a lower level of privilege using instructions that are not UNPREDICTABLE, are not CONSTRAINED 
UNPREDICTABLE, and do not return UNKNOWN values. 


An UNKNOWN value must not be documented or promoted as having a defined value or effect. 
In body text, the term UNKNOWN is shown in SMALL CAPITALS. 


See also CONSTRAINED UNPREDICTABLE, UNDEFINED, UNK, UNPREDICTABLE. 


Means the behavior cannot be relied upon. UNPREDICTABLE behavior must not perform any function that cannot be 
performed at the current or a lower level of privilege using instructions that are not UNPREDICTABLE. 


UNPREDICTABLE behavior must not be documented or promoted as having a defined effect. 
An instruction that is UNPREDICTABLE can be implemented as UNDEFINED. 


Execution at Non-secure EL1 or ELO of an instruction that is UNPREDICTABLE can be implemented as generating a 
trap exception that is taken to EL2, provided that at least one instruction that is not UNPREDICTABLE and is not 
CONSTRAINED UNPREDICTABLE causes a trap exception that is taken to EL2. 


In body text, the term UNPREDICTABLE is shown in SMALL CAPITALS. 
See also CONSTRAINED UNPREDICTABLE, UNDEFINED. 
See Virtual address (VA). 


In ARMv7, an extension to the ARM architecture, that provides single-precision and double-precision 
floating-point arithmetic. 


Virtual address (VA) 


An address generated by an ARM PE. This means it is an address that might be held in the program counter of the 
PE. For a PMSA implementation, the virtual address is identical to the physical address. 


See also Intermediate physical address (IPA), Physical address (PA). 





Watchpoint A debug event triggered by an access to memory, specified in terms of the address of the location in memory being 
accessed. 

Way See Cache way. 
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wi Writes Ignored. In a register that software can write to, a WI attribute applied to a bit or field indicates that the bit 
or field ignores the value written by software and retains the value it had before that write. 


See also RAO/WI, RAZ/WI, RESO, RES 1. 
Word A 32-bit data item. Words are normally word-aligned in ARM systems. 
Word-aligned Means that the address is divisible by 4. 


Write-allocate cache 
A cache in which a cache miss on storing data causes a cache line to be allocated into the cache. 


Write-back cache 
A cache in which when a cache hit occurs on a store access, the data is only written to the cache. Data in the cache 
can therefore be more up-to-date than data in main memory. Any such data is written back to main memory when 
the cache line is cleaned or reallocated. Another common term for a write-back cache is a copy-back cache. 


Write-through cache 
Acache in which when a cache hit occurs on a store access, the data is written both to the cache and to main memory. 
This is normally done via a write buffer, to avoid slowing down the PE. 


Write buffer A block of high-speed memory that optimizes stores to main memory. 
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